Queuing, Scheduling, Congestion Avoidance/Management

References:
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r5-3/qos/configuration/guide/b_qos_cg53xasr9k/b_qos_cg53xasr9k_chapter_0101.html
https://supportforums.cisco.com/t5/service-providers-documents/asr9000-xr-feature-order-of-operation/ta-p/3135041
https://www.cisco.com/c/en/us/support/docs/switches/catalyst-4000-series-switches/21389-137.html
http://www.ciscopress.com/articles/article.asp?p=352991&seqNum=4

Order of Operations:

The order of operations may vary by device type, the following is for a typical Cisco device:

Ingress classification > ingress policing > ingress marking/re-marking > ingress queuing (congestion management/congestion avoidance) > ingress scheduling > (packet moves across device fabric/backplane) > egress classification > egress policing > egress marking/re-marking > egress queuing (congestion avoidance/congestion management) > egress scheduling.

Cisco ASR9001 specific QoS order-of-operations:

Packet received on the wire > ingress classification > ingress policing action > ingress QoS stats update > L2 rewrite > ingress QoS marking > ingress WRED lookup > Queuing/WRED action into the fabric.

Packet from fabric/punt inject > L2 rewrite > egress QoS classification > egress QoS policing/marking/stats update > egress QoS WRED classification > egress queueing/scheduling/WRED action.

 

Classification:
Classification is the process of placing traffic into different categories based on its existing properties (IP address, port number, CoS/ToS/DSCP/IPP/EXP values etc.). Classification does not however, alter bits in the frame or packet.

 

Marking:
Marking alters bits (for example, bits in the ToS byte) within a frame, cell, or packet to indicate how the network should treat that traffic. Marking alone does not change how the network treats a packet. Other tools (for example, queuing tools) can, however, reference those markings and make decisions based on them.

 

 

Congestion Avoidance:
If an interface's output queue fills to capacity, newly arriving packets are discarded ("tail-dropped"), regardless of the priority that is assigned to the discarded packet. To prevent this behaviour, Cisco uses a congestion avoidance technique called Weighted Random Early Detection (WRED). After the queue depth reaches a configurable level (that is, the minimum threshold) for a particular priority marking (for example, IP Precedence or DSCP), WRED introduces the possibility of discard for packets with those markings. As the queue depth continues to increase, the possibility of discard increases until a configurable maximum threshold is reached. After the queue depth has exceeded the maximum threshold for traffic with a specific priority, there is a 100 percent chance of discard for those traffic types. Congestion avoidance techniques monitor traffic flow in an effort to anticipate and avoid congestion at common network bottlenecks. Avoidance techniques are implemented before congestion occurs as compared with congestion management techniques that control congestion after it has occurred.

Congestion avoidance techniques include:
Taildrop/FIFO
RED/WRED (Random Early Detect / Weighted Random Early Detect)

 

Congestion Management:
Congestion management means queuing. When an interface's output software queue contains packets, the interface's queuing strategy determines how the packets are emptied from the queue. For example, some traffic types can be given priority treatment, and bandwidth amounts can be made available for specific classes of traffic. Congestion management controls congestion after it has occurred on a network.

Congestion management techniques include:
MDRR (Modified Deficit Round Robin) - MDRR is a class-based composite scheduling mechanism that allows for queueing of up to eight traffic classes. It operates in the same manner as class-based weighted fair queueing (CBWFQ) and allows definition of traffic classes based on customer match criteria (such as access lists); however, MDRR does not use the weighted fair queuing algorithm.

ECN (Explicit Congestion Notification) - ECN is an extension to WRED (Weighted Random Early Detection). ECN will mark packets instead of dropping them when the average queue length exceeds a specific threshold value. When configured, ECN helps routers and end hosts to understand that the network is congested and slow down sending packets. However, if the number of packets in the queue is above the maximum threshold, packets are dropped based on the drop probability. This is the identical treatment a packet receives when WRED is enabled without ECN configured on the router.

PQ/SP/LLQ (Priority Queueing/Strict Priority/Low Latency Queueing) - A priority queue + any of the scheduling mechanisms (RR/WRR/WFQ/DRR/WDRR).

 

Scheduling Techniques:
The following are some examples of the scheduling techniques that can be used during congestion avoidance (they are different techniques for scheduling which packets should be de-queued first to the interface for transmission, many of them don't provide bandwidth guarantees):

FIFO (First-In, First-Out):
First-in first-out (FIFO) queuing is not truly performing QoS operations. As its name suggests, the first packet to come into the queue is the first packet sent out of the queue.

PQ (Priority Queueing):
This type of queuing places traffic into one of four queues. Each queue has a different level of priority, and higher-priority queues must be emptied before packets are emptied from lower-priority queues. This behaviour can "starve out" lower- priority traffic. There is no bandwidth guarantee/reserve here, simply a priority ranking.

P-WFQ (Priority-Weighted Fair Queueing):
A set of priority queues, with WFQ or a variant used between subsets of queues at the same priority.

RR/WRR (Round Robin, Weighted Round Robin):
Round-Robin queuing places traffic into multiple queues, and packets are removed from these queues in a round-robin fashion, which avoids the protocol-starvation issue that PQ suffered from. Weighted Round Robin can place a weight on the various queues, to service a different number of bytes or packets from the queues during a round-robin cycle. Custom Queuing (CQ) is an example of a WRR queuing approach.

DRR/WDRR (Deficit Round Robin, Weighted Deficit Round Robin):
This type of queuing can suffer from a "deficit" issue. For example, if you configured CQ to removed 1500 bytes from a queue during each round-robin cycle, and you had a 1499-byte packet and a 1500-byte packet in the queue, both packets would be sent. This is because CQ cannot send a partial packet. Because the 1499-byte packet was transmitted and because another byte still had to be serviced, CQ would start servicing the 1500-byte packet. DRR keeps track of the number of extra bytes that are sent during a round and subtracts that number from the number of bytes that can be sent during the next round.


WFQ/CB-WFQ (Weighted Fair Queueing, Class-Based Weighted Fair Queueing):
Weighted Fair Queuing (by default) uses IP Precedence values to provide a weighting to Fair Queuing (FQ). When emptying the queues, FQ does byte-by-byte scheduling. Specifically, FQ looks 1 byte deep into each queue to determine whether an entire packet can be sent. FQ then looks another byte deep into the queue to determine whether an entire packet can be sent. As a result, smaller traffic flows and smaller packet sizes have priority over bandwidth-hungry flows with large packets.

The WFQ mechanism made sure that no traffic was starved out. However, WFQ did not make a specific amount of bandwidth available for defined traffic types. One can, however, specify a minimum amount of bandwidth to make available for various traffic types using the CB-WFQ mechanism. Traffic for each class-map goes into a separate queue. Therefore, one queue can be overflowing while other queues are still accepting packets. Bandwidth allocations for various class-maps can be specified in one of three ways: bandwidth, percentage of bandwidth, and percentage of remaining bandwidth. The following paragraphs describe each of these allocations.

LLQ (Low Latency Queueing):
Low Latency Queuing (LLQ) is almost identical to CB-WFQ. However, with LLQ, you can instruct one or more class-maps to direct traffic into a priority queue. Realize that when one places packets in a priority queue, one is not only allocating a bandwidth amount for that traffic, but one is also policing (that is, limiting the available bandwidth for) that traffic. The policing option is necessary to prevent higher-priority traffic from starving out lower-priority traffic.

Hierarchical Scheduling:
Here there are multiple levels of schedulers, apportioning the link bandwidth at successive levels of the hierarchy.

 

Policing and Shaping:
Collectively, these tools are called traffic conditioners. Policing can be used in either the inbound or outbound direction, and it typically discards packets that exceed the configured rate limit. Because policing drops packets, resulting in retransmissions, it is recommended for use on higher-speed interfaces. Policing mechanisms also allow one to rewrite packet markings (for example, IP Precedence markings) if the configured rate is exceeded.

Shaping can be applied only in the outbound direction. Instead of discarding traffic that exceeds the configured rate limit, shaping delays the exceeding traffic by buffering it until bandwidth becomes available. That is why shaping preserves bandwidth, as compared to policing, at the expense of increased delay. Therefore, shaping is recommended for use on slower-speed interfaces. Also, shaping does not have policing's ability to rewrite packet markings.