ME3600X/ME3800X Buffer Oversubscription

On the ME3600s/ME3800s buffers can be allocated at the port level, the VLAN level and the class level (in that order). So nesting class-map and policy-map configurations can be used to push the queue increases right down to the class level (see config below). Below is an example that can set the queue-limit to 100 percent for each class for each customer on the same port. Technically this is oversubscribed as any typical service provider operates with regards to bandwidth so buffers need to be either plentiful to accommodate the bandwidth oversubscription or, oversubscribe the buffers themselves (In this respect, consider "queue-limit percent" to be for queue/buffer resources what PIR is for bandwidth i.e. sum of PIRs of queues on a port can exceed port's PIR). "queue-limit percent" provides flexibility, while absolute queue-limit provides determinism i.e. a specific bound of burst-tolerance per-queue.

Using the nested "branch" and "leaf" polices one can oversubscribe the queues right down at the class level for every class. The internal buffer resource allocator will grow the queue size dynamically (when using the "queue-limit X percent" command), live as traffic passes through the switch as the buffers are required, then free that assigned memory and return it back to the shared memory pool when unused. ME3600X-24FS/24TS have two "niles" ASICs, one for the 24x1G ports and one for the 2x10G ports. Both have 22MBs of memory on them less 4MBs on each for ECC leaves 18MBs on each ASIC for buffers. ME3800X-24FS have two "nile" ASICs also, one ASIC for the 24x 1G ports and one ASIC for the 2x10G ports. Both have 176MBs of memory on them less 30MBs for ECC leaves 146Mbs on each ASIC for buffers.

At present the maximum buffer size any class can be assigned is as follows;

LAB-3800(config-pmap-c)#queue-limit ?
in bytes, in us, in packets by default

819200 bytes is 8MBs, when tested with a 10G link to ME3600 and 100Mbps link to the end site, with no queue-limit configured tail drops were present. When configured with 1MB of queues tail drops were partially reduced. 2MBs had tail drops but less of them. 3MBs or 4MBs would have removed the tail drops completely probably but it’s easier to switch “100%” instead, otherwise each customer scenario would require testing any assigning the correct queue-limit each time. It would be the same as just configuring 8MBs everywhere but when the giant queue size isn't needed one doesn’t want to introduce any unnecessary latency.

Setting the queue limit based on time doesn't seem like a massively better option either. The max of 512000us is half a second (500ms) so a more ideal approach is that if one can avoid the additional buffering when it's not needed (i.e. 3MBs might have covered the switch for stepping 10Gbps down to 100Mbps so configuring 4MBs or more might cause the buffer to fill further than 3MBs before the packets are released) and still prevent the queue drops then that is best, pushing out "queue-limit 100 percent" configuration so that buffers are grown as needed and then shrunk again has worked.

class NC-QG
  bandwidth percent 2
  queue-limit percent 100
  police cir percent 10
   conform-action transmit
   exceed-action drop
  queue-limit percent 100
class APP-1-QG
  bandwidth percent 22
  queue-limit percent 100
class APP-2-QG
  bandwidth percent 24
  queue-limit percent 100
class APP-4-QG
  bandwidth percent 5
  queue-limit percent 100
class APP-3-QG
  bandwidth percent 12
  queue-limit percent 100
class class-default
  bandwidth percent 25
  queue-limit percent 100

class-map match-any CUST1-VLANs
 match vlan 100
 match vlan 200

class-map match-any CUST2-VLANs
 match vlan 300
 match vlan 300

! Applying this policy to a port, that matches based on VLAN and not DSCP for example
! directly at the port level, means this policy map exists at the VLAN level
! so it's child policy will match at the class-level (Port queues > VLAN queues > class queues) class CUST1-VLANs shape average 100000000 service-policy CPE-QOS-OUT-QUEUE-TEST class CUST2-VLANs shape average 200000000 service-policy CPE-QOS-OUT-QUEUE-TEST interface GigabitEthernet0/1 description Customer Tail-Circuit Aggregation switchport trunk allowed vlan none switchport mode trunk service-policy output CPE-QOS-SHAPE-CUSTOMERS service instance 100 ethernet description Cust1 VLAN 100 encapsulation dot1q 100 rewrite ingress tag pop 1 symmetric bridge-domain 100 ! service instance 200 ethernet description Cust1 VLAN 200 encapsulation dot1q 200 rewrite ingress tag pop 1 symmetric bridge-domain 200 ! service instance 300 ethernet description Cust2 VLAN 300 encapsulation dot1q 300 rewrite ingress tag pop 1 symmetric bridge-domain 300 ! service instance 400 ethernet description Cust2 VLAN 400 encapsulation dot1q 400 rewrite ingress tag pop 1 symmetric bridge-domain 400

The following configuration fails to increase the queue-limit for individual traffic classes below the VLAN level because the parent policy simply matches "class-default" so this is assumed by the switch to be a port level policy.

policy-map PARENT
 class class-default
  ! Class default is being assumed at the port level, so it's child will
  ! only be the second level which is the VLAN level
  shape average 5000000
  service-policy CHILD

policy-map CHILD
 class class-default
  ! So here the switch thinks we are trying to rise the queue for class-default at the port
  ! VLAN level
  queue-limit 512 packets

interface GigabitEthernet0/5
 switchport trunk allowed vlan none
 switchport mode trunk

 service instance 20 ethernet
  encapsulation dot1q 20
  rewrite ingress tag pop 1 symmetric
  bridge-domain 20

S1(config)#interface GigabitEthernet0/5
S1(config-if-srv)#service-policy output PARENT
QOS: queue-limit command not supported in non-leaf classes
QoS: Policy attachment failed for policymap PARENT
%QOSMGR-3-QLIMIT_LEVEL_ERROR: Qlimit command not supported in non-leaf classes

A branched hierarchy of policies is required when using QoS and EVCs that uses the following structure, as per the working example above, to push the changes down to the bottom of the port > VLAN > class tree:

policy-map root
class class-default
   service-policy logical

 Policy Map logical
    Class class-default
      service-policy leaf

 Policy Map leaf
    Class class-default
      queue-limit percent 100