Date created: Wednesday, August 26, 2015 9:03:33 AM. Last modified: Thursday, August 11, 2022 8:39:07 AM
IOS-XR Troubleshooting and Diagnostics
For ASR9000 Series Routers
References:
https://supportforums.cisco.com/document/135941/asr9000xr-understanding-platform-diags-3-puntfabricdatapathfailed#What_to_collect_if_there_is_still_an_issue
BRKARC-2017 - Packet Journey Inside ASR 9000
BRKSPG-3612 Troubleshooting IOS-XR
http://www.cisco.com/c/en/us/support/docs/routers/asr-9000-series-aggregation-services-routers/116999-problem-line-card-00.html
Contents:
System OS and RSP
Forwarding Plane
Line Card / MPA Level
NP / Bridge / FIA Level
PHY / MAC / Interface / NP Level
Top 10 processes by CPU usage, by default this command refreshes every 1 seconds for 5 iterations. Check other variants with "run top --help" and "run top_procs --help":
RP/0/RSP0/CPU0:abr1#run top_procs -D -d 1 -i 5 -c -n 10
Wed Dec 23 14:35:00.634 UTC
Computing times...
node0_RSP0_CPU0: 343 procs, 4 cpus, 1.04 delta, 418:20:11 uptime
Memory: 8192 MB total, 5.395 GB free, sample time: Wed Dec 23 14:35:01 2015
cpu 0 idle: 73.10%, cpu 1 idle: 50.29%, cpu 2 idle: 54.48%, cpu 3 idle: 52.78%, total idle: 57.66%, kernel: 1.26%
pid mem MB user cpu kernel cpu delta % ker % tot name
12689629 325.039 547565.102 42929.053 0.836 0.77 20.81 bgp
12681378 15.496 318671.569 15964.698 0.515 0.24 12.82 mpls_lsd
12124317 3.792 165476.455 4056.820 0.124 0.02 3.08 fib_mgr
192592 8.167 145014.753 22639.398 0.047 0.12 1.17 gsp
565552 200.871 62974.333 1847.591 0.040 0.02 0.99 ipv4_rib
241797 6.503 12548.085 1469.881 0.024 0.02 0.59 netio
57385 53.511 64834.412 10031.991 0.017 0.12 0.42 eth_server
192607 3.027 30455.305 1447.194 0.014 0.02 0.34 cluster_clm_rp
241793 1.578 10269.683 1606.126 0.010 0.02 0.24 nrssvr
192608 1.375 9321.330 1919.364 0.005 0.04 0.12 cluster_dlm_rp
One can view and restart processes with:
show processes fib_mgr location 0/0/CPU0 process restart fib_mgr location 0/0/CPU0 or process restart bgp location all
Process/thread/job debugging:
In this example BGP is stuck in a loop and chewing up one of four CPU cores at nearly 100% (25% overall)
RP/0/RSP0/CPU0:abr1#show proc cpu | ex 0%
Mon Dec 21 13:53:20.237 UTC
CPU utilization for one minute: 61%; five minutes: 54%; fifteen minutes: 53%
PID 1Min 5Min 15Min Process
57385 1% 1% 1% eth_server
176174 1% 1% 1% sysdb_svr_admin
192592 6% 3% 3% gsp
192605 2% 1% 1% sysdb_mc
192607 1% 1% 1% cluster_clm_rp
241776 1% 1% 1% sysdb_shared_nc
565552 1% 1% 1% ipv4_rib
569666 1% 1% 1% snmpd
569671 2% 2% 2% mibd_route
12124317 6% 6% 6% fib_mgr
12681378 14% 14% 14% mpls_lsd
12689629 23% 23% 23% bgp
RP/0/RSP0/CPU0:abr1#show processes distribution bgp all
Mon Dec 21 14:00:55.321 UTC
1 process found
NODE PID JID #THR TYPE PROGRAM
0/RSP0/CPU0 12689629 1049 23 RP bgp
RP/0/RSP0/CPU0:abr1#show processes bgp loc 0/RSP0/CPU0
Mon Dec 21 13:48:53.521 UTC
Job Id: 1049
PID: 12689629
Executable path: /disk0/iosxr-routing-4.3.4.sp10-1.0.0/0x100000/bin/bgp
RP/0/RSP0/CPU0:abr1#show processes threadname 1049 location 0/RSP0/CPU0
Mon Dec 21 13:50:27.921 UTC
JID TID ThreadName pri state TimeInState NAME
1049 1 bgp-io-control 10 Receive 0:00:01:0425 bgp
1049 2 cdm_evm_thread 10 Receive 161:22:27:0247 bgp
1049 3 10 Receive 161:22:27:0228 bgp
1049 4 bgp-rpki 10 Receive 0:00:00:0376 bgp
1049 5 spp_nsr_client_conn_monitor 10 Receive 161:22:25:0691 bgp
1049 6 10 Sigwaitinfo 161:22:26:0585 bgp
1049 7 bgp-label 10 Reply 0:00:00:0000 bgp
1049 8 bgp-mgmt 10 Receive 0:01:17:0202 bgp
1049 9 bgp-rib-upd-0 10 Receive 0:00:00:0113 bgp
1049 10 lspv_lib BGPv4 10 Nanosleep 0:00:01:0148 bgp
1049 11 bgp-rib-upd-1 10 Receive 0:00:04:0836 bgp
1049 12 bgp-io-read 10 Receive 0:00:00:0001 bgp
1049 13 bgp-io-write 10 Receive 0:00:00:0001 bgp
1049 14 bgp-router 10 Receive 0:00:00:0000 bgp
1049 15 bgp-import 10 Receive 0:00:00:0000 bgp
1049 16 bgp-upd-gen 10 Receive 0:00:00:0153 bgp
1049 17 bgp-sync-active 10 Receive 0:01:04:0920 bgp
1049 18 bgp-crit-event 10 Receive 0:00:10:0717 bgp
1049 19 bgp-event 10 Receive 0:00:37:0026 bgp
1049 20 bgp-mib-trap 10 Receive 161:22:25:0509 bgp
1049 21 bgp-io-ka 10 Receive 0:00:01:0609 bgp
1049 22 bgp-l2vpn-thr 10 Receive 0:01:04:0847 bgp
1049 23 async 10 Receive 161:22:24:0439 bgp
RP/0/RSP0/CPU0:abr1#show processes blocked
Mon Dec 21 13:52:49.491 UTC
Jid Pid Tid Name State TimeInState Blocked-on
65551 16399 1 ksh Reply 369:37:44:0916 16395 devc-conaux
97 53279 2 umass-enum Reply 369:37:49:0704 1 kernel
97 53279 6 umass-enum Reply 369:37:47:0517 57380 io-usb
97 53279 7 umass-enum Reply 369:37:47:0517 1 kernel
65568 65568 2 devb-umass Reply 0:00:00:0041 57380 io-usb
52 90154 2 attachd Reply 369:37:46:0151 57385 eth_server
52 90154 3 attachd Reply 369:37:46:0108 24595 mqueue
89 90155 6 qnet Reply 0:00:03:0533 57385 eth_server
51 90161 2 attach_server Reply 369:37:45:0843 24595 mqueue
432 192602 1 tftp_server Reply 161:37:06:0385 24595 mqueue
315 557276 2 lpts_fm Reply 0:00:00:0239 245879 lpts_pa
1049 12689629 7 bgp Reply 0:00:00:0002 12681378 mpls_lsd
326 569668 7 mibd_entity Reply 0:00:00:0003 192605 sysdb_mc
65870 21451086 1 exec Reply 0:00:00:0258 1 kernel
65877 21598549 1 more Reply 0:00:00:0010 24593 pipe
65878 21598550 1 show_processes Reply 0:00:00:0000 1 kernel
RP/0/RSP0/CPU0:abr1#show proc pidin | in bgp
RP/0/RSP0/CPU0:abr1#follow process 12689629
RP/0/RSP0/CPU0:abr1#follow process 12689629 thread 7 verbose location 0/RSP0/CPU0
Below the "ce_switch" process (the EOBC process in IOS-XR) is stuck on the active RSP in an ASR9006 chewing up one of our CPU cores at 100% (so 25% overall):
RP/0/RSP1/CPU0:ASR9006#show processes distribution ce_switch all
Wed Sep 20 15:19:31.209 BST
2 processes found
NODE PID JID #THR TYPE PROGRAM
0/RSP0/CPU0 106542 54 17 RP ce_switch
0/RSP1/CPU0 106542 54 17 RP ce_switch
RP/0/RSP1/CPU0:ASR9006#show proc cpu location 0/RSP1/CPU0 | ex 0%
Wed Sep 20 15:24:14.247 BST
CPU utilization for one minute: 26%; five minutes: 26%; fifteen minutes: 26%
PID 1Min 5Min 15Min Process
106542 25% 25% 25% ce_switch
RP/0/RSP1/CPU0:ASR9006#sh proc block location 0/RSP1/CPU0
Wed Sep 20 15:20:35.944 BST
Jid Pid Tid Name State TimeInState Blocked-on
65548 12300 1 ksh Reply 18114:21:39:0973 12299 devc-conaux
95 36892 2 umass-enum Reply 18114:21:41:0191 1 kernel
95 36892 6 umass-enum Reply 18114:21:38:0829 106546 io-usb
95 36892 7 umass-enum Reply 18114:21:38:0829 1 kernel
53 102433 2 attachd Reply 18114:21:41:0542 98343 eth_server
53 102433 3 attachd Reply 18114:21:41:0543 16399 mqueue
87 102441 6 qnet Reply 0:00:00:0023 98343 eth_server
52 106543 2 attach_server Reply 18114:21:41:0507 16399 mqueue
65587 106547 2 devb-umass Reply 0:00:08:0132 106546 io-usb
443 307290 1 tftp_server Reply 912:30:15:0642 16399 mqueue
212 340088 1 envmon Mutex 0:00:01:0948 340088-05 #1
326 667887 2 lpts_fm Reply 0:00:00:0051 352397 lpts_pa
1181 676137 13 l2vpn_mgr Reply 6021:12:51:0240 680280 lspv_server
1048 680281 12 mpls_ldp Reply 6021:12:51:0251 680280 lspv_server
65882 542187866 1 exec Reply 0:00:00:0077 1 kernel
1054 684387 11 bgp Reply 18114:17:00:0032 680280 lspv_server
65899 542634347 1 more Reply 0:00:00:0013 16397 pipe
65901 542392685 1 exec Reply 0:05:26:0223 667876 devc-vty
65902 542634350 1 show_processes Reply 0:00:00:0000 1 kernel
RP/0/RSP1/CPU0:ASR9006#sh processes threadname location 0/RSP1/CPU0 | i "NAME|ce_switch"
Wed Sep 20 15:21:09.067 BST
JID TID ThreadName pri state TimeInState NAME
54 1 main 10 Receive 0:00:01:0634 ce_switch
54 2 10 Receive 18114:21:30:0688 ce_switch
54 3 bcmDPC 50 Running 0:00:00:0000 ce_switch
54 4 bcmCNTR.0 10 Nanosleep 0:00:00:0069 ce_switch
54 5 bcmTX 56 Sem 18114:22:09:0575 ce_switch
54 6 bcmXGS3AsyncTX 56 Sem 18114:22:09:0573 ce_switch
54 7 bcmLINK.0 50 Nanosleep 0:00:00:0039 ce_switch
54 8 pfm_svr 10 Receive 0:00:00:0451 ce_switch
54 9 interruptThread 56 Intr 0:00:00:0000 ce_switch
54 10 udld_thread 56 Receive 0:00:00:0027 ce_switch
54 11 clm punch counter 10 Receive 0:00:00:0009 ce_switch
54 12 bcmRX 56 Nanosleep 0:00:00:0004 ce_switch
54 13 clm_eth_server_rx_thread 10 Receive 0:00:00:0399 ce_switch
54 14 clm_timer_event 10 Receive 18114:21:30:0688 ce_switch
54 15 clm status thread 10 Receive 0:00:00:0057 ce_switch
54 16 clm_active_eobc_periodic_update_thread 10 Nanosleep 0:00:00:0460 ce_switch
54 17 async 10 Receive 5:48:11:0857 ce_switch
RP/0/RSP1/CPU0:ASR9006#show proc pidin location 0/RSP1/CPU0 | in "STATE|ce_switch"
Wed Sep 20 15:23:32.012 BST
pid tid name prio STATE Blocked
106542 1 pkg/bin/ce_switch 10r RECEIVE 1
106542 2 pkg/bin/ce_switch 10r RECEIVE 5
106542 3 pkg/bin/ce_switch 50r SEM
106542 4 pkg/bin/ce_switch 10r NANOSLEEP
106542 5 pkg/bin/ce_switch 56r SEM f9dec890
106542 6 pkg/bin/ce_switch 56r SEM f9dec8a4
106542 7 pkg/bin/ce_switch 50r NANOSLEEP
106542 8 pkg/bin/ce_switch 10r RECEIVE 8
106542 9 pkg/bin/ce_switch 56r INTR
106542 10 pkg/bin/ce_switch 56r RECEIVE 15
106542 11 pkg/bin/ce_switch 10r RECEIVE 13
106542 12 pkg/bin/ce_switch 56r NANOSLEEP
106542 13 pkg/bin/ce_switch 10r RECEIVE 23
106542 14 pkg/bin/ce_switch 10r RECEIVE 19
106542 15 pkg/bin/ce_switch 10r RECEIVE 26
106542 16 pkg/bin/ce_switch 10r NANOSLEEP
106542 17 pkg/bin/ce_switch 10r RECEIVE 29
RP/0/RSP1/CPU0:ASR9006#follow process 106542 stackonly iteration 1
RP/0/RSP1/CPU0:ASR9006#top
Wed Sep 20 15:26:41.362 BST
Computing times...
365 processes; 1934 threads;
CPU states: 69.0% idle, 30.7% user, 0.1% kernel
Memory: 6144M total, 3286M avail, page size 4K
Time: Wed Sep 20 15:28:33.814 BST
JID TID LAST_CPU PRI STATE HH:MM:SS CPU COMMAND
54 3 2 50 Run 1487:43:37 17.47% ce_switch
54 9 1 56 Intr 464:55:26 5.74% ce_switch
426 3 2 10 Rcv 16:17:58 0.66% sysdb_mc
61 15 1 10 Rcv 42:37:35 0.57% eth_server
251 11 1 10 CdV 28:25:13 0.45% gsp
251 13 1 10 CdV 32:14:29 0.44% gsp
251 12 3 10 CdV 27:40:11 0.44% gsp
61 1 1 10 Rcv 42:35:16 0.43% eth_server
54 4 2 10 NSlp 46:58:51 0.40% ce_switch
232 1 1 10 Rcv 199:31:01 0.38% fiarsp
RP/0/RSP1/CPU0:ASR9006#follow process 106542 stackonly thread 3 verbose location 0/RSP1/CPU0
Wed Sep 20 15:29:53.565 BST
Attaching to process pid = 106542 (pkg/bin/ce_switch)
Iteration 1 of 5
------------------------------
Current process = "pkg/bin/ce_switch", PID = 106542 TID = 3 (bcmDPC)
EAX 0x00000000 EBX 0x00000000 ECX 0x0415df50 EDX 0x08297252
ESI 0xffffffff EDI 0x10000368 EIP 0x08297252
ESP 0x0415df50 EBP 0x0415df6c EFL 0x00001046
PC 0x08297252 FP 0x0415df6c RA 0x087805aa
Priority: 50 real_priority: 50
Last system call: 85
pid: 106542
State: Running
Elapsed Time(h:m:s:ms): 1487:44:33:0051
trace_back: #0 0x087c1222 [soc_schan_op]
ENDOFSTACKTRACE
One can observe if CEF is OOR (out-of-resources) with "show cef platform resource summary location 0/0/CPU0" or "show cef platform oor location 0/0/CPU0"
abr1#show cef platform oor location 0/0/CPU0
The "PD USAGE" column should not be relied upon
for accurate tracking of the PD resources.
This is due to Asynchronous nature of CEF programming
between PD and PRM.
OBJECT PD USAGE(MAX) PRM USAGE(MAX) PRM CREDITS
RPF_STRICT(0) 0(262144) 0(262144) 262144
IPv4_LEAF_P(1) 573380(4194304) 573380(4194304) 3620924
IPv6_LEAF_P(2) 37962(2097152) 286690(2097152) 1810462
LEAF(3) 573539(4194304) 573539(4194304) 3620765
TX_ADJ(4) 799(524288) 799(524288) 523489
NR_LDI(5) 1336(2097152) 1336(2097152) 2095816
TE_NH_ADJ(6) 0(65536) 0(65536) 65536
RX_ADJ(7) 71(131072) 71(131072) 131001
R_LDI(8) 901(131072) 901(131072) 130171
L2VPN_LDI(9) 2(32768) 2(32768) 32766
EXT_LSPA(10) 0(524288) 630(524288) 523658
IPv6_LL_LEAF_P(11) 0(262144) 0(262144) 262144
RP/0/RSP0/CPU0:abr1#show cef platform resource location 0/0/CPU0
Wed Dec 23 12:11:12.785 UTC
Node: 0/0/CPU0
----------------------------------------------------------------
IP_LEAF_P usage is same on all NPs
NP: 0 struct 23: IP_LEAF_P (maps to ucode stru = 9)
Used Entries: 11066 Max Entries: 4194304
-------------------------------------------------------------
IP_LEAF_P usage is same on all NPs
NP: 0 struct 24: IP_LEAF_P (maps to ucode stru = 42)
Used Entries: 483 Max Entries: 2097152
-------------------------------------------------------------
NP: 0 struct 4: IP_LEAF (maps to ucode stru = 11)
Used Entries: 573471 Max Entries: 4194304
-------------------------------------------------------------
NP: 1 struct 4: IP_LEAF (maps to ucode stru = 11)
Used Entries: 573470 Max Entries: 4194304
-------------------------------------------------------------
RP/0/RSP0/CPU0:abr1#show cef resource location 0/0/cPU0
Wed Dec 23 12:01:33.203 UTC
CEF resource availability summary state: GREEN
CEF will work normally
ipv4 shared memory resource: GREEN
ipv6 shared memory resource: GREEN
mpls shared memory resource: GREEN
common shared memory resource: GREEN
DATA_TYPE_TABLE_SET hardware resource: GREEN
DATA_TYPE_TABLE hardware resource: GREEN
DATA_TYPE_IDB hardware resource: GREEN
DATA_TYPE_IDB_EXT hardware resource: GREEN
DATA_TYPE_LEAF hardware resource: GREEN
DATA_TYPE_LOADINFO hardware resource: GREEN
DATA_TYPE_PATH_LIST hardware resource: GREEN
DATA_TYPE_NHINFO hardware resource: GREEN
DATA_TYPE_LABEL_INFO hardware resource: GREEN
DATA_TYPE_FRR_NHINFO hardware resource: GREEN
DATA_TYPE_ECD hardware resource: GREEN
DATA_TYPE_RECURSIVE_NH hardware resource: GREEN
DATA_TYPE_TUNNEL_ENDPOINT hardware resource: GREEN
DATA_TYPE_LOCAL_TUNNEL_INTF hardware resource: GREEN
DATA_TYPE_ECD_TRACKER hardware resource: GREEN
DATA_TYPE_ECD_V2 hardware resource: GREEN
DATA_TYPE_ATTRIBUTE hardware resource: GREEN
DATA_TYPE_LSPA hardware resource: GREEN
DATA_TYPE_LDI_LW hardware resource: GREEN
DATA_TYPE_LDSH_ARRAY hardware resource: GREEN
DATA_TYPE_TE_TUN_INFO hardware resource: GREEN
DATA_TYPE_DUMMY hardware resource: GREEN
DATA_TYPE_IDB_VRF_LCL_CEF hardware resource: GREEN
DATA_TYPE_TABLE_UNRESOLVED hardware resource: GREEN
DATA_TYPE_MOL hardware resource: GREEN
DATA_TYPE_MPI hardware resource: GREEN
DATA_TYPE_SUBS_INFO hardware resource: GREEN
DATA_TYPE_GRE_TUNNEL_INFO hardware resource: GREEN
DATA_TYPE_LISP_RLOC hardware resource: GREEN
DATA_TYPE_LSM_ID hardware resource: GREEN
DATA_TYPE_INTF_LIST hardware resource: GREEN
DATA_TYPE_TUNNEL_ENCAP_STR hardware resource: GREEN
DATA_TYPE_LABEL_RPF hardware resource: GREEN
DATA_TYPE_L2_SUBS_INFO hardware resource: GREEN
DATA_TYPE_LISP_IID_MAPPING hardware resource: GREEN
DATA_TYPE_LISP_RLOC_TBL hardware resource: GREEN
Trident-based line card can support a maximum of 512,000 Layer 3 (L3) prefixes by default. Typhoon line cards support a maximum of four million IPv4 and two million IPv6 prefixes by default. Both can tuned by changing the scaling profile.
IPv4/VPNv4/IPv6/VPNv6 current and max prefix limits:
#show cef misc | inc Num cef entries
Thu Oct 12 08:57:28.193 UTC
Num cef entries : 653811 gbl, 22108 vrf ! IPv4 GRT prefixes, VPNv4 prefixes
Num cef entries : 40883 gbl, 707 vrf ! IPv6 GRT prefixes, VPNv6 prefixes
#show controllers np struct IPV4-LEAF-FAST-P np0 | in Entries
Reserved Entries: 0, Used Entries: 653842, Max Entries: 4194304 ! IPv4 GRT prefixes, IPv4 GRT and VPNv4 routes share the same 4M limit
#show controllers np struct IPV4-LEAF-P np0 | in Entries
Reserved Entries: 0, Used Entries: 22291, Max Entries: 4194304 ! VPNv4 prefixes, IPv4 GRT and VPNv4 routes share the same 4M limit
#show controllers np struct IPV6-LEAF-P np0 | in Entries
Reserved Entries: 0, Used Entries: 41591, Max Entries: 2097152 ! There is one shared counter for IPv6 GRT and VPNv6 prefixes, they also share the same TCAM space as IPv4 but use two TCAM entries per IPv6 prefixs, so max is 2M On Typhoon
#show controllers np struct Label-UFIB np0 | in Entries
Reserved Entries: 0, Used Entries: 216475, Max Entries: 2097152 ! MPLS TCAM space used and free; device both numbers in half, 108k labels in use and 1M entries available. Typhoon TCAM has in+out label entries so it displays 1M label space as 2M spaces.
When working with issues to do with routes not being programmed into hardware, label recycling, state cef entries etc, the following actions can be performed which should be non service affecting:
! Restart the IPv4 RIB manager process, it does not stop forwarding process restart ipv4_rib !wait 10s-20s process restart ipv4_rib ! Then redownload cef into the line MPA/LC clear cef linecard location 0/0/CPU0 ! Restart the MPLS labal switch database manager process, as above doesn't remove the entries or stop forwarding, its the manager process process restart mpls_lsd
One can view which features are enabled at the hardware level to check for a discrepancy between what is configured in software and what has been pushed down into the hardware:
RP/0/RSP0/CPU0:abr1#show uidb data location 0/0/CPU0 TenGigE 0/0/2/0 ingress | i Enable.*0x1 QOS Enable 0x1 MPLS Enable 0x1 RP/0/RSP0/CPU0:abr1#show uidb data shadow location 0/0/CPU0 TenGigE 0/0/2/0 ingress | i Enable.*0x1 QOS Enable 0x1 MPLS Enable 0x1 RP/0/RSP0/CPU0:abr1#show uidb data compare location 0/0/CPU0 Te0/0/2/0 ingress -------------------------------------------------------------------------- Location = 0/0/CPU0 Ifname = TenGigE0_0_2_0 Index = 19 INGRESS table ------------ Layer 3 ------------ No differences were detected between hardware and shadow memory.
One can see interface indexes with "show uidb index"
To clear stale CEF information and trigger a reprogram use "clear cef linecard location ..."
To show ASIC errors on a line card "show asic-errors all location 0/0/CPU0"
To reboot a linecard "hw-module location 0/0/cpu0 reload".
Check MPA online status with "admin show shelfmgr status location 0/1/CPU0".
Check the fabric connectivity of the line card:
show asic-errors arbiter 0 all location 0/RSP0/CPU0
show asic-errors crossbar 0 all location 0/RSP0/CPU0
show asic-errors fia 0 all location 0/1/CPU0
Show the prefix-carrying capacity of the linecard: "show tbm ipv4 unicast dual detail location <loc>"
Show the number of free and used pages per memory channel (memory useage for prefixes): "show plu server summary ingress location 0/0/cpu0"
Check overall NP health:
show controllers np summary all
show controllers np counters all
Check which NP a port is attached to using "show controllers np ports all".
View NP statistics (the meanings of some of these statistics are here, if the version of IOS-XR supports it they should all be described in the output from "show controllers np descriptions location 0/0/CPU0"):
RP/0/RSP0/CPU0:abr1#show controllers np counters np0 location 0/0/CPU0
Mon Dec 21 12:11:01.109 UTC
Node: 0/0/CPU0:
----------------------------------------------------------------
Show global stats counters for NP0, revision v2
Read 99 non-zero NP counters:
Offset Counter FrameValue Rate (pps)
-------------------------------------------------------------------------------
0 NULL_STAT_0 3170536 6
16 MDF_TX_LC_CPU 54875532 109
17 MDF_TX_WIRE 97615034464 246143
21 MDF_TX_FABRIC 88731095971 240104
29 PARSE_FAB_RECEIVE_CNT 97604054631 246124
33 PARSE_INTR_RECEIVE_CNT 12816383 22
37 PARSE_INJ_RECEIVE_CNT 30960923 30
41 PARSE_ENET_RECEIVE_CNT 88918940116 240595
45 PARSE_TM_LOOP_RECEIVE_CNT 41453197 135
49 PARSE_TOP_LOOP_RECEIVE_CNT 878705137 3259
57 PARSE_ING_DISCARD 5305245 9
63 PRS_HEALTH_MON 2788521 5
68 INTR_FRAME_TYPE_3 185276 0
72 INTR_FRAME_TYPE_7 6875512 12
...
The ASR9000 series NPUs have thousands of registers that store information; some are counters, some are settings, some are flags etc. This text file shows all the NPU registers on an ASR9001 (Typhoon NPU). It also shows at the start the "blocks" that the registers are grouped together in. One can either query for the value of a specific register using "show controllers np register <reg-id> np<np-id>" or for a group/block of registers using "show controllers np register block <block-id> np<np-id>".
View bridge statistics with "show controllers fabric fia bridge stats location 0/0/CPU0" (bridge is non-blocking so drops are very rare here, any drops are likely QoS back-pressure downstream).
Via Fabric Interconnect ASIC statistics with "show controllers fabric fia stats location 0/0/CPU0".
Check for drops across the NPs, Bridges and FIAs (reset the stats with "clear controller np counters all"):
RP/0/RSP0/CPU0:abr1#show drops location 0/0/CPU0
Mon Dec 21 14:54:19.769 UTC
Node: 0/0/CPU0:
----------------------------------------------------------------
NP 0 Drops:
----------------------------------------------------------------
PARSE_EGR_INJ_PKT_TYP_UNKNOWN 503011
PARSE_DROP_IN_UIDB_TCAM_MISS 174899472
PARSE_DROP_IN_UIDB_DOWN 37
PARSE_DROP_IPV6_DISABLED 343125
PARSE_L3_TAGGED_PUNT_DROP 2043450
UNKNOWN_L2_ON_L3_DISCARD 3009072
RSV_DROP_ING_BFD 5
RSV_DROP_ING_IFIB_OPT 1
RSV_DROP_MPLS_RXADJ_DROP 7
RSV_DROP_IPV4_NRLDI_NOT_LOCAL 1
RSV_DROP_EGR_LAG_NO_MATCH 1
RSV_DROP_IPV4_URPF_CHK 159339
RSV_DROP_MPLS_LEAF_NO_MATCH 5618
RSV_DROP_IN_L3_NOT_MYMAC 2
RSV_ING_VPWS_ERR_DROP 56
PUNT_NO_MATCH_EXCD 8406
PUNT_IPV4_ADJ_NULL_RTE_EXCD 299166
MDF_PUNT_POLICE_DROP 307572
MODIFY_PUNT_REASON_MISS_DROP 1
----------------------------------------------------------------
NP 1 Drops:
----------------------------------------------------------------
PARSE_EGR_INJ_PKT_TYP_UNKNOWN 18531
PARSE_DROP_IN_UIDB_TCAM_MISS 147
PARSE_DROP_IN_UIDB_DOWN 18
PARSE_DROP_IPV6_DISABLED 49315
PARSE_L3_TAGGED_PUNT_DROP 1459380
UNKNOWN_L2_ON_L3_DISCARD 19469
RSV_DROP_ING_BFD 1
RSV_DROP_IPV4_NRLDI_NOT_LOCAL 10
RSV_DROP_IPV4_URPF_CHK 170600
RSV_DROP_MPLS_LEAF_NO_MATCH 3
RSV_ING_VPWS_ERR_DROP 1
PUNT_IPV4_ADJ_NULL_RTE_EXCD 22351
MDF_PUNT_POLICE_DROP 22351
MODIFY_PUNT_REASON_MISS_DROP 1
----------------------------------------------------------------
No Bridge 0 Drops
----------------------------------------------------------------
No Bridge 1 Drops
----------------------------------------------------------------
FIA 0 Drops:
----------------------------------------------------------------
Total drop: 6
Ingress drop: 6
Ingress sp0 align fail 3
Ingress sp1 align fail 3
----------------------------------------------------------------
FIA 1 Drops:
----------------------------------------------------------------
Total drop: 7
Ingress drop: 7
Ingress sp0 align fail 3
Ingress sp1 crc err 1
Ingress sp1 align fail 3
----------------------------------------------------------------
One can look into the hardware structs that are hard limits on NPs/LCs:
! Shows prefix usage for compressed ACL tables
show controller np struct SACL-PREFIX summary loc0/0/CPU0
! Show the number of VRRP MAC entries
show controller np struct 33 det all-entries np0 loc 0/0/CPU0
! Below, show TCAM usage
RP/0/RSP0/CPU0:abr1#show controllers np struct TCAM-Results summary
Mon Dec 21 12:19:28.977 UTC
Node: 0/0/CPU0:
----------------------------------------------------------------
NP: 0 Struct 0: TCAM_RESULTS
Struct is a PHYSICAL entity
Reserved Entries: 0, Used Entries: 351, Max Entries: 524288
NP: 1 Struct 0: TCAM_RESULTS
Struct is a PHYSICAL entity
Reserved Entries: 0, Used Entries: 303, Max Entries: 524288
! Show IPv6 usage in search memory
show controllers np struct IPV6-LEAF-P np0
! Show IPv4 usage in search memory
show controllers np struct IPV4-LEAF-FAST-P np0
show controllers np struct IPV4-LEAF-P np0
! Show MPLS label usage in search memory
show controllers np struct Label-UFIB summary
! Show LFIB summary
show mpls forwarding summary
! Show all search memory usage
show cef platform resource summary location 0/0/CPU0
One can look for errors in the NP drivers because not all NP errors trigger syslog messages / alarms / etc and meaning they are "silent" unless the NP driver log is explicitly checked:
show controllers np drvlog
One can also check for NP interruptes, the count "Cnt" should be zero for all of them in normal working conditions:
show controllers np interrupts all al
PHY / MAC / Interface / NP Level:
Show interface statistics with "show controllers te0/0/1/0 stats". Show live interface statistics with "monitor interface te0/0/1/0".
Show interface controller/device/driver configuration and state with "show controllers te0/0/1/0 all|control|internal|mac|phy|regs|xgxs", "show ethernet infra internal all trunks"
View the interface uIDB (Micro Interface Descriptor Block), the NP uIDB for a specific interface, with "show uidb data location 0/0/CPU0 Gi0/0/0/6.100 ingress | ex 0x0". This will show which features are enabled in hardware for this interface. This is also linked to LTPS, if a feature is missing such as BGP for example then BGP packets to the control-plane on this interface will be filtered.
Similarly the command "show im database interface Hu0/1/0/3.3003 [detail]" shows information about the interface flags, encapsulation type, protocols enabled on an interface, the MTU of those protocols.
Previous page: ASR9006/9010 Specific Hardware Overview
Next page: IOS-XR Virtual Machines