Skip to main content

如何使用业务板卡console排查板卡启动问题

·1012 words·5 mins
Rory
Author
Rory
Step by step the ladder is ascended

思科 SP 系列高端路由器都是由引擎 RP/线卡(业务卡/LC)/矩阵卡 FC(fabric card)等组成,但是只有引擎 RP 上存在 console 口,其他的板卡都是没有物理的 console 线路,所以在排查业务板卡等系列没有物理 console 线路的板卡启动失败的问题时,往往都只能 login 设备采集业务板卡的启动 log,而往往这些 log 都很简单,并不能告诉我们这个板卡出了什么问题,那其实我们有内置的 console 线路,可以让我们能够跳转到其他板卡的 console 口,查看启动的详细 log。方法如下:

CRS
#

CRS 使用的 32bit 的系统,我们也称之为 cXR,而且有且只有 32bit 版本的系统,并且最有一个软件版本为 6.7.4,后续不会再有软件更新。一般来说 CRS 的硬件故障很容易从 log 中识别出来原因,所以下面跳转到板卡 console 的方式并不常用,了解即可,退出符为CTRL+/

RP/0/RP0/CPU0:CRS-D# run console_tunneling 0/0/cpu0
Thu Jun 23 15:46:27.360 EDT
Trying to connect to node 0/0/CPU0...
Connected to node 0/0/CPU0.
Escape character is '^_'.


Connection to node 0/0/CPU0 closed.

ASR9000
#

cXR(32 bit)
#

命令为run attachCon,退出符为detach,即需要敲出这个单词退出 console.

RP/0/RSP0/CPU0:ASR-9006-A#run attachCon 0/0/CPU0
Thu Jun 23 15:42:15.600 UTC

attachCon: Starting console session to node 0/0/CPU0
attachCon: To quit console session type 'detach'
attachCon: WARNING - Type only 'detach' in the shell. Do not combine other keys.

 TESTING enabling interrupt for uart: mask 0x40, enable 1 reg_val 0x40
Current Baud 115200
Setting Baud to 9600

#
# detach
Terminating console to node 0/0/CPU0..

 TESTING enabling interrupt for uart: mask 0x40, enable 0 reg_val 0x0
RP/0/RSP0/CPU0:ASR-9006-A#
RP/0/RSP0/CPU0:ASR-9006-A#run attachCon 0/0/CPU0 115200
Thu Jun 23 15:42:51.083 UTC

attachCon: Starting console session to node 0/0/CPU0
attachCon: To quit console session type 'detach'
attachCon: WARNING - Type only 'detach' in the shell. Do not combine other keys.

 TESTING enabling interrupt for uart: mask 0x40, enable 1 reg_val 0x40
Current Baud 115200
Setting Baud to 115200

Terminating console to node 0/0/CPU0..

 TESTING enabling interrupt for uart: mask 0x40, enable 0 reg_val 0x0
RP/0/RSP0/CPU0:ASR-9006-A#

eXR(64 bit)
#

命令为attachCon,CTRL+W  退出

RP/0/RP0/CPU0:ASR-9912-A#admin
sysadmin-vm:0_RP0# run
[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$chvrf 0 bash
[sysadmin-vm:0_RP0:~]$attachCon 0/1
===============================================
====       Connecting to Line Card        =====
===============================================
Line Card: No 1
Press <Ctrl-W> to disconnect
Enabling 16550 on UART 1 baud rate 115200

NCS55
#

sysadmin-vm:0_RP0# run
Thu Jun  23 16:27:28.804 UTC+00:00

[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$.attach_console 1
MASTER-RP:Proceding to remote console of Slot: 1
old data: 0 new data: 80010000
Escape character is '^x' [control+x]


host:0_LC0 login:

host:0_LC0 login:

host:0_LC0 login: Exited
[sysadmin-vm:0_RP0:~]$


Or

sysadmin-vm:0_RP0#
sysadmin-vm:0_RP0#
sysadmin-vm:0_RP0# run /opt/cisco/calvados/sbin/rconsole -l 0/SC0

NCS6000
#

NCS6008:

sysadmin-vm:0_RP0# run
Thu Jun  23 17:35:39.459 UTC

[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$
[sysadmin-vm:0_RP0:~]$chvrf 0 /opt/cisco/calvados/sbin/rconsole -l 0/0
Connecting to location 0/0 (backplane-slotid 16, console 0)

Escape sequence is "end"

Cisco 8000
#

Cisco 8000 上使用的是 rconsole, slot 0 代表 0/0/cpu0,这个命令无法连接到 FC 卡。

RP/0/RP0/CPU0:8808-A#run rconsole -s 0
Tue Aug  8 16:43:43.894 UTC
----------------------------------------------------------
8 slot chassis
Target slot # : Logical=0 Physical=2
Target slot board_id status register (0x10BC): 0x2420017a
Source slot # : Logical=0 Physical=30
Source slot board_id status register (0x10BC): 0x22000177
Source slot type : 0x20
Setting up required configs on source slot (RP0) ..
Setting up required configs on target slot (0) ..
----------------------------------------------------------
Launching rconsole to slot 0x2, Use Ctrl-x to quit

ASR9000 eXR LC console log
#

ESC可以中断板卡其他,如下为 boot 菜单的顺序,第一个为XR EOBC BOOT,可以理解为 PXE 从主引擎 copy image 启动.

Build-in Shell, 选项可以进行一些特殊镜像的引导.

https://rory-1251435693.cos.ap-beijing.myqcloud.com/img/image-20220624021002985.png
image-20220624021002985

如过不中断板卡启动,板卡会自动选择 XR EOBC BOOT,开始从主引擎引导镜像,如下:

板卡执行 PXE 启动,设置自己的 IP 地址为 127.1.2.2, 改地址为自己 host vm 一个网口地址;

127.1.1.27 为主引擎 Host vm 中的一个网口;

板卡会通过该网段去主引擎 copy 镜像,

 ------------XR EOBC Boot------------
Brdcm_bringup_active_loop_free_links---efi
Trying LOCAL SC Path ..Enabling port 34 Disabling port 35
Port [23] (0-base, flag:0x00000200):  no_lpbk    link_up    2.5G full duplex
Port [34] (0-base, flag:0x00000200):  no_lpbk    link_up    1G full duplex
Port [35] (0-base, flag:0x00000000):  no_lpbk    link_down  2.5G full duplex


>>Checking Media Presence......
>>Media Present......
>>Start PXE over IPv4.
  Station IP address is 127.1.2.2

XR-EOBC get boot file size request

.
  Server IP address is 127.1.1.27
  NBP filename is ipxe-lc.efi
  NBP filesize is 280446 Bytes

>>Checking Media Presence......
>>Media Present......
 Downloading NBP file...

 Downloading NBP file-Mtftp:

.
  Succeed to download NBP file.

PXE Image Name = ipxe-lc.efi

PXE Image Size = 280446 Bytes

 ------------Cisco Secure Boot: Begin ------------

 -----------Cisco Secure Boot: Verifing-----------
Certificate parsing success

Root Certificate found.
Successfully got public key from issuer cert

Image verified successfully. Booting..

 ------------Cisco Secure Boot: End ------------
iPXE initialising devices...
GBE0 NetBoot Installed

GBE0 NetBoot Installed
Sysconf checksum failed. Using default values
Fail to retrieve boot mode
ok



iPXE 1.0.0+ (a59e9a) -- Open Source Network Boot Firmware -- http://ipxe.org
Features: DNS HTTP HTTPS TFTP VLAN EFI ISO9660 NBI Menu
Trying net0...
net0: 00:00:01:02:00:00 using NII on NII-PCI00:14.0 (open)
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Unknown (http://ipxe.org/1a086194)]
Configuring (net0 00:00:01:02:00:00).................. ok
net0: 127.1.2.2/255.255.0.0
net0: fe80::200:1ff:fe02:0/64
Next server: 127.1.1.27
Filename: http://127.1.1.27/ipxe-lc.ipxe.slot26
http://127.1.1.27/ipxe-lc.ipxe.slot26... ok
http://127.1.1.27/system_image.iso... ok
Memory required for image[system_image.iso]: 1739829248, available: 2145112064
Certificate parsing success

Root Certificate found.
Successfully got public key from issuer cert

Image verified sucessfully. Booting...
Booting iso-image@0x3d84c5000(1739829248), bzImage@0x3d8502800(6276814)
Certificate parsing success

Root Certificate found.
Successfully got public key from issuer cert

Image verified successfully. Booting..
**** PASS: secure boot verification of image: bzImage****
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.14.23-WR7.0.0.2_standard (hetsoi@calcium-99.cisco.com) (gcc version 4.9.1 (Wind River Linux 4.9.1-7) ) #1 SMP Tue Jun 30 22:17:10 PDT 2020
[    0.000000] Command line: bzImage root=/dev/ram platform=fretta install=/dev/sda boardtype=LC vmtype=hostos prod=1 crashkernel=192M@0 bigphysarea=10M pci=assign-busses noissu pci=hpmemsize=0M,hpiosize=0M  console=ttyS1,115200 initrd=initrd.img
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000be0a5fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000be0a6000-0x00000000bed49fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000bed4a000-0x00000000bed59fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bed5a000-0x00000000bf42afff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bf42b000-0x00000000bf639fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000bf63a000-0x00000000bf7fffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000e3ffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed01000-0x00000000fed03fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed08000-0x00000000fed08fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed0c000-0x00000000fed0ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1cfff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fef00000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff800000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000043fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] extended physical RAM map:
[    0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] reserve setup_data: [mem 0x0000000000100000-0x00000000bb3fb017] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb3fb018-0x00000000bb40b657] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb40b658-0x00000000bb40c017] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb40c018-0x00000000bb41c657] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb41c658-0x00000000bb41d017] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb41d018-0x00000000bb42d657] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb42d658-0x00000000bb42e017] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb42e018-0x00000000bb43e657] usable
[    0.000000] reserve setup_data: [mem 0x00000000bb43e658-0x00000000be0a5fff] usable
[    0.000000] reserve setup_data: [mem 0x00000000be0a6000-0x00000000bed49fff] reserved
[    0.000000] reserve setup_data: [mem 0x00000000bed4a000-0x00000000bed59fff] ACPI data
[    0.000000] reserve setup_data: [mem 0x00000000bed5a000-0x00000000bf42afff] ACPI NVS
[    0.000000] reserve setup_data: [mem 0x00000000bf42b000-0x00000000bf639fff] reserved
[    0.000000] reserve setup_data: [mem 0x00000000bf63a000-0x00000000bf7fffff] usable
[    0.000000] reserve setup_data: [mem 0x00000000e0000000-0x00000000e3ffffff] reserved