PAM(Platform Automated Monitoring),从 6.1.2 版本开始(64bit, not in 32bit)开始引入该功能, 并且默认情况下是自动启动,用于监视进程 crash,memory leak, CPU hog,traceback , disk usage 等, 具体点就是当检测到某一事件时, 会自动采集一些信息并默认保存到 harddisk:/cisco_support 目录下, 供我们 troubleshooting, 这一功能是全自动的,目前没法手动配置,具体示例可以参考以下文档:
从 6.6.1 开始新引入一个 feature, on-demand EDCD(Event Driven CLI Database ), 结合 PAM 能实现两种功能
- PAM Schedule: 每间隔一段时间采集一些信息
- PAM EEM Agent: 监控 syslog, 若符合条件 trigger 采集一些信息
EDCD Ondemand-Create #
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand ?
add-update Add or update ondemand EDCD entries
add-update-trigger Add or update ondemand EDCD entries
delete Delete ondemand EDCD entries
delete-all Delete all EDCD entries
trigger Trigger the collection of traces associated with given identifier
//创建一个 command list, 示例如下:
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand add-update identifier xuxing_test commands "show run;show plat;show install active su"
Sun Apr 25 09:30:18.903 UTC
Ondemand EDCD has been updated (execute 'show edcd ondemand database' to verify.)
RP/0/RSP0/CPU0:ASR9910-B#
RP/0/RSP0/CPU0:ASR9910-B#show edcd ondemand database
Sun Apr 25 09:30:58.713 UTC
============================================================
Identifier: xuxing_test
============================================================
1: show run
2: show plat
3: show install active su
------------------------------------------------------------
//往已有的 command list 中新增一些命令的话, 使用如下的方法:
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand add-update identifier xuxing_test commands "show clock"
Sun Apr 25 09:41:01.362 UTC
Ondemand EDCD has been updated (execute 'show edcd ondemand database' to verify.)
RP/0/RSP0/CPU0:ASR9910-B#
RP/0/RSP0/CPU0:ASR9910-B#show edcd ondemand database
Sun Apr 25 09:41:08.848 UTC
============================================================
Identifier: xuxing_test
============================================================
1: show run
2: show plat
3: show install active su
4: show clock <<<<<
------------------------------------------------------------
RP/0/RSP0/CPU0:ASR9910-B#
//admin cli 和 shell cli 同样是支持的:
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand add-update identifier xuxing_test commands "admin show plat;run ng_show_version"
Sun Apr 25 09:48:36.510 UTC
Ondemand EDCD has been updated (execute 'show edcd ondemand database' to verify.)
RP/0/RSP0/CPU0:ASR9910-B#show edcd ondemand database
Sun Apr 25 09:48:39.145 UTC
============================================================
Identifier: xuxing_test
============================================================
1: show run
2: show plat
3: show install active su
4: admin show plat <<<<
5: run ng_show_version <<<<
------------------------------------------------------------
RP/0/RSP0/CPU0:ASR9910-B#
EDCD Ondemand – Delete #
可以选择删除某个 command 或者删除整个 list:
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand delete identifier xuxing_test ?
commands Specify a list of commands that to be deleted (if missing all entries under this sub-pattern will be deleted)
<cr>
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand delete identifier xuxing_test commands "show clock"
Sun Apr 25 09:43:31.815 UTC
Ondemand EDCD has been updated (execute 'show edcd ondemand database' to verify.)
RP/0/RSP0/CPU0:ASR9910-B#show edcd ondemand database
Sun Apr 25 09:43:34.277 UTC
============================================================
Identifier: xuxing_test
============================================================
1: show run
2: show plat
3: show install active su
------------------------------------------------------------
EDCD Ondemand – Trigger #
如何测试 command lish 是否生效呢?可以使用以下命令:
RP/0/RSP0/CPU0:ASR9910-B#edcd ondemand trigger identifier xuxing_test
Sun Apr 25 09:49:43.479 UTC
RP/0/RSP0/CPU0:ASR9910-B#
RP/0/RSP0/CPU0:Apr 25 09:36:40.033 UTC: run_cmd[69017]: %INFRA-INFRA_MSG-5-RUN_LOGIN : User cisco logged into shell from vty0
RP/0/RSP0/CPU0:Apr 25 09:36:46.775 UTC: run_cmd[69017]: %INFRA-INFRA_MSG-5-RUN_LOGOUT : User cisco logged out of shell from vty0
RP/0/RSP0/CPU0:Apr 25 09:49:54.118 UTC: logger[67945]: %OS-SYSLOG-4-LOG_WARNING : PAM has completed on-demand data collection for xuxing_test. All files are archived and saved at 0/RSP0/CPU0 : harddisk:/cisco_support/PAM-asr9k-ondemand-xr-xuxing_test-2021Apr25-094953.tgz (Please copy tgz file out of the router and send to Cisco support. This tgz file will be removed after 14 days.
如上所示,系统会尝试一个 tar 文件"harddisk:/cisco_support/PAM-asr9k-ondemand-xr-xuxing_test-2021Apr25-094953.tgz", 从设备中 copy 出来解压缩显示如下:
]
PAM Schedule #
RP/0/RSP0/CPU0:ASR9910-B#edcd scheduler add-update cadence '*/10 * * * *' ? <<<< 两种方式, schedule command或者schedule之前配置好的command list
command Command to be executed at the above cadence
identifier An identifier linked to a list of CLIs (defined in ondemand EDCD)
<cr>
RP/0/RSP0/CPU0:ASR9910-B#edcd scheduler add-update cadence '*/10 * * * *' identifier xuxing_test
Sun Apr 25 10:03:26.302 UTC
Adding */10 * * * * root /pkg/bin/pam_is_active_rp && /pkg/bin/edcd_cli.py ondemand --operation trigger -i xuxing_test
Updating job file on remote RP
The following job has been added successfully:
*/10 * * * * root /pkg/bin/pam_is_active_rp && /pkg/bin/edcd_cli.py ondemand --operation trigger -i xuxing_test
RP/0/RSP0/CPU0:ASR9910-B#
RP/0/RSP0/CPU0:ASR9910-B#show edcd scheduler <<<< 查看已有的scheduler
Sun Apr 25 10:03:33.842 UTC
<Job ID>: <job content>
1: */10 * * * * root /pkg/bin/pam_is_active_rp && /pkg/bin/edcd_cli.py ondemand --operation trigger -i xuxing_test
RP/0/RSP0/CPU0:ASR9910-B#
‘*/10 * * * *’, 代表每隔 10 分钟执行一次, 这里的参数如何设置可以参考 Linux crontab 介绍:
如何删除该 schedule:
RP/0/RSP0/CPU0:ASR9910-B#edcd scheduler delete job-id 1 <<<< 使用job id 删除,job-id通过“show edcd scheduler”获得
Sun Apr 25 10:08:42.937 UTC
The following job has been deleted:
*/10 * * * * root /pkg/bin/pam_is_active_rp && /pkg/bin/edcd_cli.py ondemand --operation trigger -i xuxing_test
Updating job file on remote RP
RP/0/RSP0/CPU0:ASR9910-B#
PAM EEM #