Telemetry Demo

Table of Contents

telemetry - This article is part of a series.

Part 1: Telemetry simple configuration in Cisco XR devices

Part 2: ANX-a tool to test telemetry

Part 3: This Article

Part 4: Telemetry-tig

记录一下一个 Telemetry 的 demo，使用 pipeline+influxDB+Grafana，构建一个可视化监控。

pipeline: 用于采集设备的 telemetry 信息。
influxDB: 使用数据库的方式存储 telemetry 采集到的信息
Grafana: 将数据库采集的信息图像化输出.

安装与配置
#

Pipeline:
#

使用的是“bigmuddy-network-telemetry-pipeline”，bigmuddy-network-telemetry-pipeline 这个项目被存档了，我 fork 了下，所以可以使用以下的命令去 get 这个项目：

git clone https://github.com/xuxing3/bigmuddy-network-telemetry-pipeline.git

配置文件如下：

[root@xuxing239 bigmuddy-network-telemetry-pipeline]# grep -v ^# pipeline-influxdb.conf  | grep -v ^$
[default]
id = pipeline
metamonitoring_prometheus_resource = /metrics
metamonitoring_prometheus_server = :8989
[testbed]
stage = xport_input
type = tcp
encap = st
listen = :5432           <<<< 监听5432 端口， 可以更改
[inspector]
stage = xport_output
type = tap
file = /opt/dump-xuxing.txt    <<<<<采集到的 telemetry的数据存储的位置
datachanneldepth = 1000
[metrics_influx]
stage = xport_output
type = metrics
file = metrics-nms.json        <<<<<<  将采集到telemetry 的信息使用该文件定义的格式， 格式化输出到influxdb
dump = metrics-dump-xuxing.txt    <<<< 格式化输出之前会dump一份数据存储在该文件中
output = influx
influx = http://10.70.80.197:8086       <<<<<<<  influxdb, 默认端口为8086
database = mdt_db                       <<<<<<< 定义infludb 那个数据库用于存储telemetry信息

metrics-nms.json 格式如下，如何去定义这个 json 文件，我目前能找到方法也只是参考 pipline 采集到数据自己去修改 json 文件。metrics 文件的意义在于翻译采集到的信息，并导入数据库。

[root@xuxing bigmuddy-network-telemetry-pipeline]# cat metrics-nms.json
[
        {
                "basepath" : "Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters",
                "spec" : {
                        "fields" : [
                                {"name":"node-name", "tag": true},     <<<<<<<   tag 为 true 是为了设置后续可以按照哪个参数去filter 数据
                                {"name":"np-name"},
                                {
                                        "name":"np-counter",
                                        "fields" : [
                                                {"name":"counter-name", "tag": true},    <<<<<<
                                                {"name":"counter-value"},
                                                {"name":"rate"},
                                                {"name":"counter-type"},
                                                {"name":"counter-index"}
                                        ]
                                }
                        ]
                }
        }
]

InfluxDB/Grafana
#

docker run -d -v grafana-storage:/var/lib/grafana -p 3000:3000 grafana/grafana:5.4.3
docker run -d --volume=/opt/influxdb:/data -p 8080:8083 -p 8086:8086 tutum/influxdb

这里使用 docker 的方式，run 起来后访问 http://:8080，去修改一些 influxdb 的参数:

创建 mdt_db database, 并修改 database 数据存储的时间

CREATE DATABASE "mdt_db"
CREATE RETENTION POLICY "24h" ON "db_name" DURATION 1d REPLICATION 1 DEFAULT  #1 day
CREATE RETENTION POLICY "one_month" ON "mdt_db" DURATION 30d REPLICATION 1 DEFAULT # one_month

其他的一些常用命令:
SHOW MEASUREMENTS  
select * from "" 
SHOW RETENTION POLICIES ON "mdt_db"

InfluxDB 命令参考以下文章

https://blog.csdn.net/daguanjia11/article/details/90666888

Grafana 连接数据库

Telemetry 数据采集
#

IOX 设备配置如下：
#

RP/0/RSP1/CPU0:ASR9906-B#show run telemetry model-driven
Wed May 19 07:52:54.048 UTC
telemetry model-driven
 destination-group xuxing
  vrf calo-mgmt
  address-family ipv4 10.70.80.197 port 5432   <<<<<<
   encoding self-describing-gpb
   protocol tcp
  !
 sensor-group xuxing
  sensor-path Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters/np-counter
 !
 subscription xuxing
  sensor-group-id xuxing strict-timer
  sensor-group-id xuxing sample-interval 5000
  destination-id xuxing
  source-interface MgmtEth0/RSP1/CPU0/0

Server 配置
#

[root@xuxing bigmuddy-network-telemetry-pipeline]# ./bin/pipeline -log= -debug -config=pipeline-influxdb.conf
INFO[2021-05-19 00:58:24.900661] Conductor says hello, loading config          config=pipeline-influxdb.conf debug=true fluentd= logfile= maxthreads=12 tag=pipeline version="v1.0.0(bigmuddy)"
DEBU[2021-05-19 00:58:24.901561] Conductor processing section...               name=conductor section=inspector tag=pipeline
DEBU[2021-05-19 00:58:24.901576] Conductor processing section, type...         name=conductor section=inspector tag=pipeline type=tap
INFO[2021-05-19 00:58:24.901588] Conductor starting up section                 name=conductor section=inspector stage="xport_output" tag=pipeline
DEBU[2021-05-19 00:58:24.901619] Conductor processing section...               name=conductor section="metrics_influx" tag=pipeline
DEBU[2021-05-19 00:58:24.901655] Conductor processing section, type...         name=conductor section="metrics_influx" tag=pipeline type=metrics
INFO[2021-05-19 00:58:24.901665] Conductor starting up section                 name=conductor section="metrics_influx" stage="xport_output" tag=pipeline
INFO[2021-05-19 00:58:24.901676] Metamonitoring: serving pipeline metrics to prometheus  name=default resource="/metrics" server=":8989" tag=pipeline
INFO[2021-05-19 00:58:24.901767] Starting up tap                               countonly=false filename="/opt/dump-xuxing.txt" name=inspector streamSpec=&{2 <nil>} tag=pipeline
CRYPT Client [metrics_influx],[http://10.70.80.197:8086]

 Enter username: admin
 Enter password:

INFO[2021-05-19 00:58:28.195240] setup authentication                          authenticator="http://10.70.80.197:8086" name="metrics_influx" pem= tag=pipeline username=admin
INFO[2021-05-19 00:58:28.195357] setup metrics collection                      basepath="Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters" name="metrics_influx" tag=pipeline
DEBU[2021-05-19 00:58:28.195371] metrics export configured                     file=metrics-nms.json metricSpec={[{Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters 0xc420344e60}] map[Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters:0xc420344e60] 0xc420328bd0} name="metrics_influx" output=influx tag=pipeline
DEBU[2021-05-19 00:58:28.195429] Conductor processing section...               name=conductor section=testbed tag=pipeline
DEBU[2021-05-19 00:58:28.195452] Conductor processing section, type...         name=conductor section=testbed tag=pipeline type=tcp
INFO[2021-05-19 00:58:28.195466] Conductor starting up section                 name=conductor section=testbed stage="xport_input" tag=pipeline

Grafana 简单配置如下:
#

其他优化
#

由于 telemetry 的数据过于庞大，建议将 server 上的数据定期清空, 否则很容易占满磁盘

[root@xuxing bigmuddy-network-telemetry-pipeline]# crontab -l
*/60 * * * * root echo "" > /opt/dump-xuxing.txt
*/60 * * * * root echo "" > /root/bigmuddy-network-telemetry-pipeline/metrics-dump-xuxing.txt_wkid0
[root@xuxing bigmuddy-network-telemetry-pipeline]#

telemetry - This article is part of a series.

Part 1: Telemetry simple configuration in Cisco XR devices

Part 2: ANX-a tool to test telemetry

Part 3: This Article

Part 4: Telemetry-tig

安装与配置 #

Pipeline: #

InfluxDB/Grafana #

Telemetry 数据采集 #

IOX 设备配置如下： #

Server 配置 #

Grafana 简单配置如下: #

其他优化 #