BGP 监控协议 BGP Monitoring Protocol(BMP)能够对网络中的设备的 BGP 运行状态进行实时监控,BGP 运行状态包括对等体关系的建立与解除、路由信息刷新等。
路由器只需要与 BMP server 建立一个 TCP 连接, 后续就可以发送 BMP 的数据包。
BMP 的数据包包含以下 7 种类型:
-
Initiation 消息:初始化消息,向监控服务器通告厂商信息、版本号等。
-
PU(Peer Up Notification)消息:向监控服务器上报与对等体 BGP 连接的建立。
-
RM(Route Monitoring)消息:路由监控消息,向监控服务器发送从对等体收到的所有路由,并随时向监控服务器上报路由的新增或撤销。
-
PD(Peer Down Notification)消息:向监控服务器上报与对等体 BGP 连接的中断。
-
SR(Stats Reports)消息:向监控服务器上报路由器运行状态的统计信息。
-
Termination 消息:结束消息,向监控服务器通告关闭 BMP 会话的原因。
与其他厂商的 BMP 实现对比来看, Cisco 这里新增了一个报文“Route Mirroring”, 用来更新其他类型的 BGP 报文,但是从 IOS XR 设备的测试来开, 并没有发现此类新增报文。
RP/0/RP0/CPU0:CRS-H#show bgp bmp server 1
Thu Jul 29 06:15:37.127 UTC
BMP server 1
Host 10.70.79.197 Port 5000
Connected for 01:18:08
Last Disconnect event received : 00:00:00
Precedence: internet
BGP neighbors: 1
VRF: calo-mgmt (0x60000002)
Update Source: 172.18.87.66 (Mg0/RP0/CPU0/0)
Update Source Vrf ID: 0x60000002
Queue write pulse sent : Jul 29 06:15:26.601, Jul 29 04:57:03.914 (all)
Queue write pulse received : Jul 29 06:15:26.601
Update Mode : Route Monitoring Post-Policy
Queue Route Mon Msg buffer limit : 143093 KB (Current Server Up Count: 1)
Queue Route Mon Msg buffer usage : 0 B
Update Generation in Progress : No
Reset Walk in Progress : No
IPv4 Unicast
Version : 42954698
Init EOR Version : 15416361
Init EOR Pending count : 0
Update Generation
Last Run : Jul 29 06:15:26.593, Count 1620
Walk Currently Stalled : No, Last Stalled : Jul 29 04:38:04.837, Count 224
IPv6 Unicast
Version : 0
Init EOR Version : 0
Init EOR Pending count : 0
Update Generation
Last Run : not set, Count 0
Walk Currently Stalled : No, Last Stalled : not set, Count 0
TCP:
Last message sent: Jul 29 06:15:30.993, Status: No Pending Data
Last write pulse received: Jul 29 06:15:31.393, Waiting: FALSE
Message Stats:
Total msgs dropped : 5372968
Total msgs pending : 0, Max: 2070488 at Jul 29 04:25:08.541
Total messages sent : 1471520
Total bytes sent : 267883849, Time spent: 8.500 secs
INITIATION: 3
TERMINATION: 0
STATS-REPORT: 0
PER-PEER messages: 1471517
ROUTE-MON messages : 1471512
EOR messages : 2
Update messages : 17680 (Prefixes: 18940950, Err: 0)
Withdraw messages : 72092 (Prefixes: 19232017, Err: 0)
Discarded msgs: 454 (reason : peer-down)
Discarded pfx : 33658 (reason : peer-down)
Update gen time spent: 81.519 secs
Neighbor 100.1.0.2
Messages pending: 0
Messages dropped: 5372968
Messages sent : 1471517 <<<<<<
PEER-UP : 4
PEER-DOWN : 1
ROUTE-MON : 1471512
EOR : 2
Update : 17680 (Prefixes: 18940950, Err: 0)
Withdraw : 72092 (Prefixes: 19232017, Err: 0)
OPENBMP #
BMP server 我们可以使用 OPENBMP 这个开源的项目, 关于此项目可参考Github.
安装测试可以参考以下 link, 使用 docker-compose 快速的搭建这个实例。
https://www.openbmp.org/getting_started.html
安装示例:
[root@localhost BMP]# pip install docker-compose
[root@localhost BMP]# wget https://raw.githubusercontent.com/OpenBMP/obmp-docker/main/docker-compose.yml
[root@localhost BMP]# git clone https://github.com/OpenBMP/obmp-grafana.git
[root@localhost BMP]# mkdir -p /var/openbmp
[root@localhost BMP]# export OBMP_DATA_ROOT=/var/openbmp
[root@localhost BMP]# sudo mkdir -p $OBMP_DATA_ROOT
[root@localhost BMP]# sudo chmod -R 7777 $OBMP_DATA_ROOT
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/config
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/kafka-data
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/zk-data
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/zk-log
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/postgres/data
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/postgres/ts
[root@localhost BMP]# mkdir -p ${OBMP_DATA_ROOT}/grafana
[root@localhost BMP]# chmod -R 7777 $OBMP_DATA_ROOT/*
[root@localhost BMP]# cp -r obmp-grafana/dashboards obmp-grafana/provisioning ${OBMP_DATA_ROOT}/grafana/
[root@localhost BMP]# vim docker-compose.yml <<<< 修改文件中MEM字段, 更改docker使用的memory为2G, 测试使用足够了
[root@localhost BMP]# OBMP_DATA_ROOT=/var/openbmp docker-compose -f ./docker-compose.yml -p obmp up -d <<<
Creating obmp-zookeeper ...
Creating obmp-grafana ...
Creating obmp-psql ...
Creating obmp-collector ...
Creating obmp-psql-app ...
Creating obmp-kafka ...
[root@localhost BMP]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b4d4e3bd12d0 confluentinc/cp-kafka:6.0.2 "/etc/confluent/dock…" About a minute ago Up About a minute 0.0.0.0:9092->9092/tcp, :::9092->9092/tcp obmp-kafka
8ebbe99a86d6 openbmp/psql-app:latest "/usr/sbin/run" 2 minutes ago Up About a minute 8080/tcp, 0.0.0.0:9005->9005/tcp, :::9005->9005/tcp obmp-psql-app
a52de93a5645 openbmp/postgres:latest "docker-entrypoint.s…" 2 minutes ago Up About a minute 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp obmp-psql
a9866e9ef534 openbmp/collector:latest "/usr/sbin/run" 2 minutes ago Up About a minute 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp obmp-collector
a19cca96d0cc grafana/grafana:latest "/run.sh" 2 minutes ago Up About a minute 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp obmp-grafana
b9402148e6ea confluentinc/cp-zookeeper:6.0.2 "/etc/confluent/dock…" 2 minutes ago Up About a minute 2181/tcp, 2888/tcp, 3888/tcp obmp-zookeeper
85a59886ecbe portainer/portainer-ce "/portainer" 5 weeks ago Up 5 weeks 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp portainer
[root@localhost BMP]# OBMP_DATA_ROOT=/var/openbmp docker-compose -p obmp down <<<< stop and remove all containers
Router BMP 配置 #
以 Cisco IOS XR 设备为例,OPENBMP 默认开放的端口 5000 用于与 client 建立连接,端口 3000 为 grafana
RP/0/RP0/CPU0:CRS-H#show run bmp
Thu Jul 29 06:37:09.334 UTC
bmp server 1
host 10.70.79.197 port 5000
vrf calo-mgmt
update-source MgmtEth0/RP0/CPU0/0
!
RP/0/RP0/CPU0:CRS-H#show run router bgp 65001
Thu Jul 29 06:37:22.173 UTC
router bgp 65001
nsr
bgp router-id 20.20.20.20
bgp log neighbor changes detail
address-family ipv4 unicast
network 20.20.20.20/32
!
address-family ipv6 unicast
!
neighbor 100.1.0.2
remote-as 100
bmp-activate server 1 <<<< Active BMP for special peer
address-family ipv4 unicast
route-policy pass in
route-policy pass out
!
RP/0/RP0/CPU0:CRS-H#show bgp bmp summary
Thu Jul 29 06:38:08.503 UTC
ID Host Port State Time NBRs
1 10.70.79.197 5000 ESTAB 01:40:39 1
Adj-RIB-In and Adj-RIB-in Post-policy #
如图,Adj-RIB-In 为邻居发送过来的未经过 rpl 处理的路由信息,而 Adj-RIB-in Post-policy 则是经过 RPL 并且被处理过的路由信息。
默认 IOS-XR 设备是 Adj-RIB-in pre policy, 即对通过入口策略前的路由(即设备从邻居收到的所有路由)进行监控。
如果希望监控服务器只对通过入口策略后的路由(即通过路由策略筛选后,实际下发到路由表中的路由)进行监控, 使用以下配置:
RP/0/RP0/CPU0:CRS-H#show run bmp
Thu Jul 29 06:53:19.792 UTC
bmp server all
route-monitoring policy post inbound
!