SNMP support for Keepalived
Vincent Bernat
Keepalived is a high-availability and load-balancing solution. With VRRP, it allows you to share IP addresses between several servers or routers. At a given moment, only one of them (the master) is handed the IP address. VRRP is usually set up for redundant routers, but you can also set up redundant services. If the master becomes unavailable or fails, one of the backups takes over the IP address. VRRP is not a cluster resource manager. It is targeted at sharing one (or several) IPs between a set of hosts. If you want to share other resources with complex conditions like “at least two web servers should run and the database should not run on the same node as a web server,” you should look at heartbeat instead.
For load-balancing, Keepalived relies on IPVS, a Linux subsystem featuring layer-4 switching. Keepalived checks if your servers are alive and tell it to IPVS. IPVS handles incoming connections and send them on the appropriate alive server depending on the policy you can configure. For example, one policy is round-robin: servers are handled new connections in circular order.
Keepalived & SNMP#
Keepalived did not feature native SNMP support. I have added complete SNMP support and I hope these patches will be accepted upstream. Meanwhile, you can grab Keepalived with SNMP support source code from GitHub. Currently, it is based on the latest version of Keepalived.
Update (2014-03)
SNMP support has been merged in Keepalived 1.2.5.
Here is what you can do:
- query the configuration of the running Keepalived without parsing configuration files;
- query runtime status (like VRRP status, priority, or current state of a virtual server) without looking in the logs;
- query runtime statistics about virtual servers (how many
   connections were handled by this real server, for example) without
   parsing the output of ipvsadm;
- get noticed with the help of SNMP traps when something changed (VRRP transition or a real server that became unavailable);
- changing the priority of a VRRP instance (to force a transition as master); and
- changing the weight of a real server (to remove it from a pool of servers).
Compilation#
Update (2022-01)
The Git repository below is not up-to-date. As SNMP support is present since 1.2.5, use a regular Keepalived release instead.
To use it:
$ git clone https://github.com/vincentbernat/keepalived.git $ cd keepalived $ git checkout snmp $ ./configure --enable-snmp $ make
You get bin/keepalived which is the SNMP-enabled Keepalived
daemon. You can use it directly or install it with make install as
root.
On RHEL, SNMP is not properly packaged. Try the following command
instead of make:
$ make LDFLAGS="$(net-snmp-config --agent-libs) -lpopt -lssl -lcrypto"
Use#
To use it, you need to ensure that your main snmpd daemon can become
a master agent. Check for the presence of the master agentx line in
snmpd.conf. The next step is to start Keepalived with the -x
switch. Keepalived connects to your main snmpd daemon. You can
check the logs for a ligne like this:
May 28 17:21:19 L1 Keepalived_vrrp: NET-SNMP version 5.4.3 AgentX subagent connected
Update (2012-02)
The interface between Keepalived and the
master agent snmpd through AgentX protocol should be
asynchronous. However, the regular check of the connection is done
synchronously. If the master agent becomes unresponsive, Keepalived
will also be unresponsive which is critical for the VRRP part. You may
mitigate the problem by applying some workarounds.
Lab#
To test it, we can set up a little lab with UML. You can find
additional details about this kind of lab in my post about
network lab with UML. Grab the
complete lab from GitHub. Ensure that you correct the path to
Keepalived sources at the top of the setup script.
The lab features four web servers that are merged into a single web service with the help of Keepalived. Since Keepalived now features full IPv6 support, we also use IPv6 in our lab.
Here is the VRRP configuration of the first Keepalived :
vrrp_instance VI_1 { state MASTER interface eth2 dont_track_primary track_interface { eth0 eth1 } virtual_router_id 1 priority 150 advert_int 2 authentication { auth_type PASS auth_pass blibli } virtual_ipaddress { 1.1.1.15/24 dev eth0 2001:db8::15/64 dev eth0 192.168.1.1/24 dev eth1 fd00::1/64 dev eth1 } }
We define two virtual servers, one for IPv4, one for IPv6. Here is an excerpt of the configuration:
virtual_server_group VS_GROUP_IPv4 { 1.1.1.15 80 } virtual_server group VS_GROUP_IPv4 { delay_loop 10 lb_algo rr lb_kind NAT protocol TCP real_server 192.168.1.10 80 { weight 1 HTTP_GET { url { path / status_code 200 } connect_timeout 10 } } […] }
When running the lab (with ./setup), we can check that everything
works as expected. From the client node, we can make several requests,
both with IPv4 and IPv6:
$ curl http://1.1.1.15 W3 $ curl http://1.1.1.15 W2 $ wget -o /dev/null -O - 'http://[2001:db8::15]' W3 $ wget -o /dev/null -O - 'http://[2001:db8::15]' W2
If we put down eth1 on L1, L2 becomes master and everything
works as expected. Our client node can still do the requests.
SNMP#
We can query the configuration of our running Keepalived from
L1. The appropriate MIB is KEEPALIVED-MIB. The base OID for
this MIB is .1.3.6.1.4.1.9586.100.5. It is hosted in the
OID space allocated by IANA to the Debian project.
For example, to get the configuration of our VRRP instance:
# snmpwalk -v2c -cpublic localhost KEEPALIVED-MIB::vrrpInstanceTable KEEPALIVED-MIB::vrrpInstanceName.1 = STRING: VI_1 KEEPALIVED-MIB::vrrpInstanceVirtualRouterId.1 = Gauge32: 1 KEEPALIVED-MIB::vrrpInstanceState.1 = INTEGER: master(2) KEEPALIVED-MIB::vrrpInstanceInitialState.1 = INTEGER: master(2) KEEPALIVED-MIB::vrrpInstanceWantedState.1 = INTEGER: master(2) KEEPALIVED-MIB::vrrpInstanceBasePriority.1 = INTEGER: 150 KEEPALIVED-MIB::vrrpInstanceEffectivePriority.1 = INTEGER: 150 KEEPALIVED-MIB::vrrpInstanceVipsStatus.1 = INTEGER: allSet(1) KEEPALIVED-MIB::vrrpInstancePrimaryInterface.1 = STRING: eth2 KEEPALIVED-MIB::vrrpInstanceTrackPrimaryIf.1 = INTEGER: tracked(1) KEEPALIVED-MIB::vrrpInstanceAdvertisementsInt.1 = Gauge32: 2 seconds KEEPALIVED-MIB::vrrpInstancePreempt.1 = INTEGER: preempt(1) KEEPALIVED-MIB::vrrpInstancePreemptDelay.1 = Gauge32: 0 seconds KEEPALIVED-MIB::vrrpInstanceAuthType.1 = INTEGER: password(1) KEEPALIVED-MIB::vrrpInstanceLvsSyncDaemon.1 = INTEGER: disabled(2) KEEPALIVED-MIB::vrrpInstanceGarpDelay.1 = Gauge32: 0 seconds KEEPALIVED-MIB::vrrpInstanceSmtpAlert.1 = INTEGER: disabled(2) KEEPALIVED-MIB::vrrpInstanceNotifyExec.1 = INTEGER: disabled(2)
We can get statistics from IPVS subsystem:
# snmpwalk -v 2c -c public localhost KEEPALIVED-MIB::virtualServerTable | grep Stats KEEPALIVED-MIB::virtualServerStatsConns.1 = Gauge32: 6 connections KEEPALIVED-MIB::virtualServerStatsConns.2 = Gauge32: 6 connections KEEPALIVED-MIB::virtualServerStatsInPkts.1 = Counter32: 36 packets KEEPALIVED-MIB::virtualServerStatsInPkts.2 = Counter32: 36 packets KEEPALIVED-MIB::virtualServerStatsOutPkts.1 = Counter32: 24 packets KEEPALIVED-MIB::virtualServerStatsOutPkts.2 = Counter32: 24 packets KEEPALIVED-MIB::virtualServerStatsInBytes.1 = Counter64: 2970 bytes KEEPALIVED-MIB::virtualServerStatsInBytes.2 = Counter64: 3312 bytes KEEPALIVED-MIB::virtualServerStatsOutBytes.1 = Counter64: 2592 bytes KEEPALIVED-MIB::virtualServerStatsOutBytes.2 = Counter64: 3072 bytes
We can also disable a real server, for example W1:
# snmpget -v2c -cpublic localhost KEEPALIVED-MIB::realServerAddrType.1.1 \ > KEEPALIVED-MIB::realServerAddress.1.1 KEEPALIVED-MIB::realServerAddrType.1.1 = INTEGER: ipv4(1) KEEPALIVED-MIB::realServerAddress.1.1 = Hex-STRING: C0 A8 01 0A # snmpget -v2c -cpublic localhost KEEPALIVED-MIB::realServerAddress.2.1 \ > KEEPALIVED-MIB::realServerAddress.2.1 KEEPALIVED-MIB::realServerAddress.2.1 = Hex-STRING: FD 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 KEEPALIVED-MIB::realServerAddress.2.1 = Hex-STRING: FD 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 # snmpset -v2c -cprivate localhost KEEPALIVED-MIB::realServerWeight.1.1 = 0 KEEPALIVED-MIB::realServerWeight.1.1 = INTEGER: 0 # snmpset -v2c -cprivate localhost KEEPALIVED-MIB::realServerWeight.2.1 = 0 KEEPALIVED-MIB::realServerWeight.2.1 = INTEGER: 0
From C1, you can now check that W1 is not queried anymore.
Let’s graph the number of connections per second each real server
gets. With a simple script using rrdtool, we can get the
following graph:
