The return of the Linux router... (from pfSense to Debian, part 4: from CARP to VRRP)

Hi again mighty World!

So... lot of things going on, and little time to my blog, as more and more stuff piles up as article candidate.
After some struggle, now I got my v6 CCNA Routing and Switching... CCNP starts after summer, and I'm very commited onto learning Angular JS and Express... and overall, this year, contrary as I though, it seems that I'll barely will have holidays...

But here is yet another article of the series!!!
In this one I'm going to share how did I replaced pfSense CARP feature (that is router redundancy) with a solid, Debian packaged, alternative.
While on v6 CCNA, I learned about HSRP, and VRRP protocols, for router redundancy / failover, and it turned out, that a Debian solution I have many times lived with (although not directly messing with) Keepalived is just an implementation for the VRRP protocol!

I've seen and learnt in the past how to use Keepalived as a mean to have failover/redundant HAProxy reverse proxy appliances... and I never related it as a kind of hot stand-by / redundant router solution, but, well ... it is!
Now I'm using Keepalived/VRRP instead of pfSense CARP, with scenarios where up to three redundant Debian gateway appliances, having more than ten interfaces/addresses each one (and sharing a floating address per connected network) up, running, completely virtualized under PROXMOX, heavily loaded.... and rock solid!

The thing is that.... well... Keepalived is also one of those klever, easy, straight forward, and Linux/Debian admin friendly to install and setup... here's a very kick example/guide on what I needed to do!

Scenario ang goals

Well, it is a continuation of the one explained on earlier articles of this series.
In a nutshell, the idea is to have Debian VMs, running under PROXMOX, and using OVH's vRack feature (inter-datacenter Layer2 broadcast/domain for mixed physical, virtual and storage assets)

OVH's vRack, toghether with PROXMOX and OpenVSwitch VLAN tagged traffic, allows to generate multiple, isolated, broadcast domains... and by using bridges on routing VMs, it is possible to bridge OpenVPN tap layer2 tunnel endpoint interfaces directly onto those broadcast domains!

So, network interfaces seen here under configuration are bridges (because that what I'm in my notes) ... but normaly it would be exactly the same but with normal virtual/physical NICs.
Also, to keep it short to the minimum, I'll put an example with just two redundant routers, having each only two interfaces configured as redundant

 

Installation

Ummm ... a piece of cake:

apt-get install keepalived

 
we got it! :-P

 

Prepare system

This is perhaps the only 'tricky' thing... because Debian, by default, has the kernel/networking/core stack not configured to regard any network traffic as directed to itself if the destination address is not among any of the configured addresses for its interfaces (which makes sense, of course).
So, by default, Debian does only not act as a router (forwarding), but it also ignores arriving 'floating' IP addressed packages.

Again, like when enabling forwarding, the trick is tunning the system to do what we need by editing /etc/sysctl.conf config file.
This time, I just added the following at the end of the file:

#Allow binding to VRRP floating IP
net.ipv4.ip_nonlocal_bind=1

 
And that's it... now the system will trust/accept non-local (floating) IP addresses configured in Keepalived.

 

Setup

Now for the interesting part, configuration.
Keepalived has a single configuration file: /etc/keepalived/keepalived.conf ... old school fashion and Debianist friendly!

To understand the example depicted here, let's briefly comment some points about VRRP itself:

  • We got a MASTER and a number of SLAVES sharing one or more, non-local (like floating) IP addresses.
  • Those addresses are neither configured on /etc/network/interfaces nor is the networking stack of the system aware of them... they're entirely Keepalived managed, and 'just work'.
  • More than one slave is possible, here we depict the classical mirroring, but it can scale up as needed.
  • Who's the MASTER at any time is controlled by priority ... the highest the priority value, the highest the priority.
  • The configuration file allows to declare a MASTER or SLAVE state ... but think of this as either an initial state or a mean to handle equal priority systems... what really matters is priority. No matter if you configure a state of MASTER in one host while you setup another host as SLAVE but with higher priority: in such situation, slave would become master as soon as both systems negotiate.
  • Each virtual address is handled by an VRRP instance, which is bound to an interface, has its config excerpt in the configuration file, and is identified by a virtual_router_id. Appliances negotiating VRRP mastership of certain floating address do MUST HAVE SAME ID.
  • Every instance, even on the same router, has a different ID (Here I use the VLAN number, which is same on all routers, on in my case same interface connected to same VLAN on all routers)
  • So, the config includes an initial set of common things, and then, as many instance configurations as needed.
  • Per VRRP instance details, such as authentication, advertising intervals, etc, have to match on all appliances (along with ID).

So here's an example of how a common, global config, initial part of configuration file may look like (basically it is email notification an overall router naming/identification.... not much interesting really)

global_defs {
   notification_email {
     sysadmins@example.net
   }
   notification_email_from keepalived@router1.example.net
   smtp_server localhost
   smtp_connect_timeout 30
   router_id ROUTER1
}

 
This is basically what I'm using... just email-me when keepalived had to act to keep virtual IP up...

... but you may look on documentation on the many many stuff Keepalived is able to do for you, not just email.
So I encourage you to look on its documentation, because it is a really amazing piece of software!

Now for the really interesting stuff... OK, consider we have the following:

  • Our intended MASTER router. It is acting as gateway for two connected networks, so we got two interfaces, let's asume them br50 for VLAN 50 and br100 for VLAN 100.
    br50, VLAN 50, is on 192.168.50.0/24 network, has its IP address (no matter which!) defined at /etc/network/interfaces (or maybe DHCP!) but we want a common, floating, 192.168.50.254 address shared by all routers
  • Following the same logic/scheme, we do have VLAN 100 connected br100 interface, with its IP address whatever it is, but it has to respond on virtual/floating 192.168.100.254 address for the 192.168.100.0/24 subnet.
  • Our intended SLAVE/S routers do share the same scheme, being connected to same networks/VLANs, same interface names, and will have their (ofcourse different) IP addresses, which do not matter here, but will participate on keeping those same floating IPs (192.168.50.254 and 192.168.100.254) alive.

Here's how master may look like:

vrrp_instance VLAN50 {
    state MASTER
    interface br50
    virtual_router_id 50 
    priority 10
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass verysecret
    }
    virtual_ipaddress {
        192.168.50.254
    }
}

vrrp_instance VLAN100 {
    state MASTER
    interface br100
    virtual_router_id 100 
    priority 10
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass verysecret
    }
    virtual_ipaddress {
        192.168.100.254
    }
}

 
Isn't it lovely!?!?! can't be more self-explanatory!!! and when compared with how one of the slave config looks like, well, it is non-plus-ultra!!! .... here is:

vrrp_instance VLAN50 {
    state SLAVE
    interface br50
    virtual_router_id 50 
    priority 5
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass verysecret
    }
    virtual_ipaddress {
        192.168.50.254
    }
}

vrrp_instance VLAN100 {
    state SLAVE
    interface br100
    virtual_router_id 100 
    priority 5
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass verysecret
    }
    virtual_ipaddress {
        192.168.100.254
    }
}

 
Words are very unnecessary... they can only do harm...

Some check it's working

Ok.... issue the usual systemctl service command with either start/restart argument...

service keepalived start

 
and check it with 'status' argument, it runs on all participating hosts... something like this

root@router1:~# service keepalived status
● keepalived.service - Keepalive Daemon (LVS and VRRP)
    Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2018-04-24 23:19:46 CEST; 3 months 18 days ago
  Main PID: 3183 (keepalived)
    Tasks: 3 (limit: 4915)
   CGroup: /system.slice/keepalived.service
           ├─3183 /usr/sbin/keepalived
           ├─3184 /usr/sbin/keepalived
           └─3185 /usr/sbin/keepalived

 
Now let's use the 'ip a' command to see IP addressing on both master and slave systems, to check we got the floating IPs up, and on the intended system.

First let's see on 'MASTER' , where we expect to have bot normal and floating IP addresses on interesting interfaces ...

root@router1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
...
(ommited output)
...
10: br50: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.50.2/24 brd 192.168.50.255 scope global br50
       valid_lft forever preferred_lft forever
    inet 192.168.50.254/32 scope global br50
       valid_lft forever preferred_lft forever
    inet6 fe80:xxxxxxxx/64 scope link 
       valid_lft forever preferred_lft forever

14: br100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.2/24 brd 192.168.100.255 scope global br100
       valid_lft forever preferred_lft forever
    inet 192.168.100.254/32 scope global br100
       valid_lft forever preferred_lft forever
    inet6 fe80:xxxxxxxx/64 scope link 
       valid_lft forever preferred_lft forever
...
(ommited output)

 
...while on any of the 'SLAVE's we don't see the floating ip address active on any interface.

root@router2:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
...
(ommited output)
...
10: br50: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.50.3/24 brd 192.168.50.255 scope global br50
       valid_lft forever preferred_lft forever
    inet6 fe80:xxxxxxxx/64 scope link 
       valid_lft forever preferred_lft forever

14: br100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.3/24 brd 192.168.100.255 scope global br100
       valid_lft forever preferred_lft forever
    inet6 fe80:xxxxxxxx/64 scope link 
       valid_lft forever preferred_lft forever
...
(ommited output)

 
You may check that, even that those floating IP addresses are not part of /etc/network/interfaces file, they actually work, by pinging them.
The master will claim be that addresses endpoint, and take over any traffic arriving towards it.

root@router2:~# ping 192.168.100.254
PING 192.168.100.254 (192.168.100.254) 56(84) bytes of data.
64 bytes from 192.168.100.254: icmp_seq=1 ttl=64 time=21.4 ms
64 bytes from 192.168.100.254: icmp_seq=2 ttl=64 time=10.3 ms
64 bytes from 192.168.100.254: icmp_seq=3 ttl=64 time=10.3 ms
^C
--- 192.168.100.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms

 
You may reboot your master router/appliance and run 'ip a' on the slave/s and you'll see that as soon as the master goes down, the next top priority host takes over responding on the floating IP addresses!!!!

So... that's it!!! bye bye CARP ... hello VRRP!!!!!