Route-based IPsec VPN on Linux with strongSwan
Vincent Bernat
A common way to establish an IPsec tunnel on Linux is to use an IKE daemon, like the one from the strongSwan project, with a minimal configuration:1
conn V2-1 left = 2001:db8:1::1 leftsubnet = 2001:db8:a1::/64 right = 2001:db8:2::1 rightsubnet = 2001:db8:a2::/64 authby = psk auto = route
The same configuration can be used on both sides. Each side will figure out if
it is “left” or “right.” The IPsec site-to-site tunnel endpoints are
2001:db8:1::1
and 2001:db8:2::1
. The protected subnets are
2001:db8:a1::/64
and 2001:db8:a2::/64
. As a result, strongSwan
configures the following policies in the kernel:
$ ip xfrm policy src 2001:db8:a1::/64 dst 2001:db8:a2::/64 dir out priority 399999 ptype main tmpl src 2001:db8:1::1 dst 2001:db8:2::1 proto esp reqid 4 mode tunnel src 2001:db8:a2::/64 dst 2001:db8:a1::/64 dir fwd priority 399999 ptype main tmpl src 2001:db8:2::1 dst 2001:db8:1::1 proto esp reqid 4 mode tunnel src 2001:db8:a2::/64 dst 2001:db8:a1::/64 dir in priority 399999 ptype main tmpl src 2001:db8:2::1 dst 2001:db8:1::1 proto esp reqid 4 mode tunnel […]
This kind of IPsec tunnel is a policy-based VPN: encapsulation and decapsulation are governed by these policies. Each of them contains the following elements:
- a direction (
out
,in
orfwd
2); - a selector (source subnet, destination subnet, protocol, ports);
- a mode (
transport
ortunnel
); - an encapsulation protocol (
esp
orah
); and - the endpoint source and destination addresses.
When a matching policy is found, the kernel will look for a corresponding
security association (using reqid
and the endpoint source and destination
addresses):
$ ip xfrm state src 2001:db8:1::1 dst 2001:db8:2::1 proto esp spi 0xc1890b6e reqid 4 mode tunnel replay-window 0 flag af-unspec auth-trunc hmac(sha256) 0x5b68[…]8ba2904 128 enc cbc(aes) 0x8e0e377ad8fd91e8553648340ff0fa06 anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000 […]
If no security association is found, the packet is put on hold and the IKE daemon is asked to negotiate an appropriate one. Otherwise, the packet is encapsulated. The receiving end identifies the appropriate security association using the SPI in the header. Two security associations are needed to establish a bidirectionnal tunnel:
$ tcpdump -pni eth0 -c2 -s0 esp 13:07:30.871150 IP6 2001:db8:1::1 > 2001:db8:2::1: ESP(spi=0xc1890b6e,seq=0x222) 13:07:30.872297 IP6 2001:db8:2::1 > 2001:db8:1::1: ESP(spi=0xcf2426b6,seq=0x204)
All IPsec implementations are compatible with policy-based VPNs. However, some configurations are difficult to implement. For example, consider the following proposition for redundant site-to-site VPNs:
A possible configuration between V1-1
and V2-1
could be:
conn V1-1-to-V2-1 left = 2001:db8:1::1 leftsubnet = 2001:db8:a1::/64,2001:db8:a6::cc:1/128,2001:db8:a6::cc:5/128 right = 2001:db8:2::1 rightsubnet = 2001:db8:a2::/64,2001:db8:a6::/64,2001:db8:a8::/64 authby = psk keyexchange = ikev2 auto = route
Each time a subnet is modified on one site, the configurations need to be
updated on all sites. Moreover, overlapping subnets (2001:db8:a6::/64
on one
side and 2001:db8:a6::cc:1/128
at the other) can also be problematic.
The alternative is to use route-based VPNs: any packet traversing a pseudo-interface will be encapsulated using a security policy bound to the interface. This brings two features:
- Routing daemons can be used to distribute routes to be protected by the VPN. This decreases the administrative burden when many subnets are present on each side.
- Encapsulation and decapsulation can be executed in a different routing instance or namespace. This enables a clean separation between a private routing instance (where VPN users are) and a public routing instance (where VPN endpoints are).
Route-based VPN on Juniper#
Before looking at how to achieve that on Linux, let’s have a look at the way it works with a Junos-based platform (like a Juniper vSRX). This platform as long-standing history of supporting route-based VPNs (a feature already present in the Netscreen ISG platform).
Let’s assume we want to configure the IPsec VPN from V3-2
to V1-1
. First, we
need to configure the tunnel interface and bind it to the “private” routing
instance containing only internal routes (with IPv4, they would have been RFC 1918 routes):
interfaces { st0 { unit 1 { family inet6 { address 2001:db8:ff::7/127; } } } } routing-instances { private { instance-type virtual-router; interface st0.1; } }
The second step is to configure the VPN:
security { /* Phase 1 configuration */ ike { proposal IKE-P1 { authentication-method pre-shared-keys; dh-group group20; encryption-algorithm aes-256-gcm; } policy IKE-V1-1 { mode main; proposals IKE-P1; pre-shared-key ascii-text "d8bdRxaY22oH1j89Z2nATeYyrXfP9ga6xC5mi0RG1uc"; } gateway GW-V1-1 { ike-policy IKE-V1-1; address 2001:db8:1::1; external-interface lo0.1; general-ikeid; version v2-only; } } /* Phase 2 configuration */ ipsec { proposal ESP-P2 { protocol esp; encryption-algorithm aes-256-gcm; } policy IPSEC-V1-1 { perfect-forward-secrecy keys group20; proposals ESP-P2; } vpn VPN-V1-1 { bind-interface st0.1; df-bit copy; ike { gateway GW-V1-1; ipsec-policy IPSEC-V1-1; } establish-tunnels on-traffic; } } }
We get a route-based VPN because we bind the st0.1
interface to the
VPN-V1-1
VPN. Once the VPN is up, any packet entering st0.1
will be
encapsulated and sent to the 2001:db8:1::1
endpoint.
The last step is to configure BGP in the “private” routing instance to exchange routes with the remote site:
routing-instances { private { routing-options { router-id 1.0.3.2; maximum-paths 16; } protocols { bgp { preference 140; log-updown; group v4-VPN { type external; local-as 65003; hold-time 6; neighbor 2001:db8:ff::6 peer-as 65001; multipath; export [ NEXT-HOP-SELF OUR-ROUTES NOTHING ]; } } } } }
The export filter OUR-ROUTES
needs to select the routes to be advertised to
the other peers. For example:
policy-options { policy-statement OUR-ROUTES { term 10 { from { protocol ospf3; route-type internal; } then { metric 0; accept; } } } }
The configuration needs to be repeated for the other peers. The complete version
is available on GitHub. Once the BGP sessions are up, we start
learning routes from the other sites. For example, here is the route for
2001:db8:a1::/64
:
> show route 2001:db8:a1::/64 protocol bgp table private.inet6.0 best-path private.inet6.0: 15 destinations, 19 routes (15 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 2001:db8:a1::/64 *[BGP/140] 01:12:32, localpref 100, from 2001:db8:ff::6 AS path: 65001 I, validation-state: unverified to 2001:db8:ff::6 via st0.1 > to 2001:db8:ff::14 via st0.2
It was learnt both from V1-1
(through st0.1
) and V1-2
(through
st0.2
). The route is part of the private
routing instance but encapsulated
packets are sent/received in the public
routing instance. No route-leaking is
needed for this configuration. The VPN cannot be used as a gateway from internal
hosts to external hosts (or vice-versa). This could also have been done with
Junos’ security policies (stateful firewall rules) but doing the separation with
routing instances also ensure routes from different domains are not mixed and a
simple policy misconfiguration won’t lead to a disaster.
Route-based VPN on Linux#
Starting from Linux 3.15, a similar configuration is possible with the help of a virtual tunnel interface.3 First, we create the “private” namespace:
# ip netns add private # ip netns exec private sysctl -qw net.ipv6.conf.all.forwarding=1
Any “private” interface needs to be moved to this namespace (no IP is configured as we can use IPv6 link-local addresses):
# ip link set netns private dev eth1 # ip link set netns private dev eth2 # ip netns exec private ip link set up dev eth1 # ip netns exec private ip link set up dev eth2
Then, we create vti6
, a tunnel interface (similar to st0.1
in the Junos
example):
# ip -6 tunnel add vti6 \ > mode vti6 \ > local 2001:db8:1::1 \ > remote 2001:db8:3::2 \ > key 6 # ip link set netns private dev vti6 # ip netns exec private ip addr add 2001:db8:ff::6/127 dev vti6 # ip netns exec private sysctl -qw net.ipv4.conf.vti6.disable_policy=1 # ip netns exec private ip link set vti6 mtu 1500 # ip netns exec private ip link set vti6 up
The tunnel interface is created in the initial namespace and moved to the “private” one. It will remember its original namespace where it will process encapsulated packets. Any packet entering the interface will temporarily get a firewall mark of 6 that will be used only to match the appropriate IPsec policy4 below. The kernel sets a low MTU on the interface to handle any possible combination of ciphers and protocols. We set it to 1500 and let PMTUD do its work.
Update (2018-04)
The MTU is also too low due to a bug that is fixed in commit c6741fbed6dc (released with Linux 4.17).
We can then configure strongSwan:5
conn V3-2 left = 2001:db8:1::1 leftsubnet = ::/0 right = 2001:db8:3::2 rightsubnet = ::/0 authby = psk mark = 6 auto = route keyexchange = ikev2 keyingtries = %forever ike = aes256gcm16-prfsha384-ecp384! esp = aes256gcm16-prfsha384-ecp384! mobike = no
The IKE daemon configures the following policies in the kernel:
$ ip xfrm policy src ::/0 dst ::/0 dir out priority 399999 ptype main mark 0x6/0xffffffff tmpl src 2001:db8:1::1 dst 2001:db8:3::2 proto esp reqid 1 mode tunnel src ::/0 dst ::/0 dir fwd priority 399999 ptype main mark 0x6/0xffffffff tmpl src 2001:db8:3::2 dst 2001:db8:1::1 proto esp reqid 1 mode tunnel src ::/0 dst ::/0 dir in priority 399999 ptype main mark 0x6/0xffffffff tmpl src 2001:db8:3::2 dst 2001:db8:1::1 proto esp reqid 1 mode tunnel […]
These policies are used for any source or destination as long as the firewall mark is equal to 6, which matches the mark configured for the tunnel interface.
The last step is to configure BGP to exchange routes. We can use BIRD for this:
router id 1.0.1.1; protocol device { scan time 10; } protocol kernel { persist; learn; import all; export all; merge paths yes; } protocol bgp IBGP_V3_2 { local 2001:db8:ff::6 as 65001; neighbor 2001:db8:ff::7 as 65003; import all; export where ifname ~ "eth*"; preference 160; hold time 6; }
Once BIRD is started in the “private” namespace, we can check routes are learned correctly:
$ ip netns exec private ip -6 route show 2001:db8:a3::/64 2001:db8:a3::/64 proto bird metric 1024 nexthop via 2001:db8:ff::5 dev vti5 weight 1 nexthop via 2001:db8:ff::7 dev vti6 weight 1
The above route was learnt from both V3-1
(through vti5
) and V3-2
(through
vti6
). Like for the Junos version, there is no route-leaking between the
“private” namespace and the initial one. The VPN cannot be used as a gateway
between the two namespaces, only for encapsulation. This also prevent a
misconfiguration (for example, IKE daemon not running) from allowing packets to
leave the private network.
As a bonus, unencrypted traffic can be observed with tcpdump
on the tunnel
interface:
$ ip netns exec private tcpdump -pni vti6 icmp6 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vti6, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 20:51:15.258708 IP6 2001:db8:a1::1 > 2001:db8:a3::1: ICMP6, echo request, seq 69 20:51:15.260874 IP6 2001:db8:a3::1 > 2001:db8:a1::1: ICMP6, echo reply, seq 69
You can find all the configuration files for this example on GitHub. The documentation of strongSwan also features a page about route-based VPNs. It is possible to replace IPsec by WireGuard, a fast and modern VPN implementation.
Update (2018-11)
It is also possible to transport IPv4 on top of IPv6 IPsec tunnels. The lab has been updated to support such a scenario.
Update (2018-11)
There are some serious bugs starting from Linux 4.14 impacting this setup. Be sure to apply the following patches: 9e1437937807 and 0152eee6fc3b—both applied to stable trees.
-
fwd
is for incoming packets on non-local addresses. It only makes sense intransport
mode and is a Linux-only specificity. ↩︎ -
Virtual tunnel interfaces (VTI) were introduced in Linux 3.6 (for IPv4) and Linux 3.12 (for IPv6). Appropriate namespace support was added in 3.15. KLIPS, an alternative out-of-tree stack available since Linux 2.2, also features tunnel interfaces. ↩︎
-
The mark is set right before doing a policy lookup and restored after that. Consequently, it doesn’t affect other possible uses (filtering, routing). However, as Netfilter can also set a mark, one should be careful for conflicts. ↩︎
-
The ciphers used here are the strongest ones currently possible while keeping compatibility with Junos. The documentation for strongSwan contains a complete list of supported algorithms as well as security recommendations to choose them. ↩︎