Failover Internet Connection Using IP SLA Tracking and EIGRP Routing for Inter-Site Links

posted 16 May 2013, 06:22 by Tristan Self   [ updated 17 May 2013, 00:29 ]
This is a bit of an interesting one, the brief is to provide Internet connectivity from two sites, A and B. There is also a point to point LES link between the two sites to provide intersite connectivity, there is also a wireless point to point link as a backup for the LES link. Between the two site core switches an EIGRP area will run so that if the main intersite LES link goes down, it will failover to the wireless point to point automatically. Each site has its own local Internet connection that will be used by that sites clients in normal operation. In the event of a local internet link failure, the default route should adjust automatically to push the Internet traffic out of the other site (via the inter-site link.)
 
 
I'm only going to show the juicy bits of the config. I.e. the bits that refer to the routing, EIGRP or IP SLA stuff.
 
A user at Site A in normal use should see two default routes to the Internet, one being their local default route, and then the Site B default route, with a higher metric (because its come across the EIGRP network.) If Site A's Internet connection goes down, the IP SLA removes the track 1 route, i.e. the default route. This leaves only the site B default route, that then even though it has the higher metric is the only route available so it is then used. If both Internet connections go down at the same time, both routes disappear.
 
The route-map is there to force the IP SLA tracking connection out of only the local internet connection. If you did not have this, in the event of a failure, the route would be removed because the IP SLA would see it had gone, but then it would continue to track via the internet connection at site B. Oh dear, it would then find the tracked IP address available again via Site B's internet connection and put the route back in. At which point it would see it was down and remove the route. At which point you are now in "flap-city" and it will continue to flap back and forth. So putting the route-map forces the connection out the local only, meaning if the Internet connection goes down. It goes down hard and stays down, until the connection is repaired at which point it comes back in.
 
Site A

<The VLAN interfaces are for each end of the point to point links.> 

interface Vlan50
description *** SiteA to SiteB - 1Gbit Fibre Link ***
ip address 192.168.50.1 255.255.255.252

interface Vlan57
description *** SiteA to SiteC - 300Mbit Wireless Link ***
 
ip address 192.168.50.41 255.255.255.248

< The EIGRP settings, to advertise the locally connected routes to the rest of the EIGRP area.>

router eigrp 1
redistribute static
network 172.20.0.0
network 192.168.50.0 0.0.0.3
network 192.168.50.40 0.0.0.7
no auto-summary

< Firstly we need to setup a route map; this will say any traffic that matches the 101 access list will be forwarded to the Site A firewall on 172.20.2.254. >
route-map IP_SLA_SiteA
  match ip address 101
  set ip next-hop 172.20.2.254

< Now we create an access list on the switch, to say any traffic emerging from the SLA timer on 172.20.2.1 and going to site A's tracked IP. Because there is a route map, it means this traffic only is  forced out irrespective of the current routing table settings. >
access-list 101 permit icmp host 172.20.2.1 host <Tracked IP>

< The route-map also requires this setting, this means that the route-map is applied to traffic that emerges from the switch itself, this is essential as this is the traffic that comes from IP SLA timer on the switch itself. >
ip local policy route-map IP_SLA_SiteA

< Setup the IP SLA to monitor the IP address, with a source IP address of the switch IP. Its setup with a timeout of 2000 ms, a threshold (maximum tolerated RTT time in ms) and a frequency of 15 seconds, i.e. how often it attempts to ping. >
ip sla monitor 1
 type echo protocol ipIcmpEcho <Tracked IP> source-ipaddr 172.20.2.1
 timeout 2000
 threshold 300
 frequency 15

< Start the SLA timer running, this will then start tracking the IP address shown above and reporting it’s reachability. >
ip sla monitor schedule 1 life forever start-time now

< Assign track 1 to the IP SLA, this will mean that if the IP SLA 1 reports no reachability, it will remove any associated routes with the track. >
track 1 ip sla 1 reachability <== This is the syntax for a 3700 series switch
track 1 rtr 1 reachability

< Create the static route for all outbound traffic and then assign the track 1 to it, if it goes down, the IP SLA will remove this route from the routing table automatically.>
ip route 0.0.0.0 0.0.0.0 172.20.2.254 track 1

 
Site B

<The VLAN interfaces are for each end of the point to point links.> 

interface Vlan50
description *** SiteB to SiteA - 1Gbit Fibre Link ***
ip address 192.168.50.2 255.255.255.252

interface Vlan55
description *** SiteB to SiteC - 300Mbit Wireless Link ***
ip address 192.168.50.33 255.255.255.248
delay 1000

< The EIGRP settings, to advertise the locally connected routes to the rest of the EIGRP area.>

router eigrp 1
redistribute static
network 172.21.0.0
network 192.168.50.0 0.0.0.3
network 192.168.50.32 0.0.0.7

< Firstly we need to setup a route map; this will say any traffic that matches the 101 access list will be forwarded to the WGC firewall on 172.21.2.254. >
route-map IP_SLA_SiteB 

  match ip address 101
  set ip next-hop 172.21.2.254

< Now we create an access list on the switch, to say any traffic emerging from the SLA timer on 172.21.2.1 and going to tracked IP. Because there is a route map, it means this traffic only is  forced out irrespective of the current routing table settings. >
access-list 101 permit icmp host 172.21.2.1 host <tracked IP>

< The route-map also requires this setting, this means that the route-map is applied to traffic that emerges from the switch itself, this is essential as this is the traffic that comes from IP SLA timer on the switch itself. >
ip local policy route-map IP_SLA_SiteB

< Setup the IP SLA to monitor the IP address, with a source IP address of the switch IP. Its setup with a timeout of 2000 ms, a threshold (maximum tolerated RTT time in ms) and a frequency of 15 seconds, i.e. how often it attempts to ping. >
ip sla 1
  icmp-echo <tracked IP> source-ip 172.21.2.1
  timeout 2000
  threshold 300
  frequency 15

< Start the SLA timer running, this will then start tracking the IP address shown above and reporting it’s reachability. >
ip sla schedule 1 life forever start-time now

< Assign track 1 to the IP SLA, this will mean that if the IP SLA 1 reports no reachability, it will remove any associated routes with the track. >
track 1 ip sla 1 reachability
track 1 rtr 1 reachability  <== This appears to be the syntax for the 4500 series switch.

< Create the static route for all outbound traffic and then assign the track 1 to it, if it goes down, the IP SLA will remove this route from the routing table automatically.>

ip route 0.0.0.0 0.0.0.0 172.21.2.254 track 1

 
Site C (i.e. the switch half way along the point to point wireless)
 interface Vlan55
 description *** HAT to WGC - 300MBit Radio Link ***
 ip address 192.168.50.36 255.255.255.248

interface Vlan57
 description *** HAT to SMF - 300MBit Radio Link ***
 ip address 192.168.50.44 255.255.255.248

interface Vlan202
 description *** DATA-Management ***
 ip address 172.21.2.1 255.255.255.0


router eigrp 1
 redistribute static
 no auto-summary
 network 172.21.0.0 0.0.7.255
 network 192.168.50.32 0.0.0.7
 network 192.168.50.40 0.0.0.7
 
Debugging and Testing
 
# debug ip sla (monitor) error
IP SLAs ERROR debugging for all operations is on
# debug track
# debug ip routing
IP routing debugging is on
 
And then turn on “terminal monitor” you’ll now be able to track any of the messages created by these sub-systems.
 
Comments