You are here: Foswiki>Net Web>VRRPDNS (16 Feb 2010, AntonIvanov?)EditAttach

Using DNS as a VRRP Availability Indicator

This article is abandonware. It has not been maintained properly since 2007. It is not supported and not updated.

Introduction

VRRP RFC2338 is designed to provide clients with a highly available router address and it does this quite well. The problems are not as much with VRRP, but with its implementations and the way they fit into an overall network design. Even moderately complex designs have antispoofing problems for most of their failure modes. On top of that, nearly all implementations are incapable of interfacing uplink network availability information into VRRP. The most advanced ones can track the operational status of an interface (or several). No implementation can use routing protocol information to alter VRRP parameters (that is as of 2005, hope that it will change in the future). The situation with HSRP and CARP is not much different.

Sometimes around the beginning of 2005 I got on my plate the task of providing high availability to some firewall kit for my employer. I had the options:

  • Say that it is impossible until we buy 20K more connectivity and 20K more kit. This would have been the classic UK IT style choice.
  • Write an interface into a routing protocol daemon or a router proper to control it. Been there, done it, could do it again...
  • Invent something new...

After some further consideration I decided to pass on the routing protocol idea because it did not seem feasible with the upstream Vogons my employer used at the time. After all, you do not expect someone who does not want to use email to talk to a customer on support matters to talk a routing protocol to a sub/24 customer. I decided to use feedback from a non-routing protocol instead.

This article revolves around two issues: why do you have to use DNS as an indicator for network availability in VRRP and how to do it. All usual disclaimers apply. You are welcome to use it provided that you have the patience to read through all of it and especially the Caveats section. As far as support is concerned - TANSTAAFL.

Known Problems

The antispoofing problem

When using VRRP the aim is that all traffic should pass only through the currently active router (the one serving the virtual IP address). Secondary routers should remain idle (VRRP does not aim to balance traffic, only to fail-over).

This is in fact the traffic pattern when the VRRP routers handle only the link to upstream for a single network and provide redundant service for it.

Figure 1 Figure 1 :

It is never the case when the routers have other links (or LAN segments) hanging off them and run a routing protocol to serve those segments (with the segments injected or originated in the IGP). In the latter case for a many of the failure modes the traffic ends up hitting routers that are expected to be idle and it comes from an "unexpected" interface.

Figure 2 Figure 2 :

This will happen on Figure 2 if LAN A uses VRRP, router 2 is primary and LAN B and uplinks use OSPF. If the red link fails the traffic from LAN A to the uplink will end up traversing LANB and appearing on the "wrong" interface of router 1. This triggers the "antispoofing" checks present in most OSes and routers and the traffic is dropped. There are several ways to solve this:

  • Turn off some or all antispoofing checks altogether. Obviously the wrong thing to do on a firewall.
  • On some platforms it is possible to use interface status tracking. Obviously will not help for more complex cases or if the failing link is further upstream and the network is not fully meshed.
  • It is possible to use a pair of VRRP routers per each subnet and connect them to uplink system(s) that do the actual intersubnet routing. This will obviously carry the cost of the extra routers to do this. It will also require meshing all routers in a manner which provides symmetric routing for all single failure modes (usually impossible for many double failures). As the number grows the level of annoyance grows so this usually ends up with antispoofing turned off on all but the upstream interfaces.
  • Instead of controlling the VRRP priority for upstream failure (current default behaviour in nearly all implementations), we can control the interface operational status for the VRRP served LANs. As a result paths through routers that do not have a working uplink will be completely pruned from the network topology. This will allow to retain antispoofing checks for a network of arbitrary complexity. The problem here is that there is no commercial implementation to actually do this so you either have to implement it on Linux, BSD or some other *nix. Alternatively it is possible to control a Cisco (or other "proper" router) externally from a "controller" platform. If you are interested - an implementation is presented further in the document.

The isolated router problem

This is the most common problem encountered when implementing VRRP. Most commercial implementations can track the status of an interface to decide if they should announce an address or not. This allows to make VRRP dependent on the presence of upstream connectivity for some cases. Quite obviously it does not cover the case of losing the link further upstream. This is known as the "isolated router". It thinks that it is connected and it is still announcing itself into VRRP. As a result traffic is not going anywhere. There are two solutions to this problem:

  • Make VRRP dependent on the presence of an entry in a routing table. Something similar to the approach used by Cisco for default origination in OSPF or BGP should be more then enough (route map which if it returns non-zero matches triggers the origination). Unfortunately AFAIK nobody does this yet.

  • Control VRRP externally using an application that detects the lack or presence of connectivity. Once again - no commercial implementation to do this. If you are interested - an implementation is presented further in the document.

Configuration issues on Linux

While all current distributions ship vrrpd, I have yet to see one that ships a reasonable configuration system for it. Usually all you get is the daemon and nothing else. No scripts to start it, no centralized configuration, nothing. This may be good enough for a simple fixed case where you configure it with fixed priorities and start from an init.d script. Unfortunately if we intend to run 20 instances on 20 interfaces this will not do.

The easiest way to get around this problem is to use the up and down scripts run by ifup/ifdown to bring up and down VRRP for an interface as it is brought up and down. If you do not want to store actual AH or password parameters in /etc/network/interfaces you can write a global config wrapper that reads them from elsewhere. No real rocket science in this so I am not going to quote sample code for this.

The real stuff - controlling VRRP based on upstream connectivity

Choosing the indicator method

Once we have VRRP going up and down simultaneous with interfaces we can start controlling it based on external connectivity. First of all we need to chose a good indicator method.

  • ICMP (echo request/echo reply) - easiest, but least reliable. Most likely to be filtered somewhere for operational reasons and the first to be dropped. As a result there is a considerable likelihood of false positives.
  • HTTP (and other high level protocols) - too complex and too dependant on external factors. The likelihood of false positives is quite high.
  • DNS - my personal favourite. It is stateless. It is easy enough to query and representative if a link is usable or not. If you cannot get DNS on a link you might as well forget about using it for anything more useful.

DNS as an indicator method

We have decided to use DNS for detecting external connectivity. In order to do that we need to chose a target zone and a target DNS server which will be be used to test that connectivity is available. The target zone should be least likely to disappear of the face of the planet due to insolvency, incompetence, contractual problem or all of these at the same time.

The following are two examples. They are by no means complete. Based on your experience and requirements chose your own.

  • Yahoo is a good target zone. Their DNS is well managed. I do not recall a case where they have been off the radar.
  • Microsoft is not. In the past Microsoft domains have disappeared of the face of the planet due to incompetence. More specifically they did not operate proper redundant infrastructure and got knocked out by a D.O.S. in 2001. That is was the first, but not the last case.

What not to chose:

  • Do not chose your own zones. There is nothing more unpleasant then cutting your access out after a DNS mistake so you cannot fix the mistake either

Next step is to chose a set of name servers. The target name servers should be highly available and if possible geographically distributed as per RFC3258 . They should also be very well managed and correctly designed. The following is a list based on servers I have used for this (or other) purposes. It is not by any means complete and it is up-to date as of December 2005. It is provided as an example only (note, I deliberately do not provide IP addresses for any of these).

You can use for detecting upstream connectivity any of the following servers.

  • Level3 (EU or US) resolvers run a good target nameserver set. Their DNS is geographically distributed with up to 20-30 servers in different locations answering for the same address. They are also correctly designed from an operational perspective with strict separation between resolvers and authoritative nameservers.
  • Same for NTT/ Verio which also has a geographically distributed system. Also runs like a clockwork.
  • Nildram UK - UK ISP which has a reasonably reliable infrastructure (for its size of course). It is not geographically distributed as far as I know. This is no longer valid as Nildram as such no longer exists

I would recommend to be careful when using:

  • Easynet. It is not correctly designed and not correctly operated the last time I used it. Resolvers contain authoritative zones and vice versa. On top of that I have had problems with them not propagating entries to all slave nameservers and not removing dead zones for customers that have moved on. It may work if you use it as a resolver for most "BIG" zones, but I would not use it myself. Its state is too unsanitary to be relied on.

I would recommend to stay away from:

  • Claranet. The worst ISP DNS I have ever seen in my 10 years of doing DNS related work for a living. In fact it is the only DNS in the world on which I have observed to answer with authoritative "host not found" answers to queries that are for zones which they are not authoritative for. Definitely - stay away from this one. It takes some truly stellar effort to achieve this result on an ISP DNS installation. Frankly, I am curious how do they manage to do that (I have a ticket with them on this one and it has remained without an answer for 3 months so far despite the fact that it was filed on behalf of a paying customer). Calling them Vogons would have been a compliment and an understatement.

Implementation

The implementation is derived from the source of the mon DNS monitor module. I will quote only the interesting bits from it. The initialization, configuration, etc are deliberately omitted. I have done this for the following reasons:

  • As I said before there is no VRRP configuration standard. I am not about to force one so you might as well adapt this to your case.
  • I am leaving the parts of the code which deal with topology alteration to be as generic as possible. As a result they can be easily rewritten to control a "proper" router.

#!/usr/bin/perl 
# 
#
use strict; 
use English; 
use Net::DNS::Resolver; 
use Net::DNS::Packet; 
use Net::DNS::RR; 
my (@Servers)   = ();   # servers to use
my (@Zones)   = ();     # zones to check 
my (@Masters)   = ();   # masters to check 
my ($Delay)  = 30;      # check interval 
my (@Interfaces)  = (); # Which interfaces to bring up/down
my $debug;              # Debug
my $pidfile; 
# init the params - I use getopts, what you use depends on how you configure VRRP
&init_vars(\@Servers,\@Zones,\@Masters,\$Delay,\@Interfaces,\$debug); 
# if you do not have this handy function defined somewhere, time to do so.
# BSD folks are right to have it. what it does is obvious from their manpages
if (my $pid = daemon()) {
   if ($pidfile) {
      open PID, ">$pidfile";          
      print PID "$pid\n";          
      close PID;       
   }       
   exit(0); 
}
while (42) {
   my $err_cnt = 0;
   foreach my $test_server (@Servers) {      
      foreach my $test_zone (@Zones) {          
         if ($debug) {             
            print STDERR "testing $test_zone @ $test_server\n";          
         }          
         $err_cnt += dns_verify($test_zone, $test_server);       
      }   
   }
   ### 
   if (!(we_are_up()) && ($err_cnt == 0)) {
      foreach my $interface (@Interfaces) {             
         my $pid = fork();             
         if ($pid) {                
            wait()             
         } else {                
           if ($debug) {                   
               print STDERR "will bring down $interface\n";
           }                
           exec "/sbin/ifup", $interface; 
           exit 0;             
        }
     } 
   }      
   if (we_are_up() && we_can_go_down() && ($err_cnt == 0)) {
      foreach my $interface (@Interfaces) {             
         my $pid = fork();             
         if ($pid) {                
            wait()             
         } else {                
           if ($debug) {                   
               print STDERR "will bring down $interface\n";
           }                
           exec "/sbin/ifdown", $interface; 
           exit 0;             
        }
     } 
   }      
}
sub dns_verify {    
   # Most of the code is lifted from mon so this will have to be GPL I guess...
   # First verify that we have enough arguments.    
   my($Zone, $Test_server) = @_;    
   my($result) = undef;    
   my (@failed, $res, $soa_req, $Serial, $error_cnt, $server);
   # Query the $Master for the SOA of $Zone and get the serial number.    
   $res = new Net::DNS::Resolver;    
   $res->defnames(0);           # don't append default zone    
   $res->recurse(1);            # no recursion    
   $res->retry(1);              # 2 retries before failure             
   $res->nameservers($Test_server);    
   $soa_req = $res->query($Zone, "SOA");    
   if (!defined($soa_req) || ($soa_req->header->ancount <= 0)) {       
     if ($debug) {          
      print STDERR ($Test_server, sprintf("SOA query for $Zone from $Test_server failed %s\n", $res->errorstring));       
     }       
  return 1;    
  } else {       
  return 0;   
  } 
} 
sub we_are_up() {
  # debian specific, any r00th4t users are on their own here
  my $iface = shift;
  my $flag;
  open IFSTATE, '/etc/network/run/ifstate';
  while (<IFSTATE>) {
    if (/$iface=$iface/) {
      $flag = 42;
      last;
    }
  close IFSTATE;
  return $flag;
}
sub we_can_go_down() {
die 'this is the most crytical part of the logic - determine if you have an active
secondary in the network which still has an uplink path. 
I count IP addresses on an interface for a subnet where I am supposed to be secondary.
If there is one - secondary is alive, if there are two secondary is dead or in fallback state.
That approach is not valid for everyone so you are on your own here. It is also not
fully clean as far as race conditions are concerned';
}
sub init_vars {
die 'If you cannot create a config and read it you should not be here in first place';
}

Caveats

  • Realistically if you try this on anything less then 10Mbit without QoS you are bound to DOS yourself. If you do not have QoS "do not try that at home".
  • Net::DNS leaks memory like there is no tomorrow (note - I have not checked this since 2007, it may be OK nowdays). You need to whack the control subsystem periodically to avoid it eating all of your memory.
Topic attachments
I Attachment Action Size Date Who Comment
pngpng VRRP-0.png manage 13.9 K 16 Feb 2010 - 20:16 AntonIvanov? Figure 1
pngpng VRRP-1.png manage 18.0 K 16 Feb 2010 - 20:16 AntonIvanov? Figure 2
Topic revision: r1 - 16 Feb 2010 - 20:46:59 - AntonIvanov?


  • Google
    Web
    sigsegv.cx

 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback