Home > Articles > Networking > Routing & Switching

Managing Multiple Routers at a Single Site

  • Print
  • + Share This
Improving the availability of network services at a site requires more than simply installing a second router. Author Vincent Jones discusses three major configuration concerns that are unique to a LAN with multiple routers providing WAN connectivity.

Sites that require high network availability usually require more than just redundant communications links. It is common practice to place a second router at a site to eliminate the downtime associated with diagnosing a failed router and scheduling a replacement at a remote site. While routers do not fail very often, if the only router at a site fails, all network communications are lost and there is no way to determine whether the problem is a sitewide disaster, a major communications cable cut, or just a failed router. Then there is the time required to dispatch service to the site—this can be particularly problematic in a large network that spans multiple time zones or, even worse, continents. Even if the site is one on which service personnel and spare parts are readily available, having multiple routers online makes normal maintenance much easier. It means a router can be brought down to implement upgrades or perform other maintenance without interrupting production network traffic.

Generally, the primary challenge in implementing a multiple router site is not in configuring a second or third router on the LAN to support some additional WAN links. That part of the job could be done by any network administrator with a basic knowledge of IP addressing. Rather, the challenge is in designing the implementation so that the availability benefits of having two routers go beyond just simplifying routine maintenance and surviving random router failures. For the addition of a second router to really contribute to network availability, it is essential that all support systems either be equally reliable or be adequately redundant to ensure the availability of at least one router at all times.

Indeed, as we shall see in the rest of this chapter, the configuration of two routers so that one can handle the load of the other is frequently trivial. This is not surprising, as the whole philosophy of network design is centered around the concept of alternate routes and routers exchanging hello packets with one another to determine which routes are available at any given point in time.

The challenges arise from the inadequacy of other components—such as simple-minded IP hosts that only understand a single default gateway for access to other subnetworks—and from LAN component failures that can split the network, creating illegal topologies. We will also see how the specific protocols and applications being supported change the impact of various failure modes. As we have seen in our discussions on multiple communications paths, there is more to providing useful redundancy than simply installing a second router. Careful planning is required to minimize the number of common points of failure. In the effort to eliminate some single points of failure, it is quite common to introduce other single points of failure that are not as obvious.

For example, to minimize the danger of service disruption due to physical harm (such as a fire, burst pipe, or loss of air conditioning), a popular approach is to use two routers, each in a different part of the building. This is a good idea—particularly when there are two independent communications service entrances—since it minimizes the probability that a single incident will disconnect both routers from the WAN. But it also increases the likelihood that a single incident could physically split the site LAN into two disconnected networks sharing a common subnetwork ID. Such a broken topology cannot be supported by IP, IPX, and many other popular protocol suites, resulting in disrupted communications even though working links remain.

In this chapter, we will look at three major configuration concerns that are unique to a LAN with multiple routers providing WAN connectivity. These are

  • Minimizing the disruption of normal user system connectivity should the router serving as their default gateway fail

  • Having one router provide dial backup for a link serviced by another router

  • Providing continued access to the WAN for as many users as possible when a LAN served by multiple routers is segmented into disconnected LANs

Protecting LAN Users from Router Loss

Historically, routers have been very expensive network components. Most protocol suites were developed at a time when having more than one router accessible to a host system was rare. As a result, while most popular protocol suites have strong protocols for dynamic routing at the router-to-router level (intermediate-system-to-intermediate-system, or IS-IS, in OSI parlance), they still tend to have weak end-system-to-intermediate system (ES-IS) routing protocols.

TCP/IP is, perhaps, the worst offender in this regard. While it has the strongest and widest variety of IS-IS protocols available for any protocol suite (including OSPF, EIGRP, Integrated IS-IS, and BGP), most of its end-system implementations still depend upon the very weak default gateway approach to ES-IS routing. We covered TCP/IP ES-IS weaknesses from the viewpoint of router-to-end system routing, and ways to get around them, in Chapter 3's discussion on multihomed hosts. But we basically ignored the other direction, which is how an end system with a single LAN connection can find the appropriate router on that LAN to deliver IP packets to destinations that are not on the same physical LAN.

The original definition of TCP/IP pretty much assumed that routers never failed. Hosts were configured with a default gateway (remember that TCP/IP referred to routers as "gateways" until the late 1980s and still uses the old terminology in many places). When powered up, a TCP/IP end system knows its IP address, its subnetwork mask, and the IP address of a single local router. When sending a packet, the end system applies its subnetwork mask to both its own IP address and that of the destination. If the two results are identical, the destination is on the same subnetwork as the sender, and the packet is delivered directly to the destination end system using the appropriate protocols for that subnetwork. If, however, the results are not identical, then the destination is on a different subnetwork that cannot be reached directly.

Delivery of a packet to another subnetwork requires that the packet be sent to a router on the same subnetwork as the sender for forwarding to the ultimate destination. Since the only router known to the end system is the default gateway, this would be the router used. Routers, on the other hand, learn through their router-to-router routing protocols of all other routers that they can reach and that provide the best path to each possible destination. If another router on the same subnetwork as the sender has a better route to the desired destination, the router initially receiving the packet would forward the original packet along the best path available. It would also send an Internet Control Message Protocol (ICMP) re-direct packet back to the sending end system, informing it of the better router available for that destination. Depending upon the sophistication of the protocol implementation on the sending end system, that new router information might be remembered for future use with packets destined to the same destination, or it might be ignored. Either way, the off-network packets would be delivered correctly.

Effectively, routing from end system to local router is via static routes. Because IP is a connectionless protocol, there is no intrinsic mechanism for an end system to determine whether a router that used to be good has failed. ICMP re-directs assume that the initial poor-choice router is online to provide redirection to a better router. Once redirection has happened, there is no mechanism for falling back to the original router should the better router fail. Indeed, there are some protocol stack implementations in use wherein the only mechanism for removing an ICMP re-direct installed route is to reboot the system. Even implementations that support configuring multiple default gateways may have the same problem, because on some the first gateway that responds to an address resolution request is treated as the one and only default gateway.

IP implementations are supposed to protect themselves from routers that turn into black holes. The problem is that there is no standard mechanism at the IP level to determine whether or not an IP packet sent to an address on the same subnetwork arrived correctly. This is not a defect in the IP protocol. Rather, it is a fundamental property of the class of best-effort datagram protocols in which IP is included. The decision was made as part of the original design of the TCP/IP architecture that any recovery mechanism would be part of a protocol running above IP and not included in the IP protocol itself.

Unfortunately, although the need for black hole protection was recognized and indeed mandated by RFC 1122 back in 1989, proper detection is non-trivial and cannot be depended upon. Testing of specific implementations is essential if predictable response to router failure is required. This is true unless a protocol—such as Cisco's Hot Standby Router Protocol (HSRP, RFC 2281) or the recently standardized Virtual Router Redundancy Protocol (VRRP, RFC 2338), popular with Nortel and other vendors—is used on the routers to shield users from host implementation inconsistencies.

So let's look at some of the protocols designed to run above IP to provide router independence for end systems.

Passive RIP

The original and still most popular technique for providing router independence for TCP/IP end systems is passive RIP. Historically, both IP and IPX networks used variations on the Routing Information Protocol (RIP) when dynamic routing was desired for intradomain routers. RIP depended upon LAN broadcasts for communications between routers on a LAN and had no security mechanisms. So it was easy for an end system on a LAN to eavesdrop on the router-to-router RIP exchanges and extract the appropriate routing entries for all reachable, advertised networks.

This approach has many advantages over the default gateway approach. First of all, it eliminates the need to include a default gateway definition on every end system. It also avoids the extra traffic and overhead of handling ICMP re-directs, as packets should always be sent to the optimal router to begin with. Most important, it allows end systems to dynamically adjust to local routers' coming online and going offline. A new router coming online will be installed in the end systems' routing tables when its first reachability broadcast is received. Similarly, failed routers will be removed from the end systems' routing tables when update packets stop arriving and the entries time out.

As with all networking capabilities, there are, of course, trade-offs to be made when using passive RIP. In general, these are the same trade-offs we discussed in Chapter 3 when we used RIP to support hosts with interfaces on more than one subnetwork. RIP is only useful on LANs that support broadcasting. By default, each router on the LAN supporting RIP broadcasts update packets every 30 seconds, requiring one update packet for every 25 destination subnetworks advertised.

Unless the end system RIP implementation supports setting RIP detection intervals to non-default values, passive RIP responds slowly to router failures. Using default timers, passive RIP requires at least 60 to 90 seconds to detect a dead router and remove it from the end system's routing table. Some implementations, designed to allow the end system to support routing between interfaces, will take even longer to respond due to route holddown. Even in the best case, passive RIP is slow enough to recover that router failure will be quite noticeable to users.

There is also the question of inefficiency in trying to map real route metrics into RIP's 15 available distance values. Attempting to do so almost invariably results in an unmaintainable morass of manually configured route metric conversions. Consequently, it is highly likely that ICMP re-directs will be required even in a well-connected network with many nearly equal routes. Combined with a lack of support for variable-length subnetwork masking, this makes passive RIP usually only suitable in simple networks—or when it is acceptable to limit advertisements to just the default route.

One final limitation is that not all end system IP protocol stacks support passive RIP, so it may not even be an available choice. But some end systems extend the concept beyond just listening in on RIP version 1 (RIPv1) broadcasts. They can detect routers and routes by eavesdropping on other routing protocols that use broadcasts or multicasts for router-to-router exchanges, such as RIPv2, OSPF, and EIGRP.

On the other hand, passive RIP does work, and, despite its limitations, it is still superior to alternatives such as using only a static definition of default gateway.

Configuration Example: Passive RIP

Because of the many limitations of RIP, even with the release of the RIPv2 protocol, large networks today generally depend upon more modern routing protocols and no longer routinely run any version of RIP. However, virtually all router implementations still support RIP and allow you to redistribute routes learned from one routing protocol into another routing protocol. This makes it easy to run a routing protocol chosen for fast and efficient routing between the routers and still provide RIP updates for end systems to eavesdrop on. Listing 6-1 shows a Cisco router configuration excerpt that simply broadcasts a default route using RIP version 1 (Cisco introduced their support for RIPv2 in IOS 11.1) if the routing table on the router has a path to 10.0.0.1. The mechanism used to determine the reachability of 10.0.0.1 is immaterial to the functioning of passive RIP.

Listing 6-1.  Minimal passive RIP example

version 11.0
!
interface Ethernet0
ip address 192.168.110.201 255.255.255.0
!
router rip
network 192.168.110.0
redistribute static metric 1
distribute-list 10 in Ethernet0
distribute-list 11 out Ethernet0
!
ip route 0.0.0.0 0.0.0.0 10.0.0.1
access-list 10 deny  any
access-list 11 permit 0.0.0.0
!
end

Note the filter applied to RIP updates received from the Ethernet by the line distribute-list 10 in Ethernet0. The filter prevents the router from learning any routes via RIP. This prevents a malicious or misconfigured end system from creating routing problems for the routers. However, there is no way the router can prevent a malicious end system from broadcasting invalid RIP updates and confusing other end systems on the LAN.

There is no intent in this example to use passive RIP to optimize router selection. The line distribution-list 11 out Ethernet0 filters RIP updates being sent out onto the Ethernet so that they only include the default route. This keeps the update packet size minimal. It also avoids confusing passive RIP listeners with updates based on routes to networks with subnetwork mask usage incompatible with RIP version 1. If this router has a path to the network default gateway 10.0.0.1, it will advertise the default route using RIP version 1. If this router does not have a path to 10.0.0.1 in its routing tables, it will stop sending RIP updates altogether—with no default route defined, it has no routes to advertise.

If we wanted to advertise a default route unconditionally, we could add the floating static route ip route 0.0.0.0 0.0.0.0 Null0 250 to the configuration. That way, if we did not have a useful default route learned through the production routing protocol, we would still have one to advertise.

Listing 6-2 shows how the static default route used in Listing 6-1 can be replaced with a default route learned from the routing protocol actually used for router-to-router path determination. We still make no attempt to optimize router selection by distributing more detailed routes; there is poor resolution of the route metric and a lack of support for variable-length subnetwork masks in RIPv1. While RIPv2 does handle variable-length subnetwork masks, it still retains the crude zero-to-15 route metric of RIPv1. Plus, many passive RIP implementations are still based on the original Berkeley routed implementation of RIPv1. If router selection optimization is an objective, limited success may be possible using route-maps, as was done in the example in Chapter 3. But keep in mind that neither version of RIP is a good mechanism for the task. Accurate route selections are going to be very difficult, if not impossible, to achieve if the topology is at all complex.

Listing 6-2.  Passive RIP support combined with active RIP and OSPF

version 11.0
!
interface Ethernet0
ip address 192.168.110.201 255.255.255.0
ip ospf authentication-key ospfSecret1
!
interface Serial0
ip address 172.19.254.5 255.255.255.252
ip ospf authentication-key ospfSecret2
!
router ospf 200
network 192.168.110.0 0.0.0.255 area 192.42.110.0
network 172.19.0.0 0.0.255.255 area 0
area 0 authentication
area 192.42.110.0 authentication
area 192.42.110.0 range 192.42.110.0 255.255.255.0
area 0 range 172.19.254.4 255.255.255.252
redistribute rip metric 1 subnets
!
router rip
network 192.168.110.0
redistribute ospf 200 metric 1
distribute-list 10 in ethernet0
distribute-list 11 out ethernet0
access-list 10 permit 192.168.0.198
access-list 10 permit 192.168.0.199
access-list 11 permit 0.0.0.0
!
end

Note that because there is no default gateway defined, it is essential that a default route be learned through OSPF if it is to be redistributed to passive RIP users. The assumption here is that the OSPF routing domain is a much better place to control the default route for the network as a whole, with the alternative being the configuration nightmare of a static default route definition on every router on every LAN.

In this second example, we also enable RIP to accept routing updates from the two hosts 192.168.0.198 and 192.168.0.199. These are dual homed hosts with multiple LAN connections for higher reliability (the second connection is not shown in the example). The example shows how we can combine the use of passive RIP and active RIP on the same LAN. But be extremely careful, as the lack of subnetwork mask specification in RIPv1 update packets can cause the incorrect interpretation of advertisements received. Plan on verifying correct redistribution of routes learned by RIPv1 every time any changes are made on the end system or the router, such as IOS upgrades. It may be necessary to define two loopback interfaces in the same major network used to identify the dual homed systems to force RIPv1 to assume the correct subnetwork mask. Check the routing table on another router that is not on the same LAN to verify that the routes are being learned and redistributed correctly into the primary routing protocol.

If possible, RIPv2 or another routing protocol should be used instead of RIPv1. Protocols like RIPv2 and OSPF not only provide much more efficient use of address space; they also allow cleaner, more maintainable designs. They achieve this through their use of multicasting rather than broadcasting of updates or hello packets, their support for security features that reduce the risk of accidental or intentional misroutes, and—in the case of OSPF and BGP—their ability to distribute accurate routing metrics and detect one-way link failures. However, as mentioned earlier, we are much less likely to find support for protocols other than RIPv1 available on all the platforms that need to be supported.

Proxy ARP

Those working with end systems that do not support passive RIP can get some of these benefits simply by using proxy Address Resolution Protocol (ARP). The proxy ARP protocol was designed to assist network management by reducing the need to reconfigure end systems every time the subnetwork mask changes or when moving from one subnetwork to another within a larger subnetwork address space. However, we can take advantage of the way it makes a large network divided into many subnetworks look like a single large network to reduce the vulnerability of the static default gateway definition on each end system.

Proxy ARP works by allowing a tighter subnetwork mask on the router than on the end system. For example, consider a network built around allocations of the 10.0.0.0/8 block for private networks. A LAN might be assigned the subnetwork 10.0.10.0/24. The end systems on that LAN are configured with addresses that are correctly contained in the subnetwork assigned to the LAN but that use a subnetwork mask of 255.0.0.0. Only the routers on the LAN are configured with the correct subnetwork mask for the LAN.

That way, when an end system on the LAN attempts to communicate with another system that is outside the local subnetwork of 10.0.10.0/24, but still within 10.0.0.0/8, the end system determines that the destination is local and simply issues an ARP request to get the destination's MAC address. With proxy ARP enabled on the router, when the router receives the ARP, it identifies it as one for a system that is not on the local LAN. It responds as if the router were the remote system addressed, with an ARP response associating the router's MAC address with the remote destination's IP address. The local end system believes it is directly connected to the destination, while in reality its packets are being forwarded from the local subnetwork toward the destination subnetwork by their local router.

The benefit of proxy ARP, from our viewpoint, comes when we define the default gateway to be an address outside the real local subnetwork but inside the overall address block (so it looks local to the end system). The default gateway known to the end system is like any other non-local IP address that the end system thinks is local. So when the end system ARPs on the LAN to reach the default gateway, the local router replies. Under normal conditions, operation is totally transparent—the effect is identical to the local router's having two IP addresses, its real one and the default gateway's.

When proxy ARP is configured on all local routers and the router that initially responded fail, the end system automatically gets another router the next time the IP address is ARPed. The two main drawbacks of depending on proxy ARP are the speed of recovery and the need to use only a subset of the assigned address space on the LAN (to leave room for the default gateway that is being proxied).

Recovery speed can be slow, as there is no commonly implemented standard for how long to keep ARP results locally cached. This results in wide variations among systems. Most systems routinely refresh their ARP cache whenever there is a significant gap in traffic to the IP address—but the definition of significant can range from seconds to hours. On many end systems, it is possible to manually flush the ARP cache to force discovery of the remaining router. But some systems may recover faster simply by being shut down and rebooted, which could make the 3-minute wait for passive RIP much more palatable.

The addressing requirements tend to be an either/or decision. Either the network address can be subdivided or the entire network address space must be assigned to a single LAN. In the latter case, proxy ARP is not useful. Note that the default gateway does not have to be in unused address space. It only needs to be outside the address space assigned to the LAN. For example, if the address space is split between two LANs, end systems on each LAN could use a default gateway definition from the address space of the other LAN. The address does not even need to be a router—as soon as a packet being sent via the default gateway reaches the router local to the end system, it is routed based on the real destination, not on what the end system thinks should be the next hop.

Since proxy ARP is enabled by default on Cisco and most other routers, there is no need for a configuration example. However, be aware that using secondary addresses may be incompatible with proxy ARP. Cisco routers will respond with a proxy ARP only to ARP requests that come from the primary address space defined on the LAN interface. ARP requests from addresses in the secondary address space will not generate proxy ARP responses. Since this constraint is not documented, it should not be depended on in a design.

Even when proxy ARP is not required for getting around default gateway assignment limitations, it is still worth using. In today's environment of tight IP address allocations, proxy ARP can be a great time saver. When a LAN segment's subnetwork address mask has to be modified to squeeze out a few more usable addresses, only the routers on the affected LAN segments need to be reconfigured—not every single device.

IRDP, BootP, and DHCP

ICMP Router Discovery Protocol (IRDP, RFC 1256), Boot Protocol (BootP), and Dynamic Host Configuration Protocol (DHCP) are all designed to eliminate the need to hard configure all systems on an IP LAN. IRDP was specifically designed to allow systems on a LAN to dynamically detect all available routers and make a real-time choice of the appropriate default gateway to use. It bases its choice on the routers available and their advertised priority. Unfortunately, IRDP does not provide an answer to the critical function of determining that a router currently in use to reach an IP address is no longer available. While RFC 1256 discusses this problem and suggests some alternative approaches, it leaves the solution up to the host system, with no support from the routers. Indeed, the basic conclusion reached in the alternatives discussion was that every alternative considered by the IRDP team had either significant defects or performance problems when it was scaled to large LANs. As a result, IRDP is neither widely nor consistently implemented, making its use problematic. As a solution to the multiple router problem, it automates the listing of default gateways to try. But it ignores the questions of which available router should be used for any specific IP packet and how to recover when an available router becomes unavailable without warning.

BootP and DHCP use MAC-level broadcasts on the LAN to contact a BootP or DHCP server on the LAN and get a valid IP address, subnetwork mask, and default gateway. This takes place during the initial system boot-load process. DHCP provides additional flexibility by incorporating the concept that the IP address information provided is leased rather than sold. That is, DHCP attaches a time-out to the information provided, allowing the same IP address to be shared among multiple systems. BootP, on the other hand, assumes that the address information is dedicated to that MAC address. It has no facilities for time sharing of address space.

What does this have to do with providing support for multiple routers on a LAN? While it is rarely the reason for using either, we can consider BootP or DHCP as an ES-IS of last resort. For example, the default configuration could assign half the users to one router and the other half to the other router, providing simple load sharing. A background process running on a local computer could periodically test the routers to verify that they were still alive. Should one of the routers fail to respond, the script could alert the help desk that there is a router problem at the location. It could modify the DHCP or BootP database so that only IP addresses for working routers are given out as default gateway values. While this would not immediately help the systems already running with the broken default gateway, a simple reboot would automatically reconfigure the isolated system to use the working default gateway, and work could resume.

Theoretically, this could even be fully automated with DHCP by making the lease times very short. However, short lease times rarely work in practice because there are many DHCP implementations currently in use that do not properly handle expiring leases. Given that there are better and more general solutions to this problem, the use of BootP or DHCP to provide a pseudo-dynamic ES-IS protocol does not make sense—unless there is no other choice available for a specific environment.

VRRP and Cisco HSRP

Given the problem of end systems' inconsistent detection of routing black holes, Cisco developed the Hot Standby Router Protocol (HSRP, RFC 2281). This protocol runs only on the routers and is transparent to the end systems. It works by defining a virtual router IP and MAC address that is shared by multiple routers. One router acts as the active router and responds to the HSRP IP address and HSRP MAC address. Should this router fail to send timely keepalives to the other routers on the LAN, the next router reconfigures its interface and assumes the IP and MAC addresses of the virtual router. Since even the MAC address is constant, the existing ARP cache contents remain valid. The router swap is transparent to the IP protocol; but it can be a problem for IPX, OSI, DECnet, and other protocols.

Other router vendors provide similar protocols. Most follow the lead of Bay/Nortel and support the Internet standards track Virtual Router Redundancy Protocol (VRRP, RFC 2338). This and the HSRP protocol are very similar in function and performance, although the terminology used to define each is different. To minimize confusion, we will restrict this discussion to HSRP—but keep in mind that what we do here with HSRP on Cisco routers can be accomplished using the same philosophy (albeit different commands) on routers supporting VRRP.

HSRP and VRRP are not perfect, but they are quick. Using default timer settings, a standby router will take over for a failed active router in about 3 seconds. This is fast enough that most protocols will not even notice the downtime unless they are actively sending at the time. Even then, the lost packets will typically be recovered transparently by higher layer protocols.

The primary challenge with HSRP is the hardware limitation to a single MAC address in Cisco Ethernet interfaces based on the Lance chip set. These interfaces, used on Cisco 25xx series and other routers, can support only one MAC address at a time. This limits the number of HSRP virtual router definitions to no more than one per interface. More importantly, it causes the MAC address of the standby router to disappear without warning when the standby router must take over for a failed active router.

As a result, extreme care must be taken when designing HSRP capabilities into implementations using less capable routers. Consider Figure 6-1, in which we define the HSRP standby IP address to be 10.0.0.254 and the physical router IP addresses to be 10.0.0.1 and 10.0.0.2, respectively. Further assume that the MAC addresses are 00c0ab000001 for HSRP and 00c0ab111111 and 00c0ab222222 for physical Routers 1 and 2, respectively. When HSRP is enabled and Router 1 is the active router, the ARP table on the user system will look like Table 6-1.

Figure 6-1.
HSRP operation

Table 6-1.  ARP Table with Router 1 Active and Router 2 in Standby

IP Address

MAC Address

Physical Router

10.0.0.254

00c0ab000001

Router 1

10.0.0.1

00c0ab000001

Router 1

10.0.0.2

00c0ab222222

Router 2


When Router 1 fails, Router 2 takes over as active router, and the ARP tables look like Table 6-2 instead.

Table 6-2.  ARP Table with Router 1 Dead and Router 2 Alive

IP Address

MAC Address

Physical Router

10.0.0.254

00c0ab000001

Router 2

10.0.0.1

N/A

Router 1 (dead)

10.0.0.2

00c0ab000001

Router 2


At this point, any applications attempting to reach Router 1 at IP address 10.0.0.1 will fail—but not as expected. The MAC address that delivered frames to Router 1 now delivers frames to the wrong router. This is not a problem, as Router 1 is dead anyway. However, consider systems that were communicating with Router 2. Even though Router 2 is alive and well, its MAC address has changed, making 10.0.0.2 inaccessible until the ARP cache times out and is renewed. All IP users on the LAN must be sure to use only the HSRP virtual router IP address, as it is the only IP address with a dependable, consistent MAC address. However, other protocol suites, such as IPX, do not have this luxury. They can be broken unless the routers are capable of maintaining their permanent MAC address, as well as the HSRP MAC address.

Even with fully capable routers, care must be taken to ensure that efficiency is preserved. Enabling HSRP on a LAN interface disables ICMP redirects for that interface. As a result, if one router on the LAN has better connectivity to some destinations than do other routers on the same LAN, it may be preferable to give that router priority for becoming the active HSRP router. In the worst case, virtually every packet sent to the default router would need to be resent over the LAN to get to the correct outbound router, doubling the traffic density on the LAN.

Configuration Example: Simple HSRP

The simple configuration in Figure 6-2, using HSRP on two routers to support IP and IPX, highlights many of these considerations. This example also illustrates some of the useful features provided by Cisco in its HSRP implementation.

Figure 6-2.
Simple HSRP example

Key to the configuration used for this example is that Router 1 has the only permanent link to the rest of the network. The dial backup link provided by Router 2 is only activated when the primary link via Router 1 is down. There is no need for load balancing because all traffic should go to the router with the active link. This is fortunate, as there is no way to provide load balancing when limited to only one HSRP virtual router. (In the next example, we will show how to achieve limited load balancing by defining two HSRP virtual routers.)

Starting with the configuration for Router 1 in Listing 6-3, we configure HSRP using standby statements under the interface definition of the LAN on which HSRP is to be supported. The no ip redirects directive is technically unnecessary, as turning on HSRP automatically disables the issuing of ICMP redirect indications. If it is not included in the configuration, it will be automatically added by the router.

Listing 6-3.  Simple HSRP example, Router 1 configuration

version 11.0
!
hostname Router1
!
interface Ethernet0
description Remote Facility router 1
ip address 10.0.0.2 255.255.255.0
no ip redirects
ipx network 120251 encapsulation SAP
standby 1 priority 110
standby 1 preempt
standby 1 ip 10.0.0.1
standby 1 track serial0 20
!
end

To ensure that the router with the active connection is the active router, we use a number of controls provided by Cisco as part of their HSRP implementation. This implementation determines which router will be the active router, which will be relegated to the standby mode, and when switching from one to the other will be permitted. The standby priority command sets the normal priority for the router. The router with the highest priority will become the active router and assume the identity of the virtual router. Listing 6-4 shows an example implementation with the goal of having whichever router is carrying the traffic be the active router. Normally that would be Router 1, so we give Router 1 a priority of 110 and Router 2 a priority of 95.

We use the standby track command to modify the relative priorities of the routers based on which links are up and which are down. In this case, should the primary link on Router 1 go down, the standby 1 track serial0 20 command will reduce the priority of Router 1 by 20, giving Router 1 an effective priority of 90 for standby group 1. This is not enough to cause a change of active routers, however, as the default behavior of HSRP is only to change active routers when there is no choice. To get around this default behavior, we add the standby 1 preempt line, which tells each router to take over as active router any time it has the higher priority.

Listing 6-4.  Simple HSRP example, Router 2 configuration

version 11.0
!
hostname Router2
!
interface Ethernet0
description Remote Facility router 2
ip address 10.0.0.3 255.255.255.0
no ip redirects
ipx network 120251 encapsulation SAP
standby 1 priority 95
standby 1 preempt
standby 1 ip 10.0.0.1
standby 1 track bri0 10
!
end

The best way to verify the priorities and tracking weights is to set up a truth table and ensure that for every combination of states of tracked interfaces, the desired router is active. Table 6-3 gives the truth table for HSRP with tracking and preemption enabled for this example.

Table 6-3.  Truth Table for Simple HSRP Configuration Example

Frme

Relay

Status

ISDN

Link

Status

Router 1

Base

Priority

Router 1

Frame State

Adjustment

Router 1

Adjusted

Priority

Router 2

Base

Priority

Router 2

ISDN Link

Adjustment

Router 2

Adjusted

Priority

Active

Router

Up

Down

110

0

110

95

10

85

ONE

Down

Down

110

20

90

95

10

85

ONE

Down

Up

110

20

90

95

0

95

TWO

Up

Up

110

0

110

95

0

95

ONE


We must be careful when interpreting the truth table to avoid being misled. It would appear that Router 1 is always active unless the only connection to the data center is the ISDN link on Router 2 (the third line where the Router 2 adjusted weight of 95 is greater than the Router 1 adjusted weight of 90). But this is not actually true. The problem is that the ISDN link, as a dial-on-demand link, will always appear to be up, whether or not a call is actually in progress. Whether the status is UP with the line dialed and active, or UP (spoofing) with the line ready to dial, the standby tracking command sees the link as being up.

As a result, the actual behavior of the configuration is that Router 1 is only the active router while the primary link is up. Unless the ISDN link is administratively shut down, Router 2 will preempt Router 1 regardless of the functional state of the ISDN link. In this example, such behavior is acceptable; it really makes no difference which router is active when both the primary link and the dial backup link are unavailable. In other scenarios, however, this could lead to undesired HSRP behavior if the mechanism is not understood.

Anomalous behavior can also occur due to the delay from the time the primary interface first comes back to life (changing the standby weights) until the link is fully connected and recognized by the routing protocol. (At this time traffic actually stops flowing out the ISDN link and resumes flowing over the Frame Relay link.) Since this is a temporary disturbance, it is only a minor inconvenience for IP users. However, it is potentially a critical window for failure for IPX (Novell) users.

It was mentioned earlier that HSRP can create problems for IPX users. This sample configuration provides an example of the kinds of problems HSRP can cause and shows how it is sometimes possible to work around the problems. The primary problem comes from the automatic inclusion of the LAN MAC address in the IPX network address of a device. IPX works in this configuration because there are only clients at the remote site; all servers are at the data center. When a Novell client comes to life, it locates the nearest server and the router providing the optimal path to the server. Since that router will be the one with the working link to the core, the router address will be based on the active router MAC address. This will work correctly except when both links are up from the viewpoint of HSRP, forcing the active router to switch to Router 1 while the traffic is still flowing via the ISDN link. It can take some time for IPXRIP to time out and discover the correct path to the data center.

But the real IPX problem comes when the Frame Relay link fails enough to force a routing change but not enough to take the Frame Relay interface down at the physical level. Under these conditions, the IPX routing protocol learns the correct route via Router 2, and clients start sending their packets straight to Router 2. IPX communications are lost when the Frame Relay link finally does fail enough to change the standby priority and Router 2 preempts. At this point, the MAC address used for Router 2 disappears and clients are sending their IPX packets into a black hole. HSRP and IPX should not be mixed if the routers are not capable of supporting multiple MAC addresses simultaneously.

Configuration Examples: Load Balancing with HSRP

HSRP makes it possible to implement simple load-sharing schemes. With routers that can support more than one MAC address per interface, two or more default gateways can be defined as HSRP virtual routers on each LAN, and the users assigned randomly or based on need. Alternatively, users can be distributed across multiple LANs, and the HSRP configurations on each LAN independently adjusted to achieve load balancing at the router level.

Consider the network diagrammed in Figure 6-3. Rather than assign 10.0.0.1 and 10.0.0.2 as the real IP addresses of the two routers, we assign non-router addresses to the Ethernet interfaces and define the two router IP addresses as HSRP virtual routers. (Remember that this approach requires more than one MAC address per interface and therefore is not suitable for 25xx and other Lance chip-based Cisco routers.) We then configure half the users to use 10.0.0.1 as their default gateway and the other half to use 10.0.0.2. In addition to balancing the traffic between the two routers, we use standby track to force a switch over if one of the routers becomes significantly more disconnected from the rest of the enterprise than the other.

Figure 6-3.
Small-site load balancing with HSRP

A realistic HSRP configuration implementing this policy for the two routers is given in Listings 6-5 and 6-6. Note that the tracking adjustments are assigned so that preemption will occur only if the standby router has at least two more serial links up than the active router. This minimizes flip-flopping of active and standby routers. But it still provides a high probability that by the time the network degrades to the point where the majority of the packets sent to the active router are being passed on to the standby router, the standby router will have taken over and eliminated the extra LAN bandwidth consumption.

Listing 6-5.  Small-site load balancing with HSRP, Router 1 configuration

version 11.0
!
hostname Router1
!
interface Ethernet0/0
 description Remote Facility router 1
 ip address 10.0.0.101 255.255.0.0
 no ip redirects
 standby 1 priority 110
 standby 1 preempt
 standby 1 ip 10.0.0.1
 standby 1 track serial1/0 10 
 standby 1 track serial1/1 10
 standby 1 track serial1/2 10
 standby 2 priority 95
 standby 2 preempt
 standby 2 ip 10.0.0.2
 standby 2 track serial1/0 10
 standby 2 track serial1/1 10
 standby 2 track serial1/2 10
!
end

Listing 6-6.  Small-site load balancing with HSRP, Router 2 configuration

version 11.0
!
hostname Router2
!
interface Ethernet0/0
 description Remote Facility router 2
 ip address 10.0.0.102 255.255.0.0
 no ip redirects
 standby 1 priority 95
 standby 1 preempt
 standby 1 ip 10.0.0.1
 standby 1 track serial1/0 10
 standby 1 track serial1/1 10
 standby 1 track serial1/2 10
 standby 2 priority 110
 standby 2 preempt
 standby 2 ip 10.0.0.2
 standby 2 track serial1/0 10
 standby 2 track serial1/1 10
 standby 2 track serial1/2 10
!
end

The degree of load balancing provided by the single LAN configuration will depend upon the user traffic characteristics. It also has the disadvantage of confusing users and administrators by establishing two default gateways on the LAN. Most important, if each router has different connectivity to the outside world, the loss of ICMP redirection could cause a significant performance hit as well as increasing the router loading.

Using multiple LANs to achieve simple load balancing in a configuration like that in Figure 6-4 is also a popular technique. Here we define only one HSRP active router per LAN, and distribute the load on a LAN-by-LAN basis. In Listings 6-7 and 6-8, we make Router 1 the active router for the left-hand LAN and Router 2 the active router for the right. Again, this assumes there are similar loads on each LAN and minimal performance variation between delivering to one router and to the other.

Note that we can still use interface tracking to force the active router to a specific real router if the external connectivity becomes skewed. And, of course, if one router should fail, the other will quickly take over all routing operations.

Figure 6-4.
Dual-LAN load balancing with HSRP

Listing 6-7.  Dual-LAN load balancing with HSRP, Router 1 configuration

version 11.0
!
hostname Router 1
!
interface Ethernet0/0
 description Remote Facility router 1
 ip address 10.1.0.101 255.255.0.0
 no ip redirects
 standby 1 priority 110
 standby 1 preempt
 standby 1 ip 10.1.0.1
 standby 1 track serial1/0 10
 standby 1 track serial1/1 10
 standby 1 track serial1/2 10
!
interface Ethernet0/1
 description Remote Facility router 1
 ip address 10.2.0.101 255.255.0.0
 no ip redirects
 standby 2 priority 95
 standby 2 preempt
 standby 2 ip 10.2.0.1
 standby 2 track serial1/0 10
 standby 2 track serial1/1 10
 standby 2 track serial1/2 10
!
end

Listing 6-8.  Dual-LAN load balancing with HSRP, Router 2 configuration

version 11.0
!
hostname Router2
!
interface Ethernet0/0
description Remote Facility router 2
ip address 10.1.0.102 255.255.0.0
no ip redirects
standby 1 priority 95
standby 1 preempt
standby 1 ip 10.1.0.1
standby 1 track serial1/0 10
standby 1 track serial1/1 10
standby 1 track serial1/2 10
!
interface Ethernet0/1
description Remote Facility router 1
ip address 10.2.0.102 255.255.0.0
no ip redirects
standby 2 priority 110
standby 2 preempt
standby 2 ip 10.2.0.1
standby 2 track serial1/0 10
standby 2 track serial1/1 10
standby 2 track serial1/2 10
!
end

Configuration Example: Meeting Special Needs with HSRP

To get an idea of the extent to which HSRP can support unique requirements, we will look at how HSRP is used to handle a number of special requirements in the production network configuration pictured in Figure 6-5. In this network, most users on the LAN communicate with local servers and random other sites, half of which connect to Router 1 and half of which connect to Router 2. A few special users communicate heavily with a unique service site that connects via a dedicated link to Router 1, with ISDN backup to Router 2.

Figure 6-5.
Special needs HSRP example

Before we get to the HSRP configurations, we'll discuss the design of this site. It combines a number of redundancy features that have already been discussed, plus a few that are somewhat unusual. As is often the case in real life, this is due to a number of political and historical factors. All normal users are on the main LAN rather than being split equally between the two LANs at the site, as recommended earlier in this chapter. Originally, there was only one LAN, and a number of applications and operational procedures were built around the assumption of a single LAN. The cost of upgrading the installed applications exceeds the estimated cost of the downtime that could be eliminated by moving half the users over to the backup LAN.

Similarly, cost considerations caused the lack of Frame Relay redundancy to other normal sites. The Frame Relay links at this site were highly oversubscribed; configuring redundant logical permanent virtual circuits (PVCs) to remote locations would have required bringing more high-speed lines into this site. Most remote sites have single routers with a single low-speed 56-Kbps access link. The added PVC, then, would provide only a minor reliability improvement—the majority of all Frame Relay failures experienced are physical layer problems with the low-speed access links. Not shown on this diagram, because it is irrelevant to HSRP configuration, are the provisions for ISDN backup of the Frame Relay connections.

A key design feature that is not obvious from the diagram is the ISDN backup for the leased line from Router 1, interface Serial1/3 to the Special Site. The ISDN call setup is handled by the DSU on Router 2's interface serial 1/3, rather than by an ISDN interface on the router. As a result, the ISDN link looks to the router like a leased line that is usually in the failed state but occasionally comes to life.

One unusual requirement here is the support of a secondary IP address, range 192.168.45.0/24, for users who have not yet been migrated to the 10.201.0.0/16 subnetwork. In today's tight IPv4 address space, the need to move LANs from one address space to another is becoming more common. The migration can be much less stressful if a controlled transition is implemented. This can be done by supporting both address schemes for a period of time while systems are modified to support the new addressing. Of course, it is essential to guard against the temptation to just leave the duplicate assignments in place.

Based on these conditions, the following HSRP loading plan is desired:

  1. The normal users' default gateway (10.201.0.1) should be Router 2 unless Router 1 has the only active Frame Relay connection.

  2. The special users' default gateway (10.201.0.6) should be the router that has the active special link.

  3. The obsolete users' default gateway (192.168.45.1) should be the router that is not supporting special users.

  4. If one router dies, the other router should take over for all user classes.

As it turns out, the third goal cannot be met, since Cisco does not provide the ability in HSRP to use negative adjustments as part of the standby track command. Instead, the obsolete subnetwork default gateway is unilaterally assigned to Router 2 as long as Router 2's Ethernet interface is alive. Should Router 2 become incapable of supporting IP on interface Ethernet0/0, control will transfer over to Router 1 until Router 2 comes back to life, at which point Router 2 will preempt Router 1 and take back control.

Listings 6-9 and 6-10 show how we can meet goals 1, 2, and 4 using HSRP on the Ethernet interfaces to the main LAN. The authentication statements are not particularly useful from a security viewpoint, since the passwords are sent over the LAN as clear text, but they do serve two worthwhile purposes. They document the purpose of each standby group and they help prevent configuration errors from going unnoticed. A log entry is made for every HSRP packet exchange with mismatched passwords, which at one exchange per second tends to be hard to ignore

Listing 6-9.  Special needs HSRP example, Router 1 configuration

version 11.0
!
hostname Router1
!
interface Ethernet0/0
 description Core Router 1 with link to Special
 ip address 10.201.0.9 255.255.0.0 
 ip address 192.168.45.9 255.255.255.0 secondary
 no ip redirects
 ipx network 1234321 encapsulation SAP
 standby 1 priority 90
 standby 1 preempt
 standby 1 authentication Normal
 standby 1 ip 10.201.0.1
 standby 1 track Serial1/0 25
 standby 2 priority 110
 standby 2 preempt
 standby 2 authentication Special
 standby 2 ip 10.201.0.6
 standby 2 track Serial1/3 25
 standby 4 priority 90
 standby 4 authentication Obsolete
 standby 4 ip 192.168.45.1
!
end

Listing 6-10.  Special needs HSRP example, Router 2 configuration.

version 11.0
!
hostname Router2
!
interface Ethernet0/0
 description Core Router 2 with backup link to Special
 ip address 10.201.0.2 255.255.0.0 
 ip address 192.168.45.2 255.255.255.0 secondary
 no ip redirects

 ipx network 1234321 encapsulation SAP
 standby 1 priority 110
 standby 1 preempt
 standby 1 authentication Normal
 standby 1 ip 10.201.0.1
 standby 1 track Serial1/0 25
 standby 2 priority 90
 standby 2 preempt
 standby 2 authentication Special
 standby 2 ip 10.201.0.6
 standby 2 track Serial1/3 25
 standby 4 priority 110
 standby 4 preempt
 standby 4 authentication Obsolete
 standby 4 ip 192.168.45.1
!
end
  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.