Other Design and Deployment Issues
Beyond deciding whether or not to divide an IS-IS network into multiple domains, and where the domain borders should be if you decide to separate the domain, what other issues should you look for when designing and deploying an IS-IS network? There are a few things that you should look at regardless of what routing protocol you are using, such as summarizing IP prefixes or dialer interfaces, while there are others, which are more specific to IS-IS, such as tuning the various IS-IS timers.
Summarizing IP Prefixes
One of the most important issues to address when designing or deploying an IP network is prefix summarization. Summarization is a very simple conceptthe basic premise is to reduce the amount of information that intermediate systems must handle while computing the best paths through the network by hiding information about reachable destinations. For instance, in the network illustrated in Figure 4-4, each time intermediate systems B and C need to run SPF,
Figure 4 4 A highly summarizable address space
Since all of these networks are reachable through a single point, why not describe them with a single prefix (or advertisement), rather than 16? That is exactly what summarization doesrather than advertise a large number of destinations, they are summarized into a single prefix at intermediate system A, which then advertises it towards B and C. In this case, 10.1.0.0/21 would include all of the destinations from 10.1.0.0 through 10.1.15.255.
Calculating IP Summaries
IP summarization, although simple in principle, is a confusing topic to many people. The general idea is to shorten the prefix length (the number of bits set in the subnet mask) so a single advertisement covers, or represents, more address space. For instance, in the network illustrated in Figure 4-4, we begin with 16 different destinations, each with a 24-bit mask.
Figure 4 5 IP address summarization
What we want to do is to find a single destination we can advertise from intermediate system A that will represent all of the destinations covered by these 16 advertisements. The most straightforward way of finding the single prefix is to figure out the lowest and highest addresses represented by the range of addresses we would like to replace, and then try to find one prefix that will represent all of those addresses by itself. In this example, we would begin with the address 10.1.0.0, since that is the lowest address in the range, and end with the highest address of
10.1.15.255. Is there any prefix we can use to represent this entire address range? If we were to lay all of these addresses out in binary, we would find that the top 20 bits of every address remain the same throughout the entire address range.
So, we can use a single prefix with 20 bits set in its subnet mask to represent all of the addresses covered by the 16 individual prefixes, each with 24 bits set in their subnet masks. This 20-bit prefix is called a summary.
Summarization in IS-IS can only be configured at the intermediate system that is injecting the IP destinations or at a domain border. For instance, in Figure 4-4, the specific routes are static routes in intermediate system A, so it is the only node that could summarize the routes into a single advertisement. To configure a summary on a Cisco router in IS-IS, use the summary-address command within the router isis submode.
! router isis summary-address 10.1.0.0 255.255.240.0 redistribute static metric 10 level-2 net 49.0001.1111.1111.1111.00 !
Once these commands are configured on A, its routing table will show the 16 individual 24-bit prefixes plus the single 21-bit summary prefix.
router-a#show ip route .... 10.0.0.0/8 is variably subnetted, 16 subnets, 2 masks S 10.1.11.0/24 [115/10] via 10.1.11.1, Serial11 S 10.1.10.0/24 [115/10] via 10.1.10.1, Serial10 S 10.1.9.0/24 [115/10] via 10.1.9.1, Serial9 S 10.1.8.0/24 [115/10] via 10.1.8.1, Serial8 S 10.1.14.0/24 [115/10] via 10.1.14.1, Serial14 S 10.1.13.0/24 [115/10] via 10.1.13.1, Serial13 S 10.1.12.0/24 [115/10] via 10.1.12.1, Serial12 S 10.1.3.0/24 [115/10] via 10.1.3.1, Serial3 S 10.1.2.0/24 [115/10] via 10.1.2.1, Serial2 S 10.1.1.0/24 [115/10] via 10.1.1.1, Serial1 i su 10.1.0.0/20 [115/0] via 0.0.0.0, Null0 S 10.1.0.0/24 [115/10] via 10.1.10.1, Serial0 S 10.1.7.0/24 [115/10] via 10.1.7.1, Serial7 S 10.1.6.0/24 [115/10] via 10.1.6.1, Serial6 S 10.1.5.0/24 [115/10] via 10.1.5.1, Serial5 S 10.1.4.0/24 [115/10] via 10.1.4.1, Serial4
Note the summary route indicated in bold type in the output above; IS-IS automatically injects the route into the routing table when the summary is configured on A. On intermediate system B, issuing the show ip route command shows only the summary route; none of the components are shown.
router-b#show ip route .... 10.0.0.0/20 is subnetted, 1 subnets i L2 10.1.0.0 [115/20] via 184.108.40.206, Serial0/3
The summary route A is advertising to intermediate system B looks like any other L2 route B is receiving.
IP Summaries and Routing Black Holes
One issue to be careful of when summarizing destinations in a network with redundant level 1 and level 2 connections is routing black holes. Figure 4-6
Figure 4 6 A routing black hole in the making
This network design appears to be perfectly reasonable when it's initially deployed, with redundancy provided for destinations within the 192.168.50.0/24 network. However, when the link between B and D fails, the network administrators suddenly discover that 192.168.50.0/24 is no longer reachable. Why? Since both intermediate systems B and C are advertising the same route to A, A must choose between the two routes when calculating the best paths through the network. In this case, we will assume A chooses the path through the level 1/level 2 border B for all traffic destined to 192.168.48.0/20, which includes all the destinations on the 192.168.50.0/24 network. When the link between intermediate systems B and D fails, there is no reason for B to stop advertising the 220.127.116.11/20 IP summary, since it still has at least one component in the summary, 192.168.49.0/24, which is directly attached. In fact, neither B nor C will advertise any changes in this summary at all.
Router A will continue routing all traffic for any destination within 192.168.48.0/20 through intermediate system B, and B will drop the traffic destined to 192.168.50.0/24, since it no longer has a route to that destination network. This result is a very common problem in network designs with redundant connections between L1 and L2 domains.
How can you resolve this problem? One option is to simply not have redundant paths between L1 and L2 domains. This is not a very good solution, because it results in single points of failure. Another option is to simply not summarize between L1 and L2. Again, this is not a very good option either, since it will likely limit the scaling of your network over the long haul.
Yet another option is to provide an alternate link between the redundant connecting intermediate systems between the domains, as illustrated in Figure 4-7.
Figure 4 7 Resolving the routing black hole
This solution provides intermediate system B with an alternate link to reach the 192.168.50.0/24 network should the link between B and D fail. Links of this type must be used with some caution, however, and their capacity must be carefully planned. Assuming that the link between B and C will only carry one remote site's worth of traffic probably is not a good idea. If there are, say, 200 remote sites dual-homed between Routers B and C, a single massive failure could redirect all of the traffic to and from these sites across this single link. Careful network traffic flow and capacity planning are required to make this solution work well.
Some IS-IS implementations offer still another solution: automatic deaggregation. The details of the operation might be rather involved in some cases. Both intermediate systems B and C are part of the same level 1 and level 2 domains. Before any failure, intermediate system B might notice that another node (C in this example) is advertising the same summary through the level 2 domain. At the same time, intermediate system B can verify that C is reachable through the level 1 domain when it runs SPF for the domain and C's LSP is part of the resulting tree. If the second condition changes (C is not reachable anymore through the level 1 domain), but the first one is maintained, then B can deaggregate the summary and advertise the more specific routes. Of course, C would follow a similar process and it would also advertise the specific routes it can reach. The result is that intermediate system A (and all the other nodes in the level 2 domain) now have specific knowledge of which prefixes are reachable through which entry into the level 1 domain. Once the partition is resolved, both intermediate systems will continue advertising just the summary.
Within the IS-IS protocol, there are many timers which may be configured or set to defaults, including the flooding timers, the hello interval, the hold interval (or multiplier), shortest path first interval, and the link state packet generation interval. When you are designing a network with a particular goal in mind, such as minimum control traffic or very fast convergence times, it is sometimes advantageous to change these timers to better fit the network.
Routing protocols are deployed in many different situations that can be radically different from the circumstances their designers anticipated. Because of that, many of the default timers decided on when routing protocols were first designed can (and should) be reviewed against what the goals of any given intermediate system are for possible changes that might improve network performance.
IS-IS Flooding Timers
By default, IS-IS intermediate systems reflood the LSPs they created (self-originated LSPs) on a regular basisabout every 20 minutes. What is the purpose of this flooding? To guard against corrupted LSPs.
But isn't reflooding about every 20 minutes a little excessive? What are the odds that a mistake in the database, through packet corruption or other means, is going to surface in an IS-IS network? At one time, the odds were believed to be reasonably high, which is why the timers are set for such a relatively short interval.
With network infrastructure improving all the time, however, the chances of a packet being corrupted as it transits the network, or bad information getting into and staying in the LSP database on any one intermediate system for a long period of time, seems unlikely. This reliability, together with recent efforts at optimizing the use of network bandwidth (which typically means reducing the ratio of control to user traffic passed across a network), has resulted in some investigation into how the flooding of information through an IS-IS domain can be reduced.
Thus, one timer that can be considered for adjustment is the IS-IS reflooding interval. To configure the flooding interval in the Cisco IOS Software, use the router isis submode command lsp-refresh-interval:
router(config-router)#lsp-refresh-interval ? <1-65535> LSP refresh time in seconds
When configuring the LSP refresh interval, you should remember to set the max-lsp-lifetime value, also within the router isis submode, to something a bit longer than the lsp-refresh-interval value.
router(config-router)#max-lsp-lifetime ? <1-65535> Maximum LSP lifetime in seconds
A higher max-lsp-lifetime setting causes the intermediate system to generate LSP's that have a longer lifetime, which means that the LSPs will be considered valid longer by other intermediate systems in the network. Setting lsp-refresh-interval to a high value without changing the max-lsp-lifetime setting might cause other intermediate systems to time out the LSPs received before they have received a replacement (refreshed) LSP.
Many large-scale IS-IS networks are using the maximum time possible as their refresh rateabout 18.7 hours. Using this large a timer for the refresh max-lsp-lifetime value can significantly cut the amount of traffic that IS-IS generates on a very large network.
These two timers do not need to be set the same on all the intermediate systems in the network; each intermediate system can have different lsp-refresh-interval and max-lsp-lifetime values. If you decide to change the value in your network, make sure that max-lsp-lifetime is set to a number larger than the lsp-refresh-interval.
Hello and Hold Intervals
In the early days of the Internet, when IS-IS and the other routing protocols were first designed and deployed, many of the links in the Internet (and other networks) were slow links that frequently had high error rates on them. Furthermore, much of the traffic carried across networks was not anywhere close to real time in naturethat is, it didn't matter much if there was a delay of nine or ten seconds in delivering a given packet, or if it had to be retransmitted several times before it was finally delivered.
In this environment, when deciding how often intermediate systems would send hellos to each other (and how long they would wait since receiving a hello from a neighbor before timing out and assuming the neighbor had failed), slightly longer intervals of time seemed more appropriate than shorter intervals of time. Who cared if it took nine seconds for an IS-IS intermediate system to discover that its peers were down, and it needed to reroute traffic?
Today's networks are generally designed assuming much higher speeds and more reliable links, and today's traffic is much less tolerant of a nine-second delay in rerouting traffic. Therefore, it seems that these timers should be reconsidered, and possibly shortened. In fact, on most links over T1 speed (1.544Mb/s), there doesn't seem to be much of a reason not to set these timers to a much lower number than their default values, which are three seconds for the hello interval, and nine seconds for the hold interval.
In the Cisco IOS Software, to set the hello interval, use the isis hello-interval command in the interface submode.
router(config-if)#isis hello-interval ? <1-65535> Hello interval value minimal Holdtime 1 second, interval depends on multiplier
As the help string states for isis hello-interval, the hold interval depends on the hello multiplier, which is set using the isis hello-multiplier command under the interface submode.
router(config-if)#isis hello-multiplier ? <3-1000> Hello multiplier value
The hello interval is multiplied by the hello multiplier to determine the hold interval. Therefore, if the hello interval is two seconds, and the hello multiplier is four, the hold interval will be eight seconds. The Hello and Hold intervals do not need to match on all interfaces or with the settings of the neighbors, as each Hello packet carries its own timer.
Very Fast Hellos
While running very fast hellos with low hold intervals may seem like a very good solution for producing quick convergence times on first examination, further scrutiny indicates that this approach is not necessarily practical or scalable. First, fast hellos will most likely not improve the detection of neighbors which are no longer responding on point-to-point links, since the operating system running on most intermediate systems will notify IS-IS immediately of the loss of line protocol on point-to-point links.
For broadcast networks, the problem of scalability can come into play very quickly. For instance, if you have 101 intermediate systems attached to a single broadcast network, and each of them is expecting to receive a hello from every other intermediate system on the network every second, each intermediate system must be able to receive and process a hello every ten milliseconds. The same problem exists on the transmit sidean intermediate system attached to 100 networks, each with a hello timer of 330 milliseconds, must be able to transmit a hello packet every
three milliseconds or so. Due to limitations within the timers and other architectural issues beyond the scope of this book, these numbers are unrealistic.
An alternate mechanism, which uses protocol independent signaling at layer 2, has been developed and is supported in some implementations. This new mechanism allows for the subsecond detection of link failures in shared multi-access media while avoiding the scalability concerns mentioned above.
Shortest Path First Interval
It does not make a lot of sense, on first consideration, that a routing protocol would want to hold information for some time before processing itand, in fact, no distance vector protocol, such as the routing information protocol (RIP) or the Enhanced Interior Gateway Routing Protocol (EIGRP), ever holds routing information for any length of time before processing it. Why would IS-IS hold routing information before processing it, then? There are two reasons, one dealing with the flooding process, and the other dealing with the shortest path first algorithm.
First, let's deal with why holding onto SPF computations after receiving a link state packet is desirable for flooding. Link state protocols, generally, would prefer to flood information to all the intermediate systems in the network, and then let all of them compute their shortest paths through the network at the same time. This approach is not as reasonable as it sounds, since it would require some sort of synchronized timer running through the network, so IS-IS does the next best thingit tries to make the SPF computation always run after any new information is flooded to the other neighbors of this intermediate system.
Why? Most implementations of routing protocols run on single processor devices; thus, the IS can only do one thing at a time. If the processor is busy running the shortest path first computation on some new data, it cannot also be flooding that new information to its neighbors. If an intermediate system always ran SPF across new information before flooding it, the propagation of information through the network would slow down dramatically; each IS would hold the information until it had finished running SPF. Thus, it is more efficient (in network terms) for an intermediate system to flood any link state information before it begins running SPF over the data. In this way, its neighbors can flood the same information and then begin running SPF in parallel.
The second reason most intermediate systems wait for some time after receiving a link state packet before running shortest path first is to reduce the overall load on the processor and memory. For instance, if a single link flaps, two intermediate systems will flood new network topology informationthe two intermediate systems attached to either side of the link. So, assume an intermediate system runs SPF immediately upon receiving the changed information generated by the IS attached to one end of a link. While it is running SPF, it receives the second link state packet. It cannot stop running the SPF to take the second piece of information into account, nor can it insert the new information into the database SPF is currently being run over. Instead, it must wait until the current shortest path first computation is finished, flood the new information, and run SPF again. Intermediate systems batch information (or SPF runs) by delaying the SPF run for a period of time after receiving new link state packets.
There are actually three shortest path first interval timers in some implementations, rather than just one. In the Cisco IOS Software, the spf-interval command is used in the router submode.
router(config-router)#spf-interval ? <1-120> Minimum interval between consecutive SPFs in seconds router(config-router)#spf-interval 1 ? <1-10000> Initial wait before first SPF in milliseconds router(config-router)#spf-interval 1 40 ? <1-10000> Minimum wait between first and second SPF in milliseconds <cr>
The first interval is the minimum time that should elapse between consecutive shortest path first computations. It is generally set in seconds, with the shortest interval being one second. Even though we call this time the minimum, it really represents the maximum time between SPF runs; this concept will be clearer a little later.
The second timer is the number of milliseconds between the receipt of new link state information and running SPF. While it can be set as low as one millisecond, really low settings may cause the intermediate system to run SPF before flooding new information. It's better to set it to a larger number that allows for the flooding to occur; you might want to start with something as high as 200 milliseconds and reduce it incrementally until you find the network converging slower. Generally, it should be set no lower than 40 or 50 milliseconds.
Finally, there is the minimum wait time between the first and subsequent SPF calculations, which is set in milliseconds. This timer addresses, in another way, the issue raised above receiving new data while still running SPF. Generally, it is best to set this timer to the average (or maximum) length of a shortest path first computation in the network. In an intermediate system using Cisco IOS Software, this information can be obtained from the show isis spf-log command.
In the Cisco IOS Software implementation of IS-IS, the timers shown for the spf-interval command (and other commands in this section) interact to create a backoff algorithm, such that the interval between any two successive operations will increase until a maximum is reached. After some time, the interval will be reset to the minimum configured. To make this concept clearer, let's use the following configuration command: spf-interval 2 40 100.
As explained above, the intermediate system will wait 40 milliseconds after an LSP is received before running SPF. If a second SPF run is needed, then the wait will be 100 ms (the third timer). If a third SPF run is needed, then the wait will be 200 ms (or two times the third timer). For subsequent SPF runs, the wait will keep increasing (twice the last wait) to 400 ms, 800 ms, 1,600 ms, and so on, until the limit set by the first option (or argument) included with the command: 2,000 ms. The wait timer will be reset to the original minimum value (40 ms, in this case) when no more triggers are present for two times the maximum value (four seconds, in this case).
It is easily observed that the effect of this algorithm is to react quickly to a change, but to slow down if the network is showing instability (changing too fast).
The partial route calculation (PRC) timers are similar in purpose, and should be set using the same sorts of parameters as the SPF timers above. The only difference might be in the setting of the maximum wait between the first and subsequent PRC runs. In this case, the timer can be set to a smaller number because the PRC involves only the calculation of the leaf routes of a particular LSP, so it takes a lot less time to execute. In the Cisco IOS Software, the prc-interval command is used in the IS-IS router configuration submode.
router(config-router)#prc-interval ? <1-120> Minimum PRC interval in seconds router(config-router)#prc-interval 1 ? <1-10000> Initial wait for PRC in milliseconds <cr> router(config-router)#prc-interval 1 40 ? <1-10000> Minimum wait between first and second PRC in milliseconds <cr>
Link State Packet Generation
Most implementations of IS-IS allow you to determine how long after a network topology change occurs the intermediate system should wait before building and flooding a link state packet. This wait is to prevent a constantly flapping link (or some other network situation which would normally generate constant updates) from causing an intermediate system to send a massive number of link state packets through the network. In Cisco IOS Software, the lsp-gen-interval command is used in the IS-IS router configuration submode.
router(config-router)#lsp-gen-interval ? <1-120> Minimum interval in seconds router(config-router)#lsp-gen-interval 1 ? <1-10000> Initial wait in milliseconds <cr> router(config-router)#lsp-gen-interval 1 1 ? <1-10000> Wait between first and second lsp generation in milliseconds <cr>
There are, as with the SPF timers above, three timers: one for the minimum interval between generated link state packets, one for the minimum time between the first event and the first link state packet being generated, and one for the minimum interval between the first link state packet flooded in a series and subsequent LSPs. You would generally set these timers so the first change will immediately cause a link state packet to be generated and flooded, while link state packets generated for an immediate event are delayed a short time. The reason for these settings is that LSPs are generated, most often, when a change in the state of a link (up to down or down to up) occurs. By setting the first and third timers to the same value and the initial wait to a low number, no penalty is given to "good" or "bad" news in the network; both are treated equally.
Link State Packet Interval
Beyond the speed at which an intermediate system will generate link state packets in response to topology change events, IS-IS also limits the number of packets per second an intermediate system can transmit. The default value is one packet every 33 milliseconds, which allows about 360K of data to be flooded in any second (given the intermediate system is flooding maximum-sized packets on 1,500 byte links). With higher speed links, this is a rather conservative amount of information flow; in fact, pacing packets at this rate can slow down network convergence.
To configure the interval between packets (packet pacing) on an intermediate system running Cisco IOS Software, use the interface submode command isis lsp-interval.
Multiple Net Statements
Given the network illustrated in Figure 4-8, what would be the result of configuring multiple net commands in the router isis submode of router B with the Cisco IOS Software?
Figure 4 8 Multiple net statements in the Cisco IOS Software
The problem here is that intermediate systems A and C are in different domains. How will intermediate system B treat these two different domains if we enter two net statements in the router isis submode? On B, we would have:
! router isis net 49.2222.1111.1111.1112.00 net 47.1111.1111.1111.1112.00 !
Would intermediate system C learn about the 10.1.1.0/24 network? Let's look at C's routing table to find out.
router-c#show ip route .... 10.0.0.0/20 is subnetted, 1 subnets i L2 10.1.0.0 [115/30] via 18.104.22.168, Serial0/3
It looks like it does. Intermediate system B shows both intermediate systems A and C as neighbors.
router-b#sho clns neighbor System Id Interface SNPA State Holdtime Type Protocol router-a Se0/2 *HDLC* Up 29 L2 IS-IS router-c Se0/3 *HDLC* Up 21 L1L2 IS-IS
In fact, the only odd thing we see is in the show isis database detail command.
router-c#show isis database detail IS-IS Level-1 Link State Database: LSPID LSP Seq Num LSP Checksum LSP Holdtime router-c.00-00 0x000000B2 0x210B 710 Area Address: 47.1111 Hostname: router-c Metric: 10 IS router-b.00 router-b.00-00 0x000000B6 0xC573 824 Area Address: 49.2222 Area Address: 47.1111 Hostname: router-b Metric: 10 IS router-c.00 ....
Intermediate system C's database appears normalit shows a single connection to intermediate system B in domain 47.1111. However, B's LSP has two domains in it, since B has two domains configured. Intermediate system B essentially merges the two databases into one database, and treats the domain border as a level 1/level 2 domain border.
This mechanism is primarily defined for use while switching from one domain address to another; it's not designed to provide any advantages or a more conventional configuration, nor for long- term use in a large-scale network. There is no advantage to dividing a network into multiple domains and using multiple net statements on the connecting intermediate system to merge them.