- Physical Network Topology and Availability
- Layer 2 Availability: Trunking —802.3ad—Link Aggregation
- Layer 2 Availability: Spanning Tree Protocol
- Layer 3—VRRP Router Redundancy
- Layer 3—IPMP—Host Network Interface Redundancy
- Layer 3—Integrated VRRP and IPMP
- Layer 3—OSPF Network Redundancy— Rapid Convergence
- Layer 3—RIP Network Redundancy
- Conclusion
Layer 2 Availability: Trunking 802.3adLink Aggregation
Link aggregation or trunking increases availability by distributing network traffic over multiple physical links. If one link breaks, the load on the broken link is transferred to the remaining links.
IEEE 802.3ad is an industry standard created to allow the trunking solutions of various vendors to interoperate. Like most standards, there are many ways to implement the specifications. Link aggregation can be thought of as a layer of indirection between the MAC and PHY layer. Instead of having one fixed MAC address that is bound to a physical port, a logical MAC address is exposed to the IP layer and implements the Data Link Provider Interface (DLPI). This logical MAC address can be bound to many physical ports. The remote side must have the same capabilities and algorithm for distributing packets among the physical ports. FIGURE 3 shows a breakdown of the sub-layers.
FIGURE 3 Trunking Software Architecture
Theory of Operation
The Link Aggregation Control Protocol (LACP), allows both ends of the trunk to communicate trunking or link aggregation information. The first command that is sent is the Query command, where each link partner discovers the link aggregation capabilities of each other. If both partners are willing and capable, a Start Group command is sent, which indicates that a link aggregation group is to be created; followed by adding segments to this group, which includes link identifiers tied to the ports participating in the aggregation.
The LACP can also delete a link, which may be due to the detection of a failed link. Instead of balancing the load across the remaining ports, the algorithm simply places the failed links traffic onto one of the remaining links. The collector reassembles traffic coming from the different links. The distributor, takes an input stream and spreads out the traffic across the ports belonging to a trunk group or link aggregation group.
Availability Issues
To understand suitability for network availability, Sun Trunking_ 1.2 software was installed on several quad fast Ethernet cards. The client has four trunks connected to the switch. The server also has four links connected to the switch. This setup allows the load to be distributed across the four links, as shown in FIGURE 4.
FIGURE 4 Trunking Failover Test Setup
The highlighted (in bold italic) line in the following output shows the traffic from the client qfe0 moved to the server qfe1 under load balancing.
Jan 10 14:22:05 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 210 0 130 0 0 0 100.00 25.00 qfe1 0 0 130 0 0 0 0.00 25.00 qfe2 0 0 130 0 0 0 0.00 25.00 qfe3 0 0 130 0 0 0 0.00 25.00 (Aggregate Throughput(Mb/sec): 5.73(New Peak) 31.51(Past Peak) 18.18%(New/Past)) Jan 10 14:22:06 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 0 0 0 0 0 0 0.00 0.00 qfe2 0 0 0 0 0 0 0.00 0.00 qfe3 0 0 0 0 0 0 0.00 0.00 (Aggregate Throughput(Mb/sec): 0.00(New Peak) 31.51(Past Peak) 0.00%(New/Past)) Jan 10 14:22:07 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 0 0 0 0 0 0 0.00 0.00 qfe2 0 0 0 0 0 0 0.00 0.00 qfe3 0 0 0 0 0 0 0.00 0.00 (Aggregate Throughput(Mb/sec): 0.00(New Peak) 31.51(Past Peak) 0.00%(New/Past)) Jan 10 14:22:08 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 1028 0 1105 0 0 0 100.00 51.52 qfe2 0 0 520 0 0 0 0.00 24.24 qfe3 0 0 520 0 0 0 0.00 24.24 (Aggregate Throughput(Mb/sec): 23.70(New Peak) 31.51(Past Peak) 75.21%(New/Past))
Several test transmission control protocol (TTCP) streams were pumped from one host to the other. When all links were up, the load was balanced evenly and each port experienced a 25 percent load. When one link was cut, the traffic of the failed link (qfe0) was transferred onto one of the remaining links (qfe1) which then showed a 51 percent load.
The failover took three seconds. However, if all links were heavily loaded, the algorithm might force one link to be saturated with its original link load in addition to the failed links traffic. For example, if all links were running at 55 percent capacity and one link failed, one link would be saturated at 55 percent + 55 percent = 110 percent traffic. Link aggregation is suitable for point-to-point links for increased availability, where nodes are on the same segment. However, there is a trade-off of port cost on the switch side as well as the host side.