- Physical Network Topology and Availability
- Layer 2 Availability: Trunking —802.3ad—Link Aggregation
- Layer 2 Trunking Availability Strategies using SMLT and DMLT
- Layer 2 Availability: Spanning Tree Protocol
- Layer 3—VRRP Router Redundancy
- Layer 3—IPMP—Host Network Interface Redundancy
- Layer 3—Integrated VRRP and IPMP
- Layer 3—OSPF Network Redundancy— Rapid Convergence
- Layer 3—RIP Network Redundancy
- About the Authors
Layer 2 Availability: Trunking 802.3adLink Aggregation
Link aggregation or trunking increases availability by distributing network traffic over multiple physical links. If one link breaks, the load on the broken link is transferred to the remaining links.
IEEE 802.3ad is an industry standard created to allow the trunking solutions of various vendors to interoperate. Like most standards, there are many ways to implement the specifications. Link aggregation can be thought of as a layer of indirection between the MAC and PHY layer. Instead of having one fixed MAC address that is bound to a physical port, a logical MAC address is exposed to the IP layer and implements the Data Link Provider Interface (DLPI). This logical MAC address can be bound to many physical ports. The remote side must have the same capabilities and algorithm for distributing packets among the physical ports. FIGURE 3 shows a breakdown of the sub-layers.
FIGURE 3 Trunking Software Architecture
Theory of Operation
The Link Aggregation Control Protocol (LACP), allows both ends of the trunk to communicate trunking or link aggregation information. The first command that is sent is the Query command, where each link partner discovers the link aggregation capabilities of each other. If both partners are willing and capable, a Start Group command is sent, which indicates that a link aggregation group is to be created; followed by adding segments to this group, which includes link identifiers tied to the ports participating in the aggregation.
The LACP can also delete a link, which may be due to the detection of a failed link. Instead of balancing the load across the remaining ports, the algorithm simply places the failed links traffic onto one of the remaining links. The collector reassembles traffic coming from the different links. The distributor, takes an input stream and spreads out the traffic across the ports belonging to a trunk group or link aggregation group.
To understand suitability for network availability, Sun Trunking™ 1.2 software was installed on several quad fast Ethernet cards. The client has four trunks connected to the switch. The server also has four links connected to the switch. This setup allows the load to be distributed across the four links, as shown in FIGURE 4.
FIGURE 4 Trunking Failover Test Setup
The highlighted (in bold italic) line in the following output shows the traffic from the client qfe0 moved to the server qfe1 under load balancing.
Jan 10 14:22:05 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 210 0 130 0 0 0 100.00 25.00 qfe1 0 0 130 0 0 0 0.00 25.00 qfe2 0 0 130 0 0 0 0.00 25.00 qfe3 0 0 130 0 0 0 0.00 25.00 (Aggregate Throughput(Mb/sec): 5.73(New Peak) 31.51(Past Peak) 18.18%(New/Past)) Jan 10 14:22:06 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 0 0 0 0 0 0 0.00 0.00 qfe2 0 0 0 0 0 0 0.00 0.00 qfe3 0 0 0 0 0 0 0.00 0.00 (Aggregate Throughput(Mb/sec): 0.00(New Peak) 31.51(Past Peak) 0.00%(New/Past)) Jan 10 14:22:07 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 0 0 0 0 0 0 0.00 0.00 qfe2 0 0 0 0 0 0 0.00 0.00 qfe3 0 0 0 0 0 0 0.00 0.00 (Aggregate Throughput(Mb/sec): 0.00(New Peak) 31.51(Past Peak) 0.00%(New/Past)) Jan 10 14:22:08 2002 Name Ipkts Ierrs Opkts Oerrs Collis Crc %Ipkts %Opkts qfe0 0 0 0 0 0 0 0.00 0.00 qfe1 1028 0 1105 0 0 0 100.00 51.52 qfe2 0 0 520 0 0 0 0.00 24.24 qfe3 0 0 520 0 0 0 0.00 24.24 (Aggregate Throughput(Mb/sec): 23.70(New Peak) 31.51(Past Peak) 75.21%(New/Past))
Several test transmission control protocol (TTCP) streams were pumped from one host to the other. When all links were up, the load was balanced evenly and each port experienced a 25 percent load. When one link was cut, the traffic of the failed link (qfe0) was transferred onto one of the remaining links (qfe1) which then showed a 51 percent load.
The failover took three seconds. However, if all links were heavily loaded, the algorithm might force one link to be saturated with its original link load in addition to the failed links traffic. For example, if all links were running at 55 percent capacity and one link failed, one link would be saturated at 55 percent + 55 percent = 110 percent traffic. Link aggregation is suitable for point-to-point links for increased availability, where nodes are on the same segment. However, there is a trade-off of port cost on the switch side as well as the host side.
Load Sharing Principles
The trunking layer breaks up packets on a frame boundary. This means as long as the server and switch know that a trunk is spanning certain physical ports, neither side needs to know which algorithm is being used to distribute the load across the trunked ports. However, it is important to understand the traffic characteristics to distribute the load optimally across the trunked ports. The following diagrams describe how to configure load sharing across trunks, based on the nature of the traffic, which is often asymmetric.
FIGURE 5 Correct Trunking Policy on Switch
FIGURE 5 shows a correct trunking policy configured on a typical network switch. The incoming, or ingress, network traffic is expected to originate from multiple distinct source nodes. Hence the flows will have distributed Source IP and Source MAC addresses. This lends itself nicely to a trunking policy that distributes load based on he following algorithms:
Round Robintrivial case
Source MAC/Destination MACused for this particular traffic
Source IP/Destinationused for this particular traffic
FIGURE 6 Incorrect Trunking Policy on Switch
FIGURE 6 shows incorrect trunking policy on a switch. In this case, the ingress traffic, which has single target IP address and target MAC should not use a trunking policy based solely on the destination IP address or destination MAC.
FIGURE 7 Correct Trunking Policy on Server
FIGURE 7 shows a correct trunking policy on a server with egress traffic that has distributed target IP address, but the target MAC of the default router should only use a trunking policy based on round robin, or destination IP address. A destination MAC will not work because the destination MAC will only point to the default router :0:0:8:8:1, not the actual client MAC.
FIGURE 8 Incorrect Trunking Policy on a Server
FIGURE 8 shows an incorrect trunking policy on a server. In this example, the egress traffic has a distributed target IP address but the target MAC of the default router should not use a trunking policy based on the destination MAC. This is because the destination MAC will only point to the default router :0:0:8:8:1, not the actual client MAC. Trunking policy should not use the source IP address or source MAC either.
FIGURE 9 Incorrect Trunking Policy on a Server
FIGURE 9 shows an incorrect trunking policy on a server. Even though the egress traffic is using round robin, it is not distributing the load evenly because all the traffic belongs to the same session. In this case, trunking is not effective in distributing load across physical interfaces.