Home > Articles

  • Print
  • + Share This
Like this article? We recommend

Ethernet Physical Layer Troubleshooting

At the physical layer, failures can prevent the link from coming up. Or worse, the link comes up and the duplex is mismatched, giving rise to less visible problems. The key tool for looking at the physical layer is the kstat command. Refer to the kstat man page for more information about this useful tool.

The following table lists the general Ethernet MII/GMII kernel statistics and describes their meaning.

TABLE 1 Physical Layer Configuration Properties and Kernel Statistics

Statistic

Values

Description

cap_autoneg

lp_cap_autoneg

adv_cap_autoneg

0-1

Local interface, advertised and link partner capability.

0 = Forced mode

1 = Auto-negotiation

cap_1000fdx

lp_cap_1000fdx

adv_cap_1000fdx

0-1

Local interface, advertised and link partner capability.

0 = Not 1000 Mbit/sec full-duplex capable

1 = 1000 Mbit/sec full-duplex capable

cap_1000hdx

lp_cap_1000hdx

adv_cap_1000hdx

0-1

Local interface, advertised and link partner capability.

0 = Not 1000 Mbit/sec half-duplex capable

1 = 1000 Mbit/sec half-duplex capable

cap_100T4

lp_cap_100T4

adv_cap_100T4

0-1

Local interface, advertised and link partner capability.

0 = Not 100-T4 capable

1 = 100-T4 capable

cap_100fdx

lp_cap_100fdx

adv_cap_100fdx

0-1

Local interface, advertised and link partner capability.

0 = Not 100 Mbit/sec full-duplex capable

1 = 100 Mbit/sec full-duplex capable

cap_100hdx

lp_cap_100hdx

adv_cap_100hdx

0-1

Local interface, advertised and link partner capability.

0 = Not 100 Mbit/sec half-duplex capable

1 = 100 Mbit/sec half-duplex capable

cap_10fdx

lp_cap_10fdx

adv_cap_10fdx

0-1

Local interface, advertised and link partner capability.

0 = Not 10 Mbit/sec full-duplex capable

1 = 10 Mbit/sec full-duplex capable

cap_10fdx

lp_cap_10fdx

adv_cap_10fdx

0-1

Local interface, advertised and link partner capability.

0 = Not 10 Mbit/sec half-duplex capable

1 = 10 Mbit/sec half-duplex capable

cap_asmpause

lp_cap_asmpause

adv_cap_asmpause

0-1

Indicates Local interface, advertised and link partner capability of asymmetric pause Ethernet flow control

cap_pause

lp_cap_pause

adv_cap_pause

0-1

Indicates Local interface, advertised and link partner capability of symmetric pause Ethernet flow control, when set to 1 and

*cap_asmpause is 0.

It also has a different meaning based on when *cap_asmpause is equal to 1.

*cap_pause = 0 Transmit pauses based on receive congestion.

*cap_pause = 1 Receive pauses and slow down transmit to avoid congestion.

link_asmpause

0-1

Indicates the shared link asymmetric pause setting the value is based on local resolution column of Table 37-4 IEEE 802.3 spec.

link_asmpause = 0 Link is symmetric Pause

link_asmpause = 1 Link is asymmetric Pause

link_pause

0-1

Indicates the shared link pause setting. The value is based on local resolution shown above.

If link_asmpause = 0

link_pause = 0 The link has no flow control.

link_pause = 1 The link can flow control in both directions.

If link_asmpause = 1

link_pause = 0 Local flow control setting can limit link partner.

link_pause = 1 Link will flow control local Tx.

link_speed

state

The current speed of the network connection in megabits per second

link_duplex

0-2

Indicates the link duplex.

link_duplex = 0 Indicates link is down and duplex will be unknown.

link_duplex = 1 Indicates link is up and in half duplex mode

link_duplex = 2 Indicates link is up and in full duplex mode

link_up

0-1

Indicates whether the link is up or down.

link_up = 1 Indicates link is up.

link_up = 0 Indicates link is down.


The following example shows a system with ce interfaces present so that some physical layer troubleshooting capabilities can be demonstrated.

ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
    inet 127.0.0.1 netmask ff000000
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
    inet 67.123.171.236 netmask ffffffe0 broadcast 67.123.171.255
    ether 8:0:20:b1:29:20
ce1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
    inet 67.123.181.236 netmask ffffffe0 broadcast 67.123.181.255
    ether 8:0:20:b1:29:21

The first step in checking the physical layer is to check if the link is up.

kstat ce:0 | grep link_
link_asmpause          0
link_duplex           2
link_pause           0
link_speed           1000
link_up             1

If the link_up variable is set and a physical connection is present, then things are positive. But also confirm that the speed matches your expectation. For example, if the interface is 1000BASE-TX and you expect the link to run at 1000 Mbit/sec, then the link_speed parameter should indicate 1000. If this is not the case, check the link partner capabilities to see if that is the limiting factor. The following kstat command line will show output similar to the following:

kstat ce:0 | grep lp_cap
lp_cap_1000fdx         1
lp_cap_1000hdx         1
lp_cap_100T4          1
lp_cap_100fdx          1
lp_cap_100hdx          1
lp_cap_10fdx          1
lp_cap_10hdx          1
lp_cap_asmpause         0
lp_cap_autoneg         1
lp_cap_pause          0

If the link partner appears to be capable of all the desired speeds, then the problem might be local. There are two possibilities: either the NIC itself is not capable of the desired speed, or the configuration has no shared capabilities that can be agreed on. In either case, the link will not come up. You can check this using the following kstat command line.

kstat ce:0 | grep cap_
cap_1000fdx         1
cap_1000hdx         1
cap_100T4          1
cap_100fdx          1
cap_100hdx          1
cap_10fdx          1
cap_10hdx          1
cap_asmpause         0
cap_autoneg         1
cap_pause          0
.....

If all the required capabilities are available for the desired speed and duplex, yet there remains a problem with achieving the desired speed, the only remaining possibility is an incorrect configuration. You can check this by looking at individual ndd adv_cap_* parameters, or you can use the kstat command:

kstat ce:0 | grep adv_cap_
adv_cap_1000fdx         1
adv_cap_1000hdx         1
adv_cap_100T4          1
adv_cap_100fdx          1
adv_cap_100hdx          1
adv_cap_10fdx          1
adv_cap_10hdx          1
adv_cap_asmpause         0
adv_cap_autoneg         1
adv_cap_pause          0

Configuration issues are where most problems lie. All the issues of configuration can be addressed using the kstat command above to establish the local and remote configuration, and adjusting the adv_cap_* parameters using ndd to correct the problem.

The most common configuration problem is duplex mismatch, which is induced when one side of a link is enabled for auto-negotiation and the other is not. This is known as Forced mode and can only be guaranteed for 10/100 Mode operation. For 1000BASE-T UTP Mode operation, the Forced mode (auto-negotiation disabled) capability is not guaranteed because not all vendors support it.

If Auto-negotiation is turned off you must ensure that both ends of the connection are also in Forced mode, and that the speed and duplex are matched perfectly.

If you fail to match Forced mode in gigabit operation, the impact will be that the link will not come up at all. Note that this result is quite different from the 10/100 Mode case. While in 10/100 Mode operation, if only one end of the connection is auto-negotiating (with full capabilities advertised) the link will come up with the correct speed, but the duplex will always be set to half duplex (creating the potential for a duplex mismatch if the forced end is set to full duplex).

If both sides are set to Forced mode and you fail to match speeds, the link will never come up.

If both sides are set to forced mode and you fail to match duplex, the link will come up, but you will have a duplex mismatch.

Duplex mismatch is a silent failure which manifests itself from an upper layer point of view as really poor performance as many of the packets get lost because of collisions and late collisions occurring on the half-duplex end of the connection due to violations of Ethernet protocol induced by the full-duplex end.

The half-duplex end experiences collisions and late collisions while the full-duplex end experiences a whole manner of smashed packets, leading to MIB counters measuring, crc, runts, giants, alignment errors all being incremented.

If the node experiencing poor performance is the half duplex end of the connection, you can look at the kstat values for collisions and late_collisions.

kstat ce:0 | grep collisions
collisions 22332
late_collisions 15432

If the node experiencing poor performance is the full duplex end of the connection, you can look at the packet corruption counters, for example, crc_err, alignment_err.

kstat ce:0 | grep crc_err
crc_err 22332
kstat ce:0 | grep alignment_err
alignment_err 224532

Depending on the capability of the switch end, or remote end of the connection, there may be an ability to do similar measurements there.

Forced mode while having the problem of creating a potential duplex mismatch also has the drawback of isolating the link partner capabilities from the local station. In Forced mode, you cannot view the lp_cap* values and determine the capabilities of the remote link partner locally.

Where possible, use the default of Auto-negotiation with all capabilities advertised and avoid tuning the physical link parameters.

Given the maturity of the Auto-negotiation protocol and its requirement in the 802.3z specification for one gigabit UTP Physical implementations, ensure that Auto-negotiation to enabled.

Deviation from General Ethernet MII/GMII Conventions

There remains some deviation from the general Ethernet MII/GMII kernel statistics which we must address.

In the case of the ge interface, all of the statistics for getting local capabilities and link partner capabilities are read-only ndd properties, so they cannot be read using the kstat command, as described previously, although the debug mechanism is still valid.

To read the corresponding lp_cap_* using ge, use the following commands:

hostname# ndd -set /dev/ge instance 0
hostname# ndd -get /dev/ge lp_1000fdx_cap

Or you could use the interactive mode, described previously. The mechanism used for enabling Ethernet Flow control on the ge interface is also different, using the parameters in the table below.

TABLE 2 Physical Layer Configuration Properties

Statistic

Values

Description

adv_pauseTX

0-1

Transmit Pause if the Rx buffer is full.

adv_pauseRX

0-1

When you receive a pause slow down Tx.


There's also a deviation in ge for adjusting ndd parameters. For example, when modifying ndd parameters like adv_1000fdx_cap, the changes will not take effect until the adv_autoneg_cap parameter is toggled to change state (from 0-1 or from 1-0). This is a deviation from the General Ethernet MII/GMII convention for the "take affect immediately rule" of ndd.

  • + Share This
  • 🔖 Save To Your Account