1.3 Line Coding of Digital Signals
When binary data is sent through a link, it is represented by a physical quantity in the transport medium. In electrical links, that's usually a voltage or current; optical systems use the intensity of light; and wireless radio links often use the phase and frequency of a signal carrier. Line coding determines how the binary data is represented on the link.
Numerous coding schemes are available, and which one is best for any given application depends on many factors. Coding can influence the frequency spectrum, the direct current content, and the transition density of the resulting data stream. Coding efficiency determines the required link bandwidth, and the cost of implementation depends on the complexity of the code.
1.3.1 Properties of Binary Data
22.214.171.124 Mark Density
The mark density (MD) of a binary data pattern is defined as the number of one bits in the pattern, divided by the length of the pattern:
where NOne is the number of ones in the pattern, and NZero is the number of zeros. The mark density ranges from 0.0 to 1.0, where the extremes are marked by all-zeros (NOne equals 0) and all-ones data (NZero equals 0). Random data is exactly at the middle of the range: It contains as many one bits as zero bits, and its long-term mark density is therefore 0.5. If we look only at a subsection of the random data pattern, however, its mark density can be very different.
If we represent a zero bit by 0.0 and a one bit by 1.0, the mark density is equal to the time average over the pattern. It is therefore a direct measure for the DC content of the signal. A pattern with a mark density of 0.5 is therefore also called a DC-balanced pattern. DC balance is an important property in some applications; if it is required to maintain a DC level in the link, then amplifiers and other system components need to be DC coupled, often leading to a more complicated and problematic design.
126.96.36.199 Transition Density
The transition density (TD) of a data pattern is defined as the number of transitions in the pattern, divided by the length of the pattern:
where NT is the number of transitions in the pattern, NOne is the number of ones, and NZero is the number of zeros. The transition density ranges from 0.0 to 1.0, where the extremes are marked by static patterns (all-zeros or all-ones) and toggle patterns. Random data is again exactly at the middle of the range: Because the probability that two consecutive bits are identical is 0.5, the transition density is 0.5, too.
188.8.131.52 Run Length Distribution
The run length distribution of a data pattern gives the relative probabilities for runs of identical consecutive bits. Longer runs create stress in many applications, because of either excessive intersymbol interference (ISI) or baseline wander due to local disparity.
1.3.2 Binary Line Codes
184.108.40.206 Non-Return-to-Zero Code
The non-return-to-zero (NRZ) format is the prototypical representation of binary data: A logical zero state is transmitted as one signal level, and a logical one state as another level. Levels change at bit boundaries only if the bit value changes and remain stable for the entire duration of the bit period. If the level representing the zero logical bit state is lower than the level for the one state, we call this positive logic, and the respective levels are then called low level and high level. NRZ coding is essentially free because binary data is already stored in this format in CPUs and other digital devices. It is therefore the most commonly used coding scheme and the reference for all other coding schemes in terms of signal properties, efficiency, and implementation effort.
NRZ signals always have a clock signal associated with them, even if it is not transmitted along with the data. Figure 1-15 shows the NRZ representation of a short data sequence, together with a clock signal. Note how the data signal changes on the falling edge of the clock; the receiver samples it on the rising edge. There are also systems that work with an inverted clock. The data then changes on the rising edge, and the receiver samples at the falling clock edge. The clock signal for NRZ transmission usually runs at the base frequency of the data: for a 10 Gbit/s signal, the clock rate is 10 GHz (single data rate, SDR). A variant of NRZ transmission uses a clock signal at half rate (5 GHz for 10 Gbit/s), and the receiver samples the data both at the rising and falling edges of the clock. This is called double data rate (DDR) transmission.
Figure 1-15 NRZ coding of a short data sequence (PRBS 24-1). Top: single data rate clock. Bottom: double data rate clock.
The properties of NRZ-formatted data depend entirely on the data itself. The drawback of NRZ coding is that the DC content, frequency spectrum, and transition density depend on the data sequence. Long runs of zeros or ones cause problems in some applications because of effects such as baseline wander and ISI or because there are not enough transitions for clock data recovery.
Figure 1-16 shows the power spectral densities of two short NRZ-formatted data sequences. Note how both spectra have zero power at multiples of the signal base rate (e.g., 1 GHz, 2 GHz, 3 GHz). The PRBS spectrum follows the typical sinc envelope, with nulls at multiples of the data rate. Because of the very fast rise times that we used to create the spectrum, there is significant spectral content at very high frequencies. The spectrum for the toggle pattern equals that of a 500 MHz square wave. The spectra of all-zeros or all-ones patterns are zero, with the exception of a DC value.
Figure 1-16 Power spectral density for NRZ-formatted data at 1 Gbit/s. Left: PRBS 24-1. Right: Toggle pattern (101010 . . .). Power density is normalized to a maximum power of 1.0.
220.127.116.11 Return-to-Zero Code
The return-to-zero (RZ) code represents the zero logical state as a static low level and the one state as a short high-level pulse. The signal always returns to the level representing a zero state immediately after the high level, hence the name. RZ signals can be easily created from NRZ signals, by a binary AND of the NRZ and a clock. The width of the pulses depends on the duty cycle of the clock. Figure 1-17 shows the RZ representation of a short data sequence, with 50% and 25% duty cycles.
Figure 1-17 RZ coding of a short data sequence (PRBS 24-1). Top: 50% duty cycle. Bottom: 25% duty cycle.
RZ coding is used primarily in optical transmission systems because it minimizes power consumption and the effects of system dispersion on optical signal distortion. Consecutive one bits carry one transition each, so that clock data recovery is fairly easy with this coding, provided the signal doesn't consist of all zeros. The signals also carry significant DC content, which is not a factor in optics, though.
The signal bandwidth of RZ-coded data is significantly higher than that of NRZ data, by at least a factor of two (for a 50% duty cycle). The spectral densities for the RZ-coded signals from Figure 1-17 are shown in Figure 1-18. The signal with a 50% duty cycle has significantly less energy at lower frequencies than the NRZ signal and very distinct spikes at the data rate and its even harmonics. The 25% duty cycle signal has even less low-frequency content but distinct spikes at all integer multiples of the data rate.
Figure 1-18 Power spectral density for a short RZ-formatted data sequence (PRBS 24-1), at 1 Gbit/s. Left: 50% duty cycle. Right: 25% duty cycle. Power density is normalized for comparison with NRZ format (dotted line).
18.104.22.168 Return-to-One Code
Return-to-one (R1) code uses a static high level for the logical one state and a short low-level pulse for a zero. Creating an R1-formatted signal from NRZ data is a bit more complicated than using the RZ format: It's a binary AND of the inverted NRZ data with the clock, and the result inverted again. Figure 1-19 shows an example. The properties of R1-coded data are very similar to those of RZ-coded data, with the exception of the DC content, which is significantly higher than for RZ-coded signals.
Figure 1-19 R1 coding of a short data sequence (PRBS 24-1)
22.214.171.124 Manchester Code
Manchester code is generated from NRZ data by a binary XOR with a clock signal. Since there are two possible clock phases, there are also two variants of Manchester code. The coded data has a transition in the middle of every bit, and the direction of this transition indicates a binary zero or one. The original Manchester variant uses a falling edge for a one and a rising edge for a zero; the other variant (which is used in IEEE 802.3 10Base-T Ethernet, for example) is the exact inverse. Figure 1-20 shows both variants.
Figure 1-20 Manchester code representation of a short data sequence (PRBS 24-1). Top: "10" variant. Bottom: "01" variant.
Manchester code is very attractive for embedded clock applications because it forces at least one transition per bit, even if the data is a constant zero or one. It is also a DC-balanced code. However, the price for this is a significantly higher bandwidth relative to NRZ data. Figure 1-21 shows the spectral densities for two short data sequences. Compared to the NRZ spectrum (dotted line), the PRBS spectrum has significantly less spectral content at low frequencies but more at higher frequencies. Spectral nulls are at even harmonics. The spectrum for the constant one pattern is equal to a 1 GHz square wave.
Figure 1-21 Power spectral density for Manchester-coded data at 1 Gbit/s. Left: PRBS 24-1. Right: Constant one (111111 . . .). Power density is normalized for comparison with NRZ format (dotted line, left plot only).
126.96.36.199 Non-Return-to-Zero Inverted Code
Non-return-to-zero inverted (NRZI) code is not, as the name suggests, the mere inversion of an NRZ-coded signal; it is an example of a differential code, where the state of the signal depends on both the current and the previous bit. An NRZI-coded signal changes its state when the current bit is a logic one bit but stays constant if the current bit is a logic zero (Figure 1-22). Using transitions rather than levels makes detection less error-prone in noise environments, and the signal polarity is insignificant. NRZI coding is used, for example, in USB.
Figure 1-22 NRZI coding of a short data sequence (PRBS 24-1)
The signal properties of NRZI-coded data are similar to those of NRZ data: The transition density can be between 0.0 (for a constant zero pattern) and 1.0 (for a constant one pattern), and the spectral content for random data is exactly the same as for NRZ. The NRZI code is therefore not sufficient to enable data transmission with clock recovery, or to limit the amount of ISI.
188.8.131.52 Differential Manchester Code
Differential Manchester code (DMC) is a combination of Manchester and NRZI: It uses transitions in the middle of the bit, but the transition direction changes with every one in the data stream (Figure 1-23). This coding can be generated by an XOR function of NRZI-coded data and a clock signal. DMC is also known as conditional de-phase (CDP) code and used in token ring LANs (IEEE 802.5).
Figure 1-23 Differential Manchester coding of a short data sequence (PRBS 24-1)
The properties of data that is coded with DMC are very similar to those of pure Manchester code: The signal is DC balanced, there is at least one transition per bit, and the spectrum has low content at lower frequencies but significantly more high-frequency content than NRZ data has.
1.3.3 Multilevel Line Codes
184.108.40.206 Bipolar Return-to-Zero Code
A variant of the RZ code is bipolar return-to-zero (BPRZ) coding, where the signal returns to an intermediate zero level after both zero and one bits (Figure 1-24). There are two transitions per bit, which makes synchronization of the receiver fairly easy. The drawback is the fairly complicated circuitry and an even higher bandwidth requirement than for RZ and R1 data. Figure 1-25 shows the power spectral density for a BPRZ-formatted data sequence.
Figure 1-24 BPRZ coding of a short data sequence (PRBS 24-1)
Figure 1-25 Power spectral density for a short BPRZ-formatted data sequence (PRBS 24-1), at 1 Gbit/s. Left: 50% duty cycle. Right: 25% duty cycle. Power density is normalized for comparison with NRZ format (dotted line).
220.127.116.11 Pulse Amplitude Modulation
Pulse amplitude modulation (PAM) is a class of multilevel codes that encodes several consecutive bits into one of several levels. PAM-4, for example, encodes two bits into one out of four levels (Figure 1-26). Demodulation is performed by detecting the signal level once per symbol period. PAM-4-encoded data has much less high-frequency content than, for example, NRZ data because the signal level changes only for every other bit. However, the cost is increased transmitter and especially receiver complexity, and a lower signal-to-noise ratio if the same levels are used. PAM-4 alone is not sufficient for embedded clock systems, as it does not guarantee transition density: Constant zero or one patterns are encoded as DC levels. Figure 1-27 shows the power spectral density for a PAM-4-coded data sequence.
Figure 1-26 PAM-4 coding of a short data sequence (PRBS 24-1)
Figure 1-27 Power spectral density for PAM-4-coded data at 1 Gbit/s. Left: PRBS 24-1. Right: Half-rate toggle (11001100 . . .). Power density is normalized for comparison with NRZ format (dotted line).
1.3.4 Block Codes
18.104.22.168 mBnB Block Codes
Block codes of type mBnB take m bits of the original data and encode them into n bits, following very specific rules. Several of the coding schemes from the previous sections can be expressed as 1B2B codes; RZ coding, for example, encodes every one bit as a one, followed by a zero, and every zero bit as two zeros. Widely used in serial high-speed applications are 4B5B and in particular 8B10B coding. The dominant encoding scheme in computing applications, 8B10B seems to hit a sweet spot with relatively low overhead (25%), ease of implementation, coding properties such as maximum run length, and so on. Chapter 3 describes 4B5B and 8B10B coding in greater detail.
22.214.171.124 Error Detection and Forward Error Correction
Some of the block codes from Section 126.96.36.199 enable the receiver to detect some transmission errors, either from calculating disparity or by detecting invalid code words. A system that is based on such coding techniques can issue a packet resend command and transmit the packet again, this time hopefully without an error. Ideally, however, the receiver would be able to not only detect errors (all errors, not just a few) but also correct them.
The process of adding redundancy to the data stream and analyzing and correcting errors in real time is called forward error correction (FEC). Systems that use FEC can operate with less margin in transmission than non-FEC systems. In practical applications, this means a longer range between sender and receiver or reduced transmission power. Especially under difficult transmission conditions, FEC systems are more effective than non-FEC systems because fewer packets need to be retransmitted.