Home > Articles

  • Print
  • + Share This
This chapter is from the book

Memory Diagnostics—Parity

The first thing a PC tests when it runs through the POST (Power On Self-Test) is the memory integrity. On many machines we can see this taking place as a rapidly increasing number displayed on the monitor before anything else happens. The testing is designed to verify the structural fitness of each cell (usually capacitors) in every main memory module. When all the cells have been checked, the boot process continues. The POST test is a simple, one-time test, and may not uncover a bad memory module.

NOTE

A bad memory module can cause strange, intermittent errors having to do with read failures, page faults, or even more obscure error messages. Before you tear apart Windows in an attempt to diagnose a possible operating system problem, run a comprehensive hardware diagnostics program on the machine. These applications do a much more exhaustive test of each memory cell, and produce a report of a failed module by its location in the memory banks.

Memory modules may or may not use parity checking, depending on how they're manufactured. The circuitry must be built into the module for it to be capable of parity checking. Keep in mind that parity checking is not the same as the initial test of the cells. Parity checking takes place only after the machine is up and running, and is used to check read/write operations.

Originally, parity checking was a major development in data protection. At the time, memory chips were nowhere near as reliable as they are today, and the process went a long way toward keeping data accurate. Parity checking is still the most common (and least expensive) way to check whether a memory cell can accurately hold data. A more sophisticated (and expensive) method uses Error Correcting Code (ECC) parity.

NOTE

Most DRAM chips in SIMMs or DIMMs require a parity bit because memory can be corrupted even if the computer hasn't actually been bashed with a hammer. Alpha particles can disturb memory cells with ionizing radiation, resulting in lost data. Electromagnetic interference (EMI) also can change stored information.

Even or Odd Parity

In odd and even parity, every byte gets 1 parity bit attached, making a combined 9-bit byte. Therefore, a 16-bit byte has 2 parity bits, a 32-bit byte has 4 parity bits, and so forth. This produces extra pins on the memory module, and this is one of the reasons why various DIMMs and SIMMs have a different number of pins.

Again, parity adds 1 bit to every 8-bit byte going into memory. If parity is set to odd, the circuit totals the number of binary "1s" in the byte and then adds a 1 or a 0 in the ninth place to make the total odd. When the same byte is read from memory, the circuit totals up all eight bits to ensure the total is still odd. If the total has changed to even, an error has occurred and a parity error message is generated. Figure 3.7 shows various bytes of data with their additional parity bit.

Even parity checking is where the total of all the 1 bits in a byte must equal an even number. If five of the bits are set to 1, the parity bit will also be set to 1 to total six (an even number). If 6 bits were set to 1, the parity bit would be set to 0 to maintain the even number six.

Figure 3.7Figure 3.7 Odd and even parity.

Fake or Disabled Parity

Some computer manufacturers install a less expensive "fake" parity chip that simply sends a 1 or a 0 to the parity circuit to supply parity on the basis of which parity state is expected. Regardless of whether the parity is valid, the computer is fooled into thinking that everything is valid. This method means no connection whatsoever exists between the parity bit being sent and the associated byte of data.

A more common way for manufacturers to reduce the cost of SIMMs is to simply disable the parity completely, or to build a computer without any parity checking capability installed. Some of today's PCs are being shipped this way, and they make no reference to the disabled or missing parity. The purchaser must ensure that the SIMMs have parity capabilities, and must configure the motherboard to turn parity on.

Error Correction Code (ECC)

Parity checking is limited in the sense that it can only detect an error—it can't repair or correct the error. The circuit can't tell which one of the eight bits is invalid. Additionally, if multiple bits are wrong but the result according to the parity is correct, the circuit passes the invalid data as okay.

CAUTION

You'll receive a parity error if the parity is odd and the circuit gets an even number, or if the parity is even and the parity circuit gets an odd number. The circuit can't correct the error, but it can detect that the data is wrong.

Error correction code (ECC) uses a special algorithm to work with the memory controller, adding error correction code bits to data bytes when they're sent to memory. When the CPU calls for data, the memory controller decodes the error correction bits and determines the validity of its attached data. Depending on the system, a 32-bit word (4 bytes) might use four bits for the overall accuracy test, and another two bits for specific errors. This example uses 6 ECC bits, but there may be more.

ECC requires more bits for each byte, but the benefit is that it can correct single-bit errors, rather than only the entire word. (We discuss bytes and words in the next chapter.) Because approximately 90% of data errors are single-bit errors, ECC does a very good job. On the other hand, ECC costs a lot more, because of the additional number of bits.

CAUTION

Remember that ECC can correct single-bit errors. However, like odd-even parity, it can also detect (not correct) multi-bit errors. Regular parity checking understands only that the overall byte coming out of memory doesn't match what was sent into memory: It cannot correct anything.

Usually, whoever is buying the computer decides which type of data integrity checking he or she wants, depending mainly on cost benefits. The buyer can choose ECC, parity checking, or nothing. High-end computers (file servers, for example) typically use an ECC-capable memory controller. Midrange desktop business computers typically are configured with parity checking. Low-cost home computers often have non-parity memory (no parity checking or "fake" parity).

  • + Share This
  • 🔖 Save To Your Account