- Pre-PC Microprocessor History
- Microprocessors from 1971 to the Present
- Processor Specifications
- Processor Features
- Processor Manufacturing
- Processor Socket and Slot Types
- CPU Operating Voltages
- Heat and Cooling Problems
- Math Coprocessors (Floating-Point Units)
- Processor Bugs
- Processor Codenames
- P1 (086) First-Generation Processors
- P2 (286) Second-Generation Processors
- P3 (386) Third-Generation Processors
- P4 (486) Fourth-Generation Processors
- P5 (586) Fifth-Generation Processors
- Intel P6 (686) Sixth-Generation Processors
- Other Sixth-Generation Processors
- Intel Pentium 4 (Seventh-Generation) Processors
- Eighth-Generation (64-Bit Register) Processors
- Dual-Core Processors
- Processor Upgrades
- Processor Troubleshooting Techniques
Intel Pentium 4 (Seventh-Generation) Processors
The Pentium 4 was introduced in November 2000 and represented a new generation in processors (see Figure 3.60). If this one had a number instead of a name, it might be called the 786 because it represents a generation beyond the previous 686 class processors. Three main variations on the Pentium 4 have been released, based on the processor die and architecture. They are called the Willamette, Northwood, and Prescott. The processor dies are shown in Figure 3.61.
Figure 3.60 Pentium 4 FC-PGA2 processor.
Figure 3.61 The CPU dies for the Pentium 4 CPU based on the Willamette, Northwood, and Prescott cores.
The main technical details for the Pentium 4 include
- Speeds range from 1.3GHz to 3.8GHz.
- 42 million transistors, 0.18-micron process, 217 sq. mm die (Willamette).
- 55 million transistors, 0.13-micron process, 131 sq. mm die (Northwood).
- 125 million transistors, 0.09-micron process, 112 sq. mm die (Prescott).
- Software compatible with previous Intel 32-bit processors.
- Some Prescott versions support EM64T (64-bit extensions) and Execute Disable Bit (buffer overflow protection).
- Processor (front-side) bus runs at 400MHz, 533MHz, 800MHz, or 1066MHz.
- Arithmetic logic units (ALUs) run at twice the processor core frequency.
- Hyper-pipelined (20-stage or 31-stage) technology.
- Hyper-threading technology support in all 2.4GHz and faster processors running an 800MHz bus and all 3.06GHz and faster processors running a 533MHz bus.
- Very deep out-of-order instruction execution.
- Enhanced branch prediction.
- 8KB or 16KB L1 cache plus 12K micro-op execution trace cache.
- 256KB, 512KB, or 1MB of on-die, full-core speed 256-bit-wide L2 cache with eight-way associativity.
- L2 cache can handle up to 4GB RAM and supports ECC.
- 2MB of on-die, full-speed L3 cache (Extreme Edition).
- SSE2—SSE plus 144 new instructions for graphics and sound processing (Willamette and Northwood).
- SSE3—SSE2 plus 13 new instructions for graphics and sound processing (Prescott).
- Enhanced floating-point unit.
- Multiple low-power states.
Intel abandoned Roman numerals for a standard Arabic numeral 4 designation to identify the Pentium 4. Internally, the Pentium 4 introduces a new architecture Intel calls NetBurst microarchitecture, which is a marketing term and not a technical term. Intel uses NetBurst to describe hyper-pipelined technology, a rapid execution engine, a high-speed (400MHz, 533MHz, 800MHz, or 1066MHz) system bus, and an execution trace cache. The hyper-pipelined technology doubles or triples the instruction pipeline depth as compared to the Pentium III (or Athlon/Athlon 64), meaning more and smaller steps are required to execute instructions. Even though this might seem less efficient, it enables much higher clock speeds to be more easily attained. The rapid execution engine enables the two integer arithmetic logic units (ALUs) to run at twice the processor core frequency, which means instructions can execute in half a clock cycle. The 400MHz/533MHz/800MHz/1066MHz system bus is a quad-pumped bus running off a 100MHz/133MHz/200MHz/266MHz system clock transferring data four times per clock cycle. The execution trace cache is a high-performance Level 1 cache that stores approximately 12K decoded micro-operations. This removes the instruction decoder from the main execution pipeline, increasing performance.
Of these, the high-speed processor bus is most notable. Technically speaking, the processor bus is a 100MHz, 133MHz, 200MHz, or 266MHz quad-pumped bus that transfers four times per cycle (4x), for a 400MHz, 533MHz, 800MHz, or 1066MHz effective rate. Because the bus is 64 bits (8 bytes) wide, this results in a throughput rate of 3200MBps, 4266MBps, 6400MBps, or 8532MBps.
Table 3.44 shows how this transfer rate compares to various speeds of dual-channel RDRAM and DDR SDRAM.
As you can see from Table 3.46, the throughput of the Pentium 4's processor bus is an exact match for the most common types of RDRAM and DDR SDRAM memory. The use of dual-channel memory means that modules must be added in matched pairs. Dual banks of PC1600 (DDR266), PC2100 (DDR333), or PC3200 (DDR400) DDR SDRAM are less expensive than equivalent RDRAM solutions, which is why virtually all newer Pentium 4 chipsets support DDR SDRAM or the newer DDR2 SDRAM.
In the Pentium 4's 20-stage or 31-stage pipelined internal architecture, individual instructions are broken down into many more substages than with previous processors such as the Pentium III, making this almost like a RISC processor. Unfortunately, this can add to the number of cycles taken to execute instructions if they are not optimized for this processor. Early benchmarks running existing software showed that existing Pentium III or AMD Athlon processors could easily keep pace with or even exceed the Pentium 4 in specific tasks; however, this is changing now that applications are being recompiled to work smoothly with the Pentium 4's deep pipelined architecture.
Another important architectural advantage is hyper-threading technology, which can be found in all Pentium 4 2.4GHz and faster processors running an 800MHz bus and all 3.06GHz and faster processors running a 533MHz bus. Hyper-threading enables a single processor to run two threads simultaneously, thereby acting as if it were two processors instead of one. For more information on hyper-threading technology, see the section "Hyper-Threading Technology," earlier in this chapter.
The Pentium 4 initially used Socket 423, which has 423 pins in a 39x39 SPGA arrangement. Later versions used Socket 478; recent versions use Socket T (LGA775), which has additional pins to support new features such as EM64T (64-bit extensions), Execute Disable Bit (protection against buffer overflow attacks), Intel Virtualization Technology, and other advanced features. The Celeron was never designed to work in Socket 423, but Celeron and Celeron D versions are available for Socket 478 and Socket T (LGA775), allowing for lower-cost systems compatible with the Pentium 4. Voltage selection is made via an automatic voltage regulator module installed on the motherboard and wired to the socket.
Use Table 3.47 as a comprehensive guide to Pentium 4 processor features. As you review the many Pentium 4 models listed in this table, you can easily see that there have actually been at least six distinct Pentium 4 generations, based on the most significant technology changes listed here:
- Socket 423
- Socket 478
- Socket 478 Hyper-Threading Technology
- Socket 478 Extreme Edition (L3 cache)
- Socket T (LGA775)
- Socket 775 EM64T (64-bit extensions)
Table 3.47. Pentium 4 Processor InformationView Table
For some time now, it has been obvious that "Pentium 4" has been far more of a brand than a single processor family, leading to endless confusion when users have considered processor upgrades or new system purchases. Because of the three form factors (Socket 423, Socket 478, and Socket 775) and the wide range of features available in the Pentium 4 family, it's essential that you determine exactly what the features are of a particular processor before you purchase it as an upgrade to an existing processor or as part of a complete system.
Pentium 4 Extreme Edition
In November 2003, Intel introduced the Extreme Edition of the Pentium 4, which is notable for being the first desktop PC processor to incorporate L3 cache. The Extreme Edition (or Pentium 4EE) is basically a revamped version of the Prestonia core Xeon workstation/server processor, which has used L3 cache since November 2002. The Pentium 4EE has 512KB of L2 cache and 2MB of L3 cache, which increases the transistor count to 178 million transistors and makes the die significantly larger than the standard Pentium 4. Because of the large die based on the 130-nanometer process, this chip is expensive to produce and the extremely high selling price reflects that. The Extreme Edition is targeted toward the gaming market, where people are willing to spend extra money for additional performance. The additional cache doesn't help standard business applications as well as it helps power-hungry 3D games.
In 2004, revised versions of the Pentium 4 Extreme Edition were introduced. These processors are based on the 90-nanometer (0.09-micron) Pentium 4 Prescott core but with a larger 2MB L2 cache in place of the 512KB L2 cache design used by the standard Prescott-core Pentium 4. Pentium 4 Extreme Edition processors based on the Prescott core do not have L3 cache.
The Pentium 4 Extreme Edition is available in both Socket 478 and Socket T form factors, with clock speeds ranging from 3.2GHz to 3.4GHz (Socket 478) and from 3.4GHz to 3.73GHz (Socket T). For specific features of a particular Pentium 4 Extreme Edition processor, see Table 3.47.
The various Pentium 4 and Pentium 4 Extreme Edition versions, including thermal and power specifications, are shown in Table 3.47.
Pentium 4–based motherboards use RDRAM, SDRAM, DDR SDRAM, or DDR2 SDRAM memory, depending on the chipset; however, most Pentium 4 systems use DDR or DDR2 SDRAM. Since Intel's contract with RAMBUS expired in 2001, DDR SDRAM and DDR2 SDRAM have become Intel's preferred memory type for mainstream systems.
Pentium 4 Power Supply and Cooling Issues
Compared to older processors, the Pentium 4 requires a lot of electrical power, and because of this, most Pentium 4 motherboards use a new design voltage regulator module powered from 12V instead of 3.3V or 5V, as with previous designs. By using the 12V power, more 3.3V and 5V power is available to run the rest of the system and the overall current draw is greatly reduced with the higher voltage as a source. PC power supplies generate more than enough 12V power, but the ATX motherboard and power supply design originally allotted only one pin for 12V power (each pin is rated for only 6 amps), so additional 12V lines were necessary to carry this power to the motherboard.
The fix appears in the form of a third power connector, called the ATX12V connector. This new connector is used in addition to the standard 20-pin ATX power supply connector and 6-pin auxiliary (3.3V/5V) connector. Fortunately, the power supply itself doesn't require a redesigned power supply; more than enough 12V power is available from the drive connectors. To utilize this, companies such as PC Power and Cooling sell an inexpensive ($8) adapter that converts a standard Molex-type drive power connector to the ATX12V connector. Typically, a 300-watt (the minimum recommended) or larger power supply has more than adequate levels of 12V power for both the drives and the ATX12V connector.
If your power supply is less than the 300-watt minimum recommended, you need to purchase a replacement. Because the ATX12V power supply connector is required for most Intel-based systems from the past few years, virtually all vendors sell an off-the-shelf ATX12V-ready model or one that uses the adapter mentioned previously.
See "Motherboard Power Connectors," p. 1167.
Cooling a high-wattage unit such as the Pentium 4 requires a large active heatsink. These heavy (sometimes more than 1 lb.) heatsinks can damage a CPU or destroy a motherboard when subjected to vibration or shock, especially during shipping. To solve this problem with Pentium 4 motherboards, various methods have been used. Intel's specifications for Socket 423 added four standoffs to the ATX chassis design flanking the Socket 423 to support the heatsink retention brackets. These standoffs enabled the chassis to support the weight of the heatsink instead of depending on the motherboard, as with older designs. Vendors also used other means to reinforce the CPU location without requiring a direct chassis attachment. For example, Asus's P4T motherboard was supplied with a metal reinforcing plate to enable off-the-shelf ATX cases to work with the motherboard.
Socket 478 systems do not require any special standoffs or reinforcement plates; instead they use a unique scheme in which the CPU heatsink attaches directly to the motherboard rather than to the CPU socket or chassis. Motherboards with Socket 478 can be installed into any ATX chassis—no special standoffs are required.
Socket T (LGA775) systems use a unique clamping mechanism that holds the processor in place. The heatsink is attached over the processor and clamping mechanism and attaches to the motherboard.
Because the Pentium 4 processor family has been manufactured in three socket types with a wide variation in clock speed and power dissipation, it's essential that you choose a heatsink made specifically for the processor form factor and speed you have purchased (or intend to purchase). This is just one more reason I think it's worth getting a boxed processor instead of an OEM version when building or upgrading a system. If you purchase the shrink-wrapped or "boxed" processor, you get an Intel-specified high-quality heatsink in the box with the process. In addition, you get a 3-year warranty with Intel, making the boxed version ideal for upgraders and system builders.
Xeon processors are based on the Pentium 4 and are designed for Socket 603 and Socket 604. Xeon DP processors (often referred to simply as Xeon) are designed for single- and dual-processor workstations:
- Xeon DP processors with a 400MHz CPU bus feature clock speeds from 1.4GHz to 3GHz.
- Xeon DP processors with a 533MHz CPU bus feature clock speeds from 2GHz to 3.2GHz.
- Xeon DP processors with a 667MHz CPU bus (a speed never used by the Pentium 4, by the way) feature clock speeds from 3.33GHz to 3.66GHz.
- Xeon DP processors with an 800MHz CPU bus feature clock speeds from 2.8GHz to 3.8GHz.
Xeon MP processors are designed for four-way and larger servers. They are available in speeds ranging from 1.4GHz to 3GHz, and all support the 400MHz CPU bus.
For more information about Xeon DP and Xeon MP processors, see my book Upgrading and Repairing Servers.