GPGPU: Is a Supercomputer Hiding in Your PC?
Growth of the GPU
Were you aware that the most powerful chip in an average or better PC is not the CPU, but the GPU (graphics processing unit)? More than likely, you had an inkling that some serious technology was driving your graphics card, and over the past couple years saw the accoutrements of raw power appear: larger fans and heatsinks, and even supplementary power connectors. If you're on the bleeding edge, you may have painstakingly built a benchmark-crushing PC with a water-cooling system. The growing rage is a single PC with multiple GPUs in a scalable link interface (SLI) configuration.
Nowadays, high-performance GPUs have become more rational, with quiet thermal solutions and reasonable power requirements. But the ever-advancing video game industry entices consumers with a consistent and growing hunger for higher performance. This mass market drives a tremendous technology arms race in the GPU industry, with consumers being the overall winners. In turn, low prices for high-performance GPUs provide a great opportunity for software developers and their customers, letting them capitalize on otherwise idle transistors for their computational needs.
Perhaps management won't pay for you to use all that GPU performance during the workday, and the only exercise your powerful, computationally starved GPU gets is your lunchtime and after-work gaming sessions. That simply isn't fair to all those transistors.
So how do you convince your boss that you really need to buy the latest GPU? In fact, the argument really isn't difficult—provided that your programmers are willing and able to recast your problems to fit the massively parallel architecture of the GPU. If you can stay on the computational superhighway of the GPU, you'll be able to unleash tremendous gigaflops. Naturally, today's graphically intense applications do exactly that, using one of the two primary APIs for programming these chips: the cross-platform OpenGL API, or Microsoft's Direct3D component of the DirectX API. For example, leading games have inner loops with core algorithms that run "shaders" across an array of pixels. A shader takes in a variety of inputs—geometry and light positions, texture maps, bump maps, shadow maps, material properties, and other data—for special effects such as fog or glow. It then computes a resulting color (and possibly other data, such as transparency and depth). All this action happens for each pixel on the screen, at 60 or more frames per second—a massive amount of computation.
At its core, this processing is similar to how most computer-animated and special effects–rich movies, such as Shrek, are produced. A C-like high-level programming language is used to write shaders, and then a commercial or in-house rendering program chugs away, computing scenes on the film studio's render farm. Often comprising thousands of CPUs, these render farms handle the enormous amounts of geometry and texture data required to hit the increasing quality threshold demanded by discerning audiences of CG feature films. This rendering process is commonly done overnight, and reviews of the "dailies" (those scenes rendered overnight) are done the following morning.
GPUs go through a simplified version of this process 60 or more times per second, and each year the gap between real-time and offline rendering shrinks. In fact, many recent GPU technology demonstrations show real-time renderings of content that was generated offline just a few years ago.