SPECIAL OFFERS
Keep up with new releases and promotions. Sign up to hear from us.
Register your product to gain access to bonus material or receive a coupon.
“The GPU Gems series features a collection of the most essential algorithms required by Next-Generation 3D Engines.”
—Martin Mittring, Lead Graphics Programmer, Crytek
This third volume of the best-selling GPU Gems series provides a snapshot of today’s latest Graphics Processing Unit (GPU) programming techniques. The programmability of modern GPUs allows developers to not only distinguish themselves from one another but also to use this awesome processing power for non-graphics applications, such as physics simulation, financial analysis, and even virus detection—particularly with the CUDA architecture. Graphics remains the leading application for GPUs, and readers will find that the latest algorithms create ultra-realistic characters, better lighting, and post-rendering compositing effects.
Major topics include
Contributors are from the following corporations and universities:
3Dfacto
Adobe Systems
Apple
Budapest University of Technology and Economics
CGGVeritas
The Chinese University of Hong Kong
Cornell University
Crytek
Czech Technical University in Prague
Dartmouth College
Digital Illusions Creative Entertainment
Eindhoven University of Technology
Electronic Arts
Havok
Helsinki University of Technology
Imperial College London
Infinity Ward
Juniper Networks
LaBRI–INRIA, University of Bordeaux
mental images
Microsoft Research
Move Interactive
NCsoft Corporation
NVIDIA Corporation
Perpetual Entertainment
Playlogic Game Factory
Polytime
Rainbow Studios
SEGA Corporation
UFRGS (Brazil)
Ulm University
University of California, Davis
University of Central Florida
University of Copenhagen
University of Girona
University of Illinois at Urbana-Champaign
University of North Carolina Chapel Hill
University of Tokyo
University of Waterloo
Section Editors include NVIDIA engineers: Cyril Zeller, Evan Hart, Ignacio Castaño, Kevin Bjorke, Kevin Myers, and Nolan Goodnight.
The accompanying DVD includes complementary examples and sample programs.
1.1 Introduction 7
1.2 Marching Cubes and the Density Function 7
1.3 An Overview of the Terrain Generation System 12
1.4 Generating the Polygons Within a Block of Terrain 20
1.5 Texturing and Shading 29
1.6 Considerations for Real-World Applications 35
1.7 Conclusion 37
1.8 References 37
2.1 Motivation 39
2.2 A Brief Review of Instancing 40
2.3 Details of the Technique 42
2.4 Other Considerations 50
2.5 Conclusion 51
2.6 References 52
3.1 Introduction 53
3.2 How Does It Work? 56
3.3 Running the Sample 66
3.4 Performance 66
3.5 References 67
4.1 Introduction 69
4.2 Silhouette Clipping 69
4.3 Shadows 76
4.4 Leaf Lighting 81
4.5 High Dynamic Range and Antialiasing 85
4.6 Alpha to Coverage 85
4.7 Conclusion 88
4.8 References 91
5.1 Introduction 94
5.2 Overview 95
5.3 Adaptive Refinement Patterns 96
5.4 Rendering Workflow 98
5.5 Results 100
5.6 Conclusion and Improvements 103
5.7 References 104
6.1 Introduction 105
6.2 Procedural Animations on the GPU 106
6.3 A Phenomenological Approach 106
6.4 The Simulation Step 113
6.5 Rendering the Tree 117
6.6 Analysis and Comparison 118
6.7 Summary 119
6.8 References 120
7.1 Metaballs, Smoothed Particle Hydrodynamics, and Surface Particles 124
7.2 Constraining Particles 127
7.3 Local Particle Repulsion 135
7.4 Global Particle Dispersion 140
7.5 Performance 145
7.6 Rendering 146
7.7 Conclusion 147
7.8 References 148
8.1 Introduction 157
8.2 Related Work 158
8.3 Percentage-Closer Filtering 159
8.4 Variance Shadow Maps 161
8.5 Summed-Area Variance Shadow Maps 174
8.6 Percentage-Closer Soft Shadows 178
8.7 Conclusion 181
8.8 References 181
9.1 Introduction 183
9.2 An Overview of the Algorithm 184
9.3 Gather Samples 186
9.4 One-Bounce Indirect Illumination 188
9.5 Wavelets for Compression 189
9.6 Adding Multiple Bounces 192
9.7 Packing Sparse Matrix Data 193
9.8 A GPU-Based Relighting Engine 195
9.9 Results 200
9.10 Conclusion 201
9.11 References 201
10.1 Introduction 203
10.2 The Algorithm 205
10.3 Hardware-Specific Implementations 214
10.4 Further Optimizations 232
10.5 Results 233
10.6 Conclusion 233
10.7 References 235
11.1 Introduction 239
11.2 An Overview of Shadow Volumes 240
11.3 Our Implementation 244
11.4 Conclusion 254
11.5 References 254
12.1 Review 257
12.2 Problems 258
12.3 A Robust Solution 261
12.4 Results 267
12.5 Performance 269
12.6 Caveats 270
12.7 Future Work 273
12.8 References 274
13.1 Introduction 275
13.2 Crepuscular Rays 276
13.3 Volumetric Light Scattering 277
13.4 The Post-Process Pixel Shader 279
13.5 Screen-Space Occlusion Methods 281
13.6 Caveats 282
13.7 The Demo 283
13.8 Extensions 284
13.9 Summary 284
13.10 References 284
14.1 The Appearance of Skin 293
14.2 An Overview of the Skin-Rendering System 297
14.3 Specular Surface Reflectance 299
14.4 Scattering Theory 305
14.5 Advanced Subsurface Scattering 314
14.6 A Fast Bloom Filter 342
14.7 Conclusion 342
14.8 References 345
15.1 Introduction 349
15.2 The Data Acquisition Pipeline 350
15.3 Compression and Decompression of the Animated Textures 352
15.4 Sequencing Performances 363
15.5 Conclusion 363
15.6 References 370
16.1 Procedural Animation 373
16.2 Vegetation Shading 378
16.3 Conclusion 384
16.4 References 384
17.1 Introduction 388
17.2 Tracing Secondary Rays 389
17.3 Reflections and Refractions 396
17.4 Results 400
17.5 Conclusion 402
17.6 References 406
18.1 Introduction 409
18.2 A Brief Review of Relief Mapping 411
18.3 Cone Step Mapping 415
18.4 Relaxed Cone Stepping 416
18.5 Conclusion 425
18.6 References 427
19.1 Introduction 429
19.2 Some Background 430
19.3 Forward Shading Support 431
19.4 Advanced Lighting Features 434
19.5 Benefits of a Readable Depth and Normal Buffer 440
19.6 Caveats 445
19.7 Optimizations 448
19.8 Issues 450
19.9 Results 454
19.10 Conclusion 454
19.11 References 457
20.1 Introduction 459
20.2 Rendering Formulation 459
20.3 Quasirandom Low-Discrepancy Sequences 465
20.4 Mipmap Filtered Samples 466
20.5 Performance 470
20.6 Conclusion 471
20.7 Further Reading and References 474
21.1 Introduction 481
21.2 Algorithm and Implementation Details 482
21.3 Results 487
21.4 Conclusion 489
21.5 References 48922.1 The Traditional Implementation 492
22.2 Acceleration Structures 493
22.3 Feeding the GPU 496
22.4 Implementation 498
22.5 Results 508
22.6 Conclusion 511
22.7 References 511
23.1 Motivation 513
23.2 Off-Screen Rendering 514
23.3 Downsampling Depth 517
23.4 Depth Testing and Soft Particles 519
23.5 Alpha Blending 520
23.6 Mixed-Resolution Rendering 522
23.7 Results 525
23.8 Conclusion 527
23.9 References 528
24.1 Introduction 529
24.2 Light, Displays, and Color Spaces 529
24.3 The Symptoms 533
24.4 The Cure 538
24.5 Conclusion 541
24.6 Further Reading 542
25.1 Introduction 543
25.2 Quadratic Splines 544
25.3 Cubic Splines 546
25.4 Triangulation 555
25.5 Antialiasing 556
25.6 Code 558
25.7 Conclusion 559
25.8 References 560
26.1 Image Processing Abstracted 564
26.2 Object Detection by Color 567
26.3 Conclusion 574
26.4 Further Reading 574
27.1 Introduction 575
27.2 Extracting Object Positions from the Depth Buffer 576
27.3 Performing the Motion Blur 579
27.4 Handling Dynamic Objects 580
27.5 Masking Off Objects 580
27.6 Additional Work 581
27.7 Conclusion 581
27.8 References 581
28.1 Introduction 583
28.2 Related Work 583
28.3 Depth of Field 585
28.4 Evolution of the Algorithm 587
28.5 The Complete Algorithm 592
28.6 Conclusion 602
28.7 Limitations and Future Work 603
28.8 References 605
29.1 Introduction 613
29.2 Rigid Body Simulation on the GPU 618
29.3 Applications 627
29.4 Conclusion 629
29.5 Appendix 631
29.6 References 631
30.1 Introduction 633
30.2 Simulation 634
30.3 Rendering 665
30.4 Conclusion 672
30.5 References 673
31.1 Introduction 677
31.2 All-Pairs N-Body Simulation 679
31.3 A CUDA Implementation of the All-Pairs N-Body Algorithm 680
31.4 Performance Results 686
31.5 Previous Methods Using GPUs for N-Body Simulation 691
31.6 Hierarchical N-Body Methods 692
31.7 Conclusion 693
31.8 References 694
32.1 Broad-Phase Algorithms 697
32.2 A CUDA Implementation of Spatial Subdivision 702
32.3 Performance Results 719
32.4 Conclusion 721
32.5 References 721
33.1 Parallel Processing 724
33.2 The Physics Pipeline 724
33.3 Determining Contact Points 726
33.4 Mathematical Optimization 728
33.5 The Convex Distance Calculation 731
33.6 The Parallel LCP Solution Using CUDA 732
33.7 Results 738
33.8 References 739
34.1 Introduction 741
34.2 Leaking Artifacts in Scan Methods 742
34.3 Our Tetrahedra GPU Scan Method 747
34.4 Results 756
34.5 Conclusion 758
34.6 Future Work 759
34.7 Further Reading 760
34.8 References 762
35.1 Introduction 771
35.2 Pattern Matching 773
35.3 The GPU Implementation 775
35.4 Results 779
35.5 Conclusions and Future Work 782
35.6 References 783
36.1 New Functions for Integer Stream Processing 786
36.2 An Overview of the AES Algorithm 788
36.3 The AES Implementation on the GPU 790
36.4 Performance 797
36.5 Considerations for Parallelism 799
36.6 Conclusion and Future Work 802
36.7 References 802
37.1 Monte Carlo Simulations 806
37.2 Random Number Generators 809
37.3 Example Applications 821
37.4 Conclusion 829
37.5 References 829
38.1 Introduction 831
38.2 Seismic Data 832
38.3 Seismic Processing 834
38.4 The GPU Implementation 841
38.5 Performance 849
38.6 Conclusion 849
38.7 References 850
39.1 Introduction 851
39.2 Implementation 853
39.3 Applications of Scan 866
39.4 Conclusion 875
39.5 References 875
40.1 Introduction and Related Work 877
40.2 Polynomial Forward Differencing 879
40.3 The Incremental Gaussian Algorithm 882
40.4 Error Analysis 885
40.5 Performance 887
40.6 Conclusion 888
40.7 References 888
41.1 Introduction 891
41.2 Why Use the Geometry Shader? 892
41.3 Dynamic Output with the Geometry Shader 893
41.4 Algorithms and Applications 895
41.5 Benefits: GPU Locality and SLI 903
41.6 Performance and Limits 905
41.7 Conclusion 907
41.8 References 907