"I just wish this book had been available years ago."
-Bobby Prince, composer and sound designer for computer games including Doom and Duke Nukem 3D
"This is the kind of book that will stand as one of the defining works in the specialization of audio programming."
-Gene Turnbow, senior programmer and game designer for Sound Source Interactive, Inc.
"I'm amazed at the breadth and depth of Tim's coverage."
Information Systems, Loyola College, Maryland
A Programmer's Guide To Sound provides detailed technical information about audio storage, processing, and compression, and includes tested C++ source code. Developers who want to add sound technology to their applications will find all the details they need to:
This book also includes accessible introductions to related topics, such as instrument synthesis, musical tuning, human sound perception, digital filtering, and Fourier Transforms.
Developers will especially appreciate the emphasis placed on practical details. For every topic, the author provides complete source code to demonstrate the principles involved. The source code from the book compiles into a sample program that reads and plays a wide variety of different sound files on Win32, Mac OS, and UNIX. The CD-ROM includes all 40,000 lines of source code from the book, in addition to project files for popular compilers, sample sound files, and contributed software and related information.
Whether you are an audio professional who wants to learn more about programming or a computer programmer who wants to know more about implementing audio, this comprehensive resource will be an invaluable reference for years to come.
About This Book.
I. BASICS.1. From Hollow Logs to Cyberspace.
What is Sound?
Sounds We Hear.
Other Resources.2. Human Sound Perception.
Frequency and Pitch.
Pitch and Frequency Aren’t the Same.
Loudness, Amplitude, and Power.
Overall Quality.3. Storing Sound Digitally.
Sampled Sound Formats.
Pulse Amplitude Modulation (PAM).
Pulse Width Modulation (PWM).
Pulse Code Modulation (PCM).
Side Effects of Sampling.
Floating-Point Samples.4. A C++ Sound Framework.
The AudioAbstract Class.
Reading and Writing Integers.
A SineWave Class.
II. SYSTEM SPECIFICS.5. Player Objects.
Implementing the Queue.
Opening an Unknown Sound File.6. Playing Audio on Windows.
Selecting an Audio Device.
A Sample Windows Application.7. Playing Audio on Mac OS.
Double Buffered Sound Under Mac OS.
A Sample Mac OS Program.
Playing a Mac File.8. UNIX and the Network Audio System.
Servers and Flows.
Callbacks and Events.
Configuring a Flow and Attaching to the Server.
A Sample UNIX Application.
III. COMPRESSION.9. Audio Compression.
Adaptive Differential PCM.
Human Speech Compression.
Progressive Compression.10. Decompression Classes.
Signed 8-Bit PCM.
Unsigned 8-Bit PCM.
MSB 16-Bit PCM.
LSB 16-Bit PCM.11. Nonlinear Sound Formats.
Properties of Logarithmic Encodings.12. Differential PCM.
Two DPCM Encodings.13. IMA ADPCM Compression.
Practical IMA ADPCM Decompression.
Microsoft’s IMA ADPCM Variant.
Apple’s IMA ADPCM Variant.
Comparing Microsoft’s and Apple’s IMA ADPCM Codecs Notes on IMA ADPCM.
Unraveling ADPCM Formats.
Credits.14. MPEG Audio.
A Survey of the MPEG Standards.
Managing the Byte Stream.
MPE’s Frame Header.
Slots and Frames.
Layer 1/Layer 2 Subband Synthesis.
Synthesis Window Coefficients.
MPEG Stereo Encoding.
Layer 1 Allocation Storage.
Layer 1 Scale Factors.
Layer 1 Sample Storage.
Layer 1 Requantization and Scaling.
Layer 2 Allocation Storage.
Layer 2 Scale Factors.
Reading Layer 2 Samples.
Layer 2 Requantization.
A Reader for MPEG Files.
IV. GENERAL FILE FORMATS.15. AU File Format.
Identifying AU Files.
Reading AU Files.
Writing AU Files.
A Simple AU Filter.16. VOC File Format.
Identifying VOC Files.
Terminator Block (Type 0).
Sound Data Block (Type 1).
Sound Continuation Block (Type 2).
Silence Block (Type 3).
Marker Block (Type 4).
Text Block (Type 5).
Repeat Loops (Types 6 and 7).
Extension Block (Type 8).
Extension Block (Type 9).
Reading VOC Files.17. WAVE File Format.
Identifying WAVE Files.
About RIFF and other IFF Files.
An Overview of WAVE.
The WaveRead Class.
Reading WAVE Files.
The RIFF WAVE Container.
The fmt Chunk.
Creating a Decompression Object.
IMA ADPCM Data.
mu-Law and A-Law.
Other Compression Methods.
The data Chunk.
Text chunks.18. AIFF and AIFF-C File Formats.
Identifying AIFF Files.
The AiffRead Class.
Reading AIFF Files.
The FORM AIFF Container.
The FVER Chunk.
The COMM Chunk.
IMA ADPCM Data.
The SSND Chunk.19. IFF/8SVX File Format.
Identifying IFF/8SVX Files.
An Overview of IFF/8SVX.
Reading IFF/8SVX Files.
The FORM 8SVX Container.
The VHDR Chunk.
Fibonacci DPCM Encoding.
Exponential DPCM Encoding.
The BODY Chunk.
V. MUSIC FILE FORMATS.20. Programming Music.
Notes.21. Synthesizing Instruments.
A Sine Wave Generator.
Envelope Control with Sampled Instruments.
Other Types of Envelope Control.
FM and Wavetable Synthesis.
Implementing the Plucked-String Algorithm.
Testing Notes.22. MIDI.
Standard MIDI Files.
Identifying MIDI Files.
MIDI Header Chunk.
Reading MIDI Tracks.
Managing MIDI Events.
Reading MIDI Events.
Track Sequence Number Meta Event (Type 0).
Text Meta Events (Types 1 through 15).
End of Track Meta Event (Type 47).
Tempo Meta Event (Type 81).
Time Signature (Type 88).
Key Signature Meta Event (Type 89).
Sequencer-Specific Meta Events (Type 127).
A MIDI Player.
Post-Processing the MIDI Event Stream.
Base and Extended MIDI.
Playing a MIDI Event Stream.
Aftertouch and Pitch Wheel.
Controllers and Modes.
MIDI Wire Protocol.
Other MIDI File Formats.23. MOD.
Identifying MOD Files.
Amiga Sound Hardware.
MOD Format Overview.
Overall File Structure.
Playing MOD Files.
The PlayBeat Method.
Playing Notes with Effects.
Effect 0: Arpeggio.
Effect 1: Slide Up.
Effect 2: Slide Down.
Effect 3: Pitch Slide.
Effect 4: Vibrato.
Effect 5: Pitch Slide plus Volume Slide.
Effect 6: Vibrato plus Volume Slide.
Effect 7: Tremolo.
Effect 8: Unused.
Effect 9: Set Sample Offset.
Effect 10: Volume Slide.
Effect 11: Far Jump.
Effect 12: Set Volume.
Effect 13: Pattern Break (Small Jump).
Effect 14/0: Set Filter.
Effect 14/1: Fine Slide Up.
Effect 14/2: Fine Slide Down.
Effect 14/3: Set Glissando.
Effect 14/4: Set Vibrato Waveform.
Effect 14/5: Set Finetune.
Effect 14/6: Pattern Loop.
Effect 14/7: Set Tremolo Waveform.
Effect 14/8: Unused.
Effect 14/9: Retrigger.
Effect 14/10: Volume Up.
Effect 14/11: Volume Down.
Effect 14/12: Cut Note.
Effect 14/13: Delay Note.
Effect 14/14: Delay Pattern.
Effect 14/15: Invert Loop.
Effect 15: Set Speed.
VI. AUDIO PROCESSING.24. Fourier Transforms.
Fourier Transform Basics.
Measuring One Frequency at a Time.
Accounting for Phase.
Implementing the DFT.
Measuring the Entire Spectrum at Once.
Decomposing Long FFTs.
Formal Derivation of the FFT.
Programming the FFT.
Playing with FFTs.
Adding Sine Waves.
Designing Simple FIR Filters.
Implementing FIR Filters.
Music Synthesis with Filters.
II. APPENDICES.A. About the CD-ROM.
Cool Edit 96.
Articles on Music Production by Donald Griffin.
XAudio.B. A Brief Introduction to C++.
Improving C: A Better struct.
Toward Object-Oriented Programming.
Real OOP: Classes and Inheritance.
Defining Methods in Separate Files.
Friends.C. Coding Style.
Some years ago, I found myself researching a variety of different file formats. For graphics formats such as GIF, I had no trouble finding good, detailed descriptions of the overall format and the bit-by-bit details of the underlying compression. However, I was hard pressed to find comparable information for even the most popular audio formats. Although basic formats are outlined in several places, solid information about the compression methods used is surprisingly hard to find.
I'm clearly not the only person to have encountered this problem. I've seen many audio tools that boast of support for a vast number of file formats but lack support for any compression methods at all.
In the intervening years, I have managed to piece together much of the necessary information and wrote this book to bring it together in a single place. This book documents a variety of common audio file formats and audio compression standards and also discusses many related issues that arise in programming audio on a variety of systems.Source Code
As a programmer, I'm often frustrated when otherwise excellent books stop just short of giving me the details I need. For that reason, when I write about programming, I include detailed, tested source code. Even if the text description omits some critical detail, you can always look at the source code, which must somehow address that issue. By organizing the book around the source code, I hope to ensure that every detail you need is here.
In many cases, you may be able to use my source code directly in your software project. I encourage you to do so. But please pay attention to the conditions listed at the top of each source file. If you have any questions, please feel free to contact me through the publisher. Even if you don't have questions, I'd like to hear about how you used my code and what your experiences were. If there's enough interest and the publisher is willing, I may update this book to better fit your needs.About This Book
The software in this book was tested by automatically extracting the source code from the electronic files for the book. The book was produced using noweb, LaTeX2e and dvips typesetting software running on FreeBSD 2.1. The source code was tested under FreeBSD 2.1 with the GNU GCC compiler suite, Windows 95 using Microsoft Visual C++ 5.0, and Mac OS 7.6 using Metrowerks CodeWarrior Gold 11. NuMega's BoundsChecker and ParaSoft's CodeWizard were also used to test the source. The text fonts are Adobe Garamond and Computer Modern Typewriter; headings are in Adobe Helvetica and Monotype Arial. Artwork is from the Digitart Musicville collection from Image Club Graphics.Acknowledgements
Many people have generously contributed to the production of this book: Mary Treseler and the staff at Addison-Wesley patiently endured my seemingly endless revisions and last-minute changes. George Wright, John Miles, Bobby Prince, Gene Turnbow, Tom White, and several others provided thoughtful, honest criticism of my early drafts, which immeasurably improved the end product. Jon Erickson and the staff at Dr. Dobb's supported my efforts. Above all, Beth provided invaluable help in organizing, editing, indexing, and endless other tasks.
As always, any errors that remain are my own.