The "get started" developer's guide to MPEG-4, the future of Internet and Web multimedia!
MPEG-4 will transform Internet- and Web-based multimedia by enabling breakthrough audio, video, 2D, and 3D capabilities. This book is for every developer and technical decision-maker. In MPEG-4 Jump-Start, two leaders of the Internet multimedia community introduce the MPEG-4 standard: its key concepts, capabilities, applications, requirements, and limitations. Aaron E. Walsh and Mikael Bourges-Sevenier cover what you need to know to start developing MPEG-4 players and content today, including:
MPEG-4 offers unprecedented opportunities for delivering media in any networked communications environmentInternet, Web, broadcast, satellite, or even wireless. MPEG-4 Jump-Start gives you specific, insider's techniques for building tomorrow's breakthrough media applicationsstarting right now.
Every Jump-Start book is:AUTHORITATIVEwritten by world-class experts personally involved with the design and development of that technologyFOCUSEDproviding exactly what you need to know to get started immediately with minimum effortPRACTICALteaching you the skills and techniques that you need to develop professional, real-world software applications
1. Introduction to MPEG-4.
The MPEG Brain and Its Artifacts. MPEG and Memes. The MPEG Brain. MPEG Standards. MPEG-4 in a Nutshell. MPEG-4 Design Goals and Principles. Navigating MPEG-4. MPEG-4 and Other Multimedia Standards. Architecture and Tools. End-to-End Architecture. MPEG-4 Browser Architecture and Tools. Applications. Next Generation of Portals. Interactive Broadcast. Multimedia Conferencing and Communities. Conclusion. References.
Introduction to VRML97. How MPEG-4 BIFS Extends VRML. Major Features and Related Jargon. VRML Browsers. VRML Plug-Ins. Objects, Scenes, and Worlds. VRML Files. Scene Graphs. Nodes. Fields. Events. eventIn and eventOut Fields. Shapes, Geometry, and Appearance. Primitive Objects. Complex Objects. Text Objects. Coordinate Systems. Moving, Scaling, and Rotating. Summary.
Binary Format for Scenes (BIFS). Fundamental Differences with VRML97. A Note on Coordinate Systems. New Scene Features and New Nodes. Geometry Nodes. Interpolator Nodes. Grouping Nodes. Material Nodes. Sensor Nodes. Audio Nodes. Face and Body Animation Nodes. Stream Synchronization and Control Nodes. Other Nodes. Summary.
Overview of the Binary Format for Scenes (BIFS). BIFS-Update Decoding Process. Overview of the SDL Language. Structures Used by the Decoder. Configuring the Decoder. Decoding Commands. Examples. Writing BIFS-Update Codecs: Considerations. Summary. References.
Why We Need Quantization. A Crash Course on Quantization. Fourteen BIFS Quantizers. Local and Global QuantizationParameter Nodes. Quantizers 1 to 8. Quantizing Normals and Rotations (9, 10). Quantizing Object Size (11, 12). Linear Scalar Quantizers (13, 14). Efficient Float Coding. Using Quantization in BIFS. Example: Using QuantizationParameter Node. Summary. References.
Synthetic Animations in MPEG-4. The BIFS-Anim Protocol. Encoding BIFS-Anim. Decoding the BIFS-Anim Mask. Decoding the BIFS-Anim Frames. Interpolators Encoded by PredictiveMFFields. PredictiveMFFields and BIFS Scenes. PredictiveMFFields Decoding. Summary.
2D Mesh Animation Tools in MPEG-4. Tools. Profiles. 2D Mesh Generation and Coding. Mesh Generation. Mesh Coding. Summary. References.
Face Definition Parameters (FDPs). Face Animation Parameters (FAPs). FAP Interpolation. FDP Information vs the FDP Node. FAP Estimation from Video. FBA File Formats. Connecting the Head to an Animated Body. FAP/ BAP Quantization. Applications. Streaming. Implications of FAP Normalization. Real Faces vs Animated Characters. Authoring Considerations. Real Time. FBA Visual Bitstream Syntax. FBA Object. FBA Object Plane. FBA Temporal Header. Decode Frame Rate and Skip Frames. Decode New Minmax. Decode ifap. Decode pfap. Decode Viseme and Expression. FBA Systems BIFS Nodes. Text To Speech (TTS) Node. Face Model Predictability Hierarchy. Summary. References.
Body Modeling and Animation. Body System BIFS Nodes. Body Node. BAP Node. BDP Node. BodySceneGraph Node. BodyDefTables Node. Body Bitstream Syntax. Summary. References.
Functionalities of 3DMC. Introduction to IndexedFaceSet. 3D Mesh Coding. Topological Analysis—Topological Surgery. Connectivity Coding. Geometry and Properties Coding. Entropy Coding. Bitstream Syntax. 3D Mesh Object. Stitching Mode. Error Resilience Mode. Progressive Transmission Mode. Performance Demonstrations. Compression. Incremental Rendering. Error Resilience. Summary. References.
Cross-Standard Interoperability. XMT Two-Tier Architecture. XMT-?Format. Reusing SMIL in XMT-?. Extensible Media (xMedia) Objects. Animation. Timing. Spatial Layout. XMT-?Examples. XMT-A Format. Document Structure. Timing. Scene Description. Object Descriptor Framework. Deterministic Mapping. Interoperability with X3D. Summary. References.
Systems Profiles. Facial and Body Animation Profiles.
Welcome to MPEG-4 Jump-Start. This book was designed to enable professional programmers to jump directly into the new and exciting world of MPEG-4 development. Written by developers for developers, MPEG-4 Jump-Start covers the fundamentals you need to know to dive headfirst into MPEG-4 content and application development. As a collaborative effort by a number of designers and architects responsible for the MPEG-4 international standard, the book you hold in your hands is rich with details and insights from expert programmers whose careers revolve around MPEG-4 and related technologies. We hope you enjoy this book as much as we enjoyed writing it; welcome to our growing community of MPEG-4 developers!Organization of This Book
In August 2000, MPEG-4 Systems version 3 was almost finalized. Most of us had spent many years improving and refining the algorithms in the standard. Before it could be approved as a formal standard, we had to write reference software to validate not only the decoding of the bitstreams but also the authoring tools needed to create them. These components were assembled in IM1, MPEG-4's reference software. The standard was mature enough that we were able to write a book that explains our experiences to managers, developers, and content authors, with the aim of helping them create MPEG-4 content and applications.
At the time this book went to press, VRML97 was our reference for 3D on the Web, while Extensible 3D (X3D) and VRML200x were still in their infancy. Many of the new functionalities under consideration for X3D and VRML200x were also in MPEG-4. On the other hand, because MPEG-4's Binary Format for Scenes (BIFS) follows the same structure and encompasses all nodes defined by the VRML97 international standard, we decided to start MPEG-4 Jump-Start by familiarizing you with VRML and related scene graph programming concepts. This allowed us to then introduce BIFS, and all the extensions introduced by MPEG-4, in more detail.
MPEG-4 Jump-Start contains 12 chapters and 3 appendixes. Apart from the introduction (Chapter 1), the conclusion (Chapter 12), and the appendixes, the conceptual organization of the book follows the workflow of creating MPEG-4 contents.
MPEG-4 content is first created using an authoring tool. Different streams such as BIFS, OD, Audio, Video, and MPEG-J are designed by an author. The author's intent is well-captured in an XMT file. The output of the authoring tool is stored in an MPEG-4 File Format (.mp4) that can be played locally or streamed to MPEG-4 terminals using a server. At a terminal, the various streams are presented in time and composed before being rendered. At the rendering stage, a user can interact with the content.
MPEG-4 Jump-Start is organized as follows:
We start our journey with an introduction to MPEG standards in general, and an overview of MPEG-4 in particular.
Before diving headlong into MPEG-4, we will first start with a brief review of VRML, as VRML forms the basis of MPEG-4 scenes.
Then, we will guide you through the new nodes proposed by MPEG-4's Binary Format for Scenes (BIFS).
As MPEG-4 is made of binary streams, you will learn how to represent the BIFS scene graph in binary format. You will also learn how to send commands to modify the scene graph at any time.
Even if a straight binary representation of the scene graph provides a substantial compression ratio, you will learn how to further reduce its size using quantization.
We review how animations can be created in BIFS, and you will learn how to make synthetic streamed animations (i.e., modify the values of the components of the scene graph at a constant frame rate).
We reveal in this chapter that 2D mesh animation uses a dedicated synthetic animation stream.
Facial animations use other specialized compression algorithms to reduce the amount of data needed to animate faces. Facial animation parameters are model-independent, so they can be used to animate models different from the ones used to generate the bitstream.
Body animation extends facial animation with a dedicated framework. A generic humanoid is defined with special animation points and joints.
3D Meshes can consume a lot of bandwidth in a content. Specific compression methods have been developed for them.
XMT enables you to store your contents in an XML-based format that includes not only MPEG-4 commands but also high-level commands suitable for authoring.
Finally, we conclude MPEG-4 Jump-Start and, as MPEG-4 is constantly evolving, we detail what extensions to expect in the next couple of years.
As many MPEG-4 compression algorithms use arithmetic coding, we give the code of a generic arithmetic coder.
BIFS uses node coding tables to encode the node of the scene graph. We explain how they are built.
MPEG-4 can be thought of as a large suite of multimedia tools. Although extensive, not all of the tools offered by MPEG-4 are necessary for every application. To help developers decide which tools they should use for a given application, we introduce the notion of MPEG-4 profiles that have been defined for a set of applications and different levels (upper complexity bounds) that have been established.
MPEG-4 Jump-Start details the bitstream representation of many MPEG-4 tools using the Syntactic Description Language. We have chosen to follow the standard and to give the methods to decode the bitstreams. We indicate considerations for writing the encoders in general; there exist many ways to encode, and many tricks and optimizations are possible. However, for interoperability purposes, there should be only one way to interpret the bitstreams. In many cases, writing the encoder is straightforward from the methods given. For updates, tools, and node coding tables, please visit the companion Web site established for this book (http://web3dbooks.com/mpeg-4/jumpstart/).
This book's companion volume, More MPEG-4 Jump-Start, will cover the following topics:
http://web3dbooks.com/. To jump directly to the MPEG-4 Jump-Start support site, visit