- Overview
-
Table of Contents
- Special Member Functions: Constructors, Destructors, and the Assignment Operator
- Operator Overloading
- Memory Management
- Templates
- Namespaces
- Time and Date Library
- Streams
- Object-Oriented Programming and Design Principles
- The Standard Template Library (STL) and Generic Programming
- Exception Handling
- Runtime Type Information (RTTI)
- Signal Processing
- Creating Persistent Objects
- Bit Fields
- New Cast Operators
- Environment Variables
- Variadic Functions
- Pointers to Functions
- Function Objects
- Pointers to Members
- Lock Files
- Design Patterns
- Dynamic Linking
-
Tips and Techniques
- Using the Swap() Algorithm
- Using class stopwatch for Performance Measurements
- Extending <tt><iostream></tt> to Support User-Defined Types
- Using <tt>auto_ptr</tt> To Automate Memory Management
- Using <tt>auto_ptr</tt> To Automate Memory Management, Part II
- Using <tt>auto_ptr</tt> To Automate Memory Management, Part III
- Using <tt>enum</tt>s as Mnemonic Indexes
- Create Objects on Pre-Allocated Memory Using <tt>Placement-new</tt>
- Online Books: <tt>Placement-new</tt>
- Bitwise Operators
- Bitwise Operators II
- Who's <tt>this</tt>?
- A Reference Guide
- The Virtues of Multiple Inheritance
- Interfaces
- Multiple Inheritance: Construction and Destruction Order
- nothrow new
- POD Initialization
- Object Initialization
- <tt>const</tt> Declarations
- The Semantics of <tt>volatile</tt>
- <tt>inline</tt> Functions
- Project Organization Guidelines
- All About <tt>bool</tt>
- <tt>typedef</tt> Declarations
- State of the <tt>union</tt>
- Dynamic Cast Uses
- Integrating C and C++
- <tt>const</tt>-Correctness
- <tt>const</tt>-Correctness: Advanced Issues
- Sprucing Up Legacy Code
- Virtual Constructors
- Naming Names
- Function Calls
- Speaking Standardese (updated)
- Speaking Standardese: the One Definition Rule
- Declarations and Definitions
- More on Declarations and Definitions
- The Most Vexing Parse
- Finally, At Last
- Sound Bytes (Admittedly Off Topic)
- Local Classes
- Complex Arithmetic
- Floating Point Woes
- String Manipulation
- The Object Model
- The Object Model II
- The Object Model III
- Temporary Objects
- Temporary Objects: Advanced Techniques
- Over-Engineering
- Security Enhancements
- Drop the (automatic) Pilot
- Choosing the Right Container
- Choosing the Right Container II
- Choosing the Right Container, Part III
- Arrays and Pointers
- Low-Level File I/O
- Low-Level File I/O Part II
- static Declarations, Part I
- static Declarations, Part II
- <code>static</code> Initialization Order
- Revisiting the Deprecation of File-Scope Static
- Virtual Memory and Memory Mapping
- Cellular Phone Programming Guidelines
- The Handle/Body Idiom
- Whole Program Optimization, Part I
- Whole Program Optimization, Part II
- Manipulating Directories
- Window Dressing
- <code>friend</code> Declarations
- <code>friend</code> Part II: the Interaction of Friendship and Template Classes
- Forcing Object Allocation on Specific Storage Types
- Lazy Evaluation
- Cache and Carry
- Controlling a Container’s Capacity
- Non-Blocking I/O, Part I
- Non-Blocking I/O, Part II
- Using Unions for Automatic Conversion
- Launching a Child Process
- <tt>switch</tt> Statements
- Introducing the "struct Hack"
- Scoped Enumerators
- Doing Statistics with STL
- Fixing the "Unresolved External" Linkage Error
- Understanding Calling Conventions
- Understanding the Empty Base Optimization
- Implementing RPC with the door Library, Part 1
- Implementing RPC with the door Library, Part 2
- Eliminating Two Common Pointer and <tt>sizeof</tt> Bugs
- Command Line Arguments
- Performance Myths Busting
- Tag Names And Types Part I
- Tag Names And Types Part II
- The Infamous goto
- Trimming Strings
- Can Objects Live Forever? Part I
- Can Objects Live Forever? Part II
- Five Ways to Improve Your Functions
- Member Aggregate Initialization
- Five Futile Coding-Style Debates
- The Good Parasite Idiom: An Exercise in OOD
- The Good Parasite Idiom: An Exercise in OOD, Part II
- The Good Parasite Idiom: An Exercise in OOD, Part III
- Ten Techniques to Reduce the Size of Your Classes, Part I
- Ten Techniques to Reduce the Size of Your Classes, Part II: Inheritance Issues
- Ten Techniques to Reduce the Size of Your Classes, Part III
- Ten Techniques to Reduce the Size of Your Classes IV
- Taking the Address of an Object with an Overloaded Operator <tt>&</tt>
- strcpy() -- How and Why Does It "Just Work"?
- Anonymous Structs
- Five Easy Ways to Reduce The Size of your Executables
- Standard Layout Classes and Trivially Copyable Types, Part I
- Standard Layout Classes and Trivially Copyable Types, Part II
- Five Simple Code Sanity Checks
- Five Things You Need to Know About C++11 Unions
- A Tour of C99
- A Tour of C1X
- C++0X: The New Face of Standard C++
- C++0x Concurrency
- The Reflecting Circle
- We Have Mail
- The Soapbox
- Numeric Types and Arithmetic
- Careers
- Locales and Internationalization
The Object Model II
Last updated Jan 1, 2003.
The second part of this series discusses inheritance in its various flavors: single inheritance, multiple inheritance and virtual inheritance and shows how each one of these affects the underlying memory layout of a C++ object.
Single Inheritance
Every object-oriented programming language supports at least one type of inheritance: single inheritance. When you derive a new class from a base class, the user-defined data members of the base are implicitly re-declared in the derived class, preserving the same declaration order as in the base class. This generalization leaves us with three open questions:
- Do different access types in the base class affect its members' layout in the derived class?
- Are implicitly-defined data members (the vptr) copied into the derived class, too?
- Are all member functions of the base class implicitly re-declared in the derived class?
The C++ standard gives implementers enough leeway with respect to the memory layout of a derived object. However, it requires that the base class subobject should be allocated before any data members of the derived class. If we have the following class hierarchy:
class Base
{
private:
int a;
char b;
void * p;
public:
explicit Base(int a);
};
class Derived : public Base
{
private:
double d;
public:
Derived: Base(0) {}
};
The memory layout of Derived is similar to this:
struct S
{
int a;
char b;
void * p; //last member of base subobject
double d; //first member of derived
};
As noted in the first part of this series, the Standard requires that data members declared without any intervening access type shall be allocated in the following manner: the first member will have a memory address that is identical or closest to this, and the last member shall be the farthest from this.
Officially, the allocation order of nonstatic data members separated by an access-specifier is unspecified. In practice, however, virtually all compilers ignore access specifiers with respect to members' order. There have been a few attempts to automatically reorder members with intervening access specifiers so that the resulting class shall be more efficiently laid out. For example, an implementation that processes the following suboptimal class declaration:
class C
{
bool b1;
protected:
int n;
proctected:
bool b2;
};
Is allowed to reorder the members' layout like this:
class C_ //after compiler-initiated member layout optimization
{
bool b1;
bool b2;
int n;
};
This way, sizeof(C_) is only 8 bytes on a typical 32-bit implementation, as opposed to 12 bytes that the original declaration order of C dictates. In reality however, tampering with the original declaration order could cause serious binary incompatibilities and surprise programmers. Therefore, compiler-initiated member layout optimization is not implemented, but there are source code analyzers that can suggest a better member declaration order.
Alignment and Padding
The C++ standard guarantees the relative order of user-declared data members without an intervening access type. However, there's no guarantee that members shall be allocated on adjacent memory addresses. Alignment requirement might cause the compiler to insert padding bytes between members that don't fit into the hardware's native word size. For this reason, a typical 32-bit compiler will allocate the member p on a memory address that is four byte farther than b's address, even though b occupies only a single byte. The three padding bytes between b and p are unnamed and ignored; they usually contain an indeterminate value, unless the object in question has static storage duration:
void func()
{
static C c1; //all padding bytes are zeroed
C c2; //padding bytes have indeterminate values
}
Figuring out ways to examine the content of the padding bytes of c1 and c2 is left as an exercise to the reader.
Inheritance and Polymorphism
In the previous examples, I used an unrealistic inheritance model, in which the base class doesn't declare any virtual member functions. In practice, base classes declare at least a virtual destructor. When a base class has a virtual destructor, the memory layout of a derived object is slightly different because the vptr must be inserted into the derived class.
Compilers can be divided into two categories with respect to their vptr's position. UNIX compilers typically place the vptr after the last user-declared data member, whereas Windows compilers place it as the first data member of the object, before any user-declared data members. Each approach has its pros and cons:
class A
{
int x;
public:
virtual int f();
};
A a;
The memory layout of a is either as that of S1 or S2:
struct S1 //Unix style, vptr at the end
{
int x;
void * __vptr;
};
struct S2 //Windows style, vptr at offset 0
{
void * __vptr;
int x;
};
Placing the vptr at offset 0 ensures that its relative position in a polymorphic object is always equivalent to this. Windows compilers employ certain optimizations based on the vptr's fixed position. The UNIX tradition of placing the vptr at the end necessitates that its precise location must be calculated for each object. And yet, placing the vptr at the end makes C++ objects overlap POD C structs. In the early days of C++, C++ compilers were implemented as translators that produced intermediary C code. Therefore, such binary compatibility was useful for debugging and testing. In addition, it enabled programmers to combine legacy C code in new C++ apps more easily.
Is the vptr Reduplicated in a Derived Class?
What happens when a class inherits from a class that already has a vptr? The derived class doesn't get two vptrs. Instead, it gets a single vptr that points to the correct table of virtual functions. In other words, every polymorphic class has a single vptr with a different value:
class A //has a vptr
{
int x;
public:
virtual int f();
};
class Derived: public A
{
double y;
public:
virtual int g();
};
struct D1 //UNIX style vptr location
{
int x;
double y;
void * vptr; //pushed after all user-declared data
};
struct D2 //Windows style vptr location
{
void * vptr;//always at offset 0
int x;
double y;
};
Multiple Inheritance
When a class has two or more direct base classes, the compiler juxtaposes each base class subobject into the resulting object according to the bases' declaration order:
class A
{
int one;
};
class B
{
double two;
};
class D: public A, public B
{
void * three;
};
The memory layout of a D object is:
struct D
{
int one; //first base class
int two; //second base class
void * three; //D's own data
};
If any of the three classes has a virtual function, the compiler will insert the vptr either before one or after three.
Virtual Inheritance
When virtual inheritance is used, all hell breaks loose. The problem is that the compiler tries its best to ensure that the resulting object shall not contain multiple copies of any of the virtual base classes. The common solution employed by virtually all compilers (pun unintended) is to use a pointer to the virtual (shared) base class subobject. The most derived object thus accesses members from its virtual base classes through a pointer. This extra layer of indirection has runtime and memory layout ramifications: accessing data members of a virtual base class might be slower, and the exact location of the shared subobject within the most derived object is compiler-dependent.
Member Functions
The last of the three questions above still remains: Are all member functions of the base class implicitly redeclared in the derived class? Yes, they are. Even if the base class declares private member functions that the derived class can't access, the member functions of the base class are recognized in the derived class by their name. More importantly, these member functions may affect the derived class' memory layout. Think for example of a virtual private member function declared in a base class. Any class derived from that base class will have a vptr, even if the derived class doesn't declare any virtual member functions of its own:
class Base
{
private:
void func();
};
class Derived: public Base
{
int g();;
};
C++ has a simple rule: once polymorphic, always polymorphic. When you add a virtual member function to a class, any classes derived from the latter shall be polymorphic as well. Therefore, class Derived is polymorphic and has an implicit vptr member even if it cannot call the private member func() directly. If private or protected inheritance were used, Derived would still have the same memory layout:
class Derived: private Base
{
int g();;
};
struct layout_of_Derived
{
void* __vptr;
}
