Home > Articles > Programming > C/C++

C++ Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

The Object Model II

Last updated Jan 1, 2003.

The second part of this series discusses inheritance in its various flavors: single inheritance, multiple inheritance and virtual inheritance and shows how each one of these affects the underlying memory layout of a C++ object.

Single Inheritance

Every object-oriented programming language supports at least one type of inheritance: single inheritance. When you derive a new class from a base class, the user-defined data members of the base are implicitly re-declared in the derived class, preserving the same declaration order as in the base class. This generalization leaves us with three open questions:

  • Do different access types in the base class affect its members' layout in the derived class?
  • Are implicitly-defined data members (the vptr) copied into the derived class, too?
  • Are all member functions of the base class implicitly re-declared in the derived class?

The C++ standard gives implementers enough leeway with respect to the memory layout of a derived object. However, it requires that the base class subobject should be allocated before any data members of the derived class. If we have the following class hierarchy:

class Base
{
private:
int a;
char b;
void * p;
public:
explicit Base(int a);
};
class Derived : public Base
{
private:
double d;
public:
Derived: Base(0) {}
};
 

The memory layout of Derived is similar to this:

struct S
{
int a;
char b;
void * p; //last member of base subobject
double d; //first member of derived
};
 

As noted in the first part of this series, the Standard requires that data members declared without any intervening access type shall be allocated in the following manner: the first member will have a memory address that is identical or closest to this, and the last member shall be the farthest from this.

Officially, the allocation order of nonstatic data members separated by an access-specifier is unspecified. In practice, however, virtually all compilers ignore access specifiers with respect to members' order. There have been a few attempts to automatically reorder members with intervening access specifiers so that the resulting class shall be more efficiently laid out. For example, an implementation that processes the following suboptimal class declaration:

class C
{
bool b1;
protected:
int n;
proctected:
bool b2;
};
 

Is allowed to reorder the members' layout like this:

class C_ //after compiler-initiated member layout optimization
{
bool b1;
bool b2;
int n;
};
 

This way, sizeof(C_) is only 8 bytes on a typical 32-bit implementation, as opposed to 12 bytes that the original declaration order of C dictates. In reality however, tampering with the original declaration order could cause serious binary incompatibilities and surprise programmers. Therefore, compiler-initiated member layout optimization is not implemented, but there are source code analyzers that can suggest a better member declaration order.

Alignment and Padding

The C++ standard guarantees the relative order of user-declared data members without an intervening access type. However, there's no guarantee that members shall be allocated on adjacent memory addresses. Alignment requirement might cause the compiler to insert padding bytes between members that don't fit into the hardware's native word size. For this reason, a typical 32-bit compiler will allocate the member p on a memory address that is four byte farther than b's address, even though b occupies only a single byte. The three padding bytes between b and p are unnamed and ignored; they usually contain an indeterminate value, unless the object in question has static storage duration:

void func()
{
static C c1; //all padding bytes are zeroed
C c2; //padding bytes have indeterminate values
}
 

Figuring out ways to examine the content of the padding bytes of c1 and c2 is left as an exercise to the reader.

Inheritance and Polymorphism

In the previous examples, I used an unrealistic inheritance model, in which the base class doesn't declare any virtual member functions. In practice, base classes declare at least a virtual destructor. When a base class has a virtual destructor, the memory layout of a derived object is slightly different because the vptr must be inserted into the derived class.

Compilers can be divided into two categories with respect to their vptr's position. UNIX compilers typically place the vptr after the last user-declared data member, whereas Windows compilers place it as the first data member of the object, before any user-declared data members. Each approach has its pros and cons:

class A
{
int x;
public:
virtual int f();
};
A a;
 

The memory layout of a is either as that of S1 or S2:

struct S1 //Unix style, vptr at the end
{
int x; 
void * __vptr;
};
   
struct S2 //Windows style, vptr at offset 0
{
void * __vptr;
int x;
};
 

Placing the vptr at offset 0 ensures that its relative position in a polymorphic object is always equivalent to this. Windows compilers employ certain optimizations based on the vptr's fixed position. The UNIX tradition of placing the vptr at the end necessitates that its precise location must be calculated for each object. And yet, placing the vptr at the end makes C++ objects overlap POD C structs. In the early days of C++, C++ compilers were implemented as translators that produced intermediary C code. Therefore, such binary compatibility was useful for debugging and testing. In addition, it enabled programmers to combine legacy C code in new C++ apps more easily.

Is the vptr Reduplicated in a Derived Class?

What happens when a class inherits from a class that already has a vptr? The derived class doesn't get two vptrs. Instead, it gets a single vptr that points to the correct table of virtual functions. In other words, every polymorphic class has a single vptr with a different value:

class A //has a vptr
{
int x;
public:
virtual int f();
};
   
class Derived: public A
{
double y;
public:
virtual int g();
};
struct D1 //UNIX style vptr location
{
int x;
double y;
void * vptr; //pushed after all user-declared data
};
struct D2 //Windows style vptr location
{
void * vptr;//always at offset 0
int x;
double y;
};
 

Multiple Inheritance

When a class has two or more direct base classes, the compiler juxtaposes each base class subobject into the resulting object according to the bases' declaration order:

class A
{
int one;
};
class B
{
double two;
};
class D: public A, public B
{
void * three;
};
 

The memory layout of a D object is:

struct D
{
int one; //first base class
int two; //second base class
void * three; //D's own data 
};
 

If any of the three classes has a virtual function, the compiler will insert the vptr either before one or after three.

Virtual Inheritance

When virtual inheritance is used, all hell breaks loose. The problem is that the compiler tries its best to ensure that the resulting object shall not contain multiple copies of any of the virtual base classes. The common solution employed by virtually all compilers (pun unintended) is to use a pointer to the virtual (shared) base class subobject. The most derived object thus accesses members from its virtual base classes through a pointer. This extra layer of indirection has runtime and memory layout ramifications: accessing data members of a virtual base class might be slower, and the exact location of the shared subobject within the most derived object is compiler-dependent.

Member Functions

The last of the three questions above still remains: Are all member functions of the base class implicitly redeclared in the derived class? Yes, they are. Even if the base class declares private member functions that the derived class can't access, the member functions of the base class are recognized in the derived class by their name. More importantly, these member functions may affect the derived class' memory layout. Think for example of a virtual private member function declared in a base class. Any class derived from that base class will have a vptr, even if the derived class doesn't declare any virtual member functions of its own:

class Base 
{
private: 
void func();
};
class Derived: public Base
{
int g();;
};
 

C++ has a simple rule: once polymorphic, always polymorphic. When you add a virtual member function to a class, any classes derived from the latter shall be polymorphic as well. Therefore, class Derived is polymorphic and has an implicit vptr member even if it cannot call the private member func() directly. If private or protected inheritance were used, Derived would still have the same memory layout:

class Derived: private Base
{
int g();;
};
struct layout_of_Derived
{
void* __vptr;
}