Home > Articles > Programming > C/C++

C++ Reference Guide

Hosted by

Constructors

Last updated Jan 1, 2003.

If you’re looking for more up-to-date information on this topic, please visit our C/C++ Programming article, podcast, and store pages.

A constructor initializes an object. A default constructor is one that can be invoked without any arguments. If there is no user-declared constructor for a class, and if the class doesn't contain const or reference data members, C++ implicitly declares a default constructor for it. Such an implicitly declared default constructor performs the initialization operations needed to create an object of this type. Note, however, that these operations don't involve initialization of user-declared data members. For example:

class C
{
private:
 int n;
 char *p;
public:
 virtual ~C() {}
};

void f()
{
 C obj; // 1 implicitly-defined constructor is invoked
}

C++ synthesized a constructor for class C because it contains a virtual member function. Upon construction, C++ initializes a hidden data member called the virtual pointer, which every polymorphic class has. This pointer holds the address of a dispatch table that contains all the virtual member functions' addresses for that class. The synthesized constructor doesn't initialize the data members n and p, nor does it allocate memory for the data pointed to by the latter. These data members have an indeterminate value once obj has been constructed. This is because the synthesized default constructor performs only the initialization operations that are required by the implementation—not the programmer—to construct an object.

Other implementation-required operations that are performed by implicitly-defined constructors are the invocation of a base class constructor and the invocation of embedded object's constructor. However, C++ doesn't declare a constructor for a class if the programmer has defined one. For example:

class C
{
private:
 int n;
 char *p;
public:
 C() : n(0), p(0) {}
 virtual ~C() {}
};
C obj; // 1 user-defined constructor is invoked

Now the data members of the object obj are initialized because the user-defined constructor was invoked to create it. This still leaves us with a mystery: The user-defined constructor only initializes the data members n and p. When did the initialization of the virtual pointer take place? Here's the answer: The compiler augments the user-defined constructor with additional code, which is inserted into the constructor's body before any user-written code, and performs the necessary initialization of the virtual pointer.

Trivial Constructors

As noted previously, compilers synthesize a default constructor for every class or struct, unless a constructor was already defined by the user. However, in certain conditions, such a synthesized constructor is redundant:

class Empty {};
struct Person
{
 char name[20];
 double salary;
 int age;
};
int main()
{
 Empty e;
 Person p;
 p.age = 30; // public access allowed, no constructor needed
 return 0;
}

C++ can instantiate Empty and Person objects without a constructor. In such cases, the implicitly declared constructor is said to be trivial. The fact that a constructor is considered trivial means that neither the programmer nor the compiler generates code for it. We'll discuss this in greater detail shortly.

Constructors of Built-in Types

You might be surprised to hear this: Built-in types such as char, int, and float also have constructors. You can initialize a variable by explicitly invoking its default constructor:

char c = char();
int n = int ();
return 0;

This expression:

char c = char();

is equivalent to this one:

char c = char(0);

Of course, it's possible to initialize a fundamental type with values other than 0:

float f = float (0.333);
char c = char ('a');
int *pi= new int (10);
float *pf = new float (0.333);

Note that this form is just "syntactic sugar." It's interchangeable with the more widely used form:

char c = 'a';
float f = 0.333;

Explicit Constructors

Constructors have many peculiar characteristics. They don't have a name (and therefore can't be called directly, as opposed to ordinary member functions); they don't have a return value; and their address cannot be taken. Non-default constructors are even more odd. A constructor that takes a single argument operates as an implicit conversion operator by default. Today, I will explore this phenomenon in further detail and explain how to use the explicit qualifier to restrain constructors' behavior.

Constructor Categories

In essence, constructors are what differentiates between a POD struct and a real object, as they automatically shape a raw chunk of memory into an object with a determinate state. A class may have several constructors (in this discussion, I'm referring to "plain constructors," not to copy constructors), each of which taking a different number of arguments. A constructor that may be called without any arguments is a default-constructor.

A non-default constructor is one that takes one or more arguments. (The C++ literature provides no special name for this type of constructor, so I'm referring to it as a "non-default constructor.") Non-default constructors are further divided into two subcategories: those that take a single argument and thus operate as implicit conversion operators, and those that take multiple arguments. For example,

class Date
{
public:
 Date(); // default ctor; no arguments required 
//non-default ctors:
Date(time_t t); // extracts date from a time_t value
Date(int d, int m, int y); //
};

The class Date has three constructors. The first one is a default constructor. The second one, which takes a single argument, operates as an implicit time_t to D conversion operator. As such, it will be invoked automatically in a context that requires a long to Date conversion (Remember that time_t is a synonym for long or a similar integral type). For example,

Date d=std::time(0);// invokes ctor # 2 to create d

You're probably wondering how it works. Under the hood, the compiler transforms this code into something like this:

//pseudo C++ code 
time_t __value; 
__value = std::time(0);
Date __temp(__value);//create a temporary Date object
Date d = __temp; 
temp.C::~C(); //destroy temp

In this example, the automatic conversion is intentional and useful. Yet, there are many cases in which an implicit conversion is undesirable, for example:

Date p=NULL; //actually meant 'Date *p=NULL'; compiles OK

Can you see what's happening here? NULL is defines as 0 or 0L in C++. The compiler silently transformed the code above into:

//pseudo C++ code 
Date temp(0L);//create a temporary Date object
Date d = temp; //assigned using operator =
temp.C::~C(); //destroy temp

The explicit Keyword

The problem here is that the implicit conversion (taking place behind the programmer's back) switches off the compiler's static type checking. Without this implicit conversion, the compiler would have complained about a type mismatch. C++ creators perceived this problem long ago. They decided to add a "patch" to the language in the form of the keyword explicit. Constructors declared explicit will refuse to perform such implicit conversions:

class Date
{
//...
 explicit Date(time_t t); // no implicit conversions
};

Now, the previous examples will not compile:

Date d =std::time(0); //error, can't convert 'long' to 'Date' 
Date p=NULL; //also an error

To convert a time_t or any other integral value to a Date object, you need to use an explicit conversion now:

Date d(std::time(0)); //OK
Date d2= Date(std::time(0)); //same meaning as above

Advice

If you examine a code corpus, say the Standard Library, you will see that most of the constructors taking one argument are explicit. One therefore could argue that this should have been the default behavior of constructors, whereas constructors that permit implicit conversions should have been the exception. Put differently, instead of explicit, C++ creators should have introduced the keyword implicit and changed the semantics of constructors so that they didn't function as implicit conversion operators.

However, this approach would have caused existing code to break. In some classes, for example std::complex and other mathematical classes, implicit conversions are rather useful. The C++ creators therefore decided to leave the original semantics constructors intact, while introducing a mechanism for disabling implicit conversions, when necessary.

As a rule, every constructor that takes a single argument, including constructors that take multiple arguments with default values such as the following, should be explicit, unless you have a good reason to allow implicit conversions:

class File
{
public:
 //this ctor may be called with a single argument
 //it's therefore declared explicit:
 explicit File(const char *name, 
        ios_base::openmode mode=ios_base::out,
        long protection = 0666);
};

Member Initialization Lists

A constructor may include a member initialization (mem-initialization for short) list that initializes the object's data members. For example:

class Cellphone //1: initialization by mem-init
{
private:
 long number;
 bool on;
public:
 Cellphone (long n, bool ison) : number(n), on(ison) {}
};

The constructor of Cellphone can also be written as follows:

Cellphone (long n, bool ison) //2: initialization within ctor's body
{
 number = n;
 on = ison;
}

There is no substantial difference between the two forms in this case because the compiler scans the mem-initialization list and inserts its code into the constructor's body before any user-written code. Thus, the constructor in the first example is expanded by the compiler into the constructor in the second example. Nonetheless, the choice between using a mem-initialization list and initialization inside the constructor's body is significant in the following four cases:

  • Initialization of const members. For example:

    class Allocator
    {
    private:
     const int chunk_size;
    public:
     Allocator(int size) : chunk_size(size) {}
    };
  • Initialization of reference members. For example:

    class Phone;
    class Modem
    {
    private:
     Phone & line;
    public:
      Modem(Phone & ln) : line(ln) {}
    };
  • Passing arguments to a constructor of a base class or an embedded object. For example:

    class base
    {
    //...
    public:
      //no default ctor
     base(int n1, char * t) {num1 = n1; text = t; }
    };
    class derived : public base
    {
    private:
     char *buf;
    public:
     //pass arguments to base constructor
     derived (int n, char * t) : base(n, t) { buf = (new char[100]);}
    };
  • Initialization of member objects. For example:

    #include<string>
    using std::string;
    class Website
    {
    private:
     string URL
     unsigned int IP
    public:
     Website()
     {
      URL = "";
      IP = 0;
     }
    };

In the first three cases, a mem-initialization list is mandatory because these initializations must be completed before the constructor's execution. Conceptually, initializations listed in a member-initialization list take place before the constructor executes. In the fourth case, the use of a mem-init list is optional. However, it can improve performance in certain cases because it eliminates unnecessary creation and subsequent destruction of objects that would occur if the initialization were performed inside the constrictor's body.

Due to the performance difference between the two forms of initializing embedded objects, some programmers use mem-initialization exclusively. Note, however, that the order of the initialization list has to match the order of declarations within the class. This is because the compiler transforms the list so that it coincides with the order of the declaration of the class members, regardless of the order specified by the programmer. For example:

class Website
{
private:
 string URL; //1
 unsigned int IP; //2
public:
 Website() : IP(0), URL("") {} // initialized in reverse order
};

In the mem-initialization list, the programmer first initializes the member IP and then URL, even though IP is declared after URL. The compiler transforms the initialization list to the order of the member declarations within the class. In this case, the reverse order is harmless. When there are dependencies in the order of initialization list, however, this transformation can cause surprises. For example:

class Mystring
{
private:
 char *buff;
 int capacity;
public:
 explicit Mystring(int size) :
 capacity(size), buff (new char [capacity]) {} undefined behavior
};

The mem-initialization list in the constructor of Mystring doesn't follow the order of declaration of Mystring's members. Consequently, the compiler transforms the list like this:

explicit Mystring(int size) :
buff (new char [capacity]), capacity(size) {}

The member capacity specifies the number of bytes that new has to allocate, but it hasn't been initialized. The results in this case are undefined. There are two ways to avert this pitfall: Change the order of member declarations so that capacity is declared before buff, or move the initialization of buff into the constructor's body.

Copy Constructor

A copy constructor initializes its object with another object. If there is no user-defined copy constructor for a class, C++ implicitly declares one. A copy constructor is said to be trivial if it's implicitly declared, if its class has no virtual member functions and no virtual base classes, and if its entire direct base classes and embedded objects have trivial copy constructors. The implicitly defined copy constructor performs a memberwise copy of its sub-objects, as in the following example:

#include<string>
using std::string;
class Website //no user-defined copy constructor
{
private:
 string URL;
 unsigned int IP;
public:
 Website() : IP(0), URL("") {}
};
int main ()
{
 Website site1;
 Website site2(site1); //invoke implicitly defined copy constructor
}

The programmer didn't declare a copy constructor for class Website. Because Website has an embedded object of type std::string, which happens to have a user-defined copy constructor, the implementation implicitly defines a copy constructor for class Website and uses it to copy-construct the object site2 from site1. The synthesized copy constructor first invokes the copy constructor of std::string, and then performs a bitwise copying of the data members of site1 into site2.

Novices are sometimes encouraged to define the four special member functions for every class they write. As can be seen in the case of the Website class, not only is this unnecessary, but it's even undesirable under some conditions. The synthesized copy constructor (and the assignment operator, described in the next section) already "do the right thing." They automatically invoke the constructors of base and member sub-objects, they initialize the virtual pointer (if one exists), and they perform a bitwise copying of fundamental types. In many cases, this is exactly the programmer's intention anyway. Furthermore, the synthesized constructor and copy constructor enable the implementation to create code that's more efficient than user-written code, because it can apply optimizations that aren't always possible otherwise.