- Virtual Function Basics
- Pointers and Virtual Functions
- Summary
Pointers and Virtual Functions
Do not mistake the pointing finger for the moon.
Zen saying
In this section, we explore some of the more subtle points about virtual functions.
Virtual Functions and Extended Type Compatibility
If Derived is a derived class of the base class Base, then you can assign an object of type Derived to a variable (or parameter) of type Base, but not the other way around. If you consider a concrete example, this becomes sensible. For example, DiscountSale is a derived class of Sale. (Refer to Displays 1 and 3.) You can assign an object of the class DiscountSale to a variable of type Sale, since a DiscountSale is a Sale. However, you cannot do the reverse assignment, since a Sale is not necessarily a DiscountSale. The fact that you can assign an object of a derived class to a variable (or parameter) of its base class is critically important for reusing of code via inheritance. However, it does have its problems.
For example, suppose a program or unit contains the following class definitions:
class Pet { public: string name; virtual void print( ) const; }; class Dog : public Pet { public: string breed; virtual void print( ) const; //keyword virtual not needed, //but put here for clarity. }; Dog vdog; Pet vpet;
Now concentrate on the data members, name and breed. (To keep this example simple, we have made the member variables public. In a real application, they should be private and have functions to manipulate them.)
Anything that is a Dog is also a Pet. It would seem to make sense to allow programs to consider values of type Dog to also be values of type Pet and hence the following should be allowed:
vdog.name = "Tiny"; vdog.breed = "Great Dane"; vpet = vdog;
C++ does allow this sort of assignment. You may assign a value, such as the value of vdog to a variable of a parent type, such as vpet (but you are not allowed to perform the reverse assignment). Although the above assignment is allowed, the value that is assigned to the variable vpet loses its breed field. This is called the slicing problem. The following attempted access will produce an error message:
cout << vpet.breed; // Illegal: class Pet has no member named breed
You can argue that this makes sense, since once a Dog is moved to a variable of type Pet it should be treated like any other Pet and not have properties peculiar to Dogs. This makes for a lively philosophical debate, but it usually just makes for a nuisance when programming. The dog named Tiny is still a Great Dane and we would like to refer to its breed, even if we treated it as a Pet someplace along the way.
Fortunately, C++ does offer us a way to treat a Dog as a Pet without throwing away the name of the breed. To do this, we use pointers to dynamic variables.
Suppose we add the following declarations:
Pet *ppet; Dog *pdog;
If we use pointers and dynamic variables we can treat Tiny as a Pet without losing his breed. The following is allowed.
pdog = new Dog; pdog->name = "Tiny"; pdog->breed = "Great Dane"; ppet = pdog;
Moreover, we can still access the breed field of the node pointed to by ppet. Suppose that
Dog::print( ) const;
has been defined as follows:
void Dog::print( ) const { cout << "name: " << name << endl; cout << "breed: " << breed << endl; }
The statement
ppet->print( );
will cause the following to be printed on the screen:
name: Tiny breed: Great Dane
This nice output happens by virtue of the fact that print( ) is a virtual member function. (No pun intended.) We have included test code in Display 7.
Display 7Defeating the Slicing Problem
//Program to illustrate use of a virtual function to defeat the slicing problem. #include <string> #include <iostream> using std::string; using std::cout; using std::endl; class Pet { public: //We have made the member variables public to keep the example simple. In a //real application they should be private and accessed via member functions. string name; virtual void print( ) const; }; class Dog : public Pet { public: string breed; virtual void print( ) const; }; int main( ) { Dog vdog; Pet vpet; vdog.name = "Tiny"; vdog.breed = "Great Dane"; vpet = vdog; cout << "The slicing problem:\n"; //vpet.breed; is illegal since class Pet has no member named breed. vpet.print( ); cout << "Note that it was print from Pet that was invoked.\n"; cout << "The slicing problem defeated:\n"; Pet *ppet; ppet = new Pet; Dog *pdog; pdog = new Dog; pdog->name = "Tiny"; pdog->breed = "Great Dane"; ppet = pdog; //These two print the same output: //name: Tiny //breed: Great Dane ppet->print( ); pdog->print( ); //The following, which accesses member variables directly //rather than via virtual functions would produce an error: //cout << "name: " << ppet->name << " breed: " // << ppet->breed << endl; return 0; } void Dog::print( ) const { cout << "name: " << name << endl; cout << "breed: " << breed << endl; } void Pet::print( ) const { cout << "name: " << name << endl; }
Sample Dialogue The slicing problem: name: Tiny Note that it was print from Pet that was invoked. The slicing problem defeated: name: Tiny breed: Great Dane name: Tiny breed: Great Dane
Object-oriented programming with dynamic variables is a very different way of viewing programming. This can all be bewildering at first. It will help if you keep two simple rules in mind:
If the domain type of the pointer pAncestor is an ancestor class for the domain type of the pointer pDescendent, then the following assignment of pointers is allowed:
pAncestor = pDescendent;
Moreover, none of the data members or member functions of the dynamic variable being pointed to by pDescendent will be lost.
Although all the extra fields of the dynamic variable are there, you will need virtual member functions to access them.
Pitfall: The Slicing Problem
Although it is legal to assign a derived class object into a base class variable, assigning a derived class object to a base class object slices off data. Any data members in the derived class object that are not also in the base class will be lost in the assignment, and any member functions that are not defined in the base class are similarly unavailable to the resulting base class object.
For example, if Dog is a derived class of Pet, then the following is legal:
Dog vdog; Pet vpet; vpet = vdog;
However, vpet cannot be a calling object for a member function from Dog unless the function is also a member function of Pet, and all the member variables of vdog that are not inherited from the class Pet are lost. This is the slicing problem.
Note that simply making a member function virtual does not defeat the slicing problem. Note the following code from Display 7:
Dog vdog; Pet vpet; vdog.name = "Tiny"; vdog.breed = "Great Dane"; vpet = vdog; . . . vpet.print( );
Although the object in vdog is of type Dog, when vdog is assigned to the variable vpet (of type Pet) it becomes an object of type Pet. So, vpet.print( ) invokes the version of print( ) defined in Pet, not the version defined in Dog. This happens despite the fact that print( ) is virtual. In order to defeat the slicing problem, the function must be virtual and you must use pointers and dynamic variables.
Programming Tip: Make Destructors Virtual
It is a good policy to always make destructors virtual, but before we explain why this is a good policy we need to say a word or two about how destructors and pointers interact and about what it means for a destructor to be virtual.
Consider the following code, where SomeClass is a class with a destructor that is not virtual:
SomeClass *p = new SomeClass; . . . delete p;
When delete is invoked with p, the destructor of the class SomeClass is automatically invoked. Now, let's see what happens when a destructor is marked as virtual.
The easiest way to describe how destructors interact with the virtual function mechanism is that destructors are treated as if all destructors had the same name (even though they do not really have the same name). For example, suppose Derived is a derived class of the class Base and suppose the destructor in the class Base is marked virtual. Now consider the following code:
Base *pBase = new Derived; . . . delete pBase;
When delete is invoked with pBase, a destructor is called. Since the destructor in the class Base was market virtual and the object pointed to is of type Derived, the destructor for the class Derived is called (and it in turn calls the destructor for the class Base). If the destructor in the class Base had not been declared as virtual, then only the destructor in the class Base would be called.
Another point to keep in mind is that when a destructor is marked as virtual, then all destructors of derived classes are automatically virtual (whether or not they are marked virtual). Again, this behavior is as if all destructors had the same name (even though they do not).
Now we are ready to explain why all destructors should be virtual. To do this, consider what happens when destructors are not declared as virtual in a base class. In particular consider the base class PFArrayD (partially filled array of doubles) and its derived class PFArrayDBak (partially filled array of doubles with backup). That was before we knew about virtual functions, and so the destructor in the base class PFArrayD was not marked virtual. In Display 8, we have summarized all the facts we need about the classes PFArrayD and PfarrayDBak.
Display 8Review of the Classes PFArrayD and PFArrayDBak
//Some details about the base class PFArrayD. //A more complete definition of PFArrayD is given in Chapter 14, //but this display has all the details you need for this chapter. class PFArrayD { public: PFArrayD( ); . . . ~PFArrayD( );//Should be virtual, but is not virtual. protected: double *a; //for an array of doubles. int capacity; //for the size of the array. int used; //for the number of array positions currently in use. }; PFArrayD::PFArrayD( ) : capacity(50), used(0) { a = new double[capacity]; } PFArrayD::~PFArrayD( ) { delete [] a; } class PFArrayDBak : public PFArrayD { public: PFArrayDBak( ); . . . ~PFArrayDBak( ); private: double *b; //for a backup of main array. int usedB; //backup for inherited member variable used. }; PFArrayDBak::PFArrayDBak( ) : PFArrayD( ), usedB(0) { b = new double[capacity]; } PFArrayDBak::~PFArrayDBak( ) { delete [] b; }
Consider the following code:
PFArrayD *p = new PFArrayDBak; . . . delete p;
Since the destructor in the base class is not marked as virtual, only the destructor for the base class (PFArrayD) will be invoked. This will return the memory for the member array a (declared in PFArrayD) to the free store, but the memory for the member array b (declared in PFArrayDBak) will never be returned to the freestore (until the program ends).
On the other hand, if (unlike Display 8) the destructor for the base class PFArrayD were marked virtual, then when delete is applied to p, the constructor for the class PFArrayDBak would be invoked (since the object pointed to is of type PFArrayDBak). The destructor for the class PFArrayDBak would delete the array b and then automatically invoke the constructor for the base class PFArrayD and that would delete the member array a. So, with the base class destructor marked as virtual, all the memory is returned to the freestore. To prepare for eventualities such as these it is best to always mark destructors as virtual.
Downcasting and Upcasting
You might think some sort of type casting would allow you to easily get around the slicing problem. However, things are not that simple. The following is illegal:
Pet vpet; Dog vdog; //Dog is a derived class with base class Pet. . . . vdog = static_cast<Dog>(vpet); //ILLEGAL!
However, casting in the other direction is perfectly legal and does not even need a casting operator:
vpet = vdog; //Legal (but does produce the slicing problem.)
When you cast from a descendent type to an ancestor type, it is known as up casting, since you are moving up the class hierarchy. Upcasting is safe, since you are simply disregarding some information (disregarding member variables and functions). So, the following is perfectly safe:
vpet = vdog;
When you cast from an ancestor type to a descended type that is called downcasting and is very dangerous, since you are assuming that information is being added (added member variables and functions). The dynamic_cast is used for downcasting. It is of some possible use in defeating the slicing problem, but is dangerously unreliable and fraught with pitfalls. A dynamic_cast may allow you to downcast but it only works for pointer types, as in the following:
Pet *ppet; ppet = new Dog; Dog *pdog = dynamic_cast<Dog*>(ppet); //Dangerous!
We have had downcasting fail even in situations as simple as this and so we do not recommend it.
The dynamic_cast is supposed to inform you if it fails. If the cast fails the dynamic_cast should return NULL (which is really the integer 0). (The standard says, "The value of a failed cast to pointer type is the null pointer of the required result type. A failed cast to a reference type throws a bad_cast.")
If you want to try downcasting keep the following points in mind: 1. You need to keep track of things so that you know the information to be added is indeed present. 2. Your member functions must be virtual, since dynamic_cast uses the virtual function information to perform the cast.
How C++ Implements Virtual Functions
You need not know how a compiler works in order to use it. That is the principle of information hiding which is basic to all good program design philosophies. In particular, you need not know how virtual functions are implemented in order to use virtual functions. However, many people find that a concrete model of the implementation helps their understanding, and when reading about virtual functions in other books you are likely to encounter references to the implementation of virtual functions. So, we will give a brief outline of how they are implemented. All compilers for all languages (including C++) that have virtual functions typically implement them in basically the same way.
If a class has one or more member functions that are virtual, then the compiler creates what is called a virtual function table for that class. This table has a pointer (memory address) for each virtual member functions. The pointer points to the location of the correct code for that member function. So if one virtual function was inherited and not changed, then its table entry points to the definition for that function that was given in the parent class (or other ancestor class if need be). If another virtual function had a new definition in the class, then the pointer in the table for that member function points to that definition. (Remember that the property of being a virtual function is inherited, so once a class has a virtual function table, then all its descendent classes have a virtual function table.)
Whenever an object of a class with one or more virtual functions is created, another pointer is added to description of the object that is stored in memory. This pointer points to the classes virtual function table. When you make a call to a member function using a pointer (yep, another one) to the object, the run time system uses the virtual function table to decide which definition of a member function to use; it does not use the type of the pointer.
Of course, this all happens automatically so you need not worry about it. A compiler writer is even free to implement virtual functions in some other way so long as it works correctly (although it never actually is implemented in a different way).