Home > Articles > Programming > C/C++

C++ Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

A Few Pedagogical Insights about C++ Teaching: Public Data Members

Last updated Jan 1, 2003.

Another controversial issue in Koenig's "Teaching C++ Badly: Introduce Constructors and Destructors at the Same Time" is public data members. Is this the biggest OOP heresy, or is it a commendable out of the box design?

Public Data Members?!

Koenig uses this class to illustrate the validity of public data members:

  struct Point 
  {
  Point(double x = 0, double  y = 0): x(x), y(y) { }
  double x, y; //public data  members
  };
  

Don't let the keyword struct mislead you -- Point is a non-POD type because it has a non-trivial constructor. And yet, it exposes its data members, seemingly violating the most sacred principle of OOP -- encapsulation.

Koenig explains: "I suppose that what I am about to say borders on heresy, but I see nothing wrong a priori with a class with public data members. You may cry, 'But it's not encapsulated!' but I will answer, 'You're right. So what?' This is a very simple class with an interface that is fundamentally the same as its implementation."

It takes a lot of courage to write such a class. But is it a good idea?

Analysis

Point isn't a hypothetical example. The C++ Standard Library has at least two classes whose data members have been the subject of a similar debate, namely std::pair<T,T> and std::complex<T>.

First, let's look at pair<T,T>. This class template packs two objects. Therefore, its designers thought that there was no need to over-engineer its interface with getters and setters. Instead, they exposed its data members:

  #include <utility>
  std::pair<int,int> pi;
  pi.first=0; //direct access to a public data member
  pi.second=2; //another public data member

Where do you draw the line between well-behaved exposure of data members and blatant violations of OOP principles? Koenig presents a criterion: "[t]his is a very simple class with an interface that is fundamentally the same as its implementation." A similar argument was made when std::pair<>'s data members were declared public. And yet, when you look at std::complex, which is a very similar case (after all, a complex is nothing but two floating point variables and a set of overloaded operators) the cracks begin to show: the data members of std::complex are private. To access them you have to use setters and getters:

  std::complex<double> cd(0.0);
  double d=cd.real();
  cd.real(11.4);

Looking For an Objective Criterion

What's the difference between std::pair and std::complex? Why is the design of two similar classes so different? No one has a good answer. And that's exactly the problem with the rationale for class Point. Exposing data members whose "interface is fundamentally the same as its implementation" may sound convincing. Alas, you can stretch this argument to other cases as well. What's a string after all? Isn't it always implemented as a char* and an integral type that stores its size? Why not declare string's data members public too, then? What about vector's data members?

Once you permit public data member "because they are fundamentally the same as the interface" you may find yourself with classes that are merely C structs with member functions. No one has come up with an objective litmus test that can tell good public data members from wrong ones.

When there's no objective criterion, it's prudent and wise to stick to the most scared OOP rule -- data members shall be private. Why? Because once you grant indiscriminate access to data members, you're ushering in the well-known ailment of procedural programming, i.e., dependencies on specific implementation details. It won't take much before your clients start writing code that depends on a certain implementation of Point. Changing the implementation later would be impossible.

A Hypothetical Risk?

One might argue that the risk of implementation changes for Point (or std::pair<>) is hypothetical. Which other implementation option are possible anyway?

This question is in fact the litmus test we are seeking. Whenever you consider exposing data members because allegedly "they are fundamentally the same as the interface" ask yourself: "can this implementation ever change?" The answer is almost always "yes, it can". Consequently, public data members are almost always out of the question. To demonstrate this point, let's look again at class std::pair<>. Can its implementation change? Yes it can, in more than one way.

Think of a pair as a private case of an array with two elements. Instead of declaring to data members of type T, you can pack a pair as T[2]:

  template <typename T> class pair 
  {
  T p[2];
  //
  };
This design, however unusual, offers a few benefits. An array ensures that the compiler doesn't insert padding bytes between the two data members of pair. Another reason might be the use of pointer arithmetic to access every member from every member.

If the use of an array sounds contrived, consider of another example: implementing pair as tuple<T,T> (technically, pair is a private case of tuple but since the latter was added to C++ only in TR1, the two are implemented as independent classes.) What about Point? Here again alternative implementations are possible: implementing Point as std::pair<int,int>, as int[2], or even as a three-dimensional point whose third member isn't used.

std::complex is also subjected to such possible design changes. In retrospect, its authors were right when they decided to hide its implementation details by declaring them private.

In Conclusion

Classic OOD is pretty much obsolete in contemporary C++. You'll rarely see virtual member functions and inheritance in the Standard Library these days. However, it appears that encapsulation has lived up to its promise. I would think twice before exposing data members, as tempting as it may seem.