Modularity in C++ 17

By Bjarne Stroustrup
Aug 13, 2018

📄 Contents

␡

3.1 Introduction
3.2 Separate Compilation
3.3 Modules (C++20)
3.4 Namespaces
3.5 Error Handling
3.6 Function Arguments and Return Values
3.7 Advice

⎙ Print

< Back Page 6 of 7 Next >

This chapter is from the book 

A Tour of C++, 2nd Edition

Learn More Buy

3.6 Function Arguments and Return Values

The primary and recommended way of passing information from one part of a program to another is through a function call. Information needed to perform a task is passed as arguments to a function and the results produced are passed back as return values. For example:

int sum(const vector<int>& v)
{
     int s = 0;
     for (const int i : v)
           s += i;
     return s;
}

vector fib = {1,2,3,5,8,13,21};

int x = sum(fib);          // x becomes 53

There are other paths through which information can be passed between functions, such as global variables (§1.5), pointer and reference parameters (§3.6.1), and shared state in a class object (Chapter 4). Global variables are strongly discouraged as a known source of errors, and state should typically be shared only between functions jointly implementing a well-defined abstraction (e.g., member functions of a class; §2.3).

Given the importance of passing information to and from functions, it is not surprising that there are a variety of ways of doing it. Key concerns are:

Is an object copied or shared?
If an object is shared, is it mutable?
Is an object moved, leaving an “empty object” behind (§5.2.2)?

The default behavior for both argument passing and value return is “copy” (§1.9), but some copies can implicitly be optimized to moves.

In the sum() example, the resulting int is copied out of sum() but it would be inefficient and pointless to copy the potentially very large vector into sum(), so the argument is passed by reference (indicated by the &; §1.7).

The sum() has no reason to modify its argument. This immutability is indicated by declaring the vector argument const (§1.6), so the vector is passed by const-reference.

3.6.1 Argument Passing

First consider how to get values into a function. By default we copy (“pass-by-value”) and if we want to refer to an object in the caller’s environment, we use a reference (“pass-by-reference”). For example:

void test(vector<int> v, vector<int>& rv)       // v is passed by value; rv is passed by reference
{
     v[1] = 99;     // modify v (a local variable)
     rv[2] = 66;    // modify whatever rv refers to
}

int main()
{
     vector fib = {1,2,3,5,8,13,21};
     test(fib,fib);
     cout << fib[1] << '' << fib[2] << '\n';     // prints 2 66
}

When we care about performance, we usually pass small values by-value and larger ones by-reference. Here “small” means “something that’s really cheap to copy.” Exactly what “small” means depends on machine architecture, but “the size of two or three pointers or less” is a good rule of thumb.

If we want to pass by reference for performance reasons but don’t need to modify the argument, we pass-by-const-reference as in the sum() example. This is by far the most common case in ordinary good code: it is fast and not error-prone.

It is not uncommon for a function argument to have a default value; that is, a value that is considered preferred or just the most common. We can specify such a default by a default function argument. For example:

void print(int value, int base =10);  // print value in base "base"

print(x,16);    // hexadecimal
print(x,60);    // sexagesimal (Sumerian)
print(x);       // use the dafault: decimal

This is a notationally simpler alternative to overloading:

void print(int value, int base);    // print value in base "base"

void print(int value)               // print value in base 10
{
     print(value,10);
}

3.6.2 Value Return

Once we have computed a result, we need to get it out of the function and back to the caller. Again, the default for value return is to copy and for small objects that’s ideal. We return “by reference” only when we want to grant a caller access to something that is not local to the function. For example:

class Vector {
public:
     // ...
     double& operator[](int i) { return elem[i]; }    // return reference to ith element
private:
     double* elem;     // elem points to an array of sz
     // ...
};

The ith element of a Vector exists independently of the call of the subscript operator, so we can return a reference to it.

On the other hand, a local variable disappears when the function returns, so we should not return a pointer or reference to it:

int& bad()
{
     int x;
     // ...
     return x;  // bad: return a reference to the local variable x
}

Fortunately, all major C++ compilers will catch the obvious error in bad().

Returning a reference or a value of a “small” type is efficient, but how do we pass large amounts of information out of a function? Consider:

Matrix operator+(const Matrix& x, const Matrix& y)
{
     Matrix res;
     // ... for all res[i,j], res[i,j] = x[i,j]+y[i,j] ...
     return res;
}

Matrix m1, m2;
// ...
Matrix m3 = m1+m2;     // no copy

A Matrix may be very large and expensive to copy even on modern hardware. So we don’t copy, we give Matrix a move constructor (§5.2.2) and very cheaply move the Matrix out of operator+(). We do not need to regress to using manual memory management:

Matrix* add(const Matrix& x, const Matrix& y)     // complicated and error-prone 20th century style
{
     Matrix* p = new Matrix;
     // ... for all *p[i,j], *p[i,j] = x[i,j]+y[i,j] ...
     return p;
}

Matrix m1, m2;
// ...
Matrix* m3 = add(m1,m2);     // just copy a pointer
// ...
delete m3;                   // easily forgotten

Unfortunately, returning large objects by returning a pointer to it is common in older code and a major source of hard-to-find errors. Don’t write such code. Note that operator+() is as efficient as add(), but far easier to define, easier to use, and less error-prone.

If a function cannot perform its required task, it can throw an exception (§3.5.1). This can help avoid code from being littered with error-code tests for “exceptional problems.”

The return type of a function can be deduced from its return value. For example:

auto mul(int i, double d) { return i*d; }       // here, "auto" means "deduce the return type"

This can be convenient, especially for generic functions (function templates; §6.3.1) and lambdas (§6.3.3), but should be used carefully because a deduced type does not offer a stable interface: a change to the implementation of the function (or lambda) can change the type.

3.6.3 Structured Binding

A function can return only a single value, but that value can be a class object with many members. This allows us to efficiently return many values. For example:

struct Entry {
     string name;
     int value;
};

Entry read_entry(istream& is)     // naive read function (for a better version, see §10.5)
{
     string s;
     int i;
     is >> s >> i;
     return {s,i};
}

auto e = read_entry(cin);

cout << "{ " << e.name << "," << e.value << " }\n";

Here, {s,i} is used to construct the Entry return value. Similarly, we can “unpack” an Entry’s members into local variables:

auto [n,v] = read_entry(is);
cout << "{ " << n << "," << v << " }\n";

The auto [n,v] declares two local variables n and v with their types deduced from read_entry()’s return type. This mechanism for giving local names to members of a class object is called structured binding.

Consider another example:

map<string,int> m;
// ... fill m ...
for (const auto [key,value] : m)
      cout << "{" << key "," << value << "}\n";

As usual, we can decorate auto with const and &. For example:

void incr(map<string,int>& m)     // increment the value of each element of m
{
     for (auto& [key,value] : m)
           ++value;
}

When structured binding is used for a class with no private data, it is easy to see how the binding is done: there must be the same number of names defined for the binding as there are nonstatic data members of the class, and each name introduced in the binding names the corresponding member. There will not be any difference in the object code quality compared to explicitly using a composite object; the use of structured binding is all about how best to express an idea.

It is also possible to handle classes where access is through member functions. For example:

complex<double> z = {1,2};
auto [re,im] = z+2;         // re=3; im=2

A complex has two data members, but its interface consists of access functions, such as real() and imag(). Mapping a complex<double> to two local variables, such as re and im is feasible and efficient, but the technique for doing so is beyond the scope of this book.

< Back Page 6 of 7 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address