Home > Articles

  • Print
  • + Share This
This chapter is from the book

2.4 Logical and Physical Name Cohesion

The ability to identify the physical location of the definition of essentially every logical construct — directly from its point of use — is an important aspect of design that distinguishes our methodology from others used in the software industry. The practical advantages of this aspect of design, however, are many and are explored in this section.

2.4.1 History of Addressing Namespace Pollution

Global namespace pollution — specifically, local constructs usurping short common names — is an age-old problem. All of us have learned that naming a class Link or a function max at file scope — even in a .cpp file — is just asking for trouble. Left unmanaged, the probability of name conflicts increases combinatorially with program size. Developers have traditionally responded to this problem with ad hoc conventions for naming logical constructs based on what are hopefully unique prefixes (e.g., ls_Link, myMax, size_t). When the use of a logical construct is confined to a single .cpp file, we can always make individual functions static and nest local classes within the unnamed namespace. The problem of name collisions, however, extends to header files as well.

2.4.2 Unique Naming Is Required; Cohesive Naming Is Good for Humans

Recall from section 2.2.6 that a logical or physical entity is architecturally significant if its name (or symbol) is intentionally visible from outside of the UOR that defines it. To refer to each architecturally significant entity unambiguously, we require the name of each such entity to be globally unique. How we achieve this uniqueness is, to some extent, an implementation detail — at least from the compiler’s perspective. When it comes to human beings, however, cohesive naming, as we will elucidate in this section, has proven to provide powerful cognitive reinforcement.

Suppose we want to implement an architecturally significant type, say one that represents a price — e.g., for a financial instrument. How should we ensure that the name of this type is globally unique? In theory, there are many ways to achieve unique naming. We could, for example, maintain a central registry of logical names. The first developer to choose Price gets it! The next developer implementing a similar concept (there are many ways to characterize a price) would be forced to choose something else (e.g., MyPrice, Price23). The same approach could just as easily be used to reserve unique filenames.

2.4.3 Absurd Extreme of Neither Cohesive nor Mnemonic Naming

Taking this approach to the extreme, we could even have the registry generate unique type names based on a global counter — e.g., T125061, T125062, T125063, and so on. We could do similarly for component names (e.g., c05684, c05685, c05686) and even for units of release (e.g., u1401, u1135, u1564), as illustrated in Figure 2-16. It all works just fine as far as the compiler and linker are concerned. Moreover, physically moving a component from one aggregate to another would have no nominal implications. Human cognition, however, is not served by this approach.


FIGURE 2-16: Absurdly opaque, noncohesive generated unique names (BAD IDEA)

Maintaining a central database to reserve individual class or component names is not practical and clearly not the best answer. Instead, we will exploit hierarchy to allocate multiple levels of namespaces at once. This hierarchy, however, is neither ad hoc nor arbitrary; with the exception of an overarching enterprise-wide namespace (see below), each namespace that we employ in our methodology will correspond to a coherent, architecturally significant, logically and physically cohesive aggregate.

2.4.4 Things to Make Cohesive

For every architecturally significant logical entity there are at least three related architectural names:

  1. The name (or symbol) of the logical entity itself

  2. The name of the component (or header) that declares the logical entity

  3. The name of the UOR that implements the logical entity

Ensuring that these names are deliberately cohesive will have significant implications with respect to development and maintenance. Hence, how and at what physical levels we achieve nominal cohesion is a distinctive and very important design consideration within our methodology.

2.4.5 Past/Current Definition of Package

A package (see section 2.8) is an architecturally significant — i.e., globally visible — unit of logical and physical design that serves to aggregate components, subject to explicitly stated, allowed dependency criteria (section 2.2.14). A package is also a means for making related components physically and, as we are about to see, nominally cohesive. In these ways, packages enable designers to capture and reflect, in source code, important architectural information not easily expressed in terms of components alone.

Historically,20 a package was defined as a collection of components organized as a (logically and) physically cohesive unit (see section 2.8.1). Although every package we write ourselves will necessarily be implemented exclusively in terms of components, other kinds of wellreasoned architecturally significant physical entities comprising multiple header files, yet not aggregating components, are certainly possible.21

With the definition as worded above, the word package can serve as a unifying term to describe any architecturally significant body of code that is larger than a component, but without necessarily being component-based. We will, however, consistently characterize packages that are not composed entirely of components adhering to our design rules — especially those pertaining to our cohesive naming conventions delineated throughout the remainder of this section (section 2.4) — as irregular (see section 2.12).

Suppose now that we have a logical subsystem called the Bond Trading System (referred to in code as bts for short). Suppose further that this logical subsystem consists of a number of classes (including a price class) that have been implemented in terms of components, which, in turn, have been aggregated into a package to be deployed atomically as an independent library (e.g., libbts.a). How should we distinguish the bts bond price class from other price classes, and what should be the name of the component in which that price class is defined?

2.4.6 The Point of Use Should Be Sufficient to Identify Location

Whenever we see a logical construct used in code, we want to know immediately to which component, package, and UOR it belongs. Without an explicit policy to do otherwise, the name of a class, the header file declaring that class, and the UOR implementing that class might all have unrelated names, as illustrated Figure 2-17. Clients reading BondPrice will not be able to predict, from usage alone, which header file defines it, nor which library implements it; hence, global search tools would be required during all subsequent maintenance of client code.


FIGURE 2-17: Noncohesive logical and physical naming (BAD IDEA)

By the same token, other components packaged together to implement this logical subsystem might well have names that are unrelated to each other, obscuring the cohesive physical modularity of this subsystem. Although not strictly necessary, experience shows that human cognition is facilitated by explicit “visual” associations within the source code. This nominal cohesion, in turn, reinforces the more critical requirement of logical/physical coherence (section 2.3). Hence, logical and physical name cohesion across related architecturally significant entities is an explicit design goal of our packaging methodology.

By their nature, components implemented as .h /.cpp pairs naturally already exhibit some degree of physical name cohesion. Note that as recently as the writing of my first book (1996), however, such was not the case. Due to unreasonable restrictions on the length of names that could be accommodated to distinguish .o files contained in library archive (.a) files of the day, .o files often had to be shortened; hence, an external cross-reference needed to be maintained in order to reestablish the cohesive nature of components.22

Recall from section 2.2.23 that every globally visible physical entity must itself be uniquely named. Since library component headers are at least potentially (see section 3.9.7) clearly visible from outside their respective units of release, and their corresponding .cpp file(s) derive from the same root name and yet are distinct among themselves, they too must be globally unique. Note that, unlike library components, the names of components residing in application packages (see section 2.13) do not have to be distinct from those in other application packages so long as their logical and physical names do not conflict with those in our library as, in our methodology, no two such application packages would ever be present in the same program.

Components, which are intended to address a highly focused purpose and are tailored to bolster hierarchical reuse (section 0.4), are invariably too fine grained to be practical to be released individually (section 2.2.20). Hence, in our methodology, each component is necessarily nested within a higher-level, architecturally significant aggregate, which (by definition) is a package. Although the benefits of physical uniformity — enhanced understandability and facilitation of automation tools — as outlined in section 0.7 alone are compelling, mindless adherence to this rule, however, will fall far short of the potential benefit it seeks to motivate. The intent here is not just to provide a uniform and balanced physical representation of software, but also to craft a hierarchical repository where the contained elements, from a logical as well as a physical perspective, are cohesive and synergistic (see section 2.8.3). Moreover, we want to ensure that each library component we write has a natural and obvious place in the physical hierarchy of our firm-wide repository (see sections 3.1.4 and 3.12).

A first step toward ensuring overt visible cohesion between architecturally significant names is making sure that the component name reflects the name of the package in which it resides, as shown in Figure 2-18. Just by looking at the name of the bts_cost component, we know that there exist two component files named bts_cost.h and bts_cost.cpp, which reside in the bts package.23,24


FIGURE 2-18: Component names always reflect their enclosing package.

Our preference that the names of physical entities (e.g., files, packages, and libraries) not contain any uppercase letters (section 1.7.1) begins with the observation that some popular file systems — Microsoft’s NTFS, in particular — do not distinguish between uppercase and lowercase.25 Theoretically, it is sufficient that the lowercased rendering of all filenames be unique. Practically, however, having any unnecessary extra degree of freedom in our physical packaging, thereby complicating development/deployment tools, let alone human comprehension, makes the use of mixed-case filenames for C++ source code suboptimal.26

Separately, and perhaps most importantly, we find that having class names, which we consistently render in mixed case (section 1.7.1) — being distinct from physical names, which we render in all lowercase — is notationally convenient and also visually reinforces the distinction between these two distinct dimensions of design, e.g., in component/class diagrams such as the one shown above (Figure 2-18). The utility afforded by this visual distinction within source code and external documents, such as this book, should not be underestimated.

Although the namespace construct can and will be used effectively with respect to logical names, it cannot address the corresponding physical ones — i.e., component filenames. That is, even with namespaces, having a header file employing a simple name such as date.h is still problematic. We could, as many do, force clients to embed a partial (relative) path to the appropriate header file (e.g., #include <bts/date.h>) within their source code; however, ensuring enterprise-wide uniqueness in the filename itself (e.g., #include <bts_date.h>) provides superior flexibility with respect to deployment.27 In other words, by making all component filenames themselves unique by design (irrespective of relative directory paths), we enable much more robustness and flexibility with respect to repackaging during deployment (see section 2.15.2).

Taking a software vendor’s perspective, an early explicit requirement of our packaging methodology was the ability to select one component, or an arbitrary set of specific components, from a vast repository, extract (copies of) them along with just the components on which those components depended (directly or indirectly), and make these components available to customers as a library having a single (“flat”) include directory and a single archive. Had we allowed our development directory structure to adulterate our source files, we would be forced to replicate a perhaps very large and sparsely populated directory structure on our clients’ systems. Similarly, nonunique.cpp filenames would make re-archiving .o files from multiple packages into a single library archive anything but straightforward.

This unnecessarily sparse directory structure would be exacerbated by a third level of physical aggregation. For example, the same header that resided within the package-level #include directory during development can co-exist (i.e., within a single group-level #include directory) alongside headers from other packages grouped together within the same UOR, which can be more convenient (and also more efficient28) for use by external clients. Having this superior flexibility in deployment — especially for library software — trumps any arguments based on aesthetics or “common practice.”

There are other collateral benefits for ensuring globally unique filenames. Having the filename embody its unique package prefix also simplifies predicting include-guard names. As illustrated in Figure 1-40, in section 1.5.2, the guard name is simply the prefix INCLUDED_ followed by the root filename in uppercase (e.g., for file bts_bondprice.h the guard symbol is simply INCLUDED_BTS_BONDPRICE). Compilers often make use of the implementation filename as the basis for generating unique symbols within a program — e.g., for virtual tables or constructs in an unnamed namespace. Hard-coding the unique package prefix in the filename also means that its globally unique identity is preserved outside the directory structure in which it was created — e.g., in ~/tmp, as an email attachment, or on the printer tray. Consistently repeating the filename as a comment on the very first line of each component file, as we do (see section 2.5), further reinforces its identity. Knowing the context of a file simply by looking at its name is a valuable property that one soon comes to expect and then depend on.

Before the introduction of the namespace keyword into the C++ language (and currently for languages such as C that do not provide a logical namespace construct), the best solution available was to require that (where possible) the name of every logical entity declared at file scope begin with a (registered) prefix unique to the architecturally significant physically cohesive aggregate immediately enclosing them, namely, a package.29 Attaching a logical package prefix to the name of every architecturally significant logical entity within a component, albeit aesthetically displeasing to many, was effective not only at avoiding name collisions, but also at achieving nominal cohesion, thereby reinforcing logical/physical coherence. A reimplementation of the physical module of Figure 2-17 (above) using logical package prefixes (now deprecated) is shown for reference only in Figure 2-19.


FIGURE 2-19: (Classical) logical package prefixes (deprecated)

Now that the namespace construct has long since been supported by all relevant C++ compilers, there has been an inculcation toward having concise, unadulterated logical names. Hence, we now (since c. 2005) nest each logical entity within a namespace having the same name as the package containing the component that defines the construct, as shown in Figure 2-20. Our use of logical package namespaces is isomorphic to our original use of logical package prefixes, and therefore consistent with our continued use of physical package prefixes for component filenames to preserve logical and physical name cohesion.


FIGURE 2-20: (Modern) logical package and enterprise namespaces

30 Note that when namespaces are not appropriate (e.g., functions having extern "C" linkage), we revert back to the use of logical package prefixes (see section 3.11.7).

2.4.7 Proprietary Software Requires an Enterprise Namespace

Notice how Figure 2-20, section 2.4.6, anticipates that we now also recommend an overarching enterprise-wide namespace as a way of enabling us to disambiguate (albeit extremely rare in practice) collisions with other software that might follow our (or a similar) naming methodology.

By shielding all of our proprietary code (other than application main functions, see section 2.13) behind a single enterprise-wide name, e.g., our full company name (as illustrated in Figure 2-20, section 2.4.6), we all but eliminate any chance of accidental external collision. And, since all of our components reside within the same enterprise namespace, there is no need or temptation to employ using declarations or directives.31 In the very unlikely event that a collision with external software occurs — even in the presence of using directives — all that is required to disambiguate the collision is to prepend (1) the firm-wide symbol, (2) the thirdparty product’s symbol, or (3) :: if the third-party code failed to take this precaution.

Having, instead, each individual package represented by a namespace at the highest level would lead, at least conceptually, to myriad short global symbols, combinatorially increasing the probability of collision with vendors adopting a similar strategy (see the birthday problem in Volume III, section 8.3).32 In any event, having a single (somehow unique) enterprise-wide “umbrella” namespace for our own code serves to mitigate risk and is therefore desirable.

The next step in achieving logical and physical name cohesion is to formalize how logical entities defined within a component are named so that their use alone identifies the component in which they are defined. To simplify the description, we provide the following definition of a component’s base name.

For example, the base name of the component illustrated in Figure 2-20, section 2.4.6, is cost. This name, however, fails to achieve nominal cohesion with the class BondPrice, which it defines.

2.4.8 Logical Constructs Should Be Nominally Anchored to Their Component

Naming a component after its principal class or struct (but in all lowercase), as shown in Figure 2-21, usually resolves most potential ambiguity. For example, we would expect that class bts::PackedCalendar would be defined in a component called bts_packedcalendar (or conceivably, bts_packed, if the component defined other intimately related “packed” types). Note that in our methodology, however, we tend to have a single (principal) class per component unless there is one of four specific countervailing reasons to do otherwise (see section 3.3.1). Whenever there is more than one class defined at package-namespace scope within a single component, each such class name will incorporate that component’s base name (albeit in “UpperCamelCase”) as a prefix.33


FIGURE 2-21: Nominally cohesive class and component (GOOD IDEA)

Where appropriate, we routinely define outwardly accessible (“public”) auxiliary classes, such as iterators, in the same component either by appending to the name of the primary class (e.g., bdlt::PackedCalendarHolidayIterator), or else by nesting the auxiliary class within the principal class itself (e.g., PackedCalendar::HolidayIterator).34 Note, however, that some detective work might be unavoidable when operators, inheritance, or user-defined conversion are involved. The rules surrounding the placement of free operators within components are discussed below.

2.4.9 Only Classes, structs, and Free Operators at Package-Namespace Scope

To minimize clutter, we have consistently avoided declaring individual functions as well as enumerations, variables, constants, etc., at namespace scope in component header files, preferring instead always to nest these logical constructs within the scope of an appropriate class or struct.35 In so doing, we anchor these less substantial constructs within a larger, architecturally significant logical entity that, unlike a namespace (section 1.3.18), is necessarily fully contained within a single component (section 0.7). We understand that this rule, like the previous one, might not be applicable when there are valid countervailing business reasons such as an externally specified (“client-facing”) interface.36

Having modifiable global variables at namespace scope is simply a bad idea. Nesting such variables within a class as static data members and providing only functional access is also generally a bad idea, but at least addresses the issue of nominal cohesion. On the other hand, nesting compile-time-initialized constants along with typedef declarations37 within the scope of a class or struct is perfectly fine. Requiring that enumerations be nested within a class, struct, or function ensures that all of the enumerators are scoped locally and cannot collide with those in other components within the same package namespace.38

would be a BAD IDEA. Separately, there would ideally be a single C++ type to represent each truly distinct platonic type used widely across interface boundaries (see Volume II, section 4.4).

The justification for avoiding free functions, except operator and operator-like “aspect” functions, which might benefit from argument-dependent lookup (ADL), derives from our desire to encapsulate an appropriate amount of logically and physically coherent functionality within a nominally cohesive component. While classes are substantial architectural entities that are easily identifiable from their names, individual functions are generally too small and specific for each to be made nominally cohesive with the single component that defines them, as in Figure 2-22a.39


FIGURE 2-22: Ensuring nominal cohesion for free functions and components

Creating components that hold multiple functions in which there is no nominal cohesion (Figure 2-22b) makes human reasoning about such physical nodes much more difficult and is therefore also a bad idea. Forcing the name of each function to have, as a prefix, the initiallowercased rendering of the base name of the component (Figure 2-22c) achieves nominal cohesion, but is awkward at best, and fails to emphasize logical coherence (section 2.3). We could employ a third level of namespace (Figure 2-22d), but for reasons discussed below (Figure 2-23) and also near the end of section 2.5, we feel that would be suboptimal.


FIGURE 2-23: Prefer struct to namespace for aggregating “free” functions.

We therefore generally avoid declaring free (nonoperator) functions at package-namespace scope, and instead achieve both nominal logical and physical cohesion by grouping related functionality within an extra level of namespace matching the component name using static methods within a struct (Figure 2-22e), which we will consistently refer to as a utility (see section 3.2.7) and so indicate with a Util suffix (e.g., xyza::MathUtil).40 Additional, collateral advantages for preferring a struct (e.g., Figure 2-22e) over a third level of namespace (e.g., Figure 2-22d) for implementing a utility are summarized in Figure 2-23.41

42 Although using declarations can be used to import declarations of overloaded functions of a given name from a private (or protected) base class into a public one, we generally discourage such use, as it would require a public client to view otherwise private (or protected) detail; instead, we prefer to create (and document) an inline forwarding function. Note that a similar issue arises with forwarding constructors as of C++11.

43 Titus Winters of Google has recently (c. 2018) expressed increasing concerns as to the scalability and stability of such overload sets (winters18a, “ADL”); see also winters18b, particularly starting at the 11:30 time marker.

In our methodology, operators, whether member or free, are by their nature fundamental to the type(s) on which they operate. Every unary and homogeneous binary operator — i.e., one written in terms of a single user-defined type, e.g.,

bool operator==(const BondPrice& lhs, const BondPrice& rhs);

is declared and defined within the same component (e.g., bts_bondprice) as the type (e.g., bts::BondPrice) on which it operates. Note that, except for forms of assignment (e.g., =, +=, *=), we will always choose to make a binary operator free (as opposed to a member) to ensure symmetry with respect to user-defined conversions (see Volume II, section 6.13). For conventionally heterogeneous operators such as

std::ostream& operator<<(std::ostream&    stream,
                         const BondPrice& price);

the motivation to make them free is born of extensibility without modification, as in the open-closed principle (section 0.5). In any event, the place to look for the definition of an operator (entirely consistent with ADL) is within a component that defines a type on which that operator operates.

If we were to allow free operators to be defined in arbitrary components, how could we even know if they exist? If we saw one being used, how would we track down its definition? Even more insidious is the possibility that a client unwittingly duplicates such a definition locally. The resulting latent incompatibilities, manifested by future multiply-defined-symbol linker errors, would threaten to destabilize our development process.

As an important, relevant example, consider the standard template container class, std::vector, for which no standard output operator is defined. Referring to Figure 2-24, suppose that the author of component my_stuff finds outputting a vector to be generally useful, and so “thoughtfully” provides


FIGURE 2-24: Problems with defining operators in unexpected components

template <class TYPE>
std::ostream& operator<<(std::ostream&            lhs,
                         const std::vector<TYPE>& rhs);

(along with an appropriate definition) in its header for general use by clients. It is not hard to imagine that component your_stuff might do so as well. Now consider what happens when their_stuff.cpp includes both my_stuff.h and your_stuff.h. The inevitable result is multiply defined symbols!44

Instead, the functionality should have been implemented as a static member function of a utility struct (see section 3.2.7) in a separate component, as illustrated in Figure 2-25.


FIGURE 2-25: Avoiding free operators on nonlocal types

As illustrated in Figure 2-26, providing an output operator on a type my::Type — or conceivably even on a std::vector<my::Type> — in component my_type is perfectly fine. The general design concept being illustrated here is to follow the teachings of the philosopher Immanuel Kant and avoid doing those things that, if also done by others, would adversely affect society (see section 3.9.1). By adhering to this simple rule for operators, we ensure that (1) we know where to look for each operator, and (2) operator definitions will not be duplicated (and therefore cannot conflict at higher levels in the physical hierarchy).


FIGURE 2-26: Overloading free operators on types within the same component

If a single free operator refers to two types implemented in separate components, where one depends on the other, the operator would of course be defined in the higher-level component. If, however, the components are otherwise independent (as illustrated Figure 2-27a), we have two alternatives:


FIGURE 2-27: Implementing “free operators” referring to multiple peer types

  1. [Suboptimal] Arbitrarily choose one of the components to be at a higher-level and place the free operator there, as in Figure 2-27b (thus introducing additional physical dependency for one of the components).

  2. [Preferred] Create a utility class in a separate component, as in Figure 2-27c, and define one or more nonoperator functions nested within a struct that serves the same purpose (see section 3.2.7). Note that it is never appropriate to escalate (see section 3.5.2) co-dependent free operators to a separate component.

Use of operators for anything but the most fundamental, obvious, and intuitive operations (see Volume II, section 6.11) are almost always a bad idea and should generally be avoided; any valid, practical need for operators across otherwise independent user-defined types is virtually nonexistent.45

2.4.10 Package Prefixes Are Not Just Style

Make no mistake, how packages are named is not just a matter of style; package names have profound architectural significance. As an example, consider Figure 2-28, which shows a hierarchy of components whose dependencies form a binary tree. Clearly these components are levelizable (section 1.10) and, hence, have no cycles. However, it is not in general possible to assign components of a multipackage subsystem to arbitrary packages without introducing package-level cycles. In this example, the packages containing these components (as implied by the package prefixes embedded in the component names) would be cyclic and therefore not levelizable.


FIGURE 2-28: Implied cyclic package dependencies (BAD IDEA)

The problem, identified by Figure 2-29, can easily arise in practice. Consider the design of a single package that is intended to contain everything that is directly usable by clients of a multipackage subsystem. If this presentation package (subc ) defines both protocol (i.e., pure abstract interface) classes (which are inherently very low level) and wrapper components (which are inherently very high level), it will not be possible to interleave components from a separate implementation package (subim).46


FIGURE 2-29: Acyclic component hierarchy; cyclic package hierarchy (BAD IDEA)

Allowing cyclic dependencies among packages, like any other aggregate, would make our software qualitatively more complicated. Ultimately, all cyclically involved packages would have to be treated as a unit. A general solution to this common problem, illustrated in Figure 2-30, is simply to provide two separate client-facing packages. One package (subw) will reside at the top of the subsystem and contain components that define only wrappers47 (e.g., subw_comp1); the second will reside at the bottom of the package hierarchy and incorporate components (e.g., subv_comp1 ) that define protocol and other vocabulary types (see Volume II, section 4.4) exposed programmatically through the wrapper interface.48


FIGURE 2-30: Repackaging of components to avoid cyclic package dependencies

Components that are used in the interface of the wrapper components (subw), and also in name only by low-level protocols, typically reside either in the same package as the protocols (e.g., subv in Figure 2-30) or in a separate, lower-level package, as illustrated in Figure 2-31b, as opposed to at the same level (Figure 2-31a), in order to enable concrete test implementations of the protocols to properly reside along with them (e.g., in subp), yet allow such test implementations to depend on the actual concrete vocabulary types (e.g., in subt) rather than having to mock them.


FIGURE 2-31: Alternative packaging strategies

2.4.11 Package Prefixes Are How We Name Package Groups

Although packages, being architecturally significant aggregates, have unique names (and namespaces), it is often advantageous to bundle packages having similar purposes and/or similar envelopes of physical dependency into a larger, logically and physically coherent, nominally cohesive aggregate. We could make a big deal about this issue (and perhaps we should, given its importance). Instead we will avoid the drama and just make our point: The first three letters of a package name identify the physically cohesive package group in which a grouped package resides.

The reason for this simple approach is, well, simple (see section 2.10.1): We simply must have an ultra-efficient way to specify the package group and package of each component and class in order to obviate noisome and debilitating using directives and declarations (see section 2.4.12). The choice of three letters (as opposed to, say, two or four) is simply an engineering trade-off. This simple, concise, and effective approach to naming package groups is illustrated in Figure 2-32. We will revisit our package-naming rules (in much greater depth) in section 2.10.


FIGURE 2-32: Logically and physically cohesive package group

2.4.12 using Directives and Declarations Are Generally a BAD IDEA

Let us now take a closer look at our use of the C++ namespace construct to partition logical entities along package boundaries. One of the solid benefits of package namespaces is that access to other entities local to that package does not require explicit qualification. This advantage is particularly pronounced at the application level, where much of the code that interoperates is defined locally (see section 2.13). Absent using directives and declarations, an unqualified reference is as informative as a qualified one: An unqualified reference implies that the entity is local to this package.49

In the code example of Figure 2-33, we cannot simply look at the definition of the insertAfterLink helper function and know which Link class we are talking about without potentially having to scan back through the entire file for preceding occurrences of using.


FIGURE 2-33: Nonlocal namespace names are optional! (BAD IDEA)

What’s worse, it might be that using directives or declarations are not even local to the implementation file, but are instead imported quietly in one or more of many included header files as illustrated in Figure 2-34. And, unlike the C++ Standard Library (or std in code), which is comparatively small, unchanging, and well known, we cannot be expected to know every class within every component of every package throughout our enterprise. Still worse, nesting a variety of using directives and declarations within header files risks making relevant the relative order in which these headers are incorporated into a translation unit!50


FIGURE 2-34: using directives/declarations can be included! (BAD IDEA)

No matter what, we must forbid any using directives or declarations in header files outside of function scope.51,52,53,54 Perhaps some advocates of using in headers might not yet have realized that the incorporation of names from one namespace, A, into another, B, does not end with the closing brace of B into which names from A were imported, but remain in B until the end of the translation unit. Consequently, using directives or declarations are sometimes used (we should say horribly misused) in header files when declaring class member data and function prototypes to shorten the names of types declared in distant namespaces (BAD IDEA).55 Instead, we must use the package-qualified name of each logical entity not local to the enclosing package. For this reason, we will want to ensure that widely used (“package”) namespace names, like std, are very short indeed.

Nonetheless, our recommended approach is to avoid such uses of (typically structural) inheritance (see Volume II, section 4.6), preferring the more compositional Has-A (section 1.7.4) approach to layering (see section 3.7.2) instead.

That said, exceptional cases do exist. Alisdair Meredith further points out (again, via personal email, 2018) that we ourselves have, on occasion, been known to introduce a base class having fewer template parameters, and then use structural inheritance and using declarations to expose that functionality as the public interface. If we were now to replace using declarations with, say, inline forwarding functions, we would negate the intended effect of reducing template-induced code bloat (see Volume II, section 4.5).

class Book {
    // ...
    typedef std::map<std::string, std::string>       StrStrMap;
    typedef std::map<std::string, std::vector<int> > StrIntarrayMap;
    // ...
    StrStrMap      d_glossary;
    StrIntarrayMap d_index;
    // ...

We recognize that C++11 offers using as a syntactic alternative, and that thoughtful (discriminating) use of auto can also help eliminate redundant (or otherwise superfluous) explicit type information in source code. See lakos21.

The use of using declarations for function forwarding during private (never mind protected) inheritance is also to be avoided because (1) our ability to document and understand such functionality in the derived header itself is compromised, and (2) inheritance necessarily implies compile-time coupling (section 1.9; see also section 3.10). We generally prefer to avoid private inheritance, in favor of layering (a.k.a. composition), and explicit (inline) function forwarding.

Finally, using namespaces to define a logical “location” independent of its physical location, say, to avoid changing #include directives (should some class be logically “repackaged”) is — in our view — misguided. If we change the logical location of a class then — in our methodology — that class must be moved to its proper physical location as well. Unless logical and physical locations coincide, many of the advantages of sound physical design — e.g., reduced compile time, link time, and executable size (not to mention organization and understandability) — are compromised.

Adhering to these cohesive naming rules does, however, impose some extra burden on library developers. That is, if a logical construct were to “move” from one architectural location to another, its address (i.e., its component name), and therefore some aspect of its fully qualified logical name, must necessarily change as well. This “deficiency” is actually a feature in that it allows for a reasonable deprecation strategy: During refactoring, it is possible for two versions of the same logical entity to co-exist for a period of time as clients rework their code to refer to the new component before the original one is finally removed.56

2.4.13 Section Summary

In summary, our rigorous approach to cohesive naming — packages, components, classes, and free (operator) functions — not only avoids collisions, it also provides valuable visual cues within the source code that serve to identify the physical location of all architecturally significant entities. Experience shows that human cognition is facilitated by such visual associations. In turn, this nominal cohesion reinforces the even more critical requirement of logical/physical coherence (section 2.3). Hence, logical and physical name cohesion across related architecturally significant entities is an integral part of our component-based packaging methodology.

  • + Share This
  • 🔖 Save To Your Account