Real-World Build System Scenarios
As discussed in the introduction to Part II, "The Build Tools," it's important to compare how each of the available build tools can be used in realistic scenarios. After all, not until you actually solve a technical problem do you get a true sense of whether the tool is easy to use. From now on, this chapter focuses less on the syntax of GNU Make and more on how everything fits together.
Scenario 1: Source Code in a Single Directory
In the simple case in which you have a C program stored entirely within a single directory, you have three solutions. The first is a repeat of what you saw earlier in the chapter. The second shows how to improve upon that solution, and the third uses an external scanner tool to find dependencies.
Consider the solution you've already seen:
1 SRCS = add.c calc.c mult.c sub.c 2 PROG = calculator 3 CC = gcc 4 CFLAGS = -g 5 OBJS = $(SRCS:.c=.o) 6 7 $(PROG): $(OBJS) 8 $(CC) $(CFLAGS) -o $@ $^ 9 10 $(OBJS): numbers.h
This type of makefile is common for projects that start small. When developers first write their code, they often don't put much effort into planning their build system, given that a simple makefile will suffice. They can add new source files by appending to the SRCS variable, and everything continues to work perfectly—at least, for a while.
Focus on line 10, stating that all source files have a dependency on the numbers.h header file. What would happen if a newly added source file didn't actually include numbers.h? What if additional header files were added, but you forgot to list them in the makefile? In both cases, a lot of manual work is required to keep the makefile consistent with the source files; otherwise, you'd end up with an incorrect executable program.
The second approach is to automate the detection of header files. The following solution scans the source files and computes the correct set of dependencies.
1 SRCS = add.c calc.c mult.c sub.c 2 PROG = calculator 3 CC = gcc 4 CFLAGS = -g 5 OBJS = $(SRCS:.c=.o) 6 7 $(PROG): $(OBJS) 8 $(CC) $(CFLAGS) -o $@ $^ 9 10 -include $(SRCS:.c=.d) 11 12 %.d: %.c 13 @$(CC) -MM $(CPPFLAGS) $< | sed 's#\(.*\)\.o: #\1.o \1\.d: #g' > $@
This code looks rather complex (and it is), so let's break it down in detail. The approach is to automatically generate a new dependency information file (with .d suffix), corresponding to each C source file. In this case, you generate add.d, calc.d, mult.d, and sub.d. Here's what these dependency files look like (in this case, it's add.d):
add.o add.d: add.c numbers.h
On line 10 of the makefile, you explicitly include all these .d files, ensuring that everything is added to the same dependency graph. On line 12, a new rule informs GNU Make how to generate these .d files if they're missing or if the corresponding .c or .h files have changed.
Line 13 works a bit of magic to obtain the dependency information in the first place. Most of the work is done by passing the –MM option to the GCC compiler. This asks the compiler to generate the list of .c and .h files that it reads in but to stop immediately after doing so (instead of doing any real compile work). Finally, the cryptic sed command adds the name of the .d file on the left side of the rule, because GCC won't put it there by itself.
To fully understand this example, you need to know that GNU Make determines when makefile fragments (such as .d files) have changed and restarts the entire parsing process as a result. That's more detail than you'll want to get into, but hopefully you can see what's involved in automatically detecting header file dependencies.
A third solution uses the makedepend command. This tool is similar in nature to gcc -MM, although it provides its own scanner for analyzing C source files instead of relying on the compiler itself. Chapter 19, "Faster Builds," discusses build system performance and covers makedepend in more detail.
Let's continue by addressing scalability and see how to write a makefile for multidirectory programs.
Scenario 2(a): Source Code in Multiple Directories
Constructing a multidirectory build system is not as simple as the single directory case, so next you'll see three different attempts to achieve what you need. In these cases, the source code files are no longer colocated in the same directory, but are instead spread across a larger source tree. As a reminder, Figure 6.1 shows the tree for the example software described at the start of Part II.
Figure 6.1 The source tree for the multidirectory calculator example.
For the first attempt, you use a similar makefile to the single-directory program, but the SRCS variable now contains the full path to each file.
1 SRCS = libmath/clock.c libmath/letter.c libmath/number.c 2 libprint/banner.c libprint/center.c libprint/normal.c 3 calc/calc.c 4 ...
Although this is easy to understand and it works properly for simple programs, this approach doesn't work in a large-scale build environment, for several reasons:
Harder dependency generation: With automatic generation of .d files, the dependency rules are no longer created properly. Instead, you end up with a rule that doesn't contain the correct pathname on the left side (it's missing the directory component).
clock.o: libmath/clock.c libmath/math.hOf course, this can be fixed by adding more complexity to the rule that generates .d files, but let's not look into that approach yet.
Developer contention on the single makefile: The SRCS variable is already spread over three lines in the makefile. What would happen if you had a hundred files or a thousand files? This single makefile would be unmanageable, becoming a point of contention when all software engineers (perhaps hundreds of them) needed to modify the same file at the same time.
Inability to subdivide the program: This makefile solution doesn't enable the use of libraries, such as libmath.a or libprint.a. For large programs, it's convenient to subdivide the code into libraries that help delineate areas of code, making it possible to reuse code across different executable programs.
For these reasons, it's uncommon to find a large build system that uses a single makefile. A more common solution is to divide the build description across more than one makefile. That leads to the next solution.
Scenario 2(b): Recursive Make over Multiple Directories
The second approach, known as recursive Make, is a common solution in the software industry. The basic approach is to have a different makefile in each source directory, with the high-level makefile (in the high-level directories) recursively invoking each lower-level makefile. Figure 6.2 shows the revised directory tree, with each directory having its own makefile.
Figure 6.2 Multidirectory example, showing the location of makefiles and library files.
Observe that the build tree now has four different files named Makefile: one at the top level and one within each of the libmath, libprint, and calc subdirectories. Going a step further, two static libraries, libmath.a and libprint.a, were added, each archiving the object files from their specific directories.
The advantage of recursive Make is that each makefile needs to list only the files in the current source directory. When necessary, a makefile can recursively call upon another makefile if there's a requirement to build other parts of the source tree. Listing long pathnames in the makefile is unnecessary because all file references are relative to the directory itself. Less contention also arises between different developers who need to make changes to a makefile. The odds of two developers changing the same small makefile are significantly less than with a single large makefile.
Now look at the content of each makefile, starting with libmath/Makefile:
1 SRCS = clock.c letter.c number.c 2 LIB = libmath.a 3 CC = gcc 4 CFLAGS = -g 5 OBJS = $(SRCS:.c=.o) 6 7 $(LIB): $(OBJS) 8 $(AR) cr $(LIB) $(OBJS) 9 10 $(OBJS): math.h
The code looks similar to the makefile used in the single-directory case, which, of course, is a major reason for using recursive Make. The files listed in the SRCS variable are all relative to the current directory, and you can use GNU Make's built-in rule for compiling C source files. Notice that the code is a bit lazy here: Line 10 contains an explicit dependency for the math.h header file instead of automatically detecting it.
The big difference is in lines 7 and 8, where, instead of linking together a final executable program, a static library is created by archiving the files listed in $(OBJS) into libmath.a. In another makefile that you'll see shortly, this archive is linked into the executable program.
The next makefile, in the libprint subdirectory, is essentially the same.
1 SRCS = banner.c center.c normal.c 2 LIB = libprint.a 3 CC = gcc 4 CFLAGS = -g 5 OBJS = $(SRCS:.c=.o) 6 7 $(LIB): $(OBJS) 8 $(AR) cr $(LIB) $(OBJS) 9 10 $(OBJS): printers.h
This makefile is so similar to libmath/Makefile that you might wonder whether you could factor out the common code. This is certainly the case, and many build systems extract the common code into a framework makefile. Each individual makefile uses the include directive to incorporate the shared functionality. For example, you could rewrite libprint/Makefile as follows:
1 SRCS = banner.c center.c normal.c 2 LIB = libprint.a 3 include lib.mk 4 $(OBJS): printers.h
The third makefile, in the calc directory, is different from the other two, in that it creates the final executable program by combining libprint.a and libmath.a, along with a small main program.
1 SRCS = calc.c 2 PROG = calculator 3 LIBS = ../libmath/libmath.a ../libprint/libprint.a 4 CC = gcc 5 CFLAGS = -g 6 OBJS = $(SRCS:.c=.o) 7 8 $(PROG): $(OBJS) $(LIBS) 9 $(CC) -o $@ $^
Note the use of relative paths on line 3 to access the static libraries from the libmath and libprint directories. An assumption is clearly being made that calc/Makefile is executed only after the two libraries have already been brought up-to-date. If the ordering of the steps was incorrect, you'd end up with a broken build, or worse, would build an executable program with outdated libraries.
To make sure everything is built properly, the top-level makefile recursively calls every other makefile in the correct order.
1 .PHONY: all 2 all: 3 $(MAKE) -C libmath 4 $(MAKE) -C libprint 5 $(MAKE) -C calc
This top-level makefile uses only the most basic features of GNU Make and doesn't have much of a dependency graph. Each of the shell commands is executed in the specified order, and there's no choice about whether they'll be executed. The all target has no prerequisites, so each of the recursive calls to $(MAKE) happens every time the developer executes the makefile.
Although recursive Make is simple to understand, it isn't the most efficient solution available. It might be commonly used in the software industry, but it still has a number of flaws that tend to cause slow or incorrect builds. Even though recursive Make enables developers to keep each makefile small and self-contained, with operations being done in an explicit sequence, those are the exact reasons the solution sometimes fails.
The example had only three directories to think about: libmath, libprint, and calc. The relationship between these directories was clearly defined, so the explicit sequence of $(MAKE) calls was easy to determine. On the other hand, what if you had a hundred directories with a complex network of dependencies between them? Trying to build everything in the correct order becomes an impossible task, especially if developers create more interdirectory dependencies as they write new code. After a while, you'd start wishing you'd used GNU Make's dependency-analysis system to figure out the correct ordering for you.
As an example, what would happen if the source code in the libmath directory started to use the libprint.a library. In the current system, libmath is compiled first and, therefore, runs the risk of using an outdated version of the libprint.a library or simply failing if the library didn't yet exist. The easiest solution is to modify the top-level makefile to build libprint first, but that solution doesn't scale to hundreds of directories with complex ordering requirements.
A similar problem occurs if you want to build only part of the program. Imagine if you tried to cut corners and not build the calculator example from the top-level makefile. If you started in the calc subdirectory and typed gmake, you'd simply be recompiling the calc.c source file (if required). Because calc/Makefile doesn't know how to build libprint.a, it doesn't attempt to rebuild any of those files even if they are out-of-date.
To phrase these problems in more technical terms, each makefile is executed by a separate instance of the $(MAKE) process and, therefore, has a completely different dependency graph. In no place in the build system is the entire dependency graph available, which, of course, is the root cause of invalid builds. If GNU Make isn't provided with full dependency information, it can't compile the correct set of files in the correct order.
In most large-scale recursive Make systems, developers end up seeing a lot of redundancy. To avoid risking the chance of building an executable program using outdated libraries, each makefile rebuilds the same libraries many times, just to make sure no dependencies were missed. For example, you might choose to build libmath.a, followed by libprint.a, and then repeat the compilation of libmath.a, just in case something in the libprint directory changed since the first time it was compiled. This type of paranoia is common when developers don't trust the build system to do the right thing.
This sequencing technique clearly results in building libmath.a twice, although because the library is already up-to-date, there's probably no extra work to do the second time—well, almost no work. In reality, there's still the overhead of starting a new GNU Make process, parsing the makefile to build the dependency graph, and then reading the file time stamps to see if anything has changed. Unfortunately, this overhead isn't free: It could slow the build by anything from a few seconds to a few minutes, depending on the size of libmath.a.
These problems and several others are detailed in a classic research paper titled "Recursive Make Considered Harmful" . This paper also discusses solutions to the recursive Make problem, including the next solution you'll evaluate.
Scenario 2(c): Inclusive Make over Multiple Directories
The third multidirectory solution adopts the good practices of the recursive Make approach, while ensuring that only one instance of the GNU Make process is ever executed. As a result, you benefit from the full power of GNU Make's dependency system so that important dependencies aren't missed. In contrast to the previous method, this new solution is called inclusive Make.
Consider the benefits:
- Only one instance of GNU Make is running, with a lower start-up time. This contrasts with starting hundreds of processes over the lifetime of the build.
- You still have a single makefile per directory to describe all the files in that directory. This makes it possible to encapsulate each directory's build description, and it reduces contention between developers when they modify each makefile.
- All source filenames are specified by their filename component only, so there's no need to include the full path to each file (as in the first example).
- A single dependency graph contains all dependencies in the entire build system, reducing the chance of incorrect builds.
- Because there's no recursion, you don't need to explicitly sequence all the recursive $(MAKE) calls and risk possibly getting it wrong. GNU Make executes the rules in the correct order.
Although this sounds like an excellent solution, the major downside is the additional complexity. If you're new to GNU Make, the solution you're about to see will stretch your knowledge of how the tool works. In most production build systems, an experienced GNU Make guru would create the inclusive build system in the first place, with junior GNU Make programmers scratching their heads to understand how everything works. This example just covers the basic framework and doesn't go into much detail.
Figure 6.3 illustrates the inclusive Make build tree. This is a larger example because a two-level directory structure doesn't show the full extent of this solution.
Figure 6.3 A larger source tree, illustrating the inclusive Make system.
This example has one main makefile, at the top of the source tree. You can also see the make/framework.mk file, which contains most of the complexity of the build system. Finally, each source directory contains a short makefile fragment, named Files.mk, for describing the source files in that particular directory.
Because of the complexity of the inclusive Make framework, this separation of files is important. Software developers are only encouraged to view and edit Files.mk files where they can find the list of source files, the list of subdirectories to traverse, and the list of compiler flags. On the other hand, the GNU Make complexity is deliberately hidden inside the make/framework.mk file so that nonguru software engineers don't attempt to change the build mechanism by mistake.
Start by examining a few of the Files.mk files. These are designed to be readable and editable by software developers, and they contain only variables that developers care about:
1 SUBDIRS := libraries application 2 SRC := main.c 3 CFLAGS := -g
1 SUBDIRS := math protocols sql widgets
1 SRC := add.c mult.c sub.c 2 CFLAGS := -DBIG_MATH
First consider the SUBDIRS variable definitions. For directories (such as src and src/libraries) that contain subdirectories of their own, the SUBDIRS variable lists the directories to be included in the build process. As you can see, src/libraries/Files.mk includes the math subdirectory, so the inclusive framework must incorporate src/libraries/math/Files.mk into the build process. On the other hand, src/libraries/math/Files.mk doesn't contain a definition for SUDIRS, so the build system won't search any lower in the build tree.
Next, the SRC variable within each Files.mk fragment informs the build system about the C source files that should be included from that directory. Given that src/libraries/Files.mk doesn't include the SRC variable, none of the source files from that directory (if there were any) would be included.
Finally, the CFLAGS variable states which C compiler flags should be used for all the source files in this directory. Each directory can have a different set of C flags instead of using a global set of flags for all files in the build tree.
In the inclusive Make example, these Files.mk fragments are all that an average software developer is interested in seeing. The question remains of how GNU make interprets these Files.mk files and how the SRC, SUBDIRS, and CFLAGS variables are used.
Continue by examining src/Makefile, which is the main entry point to the GNU Make program. As a reminder, only build gurus would be interested in reading or modifying this file.
1 _subdirs := 2 _curdir := 3 FRAMEWORK := $(CURDIR)/make/framework.mk 4 include Files.mk 5 include $(FRAMEWORK) 6 7 VARS := $(sort $(filter srcs-% cflags-%, $(.VARIABLES))) 8 $(foreach var, $(VARS), $(info $(var) = $($(var)))) 9 10 .PHONY: all 11 all: 12 @# do nothing
Again a detailed explanation is in order. The inclusive Make solution is complex, so now examine each line in detail.
On line 1, the _subdirs variable is initialized to the empty string. This variable is used as a space-separated list of subdirectories to be traversed. Within each of these directories, you can expect to find a Files.mk file, which itself could potentially include a definition for the SUBDIRS variable. Each time you find another SUBDIRS definition, you append the new subdirectories onto _subdirs, effectively creating a queue of directories to visit.
For example, after you've visited src/Files.mk, the _subdirs variable contains the following:
In the next step, you pop the libraries path off the front of the queue and parse src/libraries/Files.mk. After discovering the new definition for SUBDIRS in that file, the _subdirs variable changes to this:
applications libraries/math libraries/protocols libraries/sql libraries/widgets
Following this process repeatedly, you end up traversing the entire build tree and reading every Files.mk file. Note that the src directory name isn't included in these pathnames because that's the current working directory. Everything is already relative to the src directory.
Line 2 of src/Makefile initializes _curdir to the empty string. This variable represents the current directory you're traversing. It starts empty because you're at the top level of the build tree (inside the src directory). As you traverse the build tree, by popping entries off the start of the _subdirs queue, the value of _curdir reflects the current point of traversal.
Line 3 defines FRAMEWORK to be the path of the framework makefile. You'll be calling upon this makefile often, so it's convenient to have a variable referring to it.
Line 4 starts everything in motion by including the src/Files.mk file. From this, you get the top-level definition of SRC, SUBDIRS, and CFLAGS. Note the distinction here between including a file with the include directive and calling upon another makefile using $(MAKE). Because you're using include, the same Make instance is used, and you'll be adding to the same dependency graph each time (instead of creating a new one).
Line 5 calls the inclusive Make framework to process the content of the SRC, SUBDIRS, and CFLAGS variables; the framework then continues traversing the remainder of the source tree. By the time you return from this particular include directive, all the Files.mk files will have been processed.
Lines 7 and 8 are executed after the entire tree of Files.mk fragments has been processed. This code takes the complete list of variables that GNU Make knows about (automatically stored in $(.VARIABLES)) and filters all variables names that start with srcs- or cflags-. It then displays each one on the program's output so that you can see the computed values. You haven't seen it yet, but the framework file defines the srcs-* and cflags-* variables as it traverses the build tree.
This mechanism isn't normally part of the build system, but it's used as a means of debugging the inclusive Make algorithm to ensure that everything is working correctly. You'll take a look at the output shortly.
Now examine the content of make/framework.mk, which is the main algorithm for traversing the build tree and collecting the values from each Files.mk fragment:
1 srcs-$(_curdir) := $(addprefix $(_curdir),$(SRC)) 2 cflags-$(_curdir) := $(CFLAGS) 3 _subdirs := $(_subdirs) $(addprefix $(_curdir), $(SUBDIRS)) 4 5 ifneq ($(words $(_subdirs)),0) 6 _curdir := $(firstword $(_subdirs))/ 7 _subdirs := $(wordlist 2, $(words $(_subdirs)), $(_subdirs)) 8 SUBDIRS := 9 SRC := 10 CFLAGS := 11 include $(_curdir)Files.mk 12 include $(FRAMEWORK) 13 endif
As with the previous file, this makefile framework requires detailed explanation. Recall that this file is included immediately after one of the Files.mk files has been parsed. Therefore the SRC, SUBDIRS, and CFLAGS variables have just been set to the appropriate value for the directory you're currently processing.
Line 1 records the set of source files for the current directory. The variable name on the left side of the assignment also contains a variable, so you'll be creating a different GNU Make variable for each directory you visit. This syntax seems odd at first, but having the capability to dynamically construct variable names is equivalent to defining arrays or hashes in other languages. That is, the srcs- variable has many subelements, each indexed by the name of the directory.
On the right side of line 1, you take the current definition of the SRC variable and add the current directory as a prefix to each element in that list. For example, if _curdir is set to libraries/math/, then you've just finished parsing the src/libraries/math/Files.mk file. Line 1 of the framework makefile is therefore equivalent to this:
srcs-libraries/math/ := libraries/math/add.c libraries/math/mult.c libraries/math/sub.c
Although it might seem odd, it's perfectly acceptable to have punctuation within variable names.
Line 2 is similar and stores the current directory's CFLAGS definition inside a directory-specific cflags-* variable. In the simple inclusive framework, you won't be doing anything with these variables aside from displaying them for debugging purposes.
Line 3 is responsible for queuing up any additional SUBDIRS values that the current Files.mk fragment might contain. Again, you prefix the elements in SUBDIRS with the current directory, but this time you append these values to the end of the existing $(_subdirs) value.
Lines 5–13 are where the tree traversal takes place. Assuming that there are more entries in the queue of pending subdirectories, you'll extract the first of them and visit the Files.mk file in the corresponding source code directory.
Lines 6 and 7 remove the first queue element. Line 6 sets the first item in the _subdirs list as the current directory (_curdir). Line 7 deletes this first element from the queue by reassigning _subdirs with all the words from position 2 to the end of the current _subdirs value.
Line 11 now includes the Files.mk fragment that resides within the current directory. Given that Files.mk isn't required to contain all the variable definitions (SRC, SUBDIRS, CFLAGS), you first set them to the empty string (lines 8–10) to make sure that the values from the previous directory don't "leak through" to the current directory.
Finally, Line 12 repeats the whole framework file, which stores the values of SRC and CFLAGS and then traverses any additional directories listed in SUBDIRS.
That's the end of the example. For completeness, let's see the output of executing the makefile on the example build tree. The values for the srcs- and cflags- variable should match the original diagram.
cflags- = -g cflags-application/ = cflags-application/database/ = cflags-application/database/load/ = cflags-application/database/save/ = cflags-application/graphics/ = cflags-libraries/ = cflags-libraries/math/ = -DBIG_MATH cflags-libraries/protocols/ = -DFAST_SEND cflags-libraries/sql/ = -O2 cflags-libraries/widgets/ = -DCOLOR="red" srcs- = main.c srcs-application/ = srcs-application/database/ = application/database/persistence.c application/database/backup.c application/database/optimize.c srcs-application/database/load/ = application/database/load/loading.c srcs-application/database/save/ = application/database/save/saving.c srcs-application/graphics/ = application/graphics/line-drawing.c application/graphics/vector-size.c application/ graphics/3d.c srcs-libraries/ = srcs-libraries/math/ = libraries/math/add.c libraries/math/mult.c libraries/math/sub.c srcs-libraries/protocols/ = libraries/protocols/tcp.d libraries/protocols/udp.c libraries/protocols/ip.c srcs-libraries/sql/ = libraries/sql/select.c libraries/sql/ view.c libraries/sql/create.c libraries/sql/drop.c srcs-libraries/widgets/ = libraries/widgets/button.c libraries/widgets/list.c libraries/widgets/window.c libraries/widgets/tree.c
At this point, it should be clear that you haven't built a complete inclusive Make system, but you should have a basic idea of how it could be done. The important factors are that each directory has its own Files.mk files (with paths specified relative to that directory) and that using one instance of the GNU Make process enables you to have a single unified dependency graph.
To make a fully functional build system, you need to add the following features:
- GNU Make code to define the dependencies between object files, source files, and header files (using automatic dependency analysis).
- Rules for compiling the code (You'd need to override the built-in rules for C compilation.)
- Code to link object files into static libraries.
- Code to link together the final executable programs (possibly more than one program could be compiled).
- The capability to start the GNU Make process from a subdirectory (Currently, the only makefile is in the top-level src directory.)
- Support for compiling on multiple CPU architectures.
- C compiler flags on a per-file basis instead of just on a per-directory basis.
- Inheritance of compiler flags from parent directories to child directories.
Certainly, the list goes on. In summary, an inclusive Make build system is not an easy system to create. Definitely budget plenty of time if you decide to create your own. Luckily, several experts  have provided systems you can use as a starting point.
Scenario 3: Defining New Compilation Tools
The next real-world scenario looks at adding a new type of compilation tool into the makefile. So far, this chapter has focused exclusively on compiling C-language source files, but the same concepts extend nicely to other languages. In fact, this GNU Make code will appear simple compared to some you've seen so far.
To make use of the mathcomp compiler (discussed in the introduction to Part II), you need to add the following:
- A list of source files that are in .math file format, to be read by the mathcomp compiler
- A GNU Make rule that describes how to compile .math files into .c files
- A new type of dependency file (with .d1 suffix) to record the relationship between .math files and the .mathinc files they depend upon
Now jump right into the final solution, which isn't too different from what you've already seen.
1 MATHCOMP := /tools/bin/mathcomp 2 CC := gcc 3 MATHSRC := equations.math 4 CSRC := calculator.c 5 PROG := calculator 6 OBJS := $(CSRC:.c=.o) $(MATHSRC:.math=.o) 7 8 $(PROG): $(OBJS) 9 $(CC) -o $@ $^ 10 11 %.c: %.math 12 $(MATHCOMP) -c $< 13 14 -include $(CSRC:.c=.d) 15 -include $(MATHSRC:.math=.d1) 16 17 %.d: %.c 18 @$(CC) -MM $(CPPFLAGS) $< | sed 's#\(.*\)\.o: #\1.o \1.d: #g' > $@ 19 20 %.d1: %.math 21 echo -n "$@ $(*F).c: " > $@; 22 $(MATHCOMP) -d $< >> $@
Here's a line-by-line explanation, but only for the new portions of the makefile. Everything else should look familiar.
Line 1 defines the path of the mathcomp compiler. An absolute path is used for the tool here instead of relying on users to have their $PATH variable set correctly.
Line 3 defines the list of source files (MATHSRC) in the .math file format, just as line 4 defines the list (CSRC) of C-language source files. Line 6 forms a list of object files by replacing .c and .math file extensions with the .o extension.
Lines 11 and 12 define a dependency rule to generate .c files from their corresponding .math files. For example, to generate equations.o (required by line 8), you first need to generate equations.c (defined by the built-in C compilation rule). To do this, GNU Make triggers the rule on line 11 to generate equations.c from equations.math.
Lines 15 and 20–22 perform the magic necessary for autodetecting makefile dependencies. Similarly to the C compiler, you pass the –d option to the mathcomp compiler and have it generate the list of source files it includes (namely, .mathinc files). The additional echo command on line 21 adds a small amount of extra information that mathcomp doesn't provide by default. The resulting equations.d1 file looks like this:
equations.d1 equations.c: equations.math equ1.mathinc equ2.mathinc
With those key points covered and all the previous examples you've seen, the rest of the makefile should be easy to understand. In summary, adding a new compilation tool in GNU Make is not too difficult, except perhaps when it comes to automatically detecting dependencies.
Scenario 4: Building with Multiple Variants
GNU Make is the most common means of compiling C and C++ code, and both of these languages usually compile to native machine code. Clearly, you need a way to select which CPU type to use. This example allows the software developer to compile for the Intel x86 series, the PowerPC series, or the Alpha CPUs. In fact, you allow them to compile for all three architectures within the same build tree at the same time.
To select a target architecture, developers should provide a value for the PLATFORM variable. If they don't provide a value, the compilation defaults to using the x86 architecture. For example:
$ gmake PLATFORM=powerpc # build for PowerPC CPUs $ gmake # build for i386 CPUs $ gmake PLATFORM=xbox # OOPS! Not allowed. Makefile:8: *** Invalid PLATFORM: xbox. Stop.
Here's the necessary GNU Make code for compiling platform-specific code:
1 SRCS = add.c calc.c mult.c sub.c 2 PROG = calculator 3 CFLAGS = -g 4 PLATFORM ?= i386 5 VALID_PLATFORMS = i386 powerpc alpha 6 7 ifeq ($(filter $(PLATFORM), $(VALID_PLATFORMS)),) 8 $(error Invalid PLATFORM: $(PLATFORM)) 9 endif 10 11 OBJDIR=obj/$(PLATFORM) 12 $(shell mkdir -p $(OBJDIR)) 13 14 CC := gcc-$(PLATFORM) 15 OBJS = $(addprefix $(OBJDIR)/, $(SRCS:.c=.o)) 16 17 $(OBJDIR)/$(PROG): $(OBJS) 18 $(CC) $(CFLAGS) -o $@ $^ 19 20 $(OBJDIR)/%.o: %.c 21 $(CC) -c -o $@ $< 22 23 $(OBJS): numbers.h
This makefile example includes a few new concepts. Line 4 provides the default value for the PLATFORM variable. If the user doesn't set the variable on the command line, it defaults to i386. You don't technically need to use the ?= operator here; any variable defined on the command line automatically overrides the default value provided in the makefile.
Lines 7–9 tests whether $(PLATFORM) is one of the acceptable values. The $(filter) function returns the empty string if it's unable to find $(PLATFORM) in the list of valid platforms. The ifeq directive tests for this empty string and displays an appropriate error message.
Lines 11 and 12 determine the directory in which the object files will be placed. All the examples so far have stored the object files in the same directory as the source files because that's the default behavior. However, with object files from three different architectures, you need to explicitly store them in an architecture-specific location (obj/i386, obj/powerpc, or obj/alpha). Line 12 ensures that the selected object directory already exists.
Line 14 selects the appropriate C compiler to use and assigns the name to the CC variable. Assume that each CPU architecture requires a different version of GCC, as opposed to a single compiler instance supporting multiple targets.
Line 15 computes the list of object files to be compiled. Given that each CPU's object files are stored in a different object directory, you need to explicitly state which object files are to be built. In this case, you prefix each element in the object file list with $(OBJDIR).
Finally, lines 17–21 rewrite the rules you've seen many times before. The only difference is that here you've added $(OBJDIR) on the left side of each rule, whereas in the past you've assumed that object files are placed in the source directory. This code uses an interesting feature of GNU Make that permits the source and object files to be located in different places.
With this additional functionality, you now can support multiple CPU architectures. To help clarify how this build system works, examine the output:
$ gmake gcc-i386 -c -o obj/i386/add.o add.c gcc-i386 -c -o obj/i386/calc.o calc.c gcc-i386 -c -o obj/i386/mult.o mult.c gcc-i386 -c -o obj/i386/sub.o sub.c gcc-i386 -g -o obj/i386/calculator obj/i386/add.o obj/i386/ calc.o obj/i386/mult.o obj/i386/sub.o $ gmake PLATFORM=powerpc gcc-powerpc -c -o obj/powerpc/add.o add.c gcc-powerpc -c -o obj/powerpc/calc.o calc.c gcc-powerpc -c -o obj/powerpc/mult.o mult.c gcc-powerpc -c -o obj/powerpc/sub.o sub.c gcc-powerpc -g -o obj/powerpc/calculator obj/powerpc/add.o obj/powerpc/calc.o obj/powerpc/mult.o obj/powerpc/sub.o
Of course, in a realistic environment, you'd integrate this code into a recursive Make or inclusive Make solution; otherwise, you're limited to compiling files in a single source directory.
Scenario 5: Cleaning a Build Tree
The next real-world scenario involves cleaning a build tree by removing all the generated files. Sometimes you want this functionality on a per-directory basis, but in other cases, you're happy to remove all objects files from the build tree. In either case, it's important that your cleaning operation remove the exact set of object files your build process created in the first place.
The way a build system cleans a build tree depends entirely on how your build system was constructed. For recursive Make systems, each makefile is responsible for generating the object files in its own directory; therefore, it should be responsible for removing them, too.
For example, in the top-level makefile, you'd have a rule that recursively cleans the subdirectories.
.PHONY: clean clean: $(MAKE) -C libmath clean $(MAKE) -C libprint clean $(MAKE) -C calc clean
And in each of the subdirectories, you'd have a rule to actually remove the files.
.PHONY: clean clean: rm -f $(OBJS) $(LIB)
One advantage of this system is that developers can easily clean the content of any subdirectory by simply issuing the gmake clean command at that level.
For inclusive Make systems, you can take advantage of the fact that the entire dependency graph is available within the single GNU Make process. Because you have a complete list of source files being compiled, you also know the complete set of object files. Things get a little more complicated when you have other generated files (such as equations.c being generated from equations.math), but this simply requires additional logic to record the relevant filenames. Cleaning specific subdirectories is also possible by filtering each file based on its pathname.
The tricky part about cleaning a build tree is that you're not always aware of which files are generated. Sometimes this is a sign that your interfile dependencies are not well understand, but sometimes a compilation tool creates extraneous files that you don't really care about. Although these files are never used and are never included in the dependency graph, they still need to be deleted from the build tree.
One good practice for testing your clean target is to fully build a source tree and then fully clean that same tree. Next, compare the list of disk files against a completely fresh source tree and see if any discrepancies arise. If any files are left over, you can explicitly add them to the clean target to make sure they're properly deleted. On the other hand, you might wonder why those files weren't already accounted for in $(OBJS) and, therefore, already deleted.
Finally, one advantage of storing all generated files in a special object directory instead of the source code tree is that a single delete command (such as rm –rf in UNIX) is guaranteed to remove all generated files.
Scenario 6: Debugging Incorrect Builds
Locating bugs in your GNU Make build system is often challenging. Given the nature of the pattern-matching algorithm, GNU Make doesn't use the line-by-line sequencing that most programmers are comfortable with. Rules from any part of the makefile system can be triggered at any time.
In a real-world development project, you'll likely experience the following makefile problems:
- A target file isn't being generated when it should be. In this case, there's probably a missing link in the dependency graph, and you need to add an additional rule.
- A file is being generated when it shouldn't be, which makes you wonder if an incorrect dependency is causing too much work to be performed.
- The content of the target file is incorrect, which suggests that a compilation tool is being invoked with the wrong command-line options.
- GNU Make is reporting that no rule is available to create a specific target. You need to add the missing rule or determine why an existing rule isn't triggering when it should.
- Rules are being triggered in the wrong order, most likely when you're trying to build multiple jobs in parallel. This is also because you have links missing in the dependency graph.
You can resolve each of these problems by first determining which compilation tool has the incorrect behavior and then working backward to determine where the associated rules and variables are defined. The steps are as follows:
- Examine the build output log to determine which of the compilation tools is doing the wrong thing. This might involve scanning through hundreds or thousands of lines of output to find the offending command.
- Locate the makefile rule that's responsible for generating the bad command line. Given that rules (including the built-in rules) can be spread across a number of different makefiles in a build system, finding where everything is defined can take time.
- Check that the command-line options in this rule are valid. If necessary, double-check the variable definitions used in the rule. This can be challenging if some of the variables use deferred evaluation, making use of subvariables that are defined in other parts of the build system.
- Examine the dependencies in the rule to make sure they're correct. This might involve searching for related rules to ensure that prerequisite files are also being created.
To help with this debugging effort, GNU Make provides a number of command-line options:
- gmake –n: Displays the list of shell commands to be executed, without actually executing them. This saves you a lot of time when trying to find an offending compilation tool, without waiting for a long build to complete.
- gmake –p: Displays the content of GNU Make's internal database. This contains the complete list of rules and variables defined in each makefile, as well as GNU Make's built-in rules. Line number information is recorded so you can easily track down where something is defined.
- gmake –d: Displays a trace log of GNU Make's pattern-matching algorithm as it parses and executes a makefile. The output can be extremely verbose, but it provides everything you need to know.
In addition to these command-line options, you can use the print debugging approach to display useful messages on the program's output. The exact sequence in which these messages appear helps the developer understand how the makefile is executing. The $(warning) function displays a text message, along with information on where in the makefile the function was called.
$(warning CFLAGS is set to $(CFLAGS))
This function doesn't return a value, so it can be inserted at any point in the makefile where a function is permitted. Another clever trick is to use $(warning) within the definition of variables. Whenever the variable is accessed, a suitable message is displayed.
CFLAGS = $(warning Accessing CFLAGS) -g
Also, if you redefine the $(SHELL) variable to include a $(warning) directive, you display a message on the program's output whenever a rule is triggered.
SHELL = $(warning Target is $@) /bin/sh
Now see how all this fits together. Going back to the first calculator program, you now get a much better view of when variables are accessed, what they're defined as, and when the rules are being triggered.
Makefile:8: Accessing CFLAGS Makefile:8: CFLAGS is set to -g Makefile:13: Accessing CFLAGS Makefile:13: Target is add.o gcc -g -c -o add.o add.c Makefile:13: Accessing CFLAGS Makefile:13: Target is calc.o gcc -g -c -o calc.o calc.c Makefile:13: Accessing CFLAGS Makefile:13: Target is mult.o gcc -g -c -o mult.o mult.c Makefile:13: Accessing CFLAGS Makefile:13: Target is sub.o gcc -g -c -o sub.o sub.c Makefile:16: Accessing CFLAGS Makefile:16: Target is calculator gcc -g -o calculator add.o calc.o mult.o sub.o
Finally, to make life much easier, the third-party GNU Make debugger tool  uses these underlying tricks to provide a more traditional debugging environment. You can interactively print the value of variables, find out how they're defined, and set breakpoints on specific makefile rules. Consider using this tool when debugging a nontrivial makefile.