The Art of Computer Programming, Volume 4, Fascicle 7: Constraint Satisfaction
Preface
There is quite a good deal of information in the book. I regret this very much; but really it could not be helped: information appears to stew out of me naturally, like the precious ottar of roses out of the otter. Sometimes it has seemed to me that I would give worlds if I could retain my facts; but it cannot be. The more I calk up the sources, and the tighter I get, the more I leak wisdom. Therefore, I can only claim indulgence at the hands of the reader, not justification.
— MARK TWAIN, Roughing It (1872)
FREEDOM is a wonderful thing. But we can also be thankful for the constraints that give structure to our lives and provide a focus for our creative juices. We experience moments of great satisfaction when challenges have been met. Combinatorial and musical patterns that fit together almost magically can be supremely satisfying.
This booklet is Fascicle 7 of The Art of Computer Programming, Volume 4: Combinatorial Algorithms. As explained in the preface to Fascicle 1 of Volume 1, I’m circulating the material in this preliminary form because I know that the task of completing Volume 4 will take many years; I can’t wait for people to begin reading what I’ve written so far and to provide valuable feedback.
To put the material in context, this lengthy fascicle contains part of a long, long chapter on combinatorial searching. Chapter 7 will eventually fill at least four volumes (namely Volumes 4A, 4B, 4C, and 4D), assuming that I’m able to remain healthy. It begins in Volume 4A with a short review of graph theory and a longer discussion of “Zeros and Ones” (Section 7.1); that volume concludes with Section 7.2.1, “Generating Basic Combinatorial Patterns,” which is the first part of Section 7.2, “Generating All Possibilities.” Volume 4B resumes the story with Section 7.2.2, about backtracking in general. Section 7.2.2.1 discusses a family of methods called “dancing links,” for updating data structures while backtracking, and shows that such methods are particularly successful when applied to XCC problems—“exact covering with colors.” That sets the scene for Section 7.2.2.2, which fills the rest of Volume 4B; it is devoted to the important problem of Boolean satisfiability, aka ‘SAT’.
Now comes Volume 4C, whose first third will eventually consist of the contents of the present fascicle. Our theme, “constraint satisfaction,” is Section 7.2.2.3, and it takes the concepts of Sections 7.2.2.1 and 7.2.2.2 to a higher level. A fresh look at the basic notions leads to significant improvements and many individual topics of independent interest. Once again it has been a great pleasure for me to put these stories together, and I hope that I can convey the joy of what I’ve learned to as many people as possible.
Dozens of examples tie the subject matter of this fascicle to numerous other parts of computer science and mathematics. We’ll see, for instance, applications to scene analysis (computer vision); we’ll construct fascinating instances of “graceful graphs”; we’ll discuss efficient algorithms that embed one graph as a subgraph of another; we’ll learn new ways to look ahead when backtracking, and we’ll investigate new heuristics for guiding a search that backtracks through a massive space of possibilities; we’ll also learn how to avoid backtracking altogether, when possible. We’ll study new sparse-set data structures that lead to “dancing cells”—a technique that often is even better than “dancing links”! Of course there are recreational topics galore, even including some new takes on the classic problem of a knight’s tour. Puzzle connoisseurs will enjoy learning more about fillomino patterns. And so on.
As usual, more than half of this book is devoted to exercises and their answers. In fact, Section 7.2.2.3 has approximately 500 exercises, some of which are routine and some of which aren’t yet solved; all of them have been designed for self-study. I’ve tried to make it easy for a reader to navigate through this maze of problems by presenting them in an order that roughly parallels the threads of the main text.
Look, for example, at a “random” page—page 42, say, which discusses a variety of ways to translate graph coloring problems into SAT clauses. On that page you’ll see that exercises 221, 226, 224, 225, 228 are mentioned. So you can guess that the main exercises about SAT encoding probably have numbers near 225. The exercises have also been carefully indexed.
My goal as always has been to strive for a good balance between theory and practice (and fun). I try to discuss the theoretical discoveries that are most relevant to people who write programs in the “real world,” and to present those theories without using advanced jargon. I also strive to encapsulate the history of the subject, so that readers can understand the “human dimension” and the process of discovery.
While writing this section I also wrote hundreds of programs for my own edification. (I usually can’t understand things well until I’ve tried to explain them to a machine.) Most of those programs were quite short, of course; but several of them are rather substantial, and possibly of interest to others. Therefore I’ve made a selection available by listing some of them on the following webpage:
Prototypes of the main algorithms can be found there, together with a few other programs that are mentioned in the answers to certain exercises. If you want to see a program called FOO, look for FOO on that webpage. In particular, you can download the programs SSXCCO, SSXCC, SSXCC-BINARY, SSMCC, and XCCDC; those experimental versions of Algorithms C, C+, B, F, and S were my constant companions while writing the middle portions of this fascicle.
Interested readers who wish to do their own experiments can find data files for the benchmark examples in the following two collections:
Incidentally, the illustrations in Section 7.2.2.3 are numbered beginning with Fig. 100, because the final illustration of Volume 4B is Fig. 99. The editor has decided to treat Chapter 7 as a single unit, even though it is being split across several physical volumes.
Special thanks are due to Christian Bessière, Víctor Dalmau, Ralph Freese, Daniel Horsley, Peter Jeavons, Ciaran McCreesh, Patrick Prosser, George Sicherman, Christine Solnon, Filip Stappers, Peter Stuckey, Kokichi Sugihara, James Trimble, Udo Wermuth, Ross Willard, and Dmitriy Zhuk, for their detailed comments on my early attempts at exposition, as well as to numerous other correspondents who have contributed crucial corrections. Thanks also to Stanford’s InfoLab for providing extra computer power when my workstation was inadequate. And above all, I thank my wife for her constant support and for help with Figs. 100 and 101.
I happily offer a “finder’s fee” of $2.56 for each error in this draft when it is first reported to me, whether that error be typographical, technical, or historical. The same reward holds for items that I forgot to put in the index. And valuable suggestions for improvements to the text are worth 32¢ each. (Furthermore, if you find a better solution to an exercise, I’ll actually do my best to give you immortal glory, by publishing your name in the eventual book:–)
Cross references to yet-unwritten material sometimes appear as ‘00’; this impossible value is a placeholder for the actual numbers to be supplied later.
Happy reading!
D. E. K.
Stanford, California
23 October 2024
P.S.: A note on notations. Some formulas in this booklet use the notation ‘υ x’ for the “sideways sum” or “population count” function, as well as the notation ‘ρx’ for the “ruler” function. Those functions, and other bitwise notations, are discussed extensively in Section 7.1.3 of Volume 4A.
Other formulas use the notation 〈xyz〉 for the median function, which is discussed extensively in Section 7.1.1. Hexadecimal constants are preceded by a number sign or hash mark: # 123 means (123)16.
Furthermore, there’s a special list of ‘Notational conventions’ in the index.
If you run across other notations that appear strange, please look at the Index to Notations (Appendix B) at the end of Volume 4A or 4B. Volume 4C will, of course, have its own Appendix B some day.
The field of combinatorial algorithms is too vast to cover in a single paper or even in a single book.
— ROBERT ENDRE TARJAN, SIAM Review (1978)
