This book represents a significant new milestone in UNIX kernel internals books. Symmetric multiprocessing and cache memory systems are important cost-effective technologies for improving performance in today's state-of-the-art systems.
Written for the UNIX kernel developer, this book provides a complete yet comprehensible explanation of the operation of caches and symmetric multiprocessors, how they work together, and the issues operating systems must address in order to run on the machines that incorporate them.
After a review of UNIX kernel internals, Curt Schimmel launches into a detailed description of cache memory systems, including several kinds of virtual and physical caches, as well as a chapter on efficient cache management. For each type of cache, the book covers the impact on the software and the operating system changes necessary for these systems. The next section details the operation of the tightly-coupled, shared memory, symmetric multiprocessor. It examines the problems these multiprocessors present to the operating system, such as race conditions, deadlocks, and the ordering of memory operations, and looks at how the UNIX kernel can be adapted to run on such systems. Finally, the book looks at the interaction between cache memory systems and multiprocessors and the new problems that this interaction presents to the kernel. Techniques for solving these problems are then explained.
Numerous examples representing CISC and RISC processors, such as the Intel 80486 and Pentium, the Motorola 68040 and 88000, as well as theMIPS and SPARC processors, illustrate the concepts presented. To reinforce the concepts, each chapter contains a set of exercises with answers to selected exercises included in the back.
"This book UNIX Systems for Modern Architectures for the systems programmer covers almost everything you wanted to know about caches, multiprocessor systems, and cached multiprocessor systems, especially as related to UNIX."-Unix Review
1. Review of UNIX Kernel Internals.
Processes, Programs, and Threads.
The Process Address Space.
Memory and Process Management System Calls.
I. CACHE MEMORY SYSTEMS. @CHAPTER 2. Introduction to Cache Memory Systems.
Direct Mapped Caches.
Two-Way Set Associative Caches.
n-Way Set Associative Caches.
Fully Associative Caches.
Summary of n-Way Set Associative Caches.
Separate Instruction and Data Caches.
How Cache Architectures Differ.
Further Reading.3. Virtual Caches.
Virtual Cache Operation.
Problems with Virtual Caches.
Managing a Virtual Cache.
Further Reading.4. Virtual Caches with Keys.
The Operation of a Virtual Cache with Keys.
Managing a Virtual Cache with Keys.
Virtual Cache Usage in MMUs.
Further Reading.5. Virtual Caches with Physical Address Tags.
The Organization of a Virtual Cache with Physical Tags.
Managing a Virtual Cache with Physical Tags.
Further Reading.6. Physical Caches.
The Organization of a Physical Cache.
Managing a Physical Cache.
Primary Virtual Cache with Secondary Physical Cache.
Further Reading.7. Efficient Cache Management Techniques.
Address Space Layout.
Cache Size Bounded Flushing—Delayed Cache Invalidations.
Cache-Aligning Data Structures.
II. MULTIPROCESSOR SYSTEMS.8. Introduction to Multiprocessor Systems.
The Tightly Coupled, Shared Memory, Symmetric.
The MP Memory Model.
Review of Mutual Exclusion on Uniprocessor.
Problems Using UP Mutual Exclusion Policies on MPs.
Further Reading.9. Master-Slave Kernels.
Master-Slave Kernel Implementation.
Further Reading.10. Spin-Locked Kernels.
Multithreading Cases Requiring No Locks.
Effects of Sleep and Wakeup on Multiprocessors.
Further Reading.11. Semaphored Kernels.
Coarse-Grained Semaphore Implementations.
Multithreading with Semaphores.
Further Reading.12. Other MP Primitives.
Eventcounts and Sequencers.
The MP Primitives of SVR4.2 MP.
Comparison of MP Synchronization Primitives.
Further Reading.13. Other Memory Models.
Other Memory Models.
Total Store Ordering.
Partial Store Ordering.
The Store Buffer as Part of the Memory Hierarchy.
III. MULTIPROCESSOR SYSTEMS WITH CACHES.14. Introduction to MP Cache Consistency.
The Cache Consistency Problem.
Software Cache Consistency.
Further Reading.15. Hardware Cache Consistency.
Consistency of Read-Modify-Write Operations.
Hardware Consistency for Multilevel Caches.
Other Main Memory Architectures.
Effects on the Software.
Hardware Consistency for Nonsequential Memory Models.
Performance Considerations for Software.
Further Reading.Appendix A: Architecture Summary.
The goal of this book is to provide practical information on the issues operating systems must address in order to run on modern computer systems that employ cache memories and/or multiprocessors. At the time of this writing, a number of books describe UNIX system implementations, but none describes in detail how caches and multiprocessors should be managed. Many computer architecture books describe caches and multiprocessors from the hardware aspect, but none successfully deals with the operating system issues that these modern architectures present. This book is intended to fill these gaps by bridging computer architecture and operating systems.
Written with the operating developer in mind, this book explains the operation of caches and multiprocessors from the system programmers point of view. While targeted toward UNIX system programmers, the book has been written so that the information can be applied to any operating system, including all UNIX variations. This is accomplished by explaining the issues and solutions at a conceptual level and using the UNIX system services as examples of where the issues will be encountered. The solutions can then be applied to other operating systems in the corresponding situations.
This book is intended to assist the operating system developer in two ways. First, the reader will learn how existing operating systems must be adapted to run on modern architectures. This is accomplished by a detailed examination of the operation of these architectures from the operating system perspective and an explanation of what the operating system must do to manage them. Second, the reader will learn the trade-offs involved in the different approaches taken by modern architectures. This will give the operating system developer the background needed when involved in the design of new computer systems employing caches and multiprocessors.
The reader is assumed to be familiar with the UNIX system call interface and the high-level concepts of UNIX kernel internals. The reader should also be familiar with computer architecture and computer system organization as would be taught in an undergraduate-level computer science course.
This book is an extension of a course I developed for UNIX system professionals in the computer industry. The course has been taught during the past four years in the United States at USENIX conferences, and in Europe at the EurOpen and UKUUG conferences. The course is a one-day tutorial and as such is limited in the amount of material that can be covered. This book covers all the course material on cache memories and multiprocessors in greater detail and includes additional topics.
This book is suitable for use in an upper-division undergraduate-level course or at the graduate level. Each chapter concludes with a list of exercises. The questions were chosen so that they could be solved with the information provided in the chapter plus some additional thought, rather than simply parrot the material. In many cases, the exercises build upon the examples presented in the chapter. Answers are generally expected to take the form of a short paragraph (four to five sentences in most cases, sometimes longer). The reader is urged to try all the questions in order to reinforce the concepts learned. Answers to selected exercises are provided in the back of the book.
We begin with a review of the UNIX system internals that are relevant to the discussion in the remainder of the book. The purpose of the review is to reinforce the concepts of the UNIX operating system and to define terminology used later. The book is then divided into three main parts: cache memory systems, multiprocessor UNIX implementations, and multiprocessor cache consistency. The first part, cache memory systems, introduces cache architecture, terminology, and concepts. It then proceeds to take a detailed look at four common cache implementations: three variations of the virtual cache and then the physical cache. The second part, multiprocessor UNIX implementations, looks at the problems and design issues faced when adapting a uniprocessor kernel implementation to run on a tightly coupled, shared memory multiprocessor. Several different implementations are examined. The final part, multiprocessor cache consistency, combines the concepts of the first two parts by examining the operating system and cache architecture issues that occur when caches are added to a tightly coupled, shared memory multiprocessor system.
A selected set of modern microprocessor architectures is used to illustrate the concepts where appropriate. Representing the traditional CISC (complex instruction set computer) processors are the Motorola 68040 and the Intel 80X86 line (80386, 80486, and Pentium). The RISC (reduced instruction set computer) approach is represented by the MIPS line (R2000, R3000, and R4000), the Motorola 88000, and the SPARC version 8 compatible processors from Texas Instruments (the MicroSPARC and the SuperSPARC). Several other examples, including Sun and Apollo workstations and the Intel i860, are also presented. A summary of the characteristics of these processors can be found in Appendix A.
I owe my gratitude to the people who offered their time to review the manuscript before publication. In particular, I would like to thank Steve Albert, Paul Borman, Steve Buroff, Clement Cole, Peter Collinson, Geoff Collyer, Bruce Curtis, Mukesh Kacker, Brian Kernighan, Steve Rago, Mike Scheer, Brian Silverio, Rich Stevens, Manu Thapar, Chris Walquist, and Erez Zadok. I would also like to thank the Addison-Wesley staff for their help and advice on this project, particularly Kim Dawley, Kathleen Duff, Tiffany Moore, Simone Payment, Marty Rabinowitz, and John Wait. They have helped make this a better book than I could have done on my own. I would also like to thank the many people who took the time to provide thoughtful feedback by filling out the course evaluations during the tutorial sessions.
Comments, suggestions, and bug fixes regarding the contents of this book are welcome and can be sent by email to firstname.lastname@example.org.