Code Reading: The Open Source Perspective
- By Diomidis Spinellis
- Published May 27, 2003 by Addison-Wesley Professional. Part of the Effective Software Development Series series.
- Copyright 2003
- Dimensions: 7-3/8x9-1/4
- Pages: 528
- Edition: 1st
- ISBN-10: 0-201-79940-5
- ISBN-13: 978-0-201-79940-8
Register your product to gain access to bonus material or receive a coupon.
Product Author Bios
Diomidis Spinellis has been developing the concepts presented in this book since 1985, while also writing groundbreaking software applications and working on multimillion-line code bases. Spinellis holds an M.Eng. degree in software engineering and a Ph.D. in computer science from Imperial College London. Currently he is an associate professor in the Department of Management Science and Technology at the Athens University of Economics and Business.
If you are a programmer, you need this book.
- You've got a day to add a new feature in a 34,000-line program: Where do you start? Page 333
- How can you understand and simplify an inscrutable piece of code? Page 39
- Where do you start when disentangling a complicated build process? Page 167
- How do you comprehend code that appears to be doing five things in parallel? Page 132
You may read code because you have to--to fix it, inspect it, or improve it. You may read code the way an engineer examines a machine--to discover what makes it tick. Or you may read code because you are scavenging--looking for material to reuse.
Code-reading requires its own set of skills, and the ability to determine which technique you use when is crucial. In this indispensable book, Diomidis Spinellis uses more than 600 real-world examples to show you how to identify good (and bad) code: how to read it, what to look for, and how to use this knowledge to improve your own code.
Fact: If you make a habit of reading good code, you will write better code yourself.
Download the complete source code base (134 MB) for the book: Code Reading: The Open Source Perspective.
62 of 67 people found the following review helpful
Cool concept, but disappointing,
Amazon Verified Purchase(What's this?)
This review is from: Code Reading: The Open Source Perspective (v. 1) (Paperback)I purchased the book to help me out with the recurring task of quickly understanding the nature of unfamiliar large software projects. Kudos to Mr. Spinellis for tackling this subject, which is a large part of the everyday work of programming.
Unfortunately, I feel that this book was of very limited use to me as an experienced programmer, and suffers from a rather basic flaw (as a topic). The problem is that the art of code reading is really the intersection of a deep and/or broad understanding of programming, in conjunction with a deep and/or broad understanding of the tools and practices employed. One could well assert that this book is about *debugging* unfamiliar codebases as much as it is about *reading* them, since code comprehension is a component of code debugging. This is a rather apt analogy, since many have attempted to describe the black art of debugging just as Mr. Spinellis has attempted with reading, and with no definitive "must-have" coverage to... Read more
61 of 78 people found the following review helpful
This review is from: Code Reading: The Open Source Perspective (v. 1) (Paperback)Programmers need to be able to look at code and analyze what it does in order to change it or fix it. The concept behind this book is to use many of the open source code samples to discuss how to read code and how to spot potential trouble areas in code. Unfortunately the book doesn't stay focused on this single goal and that detracts from its overall value. The book spends too much time explaining the basics of programming instead of concentrating on reading code. It also bounces around from one language to another, from C to C++ to Perl to Java, which is very confusing. For example, if you are a Java programmer do you really care how the C compiler optimizes strcmp calls? And what does that have to do with reading code?
Some of the advice is fairly basic such as try to realign indentations properly and replace complex code structures with simple placeholders when doing analysis. Although there are parts of the book that are excellent, too many of these good parts are... Read more
9 of 10 people found the following review helpful
Imparts benefits from much experience with wisdom & humor,
This review is from: Code Reading: The Open Source Perspective (v. 1) (Paperback)This book is exactly what I was looking for to lead a seminar in bioinformatics at UNC Chapel Hill that brings together bio-chem-phys students with computer science students to try to raise the level of programming sophistication of the former, and raise the level of biochem/biophys sophistication of the latter. It collects examples of why and how to read code, pointing out lessons about the idioms and pitfalls that can help you write, maintain, or evolve code under your control. Full of good ideas, drawn from a lot of experience, and written with humor.
The only problem is that inexperienced programmers, who would benefit most from this book, are unlikely to pick up a book on how to read C programs unless someone tells them to. Experts will find that they have already learned most of these things from their experience, although they may still enjoy this book for confirming what they know. But I think that experts will also enjoy being able to loan this book to inexperienced... Read more
› See all 19 customer reviews...
Download the Index
file related to this title.
What do we ever get nowadays from reading to equal the excitement and the revelation in those first fourteen years? -- Graham Greene
The reading of code is likely to be one the most common activities of a computing professional, yet it is seldom taught as a subject, or formally used as a method for learning how to design and program. One reason for this sad situation may have been the lack of high-quality code to read. Companies often protect source code as a trade secret and rarely allow others to read, comment, experiment, and learn from it. In the few cases where important proprietary code was allowed out of a company's closet, it has spurred enormous interest and creative advancements. As an example, a generation of programmers benefited from John Lions' Commentary on the UNIX Operating System that listed and annotated the complete source code of the sixth edition UNIX kernel. Although Lions' book was originally written under a grant from AT&T for use in an operating system course and was not available to the general public, copies of it circulated for years as bootleg nth-generation photocopies.
In the last few years however, the popularity of open-source software has provided us with a large body of code that we can all freely read. Some of the most popular software systems used today, such as the Apache Web server, the Perl language, the gnu/Linux operating system, the BIND domain-name server, and the sendmail mail-transfer agent are in fact available in open-source form. I was thus fortunate to be able to use open-source software, such as the above to write this book as a primer and reader for software code. My goal was to provide background knowledge and techniques for reading code written by others. By using real-life examples taken out of working, open-source projects I tried to cover most concepts related to code that are likely to appear before a software developer's eyes including programming constructs, data types, data structures, control flow, project organization, coding standards, documentation, and architectures. A companion title to this book will cover interfacing, and application-oriented code including the issues of internationalization and portability, the elements of commonly used libraries and operating systems, low-level code, domain-specific and declarative languages, scripting languages, and mixed language systems.
This book is--as far as I know--the first one to exclusively deal with code-reading as a distinct activity, one worthy on its own. As such I am sure that there will be inevitable shortcomings, better ways some of its contents could have been treated, and important material I have missed. I firmly believe that the reading of code should both be properly taught, and used as a method for improving one's programming abilities. I therefore hope that this book will spur interest to include code reading courses, activities, and exercises into the computing education curriculum so that in a few years our students will learn from existing open-source systems, just as their peers studying a language learn from the great literature.Supplementary Material
Many of the source code examples provided come from the source distribution of NetBSD. NetBSD is a free, highly portable UNIX-like operating system available for many platforms, from 64-bit AlphaServers to handheld devices. Its clean design and advanced features make it an excellent choice for both production and research environments. I selected NetBSD over other similarly admirable and very popular free UNIX-like systems such as GNU/Linux, FreeBSD, and OpenBSD, because the primary goal of the NetBSD project is to emphasize correct design and well written code thus making it a superb choice for providing example source code. According to its developers, some systems seem to have the philosophy of "if it works, it's right" whereas NetBSD could be described as "it doesn't work unless it's right." In addition, some other NetBSD goals fitted particularly well with the objectives of this book. Specifically, the NetBSD project avoids encumbering licenses, provides a portable system running on many hardware platforms, interoperates well with other systems, and conforms to open systems standards as much as is practical. The code used in this book is a (now historic) export-19980407 snapshot. A few examples refer to errors I found in the code; as the NetBSD code continuously evolves, presenting examples from a more recent version would mean risking that those realistic gems would have been corrected.
I chose the rest of the systems I used in the book's examples for similar reasons: code quality, structure, design, utility, popularity, and a license that would not make my publisher nervous. I strived to balance the selection of languages, actively looking for suitable Java and C++ code. However, where similar concepts could be demonstrated using different languages I chose to use C as the least common denominator.
I sometimes used real code examples to illustrate unsafe, non-portable, unreadable, or otherwise condemnable coding practices. I appreciate that I can be accused of disparaging code that was contributed by its authors in good faith to further the open-source movement and to be improved upon rather than be merely criticized. I sincerely apologize in advance if my comments cause any offence to a source code author. In defense I argue that in most cases the comments do not target the particular code excerpt, but rather use it to illustrate a practice that should be avoided. Often the code I am using as a counter example is a lame duck, as it was written at a time where technological and other restrictions justified the particular coding practice, or the particular practice is criticized out of the context. In any case, I hope that the comments will be received good-humouredly, and openly admit that my own code contains similar, and probably worse, misdeeds.
We're programmers. Our job (and in many cases our passion) is to make things happen by writing code. We don't meet our user's requirements with acres of diagrams, with detailed project schedules, with four-foot-high piles of design documentation. These are all wishes--expressions of what we'd like to be true. No, we deliver by writing code: code is reality.
So that's what we're taught. Seems reasonable. Our job is to write code, so we need to learn how to write code. College courses teach us to to write programs. Training courses tell us how to code to new libraries and APIs. And that's one of the biggest tragedies in the industry.
Because the way to learn to write great code is by reading code. Lots of code. High-quality code, low-quality code. Code in assembler, code in Haskell. Code written by strangers ten thousand miles away, and code written by ourselves last week. Because unless we do that, we're continually reinventing what has already been done, repeating both the successes and mistakes of the past.
I wonder how many great novelists have never read someone else's work, how many great painters never studied another's brush strokes, how many skilled surgeons never learned by looking over a colleague's shoulder, how many 767 captains didn't first spend time in the copilot's seat watching how it's really done.
And yet that's what we expect programmers to do. "This week's assignment is to write. ..." We teach developers the rules of syntax and construction, and then we expect them to be able to write the software equivalent of a great novel.
The irony is that there's never been a better time to read code. Thanks to the huge contributions of the open-source community, we now have gigabytes of source code floating around the 'net just waiting to be read. Choose any language, and you'll be able to find source code. Select a problem domain, and there'll be source code. Pick a level, from microcode up to high-level business functions, and you'll be able to look at a wide body of source code.
Code reading is fun. I love to read others' code. I read it to learn tricks and to study traps. Sometimes I come across small but precious gems. I still remember the pleasure I got when I came across a binary-to-octal conversion routine in PDP-11 assembler that managed to output the six octal digits in a tight loop with no loop counter.
I sometimes read code for the narrative, like a book you'd pick up at an airport before a long flight. I expect to be entertained by clever plotting and unexpected symmetries. Jame Clark's gpic program (part of his GNU groff package) is a wonderful example of this kind of code. It implements something that's apparently very complex (a declarative, device-independent picture-drawing language) in a compact and elegant structure. I came away feeling inspired to try to structure my own code as tidily.
Sometimes I read code more critically. This is slower going. While I'm reading, I'm asking myself questions such as "Why is this written this way?" or "What in the author's background would lead her to this choice?" Often I'm in this mode because I'm reviewing code for problems. I'm looking for patterns and clues that might give me pointers. If I see that the author failed to take a lock on a shared data structure in one part of the code, I might suspect that the same might hold elsewhere and then wonder if that mistake could account for the problem I'm seeing. I also use the incongruities I find as a double check on my understanding; often I find what I think is a problem, but it on closer examination it turns out to be perfectly good code. Thus I learn something.
In fact, code reading is one of the most effective ways to eliminate problems in programs. Robert Glass, one of this book's reviewers, says, "by using (code) inspections properly, more than 90 percent of the errors can be removed from a software product before its first test. In the same article he cites research that shows "Code-focused inspectors were finding 90 percent more errors than process-focused inspectors." Interestingly, while reading the code snippets quoted in this book I came across a couple of bugs and a couple of dubious coding practices. These are problems in code that's running at tens of thousands of sites worldwide. None were critical in nature, but the exercise shows that there's always room to improve the code we write. Code-reading skills clearly have a great practical benefit, something you already know if you've ever been in a code review with folks who clearly don't know how to read code.
And then there's maintenance, the ugly cousin of software development. There are no accurate statistics, but most researchers agree that more than half of the time we spend on software is used looking at existing code: adding new functionality, fixing bugs, integrating it into new environments, and so on. Code-reading skills are crucial. There's a bug in a 100,000-line program, and you've got an hour to find it. How do you start? How do you know what you're looking at? And how can you assess the impact of a change you're thinking of making?
For all these reasons, and many more, I like this book. At its heart it is pragmatic. Rather than taking an abstract, academic approach, it instead focuses on the code itself. It analyzes hundreds of code fragments, pointing out tricks, traps and (as importantly) idioms. It talks about code in its environment and discusses how that environment affects the code. It highlights the important tools of the code reader's trade, from common tools such as grep and find to the more exotic. And it stresses the importance of tool building: write code to help you read code. And, being pragmatic, it comes with all the code it discusses, conveniently cross-referenced on a CD-ROM.
This book should be included in every programming course and should be on every developer's bookshelf. If as a community we pay more attention to the art of code reading we'll save ourselves both time and pain. We'll save our industry money. And we'll have more fun while we're doing it.
The Pragmatic Programmers, LLC
Download the sample pages (includes Chapter 2 and Index)
Table of Contents
2. Basic Programming Elements.
3. Advanced C Data Types.
4. C Data Structures.
5. Advanced Control Flow.
6. Tackling Large Projects.
7. Coding Standards and Conventions.
10. Code-Reading Tools.
11. A Complete Example.
Appendix A. Outline of the Code Provided.
Appendix B. Source Code Credits.
Appendix C. Referenced Source Files.
Appendix D. Source Code Licenses.
Appendix E. Maxims for Reading Code.
Author Index. 0201799405T01162003
Downloadable Sample Chapter
Download the Sample Chapter related to this title.
Click for the Errata related to this title.
Book + eBook Bundle
Book Price $55.99
eBook Price $19.60
eBook formats included
This book includes free shipping!
This book includes free shipping!
Includes EPUB, MOBI, and PDF
About eBook Formats
This eBook includes the following formats, accessible from your Account page after purchase:
EPUBThe open industry format known for its reflowable content and usability on supported mobile devices.
MOBIThe eBook format compatible with the Amazon Kindle and Amazon Kindle applications.
PDFThe popular standard, used most often with the free Adobe® Reader® software.
This eBook requires no passwords or activation to read. We customize your eBook by discretely watermarking it with your name, making it uniquely yours.
Get access to thousands of books and training videos about technology, professional development and digital media from more than 40 leading publishers, including Addison-Wesley, Prentice Hall, Cisco Press, IBM Press, O'Reilly Media, Wrox, Apress, and many more. If you continue your subscription after your 30-day trial, you can receive 30% off a monthly subscription to the Safari Library for up to 12 months. That's a total savings of $199.