"This sobering description of many computer-related failures throughout our world deflates the hype and hubris of the industry. Peter Neumann analyzes the failure modes, recommends sequences for prevention and ends his unique book with some broadening reflections on the future."--Ralph Nader, Consumer Advocate
This book is much more than a collection of computer mishaps; it is a serious, technically oriented book written by one of the world's leading experts on computer risks. The book summarizes many real events involving computer technologies and the people who depend on those technologies, with widely ranging causes and effects. It considers problems attributable to hardware, software, people, and natural causes. Examples include disasters (such as the Black Hawk helicopter and Iranian Airbus shootdowns, the Exxon Valdez, and various transportation accidents); malicious hacker attacks; outages of telephone systems and computer networks; financial losses; and many other strange happenstances (squirrels downing power grids, and April Fool's Day pranks).
Computer-Related Risks addresses problems involving reliability, safety, security, privacy, and human well-being. It includes analyses of why these cases happened and discussions of what might be done to avoid recurrences of similar events. It is readable by technologists as well as by people merely interested in the uses and limits of technology. It is must reading for anyone with even a remote involvement with computers and communications--which today means almost everyone.
If you wish to catch up with current events and are able to browse the Internet, you are encouraged to peruse the ongoing RISKS archives -- at ftp://ftp.sri.com/risks . Further instructions for on-line access are given in Appendix Section A.2.1, including subscribing to receive RISKS by direct e-mail, by sending e-mail to risks-request@CSL.sri.com with one-line "subscribe" or "help" for further information.
A summary of one-liners of essentially all interesting RISKS cases (Illustrative Risks to the Public in the Use of Computer Systems and Related Technology), updated regularly, is browsable at http://www.csl.sri.com/neumann/illustrative.html (and is also available for ftp in .ps and .pdf form). Further RISKS-related material also appears in each regular issue of the ACM Software Engineering Notes and in the Inside Risks column in each issue of the Communications of the ACM.
In addition, Peter Neumann's Web site http://www.csl.sri.com/neumann.html contains further information, including his 1996 testimony for the Permanent Subcommittee on Investigations of the Senate Committee on Governmental Affairs and 1998 testimony to the full Committee on security risks in the infrastructure, analysis of risks relating to the Social Security Administration's PEBES (Personal Earnings and Benefit Estimate Statement) Web site and related identity-related risks, and my 1997 testimony for the Senate Judiciary Committee on risks in key-recovery.
Computer systems enable us to do tasks that we could not even dream of doing otherwise --- for example, carrying out extremely complex operations, or searching rapidly through mammoth amounts of information. They can also fail to live up to our expectations in many different ways, sometimes with devastating consequences.
This book analyzes a large collection of problems experienced to date, and provides insights that may be helpful in avoiding such consequences in the future. Hindsight can be valuable when it leads to new foresight. I hope that this book will serve that purpose, helping us to attain a safer, sounder, and more secure future in our interactions with and dependence on computers and related technologies.
This book refers to vulnerabilities, threats, and risks in computers and related systems. Strict definitions always tend to cause arguments about subtle nuances, and tend to break down in specific cases. Because the primary emphasis in this book is on the big picture, we seek definitions that are intuitively motivated.
The following intuitively-based English language terms are applied to computer and communication systems throughout this book, and are introduced here. These definitions are generally consistent with common technical usage. (Each term tends to have a conceptual meaning as well as a relative qualitative meaning.) Further definitions are given as they are needed.
These three terms appear to overlap significantly, and indeed that is the case no matter how carefully definitions are chosen. These concepts are inherently interrelated; uses of technology must necessarily consider them together. Furthermore, there is never an absolute sense in which a system is secure or reliable.
Increasingly, we depend on computer systems to behave acceptably in applications with extremely critical requirements, by which we mean that the failure of systems to meet their requirements may result in serious consequences. Examples of critical requirements include protection of human lives and resources on which we depend for our well-being, and the attainment (with high assurance) of adequate system reliability, data confidentiality, and timely responsiveness, particularly under high-risk situations. A challenge of many computer-system designs is that the entire application environment (rather than just the computer systems) must satisfy simultaneously a variety of critical requirements, and must continue to do so throughout its operation, maintenance, and long-term evolution. The satisfaction of a single requirement is difficult enough, but the simultaneous and continued satisfaction of diverse and possibly conflicting requirements is typically much more difficult.
The collection of diverse failures presented here is representative of computer-related problems that have arisen in the past. Understanding the reasons for these cases can help us at least to reduce the chances of the same mistakes recurring in the future. However, we see that similar problems continue to arise. Furthermore, major gaps still exist between theory and practice, and between research and development.
We take a broad, system-oriented view of computer-related technologies that includes computer systems, communication systems, control systems, and robots, for example. We examine the role of computers themselves from a broad perspective. Hardware, software, and people are all sources of difficulties. Human safety and personal well-being are of special concern.
We explore various inherent limitations both of the technology and of the people who interact with it. Certain limitations can be overcome --- albeit only with significant effort. We must strive to promote the development and systematic use of techniques that can help us to identify the intrinsic limitations, and to reduce those that are not intrinsic --- for example, through better systems and better operational practices. We must also be keenly aware of the limitations.
There are many pitfalls in designing a system to meet critical requirements. Software-engineering techniques provide directions, but no guarantees. Experience shows that even the most carefully designed systems may have significant flaws. Both testing and formal verification have serious deficiencies, such as the intrinsic incompleteness of the former and the considerable care and effort necessary in carrying out the latter. All in all, there are no easy answers.
Unfortunately, even if an ideal system were designed such that its specifications were shown to be consistent with its critical requirements, and then the design were implemented correctly such that the code could be shown to be consistent with the specifications, the system would still not be totally trustworthy. Desired behavior could be undermined by the failure of the underlying assumptions (whether implicit or explicit), even temporarily. Such assumptions are far-reaching, yet often are not even stated --- for example, that the requirements are complete and correct, that there are no lurking design flaws, that no harmful malicious code has been inserted, that there are no malicious users (insiders or outsiders), that neither the data nor the system has been altered improperly, and that the hardware behaves predictably enough that the expected worst-case fault coverage is adequate. In addition, human misuse or other unanticipated problems can subvert even the most carefully designed systems.
Thus, there is good news and there is bad news. The good news is that computer system technology is advancing. Given well-defined and reasonably modest requirements, talented and diligent people, enlightened and altruistic management, adequate financial and physical resources, and suitably reliable hardware, systems can be built that are likely to satisfy certain stringent requirements most of the time. There have been significant advances --- particularly in the research community --- in techniques for building such computer systems. The bad news is that guaranteed system behavior is impossible to achieve --- with or without people in the operational loop. There can always be circumstances beyond anyone's control, such as floods, lightning strikes, and cosmic radiation, to name a few. Besides, people are fallible. Thus, there are always inherent risks in relying on computer systems operating under critical requirements --- especially those that are complex and are necessary for controlling real-time environments, such as in fly-by-wire aircraft that are aerodynamically unstable and cannot be flown without active computer control. Even the most exhausting (but still not exhaustive) testing leaves doubts. Furthermore, the development community tends to be slow in adopting those emerging research and development concepts that are practical, and in discarding many other ideas that are not practical. Even more important, it is inherent in a development effort and in system operation that all potential disasters cannot be foreseen --- yet it is often the unforeseen circumstances that are the most disastrous, typically because of a combination of circumstances involving both people and computers. A fundamental conclusion is that, even if we are extremely cautious and lucky, we must still anticipate the occurrences of serious catastrophes in using computer systems in critical applications. This concept is also explored by Charles Perrow, who illustrates why accidents must be considered to be normal, rather than exceptional, events .
Because most of the examples cited here illustrate what has gone wrong in the past, the casual reader may wonder if anything has ever gone right. I have always sought to identify true success stories. However, there are few cases in which system developments met their requirements, on budget, and on time; even among those, there have been many developmental and operational problems.
The technological aspects of this book consider how we can improve the state of the art and enhance the level of human awareness, to avoid in the future the problems that have plagued us in the past. It is important to consider the collection of examples as representing lessons from which we must learn. We must never assume infallibility of either the technology or the people developing and applying that technology. Indeed, in certain situations, the risks may simply be too great for us to rely on either computers or people, and it would be better not to entrust the application to automation in the first place. For other applications, suitable care in system development and operation may be sufficient to keep the risks within acceptable limits. But all such conclusions depend on an accurate assessment of the risks and their consequences, an assessment that is typically lacking.
Some of the many stages of system development and system use during which risks may arise are listed in Sections 1.2.1 and 1.2.2, along with a few examples of what might go wrong. Indeed, every one of these categories is illustrated throughout the book with problems that have actually occurred. Techniques for overcoming these problems are considered in Chapter 7.
Problems may occur during each stage of system development (often involving people as an underlying cause), including the following:
Problems may also arise during system operation and use (typically involving people or external factors), including the following:
There are many areas in which computers affect our lives, and in which risks must be anticipated or accommodated. Several of these areas are listed here, along with just a few types of potential risks associated with the causes listed in Sections 1.2.1 and 1.2.2. All of these areas are represented in the following text.
Given these potential sources of risks and their consequent adverse effects, appropriate countermeasures are essential. Chapter 7 includes discussion of techniques for increasing reliability, safety, security, and other system properties, and protecting privacy. That chapter also addresses techniques for improving the system-development process, including system engineering and software engineering.
The reader who is technically inclined will find useful discussions throughout, and many pointers to further references --- in various sections and in the summaries at the end of each chapter. Reliability and safety techniques are considered in Section 7.7. Security techniques are considered in Sections 3.8 and 6.3, and in several of the sections of Chapter 7 (particularly Section 7.9). Assurance of desired system properties in turn relies heavily on software engineering and system-development approaches considered in Sections 7.6 and 7.8. The reader who is primarily interested in the illustrative material may wish to skip over those sections.
This introductory chapter provides a brief overview of the types of harmful causes and adverse effects that can arise in connection with the use of computers. It introduces the enormous variety of problems that we confront throughout this book.
C1.1 -- Examine the sources of risks given in Section 1.2. Identify those sources that have affected your life. Describe how they have done so.
C1.2 -- Examine the illustrative list of adverse consequences given in Section 1.3. Identify those consequences that have affected your life. Describe them.
C1.3 -- Describe an incident in which your work with computer systems has been adversely affected, and analyze why that happened and what could have been done to avoid or mitigate the effects.
C1.4 -- (Essay question) What are your expectations of computer-communication technologies? What effects do you think these technologies might have on civilization, both in the short term and in the long term? What kinds of risks do you think might be acceptable, and under what circumstances? Is the technology making life better or worse? Be specific in your answers. (This challenge is a preliminary one. Your views may well change after you read the rest of the book, at which time you may wish to reassess your answers to this question.)Identify ways in which failures could affect your life in the future.
1. The Nature Of Risks.
Background on Risks.
Sources of Risks.
Guide to Summary Tables.
Problems in Space.
Robotics and Safety.
Medical Health and Safety.
Computer Calendar Clocks.
Security Vulnerabilities and Misuse Types.
Pest Programs and Deferred Effects.
Bypass of Intended Controls.
Other Attack Methods.
Comparison of the Attack Methods.
Classical Security Vulnerabilities.
Avoidance of Security Vulnerabilities.
Weak Links and Multiple Causes.
Accidental versus Intentional Causes.
Spoofs and Pranks.
Intentional Denials of Service.
Unintentional Denials of Service.
Financial Fraud by Computer.
Accidental Financial Losses.
Risks in Computer-Based Elections.
Needs for Privacy Protection.
Prevention of Privacy Abuses.
Annoyances in Life, Death, and Taxes.
What's in a Name?
Use of Names as Identifiers.
The Not-So-Accidental Holist: A System View.
Putting Your Best Interface Forward.
Woes of System Development.
Modeling and Simulation.
Coping with Complexity.
Techniques for Increasing Reliability.
Techniques for Software Development.
Techniques for Increasing Security.
Risks in Risk Analysis.
Risks Considered Global(ly).
The Human Element.
Trust in Computer-Related Systems and in People.
Computers, Ethics, and the Law.
Mixed Signals on Social Responsibility.
Certification of Computer Professionals.
Where to Place the Blame.
Expect the Unexpected!
Avoidance of Weak Links.
Assessment of the Risks.
Assessment of the Feasibility of Avoiding Risks.
Risks in the Information Infrastructure.
Questions Concerning the NII.
Avoidance of Risks.
Assessment of the Future. 020155805XT04062001
Some books are to be tasted,
others to be swallowed,
and a few to be chewed and digested.
This book is based on a remarkable collection of mishaps and oddities relating to computer technology. It considers what has gone wrong in the past, what is likely to go wrong in the future, and what can be done to minimize the occurrence of further problems. It may provide meat and potatoes to some readers and tasty desserts to others --- and yet may seem almost indigestible to some would-be readers. However, it should be intellectually and technologically thought-provoking to all.
Many of the events described here have been discussed in the on-line computer newsgroup, the Risks Forum Risks to the Public in the Use of Computers and Related Systems (referred to here simply as RISKS), which I have moderated since its inception in 1985, under the auspices of the Association for Computing (ACM). Most of these events have been summarized in the quarterly publication of the ACM Special Interest Group on Software Engineering (SIGSOFT), Software Engineering Notes (SEN), which I edited from its beginnings in 1976 through 1993 and to which I continue to contribute the "RISKS" section. Because those sources represent a fascinating archive that is not widely available, I have distilled the more important material and added further discussion and analysis.
Most of the events selected for inclusion relate to roles that computers and communication systems play in our lives. Some events exhibit problems with technology and its application; some events illustrate a wide range of human behavior, such as malice, inadvertent actions, incompetence, ignorance, carelessness, or lack of experience; some events are attributable to causes over which we have little control, such as natural disasters. Some of the events are old; others are recent, although some of the newer ones seem strangely reminiscent of earlier ones. Because such events continue to happen and because they affect us in so many different ways, it is essential that we draw realistic conclusions from this collection --- particularly if the book is to help us avoid future disasters. Indeed, the later chapters focus on the technology itself and discuss what can be done to overcome or control the risks.
I hope that the events described and the conclusions drawn are such that much of the material will be accessible to readers with widely differing backgrounds. I have attempted to find a middle ground for a diverse set of readers, so that the book can be interesting and informative for students and professionals in the computer field, practitioners and technologists in other fields, and people with only a general interest in technology. The book is particularly relevant to students of software engineering, system engineering, and computer science, for whom it could be used as a companion source. It is also valuable for anyone studying reliability, fault tolerance, safety, or security; some introductory material is included for people who have not been exposed to those topics. In addition, the book is appropriate for people who develop or use computer-based applications. Less technically oriented readers may skip some of the details and instead read the book primarily for its anecdotal material. Other readers may wish to pursue the technological aspects more thoroughly, chasing down relevant cited references --- for historical, academic, or professional reasons. The book is relatively self-contained, but includes many references and notes for the reader who wishes to pursue the details further. Some readers may indeed wish to browse, whereas others may find the book to be the tip of an enormous iceberg that demands closer investigation.
In my presentations of the cases, I have attempted to be specific about the causes and actual circumstances wherever specifics were both available and helpful. Inevitably, the exact causes of some of the cases still remain unknown to me. I have also opted to cite actual names, although I realize that certain organizations may be embarrassed by having some of their old dirty laundry hung out yet again. The alternative would have been to make those cases anonymous --- which would have defeated one of the main purposes of the book, namely, to increase reader awareness of the pervasiveness and real-life nature of the problems addressed here.
Challenges for the reader are suggested at the end of each chapter. They include both thought-provoking questions of general interest and exercises that may be of concern primarily to the more technically minded reader. They are intended to offer some opportunities to reflect on the issues raised in the book. At the urging of my EditriX, some specific numbers are given (such as number of cases you are asked to examine, or the number of examples you might generate); however, these numbers should be considered as parameters that can be altered to suit the occasion. Students and professors using this book for a course are invited to invent their own challenges.
Appendix A provides useful background material. Section A.1 gives a table relating Software Engineering Notes (SEN) volume and issue numbers to dates, which are omitted in the text for simplicity. Section A.2 gives information on how to access relevant on-line sources, including RISKS, PRIVACY, and VIRUS-L newsgroups. Section A.3 suggests some selected further readings. The back of the book includes a glossary of acronyms and terms, the notes referred to throughout the text, an extensive bibliography that is still only a beginning, and the index.
Many different organizations could have been used for the book. I chose to present the experiential material according to threats that relate to specific attributes (notably reliability, safety, security, privacy, and well-being in Chapters 2 through 6), and, within those attributes, by types of applications. Chapters 7 and 8 provide broader perspectives from a system viewpoint and from a human viewpoint, respectively. That order reinforces the principal conclusions of the book and exhibits the diversity, perversity, and universality of the problems encountered.
Alternatively, the book could have been organized according to the causes of problems --- for example, the diverse sources of risks summarized in Section 1.2; it could have been organized according to the effects that have been experienced or that can be expected to occur in the future --- such as those summarized in Section 1.3; it could have been organized according to the types of defensive measures necessary to combat the problems inherent in those causes and effects --- such as the diverse types of defensive measures summarized in Section 1.4. Evidently, no one order is best suited to all readers. However, I have tried to help each reader to find his or her own path through the book, and have provided different viewpoints and cross-references.
The book may be read from cover to cover, which is intended to be a natural order of presentation. However, a linear order may not be suitable for everyone. A reader with selective interests may wish to read the introductory material of Chapter 1, to choose among those sections of greatest interest in Chapters 2 through 6, and then to read the final three chapters. A reader not particularly interested in the technological details of how the risks might be avoided or reduced can skip Chapter 7.
Certain cases recur in different contexts, and are interesting precisely because they illustrate multiple concepts. For example, a particular case might appear in the context of its application (such as communications or space), its types of problems (distributed systems, human interfaces), its requirements (reliability, security), and its implications with respect to software engineering. Certain key details are repeated in a few essential cases so that the reader is not compelled to search for the original mention.
I am deeply indebted to the numerous people who contributed source material to the Risks Forum and helped to make this book possible. My interactions with them and with the newsgroup's countless readers have made the RISKS experience most enjoyable for me. Many contributors are identified in the text. Others are noted in the referenced items from the ACM Software Engineering Notes.
I thank Adele Goldberg, who in 1985 as ACM President named me to be the Chairman of the ACM Committee on Computers and Public Policy and gave me the charter to establish what became the Risks Forum 50. Peter Denning, Jim Horning, Nancy Leveson, David Parnas, and Jerry Saltzer are the "old reliables" of the RISKS community; they contributed regularly from the very beginning. I am delighted to be able to include the "CACM Inside Risks" guest columns written by Bob Charette (Section 7.10), Robert Dorsett (Section 2.4.1), Don Norman (Section 6.6), Ronni Rosenberg (Section 8.4), Marc Rotenberg (Section 6.1), and Barbara Simons (Section 9.7). I thank Jack Garman and Eric Rosen for the incisive articles they contributed to Software Engineering Notes, discussing the first shuttle launch problem47 and the 1980 ARPAnet collapse 139, respectively. I also thank Matt Jaffe for his extemporaneous discussion on the Aegis system in response to my lecture at the Fifth International Workshop on Software Specification and Design in 1989. (My summary of his talk appears in 58.)
I would like to express my appreciation to John Markoff of The New York Times. Our interactions began long before the Wily Hackers 162, 163 and the Internet Worm 35, 57, 138, 150,159. John has been a media leader in the effort to increase public awareness with respect to many of the concepts discussed in RISKS.
I am grateful to many people for having helped me in the quest to explore the risks involved in the design and implementation of computer systems --- especially my 1960s colleagues from the Multics effort, F.J. Corbató, Bob Daley, Jerry Saltzer, and the late E.L. (Ted) Glaser at MIT; and Vic Vyssotsky, Doug McIlroy, Bob Morris, Ken Thompson, Ed David, and the late Joe Ossanna at Bell Laboratories. My interactions over the years with Tony Oettinger, Dave Huffman, and Edsger W. Dijkstra have provided great intellectual stimulation. Mae Churchill encouraged me to explore the issues in electronic voting. Henry Petroski enriched my perspective on the nature of the problems discussed here. Two of Jerry Mander's books were particularly reinforcing 87, 88.
Special thanks go to Don Nielson and Mark Moriconi for their support at SRI International (formerly Stanford Research Institute). The on-line Risks Forum has been primarily a pro bono effort on my part, but SRI has contributed valuable resources --- including the Internet archive facility. I also thank Jack Goldberg, who invited me to join SRI's Computer Science Laboratory (CSL) in 1971 and encouraged my pursuits of reliability and security issues in a socially conscious context. Among others in CSL, Teresa Lunt and John Rushby have been particularly thoughtful colleagues. Liz Luntzel provided cheerful assistance throughout. Donn Parker and Bruce Baker provided opportunities for inputs and outputs through their International Information Integrity Institute (I-4) and as part of SRI's Business and Policy Group.
Maestro Herbert Blomstedt has greatly enriched my life through his music and teaching over the past 10 years. My Tai Chi teachers Martin and Emily Lee 77 contributed subliminally to the writing of this book, which in a Taoist way seems to have written itself.
I thank Lyn Dupré, my high-tech EditriX, for her X-acting X-pertise (despite her predilection for the "staffed space program" and "fisherpersons" --- which I carefully eschewed, in Sections 2.2.1 and 2.6, respectively); the high-TEX Marsha Finley (who claims she did only the dog work in burying the bones of my LaTEX, but whose bark and bite were both terrific); Paul Anagnostopoulos, who transmogrified the LaTEX into ZzTEX; Peter Gordon of Addison-Wesley for his patient goading; Helen Goldstein of Addison-Wesley for her wonderful encouragement and help; and Helen Wythe, who oversaw the production of the book for Addison-Wesley. I am indebted to the anonymous reviewers, who made many useful suggestions --- although some of their diverse recommendations were mutually incompatible, further illustrating the difficulties in trying to satisfy a heterogeneous audience within a single book.
I am pleased to acknowledge two marvelous examples of nonproprietary software: Richard Stallman's Gnu Emacs and Les Lamport's LaTEX, both of which were used extensively in the preparation of the text.
I would be happy to hear from readers who have corrections, additions, new sagas, or other contributions that might enhance the accuracy and completeness of this book in any future revisions.
I thank you all for being part of my extended family, the RISKS community.Peter G. Neumann
Much has happened since this book originally went to press. There have been many new instances of the problems documented here, but relatively few new types of problems. In some cases, the technology has progressed a little -- although in those cases the threats, vulnerabilities, risks, and expectations of system capabilities have also escalated. On the other hand, social, economic, and political considerations have not resulted in any noticeable lessening of the risks. Basically, all of the conclusions of the book seem to be just as relevant now -- if not more so.
The archives of the Risks Forum have been growing dramatically. Because recent events are always a moving target, a significant body of new material has been assembled and made available on-line, rather that trying to keep the printed form of the book up-to-date in terms of those recent events. An on-line summary of events since the first printing of this book is updated periodically, and is available at http://www.awl.com/cseng/titles/0-201-55805-X/ --- along with pointers to substantial new material that might otherwise go into a revised edition of this book that would be much longer.
Page 193, lines 4-7. The "LaserWriter" (Apple product) saga is in error. The parenthetical should apply to "Laserwriter" not "Laserjet" as it was in the first three printings. It should read as follows:
"Scott Siege noted that the spelling checker for a Unix system failed to recognize the word Unix, and that a checker on the Apple Macintosh suggested that Laserwriter (which is Apple's printer) be changed to a competitor's Laserjet!"
Page 288, relating to Enos, the space monkey. Enos in Greek is WINE (oivos, oi=e and v=n), not Man. George Roussos noted that this breaks the macho monkey joke, but goes well with my 'expect the unexpected' motto, in the sense that a computer-related risks book bug is due to an unchecked Greek literature reference. [Efcharisto', George!] An appropriate correction to the book would be simply to change "man" to "wine". (Alternatively, we could remove the end of the parenthetical, "which is Greek for man" [anthropos], as it is now gratuitous.) I am included just to change "man" to "wine".Page 314, the table can be updated as follows:
Page 314, The correct ftp address is ftp://ftp.sri.com [188.8.131.52], rather than unix.sri.com. The last sentence "RISKS is also available through the Wide Area Information Server" should be replaced with the following text.
A searchable archive site is maintained by Lindsay Marshall at http://catless.ncl.ac.uk/Risks/ .
Page 314, Please remove the entire paragraph on RISKS by FAX. RISKS is no longer available by fax.
Page 315, VIRUS-L has moved again. In the paragraph on VIRUS-L, the rest of the paragraph beginning with "A FAQ (Frequently Asked Questions) ... should be replaced with this:
A FAQ (Frequently Asked Questions) document and pointers to all of the back issues are available at http://www.faqs.org/faqs/computer-virus/faq, along with instructions for subscribing and contributing.