Preventing and Detecting Bugs
In light of inevitable failure, it seems appropriate to discuss the various ways to keep bugs out of the software ecosystem so as to minimize failures and produce the best software possible. In fact, there are two major categories of such techniques: bug prevention and bug detection.
Bug-prevention techniques are generally developer-oriented and consist of things such as writing better specs, performing code reviews, running static analysis tools, and performing unit testing (which is often automated). All of these prevention techniques suffer from some fundamental problems that limit their efficacy, as discussed in the following subsections.
The "Developer Makes the Worst Tester" Problem
The idea that developers can find bugs in their own code is suspect. If they are good at finding bugs, shouldn't they have known not to write the bugs in the first place? Developers have blind spots because they approach development from a perspective of building the application. This is why most organizations that care about good software hire a second set of eyes to test it. There's simply nothing like a fresh perspective free from development bias to detect defects. And there is no replacement for the tester attitude of how can I break this to complement the developer attitude of how can I build this.
This is not to say that developers should do no testing at all. Test-driven development or TDD is clearly a task meant as a development exercise, and I am a big believer in unit testing done by the original developer. There are any number of formatting, data-validity, and exception conditions that need to be caught while the software is still in development. For the reasons stated previously, however, a second set of eyes is needed for more subtle problems that otherwise might wait for a user to stumble across.
The "Software at Rest" Problem
Any technique such as code reviews or static analysis that doesn't require the software to actually run, necessarily analyzes the software at rest. In general, this means techniques based on analyzing the source code, object code, or the contents of the compiled binary files or assemblies. Unfortunately, many bugs don't surface until the software is running in a real operational environment. This is true of most of the bugs shown previously in Chapter 1, "The Case for Quality Software": Unless you run the software and provide it with real input, those bugs will remain hidden.
The "No Data" Problem
Software needs input and data to execute its myriad code paths. Which code paths actually get executed depends on the inputs applied, the software's internal state (the values stored in internal data structures and variables), and external influences such as databases and data files. It's often the accumulation of data over time that causes software to fail. This simple fact limits the scope of developer testing, which tends to be short in duration.
Perhaps tools and techniques will one day emerge that enable developers to write code without introducing bugs.3 Certainly it is the case that for narrow classes of bugs such as buffer overflows4 developer techniques can and have driven them to near extinction. If this trend continues, the need for a great deal of testing will be negated. But we are a very long way, decades in my mind, from realizing that dream. Until then, we need a second set of eyes, running the software in an environment similar to real usage and using data that is as rich as real user data.
Who provides this second set of eyes? Software testers provide this service, using techniques to detect bugs and then skillfully reporting them so that they get fixed. This is a dynamic process of executing the software in varying environments, with realistic data and with as much input variation as can be managed in the short cycles in which testing occurs. Such is the domain of the software tester.
Testers generally use two forms of dynamic analysis: automated testing (writing test code to test an application) and manual testing (using shipping user interfaces to apply inputs manually).
Automated testing carries both stigma and respect.
The stigma comes from the fact that tests are code, and writing tests means that the tester is necessarily also a developer. Can a developer really be a good tester? Many can, many cannot, but the fact that bugs in test automation are a regular occurrence means that they will spend significant time writing code, debugging it, and rewriting it. Once testing becomes a development project, one must wonder how much time testers are spending thinking about testing the software as opposed to maintaining the test automation. It's not hard to imagine a bias toward the latter.
The respect comes from the fact that automation is cool. One can write a single program that will execute an unlimited number of tests and find tons of bugs. Automated tests can be run and then rerun when the application code has been churned or whenever a regression test is required. Wonderful! Outstanding! How we must worship this automation! If testers are judged based on the number of tests they run, automation will win every time. If they are based on the quality of tests they run, it's a different matter altogether.
The kicker is that we've been automating for years, decades even, and we still produce software that readily falls down when it gets on the desktop of a real user. Why? Because automation suffers from many of the same problems that other forms of developer testing suffers from: It's run in a laboratory environment, not a real user environment, and we seldom risk automation working with real customer databases because automation is generally not very reliable. (It is software after all.) Imagine automation that adds and deletes records of a database—what customers in their right mind would allow that automation anywhere near their real databases? And there is one Achilles heel of automated testing that no one has ever solved: the oracle problem.
The oracle problem is a nice name for one of the biggest challenges in testing: How do we know that the software did what it was supposed to do when we ran a given test case? Did it produce the right output? Did it do so without unwanted side effects? How can we be sure? Is there an oracle we can consult that will tell us—given a user environment, data configuration, and input sequence—that the software performed exactly as it was designed to do? Given the reality of imperfect (or nonexistent) specs, this just is not a reality for modern software testers.
Without an oracle, test automation can find only the most egregious of failures: crashes, hangs (maybe), and exceptions. And the fact that automation is itself software often means that the crash is in the test case and not in the software! Subtle/complex failures are missed. One need look no further than the previous chapter to see that many such crucial failures readily slip into released code. Automation is important, but it is not enough, and an overreliance on it can endanger the success of a product in the field.
So where does that leave the tester? If a tester cannot rely on developer bug prevention or automation, where should she place her hope? The only answer can be in manual testing.