Why We Need More Description and Less Prescription
Cargo Cult Science
In a 1974 commencement address entitled Cargo Cult Science at Caltech, the physicist Richard Feynman railed against pretend science. In particular, Feynman was worried about work that seems scientific but is lacking in "a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty." Computer security has many cargo cult characteristics.
For readers who don't know what a cargo cult is, a quick introduction. During World War II, planes landed for the first time on far flung Pacific islands inhabited by indigenous populations who had little or no interaction with the outside world. The planes were viewed as miraculous things associated with the "ancestors," imbued with metaphysical powers, and able to bring in mind-boggling modern treasures such as T-shirts, canned food, tents, fuel, and weapons. To the islanders, these items were tantalizing; but the war moved on, and the planes stopped landing.
The islanders, eager to enable the planes to return with their incredible stuff, developed quasi-religious theories about what made the planes come in the first place. They built simulated airstrips. They lit fires around the "airstrips." They even carved radio headphones out of wood and put up bamboo antennas. These were the Cargo Cults. As Wikipedia reports, "The cults are focused on obtaining the material wealth (the "cargo") of the advanced culture through magical thinking and religious rituals and practices, believing that the wealth was intended for them by their deities and ancestors." The planes never came back..
Figure 1 A picture of an "airplane" and a "control tower" created by a Cargo Cult.
Feynman began his 1974 address mostly worried about combating UFOs, extrasensory perception, witch doctors and the like, but zeroed in on a more subtle and important issue — a skeptical stance. Science, Feynman argues, should "bend over backwards" and try to get to the bottom of explaining a phenomenon by thinking carefully about all possible causes that can explain results (not just the most convenient or most popular theory). "In summary," Feynman says, "the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one direction or another."
Computer Security, and especially my own subfield of software security, suffers from a Cargo Cult mentality that may have roots in its relative youth as a field. When a field is just getting started, advocacy and evangelism are the order of the day. "Because I said so," is often seen as a convincing argument, especially if uttered by a self-proclaimed expert. This is a problem as the field progresses.
Of course the problem exists in other aspects of Computer Security as well. Sometimes it seems that security has more to do with clever marketing and moving product than with actually making things better. As a modicum of evidence, consider the basic sniff/grep/react technologies that are continually renamed and repackaged as trends come and go (e.g., logging, firewalls, exfiltration protection, GRC, SIM/SEIM, WAFs, etcetera). The clearest sign of our overall maturity in computer security is that the "market" is more important than the problem or the solution
If we want to combat this phenomenon (and I do), then the time has come to turn to science. But not Cargo Cult Science — real science.
Monkeys Eat Bananas
The BSIMM model is a descriptive model of software security. The idea behind the BSIMM is fairly novel for computer science. Gather data first, and then build a model that explains the data. Continue the data gathering and refine the model. Share your results with others and look for facts that disprove the model. (The usual approach in computer science is to build a model first and only then to gather various dribs and drabs of data — if any at all — that seem to justify the model. The problem here is that any model, no matter how ridiculous, will always find a proponent or two, providing "evidence" that the model somehow simulates a useful reality.)
Models are important. They allow experimentation and prediction when reality isn't readily available. Gathering data and refining a model makes it more representative of reality and more useful to others. There is a fundamental difference between an a posteriori descriptive model accurately reflecting a growing slice of reality and an a priori prescriptive model in search of proponents.
A descriptive model purports only to observe and report. My co-authors and I joke around that we wandered off into the jungle to see what we could see and discovered that "monkeys eat bananas." Notice that in the BSIMM model we don't report "you should only eat yellow bananas," "do not run while eating a banana," "thou shalt not steal thy neighbors' bananas," or any other value judgment related statements. Simple observations, simply reported. In our work, counts are kept for each activity, just in case we happened upon the one banana-eating monkey in the entire jungle.
Prescriptive models purport to tell you what you should do. Promulgators of such models say things more like, "the model is chocked full of value judgments [sic] about what organizations SHOULD be doing." That's just dandy, as long as any prescriptive model only became prescriptive over time based on sufficient observation and testing.
The problem is we have way too many prescriptive models in Computer Security that are not backed up by any data at all. Pseudoscience. Perhaps we should focus our field for the time being on gathering lots of descriptive data and making sure that any prescriptive theories we come up with cohere with actual observables.
What the Data Say
I'll tip my hat to the new constitution
Take a bow for the new revolution
Smile and grin at the change all around me
Pick up my guitar and play
Just like yesterday
And I'll get on my knees and pray
We don't get fooled again
Don't get fooled again
Of course descriptive data can be misused and abused — wild extrapolation probably being the biggest transgression. Feynman clearly understands this when he says, "The first principal is that you must not fool yourself — and you are the easiest person to fool." An important aspect of avoiding Cargo Cult Science is to make sure that you don't only refer to data when they are convenient or when they justify your foregone conclusion. Otherwise the planes don't land.
Two examples from the BSIMM work may help. The first is obvious, and is the subject of last month's column, You Really Need a Software Security Group. There are a number of lone wolf consultants and software security soothsayers who continue to claim that an SSG is unnecessary for software security, even in the face of overwhelming data. Such ideas would surely be welcome if they were backed by anything other than strident proclamations. Disagreement is healthy — it plays an essential role in scientific progress. But disagreement without data leads to absurd ideas about the existence of global warming.
A more subtle example can be found in overly generous claims about OWASP ESAPI by its proponents. The idea of building middleware for developers to use in their work is certainly a good one (this is one of ten activities observed in the "Security Features and Design" BSIMM practice). Furthermore, data from the BSIMM show that 18 of 30 firms in the current dataset make use of various kinds of middleware frameworks for security controls (not OWASP ESAPI, mind you, but something similar). But claiming that the use of an enterprise security API will single-handedly solve the software security problem is just plain silly and flies in the face of the BSIMM's other 109 activities.
Compared to What?
Another distinct advantage that descriptive models have over prescriptive models is the ability to compare current observations with past observations. In the case of the BSIMM, the idea is to compare observations about activities observed in a target company to observations made over groups of other firms represented in the data. For example, we can compare a major credit card supplier to eleven other financial services firms directly. Or we can compare an ISV to the multiple ISVs already represented in the data set.
This is an extremely powerful technique. Data that suggest how the software security activities observed in your firm compare to activities observed in similar firms are a much more useful guide for reasonable strategy than the studied opinion of a witch doctor. Just for the record, I was one of the first software security witch doctors, and along with my co-authors I am proud to be in the first wave of software security practitioners to leave that era behind.
Also worthy of mention in this section is the "one size fits all" problem that many prescriptive models suffer from. The fact is, nobody knows your organizational culture like you do. A descriptive comparison allows you to gather descriptive data and adapt good ideas from others while taking your culture into account.
Last Word to Feynman
As I have said before, the time has come to put away the bug parade boogeyman, the top 25 tea leaves, black box web app goat sacrifice, and the occult reading of pen testing entrails. It's science time. And the more descriptive and data driven we are, the better.
Feynman said it best in 1974:
We've learned from experience that the truth will come out. Other experimenters will repeat your experiment and find out whether you were wrong or right. Nature's phenomena will agree or they'll disagree with your theory. And, although you may gain some temporary fame and excitement, you will not gain a good reputation as a scientist if you haven't tried to be very careful in this kind of work. And it's this type of integrity, this kind of care not to fool yourself, that is missing to a large extent in much of the research in cargo cult science.