16.2 Attacks and Defenses
16.2.1 Physical Attacks
So far, we’ve discussed how a computing device may or may not keep secrets from an adversary with physical access. We now discuss some ways an adversary may use physical access to mount an attack. To start with, we might consider the security perimeter: what the designers regarded as the boundary between the internal trusted part of the system and the external part under the potential control of the adversary.
Perhaps the first model to consider is the single trusted chip. The designer/deployer wants to trust the internal operation of the chip, but the adversary controls the outside. Over the years, this model has received perhaps the most attention—in the public literature, anyway—owing to the long and widespread use of low-cost chip cards—often considered synonymous with smart cards—in commercial applications, such as controlling the ability to make telephone calls or to view licensed satellite TV. The ubiquity creates a large community of adversaries; the applications give them motivation; and the cost makes experimentation feasible.
The work of Anderson and Kuhn provides many nice examples of attack techniques on such single-chip devices [AK96, AK97]. Perhaps the most straightforward family of attacks are the many variations of "open up the device and play with it." Various low-cost lab techniques can enable the adversary to open up the chip and start probing: reading bits, changing bits, resetting devices back to special factory modes by re-fusing fuses, and so on. Historically, we’ve seen a cycle here.
- The vendor community claims that such attacks are either not possible or are far too sophisticated for all but high-end state-sponsored adversaries.
- The adversary community demonstrates otherwise.
- The vendor community thinks a bit, reengineers its defense technology, and the loop repeats.
It’s anyone’s guess where we will be in this cycle—and whether the loop will keep repeating—when this book is published.
By manipulating the device’s environment, the adversary can also use more devious ways to influence the computation of such devices. For an amusing and effective example of attacks, we refer back to Anderson and Kuhn. The device may execute an internal program that brings it to a conditional branch instruction. Let’s say that the device compares a register to 0 and jumps to a different address if the two are equal. However, in typical chip card applications, the device obtains its power from an outside source. This means that the adversary can deviously manipulate the power, such as by driving it way out of specification. Generally speaking, the CPU will not function correctly under such conditions. If the adversary applies such a carefully timed spike at the moment the device is executing this comparison instruction, the adversary can cause the CPU to always take one direction of the branch—whether or not it’s correct. Finding examples where such an attack lets the adversary subvert the correctness of the system is an exercise for the reader.
In some sense, such environmental attacks are the flip side of side-channel attacks. Rather than exploiting an unexpected communication path coming out of the device, the adversary is exploiting an unexpected communication path going into it. Another example of this family of attack is differential fault analysis (DFA), sometimes also known as the Bellcore attack. Usually framed in the context of a chip card performing a cryptographic operation, this type of attack has the adversary somehow causing a transient hardware error: for example, by bombarding the chip with some kind of radiation and causing a gate to fail. This error then causes the chip to do something other than the correct cryptographic operation. In some situations, the adversary can then derive the chip’s critical secrets from these incorrect results.
Bellcore attacks were originally suggested as a theoretical exercise (e.g., [BDL97]). However, they soon became a practical concern (e.g, [ABF+03]), to the point where countermeasures became a serious concern. How does one design a circuit to carry out a particular cryptographic operation but that also doesn’t yield anything useful to the adversary if a transient error occurs? Some researchers have even begun formally studying this model: how to transform a circuit so that an adversary who can probe and perhaps alter the state of a limited subset of wires still cannot subvert the computation [ISW03]. We touch on these attacks again in Section 16.5.
Multichip modules provide both more avenues for the attacker and more potential for defense.
Getting inside the chassis is the first step. Here we see another cat-and-mouse game, featuring such defenses as one-way bolt heads and microswitches on service doors, and corresponding counterattacks, such as using a pencil eraser as a drill bit or putting superglue on the microswitch after drilling through the door.
An attacker who can get inside the chassis might start monitoring and manipulating the connections on the circuit boards themselves. The attacker might hook logic analyzers or similar tools to the lines or insert an interposer between a memory or processor module and the circuit board, allowing easy monitoring and altering of the signals coming in and out. Other potential attacks misusing debugging hooks include using an in-circuit emulator (ICE) to replace a CPU and using a JTAG port to suspend execution and probe/alter the internal state of a CPU.1
The attacker might also exploit properties of the internal buses, without actually modifying hardware. For one example, the PCI bus includes a busmastering feature that allows a peripheral card to communicate directly with system memory, without bothering the CPU. Intended to support direct memory access (DMA), occasionally a desirably form of I/O, busmastering can also support malicious DMA, through which a malicious PCI card reads and/or writes memory and other system resources illicitly.
When focusing on these subtle ways that an adversary might access secrets by sneakily bypassing the ways a system might have tried to block this access, it’s easy to overlook the even more subtle approach of trying to use the front door instead. The APIs that systems offer through which legitimate users can access data are becoming increasingly complex. A consequence of this complexity can be extra, unintended functionality: ways to put calls together that lead to behavior that should have been disallowed. Bond and Anderson made the first big splash here, finding holes in the API for the Common Cryptographic Architecture (CCA) application that IBM offered for the IBM 4758 platform [BA01]. More recently, Jonathan Herzog has been exploring the use of automated formal methods to discover such flaws systematically [Her06].
16.2.2 Defense Strategies
As with attacks, we might start discussing defenses by considering the trust perimeter: what part of the system the designer cedes to the adversary.
As we observed earlier, attacks and defenses for single-chip modules have been a continual cat-and-mouse game, as vendors and adversaries take turns with innovation. In addition, some new techniques and frameworks are beginning to emerge from academic research laboratories. Researchers have proposed physical one-way functions: using a device’s physical properties to embody functionality that, one hopes, cannot be accessed or reverse engineered any other way. The intention here is that an adversary who tries to use some type of physical attack to extract the functionality will destroy the physical process that generated the functionality in the first place.
In an early manifestation of this concept, researchers embedded reflective elements within a piece of optical-grade epoxy [PRTG02]. When entering this device, a laser beam reflects off the various obstacles and leaves in a rearranged pattern. Thus, the device computes the function that maps the input consisting of the laser angle to the output consisting of the pattern produced by that input. Since the details of the mapping follow randomly from the manufacturing process, we call this a random function: the designer cannot choose what it is, and, one hopes, the adversary cannot predict its output with any accuracy, even after seeing some reasonable number of x, f (x) pairs. (Formalizing and reasoning about what it means for the function to resist reverse engineering by the adversary requires the tools of theoretical computer science—recall Section 7.1 or see the Appendix.)
It’s hard to use these bouncing lasers in a computing system. Fortunately, researchers [GCvD02] subsequently explored silicon physical random functions (SPUF), apparently from the earlier acronym silicon physical unknown functions. The central idea here is that the length of time it takes a signal to move across an internal connector depends on environmental conditions, such as temperature and, one hopes, on random manufacturing variations. If we instead compare the relative speed of two connectors, then we have a random bit that remains constant even across the environmental variations. Researchers then built up more elaborate architectures, starting with this basic foundation.
Outside the Chip
Even if we harden a chip or other module against the adversary, the chip must still interact with other elements in the system. The adversary can observe and perhaps manipulate this interaction and may even control the other elements of the system. A number of defense techniques—many theoretical, so far—may apply here. However, it’s not clear what the right answer is. Figuring out the right balance of security against performance impact has been an area of ongoing research; many of the current and emerging tools we discuss later in this chapter must wrestle with these design choices.
For example, suppose that the device is a CPU fetching instructions from an external memory. An obvious idea might be to encrypt the instructions and, of course, check their integrity, in order to keep the adversary from learning details of the computation. Although perhaps natural, this idea has several drawbacks. One is figuring out key management: Who has the right to encrypt the instructions in the first place? Another drawback is that the adversary still sees a detailed trace of instruction fetches, with only the opcodes obfuscated. However, there’s nothing like the real thing—the most damning indictment of this technique is the way Anderson and Kuhn broke it on a real device that tried it [AK96].
We might go beyond this basic idea and think about using external devices as memory, which makes sense, since that’s where the RAM and ROM will likely be. What can the adversary do to us? An obvious attack is spying on the memory contents; encryption can protect against this, although one must take care with using initialization vectors (IVs) or clever key management to prevent the same plaintext from going to the same ciphertext—or the same initial blocks from going to the same initial blocks. (Note, however, that straightforward use of an IV will cause the ciphertext to be one block larger than the plaintext, which might lead to considerable overhead if we’re encrypting on the granularity of a memory word.)
Beyond this, two more subtle categories of attacks emerge:
Learning access patterns. The adversary who can see the buses or the memory devices can see what the trusted chip is touching when. One potential countermeasure here lies in aggregation: If it has sufficient internal storage, the chip can implement virtual memory and cryptopage to the external memory, treated as a backing store [Yee94].
The world of crypto and theory give us a more thorough and expensive technique: oblivious RAM (ORAM) [GO96]. In a basic version, the trusted device knows a permutation p of addresses. When it wants to touch location i 1, the device issues the address p(i 1) instead. If it only ever touches one address, then this suffices to hide the access pattern from the adversary. If it needs to then touch an i 2, then the device issues p(i 1) and then p(i 2)—unless, of course, i 2 = i 1, in which case the device makes up a random and issues p( ) instead. The adversary knows that two addresses were touched but doesn’t know which two they were or even whether they were distinct. To generalize this technique, the device must generate an encrypted shuffle of the external memory; the kth fetch since the last shuffle requires touching k memory addresses. (One might wonder whether we could turn around and use the same technique on the k fetches—in fact, Goldreich and Ostrevsky came up with an approach that asymptotically costs O(log4 n) per access.)
Freshness of contents. Earlier, we mentioned the obvious attack of spying on the stored memory and the obvious countermeasure of encrypting it. However, the adversary might also change memory, even if it’s encrypted. An effective countermeasure here is less obvious. Naturally, one might think of using a standard cryptographic integrity-checking technique, such as hashes or MACs, although doing so incurs even more memory overhead. However, if the device is using the external memory for both writing and reading, then we have a problem. If we use a standard MAC on the stored data, then we can replace the MAC with a new value when we rewrite the memory. But then nothing stops the adversary from simply replacing our new value-MAC pair with an older one! We could stop this attack by storing some per location data, such as the MAC, inside the trusted device, but then that defeats the purpose of using external memory in the first place.
Two techniques from the crypto toolkit can help here. One is the use of Merkle trees (recall Section 7.6 and Figure 7.19). Rather than storing a per location hash inside the trusted device, we build a Merkle tree on the hashes of a large set of locations and store only the root inside the device. This approach saves internal memory but at the cost of increased calculation for each integrity/freshness check. Another idea is to use incremental multiset hashing, a newer crypto idea, whereby the device calculates a hash of the contents of memory—"multiset"—but can do so in an incremental fashion. (Srini Devadas’ group at MIT came up with these ideas—for example, see [CDvD+03, SCG+03].)
The preceding approaches considered how the trusted device might use the rest of the system during its computation. We might also consider the other direction: how the rest of the system might use the trusted device. A general approach that emerged from secure coprocessing research is program partitioning: sheltering inside the trusted device some hard to reverse engineer core of the program but running the rest of the program on the external system. Doing this systematically, for general programs, in a way that accommodates the usually limited power and size of the trusted device, while also preserving overall system performance, while also being secure, appears to be an open problem.
However, researchers have made progress by sacrificing some of these goals. For example, theoreticians have long considered the problem of secure function evaluation (SFE), also known as secure multiparty computation. Alice and Bob would like to evaluate a function f which they both know, on the input (x A , x B ), which they each know (respectively), but don’t want to share. In 1986, Yao published an algorithm to do this—an inefficient algorithm, to be sure, but one that works [Yao86]. 2004 brought an implementation—still inefficient, but we’re making progress [MNPS04].
The economic game of enforcing site licenses on software also used to manifest a version of this program-partitioning problem. Software vendors occasionally provide a dongle—a small device trusted by the vendor—along with the program. The program runs on the user’s larger machine but periodically interacts with the dongle. In theory, absence of the dongle causes the program to stop running. Many software vendors are moving toward electronic methods and are abandoning the hardware dongle approach. For example, many modern PC games require an original copy of the game CD to be inserted into the machine in order to play the game; a copied CD generally will not work.
Building a module larger than a single chip gives the designer more opportunity to consider hardware security, as a system. For example, a larger package lets one more easily use internal power sources, environmental sensing, more robust filtering on the power the device demands from external sources, and so on.
However, colleagues who work in building "tamper-proof hardware" will quickly assert that there is no such thing as "tamper-proof hardware." Instead, they advocate looking at a systems approach interleaving several concepts:
- Tamper resistance. It should be hard to penetrate the module.
- Tamper evidence. Penetration attempts should leave some visible signal.
- Tamper detection. The device itself should notice penetration attempts.
- Tamper response. The device itself should be able to take appropriate countermeasures when penetration is detected.
Integrating these concepts into a broader system requires considering many tradeoffs and design issues. Tamper evidence makes sense only if the deployment scenario allows for a trustworthy party to actually observe this evidence. Tamper resistance can work in conjunction with tamper detection—the stronger the force required to break into the module, the more likely it might be to trigger detection mechanisms. Tamper response may require consideration of the data remanance issues discussed earlier. What should happen when the adversary breaks in? Can we erase the sensitive data before the adversary can reach it? These questions can in turn lead to consideration of protocol issues—for example, if only a small amount of SRAM can be zeroized on attack, then system software and key management may need to keep larger sensitive items encrypted in FLASH and to be sure that the sensitive SRAM is regularly inverted. The choice of tamper-response technology can also lead to new tamper-detection requirements, since the tamper-response methods may require that the device environment remain inside some operating envelope for the methods to work.
Recently, a new aspect of tamper protection has entered the research agenda. U.S. government agencies have been expressing concern about whether the chips and devices they use in sensitive systems have themselves been tampered with somehow—for example, an adversary who infiltrated the design and build process for a memory chip might have included (in hardware) a Trojan horse that attacks its contents when a prespecified signal arrives. We can find ourselves running into contradictions here—to protect against this type of attack, we might need to be able to probe inside the device, which violates the other type of tamper protection. (Some recent research here tries to use the techniques of side-channel analysis—typically used to attack systems—in order to discover the presence of hardware-based Trojan horses; the idea is that even a passive Trojan will still influence such things as power consumption. [ABK+07].)
So far in this section, we’ve discussed techniques that various types of trusted hardware might use to help defend themselves and the computation in which they’re participating against attack by an adversary. However, we might also consider what software alone might do against tamper. The toolkit offers a couple of interesting families of techniques.
- Software tamper-resistance (e.g., [Auc96]) techniques try to ensure that a program stops working correctly if the adversary tampers with critical pieces—for example, the adversary might try to run the program without a proper license. Effective use of dongles often requires some notions of software tamper resistance. As noted earlier, if the program simply checks for the dongle’s presence and then jumps to the program start, then the adversary might simply bypass this check—so the tamper response needs to be more subtle. Related to this topic are techniques to produce binary code that is difficult to disassemble.
- Software-based attestation techniques (e.g., [SLS+05, SPvDK04]) try to assure an external relying party that a piece of software is running on a particular platform in a trustworthy way. The basic idea is that the relying party knows full operational details of the target system and crafts a checksum program that requires using all the resources of the system in order to produce a timely but correct response; a trusted path between the relying party and the target system is usually assumed. These techniques are still early but promising.