Secure Coding in C and C++: An Interview with Robert Seacord
Danny Kalev: Eight years after the first edition, which new techniques and insights does the second edition of Secure Coding in C and C++ reveal? More generally, what has changed in the threat maps of C and C++ programming since 2005?
Robert Seacord: One of the big changes in the C and C++ languages has been the support for multiple threads of execution, including improved memory models and atomic objects and operations. In the long term, this is a good thing, but in the short term it brings concurrency class vulnerabilities into both languages. To deal with this specific issue, we have added a chapter on concurrency to the second edition. There have also been significant security improvements, including the addition of support for bounds-checking interfaces (originally specified in ISO/IEC TR 24731-1:2007) that caused us to significantly revise the chapter on strings. We’ve also taken pains to align the second edition more carefully with the C and C++ standards and to explain the effects of undefined behaviors in these languages on your code. To make the book more practical, we’ve eliminated research mitigation strategies in favor of existing solutions.
Danny: Speaking of the new C and C++ standards, are there new features that you consider inherently risky or more dangerous than programmers realize? For example, lambdas or perhaps move semantics that let programmers bind temporaries and literal values to rvalue references? Or maybe C99’s variable length arrays (which became optional in C11)?
Robert: Any new language feature with undefined behaviors is probably more dangerous than most programmers realize. Variable length arrays (VLAs) are essentially the same as traditional C arrays except that they are declared with a size that is not a constant integer expression. Consequently, the integer expression size and the declaration of a VLA are both evaluated at runtime. If the size argument supplied to a VLA is not a positive integer value, the behavior is undefined. Additionally, if the magnitude of the argument is excessive, the program may behave in an unexpected way. An attacker may be able to leverage this behavior to execute arbitrary code. The programmer must ensure that size arguments to VLAs, especially those derived from untrusted data, are in a valid range.
Danny: In your experience, is there a noticeable difference between programming in C and C++ versus programming in other programming languages as far as security is concerned? Are there languages that are inherently more secure, or will attackers simply find and exploit loopholes in every programming language, no matter what?
Robert: There is a noticeable difference between programming in C and C++ and programming in other languages, such as Java. The difference is not so much that one language is more secure than the other but that the attack surface is different. Buffer overflows are still a major problem in C and C++ but not a problem in Java. On the other hand, Java allows code from different code sources to run in the same virtual machine, greatly expanding the attack surface of the language and leading to multiple, recent, well-publicized vulnerabilities that can allow a remote, unauthenticated attacker to execute arbitrary code on a vulnerable system. The biggest mistake you can make is still the false assumption that your program is immune to vulnerabilities because of your choice of programming language.
Danny: In a similar vein, are there platforms that are more vulnerable to security risks than others? Are there platforms that you would consider bulletproof or at least close to that?
Robert: There have been many comparisons of the relative security of the Windows and Linux platforms but little scientific evidence to suggest one platform is more or less secure than the other. Operating systems such as OpenBSD emphasize security, but no platform should be viewed as bullet-proof. Platform security tends to improve with time, so modern OS platforms that feature exploit mitigation such as address space layout randomization (ASLR) or implement a W^X policy such as data execution prevention (DEP) on Windows is preferable to older platforms that do not provide similar runtime protections. Overall security is a function of both vulnerability and threat. From this perspective, ubiquitous systems come under attack more frequently and might consequently be considered less secure.
Danny: Let’s focus on the new secure C11 library functions, such as strcpy_s and fopen_s. How important is replacing the traditional library functions with their secure counterparts? Which security aspects are not covered by the new library functions?
Robert: The bounds-checked interfaces defined in Annex K of the new C Standard are significantly improved over the functions they are designed to replace, meaning that developers using these interfaces are less likely to inadvertently code a buffer overflow. The strcpy_s() function is a close replacement for strcpy(), but includes an additional parameter to prevent buffer overflow that gives the size of the destination array. There are vulnerabilities that are not fully addressed by the APIs. The fopen_s() function opens files with exclusive (nonshared) access, but only if supported by the underlying operating system. This can leave a system open to vulnerabilities resulting from attackers exploiting race conditions when the file is accessed.
Danny: Most programmers are aware of the risks of buffer overflows. However, there are other techniques that attackers exploit, such as stack smashing and return-oriented programming attacks. Can you briefly explain what those two are and under which circumstances they might occur?
Robert: Stack smashing occurs when a buffer overflow occurs on an automatic array on the stack, and the overflow results in information on the stack, such as the return address and base pointer in the returning function being overwritten. If the return address is overwritten by an attacker with the address of malicious code inserted in the program, the attacker can run arbitrary code with the permissions of the vulnerable process. Return-oriented programming is the latest wrinkle. Rather than returning to code that the attacker inserts in the stack or data segment, the attacker returns to small code snippets that already exist in the code segment. Any number of these code snippets can be called sequentially, basically providing a return-oriented programming language that an attacker can use to write malicious programs that execute in the code segment, bypassing mitigation strategies that prevent data execution.
Danny: If you could suggest a few concrete changes to the current standards of either C or C++ that would make these programming languages more secure, what would those be?
Robert: CERT has been active in the C and C++ Standards Committees for many years, and we have made numerous proposals to improve the security of both these languages. Many of our proposals have been adopted; others have not. Some of our adopted proposals include analyzability, static assertions, “no-return” functions, and support for opening files with exclusive mode ('x' as the last character in the mode argument). We also supported the adoption of the bounds-checking interfaces and the removal of the insecure gets() function. In C++, we supported the requirement that an exception thrown from a function marked noexcept should immediately terminate the program.
Among our C proposals that were not accepted was a proposal to add two functions to the C11 Standard Library to encrypt and decrypt pointers. These functions would be similar to the EncodePointer() and DecodePointer() functions in Microsoft Windows. In many systems and applications, pointers to functions are stored unencrypted in locations addressable by exploit code. An attacker exploiting a vulnerability in a program could potentially overwrite the function pointer and thereby hijack the process when the function is called. The proposal was not adopted by the committee because it was felt that the appropriate solution was for compilers to automatically encrypt and decrypt these pointers without requiring the programmer to invoke these library calls. This analysis is correct—however, it is now a quality-of-implementation issue as to whether a compiler implements this feature—and I am not aware of any that do.
Danny: What is your best advice to C and C++ programmers who want to write more secure code (in addition to reading your book, that is)?
Robert: C and C++ programmers should familiarize themselves with the CERT Secure Coding Standards at www.securecoding.cert.org. They should also acquire the appropriate language standards and reference them frequently. Both the C and C++ standards are available from the ANSI eStandards Store at webstore.ansi.org for a reasonable price. After that, I suggest programmers start with a subset of the language they have studied and grow this subset carefully. For example, bit-fields have complex, nonintuitive semantics; you might not want to use this C language feature until you acquire the appropriate expertise.
Danny: What is the biggest misconception about secure coding that you’ve encountered in your long career as a Senior Vulnerability Analyst?
Robert: It is very hard to pick just one. I’ve already covered “just program in Java and you’ll be okay.” Replace strcpy() with strncpy() was another good one. Actually, I’ve thought of the winner: programs have vulnerabilities because programmers are stupid and lazy. Secure coding is extremely difficult, and programmers get less help than they should from their compilers and tools. Large, complex languages can be problematic because the subset programmers might know does not match the subset needed for the current projects they are working on. This is particularly true during maintenance. Developers are frequently under pressure to deliver functionality on schedule; code quality and security may suffer as a result.
Danny: In what ways, then, can compilers and software tools be improved so that programmers may get more help with respect to security?
Robert: There are many ways, but I would like to provide a very specific example. For several years, CERT has been working with WG14 to produce the draft TS 17961 C Secure Coding Rules Technical Specification. ISO/IEC TS 17961 defines secure coding rules for analyzers that wish to diagnose insecure code beyond the requirements of the C Language Standard. The application of static analysis to security has been performed in an ad hoc manner by different vendors, resulting in nonuniform coverage of significant security issues. This specification enumerates secure coding rules and requires analysis engines to diagnose violations of these rules as a matter of conformance to this specification. These rules may be extended in an implementation-dependent manner, which provides a minimum coverage guarantee to customers of any and all conforming static analysis implementations. Various types of analyzers, including static analysis tools and C language compilers, can be used to check if a program contains any violations of the coding rules.
Danny: What are your estimates regarding the security threats of the future? Do you see new kinds of threats looming ahead, for example?
Robert: One of the reasons the CERT Program at the Software Engineering Institute (SEI) works in this area is that software security is a very hard problem, one that is unlikely to be solved any time soon. A January report by the Defense Science Board (DSB) Task Force on Resilient Military Systems likened the problem to complex national security challenges of the past, such as “the counter U-boat strategy in WWII and nuclear deterrence in the Cold War.” One of the primary problems is that market forces are still driving greater performance in preference to software security, so in general, systems are frequently becoming less and not more secure. A good example of this trend is described in Vulnerability Note VU#162289, “C compilers may silently discard some wraparound checks.”