Home > Articles > Security > Software Security

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

2.8 Mitigation Strategies

Because errors in string manipulation have long been recognized as a leading source of buffer overflows in C and C++, a number of mitigation strategies have been devised. These include mitigation strategies design to prevent buffer overflows from occurring and strategies that are designed to detect buffer overflows and securely recover without allowing the failure to be exploited.

Rather than completely relying on a given mitigation strategy, it is often advantageous to follow a defense-in-depth strategy of combining multiple strategies. A common approach is to consistently apply a secure approach to implementing strings (a prevention strategy), and back it up with one or more run-time detection and recovery schemes.

Prevention

Prevention strategies can be further categorized as static or dynamic based on how they allocate space.

Statically allocated buffers assume a fixed size, meaning that once the buffer has been filled it is impossible to add data. Examples include the standard C strncpy() and strncat() and OpenBSD's strlcpy() and strlcat(). Because the static approach discards excess data, there is always a chance that actual program data will be lost. Consequently, the resulting string must be fully validated [Wheeler 04].

Dynamically allocated buffers dynamically resize as additional memory is required. Dynamic approaches scale better and do not discard excess data. The major disadvantage is that if inputs are not limited, they can exhaust memory on a machine and consequently be used in denial-of-service attacks.

Input Validation. Buffer overflows are often the result of unbounded string or memory copies. Buffer overflows can be prevented by ensuring that input data does not exceed the size of the smallest buffer in which it is stored. Figure 2–29 shows an example of a simple function that performs input validation.

Any data that arrives at a program interface across a security boundary requires validation. Examples of such data include argv, environment, sockets, pipes, files, signals, shared memory, and devices.

1. int myfunc(const char *arg) { 
2. char buff[100]; 
3. if (strlen(arg) >= sizeof(buff)) { 
4.  abort(); 
5. } 
6. } 

Figure 2–29. Input validation

Input validation works for all classes of buffer exploits but requires that a developer correctly identify and validate all of the external inputs that might result in buffer overflows. Because this process is error prone, it is usually prudent to combine this avoidance strategy with others (for example, replacing suspect functions).

fgets() and gets_s(). If there was ever a hard and fast rule in secure programming in C and C++ it is this: Never use gets(). The gets() function has been used extensively in the examples of vulnerable programs in this chapter. The gets() function reads a line from standard input into a buffer until a terminating newline or EOF is found. No check for buffer overrun is performed. The following quote is from the man page for the function:

Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security.

There are two alternative functions that can be used: fgets() and gets_s(). Figure 2–30 shows how all three functions are used.

The fgets() function is defined in C99 and has similar behavior to gets(). The fgets() function accepts two additional arguments: the number of characters to read and an input stream. By specifying stdin as the stream, fgets() can be used to simulate the behavior of gets(), as shown in lines 6–10 of Figure 2–30. Unlike gets(), the fgets() function retains the newline character, meaning that the function cannot be used as a direct replacement for gets().

 1. #define BUFFSIZE 8

 2. int _tmain(int argc, _TCHAR* argv[]){

 3. char buff[BUFFSIZE];

  // insecure use of gets()
 4. gets(buff);
 5. printf("gets: %s.\n", buff);

 6. if (fgets(buff, BUFFSIZE, stdin) == NULL) {
 7.  printf("read error.\n");
 8.  abort();
 9. } 
10. printf("fgets: %s.\n", buff); 

11. if (gets_s(buff, BUFFSIZE) == NULL) { 
12.  printf("invalid input.\n"); 
13.  abort(); 
14. } 
15. printf("gets_s: %s.\n", buff); 

16. return 0; 
17. }

Figure 2–30. Use of gets() versus fgets() versus gets_s()

When using fgets() it is possible to read a partial line. It is possible, however, to determine when the user input is truncated because the input buffer will not contain an newline character.

The fgets() function reads at most one less than the number of characters specified from the stream into an array. No additional characters are read after a newline character or after end-of-file. A null character is written immediately after the last character read into the array. The C99 standard does not define how fgets() behaves if the number of characters to read is specified as zero or if the pointer to the character array to be written to is a NULL.

The gets_s() function is defined by ISO/IEC TR 24731 [ISO/IEC 05] to provide a compatible version of gets() that was less prone to buffer overflow. This function is closer to a direct replacement for the gets() function than fgets() in that it only reads from the stream pointed to by stdin. The gets_s()function, however, accepts an additional argument of rsize_t that specifies the maximum number of characters to input. An error condition occurs, if this argument is equal to zero or greater than RSIZE_MAX9 or if the pointer to the destination character array is null. If an error condition occurs no input is performed and the character array is not modified. Otherwise, the function reads, at most, one less than the number of characters specified and a null character is written immediately after the last character read into the array. Lines 11–15 of Figure 2–30 show how gets_s() can be used in a program.

The gets_s() function returns a pointer to the character array if successful. A NULL pointer is returned if the function arguments are invalid, an end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation.

The gets_s() function only succeeds if it reads a complete line (that is, it reads a newline character). If a complete line cannot be read, the function returns NULL, sets the buffer to the null string, and clears the input stream to the next newline character.

The fgets() and gets_s() functions can still result in a buffer overflows if the specified number of characters to input exceeds the length of the destination buffer.

memcpy_s() and memmove_s() . The memcpy_s() and memmove_s() functions defined in ISO/IEC TR 24731 are similar to the corresponding less-secure memcpy() and memmove() functions but provide some additional safeguards. The secure versions of these functions add an additional argument that specifies the maximum size of the destination. The memcpy_s() and memmove_s() functions return zero if successful. A nonzero value is returned if either the source or destination pointer is NULL, if the specified number of characters to copy/move is greater than the maximum size of the destination buffer, or the number of characters to copy/move or the maximum size of the destination buffer is greater than RSIZE_MAX.

strcpy() and strcat(). The strcpy() and strcat() routines have been villainized as a major source of buffer overflows, and many prevention strategies provide more secure variants of these functions. However, not all applications of strcpy() are flawed. For example, it is often possible to dynamically allocate the required space as follows:

dest = (char *)malloc(strlen(source) + 1);
if (dest) {
 strcpy(dest, source); 
} else {
 /* handle error */
 ... 
} 

For this example to work, it is necessary that the source string be fully validated; for example, to ensure that the string is not overly long. There are also other cases where it is clear that there is no potential for writing beyond the array bounds. As a result, it may not be cost effective to replace or otherwise secure every call to strcpy(). This depends on the overall mitigation strategy adopted, however, as some strategies require an overall retooling of string manipulation.

strcpy_s() and strcat_s(). The strcpy_s() and strcat_s() functions are defined in ISO/IEC TR 24731 as close replacements for strcpy() and strcat(). These functions take an extra argument of type rsize_t that specifies the maximum length of the destination buffer.

The strcpy_s() function is similar to strcpy() when there are no constraint violations. The strcpy_s() function copies characters from a source string to a destination character array up to and including the terminating null character. The function returns zero on success.

The strcpy_s() function only succeeds when the source string can be fully copied to the destination without overflowing the destination buffer. If either the source or destination pointers are NULL or if the maximum length of the destination buffer is equal to zero, greater than RSIZE_MAX, or less than or equal to the length of the source string, the destination string is set to the null string and the function returns a nonzero value.

The strcat_s() function appends the characters of the source string, up to and including the null character, to the end of the destination string. The initial character from the source string overwrites the null character at the end of the destination string.

The strcat_s() function returns zero on success. However, the destination string is set to the null string and a nonzero value is returned if either the source or destination pointers are NULL or if the maximum length of the destination buffer is equal to zero or greater than RSIZE_MAX. The strcat_s() function will also fail if the destination string is already full or if there is not enough room to fully append the source string.

The strcpy_s() and strcat_s() functions can still result in a buffer overflow if the maximum length of the destination buffer is incorrectly specified.

strncpy() and strncat(). The standard C library includes functions that are designed to prevent buffer overflows, particularly strncpy() and strncat(). These universally available functions take a static allocation approach and discard data that doesn't fit into the buffer.

The strncpy() library function performs a similar function to strcpy() but allows a maximum size to be specified:

strncpy(dest, source, dest_size - 1);
dest[dest_size - 1] = '\0';

The strcat() function concatenates a string to the end of a buffer. Like strcpy(), strcat() has a more secure version, strncat(). Functions like strncpy() and strncat() restrict the number of bytes written and are generally more secure, but they are not foolproof. The following is an actual code example resulting from a simplistic transformation of existing code:10

strncpy(record, user, MAX_STRING_LEN - 1);
strncat(record, cpw, MAX_STRING_LEN - 1);

The problem is that the last argument to strncat() should not be the total buffer length; it should be the space remaining after the call to strncpy(). Both functions require that you specify the remaining space and not the total size of the buffer. Because the remaining space changes every time data is added or removed, programmers must track or constantly recompute the remaining space. These processes are error prone and can lead to vulnerabilities. The following call correctly calculates the remaining space when concatenating a string using strncat():

strncat(dest, source, dest_size-strlen(dest)-1); 

Another problem with strncpy() and strncat() is that neither function provides a status code or reports when the resulting string is truncated. Both functions return a pointer to the destination buffer, requiring significant effort by the programmer to determine whether the resulting string was truncated.

The strncpy() function doesn't null terminate the destination string if the source string is at least as long as the destination.11 As a result, the destination string must be null terminated after calling strncpy().

There's also a performance problem with strncpy() in that it fills the entire destination buffer with null bytes after the source data has been exhausted. Although there is no good reason for this behavior, many programs now depend on it and as a result it is difficult to change.

strncpy_s() and strncat_s(). ISO/IEC TR 24731 specifies the strncpy_s() and strncat_s() functions as close replacements for strncpy() and strncat().

The strncpy_s() function copies not more than a specified number of successive characters (characters that follow a null character are not copied) from a source string to a destination character array. If no null character was copied, then the last character of the destination character array is set to a null character.

The strncpy_s() function returns zero to indicate success. If the input arguments are invalid, strncpy_s() returns a nonzero value and sets the destination string to the null string. Input validation fails if either the source or destination pointers are NULL or if the maximum size of the destination string is zero or greater than RSIZE_MAX. The input is also considered invalid when the specified number of characters to be copied exceeds RSIZE_MAX.

A strncpy_s() operation can actually succeed when the number of characters specified to be copied exceeds the maximum length of the destination string as long as the actual source string is shorter than the maximum length of the destination string. If the number of characters to copy is greater than or equal to the maximum size of the destination string and the source string is longer than the destination buffer, the operation will fail.

Users of these functions are less likely to introduce a security flaw because the size of the destination buffer and the maximum number of characters to append must be specified. The strncat_s() function also ensures null termination of the destination string. For example, the first call to strncpy_s() on line 5 of the sample program shown in Figure 2–31 assigns the value zero to r1 and the sequence hello\0 to dst1. The second call on line 6 assigns a non-zero value to r2 and the sequence \0 to dst2. The third call on line 7 assigns the value zero to r3 and the sequence good\0 to dst3. If strncpy() had been used instead of strncpy_s() a buffer overflow would have occurred during the execution of line 6.

1. char src1[100] = "hello";
2. char src2[7] = {'g','o','o','d','b','y','e'};
3. char dst1[6], dst2[5], dst3[5];

4. int r1, r2, r3;
5. r1 = strncpy_s(dst1, 6, src1, 100);
6. r2 = strncpy_s(dst2, 5, src2, 7);
7. r3 = strncpy_s(dst3, 5, src2, 4);

Figure 2–31. Sample use of strncpy_s() function

The strncat_s() function appends not more than a specified number of successive characters (characters that follow a null character are not copied) from a source string to a destination character array. The initial character from the source string overwrites the null character at the end of the destination array. If no null character was copied from the source string, then a null character is written at the end of the appended string.

The strncat_s() function fails and returns a nonzero value if either the source or destination pointers are NULL, or if the maximum length of the destination buffer is equal to zero or greater than RSIZE_MAX. The function also fails when the destination string is already full or if there is not enough room to fully append the source string.

The strncpy_s() and strncat_s() functions are still capable of overflowing a buffer if the maximum length of the destination buffer and number of characters to copy are incorrectly specified.

strlen(). The strlen() function is not particularly flawed, but its operations can be subverted because of the underlying weaknesses of the underlying string representation. The strlen() function accepts a pointer to a character array and returns the number of characters that precede the terminating null character. If the character array is not properly null-terminated, the strlen() function may return an erroneously large number that could result in a vulnerability.

One solution is to ensure that a string is null terminated before passing it to strlen() by inserting a null character in the last byte of the array. Another solution is to use the strnlen() function. In addition to a character pointer, the strnlen() function accepts a maximum size. If the string is longer than the maximum size specified, the maximum size is returned rather than the actual size of the string. The strnlen() function is available in GCC and in the beta release of Visual Studio 2005. ISO/IEC TR 24731 defines a strnlen_s() function that has similar behavior.

Strsafe.h. Microsoft provides a set of safer string handling functions for the C programming language called Strsafe.h.12 These functions are intended to replace their built-in C/C++ counterparts, as well as any legacy Microsoftspecific string handing functions.

The Strsafe functions support both ANSI and Unicode characters, always return a status code, and require that the programmer always specifies the size of the destination buffer. Separate functions are provided that allow the programmer to specify the size of the destination buffer using either character or byte counts.

The Microsoft Strsafe library functions guarantee that all strings are null terminated (even if they are truncated) and that a write does not occur past the end of the destination buffer. These functions are safe as long as the programmer inputs the actual starting address of the destination buffer and correct length. As a result, care must still be taken when using these functions.

Figure 2–32 shows an example program that performs a secure string copy on line 8 and a secure string concatenation on line 13.

It is also important to remember that the Strsafe functions, such as StringCchCopy() and StringCchCat(), do not have the same semantics as the strncpy_s() and strncat_s() functions discussed earlier in this chapter. When strncat_s() detects an error, it sets the destination string to a null string while StringCchCat() fills the destination with as much data as possible, and then null-terminates the string.

strlcpy() and strlcat(). The strlcpy() and strlcat() functions copy and concatenate strings in a less error-prone manner than the corresponding C99 functions. These functions' prototypes are as follows:

size_t strlcpy(char *dst, const char *src, size_t size);
size_t strlcat(char *dst, const char *src, size_t size);

The strlcpy() function copies the null-terminated string from src to dst (up to size characters). The strlcat() function appends the null-terminated string src to the end of dst (but no more than size characters will be in the destination).

 1. #include <Strsafe.h>
 2. int main(int argc, char *argv[]) {
 3. char MyString[128];
 4. HRESULT Res;
 5. Res=StringCbCopy(MyString, sizeof(MyString), "Program 1. Name is ");
 6. if (Res != S_OK) {
 7.  printf("StringCbCopy Failed: %s\n", MyString)
 8.  exit(-1);
 9. } 
10. Res=StringCbCat(MyString,sizeof(MyString),argv[0]); 
11.  if (Res != S_OK) { 
12.  printf("StringCbCat Failed: %s\n", MyString); 
13.  exit(-1); 
14.  } 
15. printf("%s\n", MyString); 
16. return 0; 
17. }

Figure 2–32. Microsoft Strsafe example

To help prevent writing outside the bounds of the array, the strlcpy() and strlcat() functions accept the full size of the destination string as a size parameter. In most cases, this value is easily computed at compile time using the sizeof() operator.

Both functions guarantee that the destination string is null terminated for all nonzero-length buffers.

The strlcpy() and strlcat() functions return the total length of the string they tried to create. For strlcpy() that is simply the length of the source; for strlcat() it is the length of the destination (before concatenation) plus the length of the source. To check for truncation, the programmer needs to verify that the return value is less than the size parameter. If the resulting string is truncated, the programmer now has the number of bytes needed to store the entire string and may reallocate and recopy.

Neither strlcpy() nor strlcat() zero-fill their destination strings (other than the compulsory null byte to terminate the string). This results in performance close to that of strcpy() and much better than strncpy() [Miller 99].

Unfortunately, strlcpy() and strlcat() are not universally available in the standard libraries of UNIX systems. Both functions are defined in string.h for many UNIX variants, including Solaris but not for GNU/Linux. Because these are relatively small functions, however, you can easily include them in your own program's source whenever the underlying system doesn't provide them. It is still possible (however unlikely) that the incorrect use of these functions will result in a buffer overflow if the specified buffer size is longer than the actual buffer length.

C++ std::string. Section 2.2 described a common programming flaw using the C++ extraction operator operator>> to read input from cin into a character array. Although setting the field width eliminates the buffer overflow vulnerability, it does not address the issue of truncation. Also, unexpected program behavior could result when the maximum field width is reached and the remainder of characters in the input stream are consumed by the next call to the extraction operator.

C++ programmers have the option of using the standard std::string class defined in ISO/IEC 14882 [ISO/IEC 98]. The std::string class is the char instantiation of the std::basic_string template class, and it uses a dynamic approach to strings in that memory is allocated as required—meaning that in all cases, size() <= capacity(). The std::string class is convenient because the language supports the class directly. Also, many existing libraries already use this class, which simplifies integration.

Figure 2–33 shows another solution to extracting characters from cin into a string, using std::string instead of a character array. This program is simple, elegant, handles buffer overflows and string truncation, and behaves in a predictable fashion. What more could you possibly want?

The std::string generally protects against buffer overflow, but there are still situations in which programming errors can lead to buffer overflows. While C++ generally throws an out_of_range exception when an operation references memory outside the bounds of the string, the subscript operator [] (which does not perform bounds checking) does not [Viega 03].

1. #include <iostream>
2. #include <string>
3. using namespace std;

4. int main() { 

5. string str; 

6. cin >> str; 
7. cout << "str 1: " << str << endl; 
8. }

Figure 2–33. Extracting characters from cin into an std::string object

Another problem occurs when converting std::string objects to C-style strings. If you use string::c_str() to do the conversion, you get a properly null-terminated C-style string. However, if you use string::data(), which writes the string directly into an array (returning a pointer to the array), you get a buffer that is not null terminated. The only difference between c_str() and data() is that c_str() adds a trailing null byte.

Finally, many existing C++ programs and libraries have their own string classes. To use these libraries, you may have to use these string types or constantly convert back and forth. Such libraries are of varying quality when it comes to security. It is generally best to use the standard library (when possible) or to understand the semantics of the selected library. Generally speaking, libraries should be evaluated based on how easy or complex they are to use, the type of errors that can be made, how easy these errors are to make, and what the potential consequences may be.

SafeStr. The C String Library (SafeStr) from Messier and Viega provides a rich string-handling library for C that has secure semantics yet is interoperable with legacy library code in a straightforward manner [Messier 03].

The SafeStr library uses a dynamic approach for C that automatically resizes strings as required. SafeStr accomplishes this by reallocating memory and moving the contents of the string whenever an operation requires that a string grow in size. As a result, buffer overflows should not result from using the library.

The SafeStr library is built around the safestr_t type. The safestr_t type is compatible with char* and allows safestr_t structures to be cast as char* and behave as C-style strings. The safestr_t type keeps accounting information (for example, the actual and allocated length) in memory directly preceding the memory referenced by the pointer. This is similar to the approach used by dynamic memory managers described in Chapter 4.

The SafeStr library supports immutable strings. Strings can be specified as immutable during initialization or by calling

void safestr_makereadonly(safestr_t s); 

Immutable strings cannot be modified using the SafeStr API. However, the memory can still be overwritten. The library only prevents writes initiated through SafeStr functions.

The SafeStr API can help track trusted and untrusted data in the style of Perl's taint mode. A developer can use this mechanism to mark strings originating from untrusted sources as such. Strings that have been checked for potentially malicious input could subsequently be marked as trusted. When modifying a string, the trusted property of that string is set to "untrusted" if any of the operands are untrusted. When creating a new string from operations on other strings, the new string is only marked as trusted if all the strings that influence its value are trusted.

The trust property will not properly propagate if the SafeStr API is circumvented. The SafeStr API does not currently provide any routines that check the trusted flag. However, you can explicitly check the flag yourself as shown in Figure 2–34.

Error handling in SafeStr is performed using XXL,13 a library that provides both exceptions and asset management for C and C++. The caller is responsible for handling exceptions thrown by SafeStr and XXL. If no exception handler is specified, the default action is to output a message to stderr and call abort(). The dependency on XXL can be an issue because both libraries need to be adopted to support this solution.

The sample program shown in Figure 2–35 uses the SafeStr library to allocate two strings and copies one string to the other. The use of XXL provides a convenient mechanism for error checking.

Vstr. The Vstr string library is optimized for input/output using readv()/ writev() [Antill 04]. For example, you can readv() data to the end of the string and writev() data from the beginning of the string without allocating or moving memory. Vstr also works with data containing zero bytes.

Figure 2–36 shows a simple example of a program that uses Vstr to print out "Hello World." The library is initialized on line 8 of this example. The call to the vstr_dup_cstr_buf () function on line 10 creates a vstr from a C style string literal. The string is then output to the user using the vstr_sc_write_fd () function on line 13. This call to the vstr_sc_write_fd () function writes the

1. int safer_system(safestr_t cmd) {
2. if (!safestr_istrusted(cmd)) { 
3.  printf("Untrusted data in safer_system!\n"); 
4.  abort(); 
5. } 
6. return system((char *)cmd); 
7. }

Figure 2–34. Trusted and untrusted data in SafeStr

 1. #include <stdio.h>
 2. #include "safestr.h"
 3. #include "xxl.h"

 4. int main(int argc, char *argv[]) {
 5. safestr_t str1;
 6. safestr_t str2;
 7. XXL_TRY_BEGIN {
 8.  str1 = safestr_alloc(12, 0);
 9. str2 = safestr_create( 
10.  "hello, world\n", 0); 
11. safestr_copy(&str1, str2); 
12. safestr_printf(str1); 
13. safestr_printf(str2); 
14. } 
15. XXL_CATCH (SAFESTR_ERROR_OUT_OF_MEMORY) { 
16. /* handle exception */ 
17. } 
18. XXL_EXCEPT { 
19. /* handle exception */ 
20. } 
21. XXL_TRY_END; 
22. return 0; 
23. }

Figure 2–35. "Hello World" using SafeStr and XXL

contents of the s1 vstr to STDOUT. Lines 17 and 18 of the the example are used to cleanup allocated resources

Unlike most string libraries, Vstr does not have an internal representation of the string that allows direct access from a character pointer. Instead, the internal representation is of multiple nodes, each containing a portion of the string data. This data representation model means that memory usage increases linearly as a string get larger. Adding, substituting, or moving data anywhere in the string can be optimized to an O(1) algorithm.

String Streams

The GNU library allows you to define streams that do not correspond to open files. One such type of stream takes input from or writes output to a string. These streams are used by the GNU library to implement the sprintf() and sscanf() functions. You can also create a string stream explicitly using the

 1. #define VSTR_COMPILE_INCLUDE 1
 2. #include <vstr.h>
 3. #include <errno.h> 
 4. #include <err.h> 
 5. #include <unistd.h> 

 6. int main(void) {
 7. Vstr_base *s1 = NULL;

 8. if (!vstr_init()) 
 9.  err(EXIT_FAILURE, "init"); 
10. if (!(s1 = vstr_dup_cstr_buf(NULL, "HelloWorld\n"))) 
11.  err(EXIT_FAILURE, "Create string"); 
12. while (s1->len) 
13.  if (!vstr_sc_write_fd(s1, 1, s1->len, STDOUT_FILENO, NULL)){ 

14.  if ((errno != EAGAIN) && (errno != EINTR)) 
15.  err(EXIT_FAILURE, "write"); 
16. } 
17. vstr_free_base(s1); 
18. vstr_exit(); 
19. exit (EXIT_SUCCESS); 
20. }

Figure 2–36. "Hello World" using Vstr

fmemopen() and open_memstream() functions. These functions allow you to perform I/O to a string or memory buffer. Both functions are declared in stdio.h as follows:

FILE * fmemopen(void *buf, size_t size, const char *ot)
FILE * open_memstream(char **ptr, size_t *sizeloc)

The fmemopen() function opens a stream that allows you to read from or write to a specified buffer. The open_memstream() function opens a stream for writing to a buffer. The buffer is allocated dynamically and grown as necessary. When the stream is closed with fclose() or flushed with fflush(), the locations ptr and sizeloc are updated to contain the pointer to the buffer and its size. These values only remain valid as long as no further output on the stream takes place. If you perform additional output, you must flush the stream again to store new values before you use them again. A null character is written at

 1. #include <stdio.h>

 2. int main (void) {
 3. char *bp;
 4. size_t size;
 5. FILE *stream;

 6. stream = open_memstream(&bp, &size);
 7. fprintf(stream, "hello");
 8. fflush(stream);
 9. printf("buf = '%s', size = %d\n", bp, size); 
10. fprintf(stream, ", world"); 
11. fclose(stream); 
12. printf("buf = '%s', size = %d\n", bp, size); 

13. return 0; 
14. } 

Figure 2–37. Using open_memstream() to write to memory.

the end of the buffer. This null character is not included in the size value stored at sizeloc.

Figure 2–37 shows a sample program that opens a stream to write to memory on line 6. The string "hello" is written to the stream on line 7, and the stream flushed on line 8. The call to fflush() updates buf and size so that the printf() function on line 9 outputs:

buf = 'hello', size = 5 

After the string ", world" is written to the stream on line 10 the stream is closed (line 11). Closing the stream also updates buf and size so that the printf() function on line 12 outputs:

buf = 'hello, world', size = 12 

The size is the cumulative (total) size of the buffer. The fmemopen() provides a safer mechanism for writing to memory because it uses a dynamic approach that allocates memory as required. The downside is that the user must manage the memory (which could lead to some of the common memory management errors described in Chapter 4).

Detection and Recovery

Detection and recovery mitigation strategies generally make changes to the runtime environment to detect buffer overflows when they occur so that the application or operating system can recover from the error (or at least fail safely). Detection and recovery mitigations generally form a second line of defense in case the "outer perimeter" is compromised. Because attackers have numerous options for controlling execution after a buffer overflow has occurred, detection and recovery is not as effective as prevention and should not be depended on as the only mitigation strategy.

Compiler Generated Runtime Checks. Visual C++ provides native runtime checks to catch common runtime errors such as stack pointer corruption and overruns of local arrays. Visual C++ also provides a runtime_checks pragma that disables or restores the /RTC settings.

Stack Pointer Corruption. Stack pointer verification detects stack pointer corruption. Stack pointer corruption can be caused by a calling convention mismatch.

Overruns of Local Arrays. The /RTCs option enables stack frame runtime error checking for writes outside the bounds of local variables such as arrays but does not detect overruns when accessing memory that results from compiler padding within a structure.

Nonexecutable Stacks. Nonexecutable stacks are a runtime solution to buffer overflows that are designed to prevent executable code from running in the stack segment. Many operating systems can be configured to use nonexecutable stacks.

Nonexecutable stacks are often represented as a panacea in securing against buffer overflow vulnerabilities. However, nonexecutable stacks do not prevent buffer overflows from occurring in the stack, heap, or data segments. They do not prevent an attacker from using a buffer overflow to modify a return address, variable, data pointer, or function pointer. Nonexecutable stacks do not prevent arc injection or injection of the execution code in the heap or data segments. Not allowing an attacker to run executable code on the stack can prevent the exploitation of some vulnerabilities, but it is often only a minor inconvenience to an attacker.

Depending on how they are implemented, nonexecutable stacks can affect performance. Nonexecutable stacks can also break programs that execute code in the stack segment, including Linux signal delivery and GCC trampolines.

Stackgap. Many stack-based buffer overflow exploits rely on the buffer being at a known location in memory. If the attacker can overwrite the function return address, which is at a fixed location in the overflow buffer, execution of the attacker-supplied code starts. Introducing a randomly sized gap of space upon allocation of stack memory makes it more difficult for an attacker to locate a return value on the stack while costing no more than one page of real memory. This mitigation can be relatively easy to add to an operating system. Figure 2–38 shows the change to the Linux kernel required to implement Stackgap.

Although Stackgap may make it more difficult to exploit a vulnerability, it does not prevent exploits if an attacker can use relative, rather than absolute, values. Section 6.4 describes stack randomization in more detail and also demonstrates how it can be thwarted.

Runtime Bound Checkers. If using a type-safe language such as Java is impractical, it may still be possible to use a compiler that performs array bounds checking for C programs.

Jones and Kelley [Jones 97] propose an approach for bounds checking using referent objects. This approach is based on the principle that an address computed from an in-bounds pointer must share the same referent object as the original pointer. Unfortunately, there are a surprisingly large number of programs that generate and store out-of-bounds addresses and later retrieve these values in their computation without causing buffer overflows—making these programs incompatible with this bounds-checking approach. This approach to runtime bound checking also has significant performance costs, particularly in pointer intensive programs, where performance may slow down by up to 30 times [Cowan 00].

Ruwase and Lam have improved the Jones and Kelley approach in their C Range Error Detector (CRED) [Ruwase 04]. According to the authors, CRED enforces a relaxed standard of correctness by allowing program manipulations

1. sgap = STACKGAPLEN;
2. if (stackgap_random != 0)
 sgap += (arc4random()*ALIGNBYTES) & (stackgap_random-1);
 /* Check if args & environ fit into new stack */ 
3. len = ((argc + envc + 2 + pack.ep_emul->e_arglen) *
 sizeof(char *) + sizeof(long) + dp + sgap +
 sizeof(struct ps_strings)) - argp; 

Figure 2–38. Linux kernel modification to support stackgap

of out-of-bounds addresses that do not result in buffer overflows. This relaxed standard of correctness provides higher compatibility with existing software.

CRED can be configured to check all bounds of all data or of string data only. Full bounds checking, like the Jones and Kelley approach, imposes significant performance overhead. Limiting the bounds checking to strings improves the performance for most programs. Overhead ranges from 1% to 130% depending on the use of strings in the application.

Bounds checking is effective in preventing most overflow conditions but is not perfect. The CRED solution, for example, is unable to detect conditions where an out-of-bounds pointer is cast to an integer, used in an arithmetic operation, and cast back to a pointer. The approach does prevent overflows in the stack, heap, and data segments. CRED was effective in detecting 20 different buffer overflow attacks developed by Wilander and Kamkar for evaluating dynamic buffer overflow detectors [Wilander 03], even when optimized to check only for overflows in strings.

CRED has been merged into the latest Jones and Kelly checker for GCC 3.3.1, which is currently maintained by Herman ten Brugge.14

Canaries. Canaries are another mechanism used to detect and prevent stack smashing attacks. Instead of performing generalized bounds checking, canaries are used to protect the return address on the stack from sequential writes through memory (for example, resulting from a strcpy()). Canaries consist of a hard-to-insert or hard-to-spoof value written to an address below the section of the stack being protected. A sequential write would therefore need to overwrite this value on the way to the protected region. The canary is initialized immediately after the return address is saved and checked immediately before the return address is accessed.

A hard-to-insert or terminator canary consists of four different string terminators (CR, LF, NULL, and –1). This guards against buffer overflows caused by string operations but not memory copy operations.

A hard-to-spoof or random canary is a 32-bit secret random number that changes each time the program is executed. This approach works well as long as the canary remains a secret.

Canaries are implemented in StackGuard [Cowan 98]. Various StackGuard versions have been used with GCC for Immunix OS 6.2, 7.0, and 7+. Red Hat 7.3 plans exist to merge StackGuard 3 into the GCC 3.x mainline compiler. Canaries have also been used in ProPolice and Microsoft's Visual C++ .NET.

Canaries are useful only against exploits that overflow a buffer on the stack and attempt to overwrite the stack pointer or other protected region.

Canaries do not protect against exploits that modify variables, data pointers, or function pointers. Canaries do not prevent buffer overflows from occurring in any location, including the stack segment.

Neither the terminator or random canary offers complete protection against exploits that overwrite the return address. Exploits that write four bytes directly to the location of the return address on the stack can defeat terminator and random canaries [Bulba 00]. To solve these direct access exploits, Stack-Guard added Random XOR canaries [Wagle 03] that XOR the return address with the canary. Again, this works well as long as the canary remains a secret.

Stack Smashing Protector (ProPolice). A popular mitigation approach derived from StackGuard is the GCC Stack Smashing Protector (SSP, also known as ProPolice) [Etoh 00]. SSP is a GCC extension for protecting applications written in C from the most common forms of stack buffer overflow exploits and is implemented as an intermediate language translator of GCC. SSP provides buffer overflow detection and variable reordering to avoid the corruption of pointers. Specifically, SSP reorders local variables to place buffers after pointers and copies pointers in function arguments to an area preceding local variable buffers to avoid the corruption of pointers that could be used to further corrupt arbitrary memory locations

The SSP feature is enabled using gcc options. The -fstack-protector and -fno-stack-protector options enable and disable stack smashing protection. The -fstack-protector-all and -fno-stack-protector-all options enable and disable the protection of every function, not just the functions with character arrays.

SSP works by introducing a guard variable to prevent changes to the arguments, return address, and previous frame pointer. Given the source code of a function, a preprocessing step inserts code fragments into appropriate locations as follows:

  • Declaration of local variables
    volatile int guard; 
  • Entry point
    guard = guard_value; 
  • Exit point
    if (guard != guard_value) {
     /* output error log */
     /* halt execution */ 
    } 
Figure 39

Figure 2–39. SSP safe frame structure

A random number is generated for the guard value during application initialization, preventing discovery by a nonprivileged user.

SSP also provides a safer stack structure as shown in Figure 2–39. This structure establishes the following constraints:

  • Location (A) has no array or pointer variables.
  • Location (B) has arrays or structures that contain arrays.
  • Location (C) has no arrays.

Placing the guard after the section containing the arrays (B) prevents a buffer overflow from overwriting the arguments, return address, previous frame pointer, or local variables (but not other arrays).

Libsafe and Libverify. Libsafe is a dynamic library available from Avaya Labs Research15 for limiting the impact of buffer overflows on the stack. The library intercepts and bounds checks arguments to C library functions that are susceptible to buffer overflow [Baratloo 00]. The library makes sure that frame pointers and return addresses cannot be overwritten by an intercepted function. The Libverify library, also described by Baratloo and colleagues., implements a return address verification scheme similar to that used in StackGuard but does not require recompilation of source code, which allows it to be used with existing binaries.

  • + Share This
  • 🔖 Save To Your Account