Home > Articles > Programming > C/C++

C++ Reference Guide

Hosted by

A Tour of C1X, Part II

Last updated Jan 1, 2003.

This part overviews C++11-independent features of C1X. These include Unicode support, bounds-checking functions, improved macros for handling complex numbers, a new fopen() interface and static assertions.

Unicode support

The Unicode standard supports three encoding forms: UTF-8, UTF-16 and UTF-32. Each form has its own advantages and disadvantages with respect to space efficiency, processing speed and portability. Currently, C programmers implement UTF-8 using char type, UTF-16 using unsigned short or wchar_t, and UTF-32 using unsigned long or wchar_t. The current situation isn't ideal to say the least because the size of wchar_t is implementation-defined and there are no string literals for UTF-16 and UTF-32. C1X assigns appropriate data types support for all the Unicode encoding forms by introducing two new data types with platform-independent widths: char16_t and char32_t. UTF-8 encoding will stick to char, as before. Additionally, C1X defines Unicode conversion functions in <uchar.h>, corresponding u and U string literal prefixes, and the u8 prefix for UTF-8 encoded literals.

Removal of gets()

gets() is declared in the header file <stdio.h>. It reads a line from the standard input and stores it in a buffer provided by the caller. For many years, gets() has been exploited in buffer overflow attacks because it doesn't know the size of its buffer and might write data past the buffer's boundaries. That is why gets() was deprecated in C99. C1X removes it entirely, offering a safer alternative called gets_s():

char *gets_s(char * restrict buffer, size_t nch);

gets_s() will read no more than nch characters from the standard input and therefore will not cause a buffer overflow.

Bounds-Checking Functions

Technical Report 24731-1 (incorporated into C1X) defines bounds-checking versions of standard C library string handling functions. The bounds-checking versions have the _s suffix. For example, strcat() and strncpy() have safer counterparts called strcat_s() and strncpy_s(), respectively. The main difference between the unsafe functions and the bounds-checking functions is that the latter take an additional parameter that indicates the size of the buffer to which data will be written. Some of the bounds-checking functions also perform additional runtime checks to detect certain types of runtime errors. For example,

errno_t strcat_s(char * restrict s1, rsize_t s1max, const char * restrict s2); //safe

strcat_s copies no more than s1max bytes to s1. Similarly, consider strcy_s():

errno_t strcpy_s(char * restrict s1, rsize_t s1max, const char * restrict s2);

This function requires that s1max shall be larger than the size of s2, thereby preventing an out-of-bounds read.

The original prototypes of the bounds-checking functions were developed by Microsoft's Visual C++ team. At that time, the whole issue raised controversy because the Visual C++ compiler would report that the unsafe versions of the standard C functions were allegedly deprecated, which they weren't (I reported this story in 2005). The good news is that Microsoft's bounds-checking functions, which were at that time non-portable, will become standard in C1X, although the standardized versions aren't 100% identical to the original Microsoft implementation.

Macros for Complex Values

New macros for constructing complex values are available in C1X. The aim is to solve a problem with expression such as real + imaginary*I that might not produce the expected values if Imaginary is infinite or NaN.

Static Assertions

Static assertions are a mechanism for reporting errors during source file translation. As opposed to the traditional #if and #error preprocessor directives, static assertions are evaluated at a later translation phase when the type of the expression is known. Therefore, they enable the implementer to catch errors that are impossible to detect during the preprocessing phase.

New fopen() Interface

C1X introduces a new exclusive create-and-open mode ("...x") for fopen() (it also introduces a safer version of fopen() called fopen_s()) . The new mode behaves like O_CREAT|O_EXCL in POSIX and is commonly used for lock files. More specifically, the "x" family of modes includes the following combinations:

  • wx create text file for writing with exclusive access.
  • wbx create binary file for writing with exclusive access.
  • w+x create text file for update with exclusive access.
  • w+bx or wb+x create binary file for update with exclusive access.

Opening a file with any of the exclusive modes above fails if the file already exists or cannot be created. Otherwise, the file is created with exclusive (non-shared) access to the extent that the underlying system supports exclusive access.

In Conclusion

C1X is in many aspects a better version of C99. Not only does it bring C and C++11 closer by adopting some C++11 features, it also fixes certain loopholes of the C99 standard, particularly with respect to code security. Additionally, it introduces an important new feature -- multithreading support. These features make the C1X standard suitable for 21st century C programmers.