Home > Articles > Programming > C/C++

Writing Portable C

  • Print
  • + Share This
C is often called a "portable assembly language," but in a lot of situations it's possible to write nonportable C — sometimes by design, sometimes by accident. David Chisnall considers how to avoid portability issues when writing C code.
Like this article? We recommend

Like this article? We recommend

When you choose a language to use for a particular project, you have a whole spectrum of potential choices. At the lowest level, you have assembly languages, tied to a specific CPU, and requiring you to encode things like calling conventions that tie the result to a particular operating system. At the opposite extreme, you have languages like Prolog, where you provide a high-level description of the problem and get a (typically not very efficient) solution.

The C programming language was intended to be as low-level as possible, without being tied to a specific architecture. People often refer to C as "portable assembly." It's not an assembly language, but it's about as close as you can get and remain portable, at least in theory.

Unfortunately, C does expose a number of things that are specific to various processors. One of the first languages that I learned was PL/M. This was an even lower-level language than C, and exposed some details of things like the x86 segmented memory model to the programmer. C doesn't make such a sharp distinction between portable and non-portable features—most compilers won't warn you when you write non-portable code, and often they can't even recognize it.

Size of Type

The most obvious problem with writing portable C is that the size of various types changes between platforms. C defines five integer types, in signed and unsigned variants: char, short, int, long, and long long. On the system where I first learned C, these were 8, 16, 16, 32, and nonexistent, respectively. The last type was only introduced with C99, which postdated my introduction to C by several years.

On the slightly more modern systems where I've done most C programming, the sizes are 8, 16, 32, 32, and 64 bits, respectively. On other systems that are increasingly common, either long, or int and long, are both 64 bits. The C standard requires that each of these must be at least as big as the previous one. It also places some minimum ranges on them, but that's all.

On some systems, char is not 8 bits. This is quite uncommon, and can cause huge problems when porting code. If you combine the constraints imposed by C and POSIX, char must be exactly 8 bits, so you can assume that it will be if you only target POSIX-compliant platforms. One issue that appears with the char type is that, without a qualifier, it may be signed or unsigned, depending on the compiler.

The simplest problem related to types is assuming that one of these types has a specific size. For example, a lot of Windows applications assume that long is 32 bits, and they use packed structures containing longs to represent headers in file formats. If long is 64 bits, all of these will break. This is why, unlike almost all other 64-bit platforms, Win64 comes with 32-bit longs.

Unfortunately, this approach introduces some other problems. None of the primitive C types is required by the standard to be large enough to store a pointer. In practice, however, long almost always is large enough. On Win16 and Win32 it was, and it is on all UNIX systems and embedded platforms. Therefore, a lot of existing code assumes that you can cast a void* to a long without truncation. Unless you need your code to run on Win64, you probably can.

C99 introduced the stdint.h header to address this problem. If you need a fixed-size integer, you can use int8_t up to int64_t, or uint8_t to uint64_t. If you need an integer that's big enough to store a pointer, you can use intptr_t (or uintptr_t). There are a number of other useful types, such as ptrdiff_t, for representing pointer offsets. This is typically the same size as an intptr_t, but may be smaller if a system has a flat address space but only allows individual programs to use a subset of it.

  • + Share This
  • 🔖 Save To Your Account