31.5 Common Security-Related Programming Problems
Unfortunately, programmers are not perfect. They make mistakes. These errors can have disastrous consequences in programs that change the protection domains. Attackers who exploit these errors may acquire extra privileges (e.g., access to a system account such as root or Administrator). They may disrupt the normal functioning of the system by deleting or altering services over which they should have no control. They may simply be able to read files to which they should have no access.21 So the problem of avoiding these errors, or security holes, is a necessary issue to ensure that the programs and system function as required.
We present both management rules (installation, configuration, and maintenance) and programming rules together. Although there is some benefit in separating them, doing so creates an artificial distinction by implying that they can be considered separately. In fact, the limits on installation, configuration, and maintenance affect the implementation, just as the limits of implementation affect the installation, configuration, and maintenance procedures.
Researchers have developed several models for analyzing systems for these security holes.22 These models provide a framework for characterizing the problems. The goal of the characterization guides the selection of the model. Because we are interested in technical modeling and not in the reason or time of introduction, many of the categories of the NRL model23 are inappropriate for our needs. We also wish to analyze the multiple components of vulnerabilities rather than force each vulnerability into a particular point of view, as Aslam’s model24 does. So either the PA model25 or the RISOS model26 is appropriate. We have chosen the PA model for our analysis.
We examine each of the categories and subcategories separately. We consider first the general rules that we can draw from the vulnerability class, and then we focus on applying those rules to the program under discussion.
31.5.1 Improper Choice of Initial Protection Domain
Flaws involving improper choice of initial protection domain arise from incorrect setting of permissions or privileges. There are three objects for which permissions need to be set properly: the file containing the program, the access control file, and the memory space of the process. We will consider them separately.
22.214.171.124 Process Privileges
The principle of least privilege27 dictates that no process have more privileges than it needs to complete its task, but the process must have enough privileges to complete its task successfully.
Ideally, one set of privileges should meet both criteria. In practice, different portions of the process will need different sets of privileges. For example, a process may need special privileges to access a resource (such as a log file) at the beginning and end of its task, but may not need those privileges at other times. The process structure and initial protection domain should reflect this.
Implementation Rule 31.1. Structure the process so that all sections requiring extra privileges are modules. The modules should be as small as possible and should perform only those tasks that require those privileges.
The basis for this rule lies in the reference monitor.28 The reference monitor is verifiable, complete (it is always invoked to access the resource it protects), and tamperproof (it cannot be compromised). Here, the modules are kept small and simple (verifiable), access to the privileged resource requires the process to invoke these modules (complete), and the use of separate modules with well-defined interfaces minimizes the chances of other parts of the program corrupting the module (tamperproof).
Management Rule 31.1. Check that the process privileges are set properly.
Insufficient privileges could cause a denial of service. Excessive privileges could enable an attacker to exploit vulnerabilities in the program. To avoid these problems, the privileges of the process, and the times at which the process has these privileges, must be chosen and managed carefully.
One of the requirements of this program is availability (Requirements 31.1 and 31.4). On Linux and UNIX systems, the program must change the effective identity of the user from the user’s account to the role account. This requires special (setuid) privileges of either the role account or the superuser.29 The principle of least privilege30 says that the former is better than the latter, but if one of the role accounts is root, then having multiple copies of the program with limited privileges is irrelevant, because the program with privileges to access the root role account is the logical target of attack. After all, if one can compromise a less privileged account through this program, the same attack will probably work against the root account. Because the Drib plans to control access to root in some cases, the program requires setuid to root privileges.
If the program does not have root privileges initially, the UNIX protection model does not allow the process to acquire them; the permissions on the program file corresponding to the program must be changed. The process must log enough information for the system administrator to identify the problem,31 and should notify users of the problem so that the users can notify the system administrator. An alternative is to develop a server that will periodically check the permissions on the program file and reset them if needed, or a server that the program can notify should it have insufficient privileges. The designers felt that the benefits of these servers were not sufficient to warrant their development. In particular, they were concerned that the system administrators investigate any unexpected change in file permissions, and an automated server that changed the permissions back would provide insufficient incentive for an analysis of the problem.
As a result, the developers required that the program acquire root permission at start-up. The access control module is executed. Within that module, the privileges are reset to the user’s once the log file and access control file have been opened.32 Superuser privileges are needed only once more—to change the privileges to those of the role account should access be granted. This routine, also in a separate module, supplies the granularity required to provide the needed functionality while minimizing the time spent executing with root privileges.
126.96.36.199 Access Control File Permissions
Biba’s models33 emphasize that the integrity of the process relies on both the integrity of the program and the integrity of the access control file. The former requires that the program be properly protected so that only authorized personnel can alter it. The system managers must determine who the “authorized personnel” are. Among the considerations here are the principle of separation of duty34 and the principle of least privilege.35
Verifying the integrity of the access control file is critical, because that file controls the access to role accounts. Some external mechanism, such as a file integrity checking tool, can provide some degree of assurance that the file has not changed. However, these checks are usually periodic, and the file might change after the check. So the program itself should check the integrity of the file when the program is run.
Management Rule 31.2. The program that is executed to create the process, and all associated control files, must be protected from unauthorized use and modification. Any such modification must be detected.
In many cases, the process will rely on the settings of other files or on some other external resources. Whenever possible, the program should check these dependencies to ensure that they are valid. The dependencies must be documented so that installers and maintainers will understand what else must be maintained in order to ensure that the program works correctly.
Implementation Rule 31.2. Ensure that any assumptions in the program are validated. If this is not possible, document them for the installers and maintainers, so they know the assumptions that attackers will try to invalidate.
The permissions of the program, and its containing directory, are to be set so only root can alter or move the program. According to Requirement 31.2, only root can alter the access control file. Hence, the file must be owned by root, and only root can write to it. The program should check the ownership and permissions of this file, and the containing directories, to validate that only root can alter it.
EXAMPLE: The naive way to check that only root can write to the file is to check that the owner is root and that the file permissions allow only the owner to write to it. But consider the group permissions. If root is the only member of the group, then the group permissions may allow members of the group to write to the file. The problem is that checking group membership is more complicated than looking up the members of the group. A user may belong to a group without being listed as a member, because the GID of the user is assigned from the password file, and group membership lists are contained in a different file.36 Either the password file and the group membership list must both be checked, or the program should simply report an error if anyone other than the user can write to the file. For simplicity,37 the designers chose the second approach.
188.8.131.52 Memory Protection
As the program runs, it depends on the values of variables and other objects in memory. This includes the executable instructions themselves. Thus, protecting memory against unauthorized or unexpected alteration is critical.
Consider sharing memory. If two subjects can alter the contents of memory, then one could change data on which the second relies. Unless such sharing is required (for example, by concurrent processes), it poses a security problem because the modifying process can alter variables that control the action of the other process. Thus, each process should have a protected, unshared memory space.
If the memory is represented by an object that processes can alter, it should be protected so that only trusted processes can access it. Access here includes not only modification but also reading, because passwords reside in memory after they are types. Multiple abstractions are discussed in more detail in the next section.
Implementation Rule 31.3. Ensure that the program does not share objects in memory with any other program, and that other programs cannot access the memory of a privileged process.
Interaction with other processes cannot be eliminated. If the running process obtains input or data from other processes, then that interface provides a point through which other processes can reach the memory. The most common version of this attack is the buffer overflow.
Buffer overflows involve either altering of data or injecting of instructions that can be executed later. There are a wide variety of techniques for this [32, 706].38 Several remedies exist. For example, if buffers reside in sections of memory that are not executable, injecting instructions will not work. Similarly, if some data is to remain unaltered, the data can be stored in read-only memory.
Management Rule 31.3. Configure memory to enforce the principle of least privilege. If a section of memory is not to contain executable instructions, turn execute permission off for that section of memory. If the contents of a section of memory are not to be altered, make that section read-only.
These rules appear in three ways in our program. First, the implementers use the language constructs to flag unchanging data as constant (in the C programming language, this is the keyword const). This will cause compile-time errors if the variables are assigned to, or runtime errors if instructions try to alter those constants.
The other two ways involve program loading. The system’s loader places data in three areas: the data (initialized data) segment, the stack (used for function calls and variables local to the functions), and the heap (used for dynamically allocated storage). A common attack is to trick a program into executing instructions injected into three areas. The vector of injection can be a buffer overflow,39 for example. The characteristic under discussion does not stop such alteration, but it should prevent the data from being executed by making the segments or pages of all three areas nonexecutable. This suffices for the data and stack segments and follows Management Rule 31.3.
If the program uses dynamic loading to load functions at runtime, the functions that are loaded may change over the lifetime of the program. This means that the assumptions the programmers make may no longer be valid.40 One solution to this problem is to compile the program in such a way that it does not use dynamic loading. This also also prevents the program from trying to load a module at runtime that may be missing. This could occur if a second process deleted the appropriate library. So disabling of dynamic loading also follows Implementation Rule 31.3.41
Finally, some UNIX-like systems (including the one on which this program is being developed) allow execution permission to be turned off for the stack. The boot file sets the kernel flag to enforce this.
184.108.40.206 Trust in the System
This analysis overlooks several system components. For example, the program relies on the system authentication mechanisms to authenticate the user, and on the user information database to map users and roles into their corresponding UIDs (and, therefore, privileges). It also relies on the inability of ordinary users to alter the system clock. If any of this supporting infrastructure can be compromised, the program will not work correctly. The best that can be done is to identify these points of trust in the installation and operation documentation so that the system administrators are aware of the dependencies of the program on the system.
Management Rule 31.4. Identify all system components on which the program depends. Check for errors whenever possible, and identify those components for which error checking will not work.
For this program, the implementers should identify the system databases and information on which the program depends, and should prepare a list of these dependencies. They should discuss these dependencies with system managers to determine if the program can check for errors. When this is not possible, or when the program cannot identify all errors, they should describe the possible consequences of the errors. This document should be distributed with the program so that system administrators can check their systems before installing the program.
31.5.2 Improper Isolation of Implementation Detail
The problem of improper isolation of implementation detail arises when an abstraction is improperly mapped into an implementation detail. Consider how abstractions are mapped into implementations. Typically, some function (such as a database query) occurs, or the abstraction corresponds to an object in the system. What happens if the function produces an error or fails in some other way, or if the object can be manipulated without reference to the abstraction?
The first rule is to catch errors and failures of the mappings. This requires an analysis of the functions and a knowledge of their implementation. The action to take on failure also requires thought. In general, if the cause cannot be determined, the program should fail by returning the relevant parts of the system to the states they were in when the program began.42
Implementation Rule 31.4. The error status of every function must be checked. Do not try to recover unless the cause of the error, and its effects, do not affect any security considerations. The program should restore the state of the system to the state before the process began, and then terminate.
The abstractions in this program are the notion of a user and a role, the access control information, and the creation of a process with the rights of the role. We will examine these abstractions separately.
220.127.116.11 Resource Exhaustion and User Identifiers
The notion of a user and a role is an abstraction because the program can work with role names and the operating system uses integers (UIDs). The question is how those user and role names are mapped to UIDs. Typically, this is done with a user information database that contains the requisite mapping, but the program must detect any failures of the query and respond appropriately.
EXAMPLE: A mail server allowed users to forward mail by creating a forwarding file . The forwarding file could specify files to which the mail should be appended. In this case, the mail server would deliver the letter with the privileges of the owner of the forwarding file (represented on the system as an integer UID). In some cases, the mail server would queue the message for later delivery. When it did so, it would write the name (not the UID) of the user into a control file. The system queried a database, supplying the UID, and obtaining the corresponding name. If the query failed, the mail server used a default name specified by the system administrator.
Attackers discovered how to make the queries fail. As a result, the user was set to a default user, usually a system-level user (such as daemon). This enabled the attackers to have the mail server append mail to any file to which the default user could write. They used this to implant Trojan horses into system programs. These Trojan horses gave them extra privileges, compromising the system.
The designers and implementers decided to have the program fail if, for any reason, the query failed. This application of the principle of fail-safe defaults43 ensured that in case of error, the users would not get access to the role account.
18.104.22.168 Validating the Access Control Entries
The access control information implements the access control policy (an abstraction). The expression of the access control information is therefore the result of mapping an abstraction to an implementation. The question is whether or not the given access control information correctly implements the policy. Answering this question requires someone to examine the implementation expression of the policy.
The programmers developed a second program that used the same routines as the role-assuming program to analyze the access control entries. This program prints the access control information in an easily readable format. It allows the system managers to check that the access control information is correct. A specific procedure requires that this information be checked periodically, and always after the file or the program is altered.
22.214.171.124 Restricting the Protection Domain of the Role Process
Creating a role process is the third abstraction. There are two approaches. Under UNIX-like systems, the program can spawn a second, child, process. It can also simply start up a second program in such a way that the parent process is replaced by the new process. This technique, called overlaying, is intrinsically simpler than creating a child process and exiting. It allows the process to replace its own protection domain with the (possibly) more limited one corresponding to the role. The programmers elected to use this method. The new process inherits the protection domain of the original one. Before the overlaying, the original process must reset its protection domain to that of the role. The programmers do so by closing all files that the original process opened, and changing its privileges to those of the role.
EXAMPLE: The effective UIDs and GIDs44 control privileges. Hence, the programmers reset the effective GID first, and then the effective UID (if resetting were done in the opposite order, the change to GIDs would fail because such changes require root privileges). However, if the UNIX-like system supports saved UIDs, an authorized user may be able to acquire root privileges even if the role account is not root. The problem is that resetting the effective UID sets the saved UID to the previous UID—namely, root. A process may then reacquire the rights of its saved UID. To avoid this problem, the programmers used the setuid system call to reset all of the real, effective, and saved UIDs to the UID of the role. Thus, all traces of the root UID are eliminated and the user cannot reacquire those privileges.
Similarly, UNIX-like systems check access permissions only when the file is opened. If a root process opens a privileged file and then the process drops root privileges, it can still read from (or write to) the file.
The components of the protection domain that the process must reset before the overlay are the open files (except for standard input, output, and error), which must be closed, the signal handlers, which must be reset to their default values, and any user-specific information, which must be cleared.
31.5.3 Improper Change
This category describes data and instructions that change over time. The danger is that the changed values may be inconsistent with the previous values. The previous values dictate the flow of control of the process. The changed values cause the program to take incorrect or nonsecure actions on that path of control.
The data and instructions can reside in shared memory, in nonshared memory, or on disk. The last includes file attribute information such as ownership and access control list.
First comes the data in shared memory. Any process that can access shared memory can manipulate data in that memory. Unless all processes that can access the shared memory implement a concurrent protocol for managing changes, one process can change data on which a second process relies. As stated above, this could cause the second process to violate the security policy.
EXAMPLE: Two processes share memory. One process reads authentication data and writes it into the shared memory space. The second process performs the authentication, and writes a boolean true back into the shared memory space if the authentication succeeds, and false if it fails. Unless the two processes use concurrent constructs to synchronize their reading and writing, the first process may read the result before the second process has completed the computation for the current data. This could allow access when it should be denied, or vice versa.
Implementation Rule 31.5. If a process interacts with other processes, the interactions should be synchronized. In particular, all possible sequences of interactions must be known and, for all such interactions, the process must enforce the required security policy.
A variant of this situation is the asynchronous exception handler. If the handler alters variables and then returns to the previous point in the program, the changes in the variables could cause problems similar to the problem of concurrent processes. For this reason, if the exception handler alters any variables on which other portions of the code depend, the programmer must understand the possible effects of such changes. This is just like the earlier situation in which a concurrent process changes another’s variables in a shared memory space.
Implementation Rule 31.6. Asynchronous exception handlers should not alter any variables except those that are local to the exception handling module. An exception handler should block all other exceptions when begun, and should not release the block until the handler completes execution, unless the handler has been designed to handle exceptions within itself (or calls an uninvoked exception handler).
A second approach applies whether the memory is shared or not. A user feeds bogus information to the program, and the program accepts it. The bogus data overflows its buffer, changing other data, or inserting instructions that can be executed later.
EXAMPLE: The buffer overflow attack on fingerd described in Section 126.96.36.199 illustrates this approach. The return address is pushed onto the stack when the input routine is called. That address is not expected to change between its being pushed onto the stack and its being popped from the stack, but the buffer overflow changes it. When the input function returns, the address popped from the stack is that of the input buffer. Execution resumes at that point, and the input instructions are used.
This suggests one way to detect such transformations (the stack guard approach) . Immediately after the return address is pushed onto the stack, push a random number onto the stack (the canary). Assume that the input overflows the buffer on the stack and alters the return address on the stack. If the canary is n bits long and has been chosen randomly, the probability of the attacker not changing that cookie is 2–n. When the input procedure returns, the canary is popped and compared with the value that was pushed onto the stack. If the two differ, there has been an overflow.45
In terms of trust, the return address (a trusted datum) can be affected by untrusted data (from the input). This lowers the trustworthiness of the return address to that of input data. One need not supply instructions to breach security.
EXAMPLE: One (possibly apocryphal) version of a UNIX login program allocated two adjacent arrays. The first held the user’s cleartext password and was 80 characters long, and the second held the password hash46 and was 13 characters long. The program’s logic loaded the password hash into the second array as soon as the user’s name was determined. It then read the user’s cleartext password and stored it in the first array. If the contents of the first array hashed to the contents of the second array, the user was authenticated. An attacker simply selected a random password (for example, “password”) and generated a valid hash for it (here, “12CsGd8FRcMSM”). The attacker then identified herself as root. When asked for a password, the attacker entered “password”, typed 72 spaces, and then typed “12CsGd8FRcMSM”. The system hashed “password”, got “12CsGd8FRcMSM”, and logged the user in as root.
A technique in which canaries protect data, not only the return address, would work, but raises many implementation problems (see Exercise 7).
Implementation Rule 31.7. Whenever possible, data that the process trusts and data that it receives from untrusted sources (such as input) should be kept in separate areas of memory. If data from a trusted source is overwritten with data from an untrusted source, a memory error will occur.
In more formal terms, the principle of least common mechanism47 indicates that memory should not be shared in this way.
These rules apply to our program in several ways. First, the program does not interact with any other program except through exception handling.48 So Implementation Rule 31.5 does not apply. Exception handling consists of calling a procedure that disables further exception handling, logs the exception, and immediately terminates the program.
Illicit alteration of data in memory is the second potential problem. If the user-supplied data is read into memory that overlaps with other program data, it could erase or alter that data. To satisfy Implementation Rule 31.7, the programmers did not reuse variables into which users could input data. They also ensured that each access to a buffer did not overlap with other buffers.
The problem of buffer overflow is solved by checking all array and pointer references within the code. Any reference that is out of bounds causes the program to fail after logging an error message to help the programmers track down the error.
188.8.131.52 Changes in File Contents
File contents may change improperly. In most cases, this means that the file permissions are set incorrectly or that multiple processes are accessing the file, which is similar to the problem of concurrent processes accessing shared memory. Management Rule 31.2 and Implementation Rule 31.5 cover these two cases.
A nonobvious corollary is to be careful of dynamic loading. Dynamic load libraries are not part of this program’s executable. They are loaded, as needed, when the program runs. Suppose one of the libraries is changed, and the change causes a side effect. The program may cease to function or, even worse, work incorrectly.
If the dynamic load modules cannot be altered, then this concern is minimal, but if they can be upgraded or otherwise altered, it is important. Because one of the reasons for using dynamic load libraries is to allow upgrades without having to recompile programs that depend on the library, security-related programs using dynamic load libraries are at risk.
Implementation Rule 31.8. Do not use components that may change between the time the program is created and the time it is run.
This is another reason that the developers decided not to use dynamic loading.
184.108.40.206 Race Conditions in File Accesses
A race condition in this context is the time-of-check-to-time-of-use problem. As with memory accesses, the file being used is changed after validation but before access.49 To thwart it, either the file must be protected so that no untrusted user can alter it, or the process must validate the file and use it indivisibly. The former requires appropriate settings of permission, so Management Rule 31.2 applies. Section 31.5.7, “Improper Indivisibility,” discusses the latter.
This program validates that the owner and access control permissions for the access control file are correct (the check). It then opens the file (the use). If an attacker can change the file after the validation but before the opening, so that the file checked is not the file opened, then the attacker can have the program obtain access control information from a file other than the legitimate access control file. Presumably, the attacker would supply a set of access control entries allowing unauthorized accesses.
EXAMPLE: The UNIX operating system allows programs to refer to files in two ways: by name and by file descriptor.50 Once a file descriptor is bound to a file, the referent of the descriptor does not change. Each access through the file descriptor always refers to the bound file (until the descriptor is closed). However, the kernel reprocesses the file name at each reference, so two references to the same file name may refer to two different files. An attacker who is able to alter the file system in such a way that this occurs is exploiting a race condition. So any checks made to the file corresponding to the first use of the name may not apply to the file corresponding to the second use of the name. This can result in a process making unwarranted assumptions about the trustworthiness of the file and the data it contains.
In the xterm example51 the program can be fixed by opening the file and then using the file descriptor (handle) to obtain the owner and access permissions.52 Those permissions belong to the opened file, because they were obtained using the file descriptor. The validation is now ensured to be that of the access control file.
The program does exactly this. It opens the access control file and uses the file descriptor, which references the file attribute information directly to obtain the owner, group, and access control permissions. Those permissions are checked. If they are correct, the program uses the file descriptor to read the file. Otherwise, the file is closed and the program reports a failure.
31.5.4 Improper Naming
Improper naming refers to an ambiguity in identifying an object. Most commonly, two different objects have the same name. The programmer intends the name to refer to one of the objects, but an attacker manipulates the environment and the process so that the name refers to a different object. Avoiding this flaw requires that every object be unambiguously identified. This is both a management concern and an implementation concern.
Objects must be uniquely identifiable or completely interchangeable. Managing these objects means identifying those that are interchangeable and those that are not. The former objects need a controller (or set of controllers) that, when given a name, selects one of the objects. The latter objects need unique names. The managers of the objects must supply those names.
Management Rule 31.5. Unique objects require unique names. Interchangeable objects may share a name.
A name is interpreted within a context. At the implementation level, the process must force its own context into the interpretation, to ensure that the object referred to is the intended object. The context includes information about the character sets, process and file hierarchies, network domains, and any accessible variables such as the search path.
EXAMPLE: Stage 3 in Section 24.2.9 discussed an attack in which a privileged program called loadmodule executed a second program named ld.so. The attack exploited loadmodule’s failure to specify the context in which ld.so was named. Loadmodule used the context of the user invoking the program. Normally, this caused the correct ld.so to be invoked. In the example, the attacker changed the context so that another version of ld.so was executed. This version had a Trojan horse that would grant privileged access. When the attacker executed loadmodule, the Trojan horse was triggered and maximum privileges were acquired.
Implementation Rule 31.9. The process must ensure that the context in which an object is named identifies the correct object.
This program uses names for external objects in four places: the name of the access control file, the names of the users and roles, the names of the hosts, and the name of the command interpreter (the shell) that the program uses to execute commands in the role account.
The two file names (access control file and command interpreter) must identify specific files. Absolute path names specify the location of the object with respect to a distinguished directory called / or the “root directory.” However, a privileged process can redefine / to be any directory.53 This program does not do so. Furthermore, if the root directory is anything other than the root directory of the system, a trusted process has executed it. No untrusted user could have done so. Thus, as long as absolute path names are specified, the files are unambiguously named.
The name provided may be interpreted in light of other aspects of the environment. For example, differences in the encoding of characters can transform file names. Whether characters are made up of 16 bits, 8 bits, or 7 bits can change the interpretation, and therefore the referent, of a file name. Other environment variables can change the interpretation of the path name. This program simply creates a new, known, safe environment for execution of the commands.54
This has two advantages over sanitization of the existing context. First, it avoids having the program analyze the environment in detail. The meaning of each aspect of the environment need not be analyzed and examined. The environment is simply replaced. Second, it allows the system to evolve without compromising the security of the program. For example, if a new environment variable is assigned a meaning that affects how programs are executed, the variable will not affect how this program executes its commands because that variable will not appear in the command’s environment. So this program closes all file descriptors, resets signal handlers, and passes a new set of environment variables for the command.
These actions satisfy Implementation Rule 31.9.
The developers assumed that the system was properly maintained, so that the names of the users and roles would map into the correct UIDs. (Section 220.127.116.11 discusses this.) This applies to Management Rule 31.5.
The host names are the final set of names. These may be specified by names or IP addresses. If the former, they must be fully qualified domain names to avoid ambiguity. To see this, suppose an access control entry allows user matt to access the role wheel when logging in from the system amelia. Does this mean the system named amelia in the local domain, or any system named amelia from any domain? Either interpretation is valid. The former is more reasonable,55 and applying this interpretation resolves the ambiguity. (The program implicitly maps names to fully qualified domain names using the former interpretation. Thus, amelia in the access control entry would match a host named amelia in the local domain, and not a host named amelia in another domain.) This implements Implementation Rule 31.9.56
As a side note, if the local network is mismanaged or compromised, the name amelia may refer to a system other than the one intended. For example, the real host amelia may crash or go offline. An attacker can then reset the address of his host to correspond to amelia. This program will not detect the impersonation.
31.5.5 Improper Deallocation or Deletion
Failing to delete sensitive information raises the possibility of another process seeing that data at a later time. In particular, cryptographic keywords, passwords, and other authentication information should be discarded once they have been used. Similarly, once a process has finished with a resource, that resource should be deallocated. This allows other processes to use that resource, inhibiting denial of service attacks.
A consequence of not deleting sensitive information is that dumps of memory, which may occur if the program receives an exception or crashes for some other reason, contain the sensitive data. If the process fails to release sensitive resources before spawning unprivileged subprocesses, those unprivileged subprocesses may have access to the resource.
Implementation Rule 31.10. When the process finishes using a sensitive object (one that contains confidential information or one that should not be altered), the object should be erased, then deallocated or deleted. Any resources not needed should also be released.
Our program uses three pieces of sensitive information. The first is the cleartext password, which authenticates the user. The password is hashed, and the hash is compared with the stored hash. Once the hash of the entered password has been computed, the process must delete the cleartext password. So it overwrites the array holding the password with random bytes.
The second piece of sensitive information is the access control information. Suppose an attacker wanted to gain access to a role account. The access control entries would tell the attacker which users could access that account using this program. To prevent the attacker from gaining this information, the developers decided to keep the contents of the access control file confidential. The program accesses this file using a file descriptor. File descriptors remain open when a new program overlays a process. Hence, the program closes the file descriptor corresponding to the access control file once the request has been validated (or has failed to be validated).
The third piece of sensitive information is the log file. The program alters this file. If an unprivileged program such as one run by this program were to inherit the file descriptor, it could flood the log. Were the log to fill up, the program could no longer log failures. So the program also closes the log file before spawning the role’s command.
31.5.6 Improper Validation
The problem of improper validation arises when data is not checked for consistency and correctness. Ideally, a process would validate the data against the more abstract policies to ensure correctness. In practice, the process can check correctness only by looking for error codes (indicating failure of functions and procedures) or by looking for patently incorrect values (such as negative numbers when positive ones are required).
As the program is designed, the developers should determine what conditions must hold at each interface and each block of code. They should then validate that these conditions hold.
What follows is a set of validations that are commonly overlooked. Each program requires its own analysis, and other types of validation may be critical to the correct, secure functioning of the program, so this list is by no means complete.
18.104.22.168 Bounds Checking
Errors of validation often occur when data is supposed to lie within bounds. For example, a buffer may contain entries numbered from 0 to 99. If the index used to access the buffer elements takes on a value less than 0 or greater than 99, it is an invalid operand because it accesses a nonexistent entry. The variable used to access the element may not be an integer (for example, it may be a set element or pointer), but in any case it must reference an existing element.
Implementation Rule 31.11. Ensure that all array references access existing elements of the array. If a function that manipulates arrays cannot ensure that only valid elements are referenced, do not use that function. Find one that does, write a new version, or create a wrapper.
In this example program, all loops involving arrays compare the value of the variable referencing the array against the indexes (or addresses) of both the first and last elements of the array. The loop terminates if the variable’s value is outside those two values. This covers all loops within the program, but it does not cover the loops in the library functions.
For loops in the library functions, bounds checking requires an analysis of the functions used to manipulate arrays. The most common type of array for which library functions are used is the character string, which is a sequence of characters (bytes) terminating with a 0 byte. Because the length of the string is not encoded as part of the string, functions cannot determine the size of the array containing the string. They simply operate on all bytes until a 0 byte is found.
EXAMPLE: The program sometimes must copy character strings (defined in C as arrays of character data terminating with a byte containing 0). The canonical function for copying strings does no bounds checking. This function, strcpy(x, y), copies the string from the array y to the array x, even if the string is too long for x. A different function, strncpy(x, y, n), copies at most n characters from array y to array x. However, unlike strcpy, strncpy may not copy the terminating 0 byte.57 The program must take two actions when strncpy is called. First, it must insert a 0 byte at the end of the x array. This ensures that the contents of x meet the definition of a string in C. Second, the process must check that both x and y are arrays of characters, and that n is a positive integer.
The programmers use only those functions that bound the sizes of arrays. In particular, the function fgets is used to read input, because it allows the programmer to specify the maximum number of characters to be read. (This solves the problem that plagued fingerd.58)
22.214.171.124 Type Checking
Failure to check types is another common validation problem. If a function parameter is an integer, but the actual argument passed is a floating point number, the function will interpret the bit pattern of the floating point number as an integer and will produce an incorrect result.
Implementation Rule 31.12. Check the types of functions and parameters.
A good compiler and well-written code will handle this particular problem. All functions should be declared before they are used. Most programming languages allow the programmer to specify the number and types of arguments, as well as the type of the return value (if any). The compiler can then check the types of the declarations against the types of the actual arguments and return values.
Implementation Rule 31.13. When compiling programs, ensure that the compiler reports inconsistencies in types. Investigate all such warnings and either fix the problem or document the warning and why it is spurious.
126.96.36.199 Error Checking
A third common problem involving improper validation is failure to check return values of functions. For example, suppose a program needs to determine ownership of a file. It calls a system function that returns a record containing information from the file attribute table. The program obtains the owner of the file from the appropriate field of the record. If the function fails, the information in the record is meaningless. So, if the function’s return status is not checked, the program may act erroneously.
Implementation Rule 31.14. Check all function and procedure executions for errors.
This program makes extensive use of system and library functions, as well as its own internal functions (such as the access control module). Every function returns a value, and the value is checked for an error before the results of the function are used. For example, the function that obtains the ownership and access permissions of the access control file would return meaningless information should the function fail. So the function’s return value is checked first for an error; if no error has occurred, then the file attribute information is used.
As another example, the program opens a log file. If the open fails, and the program tries to write to the (invalid) file descriptor obtained from the function that failed, the program will terminate as a result of an exception. Hence, the program checks the result of opening the log file.
188.8.131.52 Checking for Valid, Not Invalid, Data
Validation should apply the principle of fail-safe defaults.59 This principle requires that valid values be known, and that all other values be rejected. Unfortunately, programmers often check for invalid data and assume that the rest is valid.
EXAMPLE: A metacharacter is a character that is interpreted as something other than itself. For example, to the UNIX shells, the character “?” is a metacharacter that represents all single character files. A vendor upgraded its version of the command interpreter for its UNIX system. The new command interpreter (shell) treated the character “ `” (back quote) as a delimiter for a command (and hence a metacharacter). The old shell treated the back quote as an ordinary character. Included in the distribution was a program for executing commands on remote systems. The set of allowed commands was restricted. This program carefully checked that the command was allowed, and that it contained no metacharacters, before sending it to a shell on the remote system. Unfortunately, the program checked a list of metacharacters to be rejected, rather than checking a list of characters that were allowed in the commands. As a result, one could embed a disallowed command within a valid command request, because the list of metacharacters was not updated to include the back quote.
Implementation Rule 31.15. Check that a variable’s values are valid.
This program checks that the command to be executed matches one of the authorized commands. It does not have a set of commands that are to be denied. The program will detect an invalid command as one that is not listed in the set of authorized commands for that user accessing that role at the time and place allowed.
As discussed in Section 184.108.40.206, it is possible to allow all users except some specific users access to a role by an appropriate access control entry (using the keyword not). The developers debated whether having this ability was appropriate because its use could lead to violations of the principle of fail-safe defaults. On one key system, however, the only authorized users were system administrators and one or two trainees. The administrators wanted the ability to shut the trainees out of certain roles. So the developers added the keyword and recommended against its use except in that single specific situation.
Implementation Rule 31.16. If a trade-off between security and other factors results in a mechanism or procedure that can weaken security, document the reasons for the decision, the possible effects, and the situations in which the compromise method should be used. This informs others of the trade-off and the attendant risks.
220.127.116.11 Checking Input
All data from untrusted sources must be checked. Users are untrusted sources. The checking done depends on the way the data is received: into an input buffer (bounds checking) or read in as an integer (checking the magnitude and sign of the input).
Implementation Rule 31.17. Check all user input for both form and content. In particular, check integers for values that are too big or too small, and check character data for length and valid characters.
The program determines what to do on the basis of at least two pieces of data that the user provides: the role name and the command (which, if omitted, means unrestricted access).60 Users must also authenticate themselves appropriately. The program must first validate that the supplied password is correct. It then checks the access control information to determine whether the user is allowed access to the role at that time and from that location.
The length of the input password must be no longer than the buffer in which it is placed. Similarly, the lines of the access control file must not overflow the buffer allocated for it. The contents of the lines of the access control file must make up a valid access control entry. This is most easily done by constraining the format of the contents of the file, as discussed in the next section.
An excellent example of the need to constrain user input comes from formatted print statements in C.
EXAMPLE: The printf function’s first parameter is a character string that indicates how printf is to format output data. The following parameters contain the data. For example,
printf("%d %d\n", i, j);
prints the values of i and j. Some versions of this library function allow the user to store the number of characters printed at any point in the string. For example, if i contains 2, j contains 21, and m and n are integer variables,
printf("%d %d%n %d\n%n", i, j, &m, i, &n);
2 21 2
and stores 4 in m and 7 in n, because four characters are printed before the first “%n” and seven before the second “%n” (the sequence “\n” is interpreted as a single character, the newline). Now, suppose the user is asked for a file name. This input is stored in the array str. The program then prints the file name with
If the user enters the file name “log%n”, the function will overwrite some memory location with the integer 3. The exact location depends on the contents of the program stack, and with some experimentation it is possible to cause the program to change the return address stored on the stack. This leads to the buffer overflow attack described earlier.
18.104.22.168 Designing for Validation
Sometimes data cannot be validated completely. For example, in the C programming language, a programmer can test for a NULL pointer (meaning that the pointer does not hold the address of any object), but if the pointer is not NULL, checking the validity of the pointer may be very difficult (or impossible). Using a language with strong type checking is another example.
The consequence of the need for validation requires that data structures and functions be designed and implemented in such a way that they can be validated. For example, because C pointers cannot be properly validated, programmers should not pass pointers or use them in situations in which they must be validated. Methods of data hiding, type checking, and object-oriented programming often provide mechanisms for doing this.
Implementation Rule 31.18. Create data structures and functions in such a way that they can be validated.
An example will show the level of detail necessary for validation. The entries in the access control file are designed to allow the program to detect obvious errors. Each access control entry consists of a block of information in the following format:
role name user comma-separated list of users location comma-separated list of locations time comma-separated list of times command program and arguments ... command program and arguments endrole
This defines each component of the entry. (The lines need not be in any particular order.) The syntax is well-defined, and the access control module in the program checks for syntax errors. The module also performs other checks, such as searching for invalid user names in the user field and requiring that the full path names of all commands be specified. Finally, note that the module computes the number of commands for the module’s internal record. This eliminates a possible source of error—namely, that the user may miscount the number of commands.
In case of any error, the process logs the error, if possible, and terminates. It does not allow the user to access the role.
31.5.7 Improper Indivisibility
Improper indivisibility61 arises when an operation is considered as one unit (indivisible) in the abstract but is implemented as two units (divisible). The race conditions discussed in Section 22.214.171.124 provide one example. The checking of the access control file attributes and the opening of that file are to be executed as one operation. Unfortunately, they may be implemented as two separate operations, and an attacker who can alter the file after the first but before the second operation can obtain access illicitly. Another example arises in exception handling. Often, program statements and system calls are considered as single units or operations when the implementation uses many operations. An exception divides those operations into two sets: the set before the exception, and the set after the exception. If the system calls or statements rely on data not changing during their execution, exception handlers must not alter the data.
Section 31.5.3 discusses handling of these situations when the operations cannot be made indivisible. Approaches to making them indivisible include disabling interrupts and having the kernel perform operations. The latter assumes that the operation is indivisible when performed by the kernel, which may be an incorrect assumption.
Implementation Rule 31.19. If two operations must be performed sequentially without an intervening operation, use a mechanism to ensure that the two cannot be divided.
In UNIX systems, the problem of divisibility arises with root processes such as the program under consideration. UNIX-like systems do not enforce the principle of complete mediation.62 For root, access permissions are not checked. Recall the xterm example in Section 24.3.1. A user needed to log information from the execution of xterm, and specified a log file. Before appending to that file, xterm needed to ensure that the real UID could write to the log file. This required an extra system call. As a result, operations that should have been indivisible (the access check followed by the opening of the file) were actually divisible. One way to make these operations indivisible on UNIX-like systems is to drop privileges to those of the real UID, then open the file. The access checking is done in the kernel as part of the open.
Improper indivisibility arises in our program when the access control module validates and then opens the access control file. This should be a single operation, but because of the semantics of UNIX-like systems, it must be performed as two distinct operations. It is not possible to ensure the indivisibility of the two operations. However, it is possible to ensure that the target of the operations does not change, as discussed in Section 31.5.3, and this suffices for our purposes.
126.96.36.199 Improper Sequencing
Improper sequencing means that operations are performed in an incorrect order. For example, a process may create a lock file and then write to a log file. A second process may also write to the log file, and then check to see if the lock file exists. The first program uses the correct sequence of calls; the second does not (because that sequence allows multiple writers to access the log file simultaneously).
Implementation Rule 31.20. Describe the legal sequences of operations on a resource or object. Check that all possible sequences of the program(s) involved match one (or more) legal sequences.
In our program, the sequence of operations in the design shown in Section 188.8.131.52 follows a proper order. The user is first authenticated. Then the program uses the access control information to determine if the requested access is valid. If it is, the appropriate command is executed using a new, safe environment.
A second sequence of operations occurs when privileges to the role are dropped. First, group privileges are changed to those of the role. Then all user identification numbers are changed to those of the role. A common error is to switch the user identification numbers first, followed by the change in group privileges. Because changing group privileges requires root privileges, the change will fail. Hence, the programmers used the stated ordering.
31.5.8 Improper Choice of Operand or Operation
Preventing errors of choosing the wrong operand or operation requires that the algorithms be thought through carefully (to ensure that they are appropriate). At the implementation level, this requires that operands be of an appropriate type and value, and that operations be selected to perform the desired functions. The difference between this type of error and improper validation lies in the program. Improper implementation refers to a validation failure. The operands may be appropriate, but no checking is done. In this category, even though the operands may have been checked, they may still be inappropriate.
EXAMPLE: The UNIX program su allows a user to substitute another user’s identity, obtaining the second user’s privileges. According to an apocryphal story, one version of this program granted the user root privileges if the user information database did not exist (see Exercise 10 in Chapter 14). If the program could not open the user information database file, it assumed that the database did not exist. This was an inappropriate choice of operation because one could block access to the file even when the database existed.
Assurance techniques63 help detect these problems. The programmer documents the purpose of each function and then checks (or, preferably, others check) that the algorithms in the function work properly and that the code correctly implements the algorithms.
Management Rule 31.6. Use software engineering and assurance techniques (such as documentation, design reviews, and code reviews) to ensure that operations and operands are appropriate.
Within our program, many operands and operations control the granting (and denying) of access, the changing to the role, and the execution of the command. We first focus on the access part of the program, and afterwards we consider two other issues.
First, a user is granted access only when an access control entry matches all characteristics of the current session. The relevant characteristics are the role name, the user’s UID, the role’s name (or UID), the location, the time, and the command. We begin by checking that if the characteristics match, the access control module returns true (allowing access). We also check that the caller grants access when the module returns true and denies access when the module returns false.
Next, we consider the user’s UID. That object is of type uid t. If the interface to the system database returns an object of a different type, conversion becomes an issue. Specifically, many interfaces treat the UID as an integer. The difference between the types int and uid t may cause problems. On the systems involved, uid t is an unsigned integer. Since we are comparing signed and unsigned integers, C simply converts the signed integers to unsigned integers, and the comparison succeeds. Hence, the choice of operation (comparison here) is proper.
Checking location requires the program to derive the user’s location, as discussed above, and pass it to the validator. The validator takes a string and determines whether it matches the pattern in the location field of the access control entry. If the string matches, the module should continue; otherwise, it should terminate and return false.
Unlike the location, a variable of type time t contains the current time. The time checking portion of the module processes the string representing the allowed times and determines if the current time falls in the range of allowed times. Checking time is different than checking location because legal times are ranges, except in one specific situation: when an allowed time is specified to the exact second. A specification of an exact time is useless, because the program may not obtain the time at the exact second required. This would lead to a denial of service, violating Requirement 31.4. Also, allowing exact times leads to ambiguity.
EXAMPLE: The system administrator specifies that user matt is allowed access to the role mail at 9 a.m. on Tuesdays. Should this be interpreted as exactly 9 a.m. (that is, 9:00:00 a.m.) or as sometime during the 9 a.m. hour (that is, from 9:00:00 to 9:59:59 a.m.)? The latter interprets the specification as a range rather than an exact time, so the access control module uses that interpretation.
The use of signal handlers provides a second situation in which an improper choice of operation could occur. A signal indicates either an error in the program or a request from the user to terminate, so a signal should cause the program to terminate. If the program continues to run, and then grants the user access to the role account, either the program has continued in the face of an error or it has overridden the user’s attempt to terminate the program.
This type of top-down analysis differs from the more usual approach of taking a checklist of common vulnerabilities and using it to examine code. There is a place for each of these approaches. The top-down approach presented here is a design approach, and should be applied at each level of design and implementation. It emphasizes documentation, analysis, and understanding of the program, its interfaces, and the environment in which it executes. A security analysis document should describe the analysis and the reasons for each security-related decision. This document will help other analysts examine the program and, more importantly, will provide future developers and maintainers of the program with insight into potential problems they may encounter in porting the program to a different environment, adding new features, or changing existing features.
Once the appropriate phase of the program has been completed, the developers should use a checklist to validate that the design or implementation has no common errors. Given the complexity of security design and implementation, such checklists provide valuable confirmation that the developers have taken common security problems into account.
Appendix H lists the implementation and management rules in a convenient form.