- Apr 5, 2002
10.3.2 Process Management System Calls in UNIX
Let us now look at the UNIX system calls dealing with process management. The main ones are listed in Fig. 10-3. Fork is a good place to start the discussion. Fork is the only way to create a new process in UNIX systems. It creates an exact duplicate of the original process, including all the file descriptors, registers and everything else. After the fork, the original process and the copy (the parent and child) go their separate ways. All the variables have identical values at the time of the fork, but since the entire parent core image is copied to create the child, subsequent changes in one of them do not affect the other one. The fork call returns a value, which is zero in the child, and equal to the child's PID in the parent. Using the returned PID, the two processes can see which is the parent and which is the child.
In most cases, after a fork, the child will need to execute different code from the parent. Consider the case of the shell. It reads a command from the terminal, forks off a child process, waits for the child to execute the command, and then reads the next command when the child terminates. To wait for the child to finish, the parent executes a waitpid system call, which just waits until the child terminates (any child if more than one exists). Waitpid has three parameters. The first one allows the caller to wait for a specific child. If it is -1, any old child (i.e., the first child to terminate) will do. The second parameter is the address of a variable that will be set to the child's exit status (normal or abnormal termination and exit value). The third one determines whether the caller blocks or returns if no child is already terminated.
Figure 10-2. The signals required by POSIX.
Figure 10-3. Some system calls relating to processes. The return code s is -1 if an error has occurred, pid is a process ID, and residual is the remaining time in the previous alarm. The parameters are what the name suggests.
In the case of the shell, the child process must execute the command typed by the user. It does this by using the exec system call, which causes its entire core image to be replaced by the file named in its first parameter. A highly simplified shell illustrating the use of fork, waitpid, and exec is shown in Fig. 10-4.
Figure 10-4. A highly simplified shell.
In the most general case, exec has three parameters: the name of the file to be executed, a pointer to the argument array, and a pointer to the environment array. These will be described shortly. Various library procedures, including execl, execv, execle, and execve, are provided to allow the parameters to be omitted or specified in various ways. All of these procedures invoke the same underlying system call. Although the system call is exec, there is no library procedure with this name; one of the others must be used.
Let us consider the case of a command typed to the shell such as
cp file1 file2
used to copy file1 to file2. After the shell has forked, the child locates and executes the file cp and passes it information about the files to be copied.
The main program of cp (and many other programs) contains the function declaration
main(argc, argv, envp)
where argc is a count of the number of items on the command line, including the program name. For the example above, argc is 3.
The second parameter, argv, is a pointer to an array. Element i of that array is a pointer to the i-th string on the command line. In our example, argv would point to the string ''cp''. Similarly, argv would point to the 5-character string ''file1'' and argv would point to the 5-character string ''file2''.
The third parameter of main, envp, is a pointer to the environment, an array of strings containing assignments of the form name = value used to pass information such as the terminal type and home directory name to a program. In Fig. 10-4, no environment is passed to the child, so the third parameter of execve is a zero in this case.
If exec seems complicated, do not despair; it is the most complex system call. All the rest are much simpler. As an example of a simple one, consider exit, which processes should use when they are finished executing. It has one parameter, the exit status (0 to 255), which is returned to the parent in the variable status of the waitpid system call. The low-order byte of status contains the termination status, with 0 being normal termination and the other values being various error conditions. The high-order byte contains the child's exit status (0 to 255), as specified in the child's call to exit. For example, if a parent process executes the statement n = waitpid(-1, &status, 0); it will be suspended until some child process terminates. If the child exits with, say, 4 as the parameter to exit, the parent will be awakened with n set to the child's PID and status set to 0x0400 (0x as a prefix means hexadecimal in C). The low-order byte of status relates to signals; the next one is the value the child returned in its call to exit.
If a process exits and its parent has not yet waited for it, the process enters a kind of suspended animation called the zombie state. When the parent finally waits for it, the process terminates.
Several system calls relate to signals, which are used in a variety of ways. For example, if a user accidently tells a text editor to display the entire contents of a very long file, and then realizes the error, some way is needed to interrupt the editor. The usual choice is for the user to hit some special key (e.g., DEL or CTRL-C), which sends a signal to the editor. The editor catches the signal and stops the print-out.
To announce its willingness to catch this (or any other) signal, the process can use the sigaction system call. The first parameter is the signal to be caught (see Fig. 10-2). The second is a pointer to a structure giving a pointer to the signal handling procedure, as well as some other bits and flags. The third one points to a structure where the system returns information about signal handling currently in effect, in case it must be restored later.
The signal handler may run for as long as it wants to. In practice, though, signal handlers are usually fairly short. When the signal handling procedure is done, it returns to the point from which it was interrupted. The sigaction system call can also be used to cause a signal to be ignored, or to restore the default action, which is killing the process.
Hitting the DEL key is not the only way to send a signal. The kill system call allows a process to signal another related process. The choice of the name ''kill'' for this system call is not an especially good one, since most processes send signals to other ones with the intention that they be caught.
For many real-time applications, a process needs to be interrupted after a specific time interval to do something, such as to retransmit a potentially lost packet over an unreliable communication line. To handle this situation, the alarm system call has been provided. The parameter specifies an interval, in seconds, after which a SIGALRM signal is sent to the process. A process may have only one alarm outstanding at any instant. If an alarm call is made with a parameter of 10 seconds, and then 3 seconds later another alarm call is made with a parameter of 20 seconds, only one signal will be generated, 20 seconds after the second call. The first signal is canceled by the second call to alarm. If the parameter to alarm is zero, any pending alarm signal is canceled. If an alarm signal is not caught, the default action is taken and the signaled process is killed. Technically, alarm signals may be ignored, but that is a pointless thing to do.
It sometimes occurs that a process has nothing to do until a signal arrives. For example, consider a computer-aided instruction program that is testing reading speed and comprehension. It displays some text on the screen and then calls alarm to signal it after 30 seconds. While the student is reading the text, the program has nothing to do. It could sit in a tight loop doing nothing, but that would waste CPU time that a background process or other user might need. A better solution is to use the pause system call, which tells UNIX to suspend the process until the next signal arrives.
Thread Management System Calls
The first versions of UNIX did not have threads. That feature was added many years later. Initially there were many threads packages in use, but the proliferation of threads packages made writing portable code difficult. Eventually, the system calls used to manage threads were standardized as part of POSIX (P1003.1c).
The POSIX specification did not take a position on whether threads should be implemented in the kernel or in user space. The advantage of having user-space threads is that they can be implemented without having to change the kernel and thread switching is very efficient. The disadvantage of user-space threads is that if one thread blocks (e.g., on I/O, a semaphore, or a page fault), all the threads in the process block because the kernel thinks there is only one thread and does not schedule the process until the blocking thread is released. Thus the calls defined in P1003.1c were carefully chosen to be implementable either way. As long as user programs adhere carefully to the P1003.1c semantics, both implementations should work correctly. The most commonly-used thread calls are listed in Fig. 10-5. When kernel threads are used, these calls are true system calls; when user threads are used, these calls are implemented entirely in a user-space runtime library.
(For the truly alert reader, note that we have a typographical problem now. If the kernel manages threads, then calls such as ''pthread_create,'' are system calls and following our convention should be set in Helvetica, like this: pthread_create. However, if they are simply user-space library calls, our convention for all procedure names is to use Times Italics, like this: pthread_create. Without prejudice, we will simply use Helvetica, also in the next chapter, in which it is never clear which Win32 API calls are really system calls. It could be worse: in the Algol 68 Report there was a period that changed the grammar of the language slightly when printed in the wrong font.)
Figure 10-5. The principal POSIX thread calls.
Let us briefly examine the thread calls shown in Fig. 10-5. The first call, pthread_create, creates a new thread. It is called by
err = pthread_create(&tid, attr, function, arg);
This call creates a new thread in the current process running the code function with arg passed to it as a parameter. The new thread's ID is stored in memory at the location pointed to by the first parameters. The attr parameter can be used to specify certain attributes for the new thread, such as its scheduling priority. After successful completion, one more thread is running in the caller's address space than was before the call.
A thread that has done its job and wants to terminate calls pthread_exit. A thread can wait for another thread to exit by calling pthread_join. If the thread waited for has already exited, the pthread_join finishes immediately. Otherwise it blocks.
Threads can synchronize using locks called mutexes. Typically a mutex guards some resource, such as a buffer shared by two threads. To make sure that only one thread at a time accesses the shared resource, threads are expected to lock the mutex before touching the resource and unlock it when they are done. As long as all threads obey this protocol, race conditions can be avoided. Mutexes are like binary semaphores, that is, semaphores that can take on only the values of 0 and 1. The name ''mutex'' comes from the fact that mutexes are used to ensure mutual exclusion on some resource.
Mutexes can be created and destroyed by the calls pthread_mutex_init and pthread_mutex_destroy, respectively. A mutex can be in one of two states: locked or unlocked. When a thread needs to set a lock on an unlocked mutex (using pthread_mutex_lock), the lock is set and the thread continues. However, when a thread tries to lock a mutex that is already locked, it blocks. When the locking thread is finished with the shared resource, it is expected to unlock the corresponding mutex by calling pthread_mutex_unlock.
Mutexes are intended for short-term locking, such as protecting a shared variable. They are not intended for long-term synchronization, such as waiting for a tape drive to become free. For long-term synchronization, condition variables are provided. These are created and destroyed by calls to pthread_cond_init and pthread_cond_destroy, respectively.
A condition variable is used by having one thread wait on it, and another thread signal it. For example, having discovered that the tape drive it needs is busy, a thread would do pthread_cond_wait on a condition variable that all the threads have agreed to associate with the tape drive. When the thread using the tape drive is finally done with it (possibly even hours later), it uses pthread_cond_signal to release exactly one thread waiting on that condition variable (if any). If no thread is waiting, the signal is lost. In other words, condition variables do not count like semaphores. A few other operations are also defined on threads, mutexes, and condition variables.