Kernel API Subset
IN THIS APPENDIX
| Tasks 450 | ||
| Threads | 456 | |
| Locking | 460 | |
| Signals | 462 |
Files and So On 465
This appendix lists all the manual pages of the kernel library and system calls that are not directly related to sockets but are typically used in conjunction with sockets.
Tasks
Tasks include both processes and threads. Threads (pThreads) are defined in the next section; this section covers processes and low-level tasks (clones).
fork()
Create a new process (independent task) at this call. This call creates a child process to run with the parent. You must be careful that you capture the child and direct it to its assigned task; otherwise, the child runs each statement the parent does (they run together).
Prototype | |
#include <unistd.h> | |
pid_t fork(void); | |
Return Value | |
0 | The task that gets this is the child. |
>0 | The task that gets this is the parent. |
<0 | The parent failed to create a new child; check errno. |
Parameters | |
(none) | |
Possible Errors | |
EAGAIN | The fork() cannot allocate sufficient memory to copy the par- |
ents page tables and allocate a task structure for the child. | |
ENOMEM | The fork() failed to allocate the necessary kernel structures |
because memory is tight. | |
Example |
int PID; if ( (PID = fork()) == 0 )
{ |
/*--- CHILD ---*/ /**** Run the childs assignment ***/ exit(); |
} else if ( PID > 0 )
{ |
/*--- PARENT ---*/ int status; /**** Do parents work ****/ |
wait(status); /* may be done in SIGCHLD signal handler */
} else /*--- ERROR ---*/ perror(fork() failed);
__clone()
This is a low-level system call for creating tasks. You can directly control what is shared between the parent and the child. This is not for amateur programmers; you can create very unpredictable programs. (See Chapter 7, Dividing the Load: Multitasking, for a complete description of this call.)
Prototype
#include <sched.h> int __clone(int (*fn)(void* arg), void* stacktop, int flags, void* arg);
Return Value
process ID
Parameters
fn
stacktop
If negative, errno has the exact error code.
The home for the child task. Create a function (or procedure) that accepts a void* parameter argument. When the routine attempts to return, the operating system terminates the task for you. You must create a stack for the child task. This parameter points to the top of that stack (the highest address of the data block). Because you provide the stack, the stack is fixed in size and cannot grow like a normal tasks stacks.
Two types of information arithmetically ORed together; the VM spaces to share and the termination signal. This flag supports all signal types and, when the task terminates, the operating system raises the signal you define.
The available VM spaces are as follows:
CLONE_VM Share the data space between tasks. Use this flag to share all static data, preinitialized data, and the allocation heap. Otherwise, copy data space.
CLONE_FS Share the file system information: current working directory, root file system, and default file creation permissions. Otherwise, copy settings.
CLONE_FILES Share open files. When one task changes the file pointer, the other tasks see the change. Likewise, if the task closes the file, the other tasks are not able to access the file any longer. Otherwise, create new references to open inodes.
flags
CLONE_SIGHAND Share signal tables. Individual tasks may choose to ignore open signals (using sigprocmask()) without affecting peers. Otherwise, copy tables.
CLONE_PID Share Process ID. Use this flag carefully; not all the existing tools support this feature. The PThreads library does not use this option. Otherwise, allocate new PID.
You can pass a pointer reference to any data value using this parameter. When the operating system finishes creating the child task, it calls the routine fn with the arg parameter. If you use this feature, be sure to place the value arg points to in the shared data region (CLONE_VM).
arg
Possible Errors | ||
EAGAIN | The __clone() cannot allocate sufficient memory to copy the | |
parents page tables and allocate a task structure for the child. | ||
ENOMEM | The __clone() failed to allocate the necessary kernel structures | |
because memory is tight. | ||
Example | ||
#define STACKSIZE | 1024 |
void Child(void* arg) {
/*---childs responsibility---*/ exit(0); }
... int main(void)
{ |
int cchild; char *stack=malloc(STACKSIZE); |
if ( (cchild = __clone(&Child, stack+STACKSIZE-1, SIGCHLD, 0) == 0 )
exec()
Run an external program (either a binary or an executable script with #! <interpreter> [arg] in the first line). This call replaces the currently running task with the external programs context. The new program keeps the callers PID and open files.
The calls execl(), execlp(), execle(), execv(), and execvp() are all front ends to
execve().
Prototype
#include <unistd.h> int execve(const char* path, char* const argv[], char* const envp[]); int execl(const char* path, const char* arg, ...); int execlp(const char* file, const char* arg, ); int execle(const char* path, const char* arg, , char* const envp[]); int execv(const char* path, char* const argv[]); int execvp(const char* file, char* const argv[]);
Return Value
This call does not return if successful. If it fails, the return value is -1.
Parameters | |
file | The program to execute. The call searches for the name in this |
variable using the defined PATH. | |
path | The absolute path and filename of the program to execute. |
argv | The string array of command-line parameters. The first array ele- |
ment value must be arg0 (or the name of the program). The last | |
array element is always zero (0). | |
arg | A command-line parameter. This is followed by an ellipsis ( ) |
to indicate that there are several arguments. The first arg is | |
always the name of the program, and the last arg is always | |
zero (0). | |
envp | The string array of environment parameters. Each parameter is in |
the form <param>=<value> (for example, TERM=vt100). The last | |
array element is always zero (0). | |
Possible Errors | |
EACCES | The file or a script interpreter is not a regular file, or execute per- |
mission is denied for the file or a script interpreter, or the file | |
system is mounted noexec. | |
EPERM | The file system is mounted nosuid, the user is not the superuser, |
and the file has an SUID or SGID bit set. | |
EPERM | The process is being traced, the user is not the superuser, and the |
file has an SUID or SGID bit set. | |
E2BIG | The argument list is too big. |
ENOEXEC | An executable is not in a recognized format, is for the wrong |
architecture, or has some other format error that means it cannot | |
be executed. | |
EFAULT | The filename points outside your accessible address space. |
ENAMETOOLONG | The filename is too long. |
ENOENT | The filename or a script or ELF interpreter does not exist. |
ENOMEM | Insufficient kernel memory was available. |
ENOTDIR | A component of the path prefix of filename, script, or ELF inter- |
preter is not a directory. | |
EACCES | Search permission is denied on a component of the path prefix of |
filename or the name of a script interpreter. | |
ELOOP | Too many symbolic links were encountered in resolving file- |
name, the name of a script, or ELF interpreter. | |
ETXTBUSY | Executable was open for writing by one or more processes. |
EIO | An I/O error occurred. |
ENFILE | The limit on the total number of files open on the system has |
been reached. | |
EMFILE | The process has the maximum number of files open. |
EINVAL | An ELF executable had more than one PT_INTERP segment. |
EISDIR | An ELF interpreter was a directory. |
ELIBBAD | An ELF interpreter was not in a recognized format. |
Example
execl(/bin/ls, /bin/ls, -al, /home, /boot, 0); perror(execl() failed); /* No IF needed here: if successful, no return */ char *args[]={ls, -al, /home, /boot, 0}; execvp(args[0], args); perror(execvp() failed);
sched_yield()
Relinquish control of the CPU without blocking. This routine tells the scheduler that the currently running task wants to give up the remains of its current timeslice. The call returns on the next timeslice.
Prototype
#include <sched.h> int sched_yield(void);
Return Value
Zero if all goes okay and control is transferred; otherwise, -1.
Parameters
(none)
Possible Errors
(none defined)
Example
#include <sched.h> sched_yield();
wait(), waitPID()
Wait for and acknowledge the termination of a child process. This is important to keep zombie processes from lingering in the process table and to free up valuable resources. The wait() call waits for any process to terminate, and the waitPID() call permits you to specify a specific process or group. You can use the following macros to get the meaning from the status:
Prototype
PID_t wait(int *status);
PID_t waitpid(PID_t PID, int *status, int options);
Return Value
Both calls return the PID of the child that terminated.
Parameters
status
PID
Returns the ending status of the child. If not zero or NULL, this parameter picks up the childs termination code and exit() value.
Indicates which process to wait for:
< -1 Wait for any child process whose process group ID is equal to the absolute value of PID.
== -1 Wait for any child process; this is the same behavior that wait() exhibits.
== 0 Wait for any child process whose process group ID is equal to that of the calling process.
> 0 Wait for the child whose process ID is equal to the value of PID.
WNOHANG Return immediately if no child has exited.
WUNTRACED Return for children who are stopped and whose status has not been reported.
options
Possible Errors | |
ECHILD | If the process specified in PID does not exist or is not a child of |
the calling process. (This can happen for ones own child if the | |
action for SIGCHLD is set to SIG_IGN.) | |
EINVAL | If the options argument was invalid. |
EINTR | If WNOHANG was not set and an unblocked signal or a SIGCHLD was |
caught. Just try again. | |
Example |
void sig_child(int signum) /* This handler only gets one waiting zombie */
{ |
int status; wait(&status); if ( WIFEXITED(status) ) printf(Child exited with the value of %d\n, WEXITSTATUS(status)); if ( WIFSIGNALED(status) ) printf(Child aborted due to signal #%d\n, WTERMSIG(status)); if ( WIFSTOPPED(status) ) printf(Child stopped on signal #%d\n, WSTOPSIG(signal)); |
} void sig_child(int signum) /* This handler removes all waiting zombies */ { while ( waitpid(-1, 0, WNOHANG) > 0 ); }
Threads
Threads are another kind of task. This section defines a few library calls from the pThreads library.
pthread_create()
This call creates a lightweight kernel process (thread). The thread starts in the function that start_fn points to using arg as the functions parameter. When the function returns, the thread
terminates. The function should return a void* value, but if it doesnt, the thread still terminates and the result is set to NULL.
Prototype
#include <pthread.h> int pthread_create(pthread_t *tchild, pthread_attr_t *attr, void (*start_fn)(void *), void *arg);
Return Value
This is a positive value if successful. If the thread-create call encountered any errors, the call returns a negative value and sets errno to the error.
Parameters | |
thread | The thread handle (passed by reference). If successful, the call |
places the thread handle in this parameter. | |
attr | The threads starting attributes. See pthread_attr_init for more |
information. | |
start_fn | The routine in which the thread is to start. This function should |
return a void* value. | |
arg | The parameter passed to start_fn. You should make this para- |
meter a nonshared (unless you plan on locking it), nonstack | |
memory reference. | |
Possible Errors | |
EAGAIN | Not enough system resources to create a process for the new |
thread. | |
EAGAIN | More than PTHREAD_THREADS_MAX threads are already active. |
Example |
void* child(void *arg) { /**** Do something! ****/ pthread_exit(arg); /* terminate and return arg */ }
int main()
{ |
pthread_t tchild; if ( pthread_create(&tchild, 0, child, 0) < 0 ) perror(Cant create thread!); /**** Do something! ****/ if ( pthread_join(tchild, 0) != 0 ) perror(Join failed); |
}
pthread_join()
Similar to the wait() system call, this call waits for and accepts the return value of the child thread.
Prototype
#include <pthread.h> int pthread_join(pthread_t tchild, void **retval);
Return Value
A positive value if successful. If the thread-create call encountered any errors, the call returns a negative value and sets errno to the error.
Parameters
thread retval
Possible Errors
ESRCH
EINVAL EINVAL EDEADLK
Example
(see pthread_create())
pthread_exit()
The thread handle to wait on
The pointer to the value passed back (passed by reference)
No thread could be found corresponding to that specified by
tchild.
The tchild thread has been detached.
Another thread is already waiting on termination of tchild. The tchild argument refers to the calling thread.
Explicitly terminates the current thread, returning retval. You can use a simple return statement as well.
Prototype
#include <pthread.h> void pthread_exit(void *retval);
Return Value
(none)
Parameter
retval
The void* value to return. Make sure that this value is non-stack memory.
Possible Errors
(none)
Example
(see pthread_create())
pthread_detach()
Detaches tchild thread from the parent. Normally, you need to join or wait for every process and thread. This call lets you create several threads and ignore them. This is the same as setting the threads attribute upon creation.
Prototype
#include <pthread.h> int pthread_detach(thread_t tchild);
Return Value
A zero if successful. If the thread-create call encountered any errors, the call returns a negative value and sets errno to the error.
Parameter
tchild
Possible Errors
ESRCH
EINVAL EINVAL EDEADLK
Example
void* child(void *arg) {
The child thread to detach
No thread could be found corresponding to that specified by
tchild.
The tchild thread has been detached.
Another thread is already waiting on termination of tchild. The tchild argument refers to the calling thread.
/**** Do something! ****/ pthread_exit(arg); /* terminate and return arg */
}
int main()
{ |
pthread_t tchild; if ( pthread_create(&tchild, 0, child, 0) < 0 ) perror(Cant create thread!); |
else pthread_detach(tchild); /**** Do something! ****/
}
Locking
The primary advantage of using threads is sharing data memory. Because the threads may try to revise the memory at the same time, you need to lock the memory for exclusive access. This section describes pThread calls that you can use (even with clones) to lock memory.
pthread_mutex_init(), pthread_mutex_destroy()
These calls create and destroy mutex semaphore variables. You may not need the initializer because the defined variables are easier and faster to use. The destroy call normally frees up any resources. However, the Linux implementation uses no allocated resources, so the call does nothing more than check whether the resource is unlocked.
Prototype
#include <pthread.h>
/*---Predefined mutex settings---*/ pthread_mutex_t fastmutex = PTHREAD_MUTEX_INITIALIZER; pthread_mutex_t recmutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP; pthread_mutex_t errchkmutex = PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP;
int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *mutexattr); int pthread_mutex_destroy(pthread_mutex_t *mutex);
Return Value | |
Always zero. | |
Parameters | |
mutex | The mutex to create or destroy. |
mutexattr | Any attributes to set. If NULL, the call uses the default setting |
(PTHREAD_MUTEX_INITIALIZER). | |
Possible Errors | |
(none) | |
pthread_mutex_lock(), pthread_mutex_trylock() |
Lock or try to lock a semaphore for entering a critical section. The parameter is simply a variable that acts like a reservation ticket. If another thread tries to lock a reserved spot, it blocks until the reserving thread releases the semaphore.
Prototype
#include <pthread.h> int pthread_mutex_lock(pthread_mutex_t *mutex); int pthread_mutex_trylock(pthread_mutex_t *mutex);
Return Value | |
The call returns zero on success and nonzero on error. You can find the exact code in errno. | |
Parameter | |
mutex | The semaphore variable |
Possible Errors | |
EINVAL | The mutex has not been properly initialized. |
EDEADLK | (pthread_mutex_try_lock) The calling thread has already |
locked the mutex (error-checking mutexes only). | |
EBUSY | (pthread_mutex_lock) The calling thread cant acquire because |
it is currently locked. | |
Example |
pthread_mutex_t mutex = fastmutex;
if ( pthread_mutex_lock(&mutex) == 0 ) { /**** work on critical data ****/ pthread_mutex_unlock(&mutex); } pthread_mutex_t mutex = fastmutex;
/*---Do other processing while waiting for semaphore---*/ while ( pthread_mutex_trylock(&mutex) != 0 && errno == EBUSY ) { /**** Work on something else while waiting ****/ } /*---Got the semaphore! Now work on the critical section---*/ if ( errno != ENOERROR ) { /**** work on critical data ****/ pthread_mutex_unlock(&mutex); }
pthread_mutex_unlock()
Unlock a mutex semaphore.
Prototype
#include <pthread.h> int pthread_mutex_unlock(pthread_mutex_t *mutex);
Return Value
The call returns zero on success and nonzero on error. You can find the exact code in errno.
Parameter | |
mutex | The semaphore variable |
Possible Errors | |
EINVAL | The mutex has not been properly initialized. |
EPERM | The calling thread does not own the mutex (error-checking |
mutexes only). | |
Example | |
(see pthread_mutex_lock()) | |
Signals |
When working with tasks, your program may get signals (or asynchronous notifications). This section describes system calls that let you capture and process them.
signal()
Register the sig_fn routine to answer the signum signal. The default behavior is a single shot; the signal handler reverts to the default after getting the first signal. Use sigaction() instead if you want to control the behavior more.
Prototype
#include <signal.h> void (*signal(int signum, void (*sig_fn)(int signum)))(int signum); -or-typedef void (*TSigFn)(int signum); TSigFn signal(int signum, TSigFn sig_fn);
Return Value
A positive value if successful. If the thread-create call encountered any errors, the call returns a negative value and sets errno to the error.
Parameters | |
signum | The signal number to capture |
sig_fn | The program routine that the schedule calls |
Possible Error | |
(errno not set) |
Example
void sig_handler(int signum) { switch ( signum ) { case SIGFPE:
} }
if ( signal(SIGFPE, sig_handler) == 0 ) perror(signal() failed);
sigaction()
Similar to signal(), sigaction() establishes the receiver of certain signals. Unlike signal(), however, this call gives you a lot more control over how the signaling notification behaves. It is also a little more complicated to use.
Prototype
#include <signal.h> int sigaction(int signum, const struct sigaction *sigact, struct sigaction *oldsigact);
Return Value
Zero upon success; otherwise, nonzero.
Parameters
signum | The signal to capture. |
sigact | The desired behavior and signal handler, using the following |
structure: | |
struct sigaction | |
{ | |
void (*sa_handler)(int); | |
sigset_t sa_mask; | |
int sa_flags; | |
void (*sa_restorer)(void); | |
}; | |
sa_handler | Signal handler function pointer. |
sa_mask | The set of signals to block while servicing a signal in the signal |
handler. |
sa_restorer | Obsolete; do not use. | |
sa_flags | How to handle the signals. You can use the following flags: | |
| SA_NOCLDSTOP If the signal is SIGCHLD, ignore cases when | |
the child stops or pauses. | ||
| SA_ONESHOT or SA_RESETHAND Reset the handler to the | |
default after getting the first signal. | ||
| SA_RESTART Try to restart an interrupted system call. | |
Normally, system calls that are interrupted return an EINTR | ||
error. This option tries to restart the call and avoid EINTR | ||
errors. | ||
| SA_NOMASK or SA_NODEFER Allow like signals to interrupt | |
the handler. Normally, if your handler is responding to a | ||
particular signal like SIGCHLD, the kernel suspends other | ||
SIGCHLD signals. This can lead to lost signals. Using this | ||
option permits your handler to be interrupted. Be careful | ||
using this option. | ||
oldsigact | A repository of the old behaviors. You can copy the old settings | |
here. | ||
Possible Errors | ||
EINVAL | An invalid signal was specified. This will also be generated if an | |
attempt is made to change the action for SIGKILL or SIGSTOP that | ||
cannot be caught. | ||
EFAULT | The sigact or oldsigact parameter points to memory that is not | |
a valid part of the process address space. | ||
EINTR | System call was interrupted. | |
Example |
void sig_handler(int signum) { switch ( signum ) { case SIGCHLD:
} }
struct sigaction sigact; bzero(&sigact, sizeof(sigact)); sigact.sa_handler = sig_handler; /* set the handler */ sigact.sa_flags = SA_NOCLDSTOP | SA_RESTART; /* set options */ if ( sigaction(SIGCHLD, &sigact, 0) == 0 ) perror(sigaction() failed);
sigprocmask()
Sets which signals are permitted to interrupt while servicing a signal.
Prototype | |||
#include <signal.h> | |||
int sigprocmask(int how, const sigset_t *sigset, sigset_t *oldsigset); | |||
Return Value | |||
Nonzero upon error; otherwise, zero. | |||
Parameters | |||
how | The following are how the interrupting signals are treated while | ||
servicing a signal: | |||
| SIG_BLOCK The set of blocked signals is the union of the | ||
current set and the sigset argument. | |||
| SIG_UNBLOCK | The signals in sigset are removed from | |
the current set of blocked signals. It is legal to attempt to | |||
unblock a signal that is not blocked. | |||
| SIG_SETMASK | The set of blocked signals is set to the | |
argument sigset. | |||
sigset | The destination signal-set. | ||
Oldsigset | If non-NULL, the call places a copy of the old values in here. | ||
Possible Errors | |||
EFAULT | The sigset or oldsigset parameter points to memory that is not | ||
a valid part of the process address space. | |||
EINTR | System call was interrupted. | ||
Files and So On | |||
This section describes a few library and system calls for file management. | |||
bzero(), memset() | |||
bzero() initializes the specified block to zeros. This call is deprecated, so you might want to | |||
use memset() instead. | |||
memset() sets the specified block to val. | |||
Prototype |
#include <string.h> void bzero(void *mem, int bytes); void* memset(void *mem, int val, size_t bytes);
Return Value | |
bzero() returns no value. | |
memset() returns the reference mem. | |
Parameters | |
mem | The memory segment to initialize |
val | The value to fill the segment with |
bytes | The number of bytes to write (the size of the memory segment) |
Possible Errors | |
(none) | |
Example | |
bzero(&addr, sizeof(addr)); | |
memset(&addr, 0, sizeof(addr)); | |
fcntl() | |
Manipulate the file or socket handle. | |
Prototype | |
#include <unistd.h> | |
#include <fcntl.h> | |
int fcntl(int fd, int cmd); | |
int fcntl(int fd, int cmd, long arg); | |
int fcntl(int fd, int cmd, struct flock *flock); | |
Return Value | |
On error, -1 is returned and errno is set appropriately. For a successful call, the return value | |
depends on the operation: |
F_DUPFD The new descriptor F_GETFD Value of flag F_GETFL Value of flags
F_GETOWN Value of descriptor owner
F_GETSIG Value of signal sent when read or write becomes possible, or zero for traditional SIGIO behavior
All other commands return zero.
Parameters | ||
fd | The descriptor to manipulate. | |
cmd | The operation to perform. Some operations are duplicates of | |
existing functions. Some operations require an operand (arg or | ||
flock). Each operation is grouped into specific functions: | ||
| Duplicate descriptor (F_DUPFD) Same as dup2(arg, fd), | |
this operation replaces fd with a copy of the descriptor | ||
in arg. | ||
| Manipulate close-on-exec (F_GETFD, F_SETFD) The kernel | |
does not pass all file descriptors to the exec-child process. | ||
With this parameter, you can test or set the close-on-exec. | ||
| Manipulate descriptor flags (F_GETFL, F_SETFL) Using | |
these commands, you can get the flags (set by the open() | ||
system call) of the descriptor. You can only set O_APPEND, | ||
O_NONBLOCK, and O_ASYNC. | ||
| Manipulate file locks (F_GETLK, F_SETLK, F_SETLKW) | |
GETLK retrieves the lock structure that currently holds the | ||
file. If the file is not locked | ||
Determine who owns I/O signals (F_GETOWN, | ||
F_SETOWN)Return or set the PID of the current | ||
owner of the SIGIO signal. | ||
Determine the kind of signal to send (F_GETSIG, | ||
F_SETSIG)Gets or sets the signal type when more | ||
I/O operations can be performed. Default is SIGIO. | ||
arg | The value to set. | |
flock | The locking key. | |
Possible errors | ||
EACCES | Operation is prohibited by locks held by other processes. | |
EAGAIN | Operation is prohibited because the file has been memory- | |
mapped by another process. | ||
EBADF | fd is not an open file descriptor. | |
EDEADLK | It was detected that the specified F_SETLKW command would | |
cause a deadlock. | ||
EFAULT | lock | is outside your accessible address space. |
EINTR | For F_SETLKW, the command was interrupted by a signal. For |
F_GETLK and F_SETLK, the command was interrupted by a signal | |
before the lock was checked or acquiredmost likely when lock- | |
ing a remote file (locking over NFS), but it can sometimes hap- | |
pen locally. | |
EINVAL | For F_DUPFD, arg is negative or is greater than the maximum |
allowable value. For F_SETSIG, arg is not an allowable signal | |
number. | |
EMFILE | For F_DUPFD, the process already has the maximum number of |
file descriptors open. | |
ENOLCK | Too many segment locks open, lock table is full, or a remote |
locking protocol failed (locking over NFS, for example). | |
EPERM | Attempted to clear the O_APPEND flag on a file that has the |
append-only attribute set. |
Example
... printf(PID which owns SIGIO: %d, fnctl(fd, F_GETOWN));
... if ( fnctl(fd, F_SETSIG, SIGKILL) != 0 ) perror(Cant set signal);
... if ( (fd_copy = fcntl(fd, F_DUPFD)) < 0 ) perror(Cant dup fd);
pipe()
Creates a pipe that points to itself. Each file descriptor in fd[] coincides with input (fd[0]) and output (fd[1]). If you write to fd[1], you can read the data on fd[0]. Used mostly with
fork().
Prototype
#include <unistd.h> int pipe(fd[2]);
Return Value
Zero if okay; -1 on error.
Parameter | |
fd | An array of two integers to receive the new file descriptor values |
Possible Errors | |
EMFILE | Too many file descriptors are already in use by the current |
process. | |
ENFILE | The systems file table is full. |
EFAULT | The process does not own the memory that fd points to (invalid |
memory reference). | |
Example | |
int fd[2]; | |
pipe(fd); /* create pipe */ | |
poll() |
Similar to select(), this call waits on any one of several I/O channels for changes. Instead of using macros for managing and controlling the descriptor list, the programmer uses structure entries.
Prototype
#include <sys/poll.h> int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
Return Value
If less than zero, an error occurred; a zero returned means that the call timed out. Otherwise, the call returns the number of descriptor records that changed.
Parameters
ufds
The following is an array of pollfd structures. Each record tracks a different file descriptor.
struct pollfd {
int fd;
/* file descriptor */
short events; /* requested events */ short revents; /* returned events */
};
The fd field is the file descriptor to check. The events and revents fields indicate the events to check and the events that occurred, respectively. The bit-values available are as follows: POLLIN There is data to read.
POLLPRI There is urgent data to read. POLLOUT Writing now will not block.
POLLERR Error condition. POLLHUP Hung up.
POLLNVAL Invalid request; fd not open. POLLRDNORM Normal read (Linux only). POLLRDBAND Read out-of-band (Linux only). POLLWRNORM Normal write (Linux only). POLLWRBAND Write out-of-band (Linux only).
nfds | The number of records to check during the call. |
timeout | The timeout in milliseconds. If timeout is negative, the call waits |
forever. | |
Possible Errors | |
ENOMEM | There was no space to allocate file descriptor tables. |
EFAULT | The array given as argument was not contained in the calling |
programs address space. | |
EINTR | A signal occurred before any requested event. |
Example |
int fd_count=0; struct pollfd fds[MAXFDs]; fds[fd_count].fd = socket(PF_INET, SOCK_STREAM, 0); /*** bind() and listen() socket ***/ fds[fd_count++].events = POLLIN; for (;;) { if ( poll(fds, fd_count, TIMEOUT_MS) > 0 )
{ |
int i; if ( (fds[0].revents & POLLIN) != 0 ) { |
fds[fd_count].events = POLLIN | POLLHUP; fds[fd_count++].fd = accept(fds[0].fd, 0, 0);
} for ( i = 1; i < fd_count; i++ ) { if ( (fds[i].revents & POLLHUP) != 0 ) { close(fds[i].fd); /*** Move up FDs to fill empty slot ***/ fd_count--; } else if ( (fds[i].revents & POLLIN) != 0 )
/*** Read and process data ***/
}
}
}
read()
Read buf_len bytes from the fd file descriptor into the buffer. You can use this system call for sockets as well as files, but this call does not provide as much control as the recv() system call.
Prototype | |
#include <unistd.h> | |
int read(int fd, char *buffer, size_t buf_len); | |
Return Value | |
The number of bytes actually read. | |
Parameters | |
fd | File (or socket) descriptor |
buffer | The memory buffer to accept the read data |
buf_len | The number of bytes to read and the number of legal bytes in the |
buffer | |
Possible Errors | |
EINTR | The call was interrupted by a signal before any data was read. |
EAGAIN | Non-blocking I/O has been selected using O_NONBLOCK and no |
data was immediately available for reading. | |
EIO | I/O error. This will happen when the process is in a background |
process group, tries to read from its controlling tty, is either | |
ignoring or blocking SIGTTIN, or its process group is orphaned. It | |
can also occur when there is a low-level I/O error while reading | |
from a disk or tape. | |
EISDIR | fd refers to a directory. |
EBADF | fd is not a valid file descriptor or is not open for reading. |
EINVAL | fd is attached to an object that is unsuitable for reading. |
EFAULT | buf is outside your accessible address space. |
Example |
int sockfd; int bytes_read; char buffer[1024];
/*---create socket & connect to server---*/ if ( (bytes_read = read(sockfd, buffer, sizeof(buffer))) < 0 ) perror(read);
select()
Wait for any I/O status changes from the file descriptor sets. When any of the specified sets changes, the call returns. You have four macros to help construct and manage the file descriptor sets:
| FD_CLR | Remove a descriptor from the set. |
| FD_SET | Add a descriptor to a set. |
| FD_ISSET | Test if specified descriptor is ready for I/O. |
| FD_ZERO | Initialize the set to empty. |
Prototype |
int select(int hi_fd, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); FD_CLR(int fd, fd_set *set); FD_ISSET(int fd, fd_set *set); FD_SET(int fd, fd_set *set); FD_ZERO(fd_set *set);
Return Value
The number of descriptors that have changed states. If an error occurred, the return value is negative. If the timeout expired, the return value is zero.
Parameters
hi_fd | This is the highest file descriptor number + 1. For example, if |
you have four files open plus the stdio, your descriptors could | |
be 0, 1, 2, 3, 5, 6, and 8. The highest is 8. If you include fd(8) in | |
your select statement, hi_fd would equal 9. If the highest fd | |
were 5, this parameter would be 6. | |
readfds | The set of descriptors to test for readability. |
writefds | The set of descriptors to test for writing. |
exceptfds | The set of descriptors to test for out-of-band data. |
timeout | The maximum time to wait for data to arrive in microseconds. |
This is a pointer to a number. If the number is zero (not the | |
pointer), the call returns immediately after checking all the | |
descriptors. If the pointer is NULL (zero), the selects timeout fea- | |
ture is disabled. |
fd set
The file descriptor to add, remove, or test. The file descriptor set.
Possible Errors | |
EBADF | An invalid file descriptor was given in one of the sets. |
EINTR | A non-blocked signal was caught. |
EINVAL | n is negative. |
ENOMEM | select was unable to allocate memory for internal tables. |
Example |
int i, ports[]={9001, 9002, 9004, -1}; int sockfd, max=0; fd_set set; struct sockaddr_in addr; struct timeval timeout={2,500000}; /* 2.5 sec. */
FD_ZERO(&set); bzero(&addr, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = INADDR_ANY; for ( i = 0; ports[i] > 0; i++ ) { sockfd = socket(PF_INET, SOCK_STREAM, 0); addr.sin_port = htons(ports[i]); if ( bind(sockfd, &addr, sizeof(addr)) != 0) perror(bind() failed); else { FD_SET(sockfd, &set); if ( max < sockfd ) max = sockfd; } } if ( select(max+1, &set, 0, &set, &timeout) > 0 ) { for ( i = 0; i <= max; i++ ) if ( FD_ISSET(i, &set) )
{ |
int client = accept(i, 0, 0); /**** process the clients requests ****/ |
}
}
write()
Write msg_len bytes to fd field descriptor from buffer. You can use a socket descriptor as well, but it does not provide you with as much control as the send() system call.
Prototype
#include <unistd.h> int write(int fd, const void *buffer, size_t msg_len);
Return Value
Number of bytes written. The byte count can be less than msg_len. If the call does not succeed in writing all required bytes, you can use a loop for successive writes. If negative, the call stores the error detail in errno.
Parameters
fd | File descriptor (can be a socket descriptor) |
buffer | The message to write |
msg_len | The length of the message |
Possible Errors | |
EBADF | fd is not a valid file descriptor or is not open for writing. |
EINVAL | fd is attached to an object that is unsuitable for writing. |
EFAULT | buf is outside your accessible address space. |
EPIPE | fd is connected to a pipe or socket whose reading end is closed. |
When this happens, the writing process will receive a SIGPIPE | |
signal; if it catches, blocks, or ignores the error, EPIPE is | |
returned. | |
EAGAIN | Non-blocking I/O has been selected using O_NONBLOCK and there |
was no room in the pipe or socket connected to fd to write the | |
data immediately. | |
EINTR | The call was interrupted by a signal before any data was written. |
ENOSPC | The device containing the file referred to by fd has no room for |
the data. | |
EIO | A low-level I/O error occurred while modifying the inode. |
Example |
/*** Write a message (TCP, UDP or Raw) ***/ int sockfd; int bytes, bytes_wrote=0; /*--- Create socket, connect to server ---*/ while ( (bytes = write(sockfd, buffer, msg_len)) > 0 ) if ( (bytes_wrote += bytes) >= msg_len ) break; if ( bytes < 0 ) perror(write);
close()
Closes all descriptors (file or socket). If the socket is connected to a server or client, it requests a close(). The channel actually remains active after the close until the channel empties or times out. Every process has a limit to the number of open descriptors it can have.
getdtablesize() returns 1024 in Linux 2.2.14, and the /usr/include/linux/limits.h file
defines this limit with NR_OPEN. Also, the first three descriptors default to stdin (0), stdout (1), and stderr (2).
Prototype
#include <unistd.h> int close(int fd);
Return Value
Zero if everything goes well. If an error occurs, you can find the cause in errno.
Parameter
fd
Possible Error
EBADF
Example
The file or socket descriptor
fd isnt a valid open file descriptor.
int sockfd; sockfd = socket(PF_INET, SOCK_RAW, htons(99)); if ( sockfd < 0 ) PANIC(Raw socket create failed); ... if ( close(sockfd) != 0 ) PANIC(Raw socket close failed);