Home > Articles

This chapter is from the book

6.3 User and Group Names

While the operating system works with user and group ID numbers for storage of file ownership and for permission checking, humans prefer to work with user and group names.

Early Unix systems kept the information that mapped names to ID numbers in simple text files, /etc/passwd and /etc/group. These files still exist on modern systems, and their format is unchanged from that of V7 Unix. However, they no longer tell the complete story. Large installations with many networked hosts keep the information in network databases: ways of storing the information on a small number of servers that are then accessed over the network.8 However, this usage is transparent to most applications since access to the information is done through the same API as was used for retrieving the information from the text files. It is for this reason that POSIX standardizes only the APIs; the /etc/passwd and /etc/group files need not exist as such for a system to be POSIX compliant.

The APIs to the two databases are similar; most of our discussion focuses on the user database.

6.3.1 User Database

The traditional /etc/passwd format maintains one line per user. Each line has seven fields, each of which is separated from the next by a colon character:

$ grep arnold /etc/passwd
arnold:x:1000:1000:Arnold D. Robbins,,,:/home/arnold:/bin/bash

In order, the fields are as follows:

  • The user name

    This is what the user types to log in. It is also what shows up for ‘ls -l’ and in any other context that displays users.

  • The password field

    Once upon a time, this was the user’s encrypted password. Today, this field is likely to be an x (as shown), meaning that the password information is held in a different file that isn’t world-readable. This separation is a security measure; if the encrypted password isn’t available to nonprivileged users, it is much harder to “crack.”

  • The user ID number

    This should be unique—one number per user.

  • The group ID number

    This is the user’s initial group ID number. As is discussed later, processes have multiple groups associated with them.

  • The user’s real name

    This is at least a first and last name. Some systems allow for comma-separated fields, for office location, phone number, and so on (again, as shown), but this is not standardized.

  • The login directory

    This directory becomes the home directory for users when they log in ($HOME—the default for the cd command).

  • The login program

    The program to run when the user logs in. This is usually a shell, but it need not be. If this field is left empty, the default is /bin/sh.

Access to the user database is through the routines declared in <pwd.h>:

#include <pwd.h>                                              POSIX

struct passwd *getpwnam(const char *name);
struct passwd *getpwuid(uid_t uid);

struct passwd *getpwent(void);                                POSIX XSI
void setpwent(void);
void endpwent(void);

The fields in the struct passwd used by the various API routines correspond directly to the fields in the password file:

struct passwd {
    char    *pw_name;      /* user name */
    char    *pw_passwd;    /* user password */
    uid_t   pw_uid;        /* user id */
    gid_t   pw_gid;        /* group id */
    char    *pw_gecos;     /* real name */
    char    *pw_dir;       /* home directory */
    char    *pw_shell;     /* shell program */
};

(The name pw_gecos is historical; when the early Unix systems were being developed, this field held the corresponding information for the user’s account on the Bell Labs Honeywell systems running the GECOS operating system.)

The purpose of each routine is described in the following list:

  • struct passwd *getpwent(void)

    Return a pointer to an internal static struct passwd structure containing the “current” user’s information. This routine reads through the entire password database one record at a time, returning a pointer to a structure for each user. The same pointer is returned each time; that is, the internal struct passwd is overwritten for each user’s entry. When getpwent() reaches the end of the password database, it returns NULL. Thus it lets you step through the entire database, one user at a time. The order in which records are returned is undefined.

  • void setpwent(void)

    Reset the internal state such that the next call to getpwent() returns the first record in the password database.

  • void endpwent(void)

    “Close the database,” so to speak, be it a simple file, network connection, or something else.

  • struct passwd *getpwnam(const char *name)

    Look up the user with a pw_name member equal to name, returning a pointer to a static struct passwd describing the user or NULL if the user is not found.

  • struct passwd *getpwuid(uid_t uid)

    Similarly, look up the user with the user ID number given by uid, returning a pointer to a static struct passwd describing the user or NULL if the user is not found.

getpwuid() is what’s needed when you have a user ID number (such as from a struct stat) and you wish to print the corresponding user name. It’s also useful if you want to look up your own information, such as home directory or login shell, based on the return value of getuid(). (getuid() is presented later, in Section 11.2, “Retrieving User and Group IDs,” page 385.) getpwnam() converts a name to a user ID number, for example, if you wish to use chown() or fchown() on a file. In theory, both of these routines do a linear search through the password database to find the desired information. This is true in practice when a password file is used; however, behind-the-scenes databases (network or otherwise, as on BSD systems) tend to use more efficient methods of storage, so these calls are possibly not as expensive in such cases.9

getpwent() is useful when you need to go through the entire password database. For instance, you might wish to read it all into memory, sort it, and then search it quickly with bsearch(). This is very useful for avoiding the multiple linear searches inherent in looking things up one at a time with getpwuid() or getpwnam().

6.3.2 Group Database

The format of the /etc/group group database is similar to that of /etc/passwd, but with fewer fields:

$ grep arnold /etc/group  | sort
adm:x:4:syslog,arnold,miriam
arnold:x:1000:
audio:x:29:pulse,arnold,miriam
cdrom:x:24:arnold,miriam,videos
dialout:x:20:miriam,arnold,videos
...

Again, there is one line per group, with fields separated by colons. The fields are as follows:

  • The group name

    This is the name of the group, as shown in ‘ls -l’ or in any other context in which a group name is needed.

  • The group password

    This field is historical. It is no longer used.

  • The group ID number

    As with the user ID, this should be unique to each group.

  • The user list

    This is a comma-separated list of users who are members of the group.

In the previous example, we see that user arnold is a member of multiple groups. This membership is reflected in practice in what is termed the group set. Besides the main user ID and group ID number that processes have, the group set is a set of additional group ID numbers that each process carries around with it. The system checks all of these group ID numbers against a file’s group ID number when performing permission checking. This subject is discussed in more detail in Chapter 11, “Permissions and User and Group ID Numbers,” page 383.

The group database APIs are similar to those for the user database. The following functions are declared in <grp.h>:

#include <grp.h>                                        POSIX

struct group *getgrnam(const char *name);
struct group *getgrgid(gid_t gid);

struct group *getgrent(void);                           POSIX XSI
void setgrent(void);
void endgrent(void);

The struct group corresponds to the records in /etc/group:

struct group {
    char    *gr_name;        /* group name */
    char    *gr_passwd;      /* group password */
    gid_t   gr_gid;          /* group id */
    char    **gr_mem;        /* group members */
};

The gr_mem field bears some explanation. While declared as a pointer to a pointer (char **), it is best thought of as an array of strings (like argv). The last element in the array is set to NULL. When no members are listed, the first element in the array is NULL.

ch-general1-groupinfo.c demonstrates how to use the struct group and the gr_mem field. The program accepts a single user name on the command line and prints all group records in which that user name appears:

 1  /* ch-general1-groupinfo.c --- Demonstrate getgrent() and struct group */
 2
 3  #include <stdio.h>
 4  #include <stdlib.h>
 5  #include <string.h>
 6  #include <grp.h>
 7
 8  extern void print_group(const struct group *gr);
 9
10  /* main --- print group lines for user named in argv[1] */
11
12  int
13  main(int argc, char **argv)
14  {
15      struct group *gr;
16      int i;
17
18      if (argc != 2) {                                       Check arguments
19          fprintf(stderr, "usage: %s user\n", argv[0]);
20          exit(1);
21      }
22
23      while ((gr = getgrent()) != NULL)                      Get each group record
24          for (i = 0; gr->gr_mem[i] != NULL; i++)            Look at each member
25              if (strcmp(gr->gr_mem[i], argv[1]) == 0)       If found the user …
26                  print_group(gr);                           Print the record
27
28      endgrent();
29
30      exit(0);
31  }

The main() routine first does error checking (lines 18–21). The heart of the program is a nested loop. The outer loop (line 23) loops over all the group database records. The inner loop (line 24) loops over the members of the gr_mem array. If one of the members matches the name from the command line (line 25), then print_group() is called to print the record (line 26). Here is print_group():

33  /* print_group --- print a group record */
34
35  void
36  print_group(const struct group *gr)
37  {
38      int i;
39
40      printf("%s:%s:%ld:", gr->gr_name, gr->gr_passwd, (long) gr->gr_gid);
41
42      for (i = 0; gr->gr_mem[i] != NULL; i++) {
43          printf("%s", gr->gr_mem[i]);
44          if (gr->gr_mem[i+1] != NULL)
45              putchar(',');
46      }
47
48      putchar('\n');
49  }

The print_group() function (lines 35–49) is straightforward, with logic similar to that of main() for printing the member list. Group list members are comma separated; thus the loop body has to check that the next element in the array is not NULL before printing a comma. This code works correctly, even if there are no members in the group. However, for this program, we know there are members, or print_group() wouldn’t have been called! Here’s what happens when the program is run:

$ ch-general1-groupinfo arnold | sort
adm:x:4:syslog,arnold,miriam
audio:x:29:pulse,arnold,miriam
cdrom:x:24:arnold,miriam,videos
dialout:x:20:miriam,arnold,videos
...

Interestingly, the line ‘arnold:x:1000:’ is missing. This is because the group member list is empty, and our program examines only the members of that list.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.