Home > Articles > Operating Systems, Server > Linux/UNIX/Open Source

  • Print
  • + Share This
This chapter is from the book

Other Tools

In addition to the general tools and subsystem-specific ones, you have a variety of mixed and other performance-monitoring tools at your disposal. The next sections look at these in more detail.

ps

The ps command is another highly used tool where performance is concerned. Most often it is used to isolate a particular process. However, it also has numerous options that can help you get more out of ps and perhaps save some time while trying to isolate a particular process.

The ps command basically reports process status. When invoked without any options, the output looks something like this:

$ ps
  PID TTY          TIME CMD
 3220 pts/0    00:00:00 bash
 3251 pts/0    00:00:00 ps

This basically tells you everything that the current session of the user who invoked it is doing.

Obviously, just seeing what you are doing in your current session is not always all that helpful—unless, of course, you are doing something very detrimental in the background!

To look at other users or the system as a whole, ps requires some further options. The ps command's options on Linux are actually grouped into sections based on selection criteria.

Let's look at these sections and what they can do.

Simple Process Selection

Using simple process selection, you can be a little selective about what you see. For example, if you want to see only processes that are attached to your current terminal, you would use the -T option:

[jfink@kerry jfink]$ ps -T
  PID TTY      STAT   TIME COMMAND
 1668 pts/0    S      0:00 login -- jfink
 1669 pts/0    S      0:00 -bash
 1708 pts/0    R      0:00 ps -T

Process Selection by List

Another way to control what you see with ps is to view by a list type. As an example, if you want to see all the identd processes running, you would use the -C option from this group that displays a given command:

[jfink@kerry jfink]$ ps -C identd
  PID TTY          TIME CMD
  535 ?        00:00:00 identd
  542 ?        00:00:00 identd
  545 ?        00:00:00 identd
  546 ?        00:00:00 identd
  550 ?        00:00:00 identd

Output Format Control

Following process selection is output control. This is helpful when you want to see information in a particular format. A good example is using the jobs format with the -j option:

[jfink@kerry jfink]$ ps -j
  PID  PGID   SID TTY          TIME CMD
 1669  1669  1668 pts/0    00:00:00 bash
 1729  1729  1668 pts/0    00:00:00 ps

Output Modifiers

Output modifiers can apply high-level changes to the output. The following is the output using the -e option to show the environment after running ps:

 [jfink@kerry jfink]$ ps ae
  PID TTY      STAT   TIME COMMAND
 1668 pts/0    S      0:00 login -- jfink
 1669 pts/0    S      0:00 -bash TERM=ansi REMOTEHOST=172.16.14.102 HOME=/home/j
 1754 pts/0    R      0:00 ps ae LESSOPEN=|/usr/bin/lesspipe.sh %s 

The remaining sections are INFORMATION, which provides versioning information and help, and OBSOLETE options. The next three sections give some specific cases of using ps with certain options.

Some Sample ps Output

Of course, reading the man page helps, but a few practical applied examples always light the way a little better.

The most commonly used ps switch on Linux and BSD systems is this:

$ ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1116  380 ?        S    Jan27   0:01 init [3]
root         2  0.0  0.0     0    0 ?        SW   Jan27   0:03 [kflushd]
root         3  0.0  0.0     0    0 ?        SW   Jan27   0:18 [kupdate]
root         4  0.0  0.0     0    0 ?        SW   Jan27   0:00 [kpiod]
root         5  0.0  0.0     0    0 ?        SW   Jan27   0:38 [kswapd]
bin        260  0.0  0.0  1112  452 ?        S    Jan27   0:00 portmap
root       283  0.0  0.0  1292  564 ?        S    Jan27   0:00 syslogd -m 0
root       294  0.0  0.0  1480  700 ?        S    Jan27   0:00 klogd
daemon     308  0.0  0.0  1132  460 ?        S    Jan27   0:00 /usr/sbin/atd
root       322  0.0  0.0  1316  460 ?        S    Jan27   0:00 crond
root       322  0.0  0.0  1316  460 ?        S    Jan27   0:00 crond
root       336  0.0  0.0  1260  412 ?        S    Jan27   0:00 inetd
root       371  0.0  0.0  1096  408 ?        S    Jan27   0:00 rpc.rquotad
root       382  0.0  0.0  1464  160 ?        S    Jan27   0:00 [rpc.mountd]
root       393  0.0  0.0     0    0 ?        SW   Jan27   2:15 [nfsd]
root       394  0.0  0.0     0    0 ?        SW   Jan27   2:13 [nfsd]
root       395  0.0  0.0     0    0 ?        SW   Jan27   2:13 [nfsd]
root       396  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       397  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       398  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       399  0.0  0.0     0    0 ?        SW   Jan27   2:11 [nfsd]
root       400  0.0  0.0     0    0 ?        SW   Jan27   2:14 [nfsd]
root       428  0.0  0.0  1144  488 ?        S    Jan27   0:00 gpm -t ps/2
root       466  0.0  0.0  1080  408 tty1     S    Jan27   0:00 /sbin/mingetty tt
root       467  0.0  0.0  1080  408 tty2     S    Jan27   0:00 /sbin/mingetty tt
root       468  0.0  0.0  1080  408 tty3     S    Jan27   0:00 /sbin/mingetty tt
root       469  0.0  0.0  1080  408 tty4     S    Jan27   0:00 /sbin/mingetty tt
root       470  0.0  0.0  1080  408 tty5     S    Jan27   0:00 /sbin/mingetty tt
root       471  0.0  0.0  1080  408 tty6     S    Jan27   0:00 /sbin/mingetty tt
root      3326  0.0  0.0  1708  892 ?        R    Jan30   0:00 in.telnetd
root      3327  0.0  0.1  2196 1096 pts/0    S    Jan30   0:00 login -- jfink
jfink     3328  0.0  0.0  1764 1012 pts/0    S    Jan30   0:00 -bash
jfink     3372  0.0  0.0  2692 1008 pts/0    R    Jan30   0:00 ps aux

The output implies that this system's main job is to serve files via NFS, and indeed it is. It also doubles as an FTP server, but no connections were active when this output was captured.

The output of ps can tell you a lot more—sometimes just simple things that can improve performance. Looking at this NFS server again, you can see that it is not too busy; actually, it gets used only a few times a day. So what are some simple things that could be done to make it run even faster? Well, for starters, you could reduce the number of virtual consoles that are accessible via the system console. I like to have a minimum of three running (in case I lock one or two). A total of six are shown in the output (the mingetty processes). There are also nine available nfsd processes; if the system is not used very often and only by a few users, that number can be reduced to something a little more reasonable.

Now you can see where tuning can be applied outside the kernel. Sometimes just entire processes do not need to be running, but those that require multiple instances (such as NFS, MySQL, or HTTP, for example) can be minimized to what is required for good operations.

The Process Forest

The process forest is a great way of seeing exactly how processes and their parents are related. The following output is a portion of the same system used in the previous section:

...
root       336  0.0  0.0  1260  412 ?        S    Jan27   0:00 inetd
root      3326  0.0  0.0  1708  892 ?        S    Jan30   0:00  \_ in.telnetd
root      3327  0.0  0.1  2196 1096 pts/0    S    Jan30   0:00      \_ login --
jfink     3328  0.0  0.0  1768 1016 pts/0    S    Jan30   0:00          \_ -bash
jfink     3384  0.0  0.0  2680  976 pts/0    R    Jan30   0:00              \_ p
s
...

Based on that output, you easily can see how the system call fork got its name.

The application here is great. Sometimes a process itself is not to blame—and what if you kill an offending process only to find it respawned? The tree view can help track down the original process and kill it.

Singling Out a User

Last but definitely not least, you might need (or want) to look at a particular user's activities. On this particular system, my user account is the only userland account that does anything. I have chosen root to be the user to look at:

$ ps u --User root
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1116  380 ?        S    Jan27   0:01 init [3]
root         2  0.0  0.0     0    0 ?        SW   Jan27   0:03 [kflushd]
root         3  0.0  0.0     0    0 ?        SW   Jan27   0:18 [kupdate]
root         4  0.0  0.0     0    0 ?        SW   Jan27   0:00 [kpiod]
root         5  0.0  0.0     0    0 ?        SW   Jan27   0:38 [kswapd]
root       283  0.0  0.0  1292  564 ?        S    Jan27   0:00 syslogd -m 0
root       294  0.0  0.0  1480  700 ?        S    Jan27   0:00 klogd
daemon     308  0.0  0.0  1132  460 ?        S    Jan27   0:00 /usr/sbin/atd
root       322  0.0  0.0  1316  460 ?        S    Jan27   0:00 crond
root       336  0.0  0.0  1260  412 ?        S    Jan27   0:00 inetd
root       350  0.0  0.0  1312  512 ?        S    Jan27   0:00 lpd
root       371  0.0  0.0  1096  408 ?        S    Jan27   0:00 rpc.rquotad
root       382  0.0  0.0  1464  160 ?        S    Jan27   0:00 [rpc.mountd]
root       393  0.0  0.0     0    0 ?        SW   Jan27   2:15 [nfsd]
root       394  0.0  0.0     0    0 ?        SW   Jan27   2:13 [nfsd]
root       395  0.0  0.0     0    0 ?        SW   Jan27   2:13 [nfsd]
root       396  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       397  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       398  0.0  0.0     0    0 ?        SW   Jan27   2:12 [nfsd]
root       399  0.0  0.0     0    0 ?        SW   Jan27   2:11 [nfsd]
root       400  0.0  0.0     0    0 ?        SW   Jan27   2:14 [nfsd]
root       428  0.0  0.0  1144  488 ?        S    Jan27   0:00 gpm -t ps/2
root       466  0.0  0.0  1080  408 tty1     S    Jan27   0:00 /sbin/mingetty tt
y
root       467  0.0  0.0  1080  408 tty2     S    Jan27   0:00 /sbin/mingetty tt
y
root       468  0.0  0.0  1080  408 tty3     S    Jan27   0:00 /sbin/mingetty tt
y
root       469  0.0  0.0  1080  408 tty4     S    Jan27   0:00 /sbin/mingetty tt
y
root       470  0.0  0.0  1080  408 tty5     S    Jan27   0:00 /sbin/mingetty tt
y
root       471  0.0  0.0  1080  408 tty6     S    Jan27   0:00 /sbin/mingetty tt
y
root      3326  0.0  0.0  1708  892 ?        R    Jan30   0:00 in.telnetd
root      3327  0.0  0.1  2196 1096 pts/0    S    Jan30   0:00 login - jfink

Applying only a single user's process is helpful when a user might have a runaway. Here's a quick example: A particular piece of software used by the company for which I work did not properly die when an attached terminal disappeared (it has been cleaned up since then). It collected error messages into memory until it was killed. To make matters worse, these error message went into shared memory queues.

The only solution was for the system administrator to log in and kill the offending process. Of course, after a period of time, a script was written that would allow users to do this in a safe manner. On this particular system, there were thousands of concurrent processes. Only by filtering based on the user or doing a grep from the whole process table was it possible to figure out which process it was and any other processes that it might be affecting.

free

The free command rapidly snags information about the state of memory on your Linux system. The syntax for free is pretty straightforward:

$ free

The following is an example of free's output:

$ free
             total       used       free     shared    buffers     cached
Mem:       1036152    1033560       2592       8596      84848     932080
-/+ buffers/cache:      16632    1019520
Swap:       265064        380     264684

The first line of output shows the physical memory, and the last line shows similar information about swap. Table 3.9 explains the output of free.

Table 3.9  free Command Output Fields

Field

Description

total

Total amount of user available memory, excluding the kernel memory. (Don't be alarmed when this is lower than the memory on the machine.)

used

Total amount of used memory.

free

Total amount of memory that is free.

shared

Total amount of shared memory that is in use.

buffers

Current size of the disk buffer cache.

cached

Amount of memory that has been cached off onto disk.

An analysis of the sample output shows that this system seems to be pretty healthy. Of course, this is only one measurement. What if you want to watch the memory usage over time? The free command provides an option to do just that: the -s option. The -s option activates polling at a specified interval. The following is an example:

[jfink@kerry jfink]$ free -s 60
total       used       free     shared    buffers     cached
Mem:        257584      65244     192340      12132      40064       4576
-/+ buffers/cache:      20604     236980
Swap:      1028120          0    1028120

             total       used       free     shared    buffers     cached
Mem:        257584      66424     191160      12200      40084       5728
-/+ buffers/cache:      20612     236972
Swap:      1028120          0    1028120

             total       used       free     shared    buffers     cached
Mem:        257584      66528     191056      12200      40084       5812
-/+ buffers/cache:      20632     236952
Swap:      1028120          0    1028120
...

To stop free from polling, hit an interrupt key.

These measurements show a pretty quiet system, but the free command can come in handy if you want to see the effect of one particular command on the system. Run the command when the system is idling, and poll memory with free. free is well suited for this because of the granularity that you get in the output.

time

One very simple tool for examining the system is the time command. The time command comes in handy for relatively quick checks of how the system performs when a certain command is invoked. The way this works is simple: time returns a string value with information about the process and is launched with process like this:

$ time <command_name> [options]

Here is an example:

$ time cc hello.c -o hello

The output from the time command looks like this:

$ time cc hello.c -o hello
0.08user 0.04system 0:00.11elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (985major+522minor)pagefaults 0swaps

Even though this output is quite low-level, the time command can return very enlightening information about a particular command or program. It becomes very helpful in large environments in which operations normally take a long time. An example of this is comparing kernel compile times between different machines.

  • + Share This
  • 🔖 Save To Your Account