Home > Articles > Operating Systems, Server > Solaris

Drill-Down Monitoring of Database Servers

  • Print
  • + Share This
This article presents a process for identifying and resolving problems with the performance of database servers. Learn how to monitor and tune databases, as well as how to recognize if a shared memory segment uses ISM (Intimate Shared Memory).
Like this article? We recommend

Introduction

Editor's Note

This material is extracted from Chapter 21 of Configuring and Tuning Databases on the Solaris Platform, by Allan N. Packer, (c) 2002, Sun Microsystems Press (ISBN 0-13-083417-3). Chapter 22, Monitoring and Tuning Oracle, will be presented in the July and August 2002 editions of Sun BluePrintsTM OnLine.

The time has come. Armed only with your wits, a little common sense, and some basic system knowledge, you're going to crack the performance problem bedevilling your database server. You roll up your sleeves and seat yourself firmly in front of a keyboard. A cluster of slightly awed colleagues watches wide-eyed over your shoulder.

OK, perhaps I'm getting a bit carried away here. Suffice it to say that the aim of this chapter is to develop a simple method for identifying performance problems on database servers.

I'm assuming you have already looked at the issues affecting server performance (covered in earlier chapters of Configuring and Tuning Databases on the Solaris Platform). Your application's behavior is well understood and consultants in small doses have already done wonders with application performance. You've done what you can to fix any database schema design problems and the addition of a couple of crucial indexes has already calmed the users down a little.

You've checked out obscure things like environment variables and racked your brains for other issues that might need attention. But performance problems still persist. Perhaps you need to upgrade your hardware, but at this point you're not sure.

Where should you start? I'm going to suggest a five-step process that will walk you through the major components of the system: memory, disk I/O, network, and CPU, followed by database monitoring and tuning. If a problem becomes apparent in one of these areas, there may be further steps to narrow the problem. This kind of "drill down" approach is an effective way to identify and ultimately solve problems.

If you find a bottleneck (by which I mean a constriction of performance, just as the neck of a bottle limits the flow of liquid into or out of a bottle), does that mean you should look no further? I would suggest going through the whole process anyway to see what you can discover.

Bear in mind, though, that fixing a bottleneck in one place might expose another elsewhere. Suppose, for example, your system is paging severely due to a lack of memory, but no problems are apparent elsewhere. Adding memory might allow your throughput to improve to the point where one of the disks becomes overutilized, resulting in a disk bottleneck. Checking out the disks as well might give a hint of problems to come.

Once you've found and resolved a bottleneck, go through the entire process again.

Finally, is the order of the steps important? Of course there are many possible ways of tackling system monitoring, but I suggest you go through the steps in the order shown.

STEP 1. Monitoring Memory

To check for a memory bottleneck, use the vmstat utility, which shows, among other things, memory behavior for the system. A 5-second interval is a good choice for live monitoring.

The vmstat trace in Figure 1 shows a system with no evidence of memory shortfall.

Figure 1Figure 1 vmstat trace with no memory shortfall

Figure 2 shows a vmstat trace from the same system during a severe memory shortfall.

Figure 2Figure 2 vmstat trace with severe memory shortfall

What to Look For

Look for po (pageouts—the kilobytes paged out per second) and sr (scan rate—the number of pages scanned by the clock algorithm). When both are consistently high at the same time (much more than 100 per second, say, on a system with up to 4 Gbytes of memory, more on a larger system), then it is possible the page daemon is being forced to steal free memory from running processes. Do you need to add more memory to the system? Maybe.

More memory may not help, though. That might sound crazy, but unfortunately the water is a little muddy here. Some explanation might help clarify the situation.

Pageouts can happen for a number of reasons, including the following:

  • Dirty (modified) file system pages are being flushed to disk. Such flushing is normal behavior and does not represent a problem. If database files are placed on file system files, expect to see this kind of pageout.

  • Application pages are being pushed out to the swap device to free up memory for other purposes. If the applications in question are active or about to become active, paging is bad!

  • New memory has been allocated by an application and swap space is being assigned to it. This, too, is normal behavior and does not represent a problem.

  • Memory pages have been freed by applications and are being flushed to disk. Isn't paging a waste of time if the memory is no longer required by the applications? You bet! Solaris 8 introduced a new madvise() flag called MADV_FREE to enable developers to tell the operating system not to bother to flush such pages to swap.

The scan rate is a measure of the activity of the page daemon. The page daemon wakes up and looks for memory pages to free when an application is unable to find enough memory on the free list (memory has fallen to the lotsfree system parameter). The greater the memory shortfall, the faster the page daemon will scan pages in main memory.

The major consumers of memory in a system are:

  • Applications, including text (binary code), stack (which contains information related to the current phase of execution of the program and its functions), heap (which contains program working space), and shared memory.

  • The file system page cache, which contains file system data (all file system disk blocks must first be read into memory before they can be used). This cache becomes important when database files are stored on file systems.

  • The operating system kernel.

Normal Paging Behavior Prior to Solaris 8

Before Solaris 8, the free column reported by vmstat may not be a good indication of the available memory in the system. The reason is that once memory pages are used by the file system page cache, they are not returned to the free list. Instead, the file system data blocks are left in the cache in case they are needed again in the future.

When the page daemon detects a memory shortfall and scans for pages to free, it may well choose to free some of the pages in the file system page cache. If the pages have been modified, they are first flushed to disk. There is no simple way of finding out how much of main memory is being used by the file system page cache at any point, but you can bet it will be substantial if database files are located on UFS files rather than raw devices. The memtool package (Richard McDougall's memory monitoring tool), available on the book website, can identify memory use by UFS files.

The problem is that the page daemon may free application memory pages as well as file system page cache pages since it doesn't know which is which. The result can be severe paging and major performance problems. Adding more memory won't help much, either. It will simply mean that more database pages can be cached in the memory. Fortunately, there is a solution.

Priority Paging

As of Solaris 7, a new feature called priority paging has been added. Priority paging lowers the priority of file system pages in memory so that the page daemon will choose to free them ahead of application pages. This behavior can make a huge difference to paging problems; priority paging should be enabled wherever databases coexist with active file systems, and especially where database files are placed on file systems.

You can activate priority paging by adding the following line to /etc/system and rebooting:

set priority_paging = 1

Patches are available for Solaris 2.5.1 and Solaris 2.6 to add priority paging functionality. From Solaris 8, changes to the virtual memory system mean that priority paging is no longer required and should not be used.

UFS Files and Paging

If your database files are UFS files rather than raw devices, you may observe significant scanning even once you have enabled priority paging. In fact, the scan rate may increase since priority paging causes the page daemon to become active sooner. This behavior is a natural consequence of the need to bring all database pages into the UFS page cache before they can be accessed by the database. The ongoing need to find free memory gives rise to constant scanning activity on busy database servers using UFS files.

If your application carries out updates, inserts, and deletes, you should also expect to see pageout activity. All database writes must go through the UFS page cache before being written to disk. Although the page being written to disk would previously have been read into the UFS page cache, the memory might have since been reused if the scan rate is high. In that case the page must be reread from disk before the write to disk can proceed. This process in turn displaces another page, and the cycle continues.

How do you stop all this paging activity? To eliminate scanning (assuming you have enough memory for your applications), either use raw devices or mount your database partitions with the Direct I/O flag (forcedirectio).

A word of caution: eliminating paging activity with Direct I/O may not always result in instant performance improvements. The file system page cache acts as a second-level cache for database pages, and removing it from the picture will expose any inadequacies in the sizing of your database buffer cache. Make sure that your database buffer cache is adequately sized; otherwise, you may find yourself with plenty of free memory and a database buffer cache starved of buffers.

Enabling Direct I/O for database files, and especially for database logs, can offer significant performance benefits as of the Solaris 8 1/01 release; earlier versions of Direct I/O may not offer significant performance gains. Direct I/O is described in more detail in "Unix File System Enhancements" on page 21 of Configuring and Tuning Databases on the Solaris Platform.

As a final caution, although Direct I/O can prove very useful for database files, do not enable it for nondatabase files without first examining carefully the performance implications of doing so.

Normal Paging Behavior as of Solaris 8

As we have seen, priority paging doesn't go all the way to solving the problem. Although the page daemon will choose file system pages in preference to application pages, the page daemon still has to search through the whole of memory to find them. And if large database buffer caches are being used, file system pages may only represent a small proportion of the total memory, so a lot of searching will be necessary.

As of Solaris 8, file system pages are separately accounted for, so they can be freed without a memory scan to find them. Consequently, the page daemon is not needed at all unless there is a major memory problem. As a result, the likelihood of paging problems is greatly diminished.

Drilling Down Further

If you want to find out where the memory is going, there are a number of options:

  • For the final answer on memory consumption, use memtool. The procmem script described below will provide a detailed breakdown of process memory usage. Both memtool and procmem are available on the book website.

  • Run dmesg and look for Avail mem. The difference between available memory and physical memory (use prtconf or /usr/platform/'arch -k'/sbin/prtdiag to find out the physical memory) indicates the amount of memory reserved for the kernel.

  • Use /usr/ucb/ps -aux to find out which processes are the major memory hogs. This command lists the percentage of memory used by each process. Beware, though! The memory listed is virtual memory, not physical memory, and it may not be a good indication of how much physical memory is actually being consumed by the process at any given moment.

If you have installed the unbundled memtool package, the procmem script will use the pmem program (similar to the standard Solaris pmap program, but with various bugs fixed on some Solaris releases). Some users are unwilling to install an unbundled package on a production system—procmem uses pmap instead if memtool (and therefore pmem) is not available. The procmem script summarizes memory use for all processes and gives a breakdown into resident, shared, and private memory usage. Please note that both procmem and memtool are unsupported software.

Passing the -h parameter to procmem results in the following usage information:

usage: procmem [-v] [-h | -p pidlist | [ -u username ] [ searchstring ]]
Examples:
procmem -p 10784 10759
   - show memory usage for processes with pids 10784 10759
procmem -u root
   - show memory usage for all processes owned by the root user
procmem -u "daemon root"
     - show memory usage for all processes owned by the root & daemon users
procmem netscape
   - show memory usage for process(es) in 'ps -ef' with "netscape"
procmem -u fred netscape
   - show memory usage for "netscape" processes owned by fred
procmem
   - show memory usage for all processes (provided current user has
    superuser access privileges)
Definition of terms
 'Kbytes' is the total memory size of the process or file.
 'Resident' is that portion currently occupying physical memory.
 'Shared' is resident memory capable of being shared.
 'Private' is resident memory unique to this process or file.
 Resident = Shared + Private
Sizing
 For reporting purposes, the 'Shared' component has been counted once
 only while the 'Private' component has been summed for each process
 or file. The /usr/lib shared libraries have been reported separately
 since they tend to be widely used across applications. To be totally
 accurate, though, the shared component of these shared libraries
 should only be counted once across all applications, not once for
 every group of applications. The same logic may apply to other
 shared libraries also used by multiple applications.

The -v flag offers additional detail. An example of procmem output follows for all processes on a server running an Oracle database.

Processes                  Kbytes Resident  Shared Private
---------                  ------ --------  ------ -------
Process Summary (Count)
 -csh (3)                    5376     2456    1064    1392
 -ksh (8)                   14624     2912    1456    1456
 automountd (1)              4088     3656    1792    1864
 cimomboot (1)               1576     1384    1256     128
 cron (1)                    1936     1744    1464     280
 devfsadmd (1)               2776     2536    1592     944
 devfseventd (1)             1272     1232     952     280
 dmispd (1)                  3160     2648    1744     904
 dtlogin (1)                 4920     2856    2192     664
 dwhttpd (2)                20080     7440    4496    2944
 esd (2)                    24712    21768    3880   17888
 grep (1)                     968      936     832     104
 in.ndpd (1)                 1856     1488    1304     184
 in.rdisc (1)                1616     1416    1272     144
 in.rlogind (6)             10368     2464    1360    1104
 inetd (1)                   2648     2384    1384    1000
 init (1)                    1888     1608    1136     472
 iostat (1)                  1824     1784     904     880
 ksh (5)                     9040     2320    1456     864
 lockd (1)                   1896     1656    1264     392
 lpsched (1)                 3040     1768    1552     216
 mibiisa (1)                 2952     2752    1504    1248
 mountd (1)                  2952     2536    1528    1008
 nfsd (1)                    1888     1672    1264     408
 nscd (1)                    3200     2928    1528    1400
 ora_ckpt_bench (1)       1405264  1372488 1371824     664
 ora_dbw0_bench (1)       1406976  1374208 1371824    2384
 ora_dbw1_bench (1)       1406968  1374200 1371824    2376
 ora_dbw2_bench (1)       1406968  1374200 1371824    2376
 ora_dbw3_bench (1)       1406968  1374200 1371824    2376
 ora_lgwr_bench (1)       1405248  1372472 1371824     648
 ora_pmon_bench (1)       1405696  1372920 1371824    1096
 ora_reco_bench (1)       1405160  1372384 1371824     560
 ora_smon_bench (1)       1405192  1372424 1371824     600
 oraclebench (50)        70258944  1400768 1371824   28944
 powerd (1)                  1632     1576     976     600
 rpcbind (1)                 2584     2088    1296     792
 sac (1)                     1736     1488    1296     192
 sendmail (1)                2936     2280    1816     464
 sh (4)                      4176     1320     912     408
 snmpXdmid (1)               3744     3184    2024    1160
 snmpdx (1)                  2144     1920    1520     400
 statd (1)                   2592     2208    1464     744
 syslogd (1)                 4192     3008    1456    1552
 tail (1)                     968      936     792     144
 tee (2)                     1808      952     792     160
 tpccload (50)             418800    38944    3344   35600
 ttymon (2)                  3464     1744    1336     408
 utmpd (1)                   1000      944     816     128
 vmstat (1)                  1312     1280     808     472
 vold (1)                    2696     2408    1768     640
 vxconfigd (1)              14656    13944    1328   12616
-----------------------------------------------------------------------------
File (Count)                      Kbytes Resident  Shared Private
-----------                       ------ --------  ------ -------
/usr/lib Shared Library Totals    291936    27928    2856   25072
Other Shared Library Totals      1473544    29304    7248   22056
Mapped File Totals                   560      488     488       0
Binary File Totals               1378632    18864   13328    5536
Shared Memory Totals            80280128  1360688 1360688       0
Anonymous Memory Totals            89680    84008       0   84008
-----------------------------------------------------------------------------
Grand Totals                    83514480  1521280 1384608  136672

The bulk of the 1.5 Gbytes of resident memory used on this server is accounted for by 1.3 Gbytes of shared memory, which also constitutes the major component of the memory used for the Oracle processes.

The script can be used to report the physical and virtual memory consumption for a group of processes. For example, procmem ora will report memory consumption for all processes that have the string ora in a ps -ef report (Oracle processes typically meet this criterion). If another Oracle user running the same applications is added to the system, you would not expect an increase in the Shared component of memory for /usr/lib shared libraries, other shared libraries, binary files, or shared memory segments. The Private component would be expected to grow, though, for the shared libraries, the mapped files, and anonymous memory. The additional private memory required would probably be roughly equivalent to the private memory total for these applications divided by the current number of users.

Detail is available for all processes, as well as summaries for /usr/lib shared libraries (which tend to be used by many processes throughout a system and so should be counted only once for sizing purposes), other shared libraries (for example, Oracle shared libraries), mapped files (memory-mapped file system files), binary files (executable programs), shared memory segments, and anonymous memory (heap and stack).

The procmem script will accurately show all memory directly used by processes, but not memory belonging to UFS files that are resident in the file system page cache. Since pages from UFS database files are not directly mapped into the address spaces of database processes, they will not appear in the totals. The memps -m command from memtool provides this information (it requires the memtool kernel module—installed when the memtool package is first set up—to be loaded).

What You Can Do to Reduce Paging

If you are using file systems for your database files, the first step is to upgrade to Solaris 8 or else enable priority paging for earlier releases of Solaris. If necessary, you could also consider include the following steps to relieve memory pressures on your database server:

  • Add more memory to the system.

  • Use Direct I/O for database files.

  • Reduce the size of the database buffer cache. This reduction may result in additional database I/O, but that is almost always preferable to paging.

  • Remove applications from the server. If applications are running on the server, move them to a client system and run the applications in client/server mode. Memory should be freed up as a result.

  • Reduce the number of users on the system.

  • + Share This
  • 🔖 Save To Your Account