Home > Articles

  • Print
  • + Share This
Like this article? We recommend

Working With Multiple Page Sizes in the Solaris OS

This section introduces a strategy for measuring the potential performance gain that could be yielded from an increase in page size. We begin by describing a powerful tool in the Solaris 9 software, trapstat, for easily quantifying the potential gains of using a larger page size. This description is followed by sections that explain the methods we use to estimate the gain in the Solaris 8 OS using the cpustat command.

Deciding When to Use Large Pages

To determine whether we can improve application performance by using a larger page size, we need to determine the amount of time the microprocessor spends servicing TLB misses on behalf of a target application (See "Understanding Why Virtual-to-Physical Address Translation Affects Performance" on page 2 for a further information on why translation misses affect application performance).

TLB misses are typically accounted for in the context of the running process. For example, if a TLB miss occurs in a user-mode application, it will be counted as user time. Thus, an application might spend a large amount of time having TLB misses serviced, but still report that it spends 100 percent of its time in user mode, as shown in the following sample.

sol8# mpstat 1 3
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
 0  2  0  1  234 134  91  46  0  0  0  25 100  0  0  0
 0  2  0  1  234 134  91  46  0  0  0  25 100  0  0  0
 0  2  0  1  234 134  91  46  0  0  0  25 100  0  0  0

Measuring Application Performance

Two different types of page size observability tools are available in Solaris software: those that describe the page sizes in use by the system or application, and those that help determine whether using large pages will benefit performance.

The pmap(1M), pagesize(1M), and getpagesize(3C) commands, and the meminfo(2) interfaces discover information about the system's ability to support different TLB page sizes. The trapstat(1M) and cpustat(1M) commands can approximate the amount of time that our target application spends waiting for the platform to service TLB misses.

We can use two methods to approximate the amount of time spent on servicing TLB misses:

  • We can observe the rate of TLB misses and multiply rate of TLB misses by the cost of a TLB miss.

  • Or, if TLB misses are serviced by system software, we can directly measure the time spent in TLB miss handlers.

In the Solaris 8 OS, the cpustat(1M) command measures the rate of TLB misses, whereas Solaris 9 software provides a new command, trapstat, which computes and displays the amount of time spent servicing TLB misses.

Determine the Number of TLB Misses With trapstat(1M)

The trapstat command in the Solaris 9 software provides information about processor exceptions on UltraSPARC platforms. Because TLB misses are serviced in software on UltraSPARC microprocessors, trapstat can also provide statistics about TLB misses.

Using the trapstat command, we can observe the number of TLB misses and the amount of time spent servicing TLB misses. The -t and -T options provide information about TLB misses. Again with trapstat, we can use the amount of time servicing TLB misses to approximate the potential gains we could make by using a larger page size or by moving to a platform that uses a microprocessor with a larger TLB.

The -t option provides first-level summary statistics. The time spent servicing TLB misses is summarized in the lower right corner; in this case, 46.2 percent of the total execution time is spent servicing misses. Miss details are provided for TLB misses incurred in the data portion of the address space, and for the instruction portion of the address space. Data is also provided for user-mode and kernel-mode misses. We are primarily interested in the user-mode misses because our application likely runs in user mode.

sol9# trapstat -t 1 111
cpu m| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
-----+-------------------------------+-------------------------------+----
 0 u|     1 0.0     0 0.0 |  2171237 45.7     0 0.0 |45.7
 0 k|     2 0.0     0 0.0 |   3751 0.1     7 0.0 | 0.1
=====+===============================+===============================+====
 ttl |     3 0.0     0 0.0 |  2192238 46.2     7 0.0 |46.2

For further detail, use the -T option to provide a per-page size breakdown. In this example, trapstat shows that all of the misses occurred are occurring on 8-kilobyte pages.

sol9# trapstat -T 1 111
cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----
 0 u  8k|    30 0.0     0 0.0 |  2170236 46.1     0 0.0 |46.1
 0 u 64k|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
 0 u 512k|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
 0 u  4m|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
- - - - - + - - - - - - - - - - - - - - - + - - - - - - - - - - - - - - - + - -
 0 k  8k|     1 0.0     0 0.0 |   4174 0.1    10 0.0 | 0.1
 0 k 64k|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
 0 k 512k|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
 0 k  4m|     0 0.0     0 0.0 |     0 0.0     0 0.0 | 0.0
==========+===============================+===============================+====
   ttl |    31 0.0     0 0.0 |  2174410 46.2    10 0.0 |46.2

We can conclude from this analysis that our application could potentially run almost twice as fast if we eliminated the majority of the TLB misses. Our objective in using the mechanisms discussed in the following sections is to minimize the user-mode data TLB (dTLB) misses, potentially by instructing the application to use larger pages for its data segments. Typically, data misses are incurred in the program's heap or stack segments. We can use the Solaris 9 software multiple-page size support commands to direct the application to use 4-megabyte pages for its heap, stack, or anonymous memory mappings.

Assess the Amount of Time Spent on TLB Misses With cpustat(1M)

The cpustat command programs and reads the hardware counters in the microprocessor. These counters measure hardware events within the processor itself. Typically, two counters and a larger number of events can be traced. The UltraSPARC III processors can count TLB miss events. Because the Solaris 8 OS lacks trapstat, the CPU counters can estimate the amount of time spent servicing TLB misses.

For example, the following cpustat command instructs the system to measure the number of dTLB miss events and the number of microprocessor cycles on each processor.

sol8# cpustat -c pic0=Cycle_cnt,pic1=DTLB_miss 1 
 time  cpu event pic0    pic1
 1.006  0 tick 663839993  3540016
 2.006  0 tick 651943834  3514443
 3.006  0 tick 630482518  3398061
 4.006  0 tick 634483028  3418046
 5.006  0 tick 651910256  3511458
 6.006  0 tick 651432039  3510201
 7.006  0 tick 651512695  3512047
 8.006  0 tick 613888365  3309406
 9.006  0 tick 650806115  3510292

By default, the cpustat command reports only counts that represent user-mode processes. This cpustat output shows us that on processor 0, a user mode process consumes approximately 650 million cycles per second and that 3.5 million dTLB misses per second are serviced. An UltraSPARC TLB miss typically ranges from about 50 cycles if the TLB entry being loaded is found in the microprocessor's cache to about 300 cycles if a memory load is required to fetch the new TLB entry. We can, therefore, approximate that between 175 million and 1050 million cycles are spent servicing TLB misses, per one-second sample.

A quick check of the processor speed allows us to calculate the ratio of time spent servicing misses.

sol8# psrinfo -v
Status of processor 0 as of: 11/10/2002 20:14:09
 Processor has been on-line since 11/05/2002 20:59:17.
 The sparcv9 processor operates at 900 MHz,
    and has a sparcv9 floating point processor.

Our microprocessor is running at 900 megahertz, providing 900 million cycles per second. Therefore, at least 175/900, or 19 percent of the time is spent servicing TLB misses. The actual number could be larger if a large fraction of the TLB misses require memory loads.

Determining Which Page Sizes Have Been Allocated

The pmap command allows us to query a target process about page size information, and the meminfo system call provides a programatic query to the OS for information about the page sizes provided to it.

Query a Process for Page Size Information With pmap(1)

The pmap command displays the page sizes of memory mappings within the address space of a process. The -sx option directs pmap to show the page size for each mapping.

sol9# pmap -sx ´pgrep testprog´
2909:  ./testprog
 Address Kbytes   RSS  Anon Locked Pgsz Mode  Mapped File
00010000    8    8    -    -  8K r-x-- dev:277,83 ino:114875
00020000    8    8    8    -  8K rwx-- dev:277,83 ino:114875
00022000 131088 131088 131088    -  8K rwx--  [ heap ]
FF280000   120   120    -    -  8K r-x-- libc.so.1
FF29E000   136   128    -    -  - r-x-- libc.so.1
FF2C0000   72   72    -    -  8K r-x-- libc.so.1
FF2D2000   192   192    -    -  - r-x-- libc.so.1
FF302000   112   112    -    -  8K r-x-- libc.so.1
FF31E000   48   32    -    -  - r-x-- libc.so.1
FF33A000   24   24   24    -  8K rwx-- libc.so.1
FF340000    8    8    8    -  8K rwx-- libc.so.1
FF390000    8    8    -    -  8K r-x-- libc_psr.so.1
FF3A0000    8    8    -    -  8K r-x-- libdl.so.1
FF3B0000    8    8    8    -  8K rwx--  [ anon ]
FF3C0000   152   152    -    -  8K r-x-- ld.so.1
FF3F6000    8    8    8    -  8K rwx-- ld.so.1
FFBFA000   24   24   24    -  8K rwx--  [ stack ]
-------- ------- ------- ------- -------
total Kb 132024 132000 131168    -.

The pmap command shows the MMU page size for each mapping. In this case, 8 kilobytes are used for all mappings. To demonstrate a larger page size, we can use the ppgsz command in the Solaris 9 software to set the page size for the heap of our test program to 4 megabytes. The ppgsz command is described in more detail in a later section.

sol9# ppgsz -o heap=4M ./testprog &
sol9# pmap -sx ´pgrep testprog´
2953:  ./testprog
 Address Kbytes   RSS  Anon Locked Pgsz Mode  Mapped File
00010000    8    8    -    -  8K r-x-- dev:277,83 ino:114875
00020000    8    8    8    -  8K rwx-- dev:277,83 ino:114875
00022000  3960  3960  3960    -  8K rwx--  [ heap ]
00400000 131072 131072 131072    -  4M rwx--  [ heap ]
FF280000   120   120    -    -  8K r-x-- libc.so.1
FF29E000   136   128    -    -  - r-x-- libc.so.1
FF2C0000   72   72    -    -  8K r-x-- libc.so.1
FF2D2000   192   192    -    -  - r-x-- libc.so.1
FF302000   112   112    -    -  8K r-x-- libc.so.1
FF31E000   48   32    -    -  - r-x-- libc.so.1
FF33A000   24   24   24    -  8K rwx-- libc.so.1
FF340000    8    8    8    -  8K rwx-- libc.so.1
FF390000    8    8    -    -  8K r-x-- libc_psr.so.1
FF3A0000    8    8    -    -  8K r-x-- libdl.so.1
FF3B0000    8    8    8    -  8K rwx--  [ anon ]
FF3C0000   152   152    -    -  8K r-x-- ld.so.1
FF3F6000    8    8    8    -  8K rwx-- ld.so.1
FFBFA000   24   24   24    -  8K rwx--  [ stack ]
-------- ------- ------- ------- -------
total Kb 135968 135944 135112    -

Retrieve a Page Description With meminfo(2)

The meminfo() system call enables a program to inquire about the physical pages mapping its address space. This system call provides a programmatic way of determining the page sizes allocated within a process's address space. An array is filled with a description of each page that backs the mapping. For more information, refer to the meminfo(3c) man page.

Discovering Supported Page Sizes

This section describes the three commands that enable us to determine information about the page size supported by the Solaris 9 OS.

Determine Page Size With pagesize(1M)

The pagesize command displays the default page size used by the Solaris OS on the given microprocessor. The default is currently 8 kilobytes for all UltraSPARC platforms.

sol8# pagesize
8192

The pagesize command can also display the available page sizes on the given microprocessor in the Solaris 9 OS. In this example, we can see that four page sizes are available on our UltraSPARC processor.

sol9# pagesize -a
8192
65536
524288
4194304

Retrieve the Base Page Size With getpagesize(3C)

The getpagesize() function returns the base page size in bytes.

Retrieve the Microprocessor's Available Page Size With getpagesizes(3C)

The getpagesizes() function reports the available page sizes on the given microprocessor. For more information, refer to the getpagesizes(3c) man page.

  • + Share This
  • 🔖 Save To Your Account