Reliability and Performance Monitor

Whereas Event Viewer allows you to monitor system and application events, Reliability and Performance Monitor (RPM) allows you to monitor and log the reliability and performance of your computer. This is the new, upgraded version of the old PerfMon tool that has been around since the NT days.

RPM has three monitoring tools:

  • Resource View—Provides a quick look at CPU, disk, network, and memory utilization in real time.
  • Performance Monitor—Uses collections of counters (a Data Collector Set) to monitor and log specific resource components in real time or in written logs, for historical review and analysis.
  • Reliability Monitor—Monitors and logs software, operating system, and hardware failures to present an overview of the system's stability over time.

The RPM tool can be accessed in Administrative Tools. The main dialog box for RPM is shown in Figure 3.31.

Figure 3.31

Figure 3.31 The Resource Overview is presented when you open the Reliability and Performance Monitor.

Data Collector Sets (DCSs)

The actual data collection and logging is performed by using Data Collector Sets (DCSs).

You can also create your own DCSs to log any combination of performance counters available on the system. Additional performance counters may get added to the system over time as you add features and services and install applications on the computer. A sample, custom Data Collector Set is shown in Figure 3.32.

Figure 3.32

Figure 3.32 A custom Data Collector Set.

On the General tab, you can describe the details of your custom DCS and configure the credentials for running the log. On the Directory tab, you can configure where the log files are written to, as well as the format for the naming convention used for the DCS log files. The Security tab is the place where you can configure who can access and modify the DCS parameters. The Schedule tab is the place where you configure the Start conditions for the DCS. The Schedule tab is shown in Figure 3.33. You'll notice that you can schedule the collector to run on a daily basis, and you can add multiple schedules.

Figure 3.33

Figure 3.33 The Schedule tab on a Data Collector Set indicates when the collector begins collecting.

The Stop Condition tab is the place where you configure what terminates the DCS. The stop condition can be an amount of time or some number of megabytes consumed by the log file. This tab is shown in Figure 3.34.

Figure 3.34

Figure 3.34 The Stop Condition tab on a Data Collector Set indicates when the collector stops collecting.

Finally, the Task tab allows you to configure an executable or script to run when the DCS stops. This integrates with the Task Scheduler to perform the launching of the specified task.

The Performance Monitor

The Performance Monitor, shown in Figure 3.35, is a real-time display of system resources. Using the Performance Monitor, just like a DCS, you configure specific counters to monitor and display. This tool does not record any information. When the data is overwritten by the next pass of the timer mark, the data is lost forever. If you need to keep a record of the data for later review, you must use a Data Collector Set that generates a written log file.

Figure 3.35

Figure 3.35 The Performance Monitor does not record log files. DCSs do record them.

In general, there are four main resource targets for monitoring:

  • RAM—Monitor Pages Per Second. This value should be less than 20 (average). If it is greater than 20, the system needs more RAM.
  • CPU—Monitor Percent Processor Time. This value should be less than 70–80% (average). If it is greater than 70%, add a faster CPU, add a second CPU, or move some processes to a less loaded system.
  • Disk Subsystem—Monitor Percent Disk Time. This value should be less than 50% (average). If it is greater than 50%, add a faster disk, add a faster disk array (RAID 0 or RAID 5), or move some accessed content to a lesser used disk.
  • Network Subsystem—Monitor Bytes Total per Second. This value should be less than 6MB/s (average). At 6MB/s, the NIC is occupying about 50% of a 100Mbps network. This is too much. If it is greater than 6MB/s, figure out what is sending and/or receiving over the network. You'll probably find that the problem is really that the NIC is failing and should be replaced.

The Reliability Monitor

The Reliability Monitor tracks application, operating system, and hardware failures to present a trend analysis of system stability. The Reliability Monitor is shown in Figure 3.36.

Figure 3.36

Figure 3.36 The Reliability Monitor tracks system failures.

The Index rating in the upper-right corner is an indication of the reliability and stability of the system. You can select any incident on the calendar chart and review details of the incident.

