Profiling in Linux Performance Tuning
In this chapter
- stopwatch, page 3
- date, page 4
- time, page 5
- clock, page 6
- gettimeofday, page 11
- Performance Tuning Using GNU gprof, page 13
- gcc Option Needed for gprof, page 15
- kpro, page 31
- Summary, page 35
- Web Resources for Profiling, page 36
In general, performance tuning consists of the following steps:
- Define the performance problem.
- Identify the bottlenecks by using monitoring and measurement tools. (This chapter focuses on measuring from the timing aspect.)
- Remove bottlenecks by applying a tuning methodology.
- Repeat steps 2 and 3 until you find a satisfactory resolution.
A sound understanding of the problem is critical in monitoring and tuning the system. Once the problem is defined, a realistic goal for improvement needs to be agreed on. Once a bottleneck is found, you need to verify whether it is indeed a bottleneck and devise possible solutions to alleviate it. Be aware that once a bottleneck is identified and steps are taken to relieve it, another bottleneck may suddenly appear. This may be caused by several variables in the system running near capacity.
Bottlenecks occur at points in the system where requests are arriving faster than they can be handled, or where resources, such as buffers, are insufficient to hold adequate amounts of data. Finding a bottleneck is essentially a step-by-step process of narrowing down the problem’s causes.
Change only one thing at a time. Changing more than one variable can cloud results, since it will be difficult to determine which variable has had what effect on system performance. The general rule perhaps is better stated as "Change the minimum number of related things." In some situations, changing "one thing at a time" may mean changing multiple parameters, since changes to the parameter of interest may require changes to related parameters. One key item to remember when doing performance tuning is to start in the same state every time. Start each iteration of your test with your system in the same state. For example, if you are doing database benchmarking, make sure that you reset the values in the database to the same setting each time the test is run.
This chapter covers several methods to measure execution time and real-time performance. The methods give different types of granularity, from the program’s complete execution time to how long each function in the program takes. The first three methods (stopwatch, date, and time) involve no changes to the program that need to be measured. The next two methods (clock and gettimeofday) need to be added directly to the program’s source code. The timing routines could be coded to be on or off, depending on whether the collection of performance measurements is needed all the time or just when the program’s performance is in question. The last method requires the application to be compiled with an additional compiler flag that allows the compiler to add the performance measurement directly to the code. Choosing one method over another can depend on whether the application’s source code is available. Analyzing the source code with gprof is a very effective way to see which function is using a large percentage of the overall time spent executing the program.
Application performance tuning is a complex process that requires correlating many types of information with source code to locate and analyze performance problem bottlenecks. This chapter shows a sample program that we’ll tune using gprof and gcov.
The stopwatch uses the chronograph feature of a digital watch. The steps are simple. Reset the watch to zero. When the program begins, start the watch. When the program ends, stop the watch. The total execution time is shown on the watch. Figure 1.1 uses the file system benchmark dbench. The stopwatch starts when dbench is started, and it stops when the program dbench is finished.
Figure 1.1 Timing dbench with stopwatch.
Using the digital stopwatch method, the dbench program execution time came out to be 13 minutes and 56 seconds, as shown in Figure 1.2.
Figure 1.2 The execution time is shown on the watch.