Home > Articles > Operating Systems, Server > Linux/UNIX/Open Source

This chapter is from the book

gcc Option Needed for gprof

Before programs can be profiled using gprof, they must be compiled with the -pg gcc option. To get complete information about gprof, you can use the command info gprof or man gprof.

Listing 1.5 shows the benefits that profiling can have on a small program. The sample1 program prints the prime numbers up to 50,000. You can use the output from gprof to increase this program’s performance by changing the program to sample2, shown later in Listing 1.8.

Listing 1.5 sample1.c

1 #include <stdlib.h>
2 #include <stdio.h>
3
4 int prime (int num);
5
6 int main()
7 {
8  int i;
9  int colcnt = 0;
10 for (i=2; i <= 50000; i++)
11  if (prime(i)) {
12   colcnt++;
13   if (colcnt%9 == 0) {
14     printf("%5d\n",i);
15     colcnt = 0;
16   }
17  else
18    printf("%5d ", i);
19  }
20  putchar(’\n’);
21  return 0;
22 }
23
24 int prime (int num) {
25   /* check to see if the number is a prime? */
26   int i;
27   for (i=2; i < num; i++)
28   if (num %i == 0)
29    return 0;
30   return 1;
31 }

Building the sample1 Program and Using gprof

The sample1.c program needs to be compiled with the option -pg to have profile data generated, as shown in Figure 1.11.

Figure 1.11

Figure 1.11 Building and running sample1.

When the sample1 program is run, the gmon.out file is created.

To view the profiling data, the gprof utility must be on your system. If your system is rpm-based, the rpm command shows the version of gprof, as shown in Figure 1.12.

Figure 1.12

Figure 1.12 The version of gprof.

gprof is in the binutils package. For you to use the utility, the package must be installed on your system. One useful gprof option is -b. The -b option eliminates the text output that explains the data output provided by gprof:

# gprof -b ./sample1

The output shown in Listing 1.6 from gprof gives some high-level information like the total running time, which is 103.74 seconds. The main routine running time is 0.07 seconds, and the prime routine running time is 103.67 seconds. The prime routine is called 49,999 times.

Listing 1.6 Output from gprof for sample1

Flat profile:

Each sample counts as 0.01 seconds.
 %      cumulative   self              self    total
 time    seconds   seconds  calls   ms/call  ms/call  name
 99.93    103.67   103.67   49999     2.07     2.07   prime
 0.07     103.74     0.07                             main

         Call graph


granularity: each sample hit covers 4 byte(s) for 0.01% of 103.74 seconds

index  % time    self   children    called        name
                                                      <spontaneous>
[1]    100.0     0.07      103.67                 main [1]
               103.67      0.00     49999/49999        prime [2]
---------------------------------------------------------------------------
               103.67      0.00     49999/49999   main [1]
[2]     99.9   103.67      0.00     49999         prime [2]
---------------------------------------------------------------------------

Index by function name

  [1] main          [2] prime

Next we can use the gcov program to look at the actual number of times each line of the program was executed. (See Chapter 2, "Code Coverage," for more about gcov.)

We will build the sample1 program with two additional options—-fprofile-arcs and -ftest-coverage, as shown in Figure 1.13. These options let you look at the program using gcov, as shown in Figure 1.14.

Figure 1.13

Figure 1.13 Building sample1 with gcov options.

Figure 1.14

Figure 1.14 Running sample1 and creating gcov output.

Running gcov on the source code produces the file sample1.c.gcov. It shows the actual number of times each line of the program was executed. Listing 1.7 is the output of gcov on sample1.

Listing 1.7 Output from gcov for sample1

        -:   0:Source:sample1.c
        -:   0:Graph:sample1.bbg
        -:   0:Data:sample1.da
        -:   1:#include <stdlib.h>
        -:   2:#include <stdio.h>
        -:   3:
        -:   4:int prime (int num);
        -:   5:
        -:   6:int main()
        1:   7: {
        1:   8: int i;
        1:   9: int colcnt = 0;
    50000:  10: for (i=2; i <= 50000; i++)
    49999:  11:  if (prime(i)) {
     5133:  12:   colcnt++;
     5133:  13:   if (colcnt%9 == 0) {
      570:  14: printf("%5d\n",i);
      570:  15: colcnt = 0;
        -:  16:  }
        -:  17: else
     4563:  18:  printf("%5d ", i);
        -:  19:  }
        1:  20:   putchar(’\n’);
        1:  21:   return 0;
        -:  22: }
        -:  23:
    49999:  24:int prime (int num) {
        -:  25:   /* check to see if the number is a prime? */
    49999:  26:   int i;
121337004:  27:   for (i=2; i < num; i++)
121331871:  28:   if (num %i == 0)
    44866:  29:    return 0;
     5133:  30:   return 1;
        -:  31:   }
        -:  32:

There are 5,133 prime numbers. The expensive operations in the routine prime are the for loop (line 27) and the if statement (line 28). The "hot spots" are the loop and the if test inside the prime routine. This is where we will work to increase the program’s performance. One change that will help this program is to use the sqrt() function, which returns the nonnegative square root function of the number passed in. sample2, shown inListing 1.8, has been changed to use the sqrt function in the newly created function called faster.

Listing 1.8 sample2.c

1 #include <stdlib.h>
2 #include <stdio.h>
3 #include <math.h>
4
5 int prime (int num);
6 int faster (int num);
7
8 int main()
9 {
10  int i;
11  int colcnt = 0;
12  for (i=2; i <= 50000; i++)
13   if (prime(i)) {
14    colcnt++;
15   if (colcnt%9 == 0) {
16     printf("%5d\n",i);
17     colcnt = 0;
18   }
19   else
20     printf("%5d ", i);
21   }
22   putchar(’\n’);
23   return 0;
24 }
25
26 int prime (int num) {
27   /* check to see if the number is a prime? */
28  int i;
29   for (i=2; i <= faster(num); i++)
30   if (num %i == 0)
31    return 0;
32   return 1;
33 }
34
35 int faster (int num)
36 {
37 return (int) sqrt( (float) num);
38 }

Now you can build the sample2 program (see Figure 1.15) and use gprof to check how long the program will take to run (see Figure 1.16). Also, the gcov output shows the reduced number of times each line needs to be executed. In Listing 1.9, the total running time has been reduced from 103.74 seconds to 2.80 seconds.

Listing 1.9 shows the output of gprof for the sample2 program.

Listing 1.9 Output from gprof for sample2

Flat profile:

Each sample counts as 0.01 seconds.
%      cumulative   self                   self      total
time     seconds    seconds     calls     us/call   us/call    name
52.68      1.48     1.48       1061109       1.39      1.39    faster
46.61      2.78     1.30         49999      26.10     55.60    prime
 0.71      2.80     0.02                                  main

         Call graph


granularity: each sample hit covers 4 byte(s) for 0.36% of 2.80 seconds

index   % time    self   children   called       name
                                                    <spontaneous>
[1]     100.0     0.02     2.78                   main [1]
                  1.30     1.48     49999/49999      prime [2]
-------------------------------------------------------------
                  1.30     1.48     49999/49999      main [1]
[2]      99.3     1.30     1.48     49999         prime [2]
                  1.48     0.00   1061109/1061109     faster [3]
------------------------------------------------------------- 
                  1.48     0.00   1061109/1061109     prime [2]
[3]      52.7     1.48     0.00   1061109         faster [3]
------------------------------------------------------------- 

Index by function name

[3] faster         [1] main          [2] prime
Figure 1.15

Figure 1.15 Building and running sample2.

Figure 1.16

Figure 1.16 Using gprof on sample2.

Now we’ll run gcov on the sample2 program, as shown in Figures 1.17 and 1.18.

Figure 1.17

Figure 1.17 Building sample2 with gcov and running sample2.

Figure 1.18

Figure 1.18 Running sample2 and getting gcov output.

Listing 1.10 shows gcov output for the sample2 program.

Listing 1.10 Output of sample2.c.gcov

      -:  0:Source:sample2.c
      -:  0:Graph:sample2.bbg
      -:  0:Data:sample2.da
      -:  1:#include <stdlib.h>
      -:  2:#include <stdio.h>
      -:  3:#include <math.h>
      -:  4:
      -:  5:int prime (int num);
      -:  6:int faster (int num);
      -:  7:
      -:  8:int main()
      1:  9:{
      1:  10:  int i;
      1:  11:  int colcnt = 0;
  50000:  12:  for (i=2; i <= 50000; i++)
  49999:  13:   if (prime(i)) {
   5133:  14:    colcnt++;
   5133:  15:   if (colcnt%9 == 0) {
    570:  16:printf("%5d\n",i);
    570:  17:colcnt = 0;
      -:  18:   }
      -:  19:  else
   4563:  20:   printf("%5d ", i);
      -:  21:  }
      1:  22:   putchar(’\n’);
      1:  23:   return 0;
      -:  24: }
      -:  25:
  49999:  26:int prime (int num) {
      -:  27:   /* check to see if the number is a prime? */
  49999:  28:  int i;
1061109:  29:   for (i=2; i <= faster(num); i++)
1055976:  30:   if (num %i == 0)
  44866:  31:    return 0;
   5133:  32:   return 1;
      -:  33:   }
      -:  34:
      -:  35:int faster (int num)
1061109:  36: {
1061109:  37: return (int) sqrt( (float) num);
      -:  38: }
      -:  39:

The for loop in the prime routine has been reduced from 121 million executions to 1 million executions. Therefore, the total time has been reduced from 103.74 seconds to 2.80 seconds.

The tools gprof and gcov helped find the "hot spots" in this sample program. After the "hot spots" were found, the program was changed to increase its overall performance. It is interesting how changing a few lines of code can have a great impact on a program’s performance.

Listing 1.11, sample3.cpp, has three different functions (1, 2, and 3). It shows a more complex use of profiling, with both flat and graphic profiles. We’ll also use kprof, which can use gprof output. It presents the information in list or tree views, which make the information easier to understand when programs are more complicated. Let’s start by building the sample3.cpp program and displaying the flat and graphic profiles and then displaying the data using kprof.

Listing 1.11 sample3.cpp

1 #include <iostream>
2
3 void function1(){
4   for(int i=0;i<1000000;i++);
5 }
6
7 void function2(){
8   function1();
9   for (int i=0;i<2000000;i++);
10 }
11
12 void function3(){
13  function1();
14  function2();
15  for (int i=0;i<3000000;i++);
16    function1();
17 }
18
19 int main(){
20  for(int i=0;i<10;i++)
21  function1();
22
23  for (int i=0;i<5000000;i++);
24
25  for(int i=0;i<10;i++)
26    function2();
27    for(int i=0; i<13;i++);
28     {
29     function3();
30     function2();
31     function1();
32     }
33 }

Figure 1.19 shows the commands used to build and run the sample3 program. gprof is also run on sample3 to get the profile data from sample3.

Figure 1.19

Figure 1.19 Building and capturing gprof output for sample3.

We won’t use the -b option on the gprof output on the sample3 program so that we can see all the descriptive information that gprof can display.

The sample3.gprof should look similar to this:

Flat profile:
Each sample counts as 0.01 seconds.
%       cumulative   self              self      total
time     seconds     seconds   calls   ms/call   ms/call   name
43.36    4.21        4.21      12      0.35      0.52      function2()
42.84    8.37        4.16      25      0.17      0.17      function1()
 8.65    9.21        0.84                                  main
 5.15    9.71        0.50       1      0.50      1.35      function3()
 0.00    9.71        0.00       1      0.00      0.00      global constructors
                                                           keyed to function1()
 0.00    9.71        0.00       1      0.00      0.00      __static_initialization_and_destruction_0(int, int)

Field

Description

% time

The percentage of the program’s total running time used by this function.

cumulative seconds

A running sum of the number of seconds accounted for by this function and those listed above it.

self seconds

The number of seconds accounted for by this function alone. This is the major sort for this listing.

calls

The number of times this function was invoked if this function is profiled; otherwise, it is blank.

self ms/call

The average number of milliseconds spent in this function per call if this function is profiled; otherwise, it is blank.

total ms/call

The average number of milliseconds spent in this function and its descendents per call if this function is profiled; otherwise, it is blank.

name

The function’s name. This is the minor sort for this listing. The index shows the location of the function in the gprof listing. If the index is in parentheses, it shows where it would appear in the gprof listing if it were to be printed.

Call graph (explanation follows)
granularity: each sample hit covers 4 byte(s) for 0.10% of 9.71 seconds
index     % time    self     children    called       name
                                                         <spontaneous>
[1]       100.0    0.84      8.87                     main [1]
                   3.86      1.83        11/12              function2() [2]
                   1.83      0.00        11/25              function1() [3]
                   0.50      0.85        1/1                function3() [4]
----------------------------------- 
                   0.35      0.17        1/12               function3() [4]
                   3.86      1.83        11/12        main [1]
[2]       63.9     4.21      2.00        12                 function2() [2]
                   2.00      0.00        12/25              function1() [3]
----------------------------------- 
                   0.33      0.00        2/25               function3() [4]
                   1.83      0.00        11/25        main [1]
                   2.00      0.00        12/25              function2() [2]
[3]       42.8     4.16      0.00        25                 function1() [3]
----------------------------------- 
                   0.50      0.85        1/1          main [1]
[4]       13.9     0.50      0.85        1                  function3() [4]
                   0.35      0.17        1/12               function2() [2]
                   0.33      0.00        2/25               function1() [3]
----------------------------------- 
                   0.00      0.00        1/1          __do_global_ctors_aux [13]
[11]      0.0      0.00      0.00        1            global constructors keyed to function1() [11]
                   0.00      0.00        1/1          __static_initialization_and_destruction_0(int, int) [12]
----------------------------------- 
                   0.00      0.00        1/1          global constructors keyed to function1() [11]
[12]      0.0      0.00      0.00        1            __static_initialization_and_destruction_0(int, int) [12]
----------------------------------- 

This table describes the program’s call tree. It is sorted by the total amount of time spent in each function and its children.

Each entry in this table consists of several lines. The line with the index number at the left margin lists the current function. The lines above it list the functions that called this function, and the lines below it list the functions this one called.

You see the following:

Field

Description

index

A unique number given to each element of the table. Index numbers are sorted numerically. The index number is printed next to every function name so that it is easier to look up the function in the table.

% time

The percentage of the total time that was spent in this function and its children. Note that due to different viewpoints, functions excluded by options, and so on, these numbers do not add up to 100%.<\p>

self<\p>

The total amount of time spent in this function.<\p>

children<\p>

The total amount of time propagated into this function by its children.<\p>

called<\p>

The number of times the function was called. If the function called itself recursively, the number includes only nonrecursive calls and is followed by a + and the number of recursive calls.<\p>

name

The name of the current function. The index number is printed after it. If the function is a member of a cycle, the cycle number is printed between the function’s name and the index number.

For the function’s parents, the fields have the following meanings:

Field

Description

self

The amount of time that was propagated directly from the function into this parent.

children<\p>

The amount of time that was propagated from the function’s children into this parent.<\p>

called<\p>

The number of times this parent called the function and the total number of times the function was called. Recursive calls to the function are not included in the number after the /.<\p>

name

The parent’s name. The parent’s index number is printed after it. If the parent is a member of a cycle, the cycle number is printed between the name and the index number.

If the function’s parents cannot be determined, the word <spontaneous> is printed in the name field, and all the other fields are blank.

For the function’s children, the fields have the following meanings:

Field

Description

self

The amount of time that was propagated directly from the child into the function.

children<\p>

The amount of time that was propagated from the child’s children to the function.<\p>

called<\p>

The number of times the function called this child and the total number of times the child was called. Recursive calls by the child are not listed in the number after the /.<\p>

name

The child’s name. The child’s index number is printed after it. If the child is a member of a cycle, the cycle number is printed between the name and the index number.

If the call graph has any cycles (circles), there is an entry for the cycle as a whole. This entry shows who called the cycle (as parents) and the members of the cycle (as children). The + recursive calls entry shows how many function calls were internal to the cycle. The calls entry for each member shows, for that member, how many times it was called from other members of the cycle.

Index by function name
[11] global constructors keyed to function1() [3] function1() [4] function3()
[12] __static_initialization_and_destruction_0(int, int) [2] function2() [1] main

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020