CallGraphProfile Predefined Probe

From OC Systems Wiki!
Jump to: navigation, search

Call Graph Profile Probe: apcgp.ual

[Since 4.4.9]

The apcgp.ual predefined probe profiles the execution of functions within an application program. The probe records the number of calls made to functions, the number of calls made by functions. The profiling information is reported for the program as a whole in a call graph tree.

Call graph profiling allows you to determine which functions call other functions and are called by other functions. The call graph profile information can be used to locate "hot spots" within your application on which to focus your attention when trying to improve your program's performance.

While this is a very powerful tool, it is not going to tell you in the first run what needs to be re-coded, unless it's a small application with a very obvious problem. Performance analysis of an application is an iterative process that requires an understanding of what the application does, and a feeling for how long operations should take.

See statprof.ual and profile.ual for other approaches to profiling.

Usage

This probe is applied at runtime using aprobe as described under Apcgp UAL Parameters below. The only functions for which profiling data will be collected are those selected with the PROFILE keyword in the configuration file (see Configuration File). The REMOVE keyword allows you to remove a function from the set to be profiled, and the SNAPSHOT keyword logs a snapshot of the profile information at a specified function.

The profile information is logged to a file for later processing by apformat. A call graph tree of profile information is generated for the program as a whole. See Apcgp Probe Output for an example.

Apcgp UAL Parameters

apcgp.ual is specified on the aprobe command line or in an APO file as described in Command Line. The specific options are:

aprobe  -u apcgp.ual     [-p "[-h] [-v] [-a size] [-b size] [-c config_filename] [-S signum] [-t size]"] your_program

where:

-a size
sets the arc density. Default 2. Minimum 2. Programs with functions which call lots of other functions should have higher arc densities.
-b size
sets the sampling bin size. Default 8. Minimum 8. A lager bin size requires less profile data memory but is more granular. Programs with a large text segment may require a larger bin size to fit profile data in the address space. Programs with lots of small functions may want to reduce the bin size for more accuracy.
-c config_filename
specifies that the name of the probe configuration options file will follow immediately after -c. The default file name is your_program.apcgp.cfg. For example, if your executable program is called wilbur.exe, then the default file name would be wilbur.exe.apcgp.cfg.
-h
produces brief help text.
-S signum
sets the signal number for a snapshot signal handler Default 0 (no signal handler). Range 0-63. On receipt of the numbered signal, the probe will performa a snapshot to save profile data.
-t size
sets the call to table size. Default 5,000. Range 5,000-200,000. Programs with lots of function should use larger values.
-v
verbose mode, which produces additional progress messages.

Apcgp Configuration File

The Apcgp configuration file is used to specify what subprograms are to be analyzed, when snapshots are to be taken, and other options, as described in Configuration File. The example below shows one possible Apcgp configuration file.

 APCGP CONFIGURATION FILE FOR PROFILE VERSION 2.0.0
 
 Verbose                   FALSE
 ProfilingEnabledInitially TRUE
 ResetSnapshotCounts    TRUE
 BinSize                    2
 ArcDensity              5
 ToLimit             5000
 
 // see who calls and is called by system functions
 PROFILE "open()" in "libc.so"
 PROFILE "read()" in "libc.so"
 
 // profile ''almost'' everything in the Motif Library
 PROFILE "*" in "libXm.so"
 REMOVE "_XmRecordEvent" in "libXm.so"
 
 // log a snapshot of profile data
 SNAPSHOT "RefreshCallback()"
 

Example apcgp.cfg File

Configuration Variables

The following are the only valid keywords that identify lines to set configuration variables. Each such line must begin with one of these keywords, and the keyword must be followed by its value. Nothing else is allowed on the same line.

Verbose

This must be followed by the value TRUE or FALSE. The default is FALSE. The value TRUE indicates the progress messages should be produced by the profile probe.

ProfilingEnabledInitially

This must be followed by the value TRUE or FALSE. The default is TRUE. The value TRUE indicates that data logging will begin as soon as the application program starts running. The value FALSE indicates that data logging will begin only after a call is made to the probe's function ap_Cgp_Enable() rather than as soon as the application program starts running.

RestSnapshotCounts

This must be followed by the value TRUE or FALSE. The default is TRUE. The value TRUE indicates that at format time profile counts will be reset between snapshots.

BinSize

This must be followed by an unsigned integer, minimum 8. This sets the sampling bin size. The sampling bin size controls how granular the profile is: a smaller bin size provides more granular information at the expense of large memory requirement, while a large bin size requires less memory but provides more granular (less accurate) profile data. More granular data may not be able to distinguish between small functions, but will allow a larger program to be profiled.

ArcDensity

This must be followed by an unsigned integer, minimum 2. This sets the call arc density used to calculate of profile data tables. A larger arc density is needed for programs whose functions call lots of other functions, and will require more memory space, while a lower arc density is sufficient for simpler program functions, and requires less memory.

ToLimit

This must be followed by an unsigned integer, minimum 2. This sets the call to arc size used to calculate the size of profile data tables. A larger value is required to record more data and will require more memory space.

Configuration of Profiled Functions

Each function to be profiled must be specified explicitly using the PROFILE keyword followed by the name of the function, as described in Configuration of Selected Functions.

The REMOVE keyword allows you to specify functions that should not be instrumented for profiling. This is useful when used in conjunction with a wildcard ("*"), to gather data about everything except certain routines.

A line beginning with the keyword SNAPSHOT specifies the name of a snapshot function in the usual manner. Entry to the snapshot function will cause the save a snapshot of the profile data.

Apcpg API

Users can control the behavior of the apcgp.ual probe by calls from within their own probes. The API is defined by $APROBE/include/apcgp.h. Some of the functions exported by apcgp.ual are:

ap_Cgp_Enable

Enables collection of profile data.
ap_Cgp_Disable

Disables collection of profile data.
ap_Cgp_Snapshot

Takes a snapshot of profile data for the program.

Apcgp Demand Actions

(Since 4.4.8).

You can control apcgp.ual using demand.ual and apdemand.

Include demand.ual on the Aprobe command line:

 aprobe -u apcgp -u demand myapp.exe
 

then use apdemand to send actions:

 apdemand apcgp snapshot
 

apcgp.ual responds to the following actions:

  • apcgp snapshot
  • apcgp enable
  • apcgp disable

Note that all actions containing the action string will be triggered.

Profile Performance Issues

See Performance_Issues for a general discussion of factors that affect performance.

Apcgp Report

The reports generated by running apformat vary depending upon the options chosen as specified above. The default looks like the example below:

 apcgp: Snapshot #1 @ 13:55:26.605, Snapshot on program exit 
-----------------------------------------------------------
 apcgp: Bin Size: 8  Arc Density: 10  Number Bins: 463  Number Arcs: 5000

 Call Graph Profile

 Indx        Calls  Recursive   Name 
 --------------------------------------------------------
 	         1                call_pth_init()   [5]
 [6]	         1             __pth_init()   [6]
 -----------------------------------------------------
 	                          <spontaneous>
 [1]	         1             __start()   [1]
 	         1                exit()   [3]
 	         1                main()   [2]
 -----------------------------------------------------
 	                          <spontaneous>
 [4]	         1             __threads_init()   [4]
 	         1                call_pth_init()   [5]
 -----------------------------------------------------
 	         1                __threads_init()   [4]
 [5]	         1             call_pth_init()   [5]
 	         1                __pth_init()   [6]
 -----------------------------------------------------
 	         1                __start()   [1]
 [3]	         1             exit()   [3]
 -----------------------------------------------------
 	         1                __start()   [1]
 [2]	         1             main()   [2]
 	         2                pthread_create()   [8]
 	         2                pthread_join()   [10]
 	         1                pthread_mutex_init()   [7]
 	         1                sleep()   [9]
 -----------------------------------------------------
 	         1                threadmain()   [11]
 [13]	       288             mylib1_work()   [13]
 -----------------------------------------------------
 	         1                threadmain()   [11]
 [14]	       288             mylib2_work()   [14]
 -----------------------------------------------------
 	         1                threadmain()   [11]
 [12]	         4             printf()   [12]
 -----------------------------------------------------
 	         1                main()   [2]
 [8]	         2             pthread_create()   [8]
 -----------------------------------------------------
 	         1                main()   [2]
 [10]	         2             pthread_join()   [10]
 -----------------------------------------------------
 	         1                main()   [2]
 [7]	         1             pthread_mutex_init()   [7]
 -----------------------------------------------------
 	         1                main()   [2]
 [9]	         1             sleep()   [9]
 -----------------------------------------------------
 	                          <spontaneous>
 [11]	         1             threadmain()   [11]
 	       288                mylib1_work()   [13]
 	       288                mylib2_work()   [14]
 	         4                printf()   [12]
 -----------------------------------------------------
 
 Symbol Index
 --------------------------------------------------------
 [6]	__pth_init()  
 [1]	__start()  
 [4]	__threads_init()  
 [5]	call_pth_init()  
 [3]	exit()  
 [2]	main()  
 [13]	mylib1_work()  
 [14]	mylib2_work()  
 [12]	printf()  
 [8]	pthread_create()  
 [10]	pthread_join()  
 [7]	pthread_mutex_init()  
 [9]	sleep()  
 [11]	threadmain()  
 
 apcgp: Snapshot #1 end.
 -----------------------
 

In the report, each profiled function is given an entry in the table. The table is either sorted by function address or by name. The default is sort by name.

Each entry focuses on a function, and that function's index is given in the lefthand column; the center column displays call counts; and the righthand column shows the function names. The entries focus functions is outdented, and above the focus function are the functions that called the focus function, and below it are functions called by the focus function.

If a function recurses, the recursive call counts will be displayed separately in that column.

Example D-8. Apcgp Probe Output