CallGraphProfile Predefined Probe
Contents
Call Graph Profile Probe: apcgp.ual
The apcgp.ual predefined probe profiles the execution of functions within an application program. The probe records the number of calls made to functions, the number of calls made by functions. The profiling information is reported for the program as a whole in a call graph tree.
Call graph profiling allows you to determine which functions call other functions and are called by other functions. The call graph profile information can be used to locate "hot spots" within your application on which to focus your attention when trying to improve your program's performance.
While this is a very powerful tool, it is not going to tell you in the first run what needs to be re-coded, unless it's a small application with a very obvious problem. Performance analysis of an application is an iterative process that requires an understanding of what the application does, and a feeling for how long operations should take.
Usage
This probe is applied at runtime using aprobe as described under Apcgp UAL Parameters below. The only functions for which profiling data will be collected are those selected with the PROFILE keyword in the configuration file (see Configuration File). The REMOVE keyword allows you to remove a function from the set to be profiled, and the SNAPSHOT keyword logs a snapshot of the profile information at a specified function.
The profile information is logged to a file for later processing by apformat. A call graph tree of profile information is generated for the program as a whole. See Apcgp Probe Output for an example.
Apcgp UAL Parameters
apcgp.ual is specified on the aprobe command line or in an APO file as described in Command Line. The specific options are:
aprobe -u apcgp.ual [-p "[-h] [-v] [-a size] [-b size] [-c config_filename] [-S signum] [-t size]"] your_program
where:
- -a size
- sets the arc density. Default 2. Minimum 2. Programs with functions which call lots of other functions should have higher arc densities.
- -b size
- sets the sampling bin size. Default 8. Minimum 8. A lager bin size requires less profile data memory but is more granular. Programs with a large text segment may require a larger bin size to fit profile data in the address space. Programs with lots of small functions may want to reduce the bin size for more accuracy.
- -c config_filename
- specifies that the name of the probe configuration options file will follow immediately after -c. The default file name is your_program.apcgp.cfg. For example, if your executable program is called wilbur.exe, then the default file name would be wilbur.exe.apcgp.cfg.
- -h
- produces brief help text.
- -S signum
- sets the signal number for a snapshot signal handler Default 0 (no signal handler). Range 0-63. On receipt of the numbered signal, the probe will performa a snapshot to save profile data.
- -t size
- sets the call to table size. Default 5,000. Range 5,000-200,000. Programs with lots of function should use larger values.
- -v
- verbose mode, which produces additional progress messages.
Apcgp Configuration File
The Apcgp configuration file is used to specify what subprograms are to be analyzed, when snapshots are to be taken, and other options, as described in Configuration File. The example below shows one possible Apcgp configuration file.
APCGP CONFIGURATION FILE FOR PROFILE VERSION 2.0.0 Verbose FALSE ProfilingEnabledInitially TRUE ResetSnapshotCounts TRUE BinSize 2 ArcDensity 5 ToLimit 5000 // see who calls and is called by system functions PROFILE "open()" in "libc.so" PROFILE "read()" in "libc.so" // profile ''almost'' everything in the Motif Library PROFILE "*" in "libXm.so" REMOVE "_XmRecordEvent" in "libXm.so" // log a snapshot of profile data SNAPSHOT "RefreshCallback()"
Example apcgp.cfg File
Configuration Variables
The following are the only valid keywords that identify lines to set configuration variables. Each such line must begin with one of these keywords, and the keyword must be followed by its value. Nothing else is allowed on the same line.
Verbose
This must be followed by the value TRUE or FALSE. The default is FALSE. The value TRUE indicates the progress messages should be produced by the profile probe.
ProfilingEnabledInitially
This must be followed by the value TRUE or FALSE. The default is TRUE. The value TRUE indicates that data logging will begin as soon as the application program starts running. The value FALSE indicates that data logging will begin only after a call is made to the probe's function ap_Cgp_Enable()
rather than as soon as the application program starts running.
RestSnapshotCounts
This must be followed by the value TRUE or FALSE. The default is TRUE. The value TRUE indicates that at format time profile counts will be reset between snapshots.
BinSize
This must be followed by an unsigned integer, minimum 8. This sets the sampling bin size. The sampling bin size controls how granular the profile is: a smaller bin size provides more granular information at the expense of large memory requirement, while a large bin size requires less memory but provides more granular (less accurate) profile data. More granular data may not be able to distinguish between small functions, but will allow a larger program to be profiled.
ArcDensity
This must be followed by an unsigned integer, minimum 2. This sets the call arc density used to calculate of profile data tables. A larger arc density is needed for programs whose functions call lots of other functions, and will require more memory space, while a lower arc density is sufficient for simpler program functions, and requires less memory.
ToLimit
This must be followed by an unsigned integer, minimum 2. This sets the call to arc size used to calculate the size of profile data tables. A larger value is required to record more data and will require more memory space.
Configuration of Profiled Functions
Each function to be profiled must be specified explicitly using the PROFILE keyword followed by the name of the function, as described in Configuration of Selected Functions.
The REMOVE keyword allows you to specify functions that should not be instrumented for profiling. This is useful when used in conjunction with a wildcard ("*"), to gather data about everything except certain routines.
A line beginning with the keyword SNAPSHOT specifies the name of a snapshot function in the usual manner. Entry to the snapshot function will cause the save a snapshot of the profile data.
Apcpg API
Users can control the behavior of the apcgp.ual
probe by calls from within their own probes. The API is defined by $APROBE/include/apcgp.h
. Some of the functions exported by apcgp.ual
are:
- ap_Cgp_Enable
Enables collection of profile data.- ap_Cgp_Disable
Disables collection of profile data.- ap_Cgp_Snapshot
Takes a snapshot of profile data for the program.
Apcgp Demand Actions
(Since 4.4.8).
You can control apcgp.ual
using demand.ual and apdemand.
Include demand.ual
on the Aprobe command line:
aprobe -u apcgp -u demand myapp.exe
then use apdemand
to send actions:
apdemand apcgp snapshot
apcgp.ual
responds to the following actions:
- apcgp snapshot
- apcgp enable
- apcgp disable
Note that all actions containing the action string will be triggered.
Profile Performance Issues
See Performance_Issues for a general discussion of factors that affect performance.
Apcgp Report
The reports generated by running apformat
vary depending upon the options chosen as specified above. The default looks like the example below:
apcgp: Snapshot #1 @ 13:55:26.605, Snapshot on program exit ----------------------------------------------------------- apcgp: Bin Size: 8 Arc Density: 10 Number Bins: 463 Number Arcs: 5000 Call Graph Profile Indx Calls Recursive Name -------------------------------------------------------- 1 call_pth_init() [5] [6] 1 __pth_init() [6] ----------------------------------------------------- <spontaneous> [1] 1 __start() [1] 1 exit() [3] 1 main() [2] ----------------------------------------------------- <spontaneous> [4] 1 __threads_init() [4] 1 call_pth_init() [5] ----------------------------------------------------- 1 __threads_init() [4] [5] 1 call_pth_init() [5] 1 __pth_init() [6] ----------------------------------------------------- 1 __start() [1] [3] 1 exit() [3] ----------------------------------------------------- 1 __start() [1] [2] 1 main() [2] 2 pthread_create() [8] 2 pthread_join() [10] 1 pthread_mutex_init() [7] 1 sleep() [9] ----------------------------------------------------- 1 threadmain() [11] [13] 288 mylib1_work() [13] ----------------------------------------------------- 1 threadmain() [11] [14] 288 mylib2_work() [14] ----------------------------------------------------- 1 threadmain() [11] [12] 4 printf() [12] ----------------------------------------------------- 1 main() [2] [8] 2 pthread_create() [8] ----------------------------------------------------- 1 main() [2] [10] 2 pthread_join() [10] ----------------------------------------------------- 1 main() [2] [7] 1 pthread_mutex_init() [7] ----------------------------------------------------- 1 main() [2] [9] 1 sleep() [9] ----------------------------------------------------- <spontaneous> [11] 1 threadmain() [11] 288 mylib1_work() [13] 288 mylib2_work() [14] 4 printf() [12] ----------------------------------------------------- Symbol Index -------------------------------------------------------- [6] __pth_init() [1] __start() [4] __threads_init() [5] call_pth_init() [3] exit() [2] main() [13] mylib1_work() [14] mylib2_work() [12] printf() [8] pthread_create() [10] pthread_join() [7] pthread_mutex_init() [9] sleep() [11] threadmain() apcgp: Snapshot #1 end. -----------------------
In the report, each profiled function is given an entry in the table. The table is either sorted by function address or by name. The default is sort by name.
Each entry focuses on a function, and that function's index is given in the lefthand column; the center column displays call counts; and the righthand column shows the function names. The entries focus functions is outdented, and above the focus function are the functions that called the focus function, and below it are functions called by the focus function.
If a function recurses, the recursive call counts will be displayed separately in that column.