Difference between revisions of "Statprof Predefined Probe"

From OC Systems Wiki!
Jump to: navigation, search
m (Statistical Profiling: statprof.ual)
m (Statprof UAL parameters)
Line 37: Line 37:
 
: This performs a coarse profile to see overall usage in shared libraries. This uses a very large "cell" size which allows you to determine which shared library was active when the kernel tick occurred but not which routine.
 
: This performs a coarse profile to see overall usage in shared libraries. This uses a very large "cell" size which allows you to determine which shared library was active when the kernel tick occurred but not which routine.
 
; '''-i'''
 
; '''-i'''
: This sets the profile timer interval.  The default value is 10 milliseconds.  Valid values range from 10 milliseconds to 999 milliseconds.
+
: This sets the profile timer interval.  The default value is 10 milliseconds.  Valid values range from 10 milliseconds to 999 milliseconds.  (Since 4.4.9).
 
; '''-l'''
 
; '''-l'''
 
: This displays a breakdown of time within a function by line number as well as by function. The bigger the cell size the less reliable it will be but this can give useful information.
 
: This displays a breakdown of time within a function by line number as well as by function. The bigger the cell size the less reliable it will be but this can give useful information.

Revision as of 19:39, 25 May 2018

Statistical Profiling: statprof.ual

The statprof.ual predefined probe uses the Unix statistical profiling functions to provide a very low overhead mechanism to sample CPU timing. This is the underlying mechanism used when linking an application with the -p flag (or equivalent), the output of which is viewed using the prof tool. Since this is based on Unix services it is not available for the Windows 2000 version of Aprobe.

The probe works by using the Unix profil function. Normally this would be called from the startup and exit code of an application linked with -p; with Aprobe we call it on program entry and exit. A buffer is allocated and passed to the kernel with a "slot" (or cell) for every couple of instructions. When a kernel clock "ticks", if one of the threads for the application is running, the value in that slot is incremented.

When the application exits (or we take a snapshot), the memory buffer is saved to the APD file. At format time we convert each cell index to an address and use it to compute the time spent in each routine. Obviously, since the kernel is sampling the application at intervals, this is a statistical profiling method: Theoretically an application could spend 99% of it's time in one routine and jump to another routine at exactly the time we are sampling and then immediately return back again. Although this is theoretically possible, in practice it will never happen (given enough samples).

Statprof.ual has a number of advantages over using -p when building your application:

  • You do not need to rebuild your application to use it!
  • With -p you can only get information about the application module. While -p does link in some special static libraries to replace the standard libs this is not a perfect solution. Statprof allows you to access any of your modules.
  • You can take snapshots with staprof, control the cell size, display lines and generally control things to a much finer level.

One thing that statprof doesn't replicate from prof is the call counts facility since this is adequately covered by profile.ual.

Note that unlike profile, which measures wallclock time, statprof only measures CPU time. If the application is paused waiting on input, this time will not be seen. Sometimes you will be more interested in this than wallclock time, other times it will be the reverse. The [../examples/predefined_probes/statprof/README $APROBE/examples/learn/statprof] directory has a worked example where both statprof and profile are used as appropriate.

Usage

Using statprof is straightforward - there is no configuration file or GUI to worry about. All options can be set by command line options (either at runtime or format-time).

Statprof UAL parameters

statprof.ual is specified on the aprobe command line or in an APO file as described in Command Line. The specific options are:


aprobe  -u statprof.ual [-p "[-a] [-c] [-h] [-l] [-s size] [-z] [-i interval]
    [module name]]" your_program

where:

-a
This performs a profile on the whole address space. This uses the smallest cell size possible, though it will generally be rather coarse. Be careful to set the APD file size large enough to hold all the data.
-c
This performs a coarse profile to see overall usage in shared libraries. This uses a very large "cell" size which allows you to determine which shared library was active when the kernel tick occurred but not which routine.
-i
This sets the profile timer interval. The default value is 10 milliseconds. Valid values range from 10 milliseconds to 999 milliseconds. (Since 4.4.9).
-l
This displays a breakdown of time within a function by line number as well as by function. The bigger the cell size the less reliable it will be but this can give useful information.
-o filename
This outputs comma-delimited data to the given filename. This is suitable for input into a spreadsheet or similar program. Note that the line breaks are PC-based rather than Unix-based as this seems to be more generally accepted by programs we've tested with.
-s
This sets the number of bytes used for each sampling cell. It's unlikely that you would modify this but if you have a huge application and limited memory you may wish to do so. Basically the larger the cell size, the smaller the amount of storage required to hold the sampling table but it's more likely that the output data will be incorrect. By default two instructions are mapped to a cell which has nearly a 100% chance of both instructions belonging to the same routine due to the padding usually used by compilers. But if there are four instructions for the same cell, there's a chance that some of the instructions are for one function and some for another. However, when formatted they are treated as one address (obviously) so wrong information can result.
-z
This displays symbols or modules which have no associated CPU: Normally just the names of the routines (or modules) that have matches in the sampling table are displayed. This switch will also display the list of all other routines (or modules).

If you do not provide a module name, the application will be profiled.

At format time, the -l, -o and -z options may be provided if they were not specified at runtime.

Statprof API

Users can control the behavior of the statprof probe by calls from within their own probes. The API is defined by $APROBE/include/statprof.h. The functions exported by statprof.ual are:

ap_Statprof_Enable

Enables collection of statprof data.
ap_Statprof_Disable

Disables collection of statprof data.
ap_Statprof_Snapshot

Takes a snapshot of current statprof data.

Statprof report

The reports generated by statprof differ depending on the options you choose. The following is a sample output using the default options:

 Snapshot #1 @ 11:26:39.952, Final Snapshot
 ------------------------------------------
 Statistical Profiling (cell size 8) for module thisone, sorted by time:
  %Time   Seconds   Name
  ------  -------   -----------------------------------------------
   75.0    0.48     extern:"DoWork()"
   23.4    0.15     extern:"DoLessWork()"
    1.6    0.01     Other functions (not in profiled module)
 
 

Example D-9. Statprof Report