Performance Probes

The trace, perf_cpu and perf_rusage probes work together to gather performance statistics on the application. RootCause supports collecting additional statistics in a straightforward manner by using the "perf_*" probes as examples. The usage of the sampled resources (e.g., elapsed CPU time) by each traced function and method is displayed in the Event Summary Tree.

Event Summary Tree

All statistics are shown in the Event Summary Tree, found under the EVENT_SUMMARY node at the end of your Trace output in the Trace Display window.

The Event Summary Tree is a call tree constructed from the trace events in the Event Tree. Each traced function or method will appear exactly once for each unique invocation point in each thread in the program. By "unique invocation point" we mean unique call stack. For example, if function C is called once from B and twice from A, then the Event Summary Tree will show function C twice: once as a child of B and once as a child of A. The details of each shows the number of times it is called, and the elapsed "Wall Time" -- the elapsed time between entry and exit.

To get the most accurate summary, you should format all available data. To do this, click the Examine button, select all data files (clear then re-check the checkbox for the entire Process Data Set in the Examine Data Dialog).

trace

The trace probe records the entry to and exit from each function and method selected in the Trace Setup dialog. The "time of day" is automatically recorded as part of this, so tracing can be correctly ordered by time, and events from different programs can be ordered sequentially.

An "added bonus" of recording the time is that we can display the elapsed time spent in each traced function. The elapsed time is shown for each individual call in the Event Tree, and for all calls to a function from a unique point in the Event Summary Tree.

perf_cpu

The perf_cpu probe samples the CPU time on entry and exit of each function and method being traced. The total CPU time used over all invocations of a function or method from a given point, as well as the average, minimum and maximum values, is shown in the Event Summary Tree.

To enable the perf_cpu probe:

Then run your application under RootCause, and Examine the data. To get the most accurate data, select all the Data files for your run in the The CPU time consumed by each tracked function and method is shown in the EVENT_SUMMARY node at the end of the data. CPU time is not shown for individual calls, only under the EVENT_SUMMARY.

perf_rusage (Unix only)

The perf_rusage probe calls the getrusage() system call periodically and logs the statistics it provides:

RU_IXRSS  -  integral shared memory size
RU_IDRSS  -  integral unshared data
RU_ISRSS  -  integral unshared stack
RU_MINFLT  -  page reclaims
RU_MAJFLT  -  page faults
RU_NSWAP  -  swaps
RU_INBLOCK  -  block input operations
RU_OUBLOCK  -  block output operations
RU_MSGSND  -  messages sent
RU_MSGRCV  -  messages received
RU_NSIGNALS  -  signals received
RU_NVCSW  -  voluntary context switches
RU_NIVCSW  -  involuntary context switches

The results of each call are available as an event in the Trace Event Tree, and the change in each value between the entry and exit of each function is recorded. The sum of all the resource usage for a given function or method is shown in the Event Summary Tree.

Use the perf_rusage probe exactly as described for the perf_cpu probe above.


[RootCause Help]     [User Guide]