Memleak Predefined Probe

From OC Systems Wiki!
Revision as of 23:39, 28 March 2018 by Swn (talk | contribs) (Memory Leak Probe: memleak.ual)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Memory Leak Probe: memleak.ual

The memleak predefined probe is useful for tracking down memory leaks in an application. Such leaks occur when an application allocates memory dynamically (e.g. by directly or indirectly calling malloc or a similar memory allocation routine) and never frees it.

The probe works by keeping a list of outstanding allocations. When a deallocation occurs, the corresponding entry is removed from this list. When a snapshot is taken we log all of the current entries in the list which correspond to all items allocated since the last snapshot that have not yet been freed. Note that this doesn't necessarily mean there is a leak since the free could occur after the snapshot point. However, by carrying over young allocations into the next snapshot, false alarms are avoided.

NOTE: The memleak.ual predefined probe is most appropriate for shorter-running programs which can afford a certain amount of overhead both in CPU time and memory. If you are looking for memory leaks in very long running programs or programs where you need a very light touch, you probably want to use memstat.ual.

Motivation

After 3 probes dealing with memory leaks, enough was learned to tailor a probe precisely for memory leaks.

The first lesson was that leak output is for programmers, and they only need to be told once, not many times, to examine a suspicious allocation.

A second lesson was that leaks are needles in a haystack, and the hay is of no interest to programmers. Output unrelated to a leak is best avoided.

Overview

Instead of trying to report the status of all allocations, or a sampling of all allocations, memleak reports only allocations that reach a suspicious age. Memleak also distinguishes long age allocations in the first snapshot interval from long age allocations in later intervals.

That is, there are many allocations at the beginning of a program which are kept for the life of the program. These aren't leaks, or at least, if some are significant leaks, they will repeat in later intervals.

So the output from the first snapshot interval is kept separate. It is logged, but programmers can generally ignore the output when seeking a leak. In the ideal case, a program makes all its longterm allocations during the first snapshot interval. Then the other snapshots are all leaks. In the non-ideal case, valid long age allocations also occur later, and a programmer seeking a leak needs to distinguish which longterm allocations are meant to be longterm, and which are accidentally longterm. Memleak makes that task as easy as possible:

  1. it only reports longterm allocations
  2. it only reports an index to the allocator tracebacks

Memleak avoids reporting shortterm allocations by carrying over into the next interval allocations of a non-suspicious age. Most get freed in the next interval.

Memleak uses a small snapshot interval so that very little memory is needed to remember the allocations. Memleak frees its memory at unallocation, and frees the memory of suspicious age allocations when it logs them.

In other words, memleak doesn't bother programmers with all allocations, just the aged allocations. And by logging them quickly, it frees memory quickly.

Output

The following disk files are created:

  • memleak.tbi - tracebacks index, referred to by snapshots
  • memleak.snap - periodic snapshot output
  • memleak.init - initial snapshots collected during startup
  • memleak.exit - final snapshot on_exit

All memleak.* files are created in the current directory where the application is run, unless 'memleak' is overridden by the argument to '-f' on the memleak.ual argument list, e.g.,

aprobe -u memleak.ual -p "-f /u/logs/memleak_<pid>" /u/app1/app1.exe"

would write the files to /usr/logs/memleak_123456.tbi, etc, where 123456 is the process PID.

All descriptions below assume the default names.

In addition, progress lines are written to stderr, like

10 new kept in 00:00:07.502683 seconds before Tue Mar 25 12:44:35.925683 2008

which means that 10 new possible leak points were added at the given time.

Usage

So here is a way to use memleak.ual and interpret its output:

1. run your program with the memleak UAL

aprobe -u memleak MyProgram.exe

21 new kept in 00:00:30.016827 seconds before Wed Sep  6 16:16:43.710582 2014
30 new kept in 00:00:30.015614 seconds before Wed Sep  6 16:17:13.726770 2014
30 new kept in 00:00:30.016394 seconds before Wed Sep  6 16:17:43.744305 2014
30 new kept in 00:00:30.015884 seconds before Wed Sep  6 16:18:13.760815 2014
24 new kept in 00:02:40.037675 seconds before Wed Sep  6 16:21:03.799458 2014

2. grep the log output after the first interval for long-term allocations

grep 'memleak.tbi' memleak.snap

Note that a file, memleak.snap, is used as the log. Short applications that finish before it is time to take a periodic snapshot use another file, memleak.exit, for the final log. If there is interest in the early allocations, they are in the memleak.init file.

memleak.tbi Id 12: 3072 bytes at 0x804fd88
memleak.tbi Id 11: 4096 bytes at 0x804ed80
memleak.tbi Id 13: 2048 bytes at 0x804e578
memleak.tbi Id 10: 17408 bytes at 0x8052f40
memleak.tbi Id 17: 1024 bytes at 0x8051d98
memleak.tbi Id 16: 5120 bytes at 0x8050990

3. sort the stream by traceback ID

grep 'memleak.tbi' memleak.snap | sort -k 3 -n

memleak.tbi Id 10: 17408 bytes at 0x8052f40
memleak.tbi Id 11: 4096 bytes at 0x804ed80
memleak.tbi Id 12: 3072 bytes at 0x804fd88
memleak.tbi Id 13: 2048 bytes at 0x804e578
memleak.tbi Id 16: 5120 bytes at 0x8050990
memleak.tbi Id 17: 1024 bytes at 0x8051d98

4. lookup the frequently occurring tracebacks in the memleak.tbi file

Id 10:
         "interp.c":"__GI___libc_malloc" in "libc.so"
     ==> extern:"SimpleSpot" + 0x0046
     ==> extern:"spot3" + 0x0022
     ==> extern:"main" + 0x02ac
     ==> extern:"__libc_start_main" + 0x00d2 in "libc.so"
     ==> extern:"_start" + 0x0020

5. go to the source code to see if the allocation is intended to be longterm

6. if not intended to be longterm, a deallocation is missing from the code

7. if there is a deallocation, it is too guarded

If more runs are needed, grep -v can skip uninteresting traceback Ids, and tail can skip uninteresting early intervals.

Analyzing Memleak Data

The following script will help sort through memleak snapshots data. This script will produce three reports sorted by the leak size, count, and total size. This will help you focus on the biggest problems first.

#!/bin/sh
#-------------------------------------------------------------------
# Produce reports from memleak snapshot data.  This script produces
# several reports:  
# 
#  - leak points sorted by size of (each) leaked object
#  - leak points sorted by number of leaked objects
#  - leak points sorted by total size of leaked objects
# 
# This script assumes that the file memleak.snap is located in the 
# current directory and produces 3 files in the current directory:
#
#  - memleak.size.txt
#  - memleak.number.txt
#  - memleak.total.txt
#
# corresponding to the three reports above.
# 
# You can provide a parameter to this script to operate on specific
# snapshot files, like memleak.init (for early allocations) or 
# memleak.exit (for short-lived programs).
# 
#-------------------------------------------------------------------
SNAPFILE=${1:-memleak.snap}

grep 'memleak.tbi' $SNAPFILE | awk '{printf("%s\t%s\n", $3, $4)}' | sort -k 1 -n | uniq -c | awk '{printf("%s\t%s\t%s\t%s\n", $2, $1, $3, $1 * $3)}' | sort -k 3 -n -r | awk 'BEGIN{printf("Memleak Data Sorted By Size\nID\tCount\tSize\tTotal\n--\t-----\t----\t----\n")}{printf("%s\t%s\t%s\t%s\n", $1, $2, $3, $4)}' > memleak.size.txt
printf "\n\nTraceback Ids:\n\n" >> memleak.size.txt
cat memleak.tbi >> memleak.size.txt

grep 'memleak.tbi' $SNAPFILE | awk '{printf("%s\t%s\n", $3, $4)}' | sort -k 1 -n | uniq -c | awk '{printf("%s\t%s\t%s\t%s\n", $2, $1, $3, $1 * $3)}' | sort -k 2 -n -r | awk 'BEGIN{printf("Memleak Data Sorted By Count\nID\tCount\tSize\tTotal\n--\t-----\t----\t----\n")}{printf("%s\t%s\t%s\t%s\n", $1, $2, $3, $4)}' > memleak.num.txt
printf "\n\nTraceback Ids:\n\n" >> memleak.num.txt
cat memleak.tbi >> memleak.num.txt

grep 'memleak.tbi' $SNAPFILE | awk '{printf("%s\t%s\n", $3, $4)}' | sort -k 1 -n | uniq -c | awk '{printf("%s\t%s\t%s\t%s\n", $2, $1, $3, $1 * $3)}' | sort -k 4 -n -r | awk 'BEGIN{printf("Memleak Data Sorted By Total\nID\tCount\tSize\tTotal\n--\t-----\t----\t----\n")}{printf("%s\t%s\t%s\t%s\n", $1, $2, $3, $4)}' > memleak.total.txt
printf "\n\nTraceback Ids:\n\n" >> memleak.total.txt
cat memleak.tbi >> memleak.total.txt

Configuration

Command-line options. These must be in quotes following memleak -p, like

aprobe -u memleak -p "-s 5 -d 20 -f foo<pid>" app.exe
-d TbDepth
the maximum number of frames in the call chain to record for each allocation point. Default: 9
-f FileName
the name of the output file(s). Default: "./memleak". E.g., -p "-f /tmp/foo<pid>" creates /tmp/foo12345.tbi, /tmp/foo12345.exit.
-i InitTime
the number of seconds before taking regular snapshots. Default: SnapTime
-l LeakTime
the number of seconds of age an allocation needs to have before it is reported. Default: 10.
-s SnapTime
the number of seconds between snapshots after InitTime. Default: 30.
-w WaitTime
the number of seconds for the application to initialize before tracking begins. Default: 0 (Aprobe 4.4.9)