Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

malloc+utrace, tracking memory leaks in a running program.

121 views
Skip to first unread message

Alfred Perlstein

unread,
Dec 21, 2012, 10:37:52 PM12/21/12
to
Hey guys.

So the other day in an effort to debug a memory leak I decided to take a
look at malloc+utrace(2) and decided to make a tool to debug where leaks
are coming from.

A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data to
prevent overloading of data. (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in
utrace2.diff)
3) changes to jemalloc to include the new format AND the function caller
so it's easy to get the source of the leaks. (also in utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what memory
has leaked. (alloctrace.py)
5) simple test program (test_utrace.c)

If you want to get a trace now you can do this:
gcc -Wall -O ./test_utrace.c

env MALLOC_CONF='utrace:true' ktrace ./a.out
kdump | ./alloctrace.py


Now the problem I am having is making this work on a running program:
1) turning on the "opt_utrace" in a running program is almost
impossible. This is because libc is installed stripped. Unfortunately
my gdb-foo is weak and I was unable to load the symbol file without a
really bad hack.

The only way I could get it done was to use a trick from Ed Maste which
was to:
1.1) install a debug copy of libc.so over the installed one. <- dislike!
1.2) then launching gdb ./a.out <pid>,
1.3) then set __jemalloc_opt_utrace = 1
1.4) enable ktrace on the running binary: ktrace -p <pid> -t U #
this is utrace2 enabled
1.5) run 'cont' in gdb to proceed.

There has to be an easier way to access the symbol __jemalloc_opt_utrace
besides copying over the installed libc.

Is there a workaround for 1.1?

Is it time to start installing with some form of debug symbols? This
would help us also with dtrace.

Ideas?

-Alfred

utrace2.diff
test_utrace.c

Ed Maste

unread,
Dec 22, 2012, 11:56:05 AM12/22/12
to
On 21 December 2012 22:37, Alfred Perlstein <bri...@mu.org> wrote:

> Is it time to start installing with some form of debug symbols? This would
> help us also with dtrace.

I just posted a patch to add a knob to build and install standalone
debug files. My intent is that we will build releases with this
enabled, and add a base-dbg.txz distribution that contains the debug
data for the base system, so that one can install it along with
everything else, or add it later on when needed.

We could perhaps teach dtrace to read its data from standalone .ctf
files, or have it read DWARF directly and use the same debug files.
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hacke...@freebsd.org"

Alfred Perlstein

unread,
Dec 22, 2012, 1:08:05 PM12/22/12
to
On 12/22/12 8:56 AM, Ed Maste wrote:
> On 21 December 2012 22:37, Alfred Perlstein <bri...@mu.org> wrote:
>
>> Is it time to start installing with some form of debug symbols? This would
>> help us also with dtrace.
> I just posted a patch to add a knob to build and install standalone
> debug files. My intent is that we will build releases with this
> enabled, and add a base-dbg.txz distribution that contains the debug
> data for the base system, so that one can install it along with
> everything else, or add it later on when needed.
>
> We could perhaps teach dtrace to read its data from standalone .ctf
> files, or have it read DWARF directly and use the same debug files.
>
Thank you.

Added CC'd Rui Paulo. Rui, do you think it's easy to get dtrace to
honor these conventions?

-Alfred

Jason Evans

unread,
Dec 23, 2012, 12:28:52 PM12/23/12
to
On Dec 21, 2012, at 7:37 PM, Alfred Perlstein <bri...@mu.org> wrote:
> So the other day in an effort to debug a memory leak I decided to take a look at malloc+utrace(2) and decided to make a tool to debug where leaks are coming from.
>
> A few hours later I have:
> 1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent overloading of data. (utrace2.diff)
> 2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
> 3) changes to jemalloc to include the new format AND the function caller so it's easy to get the source of the leaks. (also in utrace2.diff)
> 4) a program that can take a pipe of kdump(1) and figure out what memory has leaked. (alloctrace.py)
> 5) simple test program (test_utrace.c)
>
> […]

Have you looked at the heap profiling functionality built into jemalloc? It's not currently enabled on FreeBSD, but as far as I know, the only issue keeping it from being useful is the absence of a Linux-compatible /proc/<pid>/maps (and the gperftools folks may already have a solution for that; I haven't looked). I think it makes more sense to get that sorted out than to develop a separate trace-based leak checker. The problem with tracing is that it doesn't scale beyond some relatively small number of allocator events.

> Is it time to start installing with some form of debug symbols? This would help us also with dtrace.

Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work by default, IMO we should strongly prefer such defaults. It's more reasonable to expect people who need every last bit of performance to remove functionality than to expect people who want to figure out what the system is doing to figure out what functionality to turn on.

Thanks,
Jason

Alfred Perlstein

unread,
Dec 23, 2012, 1:02:19 PM12/23/12
to
On 12/23/12 9:28 AM, Jason Evans wrote:
> On Dec 21, 2012, at 7:37 PM, Alfred Perlstein <bri...@mu.org> wrote:
>> So the other day in an effort to debug a memory leak I decided to take a look at malloc+utrace(2) and decided to make a tool to debug where leaks are coming from.
>>
>> A few hours later I have:
>> 1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent overloading of data. (utrace2.diff)
>> 2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
>> 3) changes to jemalloc to include the new format AND the function caller so it's easy to get the source of the leaks. (also in utrace2.diff)
>> 4) a program that can take a pipe of kdump(1) and figure out what memory has leaked. (alloctrace.py)
>> 5) simple test program (test_utrace.c)
>>
>> […]
> Have you looked at the heap profiling functionality built into jemalloc? It's not currently enabled on FreeBSD, but as far as I know, the only issue keeping it from being useful is the absence of a Linux-compatible /proc/<pid>/maps (and the gperftools folks may already have a solution for that; I haven't looked). I think it makes more sense to get that sorted out than to develop a separate trace-based leak checker. The problem with tracing is that it doesn't scale beyond some relatively small number of allocator events.
Ok, we are in agreement on this all.

Paul Saab recommended profiling to me, but yes, the problem is that none
of this stuff works on FreeBSD out of the box due to missing bits here
or there. Augmenting the existing utrace stuff to get what I needed
seemed much simpler than figuring out how to get dtrace, pidmaps and
whatnot into the system. It's a matter of the requirements to
accomplish these higher order things requires
X=(skill+time+ability_to_socialize_these_changes) where X >
alfred->skill_and_time_and_socialize(). :)

To be honest, if dtrace just worked, then I could get the same
information I'm getting from utrace2(2) from dtrace with no problem.
(at least I think so).

As far as scaling it, I agree it does not work for long running
programs, however there are a few instances of programs leaking large
memory in a short while that I can track down by temporarily ktracing
for short while.

>> Is it time to start installing with some form of debug symbols? This would help us also with dtrace.
> Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work by default, IMO we should strongly prefer such defaults. It's more reasonable to expect people who need every last bit of performance to remove functionality than to expect people who want to figure out what the system is doing to figure out what functionality to turn on.
Yes!!! :)

Is there an easy way to go about this?

Rui says it's really a matter of just turning off stripping of shlibs
and adding -fno-omit-frame-pointer and WITH_CTF.

I'm going to give this a shot, if it works, can you help me refine this?

I'll post diffs later today if I don't get completely stuck somehow.

-Alfred

Alfred Perlstein

unread,
Jan 10, 2013, 1:41:05 AM1/10/13
to
On 12/23/12 12:28 PM, Jason Evans wrote:
> On Dec 21, 2012, at 7:37 PM, Alfred Perlstein <bri...@mu.org> wrote:
>> So the other day in an effort to debug a memory leak I decided to take a look at malloc+utrace(2) and decided to make a tool to debug where leaks are coming from.
>>
>> A few hours later I have:
>> 1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent overloading of data. (utrace2.diff)
>> 2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
>> 3) changes to jemalloc to include the new format AND the function caller so it's easy to get the source of the leaks. (also in utrace2.diff)
>> 4) a program that can take a pipe of kdump(1) and figure out what memory has leaked. (alloctrace.py)
>> 5) simple test program (test_utrace.c)
>>
>> […]
> Have you looked at the heap profiling functionality built into jemalloc? It's not currently enabled on FreeBSD, but as far as I know, the only issue keeping it from being useful is the absence of a Linux-compatible /proc/<pid>/maps (and the gperftools folks may already have a solution for that; I haven't looked). I think it makes more sense to get that sorted out than to develop a separate trace-based leak checker. The problem with tracing is that it doesn't scale beyond some relatively small number of allocator events.

I have looked at some of this functionality (heap profiling) but alas it
is not implemented yet. In addition the dtrace work appears to be quite
away from a workable solution with too many performance penalties until
some serious hacking is done.

I am just not sure how to proceed, on one hand I do not really have the
skill to fix the /proc/pid/maps problem, nor figure out how to get
dtrace into the system in any time frame that is reasonable.

All a few of us need is the addition of the trace back into the existing
utrace framework.

>> Is it time to start installing with some form of debug symbols? This would help us also with dtrace.
> Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work by default, IMO we should strongly prefer such defaults. It's more reasonable to expect people who need every last bit of performance to remove functionality than to expect people who want to figure out what the system is doing to figure out what functionality to turn on.
>

This is very true. I'm going to continue to work towards this end with
a few people and get up to speed on it so that hopefully we can get to
this point hopefully in the next release cycle or two.

If you have a few moments, can you have a look at the "utrace2" branches
here:
https://github.com/alfredperlstein/freebsd/tree/utrace2

This branch contains the addition of the utrace2 system call which is
needed to structure data via utrace(2). The point of this is to avoid
kdump(1) needing to discern type of ktrace records based on arbitrary
size or other parameters and introduces an extensible protocol for new
types of utrace data.

The utrace2 branch here augments jemalloc to use utrace2 to pass the old
utrace records, but in addition to pass the return address along with
the type and size of the allocation:
https://github.com/alfredperlstein/jemalloc/tree/utrace2

Alfred Perlstein

unread,
Jan 10, 2013, 1:56:48 AM1/10/13
to
Jason,

Here are more convenient links that give diffs against FreeBSD and
jemalloc for the proposed changes:

FreeBSD:
https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2

jemalloc:
https://github.com/alfredperlstein/jemalloc/compare/master...utrace2

Konstantin Belousov

unread,
Jan 10, 2013, 2:38:54 AM1/10/13
to
On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:
> Here are more convenient links that give diffs against FreeBSD and
> jemalloc for the proposed changes:
>
> FreeBSD:
> https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2
>
Why do you need to expedite the records through the ktrace at all ?
Wouldn't direct write(2)s to a file allow for better performance
due to not stressing kernel memory allocator and single writing thread ?
Also, the malloc coupling to the single-system interface would be
prevented.

I believe that other usermode tracers also behave in the similar way,
using writes and not private kernel interface.

Also, what /proc issues did you mentioned ? There is
sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
and does not require /proc mounted.

Alfred Perlstein

unread,
Jan 10, 2013, 10:16:46 AM1/10/13
to
On 1/10/13 2:38 AM, Konstantin Belousov wrote:
> On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:
>> Here are more convenient links that give diffs against FreeBSD and
>> jemalloc for the proposed changes:
>>
>> FreeBSD:
>> https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2
>>
> Why do you need to expedite the records through the ktrace at all ?
> Wouldn't direct write(2)s to a file allow for better performance
> due to not stressing kernel memory allocator and single writing thread ?
> Also, the malloc coupling to the single-system interface would be
> prevented.
>
> I believe that other usermode tracers also behave in the similar way,
> using writes and not private kernel interface.
>
> Also, what /proc issues did you mentioned ? There is
> sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
> and does not require /proc mounted.
>
>> jemalloc:
>> https://github.com/alfredperlstein/jemalloc/compare/master...utrace2
>>

Konstantin, you are right, it is a strange thing this utrace. I am not
sure why it was done this way.

You are correct in that much more efficient system could be made using
writes gathered into a single write(2).

Do you think there is any reason they may have re-used the kernel paths
for ktrace even at the cost of efficiency?

About kern.proc.vmmap I will look into that.

Konstantin Belousov

unread,
Jan 10, 2013, 1:05:14 PM1/10/13
to
On Thu, Jan 10, 2013 at 10:16:46AM -0500, Alfred Perlstein wrote:
> On 1/10/13 2:38 AM, Konstantin Belousov wrote:
> > On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:
> >> Here are more convenient links that give diffs against FreeBSD and
> >> jemalloc for the proposed changes:
> >>
> >> FreeBSD:
> >> https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2
> >>
> > Why do you need to expedite the records through the ktrace at all ?
> > Wouldn't direct write(2)s to a file allow for better performance
> > due to not stressing kernel memory allocator and single writing thread ?
> > Also, the malloc coupling to the single-system interface would be
> > prevented.
> >
> > I believe that other usermode tracers also behave in the similar way,
> > using writes and not private kernel interface.
> >
> > Also, what /proc issues did you mentioned ? There is
> > sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
> > and does not require /proc mounted.
> >
> >> jemalloc:
> >> https://github.com/alfredperlstein/jemalloc/compare/master...utrace2
> >>
>
> Konstantin, you are right, it is a strange thing this utrace. I am not
> sure why it was done this way.
>
> You are correct in that much more efficient system could be made using
> writes gathered into a single write(2).
Even without writes gathering, non-coalesced writes should be faster than
utrace.

>
> Do you think there is any reason they may have re-used the kernel paths
> for ktrace even at the cost of efficiency?
I can only speculate. The utracing of the malloc calls in the context
of the ktrace stream is useful for the human reading the trace. Instead
of seeing the sequence of unexplanaible calls allocating and freeing
memory, you would see something more penetrable. For example, you would
see accept/malloc/read/write/free, which could be usefully interpreted
as network server serving the client.

This context is not needed for a leak detector.

Alfred Perlstein

unread,
Jan 10, 2013, 1:29:38 PM1/10/13
to
Now I may be wrong here, but I think it's an artifact of someone
noticing how useful fitting this into the ktrace system and leveraging
existing code.

Even though there are significant performance deficiencies, the actual
utility of the existing framework may have been such a stepping stool
towards tracing that it was just used.

Right now the code already exists, however it logs just {operation,
size, ptr}, example:
malloc, 512, -> 0xdeadbeef
free, 0, 0xdeadbeef
realloc, 512, 0 -> 0xdeadc0de
realloc, 1024, 0xdeadc0de -> 0xffff0000
free, 0, 0xffff0000

What do you think of just adding the address of the caller of
malloc/free/realloc to these already existing tracepoints?

Konstantin Belousov

unread,
Jan 10, 2013, 4:03:13 PM1/10/13
to
In most real-world applications I saw, malloc() was not a function called
to do the allocation. Usually, there is either an app-specific wrapper,
or the language runtime system which calls malloc(), e.g. the new operator
for the C++ code. Than, the caller address becomes constant for the whole
duration of the program run.

What would be useful is the full backtrace of each allocation. The tools
like libunwind are indeed optimized for this usage pattern.

From this POV, the libc malloc(3) might be better offering a set of the
well-defined hooks for a pluggable tracer to utilize. I am on the fence
there, you could override the malloc/free without hooks, by the ELF symbol
interposing technique, but hooks would also offer features not easily
implementable with the interposing.
0 new messages