Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Crash in CentralFreeList::Populate() under low-memory conditions

Received: by 10.58.18.172 with SMTP id x12mr2783186ved.38.1350326172643;
        Mon, 15 Oct 2012 11:36:12 -0700 (PDT)
X-BeenThere: google-perftools@googlegroups.com
Received: by 10.220.147.138 with SMTP id l10ls3694475vcv.7.gmail; Mon, 15 Oct
 2012 11:36:12 -0700 (PDT)
Received: by 10.58.169.17 with SMTP id aa17mr2616401vec.29.1350326171994;
        Mon, 15 Oct 2012 11:36:11 -0700 (PDT)
Received: by 10.58.169.17 with SMTP id aa17mr2616399vec.29.1350326171974;
        Mon, 15 Oct 2012 11:36:11 -0700 (PDT)
Return-Path: <chapp...@gmail.com>
Received: from mail-vc0-f177.google.com (mail-vc0-f177.google.com [209.85.220.177])
        by gmr-mx.google.com with ESMTPS id ef10si966758vdb.3.2012.10.15.11.36.11
        (version=TLSv1/SSLv3 cipher=OTHER);
        Mon, 15 Oct 2012 11:36:11 -0700 (PDT)
Received-SPF: pass (google.com: domain of chapp...@gmail.com designates 209.85.220.177 as permitted sender) client-ip=209.85.220.177;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of chapp...@gmail.com designates 209.85.220.177 as permitted sender) smtp.mail=chapp...@gmail.com; dkim=pass header...@gmail.com
Received: by mail-vc0-f177.google.com with SMTP id p16so6438741vcq.36
        for <google-perftools@googlegroups.com>; Mon, 15 Oct 2012 11:36:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        bh=7JE3vXWjfmnBNtWFI5olOWwbDVrcutZanItJbsYTs+s=;
        b=A5/UWFKH8+lc45ujTC/GihIYhv+0npS9wD0tfqntNy6pCOYJqDqIm15Q0AZfgXuxeg
         ol7Q7cLYE+8qjMkakdyOic/5n8iwumP49KNCEQJbrB3Fr6le8B/YDJeb5CBSlj6zkIzr
         kH4iZH3viELw5RBJ76LaIboRLTSyXjU0xbANSgDJORAupB+tWp1LCOa+JoHExEAWa8hH
         dVX9qX3b+lwezC8v80dsf+C8UjS/jyqG19EXq5RO9PlD/GRd/MTsHBO1cDP0r4uwl6rz
         8P2dxoueGdkAMd3nmsaDo+v8XGmhBFaX5/7Yj7AgYR7QV8y/9BC9l3GHpFwyXDnkRX7B
         cQUw==
MIME-Version: 1.0
Received: by 10.220.231.8 with SMTP id jo8mr6621692vcb.40.1350326171836; Mon,
 15 Oct 2012 11:36:11 -0700 (PDT)
Received: by 10.59.1.194 with HTTP; Mon, 15 Oct 2012 11:36:11 -0700 (PDT)
In-Reply-To: <feac16b7-279a-4030-9de1-8a0d9c93e168@googlegroups.com>
References: <51dea7ea-0785-47f7-9a9b-b183e4e4b507@googlegroups.com>
	<feac16b7-279a-4030-9de1-8a0d9c93e168@googlegroups.com>
Date: Mon, 15 Oct 2012 14:36:11 -0400
Message-ID: <CALj-ATJzW1WJQ2ywVVC6YHOMp9hy5F772bSreEFXZ9ttPpj...@mail.gmail.com>
Subject: Re: Crash in CentralFreeList::Populate() under low-memory conditions
From: David Chappelle <chapp...@gmail.com>
To: google-perftools@googlegroups.com
Content-Type: multipart/alternative; boundary=14dae9cdc0fdcda0a804cc1d4fab

--14dae9cdc0fdcda0a804cc1d4fab
Content-Type: text/plain; charset=ISO-8859-1

Martin, I don't believe that the upper limit is writeable. The usable
address space should be interpreted as:

    [0xff930000 - 0xfffd0000)

If you do the math this is exactly 6MB. Any chance you can reproduce the
crash using a version of tcmalloc that contains symbols? Is it possible
that we are using some uninitialized object as a result of the allocation
failure?

-Dave

On Mon, Oct 15, 2012 at 1:35 PM, Martin Lemke <terrade...@gmail.com> wrote:

> I forgot to mention that the fact that 0xfff90000-0xfffd0000 range is not
> a continuous memory allocation is revealed not using "info files" gdb
> command but by trying to examine the memory. The allocation apparently ends
> between 0xfff9a000 and 0xfff9a100 -which is really weird, it could be that
> the core is not ok.
>
> Thank you,
> Martin
>
>
> On Monday, October 15, 2012 7:23:06 PM UTC+2, Martin Lemke wrote:
>>
>> I am experiencing repetitive crashes in tcmalloc CentralFreeList::**Populate()
>> under a low-memory condition on linux (debian 5.0.8 with 2.6.32.15-686
>> kernel), the application is i32, using 2.0-1 version of libgperftools.
>> tcmalloc is loaded using LD_PRELOAD instruction.
>>
>> The problem occurs when memory allocations reach their limit at
>> ~4G. Normally tcmalloc under such conditions logs a failure in stderr and
>> exits the process, which is more or less ok. However in some cases a crash
>> occurs.
>>
>> I investigated one of the crashes, it looks like there is a race
>> condition between two threads - both calling  CentralFreeList::Populate(),
>> however one of them fails to allocate a span, and is already logging the
>> failure - though I don't see that it has already called exit(). The second
>> thread seems to be more successful, it seems to have allocated needed span,
>> but then the crash occurs. Here is what I found (after long
>> reverse-engineering session since tcmalloc we use do not have debugging
>> information):
>>
>> a span is allocated:
>>  (gdb) x/10x 0x75852b78
>> 0x75852b78:     0x0007ffc8      0x00000020      0x00000000      0x00000000
>> 0x75852b88:     0xfff90000      0x00550000      0x000297a8      0x00000001
>>
>> span->start is 0x0007ffc8 with length 0x00000020. npages is 0x00000020 as
>> well - which is correct, as I assume.
>>
>> The tcmalloc is apparently compilated with TCMALLOC_LARGE_PAGES off,
>> so kPageShift  = 13.
>>
>> Then the function starts populating linked lists. The first address,
>> start->objects, as we see, is calculated as 0xfff90000. The value of the
>> "limit" variable is calculated as 0xfffd0000
>>
>> The content of span->objects:
>>
>> (gdb) x/20x *(0x75852b78+0x10)
>> 0xfff90000:     0xfffd0000      0x00000000      0x00000000      0x00000000
>>
>> (I don't really understand it because 0xfffd0000 is essentially the
>> "limit" variable. However I also see that the value of "num" is 3, meaning
>> that we made more than two iteration! )
>>
>> But the main problem is the memory mapping:
>>
>> show files:
>> 0xf77ce000 - 0xff90e000 is load1215
>>         0xff91a000 - 0xff92f000 is load1216
>>         0xff930000 - 0xfffd0000 is load1217 (there is no allocations
>> after 0xfffd0000)
>>
>> The range between 0xfff90000  and 0xfffd0000 - the span->objects and
>> "limit" is not a continuous memory allocation.
>>
>> The crash occurs apparently in the line "*tail = ptr", trying to store a
>> ridiculous address 0x10000 (0xfffd0000 + 0x40000 ("size")) at  0xfffd0000.
>>
>> First of all, I have a feeling that tcmalloc expects
>> ranges 0xfff90000-0xfffd0000 and even 0xfffd0000-0x10000 be continuous
>> memory allocation. My assumption can be wrong though, can you please
>> confirm or reject it?
>>
>> Then, the address 0xfffd0000, "limit", is expected to be a valid memory
>> allocation, but it is not. The 0x10000 I can't explain at all (in fact, the
>> core already contains a successful allocation started as low as 0x8000
>> (0x00008000 - 0x08048000 is load1), which also comes to me as a surprise -
>> this allocation is filled with my application dynamic data, more
>> particularly I clearly see that it contains one of the strings that
>> application generates. I would expect that such low-memory addresses are
>> used as stack, for example, it comes even before the main application
>> binary!).
>>
>> I would like to ask your advice - is there something which can be done or
>> it can be expected that under a low-memory condition a multi-threaded
>> allocations can fail randomly? Is the data that I see consistent?
>>
>> It is clear that the real solution would be to avoid such low-memory
>> condition. Still I have a feeling that what I see is a kind of stress test
>> which makes tcmalloc to hit some inconsistency and fail. Then I wonder if
>> similar error may occur without such a stess, in a normal conditions.
>>
>> Thank you,
>> Martin
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "google-perftools" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-perftools/-/tECeBIHTdigJ.
>
> To post to this group, send email to google-perftools@googlegroups.com.
> To unsubscribe from this group, send email to
> google-perftools+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-perftools?hl=en.
>

--14dae9cdc0fdcda0a804cc1d4fab
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Martin, I don&#39;t believe that the upper limit is writeable. The usable a=
ddress space should be=A0interpreted=A0as:<div><br></div><div>=A0 =A0 [0xff=
930000 - 0xfffd0000)<br><br>If you do the math this is exactly 6MB.=A0Any c=
hance you can reproduce the crash using a version of tcmalloc that contains=
 symbols? Is it possible that we are using some uninitialized object as a r=
esult of the allocation failure?</div>
<div><br></div><div>-Dave</div><div><br><div class=3D"gmail_quote">On Mon, =
Oct 15, 2012 at 1:35 PM, Martin Lemke <span dir=3D"ltr">&lt;<a href=3D"mail=
to:terrade...@gmail.com" target=3D"_blank">terrade...@gmail.com</a>&gt;</sp=
an> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">I forgot to mention that the fact that 0xfff=
90000-0xfffd0000 range is not a continuous memory allocation is revealed no=
t using &quot;info files&quot; gdb command but by trying to examine the mem=
ory. The allocation apparently ends between 0xfff9a000 and 0xfff9a100 -whic=
h is really weird, it could be that the core is not ok.<div>
<br></div><div>Thank you,</div><div>Martin<div><div class=3D"h5"><br><div><=
br>On Monday, October 15, 2012 7:23:06 PM UTC+2, Martin Lemke wrote:<blockq=
uote class=3D"gmail_quote" style=3D"margin:0;margin-left:0.8ex;border-left:=
1px #ccc solid;padding-left:1ex">
I am experiencing repetitive crashes in tcmalloc=A0CentralFreeList::<u></u>=
Populate() under a low-memory condition on linux (debian 5.0.8 with 2.6.32.=
15-686 kernel), the application is i32, using=A02.0-1 version of=A0libgperf=
tools. tcmalloc is loaded using LD_PRELOAD instruction.<div>
<br></div><div>The problem occurs when memory allocations reach their limit=
 at ~4G.=A0Normally tcmalloc under such conditions logs a failure in stderr=
 and exits the process, which is more or less ok.=A0However=A0in some cases=
 a crash occurs.</div>
<div><br></div><div>I investigated one of the crashes, it looks like there =
is a race condition between two threads - both calling =A0CentralFreeList::=
Populate(), however one of them fails to allocate a span, and is already lo=
gging the failure - though I don&#39;t see that it has already called exit(=
). The second thread seems to be more successful, it seems to have allocate=
d needed span, but then the crash occurs. Here is what I found (after long =
reverse-engineering session since tcmalloc we use do not have debugging inf=
ormation):</div>
<div><br></div><div>a span is allocated:</div><div>=A0(gdb) x/10x 0x75852b7=
8=A0</div><div>0x75852b78: =A0 =A0 0x0007ffc8 =A0 =A0 =A00x00000020 =A0 =A0=
 =A00x00000000 =A0 =A0 =A00x00000000</div><div>0x75852b88: =A0 =A0 0xfff900=
00 =A0 =A0 =A00x00550000 =A0 =A0 =A00x000297a8 =A0 =A0 =A00x00000001</div>
<div><br></div><div>span-&gt;start is=A00x0007ffc8 with length=A00x00000020=
. npages is=A00x00000020 as well - which is correct, as I assume.</div><div=
><br></div><div>The tcmalloc is apparently compilated with=A0TCMALLOC_LARGE=
_PAGES off, so=A0kPageShift =A0=3D 13.</div>
<div><br></div><div>Then the function starts populating linked lists. The f=
irst address, start-&gt;objects, as we see, is calculated as=A00xfff90000. =
The value of the &quot;limit&quot; variable is calculated as=A00xfffd0000 =
=A0 =A0 =A0</div>
<div><br></div><div>The content of span-&gt;objects:</div><div><br></div><d=
iv><div>(gdb) x/20x *(0x75852b78+0x10)=A0</div><div>0xfff90000: =A0 =A0 0xf=
ffd0000 =A0 =A0 =A00x00000000 =A0 =A0 =A00x00000000 =A0 =A0 =A00x00000000</=
div></div><div><br></div>
<div>(I don&#39;t really understand it because=A00xfffd0000 is essentially =
the &quot;limit&quot; variable. However I also see that the value of &quot;=
num&quot; is 3, meaning that we made more than two iteration! )<br></div>
<div><br></div><div>But the main problem is the memory mapping:</div><div><=
br></div><div><div>show files:</div><div><span style=3D"white-space:pre-wra=
p">	</span>0xf77ce000 - 0xff90e000 is load1215</div><div>=A0 =A0 =A0 =A0 0x=
ff91a000 - 0xff92f000 is load1216</div>
<div>=A0 =A0 =A0 =A0 0xff930000 - 0xfffd0000 is load1217 (there is no alloc=
ations after=A00xfffd0000)</div></div><div><br></div><div>The range between=
=A00xfff90000 =A0and 0xfffd0000 - the span-&gt;objects and &quot;limit&quot=
; is not a continuous memory allocation.=A0</div>
<div><br></div><div>The crash occurs apparently in the line &quot;*tail =3D=
 ptr&quot;, trying to store a ridiculous address 0x10000 (0xfffd0000=A0+ 0x=
40000 (&quot;size&quot;)) at =A00xfffd0000.</div><div><br></div><div>First =
of all, I have a feeling that tcmalloc expects ranges=A00xfff90000-0xfffd00=
00 and even=A00xfffd0000-0x10000=A0be continuous memory allocation. My assu=
mption can be wrong though, can you please confirm or reject it?</div>
<div><br></div><div>Then, the address=A00xfffd0000, &quot;limit&quot;,=A0is=
 expected to be a valid memory allocation, but it is not. The=A00x10000=A0I=
 can&#39;t explain at all (in fact, the core already contains a successful =
allocation started as low as 0x8000 (0x00008000 - 0x08048000 is load1), whi=
ch also comes to me as a surprise - this allocation is filled with my appli=
cation dynamic data, more particularly I clearly see that it contains one o=
f the strings that application generates. I would expect that such low-memo=
ry addresses are used as stack, for example, it comes even before the main =
application binary!).</div>
<div><br></div><div>I would like to ask your advice - is there something wh=
ich can be done or it can be expected that under a low-memory condition a m=
ulti-threaded allocations can fail randomly? Is the data that I see consist=
ent?</div>
<div><br></div><div>It is clear that the real solution would be to avoid su=
ch low-memory condition. Still I have a feeling that what I see is a kind o=
f stress test which makes tcmalloc to hit some inconsistency and fail. Then=
 I wonder if similar error may occur without such a stess, in a normal cond=
itions.</div>
<div><br></div><div>Thank you,</div><div>Martin</div></blockquote></div></d=
iv></div></div><div><div class=3D"h5">

<p></p>

-- <br>
You received this message because you are subscribed to the Google Groups &=
quot;google-perftools&quot; group.<br></div></div>
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msg/google-perftools/-/tECeBIHTdigJ" target=3D"_blank">https://groups.=
google.com/d/msg/google-perftools/-/tECeBIHTdigJ</a>.<div class=3D"HOEnZb">=
<div class=3D"h5">
<br>=20
To post to this group, send email to <a href=3D"mailto:google-perftools@goo=
glegroups.com" target=3D"_blank">google-perftools@googlegroups.com</a>.<br>
To unsubscribe from this group, send email to <a href=3D"mailto:google-perf=
tools%2Bunsubscribe@googlegroups.com" target=3D"_blank">google-perftools+un=
subscribe@googlegroups.com</a>.<br>

For more options, visit this group at <a href=3D"http://groups.google.com/g=
roup/google-perftools?hl=3Den" target=3D"_blank">http://groups.google.com/g=
roup/google-perftools?hl=3Den</a>.<br>


</div></div></blockquote></div><br></div>

--14dae9cdc0fdcda0a804cc1d4fab--