Fwd: [GHC] #698: GHC's internal memory allocator never releases memory back to the OS

1 view
Skip to first unread message

Gwern Branwen

unread,
May 31, 2009, 12:11:38 PM5/31/09
to Gitit
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

If anyone is interested in part of the reason why Gitit's memory use
can be so high.

- --
gwern

- ---------- Forwarded message ----------
From: GHC
Date: Sun, May 31, 2009 at 12:01 PM
Subject: Re: [GHC] #698: GHC's internal memory allocator never
releases memory back to the OS
To:
Cc: glasgow-ha...@haskell.org


#698: GHC's internal memory allocator never releases memory back to the OS
- ---------------------------------+------------------------------------------
Reporter: guest | Owner: igloo
Type: bug | Status: new
Priority: low | Milestone: 6.12 branch
Component: Runtime System | Version: 6.4.1
Severity: normal | Resolution:
Keywords: | Difficulty: Moderate (1 day)
Testcase: N/A | Os: Linux
Architecture: Unknown/Multiple |
- ---------------------------------+------------------------------------------
Changes (by guest):

* cc: gwe...@gmail.com (added)

Comment:

I second Bulat's comment. This is *very* important for long-running
servers. Consider the Gitit wiki server. There are a few pages like
'Recent changes' which get the entire revision history; each time this
happens, the memory usage goes up a bit - even though even last bit of the
history will get discarded once the page has been constructed and sent off
to the client. (We've checked for memory leaks.) This is especially
problematic when Gitit is being used, surprisingly enough, as a web host,
since typically this is done on virtualized slices or is otherwise
resource-constrained. It looks bad for wiki.darcs.net that an idling gitit
takes 31% of RAM.

- --
Ticket URL:
GHC
The Glasgow Haskell Compiler
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEAREKAAYFAkoirDYACgkQvpDo5Pfl1oJkpQCdGohj2WbEHHHEjperUEaDhXgO
M/UAn3h7AHHTSMLoy16aY892QgqGK+3r
=hEJj
-----END PGP SIGNATURE-----

John MacFarlane

unread,
Jun 2, 2009, 11:39:00 AM6/2/09
to gitit-...@googlegroups.com
Interesting -- can you post a link to the bug report?
(I didn't see one below.)

+++ Gwern Branwen [May 31 09 12:11 ]:

Gwern Branwen

unread,
Jun 2, 2009, 12:06:39 PM6/2/09
to gitit-...@googlegroups.com
On Tue, Jun 2, 2009 at 11:39 AM, John MacFarlane <fiddlo...@gmail.com> wrote:
>
> Interesting -- can you post a link to the bug report?
> (I didn't see one below.)

http://hackage.haskell.org/trac/ghc/ticket/698

--
gwern

Anton van Straaten

unread,
Jun 3, 2009, 1:10:16 PM6/3/09
to gitit-...@googlegroups.com
Gwern Branwen wrote:
> This is *very* important for long-running
> servers. Consider the Gitit wiki server. There are a few pages like
> 'Recent changes' which get the entire revision history; each time this
> happens, the memory usage goes up a bit - even though even last bit of the
> history will get discarded once the page has been constructed and sent off
> to the client. (We've checked for memory leaks.) This is especially
> problematic when Gitit is being used, surprisingly enough, as a web host,
> since typically this is done on virtualized slices or is otherwise
> resource-constrained. It looks bad for wiki.darcs.net that an idling gitit
> takes 31% of RAM.

Not to defend GHC's behavior, but from this description, it sounds as
though the root of the problem in this case is the need to load the
entire revision history into memory at once.

Anton

John MacFarlane

unread,
Jun 3, 2009, 4:09:06 PM6/3/09
to gitit-...@googlegroups.com
+++ Anton van Straaten [Jun 03 09 13:10 ]:

Well, it's not true that the "recent changes" page retrieves
the entire revision history. It only gets the last month's worth.

John

Gwern Branwen

unread,
Jun 4, 2009, 1:06:48 PM6/4/09
to gitit-...@googlegroups.com
On Wed, Jun 3, 2009 at 4:09 PM, John MacFarlane <fiddlo...@gmail.com> wrote:
> Well, it's not true that the "recent changes" page retrieves
> the entire revision history.  It only gets the last month's worth.
>
> John

One of my problems with optimizing gitit and darcs is that watching
the output of ps, there is an unconditional/no-options 'darcs changes'
call *somewhere*, but for the life of me, I haven't been able to
figure out where and profiling doesn't help. It really perplexes me.

--
gwern

Thomas Hartman

unread,
Jun 4, 2009, 1:10:41 PM6/4/09
to gitit-...@googlegroups.com
probably being called from the filestore module, no?
--
Thomas Hartman

Darcs hosting: patch-tag.com
Build a webapp with haskell: happstack.com

Simon Michael

unread,
Jun 4, 2009, 1:17:05 PM6/4/09
to gitit-...@googlegroups.com
> One of my problems with optimizing gitit and darcs is that watching
> the output of ps, there is an unconditional/no-options 'darcs changes'
> call *somewhere*, but for the life of me, I haven't been able to
> figure out where and profiling doesn't help. It really perplexes me.

Maybe you could instrument filestore with some optional logging ? I
think that would be quite useful.

Anton van Straaten

unread,
Jun 4, 2009, 1:47:53 PM6/4/09
to gitit-...@googlegroups.com

In the Filestore lib, in Data/FileStore/Darcs.hs, the function
darcsGetRevision includes the following call:

darcsLog repo [] (TimeRange Nothing Nothing)

Afaict from reading code, this would cause darcsLog to invoke darcs as
follows:

darcs changes --xml-output --summary

Could that be the call you're seeing?

Anton

Gwern Branwen

unread,
Jun 4, 2009, 2:02:24 PM6/4/09
to gitit-...@googlegroups.com

No. I believe that call was removed long ago; it's certainly not in
darcs FileStore.

--
gwern

Anton van Straaten

unread,
Jun 4, 2009, 2:51:50 PM6/4/09
to gitit-...@googlegroups.com

Oh, sorry. The filestore code I was looking at was from
http://johnmacfarlane.net/repos/filestore on April 7.

Still, the latest darcsLog function seems like it would be capable of
generating a call like the one I mentioned, if something called it with
no filenames and Nothing timestamps. Some of the calls to darcsLog are
indirect, through the 'history' function. Perhaps putting a trace in
darcsLog to watch for the case where begin & end are both Nothing might
turn something up?

Anton

John MacFarlane

unread,
Jun 4, 2009, 3:17:02 PM6/4/09
to gitit-...@googlegroups.com
+++ Gwern Branwen [Jun 04 09 14:02 ]:

darcsLatestRevId will call 'darcs changes --xml-output' unless you
compiled with the maxcount flag. Are you sure you did?

As suggested, it would be easy enough to put a debugging statement
in runDarcsCommand that would print each darcs command as it was
run...then you could see which gitit action triggered the unqualified
'darcs changes'.

John

Gwern Branwen

unread,
Jun 4, 2009, 3:59:22 PM6/4/09
to gitit-...@googlegroups.com
On Thu, Jun 4, 2009 at 3:17 PM, John MacFarlane <fiddlo...@gmail.com> wrote:
> darcsLatestRevId will call 'darcs changes --xml-output' unless you
> compiled with the maxcount flag.  Are you sure you did?

Quite sure. When we added the maxcount flag, I made a local change to
always enable it, just so there could be no confusion.

> As suggested, it would be easy enough to put a debugging statement
> in runDarcsCommand that would print each darcs command as it was
> run...then you could see which gitit action triggered the unqualified
> 'darcs changes'.
>
> John

I already have a fairly good idea that editing pages triggers it; but
I'm not sure *where* it's triggered and it doesn't seem to show up in
the profiling call hierarchy. This is why I said I'm so perplexed.

--
gwern

John MacFarlane

unread,
Jun 4, 2009, 4:25:30 PM6/4/09
to gitit-...@googlegroups.com
+++ Gwern Branwen [Jun 04 09 15:59 ]:

It couldn't hurt to get a more fine-grained picture, as opposed to
a fairly good idea.

Reply all
Reply to author
Forward
0 new messages