Serialization code seems to require excessive amounts of memory

stepcut

unread,

Jul 30, 2009, 6:54:39 PM7/30/09

to HAppS

Hello,

I just filed this bug:

http://code.google.com/p/happstack/issues/detail?id=103

This simple program:

main =
let list = [1..1000000] :: [Int]
bin = B.runPut (safePut list)
list' = B.runGet safeGet bin :: [Int]
in putStrLn (show . length $ takeWhile (< 10000000) list) >>
getLine >> return ()

requires 50 times more RSS and 10 times more VIRT if you change list
in the last line to list'.

I have a real server where my checkpoint file is only 11MB, but once
it gets loaded, the server requires 500-800MB. That seems pretty
excessive. Anyone got any ideas why the serialization code bumps up
the RAM usage to so much? It's not temporary either.. in the above
example, when sitting at the getLine, it is still using all that
memory.

I am using using GHC 6.10.4 on Linux.

- jeremy

Lemmih

unread,

Jul 30, 2009, 7:33:31 PM7/30/09

to HA...@googlegroups.com

Same thing seems to happen with plain Data.Binary. Also, +RTS -s -RTS
is very accurate for measuring heap usage.

--
Cheers,
Lemmih

MightyByte

unread,

Jul 30, 2009, 7:42:36 PM7/30/09

to HA...@googlegroups.com

I've also been experiencing similar memory size anomalies. My
checkpoint file is six megs and I'm using 600+ megs of RAM. I haven't
yet identified the source of the problem, but this sounds very similar.

stepcut

unread,

Jul 31, 2009, 4:58:16 PM7/31/09

to HAppS

Cool. I started a thread on haskell-cafe, now we just have to see if
dons rises to the challenge:

http://www.haskell.org/pipermail/haskell-cafe/2009-July/064779.html

This does indeed seem to be an issue with Binary, and not anything
happstack specific.

- jeremy

Matthew Elder

unread,

Jul 31, 2009, 5:40:52 PM7/31/09

to HA...@googlegroups.com

Looks good stepcut, I only have one complaint.

*puts on his linux hat*

its Linux, ok? not GNU/Linux.

*puts on his RMS hat*

Linux is actually only the kernel that the GNU operating system runs on. You see HERD was a really advanced kernel and it was going to be the best but Linux beat us.

Since Linux is actually the GNU operating system, you should call it GNU/Linux.

*thank you*

--
Need somewhere to put your code? http://patch-tag.com
Want to build a webapp? http://happstack.com

Don Stewart

unread,

Jul 31, 2009, 5:49:31 PM7/31/09

to HA...@googlegroups.com

Is there a general task here to write a better Binary instance for IxSet
et al?

If so, I could look at it. Let me know what you need.

-- Don

matt:

MightyByte

unread,

Jul 31, 2009, 6:05:35 PM7/31/09

to HA...@googlegroups.com

Well, we need something that both won't overflow stack *and* doesn't
impose this 60x space explosion.

Don Stewart

unread,

Jul 31, 2009, 6:16:02 PM7/31/09

to HA...@googlegroups.com

Where's the data type defined? And is there some test data?

I'd be happy to help design a good serialization instance for the
happstack team.

-- Don

mightybyte:

Lemmih

unread,

Jul 31, 2009, 6:31:58 PM7/31/09

to HA...@googlegroups.com

On Sat, Aug 1, 2009 at 12:16 AM, Don Stewart<do...@galois.com> wrote:
>
> Where's the data type defined? And is there some test data?

The test case is a list of ints. It doesn't seem to be related to happstack.

--
Cheers,
Lemmih

Tom Tobin

unread,

Jul 31, 2009, 6:10:52 PM7/31/09

to HA...@googlegroups.com

On Fri, Jul 31, 2009 at 4:40 PM, Matthew Elder<ma...@mattelder.org> wrote:
> *puts on his RMS hat*

That's one frightening hat. ^_^

MightyByte

unread,

Jul 31, 2009, 7:10:48 PM7/31/09

to HA...@googlegroups.com

Dons, the code is in Happstack.Data.IxSet but I agree with lemmih that
it should be able to be fixed outside of Happstack.

Don Stewart

unread,

Jul 31, 2009, 7:14:37 PM7/31/09

to HA...@googlegroups.com

So the problem is how to serialise lists lazily.

There are several encodings.

* write the length of the list, then the elements
* interleave bits of length with elements

We choose the more space efficient disk encoding. Also, we can't change
the disk encoding of the default instances (so we can't switch away from
a length encoding).

Specific applications may wish to use different disk representations,
due to other constaints. happstack should just use their own encoding
that satisifies the constrains they have.

mightybyte:

Lemmih

unread,

Jul 31, 2009, 7:41:59 PM7/31/09

to HA...@googlegroups.com

On Sat, Aug 1, 2009 at 1:14 AM, Don Stewart<do...@galois.com> wrote:
>
>
> So the problem is how to serialise lists lazily.
>
> There are several encodings.
>
> * write the length of the list, then the elements
> * interleave bits of length with elements
>
> We choose the more space efficient disk encoding. Also, we can't change
> the disk encoding of the default instances (so we can't switch away from
> a length encoding).
>
> Specific applications may wish to use different disk representations,
> due to other constaints. happstack should just use their own encoding
> that satisifies the constrains they have.

We aren't concerned about efficient disk encodings. It's the memory
retention we're worried about. Serializing 1,000,000 ints requires
32megs of memory (101 with GC overhead). Serializing 5,000,000 ints
requires 186megs of memory (475 with GC overhead). Serializing the
list with 'show' runs in constant space; hopefully this could be made
true for Data.Binary as well.

--
Cheers,
Lemmih

Lemmih

unread,

Jul 31, 2009, 7:43:01 PM7/31/09

to HA...@googlegroups.com

Oh darn, I'm a fool. Why do I always realize these things right after
I send the mail. Please ignore me.

--
Cheers,
Lemmih

Lemmih

unread,

Jul 31, 2009, 9:11:24 PM7/31/09

to HA...@googlegroups.com

To show that I'm not a complete idiot, here's a chunked list instance
that runs in constant space, and a "strict" list instance that
serializes the elements before writing the length of the list. The
strict list instance is a drop-in replacement for the normal list
instance and it is about 10x more memory efficient for lazy lists of
ints. It does not run in constant space, though.

The two instances obviously only use less memory when serializing lazy
data. I guess the lesson here (if there is any) is to avoid data
structures that uses 40 bytes per entry when storing objects in the
millions.

import Data.Binary
import qualified Data.Binary.Put as B
import qualified Data.ByteString.Lazy as BS
import Control.Monad

newtype ChunkedList a = ChunkedList [a]

instance Binary a ⇒ Binary (ChunkedList a) where
put (ChunkedList lst)
= worker (toChunks lst)
where worker [] = do putWord8 0
worker (x:xs) = do putWord8 1
put x
worker xs
get = liftM ChunkedList worker
where worker = do n ← getWord8
if n == 0
then do return []
else do liftM2 (:) get worker

toChunks ∷ [a] → [[a]]
toChunks [] = []
toChunks lst = take n lst : toChunks (drop n lst)
where n = 1024

newtype StrictList a = StrictList [a]

instance Binary a ⇒ Binary (StrictList a) where
put (StrictList lst)
= do let (len,bs) = B.runPutM (builder 0 lst)
BS.length bs `seq` put len >> B.putLazyByteString bs
where builder !n [] = return n
builder !n (x:xs) = do put x
builder (n+1 ∷ Int) xs
get = liftM StrictList get

--
Cheers,
Lemmih

stepcut

unread,

Jul 31, 2009, 10:59:14 PM7/31/09

to HAppS

I believe I ultimately determined that my test case was bogus. Here is
my revised test, which does a better job of keeping the whole list in
memory (which is the behavior we generally want with happstack).

main :: IO ()

main =
let list = [1..1000000] :: [Int]

bin = encode list
list' = decode bin :: [Int]
in do putStrLn (show . length $ takeWhile (< 10000000) list)
getLine
putStrLn (show . length $ takeWhile (< 10000001) list)

This version consumes about 40MB in the control version. and 60MB when
you swap list' for list. Further testing using decodeFile/encodeFile
showed that writing to disk, restarting the app, and reloading the
disk state resulted in 50MB of space instead of 40MB, which is not too
bad.

For applications that use lots of Strings the best option is probably
to use Data.Text from the text library.

Now that I have a better control, I am going to do some additional
testing and see if happstack-data or IxSet are causing any unneeded
memory usage. It is quite possible that the difference between the
checkpoint file and the loaded state sizes is because the binary
format is very compact, and typical Haskell data structures are not.
The 'fix' for this is to use Haskell data structures like Data.Text
which aim to provide compact in-memory representations.

- jeremy

MightyByte

unread,

Aug 4, 2009, 1:16:37 PM8/4/09

to HA...@googlegroups.com

Is there a good way we could integrate this into Happstack as an
alternative way to serialize IxSets?

Don Stewart

unread,

Aug 4, 2009, 1:33:45 PM8/4/09

to HA...@googlegroups.com

mightybyte:

I'm happy to change the implementation of [a] serialization (as long as
we prefer on-disk format), or to add a chunked list newtype, since that
seems to be useful too.

-- Don

Gregory Collins

unread,

Aug 4, 2009, 3:22:43 PM8/4/09

to HA...@googlegroups.com

Don Stewart <do...@galois.com> writes:

> I'm happy to change the implementation of [a] serialization (as long
> as we prefer on-disk format), or to add a chunked list newtype, since
> that seems to be useful too.

+1 for the newtype, changing the default list encoding will cause
headaches for people who use binary.

G.
--
Gregory Collins <gr...@gregorycollins.net>

MightyByte

unread,

Aug 4, 2009, 4:15:34 PM8/4/09

to HA...@googlegroups.com

On Tue, Aug 4, 2009 at 3:22 PM, Gregory Collins<gr...@gregorycollins.net> wrote:
>
> Don Stewart <do...@galois.com> writes:
>
>> I'm happy to change the implementation of [a] serialization (as long
>> as we prefer on-disk format), or to add a chunked list newtype, since
>> that seems to be useful too.
>
> +1 for the newtype, changing the default list encoding will cause
> headaches for people who use binary.

Yeah, I agree. Of course the next question is whether we should
change IxSet to use the chunked list or create something like
Happstack.Data.IxSet.Chunked.

S. Alexander Jacobson

unread,

Aug 7, 2009, 3:34:04 PM8/7/09

to HA...@googlegroups.com

Can you clarify the issue here? Do state types need special
customization to be serialize correctly within HAppS as distinct from
other contexts?

-Alex-

MightyByte

unread,

Aug 7, 2009, 4:01:56 PM8/7/09

to HA...@googlegroups.com

The issue is that Happstack's memory usage seems a little high, and it
seems that the serialization may be partly responsible for that. I'm
currently looking for every possible way I can reduce my memory
footprint, because it is what is driving up my hosting costs the most.
I'm not coming close to my bandwidth, CPU, and disk usage limits. So
I would argue that Happstack needs a list serialization that runs in
constant space.

S. Alexander Jacobson

unread,

Aug 7, 2009, 7:50:47 PM8/7/09

to HA...@googlegroups.com

We can just shift back to serializing using Read/Show. It is a very
easy change of code.

MightyByte

unread,

Aug 7, 2009, 8:14:26 PM8/7/09

to HA...@googlegroups.com

That sounds fine, but we need to make sure we don't break existing
state.

On Aug 7, 2009, at 7:50 PM, "S. Alexander Jacobson" <al...@alexjacobson.com

stepcut

unread,

Aug 8, 2009, 1:32:41 PM8/8/09

to HAppS

In my experience, the tricky part so far has been figuring out why
memory is being used, and if it is justified.

If you use memory profiling to examine the memory consumption when
restoring from a checkpoint file, you find out that all the memory is
being allocated by various instances of getCopy. Which is not
surprising, but also does not tell you anything useful.

So, the first thing to do is to figure how to get a good picture of
where all the memory is going. In my case, it may all be due to the
fact that my state has a lot of Strings in it. These 'compress' really
well when serialized (because they are converted to a utf-8
bytestring). So that could explain the big different between the on
disk checkpoint file size, and the amount of memory used after it is
loaded. If that is the case, then the method used for serialization/
deserialization is not going to change anything -- I need to change my
types instead, to use a more efficient representation in memory.

Not sure how to profile this type of thing though.

- jeremy

MightyByte

unread,

Aug 8, 2009, 3:08:28 PM8/8/09

to HA...@googlegroups.com

Yeah, I've also had difficulty figuring out why so much memory is
being used. My state doesn't have very many strings, and when I did
convert some of them to bytestrings I didn't see a substantial
improvement. The memory intensive part of my data structure may be
that my main IxSet has 4 static indexes and 4 calculated indexes that
probably aren't stored in the checkpoint file. (Although 60x bloat
still seems excessive.)

Don Stewart

unread,

Aug 8, 2009, 3:07:23 PM8/8/09

to HA...@googlegroups.com

Can you profile?

mightybyte:

MightyByte

unread,

Aug 8, 2009, 3:32:26 PM8/8/09

to HA...@googlegroups.com

Yes, I've done some profiling. The flatten function in
Happstack.Data.IxSet is the biggest allocator. It seems difficult to
distinguish valid memory usage from unnecessary usage, but maybe I
just don't have enough experience with Haskell profiling.

stepcut

unread,

Aug 8, 2009, 6:12:33 PM8/8/09

to HAppS

On Aug 8, 2:07 pm, Don Stewart <d...@galois.com> wrote:
> Can you profile?

Profiling tells you what functions allocated the memory. But that does
not really tell you if too much memory was allocated.

In happstack, almost all the memory is being used by the 'state'
value. If you look at the runTxLoop function at the end of this file:

http://hackage.haskell.org/packages/archive/happstack-state/0.3.4/doc/html/src/Happstack-State-Transaction.html

the 'st0' parameter holds the current state. If there was some way to
calculate the size of that data structure at runtime that would be
useful. Especially if you could traverse the value and show how much
each piece was using up.

So, if your state had the type:

data State = State Int (IxSet String) (IxSet Foo)

It would be nice to get some sort of break down of how much space Int,
IxSet String, and IxSet Foo were taking up. And then be able to break
it down further and see how much each Foo record in the IxSet was
taking up.

This would be a bit like trying to figure out why you have no disk
space. You might start by going:

du -sh /

To see which directories are taking up a lot of space. Then you might
drill down into those directories and do it again:

du -sh /home

and eventually you might find that a bunch of old linux ISOs or
something that you forgot about which is taking up all your space. Or
you may find that you just have a lot of valuable data and need a
bigger hard drive.

Not sure how to do this in GHC though. Any ideas?

- jeremy

MightyByte

unread,

Aug 8, 2009, 6:31:41 PM8/8/09

to HA...@googlegroups.com

On Sat, Aug 8, 2009 at 6:12 PM, stepcut<jer...@n-heptane.com> wrote:
>
> On Aug 8, 2:07 pm, Don Stewart <d...@galois.com> wrote:
>> Can you profile?
>

> the 'st0' parameter holds the current state. If there was some way to
> calculate the size of that data structure at runtime that would be
> useful. Especially if you could traverse the value and show how much
> each piece was using up.

I have wanted this exact feature as well. For IxSet, you'd want to
see both the number of items and the total memory allocated for the
whole set. This would help in estimating the relationship between
number of items, number of indexes, and total size. The problem is
that I have no idea if such a thing is possible, let alone how you
might go about implementing it if it was.

Lemmih

unread,

Aug 8, 2009, 6:36:22 PM8/8/09

to HA...@googlegroups.com

Isn't heap profiling an option?

--
Cheers,
Lemmih

Don Stewart

unread,

Aug 8, 2009, 6:38:09 PM8/8/09

to HA...@googlegroups.com

lemmih:

Yes, seems like a good case for heap profiling (with the per-type option
set, so allocs are broken down by type).

-- Don

MightyByte

unread,

Aug 8, 2009, 6:45:26 PM8/8/09

to HA...@googlegroups.com

Ok, I'll try this. I've done some heap profiling before, but maybe I
just wasn't using the right options.

Don Stewart

unread,

Aug 8, 2009, 6:44:53 PM8/8/09

to HA...@googlegroups.com

mightybyte:

I wrote about it here,

http://book.realworldhaskell.org/read/profiling-and-optimization.html#id678078

with some examples.

MightyByte

unread,

Aug 9, 2009, 11:13:44 AM8/9/09

to HA...@googlegroups.com

Thanks. Somehow I had forgotten about that chapter.

I've been profiling a bit, but I'm still not really learning anything
new. For cost centers, the two big hitters are IxSet.insert and
IxSet.flatten. For types, the big hitter is []. And for
constructors, the big one is obviously ':'.

The problem here is that laziness makes it difficult to tell the
difference between allocations that were caused by reading the
checkpoint into memory and allocations that may have been due to
actual leaks (such as stack and heap leaks like you describe in that
chapter) that are happening when I query the data. It would be nice
if the complete state could be loaded into memory at program start
without me executing any queries. Then it would be easier to tell the
difference between the inherent memory size of my state and other
memory being allocated by the operations I'm doing on my state.
Although you still would have a hard time finding if there were any
leaks in the state loading code.

Lemmih

unread,

Aug 9, 2009, 11:24:22 AM8/9/09

to HA...@googlegroups.com

Shouldn't you be looking at memory retention instead of memory allocation?

MightyByte

unread,

Aug 9, 2009, 11:49:01 AM8/9/09

to HA...@googlegroups.com

To some extent you're right. I've tried some of that too, and it
still doesn't really give me clear answers to the question, "Why is
this data being retained? Is it being retained because the in memory
implementation of IxSet with 8 indexes requires 50x more space than
the checkpoint file? Or is it being retained because of a leak
somewhere?" The big hitters are the ones I expect, but I don't know
whether they're using more than I expect, and whether there's a leak,
or my expectations are wrong.

But in another sense, allocation is just as important as retention.
If my state uses one meg, but a query/update allocates 900 megs that
doesn't need to be retained (and possibly didn't need to be
allocated), I'm still no better off because I still require that more
expensive server. And this situation won't necessarily be obvious
because ghc never releases memory back to the OS.

Lemmih

unread,

Aug 9, 2009, 11:54:51 AM8/9/09

to HA...@googlegroups.com

Try committing a new checkpoint. That should force the state to memory
and you won't get interference from queries.

stepcut

unread,

Aug 9, 2009, 2:45:41 PM8/9/09

to HAppS

On Aug 8, 5:38 pm, Don Stewart <d...@galois.com> wrote:
> Yes, seems like a good case for heap profiling (with the per-type option
> set, so allocs are broken down by type).

Ok, so I ran my real app with, +RTS -I0 -p -s -hy

It has been modified so that it just loads the state, makes a
checkpoint, and then exits. The state is an IxSet of Reports. A Report
is a record with with 20+ fields, many of which are Strings, some of
which are other data types.

Here is the graph:

http://src.seereason.com/~jeremy/rs.ps

It is clear that I have a lot of lists, but it is still pretty opaque
to me whether there is anything wrong on not. Those lists could be the
Strings in my Report, or they could be lists created by IxSet, or
deserialization.

- jeremy

Reply all

Reply to author

Forward