Pipes, StateT and memory consumption

39 views
Skip to first unread message

Alexey Raga

unread,
Aug 15, 2015, 6:52:09 AM8/15/15
to Haskell Pipes
Hi All,

I have noticed that as soon as I introduce StateT into my pipeline the memory consumption increases a lot, and it keeps increasing until the program stops.

Here is a simple counter example, but even this one reproduces the problem:

counter :: Pipe a a (StateT Integer IO) ()
counter = do
    x <- await
    S.modify(+1)
    yield x
    counter

Piping a very large file (~40M rows) through a simple in/out pipeline only consumes 3mb of memory, but once I introduce this "counter" step the memory consumption keeps increasing up to ~700mb when the pipeline finishes.

Am I doing it wrong? And what would be the correct way of implementing such a "stateful" Pipe?

Cheers,
Alexey.

Daniel Díaz

unread,
Aug 15, 2015, 7:06:41 AM8/15/15
to Haskell Pipes
What happens if you use modify' instead of modify? A common gotcha with Writer and State is that even the "strict versions" of these monads aren't strict in the accumulators. You have to explicitly use modify' in the case of State.

Alexey Raga

unread,
Aug 15, 2015, 7:45:12 AM8/15/15
to Haskell Pipes
Wow, Daniel, that helped a lot with the counter, thank you!

But my "real" code continues eating memory like crazy even after updating it to modify'. I assume it could be something similar?

Can you spot something bad in the following one?

data RefField a = RefField String (a -> ByteString)

ref :: [RefField a] -> Pipe a a (StateT (IndexTables ByteString) IO) ()
ref fields =
    let updateTables x ts = F.foldl' (\s' (RefField n f) -> tableInsert n (f x) s') ts fields
    in forever $ await >>= (\x -> do { S.modify' (updateTables x); yield x })

And IndexTable is just a wrapper for Map. I tried to use Data.Map.Strict and it didn't help...

Cheers,
Alexey.

Alexey Raga

unread,
Aug 15, 2015, 8:00:04 AM8/15/15
to Haskell Pipes

Hmmm, that was actually one of my functions that operates on Map. I forced its argument to be strict (using BandPattern extension) and now the memory footprint is as expected.

Thank you again for your help with modify', it was really puzzling until you explained.

Cheers,
Alexey.

John Wiegley

unread,
Aug 15, 2015, 5:18:39 PM8/15/15
to haskel...@googlegroups.com
>>>>> Alexey Raga <alexe...@gmail.com> writes:

> forever $ await >>= (\x -> do { S.modify' (updateTables x); yield x })

Does it help at all to use:

mapM $ \x -> S.modify' (updateTables x) >> return x

John
Reply all
Reply to author
Forward
0 new messages