[Haskell-cafe] Resumable applicative parsing from tail end?

15 views

Skip to first unread message

martin

unread,

Jun 20, 2016, 6:43:05 AM6/20/16

to haskell Cafe

Hello all,

in my attempt to write a simulation program, I am facing the following problem:

as the simulation progresses, I collect "primary" log entries, each one associated with a timestamp. These entries carry
the information I am interested in (and not so much the final state). Currently I am using a simple list to collect log
entries. Since (:) is so much easier than (++) I prepend new entries to the head of the list. So my log runs backwards
in time with the oldest entries at the tail end.

Because the log-entries are typically too fine-grained to be useful, I want to aggregate several of them into a single
log entry in a secondary log. I want to do this "as I go" and discard primary log entries which are already aggregated.
Thus I can keep the primary log reasonably small.

I thought that aggregation is just a special case of parsing. And indeed, I could write an applicative parser

PrimaryLog -> (PrimaryLog, SecondaryLog)

which does the trick, except I had to reverse the ordering of the primary log. This is because the parser/aggregator
must parse from past to future, because there may be new log entries coming up in the future. I need to be able to
resume parsing where I left off whenever new primary log entries are entered, becuase I may then have just enough
unconsumed primary log entries to create another secondary entry.

Currently I'm trying Data.Sequence to let the primary log grow to the left and let parsing to begin at the far-right.
But I have a bad feeling about it. Particularly, I don't have any hard reasons why a simple List wouldn't be sufficient.
But I just wasn't able to write the code.

What do you think? How would you parse a list from the tail end? Or how else would you deal with this problem?
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Ruben Astudillo

unread,

Jul 6, 2016, 11:54:17 PM7/6/16

to martin, haskel...@haskell.org

On 20/06/16 06:42, martin wrote:
> as the simulation progresses, I collect "primary" log entries, each one associated with a timestamp. These entries carry
> the information I am interested in (and not so much the final state). Currently I am using a simple list to collect log
> entries. Since (:) is so much easier than (++) I prepend new entries to the head of the list. So my log runs backwards
> in time with the oldest entries at the tail end.
>
> Because the log-entries are typically too fine-grained to be useful, I want to aggregate several of them into a single
> log entry in a secondary log. I want to do this "as I go" and discard primary log entries which are already aggregated.
> Thus I can keep the primary log reasonably small.
>
> I thought that aggregation is just a special case of parsing. And indeed, I could write an applicative parser
>
> PrimaryLog -> (PrimaryLog, SecondaryLog)
>
> which does the trick, except I had to reverse the ordering of the primary log. This is because the parser/aggregator
> must parse from past to future, because there may be new log entries coming up in the future. I need to be able to
> resume parsing where I left off whenever new primary log entries are entered, becuase I may then have just enough
> unconsumed primary log entries to create another secondary entry.

You are dealing with the classic "streaming problem" when you keep a
growing "Log" that you can't reduce until the end of the process;
instead of it being returned incrementally as when you `map a list`.

Luckily this problem has two standards solution namely pipes/conduit.
They let you recover the streaming behaviour you want and will make your
parsing go left-to-right.