best tool for log file processing?

12 views
Skip to first unread message

nondu...@gmail.com

unread,
Dec 18, 2007, 1:03:37 PM12/18/07
to CUFP
I would like to try FP for log file processing, i.e. reading files,
parsing, sorting, merging and writing files.
Not sure which language/environment to choose? Erlang, Haskell, OCaml?



Per

Clifford Beshers

unread,
Dec 18, 2007, 1:43:54 PM12/18/07
to cu...@googlegroups.com
http://software.complete.org/hslogger

John Goerzen has done some work on creating log files from Haskell, but I think it's mostly for creating log files, not analyzing them.

Ulf Wiger

unread,
Dec 18, 2007, 5:25:56 PM12/18/07
to cu...@googlegroups.com
2007/12/18, nondu...@gmail.com <nondu...@gmail.com>:

>
> I would like to try FP for log file processing, i.e. reading files,
> parsing, sorting, merging and writing files.
> Not sure which language/environment to choose? Erlang, Haskell, OCaml?

A colleague of mine (the now famous Hans Nilsson) has written some
absolutely gorgeous little programs in Prolog for analyzing logs.
In one program, he laid down the rules for a successful SIP dialogue,
and then had it scour a large log looking for calls where something
went wrong (e.g. missing ACK message). The program ran really fast,
and found a small number of odd cases among 10,000 or so successful
calls. Ok, Prolog isn't FP, but I believe it's a pretty marvellous tool for
log analysis.

BR,
Ulf W

nondu...@gmail.com

unread,
Dec 19, 2007, 11:01:58 AM12/19/07
to CUFP
Cool, I have looked at this a bit further and decided to go with
Erlang, because of process distribution.

I remember working with the Erlang guys back in the early nineties, we
got an Ericsson MD110 PBX, and the logic
in it was overridden with Erlang code on a Sun machine :-)


Per



On Dec 18, 2:25 pm, "Ulf Wiger" <ulf.wi...@gmail.com> wrote:
> 2007/12/18, nondual...@gmail.com <nondual...@gmail.com>:

Valery V. Vorotyntsev

unread,
Dec 19, 2007, 11:10:42 AM12/19/07
to nondu...@gmail.com, cu...@googlegroups.com
On 12/19/07, nondu...@gmail.com <nondu...@gmail.com> wrote:
>
> Cool, I have looked at this a bit further and decided to go with
> Erlang, because of process distribution.
>
> I remember working with the Erlang guys back in the early nineties, we
> got an Ericsson MD110 PBX, and the logic
> in it was overridden with Erlang code on a Sun machine :-)

Could you show a sample of your log format, please?

--
vvv

thomas_h

unread,
Dec 19, 2007, 12:51:39 PM12/19/07
to cu...@googlegroups.com
> I would like to try FP for log file processing, i.e. reading files,
> parsing, sorting, merging and writing files.
> Not sure which language/environment to choose? Erlang, Haskell, OCaml?

Ok, you've already made up your mind, but from the choices you offered
I would have probably picked OCaml: it produces very fast code, and
has this impressive "camlp4o" module for parsing.

=T.

Ulf Wiger

unread,
Dec 19, 2007, 5:39:55 PM12/19/07
to cu...@googlegroups.com
2007/12/19, nondu...@gmail.com <nondu...@gmail.com>:

>
> Cool, I have looked at this a bit further and decided to go with
> Erlang, because of process distribution.
>
> I remember working with the Erlang guys back in the early nineties, we
> got an Ericsson MD110 PBX, and the logic
> in it was overridden with Erlang code on a Sun machine :-)

I have a working simulator of that setup, and used it to
illustrate some different ways of programming multi-way
concurrency. I instrumented it somewhat so that I could
delay answers from the switch, and so fairly easily demonstrate
timing bugs in the software e.g. by hanging up too fast in
the graphical simulator.

I had this wish that someone would plug something else into
it, and illustrate what it could look like, e.g. when programming
telephony in OCaml, OHaskell, C++, or whatever... So far, I haven't
had any takers.

http://www.erlang.se/euc/05/1500Wiger.ppt

It was fun to play with it. When I first learned Erlang back in 1992,
it was in a lecture series, where we were taught Erlang in four
lectures, and then got to complete a control program for the
MD 110 switch as our assignment. We could play with it in the
simulator, and then try it out on real hardware, actually making
the phones ring and talking to each other. (:

It then became the final lab assignment in the two-day basic
Erlang course given at Ericsson, and if the students remembered
anything at all from that course, it was that lab assignment.

BR,
Ulf W

Francois

unread,
Dec 20, 2007, 10:44:18 AM12/20/07
to CUFP
If it helps, there's been recently some brouhaha around fast/parallel
parsing of large log fies. It was started by Tim Bray on his ongoing
blog. Lots of people have competed in many languages, including the
three you mention below...

http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder

nondu...@gmail.com

unread,
Dec 20, 2007, 11:21:28 AM12/20/07
to CUFP
That is an interesting article, thanks!

I think the mapreduce/hadoop style of architecture fits well, as the
actual CPU time is less important when we have massive a log volume
and many servers.


Per

Alex Jacobson

unread,
Dec 20, 2007, 12:04:39 PM12/20/07
to cu...@googlegroups.com
Another good tool if you have multiple servers is http://spread.org

There is now a pure haskell interface to spread called hspread that we will be using inside happs soon.

-Alex-
Reply all
Reply to author
Forward
0 new messages