Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Wide Finder 2 results (OCaml implementation with trivial parallelism)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Mauricio Fernandez  
View profile  
 More options Jun 8 2008, 8:54 pm
From: Mauricio Fernandez <m...@acm.org>
Date: Mon, 9 Jun 2008 02:54:56 +0200
Local: Sun, Jun 8 2008 8:54 pm
Subject: Wide Finder 2 results (OCaml implementation with trivial parallelism)
(Tim: I don't know if/when I'll be able to blog about this, so feel free to
link to this message.)

Here are the results from my last full run (and actually only the 4th
successful one, since I ran into issues with fork() and the 32 bit address
space):

Language: OCaml
LoCs: 124 + ~20 (depends on how the code is partitioned into modules)
http://eigenclass.org/misc/wf2_multicore.ml

real    8m12.155s
user    137m24.981s
sys     19m52.597s

I have introduced a couple modifications since that could save one minute or
two (real time). Unfortunately, the machine has been pretty busy with
rwaldin's debugging and it's too late in my locale now, so this will remain
hypothetical for a while.

I have just learned about the canonical results at
/wf1/data/code/twbray/coolstack.log, and fortunately the output is identical,
as was the case in the first actual run. Not bad :)

The program is mostly I/O-bound, which is why the CPU time is not that large
compared to the wall-clock time. I haven't optimized I/O properly: the program
just uses a number of workers which perform line-oriented(!) I/O in separate
regions of the file. It still manages to read at >100MB/s, and the workers are
waiting for the data from disk half of the time.

I didn't use anything fancy to achieve concurrency this time, just good old
fork+pipe, wrapped in a neat "invoke" function (due to Jon Harrop) of type
  ('a -> 'b) -> 'a -> unit -> 'b

(For non-ML speakers: this is a function that takes a function, applies it to
the given argument in another process, and returns a function that will block
until the result is ready. You get parallelism by applying the "invoke"
function repeatedly, thereby launching several processes at once.)

I am pretty fond of the simple 7-line O(n * log(m)) function that keeps the
top m elements from a collection of n items (the original Ruby implementation
and most others are O(n * m * log(m)) or worse). Generating the reports for
the 42GB file takes 5s in my OCaml program, so it might take up to 1 minute in
most of the other solutions. The time will however remain nearly constant if
the top 100 or 1000 items are requested; the original Ruby solution could take
10 and 100 times longer (easily several minutes to a couple hours!) than with
10 elements.

--
Mauricio Fernandez  -   http://eigenclass.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mauricio Fernandez  
View profile  
 More options Jun 8 2008, 9:07 pm
From: Mauricio Fernandez <m...@acm.org>
Date: Mon, 9 Jun 2008 03:07:17 +0200
Local: Sun, Jun 8 2008 9:07 pm
Subject: Re: Wide Finder 2 results (OCaml implementation with trivial parallelism)

On Mon, Jun 09, 2008 at 02:54:56AM +0200, Mauricio Fernandez wrote:
> Here are the results from my last full run (and actually only the 4th
> successful one, since I ran into issues with fork() and the 32 bit address
> space):

> Language: OCaml
> LoCs: 124 + ~20 (depends on how the code is partitioned into modules)
> http://eigenclass.org/misc/wf2_multicore.ml

> real    8m12.155s
> user    137m24.981s
> sys     19m52.597s

I forgot to add a pointer to the 68-line, single-core, minimalistic OCaml
implementation equivalent to the original Ruby one. It should be able to
process the 42GB dataset in around 2 hours, compared to ~25H for Ruby
(I haven't actually executed it, but it's essentially the same code as in the
multicore case, and it can only take less than the CPU time of the latter
since there's no interprocess communication cost --- I/O is largely irrelevant
in this case).

Here it is: http://eigenclass.org/misc/wf2_simple.ml

--
Mauricio Fernandez  -   http://eigenclass.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google