Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] Shared memory parallel application: kernel threads

21 views
Skip to first unread message

Hugo Ferreira

unread,
Mar 12, 2010, 6:56:11 AM3/12/10
to caml...@yquem.inria.fr
Hello,

I need to implement (meta) heuristic algorithms that
uses parallelism in order to (attempt to) solve a (hard)
machine learning problem that is inherently exponential.
The aim is to take maximum advantage of the multi-core
processors I have access to.

To that effect I have revisited many of the lively
discussions in threads related to concurrency, parallelism
and shared memory in this mailing list. I however still
have many doubts, some of which are very basic.

My initial objective is to make a very simple tests that
launches k jobs. Each of these jobs must access
a common data set that is read-only. Each of the k threads
in turn generates its own data. The data generated by the k
jobs are then placed in a queue for further processing.

The process continues by launching (or reusing) k/2 jobs.
Each job consumes two elements from the queue that where
previously generated (the common data set must still be
available). The process repeats itself until k=1. Note
that the queued data is not small nor can I determine
a fixed maximum size for it.

I have opted to use "kernel-level threads" that allow use
of the (multi-core) processors but still allow "easy"
access to shared memory".

I have done a cursory look at:
- Ocaml.Threads
- Ocaml.Unix (LinuxThreads)
- coThreads
- Ocamlnet2/3 (netshm, netcamlbox)
(An eThreads library exists in the forge but I did not examine this)

My first concern is to take advantage of the multi-cores so:

1. The thread library is not the answer
Chapter 24 - "The threads library is implemented by time-sharing on
a
single processor. It will not take advantage of multi-processor
machines." [1]

2. LinuxThreads seems to be what I need
"The main strength of this approach is that it can take full
advantage of multiprocessors." [2]


Issue 1

In the manual [3] I see only references to function for the creation
and use of processes. I see no calls that allow me to simply generate
and assign a function (job) to a thread (such as val create : ('a -> 'b)
-> 'a -> t in the Thread module). The unix library where LinuxThreads
is now integrated shows the same API. Am I missing something or
is their no way to launch "threaded functions" from the Unix module?
Naturally I assume that threads and processes are not the same thing.

Issue 2

If I cannot launch kernel-threads to allow for easy memory sharing, what
other options do I have besides netshm? The data I must share is defined
by a recursive variant and is not simple numerical data.

I would appreciate any comments.

TIA,
Hugo F.


[1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
[2] http://pauillac.inria.fr/~xleroy/linuxthreads/
[3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
[4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Gerd Stolpmann

unread,
Mar 12, 2010, 7:34:57 AM3/12/10
to Hugo Ferreira, caml...@yquem.inria.fr

I think you mix here several things up. LinuxThreads has nothing to do
with ocaml. It is an implementation of kernel threads for Linux on the C
level. It is considered as outdated as of today, and is usually replaced
by a better implementation (NPTL) that conforms more strictly to the
POSIX standard.

Ocaml uses for its multi-threading implementation the multi-threading
API the OS provides. This might be LinuxThreads or NPTL or something
else. So, on the lower half of the implementation the threads are kernel
threads, and multi-core-enabled. However, Ocaml prevents that more than
one of the kernel threads can run inside its runtime at any time. So
Ocaml code will always run only on one core (but you can call C code,
and this can then take full advantage of multi-cores).

This is the primary reason I am going with multi-processing in my
projects, and why Ocamlnet focuses on it.

The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
is an example program that mass-multiplies matrices on several cores:

https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml

Netcamlbox can move complex values to shared memory, so you are not
restricted to bigarrays. The matrix example uses float array array as
representation. Recursive variants should also be fine.

For providing shared data to all workers, you can simply load it into
the master process before the children processes are forked off. Another
option is (especially when it is a lot of data, and you cannot afford to
have n copies) to create another camlbox in the master process before
forking, and to copy the shared data into it before forking. This avoids
that the data is copied at fork time.

One drawback of Netcamlbox is that it is unsafe, and violating the
programming rules is punished with crashes. (But this also applies, to
some extent, to multi-threading, only that the rules are different.)

Gerd


--
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany
ge...@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------

Hugo Ferreira

unread,
Mar 12, 2010, 8:36:23 AM3/12/10
to Gerd Stolpmann, caml...@yquem.inria.fr
Hi,

Gerd Stolpmann wrote:
> On Fr, 2010-03-12 at 11:55 +0000, Hugo Ferreira wrote:
>> Hello,
>>
>> I need to implement (meta) heuristic algorithms that
>> uses parallelism in order to (attempt to) solve a (hard)
>> machine learning problem that is inherently exponential.
>> The aim is to take maximum advantage of the multi-core
>> processors I have access to.
>>

snip


>> My first concern is to take advantage of the multi-cores so:
>>
>> 1. The thread library is not the answer
>> Chapter 24 - "The threads library is implemented by time-sharing on
>> a
>> single processor. It will not take advantage of multi-processor
>> machines." [1]
>>
>> 2. LinuxThreads seems to be what I need
>> "The main strength of this approach is that it can take full
>> advantage of multiprocessors." [2]
>
> I think you mix here several things up. LinuxThreads has nothing to do
> with ocaml. It is an implementation of kernel threads for Linux on the C
> level. It is considered as outdated as of today, and is usually replaced
> by a better implementation (NPTL) that conforms more strictly to the
> POSIX standard.
>

Oops. Silly me.

> Ocaml uses for its multi-threading implementation the multi-threading
> API the OS provides. This might be LinuxThreads or NPTL or something
> else. So, on the lower half of the implementation the threads are kernel
> threads, and multi-core-enabled.

Ok.Should have read more carefully. As stated in the manual "Two
implementations of the threads library are available, depending on the
capabilities of the operating system:" So I have a recent glibc and
therefore "multi-core-enabled" threads.

> However, Ocaml prevents that more than
> one of the kernel threads can run inside its runtime at any time. So
> Ocaml code will always run only on one core (but you can call C code,
> and this can then take full advantage of multi-cores).
>

Ok. I was under the (wrong) impression that the native OS threads did
run simultaneously (multi-core) but were intermittently stopped due to
the GC. So threads won't help.

> This is the primary reason I am going with multi-processing in my
> projects, and why Ocamlnet focuses on it.
>

Understood.

> The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
> is an example program that mass-multiplies matrices on several cores:
>
> https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml
>
> Netcamlbox can move complex values to shared memory, so you are not
> restricted to bigarrays. The matrix example uses float array array as
> representation. Recursive variants should also be fine.
>
> For providing shared data to all workers, you can simply load it into
> the master process before the children processes are forked off. Another
> option is (especially when it is a lot of data, and you cannot afford to
> have n copies) to create another camlbox in the master process before
> forking, and to copy the shared data into it before forking. This avoids
> that the data is copied at fork time.
>

The main data set is large, so I will opt for the latter.

> One drawback of Netcamlbox is that it is unsafe, and violating the
> programming rules is punished with crashes. (But this also applies, to
> some extent, to multi-threading, only that the rules are different.)
>

Not an issue for me.
Going to read-up on and install ocamlnet3.

Thanks,
Hugo F.

Sylvain Le Gall

unread,
Mar 12, 2010, 9:30:46 AM3/12/10
to caml...@inria.fr
On 12-03-2010, Hugo Ferreira <h...@inescporto.pt> wrote:
> Hello,

>
> I have opted to use "kernel-level threads" that allow use
> of the (multi-core) processors but still allow "easy"
> access to shared memory".
>
> I have done a cursory look at:
> - Ocaml.Threads
> - Ocaml.Unix (LinuxThreads)
> - coThreads
> - Ocamlnet2/3 (netshm, netcamlbox)
> (An eThreads library exists in the forge but I did not examine this)
>

I think you should also have a look at ocaml/mpi for communication:
http://forge.ocamlcore.org/projects/ocamlmpi/
and ancient for accessing read-only memory:
http://merjis.com/developers/ancient

MPI can work on a single computer to take advantage of multi-core
through multi-processus.

Regards,
Sylvain Le Gall

Hugo Ferreira

unread,
Mar 12, 2010, 9:54:43 AM3/12/10
to Sylvain Le Gall, caml...@inria.fr
Sylvain Le Gall wrote:
> On 12-03-2010, Hugo Ferreira <h...@inescporto.pt> wrote:
>> Hello,
>>
>> I have opted to use "kernel-level threads" that allow use
>> of the (multi-core) processors but still allow "easy"
>> access to shared memory".
>>
>> I have done a cursory look at:
>> - Ocaml.Threads
>> - Ocaml.Unix (LinuxThreads)
>> - coThreads
>> - Ocamlnet2/3 (netshm, netcamlbox)
>> (An eThreads library exists in the forge but I did not examine this)
>>
>
> I think you should also have a look at ocaml/mpi for communication:
> http://forge.ocamlcore.org/projects/ocamlmpi/
> and ancient for accessing read-only memory:
> http://merjis.com/developers/ancient
>
> MPI can work on a single computer to take advantage of multi-core
> through multi-processus.
>

Indeed. I did not list these because I was specifically looking for
a share memory solution amongst threads. Seeing as I am forced to use
processes ancient is worth considering.

Thanks,
Hugo F.

Philippe Wang

unread,
Mar 12, 2010, 7:00:12 PM3/12/10
to Hugo Ferreira, caml...@inria.fr
Hi,

If your program doesn't need usage-proved stability, you may be
interested in the "OCaml for Multicore" project which provides an
alternative runtime library (prototype quality) which allows threads
to compute in parallel.
http://www.algo-prog.info/ocmc/

If you choose to give it a try, we would enjoy your feedbacks.

Cheers,

--
Philippe Wang
ma...@philippewang.info

Hugo Ferreira

unread,
Mar 13, 2010, 4:13:07 AM3/13/10
to Philippe Wang, caml...@inria.fr
Philippe Wang wrote:
> Hi,
>
> If your program doesn't need usage-proved stability, you may be
> interested in the "OCaml for Multicore" project which provides an
> alternative runtime library (prototype quality) which allows threads
> to compute in parallel.
> http://www.algo-prog.info/ocmc/
>

Appreciate the suggestion. Too risky though. 8-(
I am having enough difficulty as it is to design and
implement this stuff.

Regards,
Hugo F.

> If you choose to give it a try, we would enjoy your feedbacks.
>
> Cheers,
>

_______________________________________________

Richard Jones

unread,
Mar 13, 2010, 8:56:34 AM3/13/10
to Hugo Ferreira, caml...@yquem.inria.fr

Also add my 2 cents here:

At least look at OCaml Ancient for sharing the data. You possibly may
not use it, but it was designed pretty much for what you have in mind.
The README should be informative:

http://merjis.com/developers/ancient
http://merjis.com/_file/ancient-readme.txt

(You should also look at the API and source). You didn't mention how
large your read-only data set is, but OCaml Ancient should be able to
handle 100s of gigabytes, assuming a 64 bit machine. We used to use
it with 10s of gigabyte data sets without any issues.

As others have said, don't use threads to launch your jobs. Look at
one of the fork-based libraries. In addition to the ones mentioned,
take a look at PreludeML, which is internally very simple and should
give you a good start if you decide to write your own:

http://github.com/kig/preludeml

If you want to spread the jobs over multiple machines, then OCaml MPI
is probably the way to go.

Rich.

--
Richard Jones
Red Hat

Hugo Ferreira

unread,
Mar 13, 2010, 9:29:47 AM3/13/10
to Richard Jones, caml...@yquem.inria.fr
Hi Richard,

Richard Jones wrote:
> Also add my 2 cents here:
>
> At least look at OCaml Ancient for sharing the data. You possibly may
> not use it, but it was designed pretty much for what you have in mind.
> The README should be informative:
>
> http://merjis.com/developers/ancient
> http://merjis.com/_file/ancient-readme.txt
>

Sylvain Le Gall already pointed this out. I have looked at the
readme and checked that I have it in GODI (version 0.8, but I
think it will do).

> (You should also look at the API and source). You didn't mention how
> large your read-only data set is, but OCaml Ancient should be able to
> handle 100s of gigabytes, assuming a 64 bit machine. We used to use
> it with 10s of gigabyte data sets without any issues.
>

Don't think I will need so much.

> As others have said, don't use threads to launch your jobs. Look at
> one of the fork-based libraries.

Going to experiment with the ocamlnet stuff (version 3).

> In addition to the ones mentioned,
> take a look at PreludeML, which is internally very simple and should
> give you a good start if you decide to write your own:
>
> http://github.com/kig/preludeml
>

Ok.

> If you want to spread the jobs over multiple machines, then OCaml MPI
> is probably the way to go.
>

I will be using a single 8-CPU machine for the experiments.
For the problem at hand shared the memory model seems to be
a better fit than messaging.

Thanks,
Hugo F.


> Rich.

Richard Jones

unread,
Mar 13, 2010, 10:10:47 AM3/13/10
to Hugo Ferreira, caml...@yquem.inria.fr
On Sat, Mar 13, 2010 at 02:29:35PM +0000, Hugo Ferreira wrote:
> I will be using a single 8-CPU machine for the experiments.
> For the problem at hand shared the memory model seems to be
> a better fit than messaging.

I've not actually tried it, but I bet you can use a hybrid model --
Use MPI as an easy way to spread the jobs over the nodes and for
coordinating the processes, then have each process open a shared
Ancient file (backed by NFS) for the read-only data.

Rich.

--
Richard Jones
Red Hat

_______________________________________________

Hugo Ferreira

unread,
Mar 13, 2010, 10:37:44 AM3/13/10
to Richard Jones, caml...@yquem.inria.fr
Richard Jones wrote:
> On Sat, Mar 13, 2010 at 02:29:35PM +0000, Hugo Ferreira wrote:
>> I will be using a single 8-CPU machine for the experiments.
>> For the problem at hand shared the memory model seems to be
>> a better fit than messaging.
>
> I've not actually tried it, but I bet you can use a hybrid model --
> Use MPI as an easy way to spread the jobs over the nodes and for
> coordinating the processes, then have each process open a shared
> Ancient file (backed by NFS) for the read-only data.
>

I figure I will need some form of messaging to coordinate the jobs.
The problem however is that the jobs will initially generate large
amounts of data. This data is progressively "merged" and reduced.
I don't think sending messages is appropriate. At least not in
the initial stages. I will have to experiment.

Thanks for the suggestion,
Hugo F.


> Rich.

0 new messages