The video interview with Leslie Lamport
http://channel9.msdn.com/Shows/Going+Deep/E2E-Erik-Meijer-and-Leslie-Lamport-Mathematical-Reasoning-and-Distributed-Systems
Time, Clocks and the Ordering of Events in a Distributed System
http://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf
In particular, I can highly recommend Leslie Lamport's site:
http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html
Tim Daly
> Clojure works well for concurrency but does not really address
> the parallel question well. For that I've turned to MPI.
> I am working on using MPI from Clojure.
That's a topic I am very interested in as well, although unfortunately
I never find the time to really do something. Some random thoughts
based on what I did look at in the past:
1) Parallel computing vs. distributed computing: these are two
different levels of complexity in my opinion. Parallel computing in a
shared-memory environment (e.g. fork/join style) is a much simpler
problem than parallel computing on distributed-memory systems, where
you have to take care of distributing data among the machines and try
to minimize data exchange in addition to balancing CPU load. There are
some interesting approaches in Clojure's par branch for the first
problem. The second one deserves to be tackled as well, but we should
use another label than "parallel" to reduce confusion.
2) MPI via Java - which one do you plan to use?
3) Exchanging data between nodes: as far as I know many Clojure data
types, in particular closures, are not serializable yet.
4) Efficient data exchange between nodes: it would be nice to able to
profit from MPI's efficiency for large homogeneous data sets (read:
arrays) in Clojure as well. Java arrays should be easy to handle
efficiently, but Clojure code tends to avoid them. Perhaps primitive-
type vectors could be transferred as arrays as well?
5) High-level layer: MPI is much too low-level for daily use. For
distributed programming in Clojure, I'd like to have a higher-level
model which abstracts away the synchronization issues that lead to
deadlocks, race conditions, and ultimately a miserable life for
programmers. There are some good ideas in the PGAS languages that
would perhaps work fine in a Clojure context as well.
> These are some links others might find interesting.
At first glance this looks promising - they are on my "to watch" list.
Thanks!
Konrad.
On 22 Dec 2010, at 09:28, Tim Daly wrote:That's a topic I am very interested in as well, although unfortunately I never find the time to really do something. Some random thoughts based on what I did look at in the past:
Clojure works well for concurrency but does not really address
the parallel question well. For that I've turned to MPI.
I am working on using MPI from Clojure.
1) Parallel computing vs. distributed computing: these are two different levels of complexity in my opinion. Parallel computing in a shared-memory environment (e.g. fork/join style) is a much simpler problem than parallel computing on distributed-memory systems, where you have to take care of distributing data among the machines and try to minimize data exchange in addition to balancing CPU load. There are some interesting approaches in Clojure's par branch for the first problem. The second one deserves to be tackled as well, but we should use another label than "parallel" to reduce confusion.
2) MPI via Java - which one do you plan to use?
3) Exchanging data between nodes: as far as I know many Clojure data types, in particular closures, are not serializable yet.
4) Efficient data exchange between nodes: it would be nice to able to profit from MPI's efficiency for large homogeneous data sets (read: arrays) in Clojure as well. Java arrays should be easy to handle efficiently, but Clojure code tends to avoid them. Perhaps primitive-type vectors could be transferred as arrays as well?
5) High-level layer: MPI is much too low-level for daily use. For distributed programming in Clojure, I'd like to have a higher-level model which abstracts away the synchronization issues that lead to deadlocks, race conditions, and ultimately a miserable life for programmers. There are some good ideas in the PGAS languages that would perhaps work fine in a Clojure context as well.At first glance this looks promising - they are on my "to watch" list. Thanks!
These are some links others might find interesting.
Konrad.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
> I am interested in distributed parallel computing too ... I have
> prior experience coding with MPI and c .. but that besides the
> point .. while I was looking at options with clojure .. I recently
> came across swarmiji. https://github.com/amitrathore/swarmiji
Thanks for the link! Judging from the example in the README, it's a
library for task farming in Clojure. While that's a limited form of
parallelism, there are still lots of applications where it is useful,
so I'd say this library is definitely worth a closer look. However, it
doesn't seem to deal with distributed data.
> I come from the scientific computing community .. the likes of
> Computation Fluid Dynamics and related topics.. large matrix
> operations and such stuff..
My background is somewhat similar: molecular simulations and analysis
of large data sets.
Konrad.
> Thanks for the link! Judging from the example in the README, it's a
> library for task farming in Clojure. While that's a limited form of
> parallelism, there are still lots of applications where it is useful,
> so I'd say this library is definitely worth a closer look. However, it
> doesn't seem to deal with distributed data.
Distributed data is hard, though, partly because kind of distribution
you need depends on your calculation. Every time I've had to do a
distributed calculations, I've always just used the filesystem for data.
I see a lot of frameworks that assume the data is small and can be
entirely contained in the "message," while I need some kind of data
affinity. (I do model estimation on large data sets, so I'd like to send
a lump of data to different nodes, leave it there, then exchange
parameter vectors and error scores with a controller.)
In today's world, I've found I get more done faster with a single 8-core
machine with a lot of RAM (96 GB now; at a previous employer I had
access to a 512 GB monster) than I would with a farm of machines with
only 4 GB or 8 GB, so I'm back to concurrency. Of course, that's just
because my data is large, but not too large.
>> I come from the scientific computing community .. the likes of
>> Computation Fluid Dynamics and related topics.. large matrix
>> operations and such stuff..
>
> My background is somewhat similar: molecular simulations and analysis
> of large data sets.
I did astronomy, but mostly small-scale stuff. Integration, cascade
calculations, the like. These days, though, I'm doing finance,
mortgages in particular. That's a field that's been fun for the past
few years.
-Johann