Benchmarking julia parallel computing

812 views
Skip to first unread message

Kapil Agarwal

unread,
Sep 29, 2014, 2:58:33 PM9/29/14
to julia...@googlegroups.com
Hi

I am looking to benchmark some standard parallel algorithms using Julia. I am thinking of comparing its performance with MPI and other parallel programming paradigms. Although I couldn't find much on any existing benchmarks, I was looking forward to any benchmarks that may have been done by the julia-dev team or in case any one is aware of any such benchmarks that could help me go about it.

Thanks

Kapil

Tony Kelman

unread,
Sep 30, 2014, 12:12:30 AM9/30/14
to julia...@googlegroups.com
I'm not aware of any such data comparing Julia on large-scale parallel tasks with nontrivial communication patterns vs a more conventional HPC cluster approach using MPI. There are some benchmark problems (generally intended for serial or small-scale parallel execution, I believe) under test/perf, but it might be more interesting if you have your own parallel application you want to test with.

Keep us posted as you get further into this.

Erik Schnetter

unread,
Sep 30, 2014, 8:23:05 AM9/30/14
to julia...@googlegroups.com
There is a package "MPI.jl" that allows calling MPI from Julia. With
this, any MPI-based algorithm can be translated to Julia, and the
speed of MPI should be the same as for any other language.

-erik
--
Erik Schnetter <schn...@cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/

Kapil Agarwal

unread,
Oct 1, 2014, 9:18:56 AM10/1/14
to julia...@googlegroups.com
Hi


On Tuesday, 30 September 2014 08:23:05 UTC-4, Erik Schnetter wrote:
There is a package "MPI.jl" that allows calling MPI from Julia. With
this, any MPI-based algorithm can be translated to Julia, and the
speed of MPI should be the same as for any other language.

-erik

I thought Julia has its own message passing paradigm which provides functionality similar to MPI but in a different way. So, I wanted to benchmark it against a native MPI implementation. If no such benchmarks exist, would it be a good idea to compare MPI and Julia performance ? After all, you would want users to use the Julia parallel computing API rather than port existing libraries to Julia ?
 

On Tue, Sep 30, 2014 at 12:12 AM, Tony Kelman <to...@kelman.net> wrote:
> I'm not aware of any such data comparing Julia on large-scale parallel tasks
> with nontrivial communication patterns vs a more conventional HPC cluster
> approach using MPI. There are some benchmark problems (generally intended
> for serial or small-scale parallel execution, I believe) under test/perf,
> but it might be more interesting if you have your own parallel application
> you want to test with.
>
> Keep us posted as you get further into this.
>

I checked out test/perf and I could not find any parallel implementations, so I believe it would be a good idea to benchmark highly parallel applications with Julia.
Actually, I am looking at Julia for a college project and have access to a HPC cluster and am thinking to implement graph algorithms. I would appreciate if you could give me some suggestions on my idea.

 
Thanks
Kapil

Viral Shah

unread,
Oct 1, 2014, 9:57:32 AM10/1/14
to julia...@googlegroups.com
Take a look at Graphs.jl. The patterns used there should make it possible to parallelize some of the routines. I doubt that on a small cluster, there will be much difference with MPI.

Many problems do not have non-trivial communication requirements, and using tcp/ip point to point as Julia does is sufficient. With graph algorithms and such, the communication patterns are quite irregular, and MPI may not always have the right abstraction. In any case, we can always use MPI as the underlying transport in Julia as an alternative.

A good project would be to implement the http://graph500.org/ benchmark in Julia in parallel and try to achieve the highest performance. You will have lots of things to compare against. 

-viral

Erik Schnetter

unread,
Oct 1, 2014, 12:26:11 PM10/1/14
to julia...@googlegroups.com
On Wed, Oct 1, 2014 at 9:18 AM, Kapil Agarwal <kapi...@gmail.com> wrote:
> Hi
>
> On Tuesday, 30 September 2014 08:23:05 UTC-4, Erik Schnetter wrote:
>>
>> There is a package "MPI.jl" that allows calling MPI from Julia. With
>> this, any MPI-based algorithm can be translated to Julia, and the
>> speed of MPI should be the same as for any other language.
>>
>> -erik
>
>
> I thought Julia has its own message passing paradigm which provides
> functionality similar to MPI but in a different way. So, I wanted to
> benchmark it against a native MPI implementation. If no such benchmarks
> exist, would it be a good idea to compare MPI and Julia performance ? After
> all, you would want users to use the Julia parallel computing API rather
> than port existing libraries to Julia ?

Yes, this is correct: Julia offers a more high-level API than pure
MPI. However, when it comes to benchmarking one should probably
distinguish between two things:

(1) The high-level programming API that is offered
(2) The underlying transport mechanisms (how things are mapped to hardware)

For example, when running Julia on a cluster, there may be special
high-speed low-latency communication hardware such as InfiniBand
interconnects. Currently, Julia would not use these, but MPI would,
giving MPI an unfair advantage until Julia is using such hardware as
well.

Similarly, Julia's current cluster manager should work fine and
efficiently if you are using (say) 10 workstations. However, if you
are using many more -- say 1000 -- then Julia's current cluster
manager implementation will not scale. Again, this is only a
limitation of the current implementation, not of the high-level API
that Julia is offering, and I'm sure this will be improved in time.

I am currently working on a "cluster manager" based on MPI; see
<https://bitbucket.org/eschnett/funhpc.jl/>. This is still in an
experimental stage, and it's not clear to me yet how this can be
merged into Julia's ClusterManager class, as certain fundamental
implementation aspects of the current ClusterManager implementation
may need to be changed. However, the high-level API is very similar.
Fortunately, this part of ClusterManager is already being redesigned.

If you want to benchmark large-scale applications, then you will
probably need to use a ClusterManager that uses an efficient
communication protocol such as MPI; otherwise, you would benchmark
implementation limitations instead of comparing the high-level API.

-erik

Kapil Agarwal

unread,
Oct 1, 2014, 5:55:46 PM10/1/14
to julia...@googlegroups.com
Hi

What I understand is that comparing Julia and MPI may not be a good option as both of them have different internal implementations.

So Julia may be benchmarked against mpi4py and similar MPI ports ? Has such benchmarking already been done ? If so, could you please share the results.

Another thing that I was thinking was implementing the benchmark tests at http://icl.cs.utk.edu/hpcc/ in Julia in parallel. Would that be a good project and beneficial to the Julia community ?


Regards,
Kapil

Erik Schnetter

unread,
Oct 1, 2014, 6:21:50 PM10/1/14
to julia...@googlegroups.com
Julia's MPI.jl is very similar to mpi4py. Both are thin wrappers
around MPI, and should give the same performance as using MPI from C.

HPCC is a set of benchmarks for systems, not for languages. The
algorithms used are prescribed, and using a high-level language has no
chance of improving performance. This may test "abstraction penalty",
i.e. whether one can write efficiently low-level code in a high-level
language, but given that the benchmark implementations can be
arbitrarily complicated and would not need to look like "well-written
Julia" (whatever that means), I assume that Julia would show the same
performance as C.

An interesting benchmark would be to define a high-level non-trivial
problem, such as e.g. "solve the Poisson equation on a grid with
adaptive mesh refinement". Then, one can compare both speed and code
complexity for different languages, and people could then choose
according to their needs. Naively, I would expect Python to be good at
code complexity, and C/C++ be good at speed, and even more naively I
would hope that Julia is close to both of them...

-erik

Ben Arthur

unread,
Oct 6, 2014, 6:03:44 PM10/6/14
to julia...@googlegroups.com
thanks for your post erik.

could you please elaborate why Julia does not scale well to larger clusters?

also, what would it take to get Julia to utilize InfiniBand were it available.

thanks,

ben

Erik Schnetter

unread,
Oct 7, 2014, 6:56:51 PM10/7/14
to julia...@googlegroups.com
Ben

With "large", I mean clusters containing more than 100 or 1000 nodes.
Also -- I have not measured this; depending on the application, the
non-scalability may be acceptable. For example, if Julia mostly farms
out processes that work mostly independently, then communication
overhead is much less relevant than if the processes communicate
tightly.

The reason I expect Julia not to scale is the way in which the
communication is set up. Currently, each pair of processes that
exchanges data needs to open a TCP port. For N processes with a
complex communication pattern, this requires O(N^2) TCP connections.
Other communication mechanism -- such as MPI -- do not require O(N^2)
work, but only e.g. O(N log N). This is possible in MPI since
efficient implementations include a routing or multiplexing mechanism
that is not present in TCP or Julia.

To make Julia use Infiniband, I would use MPI instead of TCP as
transport mechanism; this is probably the easiest high-level interface
to use, and it supports many other network types as well. Open MPI
<http://www.open-mpi.org/> is one well-known open-source MPI
implementation, but there are several widely used others as well.
There are also lower-level APIs to access Infiniband, but I have no
experience with them.

As is, MPI probably cannot be used as drop-in replacement for TCP,
since MPI uses its own startup mechanism. Instead of starting several
processes (e.g. via ssh) and having them look for and connect to each
other, one passes MPI a list of host names, and it then starts all
processes, including the main process. That is, instead of

julia -p 8 code.jl

one would write

mpirun -np 9 julia code.jl

I assume one can rather easily modify the cluster manager skip the ssh
and TCP connection part. When I looked at the cluster manager a few
weeks ago, then it depended a lot on TCP ports and port numbers. I
haven't looked in the mean time, but if this dependency was also
removed (so that the sending Julia worker process does not need to be
identified via a TCP socket), one could use MPI for communication.

-erik
Reply all
Reply to author
Forward
0 new messages