superlinear speedup in MPI?

@PUB:EMAIL.COM EMDIR

unread,

Oct 6, 1998, 3:00:00 AM10/6/98

to

Does anyone know whether we can have speedup using MPI? If so, please direct me to the place
to get the code. I need the C code that will work on dec machine.

Regards

D.R. Commander

unread,

Oct 6, 1998, 3:00:00 AM10/6/98

to

@PUB:EMAIL.COM EMDIR wrote in message <6vdc09$t65$1...@ftp.curtin.edu.au>...

>Does anyone know whether we can have speedup using MPI? If so, please
direct me to the place
>to get the code. I need the C code that will work on dec machine.

A superlinear decrease in execution time is possible in any parallel
environment when running an iterative algorithm for which the communication
required between processes is very small.

But it is not a true "speedup" in pure, Amdahlian terms. The reason is that
in order to achieve a superlinear decrease in execution time, the algorithm
must converge faster in a parallel implementation than it does in a serial
implementation. So, you're not really running the same code anymore; the
"faster" test runs only a subset of the code of the "slower" test. I know
this doesn't make any difference to the end user, who does see an "apparent"
speedup, but to those of us who rely on these benchmarks to architect the
underlying hardware and software, it's an important distinction.

However, that being said, here are some links that may lead you to a
code/application that will produce these results. I've never worked with
either of these personally; these are just headlines that have caught my
eye within the past few months:

http://www.ncsa.uiuc.edu/SCD/SciHi/Pottenger0797.html
http://www.marc.com/Headlines/k73bnchmrk/benchmarks.htm

D. R. Commander

William T. Rankin

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

to

In article <F0F5J...@twisto.im.hou.compaq.com>,

D.R. Commander <dcomm...@IHATESPAM.compaq.com> wrote:
>@PUB:EMAIL.COM EMDIR wrote in message <6vdc09$t65$1...@ftp.curtin.edu.au>...
>>Does anyone know whether we can have speedup using MPI? If so, please
>direct me to the place
>>to get the code. I need the C code that will work on dec machine.
>
>
>A superlinear decrease in execution time is possible in any parallel
>environment when running an iterative algorithm for which the communication
>required between processes is very small.

if all you want to do is demonstrate the cause of "superlinear speedup"
then the method is fairly simple, actually.

find a distributed program/application that uses less memory per process
for the parallel case. run it with a problem size that will cause it to
swap to disk on a uniprocessor. add processors (and memory) until it
stops swapping.

voila! superlinear speedup! :-) (most likely)

now replace the words "memory/disk" with "cache/memory" in the above
paragraph and reread it.

-bill

--
bill rankin ...................................... philosopher/coffee-drinker
wra...@ee.duke.edu ........................................ doctoral wannabe
duke university dept. of electrical engr ......... scientific computing group

Hubert Ertl

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

Other application areas like domain decomposition methods not
only uses the effect that the amount of local data decreases with
higher parallelism, but also some additional re-partitioning or
re-meshing of data is done resulting in a different (and some-
times more efficient) scheme of calculations to be done.

As an example you can have a look at:
http://www.genias.de/projects/parasol/newsletter/issue-2/newsletter_2.html

Or directly at:
http://www.genias.de/projects/parasol/newsletter/issue-2/newsletter_2.html#FETI_Solver_shows

If you call this effect super-linear speedup or not is up to you ;-)

Hubert Ertl
--
- - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - -
Dr. Hubert Ertl, GENIAS Software GmbH, Phone: +49 9401 9200-50
*** PaTENT MPI 4.0 http://www.genias.de/products/patent/ ***

Jinsheng Xu

unread,

Oct 13, 1998, 3:00:00 AM10/13/98

to

Superlinear speedup is theoratically impossible if the computer is perfect.
Because it is not perfect, everything is possible.

@PUB:EMAIL.COM EMDIR wrote:

> Does anyone know whether we can have speedup using MPI? If so, please direct me to the place
> to get the code. I need the C code that will work on dec machine.
>

> Regards

--
Jinsheng

Alain Coetmeur

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

to

Jinsheng Xu a écrit dans le message <3623FD72...@cps.msu.edu>...

>> Does anyone know whether we can have speedup using MPI?

IMHO this is not lin to MPI but to algorithm. MPI is only important
in tha fact it introduce performance costs that reduce the speedup

>> If so, please direct me to the place
>> to get the code. I need the C code that will work on dec machine.

>Superlinear speedup is theoratically impossible if the computer is perfect.
>Because it is not perfect, everything is possible.

IMHO the problem is not with teh parallel machine, but
with the fact that the sequentoial machine is imperfect
and that we compare eggs and cows

please comment my reasoning, since superlinear speedup is
something strange for me...

it seems to me more than logical that superlinear speedup is impossible,
if you program well the sequential program...

I understand the argument proposed here that you you can improve the cache locality
in some algorithm, by dividing it into small chunk, and thus run faster in parallel.

but you can do this either on a parallel machine, or on a sequential machine
that reproduce the scheduling...
the case of blocked matrix alogrithm is exemplar.
real parallelism just add communication costs.

but I suppose also that you have as much memory
for your sequential mahine as the whole parallel machine have...and as much cache...

so super linear speed up can be just the effect of a gain in global cache size,
and global memory size, on the sequential machine...

anyway this is an interesting effect since, like
CPU performance, cache size is limited by technology,
so parallelism is a way to increase global CPU and cache above limit on a single processor...

am I wrong in my reasoning?