Critique of OpenMP ......

Solomon_Man

unread,

Nov 10, 2007, 1:03:26 PM11/10/07

to

All,
My partner and I are working on a research paper for a graduate course
and wanted to see what short comings that OpenMP has compared to MPI.
As a student we are especially interested in its short comings in a
commercial environment as we may not be exposed to this in an academic
atmosphere.

We are also interested in portability issues, as the academic studies
that we have ran across have shown conflicting results. We would be
interested in anything anyone would have to say about OpenMP.

Thanks,
Saranga and Chris

P.S. Does anyone know how to determine OpenMP version on a Solaris
machine?

--

Justin W

unread,

Nov 11, 2007, 3:58:10 PM11/11/07

to

[C.p. moderator: again, may be posting delays while I am in SC'07.]

> and wanted to see what short comings that OpenMP has compared to MPI.

>From the standpoint of viability in the software industry... I
generally think the technology with the most responsive support wins.
This support could come from an outside vendor, open source community
or even internal employees at the company. I work for a company that
builds a lot of stuff internally because then we see the benefit of
taking ownership of the performance tweaking, bug-fixing etc. The
open source community is good a way to get "free" support, but you
won't always get the responsiveness or quality (not to start an
argument) that your business may rely on. In terms of vendor
support... they can be wishy washy as well. My company has worked
with some all-star vendors out there... and we've worked with some
duds. I cannot comment specifically to the quality of vendor support
for OpenMP or any implementation of MPI. My personal favorite is
building these components internally. You get the most responsive
support but at the highest cost (employee salary, time lost to ramping
up new hires on existing technology etc). You also have execution
risk... ie, what if I can't produce something as reliably as a vendor
can. But in the end, you own that software and the architecture it
runs on.

So... now to answer your question. I personally... just from a
support standpoint would tend to prefer MPI. After all, MPI is just an
interface, the guts can be built internally to do whatever. Heck,
even tweak the performance of the guts to work specifically on your
architecture and throw away the overhead you'd never use anyway.

--

Solomon_Man

unread,

Nov 13, 2007, 4:55:37 PM11/13/07

to

Justin,
Thanks for your response.
I believe one of openmp's biggest strengths is its adoption by the
programming community. My guess this adoption is based on ease of use.
I have used both MPI and openMP in academia but never in a commercial
atmosphere. I have been in the commercial arena for the past 10 yrs,
but never have run across the reason to use either.

Thanks again for your response,
Chris

--

David Golden

unread,

Nov 14, 2007, 6:21:12 PM11/14/07

to

Solomon_Man wrote:
> We would be interested in anything anyone would have to
> say about OpenMP.

Well, do note that it's not an either/or thing. Don't forget it's quite
possible to use both OpenMP (at least non-"Cluster OpenMP") and MPI in
the one program... if you want to. Some people look into it
for "cluster of SMPs" situations. Whether it's worth it is another
matter, but there are papers out there a search engine query away.

--

Solomon_Man

unread,

Nov 15, 2007, 4:28:02 PM11/15/07

to

David,
Thanks for the response.
Currently we have other students looking in openMP/MPI Hybrid idea.
We have run across some studies mentioning this possibility.

Thanks,
Chris

--

Carlie J. Coats

unread,

Nov 22, 2007, 9:46:54 AM11/22/07

to

[Moderator note: during the holiday weekend [US] there may be some
posting delay as I am off flying.]

I've been dealing with both for quite a long time -- certainly since
OpenMP 1.0 -- in the environmental modeling context. There are a number
of parallel environmental models out there (the MM5 and WRF met models
are both hybrid-parallel; among atmospheric chemistry/transport models,
EPA's CMAQ is MPI and NCSC/BAMS MAQSIP is OpenMP (with a new version I'm
currently developing that's hybrid-parallel); most of the serious ocean
and climate models are parallel, mostly with MPI.)

Frankly, I find OpenMP easier to work with. I took a cleanly-written
partially-parallized atmospheric chemistry/transport model from Environ
that I had to use for a particular contract (134,000 lines of code),
and in about 3 hours work I parallelized it so that I was getting 7:1
speedup on 8 processors instead of the original 2.5:1. Admittedly,
I already knew the code pretty well, and Environ had done a very clean
job of coding the original...

In such a context, OpenMP may well offer much more flexibility: for
3-D transport, for example, horizontal transport parallelizes cleanly
on the vertical subscript, and vertical transport on (one of) the
horizontal; a distributed paradigm such as MPI requires a fixed
whole-program parallel decomposition (or else must implement global
data transposes that are fatal to scaling at high processor counts).
Consequently, you must implement the correct "halo" exchanges for
subdomain boundaries by hand (assuming you do the usual decomposition
of the horizontal domain into distributed subdomains, the vertical
subscript living "on the node"). And it is entirely possible to
implement these exchanges badly and get data corruption as a result
(as seems to be happening intermittently with EPA's CMAQ on some platforms).

A potential bottleneck for OpenMP programs on very large data sets
is the (hardware-system) overhead of (a) TLB flushing and (b) cache
coherency traffic: For (a) the processor's virtual memory system must
map the entirety of large arrays instead of only the this-processor
fraction; if done sloppily (as for MM5) this inhibits performance for
large problems (especially on hardware sensitive to this, like IBM
POWER). For (b), the system must ensure that all processors have the
same view of memory, and (especially if the program is sloppily written,
with interleaved array-element access), this can kill performance. A
rule of thumb seems to be that cache coherency is relatively easy for
small processor counts (up to 4-8), is moderately difficult for moderate
processor counts (up to 32 or so) and is very difficult for large
processor counts. As a result, you won't find many systems capable
of running hundred-processor OpenMP parallel -- but there are lots of
clusters that let you run hundred-processor MPI.

We have all heard the evils of spaghetti programming (with "GO TO"s,
etc.); however, in my non-humble opinion there has been insufficient
condemnation of the way that for complex problems the message-passing
paradigm leads not merely to bowls of spaghetti, but to entire kitchens
full of it. Diagnosing the cause of the EPA CMAQ intermittent data
corruption is an example of this that is quite familiar to me.

Finally, it is worth noting that the Powers_That_Be ® seem to have
decided that distributed parallelism is what they will support, and so
it is the "glamorous" or fashionable activity to pursue; as a result,
you will find that in hybrid models, the support for the OpenMP
implementation is frequently half-hearted and sloppy at best.

fwiw --

Carlie J. Coats, Jr., Ph.D.
Chief Software Architect
Baron Advanced Meteorological Systems, LLC.

--