Supporting OpenMPI on Stampede

208 views
Skip to first unread message

Erik Schnetter

unread,
Apr 27, 2016, 11:43:07 AM4/27/16
to Spack
Stampede is a large HPC system at TACC, and likely important to many
researchers. By default, Spack's OpenMPI doesn't find the OFED stack
there, and thus builds without Infiniband support. I find that the two
options

--with-verbs=/opt/ofed
--enable-openib-connectx-xrc=no

make things work on Stampede. (This is now in the branch
<https://github.com/eschnett/spack/tree/eschnett/openmpi-stampede>.)

How do I best incorporate this into Spack's OpenMPI package.py file?
I'd like this to work automatically. I'd find it inconvenient if
people who build Spack's OpenMPI on Stampede have to pass extra flags
since these (or equivalent) settings should clearly be the default on
Stampede.

Some ideas:
(1) check the host name in package.py, and add build-host specific options
(2) add a variant +tacc, similar to the existing +lanl and +llnl
"variants" (see "'+lanl' in spec" in the code)
(3) allow site-specific repositories to provide additional settings
that OpenMPI would then query (essentially the same as (1), but the
settings themselves are kept out of the main Spack repository)
(4) write code that tries to auto-detect things, e.g. looking for
"ofed_info" in PATH, and if so, adding a respective "--with-verbs="
option

-erik

--
Erik Schnetter <schn...@gmail.com>
http://www.perimeterinstitute.ca/personal/eschnetter/

Elizabeth F

unread,
Apr 27, 2016, 11:50:05 AM4/27/16
to Erik Schnetter, Spack
Erik,

If /opt/ofed is in the appropriate paths, then you should be able to build OpenMPI with just '--with-verbs'.  You then add the IB library as a (possibly optional) dependency for OpenMPI.

Then we need to make a package for the IB library, setting it to buildable=False in packages.yaml

My thoughts on the ideas suggested above:
 1. packages.yaml is already host-specific.  So we can just set up packages.yaml in host-specific ways.
 2. Probably a bad idea.  Cna we get rid of +lanl and +llnl too?
 3. packages.yaml is already site-specific.
 4. Auto-detection is best left as part of the configure step, not the Setup step that Spack does.

-- Elizabeth


-- Elizabeth


--
You received this message because you are subscribed to the Google Groups "Spack" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spack+un...@googlegroups.com.
To post to this group, send email to sp...@googlegroups.com.
Visit this group at https://groups.google.com/group/spack.
For more options, visit https://groups.google.com/d/optout.

luigi calori

unread,
Apr 27, 2016, 10:05:39 PM4/27/16
to Spack, schn...@gmail.com
I agree with Elizabth:

I had similar problem in order to customize the --with-tm=<path to installed specific PBS version> that we used at our site,
I added a variant tm and a correspondent dependency on a "fake" tm package that I had to define with an empty install
I had  described it in package.yaml  with my custom path and buildable=False as previously suggested.

then I used
'--with-tm=%s' % spec['tm'].prefix if '+tm' in spec else '--without-tm',

look at
https://github.com/RemoteConnectionManager/spack/commit/19c5725a70ce430b0ad5cb5fff3d07deaf49a96f


This way my site specific path stay in a package.yaml that is site specific.

I still have to specify my non-default variant and the fake package mut be somewhere in active repos

Is it possible to set variant preferencies in package.yaml ?
In that case we could keep variant defaults tune for general case and add site specific variants in package.yaml

In order to not change defaults, we should let +verbs untouched to configure with  --with-verbs

while we should define another variant like

+verbs_custom

to add the fake package dependency to ib that has the path set up to /opt/ofed and configure  --with-verbs=/opt/ofed

Best
       Luigi

Erik Schnetter

unread,
Apr 28, 2016, 10:33:37 AM4/28/16
to luigi calori, Spack
Putting the Infiniband configuration into its own package makes sense.

My question is now: How should we handle the site-specific
configuration that is required for Stampede? I am not affiliated with
TACC, and I want (need?) to provide this configuration as a service to
others who want to build OpenMPI for Stampede.

I could pass "--with-verbs=..." when building OpenMPI on Stampede, but
I don't want to remember this option every time I build there, and I
don't want to teach others to remember it either. The logic is simple
-- "if hostname=Stampede then use this option", and I wonder how we
can package this option in Spack.

To make things work on a supercomputer, one often has to add
site-specific configuration options, based e.g. on unique hardware
configurations or on policies. (The maximum number of parallel make
jobs is another such setting.)

These are some ideas I have:
- There is a set of site-specific add-ons distributed with Spack
- I maintain a repository of such add-ons for my collaboration
(einsteintoolkit.org)
- There is a set of site-specific repositories, with names based on
domain names, that one can automatically download (e.g.
"spack/site-specific/tacc.utexas.edu")

-erik

Massimiliano Culpo

unread,
Apr 28, 2016, 10:50:07 AM4/28/16
to Spack, luigi....@gmail.com
Just an idea for a slight variation on your approach: 

As far as I can see openmpi already provides the `+verbs` variant. To deal with your use case one could check the environment for a specific variable when verbs is activated ('VERBS_DIR') and add the location if the variable has been found. Something like:
```
try:
  verbs_dir = os.environ['VERBS_DIR']
  verbs_option = '--with-verbs=%s' % verbs_dir
except KeyError :
  verbs_option = '--with-verbs'
```

On your site-specific repository you'll just need then to keep track to which variables should be set in the environment, and what should be their values.

M.

Todd Gamblin

unread,
Apr 28, 2016, 11:46:10 AM4/28/16
to Erik Schnetter, luigi calori, Spack
Erik:

Pretty soon here we're going to merge the new architecture features, and I
hope that will give you a way to auto-detect things like this per
platform. We've talked a bit about it in the telcons but basically Matt's
and my idea is this:

- The newarch branch give you three fields: platform, os, and target.
e.g.:
- platform: BG/Q, Cray
- os: rhel6, rhel7, mavericks, yosemite, ubuntu14, etc
- target: haswell, ivybridge, ppc64, etc.

Spack will auto-detect the above stuff based on criteria in the platform
class. For example, on the BG/Q platform, you could auto detect by
noticing /bgsys. You could do something else for a TACC-specific platform.

Now, at the same time Spack has the `packages.yaml` file. Currently
that's either a spack-wide setting in $spack/etc, or it's overridden in
the ~/.spack/packages.yaml file. Neither of these gives you a good way to
export some sensible defaults *and* allow customization.

We are proposing to add a few levels to the configuration, namely:

${spack}/etc/spack/
defaults/ # spack default config (in git repo)
site/ # site-specific customizations (NOT in main spack git, overrides
defaults)

~/.spack/ # user settings (overrides above two options)

So basically now we would ship a default packages.yaml that is versioned
with Spack, but sites can override with their own file that won't get
clobbered by spack's changes. Users can still override both.

Within *each* of those config directories you could put configs like so:

$config_dir
packages.yaml
<platform>/
packages.yaml
...

Where things in <platform> would override packages.yaml per-platform.

That may seem complicated but the idea would be that we could do something
like this:

$spack/etc/spack/
defaults/
stampede/
packages.yaml # this would contain preferred
variants/parameters/overrides for TACC


Obviously you could make that more/less specific -- depends on how general
/opt/ofed is. I think that is actually used on more than just TACC's
systems so it could probably get hoisted up into some higher level
platform file.

So now you can add your variant and say where it is to be preferred. Or
you could set some parameter value in there -- we only have boolean
parameters right now (variants) but may have string-valued ones within the
year, so that is another thing to consider.

Finally, I have talked a bit with massimiliano about adding more support
parameterizing the architecture by network type; this might help as well.

Anyway, I hope that gives you some idea of where I would like to be. We
don't really have a good way to exports "hints" or site knowledge like
this now, but I would *like* to make it so that there are sensible
defaults for different platforms. Managing all the places preferences can
come from is actually pretty tough, so feedback on this idea would be
appreciated.

-Todd





On 4/28/16, 7:33 AM, "sp...@googlegroups.com on behalf of Erik Schnetter"
>>>>http://secure-web.cisco.com/1ZK6VGY8-A4fOI2GpL9jlrjI9mOgTY9WPoSVsuVsoHv
>>>>Jvk6-yAueTrA7MTNNthglPHpVdSVUzDJWtSpmTKZcfiFwEhptFhjvX-qf1pcWvsPN3pk8nw
>>>>mv7Rrzv6VnU4zu_q0rcgJ4LRHGeRVuIlMSJx36lCQ-ramLL9IfPLcEf0prOwzaXdbP3Z9Fe
>>>>Zv68e99wOL-ypwhQPwxnQDIToIiGkZcCFDswjbhjkMc5tlVTQHSUrIYuLe8nv5K1PEE8oUG
>>>>wrzoAL1OTZJKbNKeeU9gPcilZ98NHuslxw4amTj9W6lUGh6IIoPhP6FsrNHeYJGBuHokXHF
>>>>YCtUaYv89DldirboXNnm2CmJ3q6YyJ6QG4lHM/http%3A%2F%2Fwww.perimeterinstitu
>>>>te.ca%2Fpersonal%2Feschnetter%2F
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>>Groups
>>>> "Spack" group.
>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>send an
>>>> email to spack+un...@googlegroups.com.
>>>> To post to this group, send email to sp...@googlegroups.com.
>>>> Visit this group at
>>>>https://secure-web.cisco.com/1XBiZGP2ga_TJp1b8YuIfUC7JOsNMbftDAaHA3A-_u
>>>>733qLbGkTdgTrbLZRki5A1n0NkT4seqzLfvdMqjrYcftjINS4QD2soJQlcIgSDKc1f5e55U
>>>>J20p1cVmAmomTAw_0bEyyUec2FOR2KLOmh94KkFh1QF5HciteOjenniWINSKXIOiDo2RpVQ
>>>>IjIg_mZdYhysSeQIa2lA45vxnS7NV6QZTRa08VfX_IqmIDjbMY7Qw8ATEkoQpa4aaIw-ugP
>>>>uKowQUjCtg6OFX6_8u2dfyGk9HxYkd5CAEOiGkzmpIsCMjATbfFTjt3nDKXmxd7I778auFC
>>>>xyIZh_1_EsHdhc_N2t05ZMKvVYux80tSvhJJm3_tUOgHHfEgRmXtxQfcGqm/https%3A%2F
>>>>%2Fgroups.google.com%2Fgroup%2Fspack.
>>>> For more options, visit
>>>>https://secure-web.cisco.com/1wLr272QtnAg_9VqHqbGhd7iIHVaWUV4r6IwIQPQxC
>>>>U2-hMFxjYQfr52r_pHjLg6822aKLbxWTe_FpPDYJ6pGzuXQHMZ3v_uaYQVEcLDpHQYYMhSm
>>>>hDZsa1R8_3TwEk3TdeyA3ONgCHwjTcx7fHSBwsbyTkarNwU8HIiNS-kUHtNl6wFX8G7kitm
>>>>YF6P90xDI_BW0kP3whzj-N-D6lcXF2XEejXY1_Of7T4uhJLaT-GW7jN6oX6Lg4d91pTBFCC
>>>>XbhlzQFzftrUV6nOAHw2nnxa0loca-728hRv-wiDRG58gyGd1oUywIRIqufusptCFA6SQcC
>>>>RK3_u2LyADhlNqdLl3x0YoREDRg4JmW3KHyWjX_815Z9BHt_265r7AI6yXR/https%3A%2F
>>>>%2Fgroups.google.com%2Fd%2Foptout.
>>>
>>>
>>
>
>
>
>--
>Erik Schnetter <schn...@gmail.com>
>http://secure-web.cisco.com/1ZK6VGY8-A4fOI2GpL9jlrjI9mOgTY9WPoSVsuVsoHvJvk
>6-yAueTrA7MTNNthglPHpVdSVUzDJWtSpmTKZcfiFwEhptFhjvX-qf1pcWvsPN3pk8nwmv7Rrz
>v6VnU4zu_q0rcgJ4LRHGeRVuIlMSJx36lCQ-ramLL9IfPLcEf0prOwzaXdbP3Z9FeZv68e99wO
>L-ypwhQPwxnQDIToIiGkZcCFDswjbhjkMc5tlVTQHSUrIYuLe8nv5K1PEE8oUGwrzoAL1OTZJK
>bNKeeU9gPcilZ98NHuslxw4amTj9W6lUGh6IIoPhP6FsrNHeYJGBuHokXHFYCtUaYv89Dldirb
>oXNnm2CmJ3q6YyJ6QG4lHM/http%3A%2F%2Fwww.perimeterinstitute.ca%2Fpersonal%2
>Feschnetter%2F
>
>--
>You received this message because you are subscribed to the Google Groups
>"Spack" group.
>To unsubscribe from this group and stop receiving emails from it, send an
>email to spack+un...@googlegroups.com.
>To post to this group, send email to sp...@googlegroups.com.
>Visit this group at
>https://secure-web.cisco.com/1XBiZGP2ga_TJp1b8YuIfUC7JOsNMbftDAaHA3A-_u733
>qLbGkTdgTrbLZRki5A1n0NkT4seqzLfvdMqjrYcftjINS4QD2soJQlcIgSDKc1f5e55UJ20p1c
>VmAmomTAw_0bEyyUec2FOR2KLOmh94KkFh1QF5HciteOjenniWINSKXIOiDo2RpVQIjIg_mZdY
>hysSeQIa2lA45vxnS7NV6QZTRa08VfX_IqmIDjbMY7Qw8ATEkoQpa4aaIw-ugPuKowQUjCtg6O
>FX6_8u2dfyGk9HxYkd5CAEOiGkzmpIsCMjATbfFTjt3nDKXmxd7I778auFCxyIZh_1_EsHdhc_
>N2t05ZMKvVYux80tSvhJJm3_tUOgHHfEgRmXtxQfcGqm/https%3A%2F%2Fgroups.google.c
>om%2Fgroup%2Fspack.
>For more options, visit
>https://secure-web.cisco.com/1wLr272QtnAg_9VqHqbGhd7iIHVaWUV4r6IwIQPQxCU2-
>hMFxjYQfr52r_pHjLg6822aKLbxWTe_FpPDYJ6pGzuXQHMZ3v_uaYQVEcLDpHQYYMhSmhDZsa1
>R8_3TwEk3TdeyA3ONgCHwjTcx7fHSBwsbyTkarNwU8HIiNS-kUHtNl6wFX8G7kitmYF6P90xDI
>_BW0kP3whzj-N-D6lcXF2XEejXY1_Of7T4uhJLaT-GW7jN6oX6Lg4d91pTBFCCXbhlzQFzftrU
>V6nOAHw2nnxa0loca-728hRv-wiDRG58gyGd1oUywIRIqufusptCFA6SQcCRK3_u2LyADhlNqd
>Ll3x0YoREDRg4JmW3KHyWjX_815Z9BHt_265r7AI6yXR/https%3A%2F%2Fgroups.google.c
>om%2Fd%2Foptout.
>


Reply all
Reply to author
Forward
0 new messages