Spack package hints for CMake

1,143 views
Skip to first unread message

Jean-Paul Pelteret

unread,
Apr 20, 2016, 2:57:15 AM4/20/16
to Spack
Hi all,

I'm a new user trying to make the switch from Homebrew/Linuxbrew. First of all, thanks to the developers for this great package manager. I'm quite excited to use it for all of my scientific software needs.

I'm using CMake to help compile my own software, for which I have a number of external dependencies. Since my library has to compile on a couple of different machines running different OS's, previously I would create a variable specifying the Homebrew installation directory like this
execute_process(COMMAND bash -c "brew --prefix" COMMAND tr -d '\n'
  OUTPUT_VARIABLE HOMEBREW_ROOT
)
and then pass HOMEBREW_ROOT as a hint as to where to find the external dependencies.

I recognise that this approach will not work for Spack since it does not symlink or copy installed libraries and headers into a central location. I was hoping that someone might be able to make some suggestion as to how I should proceed instead. Is spack find -p <package> my best option here? I currently expect to have only a single variant of my dependent packages, but considerations should one make if there happen to be more than one variant of said package installed?

Many thanks for the help.
J-P

Denis Davydov

unread,
Apr 20, 2016, 4:39:38 AM4/20/16
to Spack
Hi JP,

i think i found it:

spack location -i dealii@8.4.0

Cheers,
Denis.

Jean-Paul Pelteret

unread,
Apr 20, 2016, 4:47:37 AM4/20/16
to Spack
Hi Denis,

Great, thanks a lot. I think that will probably do the trick!

Cheers,
J-P

Elizabeth F

unread,
Apr 20, 2016, 7:08:32 AM4/20/16
to Jean-Paul Pelteret, Spack
Jean-Paul,

I do not recommend using ``spack location -i`` because: (a) it's slow, and (b) setting it up is as much work as writing a ``package.py`` for your software, but not as useful.  I'll outline here the solution I DO recommend.

There are three parts to this: (1) Setting up the CMake build in your software, (2) Writing the Spack Package, and (3) using it from Spack.

Setting Up the CMake Build
---------------------------------------

You should follow standard CMake conventions in setting up your software, your CMake build should NOT depend on or require Spack to build.

See here for an example:

Note that there's one exception here to the rule I mentioned above.  In ``CMakeLists.txt``, I have the following line:
```
include_directories($ENV{CMAKE_TRANSITIVE_INCLUDE_PATH})
```

This is a hook into Spack, and it ensures that all transitive dependencies are included in the include path.  It's not needed if everything is in one tree, but it is (sometimes) in the Spack world; when running without Spack, it has no effect.

Note that this "feature" is controversial, could break with future versions of GNU ld, and probably not the best to use.  The best practice is that you make sure that anything you #include is listed as a dependency in your CMakeLists.txt.

To be more specific: if you #inlcude something from package A and an installed HEADER FILE in A #includes something from package B, then you should also list B as a dependency in your CMake build.  If you depend on A but header files exported by A do NOT #include things from B, then you do NOT need to list B as a dependency --- even if linking to A links in libB.so as well.

I also recommend that you set up your CMake build to use RPATHs correctly.  Not only is this a good idea and nice, but it also ensures that your package will build the same with or without ``spack install``.

Writing the Spack Package
---------------------------------------

Now that you have a CMake build, you want to tell Spack how to configure it.  This is done by writing a Spack package for your software.  See here for example:

You need to subclass ``CMakePackage``, as is done in this example.  This enables advanced features of Spack for helping you in configuring your software (keep reading...).  Instead of an ``install()`` method used when subclassing ``Package``, you write ``configure_args()``.  See here for more info on how this works:


NOTE: if your software is not publicly available, you do not need to set the URL or version.  Or you can set up bogus URLs and versions... whatever causes Spack to not crash.


Using it from Spack
--------------------------------

Now that you have a Spack package, you can get Spack to setup your CMake project for you.  Use the following to setup, configure and build your project:
```
cd myproject
spack spconfig myproject@local
mkdir build; cd build
../spconfig.py ..
make
make install
```

Everything here should look pretty familiar here from a CMake perspective, except that ``spack spconfig`` creates the file ``spconfig.py``, which calls CMake with arguments appropriate for your Spack configuration.  Think of it as the equivalent to running a bunch of ``spack location -i`` commands.  You will run ``spconfig.py`` instead of running CMake directly.

If your project is publicly available (eg on GitHub), then you can ALSO use this setup to "just install" a release version without going through the manual configuration/build step.  Just do:

1. Put tag(s) on the version(s) in your GitHub repo you want to be release versions.

2. Set the ``url`` in your ``package.py`` to download a tarball for the appropriate version.  (GitHub will give you a tarball for any version in the repo, if you tickle it the right way).  For example:
```
```
Set up versions as appropriate in your ``package.py``.  (Manually download the tarball and run ``md5sum`` to determine the appropriate checksum for it).

3. Now you should be able to say ``spack install myproject@version`` and things "just work."

NOTE... in order to use the features outlined in this post, you currently need to use the following branch of Spack:


There is a pull request open on this branch ( https://github.com/LLNL/spack/pull/543 ) and we are working to get it integrated into the main ``develop`` branch.


Activating your Software
-------------------------------------

Once you've built your software, you will want to load it up.  You can use ``spack load mypackage@local`` for that in your ``.bashrc``, but that is slow.  Try stuff like the following instead:

The following command will load the Spack-installed packages needed
for basic Python use of IceBin::
```
    module load `spack module find tcl icebin netcdf cm...@3.5.1`
    module load `spack module find --dependencies tcl py-basemap py-giss`
```

You can speed up shell startup by turning these into ``module load`` commands.

1. Cut-n-paste the script ``make_spackenv``::

    #!/bin/sh
    #
    # Generate commands to load the Spack environment

    SPACKENV=$HOME/spackenv.sh

    spack module find --shell tcl git icebin@local ibmisc netcdf cm...@3.5.1 >$SPACKENV
    spack module find --dependencies --shell tcl py-basemap py-giss >>$SPACKENV

2. Add the following to your ``.bashrc`` file::

    source $HOME/spackenv.sh
    # Preferentially use your checked-out Python source
    export PYTHONPATH=$HOME/icebin/pylib:$PYTHONPATH

3. Run ``sh make_spackenv`` whenever your Spack installation changes (including right now).


Giving Back
-------------------

If your software is publicly available, you should submit the ``package.py``  for it as a pull request to the main Spack GitHub project.  This will ensure that anyone can install your software (almost) painlessly with a simple ``spack install`` command.  See here for how that has turned into detailed instructions that have successfully enabled collaborators to install complex software:


Happy CMaking!
-- Elizabeth

Jean-Paul Pelteret

unread,
Apr 20, 2016, 7:56:36 AM4/20/16
to Spack, jppel...@gmail.com
Dear Elizabeth,

Thank you for the very comprehensive response! Its much appreciated and I'm certain will serve as a much-used reference to anyone else wanting to use Spack to build their project and contribute it as a package.

Unfortunately, and in retrospect, I wasn't sufficiently clear as to my use case. My intention is to use Spack to manage my project's dependencies only, and the project itself (a private research code) is built completely externally. This is a legacy of using other package managers, and its not something that I want to change at this point. However, in the future I think the approach that you have presented here is the one that I would like to use.

Since this is the case, am I correct in presuming that the step-1 that you've outlined, namely using $ENV{CMAKE_TRANSITIVE_INCLUDE_PATH} is not applicable to my intended use. If spack location -i is really not the way to go, then are there any better options available for me to use?

Many thanks again,
Jean-Paul

Elizabeth F

unread,
Apr 20, 2016, 1:20:25 PM4/20/16
to Jean-Paul Pelteret, Spack
On Wed, Apr 20, 2016 at 7:56 AM, Jean-Paul Pelteret <jppel...@gmail.com> wrote:
Dear Elizabeth,

Thank you for the very comprehensive response! Its much appreciated and I'm certain will serve as a much-used reference to anyone else wanting to use Spack to build their project and contribute it as a package.

OK, I added it to a PR:

 

--
You received this message because you are subscribed to the Google Groups "Spack" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spack+un...@googlegroups.com.
To post to this group, send email to sp...@googlegroups.com.
Visit this group at https://groups.google.com/group/spack.
For more options, visit https://groups.google.com/d/optout.

Ben Boeckel

unread,
Apr 20, 2016, 2:09:35 PM4/20/16
to Elizabeth F, Jean-Paul Pelteret, Spack
On Wed, Apr 20, 2016 at 07:08:31 -0400, Elizabeth F wrote:
> Note that there's one exception here to the rule I mentioned above. In
> ``CMakeLists.txt``, I have the following line:
> ```
> include_directories($ENV{CMAKE_TRANSITIVE_INCLUDE_PATH})
> ```
>
> This is a hook into Spack, and it ensures that all transitive dependencies
> are included in the include path. It's not needed if everything is in one
> tree, but it is (sometimes) in the Spack world; when running without Spack,
> it has no effect.

I think naming this SPACK_ rather than CMAKE_ is better. It is, after
all, a (possible) spack construction, not a CMake thing.

> Note that this "feature" is controversial, could break with future versions
> of GNU ld, and probably not the best to use. The best practice is that you
> make sure that anything you #include is listed as a dependency in your
> CMakeLists.txt.
>
> To be more specific: if you #inlcude something from package A and an
> installed HEADER FILE in A #includes something from package B, then you
> should also list B as a dependency in your CMake build. If you depend on A
> but header files exported by A do NOT #include things from B, then you do
> NOT need to list B as a dependency --- even if linking to A links in
> libB.so as well.

This is supposed to be managed by find_package(A). Things like
pkg-config already do this with Requires: (and Requires.private: for
static builds) directives. If I do find_package(A) and B is needed for
whatever reason (headers, wrapping tools, etc.), the FindA.cmake or
AConfig.cmake file is supposed to bring in B and the required include
paths for me and put them into ${A_INCLUDE_DIRS} (for FindA) or the
include interface (for AConfig.cmake).

--Ben

Elizabeth F

unread,
Apr 20, 2016, 2:40:44 PM4/20/16
to Ben Boeckel, Jean-Paul Pelteret, Spack
 
I think naming this SPACK_ rather than CMAKE_ is better. It is, after
all, a (possible) spack construction, not a CMake thing.

I called it CMAKE_ because it's a CMake-format semicolon-separated list.  It can be set by anyone (EasyBuild, for example), it is not Spack specific.
 
This is supposed to be managed by find_package(A). Things like
pkg-config already do this with Requires: (and Requires.private: for
static builds) directives. If I do find_package(A) and B is needed for
whatever reason (headers, wrapping tools, etc.), the FindA.cmake or
AConfig.cmake file is supposed to bring in B and the required include
paths for me and put them into ${A_INCLUDE_DIRS} (for FindA) or the
include interface (for AConfig.cmake).

Here is where my eternal confusion sets in:

 a) pkg-config: This is not at all universal, so I don't know how much we an rely on it.

 b) AConfig.cmake: Sounds like a more controllable approach.  Sounds like I should remove the transitive includes, find what breaks, and then fix the required FindXXX.cmake files.  If that works, then I agree, the transitive feature I made goes against the grain of CMake and should be removed.

Thank you,
-- Elizabeth

Ben Boeckel

unread,
Apr 20, 2016, 3:02:18 PM4/20/16
to Elizabeth F, Jean-Paul Pelteret, Spack
On Wed, Apr 20, 2016 at 14:40:42 -0400, Elizabeth F wrote:
> I called it CMAKE_ because it's a CMake-format semicolon-separated list.
> It can be set by anyone (EasyBuild, for example), it is not Spack specific.

CMAKE_ variables are usually specified by CMake itself which is why I
think the CMAKE_ prefix is misleading (think of it as a reserved
namespace). Yes, CMake itself has not been strict with this pattern over
the years, but CMake does have a long history.

> a) pkg-config: This is not at all universal, so I don't know how much we
> an rely on it.

No "find me an external dependency" solution is universal :( .

> b) AConfig.cmake: Sounds like a more controllable approach. Sounds like I
> should remove the transitive includes, find what breaks, and then fix the
> required FindXXX.cmake files. If that works, then I agree, the transitive
> feature I made goes against the grain of CMake and should be removed.

CMake projects should provide these files because they know *exactly*
what options were used, what the dependencies are, where the libraries
and include paths live, etc. Larger projects which have large
CMake-using communities (such as Qt5 and protobuf[1]) generate these
files themselves even though they do not use CMake for their own build.

--Ben

[1]Well, CMake is used for Windows building, but not for the *nix build.

Morgan, Ben

unread,
Apr 20, 2016, 3:25:36 PM4/20/16
to ben.b...@kitware.com, Elizabeth F, Jean-Paul Pelteret, Spack

> On 20 Apr 2016, at 20:02, Ben Boeckel <ben.b...@kitware.com> wrote:
>
> On Wed, Apr 20, 2016 at 14:40:42 -0400, Elizabeth F wrote:
>> I called it CMAKE_ because it's a CMake-format semicolon-separated list.
>> It can be set by anyone (EasyBuild, for example), it is not Spack specific.
>
> CMAKE_ variables are usually specified by CMake itself which is why I
> think the CMAKE_ prefix is misleading (think of it as a reserved
> namespace). Yes, CMake itself has not been strict with this pattern over
> the years, but CMake does have a long history.

I also think this should be named SPACK_ rather than CMAKE_, for the same reasons
outlined above.

>
>> a) pkg-config: This is not at all universal, so I don't know how much we
>> an rely on it.
>
> No "find me an external dependency" solution is universal :( .
>
>> b) AConfig.cmake: Sounds like a more controllable approach. Sounds like I
>> should remove the transitive includes, find what breaks, and then fix the
>> required FindXXX.cmake files. If that works, then I agree, the transitive
>> feature I made goes against the grain of CMake and should be removed.
>
> CMake projects should provide these files because they know *exactly*
> what options were used, what the dependencies are, where the libraries
> and include paths live, etc. Larger projects which have large
> CMake-using communities (such as Qt5 and protobuf[1]) generate these
> files themselves even though they do not use CMake for their own build.
>

Assuming the build system of a package finds and reports its dependencies (via pkg-config or
CMake/find_package and related config files) correctly and without hard-coding, then does the problem reduce to spack ensuring any transitive dependencies are added to CMAKE_PREFIX_PATH, PKG_CONFIG_PATH at the install step? I know the are a few PRs in progress that look like they *might* address this, but I’m not really up to date here… (for example #378?)

Cheers,

Ben.


> --Ben
>
> [1]Well, CMake is used for Windows building, but not for the *nix build.
>

Jean-Paul Pelteret

unread,
Apr 21, 2016, 5:32:22 AM4/21/16
to Spack, jppel...@gmail.com
Dear Elizabeth,

Attached is a minimal working example that touches all the relevant aspects of my original question and parallels what I  -- an external CMake project that uses CMake modules along with find_package to detect the dependencies. This runs on a fresh Ubuntu installation.

The OS's that I need to run this on are Ubuntu and OS X (my colleague has pointed my to this PR, but I'm unfamiliar with modules work so I'm not certain if using them is a possible solution). There are two places where I'm needing to invoke a call like `spack location -i`: (i) on the command line when I first invoke CMake to set up the build, and (ii) inside the CMakeLists.txt to provide hints as to where the dependency is located. The first point is unavoidable, of course, but is there a better way to go about achieving the desired result for the second point? I'm not opposed to setting all environmental variables manually (i.e. root directories for each package) in my .bash_profile if really necessary, but I thought that there may be some more clever way around it.

Many thanks again for your time and patience,
Jean-Paul
Spack_External_CMake_MWE.tar.gz

Elizabeth F

unread,
Apr 21, 2016, 9:18:41 AM4/21/16
to Jean-Paul Pelteret, Spack
Jean-Paul,

Thank you for sharing the code.  After reviewing it, I am increasingly sure that ``spack spconfig`` will be useful for you.

First... I think a little "philosophy" is in order.  In the old days, we had just Makefiles.  To build your project, you just said "make."  I'll call that a 1-stage build process.  Then we got Autotools and now CMake.  To build your project, you say "configure; make".  The extra configure step, while a bit inconvenient, has proven its worth over the years and been generally accepted.  Past approaches, which conflated the configure and build steps into a single build step, have fallen by the wayside.

The use of Spack actually extends that process further, to a three-stage process:
 1. Run Spack to find your dependencies
 2. Configure/CMake
 3. make

This three-stage process corresponds to three things that need to happen:
 1. Set up the build, making the appropriate dependencies available.
 2. Once dependencies have been made available, determine how they will be used to create a build plan.
 3. Build the software. 

Before I had Spack, I would do step 1 manually.  Generally, I would make a small site-specific script that would call CMake, giving it the right hints to find everything (eg by setting EIGEN3_ROOT_DIR). For example, here's how I built a particular piece of software in the pre-Spack era, when I used MacPorts to build my prerequisites.  Note the variables FEXCEPTION_ROOT and NETCDF_INCLUDES, which provide CMake hints on where to find things.

```
#!/bin/sh

MACPORTS=$HOME/macports

export BOOST_INCLUDEDIR=$BOOST_ROOT/include
export BOOST_LIBRARYDIR=$BOOST_ROOT/lib

cmake \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_Fortran_COMPILER=gfortran \
-DCOMPILE_WITH_TRAPS=YES -DCOMPILE_WITH_DEBUG=YES \
-DCMAKE_INSTALL_PREFIX:PATH=$HOME/opt/modele \
-DNETCDF_INCLUDES=$MACPORTS/include \
-DUSE_FEXCEPTION=YES -DFEXCEPTION_ROOT=$HOME/opt/fexception \
-DRUNDECK=$1 \
$2
```

In this paridigm, Step 1 (the setup stage) was manual.  Step 2 was handled by CMake, and then Step 3 by make.

With Spack, Step 1 becomes a little easier.  My first thought was to write a script similar to the one above, with a bunch of ``spack location -i`` calls in it.  That would (almost) work reliably, and would work across different platforms.  So it would have been a big step up from what I was doing before.

But then I realized that writing a bunch of ``spack location -i`` commands is the same amount of work as writing a Spack package, but less functional.  Less functional because:
  (a) It can't be used to easily distribute my software to others
  (b) Unless you have a ``package.py`` file, other software can't depend on your package and be built by Spack.  This leads to an ever-increasing list of code, built by you, that you have to build manually.
  (c) ``spack location`` can fail if you have more than one version of a package installed.
  (d) You don't get the benefits of the Spack concretization algorithm in the setup step.  For example, Spack ensures that all your prerequisites use consistent versions of all THEIR prerequisites (for example, they're all built with the same compiler and use the same MPI).  If you write a bunch of ``spack location`` calls, you lose all such guarantees.
  (e) You have to manually issue the Spack commands to install all the prerequisites listed in your ``spack location`` calls.  If you re-build your Spack repo (or move to a new compiler), you have to issue all those commands again (rather than just ``spack spconfig myproject``, which would automatically rebuild all your prerequisites).

For all these reasons, I made the ``spack spconfig`` command, which essentially writes one of those setup scripts for you in a Spack-consistent manner.  I'm attaching a sample Spack-generated setup script (``spconfig.py``), which replaces the hand-generated script I shared above.

In any case, I think it's important to understand the setup step as a separate step that happens BEFORE the configure step.  Configure/CMake will find the prerequisites needed for your software, but with caveats: (a) they can only find what you've set up for them, and (b) they don't know what to do if more than one version is available.  Increasingly, modern systems have multiple versions of packages installed.  Which increases the importance of being intentional in the setup step, to make sure the configure step sees the right versions.  The sample you shard conflates the setup and build steps, which is a bad idea.

Spack automates the setup step and removes a lot of drudgery.  Rather than poking around your filesystem, you can say "spack spec" and see what Spack intends to install.  You can fiddle with your Spack specification until you like what you see.  Then you say ``spack spconfig`` to complete the setup step --- or ``spack install`` to do the setup, configure and build steps all automatically.

Please, let Spack do its thing.  It's not just for installing your prerequisites, it can help you every step of the way in your software development process.

------------------------------------------------

Here are my comments on the sample you shared:


1. **gmp_eigen.patch**: These are good changes, and need to be submitted back to Spack.  Please fork the main Spack repo, create a new branch in your forked repo, and submit them as pull requests.  There's a small learning curve with this, but I believe it's well worth the effort (and there is plenty of help on-line, or one of us can show you how).

Forking is a good idea because (a) it allows you to submit pull requests, (b) it allows you to make "fixes" or trial fixes to the code, (c) it allows you to maintain a stable version on your own branch, suitable for installing your project.  Every time you pull from the main develop branch, you can expect at least a lot of things to rebuilt, just baed on new package versions people have submitted.

2. Do not clone Spack into ``~/.spack``, since that directory is already used for Spack configurations.  Once you've set up your own fork of Spack, you won't need this clone-and-patch step anyway (you'll just clone from your own fork on GitHu).

3. If you have commands you want to be available in general, you should use ``spack load`` to load the corresponding environment module (instead of aliasing cmake).  Or (since Spack runs slowly)... use Spack once to figure out which modules to use, and then put ``module load`` commands in your ``.bashrc``.  Ubuntu 14.04 has environment modules built-in (do ``apt-get install environment-modules``).  Or you can have Spack install them (see "Installing Environment Modules" at https://github.com/LLNL/spack/blob/develop/lib/spack/docs/basic_usage.rst ).

As with GitHub forks... learning environment modules is well worth the effort.  They are really super, and becoming pretty widespread in the HPC world.  They should work on Mac as well as Linux.

4. I repeat... making your CMake build depend on Spack is a bad idea.  Many reasons for this (in no particular order):
   a) ``spack location`` is slow.
   b) It limits who you can share it with
   c) It brings in an unnecessary entanglement
   d) Nobody else does it.
   e) No one on the CMake mailing list will help you if anything doesn't work with your CMake build, even if the problem is unrelated to Spack.
   f) It conflates the setup and configure steps

And last but  certainly not least...
   g) ``spack location -i eigen`` will fail as soon as you have more than one version of eigen installed.  Which will happen as soon as someone updates one of Eigen's dependencies, or you build with an alternate compiler, or a zillion other reasons.

Here is a CMake project that uses Eigen:

I HIGHLY (10 out of 10) recommend you follow the example in that CMake-based project, or something else equivalent.  Once you've done that, I VERY MUCH (8 out of 10) recommend that you give ``spack spconfig`` a try.  It was designed for exactly your situation.  I also recommend you get familiar with environment modules.

-- Elizabeth


spcofnig.py
Reply all
Reply to author
Forward
0 new messages