Hi everyone,
I recently started experimenting with Spack for my software packaging needs, and so far I like it a lot, but there are some things which I do not understand yet:
Thanks in advance for your explainations! I would be happy to help fixing anything in this list which is more of a bug than a feature.
Cheers,
Hadrien
PS: Oh, and an extra one. Many useful software projects exist as
a patch to a more established software project which has not been
upstreamed yet. For example, Verrou is a CESTAC floating-point
error checker which is packaged as a valgrind patch, and Templight
is a C++ template bloat debugging tool which is packaged as a
clang patch.
What is the expected way of packaging such projects? Should they
be packaged as a variant of the upstream project, or as an no-op
package which does nothing but depending on a patched version of
the upstream project?
Hi everyone,
I recently started experimenting with Spack for my software packaging needs, and so far I like it a lot, but there are some things which I do not understand yet:
- Some packages, such as Eigen and Valgrind, have variants which have no obvious effect on the build besides adding dependencies. What does it mean?
- Some packages come with many variants enabled by default. Shouldn't minimal builds be the default, with users enabling extra variants as desired?
- The ability of `spack create` to automagically fill the package.py seems limited in the common case of a CMake-based package downloaded via git. Can it be helped?
- Is there a way to avoid repeating the "git=" parameter over and over again in package.py when all versions of a package are available as commits in a git repository?
- Sometimes, running "spack load" right after "spack install" fails, saying that a module does not exist. Re-loading setup-env.sh fixes the issue. What is going on?
Many useful software projects exist as a patch to a more established software project which has not been upstreamed yet. For example, Verrou is a CESTAC floating-point error checker which is packaged as a valgrind patch, and Templight is a C++ template bloat debugging tool which is packaged as a clang patch.
What is the expected way of packaging such projects? Should they be packaged as a variant of the upstream project, or as an no-op package which does nothing but depending on a patched version of the upstream project?
Hi Elisabeth,
Many thanks for your explanations! I will expand on specific
points below.
- The ability of `spack create` to automagically fill the package.py seems limited in the common case of a CMake-based package downloaded via git. Can it be helped?
@adamjsteward has put a lot of work into automagic writing of package.py files. But it is limited because in many cases, the required information is only written in English; and we don't have built AI bots yet that can read lengthy description pages of "Installation Instructions" and convert them to Python.
I am certainly well aware of this problem, but I do not think it is the heart of the issue here.
Most recent C++ packages can be installed using a variant of the
following procedure:
Using git repos like this is more convenient than using tarballs
in several ways. The first one, which is directly relevant to
Spack, is that the "git tag command" is all you need in order to
enumerate available project releases. There is no need to
investigate a project-specific procedure for enumerating release
tarballs (some HTTP mirrors let you easily do that by truncating
the URL, others... not so easily).
Another advantage of using git for downloading software releases
is that there is a lower barrier to contribution. If you find
something broken in a software release that you downloaded via
git, all you need to do to cd into the repo, create a development
branch, and start committing some changes. Once the patch is
mature, you fork the official project repo, push your changes, and
submit the patch as a MR. This is to be contrasted with tarballs,
whose "read-only" nature creates a barrier to contribution as
there is more work to do before one is able to contribute a bugfix
(find the project website, then the git repo, clone in a different
directory, find the tag equivalent to the tarball...).
For these reasons, and others which are not relevant to the Spack
workflow (no need to fiddle with tar's infamous CLI flags, easy to
update when a new release comes out), I have generally stopped
using tarballs as a mechanism for downloading software source
code, except in cases where I am forced into it by the upstream
project.
Coming from this perspective, I was expecting "spack create <package repo URL>" to do what I would do myself when packaging such a project:
However, from the remainder of your reply, I understand that the
reason why this does not happen is that tarballs are currently
considered to be the preferred release distribution medium, and
Spack therefore only has limited support for interacting with git
repositories.
- Is there a way to avoid repeating the "git=" parameter over and over again in package.py when all versions of a package are available as commits in a git repository?
In short, no.
The standard way is to download tarballs, with a provided checksum. The checksum ensures files haven't been tampered with since the author wrote the Spack package. Checksums are not in effect when downloading from a symbolic git branch or tag; however, they ARE when downloading from a git hash.
GitHub and others provide a convenient way to download any Git branch, tag or commit as a tarball (which is checksummable).
These tarball-based download methods are dependent on the specific software forge in use, and often clunkier than git clone when it comes to enumerating available releases.
For example, if I take the ACTS project on CERN's Gitlab @ https://gitlab.cern.ch/acts/acts-core , I can indeed easily get a URL-based tarball of any branch or tag using URLs like https://gitlab.cern.ch/acts/acts-core/-/archive/master/acts-core-master.tar.gz . But trying to enumerate the available branches or tags using the intuitive URL https://gitlab.cern.ch/acts/acts-core/-/archive/ will just lead me through a 404 page, either because Gitlab does not support this or because a sysadmin somewhere disabled the feature.
Contrast with the git-based approach:
This works like a charm on any git-based project which follows the usual release tag naming conventions (which Spack already has built-in support for as it can enumerate tarballs in HTTP mirrors), and enables extra niceties such as being able to easily cd into the git repo and experiment with a patch using standard Git workflows.
In any case, I understand that for now, the answer is that Spack
only provides first-class support for tarball-based downloads. I
only hope that the above discussion will help you understand where
I am coming from, and why I think that Git-based download could
benefit from better integration and ergonomics. As said before, I
am available to contribute improvements in this direction if
someone feels like helping me get started.
- Sometimes, running "spack load" right after "spack install" fails, saying that a module does not exist. Re-loading setup-env.sh fixes the issue. What is going on?
I don't know. But "spack load" is not reliable, since it depends on what else has happened to be installed. I use Spack Environments to avoid these problems, and see "spack load" merely for casual occasional use.
Can you expand a bit on what you mean by "Spack Environments"?
The spack documentation only directly refers to filesystem views
and environment modules, are you referring to the former here?
Many useful software projects exist as a patch to a more established software project which has not been upstreamed yet. For example, Verrou is a CESTAC floating-point error checker which is packaged as a valgrind patch, and Templight is a C++ template bloat debugging tool which is packaged as a clang patch.
What is the expected way of packaging such projects? Should they be packaged as a variant of the upstream project, or as an no-op package which does nothing but depending on a patched version of the upstream project?
Download from a specific git commit out of the upstream repo, and invent a Spack-only version number for that commit. When this "hack" is no longer needed, remove your unofficial "release" from Spack and replace with the official release.
I am not sure if I fully understood you here.
To fuel the discussion, here is my current experiment at
integrating Verrou into spack by modifying the parent valgrind
package :
https://github.com/HadrienG2/spack/compare/e37554f...e4ff076 .
Please do not mind the hacks around, this is not meant for merging
yet.
Cheers,
Hadrien
Most recent C++ packages can be installed using a variant of the following procedure:
- git clone --branch=<release tag> --depth=1 <package repo URL>
- cd <package dir> && mkdir build && cd build
- cmake <config flags> ..
- make
- sudo make install
For these reasons... I have generally stopped using tarballs as a mechanism for downloading software source code,
These tarball-based download methods are dependent on the specific software forge in use, and often clunkier than git clone when it comes to enumerating available releases.
except in cases where I am forced into it by the upstream project.
Coming from this perspective, I was expecting "spack create <package repo URL>" to do what I would do myself when packaging such a project:
- Clone the git repository in a temp directory
- Enumerate release tags (those which are nothing but a sequence of numbers with an optional "v" in front of them)
- Create a "version" entry for each of these tags, optionally using commit hashes instead of tag for better reproducibility.
- Automagically mark the package as a CMakePackage
version('1.2.1', | |
git='https://github.com/citibeth/icebin.git', | |
commit='fj2yh47r2ifisuhkuhf) |
- Is there a way to avoid repeating the "git=" parameter over and over again in package.py when all versions of a package are available as commits in a git repository?
Many useful software projects exist as a patch to a more established software project which has not been upstreamed yet. For example, Verrou is a CESTAC floating-point error checker which is packaged as a valgrind patch, and Templight is a C++ template bloat debugging tool which is packaged as a clang patch.
What is the expected way of packaging such projects? Should they be packaged as a variant of the upstream project, or as an no-op package which does nothing but depending on a patched version of the upstream project?
Download from a specific git commit out of the upstream repo, and invent a Spack-only version number for that commit. When this "hack" is no longer needed, remove your unofficial "release" from Spack and replace with the official release.I am not sure if I fully understood you here.
To fuel the discussion, here is my current experiment at integrating Verrou into spack by modifying the parent valgrind package : https://github.com/HadrienG2/spack/compare/e37554f...e4ff076 . Please do not mind the hacks around, this is not meant for merging yet.
Am I correct that the approach which you would suggest is to replace my current variant-based approach with another approach which adds a "verrou-2.0" release to valgrind, and otherwise generally does the same thing?
Coming from this perspective, I was expecting "spack create <package repo URL>" to do what I would do myself when packaging such a project:
- Clone the git repository in a temp directory
- Enumerate release tags (those which are nothing but a sequence of numbers with an optional "v" in front of them)
- Create a "version" entry for each of these tags, optionally using commit hashes instead of tag for better reproducibility.
- Automagically mark the package as a CMakePackage
Is there a way to avoid repeating the "git=" parameter over and over again in package.py when all versions of a package are available as commits in a git repository?
--
You received this message because you are subscribed to the Google Groups "Spack" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spack+un...@googlegroups.com.
To post to this group, send email to sp...@googlegroups.com.
Visit this group at https://groups.google.com/group/spack.
For more options, visit https://groups.google.com/d/optout.
Elisabeth,
In any case, security is a very real issue that must be dealt with. Checksums aren't a magic guarantee; but they are FAR better than nothing. They ensure the version YOU install is the same as the version seen by the original package author. If that was a long time ago and nobody has complained about that version, then you have at least some confidence that it does not contain malicious code. Without checksums, anyone could insert malicious code into any commonly used open source project and you'd never be the wiser. Without the checksum capability, Spack would not be allowed to be used in many places it is used.
An excellent point! Let us discuss a bit the security of tarballs and Git repos, then. Here is my understanding of what can go wrong when downloading a tarball:
And here is my understanding of what can go wrong when downloading from Git via HTTPS:
Do you think I have missed anything important in this analysis?
Most recent C++ packages can be installed using a variant of the following procedure:
- git clone --branch=<release tag> --depth=1 <package repo URL>
- cd <package dir> && mkdir build && cd build
- cmake <config flags> ..
- make
- sudo make install
Except Spack does two very important things that your manual procedure above does not:
1. Check checksums2. Ensure proper dependencies are installed BEFORE installing this package.
There is no standardized machine-readable way in CMake to specify dependencies. Most of the work writing a Spack package is in getting the dependencies and variants right, not in downloading the source code.
Thanks for clarifying! Now I understand what you meant in your
first e-mail when discussing manual procedures.
I agree that things like dependency specifications cannot be automated. My goal here would be to reach ergonomics parity with source tarballs, where IIUC Spack gives you a pre-filled template where you get for free...
...and you only need to manually fill in...
Packaging is boring work which we want a lot of people to be
doing, so the easier we can make it, the better. I think Spack is
already pretty good in this area, I would just like to make it
even better for my use cases.
except in cases where I am forced into it by the upstream project.
Coming from this perspective, I was expecting "spack create <package repo URL>" to do what I would do myself when packaging such a project:
- Clone the git repository in a temp directory
- Enumerate release tags (those which are nothing but a sequence of numbers with an optional "v" in front of them)
- Create a "version" entry for each of these tags, optionally using commit hashes instead of tag for better reproducibility.
- Automagically mark the package as a CMakePackage
You can add this automagic to `spack create` if you like. Two recommendations:
1. Run any designs past @adamjstewart before implementing them.2. If you want to use git download method, generate git downloads that download by hash, not tag, This is equivalent to checksums, ensuring against malicious alteration. In other words, you should generate something like this, based on inspection of the git repo. I promise to object to any system that encourages circumvention of checksums through the use of automagic.
version('1.2.1',
git='https://github.com/citibeth/icebin.git',
commit='fj2yh47r2ifisuhkuhf)
Or just generate the appopriate tarball download URL if you're downloading from GitHub, which is 90% of packages these days.
Thanks for the pointer. Once I have built up a solid idea of what
I want, I'll make a ticket about it on spack/spack and ping @adamjstewart.
Want me to ping you as well?
As long as SHA-1 is trusted, I agree that commit hashes are the best option, are they save us from duplicating the checksumming which git already does internally. For a fallback path once security researchers are done breaking poor old SHA-1 (I don't expect Git to switch away from it then because it would break a lot of repos and the Git team has repeatedly stated that the use of SHA hashes in Git is not meant as a security mechanism), see my comment above regarding git tags.
Many useful software projects exist as a patch to a more established software project which has not been upstreamed yet. For example, Verrou is a CESTAC floating-point error checker which is packaged as a valgrind patch, and Templight is a C++ template bloat debugging tool which is packaged as a clang patch.
What is the expected way of packaging such projects? Should they be packaged as a variant of the upstream project, or as an no-op package which does nothing but depending on a patched version of the upstream project?
Download from a specific git commit out of the upstream repo, and invent a Spack-only version number for that commit. When this "hack" is no longer needed, remove your unofficial "release" from Spack and replace with the official release.I am not sure if I fully understood you here.
To fuel the discussion, here is my current experiment at integrating Verrou into spack by modifying the parent valgrind package : https://github.com/HadrienG2/spack/compare/e37554f...e4ff076 . Please do not mind the hacks around, this is not meant for merging yet.
OK, this is an exceptional circumstance. 99.9% of packages require only own download. The general idea is 1 repo = 1 tarball = 1 download = 1 Spack package. You are right to use the resource mechanism when more than one download is needed. In that case, I would recommend that verrou is the main download tarball (since this is the verrou package), and that valgrid is the resource that gets downloaded in addition. You will probably need to go to some extra work in the package on setting up the stage.
A MUCH better approach would be if verrou's authors could simply maintain verrou as a branch off of valgrind. Git is made for this kind of thing, and it would make everybody's life SO MUCH EASIER --- verrou's life, users' life, Spack's life. That's what I thought you meant by "Many useful software projects exist as a patch to a more established software project which has not been upstreamed yet."
Thanks for this clarification and advice. I fully agree with you
that this patch-based workflow is brittle and unpleasant, and that
I would like to use it as infrequently as possible. I also agree
that your proposed workflow based on branching off the upstream
git repo is highly preferable when applicable.
I think the reason why verrou and templight did not use this
method is that valgrind and clang historically used SVN for source
control management. Now that valgrind has switched to git, I can
try to convince the verrou devs to switch to a branching workflow.
For clang, everything is still based on SVN, with autogenerated
git mirrors, so this may be a tougher sell (I'm not sure how safe
building work on top of an automatically generated git repo is).
For both projects, since the end result is a patched build of clang or valgrind, I suspect that I will need to mark the verrou package as conflicting with valgrind, and the templight package as conflicting with clang.
Cheers,
Hadrien
Todd,
Is there a way to avoid repeating the "git=" parameter over and over again in package.py when all versions of a package are available as commits in a git repository?You *should* be able to just put the git = at the top level, as for url:
class MyPackage(Package):git = <git URL>version(‘1.2.3’, commit=‘a1b2c3')
Does that work for you?
I'm afraid not. With the following specification...
class ActsCore(CMakePackage): # ...description... homepage = "http://acts.web.cern.ch/ACTS/" url = "https://gitlab.cern.ch/acts/acts-core.git" git = "https://gitlab.cern.ch/acts/acts-core.git" version('v0.06.00', commit='7358bc8b274c5c474c98c9e44dd796666de11a9d') # ...dependencies, configuration, etc...
...I get the following error message when running "spack install acts-core":
==> Error: Unable to parse extension from https://gitlab.cern.ch/acts/acts-core.git. If this URL is for a tarball but does not include the file extension in the name, you can explicitly declare it with the following syntax: version('1.2.3', 'hash', extension='tar.gz') If this URL is for a download like a .jar or .whl that does not need to be expanded, or an uncompressed installation script, you can tell Spack not to expand it with the following syntax: version('1.2.3', 'hash', expand=False)
The same specification works if git= is specified in each version() call instead of globally:
class ActsCore(CMakePackage): # ...description... homepage = "http://acts.web.cern.ch/ACTS/" url = "https://gitlab.cern.ch/acts/acts-core.git" version('v0.06.00', git="https://gitlab.cern.ch/acts/acts-core.git", commit='7358bc8b274c5c474c98c9e44dd796666de11a9d') # ...dependencies, configuration, etc...
I was a bit suspicious about Spack's attempt to parse the URL, so I wondered if the top-level "url=" and "git=" weren't mutually exclusive and if I shouldn't remove the top-level "url=" when "git=" is used. But this is not the case: commenting out the "url=" results in an unhappy backtrace asking me to bring it back:
==> Error: Class constructor failed for package 'acts-core'. Caused by: NoURLError: Package ActsCore has no version with a URL. File "/home/hadrien/Software/spack/lib/spack/spack/repo.py", line 833, in get return package_class(spec) File "/home/hadrien/Software/spack/lib/spack/spack/package.py", line 672, in __init__ f = fs.for_package_version(self, self.version) File "/home/hadrien/Software/spack/lib/spack/spack/fetch_strategy.py", line 996, in for_package_version attrs['url'] = pkg.url_for_version(version) File "/home/hadrien/Software/spack/lib/spack/spack/package.py", line 784, in url_for_version raise NoURLError(cls)
Cheers,
Hadrien
class MyPackage(Package):git = <git URL>version(‘1.2.3’, commit=‘a1b2c3')
Does that work for you?I'm afraid not. With the following specification...
class ActsCore(CMakePackage): # ...description... homepage = "http://acts.web.cern.ch/ACTS/" url = "https://gitlab.cern.ch/acts/acts-core.git" git = "https://gitlab.cern.ch/acts/acts-core.git" version('v0.06.00', commit='7358bc8b274c5c474c98c9e44dd796666de11a9d') # ...dependencies, configuration, etc...
--
(Adding back forgotten mailing list CC)
Elizabeth,
You are right... security in Spack nods in the right direction, but it is half-baked. Shortcomings include:
1. Widespread use of no-longer-secure checksum algos (although better ones are actually preferred; just not used much so far).2. Lack of default "secure mode" that refuses to install things without a checksum. Or without a secure-enough checksum. Spack might warn, but that is not the same as a hard check.3. Reliance on Git for security, even though docs say not to. What if a rogue Git server is running somewhere?4. New package PRs being submitted without any checksummable version at all.
Would you be interested in spearheading an effort to move Spack in the right direction on these issues?
I'm not sure if I would be the best person for this, for two broad reasons:
Packaging is boring work which we want a lot of people to be doing, so the easier we can make it, the better. I think Spack is already pretty good in this area, I would just like to make it even better for my use cases.
Here is where we disagree. And this might have to do with roles: is Spack to be used by users to build their own software? Or by sysadmins to more easily install software in response to user requests? I am the former, but I believe that most of Spack is used by the latter.
In my expeirence, I only write Spack packages for stuff I need. And what I need doesn't change very much over time. So once I've gotten the packages written, I don't need to write much more. Because this is an infrequent job, it is not worthwhile for me to learn fancy automagic tools to make the job easier. My approach is generally to copy an existing package and fill-in-the-blanks. In general, I write Spack packages less frequently than I file taxes, but more frequently than I renew my drivers license.
@adamjstewart disagrees with me on this issue. He has put a lot of work into nifty automagic tools that I am too busy to learn how to use. If I were a sysadmin routinely writing packages and installing stuff for others, I would probably find them useful.
Maybe a bit of context could help clarify what I am trying to do here.
I'm working in the computing department of a high energy physics (HEP) lab. As you may have heard, the HEP community has historically used various in-house software packaging solutions (CMT, LCGCMake...), but is currently investigating moving to more "standardized" software packaging solutions. Various solutions are investigated in this respect, including for example Nix, Portage and Spack. An incomplete description of this effort is available @ https://hepsoftwarefoundation.org/activities/packaging.html .
After studying the various alternatives which I know about, I am currently convinced that Spack is the most interesting solution that was proposed so far:
Now, these are my opinions, and there are of course many divergent ones around:
As a junior member of the HEP community, I cannot do much about the political part, but I can help Spack become a more interesting technical choice for us, by...
Spack Environements... see
I'm not sure if I fully understood how this is supposed to work. Could you provide me a few sample CLI commands in order to see how the daily interaction would look like ? I am particularly interested in what using a package through a spack environment would look like.
Cheers,
Hadrien
Todd,
This pull request works if I have a "git=", but not a "url=", at the top-level scope:
# url = "https://gitlab.cern.ch/acts/acts-core.git" git = "https://gitlab.cern.ch/acts/acts-core.git" version('develop', branch='master') version('v0.06.00', commit='7358bc8b274c5c474c98c9e44dd796666de11a9d')
If both a URL and a git repository are specified (by uncommenting the "url=" above), then Spack tries to use the URL method for the git commits, which does not end well:
==> Error: Unable to parse extension from https://gitlab.cern.ch/acts/acts-core.git. If this URL is for a tarball but does not include the file extension in the name, you can explicitly declare it with the following syntax: version('1.2.3', 'hash', extension='tar.gz') If this URL is for a download like a .jar or .whl that does not need to be expanded, or an uncompressed installation script, you can tell Spack not to expand it with the following syntax: version('1.2.3', 'hash', expand=False)
I think either of the following logics would be better for
ergonomics:
Cheers,
Hadrien
Would you be interested in spearheading an effort to move Spack in the right direction on these issues?I'm not sure if I would be the best person for this, for two broad reasons:
- I'm worried that it could blow the limited time budget which I currently have available for playing with Spack at work, with respect to other areas where I think I could make progress more quickly like packaging my usual software or improving the user experience of creating packages from Git repos.
Spack Environements... seeI'm not sure if I fully understood how this is supposed to work. Could you provide me a few sample CLI commands in order to see how the daily interaction would look like ? I am particularly interested in what using a package through a spack environment would look like.
$ spack install foo@1.1$ spack load foo # works$ spack install foo@1.2$ spack load foo # fails
I think either of the following logics would be better for ergonomics:
- Accept both "url=" and "git=" at top-level scope, use Git method when commit/tag/branch are specified in the version() call and URL method otherwise.
- Do not accept both "url=" and "git=" at top-level scope, error out if they are specified at the same time.
Honestly, I keep forgetting we still have a Google Group. Do we still need this thing?
Hi Adam,
Okay, one other comment I wanted to make on the secure download methods discussion.
Before I say anything, I want to be up front and say that I know next to nothing about security! With that said, I personally don't feel particularly confident in the ability of checksums to protect us from bad source code. When I add a new version of a package, I ask Spack to calculate a checksum for me and assume it is correct. I have no other means to verify that the checksum I'm adding is safe. Also, package developers keep re-releasing versions on me, so if I want to fix the broken checksum, I have to verify over email with some stranger that they aren't a hacker and that this change is legit. Basically, checksums are better than nothing, but they aren't great? I don't feel like checksums provide us with significantly more security than Git does, but like I said, I don't know anything about security.
Just to clarify this point, a Spack package typically involves at least three persons/roles:
As long as they are cryptographically secure and not allowed to be overwritten by PRs, checksums provide the following useful guarantees:
The benefit of the first property to a package manager should be obvious. It means that we get reproducible bug reports, for example. It also provides a partial guarantee against software mirror compromises or software authors going rogue, in the sense that if a rogue package is downloaded in place of a known-legitimate software version, Spack is able to detect and report it to end users.
The second property is more subtle. Given the (as of now unchecked) constraint that the package.py maintainers are independent from the software authors, it allows introducing (as of now nonexistent) procedures for reviewing the quality or safety of software versions before allowing them to be merged into the Spack repo. Basically, it allows requesting package security and QA reviews before making them available to end users. We do not do this today, and it would probably be too expensive to introduce that with the manpower that is currently available, but it's nice to be able to introduce such extra security measures in the future if problems ever arise.
In principle, git and hg commit hashes can provide similar guarantees to tarball checksums, because they are a function of the commit's contents and ancestors. AFAIK, a bit of care is required in practice because version control systems do not necessarily fully check commit integrity during normal downloads (something like git fsck or hg verify is needed). Also, another problem with the use of crypto hashing algorithms in version control systems is that it is very difficult for a VCS to switch to a new hashing algorithms when the underlying crypto is getting weak (as in the SHA-1 case) because doing so would invalidate every commit identifier in the distributed world.
To conclude, no checksumming will ever prove that software is safe. Due to the halting problem, and the difficulty of defining "security" in code, devising a fully automated security check is likely to be provably impossible. What checksumming (or explicit commit hashes) can do is to allow confidence in a software version to grow over time as more eyes end up looking at the package. This is not foolproof, e.g. we could imagine malware which waits for a signal from a remote server before going rogue, but it is a nice basic guarantee to have.
Cheers,
Hadrien
The second property is more subtle. Given the (as of now unchecked) constraint that the package.py maintainers are independent from the software authors, it allows introducing (as of now nonexistent) procedures for reviewing the quality or safety of software versions before allowing them to be merged into the Spack repo. Basically, it allows requesting package security and QA reviews before making them available to end users. We do not do this today, and it would probably be too expensive to introduce that with the manpower that is currently available, but it's nice to be able to introduce such extra security measures in the future if problems ever arise.
FWIW, this is something we’re considering as we ramp up the infrastructure for hosting binaries.
Elizabeth,
Spack Environements... seeI'm not sure if I fully understood how this is supposed to work. Could you provide me a few sample CLI commands in order to see how the daily interaction would look like ? I am particularly interested in what using a package through a spack environment would look like.
Basically... a Spack Environment is a list of specs that installed one-by-one, and then loaded as a whole. This has several advantages over not using environments:
1. When you build a Spack Environment, you also get a script that loads the modules built by that environment. No need to use a dozen slow "spack load" commands.
2. "spack load" is non-deterministic, and therefore fundamentally broken. Consider the following:
$ spack install foo@1.1$ spack load foo # works$ spack install foo@1.2$ spack load foo # fails
This becomes a problem on a large shared HEP system where many people are simultaneously using many different versions of the same package. Spack Environments isolate you whatever might be going on in the system OUTSIDE of your environment.
3. If Spack Environments are used exclusively, then they provide a way to garbage-collect installed packages that aren't used anymore.
4. Spack Environments + Spack Setup provides a way to seamlessly use Spack not just to install your dependencies, but also to develop, build and debug your own (CMake-based) package. The idea is that some packages in an environment (at your choosing) will not be installed. Instead, a setup script that calls CMake is generated for that package. It is then your responsibility to run the setup script (instead of your "cmake" command), then type make and make install. Once you do "make install", now your package is fully installed and available for use. Spack Setup saves you the work, essentially, of implementing a package.py for your package twice --- once for Spack and once for development purposes. I rely heavily on it. Spack Setup can also be extended to work with Autotools or other kinds of build systems, if that is needed.
Here you can see the Spack Environment I rely on, it builds and loads up about 100 packages. The packages modele-tests, modele, ibmisc, pism and icebin are installed into that environment in "setup" mode because I am actively working on their source code.
Once the environment is generated, I load it with this file:https://github.com/citibeth/spack/blob/efischer/giss/var/spack/environments/twoway-dev-gibbs/loads-x
That is basically a bit of customization around a Spack-generated loads file, which consists of 100 module load commands.
It would also be possible to "render" environments as:1. A symlink tree2. A single module file that loads the entire environment, rather than loading 100 individual package modules.
Thanks for the explanation. Looks very promising indeed! I hope that as the project matures, someone with in-depth knowledge of this mechanism will be able to give it equal treatment to modules and views in the Spack end user documentation :)
Hadrien