Hashdist and Spack comparison

366 views
Skip to first unread message

Ondřej Čertík

unread,
Apr 21, 2015, 3:06:36 PM4/21/15
to hash...@googlegroups.com, tgam...@llnl.gov
Hi,

Todd Gamblin (the author of Spack:
https://github.com/scalability-llnl/spack) and I just met for few
hours when Todd visited LANL and we discussed how Spack and Hashdist
works inside, and I wanted to share what we found out (Todd, feel free
to correct me if something is inaccurate).

Unlike Conda, Easybuild and other systems, Spack and Hashdist are
actually extremely similar and almost all the features that Todd and I
discussed can be done in both systems, one way or another.

Hashdist is very meticulous about hashes, making sure that if you have
two packages or profiles that have the same hash, then they are build
*exactly* the same way (same dependencies, versions, build scripts,
...), in other words, that they are identical. Spack is going in this
direction as well --- it has hashes, though they currently don't seem
to always hash everything (some packages only have a name+version, not
a hash yet). I don't exactly understand all the exceptions, but one
can essentially imagine as everything having hashes as well (for the
most part it behaves like that).

Both Hashdist and Spack have a database where all the packages are
installed, each package into its own directory based on a hash (in
Spack case, the hash is not always there, but you can imagine it is).
You can have more versions of the same package and version, that
differ perhaps with just a configure option or a patch that was
applied.

Hashdist has 'hit', Spack has 'spack', which is a tool written in
Python that operates on this database. 'spack' is way better in terms
of providing the user useful information about the various packages
and dependencies. I posted some examples here:
https://gist.github.com/certik/16471771e6b6fc29246e

We need to improve 'hit' to show this info as well. But the
information is there in both Hashdist and Spack.

When you install a package into this database, Hashdist uses a profile
that you have to write by hand, that's currently the only way. Spack's
only way is a command line --- essentially you specify (more or less)
the same info, but on a command line. But it is essentially equivalent
--- the Hashdist profile could (after we implement it) be generated
from a command line, and Spack could (if it is implemented) to read
the profile from a file. Both tools install all the packages specified
in the profile into the database, each into its own directory based on
hash.

Hashdist then creates a "profile", which is a unix like structure,
with symlinks. Spack currently does not do that (though it does it for
the Python package, by linking the python packages like numpy, scipy
into it).

Spack creates "module" files, so you can load packages using modules,
common on HPC machines. I love this, and we should do that.

Besides modules, I am currently unclear how else to use the packages from Spack.

Compiler support is great in Spack --- you can build some of the
dependencies with one compiler and the rest with another. We need to
implement this in Hashdist. Essentially each package knows which
compiler it is build with. The only way that I know of in Hashdist is
to set the PROLOGUE (CC, FC, CXX), but that only allows to use one
compiler for everything.

Spack creates compiler wrappers, for ld, cc, c++, fc, ... and passes
the RPATH options using them (no need for patchelf). Hashdist is using
CXXFLAGS, FCFLAGS, LDFLAGS, and uses patchelf to patch the few
packages whose build system does not allow setting these flags. The
advantage of compiler wrappers is that you have much better control
over how things are compiled and linked, which Hashdist needs to rely
on the package's build system, that it does the right thing (which in
many cases it doesn't).

Spack doesn't cache source files (tarballs, git), but allows to create
"mirrors", which essentially do that.

Overall, Spack is more polished from the user/admin perspective.
Hashdist has a lot more packages (i.e. ~170 in Spack vs ~400 in
Hashstack), works better on Mac and Windows, and I think it's better
from a developer (as opposed to admin) perspective (e.g. Hashdist can
create a profile with symlinks, and can also do a throw away profile
which you can write into and destroy, without breaking the database).
Spack has better compiler support. Spack has 'modules' support.

Todd, did I forget anything?

Ondrej

François Bissey

unread,
Apr 21, 2015, 9:26:36 PM4/21/15
to hash...@googlegroups.com, tgam...@llnl.gov
Toyed with spack. Looked like a good fit for a system like we have here
that include power7 (mostly linux), x86 and a BG/P, plus we have a
related power6 system running AIX.
Toyed with it in February, I invested some time hacking in proper
support for IBM XL compiler in a way that didn't suck.
I wouldn't use spack and hashdist the same way and I am not sure which
would be better. With hashdist I would probably to produce whole prefix
with given feature included and load that as a module.
spack in that regard is possibly more modular but the line where one is
more useful than the other is not always obvious.

Wish spack had a mailing list of some kind.

Francois

Ondřej Čertík

unread,
Apr 21, 2015, 10:58:10 PM4/21/15
to François Bissey, hash...@googlegroups.com, Todd Gamblin
It's not even obvious to Todd and I (we are actually still discussing
the differences, or the lack thereof, offlist). In the meantime, I am
implementing better UI support in 'hit', using the spack commands
(whenever possible), to bring the user interfaces closer together.

>
> Wish spack had a mailing list of some kind.

They recently created a mailinglist:

https://github.com/scalability-llnl/spack#mailing-list

Ondrej

>
> Francois
>
> --
> You received this message because you are subscribed to the Google Groups
> "hashdist" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hashdist+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Dag Sverre Seljebotn

unread,
Apr 22, 2015, 3:26:51 AM4/22/15
to hash...@googlegroups.com, tgam...@llnl.gov
Thanks for this write-up, it's very useful.

Really feels like Spack is the twin we didn't know about. Even if there
are some differences now you are right that Hashdist could add Spack's
features and Spack add Hashdist's features, and the plan ahead for
developing both could be somewhat similar.

There's the packages and different package spec formats, but in this
case it could be possible to have Spack load Hashdist specs or have
Hashdist read Spack specs, so...

Also Hashdist is BSD and Spack is LGPL.

I need to really dig into Spack's source code to make sense of this and
form an opinion and that will take some time. A priori my thoughts are
that in the ideal scenario we'd be able to gracefully merge the projects
while keeping all features and having existing package specs from both
camps working... I'd be very interested in Todd's take on the future.
Also, did any of the Spack authors know about Hashdist and reject it for
whatever reason, or did you learn about it from Ondrej?

Dag Sverre

Todd Gamblin

unread,
Apr 25, 2015, 4:01:39 PM4/25/15
to Ondřej Čertík, hash...@googlegroups.com
Hi all,

Sorry for the delayed response. I've been trying to catch up on email.
Thanks to Ondrej for writing this up and for talking to me while I was at
LANL.

I'll put comments inline below.

On 4/21/15, 1:06 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

>Hi,
>
>Todd Gamblin (the author of Spack:
>https://github.com/scalability-llnl/spack) and I just met for few
>hours when Todd visited LANL and we discussed how Spack and Hashdist
>works inside, and I wanted to share what we found out (Todd, feel free
>to correct me if something is inaccurate).
>
>Unlike Conda, Easybuild and other systems, Spack and Hashdist are
>actually extremely similar and almost all the features that Todd and I
>discussed can be done in both systems, one way or another.
>
>Hashdist is very meticulous about hashes, making sure that if you have
>two packages or profiles that have the same hash, then they are build
>*exactly* the same way (same dependencies, versions, build scripts,
>...), in other words, that they are identical. Spack is going in this
>direction as well --- it has hashes, though they currently don't seem
>to always hash everything (some packages only have a name+version, not
>a hash yet). I don't exactly understand all the exceptions, but one
>can essentially imagine as everything having hashes as well (for the
>most part it behaves like that).

Spack currently hashes *just* the spec DAG. So if you want to draw an
analog with hashdist, you might call it "dagdist", where instead of a
hash, a particular configuration is more rigidly parameterized. The idea
was originally that once you got down to version, compiler, compiler
version, arch, *named* variants, and all that for all dependencies, you've
got the key parts of the space covered, and an admin, application user, or
a student can easily name a package and the constraints they care about.

It was *not* really designed with developers directly in mind, the way
hashdist was, so it currently doesn't hash the entire build script. It's
supposed to be a package manager, not a build system, and I think that's
where a lot of the differences in focus come from. I think Jimmy drew
this distinction a while ago, and I'm sorry I didn't get a chance to
respond back then.

I would like to add hashes, at the very least as a way to maintain build
provenance. There is concern at LLNL that exposing this to users generates
a lot of false- positives when you decide whether or not to rebuild. E.g.
What if I just add another version descriptor, or I mess with a
conditional that does not affect the current build? There are a lot of
equivalent builds in that space. I will probably end up adding full
hashing because I think its useful and would've helped me out a number of
times, but I doubt we'd expose it to end users -- we really want them to
be able to just say the name of a package and have it magically appear for
them.

>
>Both Hashdist and Spack have a database where all the packages are
>installed, each package into its own directory based on a hash (in
>Spack case, the hash is not always there, but you can imagine it is).

Yep the hash currently abridges the spec of dependencies in the DAG. We
don't write the hash for packages w/o deps. I think I will change this in
either the next version (0.9) or 1.0.

>You can have more versions of the same package and version, that
>differ perhaps with just a configure option or a patch that was
>applied.

Spack has named variants that are intended to handle things like configure
issues. E.g. You can have miranda+longdouble or something like that. I
intentionally didn't want to expose compiler options to the naming system,
because the idea is to keep that namespace curated and intuitive.

>Hashdist has 'hit', Spack has 'spack', which is a tool written in
>Python that operates on this database. 'spack' is way better in terms
>of providing the user useful information about the various packages
>and dependencies. I posted some examples here:
>https://gist.github.com/certik/16471771e6b6fc29246e
>We need to improve 'hit' to show this info as well. But the
>information is there in both Hashdist and Spack.

"spack find" also reuses the install spec as a query language. E.g. You
could find all the versions of something built with any gcc and any mpich
with "spack find foo %gcc ^mpich". In general in spack, you can refer to
particularly points in the configuration space ambiguously, and Spack
tries to resolve them to either a) a concrete spec that must be built or
b) a match with something installed. So if the user installed libelf, he
can keep referring to it as just libelf until he installs another version,
at which point he needs to supply @version.

>
>When you install a package into this database, Hashdist uses a profile
>that you have to write by hand, that's currently the only way. Spack's
>only way is a command line --- essentially you specify (more or less)
>the same info, but on a command line. But it is essentially equivalent
>--- the Hashdist profile could (after we implement it) be generated
>from a command line, and Spack could (if it is implemented) to read
>the profile from a file. Both tools install all the packages specified
>in the profile into the database, each into its own directory based on
>hash.

The focus in Spack has always been on allowing users to specify far less
of this -- basically as much as they want, up to the point they care.
Hashdist's yaml files are nice for getting control, but you can add much
of that to a Spack package.py. Spack is trying to manage constraints from
a lot of corners: the user or admin who requests the install, the site
policy (e.g.: prefer mvapich at LLNL and openmpi at LANL if a specific mpi
is not specified), package constraints (e.g. "this package is only
compatible with boost versions 1.47.0 and 1.55.0), and others. All that
is merged when you type "spack install" and the result is a build spec
that tries to satisfy all these constraints.

>
>Hashdist then creates a "profile", which is a unix like structure,
>with symlinks. Spack currently does not do that (though it does it for
>the Python package, by linking the python packages like numpy, scipy
>into it).

Yep. I like the profiles, although I see them as focused mainly on a
single user who is the developer, installer, and user. Spack is designed
to install HPC software for everyone on a system, and to manage it across
a large organization. I don't think we're there yet, but that is the
intent. Might also explain why Spack generates modules and dotkits while
hashdist does not.

>
>Spack creates "module" files, so you can load packages using modules,
>common on HPC machines. I love this, and we should do that.

Yay!

>
>Besides modules, I am currently unclear how else to use the packages from
>Spack.

It's currently the only way. I am leaning towards implementing something
like a profile, because I think I'd use it for development, and I think it
would help people construct a stack. The profile essentially helps with
what Spack calls "concretization". If I install a package, other packages
I install after it that depend on it should try to match its version. I
like that a lot.

There is another use case that I think neither tool currently considers,
and that is the use case of a large HPC machine, with lots of teams and
lots of needs. In this scenario, everything is NOT installed or merged
into a common prefix, and users interact with packages through modules.
Packages are installed into unique prefixes that users know about and use
directly, so you have basically a different way of specifying gc roots.
This is what I want to move towards at LLNL for managing software, and I'd
like a good way to specify it. Some type of generalization of the profile
idea seems like a good way to go to me.

>Compiler support is great in Spack --- you can build some of the
>dependencies with one compiler and the rest with another. We need to
>implement this in Hashdist. Essentially each package knows which
>compiler it is build with. The only way that I know of in Hashdist is
>to set the PROLOGUE (CC, FC, CXX), but that only allows to use one
>compiler for everything.
>
>Spack creates compiler wrappers, for ld, cc, c++, fc, ... and passes
>the RPATH options using them (no need for patchelf). Hashdist is using
>CXXFLAGS, FCFLAGS, LDFLAGS, and uses patchelf to patch the few
>packages whose build system does not allow setting these flags. The
>advantage of compiler wrappers is that you have much better control
>over how things are compiled and linked, which Hashdist needs to rely
>on the package's build system, that it does the right thing (which in
>many cases it doesn't).

Yep -- right now I don't think there's a good generalization in either
system of compilers or architectures, particularly for things like
cross-compiling and arch-specific flags. EasyBuild actually gets this
more right, and another LLNL meta-build system called MixDown had some
nice features for cross-cutting flag settings. I think if you get this
right you can accelerate porting to new, exotic machines, and you can
simplify package files a lot. It's on our roadmap for this year, esp. now
that we have a production code team at LLNL relying on Spack.

>Spack doesn't cache source files (tarballs, git), but allows to create
>"mirrors", which essentially do that.

This could be improved, but the reason there's no central cache right now
is so that we can farm out a build to lots of nodes. Caching is done
separately per build so that we can build the dep DAG in parallel.

>Overall, Spack is more polished from the user/admin perspective.
>Hashdist has a lot more packages (i.e. ~170 in Spack vs ~400 in
>Hashstack), works better on Mac and Windows, and I think it's better
>from a developer (as opposed to admin) perspective (e.g. Hashdist can
>create a profile with symlinks, and can also do a throw away profile
>which you can write into and destroy, without breaking the database).
>Spack has better compiler support. Spack has 'modules' support.
>
>Todd, did I forget anything?

Seems like a good summary to me. I think one thing that may not come out
above is Spack's "concretization" process and how that relates to
hashdist. The output of your gist shows the results of the process:

https://gist.github.com/certik/16471771e6b6fc29246e


Top is the user's input spec, next is the normalized version, and bottom
is *after* concretization is run on the spec, which substitutes in virtual
deps, resolves constraints according to policies, and repeats that until
the spec is no longer ambiguous. I am not 100% clear on hashdist's analog
to that process.

-Todd

>
>Ondrej

Todd Gamblin

unread,
Apr 25, 2015, 4:03:55 PM4/25/15
to François Bissey, hash...@googlegroups.com
On 4/21/15, 7:26 PM, "François Bissey" <francoi...@canterbury.ac.nz>
wrote:
>>
>Toyed with spack. Looked like a good fit for a system like we have here
>that include power7 (mostly linux), x86 and a BG/P, plus we have a
>related power6 system running AIX.
>Toyed with it in February, I invested some time hacking in proper
>support for IBM XL compiler in a way that didn't suck.
>I wouldn't use spack and hashdist the same way and I am not sure which
>would be better. With hashdist I would probably to produce whole prefix
>with given feature included and load that as a module.
>spack in that regard is possibly more modular but the line where one is
>more useful than the other is not always obvious.
>
>Wish spack had a mailing list of some kind.

Thanks for trying Spack! Can you point me at your XL support efforts, or
let me know how far they got? I would be very interested in this, as
we're working on BG/Q right now and good support is on LLNL's roadmap for
2015.

Feel free to do that on the Spack mailing list if it's off topic here.

-Todd


Todd Gamblin

unread,
Apr 25, 2015, 4:38:46 PM4/25/15
to Dag Sverre Seljebotn, hash...@googlegroups.com
Hi Dag,

On 4/22/15, 1:26 AM, "Dag Sverre Seljebotn" <d.s.se...@astro.uio.no>
wrote:

>Thanks for this write-up, it's very useful.
>
>Really feels like Spack is the twin we didn't know about. Even if there
>are some differences now you are right that Hashdist could add Spack's
>features and Spack add Hashdist's features, and the plan ahead for
>developing both could be somewhat similar.

I can definitely see the merit to keeping an eye on both projects.

>There's the packages and different package spec formats, but in this
>case it could be possible to have Spack load Hashdist specs or have
>Hashdist read Spack specs, so...
>
>Also Hashdist is BSD and Spack is LGPL.

I talked to Ondrej about this. I'm not particularly wed to the license
and would consider changing it. For context, we've actually had an
*easier* time dealing with some companies when using LGPL. With LGPL,
they're required to contribute, so they do not have to ask their lawyers
for permission. So it is easier to collaborate. With BSD, the lawyers
have had to assess whether or not to contribute, which has taken forever.
Generally with BSD we have to get a larger collaborative agreement in
place and establish ownership down to very minute details.

>I need to really dig into Spack's source code to make sense of this and
>form an opinion and that will take some time. A priori my thoughts are
>that in the ideal scenario we'd be able to gracefully merge the projects
>while keeping all features and having existing package specs from both
>camps working... I'd be very interested in Todd's take on the future.

Right now I have a bunch of roadmap items for 2015, including Spack DSL
extensions, better/faster spec handling, better compiler handling, BG/Q
support and Cray support. Oh and lots of bug fixes. We've just had an
LLNL production code team decide to use Spack for their software stack, so
we'll be working closely with them to get things working on a certain
large Xeon Phi system soon to appear near Ondrej. I think it's premature
to talk about a merge but I do think the projects are going in similar
directions and would like to see how things develop.

>
>Also, did any of the Spack authors know about Hashdist and reject it for
>whatever reason, or did you learn about it from Ondrej?

I found out about hashdist and also EasyBuild 9 months to a year after I
started working on Spack. All three projects sprung up within the same
3-year window (I guess EB is older but was not public for a while), and no
project's advertising had really ramped up to google-able levels in early
2013. So, I spent a lot of time searching for HPC package managers but
did not find anything.

By that time, I was already far enough along with the spec syntax and DAG
manipulation and I wanted to keep pursuing those ideas. I looked at the
hashdist docs, and I saw a presentation by Andy Terrell that gave me the
impression that it was primarily for developers. I wanted something for
users -- my motivation has always been to get more people using
performance tools (my research area), which are hard to build and have way
too high a barrier to entry. I think that limits the impact of tools.

I also looked at Nix and really liked it (seems we both started there),
but at that time it was an OS package manager and didn't do much for
people in user space. I still think that Nix expressions need something
like high level specs and database query features to make them usable on
big HPC machines. So, that's where Spack came from.

It seems like there are a healthy number of projects dealing with this
problem now: hashdist, EB, spack, and also smithy at ORNL, and I think
hashdist and spack get the dep model right. There are also tools like
Lmod doing interesting things on the UI side. I'm not sure options are a
bad thing, but I can see the merits of convergence in this space.
Ensuring that all those communities are aware of each other would be the
first step in that direction... I'm trying to do that across large DOE
computing sites (ORNL, ANL, NERSC, LLNL, LANL, Sandia). Maybe you guys
could agitate within the Python, DOD, and NSF (TACC) communities? We're
also organizing a workshop: http://hust15.github.io. It'll be at SC15 in
Austin. Maybe a hashdist paper is in order?

-Todd

Ondřej Čertík

unread,
Apr 27, 2015, 1:45:29 PM4/27/15
to Todd Gamblin, hash...@googlegroups.com
Hi Todd,
You should still consider hashing everything under the hood, and build
the rest on top of it. I think it is possible, while still providing
the nice and easy syntax of Spack.

It's not a perfect analogy, but I think in your case, it would be like
hashing only the Subversion commit numbers (and merge commits, i.e.
the DAG), not the contents of the commits (i.e. the patches)
themselves. While Hashdist, similar to git, hashes everything, so
there is no such thing as just a commit number (e.g. a package
version). The only thing is the hash. However, on top of this, git
does implement easy way to name commits (tags), or branches (which is
like a moving tag I guess) etc. And for the end user, for the most
part you don't have to worry about any hashes at all.

In git, if you keep the merge history, but change a contents
(=patch/diff) of some commit, the hash changes and all hashes that
depend on it change. The same in Hashdist if you change the build
script perhaps by adding or removing the "--enable-sse2" flag in the
FFTW package configure script. You can (and should) still export it as
some kind of a variable (that the user can set from the command line
or a profile), but obviously you cannot parametrize every such
possible change.

This is not just "for developers". This is a fundamental feature,
ultimately affecting everybody. Without it, you can't really robustly
change flags like "--enable-sse2" or other things in your packages ---
you only rely on the "named variants" that you assign to it, but once
the stack builds, you will never know if you have the same stack, or
not. With Hashdist you can always tell.

>
> I would like to add hashes, at the very least as a way to maintain build
> provenance. There is concern at LLNL that exposing this to users generates
> a lot of false- positives when you decide whether or not to rebuild. E.g.
> What if I just add another version descriptor, or I mess with a
> conditional that does not affect the current build?

For example if you add an "if statement" for let's say some Cray
machine, that does not affect your current linux build?
Hashdist always takes the rules (in yaml) and creates a Bash file,
that is executed (and hashed). So any such conditionals (or adding a
new version, that is not used in your current profile) do not affect
the bash file, thus the hash stays the same and no rebuild occurs.
I think after we implement the following high priority issues,
Hashdist should come much closer to what Spack is doing, and actually
supporting exactly the same syntax:

https://github.com/hashdist/hashdist/issues/329
https://github.com/hashdist/hashdist/issues/328
https://github.com/hashdist/hashdist/issues/327

>
>>
>>Hashdist then creates a "profile", which is a unix like structure,
>>with symlinks. Spack currently does not do that (though it does it for
>>the Python package, by linking the python packages like numpy, scipy
>>into it).
>
> Yep. I like the profiles, although I see them as focused mainly on a
> single user who is the developer, installer, and user. Spack is designed
> to install HPC software for everyone on a system, and to manage it across
> a large organization. I don't think we're there yet, but that is the
> intent. Might also explain why Spack generates modules and dotkits while
> hashdist does not.

We plan to fix it:

https://github.com/hashdist/hashdist/issues/330

>
>>
>>Spack creates "module" files, so you can load packages using modules,
>>common on HPC machines. I love this, and we should do that.
>
> Yay!
>
>>
>>Besides modules, I am currently unclear how else to use the packages from
>>Spack.
>
> It's currently the only way. I am leaning towards implementing something
> like a profile, because I think I'd use it for development, and I think it
> would help people construct a stack. The profile essentially helps with
> what Spack calls "concretization". If I install a package, other packages
> I install after it that depend on it should try to match its version. I
> like that a lot.
>
> There is another use case that I think neither tool currently considers,
> and that is the use case of a large HPC machine, with lots of teams and
> lots of needs. In this scenario, everything is NOT installed or merged
> into a common prefix, and users interact with packages through modules.
> Packages are installed into unique prefixes that users know about and use
> directly, so you have basically a different way of specifying gc roots.
> This is what I want to move towards at LLNL for managing software, and I'd
> like a good way to specify it. Some type of generalization of the profile
> idea seems like a good way to go to me.

We are trying to fix this as well, so that you can have some system
wide prefix, as well as user installed ones and Hashdist be able to
use both, see e.g. our progress here:

https://github.com/hashdist/hashdist/pull/314#issuecomment-96474231
I think our profile is the unambiguous concretized spec in Spack, i.e.
our yaml files are templated and require you to set some of the
variables (they all have defaults if you don't set them) -- so in the
profile you specify packages that you want, by name, and if you don't
specify anything else, Hashdist uses the default values of variables
like version (currently only implemented for Python, but we will
extend this for other packages), debug vs release, and other things.

After implementing this:

https://github.com/hashdist/hashdist/issues/329

We can then support the same input spec as Spack does, and concretize
it before writing a profile (the profile doesn't need to be saved to
disk, so it can be all on the fly).

Few more comments in your reply to Dag:

> On 4/22/15, 1:26 AM, "Dag Sverre Seljebotn" <d.s.se...@astro.uio.no>
> wrote:
>
>>Thanks for this write-up, it's very useful.
>>
>>Really feels like Spack is the twin we didn't know about. Even if there
>>are some differences now you are right that Hashdist could add Spack's
>>features and Spack add Hashdist's features, and the plan ahead for
>>developing both could be somewhat similar.
>
> I can definitely see the merit to keeping an eye on both projects.
>
>>There's the packages and different package spec formats, but in this
>>case it could be possible to have Spack load Hashdist specs or have
>>Hashdist read Spack specs, so...
>>
>>Also Hashdist is BSD and Spack is LGPL.
>
> I talked to Ondrej about this. I'm not particularly wed to the license
> and would consider changing it. For context, we've actually had an
> *easier* time dealing with some companies when using LGPL. With LGPL,
> they're required to contribute, so they do not have to ask their lawyers
> for permission. So it is easier to collaborate. With BSD, the lawyers
> have had to assess whether or not to contribute, which has taken forever.
> Generally with BSD we have to get a larger collaborative agreement in
> place and establish ownership down to very minute details.

I am surprised they do not need to ask they lawyers for permission. I
thought by default you always have to ask for permission from the
given corporation to release your patches.
In any case, once lawyers are involved, BSD is a lot simpler license than LGPL.
Hashdist is listed among options for Python packaging:

https://packaging.python.org/en/latest/projects.html#non-pypa-projects

I should introduce it more at LANL and do some tutorials.

I think it is good that there are more than one tool and more people
are trying to fix this. I think the motivations behind Spack and
Hashdist are almost identical, so I think it'd be great if we can
figure out how to join our forces. At the very least, we'll try to
support the same syntax, thanks for discussing it with us e.g. here:

https://github.com/hashdist/hashdist/pull/325#issuecomment-95680958

> also organizing a workshop: http://hust15.github.io. It'll be at SC15 in
> Austin. Maybe a hashdist paper is in order?

The paper is published (if accepted) even if we can't make it to the
conference? The deadline is August 7, correct?

>
> -Todd

Ondrej

Todd Gamblin

unread,
Apr 27, 2015, 2:08:18 PM4/27/15
to Ondřej Čertík, hash...@googlegroups.com
Hi Ondrej,

All the below sounds good to me. I pretty much agree with you on hashing,
too.

RE: HUST15:
>>also organizing a workshop: http://hust15.github.io. It'll be at SC15 in
>>Austin. Maybe a hashdist paper is in order?
>
>The paper is published (if accepted) even if we can't make it to the
>conference? The deadline is August 7, correct?


It is, although it would be good if *someone* could come and present it.
That's a requirement. That doesn't mean someone has to come for all of
SC; just the HUST half-day workshop.

-Todd

Dag Sverre Seljebotn

unread,
Apr 27, 2015, 2:23:05 PM4/27/15
to hash...@googlegroups.com
On 04/27/2015 07:45 PM, Ondřej Čertík wrote:
> Hi Todd,
>
>>> There's the packages and different package spec formats, but in this
>>> case it could be possible to have Spack load Hashdist specs or have
>>> Hashdist read Spack specs, so...
>>>
>>> Also Hashdist is BSD and Spack is LGPL.
>>
>> I talked to Ondrej about this. I'm not particularly wed to the license
>> and would consider changing it. For context, we've actually had an
>> *easier* time dealing with some companies when using LGPL. With LGPL,
>> they're required to contribute, so they do not have to ask their lawyers
>> for permission. So it is easier to collaborate. With BSD, the lawyers
>> have had to assess whether or not to contribute, which has taken forever.
>> Generally with BSD we have to get a larger collaborative agreement in
>> place and establish ownership down to very minute details.
>
> I am surprised they do not need to ask they lawyers for permission. I
> thought by default you always have to ask for permission from the
> given corporation to release your patches.
> In any case, once lawyers are involved, BSD is a lot simpler license than LGPL.

IANAL, mind, but my understanding is that the people Todd spoke to
misunderstood (L)GPL anyway. LGPL only comes into effect if they
redistribute a modified Spack to other companies/their customers -- then
they must also redistribute the source code of their modifications. For
internal use they are under no obligation. Right?

Though I guess it's great that those people misunderstood the LGPL :)

I think main reason Hashdist is BSD is mainly staying in line with the
the scientific Python ecosystem.

Dag Sverre

Ondřej Čertík

unread,
Apr 27, 2015, 3:08:26 PM4/27/15
to Dag Sverre Seljebotn, hash...@googlegroups.com
Right, use inside the company is not distribution:

https://www.gnu.org/licenses/gpl-faq.html#InternalDistribution

"As a consequence, a company or other organization can develop a
modified version and install that version through its own facilities,
without giving the staff permission to release that modified version
to outsiders."

So they are *not* required to contribute. If they did anyway, that's
great, but as you said, I think that's because their lawyers
misunderstood the license. Once they understand it, it is no different
in this respect from BSD.

Ondrej

>
> Though I guess it's great that those people misunderstood the LGPL :)
>
> I think main reason Hashdist is BSD is mainly staying in line with the the
> scientific Python ecosystem.
>
> Dag Sverre
>
>
Reply all
Reply to author
Forward
0 new messages