[...] the development profile in Hashdist
allows to install custom things into it, Conda handles it in similar
way. Neither allows modification of the installed locations of the
individual packages, as far as I know.
--
You received this message because you are subscribed to the Google Groups "hashdist" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hashdist+u...@googlegroups.com.
To post to this group, send email to hash...@googlegroups.com.
Visit this group at https://groups.google.com/group/hashdist.
To view this discussion on the web visit https://groups.google.com/d/msgid/hashdist/8f813370-2c70-4be3-8b72-432d30a5342f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
What drove the development of hashdist was the need for customized, reproducible builds of scientific Python stacks, with many non-Python dependencies and specifically vendor/host dependencies. In particular, supporting those builds in a platform-independent way across machines in a _slightly_ challenging HPC environment. That need still exists and hasn’t been addressed well by other tools that I know of.
The question I have though is how to resolve dependencies.
To get the manpower, we need to become a successful open source project with lots of users and community.
--
You received this message because you are subscribed to the Google Groups "hashdist" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hashdist+u...@googlegroups.com.
To post to this group, send email to hash...@googlegroups.com.
Visit this group at https://groups.google.com/group/hashdist.
To view this discussion on the web visit https://groups.google.com/d/msgid/hashdist/CADDwiVCYZw9oG5iYByfxBO0pZK%2B4-mDogEC0a6WzPaUpirSa8g%40mail.gmail.com.
Hi Michael,
On Wed, Mar 1, 2017 at 11:52 AM, Michael Sarahan <msar...@gmail.com> wrote:
> Greetings all from the Conda team. I'm the conda-build guy right now. I
> think it's worth adding that conda-build has just added a (massive) PR that
> brings it much more in line with what HashDist does (I think):
> https://github.com/conda/conda-build/pull/1585
Thanks for the email. Is your job what Aaron Meurer was doing before?
>
> Every conda package will now also have an associated hash. Your version
> number can be a hash itself, but I'd only actually recommend that if your
> package is actually versioned using a hash - let conda-build use its hash
> for uniqueness, and perhaps store HashDist's hash in the recipe metadata for
> future reference (the extra section would be good for this). Dependencies
> can be pinned exactly, or allowed to vary based on version numbering. I
> think to start, hashdist would produce conda recipes that are exactly
> pinned.
That looks good, it's great that Conda got support for using hashes as well.
What do you think should be our first step? I was thinking the following:
* take some simple hashdist profile with few (say 3) packages
* make hashdist generate the directory structure+Conda specs for these
3 packages
* call conda-build can build on it.
How do you make conda-build handle dependencies between source
packages? I read this:
https://conda.io/docs/building/recipe.html
On Mar 1, 2017, at 12:16 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
Hi Todd,
On Wed, Mar 1, 2017 at 12:48 PM, Gamblin, Todd <gamb...@llnl.gov> wrote:
Spack *has* an open source community, and it’s focused on HPC. We have
managed to convince most of the large HPC centers in DOE to get on board,
and many are contributing. We are also getting contributions from many HPC
centers in academia and outside the US (CERN, Fermi, NASA-GISS, EPFL, BSC,
etc.) See slide 23 here for how contributions have grown, and from where:
https://spack.io/slides/Spack-ECP-Exascale-Package-Manager.pdf
I saw your presentation at ECP in Knoxville. Great presentation, and
by talking to people there, I think that you made it, as an open
source project, I feel you got the minimal community and the ball
rolling, as well enough HPC users and companies to use Spack. Sorry I
didn't manage to find you at the conference to talk.
So my question is, what would we need to do or add to get the hashdist folks
on board with Spack? Or are you sold on the hashdist/conda combination?
I’m not opposed to adding things to Spack to suit the needs of other
communities, and we would love to have the many smart folks on this list as
contributors/peers on a shared project. I think we could make a better
package manager that way.
So this is just my own motivation, others might have different ones:
* I want both HPC *and* desktop development and I want to use the same
tool. Conda works on Linux, Mac and Windows, as well as HPC. My
understanding is that Spack doesn't work on Windows and I don't know
how important is to support the usage of Spack for the end user, just
like Conda works.
* I don't like your license, I wish you switched to a BSD or MIT (I
think I mentioned it couple times, both in person and on a
mailinglist)
I need to test Spack again to see if it works for me. I'll write more
after I play with the latest Spack. It has improved a lot since the
last time I played with it.
Ondrej
--
You received this message because you are subscribed to the Google Groups "hashdist" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hashdist+u...@googlegroups.com.
To post to this group, send email to hash...@googlegroups.com.
Visit this group at https://groups.google.com/group/hashdist.
To view this discussion on the web visit https://groups.google.com/d/msgid/hashdist/CADDwiVBv1kE6PAAo%2BngjjJ1ZLec2cu95L0yLxAoFTF%2BvOcks9w%40mail.gmail.com.
I'll answer the rest later. I just want to say thanks for your
willingness to work with us.
I can answer this one quickly --- if I like your Paraview build, and
want to copy how you do it into Hashdist, I would have to change the
license of the Hashdist paraview spec file to LGPL. If, on the other
hand, you used BSD or MIT, all we have to do is to copy the Spack's
BSD or MIT license somewhere in the Hashdist documentation, and we can
freely copy any code we want. The same if I wanted to use Spack for
some Conda packages. The same if I implement some feature into Spack,
and then wanted to use the same code in Hashdist. And so on, there are
endless combinations, where I would be forced to switch the given file
to LGPL.
I *believe* for this particular case, you’d be ok, as to put it in
hashdist you’d convert it from Python to yaml/shell anyway, which i
don’t think falls under copyright, as you’ve rewritten what we did in
Spack at that point. It would be hard to say that translating a build
script constitutes copying some part of Spack — remember this is
copyright, not patents. IANAL, IANYL, TINLA, etc.
But I can relate, as I had a very similar discussion with the easybuild
guys about their code being GPL, so we can’t reuse their builds without
having to release Spack as GPL :).
I think a big part of it is simply that with BSD or MIT, you don't have to think. As soon as you are touched by LGPL, you at least have to consider what the effects are, or if you are in a company you need to involver lawyers etc. -- which is a pain regardless of what the legal answer in the end turns out to be. As you just demonstrated by having to say IANAL on a simultaneously critical and trivial question.
On Mar 2, 2017, at 7:06 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
On Thu, Mar 2, 2017 at 2:28 AM, Gamblin, Todd <gamb...@llnl.gov> wrote:
I *believe* for this particular case, you’d be ok, as to put it in
hashdist you’d convert it from Python to yaml/shell anyway, which i
don’t think falls under copyright, as you’ve rewritten what we did in
Spack at that point. It would be hard to say that translating a build
script constitutes copying some part of Spack — remember this is
copyright, not patents. IANAL, IANYL, TINLA, etc.
Well, that's not what your own license says. ;)
If you look at the text of your license, line 75:
https://github.com/LLNL/spack/blob/88f97c07dea843f2a2c1d87347edccb69c093903/LICENSE#L75
it says very clearly "or *translated* straightforwardly into another language”:
So even if we took Spack code and rewrote straightforwardly (i.e. line
by line) from Python to C++, it's a derivative work, and thus must be
LGPL licensed. Now in the case of, say, Paraview, it's even simpler,
because there I would literally take the cmake options and copy &
pasted to Hashdist (peraps remove some Python syntax), and so then I
am literally copying the code, thus LGPL applies.
But if I did the opposite, and copied some code from Spack to Hashdist
or Conda, I would have to relicense that file. So that's a problem.
I think a big part of it is simply that with BSD or MIT, you don't have to
think. As soon as you are touched by LGPL, you at least have to consider
what the effects are, or if you are in a company you need to involver
lawyers etc. -- which is a pain regardless of what the legal answer in the
end turns out to be. As you just demonstrated by having to say IANAL on a
simultaneously critical and trivial question.
There is some irony.here in that one reason we went with LGPL was from prior
experiences with some companies, where they had an *easier* time
contributing to LGPL projects because they were obligated to do so, whereas
Were they redistributing spack to other people? Because if they only
used it within the company, neither GPL nor LGPL requires them to
contribute back, as they are not distributing the work.
with BSD/MIT they had the option to keep things proprietary and therefore
had to go through a major review with lawyers. Either way, I’d be ok with a
I touched it above --- as an employee of a corporation, it's the
company that owns your work, and by default any contribution back to,
say, Spack requires the legal department to okay it, no matter what
license Spack uses (whether LGPL or BSD). That's where the Contributor
License Agreement (CLA) comes in --- the company, not the person, has
to grant you a copyright license to their work under the Spack's LGPL
or BSD. So if they didn't run it through their legal, they can get in
big trouble.
Most open source projects don't want to bother with CLA, so the
copyright license grant is implicit by submitting a PR.
How do you all feel about Apache 2? LLNL (and various other projects like
Kubernetes) are starting to prefer it due to the patent indemnification
clause (it is otherwise the same as BSD). It is not as compatible as
BSD/MIT though — in particular it’s not compatible with GPL2. I would be
tempted to push for MIT/BSD because, as you say, I only have so many cycles
to think.
I am fine with Apache. I still think MIT/BSD is better, but Apache is
a big improvement over LGPL.
Ondrej
-Todd
--
You received this message because you are subscribed to the Google Groups
"hashdist" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to hashdist+u...@googlegroups.com.
To post to this group, send email to hash...@googlegroups.com.
Visit this group at https://groups.google.com/group/hashdist.
To view this discussion on the web visit
https://groups.google.com/d/msgid/hashdist/CA40EE99-84E3-44DE-9E5C-937335EA8F1E%40llnl.gov.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "hashdist" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hashdist+u...@googlegroups.com.
To post to this group, send email to hash...@googlegroups.com.
Visit this group at https://groups.google.com/group/hashdist.
To view this discussion on the web visit https://groups.google.com/d/msgid/hashdist/CADDwiVAWOgGgSuLw2fyu1EKZdc2iTixTYgAXMGmOU0Ut0_nZBQ%40mail.gmail.com.
Ondrej:
On Mar 2, 2017, at 7:06 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
On Thu, Mar 2, 2017 at 2:28 AM, Gamblin, Todd <gamb...@llnl.gov> wrote:
I *believe* for this particular case, you’d be ok, as to put it in
hashdist you’d convert it from Python to yaml/shell anyway, which i
don’t think falls under copyright, as you’ve rewritten what we did in
Spack at that point. It would be hard to say that translating a build
script constitutes copying some part of Spack — remember this is
copyright, not patents. IANAL, IANYL, TINLA, etc.
Well, that's not what your own license says. ;)
If you look at the text of your license, line 75:
https://github.com/LLNL/spack/blob/88f97c07dea843f2a2c1d87347edccb69c093903/LICENSE#L75
it says very clearly "or *translated* straightforwardly into another language”:
True enough. :(.
So even if we took Spack code and rewrote straightforwardly (i.e. line
by line) from Python to C++, it's a derivative work, and thus must be
LGPL licensed. Now in the case of, say, Paraview, it's even simpler,
because there I would literally take the cmake options and copy &
pasted to Hashdist (peraps remove some Python syntax), and so then I
am literally copying the code, thus LGPL applies.
Since we’re getting into the details, I will point out that there is an originality requirement in US copyright law (https://en.wikipedia.org/wiki/Threshold_of_originality#United_States). I don’t actually think any of the build recipes would stand up as more than “mere sweat of the brow” under scrutiny, so I think it would be extremely hard to bring any kind of copyright claim based on a stolen configure line. But, again, this requires you to think about it, and it’s not the position implied by the license, which is admittedly a pain.
had to go through a major review with lawyers. Either way, I’d be ok with a
I touched it above --- as an employee of a corporation, it's the
company that owns your work, and by default any contribution back to,
say, Spack requires the legal department to okay it, no matter what
license Spack uses (whether LGPL or BSD). That's where the Contributor
License Agreement (CLA) comes in --- the company, not the person, has
to grant you a copyright license to their work under the Spack's LGPL
or BSD. So if they didn't run it through their legal, they can get in
big trouble.
Most open source projects don't want to bother with CLA, so the
copyright license grant is implicit by submitting a PR.
I’m actually pretty sure this is not true — a PR doesn’t imply copyright transfer, which is why we’d have to get CLAs from people to relicense. I’m willing to do it, though. LLNL hasn’t historically had a clear policy on this. One thing I am fighting for internally is more clarity so that people releasing OSS projects understand their options.
How do you all feel about Apache 2? LLNL (and various other projects like
Kubernetes) are starting to prefer it due to the patent indemnification
clause (it is otherwise the same as BSD). It is not as compatible as
BSD/MIT though — in particular it’s not compatible with GPL2. I would be
tempted to push for MIT/BSD because, as you say, I only have so many cycles
to think.
I am fine with Apache. I still think MIT/BSD is better, but Apache is
a big improvement over LGPL.
Ok, I’ll work on this. I guess the next question is — is that all you need? I’d be interested to hear from other hashdist folks.
Hi all,
I wanted to chime in with a few points, and see if I can be convincing. I’m the creator/lead developer of Spack (https://github.com/LLNL/spack), and we’ve had some prior discussions this list comparing hashdist to Spack. I know some of you IRL, but I’ve never met Dag or Chris, so hello!
Spack *has* an open source community, and it’s focused on HPC. We have managed to convince most of the large HPC centers in DOE to get on board, and many are contributing. We are also getting contributions from many HPC centers in academia and outside the US (CERN, Fermi, NASA-GISS, EPFL, BSC, etc.) See slide 23 here for how contributions have grown, and from where:
and here for a measure of how the level of contributions to the project:
Chris wrote:
What drove the development of hashdist was the need for customized, reproducible builds of scientific Python stacks, with many non-Python dependencies and specifically vendor/host dependencies. In particular, supporting those builds in a platform-independent way across machines in a _slightly_ challenging HPC environment. That need still exists and hasn’t been addressed well by other tools that I know of.
Spack had similar motivations, perhaps without the focus on Python. I would say we wanted to build large codes in a __very__ challenging (GPU/xeon phi/cray/power8/etc.) HPC environment.
Spack also already provides *many* of the features on Chris’s feature list, as well as many features that neither hashdist nor Conda have that are good for HPC. These have helped us to grow adoption:
1. compiler provenance2. compiler swapping3. support for cross-compiled platforms (platforms can have multiple OS/target combinations)4. generating modules (lmod & tcl)5. using vendor packages that are only provided through modules (e.g. on Cray)6. multiple dependency types: build/link/run- allows, e.g., front-end build deps to be built for the host and not target on cross-compiled machines7. virtual dependencies, build variants, and a dependency system that supports them- swappable MPI, swappable BLAS/LAPACK/SCALAPACK versions8. mirroring packages over an air gap9. A query interface, a syntax, and a dependency model for all of that.
At the dependency level, Spack is not so different from hashdist. We have full control over the dependencies, like hashdist, but to some extent we have more control because we support more parameters (compilers, variants, versions, etc.), and we allow dependencies on builds with specific parameters. Like hashdist, we use hashes to identify builds in a combinatorial build space, and you can refer to builds by hash, but you can also refer to builds and query them based on their build parameters through the UI.
To post to this group, send email to has...@googlegroups.com.