Hi Todd,
On Mon, Dec 26, 2016 at 10:41 PM, Gamblin, Todd <
gamb...@llnl.gov> wrote:
> Hi Ondrej,
The way I am envisioning it is that we use Hashdist just like now, so
you have a profile and a list of packages (Hashstack). Just like
Spack, Hashdist will decide exactly how each package will be built.
Currently Hashdist itself does the building and then enabling/loading
the "environment/profile", so that you can use those installed
packages. I haven't followed the Spack development too closely, so I
can't speak for Spack, but Conda handles the binary packages very
nicely and has a large community around it. Hashdist puts the package
hash (calculated from sources + build script + dependencies, etc.) as
a version into the package.
I should stress that I got this idea from Aron Ahmadia, and I think
Chris doesn't mind this direction either.
Now I can answer your questions:
>
> In the plan below, I’m a little unclear on what the hashdist layer would continue to provide. Would hashdist just do profile management? i.e., history and build preferences?
Not just that, also the way you can easily configure each package,
say, you want to enable or disable some compile time option of a given
package, and have two environments/profiles that allow you to easily
switch between both. Essentially Hashdist would do everything that it
does now, but would partner with Conda to handle things that Conda
does well.
> Also, who’s your target audience going forward? Are you targeting HPC or mostly end-user machines?
Just like now, we are targeting both HPC and end-user machines.
>
> I’m under the impression that there are things Conda just can’t (or doesn’t care to) build, like Cray binaries.
Hashdist provides the sources of the package/stack that builds on
Cray. We only use Conda the package manager, but we provide our own
packages.
> And the dependency model, AFAIK, doesn’t really handle combinatorial versions of things (multi-compiler, multi-mpi, etc.). How would hashdist handle that stuff with Conda? How would you name the binaries?
Hashdist handles the combinatorial explosion of versions by using
hashes. Conda allows you to put such a hash into the version ---
technically, as a first iteration, I would set each package version as
1.0, and put the hash into a "tag" (I think they call it a "build
string"), which Conda uses as part of the version. So there is no
problem, Conda can already do everything that we need.
View Conda as a binary distribution, and Hashdist as a source distribution.
>
> I’m curious because I’d like to have some better profile/environment features in Spack, and I think we have a lot of what you might need in the `packages.yaml` configuration we added this year:
>
>
https://spack.readthedocs.io/en/latest/build_settings.html#concretization-preferences
>
> You can set your preferred MPI implementation, compiler, variants etc in a single file, per-package or for all of them, in multiple scopes (defaults, spack instance, user):
>
>
https://spack.readthedocs.io/en/latest/configuration.html#configuration-scopes
>
> What we don’t have at the moment is something resembling a “per-project” config scope, which is more like what a hashdist profile would do. I think that would be pretty easy to add. Would you guys be interested in contributing some type of profile capability to Spack? I think a lot of the infrastructure you’d need is already there, and you’d get the added control over compilers, the dependency model, and the DAG hashing that Spack offers. Binary packaging is also coming soon, courtesy of CERN:
>
>
https://github.com/LLNL/spack/pull/445
We also have an open PR for binary packages:
https://github.com/hashdist/hashdist/pull/314
But while both your and our binary package PR are good to have, Conda
already does this --- in fact my understanding is that's precisely
what Conda does -- how to handle binary packages, mirrors, relocation
of binaries (rpath, cmake, ...), etc. So I much rather would like to
use Conda, which has years of experience and polishing bugs regarding
the binaries, than a new PR (in either Hashdist or Spack).
Rather than trying to reproduce everything that Conda does, and there
is a lot of it, I just want to use it. Spack will have to implement
all these binary packages that Conda can do, and that's a lot of work,
and I don't have time to do that currently.
Ondrej
P.S. Another issue with Spack is that it's LGPL licensed, if there is
a choice, I much rather have a BSD/MIT licensed tool, like Hashdist or
Conda.