Question about versions

34 views
Skip to first unread message

Ondřej Čertík

unread,
Apr 22, 2015, 4:00:51 PM4/22/15
to sp...@googlegroups.com
Hi,

I am still trying to understand how package versions work in Spack.

Can I build scipy with lapack 3.4.2? Sure.
Can I build numpy with lapack 3.4.1? Sure.

Can I then load numpy and scipy into the same environment? I think
that is asking for trouble. How does Spack prevent that?

The same question with compilers --- i.e. if I build lapack with gcc
for scipy, with intel for numpy and load both.

Ondrej

Todd Gamblin

unread,
Apr 22, 2015, 4:28:50 PM4/22/15
to Ondřej Čertík, sp...@googlegroups.com
On 4/22/15, 2:00 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

>Hi,
>
>I am still trying to understand how package versions work in Spack.
>
>Can I build scipy with lapack 3.4.2? Sure.
>Can I build numpy with lapack 3.4.1? Sure.
>
>Can I then load numpy and scipy into the same environment? I think
>that is asking for trouble. How does Spack prevent that?

Nothing's preventing you from doing this with the modules right now --
more module support would be nice. I think modules are a good way to do
this because it's familiar to users and you could potentially augment your
numpy installs with modules for hand installs or external system software.
But there's a tair amount of support that would need to be implemented
before the modules prevent you from doing something stupid.

You bring up a good point with lapack. Say you had this:

nu...@0.15.0 ^numpy@1.6
sc...@0.15.0 ^numpy@1.7

If you activated the first scipy, it would also activate its numpy in the
same python install. If you then try to activate the second scipy, Spack
will complain because the already-activated numpy conflicts with the one
scipy brings in. You could still force-deactivate the first numpy, with
the -a flag to take out its dependencies too, and then bring in scipy,
But once you start using -f flags you're on your own. The less risky way
to swap is to deactivate -a the first scipy, then activate the second one.

I am leaning more and more towards moving the activate functionality into
something resembling profiles. I think that's a better way to do this,
and it gets you virtualenv-ish semantics for all your packages, not just
python ones.

>The same question with compilers --- i.e. if I build lapack with gcc for
>scipy, with intel for numpy and load both.

I'm not ruling this out at the moment, as I consider it to be nice that
you don't have to build a stack with a common compiler. e.g., we have
tools right now that want to be built with gcc that we would also want to
use with an icc build. What might be useful here would be some
consistency checks, which could happen at multiple levels. You could
check in modules whether it makes sense to load something with the current
environment, AND you could check at build time to ensure that icc's and
gcc's in the same build are actually compatible. One thing spack could do
automatically with compiler wrappers is ensure that your icc link is done
right, and that it provides the right -gcc-version flag to go with the gcc
you're cross-linking with.

There's no such check right now, but the default concretization strategy
does keep the compilers consistent by default. For any module that has no
assigned compilers, spack will try to be consistent with any ancestor in
the DAG that has a compiler set. This is why when you do "spack install
libelf%intel", all the dependencies end up building with intel by default,
too.

-Todd



>
>Ondrej
>
>--
>You received this message because you are subscribed to the Google Groups
>"Spack" group.
>To unsubscribe from this group and stop receiving emails from it, send an
>email to spack+un...@googlegroups.com.
>To post to this group, send email to sp...@googlegroups.com.
>Visit this group at http://groups.google.com/group/spack.
>For more options, visit https://groups.google.com/d/optout.


Ondřej Čertík

unread,
Apr 22, 2015, 4:57:02 PM4/22/15
to Todd Gamblin, sp...@googlegroups.com
Yes, that's what we do in Hashdist.

>
>>The same question with compilers --- i.e. if I build lapack with gcc for
>>scipy, with intel for numpy and load both.
>
> I'm not ruling this out at the moment, as I consider it to be nice that
> you don't have to build a stack with a common compiler. e.g., we have
> tools right now that want to be built with gcc that we would also want to
> use with an icc build. What might be useful here would be some
> consistency checks, which could happen at multiple levels. You could
> check in modules whether it makes sense to load something with the current
> environment, AND you could check at build time to ensure that icc's and
> gcc's in the same build are actually compatible. One thing spack could do
> automatically with compiler wrappers is ensure that your icc link is done
> right, and that it provides the right -gcc-version flag to go with the gcc
> you're cross-linking with.
>
> There's no such check right now, but the default concretization strategy
> does keep the compilers consistent by default. For any module that has no
> assigned compilers, spack will try to be consistent with any ancestor in
> the DAG that has a compiler set. This is why when you do "spack install
> libelf%intel", all the dependencies end up building with intel by default,
> too.

A specific package in the stack definitely needs to be able to build
with a different compiler than the rest, I've just created an issue
for it in Hashdist:

https://github.com/hashdist/hashdist/issues/328

because we currently only allow the same compiler for the whole stack
--- resp. it is possible to use different compilers (and I have done
that), but you need to hack the package yaml recipes. I want to make
it just like in Spack, where you specify this on a command line.

For example you might want to build Lapack with intel, but numpy with
gcc (that uses your Lapack, instead of the internal one). However,
what should not be allowed is to then build Lapack with gcc, link with
scipy *and* create a common profile with numpy --- in other words, in
the profile, the same lapack should be used with both numpy and scipy,
and so you can't build it with intel and gcc at once. I think you
agree with this.

In other words, Spack seems to only consider DAG for a single package,
is that right? Hashdist is considering DAG for the whole profile, so
Lapack is there only once (no matter if you compiled it with intel,
gcc, or which version you used), and then the same Lapack is used with
numpy and scipy or any other package that needs Lapack, like petsc.

Ondrej

Todd Gamblin

unread,
Apr 22, 2015, 5:06:05 PM4/22/15
to Ondřej Čertík, sp...@googlegroups.com
On 4/22/15, 2:57 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

>>There's no such check right now, but the default concretization strategy
>> does keep the compilers consistent by default. For any module that has
>>no
>> assigned compilers, spack will try to be consistent with any ancestor in
>> the DAG that has a compiler set. This is why when you do "spack install
>> libelf%intel", all the dependencies end up building with intel by
>>default,
>> too.
>
>A specific package in the stack definitely needs to be able to build
>with a different compiler than the rest, I've just created an issue
>for it in Hashdist:
>
>https://github.com/hashdist/hashdist/issues/328
>
>because we currently only allow the same compiler for the whole stack
>--- resp. it is possible to use different compilers (and I have done
>that), but you need to hack the package yaml recipes. I want to make
>it just like in Spack, where you specify this on a command line.
>
>For example you might want to build Lapack with intel, but numpy with
>gcc (that uses your Lapack, instead of the internal one). However,
>what should not be allowed is to then build Lapack with gcc, link with
>scipy *and* create a common profile with numpy --- in other words, in
>the profile, the same lapack should be used with both numpy and scipy,
>and so you can't build it with intel and gcc at once. I think you
>agree with this.

Yep.

>In other words, Spack seems to only consider DAG for a single package,
>is that right?

This is a little complicated to answer. Currently Spack DOES guarantee
you that for any build there is only one configuration of any particular
package, and it goes to great pains to do that. The final spec will not
contain two packages with the same name, and if the configuration from the
package files is different from what the user asked for, it'll yell at you.

Currently the only thing resembling a "profile", where packages are merged
into the same prefix, is the python package activation, but that *does*
actually recognize conflicts among dependencies. The problem is really
that it only recognizes conflicts among python dependencies, and if they
have other libraries in their DAG, they're not checked as part of the
activation process because they don't get "activated".

>Hashdist is considering DAG for the whole profile, so
>Lapack is there only once (no matter if you compiled it with intel,
>gcc, or which version you used), and then the same Lapack is used with
>numpy and scipy or any other package that needs Lapack, like petsc.

Yes. If I took the same mechanism that we use for python packages, and if
I make it merge in the full DAG for each activated package, then it would
cover that case and complain about any mismatches in the full DAG. We
only really thought as far as python for this, mainly because it's the
only place where we *have* to think about things in a single prefix. The
usage model on most of our machines is modules and scripts with hard-coded
paths. Having custom environments ready-made for users would be really
nice though.

-Todd


Ondřej Čertík

unread,
Apr 22, 2015, 5:32:01 PM4/22/15
to Todd Gamblin, sp...@googlegroups.com
I see. I am beginning to understand the differences. It seems to me
that Hashdist has most of this implemented, though perhaps it hasn't
been exposed to the user as nicely as Spack does. I've just exposed
lots of the profile loading and querying today:

https://github.com/hashdist/hashdist/pull/325

As you can see, all the information has already been there in the
database, it just needed to be exposed to the user in a nicer way.

I will now concentrate on standardizing the version and compiler
support on the per package level:

https://github.com/hashdist/hashdist/issues/327
https://github.com/hashdist/hashdist/issues/328

So that one can specify this from the profile file. Finally, I'll try
to convert your nice command line syntax for specifying packages +
dependencies into our profile language:

https://github.com/hashdist/hashdist/issues/329

That should cover most of the differences (we also need to add support
for modules: https://github.com/hashdist/hashdist/issues/330). After
this is in, let's discuss again what the actual differences are,
because all this is just syntactic sugar, just on the surface.

Ondrej

Malcolm Cook

unread,
Apr 22, 2015, 6:27:51 PM4/22/15
to sp...@googlegroups.com, ondrej...@gmail.com


On Wednesday, April 22, 2015 at 3:28:50 PM UTC-5, Todd Gamblin wrote:
On 4/22/15, 2:00 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

>Hi,
>
>I am still trying to understand how package versions work in Spack.
>
>Can I build scipy with lapack 3.4.2? Sure.
>Can I build numpy with lapack 3.4.1? Sure.
>
>Can I then load numpy and scipy into the same environment? I think
>that is asking for trouble. How does Spack prevent that?

Nothing's preventing you from doing this with the modules right now --
more module support would be nice.  I think modules are a good way to do
this because it's familiar to users and you could potentially augment your
numpy installs with modules for hand installs or external system software.
 But there's a tair amount of support that would need to be implemented
before the modules prevent you from doing something stupid.

You bring up a good point with lapack.  Say you had this:

        nu...@0.15.0 ^numpy@1.6
        sc...@0.15.0 ^numpy@1.7


I'm having trouble appreciating this example.  It seems (almost) nonsensical for numpy version 0.15.0 to depend upon numpy version 1.6.  

Did you perhaps mean to create an example where numpy and scipy depend upon different versions of lapack?

  
If you activated the first scipy, it would also activate its numpy in the
same python install.  If you then try to activate the second scipy, Spack
will complain because the already-activated numpy conflicts with the one
scipy brings in.  You could still force-deactivate the first numpy, with
the -a flag to take out its dependencies too, and then bring in scipy,
But once you start using -f flags you're on your own.  The less risky way
to swap is to deactivate -a the first scipy, then activate the second one.

I am leaning more and more towards moving the activate functionality into
something resembling profiles.  I think that's a better way to do this,
and it gets you virtualenv-ish semantics for all your packages, not just
python ones.


Trying to follow your iscussion here.  By 'activate', do you mean `spack load <spec>` or `module load <module>`.

Is there some reference notion of what you mean by "profiles" here?  Is this a user-space registry of currently 'activated' modules/specs?

Todd Gamblin

unread,
Apr 22, 2015, 6:36:36 PM4/22/15
to Malcolm Cook, sp...@googlegroups.com, ondrej...@gmail.com
From: Malcolm Cook <malcol...@gmail.com>
Date: Wednesday, April 22, 2015 at 4:27 PM
To: "sp...@googlegroups.com" <sp...@googlegroups.com>
Cc: "ondrej...@gmail.com" <ondrej...@gmail.com>
Subject: Re: [spack] Question about versions



On Wednesday, April 22, 2015 at 3:28:50 PM UTC-5, Todd Gamblin wrote:
On 4/22/15, 2:00 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

>Hi,
>
>I am still trying to understand how package versions work in Spack.
>
>Can I build scipy with lapack 3.4.2? Sure.
>Can I build numpy with lapack 3.4.1? Sure.
>
>Can I then load numpy and scipy into the same environment? I think
>that is asking for trouble. How does Spack prevent that?

Nothing's preventing you from doing this with the modules right now --
more module support would be nice.  I think modules are a good way to do
this because it's familiar to users and you could potentially augment your
numpy installs with modules for hand installs or external system software.
 But there's a tair amount of support that would need to be implemented
before the modules prevent you from doing something stupid.

You bring up a good point with lapack.  Say you had this:

        nu...@0.15.0 ^numpy@1.6
        sc...@0.15.0 ^numpy@1.7


I'm having trouble appreciating this example.  It seems (almost) nonsensical for numpy version 0.15.0 to depend upon numpy version 1.6.  

Did you perhaps mean to create an example where numpy and scipy depend upon different versions of lapack?

Oops.  That should be:

        sc...@0.15.0 ^numpy@1.6 
        sc...@0.15.0 ^numpy@1.7 

I'm quickly becoming unable to keep up with the list, apparently.


  
If you activated the first scipy, it would also activate its numpy in the
same python install.  If you then try to activate the second scipy, Spack
will complain because the already-activated numpy conflicts with the one
scipy brings in.  You could still force-deactivate the first numpy, with
the -a flag to take out its dependencies too, and then bring in scipy,
But once you start using -f flags you're on your own.  The less risky way
to swap is to deactivate -a the first scipy, then activate the second one.

I am leaning more and more towards moving the activate functionality into
something resembling profiles.  I think that's a better way to do this,
and it gets you virtualenv-ish semantics for all your packages, not just
python ones.


Trying to follow your iscussion here.  By 'activate', do you mean `spack load <spec>` or `module load <module>`.

I actually mean "spack activate" as implemented for python support:



Is there some reference notion of what you mean by "profiles" here?  Is this a user-space registry of currently 'activated' modules/specs?

A profile in the Ondrej/hashdist sense (as I understand it) is a single prefix where a number of installed packages are merged, like GNU stow does, and like the Spack python support does.  You can load this profile and it's like you have some subset of your installed packages "live".

The usage model is that you'd create a named profile, you'd install a bunch of packages into it, and Spack (or hashdist) would make sure the things installed it he same profile are consistent.  You could then load another clean slate profile, install things into it (which is really just symlinking if the thing was already installed somewhere) and construct a new environment.

This is sort of similar to how Lmod allows you to save your environment and load it again later.  

Todd Gamblin

unread,
Apr 25, 2015, 5:11:19 PM4/25/15
to Ondřej Čertík, sp...@googlegroups.com
Ok, I'll be interested to see how this works in hashdist!
Reply all
Reply to author
Forward
0 new messages