build spkg's in parallel by default?

21 views
Skip to first unread message

John H Palmieri

unread,
Oct 27, 2011, 1:14:36 PM10/27/11
to sage-...@googlegroups.com
The option to build spkg's in Sage in parallel has been available for quite a while now, but it has to be enabled by setting the shell variable SAGE_PARALLEL_SPKG_BUILD equal to "yes".  Should we change the default, building in parallel unless this variable is explicitly set to "no" (or "n")?  I propose that we do.  Are there objections to this?

Note any setting of this variable has no effect unless you tell 'make' to build in parallel by doing something like 'export MAKE="make -j4" ' (or as Leif might suggest, something like 'MAKE="make -j4 -l2.5" && export MAKE').  I am not suggesting changing this behavior: you would still have to set MAKE yourself.

--
John

kcrisman

unread,
Oct 27, 2011, 1:29:37 PM10/27/11
to sage-devel


On Oct 27, 1:14 pm, John H Palmieri <jhpalmier...@gmail.com> wrote:
> The option to build spkg's in Sage in parallel has been available for quite
> a while now, but it has to be enabled by setting the shell variable
> SAGE_PARALLEL_SPKG_BUILD equal to "yes".  Should we change the default,
> building in parallel unless this variable is explicitly set to "no" (or
> "n")?  I propose that we do.  Are there objections to this?
>

This seems like a reasonable idea, given the MAKE still needing to be
set. I *always* have to Google for the exact order of the four
words...

I'm assuming this would also have no effect on single-processor
machines, even if one were to be silly and set MAKE=make -j2 or
something?

- kcrisman

John H Palmieri

unread,
Oct 27, 2011, 2:36:21 PM10/27/11
to sage-...@googlegroups.com


On Thursday, October 27, 2011 10:29:37 AM UTC-7, kcrisman wrote:


Well, it will still try to build in parallel then.  I think it might run two processes, but they will just take turns? It might still go faster than a serial build, if one spkg is doing a lot of i/o and the other is doing processor-heavy things.  I've seen recommendations to set MAKE="make -jN" where N is 1.5 times the number of CPUs in the machine (obviously only for non-shared machines), for what that's worth.

--
John

Jeroen Demeyer

unread,
Oct 27, 2011, 2:43:56 PM10/27/11
to sage-...@googlegroups.com

Even better solution: make the SAGE_PARALLEL_SPKG_BUILD="yes" behaviour
the default and *remove the environment variabele*. There is absolutely
no reason why anybody would need to set SAGE_PARALLEL_SPKG_BUILD="no".
Simply set MAKE="make -j1" and you get back the old behaviour.

John H Palmieri

unread,
Oct 27, 2011, 2:52:07 PM10/27/11
to sage-...@googlegroups.com


On Thursday, October 27, 2011 11:43:56 AM UTC-7, Jeroen Demeyer wrote:
On 2011-10-27 19:14, John H Palmieri wrote:
> The option to build spkg's in Sage in parallel has been available for
> quite a while now, but it has to be enabled by setting the shell
> variable SAGE_PARALLEL_SPKG_BUILD equal to "yes".  Should we change the
> default, building in parallel unless this variable is explicitly set to
> "no" (or "n")?  I propose that we do.  Are there objections to this?
>
> Note any setting of this variable has no effect unless you tell 'make'
> to build in parallel by doing something like 'export MAKE="make -j4" '
> (or as Leif might suggest, something like 'MAKE="make -j4 -l2.5" &&
> export MAKE').  I am not suggesting changing this behavior: you would
> still have to set MAKE yourself.

Even better solution: make the SAGE_PARALLEL_SPKG_BUILD="yes" behaviour

the default and *remove the environment variable*.  There is absolutely


no reason why anybody would need to set SAGE_PARALLEL_SPKG_BUILD="no".
Simply set MAKE="make -j1" and you get back the old behaviour.


Good idea.  You can also "unset MAKE" to get the old behavior.

--
John
 

kcrisman

unread,
Oct 27, 2011, 3:28:00 PM10/27/11
to sage-devel
As it turns out, trying the "wrong" behavior (make -j2 with parallel
spkgs) on one of my one processor machines does lead to unusual build
failures which seem to have to do with waiting for jobs. And then it
finishes the next spkg before actually stopping the Sage build.
Things like errors in zlib... "This may not be fatal", then "read jobs
pipe, Operation not supported, stop".

Not a big deal, because with unset make but parallel spkg build there
is no problem. But just something to watch for, maybe to add to the
devel guide when it's updated for this change (if it is), as before
that env variable this was presumably not possible.

- kcrisman

John H Palmieri

unread,
Oct 27, 2011, 4:17:56 PM10/27/11
to sage-...@googlegroups.com
On Thursday, October 27, 2011 12:28:00 PM UTC-7, kcrisman wrote:

As it turns out, trying the "wrong" behavior (make -j2 with parallel
spkgs) on one of my one processor machines does lead to unusual build
failures which seem to have to do with waiting for jobs.  And then it
finishes the next spkg before actually stopping the Sage build.
Things like errors in zlib... "This may not be fatal", then "read jobs
pipe, Operation not supported, stop".

I'm trying this on one of the skynet machines (cicero), and haven't run into problems yet.  It's gotten past zlib, but it's a slow machine, so if there are problems, it may take a while to reach them.  Anyway, in my preliminary patches, I'm adding a warning about using "make -j2" on machines with one processor.

--
John

leif

unread,
Oct 27, 2011, 4:38:00 PM10/27/11
to sage-devel
On 27 Okt., 20:43, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> Even better solution: make the SAGE_PARALLEL_SPKG_BUILD="yes" behaviour
> the default and *remove the environment variabele*.

+1


> There is absolutely
> no reason why anybody would need to set SAGE_PARALLEL_SPKG_BUILD="no".
> Simply set MAKE="make -j1" and you get back the old behaviour.

I never understood why this variable was kept at all; it was only to
some extent meaningful during testing, when building spkgs in parallel
didn't yet fully work.

Also, not using $MAKE unless SAGE_PARALLEL_SPKG_BUILD=yes has been a
bad idea from the beginning, since setting MAKE is not limited to
enabling multiple 'make' jobs. (MAKE is always defined, since it is
either set by a parent 'make' or by sourcing 'sage-env', which sets it
to "make" in case it is unset or empty.)

In spkg/install we should at least use ${MAKE:-make} ... and remove
all hardcoded calls of 'make'; I'm not sure whether there are still
spkgs which invalidate or don't use $MAKE.


-leif

leif

unread,
Oct 27, 2011, 5:00:36 PM10/27/11
to sage-devel
On 27 Okt., 21:28, kcrisman <kcris...@gmail.com> wrote:
> As it turns out, trying the "wrong" behavior (make -j2 with parallel
> spkgs) on one of my one processor machines does lead to unusual build
> failures which seem to have to do with waiting for jobs.  And then it
> finishes the next spkg before actually stopping the Sage build.
> Things like errors in zlib... "This may not be fatal", then "read jobs
> pipe, Operation not supported, stop".

I guess that's on one of your MacOS boxes.

Except for running out of physical memory (or even swap space), I
never had any issues with parallel builds on single-core machines
(both with and without "hyper-threading") running different Linuces.


> Not a big deal, because with unset make but parallel spkg build there
> is no problem.  But just something to watch for, maybe to add to the
> devel guide when it's updated for this change (if it is), as before
> that env variable this was presumably not possible.

We should update the Sage *Installation* Guide and some READMEs
accordingly, perhaps with some recommendations on the number of jobs
(and btw. also [how to set] NUM_THREADS for parallel testing), and
also mentioning GNU make's -l [N] / --load-average[=N] / --max-
load[=N] option.

Slightly related: We should either stop or document that the Sage
library is built in parallel *by* *default*. On single-core machines
*without* "hyper-threading" this doesn't make a difference (it's then
built with only one thread), but otherwise one has to export
MAKE="make -j1" to disable this behaviour. IIRC the Sage library's
setup.py doesn't honor or respect "-l..." etc. in $MAKE or $MAKEFLAGS
either.


-leif

leif

unread,
Oct 27, 2011, 5:15:34 PM10/27/11
to sage-devel
On 27 Okt., 22:17, John H Palmieri <jhpalmier...@gmail.com> wrote:
> Anyway, in my
> preliminary patches, I'm adding a warning about using "make -j2" on
> machines with one processor.

Until it died :( I used to build Sage on a single-core Pentium4 (very
similar to Cicero, but with only 768MB RAM) with MAKE="make -j3",
which always succeeded. (IIRC 4 jobs, perhaps even more, also worked.
Only parallel doctesting with too many threads tends to time out or
give errors directly or indirectly caused by that.)

On a single-core with "hyper-threading" (a Pentium4 Prescott), I
usually build with 6 jobs; using 8 or 10 also worked IIRC.


2ct,

-leif


P.S.: Incidentally I also posted about parallel make on sage-support
today:

http://groups.google.com/group/sage-support/msg/a17f6e561583e1a5

http://groups.google.com/group/sage-support/msg/f4d0c2e3815f1157

John H Palmieri

unread,
Oct 27, 2011, 5:23:47 PM10/27/11
to sage-...@googlegroups.com


On Thursday, October 27, 2011 2:15:34 PM UTC-7, leif wrote:

P.S.: Incidentally I also posted about parallel make on sage-support
today:

http://groups.google.com/group/sage-support/msg/a17f6e561583e1a5

http://groups.google.com/group/sage-support/msg/f4d0c2e3815f1157

Yes, hence my comment in the original post:


> (or as Leif might suggest, something like 'MAKE="make -j4 -l 2.5" && export MAKE')

--
John

leif

unread,
Oct 27, 2011, 5:37:42 PM10/27/11
to sage-devel
On 27 Okt., 23:15, leif <not.rea...@online.de> wrote:
> On 27 Okt., 22:17, John H Palmieri <jhpalmier...@gmail.com> wrote:
>
> > Anyway, in my
> > preliminary patches, I'm adding a warning about using "make -j2" on
> > machines with one processor.
>
> Until it died :( I used to build Sage on a single-core Pentium4 (very
> similar to Cicero, but with only 768MB RAM) with MAKE="make -j3",
> which always succeeded. (IIRC 4 jobs, perhaps even more, also worked.
> Only parallel doctesting with too many threads tends to time out or
> give errors directly or indirectly caused by that.)

P.P.S.:

When preparing the Sage 4.7.2.alpha3 release, I always tested on
Cicero with either 2 or 3 jobs and never ran into any problems (even
when I forgot to suspend the ECM background jobs, or they didn't
terminate immediately).

Similar for mark, mark2 and cleo, i.e., usually using far more jobs
than cores are available or idling. On a quadcore, I successfully
built Sage with MAKE="make -j32"; with 64 jobs I just ran out of RAM
and hence stopped the build.

The only issue is that it doesn't make much sense to build self-tuning
packages (also depending on how they perform their benchmarks) with
extraordinarily high sysload; ATLAS tends to give up on such, while
other packages apparently don't care (although the tuning results may
be more or less random; the bare CPU time -- if at all available or
accurate -- doesn't take into account e.g. cache thrashing).


-leif

John H Palmieri

unread,
Oct 27, 2011, 6:34:59 PM10/27/11
to sage-...@googlegroups.com
Okay, a patch is up. See

  <http://trac.sagemath.org/sage_trac/ticket/11959>

If people have objections to the whole plan, please keep them here.  If you have issues with the particular implementation, let's discuss that on the ticket.

--
John

Jeroen Demeyer

unread,
Nov 2, 2011, 8:33:15 AM11/2/11
to sage-...@googlegroups.com
Important question: If I set
MAKE=make -j6
could it be that I get 36 processes in a parallel build? 6 packages in
parallel and 6 jobs per package? Or has this been taken care of?

John H Palmieri

unread,
Nov 2, 2011, 10:39:11 AM11/2/11
to sage-...@googlegroups.com

The GNU make man page says that it takes care of this automatically: if you set MAKE='make -jN', then

> the parent make and all the sub-makes will communicate to ensure that there are only ‘N’ jobs running at the same time between them all.

(See http://www.gnu.org/software/make/manual/make.html#Options_002fRecursion.)

So as long as you have GNU make, it should be fine.

--
John

Jeroen Demeyer

unread,
Nov 2, 2011, 10:48:01 AM11/2/11
to sage-...@googlegroups.com
On 2011-11-02 15:39, John H Palmieri wrote:
>
>
> On Wednesday, November 2, 2011 5:33:15 AM UTC-7, Jeroen Demeyer wrote:
>
> Important question: If I set
> MAKE=make -j6
> could it be that I get 36 processes in a parallel build? 6 packages in
> parallel and 6 jobs per package? Or has this been taken care of?
>
>
> The GNU make man page says that it takes care of this automatically: if
> you set MAKE='make -jN', then
>
>> the parent |make| and all the sub-|make|s will communicate to ensure
> that there are only �N� jobs running at the same time between them all.

I think this is only true if make calls make directly, not if make calls
a shell script (spkg/install) which calls make which calls a shell
script (spkg-install) which calls make.

leif

unread,
Nov 2, 2011, 10:52:06 AM11/2/11
to sage-devel
On 2 Nov., 15:39, John H Palmieri <jhpalmier...@gmail.com> wrote:
> On Wednesday, November 2, 2011 5:33:15 AM UTC-7, Jeroen Demeyer wrote:
>
> > Important question: If I set
> > MAKE=make -j6
> > could it be that I get 36 processes in a parallel build?  6 packages in
> > parallel and 6 jobs per package?  Or has this been taken care of?
>
> The GNU make man page says that it takes care of this automatically: if you
> set MAKE='make -jN', then
>
> > the parent make and all the sub-makes will communicate to ensure that
>
> there are only ‘N’ jobs running at the same time between them all.
>
> (Seehttp://www.gnu.org/software/make/manual/make.html#Options_002fRecursion.)
>
> So as long as you have GNU make, it should be fine.

As mentioned elsewhere (ticket / sage-support?), this only works
properly if the communication to the (main) jobserver isn't broken,
which obviously currently isn't the case for all spkgs.

Note that 'make' communicates through inherited file descriptors,
whose numbers are passed in MAKEFLAGS. I'm pretty sure there are
still spkgs which invalidate these.

It's always best to also use '-l', as documented.


-leif

leif

unread,
Nov 2, 2011, 10:55:42 AM11/2/11
to sage-devel
On 2 Nov., 15:52, leif <not.rea...@online.de> wrote:
> As mentioned elsewhere (ticket / sage-support?), this only works
> properly if the communication to the (main) jobserver isn't broken,
> which obviously currently isn't the case for all spkgs.
>
> Note that 'make' communicates through inherited file descriptors,
> whose numbers are passed in MAKEFLAGS.  I'm pretty sure there are
> still spkgs which invalidate these.

... or use the wrong 'unset MAKE; make ...' instead of '$MAKE -
j1 ...'.


-leif
Reply all
Reply to author
Forward
0 new messages