You *must* give SAGE_PARALLEL_SPKG_BUILD a try!!

25 views
Skip to first unread message

Dr. David Kirkby

unread,
Jun 27, 2010, 6:25:35 PM6/27/10
to sage-...@googlegroups.com
If you have not seen it, it is now possible to build Sage packages in parallel.

See http://trac.sagemath.org/sage_trac/ticket/8306

This is in my mind one of the most impressive improvements I've ever seen to aid
building Sage.

On my Sun Ultra 27, running OpenSolaris, I can build 52 packages in just 8
minutes! (I've not even bothered using ccache, which apparently would speed it
up even more).

You can probably do better with some of the big servers on the sage network.

It bombs out at that point, as Maxima is not building on OpenSolaris - one of
the very few packages which remain to be fixed for a full 64-bit build on
OpenSolaris.

drkirkby@hawk:~/sage-4.5.alpha0$ ls -lrt spkg/installed
total 99
-rw-r--r-- 1 drkirkby staff 0 Jun 27 19:07 dir-0.1
-rw-r--r-- 1 drkirkby staff 146 Jun 27 19:07 fortran-20100626
-rw-r--r-- 1 drkirkby staff 140 Jun 27 19:07 cephes-2.8
-rw-r--r-- 1 drkirkby staff 143 Jun 27 19:07 blas-20070724
-rw-r--r-- 1 drkirkby staff 148 Jun 27 19:07 lapack-20071123.p1
-rw-r--r-- 1 drkirkby staff 0 Jun 27 19:07 prereq-0.7
-rw-r--r-- 1 drkirkby staff 0 Jun 27 19:07 bzip2-1.0.5
-rw-r--r-- 1 drkirkby staff 223 Jun 27 19:07 sage_scripts-4.5.alpha0
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:07 examples-4.5.alpha0
-rw-r--r-- 1 drkirkby staff 222 Jun 27 19:07 conway_polynomials-0.2
-rw-r--r-- 1 drkirkby staff 220 Jun 27 19:07 boost-cropped-1.34.1
-rw-r--r-- 1 drkirkby staff 218 Jun 27 19:07 graphs-20070722.p1
-rw-r--r-- 1 drkirkby staff 216 Jun 27 19:07 termcap-1.3.1.p1
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:07 elliptic_curves-0.1
-rw-r--r-- 1 drkirkby staff 221 Jun 27 19:07 polytopes_db-20100210
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:07 f2c-20070816.p2
-rw-r--r-- 1 drkirkby staff 210 Jun 27 19:07 zlib-1.2.5
-rw-r--r-- 1 drkirkby staff 217 Jun 27 19:07 sympow-1.018.1.p7
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:08 rubiks-20070912.p11
-rw-r--r-- 1 drkirkby staff 211 Jun 27 19:08 palp-1.1.p3
-rw-r--r-- 1 drkirkby staff 216 Jun 27 19:08 libpng-1.2.35.p2
-rw-r--r-- 1 drkirkby staff 220 Jun 27 19:08 tachyon-0.98beta.p11
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:08 readline-6.0.p2
-rw-r--r-- 1 drkirkby staff 217 Jun 27 19:08 freetype-2.3.5.p2
-rw-r--r-- 1 drkirkby staff 217 Jun 27 19:08 symmetrica-2.0.p5
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:08 boehm_gc-7.1.p6
-rw-r--r-- 1 drkirkby staff 216 Jun 27 19:08 libm4ri-20100221
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:08 iconv-1.13.1.p2
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:09 libgpg_error-1.6.p3
-rw-r--r-- 1 drkirkby staff 212 Jun 27 19:09 gd-2.0.35.p5
-rw-r--r-- 1 drkirkby staff 218 Jun 27 19:09 libgcrypt-1.4.4.p3
-rw-r--r-- 1 drkirkby staff 213 Jun 27 19:09 sqlite-3.6.22
-rw-r--r-- 1 drkirkby staff 216 Jun 27 19:09 opencdk-0.6.6.p4
-rw-r--r-- 1 drkirkby staff 213 Jun 27 19:09 mpir-1.2.2.p1
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:09 flintqs-20070817.p5
-rw-r--r-- 1 drkirkby staff 218 Jun 27 19:09 ratpoints-2.1.3.p1
-rw-r--r-- 1 drkirkby staff 213 Jun 27 19:10 pari-2.3.5.p1
-rw-r--r-- 1 drkirkby staff 222 Jun 27 19:10 genus2reduction-0.3.p6
-rw-r--r-- 1 drkirkby staff 214 Jun 27 19:10 cddlib-094f.p7
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:10 gfan-0.4plus.p1
-rw-r--r-- 1 drkirkby staff 212 Jun 27 19:11 ecm-6.2.1.p2
-rw-r--r-- 1 drkirkby staff 219 Jun 27 19:11 givaro-3.2.13rc2.p2
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:11 gnutls-2.2.1.p5
-rw-r--r-- 1 drkirkby staff 214 Jun 27 19:11 zn_poly-0.9.p4
-rw-r--r-- 1 drkirkby staff 210 Jun 27 19:12 mpfr-2.4.2
-rw-r--r-- 1 drkirkby staff 225 Jun 27 19:12 mpfi-1.3.4-cvs20071125.p8
-rw-r--r-- 1 drkirkby staff 218 Jun 27 19:12 libfplll-3.0.12.p1
-rw-r--r-- 1 drkirkby staff 222 Jun 27 19:12 lcalc-20100428-1.23.p0
-rw-r--r-- 1 drkirkby staff 210 Jun 27 19:12 ecl-10.4.1
-rw-r--r-- 1 drkirkby staff 215 Jun 27 19:13 python-2.6.4.p9
-rw-r--r-- 1 drkirkby staff 213 Jun 27 19:15 ntl-5.4.2.p12


I think we need to ensure 'prereq' is built earlier in the cycle, as it seems
silly that lapack, which needs a fortran compiler, is built before we have even
checked if the Fortran compiler works or not.

But pretty impressive. Well done Mitesh Patel

Dave

Nils Bruin

unread,
Jun 28, 2010, 12:19:22 AM6/28/10
to sage-devel
I tried building sage 4.4.4 with

export SAGE_PARALLEL_SPKG_BUILD="yes"
export MAKE="make -j4"
make

but got:

"""
- use the `-Wl,--rpath -Wl,LIBDIR' linker flag
- have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
make[5]: Leaving directory `/usr/local/sage/4.4.4/spkg/build/
mpir-1.2.2.p1/src'
make[4]: *** [install-am] Error 2
make[4]: Leaving directory `/usr/local/sage/4.4.4/spkg/build/
mpir-1.2.2.p1/src'
make[3]: *** [install-recursive] Error 1
make[3]: Leaving directory `/usr/local/sage/4.4.4/spkg/build/
mpir-1.2.2.p1/src'
make[2]: *** [install] Error 2
make[2]: Leaving directory `/usr/local/sage/4.4.4/spkg/build/
mpir-1.2.2.p1/src'
Error installing MPIR.
"""
export SAGE_PARALLEL_SPKG_BUILD="yes"
make clean
make

seems to be doing OK now.

This is on a:

ella sage/4.4.4$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
ella sage/4.4.4$ uname -a
Linux ella.cecm.sfu.ca 2.6.18-194.3.1.el5 #1 SMP Sun May 2 04:17:42
EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

where building sage from source usually doesn't give me trouble.
The building of MPIR finished successfully very quickly after
restarting, so I think MPIR was built almost completely already and
the make clean probably didn't do too much.

Dr. David Kirkby

unread,
Jun 28, 2010, 1:33:29 AM6/28/10
to sage-...@googlegroups.com

'make distclean'

is safer in general.

> seems to be doing OK now.
>
> This is on a:
>
> ella sage/4.4.4$ cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 5.5 (Tikanga)
> ella sage/4.4.4$ uname -a
> Linux ella.cecm.sfu.ca 2.6.18-194.3.1.el5 #1 SMP Sun May 2 04:17:42
> EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>
> where building sage from source usually doesn't give me trouble.
> The building of MPIR finished successfully very quickly after
> restarting, so I think MPIR was built almost completely already and
> the make clean probably didn't do too much.

There's too little information above for me to see what went wrong. It should be
noted that the output of commands will be mixed too, as different packages are
all being built at the same time.

There is one possible thing I think could go wrong on the *.math.washington.edu
network related to how the NFS disks are shared. The ZIL log has been disabled
on the server with the disks (called 'disk') which means that that the a file
could appear to have been committed to disk, when in fact it has not. I would
not be totally surprised if we got hit with problems on NFS file systems there

Building on NFS file systems should be ok if the systems are set up correctly,
but problems might occur if things like the ZIL log is disabled, or perhaps if
the server and client are not accurately synced in time.

What I find a bit odd, is that MPIR does not to my knowledge rely on any
libraries, but there is too little output to see above.

The only downside I've personally found have been that my computer gets more
noisy, as the CPU is working harder so the fans have to rotate faster!

Dave

John H Palmieri

unread,
Jun 28, 2010, 2:41:07 AM6/28/10
to sage-devel
On Jun 27, 6:33 pm, "Dr. David Kirkby" <david.kir...@onetel.net>
wrote:

> There's too little information above for me to see what went wrong. It should be
> noted that the output of commands will be mixed too, as different packages are
> all being built at the same time.

But there is now also a directory spkg/logs/ in which the output for
each package gets written into a separate file. So you can use those
for troubleshooting.

--
John

Dr. David Kirkby

unread,
Jun 28, 2010, 3:23:11 AM6/28/10
to sage-...@googlegroups.com


Great, I was not aware of that.

Dave

Nils Bruin

unread,
Jun 28, 2010, 4:56:05 AM6/28/10
to sage-devel
On Jun 27, 7:41 pm, John H Palmieri <jhpalmier...@gmail.com> wrote:
> But there is now also a directory spkg/logs/ in which the output for
> each package gets written into a separate file.  So you can use those
> for troubleshooting.

That is not present (does that get cleaned up by make clean?) so
perhaps I didn't do everything that is required for a parallel make.

I've run make ptest with NUM_THREADS=8. It's a 2 processor machine. It
reports:

-------------
The following tests failed:

sage -t local/lib/python2.6/site-packages/sagenb-0.8-
py2.6.egg/sagenb/notebook/template.py # File not found
sage -t devel/sage/sage/symbolic/constants.py # 1 doctests
failed
sage -t devel/sage/sage/symbolic/assumptions.py # 1 doctests
failed
sage -t devel/sage/sage/groups/perm_gps/partn_ref/
refinement_graphs.pyx # File not found
sage -t devel/sage/sage/groups/perm_gps/partn_ref/
refinement_matrices.pyx # File not found
sage -t devel/sage/sage/rings/number_field/number_field.py #
File not found
sage -t devel/sage/sage/rings/finite_rings/
finite_field_base.pyx # File not found
sage -t devel/sage/sage/schemes/generic/algebraic_scheme.py #
2 doctests failed
sage -t devel/sage/sage/schemes/hyperelliptic_curves/
hyperelliptic_padic_field.py # File not found
sage -t devel/sage/sage/schemes/elliptic_curves/heegner.py #
File not found
sage -t devel/sage/sage/schemes/elliptic_curves/sha_tate.py #
File not found
sage -t devel/sage/sage/combinat/crystals/
kirillov_reshetikhin.py # File not found
sage -t devel/sage/sage/plot/plot.py # File not found
-----------------
It looks like maxima had communication problems (reports unexpected
EOF etc.) and a couple of other tests timed out, so this may have been
too much of a load for the computer and caused the test failures.

Back to a simple "MAKE="make -j2" for me.

John H Palmieri

unread,
Jun 28, 2010, 5:12:11 AM6/28/10
to sage-devel
On Jun 27, 9:56 pm, Nils Bruin <nbr...@sfu.ca> wrote:
> On Jun 27, 7:41 pm, John H Palmieri <jhpalmier...@gmail.com> wrote:
>
> > But there is now also a directory spkg/logs/ in which the output for
> > each package gets written into a separate file.  So you can use those
> > for troubleshooting.
>
> That is not present (does that get cleaned up by make clean?) so
> perhaps I didn't do everything that is required for a parallel make.

After looking at the source code, I think that this directory
(SAGE_ROOT/spkg/logs) should be there and contain one log file for
each spkg regardless of whether you do a parallel build. The relevant
code was added in Sage 4.5.alpha0, and it's also possible that it
wouldn't be there if you did a "sage -upgrade", only if you built from
scratch.

Oh, I see looking at your earlier message that you're using 4.4.4.
That explains it. Whatever problems you were having had to do with
MAKE='make -j4', not SAGE_PARALLEL_SPKG_BUILD.

On my Mac, building in parallel (version 4.5.alpha0) with 'make -j2'
cut the build time from 140 minutes to 90 minutes. I haven't tried
with 'make -j4' because the machine only has 2 cores. I should try
that and see what happens.

It's more impressive on t2.math.washington.edu, a Solaris machine:
without the parallel build, it took me 17 hours, but the parallel
build takes under 5 hours.

--
John

Dr. David Kirkby

unread,
Jun 28, 2010, 5:43:18 AM6/28/10
to sage-...@googlegroups.com
On 06/28/10 06:12 AM, John H Palmieri wrote:
> On Jun 27, 9:56 pm, Nils Bruin<nbr...@sfu.ca> wrote:
>> On Jun 27, 7:41 pm, John H Palmieri<jhpalmier...@gmail.com> wrote:
>>
>>> But there is now also a directory spkg/logs/ in which the output for
>>> each package gets written into a separate file. So you can use those
>>> for troubleshooting.
>>
>> That is not present (does that get cleaned up by make clean?) so
>> perhaps I didn't do everything that is required for a parallel make.
>
> After looking at the source code, I think that this directory
> (SAGE_ROOT/spkg/logs) should be there and contain one log file for
> each spkg regardless of whether you do a parallel build. The relevant
> code was added in Sage 4.5.alpha0, and it's also possible that it
> wouldn't be there if you did a "sage -upgrade", only if you built from
> scratch.
>
> Oh, I see looking at your earlier message that you're using 4.4.4.
> That explains it. Whatever problems you were having had to do with
> MAKE='make -j4', not SAGE_PARALLEL_SPKG_BUILD.
>
> On my Mac, building in parallel (version 4.5.alpha0) with 'make -j2'
> cut the build time from 140 minutes to 90 minutes. I haven't tried
> with 'make -j4' because the machine only has 2 cores. I should try
> that and see what happens.

Sometime I read recently said about 1.5x the number of CPUS was best. So I'd try
2, 3 and 4, and see where you get.

> It's more impressive on t2.math.washington.edu, a Solaris machine:
> without the parallel build, it took me 17 hours, but the parallel
> build takes under 5 hours.
>
> --
> John

I assume that is not rebuilding ATLAS, as I think that takes more than 5 hours.
Clint, the ATLAS developer said his latest version of ATLAS builds in under an
hour on t2, but I've been unable to get it to build reliably in parallel, so I
think one needs to keep MAKE unset in the atlas package. At which point, it
takes quite a while to build.

dave

Reply all
Reply to author
Forward
0 new messages