Race condition in parallel make install

121 views
Skip to first unread message

Mitesh Patel

unread,
Nov 26, 2010, 7:50:04 AM11/26/10
to mpir-...@googlegroups.com
On Skynet's cleo (ia64-Linux-rhel), there seems to be a race condition
which can cause an error during a parallel make install of MPIR 2.1.3.

I did

cd ~buildbot/tmp
wget http://www.mpir.org/mpir-2.1.3.tar.gz
tar zxvf mpir-2.1.3.tar.gz
mv mpir-2.1.3 mpir-2.1.3-cleo
cd mpir-2.1.3-cleo

./configure --prefix=$HOME/tmp/prefix_cleo --enable-gmpcompat
make -j4

Then I ran this script:

#!/bin/bash


RUNS=100
JOBS=4

for I in `seq 1 $RUNS`; do
rm -rf $HOME/tmp/prefix_cleo
make -j $JOBS install > zinstall.log.$I 2>&1
CODE=$?
echo Run $I of $RUNS: code $CODE
if [ $CODE = 0 ]; then
rm -f zinstall.log.$I
fi
done

which yields errors (exit status 2) on runs 8, 12, 57, 64, and 99. I've
attached one of these logs. The only, common error seems to be

make[4]: Entering directory `/home/buildbot/tmp/mpir-2.1.3-cleo'
(cd /home/buildbot/tmp/prefix_cleo/include && rm -f gmp.h && cp
mpir.h gmp.h)
cp: cannot stat `mpir.h': No such file or directory
make[4]: *** [install-data-hook] Error 1
make[4]: Leaving directory `/home/buildbot/tmp/mpir-2.1.3-cleo'
/usr/bin/install -c -m 644 'mpir.h'
'/home/buildbot/tmp/prefix_cleo/include/mpir.h'
make[3]: *** [install-data-am] Error 2
make[3]: *** Waiting for unfinished jobs....
/usr/bin/install -c .libs/libmpir.so.8.2.3
/home/buildbot/tmp/prefix_cleo/lib/libmpir.so.8.2.3


We first noticed this problem when building Sage with MPIR 1.2.2 on cleo
and iras (ia64-Linux-suse). I don't know if the error is specific to
Itanium.

(We also use

--enable-cxx=yes --enable-shared --disable-static

when configuring MPIR for Sage, but the setup above seems to be enough
to trigger the problem for me.)

Please let me know if you need further information. Thanks!

zinstall.log.8

Mitesh Patel

unread,
Nov 26, 2010, 8:00:02 AM11/26/10
to mpir-...@googlegroups.com
On 11/26/2010 06:50 AM, Mitesh Patel wrote:
> We first noticed this problem when building Sage with MPIR 1.2.2 on cleo
> and iras (ia64-Linux-suse). I don't know if the error is specific to
> Itanium.

Leif Leonhardy has reported the same problem on a non-Itanium machine

http://trac.sagemath.org/sage_trac/ticket/9343#comment:324

I've also seen it on Skynet's menas (x86_64-Linux-core2-suse):

http://build.sagemath.org/sage/builders/openSUSE%2011.1-64%20%28menas%29/builds/39/steps/shell_1/logs/mpir

Jason

unread,
Nov 30, 2010, 10:18:54 AM11/30/10
to mpir-...@googlegroups.com

Hi

There are actually very few dependencies in MPIR , I always test with parallel
builds because I'm impatient , but the one thing I almost never test for is a
make install (next mpir will be tested for a make install as well) .MPIR could
well have unspecified dependencies.The other possible cause is that if you are
building in a directory on an NFS drive , timing issues can effect it.You could
try building on a local drive , if you still get the error then its most
likely bad dependencies , but if the error goes away then !maybe!

Thanks
Jason

leif

unread,
Nov 30, 2010, 2:40:03 PM11/30/10
to mpir-devel
On 30 Nov., 16:18, Jason <ja...@njkfrudils.plus.com> wrote:
> There are actually very few dependencies in MPIR , I always test with parallel
> builds because I'm impatient , but the one thing I almost never test for is a
> make install (next mpir will be tested for a make install as well) .MPIR could
> well have unspecified dependencies.The other possible cause is that if you are
> building in a directory on an NFS drive , timing issues can effect it.You could
> try building on a local drive , if you still get the error then its most
> likely bad dependencies , but if the error goes away then !maybe!

The race condition in 'make install' definitely appears on local
filesystems as well.

(Just a few days ago for the first time on an old single-core Pentium
4, too, with 8 'make' jobs, Ubuntu 9.04.)


Cheers,
-Leif

Jason

unread,
Nov 30, 2010, 4:57:24 PM11/30/10
to mpir-...@googlegroups.com

I'll see if I can track it down

Thanks
Jason

Jason

unread,
Nov 30, 2010, 8:09:55 PM11/30/10
to mpir-...@googlegroups.com

Heres the fix

we have to change Makefile.am to

install-data-hook:
if WANT_GMPCOMPAT
(rm -f $(DESTDIR)$(includedir)/gmp.h && cp mpir.h
$(DESTDIR)$(includedir)/gmp.h)
if WANT_CXX
(rm -f $(DESTDIR)$(includedir)/gmpxx.h && cp mpirxx.h
$(DESTDIR)$(includedir)/gmpxx.h)
endif
endif

ignoring the line ending changes because of my email

then run autoreconf on boxen

I can issue a mpir-2.1.4 tomorrow as I'm testing the new 2.2.0-rc3 , this is
not problem as the change is trivial and obvious.

Jason

leif

unread,
Dec 1, 2010, 4:16:14 PM12/1/10
to mpir-devel
On 1 Dez., 02:09, Jason <ja...@njkfrudils.plus.com> wrote:
> Heres the fix
>
> we have to change Makefile.am to
>
> install-data-hook:
> if WANT_GMPCOMPAT
>         (rm -f $(DESTDIR)$(includedir)/gmp.h && cp mpir.h
> $(DESTDIR)$(includedir)/gmp.h)
> if WANT_CXX
>         (rm -f $(DESTDIR)$(includedir)/gmpxx.h && cp mpirxx.h
> $(DESTDIR)$(includedir)/gmpxx.h)
> endif
> endif
>
> ignoring the line ending changes because of my email
>
> then run autoreconf on boxen
>
> I can issue a mpir-2.1.4 tomorrow as I'm testing the new 2.2.0-rc3 , this is
> not problem as the change is trivial and obvious.

Thanks, this will obviously fix it.

I must admit I was rather thinking of adding an explicit prerequisite
like install-nodist_includeexecHEADERS.


Btw, at least when configuring 2.1.3 with non-empty CFLAGS (which we
do in Sage), we also have to add

* -Wl,-z,noexecstack to clear the (erroneously set) executable stack
attributes causing trouble on Fedora 14 (and other SELinux-enabled
systems),

* -Wa,-force_cpusubtype_ALL on MacOS X [10.5] PowerPC [G4], at least
with Apple's XCode GCC 4.2.1, since otherwise the assembler rejects
some code which makes use of an extended instruction set (AltiVec
extensions I think). (See http://trac.sagemath.org/sage_trac/ticket/8664#comment:47
ff. Interestingly, this apparently wasn't necessary on a MacOS X 10.4
PowerPC G5, with XCode GCC 4.0.1.)

(We also remove a lot of x86 assembly code on 32-bit MacOS X 10.4 and
10.5 Intel due to PIC issues, though I'm not sure if this is really
still necessary.)


-Leif

Jason

unread,
Dec 1, 2010, 5:59:15 PM12/1/10
to mpir-...@googlegroups.com
On Wednesday 01 December 2010 21:16:14 leif wrote:
> On 1 Dez., 02:09, Jason <ja...@njkfrudils.plus.com> wrote:
> > Heres the fix
> >
> > we have to change Makefile.am to
> >
> > install-data-hook:
> > if WANT_GMPCOMPAT
> > (rm -f $(DESTDIR)$(includedir)/gmp.h && cp mpir.h
> > $(DESTDIR)$(includedir)/gmp.h)
> > if WANT_CXX
> > (rm -f $(DESTDIR)$(includedir)/gmpxx.h && cp mpirxx.h
> > $(DESTDIR)$(includedir)/gmpxx.h)
> > endif
> > endif
> >
> > ignoring the line ending changes because of my email
> >
> > then run autoreconf on boxen
> >
> > I can issue a mpir-2.1.4 tomorrow as I'm testing the new 2.2.0-rc3 , this
> > is not problem as the change is trivial and obvious.
>
> Thanks, this will obviously fix it.
>
> I must admit I was rather thinking of adding an explicit prerequisite
> like install-nodist_includeexecHEADERS.
>
>

I was going to do it the proper way , but its more code than the install-data-
hook :) , I've never done a parallel install before , it wasn't worth doing
for MPIR , that why it was never noticed . This probably explains why windows
installation programs are so unreliable as they are GUI based and therefore
multithreaded and are probably full of race conditions , so many times I have
to install twice to get them to work.

> Btw, at least when configuring 2.1.3 with non-empty CFLAGS (which we
> do in Sage), we also have to add
>
> * -Wl,-z,noexecstack to clear the (erroneously set) executable stack
> attributes causing trouble on Fedora 14 (and other SELinux-enabled
> systems),
>

This is fixed in mpir-2.2.0

> * -Wa,-force_cpusubtype_ALL on MacOS X [10.5] PowerPC [G4], at least
> with Apple's XCode GCC 4.2.1, since otherwise the assembler rejects
> some code which makes use of an extended instruction set (AltiVec
> extensions I think). (See
> http://trac.sagemath.org/sage_trac/ticket/8664#comment:47 ff.
> Interestingly, this apparently wasn't necessary on a MacOS X 10.4 PowerPC
> G5, with XCode GCC 4.0.1.)
>

I'll have to have a look at at one

> (We also remove a lot of x86 assembly code on 32-bit MacOS X 10.4 and
> 10.5 Intel due to PIC issues, though I'm not sure if this is really
> still necessary.)
>

This was fixed back in about mpir-1.3

>
> -Leif

Reply all
Reply to author
Forward
0 new messages