Unexpected failure on a test with znpoly

16 views
Skip to first unread message

Dr. David Kirkby

unread,
Apr 3, 2011, 1:00:50 AM4/3/11
to sage-devel
I've built Sage tons of time on OpenSolaris as a 32-bit application, and rarely
had any problems for the last 6 months or so. In fact, I've built
sage-4.7.alpha3 several times without issue.

znpoly is a slightly unusual .spkg in Sage, in that it runs a minimal test suite
irrespective of the setting of SAGE_CHECK. If SAGE_CHECK is set to "yes" then it
runs a more comprehensive set of tests.

But today with the sage-4.7.alpha3 I got a totally unexpected failure.


gcc -g -g -fPIC -O3 -L.
-I/export/home/drkirkby/newdocs/sage-4.7.alpha3/local/include -I./include
-DDEBUG -o test/support-DEBUG.o -c test/support.c
gcc -g -o test/test src/array-DEBUG.o src/invert-DEBUG.o src/ks_support-DEBUG.o
src/mulmid-DEBUG.o src/mulmid_ks-DEBUG.o src/misc-DEBUG.o src/mpn_mulmid-DEBUG.o
src/mul-DEBUG.o src/mul_fft-DEBUG.o src/mul_fft_dft-DEBUG.o src/mul_ks-DEBUG.o
src/nuss-DEBUG.o src/pack-DEBUG.o src/pmf-DEBUG.o src/pmfvec_fft-DEBUG.o
src/tuning-DEBUG.o src/zn_mod-DEBUG.o test/test-DEBUG.o test/ref_mul-DEBUG.o
test/invert-test-DEBUG.o test/pmfvec_fft-test-DEBUG.o
test/mulmid_ks-test-DEBUG.o test/mpn_mulmid-test-DEBUG.o
test/mul_fft-test-DEBUG.o test/mul_ks-test-DEBUG.o test/nuss-test-DEBUG.o
test/pack-test-DEBUG.o test/support-DEBUG.o
-L/export/home/drkirkby/newdocs/sage-4.7.alpha3/local/lib -lgmp -lm
test/test -quick all
mpn_smp_basecase()... ok
mpn_smp_kara()... make[2]: *** [check] Segmentation Fault (core dumped)
make[2]: Leaving directory
`/export/home/drkirkby/newdocs/sage-4.7.alpha3/spkg/build/zn_poly-0.9.p5/src'
Error running zn_poly's quick test suite (make check).

real 5m41.143s
user 1m8.034s
sys 0m5.595s
sage: An error occurred while installing zn_poly-0.9.p5


After I typed "make again" I see:

Successfully installed zn_poly-0.9.p5

So for some unknown reason, znpoly has failed to pass the self-tests, when I've
probably built it 100 times before and it passed each time.

As usual, I checked the system log and see nothing to indicate the system had a
problem like a memory error, disk error, lack of swap space etc.

Dave

Bill Hart

unread,
Apr 4, 2011, 8:01:54 AM4/4/11
to sage-devel
Is the problem reproducible?

If so, a valgrind log would be useful to debug the problem. It should
find the source of any segfault. zn_poly would need to be compiled
with the -g option to GCC. It could of course be a compiler bug.
Possibly compilation with a lower optimisation level would make it
disappear.

Bill.

Dr. David Kirkby

unread,
Apr 4, 2011, 8:28:52 PM4/4/11
to sage-...@googlegroups.com
On 04/ 4/11 01:01 PM, Bill Hart wrote:
> Is the problem reproducible?

I'll have to check how reproducible it is by building znpoly repeatedly. But
it's the first failure I've known in what must be more than 100 compete builds
of Sage on this machine.

znpoly is only taking 39 s to build and run the short test suite, so if I run
it for a day, I can build+test it 2200 times. The more comprehensive test suite
takes several minutes.

But I've built Sage on this machine perhaps 100 times now, and the buildbot uses
it too. So far only one failure.

> If so, a valgrind log would be useful to debug the problem.

Valgrind is not supported on this operating system.

> It should
> find the source of any segfault.

> zn_poly would need to be compiled
> with the -g option to GCC. It could of course be a compiler bug.
> Possibly compilation with a lower optimisation level would make it
> disappear.

It could be. Whatever it is, it a rare failure, though just how rare I don't
know now. The program is compiled with -O3, so plenty of scope for reducing that.

The package compiles without any warnings at all, though the -Wall option is not
added.

Had parallel builds been enabled in the package, I might have put it down to a
race condition, but they are not, so it can't be that.

Dave

William Stein

unread,
Apr 5, 2011, 12:27:05 AM4/5/11
to sage-...@googlegroups.com, Bill Hart
On Mon, Apr 4, 2011 at 5:01 AM, Bill Hart <goodwi...@googlemail.com> wrote:
> Is the problem reproducible?
>
> If so, a valgrind log would be useful to debug the problem. It should
> find the source of any segfault. zn_poly would need to be compiled
> with the -g option to GCC. It could of course be a compiler bug.
> Possibly compilation with a lower optimisation level would make it
> disappear.
>
> Bill.

Bill,

Do you know if the zn_poly test suite uses any random numbers? If so,
does it use a random time-dependent seed, and if so, does it print out
that seed?

-- William

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Bill Hart

unread,
Apr 5, 2011, 7:21:40 AM4/5/11
to William Stein, sage-...@googlegroups.com
On 5 April 2011 05:27, William Stein <wst...@gmail.com> wrote:
> On Mon, Apr 4, 2011 at 5:01 AM, Bill Hart <goodwi...@googlemail.com> wrote:
>> Is the problem reproducible?
>>
>> If so, a valgrind log would be useful to debug the problem. It should
>> find the source of any segfault. zn_poly would need to be compiled
>> with the -g option to GCC. It could of course be a compiler bug.
>> Possibly compilation with a lower optimisation level would make it
>> disappear.
>>
>> Bill.
>
> Bill,
>
> Do you know if the zn_poly test suite uses any random numbers?

Pseudorandom. The test in question uses both mpn_random2 and gmp_urandomm_ui.

> If so,
> does it use a random time-dependent seed,

No. Not as far as I can tell. The initialisation in zn_poly 0.9 is
just gmp_randinit_default (randstate). (Note mpn_random2 uses an
internal GMP/MPIR state, not the provided state.)

> and if so, does it print out
> that seed?

I don't see any in the trace below and I didn't find any in the code.

>
>  -- William

Bill.

Bill Hart

unread,
Apr 5, 2011, 7:57:00 AM4/5/11
to sage-devel
It is testing exceptionally simple code, with exceptionally simple
tests.

I doubt this is a bug in zn_poly.

If it turns out to not be reproducible, I would ignore it. It's much
more likely a cosmic ray or transient hardware fault or bug in the OS.

But let's see if it can be reproduced.

Bill.

kcrisman

unread,
Dec 7, 2011, 1:25:09 PM12/7/11
to sage-...@googlegroups.com
It won't let me reply any more...

Just updating this thread to point out this can still happen. This
is on Cygwin; failed twice at the same spot, then no problems.

make[2]: Leaving directory
`/home/Administrator/sage-4.7.2/spkg/build/zn_poly-0.9.p5/src'
zn_poly tuning program
(use -v flag for verbose output)

Calibrating cycle counter... ok (2.37e+09)
mpn smp kara: done
mpn mulmid fallback: done
KS1/2/4 mul: ...............................
KS1/2/4 sqr: ...............................
KS1/2/4 mulmid: ...............................
nuss mul: ...............................
nuss sqr: ...............................
KS/FFT mul: ......................../spkg-install: line 59: 3460
Segmentation fault (core dumed) tune/tune > src/tuning.c
Error running tune program.

real 0m25.990s
user 0m18.081s
sys 0m1.781s


sage: An error occurred while installing zn_poly-0.9.p5

Please email sage-devel http://groups.google.com/group/sage-devel

Reply all
Reply to author
Forward
0 new messages