I think I found a bug in MPIR 2.1.1 in mpz_divexact() (or I'm doing
something something very stupid). Running the following program gives a
Segmentation Fault:
#include <mpir.h>
int main()
{
mpz_t Z, R;
mpz_init(Z);
mpz_init(R);
mpz_ui_pow_ui(Z, 10, 100000);
mpz_divexact(R, Z, Z);
return 0;
}
gdb backtrace:
Program received signal SIGSEGV, Segmentation fault.
0x00007fa2f2b5245d in mpn_submul_1 () from
/usr/local/src/pari/local/lib/libmpir.so.8
(gdb) bt
#0 0x00007fa2f2b5245d in mpn_submul_1 () from
/usr/local/src/pari/local/lib/libmpir.so.8
#1 0x00007fa2f2b73e72 in __gmpn_sb_divappr_q () from
/usr/local/src/pari/local/lib/libmpir.so.8
#2 0x00007fa2f2b812f2 in __gmpn_inv_divappr_q () from
/usr/local/src/pari/local/lib/libmpir.so.8
#3 0x00007fa2f2b88765 in __gmpn_divexact () from
/usr/local/src/pari/local/lib/libmpir.so.8
#4 0x00007fa2f2b3fb4b in __gmpz_divexact () from
/usr/local/src/pari/local/lib/libmpir.so.8
#5 0x0000000000400741 in main () at mpirtest.c:8
My system is:
$ uname -a
Linux arcanis 2.6.32-gentoo-r7 #5 SMP Thu Jun 10 23:07:26 CEST 2010
x86_64 Intel(R) Core(TM)2 Duo CPU T5870 @ 2.00GHz GenuineIntel GNU/Linux
$ ./config.guess
core2-unknown-linux-gnu
$ gcc --version
gcc-4.4.3 (Gentoo 4.4.3-r2 p1.2) 4.4.3
This is with MPIR 2.1.1 configured with ./configure --enable-gmpcompat
This bug was found thanks to Sage (see
http://trac.sagemath.org/sage_trac/ticket/9837)
Jeroen Demeyer.
This looks like a very live bug. Thanks for the report and the trace.
I think this looks to be in code I added to MPIR when I worked on it,
so I'll try to track this down.
Regarding the internal undocumented functions used by Pari, if there
are no missing symbol errors when you build against MPIR, then it is
likely these are mpn functions that were formerly undocumented, but
now part of the documented interface. I think it would be safe to
leave that flag set (so long as it actually compiles and links).
Bill.
> --
> You received this message because you are subscribed to the Google Groups "mpir-devel" group.
> To post to this group, send email to mpir-...@googlegroups.com.
> To unsubscribe from this group, send email to mpir-devel+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.
>
>
I've tried a variety of things to replicate this bug, including
valgrinding with symbols on in MPIR and various compiler optimisation
flags, but at least on my machine, I can't replicate it.
If you have time, could you possibly help by taking the following steps:
1) Download a fresh copy of MPIR 2.1.1 from our website: http://www.mpir.org/
2) Run ./configure; make; make check to verify that the library
appears to compile OK on your machine (you could also try building
your program against this clean library again if you haven't already
done this)
3) Run ./configure again until about 30 lines are showing. You'll see
something like
CFLAGS="-O2 -m64 -march=k8 -mtune=k8"
which will be indented (the string on your machine will be different).
Copy the string that shows and alter the -O2 (or whatever you have) to
-g, and pass to ./configure, e.g..:
./configure CFLAGS="-g -m64 -march=k8 -mtune=k8"
4) do
make clean
make
5) build your program against MPIR (to prevent it picking up a stray
copy elsewhere, use -static when linking -- the library being in the
.libs directory, the .h file in the top level directory of the MPIR
source tree -- you'll have to set the -I and -L flags to gcc
accordingly), e.g. I put the program in the top level mpir source
directory and used:
gcc -O2 mpirtest.c -o mpirtest -I. -L../.libs -lmpir -static
6) If you have valgrind installed on your machine, type:
valgrind ./mpirtest
and report to us the output. If not, a gdb backtrace, as you gave, may
be sufficient.
7) If you have access to any other version of gcc on the same machine,
it would also be useful to know if the same bug occurs with that
version of gcc.
This *may* be a compiler bug, but at this stage we have to assume it
is more likely a bug in our code within MPIR.
Bill.
On 3 September 2010 22:10, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
Sorry, that should read:
gcc -O2 mpirtest.c -o mpirtest -I. -L.libs -lmpir -static
I can reproduce this error , I'm using gcc 4.4.4 , but only if I do a build
for core2 or nehalem , the k8 build is OK , so perhaps it's the CFLAGS=-
march=core2 that causes it?
I'll change the flags and see if that makes a difference
Jason
Jason
Actually thats not true , the values could be good , its just that it follows
a different code path. Building with --enabel-assert we get this assertion
failure when we run the code
inv_divappr_q.c:46: GNU MP assertion failed: nn > dn
Aborted
./configure --enable-assert --build=core2-unknown-linux-gnu
and just run the test case to get an assertion failure , therefore this looks
like a real error , although I could not the failure when I copied the core
param into the K8 directory and changed cflags , so I think this also depends
on which HAVE_NATIVE functions are availible
No , if I put the core2 params in the k8 directory and build on a k8 , I get
the error(must of done something daft before).
The error happens when mpn_divexact (which can accept denom limbs dn <= numer
limbs nn) calls mpn_inv_divapprox_q (which requires just <) . With the k8
thresholds it reduces that case properly , but with core2 params it does not.
It's possible that the k8 params could also produce this error as it just have
to find the right input size.
Unfortunately I'm not at all familiar with the code , I only roughly know the
algorithm.
Bill.
wbhart@selmer:~/mpir-2.1.1/mpn/generic$ diff divexact.c divexact.old
131,141c131,138
< if (qn)
< {
< inv = TMP_ALLOC_LIMBS(dn);
< mpn_invert(inv, dp, dn);
< cy = mpn_inv_divappr_q(qp, n2p, nn, dp, dn, inv);
< if (!extend) qp[qn] = cy;
<
< if ((qp[0] & 1) + q_even != 1) /* quotient is out by 1 */
< mpn_sub_1(qp, qp, qn + 1, 1);
< } else
< qp[0] = 1; // as dp is normalised, exact division means qp = 1
---
> inv = TMP_ALLOC_LIMBS(dn);
> mpn_invert(inv, dp, dn);
> cy = mpn_inv_divappr_q(qp, n2p, nn, dp, dn, inv);
> if (!extend) qp[qn] = cy;
>
> if ((qp[0] & 1) + q_even != 1) /* quotient is out by 1 */
> mpn_sub_1(qp, qp, qn + 1, 1);
>
I don't seem to have an mpir svn setup at the moment. If someone could
commit this and test, I'd be grateful.
I don't see a reason to change the requirement that nn > dn, just that
this requirement is respected by divexact, which it wasn't doing.
Bill.
Jason
Can we do a quick 2.1.1 release for this? I think nothing more is
required than to do a version change, tarball and upload to the
website. It passes make check on my machine with this patch, and given
that it is pure C and fixed the problem, and a well understood patch,
I think a fairly good candidate for a quick release.
Bill.
Jason
Bill.
I already did that, see
http://trac.sagemath.org/sage_trac/ticket/8664#comment:16
PARI doesn't use any undocumented functions, it uses the structure of
the mpz_t type (see MPIR documentation, section 16.1).
It would be useful for PARI (and hence Sage) to have a public interface
for mpn_divexact (currently, there is only mpn_divexact_by3 and
mpn_divexact_by3c).
Jeroen.
Hi
There are a few problems with this , GMP would have to have the same interface
or I certain Pari would not use it. There is a second more subtle problem , do
you want divexact (ie division which is only correct when exact ) or
divrem_hensel (ie division which gives the hensel(2-adic) quotient )
If you use mpn_divexact and are relying on it to be hensel division then when
we change the algorithm later for a faster one (ie bidirectional) then your
code will break , or we will have to fudge it.
For example the old mpn_divexact_by3c was hensel division , but the new one
(which is about twice as fast) is not , and we had to fudge the feed-in and
wind-out code to get the same remainders.
Jason
> There is a second more subtle problem , do
> you want divexact (ie division which is only correct when exact ) or
> divrem_hensel (ie division which gives the hensel(2-adic) quotient )
> If you use mpn_divexact and are relying on it to be hensel division then when
> we change the algorithm later for a faster one (ie bidirectional) then your
> code will break , or we will have to fudge it.
Well, the point of the public function mpz_divexact() is that you *know*
that one number is divisible an other. Nothing in the mpz_divexact()
documentation mentions 2-adic division. So I think the only necessary
criterion is that the quotient is correct when it is known that the
remainder is zero.
Jeroen.
It seems reasonable to me that we could mark the documentation with
"MPIR only" for functions supported by us and not by GMP.
The problem is that suppose MPIR defines mpn_divexact one way, then
GMP come along and define it differently, but with the same name. We
then have to change MPIR to be compatible with GMP, breaking
everyone's code who relied upon the MPIR interface.
This is why Jason is reluctant to document this. If you want to use an
undocumented function, go right ahead, It's:
mpn_divexact (mp_ptr qp,
mp_srcptr np, mp_size_t nn,
mp_srcptr dp, mp_size_t dn)
which does what you think it does.
>
>> There is a second more subtle problem , do
>> you want divexact (ie division which is only correct when exact ) or
>> divrem_hensel (ie division which gives the hensel(2-adic) quotient )
>> If you use mpn_divexact and are relying on it to be hensel division then when
>> we change the algorithm later for a faster one (ie bidirectional) then your
>> code will break , or we will have to fudge it.
> Well, the point of the public function mpz_divexact() is that you *know*
> that one number is divisible an other. Nothing in the mpz_divexact()
> documentation mentions 2-adic division. So I think the only necessary
> criterion is that the quotient is correct when it is known that the
> remainder is zero.
The most useful one for me (in flint) is also the divexact interface,
where you know in advance that it is exactly divisible.
Bill.
>
> Jeroen.
This can be solved (relatively easily ) by library versioning , basically , if
I understood it correctly , is that one library can "export" many incompatible
version of the same function , ie when you link to the library , you have
specify which version of it you want. I read this a year or two ago , I think
in the libtool docs.