20110601

13 views
Skip to first unread message

Martin Albrecht

unread,
May 31, 2011, 7:36:53 AM5/31/11
to m4ri-...@googlegroups.com
Hi, I pressed the button, 20110601 is out. Release notes are here:

https://bitbucket.org/malb/m4ri/wiki/M4RI-20110601

Cheers,
Martin


--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
_www: http://martinralbrecht.wordpress.com/
_jab: martinr...@jabber.ccc.de

Jerry James

unread,
Jun 7, 2011, 1:23:25 PM6/7/11
to M4RI Development
On May 31, 5:36 am, Martin Albrecht <martinralbre...@googlemail.com>
wrote:
> Hi, I pressed the button, 20110601 is out. Release notes are here:
>
>    https://bitbucket.org/malb/m4ri/wiki/M4RI-20110601
>
> Cheers,
> Martin

Hello all. I help maintain the Fedora Linux package of m4ri. I tried
updating our package to the 20110601 release, but ran into some
problems. I'll start with the easy one. The compiler complains about
passing a const argument to a non-const function parameter. This
patch fixes it:

--- ./src/packedmatrix.h.orig 2011-05-30 09:57:38.000000000 -0600
+++ ./src/packedmatrix.h 2011-06-06 13:54:13.531406107 -0600
@@ -361,7 +361,7 @@
*
*/

-mzd_t *mzd_init_window(mzd_t *M, rci_t const lowr, rci_t const lowc,
rci_t const highr, rci_t const highc);
+mzd_t *mzd_init_window(const mzd_t *M, rci_t const lowr, rci_t const
lowc, rci_t const highr, rci_t const highc);

/**
* \brief Create a const window/view into a const matrix M.
@@ -371,7 +371,7 @@

static inline mzd_t const *mzd_init_window_const(mzd_t const *M,
rci_t const lowr, rci_t const lowc, rci_t const highr, rci_t const
highc)
{
- return mzd_init_window((mzd_t*)M, lowr, lowc, highr, highc);
+ return mzd_init_window((const mzd_t*)M, lowr, lowc, highr, highc);
}

/**
--- ./src/packedmatrix.c.orig 2011-05-30 09:57:38.000000000 -0600
+++ ./src/packedmatrix.c 2011-06-06 13:53:42.883406096 -0600
@@ -228,7 +228,7 @@

*/

-mzd_t *mzd_init_window (mzd_t *m, rci_t lowr, rci_t lowc, rci_t
highr, rci_t highc) {
+mzd_t *mzd_init_window (const mzd_t *m, rci_t lowr, rci_t lowc, rci_t
highr, rci_t highc) {
rci_t nrows, ncols;
mzd_t *window;
window = mzd_t_malloc();


The remaining problems appear to be OpenMP-related. Compiling with
OpenMP enabled fails, due to 3 distinct kinds of problems. First,
some OpenMP macros refer to variables that aren't declared until the
following line. Second, there is a "return" from the middle of a
critical section in mmc.c. Third, "-fopenmp" isn't passed to the
compiler when building the tests, so they fail to link. This patch
fixes all of those problems:

--- ./src/brilliantrussian.c.orig 2011-05-30 09:57:38.000000000 -0600
+++ ./src/brilliantrussian.c 2011-06-06 13:40:22.118406098 -0600
@@ -362,11 +362,12 @@

int const ka = k / 2;
int const kb = k - k / 2;
+ rci_t r;

#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(static,512) // MAX((__M4RI_CPU_L1_CACHE >> 3) / wide,
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = L0[ mzd_read_bits_int(M, r, startcol, ka)];
rci_t const x1 = L1[ mzd_read_bits_int(M, r, startcol+ka, kb)];
if((x0 | x1) == 0) // x0 == 0 && x1 == 0
@@ -404,11 +405,12 @@
int const ka = k / 3 + ((rem >= 2) ? 1 : 0);
int const kb = k / 3 + ((rem >= 1) ? 1 : 0);
int const kc = k / 3;
+ rci_t r;

#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(static,512) //if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = L0[ mzd_read_bits_int(M, r, startcol, ka)];
rci_t const x1 = L1[ mzd_read_bits_int(M, r, startcol+ka, kb)];
rci_t const x2 = L2[ mzd_read_bits_int(M, r, startcol+ka+kb,
kc)];
@@ -450,11 +452,12 @@
int const kb = k / 4 + ((rem >= 2) ? 1 : 0);
int const kc = k / 4 + ((rem >= 1) ? 1 : 0);
int const kd = k / 4;
+ rci_t r;

#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(static,512) //if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = L0[ mzd_read_bits_int(M, r, startcol, ka)];
rci_t const x1 = L1[ mzd_read_bits_int(M, r, startcol+ka, kb)];
rci_t const x2 = L2[ mzd_read_bits_int(M, r, startcol+ka+kb,
kc)];
@@ -499,11 +502,12 @@
int const kc = k / 5 + ((rem >= 2) ? 1 : 0);
int const kd = k / 5 + ((rem >= 1) ? 1 : 0);
int const ke = k / 5;
+ rci_t r;

#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(static,512) //if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {

rci_t const x0 = L0[ mzd_read_bits_int(M, r, startcol, ka)];
rci_t const x1 = L1[ mzd_read_bits_int(M, r, startcol+ka, kb)];
@@ -555,11 +559,12 @@
int const kd = k / 6 + ((rem >= 2) ? 1 : 0);
int const ke = k / 6 + ((rem >= 1) ? 1 : 0);;
int const kf = k / 6;
+ rci_t r;

#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(static,512) //if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = L0[ mzd_read_bits_int(M, r, startcol, ka)];
rci_t const x1 = L1[ mzd_read_bits_int(M, r, startcol+ka, kb)];
rci_t const x2 = L2[ mzd_read_bits_int(M, r, startcol+ka+kb,
kc)];
--- ./src/mmc.c.orig 2011-05-30 09:57:38.000000000 -0600
+++ ./src/mmc.c 2011-06-06 13:40:22.118406098 -0600
@@ -97,7 +97,7 @@
if(mm[i].size == 0) {
mm[i].size = size;
mm[i].data = condemned;
- return;
+ goto done;
}
}
m4ri_mm_free(mm[j].data);
@@ -107,6 +107,8 @@
} else {
m4ri_mm_free(condemned);
}
+done:
+ ;
#ifdef HAVE_OPENMP
}
#endif
--- ./src/pls_mmpf.c.orig 2011-05-30 09:57:38.000000000 -0600
+++ ./src/pls_mmpf.c 2011-06-06 13:40:22.119406097 -0600
@@ -198,10 +198,11 @@
}

wide -= 2;
+ rci_t r;
#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(dynamic,32) if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = E0[ mzd_read_bits_int(M, r, startcol, ka) ];
word const *t0 = T0->rows[x0] + blocknuma;
word *m0 = M->rows[r+0] + blocknuma;
@@ -245,10 +246,11 @@
}

wide -= 3;
+ rci_t r;
#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(dynamic,32) if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = E0[ mzd_read_bits_int(M, r, startcol, ka) ];
word const *t0 = T0->rows[x0] + blocknuma;
word *m0 = M->rows[r] + blocknuma;
@@ -305,10 +307,11 @@
return;
}
wide -= 4;
+ rci_t r;
#ifdef HAVE_OPENMP
#pragma omp parallel for private(r) shared(startrow, stoprow)
schedule(dynamic,32) if(stoprow-startrow > 128)
#endif
- for(rci_t r = startrow; r < stoprow; ++r) {
+ for(r = startrow; r < stoprow; ++r) {
rci_t const x0 = E0[mzd_read_bits_int(M, r, startcol, ka)];
word *t0 = T0->rows[x0] + blocknuma;
word *m0 = M->rows[r] + blocknuma;
--- ./Makefile.am.orig 2011-06-06 13:40:08.159406095 -0600
+++ ./Makefile.am 2011-06-06 13:41:37.195406090 -0600
@@ -19,42 +19,42 @@
check_PROGRAMS=test_multiplication test_elimination test_trsm
test_pls test_solve test_kernel test_random test_smallops
test_transpose test_colswap
test_multiplication_SOURCES=testsuite/test_multiplication.c
test_multiplication_LDFLAGS=-lm4ri -lm
-test_multiplication_CFLAGS=-I$(srcdir)/src
+test_multiplication_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_elimination_SOURCES=testsuite/test_elimination.c
test_elimination_LDFLAGS=-lm4ri -lm
-test_elimination_CFLAGS=-I$(srcdir)/src
+test_elimination_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_trsm_SOURCES=testsuite/test_trsm.c
test_trsm_LDFLAGS=-lm4ri -lm
-test_trsm_CFLAGS=-I$(srcdir)/src
+test_trsm_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_pls_SOURCES=testsuite/test_pluq.c
test_pls_LDFLAGS=-lm4ri -lm
-test_pls_CFLAGS=-I$(srcdir)/src
+test_pls_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_solve_SOURCES=testsuite/test_solve.c
test_solve_LDFLAGS=-lm4ri -lm
-test_solve_CFLAGS=-I$(srcdir)/src
+test_solve_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_kernel_SOURCES=testsuite/test_kernel.c
test_kernel_LDFLAGS=-lm4ri -lm
-test_kernel_CFLAGS=-I$(srcdir)/src
+test_kernel_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_random_SOURCES=testsuite/test_random.c
test_random_LDFLAGS=-lm4ri -lm
-test_random_CFLAGS=-I$(srcdir)/src
+test_random_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_smallops_SOURCES=testsuite/test_smallops.c testsuite/testing.c
testsuite/testing.h
test_smallops_LDFLAGS=-lm4ri -lm
-test_smallops_CFLAGS=-I$(srcdir)/src
+test_smallops_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_transpose_SOURCES=testsuite/test_transpose.c
test_transpose_LDFLAGS=-lm4ri -lm
-test_transpose_CFLAGS=-I$(srcdir)/src
+test_transpose_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

test_colswap_SOURCES=testsuite/test_colswap.c
test_colswap_LDFLAGS=-lm4ri -lm
-test_colswap_CFLAGS=-I$(srcdir)/src
+test_colswap_CFLAGS=-I$(srcdir)/src $(AM_CFLAGS)

TESTS = test_multiplication test_elimination test_trsm test_pls
test_solve test_kernel test_random test_smallops test_transpose
test_colswap


Even with the above patches, m4ri is still failing the tests when
built with OpenMP enabled. I can build without it, but am reluctant
to do so unless there is no other choice. The failing tests,
test_multiplication, test_elimination, and test_kernel, all segfault.
The segfaults seem to occur in different places on each run, which
suggests a race condition somewhere. I don't know how to debug this
further. Any suggestions or help are much appreciated.

One more thing. I'm going to have to rebuild Fedora's polybori
package once I get this sorted out. Did you have to make any
significant changes to polybori when you tested the build, or was it
pretty straightforward?

Regards,
Jerry James

Martin Albrecht

unread,
Jun 8, 2011, 7:01:39 AM6/8/11
to m4ri-...@googlegroups.com
On Tuesday 07 June 2011, Jerry James wrote:
> Hello all. I help maintain the Fedora Linux package of m4ri. I tried
> updating our package to the 20110601 release, but ran into some
> problems. I'll start with the easy one. The compiler complains about
> passing a const argument to a non-const function parameter. This
> patch fixes it:

Cheers, I'll take a look.



> The remaining problems appear to be OpenMP-related. Compiling with
> OpenMP enabled fails, due to 3 distinct kinds of problems. First,
> some OpenMP macros refer to variables that aren't declared until the
> following line. Second, there is a "return" from the middle of a
> critical section in mmc.c. Third, "-fopenmp" isn't passed to the
> compiler when building the tests, so they fail to link. This patch
> fixes all of those problems:

Cheers, that's because I forgot to test OpenMP for this release. Sorry about
that!

> Even with the above patches, m4ri is still failing the tests when
> built with OpenMP enabled. I can build without it, but am reluctant
> to do so unless there is no other choice. The failing tests,
> test_multiplication, test_elimination, and test_kernel, all segfault.
> The segfaults seem to occur in different places on each run, which
> suggests a race condition somewhere. I don't know how to debug this
> further. Any suggestions or help are much appreciated.

Carlo added more caching for matrix structs which could be the cause for this
race condition. I'll also investigate.

> One more thing. I'm going to have to rebuild Fedora's polybori
> package once I get this sorted out. Did you have to make any
> significant changes to polybori when you tested the build, or was it
> pretty straightforward?

The only thing that needs changing IIRC is to replace RADIX by m4ri_radix. The
next version of PolyBoRi (i.e. 0.8) will have that fix (it will also work with
older versions of M4RI btw.)

Cheers,
Martin

PS: I'm at a conference this week with limited internet access, so I might
take longer than usual to reply & look into stuff.

Martin Albrecht

unread,
Jun 8, 2011, 7:09:07 AM6/8/11
to m4ri-...@googlegroups.com
> Hello all. I help maintain the Fedora Linux package of m4ri. I tried
> updating our package to the 20110601 release, but ran into some
> problems. I'll start with the easy one. The compiler complains about
> passing a const argument to a non-const function parameter. This
> patch fixes it:

Hi again,

which compiler is this? I compile with gcc 4.5.3 with -Wall -pedantic and
nothing shows up. I also compiled under MSVC which usually seems to be more
strict than my GCC.

Alexander Dreyer

unread,
Jun 8, 2011, 9:06:21 AM6/8/11
to m4ri-...@googlegroups.com
Hi alltogether,

> The only thing that needs changing IIRC is to replace RADIX by m4ri_radix. The
> next version of PolyBoRi (i.e. 0.8) will have that fix (it will also work with
> older versions of M4RI btw.)
This patch will already be in the patched version 0.7.1-p2 for Sage,
which is pending here:
http://trac.sagemath.org/sage_trac/ticket/11261

My best,
Alexander

--
Dr. rer. nat. Dipl.-Math. Alexander Dreyer

Abteilung "Systemanalyse, Prognose und Regelung"
Fraunhofer Institut f�r Techno- und Wirtschaftsmathematik (ITWM)
Fraunhofer-Platz 1
67663 Kaiserslautern

Telefon +49 (0) 631-31600-4318
Fax +49 (0) 631-31600-1099
E-Mail alexande...@itwm.fraunhofer.de
Internet http://www.itwm.fraunhofer.de/sys/dreyer.html

Martin Albrecht

unread,
Jun 8, 2011, 1:17:22 PM6/8/11
to m4ri-...@googlegroups.com
I just committed a fix, now M4RI build without issues + runs without segfault
when --enable-openmp is given.

Note however, that you'll see little improvements when using OpenMP over the
sequential code, at least on Intel CPUs (which tend to have more shared cache
between cores than AMDs). On my 4 core Intel i7 I see an improvement when
using the M4RI algorithm but using the asymptotically fast sequential code is
still faster.

I cut a new release (well, alpha0 of it), available here:

http://m4ri.sagemath.org/downloads/m4ri-20110613.alpha0.tar.gz

let me know if it works, then I'll release 20110613 proper (which will only
contains this fix).

Cheers,
Martin

PS: I was offline until now so I didn't use your patches in the end but
credited you in mine, I hope that's okay.

Jerry James

unread,
Jun 9, 2011, 12:14:05 PM6/9/11
to M4RI Development
On Jun 8, 11:17 am, Martin Albrecht <martinralbre...@googlemail.com>
wrote:
> I just committed a fix, now M4RI build without issues + runs without segfault
> when --enable-openmp is given.
>
> Note however, that you'll see little improvements when using OpenMP over the
> sequential code, at least on Intel CPUs (which tend to have more shared cache
> between cores than AMDs). On my 4 core Intel i7 I see an improvement when
> using the M4RI algorithm but using the asymptotically fast sequential code is
> still faster.

I don't know what the split is between Intel and AMD chips in our user
base. I have to build one package that all of them will download and
use. Would you suggest building with or without OpenMP support? We
also build for a variety of other platforms, such as PowerPC, SuperH,
and Sparc. Do you have a good feel for how we should choose between
OpenMP and non-OpenMP builds for those platforms?

> I cut a new release (well, alpha0 of it), available here:
>
>    http://m4ri.sagemath.org/downloads/m4ri-20110613.alpha0.tar.gz
>
> let me know if it works, then I'll release 20110613 proper (which will only
> contains this fix).

That works for me. Thank you very much. Sorry to distract you while
at a conference. :-)

> PS: I was offline until now so I didn't use your patches in the end but
> credited you in mine, I hope that's okay.

Oh, absolutely. I don't even care about the credit, really.

You also asked about the compiler I am using. I am running Fedora 15
on my desktop, which includes gcc 4.6.0 and binutils 2.21.51.0.6.

One last thing. I have to disable SSE3/SSSE3 support, because the
machines we build on may have such support, but not all of our users
will. This can result in illegal instruction errors on the users'
machines. I'm currently patching m4/ax_ext.m4 to force those two
checks to fail. It is certainly not urgent, but if you have time in
the future, some officially sanctioned way of forcing those off would
be nice.

Thank you,
Jerry James

Martin Albrecht

unread,
Jun 10, 2011, 4:04:42 AM6/10/11
to m4ri-...@googlegroups.com
> I don't know what the split is between Intel and AMD chips in our user
> base. I have to build one package that all of them will download and
> use. Would you suggest building with or without OpenMP support? We
> also build for a variety of other platforms, such as PowerPC, SuperH,
> and Sparc. Do you have a good feel for how we should choose between
> OpenMP and non-OpenMP builds for those platforms?

I'd say it's currently probably not worth it enabling OpenMP by default at
least on x86(_64). I have no idea about other platforms. But the focus of the
library so far definitely is single core.



> > I cut a new release (well, alpha0 of it), available here:
> >
> > http://m4ri.sagemath.org/downloads/m4ri-20110613.alpha0.tar.gz
> >
> > let me know if it works, then I'll release 20110613 proper (which will
> > only contains this fix).
>
> That works for me. Thank you very much.

I guess I will make this the official release then.

> Sorry to distract you while at a conference. :-)

NP :)

> > PS: I was offline until now so I didn't use your patches in the end but
> > credited you in mine, I hope that's okay.
>
> Oh, absolutely. I don't even care about the credit, really.
>
> You also asked about the compiler I am using. I am running Fedora 15
> on my desktop, which includes gcc 4.6.0 and binutils 2.21.51.0.6.
>
> One last thing. I have to disable SSE3/SSSE3 support, because the
> machines we build on may have such support, but not all of our users
> will. This can result in illegal instruction errors on the users'
> machines. I'm currently patching m4/ax_ext.m4 to force those two
> checks to fail. It is certainly not urgent, but if you have time in
> the future, some officially sanctioned way of forcing those off would
> be nice.

Just pass --disable-sse2 to configure :) We only use SSE2 (128-bit wide XORs)
and not SSE3 btw.

Cheers,
Martin

Jerry James

unread,
Jun 10, 2011, 11:17:11 AM6/10/11
to M4RI Development
On Jun 10, 2:04 am, Martin Albrecht <martinralbre...@googlemail.com>
wrote:
> I'd say it's currently probably not worth it enabling OpenMP by default at
> least on x86(_64). I have no idea about other platforms. But the focus of the
> library so far definitely is single core.

That's useful information, thanks.

> Just pass --disable-sse2 to configure :) We only use SSE2 (128-bit wide XORs)
> and not SSE3 btw.

No, the SSE2 situation is fine, because there is a configure switch to
control it. The problem is that m4/ax_ext.m4 checks the CPUID of the
*building* machine, and uses that to determine whether to pass -msse3
to the compiler. If the builder happens to have SSE3 support, and gcc
just happens to decide that it has detected a code pattern that can be
compiled down to an SSE3 instruction, then we'll eventually have some
hapless user with a non-SSE3-enabled CPU get an illegal instruction
fault. I realize there's no SSE3 assembly code in m4ri, but the -
msse3 compiler flag is dangerous. I'm asking for a --disable-sse3
configure flag to skip that check.

It's not really a big deal as I can continue the current practice of
patching the configure script to skip the SSE3 check. This is just a
"would be nice" request. :-)

Thanks for the new release.

Regards,
Jerry James

Martin Albrecht

unread,
Jul 6, 2011, 10:17:32 AM7/6/11
to m4ri-...@googlegroups.com

Sorry for coming back to this only now, but I just ran configure with --
disable-sse2 and it worked, i.e. we don't run ax_ext.m4 if disable-sse2 is
set. Also there was no msse3 gcc flag flying by while building. GCC might
detect SSE3 anyway and use it, but that should be controlled by setting CFLAGS
appropriately?

Jerry James

unread,
Jul 18, 2011, 1:11:26 PM7/18/11
to M4RI Development
On Jul 6, 8:17 am, Martin Albrecht <martinralbre...@googlemail.com>
wrote:
> Sorry for coming back to this only now, but I just ran configure with --
> disable-sse2 and it worked, i.e. we don't run ax_ext.m4 if disable-sse2 is
> set. Also there was no msse3 gcc flag flying by while building. GCC might
> detect SSE3 anyway and use it, but that should be controlled by setting CFLAGS
> appropriately?
>
> Cheers,
> Martin
>
> --
> name: Martin Albrecht
> _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
> _otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
> _www:http://martinralbrecht.wordpress.com/
> _jab: martinralbre...@jabber.ccc.de

Well, my apologies in return. I haven't been to this forum for a few
weeks, obviously.

The problem isn't the --disable-sse2 case which, as you note, works
correctly. What I'm worried about is the x86_64 builds, where we know
that SSE2 is available, but don't know that SSE3 is available. In
that case, I don't want to pass the --disable-sse2 flag, since we do
want to use the SSE2 code. But since our building machines may be
SSE3 capable, running plain ./configure on those machines does put -
msse3 into SIMD_FLAGS. (I just tried this with the 20110715 release,
by the way.) If a user with a non-SSE3-capable CPU installs the
resulting package, (s)he will get an illegal instruction error at
runtime if GCC actually used the SSE3 instruction set anywhere. I
realize there are a couple of hypotheticals in that scenario, but I'd
like to be sure they can't happen.

Thanks,
Jerry James

Martin Albrecht

unread,
Jul 20, 2011, 12:43:52 AM7/20/11
to m4ri-...@googlegroups.com
> The problem isn't the --disable-sse2 case which, as you note, works
> correctly. What I'm worried about is the x86_64 builds, where we know
> that SSE2 is available, but don't know that SSE3 is available. In
> that case, I don't want to pass the --disable-sse2 flag, since we do
> want to use the SSE2 code. But since our building machines may be
> SSE3 capable, running plain ./configure on those machines does put -
> msse3 into SIMD_FLAGS. (I just tried this with the 20110715 release,
> by the way.) If a user with a non-SSE3-capable CPU installs the
> resulting package, (s)he will get an illegal instruction error at
> runtime if GCC actually used the SSE3 instruction set anywhere. I
> realize there are a couple of hypotheticals in that scenario, but I'd
> like to be sure they can't happen.
>
> Thanks,
> Jerry James

Thanks for explaining, I got it now and opened a ticket at

https://bitbucket.org/malb/m4ri/issue/30/allow-to-disable-sse3

Cheers,
Martin

--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
_www: http://martinralbrecht.wordpress.com/

_jab: martinr...@jabber.ccc.de

Reply all
Reply to author
Forward
0 new messages