SEGV in libatlas.so

55 views
Skip to first unread message

Georgi Guninski

unread,
Oct 12, 2012, 10:03:49 AM10/12/12
to sage-s...@googlegroups.com
g=DiGraph('kO??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????W???????')
m=g.adjacency_matrix()
m
m^7

Unhandled SIGSEGV:
local/lib/libatlas.so(+0x87d4d)[0x7fdb41c8ed4d]

Georgi Guninski

unread,
Oct 12, 2012, 10:21:20 AM10/12/12
to sage-s...@googlegroups.com
Simpler testcase:

g=graphs.CycleGraph(44);m=g.adjacency_matrix()
m^7

P Purkayastha

unread,
Oct 12, 2012, 10:29:52 AM10/12/12
to sage-s...@googlegroups.com
Both the examples work here in sage-5.2 and sage-5.4beta1. What version
of Sage are you using?

Georgi Guninski

unread,
Oct 12, 2012, 10:34:09 AM10/12/12
to sage-s...@googlegroups.com
On Fri, Oct 12, 2012 at 10:29:52PM +0800, P Purkayastha wrote:
> Both the examples work here in sage-5.2 and sage-5.4beta1. What
> version of Sage are you using?
>

5.3 on ubuntu 10.04 x86_64.
5.2 crashes too for me.

Dan Drake

unread,
Oct 12, 2012, 10:42:04 AM10/12/12
to sage-s...@googlegroups.com
On Fri, 12 Oct 2012 at 05:21PM +0300, Georgi Guninski wrote:
> Simpler testcase:
>
> g=graphs.CycleGraph(44);m=g.adjacency_matrix()
> m^7

This works for me. You'll need to give more information before anyone
can help you -- platform, whether you compiled yourself or use a binary,
etc.

Dan

--
--- Dan Drake
----- http://math.pugetsound.edu/~ddrake
-------
signature.asc

Georgi Guninski

unread,
Oct 12, 2012, 10:54:57 AM10/12/12
to sage-s...@googlegroups.com
On Fri, Oct 12, 2012 at 07:42:04AM -0700, Dan Drake wrote:
> This works for me. You'll need to give more information before anyone
> can help you -- platform, whether you compiled yourself or use a binary,
> etc.
>

i don't expect anyone to help me...

sage 5.3 on ubuntu 10.04 x86_64 downloaded from sagemath.

kcrisman

unread,
Oct 12, 2012, 1:17:29 PM10/12/12
to sage-s...@googlegroups.com


> This works for me. You'll need to give more information before anyone
> can help you -- platform, whether you compiled yourself or use a binary,
> etc.
>

i don't expect anyone to help me...


Well, but presumably by helping you we make Sage better - this could be a pretty serious subtle bug, or something trivial, but this will help.
 
sage 5.3 on ubuntu 10.04 x86_64 downloaded from sagemath.


What is the name of the binary (i.e., is it 64 or 32 bit, and so forth)?

Georgi Guninski

unread,
Oct 13, 2012, 2:02:52 AM10/13/12
to sage-s...@googlegroups.com
On Fri, Oct 12, 2012 at 10:17:29AM -0700, kcrisman wrote:
> What is the name of the binary (i.e., is it 64 or 32 bit, and so forth)?
>

The name of the directory is:

sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux

Tested it on a second box with the same binary and Ubuntu 10.04
and crashed again.

Doubt this is a hardware problem.

On both machines don't crash on sage 4.3.

Dima Pasechnik

unread,
Oct 13, 2012, 6:47:29 AM10/13/12
to sage-s...@googlegroups.com, guni...@guninski.com
It is a hardware in a way: you use a downloaded Sage on a system with slightly different characteristics. It's a "bug" in Sage in a way - the binary you download assumes a bit too much from the box.
Build Sage from source...

Georgi Guninski

unread,
Oct 13, 2012, 7:26:06 AM10/13/12
to sage-s...@googlegroups.com
On Fri, Oct 12, 2012 at 10:17:29AM -0700, kcrisman wrote:
> What is the name of the binary (i.e., is it 64 or 32 bit, and so forth)?
>

It easy is to check if the problem is in my boxen - install ubuntu 10.04 in
a virtual machine, download the 64 bit binary from sagemath and
run the testcases.

Dima Pasechnik

unread,
Oct 13, 2012, 8:45:29 AM10/13/12
to sage-s...@googlegroups.com, guni...@guninski.com
Not all x86_64 boxes are equal. Different models of x86_64 processors have different sets of commands, and VMs are even worse in this case, as we saw situations where not all the processor capabilities are allowed by the VM, but for the software it looks as if these capabilities are allowed...


Georgi Guninski

unread,
Oct 13, 2012, 8:58:34 AM10/13/12
to Dima Pasechnik, sage-s...@googlegroups.com
So you are implying my two boxen (intel and amd) are all buggy so i
should compile from source on both?

I should have be warned before downloading the binary, please make
this statement explicit on the binary download page :)))

Dima Pasechnik

unread,
Oct 13, 2012, 9:25:16 AM10/13/12
to sage-s...@googlegroups.com, sage-devel


On Saturday, 13 October 2012 20:58:46 UTC+8, Georgi Guninski wrote:
On Sat, Oct 13, 2012 at 05:45:29AM -0700, Dima Pasechnik wrote:
>
>
> On Saturday, 13 October 2012 19:26:17 UTC+8, Georgi Guninski wrote:
> >
> > On Fri, Oct 12, 2012 at 10:17:29AM -0700, kcrisman wrote:
> > > What is the name of the binary (i.e., is it 64 or 32 bit, and so forth)?
> > >
> >
> > It easy is to check if the problem is in my boxen - install ubuntu 10.04
> > in
> > a virtual machine, download the 64 bit binary from sagemath and
> > run the testcases.
> >
> >
> Not all x86_64 boxes are equal. Different models of x86_64 processors have
> different sets of commands, and VMs are even worse in this case, as we saw
> situations where not all the processor capabilities are allowed by the VM,
> but for the software it looks as if these capabilities are allowed...
>
>

So you are implying my two boxen (intel and amd) are all buggy so i
should compile from source on both?

buggy? I never said that. It just so happened that the binary you run on them was built for 
a slightly different architecture.
 

I should have be warned before downloading the binary, please make
this statement explicit on the binary download page :)))


Well, Sage keeps shooting itself in the foot here. Few releases ago there was a similar problem with OSX binaries :-(.

It would help if you post specifications of your processors.
That is, the output of 
cat /proc/cpuinfo

 

Jeroen Demeyer

unread,
Oct 13, 2012, 3:10:30 PM10/13/12
to sage-s...@googlegroups.com
On 2012-10-13 14:45, Dima Pasechnik wrote:
> Not all x86_64 boxes are equal. Different models of x86_64 processors
> have different sets of commands, and VMs are even worse in this case, as
> we saw situations where not all the processor capabilities are allowed
> by the VM, but for the software it looks as if these capabilities are
> allowed...
None of this should lead to a *Segmentation Fault* though. I'm more
guessing there are incompatible libraries causing this.

Dima Pasechnik

unread,
Oct 14, 2012, 1:52:48 AM10/14/12
to sage-s...@googlegroups.com
Do you mean to say it should have been "Illegal Instruction" rather than segfault?
Well, I would not bet my head on this...
 

Georgi Guninski

unread,
Oct 14, 2012, 3:38:26 AM10/14/12
to sage-s...@googlegroups.com
I am pretty sure all modern CPUs have a lot of bugs.

Don't have time to debug this bug.

Checked it on ubuntu 12.04 with binary sage and didn't crash,
so it is probably a "feature". 12.04 made me install libgfortran3
and it is a different version from 10.04.
> --
> You received this message because you are subscribed to the Google Groups "sage-support" group.
> To post to this group, send email to sage-s...@googlegroups.com.
> To unsubscribe from this group, send email to sage-support...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-support?hl=en.
>
>

P Purkayastha

unread,
Oct 14, 2012, 4:16:16 AM10/14/12
to sage-s...@googlegroups.com
It is not whether the CPU has any bug. The sage binaries and the
libraries it depends on are (I believe) compiled with some minimal level
of optimization. Maybe some optimization got applied in the atlas
library which is not present in your cpu. Typically, the atlas library
will run fine as long as you don't call that one particular command that
is using that optimized path. And then, you will get a segv or illegal
instruction when it does hit the optimized path.

Why dima asked you for the output of your /proc/cpuinfo is to determine
what instructions your processor supports. It will help tone down or
omit that specific optimization that the atlas library is being compiled
with.

Jeroen Demeyer

unread,
Oct 14, 2012, 8:12:14 AM10/14/12
to sage-s...@googlegroups.com
On 2012-10-14 07:52, Dima Pasechnik wrote:
> Do you mean to say it should have been "Illegal Instruction" rather than
> segfault?
Exactly.

Jeroen Demeyer

unread,
Oct 14, 2012, 8:16:34 AM10/14/12
to sage-s...@googlegroups.com
On 2012-10-14 10:16, P Purkayastha wrote:
> It is not whether the CPU has any bug. The sage binaries and the
> libraries it depends on are (I believe) compiled with some minimal level
> of optimization. Maybe some optimization got applied in the atlas
> library which is not present in your cpu.
As I said, I doubt that this can lead to Segmentation Faults. In this
case, one should see an Illegal Instruction (SIGILL).

Dima Pasechnik

unread,
Oct 14, 2012, 9:18:54 AM10/14/12
to sage-s...@googlegroups.com
It's easy to imagine, say,  the same CPU command requiring a different memory alignment on an older arc.
I haven't written a line of assembler since circa 1989 (although plenty before that :-)) and I don't speak x86 assembler, but, you know...

  

Georgi Guninski

unread,
Oct 14, 2012, 9:54:28 AM10/14/12
to sage-s...@googlegroups.com
Please make it clear you don't support generic x86_64, only the CPUs you
like :)))))))

I didn't understand did someone try it on 10.04 probably in VM?

12.04 certainly depends on ubuntu's libgfortran3 and it may be remotely
possible it is the culprit.


P Purkayastha

unread,
Oct 14, 2012, 10:11:41 AM10/14/12
to sage-s...@googlegroups.com
On 10/14/2012 09:54 PM, Georgi Guninski wrote:
> On Sun, Oct 14, 2012 at 06:18:54AM -0700, Dima Pasechnik wrote:
>>
>>
>> On Sunday, 14 October 2012 20:16:40 UTC+8, Jeroen Demeyer wrote:
>>>
>>> On 2012-10-14 10:16, P Purkayastha wrote:
>>>> It is not whether the CPU has any bug. The sage binaries and the
>>>> libraries it depends on are (I believe) compiled with some minimal level
>>>> of optimization. Maybe some optimization got applied in the atlas
>>>> library which is not present in your cpu.
>>> As I said, I doubt that this can lead to Segmentation Faults. In this
>>> case, one should see an Illegal Instruction (SIGILL).
>>>
>>
>> It's easy to imagine, say, the same CPU command requiring a different
>> memory alignment on an older arc.
>> I haven't written a line of assembler since circa 1989 (although plenty
>> before that :-)) and I don't speak x86 assembler, but, you know...
>>
>>
>>
>
> Please make it clear you don't support generic x86_64, only the CPUs you
> like :)))))))

I believe, the intention is to support generic x86_64. Otherwise the
reply you would have got is: "you probably don't have a supported CPU". :)))


> I didn't understand did someone try it on 10.04 probably in VM?

I am pretty sure the binaries are all tested in the sense that all
doctests are run to make sure they don't give any errors. Jeroen is very
strict about this. :)

VM is a different ballgame. If you search the sage-devel or maybe this
group (within this year), you will find some threads talking about some
problems people have run into in running precompiled Sage (or maybe just
compiling Sage) on VMs. vbraun here releases Virtualbox images with
Sage. They can be found in the Windows download of Sage. More details
here: http://wiki.sagemath.org/SageAppliance

Georgi Guninski

unread,
Oct 14, 2012, 10:18:11 AM10/14/12
to sage-s...@googlegroups.com
i mean did anyone beside me tested the disputed testcases on any ubuntu
10.04?


On Sun, Oct 14, 2012 at 10:11:41PM +0800, P Purkayastha wrote:

Dima Pasechnik

unread,
Oct 14, 2012, 10:58:37 AM10/14/12
to sage-s...@googlegroups.com, guni...@guninski.com


On Sunday, 14 October 2012 22:18:15 UTC+8, Georgi Guninski wrote:
i mean did anyone beside me tested the disputed testcases on any ubuntu
10.04?

I presume whoever built the said binary did test it on his/her machine.

There is no such thing like generic x86_64, AFAIK.
Intel did it one way, AMD another, supported (or not) all these SSE-whatever features...

Just as there is no x86 - there are i286, i386, i486, i586, i686, and I probably miss some AMD-only thing here.
Oh yeah, there are also Celerons, etc etc etc...
And I am sure you do not want to run a "generic" x86 binary, cause this would probably mean i386, i.e. veeery ooold and veeeeery slooooow....

Jeroen Demeyer

unread,
Oct 14, 2012, 11:02:44 AM10/14/12
to sage-s...@googlegroups.com
> There is no such thing like generic x86_64, AFAIK.
Actually, there is. There is a lowest common denominator (which I
think, actually includes SSE and SSE2).

The binaries are
1) supposed to be generic, anything else is a (low-priority) bug.
2) certainly tested on the platforms they are built on.

Jeroen Demeyer

unread,
Oct 14, 2012, 11:11:54 AM10/14/12
to sage-s...@googlegroups.com
On 2012-10-14 15:54, Georgi Guninski wrote:
> I didn't understand did someone try it on 10.04 probably in VM?
Yes, it was tested in Ubuntu 10.04, not in a VM.

I just tested that binary again on that machine and both your testcases
work.

> 12.04 certainly depends on ubuntu's libgfortran3 and it may be remotely
> possible it is the culprit.
That problem is very specific to Ubuntu 12.04 (and will anyway be fixed
in Sage-5.4)

Volker Braun

unread,
Oct 14, 2012, 11:18:58 AM10/14/12
to sage-s...@googlegroups.com
On Sunday, October 14, 2012 4:02:58 PM UTC+1, Jeroen Demeyer wrote:
The binaries are
1) supposed to be generic, anything else is a (low-priority) bug.
2) certainly tested on the platforms they are built on.

Also, the new ATLAS spkg uses the new "generic" archdefs and should fix this.

Georgi Guninski

unread,
Oct 14, 2012, 11:31:35 AM10/14/12
to sage-s...@googlegroups.com
On Sun, Oct 14, 2012 at 05:11:54PM +0200, Jeroen Demeyer wrote:
> On 2012-10-14 15:54, Georgi Guninski wrote:
> > I didn't understand did someone try it on 10.04 probably in VM?
> Yes, it was tested in Ubuntu 10.04, not in a VM.
>
> I just tested that binary again on that machine and both your testcases
> work.
>

ok, thank you.

the problem might be indeed in my boxen.

Georgi Guninski

unread,
Oct 15, 2012, 2:11:10 AM10/15/12
to sage-s...@googlegroups.com
Here is some debug info.
(if i change 44 to 43 don't crash).

Searching for the top of stack returns this:
http://projects.scipy.org/scipy/ticket/1611
interp1d gives a Segmentation fault when 1. using kind='cubic' and if the original data-set is greater than 119


g=graphs.CycleGraph(44);m=g.adjacency_matrix();m^7

Program received signal SIGSEGV, Segmentation fault.
0x00007fffec15dd4d in ATL_dupMBmm0_8_0_b0 ()
from /opt/sage/sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux/local/lib/libatlas.so
(gdb) x/i $pc
=> 0x7fffec15dd4d <ATL_dupMBmm0_8_0_b0+13>: ldmxcsr -0x4(%rsp)
(gdb) p/x $rsp
$1 = 0x7fffffffb848
(gdb) bt
#0 0x00007fffec15dd4d in ATL_dupMBmm0_8_0_b0 ()
from /opt/sage/sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux/local/lib/libatlas.so
#1 0x00007fffec1a8dac in ATL_dIBNBmm ()
from /opt/sage/sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux/local/lib/libatlas.so
#2 0x00007fffec1b2035 in ATL_dmmJIK2 ()
from /opt/sage/sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux/local/lib/libatlas.so
#3 0x00007fffec1b28ea in ATL_dmmJIK ()
from /opt/sage/sage-5.3-linux-64bit-ubuntu_10.04.4_lts-x86_64-Linux/local/lib/libatlas.so
#4 0x00007fffec1aa1a5 in ATL_dgemm ()


(gdb) info reg
rax 0x7fffec15dd40 140737154243904
rbx 0x2c 44
rcx 0x4fdad00 83733760
rdx 0x24 36
rsi 0x24 36
rdi 0x8 8
rbp 0x4fd4a00 0x4fd4a00
rsp 0x7fffffffb848 0x7fffffffb848
r8 0x24 36
r9 0x4fd4a00 83708416
r10 0x8 8
r11 0x24 36
r12 0x4fdad00 83733760
r13 0x4fd0e80 83693184
r14 0x8 8
r15 0x4fd0e80 83693184
rip 0x7fffec15dd4d 0x7fffec15dd4d <ATL_dupMBmm0_8_0_b0+13>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0


$grep flags /proc/cpuinfo
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority

Dima Pasechnik

unread,
Oct 15, 2012, 10:27:41 AM10/15/12
to sage-s...@googlegroups.com, guni...@guninski.com
another issue here is that the Atlas version used in 5.3 is old and obsolete.
The new one is here:

 

Georgi Guninski

unread,
Nov 13, 2012, 2:47:15 AM11/13/12
to sage-s...@googlegroups.com
Today tried binary sage 5.4 on ubuntu 10.04.

Both testcases don't SEGV on it.

If you fixed it, thanks :)


On Fri, Oct 12, 2012 at 05:21:20PM +0300, Georgi Guninski wrote:
> Simpler testcase:
>
> g=graphs.CycleGraph(44);m=g.adjacency_matrix()
> m^7
>
> On Fri, Oct 12, 2012 at 05:03:49PM +0300, Georgi Guninski wrote:
> > g=DiGraph('kO??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????O??????A???????W???????')
> > m=g.adjacency_matrix()
> > m
> > m^7
> >
> > Unhandled SIGSEGV:
> > local/lib/libatlas.so(+0x87d4d)[0x7fdb41c8ed4d]

P Purkayastha

unread,
Nov 13, 2012, 7:32:00 AM11/13/12
to sage-s...@googlegroups.com, guni...@guninski.com


On Monday, October 15, 2012 10:27:41 PM UTC+8, Dima Pasechnik wrote:


another issue here is that the Atlas version used in 5.3 is old and obsolete.
The new one is here:


I hope it gets in soon. 5.10 is so much better. Compiled here in 30m, while the previous one would go on and on for hours and then eventually fail.

Volker Braun

unread,
Nov 13, 2012, 12:16:40 PM11/13/12
to sage-s...@googlegroups.com, guni...@guninski.com
Apparently it won't. Itanium doesn't work correctly and it doesn't look like Clint (=Upstream) will have time anytime soon to look into it. If I were him I'd think twice about prioritizing effort for a dead platform, too. 

Dima Pasechnik

unread,
Nov 13, 2012, 10:08:45 PM11/13/12
to sage-s...@googlegroups.com
On 2012-11-13, Volker Braun <vbrau...@gmail.com> wrote:
> ------=_Part_133_37800.1352827000551
> Content-Type: text/plain; charset=ISO-8859-1
>
> Apparently it won't. Itanium doesn't work correctly and it doesn't look
> like Clint (=Upstream) will have time anytime soon to look into it. If I
> were him I'd think twice about prioritizing effort for a dead platform,
> too.
This does look very unfortunate that we have to be held back by this.
Should we just ship sage on itanium with an older atlas?

Dima

Volker Braun

unread,
Nov 13, 2012, 10:38:09 PM11/13/12
to sage-s...@googlegroups.com
I would be in favor of delegating Itanium to a second class platform. As less and less people have access to it its bound to become more troublesome. Of course there is value in having it working eventually, but it doesn't have to be on the first day.

P Purkayastha

unread,
Nov 14, 2012, 2:28:10 AM11/14/12
to sage-s...@googlegroups.com
On 11/14/2012 11:38 AM, Volker Braun wrote:
> I would be in favor of delegating Itanium to a second class platform. As
> less and less people have access to it its bound to become more
> troublesome. Of course there is value in having it working eventually,
> but it doesn't have to be on the first day.

According to wiki, it will be around for a couple more years.

http://en.wikipedia.org/wiki/Itanium#Timeline


Volker Braun

unread,
Nov 14, 2012, 5:16:32 AM11/14/12
to sage-s...@googlegroups.com
On Wednesday, November 14, 2012 2:28:31 AM UTC-5, P Purkayastha wrote:
According to wiki, it will be around for a couple more years. 

Theoretically yes, but in practice I think no Sage developer has even seen the current Itanium 9300 hardware. All we have access to is an old chip from back when.
 
Reply all
Reply to author
Forward
0 new messages