How to proceed to reduce Sage's memory leaking?

570 views
Skip to first unread message

Nils Bruin

unread,
Nov 3, 2012, 3:58:08 PM11/3/12
to sage-devel
Presently, Sage has a significant memory leak issue: Uniqueness of
parents is currently guaranteed by keeping them in memory
permanently.This prevents many computational strategies that are
otherwise perfectly legitimate, but require the construction of, for
instance, many finite fields and/or polynomial rings. A lot of
arithmetic geometric constructions fall in that category and current
approaches to tackle noncommutative algebra do too. Every single time
I have tried to use sage for some significant computation, this has
prevented me from completing it.

There has been work on resolving this issue by replacing permanent
storage by weak caching, so that parents can actually get delete; see
tickets #715 and #11521, for instance. The code on these tickets is by
now one of the most carefully reviewed (future) parts of sage.
However, time and again, issues elsewhere crop up because there is
broken code elsewhere that never got exercised because parents never
got deleted, even though they should.

We have been in shape a couple of times now, where all noticeable
issues were resolved. However, the merger of *other* tickets brought
to light even different issues, resulting in pulling #715 and #11521.

If we ever want sage to be usable on a reasonable scale, parents need
to be deleted every now and again. The basic design allows them to.
It's just that there is a lot code in sage that breaks when that
actually happens. Apparently, the normal ticket review and merger
process is not suitable for a change that's so fundamental to sage's
infrastructure, because it favours small-scale and superficial patches
(and hence keeps moving the goal posts for infrastructure changes).
Any ideas how to get this done?
For me this is a must-have to consider sage as viable platform and I
suspect I am not the only one for which it is.

Cheers,

Nils

Jeroen Demeyer

unread,
Nov 3, 2012, 4:12:54 PM11/3/12
to sage-...@googlegroups.com
Let me add to this that the bugs revealed by these tickets are often
quite complex. These are hard to debug, both for Nils Bruin and Simon
King working on the ticket, and for me as release manager.

For example, I remember in the past two seemingly unrelated tickets
which together caused a bug, but independently did not.

Travis Scrimshaw

unread,
Nov 3, 2012, 5:58:10 PM11/3/12
to sage-...@googlegroups.com
Here are my thoughts on the matter, but I'm not an expert on the innerworkings of sage, so please forgive/tell me if this is already done/impossible.

I propose limiting the size of the cache of parents and keep track of the references of parents. Thus parents with the fewest references should be replaced from the cache once we reach the maximum. Additionally if a parent has no references, we allow the garbage collector to take the parent. To get around referenced parents being spontaneously deleted, every time we return a parent object, we have a lightweight bridge class which recreates the parents if they've been deleted when called (which also gets notified when the parent is deleted). Something like this:

class ParentBridge:
    def __init__(self, parent_class, data):
        self._parent_class = parent_class
        self._data = data # arguments passed to the parent
        self._parent = None

    def parent(self):
        if self._parent is None:
            self._create_parent()
        return self._parent

    def _create_parent(self):
        # Do stuff to create parent in the cache
        self._parent = # the created parent

    def _parent_deleted(self):
        self._parent = None

We also return the same ParentBridge when the parent is stored in the cache. This would basically be a slight modification of a weak reference which recreates the target object if it is invalid. Another variant is we implement some other type of cache replacement algorithm (http://en.wikipedia.org/wiki/Cache_algorithms).

Alternatively we could just allow parents with no references are allowed to be garbage collected. This will likely not break any doctests since checking parent identity is usually in successive lines and the garbage collector usually does not have time to collect anything within a few lines when doctesting. We might also want to add a flag for (very) select instances which says that it can never be collected.

In both of the above, there is at most 1 instance of a given parent at any one time, so I do not foresee any problems (as long as we can reconstruct the parent object and appropriate references if it's deleted). Nevertheless, how we implement this must minimally change the interface, and I suspect the first way I suggested may require substantial change...

Best,
Travis

Volker Braun

unread,
Nov 3, 2012, 7:18:11 PM11/3/12
to sage-...@googlegroups.com
I'd say talk to Jeroen to make collectable parents a priority for one release. For example, lets have 5.5 as a the release where we add the collectable parents. Push out a beta1 with these patches, then we'll have a month during Jeroen's holiday where we can check any other tickets. No other tickets get merged if they break the parents stuff.




Jeroen Demeyer

unread,
Nov 3, 2012, 7:41:56 PM11/3/12
to sage-...@googlegroups.com
An extra complication is that the breakage is often non-reproducible and
system-dependent. Together with the wierd interaction between seemingly
unrelated patches, even determining whether a patch breaks the parent
stuff is very non-trivial.

Volker Braun

unread,
Nov 3, 2012, 8:06:42 PM11/3/12
to sage-...@googlegroups.com
You make it sound like there is just not enough doctesting coverage. The Sage doctests generally do not generate a lot of parents in one go. Maybe its just that the coverage of this use case needs to be improved? E.g. create a list of thousands of parents, delete random subset, garbage collect, repeat?

I admit that I haven't followed these patches as much as I would. Its clear that deleting parents can trigger lots of nasty stuff. We need to understand how to exercise that code.

If we can agree to dedicating a point release to this issue then that just means that beta0 is going to be broken on some systems. I take it this is Nils' original objection: Not every beta has to work perfectly on every system. If you merge a hundred small patches then its reasonable to kick everything back out that triggers a doctest failure. But if you want to make progress on a big issue then you have to accept that a beta is going to be imperfect and meant to expose a ticket to a much wider audience.

Francois Bissey

unread,
Nov 3, 2012, 8:23:57 PM11/3/12
to sage-...@googlegroups.com
On 04/11/12 13:06, Volker Braun wrote:
> You make it sound like there is just not enough doctesting coverage. The
> Sage doctests generally do not generate a lot of parents in one go.
> Maybe its just that the coverage of this use case needs to be improved?
> E.g. create a list of thousands of parents, delete random subset,
> garbage collect, repeat?
>
> I admit that I haven't followed these patches as much as I would. Its
> clear that deleting parents can trigger lots of nasty stuff. We need to
> understand how to exercise that code.
>
> If we can agree to dedicating a point release to this issue then that
> just means that beta0 is going to be broken on some systems. I take it
> this is Nils' original objection: Not every beta has to work perfectly
> on every system. If you merge a hundred small patches then its
> reasonable to kick everything back out that triggers a doctest failure.
> But if you want to make progress on a big issue then you have to accept
> that a beta is going to be imperfect and meant to expose a ticket to a
> much wider audience.
>
>

Actually, because some of the bugs are platform dependent etc... the
audience from a beta may not be big enough.
But nevertheless we have to just bit the bullet, do the best we can
and fix things as they become apparent. We cannot stop moving forward
because we are afraid to break stuff accidentally forever.

Francois

Jeroen Demeyer

unread,
Nov 4, 2012, 3:29:16 AM11/4/12
to sage-...@googlegroups.com
On 2012-11-04 01:06, Volker Braun wrote:
> You make it sound like there is just not enough doctesting coverage. The
> Sage doctests generally do not generate a lot of parents in one go.
> Maybe its just that the coverage of this use case needs to be improved?
> E.g. create a list of thousands of parents, delete random subset,
> garbage collect, repeat?
It would be absolutely awesome if we would have good doctests for this.
Of all the tickets I have ever seen as release manager, this is
probably the single hardest ticket to debug and find out why stuff
breaks (with #12221 as honorable second).

Jeroen Demeyer

unread,
Nov 4, 2012, 3:36:48 AM11/4/12
to sage-...@googlegroups.com
On 2012-11-04 01:23, Francois Bissey wrote:
> But nevertheless we have to just bit the bullet, do the best we can
> and fix things as they become apparent. We cannot stop moving forward
> because we are afraid to break stuff accidentally forever.
OK, let's go for it!

Do you want also other tickets like #12215 and #12313 or should we do
just #715 + #11521?

Francois Bissey

unread,
Nov 4, 2012, 4:14:41 AM11/4/12
to sage-...@googlegroups.com
It may be best to do only one set of big changes at a time just not
confuse issues. But these two sets may be similar enough.
Any other opinions?

Francois

Robert Bradshaw

unread,
Nov 5, 2012, 3:12:02 PM11/5/12
to sage-...@googlegroups.com
+1. I've always been meaning to get back to this for ages, but just
haven't found the time. If we're going to make a big push to get this
in, I'll do what I can to help.

For testing, I would propose we manually insert gc operations
periodically to see if we can reproduce the failures more frequently.
We could then marks some (hopefully a very small number) parents as
"unsafe to garbage collect" and go forward with this patch, holding
hard references to all "unsafe" parents to look into them later (which
isn't a regression).

- Robert

Simon King

unread,
Nov 5, 2012, 6:25:20 PM11/5/12
to sage-...@googlegroups.com
Hi Robert,

On 2012-11-05, Robert Bradshaw <robe...@gmail.com> wrote:
> +1. I've always been meaning to get back to this for ages, but just
> haven't found the time. If we're going to make a big push to get this
> in, I'll do what I can to help.

I'd appreciate your support!

> For testing, I would propose we manually insert gc operations
> periodically to see if we can reproduce the failures more frequently.

How can one insert gc operations? You mean, by inserting gc.collect()
into doctests, or by manipulating the Python call hook?

> We could then marks some (hopefully a very small number) parents as
> "unsafe to garbage collect" and go forward with this patch, holding
> hard references to all "unsafe" parents to look into them later (which
> isn't a regression).

That actually was what we tried: There was some bug that has only
occurred on bsd.math, and could be fixed by keeping a strong cache for
polynomial rings (which is inacceptable for my own project, but which
is at least no regression).

Anyway. I did not look into the new problems yet. If it is (again) about
libsingular polynomial rings, then I think we should really make an
effort to get reference counting for libsingular rings right.

Best regards,
Simon

Robert Bradshaw

unread,
Nov 5, 2012, 8:15:07 PM11/5/12
to sage-...@googlegroups.com
On Mon, Nov 5, 2012 at 3:25 PM, Simon King <simon...@uni-jena.de> wrote:
> Hi Robert,
>
> On 2012-11-05, Robert Bradshaw <robe...@gmail.com> wrote:
>> +1. I've always been meaning to get back to this for ages, but just
>> haven't found the time. If we're going to make a big push to get this
>> in, I'll do what I can to help.
>
> I'd appreciate your support!
>
>> For testing, I would propose we manually insert gc operations
>> periodically to see if we can reproduce the failures more frequently.
>
> How can one insert gc operations? You mean, by inserting gc.collect()
> into doctests, or by manipulating the Python call hook?

I was thinking about inserting it into the doctesting code, e.g. with
a random (know seen) x% chance between any two statements.

>> We could then marks some (hopefully a very small number) parents as
>> "unsafe to garbage collect" and go forward with this patch, holding
>> hard references to all "unsafe" parents to look into them later (which
>> isn't a regression).
>
> That actually was what we tried: There was some bug that has only
> occurred on bsd.math, and could be fixed by keeping a strong cache for
> polynomial rings (which is inacceptable for my own project, but which
> is at least no regression).
>
> Anyway. I did not look into the new problems yet. If it is (again) about
> libsingular polynomial rings, then I think we should really make an
> effort to get reference counting for libsingular rings right.

True, but I'd rather no particular ring hold us back from getting the
general fix in.

- Robert

Jeroen Demeyer

unread,
Nov 12, 2012, 4:47:16 PM11/12/12
to sage-...@googlegroups.com
Bad news again. During a preliminary test of sage-5.5.beta2, I got again
a segmentation fault in
devel/sage/sage/schemes/elliptic_curves/ell_number_field.py
but this time on a different system (arando: Linux i686) and with a
different set of patches as before. And for added fun: this time the
error isn't always reproducible.

Nils Bruin

unread,
Nov 12, 2012, 10:16:15 PM11/12/12
to sage-devel
On Nov 12, 1:47 pm, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> And for added fun: this time the error isn't always reproducible.

That's excellent news! Just keep trying until it's not reproducible
anymore. Then we're fine!

Seriously though, given that the bug pops up in the same file as
before indicates that probably the deletion of a similar kind of
object is to blame here. We just need to keep trying until we find a
way to consistently produce the error on a platform with reasonable
debugging tools.

Incidentally: Are PPC-OSX4 (or where-ever the problem earlier arose)
and i686 both 32 bit platforms? My bet is singular, since we know
refcounting there (or at least our interfacing with it) is handled
fishily and a previous issue indicated that omalloc is almost taylor-
made to generate different problems on different wordlengths.

Michael Welsh

unread,
Nov 12, 2012, 10:17:57 PM11/12/12
to sage-...@googlegroups.com
On 13/11/2012, at 4:16 PM, Nils Bruin <nbr...@sfu.ca> wrote:
>
> Incidentally: Are PPC-OSX4 (or where-ever the problem earlier arose)
> and i686 both 32 bit platforms?

Yes.

Jean-Pierre Flori

unread,
Nov 13, 2012, 8:13:04 PM11/13/12
to sage-...@googlegroups.com
I'll try to setup a 32 bits (on i686) install of the latest beta this week end and give this a shot...
If I'm lucky enough, I'll be able to reproduce the problem and get a proper backtrace, hopefully pointing to libsingular.

Jeroen Demeyer

unread,
Nov 14, 2012, 11:29:48 AM11/14/12
to sage-...@googlegroups.com
It happens also on other systems, including 64-bit. It's easy to
reproduce on the Skynet machine "sextus" (Linux x86_64) where it happens
about 71% of the time.

Nils Bruin

unread,
Nov 14, 2012, 1:06:53 PM11/14/12
to sage-devel
On Nov 14, 8:29 am, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> It happens also on other systems, including 64-bit.  It's easy to
> reproduce on the Skynet machine "sextus" (Linux x86_64) where it happens
> about 71% of the time.

That might be workable. What exact version/patches to reproduce the
problem? (I don't think I have a login on "sextus"). I don't promise
that I'll actually have time to build/test/track down this problem,
but I can see. Other people should definitely look at it too.

Jeroen Demeyer

unread,
Nov 14, 2012, 2:28:16 PM11/14/12
to sage-...@googlegroups.com
On 2012-11-14 19:06, Nils Bruin wrote:
> I don't think I have a login on "sextus"
FYI: it's a Fedora 16 system with an Intel(R) Pentium(R) 4 CPU 3.60GHz
processor running Linux 3.3.7-1.fc16.x86_64.

Nils Bruin

unread,
Nov 14, 2012, 5:34:27 PM11/14/12
to sage-devel
On Nov 14, 11:28 am, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> FYI: it's a Fedora 16 system with an Intel(R) Pentium(R) 4 CPU 3.60GHz
> processor running Linux 3.3.7-1.fc16.x86_64.

That sounded convenient because my desktop is similar:

Fedora 16 running 3.6.5-2.fc16.x86_64 #1 SMP on Intel(R) Core(TM)
i7-2600 CPU @ 3.40GHz

No such luck, however:

with

$ ./sage -v
Sage Version 5.5.beta2, Release Date: 2012-11-13

I ran

for i in `seq 100`; do
echo $i;
./sage -t devel/sage/sage/schemes/elliptic_curves/
ell_number_field.py || echo FAULT AT i is $i
done

which succeeded all 100 times.

Nils Bruin

unread,
Nov 14, 2012, 6:42:23 PM11/14/12
to sage-devel
However, in an effort to make memory errors during testing a little
more reproducible I made this little edit to local/bin/sagedoctest.py
to ensure the garbage collector is run before every doctested line:

--------------------------------------------------------------------
diff --git a/sagedoctest.py b/sagedoctest.py
--- a/sagedoctest.py
+++ b/sagedoctest.py
@@ -1,7 +1,9 @@
from __future__ import with_statement

import ncadoctest
+import gc
import sage.misc.randstate as randstate
+import sys

OrigDocTestRunner = ncadoctest.DocTestRunner
class SageDocTestRunner(OrigDocTestRunner):
@@ -35,6 +37,8 @@ class SageDocTestRunner(OrigDocTestRunne
except Exception, e:
self._timeit_stats[key] = e
# otherwise, just run the example
+ sys.stderr.write('testing example %s\n'%example)
+ gc.collect()
OrigDocTestRunner.run_one_example(self, test, example,
filename, compileflags)

def save_timeit_stats_to_file_named(self, output_filename):
--------------------------------------------------------------------

(i.e., just add a gc.collect() to run_one_example)

and it causes a reliable failure in crypto/mq/mpolynomialsystem.py:

Trying:
C[Integer(0)].groebner_basis()###line 84:_sage_ sage:
C[0].groebner_basis()
Expecting:
Polynomial Sequence with 26 Polynomials in 16 Variables
testing example <ncadoctest.Example instance at 0x69706c8>
ok
Trying:
A,v = mq.MPolynomialSystem(r2).coefficient_matrix()###line
87:_sage_ sage: A,v = mq.MPolynomialSystem(r2).coefficient_matrix()
Expecting nothing
testing example <ncadoctest.Example instance at 0x6970710>
*** glibc detected *** python: double free or corruption (out):
0x00000000075c58c0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x31cfe7da76]
/lib64/libc.so.6[0x31cfe7ed5e]
/usr/local/sage/5.5b2/local/lib/python/site-packages/sage/rings/
polynomial/pbori.so(+0x880aa)[0x7fa5eba7e0aa]
/usr/local/sage/5.5b2/local/lib/python/site-packages/sage/rings/
polynomial/pbori.so(+0x1d993)[0x7fa5eba13993]
...

Running it under sage -t --gdb gives:

(gdb) bt
#0 0x00000031cfe36285 in raise () from /lib64/libc.so.6
#1 0x00000031cfe37b9b in abort () from /lib64/libc.so.6
#2 0x00000031cfe7774e in __libc_message () from /lib64/libc.so.6
#3 0x00000031cfe7da76 in malloc_printerr () from /lib64/libc.so.6
#4 0x00000031cfe7ed5e in _int_free () from /lib64/libc.so.6
#5 0x00007fffce5cb0aa in
Delete<polybori::groebner::ReductionStrategy> (mem=0x547db30)
at /usr/local/sage/5.5b2/local/include/csage/ccobject.h:77
#6
__pyx_pf_4sage_5rings_10polynomial_5pbori_17ReductionStrategy_2__dealloc__
(__pyx_v_self=<optimized out>)
at sage/rings/polynomial/pbori.cpp:37868
#7
__pyx_pw_4sage_5rings_10polynomial_5pbori_17ReductionStrategy_3__dealloc__
(__pyx_v_self=0x54bf390)
at sage/rings/polynomial/pbori.cpp:37834
#8
__pyx_tp_dealloc_4sage_5rings_10polynomial_5pbori_ReductionStrategy
(o=0x54bf390) at sage/rings/polynomial/pbori.cpp:52283
#9 0x00007fffce560993 in
__pyx_tp_clear_4sage_5rings_10polynomial_5pbori_GroebnerStrategy
(o=0x54baeb0)
at sage/rings/polynomial/pbori.cpp:52545
#10 0x00007ffff7d4b637 in delete_garbage (old=0x7ffff7fe19e0,
collectable=0x7fffffffbb60) at Modules/gcmodule.c:769
#11 collect (generation=2) at Modules/gcmodule.c:930
#12 0x00007ffff7d4bdc9 in gc_collect (self=<optimized out>,
args=<optimized out>, kws=<optimized out>) at Modules/gcmodule.c:1067

which should give a pretty good pointer for pbori people to figure out
which memory deallocation is actually botched.

Nils Bruin

unread,
Nov 14, 2012, 7:15:34 PM11/14/12
to sage-devel
<polybori problem>:
This is actually reproducible in plain 5.0. This is now

http://trac.sagemath.org/sage_trac/ticket/13710

Nils Bruin

unread,
Nov 14, 2012, 7:22:24 PM11/14/12
to sage-devel
Other consequences from gc.collect() insertions:

sage -t -force_lib devel/sage/sage/crypto/mq/mpolynomialsystem.py #
Killed/crashed
sage -t -force_lib devel/sage/sage/rings/polynomial/
multi_polynomial_sequence.py # Killed/crashed

(same problem; reported as above)


**********************************************************************
File "/usr/local/sage/5.5b2/devel/sage/sage/modular/abvar/
abvar_ambient_jacobian.py", line 345:
sage: J0(33).decomposition(simple=False)
Expected:
[
Abelian subvariety of dimension 2 of J0(33),
Simple abelian subvariety 33a(None,33) of dimension 1 of J0(33)
]
Got:
[
Abelian subvariety of dimension 2 of J0(33),
Abelian subvariety of dimension 1 of J0(33)
]
**********************************************************************

sage -t -force_lib devel/sage/sage/modular/abvar/
abvar_ambient_jacobian.py # 1 doctests failed

(i.e., doctest is relying on a previous copy of 33a remaining in
memory on which additional computations have changed the way it
prints. That's a violation of immutability anyway and the doctest
shouldn't rely on such behaviour)


**********************************************************************
File "/usr/local/sage/5.5b2/devel/sage/sage/modular/abvar/abvar.py",
line 2840:
sage: J0(33).is_simple(none_if_not_known=True)
Expected:
False
Got nothing
**********************************************************************
sage -t -force_lib devel/sage/sage/modular/abvar/abvar.py # 1
doctests failed

Same problem! Since J0(33) is freshly constructed, one should not rely
on anything being cached on it and the test explicitly asks to not
compute anything.

Jean-Pierre Flori

unread,
Nov 14, 2012, 9:58:45 PM11/14/12
to sage-...@googlegroups.com
We dealt with something very similar in one of the "memleaks" tickets.
Not sure it was 715 or 11521, but maybe 12313 (the figures here might be wrong...).
So the fix is potentially not included in 5.5.beta2 if it was in the later.
 

Jean-Pierre Flori

unread,
Nov 14, 2012, 10:00:15 PM11/14/12
to sage-...@googlegroups.com
Ok, I took the time to check and you actually posted in 13710 that the fix is included in 12313, so not in 5.5.beta2 if I'm not wrong (nor 5.0 of course).

Jeroen Demeyer

unread,
Nov 16, 2012, 2:59:02 AM11/16/12
to sage-...@googlegroups.com
On 2012-11-14 23:34, Nils Bruin wrote:
> On Nov 14, 11:28 am, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
>> FYI: it's a Fedora 16 system with an Intel(R) Pentium(R) 4 CPU 3.60GHz
>> processor running Linux 3.3.7-1.fc16.x86_64.
>
> That sounded convenient because my desktop is similar:
>
> Fedora 16 running 3.6.5-2.fc16.x86_64 #1 SMP on Intel(R) Core(TM)
> i7-2600 CPU @ 3.40GHz

Could you try again with sage-5.5.beta1?

Nils Bruin

unread,
Nov 16, 2012, 1:35:52 PM11/16/12
to sage-devel
On Nov 15, 11:59 pm, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> Could you try again with sage-5.5.beta1?

Same behaviour. Was there a reason to expect differently?
I guess something is different on sextus. Bad memory/other hardware
problems?

I was surprised by how little issues arose from inserting garbage
collections between all doctests. That should upset the memory usage
patterns so much that I would expect it to shake out many problems.
Only things like singular's omalloc would be immune, because it hides
alloc/dealloc operations from the OS. You really need to wait for an
actual corruption to see a problem. The guarded malloc experiment on
OSX and similar operations took care of that. See

http://trac.sagemath.org/sage_trac/ticket/13447

for a dirty singular package that switches out omalloc for a system
malloc, which then allows normal OS tools to check memory allocation/
access/deallocation. See also the ticket for notes on how the approach
taken there can be adapted to let Singular use the system malloc under
linux (one singular malloc routine needs to know the size of an
allocated block, which is a non-POSIX malloc feature that both OSX and
linux support in different ways).

Do we have other memory managers in sage that play tricks like
omalloc? Things run a lot slower when you switch back to system malloc
for these, but it does enable conventional memory sanitation tests.

Valgrind produces way too much warnings to be useful. All you want is
a segfault on any access-after-dealloc or double-dealloc (out-of-
bounds access would be nice too). OSX's libgmalloc is perfect for
that. Is there a linux equivalent (or a way to configure valgrind to
do just this)?

I pose it as a challenge that no-one is able to do a comprehensive
testing of memory alloc/dealloc in sage. Even though I outline the
exact approach above that would make it a relatively straightforward
process to go through, no-one has the stamina and heroic hacker skills
to pull it off. Prove me wrong!

Jeroen Demeyer

unread,
Nov 17, 2012, 4:01:22 AM11/17/12
to sage-...@googlegroups.com
On 2012-11-16 19:35, Nils Bruin wrote:
> On Nov 15, 11:59 pm, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
>> Could you try again with sage-5.5.beta1?
>
> Same behaviour. Was there a reason to expect differently?
After adding every single ticket, there is reason to expect differently.
This stuff is *so sensitive* to changes, even changes which look
completely unrelated.

For example, on first sight, the errors are gone again in sage-5.5.beta2.

Nils Bruin

unread,
Nov 17, 2012, 2:01:19 PM11/17/12
to sage-devel
On Nov 17, 1:01 am, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> On 2012-11-16 19:35, Nils Bruin wrote:> On Nov 15, 11:59 pm, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> After adding every single ticket, there is reason to expect differently.
> This stuff is *so sensitive* to changes, even changes which look
> completely unrelated.
That's why the effort to do strict checking on memory management
should help (and it was in that light that I interpreted your
request). I think the sensitivity comes from the fact that you have to
wait for the coincidence that a freed-too-early location gets reused
and *then* written in its own role (i.e., actual corruption).

gc.collect() all the time should make deletions a little more
predictable and a very strict malloc/free should detect the problem
sooner. I'm afraid that MALLOC_CHECK_ isn't as good as BSD's gmalloc,
where even an access-after-free is a segfault (and many out-of-bound
accesses too).

Once one gets a little better in writing valgrind suppressions it's
easy to let valgrind produce less irrelevant output, so perhaps
there's a future for that. Or perhaps a tool to query and sort
valgrind reports after the fact (basically filter after the fact).
Perhaps it's time for William to hire someone again who is really good
at this stuff, because mathematically it's utterly uninteresting work
(and it really is finding and cleaning other people's mess)

Ivan Andrus

unread,
Nov 17, 2012, 3:20:01 PM11/17/12
to sage-...@googlegroups.com
At one point I had the goal of creating a suppressions file so that the doctests passed "cleanly". I'm sure some of the suppressions were actual problems, but it would at least allow you to find new problems. I still have the scripts that I used to collect and remove duplicate suppressions. I would be happy to run them again if people thought it would be useful. Sadly my machine isn't the fastest, so it takes quite a while (running all the doctests under valgrind is _slow_). I never did make it all the way through the test suite. But especially if I knew the likely areas it wouldn't be too hard to run some overnight and see what turns up.

-Ivan

Nils Bruin

unread,
Nov 17, 2012, 3:56:14 PM11/17/12
to sage-devel
On Nov 17, 12:20 pm, Ivan Andrus <darthand...@gmail.com> wrote:

> At one point I had the goal of creating a suppressions file so that the doctests passed "cleanly".  I'm sure some of the suppressions were actual problems, but it would at least allow you to find new problems.  I still have the scripts that I used to collect and remove duplicate suppressions.  I would be happy to run them again if people thought it would be useful.  Sadly my machine isn't the fastest, so it takes quite a while (running all the doctests under valgrind is _slow_).  I never did make it all the way through the test suite.  But especially if I knew the likely areas it wouldn't be too hard to run some overnight and see what turns up.

Anything that has to do with libsingular. The problem is that OTHER
tests may well exercise this code much better than libsingular's own
doctests.

However, with an unmodified libsingular it's unlikely you'll find
anything. omalloc allocates pages of system memory and then manages
pieces of it by itself. So as far as valgrind is concerned, there is
relatively little allocation/deallocation activity. I think you can go
further and tell valgrind about the functioning of alternative memory
managers. That would improve diagnostics a little. But if the compact
memory layout of omalloc (the compactness is its purpose) isn't
changed, you still have a good chance that an access-after-free refers
to perfectly valid memory (a block that now has been reallocated for a
different purpose)

This is the issue I'm trying to address with malloc-version of
singular. Combined with a malloc implementation that puts blocks on
separate pages, on the edge of the page, unmaps any page upon
deallocation, and tries to avoid reusing or using adjacent logical
pages means that any illegal access is almost sure to segfault. BSD's
gmalloc does that. It seems glibc's malloc with MALLOC_CHECK_=2 or 3
does at least a bit of that.

The real problem here is that we (Simon, Volker or I) don't know for
sure what the refcount and deletion protocols are for Singular
objects. It seems to be the kind of thing that is folklore inside the
Singular group but was never properly documented. Singular was not
designed to be a clean library, but it does seem to be a direction
Singular is heading, so perhaps this might sometime get documented
properly. I just think Sage can't wait for the decade or so that this
is probably going to take.

Ivan Andrus

unread,
Nov 17, 2012, 4:42:07 PM11/17/12
to sage-...@googlegroups.com
Thanks for the explanation. That makes sense. It sounds like there's not much valgrind will help with, but I'll give it a go anyway.

-Ivan

Jeroen Demeyer

unread,
Dec 19, 2012, 5:16:50 AM12/19/12
to sage-...@googlegroups.com
Just when I thought the #715 + #11521 issues were fixed in sage-5.5.rc1...

Apparently, sage-5.6.beta0 has uncovered a new problem: with the current
sage-5.6.beta0, I get the following reproducible segfault on hawk
(OpenSolaris i386):

> sage -t --long -force_lib devel/sage/sage/modules/module.pyx
> The doctested process was killed by signal 11
> [24.3 s]

Removing #8992 (the only ticket in sage-5.6.beta0 which seems remotely
related) doesn't help. Reverting #715, #11521, #13746 does fix it. Now I
don't know how to proceed, I am tempted to revert these tickets in the
sage-5.5 release.

Jeroen.

Jean-Pierre Flori

unread,
Dec 19, 2012, 9:25:47 AM12/19/12
to sage-...@googlegroups.com


On Wednesday, December 19, 2012 11:16:50 AM UTC+1, Jeroen Demeyer wrote:
Just when I thought the #715 + #11521 issues were fixed in sage-5.5.rc1...

Apparently, sage-5.6.beta0 has uncovered a new problem: with the current
sage-5.6.beta0, I get the following reproducible segfault on hawk
(OpenSolaris i386):

> sage -t  --long -force_lib devel/sage/sage/modules/module.pyx
> The doctested process was killed by signal 11
>          [24.3 s]

More details are available somewhere?
A gdb backtrace?

Jeroen Demeyer

unread,
Dec 20, 2012, 12:11:32 PM12/20/12
to sage-...@googlegroups.com
> A gdb backtrace?

buildbot@hawk:~/sage-5.6.beta0$ ./sage -t --long --gdb
"devel/sage/sage/modules/module.pyx"
sage -t --long --gdb "devel/sage/sage/modules/module.pyx"
********************************************************************************
Type r at the (gdb) prompt to run the doctests.
Type bt if there is a crash to see a traceback.
********************************************************************************
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-pc-solaris2.11"...
(gdb) r
Starting program: /export/home/buildbot/sage-5.6.beta0/local/bin/python
/export/home/buildbot/.sage//tmp/module_3741.py
warning: Lowest section in /lib/libdl.so.1 is .dynamic at 00000074
warning: Lowest section in /lib/libintl.so.1 is .dynamic at 00000074
warning: Lowest section in /lib/libpthread.so.1 is .dynamic at 00000074

Program received signal SIGSEGV, Segmentation fault.
PyObject_Malloc (nbytes=127) at Objects/obmalloc.c:788
788 Objects/obmalloc.c: No such file or directory.
in Objects/obmalloc.c
(gdb) bt
#0 PyObject_Malloc (nbytes=127) at Objects/obmalloc.c:788
#1 0xfee8919d in PyString_FromStringAndSize (str=0x0, size=106) at
Objects/stringobject.c:88
#2 0xfeeeeff8 in r_object (p=<value optimized out>) at Python/marshal.c:803
#3 0xfeeeee5b in r_object (p=0x80419f0) at Python/marshal.c:880
#4 0xfeeef0f9 in r_object (p=<value optimized out>) at
Python/marshal.c:1013
#5 0xfeeeee5b in r_object (p=0x80419f0) at Python/marshal.c:880
#6 0xfeeef0f9 in r_object (p=<value optimized out>) at
Python/marshal.c:1013
#7 0xfeeeee5b in r_object (p=0x80419f0) at Python/marshal.c:880
#8 0xfeeef0f9 in r_object (p=<value optimized out>) at
Python/marshal.c:1013
#9 0xfeeefb33 in PyMarshal_ReadObjectFromString (str=0xc11f7a0 "c",
len=46156) at Python/marshal.c:1181
#10 0xfeeefc77 in PyMarshal_ReadLastObjectFromFile (fp=0xfeda6838) at
Python/marshal.c:1142
#11 0xfeeebc90 in load_source_module (name=<value optimized out>,
pathname=<value optimized out>, fp=0xfeda6828) at Python/import.c:773
#12 0xfeeec9e8 in import_submodule (mod=0xb9fcbe4, subname=<value
optimized out>, fullname=0x80424bb "twisted.python.util")
at Python/import.c:2595
#13 0xfeeecf24 in ensure_fromlist (mod=<value optimized out>,
fromlist=<value optimized out>, buf=0x80424bb "twisted.python.util",
buflen=14, recursive=0) at Python/import.c:2506
#14 0xfeeed486 in import_module_level (name=0x0, globals=<value
optimized out>, locals=0xc2d402c, fromlist=0xb6dafcc, level=-1)
at Python/import.c:2174
#15 0xfeeed703 in PyImport_ImportModuleLevel (name=0xba19b24
"twisted.python", globals=0xc2d402c, locals=0xc2d402c, fromlist=0xb6dafcc,
level=-1) at Python/import.c:2188
#16 0xfeed25ea in builtin___import__ (self=0x0, args=0xc2d939c,
kwds=0x0) at Python/bltinmodule.c:49
#17 0xfee7d818 in PyCFunction_Call (func=0x807e5ec, arg=0xc2d939c,
kw=0xbcd6000) at Objects/methodobject.c:85
#18 0xfee3ad58 in PyObject_Call (func=0x807e5ec, arg=0xc2d939c, kw=0x0)
at Objects/abstract.c:2529
#19 0xfeed2b0e in PyEval_CallObjectWithKeywords (func=0x807e5ec,
arg=0xc2d939c, kw=0x0) at Python/ceval.c:3890
#20 0xfeed593c in PyEval_EvalFrameEx (f=0x9f88e54, throwflag=0) at
Python/ceval.c:2333
#21 0xfeed9d03 in PyEval_EvalCodeEx (co=0xc359ba8, globals=0xc2d402c,
locals=0xc2d402c, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#22 0xfeed9d53 in PyEval_EvalCode (co=0xc359ba8, globals=0xc2d402c,
locals=0xc2d402c) at Python/ceval.c:667
#23 0xfeee95f3 in PyImport_ExecCodeModuleEx (name=0x804357b
"twisted.python.log", co=0xc359ba8,
pathname=0x8042beb
"/export/home/buildbot/sage-5.6.beta0/local/lib/python2.7/site-packages/Twisted-12.1.0-py2.7-solaris-2.11-i86pc.32bit.egg/twisted/python/log.pyc")
at Python/import.c:681
#24 0xfeeebd06 in load_source_module (name=<value optimized out>,
pathname=<value optimized out>, fp=0xfeda6818) at Python/import.c:1018
#25 0xfeeec9e8 in import_submodule (mod=0xb9fcbe4, subname=<value
optimized out>, fullname=0x804357b "twisted.python.log")
at Python/import.c:2595
#26 0xfeeecf24 in ensure_fromlist (mod=<value optimized out>,
fromlist=<value optimized out>, buf=0x804357b "twisted.python.log",
buflen=14, recursive=0) at Python/import.c:2506
#27 0xfeeed486 in import_module_level (name=0x0, globals=<value
optimized out>, locals=0xc071b54, fromlist=0xbfce6cc, level=-1)
at Python/import.c:2174
#28 0xfeeed703 in PyImport_ImportModuleLevel (name=0xba19b24
"twisted.python", globals=0xc071b54, locals=0xc071b54, fromlist=0xbfce6cc,
level=-1) at Python/import.c:2188
#29 0xfeed25ea in builtin___import__ (self=0x0, args=0xb68734c,
kwds=0x0) at Python/bltinmodule.c:49
#30 0xfee7d818 in PyCFunction_Call (func=0x807e5ec, arg=0xb68734c,
kw=0xbcd6000) at Objects/methodobject.c:85
#31 0xfee3ad58 in PyObject_Call (func=0x807e5ec, arg=0xb68734c, kw=0x0)
at Objects/abstract.c:2529
#32 0xfeed2b0e in PyEval_CallObjectWithKeywords (func=0x807e5ec,
arg=0xb68734c, kw=0x0) at Python/ceval.c:3890
#33 0xfeed593c in PyEval_EvalFrameEx (f=0xc3bfa54, throwflag=0) at
Python/ceval.c:2333
#34 0xfeed9d03 in PyEval_EvalCodeEx (co=0xc06d2a8, globals=0xc071b54,
locals=0xc071b54, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#35 0xfeed9d53 in PyEval_EvalCode (co=0xc06d2a8, globals=0xc071b54,
locals=0xc071b54) at Python/ceval.c:667
#36 0xfeee95f3 in PyImport_ExecCodeModuleEx (name=0x804463b
"twisted.internet.pollreactor", co=0xc06d2a8,
pathname=0x8043cab
"/export/home/buildbot/sage-5.6.beta0/local/lib/python2.7/site-packages/Twisted-12.1.0-py2.7-solaris-2.11-i86pc.32bit.egg/twisted/internet/pollreactor.pyc")
at Python/import.c:681
#37 0xfeeebd06 in load_source_module (name=<value optimized out>,
pathname=<value optimized out>, fp=0xfeda6808) at Python/import.c:1018
#38 0xfeeec9e8 in import_submodule (mod=0xbfd7314, subname=<value
optimized out>, fullname=0x804463b "twisted.internet.pollreactor")
at Python/import.c:2595
#39 0xfeeecc8d in load_next (mod=<value optimized out>, altmod=<value
optimized out>, p_name=0x804462c,
buf=0x804463b "twisted.internet.pollreactor", p_buflen=0x8044a3c) at
Python/import.c:2415
#40 0xfeeed218 in import_module_level (name=0x0, globals=<value
optimized out>, locals=0xfef78608, fromlist=0xbfce7cc, level=-1)
at Python/import.c:2144
#41 0xfeeed703 in PyImport_ImportModuleLevel (name=0xc008494
"twisted.internet.pollreactor", globals=0xc0718ac, locals=0xfef78608,
fromlist=0xbfce7cc, level=-1) at Python/import.c:2188
#42 0xfeed25ea in builtin___import__ (self=0x0, args=0xb677bbc,
kwds=0x0) at Python/bltinmodule.c:49
#43 0xfee7d818 in PyCFunction_Call (func=0x807e5ec, arg=0xb677bbc,
kw=0xbcd6000) at Objects/methodobject.c:85
#44 0xfee3ad58 in PyObject_Call (func=0x807e5ec, arg=0xb677bbc, kw=0x0)
at Objects/abstract.c:2529
#45 0xfeed2b0e in PyEval_CallObjectWithKeywords (func=0x807e5ec,
arg=0xb677bbc, kw=0x0) at Python/ceval.c:3890
#46 0xfeed593c in PyEval_EvalFrameEx (f=0xc0602dc, throwflag=0) at
Python/ceval.c:2333
#47 0xfeed8c15 in PyEval_EvalFrameEx (f=0xc0ba6d4, throwflag=0) at
Python/ceval.c:4107
#48 0xfeed9d03 in PyEval_EvalCodeEx (co=0xbe97770, globals=0xc0718ac,
locals=0xc0718ac, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#49 0xfeed9d53 in PyEval_EvalCode (co=0xbe97770, globals=0xc0718ac,
locals=0xc0718ac) at Python/ceval.c:667
#50 0xfeee95f3 in PyImport_ExecCodeModuleEx (name=0x80457bb
"twisted.internet.default", co=0xbe97770,
pathname=0x8044e2b
"/export/home/buildbot/sage-5.6.beta0/local/lib/python2.7/site-packages/Twisted-12.1.0-py2.7-solaris-2.11-i86pc.32bit.egg/twisted/internet/default.pyc")
at Python/import.c:681
#51 0xfeeebd06 in load_source_module (name=<value optimized out>,
pathname=<value optimized out>, fp=0xfeda67f8) at Python/import.c:1018
#52 0xfeeec9e8 in import_submodule (mod=0xbfd7314, subname=<value
optimized out>, fullname=0x80457bb "twisted.internet.default")
at Python/import.c:2595
#53 0xfeeecf24 in ensure_fromlist (mod=<value optimized out>,
fromlist=<value optimized out>, buf=0x80457bb "twisted.internet.default",
buflen=16, recursive=0) at Python/import.c:2506
#54 0xfeeed486 in import_module_level (name=0x0, globals=<value
optimized out>, locals=0xc071a44, fromlist=0xbfce56c, level=-1)
at Python/import.c:2174
#55 0xfeeed703 in PyImport_ImportModuleLevel (name=0x8165444
"twisted.internet", globals=0xc071a44, locals=0xc071a44,
fromlist=0xbfce56c, level=-1) at Python/import.c:2188
#56 0xfeed25ea in builtin___import__ (self=0x0, args=0xbfefacc,
kwds=0x0) at Python/bltinmodule.c:49
#57 0xfee7d818 in PyCFunction_Call (func=0x807e5ec, arg=0xbfefacc,
kw=0xbcd6000) at Objects/methodobject.c:85
#58 0xfee3ad58 in PyObject_Call (func=0x807e5ec, arg=0xbfefacc, kw=0x0)
at Objects/abstract.c:2529
#59 0xfeed2b0e in PyEval_CallObjectWithKeywords (func=0x807e5ec,
arg=0xbfefacc, kw=0x0) at Python/ceval.c:3890
#60 0xfeed593c in PyEval_EvalFrameEx (f=0xc3163dc, throwflag=0) at
Pyt#61 0xfeed9d03 in PyEval_EvalCodeEx (co=0xc000410, globals=0xc071a44,
locals=0xc071a44, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#62 0xfeed9d53 in PyEval_EvalCode (co=0xc000410, globals=0xc071a44,
locals=0xc071a44) at Python/ceval.c:667
#63 0xfeee95f3 in PyImport_ExecCodeModuleEx (name=0x804687b
"twisted.internet.reactor", co=0xc000410,
pathname=0x8045eeb
"/export/home/buildbot/sage-5.6.beta0/local/lib/python2.7/site-packages/Twisted-12.1.0-py2.7-solaris-2.11-i86pc.32bit.egg/twisted/internet/reactor.pyc")
at Python/import.c:681
#64 0xfeeebd06 in load_source_module (name=<value optimized out>,
pathname=<value optimized out>, fp=0xfeda67e8) at Python/import.c:1018
#65 0xfeeec9e8 in import_submodule (mod=0xbfd7314, subname=<value
optimized out>, fullname=0x804687b "twisted.internet.reactor")
at Python/import.c:2595
#66 0xfeeecf24 in ensure_fromlist (mod=<value optimized out>,
fromlist=<value optimized out>, buf=0x804687b "twisted.internet.reactor",
buflen=16, recursive=0) at Python/import.c:2506
#67 0xfeeed486 in import_module_level (name=0x0, globals=<value
optimized out>, locals=0xfef78608, fromlist=0x81738ec, level=-1)
at Python/import.c:2174
#68 0xfeeed703 in PyImport_ImportModuleLevel (name=0x8165444
"twisted.internet", globals=0x816668c, locals=0xfef78608,
fromlist=0x81738ec, level=-1) at Python/import.c:2188
#69 0xfeed25ea in builtin___import__ (self=0x0, args=0xc27fbbc,
kwds=0x0) at Python/bltinmodule.c:49
#70 0xfee7d818 in PyCFunction_Call (func=0x807e5ec, arg=0xc27fbbc,
kw=0xbcd6000) at Objects/methodobject.c:85
#71 0xfee3ad58 in PyObject_Call (func=0x807e5ec, arg=0xc27fbbc, kw=0x0)
at Objects/abstract.c:2529
#72 0xfeed2b0e in PyEval_CallObjectWithKeywords (func=0x807e5ec,
arg=0xc27fbbc, kw=0x0) at Python/ceval.c:3890
#73 0xfeed593c in PyEval_EvalFrameEx (f=0xc053f04, throwflag=0) at
Python/ceval.c:2333
#74 0xfeed9d03 in PyEval_EvalCodeEx (co=0x81683c8, globals=0x816668c,
locals=0x0, args=0x81a9358, argcount=0, kws=0x81a9358, kwcount=1,
defs=0xbff1838, defcount=1, closure=0x0) at Python/ceval.c:3253
#75 0xfeed7b25 in PyEval_EvalFrameEx (f=0x81a921c, throwflag=0) at
Python/ceval.c:4117
#76 0xfeed9d03 in PyEval_EvalCodeEx (co=0x81682a8, globals=0x809235c,
locals=0x809235c, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#77 0xfeed9d53 in PyEval_EvalCode (co=0x81682a8, globals=0x809235c,
locals=0x809235c) at Python/ceval.c:667
#78 0xfeef697e in PyRun_FileExFlags (fp=0xfeda67e8, filename=0x8047466
"/export/home/buildbot/.sage//tmp/module_3741.py", start=257,
globals=0x809235c, locals=0x809235c, closeit=1, flags=0x804721c) at
Python/pythonrun.c:1353
#79 0xfeef6b58 in PyRun_SimpleFileExFlags (fp=0xfeda67e8,
filename=0x8047466 "/export/home/buildbot/.sage//tmp/module_3741.py",
closeit=1, flags=0x804721c) at Python/pythonrun.c:943
#80 0xfeef6f21 in PyRun_AnyFileExFlags (fp=0xfeda67e8,
filename=0x8047466 "/export/home/buildbot/.sage//tmp/module_3741.py",
closeit=1,
flags=0x804721c) at Python/pythonrun.c:747
#81 0xfef0ae5f in Py_Main (argc=2, argv=0x8047294) at Modules/main.c:639
#82 0x08050d10 in main (argc=2, argv=0x8047294) at Modules/python.c:23
hon/ceval.c:2333

David Kirkby

unread,
Dec 20, 2012, 6:31:51 PM12/20/12
to sage-...@googlegroups.com
On 3 November 2012 19:58, Nils Bruin <nbr...@sfu.ca> wrote:
> Presently, Sage has a significant memory leak issue: Uniqueness of
> parents is currently guaranteed by keeping them in memory
> permanently.

I've compiled Sage on Solaris with Sun libraries which replace
malloc/free with versions which check for memory leaks. Sage has
leaked memory before the

sage:

prompt has appeared. But from what I gather, one of the culprits does
it own memory management, which would not be detected by those
libraries.

Simon King

unread,
Dec 21, 2012, 1:09:07 AM12/21/12
to sage-...@googlegroups.com
Hi David,

On 2012-12-20, David Kirkby <david....@onetel.net> wrote:
> I've compiled Sage on Solaris with Sun libraries which replace
> malloc/free with versions which check for memory leaks. Sage has
> leaked memory before the
>
> sage:
>
> prompt has appeared. But from what I gather, one of the culprits does
> it own memory management, which would not be detected by those
> libraries.

That sounds like libsingular.

Can you try to install the singular spkg from #13731, after doing
export SINGULAR_XALLOC=yes
followed by sage -b, please?

This will result in (lib)Singular being built with xalloc (a thin
compatibility layer on top of malloc) replacing Singular's usual memory
manager omalloc.

Moreover, the spkg backports a couple of upstream fixes for out-of-bound
errors detected by Sage's doctests run with
export MALLOC_CHECK_=3
using Singular with xalloc.

Best regards,
Simon


Jeroen Demeyer

unread,
Dec 21, 2012, 2:40:25 AM12/21/12
to sage-...@googlegroups.com
On 2012-12-21 07:09, Simon King wrote:
> Can you try to install the singular spkg from #13731, after doing
> export SINGULAR_XALLOC=yes
> followed by sage -b, please?

Then singular fails to install:

### Singular spkg-install: build_singular ###
make PIPE= install-nolns in omalloc
make[1]: Entering directory
`/export/home/buildbot/sage-5.6.beta0/spkg/build/singular-3-1-5.p2/src/omalloc'
gcc -O2 -g -fPIC -I/export/home/buildbot/sage-5.6.beta0/local/include
-c omFindExec.c
rm -f libomalloc.a
ar cr libomalloc.a omFindExec.o
ranlib libomalloc.a
install omalloc.h /export/home/buildbot/sage-5.6.beta0/local/include/
make[1]: install: Command not found
make[1]: *** [install] Error 127
make[1]: Leaving directory
`/export/home/buildbot/sage-5.6.beta0/spkg/build/singular-3-1-5.p2/src/omalloc'
make: *** [install-nolns] Error 1
Unable to build and install Singular
Error building Singular (error in build_singular).

Dima Pasechnik

unread,
Dec 21, 2012, 2:46:24 AM12/21/12
to sage-...@googlegroups.com
On 2012-12-21, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> On 2012-12-21 07:09, Simon King wrote:
>> Can you try to install the singular spkg from #13731, after doing
>> export SINGULAR_XALLOC=yes
>> followed by sage -b, please?
>
> Then singular fails to install:
>
> ### Singular spkg-install: build_singular ###
> make PIPE= install-nolns in omalloc
> make[1]: Entering directory
> `/export/home/buildbot/sage-5.6.beta0/spkg/build/singular-3-1-5.p2/src/omalloc'
> gcc -O2 -g -fPIC -I/export/home/buildbot/sage-5.6.beta0/local/include
> -c omFindExec.c
> rm -f libomalloc.a
> ar cr libomalloc.a omFindExec.o
> ranlib libomalloc.a
> install omalloc.h /export/home/buildbot/sage-5.6.beta0/local/include/
> make[1]: install: Command not found
it is
/usr/sbin/install
on Solaris...
perhaps, no /usr/sbin in the PATH?

Nils Bruin

unread,
Dec 21, 2012, 4:05:50 AM12/21/12
to sage-devel


On Dec 19, 12:16 am, Jeroen Demeyer <jdeme...@cage.ugent.be> wrote:
> Just when I thought the #715 + #11521 issues were fixed in sage-5.5.rc1...
>
> Apparently, sage-5.6.beta0 has uncovered a new problem: with the current
> sage-5.6.beta0, I get the following reproducible segfault on hawk
> (OpenSolaris i386):
>
> > sage -t  --long -force_lib devel/sage/sage/modules/module.pyx
> > The doctested process was killed by signal 11
> >          [24.3 s]
I tried this on linux too:
1) I built 5.6b0 and tried the test: success
2) I set MALLOC_CHECK_=3. BOOM. Similar error as reported
Now comes the odd part:
If I turn off MALLOC_CHECK_ it now ALSO goes BOOM.
If I run with --verbose tests complete without problem
If I run with valgrind it STILL Segfaults (and I do get a mildly
informative report about
python's obmalloc.c:788, i.e.,
if ((pool->freeblock = *(block **)bp) != NULL) {
doing a read from an unallocated address)

Note that the segfault happens somewhere deep inside Python's import
machinery. My guess is something got corrupted (and written to a pyc
file?) and now spoils the fun every time.

Anyway, perhaps someone can replicate that this test fails on linux
with MALLOC_CHECK_=3 as well. Possibly valgrinding finds a useful
report.

Jeroen Demeyer

unread,
Dec 21, 2012, 4:44:07 AM12/21/12
to sage-...@googlegroups.com
On 2012-12-21 10:05, Nils Bruin wrote:
> Anyway, perhaps someone can replicate that this test fails on linux
> with MALLOC_CHECK_=3 as well.
I don't manage to replicate it on sage.math (Ubuntu 8.04, x86_64). The
test always succeeds.

Jeroen Demeyer

unread,
Dec 21, 2012, 6:13:35 AM12/21/12
to sage-...@googlegroups.com
On 2012-12-21 07:09, Simon King wrote:
> Can you try to install the singular spkg from #13731, after doing
> export SINGULAR_XALLOC=yes
> followed by sage -b, please?
That doesn't change anything. I still get the segmentation fault as before.

Jean-Pierre Flori

unread,
Dec 21, 2012, 10:54:58 AM12/21/12
to sage-...@googlegroups.com
I get segfaults on Ubuntu 12.04.1 x86_64 even without MALLOC_CHECK_ but not every time.
The Valgrind output is not that informative, it dies with a SIGILL in visit_decref (gcmodule.c: 320) where it jumps to a very fishy address.

If I use verbose I could not reproduce it.

David Kirkby

unread,
Dec 21, 2012, 12:18:29 PM12/21/12
to sage-...@googlegroups.com
On 21 December 2012 06:09, Simon King <simon...@uni-jena.de> wrote:
> Hi David,
>
> On 2012-12-20, David Kirkby <david....@onetel.net> wrote:
>> I've compiled Sage on Solaris with Sun libraries which replace
>> malloc/free with versions which check for memory leaks. Sage has
>> leaked memory before the
>>
>> sage:
>>
>> prompt has appeared. But from what I gather, one of the culprits does
>> it own memory management, which would not be detected by those
>> libraries.
>
> That sounds like libsingular.
>
> Can you try to install the singular spkg from #13731, after doing
> export SINGULAR_XALLOC=yes
> followed by sage -b, please?

This will have to wait until tommorow. I need to download Sage, then
find out how I link the special libraries. I've done it before, but a
long time ago. Sage leaked like a sieve, but that was 2-3 years ago.

Dave

Volker Braun

unread,
Dec 24, 2012, 8:23:24 AM12/24/12
to sage-...@googlegroups.com
Does anybody have an explanation why this segfaults while importing Twisted? You are doctesting sage/modules/module.pyx, that shouldn't have anything to do with web servers or internet access. 

Jean-Pierre Flori

unread,
Dec 24, 2012, 9:04:55 AM12/24/12
to sage-...@googlegroups.com


On Monday, December 24, 2012 2:23:24 PM UTC+1, Volker Braun wrote:
Does anybody have an explanation why this segfaults while importing Twisted? You are doctesting sage/modules/module.pyx, that shouldn't have anything to do with web servers or internet access. 
I think that is because the segfault occurs while quitting Sage, so this involves all Python parts of Sage which have been used.
(Not sure is loaded at startup when doctesting, but you can find some specific code involving and importing Twisted in qui_sage() in sage/all.py.)

rjf

unread,
Dec 24, 2012, 10:20:37 AM12/24/12
to sage-...@googlegroups.com
Sometimes the best strategy for fixing a bug is to not look for it, but
to rewrite the program from scratch.

I don't know what exactly you are trying to do, but surely you are not
using ALL of Sage.

If your concern is to complete some computation, can you do so without
trying to debug Sage -- indeed trying to debug Sage on multiple
platforms?

RJF

Jean-Pierre Flori

unread,
Dec 24, 2012, 3:38:44 PM12/24/12
to sage-...@googlegroups.com
Just a hint to continue debug that (unless someone wants to rewrite ALL of Sage):
it might be helpful to rebuild python with --without-pymalloc to get more hindisghtful backtraces and valgrind output, just as suggested by the Python spkg.
You can automagically do this by exporting SAGE_VALGRIND=yes and reinstalling Python.

Jean-Pierre Flori

unread,
Dec 24, 2012, 4:18:18 PM12/24/12
to sage-...@googlegroups.com
Not sure we got these so clearly before, but using --without-pymalloc and Valgrind (hint: finish and review #13060) I get lots of


==28631== Invalid read of size 8
==28631==    at 0x10429E50: __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject (category_object.c:8990)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4EBA106: insertdict (dictobject.c:530)
==28631==    by 0x4EBCB51: PyDict_SetItem (dictobject.c:775)
==28631==    by 0x4EC2517: _PyObject_GenericSetAttrWithDict (object.c:1524)
==28631==    by 0x4EC1F5E: PyObject_SetAttr (object.c:1247)
==28631==    by 0x4F21600: PyEval_EvalFrameEx (ceval.c:2004)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F1F6A6: PyEval_CallObjectWithKeywords (ceval.c:3890)
==28631==    by 0x4F23D5A: PyEval_EvalFrameEx (ceval.c:1739)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F266C1: PyEval_EvalCode (ceval.c:667)
==28631==    by 0x4F24388: PyEval_EvalFrameEx (ceval.c:4718)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4E8C46F: instancemethod_call (classobject.c:2578)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F21828: PyEval_EvalFrameEx (ceval.c:4239)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F2422C: PyEval_EvalFrameEx (ceval.c:4117)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==  Address 0xbd30390 is 48 bytes inside a block of size 256 free'd
==28631==    at 0x4C28B16: free (vg_replace_malloc.c:446)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4F5F112: collect (gcmodule.c:770)
==28631==    by 0x4F5FB06: _PyObject_GC_Malloc (gcmodule.c:996)
==28631==    by 0x4F5FB3C: _PyObject_GC_New (gcmodule.c:1467)
==28631==    by 0x4E98B97: PyWrapper_New (descrobject.c:1068)
==28631==    by 0x4EC2258: _PyObject_GenericGetAttrWithDict (object.c:1434)
==28631==    by 0x10A6CD28: __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__ (coerce_dict.c:1225)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4E7EB2D: PyObject_CallFunctionObjArgs (abstract.c:2760)
==28631==    by 0x4EEA350: PyObject_ClearWeakRefs (weakrefobject.c:881)
==28631==    by 0x10429E4F: __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject (category_object.c:8989)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4EBA106: insertdict (dictobject.c:530)
==28631==    by 0x4EBCB51: PyDict_SetItem (dictobject.c:775)
==28631==    by 0x4EC2517: _PyObject_GenericSetAttrWithDict (object.c:1524)
==28631==    by 0x4EC1F5E: PyObject_SetAttr (object.c:1247)
==28631==    by 0x4F21600: PyEval_EvalFrameEx (ceval.c:2004)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F1F6A6: PyEval_CallObjectWithKeywords (ceval.c:3890)
==28631==    by 0x4F23D5A: PyEval_EvalFrameEx (ceval.c:1739)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F266C1: PyEval_EvalCode (ceval.c:667)

and

==28631== Invalid read of size 8
==28631==    at 0x4F5FC1E: PyObject_GC_Del (gcmodule.c:210)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4EBA106: insertdict (dictobject.c:530)
==28631==    by 0x4EBCB51: PyDict_SetItem (dictobject.c:775)
==28631==    by 0x4EC2517: _PyObject_GenericSetAttrWithDict (object.c:1524)
==28631==    by 0x4EC1F5E: PyObject_SetAttr (object.c:1247)
==28631==    by 0x4F21600: PyEval_EvalFrameEx (ceval.c:2004)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F1F6A6: PyEval_CallObjectWithKeywords (ceval.c:3890)
==28631==    by 0x4F23D5A: PyEval_EvalFrameEx (ceval.c:1739)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F266C1: PyEval_EvalCode (ceval.c:667)
==28631==    by 0x4F24388: PyEval_EvalFrameEx (ceval.c:4718)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4E8C46F: instancemethod_call (classobject.c:2578)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F21828: PyEval_EvalFrameEx (ceval.c:4239)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F2422C: PyEval_EvalFrameEx (ceval.c:4117)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==  Address 0xbd30360 is 0 bytes inside a block of size 256 free'd
==28631==    at 0x4C28B16: free (vg_replace_malloc.c:446)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4F5F112: collect (gcmodule.c:770)
==28631==    by 0x4F5FB06: _PyObject_GC_Malloc (gcmodule.c:996)
==28631==    by 0x4F5FB3C: _PyObject_GC_New (gcmodule.c:1467)
==28631==    by 0x4E98B97: PyWrapper_New (descrobject.c:1068)
==28631==    by 0x4EC2258: _PyObject_GenericGetAttrWithDict (object.c:1434)
==28631==    by 0x10A6CD28: __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__ (coerce_dict.c:1225)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4E7EB2D: PyObject_CallFunctionObjArgs (abstract.c:2760)
==28631==    by 0x4EEA350: PyObject_ClearWeakRefs (weakrefobject.c:881)
==28631==    by 0x10429E4F: __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject (category_object.c:8989)
==28631==    by 0x4ED96C5: subtype_dealloc (typeobject.c:1014)
==28631==    by 0x4EBA106: insertdict (dictobject.c:530)
==28631==    by 0x4EBCB51: PyDict_SetItem (dictobject.c:775)
==28631==    by 0x4EC2517: _PyObject_GenericSetAttrWithDict (object.c:1524)
==28631==    by 0x4EC1F5E: PyObject_SetAttr (object.c:1247)
==28631==    by 0x4F21600: PyEval_EvalFrameEx (ceval.c:2004)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4EA8F65: function_call (funcobject.c:526)
==28631==    by 0x4E7DFED: PyObject_Call (abstract.c:2529)
==28631==    by 0x4F1F6A6: PyEval_CallObjectWithKeywords (ceval.c:3890)
==28631==    by 0x4F23D5A: PyEval_EvalFrameEx (ceval.c:1739)
==28631==    by 0x4F26587: PyEval_EvalCodeEx (ceval.c:3253)
==28631==    by 0x4F266C1: PyEval_EvalCode (ceval.c:667)

Simon King

unread,
Dec 24, 2012, 4:54:48 PM12/24/12
to sage-...@googlegroups.com
Hi Jean-Pierre,

On 2012-12-24, Jean-Pierre Flori <jpf...@gmail.com> wrote:
> Not sure we got these so clearly before, but using --without-pymalloc and
> Valgrind (hint: finish and review #13060) I get lots of
>
>
>==28631== Invalid read of size 8
>==28631== at 0x10429E50:
> __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject
> (category_object.c:8990)
> ...
>==28631== by 0x10A6CD28:
> __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__
> (coerce_dict.c:1225)

OK, that's good, because it clearly points to the stuff from #715 and
thus gives hope to find a bug in the new code.

I hope I will be able to debug that after Christmas.

Best regards,
Simon


Jean-Pierre Flori

unread,
Dec 24, 2012, 5:18:26 PM12/24/12
to sage-...@googlegroups.com


On Monday, December 24, 2012 10:54:48 PM UTC+1, Simon King wrote:
Hi Jean-Pierre,

On 2012-12-24, Jean-Pierre Flori <jpf...@gmail.com> wrote:
> Not sure we got these so clearly before, but using --without-pymalloc and
> Valgrind (hint: finish and review #13060) I get lots of
>
>
>==28631== Invalid read of size 8
>==28631==    at 0x10429E50:
> __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject
> (category_object.c:8990)
> ...
>==28631==    by 0x10A6CD28:
> __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__
> (coerce_dict.c:1225)

OK, that's good, because it clearly points to the stuff from #715 and
thus gives hope to find a bug in the new code.
Yup, let's hope so.

Maybe the problem is with endomorphism rings, because we have the domain and codomain pointing to the same parent, that's a nice culprit for a superfluous decref.

Jean-Pierre Flori

unread,
Dec 24, 2012, 7:44:01 PM12/24/12
to sage-...@googlegroups.com
Any reason for calling directly _refcache.__delitem__ rather than del _refcache ?
Changing this solves the problem, but surely only by hiding the bug...

Jean-Pierre Flori

unread,
Dec 24, 2012, 8:52:37 PM12/24/12
to sage-...@googlegroups.com
Indeed, rebuilding everything with --with-pydebug is just scary.
You get the smae failure as above, but everywhere (because asserts are checked and you don't have to pray for a segfault)

Simon King

unread,
Dec 25, 2012, 11:04:12 AM12/25/12
to sage-...@googlegroups.com
Hi Jean-Pierre,

On 2012-12-25, Jean-Pierre Flori <jpf...@gmail.com> wrote:
>> Any reason for calling directly _refcache.__delitem__ rather than del
>> _refcache ?

No.

> Indeed, rebuilding everything with --with-pydebug is just scary.
> You get the smae failure as above, but everywhere (because asserts are
> checked and you don't have to pray for a segfault)

What commands does one need to issue in order to rebuild everything with
--with-pydebug?

Best regards,
Simon

Jean-Pierre Flori

unread,
Dec 25, 2012, 11:13:01 AM12/25/12
to sage-...@googlegroups.com
I think the easiest way is to tweak the spkg-install script so that it passes the option to configure.
Dirty but working ok for development, we should add some way to pass the option directly (for example when SAGE_DEBUG is yes or letting the user pass falgs to put into EXTRAFLAGS).
By the way, from http://docs.python.org/devguide/setup.html#compiling-for-debugging :
You should always develop under a pydebug build of CPython (the only instance of when you shouldn’t is if you are taking performance measurements). Even when working only on pure Python code the pydebug build provides several useful checks that one should not skip.

:)

I then had to rebuild Cython (and to rebuild the Sage library), and gdb does not seem to work anymore.

And when I say it fails quite everywhere when using the debug build, hoping that I did not break everything myself with the debug build, in fact it mostly fails before doctesting anything, but I still see pointers to category objects in the trace I get (without gdb unfortunately):
python: Modules/gcmodule.c:326: visit_decref: Assertion `gc->gc.gc_refs != 0' failed.
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libcsage.so(print_backtrace+0x31)[0x81edd2d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libcsage.so(sigdie+0x14)[0x81edd5f]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libcsage.so(sage_signal_handler+0x1da)[0x81ed90b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf310)[0x52ab310]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x5beddd5]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x5bf0efb]
/lib/x86_64-linux-gnu/libc.so.6(+0x2df0e)[0x5be6f0e]
/lib/x86_64-linux-gnu/libc.so.6(+0x2dfb2)[0x5be6fb2]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1a3c11)[0x4fd3c11]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/structure/category_object.so(+0x269be)[0x105de9be]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/structure/parent.so(+0x5781e)[0x1039081e]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/structure/parent_old.so(+0x1baa1)[0x1012daa1]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/structure/parent_base.so(+0x5d03)[0xff0ad03]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0xde402)[0x4f0e402]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1a3c86)[0x4fd3c86]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1a4cd7)[0x4fd4cd7]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1a5008)[0x4fd5008]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(_PyObject_GC_Malloc+0xcc)[0x4fd5ce1]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(_PyObject_GC_New+0x1c)[0x4fd5d19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_NewEx+0x70)[0x4ee55dc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0xeb6cc)[0x4f1b6cc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0xf2359)[0x4f22359]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyType_Ready+0x21c)[0x4f19578]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/rings/integer.so(initinteger+0x2121)[0x1563bf4e]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(_PyImport_LoadDynamicModule+0x12e)[0x4faa57e]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1764a5)[0x4fa64a5]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176f19)[0x4fa6f19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x51dc9)[0x4e81dc9]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_CallFunction+0xfe)[0x4e81f3d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_Import+0x243)[0x4fa9076]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/rings/complex_double.so(+0x40ee3)[0x12aa3ee3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/rings/complex_double.so(+0x41021)[0x12aa4021]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/python/site-packages/sage/rings/complex_double.so(initcomplex_double+0x2b89)[0x12a9e0d5]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(_PyImport_LoadDynamicModule+0x12e)[0x4faa57e]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1764a5)[0x4fa64a5]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176f19)[0x4fa6f19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x170)[0x4f87abc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x683b)[0x4f81a68]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x111c)[0x4f858ca]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x5a)[0x4f7b20c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ExecCodeModuleEx+0x196)[0x4fa3e9b]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x174b3d)[0x4fa4b3d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176465)[0x4fa6465]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176ea3)[0x4fa6ea3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x170)[0x4f87abc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x683b)[0x4f81a68]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x111c)[0x4f858ca]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x5a)[0x4f7b20c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ExecCodeModuleEx+0x196)[0x4fa3e9b]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x174b3d)[0x4fa4b3d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176465)[0x4fa6465]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176f19)[0x4fa6f19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x170)[0x4f87abc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x683b)[0x4f81a68]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x111c)[0x4f858ca]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x5a)[0x4f7b20c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ExecCodeModuleEx+0x196)[0x4fa3e9b]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x174b3d)[0x4fa4b3d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176465)[0x4fa6465]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176f19)[0x4fa6f19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x170)[0x4f87abc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x683b)[0x4f81a68]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x111c)[0x4f858ca]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x5a)[0x4f7b20c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ExecCodeModuleEx+0x196)[0x4fa3e9b]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x174b3d)[0x4fa4b3d]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176465)[0x4fa6465]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x178776)[0x4fa8776]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x177c9a)[0x4fa7c9a]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x176f19)[0x4fa6f19]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyImport_ImportModuleLevel+0x3f)[0x4fa7330]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x1412e3)[0x4f712e3]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyCFunction_Call+0xbc)[0x4ee5884]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x7f)[0x4e81c61]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x170)[0x4f87abc]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x683b)[0x4f81a68]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x111c)[0x4f858ca]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x5a)[0x4f7b20c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(+0x189986)[0x4fb9986]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0xbf)[0x4fb990c]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0x2be)[0x4fb8109]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(PyRun_AnyFileExFlags+0x88)[0x4fb7741]
/home/jp/boulot/sage/sage-5.6.beta0/local/lib/libpython2.7.so.1.0(Py_Main+0xd29)[0x4fd377e]
python(main+0x20)[0x40085c]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x5bda6ad]
python[0x400779]

------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
Aborted
[25204 refs]

Jean-Pierre Flori

unread,
Dec 25, 2012, 11:36:41 AM12/25/12
to sage-...@googlegroups.com
In fact it seems you already segfault when going through the line
from sage.misc.all import * #takes a while
in sage/all.py,
but not before.

And that there it is the
from function import (...
which makes the first hit.

Volker Braun

unread,
Dec 25, 2012, 11:42:16 AM12/25/12
to sage-...@googlegroups.com
On Tuesday, December 25, 2012 4:13:01 PM UTC, Jean-Pierre Flori wrote:
I think the easiest way is to tweak the spkg-install script so that it passes the option to configure.
Dirty but working ok for development, we should add some way to pass the option directly (for example when SAGE_DEBUG is yes

+1 for enabling that (and the Singular xalloc wrapper) if SAGE_DEBUG is set. Making a debug build shouldn't involve hunting around for obscure environment variables that one can set.

Jean-Pierre Flori

unread,
Dec 25, 2012, 6:01:48 PM12/25/12
to sage-...@googlegroups.com
Ok the (first) problematic line is in sage/interfaces/mathematica.py:
mathematica = Mathematica(script_subdirectory='user')

Jean-Pierre Flori

unread,
Dec 26, 2012, 9:07:09 AM12/26/12
to sage-...@googlegroups.com
Bad (?) news, with a debug build of Python, Sage 5.2 (without the memleak patches) fails the same way.
So potentially, all the hard work here maybe only dug up old horrible bugs.

Jean-Pierre Flori

unread,
Dec 26, 2012, 12:49:52 PM12/26/12
to sage-...@googlegroups.com
The offending object seems to be a weakref which is not refcounted correctly.
Not sure which yet.

Jean-Pierre Flori

unread,
Dec 26, 2012, 2:08:24 PM12/26/12