Python 2.2 __slots_

Glyph Lefkowitz

unread,

Jun 26, 2002, 7:03:23 AM6/26/02

to

I'm sorry if this is a known issue. I couldn't get through to the bugtracker
at sourceforge.net; it's abysmally slow and I stopped waiting for the search to
run after about 15 minutes. Searches on google yielded nothing similar for
'python __slots__ bug'.

I think I have discovered a weird bug in the Python interpreter. Under
Python2.2, when I run this script:

class foo(object):
def __init__(self, n):
self.n = n
__slots__ = ['n']

o = None
for i in xrange(43550):
o = foo(o)
del o
o = None
print "I'm fine."
for i in xrange(43551):
o = foo(o)
del o
print "I'm dead."

I get "I'm fine" and then a segmentation fault. This only happens when
__slots__ is defined on a new-style class. Seriously, i'm not making those
numbers up -- reliably, 43551 objects will crash the interpreter; fewer will
not.

I don't know the code here, but I'm willing to bet Python is smashing the stack
because PyType_GenericAlloc() is recursive. I get a stacktrace from gdb that
looks like this:

#0 0x100a3fc4 in PyDict_New ()
#1 0x100244e8 in _PyType_Lookup ()
#2 0x100249d4 in _PyObject_SlotCompare ()
#3 0x1001a3d4 in PyType_GenericAlloc ()
#4 0x1001a4fc in PyType_GenericAlloc ()
#5 0x1001a5b8 in PyType_GenericAlloc ()
#6 0x1001a5b8 in PyType_GenericAlloc ()
#7 0x1001a5b8 in PyType_GenericAlloc ()
#8 0x1001a5b8 in PyType_GenericAlloc ()
#9 0x1001a5b8 in PyType_GenericAlloc ()
#10 0x1001a5b8 in PyType_GenericAlloc ()
#11 0x1001a5b8 in PyType_GenericAlloc ()
#12 0x1001a5b8 in PyType_GenericAlloc ()
#13 0x1001a5b8 in PyType_GenericAlloc ()
#14 0x1001a5b8 in PyType_GenericAlloc ()

... and so on, and so on, for longer than I had the patience to page through.
(I did get to about #400, though.)

Version Information:

% python2.2
Python 2.2.1 (#1, May 4 2002, 11:33:48)
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2

And since those arguments to xrange() above look like some weird magic to me,
this may also be useful to know:

% free
total used free shared buffers cached
Mem: 253968 146192 107776 0 8008 59184
-/+ buffers/cache: 79000 174968
Swap: 262136 3168 258968

% uname -a
Linux zelda 2.4.18 #1 SMP Fri Apr 26 19:18:50 CDT 2002 ppc unknown

Is there any in-development code I should try running?

--
| <`'> | Glyph Lefkowitz: Traveling Sorcerer |
| < _/ > | Lead Developer, the Twisted project |
| < ___/ > | http://www.twistedmatrix.com |

Jonathan Hogg

unread,

Jun 26, 2002, 9:00:48 AM6/26/02

to

On 26/6/2002 12:03, in article
mailman.1025089474...@python.org, "Glyph Lefkowitz"
<gl...@twistedmatrix.com> wrote:

> I get "I'm fine" and then a segmentation fault. This only happens when
> __slots__ is defined on a new-style class. Seriously, i'm not making those
> numbers up -- reliably, 43551 objects will crash the interpreter; fewer will
> not.

Interestingly I get the construction working fine, but the 'del' crashes the
interpreter with (on Mac OS X):

-----
Date/Time: 2002-06-26 13:53:19 +0100
OS Version: 10.1.5 (Build 5S66)

Command: python
PID: 18472

Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_INVALID_ADDRESS (0x0001) at 0xbff7ffe0

Thread 0 Crashed:
#0 0x00037bbc in lookdict_string
#1 0x0003804c in PyDict_GetItem
#2 0x00018994 in _PyType_Lookup
#3 0x00017348 in lookup_maybe
#4 0x00016f38 in call_finalizer
#5 0x00017054 in subtype_dealloc
#6 0x00017118 in subtype_dealloc
#7 0x00017118 in subtype_dealloc
[etc...]
-----

This seems to fail for me at 5430 objects. I couldn't get the allocation to
crash at all (or at least up to a million objects, after which I got bored).

Jonathan

Glyph Lefkowitz

unread,

Jun 26, 2002, 11:23:45 AM6/26/02

to

From: Jonathan Hogg <jona...@onegoodidea.com>
Subject: Re: Python 2.2 __slots__ Bug?
Date: Wed, 26 Jun 2002 14:00:48 +0100

> On 26/6/2002 12:03, in article
> mailman.1025089474...@python.org, "Glyph Lefkowitz"
> <gl...@twistedmatrix.com> wrote:
>
> > I get "I'm fine" and then a segmentation fault. This only happens when
> > __slots__ is defined on a new-style class. Seriously, i'm not making those
> > numbers up -- reliably, 43551 objects will crash the interpreter; fewer will
> > not.
>
> Interestingly I get the construction working fine, but the 'del' crashes the
> interpreter with (on Mac OS X):

Hmm. Sprinkling print statements about shows it dying on the 'del' (this is
how I encountered the bug originally, I was trying to time the [de]allocation
of large numbers of objects under 2.1 vs. 2.2), but the stacktrace is still the
same; is there some difference between the way allocation works on OS X
vs. Linux? (I assume if you can run OS X we're using the same hardware.)

> This seems to fail for me at 5430 objects. I couldn't get the allocation to
> crash at all (or at least up to a million objects, after which I got bored).

Yep, same here. Allocation is no problem.

Jonathan Hogg

unread,

Jun 26, 2002, 12:57:09 PM6/26/02

to gl...@twistedmatrix.com

On 26/6/2002 16:23, in article
mailman.1025105074...@python.org, "Glyph Lefkowitz"
<gl...@twistedmatrix.com> wrote:

> Hmm. Sprinkling print statements about shows it dying on the 'del' (this is
> how I encountered the bug originally, I was trying to time the [de]allocation
> of large numbers of objects under 2.1 vs. 2.2), but the stacktrace is still
> the
> same; is there some difference between the way allocation works on OS X
> vs. Linux? (I assume if you can run OS X we're using the same hardware.)
>
>> This seems to fail for me at 5430 objects. I couldn't get the allocation to
>> crash at all (or at least up to a million objects, after which I got bored).
>
> Yep, same here. Allocation is no problem.

I'm not sure about allocation differences, but if I up the stacksize limit
then I can run the example up to much higher numbers of objects. I think
it's basically a flaw in the deallocation mechanism. I don't understand
enough about the __slots__ stuff to know why it should only affect that.

No-one else seems to have noted it as a known bug, so I'd guess it's fair to
log it to Sourceforge as a new one (besides which there's no harm in
duplication). If you can't get it logged, let me know and I'll have a go
myself.

Jonathan

Aahz

unread,

Jun 26, 2002, 12:57:12 PM6/26/02

to

In article <mailman.1025089474...@python.org>,

Glyph Lefkowitz <gl...@twistedmatrix.com> wrote:
>
>I get "I'm fine" and then a segmentation fault. This only happens when
>__slots__ is defined on a new-style class. Seriously, i'm not making
>those numbers up -- reliably, 43551 objects will crash the interpreter;
>fewer will not.

Interesting. Using Python 2.2.1 on Mandrake 8.1, I get a segfault
without printing "I'm fine". Go ahead and file a bug.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/

Erik Max Francis

unread,

Jun 26, 2002, 2:18:39 PM6/26/02

to

Glyph Lefkowitz wrote:

> I get "I'm fine" and then a segmentation fault. This only happens
> when
> __slots__ is defined on a new-style class. Seriously, i'm not making
> those
> numbers up -- reliably, 43551 objects will crash the interpreter;
> fewer will
> not.

Hmm, the script runs fine on my Slackware 8.0 system, even if I crank
the number up to 100000.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, US / 37 20 N 121 53 W / ICQ16063900 / &tSftDotIotE
/ \ See the son in your bad day / Smell the flowers in the valley
\__/ Chante Moore
Bosskey.net: Aliens vs. Predator 2 / http://www.bosskey.net/avp2/
A personal guide to Aliens vs. Predator 2.

Skip Montanaro

unread,

Jun 26, 2002, 1:25:21 PM6/26/02

to

Glyph> I think I have discovered a weird bug in the Python interpreter.
Glyph> Under Python2.2, when I run this script:

...

Glyph,

This looks like a new bug to me. A quick scan for __slots__ in the SF bug
tracker didn't turn up anything. If you are unable to get ahold of the SF
site, I'll file a bug report for you, just let me know.

--
Skip Montanaro
sk...@pobox.com
consulting: http://manatee.mojam.com/~skip/resume.html

Jonathan Hogg

unread,

Jun 26, 2002, 2:39:54 PM6/26/02

to

On 26/6/2002 17:57, in article afcrp8$c64$1...@panix1.panix.com, "Aahz"
<aa...@pythoncraft.com> wrote:

> Interesting. Using Python 2.2.1 on Mandrake 8.1, I get a segfault
> without printing "I'm fine". Go ahead and file a bug.

I finally got Sourceforge to talk to me long enough to submit a bug report
for this:

#574207 Chained __slots__ dealloc segfault

Feel free to followup with any information I've missed.

Jonathan

Jeff Epler

unread,

Jun 26, 2002, 2:25:02 PM6/26/02

to

On Wed, Jun 26, 2002 at 11:18:39AM -0700, Erik Max Francis wrote:
> Glyph Lefkowitz wrote:
>
> > I get "I'm fine" and then a segmentation fault. This only happens
> > when
> > __slots__ is defined on a new-style class. Seriously, i'm not making
> > those
> > numbers up -- reliably, 43551 objects will crash the interpreter;
> > fewer will
> > not.
>
> Hmm, the script runs fine on my Slackware 8.0 system, even if I crank
> the number up to 100000.

When run in a thread, it dies on my redhat 7.3 machine. When not run in
a thread, it seems to succeed even for very large values.

Non-main threads are presumably started with a smaller stack..

Jeff

Erik Max Francis

unread,

Jun 26, 2002, 3:42:39 PM6/26/02

to

Jeff Epler wrote:

> When run in a thread, it dies on my redhat 7.3 machine. When not run
> in
> a thread, it seems to succeed even for very large values.

I can reproduce that crash in a thread on Slackware 8.0, as well.

Bengt Richter

unread,

Jun 26, 2002, 8:24:06 PM6/26/02

to

On Wed, 26 Jun 2002 10:23:45 -0500 (CDT), Glyph Lefkowitz <gl...@twistedmatrix.com> wrote:

>----Security_Multipart(Wed_Jun_26_10:23:45_2002_363)--
>Content-Type: Text/Plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit

>
>From: Jonathan Hogg <jona...@onegoodidea.com>
>Subject: Re: Python 2.2 __slots__ Bug?
>Date: Wed, 26 Jun 2002 14:00:48 +0100
>
>> On 26/6/2002 12:03, in article
>> mailman.1025089474...@python.org, "Glyph Lefkowitz"
>> <gl...@twistedmatrix.com> wrote:
>>
>> > I get "I'm fine" and then a segmentation fault. This only happens when
>> > __slots__ is defined on a new-style class. Seriously, i'm not making those
>> > numbers up -- reliably, 43551 objects will crash the interpreter; fewer will
>> > not.
>>
>> Interestingly I get the construction working fine, but the 'del' crashes the
>> interpreter with (on Mac OS X):
>
>Hmm. Sprinkling print statements about shows it dying on the 'del' (this is
>how I encountered the bug originally, I was trying to time the [de]allocation
>of large numbers of objects under 2.1 vs. 2.2), but the stacktrace is still the
>same; is there some difference between the way allocation works on OS X
>vs. Linux? (I assume if you can run OS X we're using the same hardware.)
>
>> This seems to fail for me at 5430 objects. I couldn't get the allocation to
>> crash at all (or at least up to a million objects, after which I got bored).
>
>Yep, same here. Allocation is no problem.
>

Wondering if there's any 12*n byte chunks involved, fitting
into some 2**n (-overhead) allocation space? Just noticing

>>> divmod(2**19, 43551)
(12, 1676)
>>> divmod(2**16, 5430)
(12, 376)

Sometimes these things mean something, sometimes they don't ;-)
Maybe it will jog someone's thought. E.g., a last-item index/limit
problem?

Regards,
Bengt Richter

Jonathan Hogg

unread,

Jun 27, 2002, 8:19:58 AM6/27/02

to

On 27/6/2002 1:24, in article afdlv6$bu6$0...@216.39.172.122, "Bengt Richter"
<bo...@oz.net> wrote:

> Wondering if there's any 12*n byte chunks involved, fitting
> into some 2**n (-overhead) allocation space? Just noticing
>
>>>> divmod(2**19, 43551)
> (12, 1676)
>>>> divmod(2**16, 5430)
> (12, 376)
>
> Sometimes these things mean something, sometimes they don't ;-)
> Maybe it will jog someone's thought. E.g., a last-item index/limit
> problem?

It appears that GjR is on the case ;-) but I did some digging into the
source out of curiosity and it's definitely a stack overrun problem. If
anyone is interested, here's my analysis:

At the moment the deallocators for the builtin container types pull a neat
trick to stop the stack getting too deep. As they're recursively
deallocating objects they keep a count of how deep deallocation has gotten
so far. If this gets over a set limit then instead of deallocating the
object it jams it onto a list and returns immediately.

When the stack has unwound back to the top a new piece of magic is invoked
that checks to see if anything was put onto the "delete later" list. If
something was then it grabs the first thing off the list and starts
recursively deleting at that point again.

A hack is used to make this "delete later" list: The object's type is
checked and, since this trick is only used by a few builtin object types, a
numeric type code is stored in the reference count slot of the object (which
is no longer interesting since it must be 0). The type object pointer slot
is then re-used as a pointer to the next item in the "delete later" list.
When the item is pulled off the list and actually deallocated the reverse
trick is pulled to correct the type object pointer and reset the reference
count to 0.

This is all fine except that this trick is not pulled for __slots__
variables as these are not stored in a dict that can be added to the "delete
later" list (the point of slots variables). In the case of Glyph's example,
because the chain is made up entirely of slots pointers, there is never a
dict, tuple, or list - the only* types that can take part in this trick -
that can be added to the "delete later" list allowing the stack to be
unwound, so the deallocation simply recurses until it runs out of stack.

I can't see an easy way to fix this without changing the way that the
"delete later" list is constructed.

Chewy problem.

Jonathan

* OK, I lied slightly there. Actually frames and traceback objects take part
in this crime as well, but they're not interesting here.

Jonathan Hogg

unread,

Jun 27, 2002, 8:21:35 AM6/27/02

to

On 27/6/2002 13:19, in article B940C17E.D101%jona...@onegoodidea.com,
"Jonathan Hogg" <jona...@onegoodidea.com> wrote:

> It appears that GjR is on the case ;-)

...and his alter-ego GvR.

oops.

Jonathan

Tim Peters

unread,

Jun 27, 2002, 11:47:01 PM6/27/02

to

[Jonathan Hogg]

> It appears that GjR is on the case ;-)

Na, I assigned the bug to GvR just to irritate him <0.7 wink>: he wrote the
__slot__ deallocation code that's blowing up, yet has also called the funky
trashcan mechanism (which you described well later) "evil" more than once.
The irritating part is that the evil trashcan mechanism is the only hope of
avoiding stack overflow in absurd cases like this.

> ...

> I can't see an easy way to fix this without changing the way that the
> "delete later" list is constructed.

Sometimes we just agree to pretend that we never saw the bug report <wink>.

more-than-one-way-to-close-a-bug-ly y'rs - tim

Eric Texier

unread,

Jun 28, 2002, 4:36:20 AM6/28/02

to

I am running on [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-98)] on linux2
and your script DOESN'T crash.
I am a notSoNewPythonWriter/longTimeC++Writer and I am always
concern about memory(I guess like anybody).
I just started writing a big app in python, and for
now I decided to not be to worry about memory and to rely as much as
possible on python.
My question is simple: Will it crash the same way if the range call what
in a function and I let python cleanning after itself:

def doStuff(x):
o = None
for i in xrange(x):
o = foo(o)

# x = 43550
x = 3
doStuff(x)
print "I'm fine."
doStuff(x+1)
print "I'm dead."

Jonathan Hogg

unread,

Jun 28, 2002, 5:21:25 AM6/28/02

to

On 28/6/2002 4:47, in article mailman.102523611...@python.org,
"Tim Peters" <tim...@comcast.net> wrote:

>> ...
>> I can't see an easy way to fix this without changing the way that the
>> "delete later" list is constructed.
>
> Sometimes we just agree to pretend that we never saw the bug report <wink>.

Actually I think I just figured out an easy way to fix the problem.

I was thinking that if there had been a liberal sprinkling of dicts in the
chain then it wouldn't overflow. But there's no way of doing that from the
Python side of things.

Then it occurred to me that it could be done from the C side of things:

Surround the slots deallocation in 'subtype_dealloc' with Py_TRASHCAN type
macros with the modification that if the recursion limit is hit, we create a
new tuple and set its items to be the contents of the slots, then DECREF it,
set the slots to NULL, and continue. The tuple deallocating code will see
that the recursion limit has been hit and add itself to the delete later
list.

This has the effect of inserting a container into the chain every few items
which can take part in the stack unwinding and restarting magic.

Simple and effective, no?

If I want to have a go at this, should I just upload a patch to Sourceforge,
or should I email it to someone?

Jonathan