Is Stackless Python DEAD?

Goh S H

unread,

Oct 27, 2001, 12:39:46 AM10/27/01

to

No update since 2001-05-14 ...

Martin von Loewis

unread,

Oct 27, 2001, 10:50:19 AM10/27/01

to

pytho...@yahoo.com (Goh S H) writes:

> No update since 2001-05-14 ...

It's free software, so it never dies.

Regards,
Martin

Károly Ladvánszky

unread,

Oct 27, 2001, 12:04:39 PM10/27/01

to

What is Stackless Python?

Cheers,

Károly

______________________________________________________________________________
Posted Via Binaries.net = SPEED+RETENTION+COMPLETION = http://www.binaries.net

Emile van Sebille

unread,

Oct 27, 2001, 12:12:17 PM10/27/01

to

http://www.stackless.com

--

Emile van Sebille
em...@fenx.com

---------
"Károly Ladvánszky" <a...@bb.cc> wrote in message
news:3bdad...@corp-goliath.newsgroups.com...

Fredrik Lundh

unread,

Oct 27, 2001, 12:46:02 PM10/27/01

to

Károly Ladvánszky wrote:
> What is Stackless Python?

http://www.stackless.com/

(also see http://www.google.com)

</F>

Bill Tate

unread,

Oct 27, 2001, 3:12:52 PM10/27/01

to

pytho...@yahoo.com (Goh S H) wrote in message news:<184fbd02.01102...@posting.google.com>...

> No update since 2001-05-14 ...

Christian Tismer is probably the best to answer this but I will put my
2 cents in on this. First off - I pray that stackless becomes part of
the core distribution of python. Continuations (the core of
Stackless) is WORTH THE TIME getting to know in any event. While the
concept of continuations is difficult to grasp at first, I struggle
with it myself, I would encourage anyone to look at some of the
examples Gordon McMillan has put up on his sight. He has working
examples and code that drive home some of the fundamentals of
continuations and their benefits.

Its also an ideal architecture from which to build scalable
application servers since continuations are very lightweight as
compared to threads and they work extremely well with sockets. There
are other benefits to them as well - again check Gordon's site.

I would also suggest checking out www.eve-online.com - under the FAQs
section - search for stackless and why they are using python and
stackless as their gaming engine!!!

In every respect, I believe Christian's work advances Python in very
meaningful ways in terms of flexibility and performance. It is stable
and snap to install.

Bruce Dodson

unread,

Oct 27, 2001, 4:35:09 PM10/27/01

to

Would Stackless have a better chance of making it into the core if its
initial PEP talked only about increasing performance and removing limits on
recursion? This advantage is not as "sexy" as continuations, but is also
less controversial. It is still something that everyone can understand, and
without any exported facilities for frame-switching, it alone does not
require changes to most existing C extensions. It does still require
additional book-keeping code to be added to the core, much like Stackless.

Microthreads and coroutines can be sold as advantages in another PEP, one
which does treat that decoupled stack as a tree, and does permit arbitrary
frame switching. This PEP would have to talk about the risks and changes
required to existing modules, to make them work safely in program that takes
advantage of continuations. A "future" import might help. Meanwhile
Stackless remains a separate branch, but hopefully becomes easier to
maintain against a CPython which already has a decoupled stack.

Bruce

Martin von Loewis

unread,

Oct 27, 2001, 5:20:53 PM10/27/01

to

"Bruce Dodson" <bruce_...@bigfoot.com> writes:

> Would Stackless have a better chance of making it into the core if its
> initial PEP talked only about increasing performance and removing limits on
> recursion?

No. It is not the feature set that prevents its incorporation; it is
the implementation strategy chosen. The patch is too intrusive.

Regards,
Martin

Donn Cave

unread,

Oct 29, 2001, 12:17:38 PM10/29/01

to

Quoth tat...@aol.com (Bill Tate):
...

| In every respect, I believe Christian's work advances Python in very
| meaningful ways in terms of flexibility and performance. It is stable
| and snap to install.

I'll say. For myself, I have mixed feelings - very powerful, but
a little scary. I was able to set up a callback-driven system that
uses continuations to span callbacks in a single function - like,

# callback from UI - selected login option
def login(self):
name = self.config.loginid
password = self.getpassword(name) # posts password prompt window
# Function actually returns from callback ...
# ... New callback arrives with password requested here; continue!
self.loginwithpassword(name, password)

That's arguably more readable than any alternative I ever thought of,
but then it substantially obscures the "real" flow of control in the
program. Anyway, whether I want to go that way or not (and I don't
know for myself), it clearly is indeed a meaningful advance, something
that makes a substantial difference in what you can do with Python.
Definitely worth looking at, if you have gnarly asynchronous programming
issues.

Donn Cave, do...@u.washington.edu

Paul Rubin

unread,

Oct 29, 2001, 12:53:00 PM10/29/01

to

Donn Cave <do...@u.washington.edu> writes:
> # callback from UI - selected login option
> def login(self):
> name = self.config.loginid
> password = self.getpassword(name) # posts password prompt window
> # Function actually returns from callback ...
> # ... New callback arrives with password requested here; continue!
> self.loginwithpassword(name, password)
>
> That's arguably more readable than any alternative I ever thought of,
> but then it substantially obscures the "real" flow of control in the
> program.

Nice! But really, using callbacks in the first place for things like
this is a kludge designed to get around the lack of
coroutines/continuations/threads whatever. In threaded style you'd
just write getpassword as an ordinary method call that blocks til
the password is entered, then returns.

> Anyway, whether I want to go that way or not (and I don't know for
> myself), it clearly is indeed a meaningful advance, something that
> makes a substantial difference in what you can do with Python.
> Definitely worth looking at, if you have gnarly asynchronous
> programming issues.

I've been toying with the idea of combining a Python parser with a
Scheme implementation, getting continuations and a more serious
compiler/interpreter at the same time. If done in the obvious way
some Python semantics would break, however.

Donn Cave

unread,

Oct 29, 2001, 4:04:23 PM10/29/01

to

Quoth Paul Rubin <phr-n...@nightsong.com>:

| Donn Cave <do...@u.washington.edu> writes:
|> # callback from UI - selected login option
|> def login(self):
|> name = self.config.loginid
|> password = self.getpassword(name) # posts password prompt window
|> # Function actually returns from callback ...
|> # ... New callback arrives with password requested here; continue!
|> self.loginwithpassword(name, password)
|>
|> That's arguably more readable than any alternative I ever thought of,
|> but then it substantially obscures the "real" flow of control in the
|> program.
|
| Nice! But really, using callbacks in the first place for things like
| this is a kludge designed to get around the lack of
| coroutines/continuations/threads whatever. In threaded style you'd
| just write getpassword as an ordinary method call that blocks til
| the password is entered, then returns.

I'm probably missing something here, but note that this system is
built on a multi-threaded context already, and part of the game was
to put the whole system together on a purely I/O dispatched basis.
That message I/O, and the callback dispatching, is a basic given of
the underlying graphics toolkit, and it makes life a lot simpler if
the whole application is on board.

The function is based on something actually do, and I currently use
a version of it that does happen to block for the password. In this
case, the caller spawns a password prompt interface with a semaphore,
then blocks waiting for the semaphore, then retrieves the password
from a common interlocked data store. That sounds complicated, but
it's easy to make it work.

But what happens if the password prompt interface isn't a thread that
you can spawn every time you want one, but rather an element inside
a UI thread that; and that UI thread happens to be asking you for
something that you'll need a password to answer? You can't ask it
for the password, because it's asking you a question, and blocking
until you come up with an answer. If all the parties just handle
I/O events and then return without blocking, we don't have that problem,
but in a simple procedural flow of control, the implementation gets
awkward because you have to divide the function into two, the second
one continuing with the password now in hand - and whatever state you
need, which has to be passed on from the first to the second function.
With Stackless' continuations, I could continue right back into the
middle of the function, with the state just where I left off, plus
the password value that came with the present callback.

If this still leaves you thinking that "coroutines/continuations/threads
whatever" would be a more obvious solution, could you elaborate?

Donn Cave, do...@u.washington.edu

Frederic Giacometti

unread,

Nov 4, 2001, 9:13:00 PM11/4/01

to

"Martin von Loewis" <loe...@informatik.hu-berlin.de> wrote in message
news:j4zo6ci...@informatik.hu-berlin.de...

'intrusive'?
This words seems to point out that the actual reason is not technical...
This is definitely not a word from the realm of engineering, but more from
someone sense of his core self being disturbed by an outsider.

Am I right ?

FG

A.M. Kuchling

unread,

Nov 5, 2001, 8:53:28 AM11/5/01

to

On Mon, 05 Nov 2001 02:13:00 GMT,
Frederic Giacometti <frederic....@arakne.com> wrote:
>This words seems to point out that the actual reason is not technical...
>This is definitely not a word from the realm of engineering, but more from
>someone sense of his core self being disturbed by an outsider.

Not at all. Stackless would have ramifications not just for a few
files in the core, but also for all the extension modules that come
with Python and for all the authors of third-party extension modules.
In terms of number of files affected, "intrusive" isn't a bad word to
describe the Stackless patches.

--amk

Guido Stepken

unread,

Nov 5, 2001, 9:18:25 AM11/5/01

to

A.M. Kuchling wrote:

IMHO, a lot of more work should be invested in stackless python, perhaps
for a dr. theses or diploma work ....

Why ? Python is even running on palm Vx and iPaq. Doing multitasking with
threads isn't that complicated, but very much easier to write code with
continuations, if one has understood once the big advantage....
Python is becoming a common programming language for everybody, because it
i s really easy to learn ....i have learned basic, 6502 assembler, pascal,
delphi, c, c++, prolog, lisp, PHP, PERL and now Python.
Compared to JAVA, Python is really RAD (RApid Development), about 3 times
faster.....and you can compile java bytecode ....
JPYTHON is a problem - continuations have to be translated internally into
green threads - a masterpiece for this person, who will succeedd.....

guys (Guido v. Rossum, PHIL (PYQT) and all others ..), keep up the good
work ....!

regards, Guido Stepken

John S. Yates, Jr.

unread,

Nov 5, 2001, 11:01:08 AM11/5/01

to

On 5 Nov 2001 13:53:28 GMT, a...@localhost.debian.org (A.M. Kuchling) wrote:

>Not at all. Stackless would have ramifications not just for a few
>files in the core, but also for all the extension modules that come
>with Python and for all the authors of third-party extension modules.

Is this because those extension modules would break? Or because they
would be sub-optimal until they took advantage of continuations?

/john
--
John Yates
40 Pine Street
Needham, MA 02492
781 444-2899

Michael Abbott

unread,

Nov 5, 2001, 12:01:07 PM11/5/01

to

Martin von Loewis <loe...@informatik.hu-berlin.de> wrote in

news:j4zo6ci...@informatik.hu-berlin.de:

> No. It is not the feature set that prevents its incorporation; it is
> the implementation strategy chosen. The patch is too intrusive.
>

I would have thought that the nature of the beast is likely to require
quite substantial modifications to key parts of Python.

Can you describe what you mean by "intrusive" in this context? I don't
know the basic statistics of the Stackless patch (number of files patched,
lines of code modified, etc), so it would be interesting to understand just
how severe a hurdle Stackless Python really has to jump to be integrated.

I keep wishing I had something like this, so I'm interested...

Mike C. Fletcher

unread,

Nov 5, 2001, 1:36:26 PM11/5/01

to

I'm not sure I understand this argument. The current version is
drop-in-compatible with the standard Python 2.0 DLL. You don't re-compile
anything in extension modules, they just work the same as with the standard
distribution.

The patch is (apparently, haven't looked at it) intrusive in it's re-design
of the core loop of the interpreter (it is re-writing some pretty basic
mechanisms, after all), but for user-land (as distinct from
interpreter-implementer-land) code, it's pretty much transparent in my
experience.

With that said, there are likely to be systems that don't work well under
(for instance) micro-threading. Modules that need locks and expect
micro-threads to look like real threads will be disappointed (1000s of them
can be running in a single OS-level thread, so regular thread locks don't
block the micro-threads). However, that's a problem only for those users
actually using the Stackless-specific stuff (in essence if there are no
micro-threads running, then the thread-assuming extensions work fine).

Enjoy,
Mike

--amk
--
http://mail.python.org/mailman/listinfo/python-list

Martin von Loewis

unread,

Nov 5, 2001, 3:01:31 PM11/5/01

to

Michael Abbott <mic...@rcp.co.uk> writes:

> I would have thought that the nature of the beast is likely to require
> quite substantial modifications to key parts of Python.
>
> Can you describe what you mean by "intrusive" in this context?

The patches changes nearly everything in the core interpreter, and it
is not clear whether all these changes are really necessary and for
the better.

It *is* clear that some things just cannot continue to work as they
did. Is just isn't clear (to me) that all the complexity that the
stackless patch adds is really necessary.

It's been quite some time since I studied it last, so here's what I
found after a quick glance right now:
- it adds 17 new members to the frame object, doubling the number
of members. Usage of some of these members isn't obvious to me.
- It adds a field nesting_level to the thread state, without
ever checking its value (it just counts the nesting level)
- it adds a number of _nr variants of functions (non-recursive),
e.g. for map and eval. In __builtins__, the _nr versions are
available as "map" and "eval", while the original versions are
preserved as apply_orig and map_orig:
* Are the _nr versions functionally completely backwards-compatible?
If not, why? If yes, why is the original version preserved?
* Just to implement map, 150 lines of builtin_map had to be
rewritten into 350 lines (builtin_map, make_stub_code,
make_map_frame, builtin_map_nr, builtin_map_loop). The author
indicates that the same procedure still needs to be done for
apply and filter. Just what is the "same procedure"? Isn't there
some better way?
- The code adds PREPARE macros into each branch of ceval. Why?
- It adds a long list of explicitly not-supported opcodes into
the ceval switch, instead of using 'default:'. No explanation
for that change is given, other than 'unused opcodes go here'.
Is it necessary to separately maintain them? Why?

It may be that some of these questions can be answered giving a good
reason for the change, but I doubt that this code can be incorporated
as-is, just saying "you need all of this for Stackless Python". I
don't believe you do, but I cannot work it out myself, either.

Regards,
Martin

Donn Cave

unread,

Nov 5, 2001, 4:35:13 PM11/5/01

to

Quoth "Mike C. Fletcher" <mcfl...@home.com>:

| I'm not sure I understand this argument. The current version is
| drop-in-compatible with the standard Python 2.0 DLL. You don't re-compile
| anything in extension modules, they just work the same as with the standard
| distribution.
|
| The patch is (apparently, haven't looked at it) intrusive in it's re-design
| of the core loop of the interpreter (it is re-writing some pretty basic
| mechanisms, after all), but for user-land (as distinct from
| interpreter-implementer-land) code, it's pretty much transparent in my
| experience.
|
| With that said, there are likely to be systems that don't work well under
| (for instance) micro-threading. Modules that need locks and expect
| micro-threads to look like real threads will be disappointed (1000s of them
| can be running in a single OS-level thread, so regular thread locks don't
| block the micro-threads). However, that's a problem only for those users
| actually using the Stackless-specific stuff (in essence if there are no
| micro-threads running, then the thread-assuming extensions work fine).

And that doesn't apply to continuations, they are compatible with OS
threads. That may be obvious, but FYI. In my experiment, I used lots
of C++ modules, some of which create OS threads that branch into the
interpreter and execute most of the program.

Donn Cave, do...@u.washington.edu

Frederic Giacometti

unread,

Nov 5, 2001, 10:52:59 PM11/5/01

to

> Frederic Giacometti <frederic....@arakne.com> wrote:
> >This words seems to point out that the actual reason is not technical...
> >This is definitely not a word from the realm of engineering, but more
from
> >someone sense of his core self being disturbed by an outsider.
>
> Not at all. Stackless would have ramifications not just for a few
> files in the core, but also for all the extension modules that come
> with Python and for all the authors of third-party extension modules.
> In terms of number of files affected, "intrusive" isn't a bad word to
> describe the Stackless patches.

Yes, but the word 'intrusive' normally refers to a negative connotation; and
the expression 'too intrusive' means what it means...

Michael Abbott

unread,

Nov 6, 2001, 3:17:18 AM11/6/01

to

Martin von Loewis <loe...@informatik.hu-berlin.de> wrote in

news:j4d72xa...@informatik.hu-berlin.de:

> Michael Abbott <mic...@rcp.co.uk> writes:
>
>> Can you describe what you mean by "intrusive" in this context?
>
> The patches changes nearly everything in the core interpreter, and it
> is not clear whether all these changes are really necessary and for
> the better.
>
> It *is* clear that some things just cannot continue to work as they
> did. Is just isn't clear (to me) that all the complexity that the
> stackless patch adds is really necessary.
>

...
>
> Regards,
> Martin

Hmm. Sounds tricky. It'd be interesting to find time to take a detailed
look; after all, we're only talking about eight .c files and seven .h
files, I think.

Is there any dialogue with Christian Tismer (the original author) on this
now? He seems to have kept quite a detailed commentary on his ideas in
the heading to continuationmodule.c, so it may be possible to reconstruct
what's going on from there.

Umm. Well I for one need to go away and think about this, but let me end
with the following question for Python kernel experts:

Is the idea of "Stackless Python" fundamentally sound?

Martin von Loewis

unread,

Nov 6, 2001, 3:37:57 AM11/6/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

> Yes, but the word 'intrusive' normally refers to a negative
> connotation; and the expression 'too intrusive' means what it
> means...

I see. Looking up "intrusive" in Webster's, I find that this is not
what I meant:

in-tru-sive \in-'tr:u-siv, -ziv\ adj (15c)
1 a : characterized by intrusion

in-tru-sion \in-'tr:u-zhen\ n
[ME, fr. MF, fr. ML intrusion-, intrusio, fr. L intrusus, pp. of
intrudere] (15c)
1 : the act of intruding or the state of being intruded; esp: the
act of wrongfully entering upon, seizing, or taking
possession of the property of another

It is not the act of wrongfully taking possession of the property of
another that has prevented stackless from being integrated. It is the
way in which the property is turned upside down that is bothersome
(now I ask myself whether bothersome will be misunderstood as well :-(

Regards,
Martin

Michael Hudson

unread,

Nov 6, 2001, 5:11:52 AM11/6/01

to

Martin von Loewis <loe...@informatik.hu-berlin.de> writes:

[schnipp]

> It's been quite some time since I studied it last, so here's what I
> found after a quick glance right now:

Did you actually want answers to these questions? It's been sometime
since I looked at it, but think I can remember the answer to some of
them. Take a pinch of salt with most of what follows.

> - it adds 17 new members to the frame object, doubling the number
> of members. Usage of some of these members isn't obvious to me.

See below about map & friends.

> - It adds a field nesting_level to the thread state, without
> ever checking its value (it just counts the nesting level)

I imagine this was just for bragging with :)

> - it adds a number of _nr variants of functions (non-recursive),
> e.g. for map and eval. In __builtins__, the _nr versions are
> available as "map" and "eval", while the original versions are
> preserved as apply_orig and map_orig:
> * Are the _nr versions functionally completely backwards-compatible?
> If not, why? If yes, why is the original version preserved?

I think the originals are just around because the implementation of
the _nr variants was tricky and Chris needed something to test
against. Not sure, though.

> * Just to implement map, 150 lines of builtin_map had to be
> rewritten into 350 lines (builtin_map, make_stub_code,
> make_map_frame, builtin_map_nr, builtin_map_loop). The author
> indicates that the same procedure still needs to be done for
> apply and filter. Just what is the "same procedure"? Isn't there
> some better way?

This is where implementing stackless in C really, really hurts.

To get continuations to work (the way stackless does it anyway, there
are other tricks), you need to allocate your locals on the (C) heap
(it is called stackless, after all). This is no problem for Python
code (as Python has heap-allocated its locals forever).

So when map wants to call the Python function, it needs to stuff all
the local data it cares about into a frame object (see above), push
this frame object onto the (Python) stack, *return* to the interpreter
in such a way that the mapping function is called next and then every
time(!) that returns have the interpreter call back into map again.

So in the map case, I think builtin_map_nr is called first, which
calls make_map_frame specifying that the interpreter should return
into builtin_map_loop, shoves the resulting frame onto the stack, then
returns. The interpreter than calls builtin_map_loop, which if the
lists mapping over are exhausted, pops the "fake" frame off the stack
and returns (which means the interpreter carries on with the code that
called map), or just returns again.

I don't remember what make_stub_code is for.

Is this helping? Oh well...

> - The code adds PREPARE macros into each branch of ceval. Why?
> - It adds a long list of explicitly not-supported opcodes into
> the ceval switch, instead of using 'default:'. No explanation
> for that change is given, other than 'unused opcodes go here'.
> Is it necessary to separately maintain them? Why?

This was an optimization Chris used to try and get back some of the
performance lost during the stackless changes. IIRC, he handles
exceptions and return values as "pseudo-opcodes" rather than using the
WHY_foo constants the current ceval.c uses. I never really understood
this part.

> It may be that some of these questions can be answered giving a good
> reason for the change, but I doubt that this code can be incorporated
> as-is, just saying "you need all of this for Stackless Python". I
> don't believe you do, but I cannot work it out myself, either.

I think integrating stackless into the core is a fairly huge amount of
work. I'd like to think I could do it, given several months of
full-time effort (which isn't going to happen). About the only likely
way I see for it to get in is for it to become important to Zope
Corp. for some reason, and them paying Tim or Guido (or Chris) to do
it.

Cheers,
M.

--
Never meddle in the affairs of NT. It is slow to boot and quick to
crash. -- Stephen Harris
-- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

Terry Reedy

unread,

Nov 6, 2001, 6:16:26 AM11/6/01

to

Michael Hudson asks

> Did you actually want answers to these questions? It's been
sometime
> since I looked at it, but think I can remember the answer to some of
> them. Take a pinch of salt with most of what follows.

The post you answered hasn't arrived here yet, but I have certainly
been genuinely curious about some of these questions.

> Is this helping?

Yes, for me. Thank you.

Terry J. Reedy

Gordon McMillan

unread,

Nov 6, 2001, 8:45:14 AM11/6/01

to

John S. Yates, Jr. wrote:

> On 5 Nov 2001 13:53:28 GMT, a...@localhost.debian.org (A.M. Kuchling)
> wrote:
>
>>Not at all. Stackless would have ramifications not just for a few
>>files in the core, but also for all the extension modules that come
>>with Python and for all the authors of third-party extension modules.
>
> Is this because those extension modules would break? Or because they
> would be sub-optimal until they took advantage of continuations?

You can't create a continuation in one execution of the interpreter
and use it in another. Recursions of the interpreter are very common
in Python - basically, anytime you execute Python code from C.

Stackless fixes some of these cases (turning recursion into iteration),
but leaves most of them untouched. In practice, it's not much of a
problem - on the Stackless list, I think I've seen two or three posts
from people trying to create continuations inside an __init__ (which
doesn't work). It turns out to be easy to work around.

But making Python *truly* stackless means getting rid of all recursions,
and that is an enormous task. If you don't do that, you've got a
language feature that doesn't work in some apparently random
set of circumstances.

There is still hope, I think, that something more general than
Generators but less ambitious than Stackless will find it's way
into the core.

- Gordon

Frederic Giacometti

unread,

Nov 6, 2001, 9:58:09 AM11/6/01

to

"Michael Hudson" <m...@python.net> wrote in message
news:uady09...@python.net...

> Martin von Loewis <loe...@informatik.hu-berlin.de> writes:
>
> [schnipp]

> > * Are the _nr versions functionally completely backwards-compatible?
> > If not, why? If yes, why is the original version preserved?
>
> I think the originals are just around because the implementation of
> the _nr variants was tricky and Chris needed something to test
> against. Not sure, though.

From what I understand in the stackless mechanism, the _nr functions are
used in the Python VM, but cannot be used in C extension callbacks.
In effect, in C extensions (or when embeding Python), Python must return to
the C code after executing the callback. So, distinct hybrid functions,
different from those used by the VM, are still needed.

It's true that stackless introduces an additional level of complexity, and
that its implementation could be lifted up. These can be the actual reasons
why this has not been merged...

FG

Frederic Giacometti

unread,

Nov 6, 2001, 10:06:40 AM11/6/01

to

"Gordon McMillan" <gm...@hypernet.com> wrote in message
news:Xns91515922BE02...@199.171.54.214...

> John S. Yates, Jr. wrote:
>
> > On 5 Nov 2001 13:53:28 GMT, a...@localhost.debian.org (A.M. Kuchling)
> > wrote:
> But making Python *truly* stackless means getting rid of all recursions,
> and that is an enormous task. If you don't do that, you've got a
> language feature that doesn't work in some apparently random
> set of circumstances.

But how do you process callbacks to Python from C code (extensions or
embeded python)?

One has to return to C after executing the Python code, and than C has to
return to Python after executing the remainer of its code...

I see maintaining hybrid behavior (iterative / recursive calls to the
interpreter, and 'recursion tolerance') as a requirement...

FG

Michael Hudson

unread,

Nov 6, 2001, 10:09:23 AM11/6/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

> "Michael Hudson" <m...@python.net> wrote in message
> news:uady09...@python.net...
> > Martin von Loewis <loe...@informatik.hu-berlin.de> writes:
> >
> > [schnipp]
> > > * Are the _nr versions functionally completely backwards-compatible?
> > > If not, why? If yes, why is the original version preserved?
> >
> > I think the originals are just around because the implementation of
> > the _nr variants was tricky and Chris needed something to test
> > against. Not sure, though.
>
> From what I understand in the stackless mechanism, the _nr functions are
> used in the Python VM, but cannot be used in C extension callbacks.
> In effect, in C extensions (or when embeding Python), Python must return to
> the C code after executing the callback. So, distinct hybrid functions,
> different from those used by the VM, are still needed.

Huh? The builtin_foo functions in Python/bltinmodule.c are all
declared static, so you can't call them from C extensions anyway. You
could grub around in __builtins__ and call them using
PyEval_CallObject or whatever, but do people actually do this?

You could still "call" the _nr variants, but any "call" of a C
function that might end up calling Python code (i.e. almost anything
in the Python C API!) must[1] actually be structured as a
stack-push-then-return wotsit like I tried (and probably failed) to
explain in the quoted post.

> It's true that stackless introduces an additional level of complexity, and
> that its implementation could be lifted up. These can be the actual reasons
> why this has not been merged...

I think Gordon's post got the nail on the head here.

Cheers,
M.

[1] If continuations are to be allowed to escape from the called
Python code, anyway.
--
Any form of evilness that can be detected without *too* much effort
is worth it... I have no idea what kind of evil we're looking for
here or how to detect is, so I can't answer yes or no.
-- Guido Van Rossum, python-dev

Gordon McMillan

unread,

Nov 6, 2001, 2:02:11 PM11/6/01

to

Frederic Giacometti wrote:

>
> "Gordon McMillan" <gm...@hypernet.com> wrote in message

[snip]

>> But making Python *truly* stackless means getting rid of all recursions,
>> and that is an enormous task. If you don't do that, you've got a
>> language feature that doesn't work in some apparently random set of
>> circumstances.
>
> But how do you process callbacks to Python from C code (extensions or
> embeded python)?
>
> One has to return to C after executing the Python code, and than C has to
> return to Python after executing the remainer of its code...

All depends on what you mean by "return" <wink>. Try thinking of "callbacks"
and the "return" from callbacks as events.

In Stackless, the Python stack and the C stack are completely separate.
In the case where C code needs to call Python and then do something with
the result (that is, where tail recursion doesn't apply), Christian used
the trick of manufacturing a Python-style frame object that represents the C
code. As far as (Stackless) Python is concerned, it gets dispatched like
any other frame (Stackless' Python "stack" is really a tree).

> I see maintaining hybrid behavior (iterative / recursive calls to the
> interpreter, and 'recursion tolerance') as a requirement...

True. And Stackless (as a 3rd party product) can get away with saying
"Oops, I can't do that". The bar is higher for a core language feature,
though. Which is a shame, because this turns out to be more of a
theoretical problem than a real one.

- Gordon

Martin von Loewis

unread,

Nov 6, 2001, 2:04:23 PM11/6/01

to

Michael Hudson <m...@python.net> writes:

> > - it adds 17 new members to the frame object, doubling the number
> > of members. Usage of some of these members isn't obvious to me.
>
> See below about map & friends.

If they are all just for the variables in map etc., a better solution
must be found - don't think it is desirable to have them in every
frame, if they are just used inside map.

However, from inspecting the patch, I doubt this is the case. Some of
them are general-purpose, but I couldn't easily figure out what, for
example, this first-instruction business is good for.

> So when map wants to call the Python function, it needs to stuff all
> the local data it cares about into a frame object (see above), push
> this frame object onto the (Python) stack, *return* to the interpreter
> in such a way that the mapping function is called next and then every
> time(!) that returns have the interpreter call back into map again.

Why couldn't map behave as if it was a plain Python function,
implemented as

def map(fun,args):
res = []
for a in args:
res.append(fun(a))

[I know that map is implemented in a more-involved way; it should
still be possible to find its Python equivalent, then think how
stackless would execute this equivalent]

For the additions to frame_state, that would mean that map should
reserve a number of localsplus variables, instead of outright adding
to frame_state on the C level.

> > - The code adds PREPARE macros into each branch of ceval. Why?
> > - It adds a long list of explicitly not-supported opcodes into
> > the ceval switch, instead of using 'default:'. No explanation
> > for that change is given, other than 'unused opcodes go here'.
> > Is it necessary to separately maintain them? Why?
>
> This was an optimization Chris used to try and get back some of the
> performance lost during the stackless changes. IIRC, he handles
> exceptions and return values as "pseudo-opcodes" rather than using the
> WHY_foo constants the current ceval.c uses. I never really understood
> this part.

If it is an optimization, I think we should take a step back and first
try to understand what it tries to optimize, and how it does
that. Perhaps we find that it isn't needed at all, and that the
problem could be solved in a different way. If you don't understand
it, it can't go into Python.

> I think integrating stackless into the core is a fairly huge amount of
> work. I'd like to think I could do it, given several months of
> full-time effort (which isn't going to happen). About the only likely
> way I see for it to get in is for it to become important to Zope
> Corp. for some reason, and them paying Tim or Guido (or Chris) to do
> it.

I don't think these are the options. I don't volunteer to support this
code myself, either, but if somebody would step forward and claim that
she understands it all, and is willing to present it to Guido in a way
that he understands it also, and if all the hackish parts of it would
be replaced by understandable code, I think it could go into Python.
It just needs a determined volunteer to work on it.

Regards,
Martin

Donn Cave

unread,

Nov 6, 2001, 2:47:42 PM11/6/01

to

Quoth Gordon McMillan <gm...@hypernet.com>:
| Frederic Giacometti wrote:
...

|> I see maintaining hybrid behavior (iterative / recursive calls to the
|> interpreter, and 'recursion tolerance') as a requirement...
|
| True. And Stackless (as a 3rd party product) can get away with saying
| "Oops, I can't do that". The bar is higher for a core language feature,
| though. Which is a shame, because this turns out to be more of a
| theoretical problem than a real one.

I had no trouble with Python -> C -> Python, done in the normal
way with PyEval_CallObject() with Stackless 2.0, and even invoked
continuations from previous calls. I don't know if that's inconsistent
with what you're saying, or not - maybe I was just lucky!

Donn Cave, do...@u.washington.edu

Gordon McMillan

unread,

Nov 6, 2001, 6:03:36 PM11/6/01

to

Donn Cave wrote:

> I had no trouble with Python -> C -> Python, done in the normal
> way with PyEval_CallObject() with Stackless 2.0, and even invoked
> continuations from previous calls. I don't know if that's inconsistent
> with what you're saying, or not - maybe I was just lucky!

That *sounds* consistent. It's doing things like creating a continuation
in Python(2) that you want used by Python(1) that cause trouble. The
only times I've seen anyone bit by this is when they create a
continuation in an __init__ method, and then expect to use it
from some other method. It's not hard to work around, it's hard
to predict when you'll get bit.

- Gordon

Michael Hudson

unread,

Nov 7, 2001, 6:09:29 AM11/7/01

to

I've refreshed my memory somewhat, but this certainly shouldn't be
taken as definitive.

Martin von Loewis <loe...@informatik.hu-berlin.de> writes:

> Michael Hudson <m...@python.net> writes:
>
> > > - it adds 17 new members to the frame object, doubling the number
> > > of members. Usage of some of these members isn't obvious to me.
> >
> > See below about map & friends.
>
> If they are all just for the variables in map etc., a better solution
> must be found - don't think it is desirable to have them in every
> frame, if they are just used inside map.

I thought this at the time I looked at the code.

> However, from inspecting the patch, I doubt this is the case. Some of
> them are general-purpose, but I couldn't easily figure out what, for
> example, this first-instruction business is good for.

Sorry, my memory had failed me somewhat. I've downloaded the code
now.

I think f_first_instr is there to transfer information between
eval_code2_setup and eval_code2_loop. As such I don't think it's at
all necessary; I think eval_code2_loop could just go

_PyCode_GETCODEPTR(f->f_code, &first_instr);

Another mystery, this time about core Python: why go to so much
trouble to ensure that code_object->co_code is only a readable buffer?

(a) I'm pretty sure it's always a string
(b) I don't see any error checking, so if it's something more
complicated than a string and bf_getreadbuffer fails, the
interpreter is going to go "boom".

Oh well, just another mystery about the buffer interface.

Not sure what f_next_instr is for either; it seems to always satisfy

unsigned char* first;
_PyCode_GETCODEPTR(f->f_code, &first);
f->f_next_instr == first + f->f_lasti

It may be another optimization...

I think f_stackpointer does what f_stacktop does in today's Python (it
was introduced by the generator patch).

f_statusflags records various hairy things about the frame. I
certainly don't understand the details, but I don't think you can go
without this.

f_execute is usually eval_code2_loop, but can also be
builtin_map_loop. This is fundamental.

f_dealloc is used by the continuation module, as are f_node, f_co,
f_coframes. I /really/ don't understand that module.

f_depth seems to be used only by map(). Oh no, continuations too.

f_age really does seem to be only used by map(). To store the length
of the longest sequence being iterated over, it seems. Nice name.

Same for f_reg3.

So there definitely seems to be some room for cleanup here. I'd need
to understand continuationmodule.c to see exactly how much.

> > So when map wants to call the Python function, it needs to stuff all
> > the local data it cares about into a frame object (see above), push
> > this frame object onto the (Python) stack, *return* to the interpreter
> > in such a way that the mapping function is called next and then every
> > time(!) that returns have the interpreter call back into map again.
>
> Why couldn't map behave as if it was a plain Python function,
> implemented as
>
> def map(fun,args):
> res = []
> for a in args:
> res.append(fun(a))
>
> [I know that map is implemented in a more-involved way; it should
> still be possible to find its Python equivalent, then think how
> stackless would execute this equivalent]
>
> For the additions to frame_state, that would mean that map should
> reserve a number of localsplus variables, instead of outright adding
> to frame_state on the C level.

I wasn't trying to justify the existing implementation, just explain it.

I agree with you here.

> > > - The code adds PREPARE macros into each branch of ceval. Why?
> > > - It adds a long list of explicitly not-supported opcodes into
> > > the ceval switch, instead of using 'default:'. No explanation
> > > for that change is given, other than 'unused opcodes go here'.
> > > Is it necessary to separately maintain them? Why?
> >
> > This was an optimization Chris used to try and get back some of the
> > performance lost during the stackless changes. IIRC, he handles
> > exceptions and return values as "pseudo-opcodes" rather than using the
> > WHY_foo constants the current ceval.c uses. I never really understood
> > this part.
>
> If it is an optimization, I think we should take a step back and first
> try to understand what it tries to optimize, and how it does
> that. Perhaps we find that it isn't needed at all, and that the
> problem could be solved in a different way. If you don't understand
> it, it can't go into Python.

And here.

> > I think integrating stackless into the core is a fairly huge amount of
> > work. I'd like to think I could do it, given several months of
> > full-time effort (which isn't going to happen). About the only likely
> > way I see for it to get in is for it to become important to Zope
> > Corp. for some reason, and them paying Tim or Guido (or Chris) to do
> > it.
>
> I don't think these are the options. I don't volunteer to support this
> code myself, either, but if somebody would step forward and claim that
> she understands it all, and is willing to present it to Guido in a way
> that he understands it also, and if all the hackish parts of it would
> be replaced by understandable code, I think it could go into Python.
> It just needs a determined volunteer to work on it.

Yeesss, but does anyone have the time for this? I mean, I've just
spent an hour or so I probably shouldn't have writing this article,
and I've barely scratched the surface.

I don't want to discourage anyone, but I thinking liking a challenge
is a prerequisite.

I might devote an evening or two to it and see where I get.

Homological algebra beckons -- brain relief in this context!

Cheers,
M.

--
LINTILLA: You could take some evening classes.
ARTHUR: What, here?
LINTILLA: Yes, I've got a bottle of them. Little pink ones.
-- The Hitch-Hikers Guide to the Galaxy, Episode 12

Frederic Giacometti

unread,

Nov 7, 2001, 10:35:43 AM11/7/01

to

"Gordon McMillan" <gm...@hypernet.com> wrote in message

news:Xns91518EE46939...@199.171.54.215...

> Frederic Giacometti wrote:
>
> >
> > "Gordon McMillan" <gm...@hypernet.com> wrote in message
>
> [snip]
>
> >> But making Python *truly* stackless means getting rid of all
recursions,
> >> and that is an enormous task. If you don't do that, you've got a
> >> language feature that doesn't work in some apparently random set of
> >> circumstances.
> >
> > But how do you process callbacks to Python from C code (extensions or
> > embeded python)?
> >
> > One has to return to C after executing the Python code, and than C has
to
> > return to Python after executing the remainer of its code...
>
> All depends on what you mean by "return" <wink>. Try thinking of
"callbacks"
> and the "return" from callbacks as events.
>
> In Stackless, the Python stack and the C stack are completely separate.
> In the case where C code needs to call Python and then do something with
> the result (that is, where tail recursion doesn't apply), Christian used
> the trick of manufacturing a Python-style frame object that represents the
C
> code. As far as (Stackless) Python is concerned, it gets dispatched like
> any other frame (Stackless' Python "stack" is really a tree).

That's a very interesting paradigm :))
Thanks,

FG

Frederic Giacometti

unread,

Nov 7, 2001, 11:47:48 AM11/7/01

to

"Gordon McMillan" <gm...@hypernet.com> wrote in message

news:Xns91518EE46939...@199.171.54.215...
> Frederic Giacometti wrote:
>

> In Stackless, the Python stack and the C stack are completely separate.
> In the case where C code needs to call Python and then do something with
> the result (that is, where tail recursion doesn't apply), Christian used
> the trick of manufacturing a Python-style frame object that represents the
C
> code. As far as (Stackless) Python is concerned, it gets dispatched like
> any other frame (Stackless' Python "stack" is really a tree).

That sounds to me that the current Python thread model could also be
reengineered with stackless, profitably.

Here is a possible direction:

*** Replacement of the Python lock with a Python dedicated thread ***

Instead of having the Python VM running in multiple threads sequentially, by
means of the Pythonsynchronization lock, the Python VM would run in one
dedicated thread (on single thread build, this would just be the main
thread) - this is just a reversal of paradigm -..

Then:
- stackless would manage the micro-threads within the (unique) python
thread
- there is no more need for a global python lock (and performance would
boost); stackless managing on its own its continuation list, and the event
queues from other threads and C functions.
- C code can run independently in other threads, using standard
inter-thread communication mechanisms to communicate with the Python VM
(just like the lock; but then only the 'C client functions' have to hang on
the interrupt; while Python stackless would run uninterrupted, unlike in the
current system).

On other terms:
- the Python VM would run in one single thread (presently, the Python VM,
by means of the Python lock, already runs in a single-threaded fashion)
- the Python thread would act as server thread to the other threads
requesting calls to Python.
- this would get rid to the Python lock synchronization overhead, and
somehow simplify the current multithreaded approach
- the concept of stackless microthread would be conciliated with the
current Python thread objects, that reflect the OS native threads.

This would impact the current threading/thread modules: then a Python thread
object would be a handle to the OS thread were the C code would be run;
while all Python bytecode and reference operations would be performed by
stackless in THE Python VM thread.

Does this make sense?

Thanks,

FG

Paul Svensson

unread,

Nov 7, 2001, 1:43:03 PM11/7/01

to

Michael Abbott <mic...@rcp.co.uk> writes:
>
> Is the idea of "Stackless Python" fundamentally sound?
>

Looking at it from the other direction,
seeing an interpreter involving the host stack in target recursion
gives me a severe case of itching-to-fix-it.

/Paul

Paul Rubin

unread,

Nov 7, 2001, 2:06:57 PM11/7/01

to

Yes, it's a standard idiom of continuations to implement coroutines
with them.

One virtue of using kernel threads is in principle the kernel threads
can take advantage of multiple hardware CPU's. Python's global
interpreter lock stops that from happening, but I thought there was
some hope the lock might go away someday.

Erno Kuusela

unread,

Nov 7, 2001, 10:44:36 PM11/7/01

to

In article <7x8zdit...@ruckus.brouhaha.com>, Paul Rubin
<phr-n...@nightsong.com> writes:

i don't think anyone has come up with a reasonably way of implementing
"free threading" yet, since the last time it was tried it caused
everything to slow down so much. probably for good performance
reference counting would have to be dropped in exchange for a less
predictable garbage collection system, and the python internals
sprinkled with fragile locks.

threads suck, use fork :)

-- erno

Frederic Giacometti

unread,

Nov 8, 2001, 12:41:16 AM11/8/01

to

"Paul Rubin" <phr-n...@nightsong.com> wrote in message
news:7x8zdit...@ruckus.brouhaha.com...

Here are more details of a possible design of the stackeless OS thread
model:

[Preamble: In a single thread build, all this collapses into one thread]

The Python VM thread (this thread that python at <<full speed>>, without
interruption) is associated to two pools of OS threads.

The thread in the first pool runs arbitrary reentrant C functions.
The threads in the second pool are each associated to a non-reentrant C
API's (one thread per independent sets of API; to be initialized upon
loading the extension module); e.g. OpenGL, Metaphase...

Each thread schedules its own queue of continuations (micro-threads) for
execution.
The C threads can post new continuations to the Python thread, and
vice-versa.

In total, three main types of continuations are used:
- continuations from the Python thread, scheduled in the Python thread
- continuations from C function threads, scheduled in the Python thread
- continuations from the Python thread, scheduled in the C threads

This model seems generic; it has no interpreter lock, and can simply
distribute C calls across multiple OS threads. Multithreaded C/Python
programming would be simplified too.

[As an exemple, with this model, in JPE, the Java threads would be mapped to
python continuations; and this would work without regard whether the JVM
uses native or green threads.
In contrast, without stackless, JPE requires that the JVM runs with native
thread, and has to bear the constant thread switching/Python lock overhead
of the present implementation...].

Furthermore, non-blocking versions of some of the Python C functions can be
provided.
For instance, the functions py_decref() and py_decref_non_blocking() could
be provided....

Finally, the concept can be pushed as far as dedicating a third thread to
python de-referencing and memory de-allocation, that would run in parallel
to the python thread; thus releasing the later from the load of
de-referencement.

FG

Michael Hudson

unread,

Nov 8, 2001, 8:12:55 AM11/8/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

[schnipp]
> Does this make sense?

But isn't the point of many multithreaded apps allowing code to run
during blocking IO operations? Your approach would knock that on the
head.

Cheers,
M.

--
languages shape the way we think, or don't.
-- Erik Naggum, comp.lang.lisp

Martin von Loewis

unread,

Nov 9, 2001, 5:21:33 AM11/9/01

to

Michael Hudson <m...@python.net> writes:

> I wasn't trying to justify the existing implementation, just explain it.

Thanks, and that is much appreciated. I'll save your article for
future reference; it has many details that anybody diving into
stackless should know.

> Yeesss, but does anyone have the time for this? I mean, I've just
> spent an hour or so I probably shouldn't have writing this article,
> and I've barely scratched the surface.

Same with me, when I inspected the diffs a few days ago. I'm not
saying that you should be the one cleaning it all up, either. I just
wanted to make it clear that it is not lack of interest that Stackless
isn't integrated into core Python yet, but that a lot of work needs to
be done to the code before an integration could be attempted.

So answering the question in the subject: Stackless Python is not
dead; it is just a sleeping beauty, waiting for a prince to kiss her
awake.

Regards,
Martin

Frederic Giacometti

unread,

Nov 10, 2001, 3:39:24 PM11/10/01

to

"Michael Hudson" <m...@python.net> wrote in message

news:u4ro58...@python.net...

> "Frederic Giacometti" <frederic....@arakne.com> writes:
>
> [schnipp]
> > Does this make sense?
>
> But isn't the point of many multithreaded apps allowing code to run
> during blocking IO operations? Your approach would knock that on the
> head.

In the proposed design, stackless call to another thread never blocks the
thread; it inserts a frame in one of the thread frame schedulers.
Just as all C functions external to the Python interpreter core, IO
functions run outside the interpreter thread.

FG

Michael Hudson

unread,

Nov 12, 2001, 5:32:00 AM11/12/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

I must be missing something. Can you sketch how, say, socket.send()
would be implemented? Would you spawn a new OS thread for every C API
call? I'm afraid I don't understand you to this point...

Cheers,
M.

--
BUGS Never use this function. This function modifies its first
argument. The identity of the delimiting character is
lost. This function cannot be used on constant strings.
-- the glibc manpage for strtok(3)

Frederic Giacometti

unread,

Nov 13, 2001, 10:42:31 AM11/13/01

to

"Michael Hudson" <m...@python.net> wrote in message

news:uu1w0e...@python.net...

> "Frederic Giacometti" <frederic....@arakne.com> writes:
>
> > "Michael Hudson" <m...@python.net> wrote in message
> > news:u4ro58...@python.net...
> > > "Frederic Giacometti" <frederic....@arakne.com> writes:
> > >
> > > [schnipp]
> > > > Does this make sense?
> > >
> > > But isn't the point of many multithreaded apps allowing code to run
> > > during blocking IO operations? Your approach would knock that on the
> > > head.
> >
> > In the proposed design, stackless call to another thread never
> > blocks the thread; it inserts a frame in one of the thread frame
> > schedulers. Just as all C functions external to the Python
> > interpreter core, IO functions run outside the interpreter thread.
>
> I must be missing something. Can you sketch how, say, socket.send()
> would be implemented? Would you spawn a new OS thread for every C API
> call? I'm afraid I don't understand you to this point...

As I mentioned, two thread pools would be maintained.
To the extent that send() is reentrant on the underlying OS, it would
beexecuted in one of the threads of the reentrant thread pool.
This is the 'thread pool' pattern; where threads are kept up (i.e. active)
from one call to the next. It's a standard pattern and concurrent
programming algorithm.

Of course, this requires a non-blocking thread library (i.e. with OS
support); not a 'green thread' library (blocking, no OS support). Currently,
Python is always build on non-blocking threads when they exist anyway.

FG

Michael Hudson

unread,

Nov 13, 2001, 11:06:39 AM11/13/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

> "Michael Hudson" <m...@python.net> wrote in message

> news:uu1w0e...@python.net...
[...]

> > I must be missing something. Can you sketch how, say, socket.send()
> > would be implemented? Would you spawn a new OS thread for every C API
> > call? I'm afraid I don't understand you to this point...
>
> As I mentioned, two thread pools would be maintained. To the extent
> that send() is reentrant on the underlying OS, it would beexecuted
> in one of the threads of the reentrant thread pool. This is the
> 'thread pool' pattern; where threads are kept up (i.e. active) from
> one call to the next. It's a standard pattern and concurrent
> programming algorithm.

So when Python code executes

sock.recv(data)

the interpreter would take a thread form the pool, and in effect say
"here, run this function". Then the interpreter thread would go off
and execute pending Python threads, and when the sock.recv call
returned, it would add the thread that called it back to the set of
pending interpreter threads? (I think this would be easier to discuss
with paper and pencil...)

What does this buy us again? It still makes writing C code that calls
Python code a bit of a pain, doesn't it? Oh, maybe not. More
thinking required...

Cheers,
M.

--
The meaning of "brunch" is as yet undefined.
-- Simon Booth, ucam.chat

Donn Cave

unread,

Nov 13, 2001, 12:19:55 PM11/13/01

to

Quoth "Frederic Giacometti" <frederic....@arakne.com>:

| "Michael Hudson" <m...@python.net> wrote in message
| news:uu1w0e...@python.net...

...

|> I must be missing something. Can you sketch how, say, socket.send()
|> would be implemented? Would you spawn a new OS thread for every C API
|> call? I'm afraid I don't understand you to this point...
|
| As I mentioned, two thread pools would be maintained.
| To the extent that send() is reentrant on the underlying OS, it would
| beexecuted in one of the threads of the reentrant thread pool.
| This is the 'thread pool' pattern; where threads are kept up (i.e. active)
| from one call to the next. It's a standard pattern and concurrent
| programming algorithm.
|
| Of course, this requires a non-blocking thread library (i.e. with OS
| support); not a 'green thread' library (blocking, no OS support). Currently,
| Python is always build on non-blocking threads when they exist anyway.

But while it requires OS thread support, it collides with support of
OS threads, doesn't it? For example, the threaded applications I write
use one thread per window, threads created by Window.Show(). Those
threads call back into the interpreter on window events, and the
interpreted code calls right back into the C level graphics library;
that library is built around this multi-threaded system and the calling
thread isn't something to select arbitrarily from a pool, it has to be
the thread that supports the Window object that noticed the event.

I can't tell if this proposal is really incompatible with that - maybe
a diagram would work better for me, too!

Donn Cave, do...@u.washington.edu

Frederic Giacometti

unread,

Nov 14, 2001, 10:13:06 AM11/14/01

to

"Michael Hudson" <m...@python.net> wrote in message

news:u8zdao...@python.net...

A possible implementation would consist in using the method flag to mark C
python methods/functions to run outside the interpreter thread. This way,
the entire function runs in a separate thread.

The programmer does not have to explicitely manipulate OS threads at any
point; the continuation/OS thread correspondance would be taken care of
automatically by the threaded stackless implementation.
One would have parallel threads of execution through just continuations.
C extensions would run entirely either in a parallel thread, or in the
interpreter thread. There would be no more interpreter lock; and when a C
thread needs some service by the interpreter, it would just post a
continuation to the interpreter scheduler.

This will simplify programming when using Python callbacks, too; since the
currently active Python continuation would be the continuation active in the
thread.

Frederic Giacometti

unread,

Nov 14, 2001, 10:22:26 AM11/14/01

to

> But while it requires OS thread support, it collides with support of
> OS threads, doesn't it? For example, the threaded applications I write
> use one thread per window, threads created by Window.Show(). Those
> threads call back into the interpreter on window events, and the
> interpreted code calls right back into the C level graphics library;
> that library is built around this multi-threaded system and the calling
> thread isn't something to select arbitrarily from a pool, it has to be
> the thread that supports the Window object that noticed the event.

The continuation in the Python interpreter has to keep a pointer to the C
thread to which it is to return.
Here, we would be in the case of non-rentrant threads (thread with a state
memory, as opposed to stateless thread, to use the analogy with server
processes/threads). The sending C thread would not be assigned another
thread/continuation of control, and would remain idle. Thus, it could
execute the callbacks from its Python continuation.

FG

Michael Hudson

unread,

Nov 14, 2001, 10:48:27 AM11/14/01

to

"Frederic Giacometti" <frederic....@arakne.com> writes:

> "Michael Hudson" <m...@python.net> wrote in message
> news:u8zdao...@python.net...
> > "Frederic Giacometti" <frederic....@arakne.com> writes:

[...]

> > So when Python code executes
> >
> > sock.recv(data)
> >
> > the interpreter would take a thread form the pool, and in effect say
> > "here, run this function". Then the interpreter thread would go off
> > and execute pending Python threads, and when the sock.recv call
> > returned, it would add the thread that called it back to the set of
> > pending interpreter threads? (I think this would be easier to discuss
> > with paper and pencil...)
> >
> > What does this buy us again? It still makes writing C code that calls
> > Python code a bit of a pain, doesn't it? Oh, maybe not. More
> > thinking required...
>
> A possible implementation would consist in using the method flag to mark C
> python methods/functions to run outside the interpreter thread. This way,
> the entire function runs in a separate thread.

method flag?

> The programmer does not have to explicitely manipulate OS threads at any
> point; the continuation/OS thread correspondance would be taken care of
> automatically by the threaded stackless implementation.
> One would have parallel threads of execution through just continuations.
> C extensions would run entirely either in a parallel thread, or in the
> interpreter thread. There would be no more interpreter lock; and when a C
> thread needs some service by the interpreter, it would just post a
> continuation to the interpreter scheduler.

This is what I thought you meant. Good.

I think you could write code that would use an OS thread for every
other level of recursion (think of printing deeply nested data
structures that have Python __str__ methods, for instance). I'm not
sure that's a good idea. I also think of other situations in which
you'd end up with a lot of OS threads sitting around. Is having a lot
of blocked OS threads around a problem?

> This will simplify programming when using Python callbacks, too;
> since the currently active Python continuation would be the
> continuation active in the thread.

I think there are too many definite articles in that sentence for me
to make head or tail of it.

Interesting idea. I wonder if anyone has the time to implement it...

Cheers,
M.

--
My hat is lined with tinfoil for protection in the unlikely event
that the droid gets his PowerPoint presentation working.
-- Alan W. Frame, alt.sysadmin.recovery

Frederic Giacometti

unread,

Nov 14, 2001, 9:39:56 PM11/14/01

to

"Michael Hudson" <m...@python.net> wrote in message

news:uzo5pl...@python.net...

> "Frederic Giacometti" <frederic....@arakne.com> writes:
>
> > "Michael Hudson" <m...@python.net> wrote in message
> > news:u8zdao...@python.net...
> > > "Frederic Giacometti" <frederic....@arakne.com> writes:
> [...]
> > > So when Python code executes
> > >
> > > sock.recv(data)
> > >
> > > the interpreter would take a thread form the pool, and in effect say
> > > "here, run this function". Then the interpreter thread would go off
> > > and execute pending Python threads, and when the sock.recv call
> > > returned, it would add the thread that called it back to the set of
> > > pending interpreter threads? (I think this would be easier to discuss
> > > with paper and pencil...)
> > >
> > > What does this buy us again? It still makes writing C code that calls
> > > Python code a bit of a pain, doesn't it? Oh, maybe not. More
> > > thinking required...
> >
> > A possible implementation would consist in using the method flag to mark
C
> > python methods/functions to run outside the interpreter thread. This
way,
> > the entire function runs in a separate thread.
>
> method flag?

The 3rd field of the PyMethodDef structure (the one on which METH_VARARGS is
used...)

FG

Christian Tismer

unread,

Dec 31, 2001, 8:45:29 AM12/31/01

to

Well, it is a little late to answer this, but...

Michael Hudson wrote:

> Martin von Loewis <loe...@informatik.hu-berlin.de> writes:

...

>>- It adds a field nesting_level to the thread state, without
>> ever checking its value (it just counts the nesting level)
>>
>
> I imagine this was just for bragging with :)

Yes. I needed a way to track when and why there are still
recursions. At sme pahse I also had the idea to drive
decisions on this when I'm allowed to switch things, but
later I learned that this is the wrong way.

>>- it adds a number of _nr variants of functions (non-recursive),
>> e.g. for map and eval. In __builtins__, the _nr versions are
>> available as "map" and "eval", while the original versions are
>> preserved as apply_orig and map_orig:

>> * Are the _nr versions functionally completely backwards-compatible?
>> If not, why? If yes, why is the original version preserved?
>>
>
> I think the originals are just around because the implementation of
> the _nr variants was tricky and Chris needed something to test
> against. Not sure, though.

A) they are completely compatible except for the special unwinding
rule. A special token may be returned that tells to unwind the stack.
The versions without _nr are kept for binary compatibility with
existing code which is not aware of the unwind token.
In this case, Stackless behaves like standard Python, it just
does recursions. The _nr versions are for extensions which
make use of the stackless features.

>> * Just to implement map, 150 lines of builtin_map had to be
>> rewritten into 350 lines (builtin_map, make_stub_code,
>> make_map_frame, builtin_map_nr, builtin_map_loop). The author
>> indicates that the same procedure still needs to be done for
>> apply and filter. Just what is the "same procedure"? Isn't there
>> some better way?
>>
>
> This is where implementing stackless in C really, really hurts.

[great explanation of stackless techniques skipped.]

I agree this is not easy to undrstand and to implement.
I always was thinking of a framework which makes this
easier, but I didn'tcome up with something suitable.

...

>>- The code adds PREPARE macros into each branch of ceval. Why?
>>- It adds a long list of explicitly not-supported opcodes into
>> the ceval switch, instead of using 'default:'. No explanation
>> for that change is given, other than 'unused opcodes go here'.
>> Is it necessary to separately maintain them? Why?
>>
>
> This was an optimization Chris used to try and get back some of the
> performance lost during the stackless changes. IIRC, he handles
> exceptions and return values as "pseudo-opcodes" rather than using the
> WHY_foo constants the current ceval.c uses. I never really understood
> this part.

This is really just an optimization.
The PREPARE macros were used to limit code increase, and to
gove me some more options to play with.
Finally, the PREPARE macros do an optimized opcode prefetch
which turns out to be a drastical speedup for the interpreter loop.
Standard Python does an increment for every byte code and then
one for the optional argument, and the argument is picked bytewise.
What I do is a single add to the program counter, dependent of the
opcode/argument size which is computed in the PREPARE macro.
Then, on intel machines, I use a short word access to the argument
which gives a considerable savings. (Although this wouldn't be
necessary if the compilers weren't that dumb).

>>It may be that some of these questions can be answered giving a good
>>reason for the change, but I doubt that this code can be incorporated
>>as-is, just saying "you need all of this for Stackless Python". I
>>don't believe you do, but I cannot work it out myself, either.

>>
>
> I think integrating stackless into the core is a fairly huge amount of
> work. I'd like to think I could do it, given several months of
> full-time effort (which isn't going to happen). About the only likely
> way I see for it to get in is for it to become important to Zope
> Corp. for some reason, and them paying Tim or Guido (or Chris) to do
> it.

I'm at a redesign for Stackless 2.2. I hope to make it simpler,
split apart Stackless and optimization, and continuations are
no longer my primary target, but built-in microthreads.

ciao - chris

--
Christian Tismer :^) <mailto:tis...@tismer.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net/
14163 Berlin : PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com/

Cameron Laird

unread,

Dec 31, 2001, 11:17:16 AM12/31/01

to

In article <mailman.1009806401...@python.org>,

Christian Tismer <tis...@tismer.com> wrote:
>Well, it is a little late to answer this, but...
>
>Michael Hudson wrote:
>
> > Martin von Loewis <loe...@informatik.hu-berlin.de> writes:
>...
>
>...
>
> >>- It adds a field nesting_level to the thread state, without
> >> ever checking its value (it just counts the nesting level)
> >>
> >
> > I imagine this was just for bragging with :)
>
>
>Yes. I needed a way to track when and why there are still
>recursions. At sme pahse I also had the idea to drive
>decisions on this when I'm allowed to switch things, but
>later I learned that this is the wrong way.

Along with bragging rights, one of the not-obvious-but-
easy-to-explain reasons to introspect on recursion depth
is as a defense against denial-of-service (or more un-
intentional) security hazards. If you're executing
something dynamic enough to give a user a wedge into the
execution stack, it's prudent, however crude it might
seem, to impose resource limits. One engineering discovery
is that recursion depth can be a useful resource to limit.
.
.

.
>I'm at a redesign for Stackless 2.2. I hope to make it simpler,
>split apart Stackless and optimization, and continuations are
>no longer my primary target, but built-in microthreads.

Cool!
.
.
.
--

Cameron Laird <Cam...@Lairds.com>
Business: http://www.Phaseit.net
Personal: http://starbase.neosoft.com/~claird/home.html

Michael Hudson

unread,

Jan 1, 2002, 2:13:15 PM1/1/02

to

Christian Tismer <tis...@tismer.com> writes:

> Well, it is a little late to answer this, but...

Hey, I don't care, I'm just glad to see you're still paying attention
:)

> Michael Hudson wrote:
[...]

> >> * Just to implement map, 150 lines of builtin_map had to be
> >> rewritten into 350 lines (builtin_map, make_stub_code,
> >> make_map_frame, builtin_map_nr, builtin_map_loop). The author
> >> indicates that the same procedure still needs to be done for
> >> apply and filter. Just what is the "same procedure"? Isn't there
> >> some better way?
> >>
> >
> > This is where implementing stackless in C really, really hurts.
>
>
> [great explanation of stackless techniques skipped.]
>
> I agree this is not easy to undrstand and to implement.
> I always was thinking of a framework which makes this
> easier, but I didn'tcome up with something suitable.

I've had similar thoughts, but likewise fell short. I think you could
probably do things with m4 that took something readable and spat out
stack-neutral C, but it would be a Major Project.

> >>- The code adds PREPARE macros into each branch of ceval. Why?
> >>- It adds a long list of explicitly not-supported opcodes into
> >> the ceval switch, instead of using 'default:'. No explanation
> >> for that change is given, other than 'unused opcodes go here'.
> >> Is it necessary to separately maintain them? Why?
> >>
> >
> > This was an optimization Chris used to try and get back some of the
> > performance lost during the stackless changes. IIRC, he handles
> > exceptions and return values as "pseudo-opcodes" rather than using the
> > WHY_foo constants the current ceval.c uses. I never really understood
> > this part.
>
>
> This is really just an optimization.
> The PREPARE macros were used to limit code increase, and to
> gove me some more options to play with.
> Finally, the PREPARE macros do an optimized opcode prefetch
> which turns out to be a drastical speedup for the interpreter loop.
> Standard Python does an increment for every byte code and then
> one for the optional argument, and the argument is picked bytewise.
> What I do is a single add to the program counter, dependent of the
> opcode/argument size which is computed in the PREPARE macro.
> Then, on intel machines, I use a short word access to the argument
> which gives a considerable savings. (Although this wouldn't be
> necessary if the compilers weren't that dumb).

This could/should be split off from stackless, right?

> >>It may be that some of these questions can be answered giving a good
> >>reason for the change, but I doubt that this code can be incorporated
> >>as-is, just saying "you need all of this for Stackless Python". I
> >>don't believe you do, but I cannot work it out myself, either.
> >>
> >
> > I think integrating stackless into the core is a fairly huge amount of
> > work. I'd like to think I could do it, given several months of
> > full-time effort (which isn't going to happen). About the only likely
> > way I see for it to get in is for it to become important to Zope
> > Corp. for some reason, and them paying Tim or Guido (or Chris) to do
> > it.
>
>
> I'm at a redesign for Stackless 2.2. I hope to make it simpler,
> split apart Stackless and optimization,

Ah :)

> and continuations are no longer my primary target, but built-in
> microthreads.

Fair enough. Glad to hear you've found some time for your baby!

Cheers,
M.

Christian Tismer

unread,

Jan 1, 2002, 5:21:12 PM1/1/02

to

Michael Hudson wrote:

> Christian Tismer <tis...@tismer.com> writes:
>>Well, it is a little late to answer this, but...
> Hey, I don't care, I'm just glad to see you're still paying attention
> :)

Hmmtja, late but yes.

>>Michael Hudson wrote:
>>
> [...]
>
>> >> * Just to implement map, 150 lines of builtin_map had to be
>> >> rewritten into 350 lines (builtin_map, make_stub_code,
>> >> make_map_frame, builtin_map_nr, builtin_map_loop). The author
>> >> indicates that the same procedure still needs to be done for
>> >> apply and filter. Just what is the "same procedure"? Isn't there
>> >> some better way?
>> >>
>> >
>> > This is where implementing stackless in C really, really hurts.
>>
>>
>>[great explanation of stackless techniques skipped.]
>>

>>I agree this is not easy to understand and to implement.

>>I always was thinking of a framework which makes this
>>easier, but I didn'tcome up with something suitable.
>>
> I've had similar thoughts, but likewise fell short. I think you could
> probably do things with m4 that took something readable and spat out
> stack-neutral C, but it would be a Major Project.

There must be a simple path. The scheme is always the same.
See the split of functions in stackless map. I hope to find
a macro set that can create this mess from a couple of fragments.

[understanding prepare macroes}

>>This is really just an optimization.
>>The PREPARE macros were used to limit code increase, and to
>>gove me some more options to play with.
>>Finally, the PREPARE macros do an optimized opcode prefetch
>>which turns out to be a drastical speedup for the interpreter loop.
>>Standard Python does an increment for every byte code and then
>>one for the optional argument, and the argument is picked bytewise.
>>What I do is a single add to the program counter, dependent of the
>>opcode/argument size which is computed in the PREPARE macro.
>>Then, on intel machines, I use a short word access to the argument
>>which gives a considerable savings. (Although this wouldn't be
>>necessary if the compilers weren't that dumb).
>>
>
> This could/should be split off from stackless, right?

Yes. And if you have a look in the last (dusty) release, you see that
it already is. There is a python script that applies all these
optimizations automagically.

[integration rumor again]

>>I'm at a redesign for Stackless 2.2. I hope to make it simpler,
>>split apart Stackless and optimization,
>>
>
> Ah :)

Jaah :)

>>and continuations are no longer my primary target, but built-in
>>microthreads.
>>
>
> Fair enough. Glad to hear you've found some time for your baby!

Well, thanks. I had a lot of trouble with my living babies, now after
that, the virtual ones get their attention again.

Michael Hudson

unread,

Jan 3, 2002, 6:20:29 AM1/3/02

to

Christian Tismer <tis...@tismer.com> writes:

> >>I agree this is not easy to understand and to implement.
> >>I always was thinking of a framework which makes this
> >>easier, but I didn'tcome up with something suitable.
> >>
> > I've had similar thoughts, but likewise fell short. I think you could
> > probably do things with m4 that took something readable and spat out
> > stack-neutral C, but it would be a Major Project.
>
>
> There must be a simple path. The scheme is always the same.

Yes.

> See the split of functions in stackless map. I hope to find
> a macro set that can create this mess from a couple of fragments.

But C is ****so**** unexpressive on this level.

#define STACKLESS_CALL(FUNC, ARGTUPLE) \
{ PyFrameObject* f = PyFrame_New(); f.next = FUNC; \
f.nextargs = ARGTUPLE; f.return_to = ???; return; }

You could probably do it in Lisp.

Cheers,
M.

--
... so the notion that it is meaningful to pass pointers to memory
objects into which any random function may write random values
without having a clue where they point, has _not_ been debunked as
the sheer idiocy it really is. -- Erik Naggum, comp.lang.lisp

Christian Tismer

unread,

Jan 3, 2002, 7:16:19 AM1/3/02

to

Michael Hudson wrote:

> Christian Tismer <tis...@tismer.com> writes:
...

>>There must be a simple path. The scheme is always the same.
>>
>
> Yes.
>
>
>>See the split of functions in stackless map. I hope to find
>>a macro set that can create this mess from a couple of fragments.
>>
>
> But C is ****so**** unexpressive on this level.
>
> #define STACKLESS_CALL(FUNC, ARGTUPLE) \
> { PyFrameObject* f = PyFrame_New(); f.next = FUNC; \
> f.nextargs = ARGTUPLE; f.return_to = ???; return; }
>
> You could probably do it in Lisp.

Haah! Got it!
I could probably do it in Python. Why do we have this
wonderful language. I can put some comments into the source
which can be understood by a Python script. This script is
then run over the source and spits out the necessary C code.

Grübel, Denk -- chris

Christian Tismer

unread,

Jan 15, 2002, 1:48:28 PM1/15/02

to

John S. Yates, Jr. wrote:

> On 5 Nov 2001 13:53:28 GMT, a...@localhost.debian.org (A.M. Kuchling) wrote:
>
>
>>Not at all. Stackless would have ramifications not just for a few
>>files in the core, but also for all the extension modules that come
>>with Python and for all the authors of third-party extension modules.
>>
>
> Is this because those extension modules would break? Or because they
> would be sub-optimal until they took advantage of continuations?

The extension modules do not break. But unless they support the
stackless way of calling back into an interpreter, they will
block things like microthreads.
In order to allow a C function to cooperate with Stackless, it needs
to free the C stack completely while it calls other python functions.
Everyting must be saved and restored from the frame chain.
There are cases where this is trivial, and other cases where they
are nearly impossible (aka rewrite the whole extension).

But there is no problem for C extensions which don't call back
into Python or if they do this only for short time, and you can
live with not switching coroutines/microthreads during this call.

ciao - chris

Christian Tismer

unread,

Jan 15, 2002, 1:59:34 PM1/15/02

to

Frederic Giacometti wrote:

> "Gordon McMillan" <gm...@hypernet.com> wrote in message

> news:Xns91515922BE02...@199.171.54.214...

>
>>John S. Yates, Jr. wrote:
>>
>>
>>>On 5 Nov 2001 13:53:28 GMT, a...@localhost.debian.org (A.M. Kuchling)
>>>wrote:
>>>

>>But making Python *truly* stackless means getting rid of all recursions,
>>and that is an enormous task. If you don't do that, you've got a
>>language feature that doesn't work in some apparently random
>>set of circumstances.
>>
>
> But how do you process callbacks to Python from C code (extensions or
> embeded python)?

Either you do it as before. Then you get the known behavior
of Python. You cannot restart uthreads from your C extension
nested Python code, unless it starts its own scheduling and
terminates this until returning to the C code.

Or: You rewrite your C extension in a way that it vanishes from
the C stack while it is executing Python code. This is not
trivial. A good example is stackless' map, which does eactly this.

> One has to return to C after executing the Python code, and than C has to
> return to Python after executing the remainer of its code...

Exactly. The C code has to put all its state info into a frame
and play the frame chain game, as Python functions do. If it
adheres to that rule, everything works seamlessly.
In a way, your C code becomes "the interpreter" for this frame.
A frame structure with better support for this has to be
developed. Currently, there are just a few extra fields which
I used for my specific stuff, like stackless map and full
continuation support.
In the long term, I wish to have more general frames which can
be adjusted to the extension's needs.

Paul Rubin

unread,

Jan 15, 2002, 2:15:12 PM1/15/02

to

Christian Tismer <tis...@tismer.com> writes:
> In order to allow a C function to cooperate with Stackless, it needs
> to free the C stack completely while it calls other python functions.
> Everyting must be saved and restored from the frame chain.

Do you think stuff can be added to SWIG to support this?

Christian Tismer

unread,

Jan 15, 2002, 2:12:06 PM1/15/02

to

Martin von Loewis wrote:

> Michael Hudson <m...@python.net> writes:
>
>
>>>- it adds 17 new members to the frame object, doubling the number
>>> of members. Usage of some of these members isn't obvious to me.
>>>
>>See below about map & friends.
>>
>
> If they are all just for the variables in map etc., a better solution
> must be found - don't think it is desirable to have them in every
> frame, if they are just used inside map.

They were also heavily used in the continuation code.
I wanted to add a number of registers, which are used in
the manner an extension needed it, without doing a redesign
of frames (which breaks compatibility).

> However, from inspecting the patch, I doubt this is the case. Some of
> them are general-purpose, but I couldn't easily figure out what, for
> example, this first-instruction business is good for.

Some fields are not necessary, after all.
This is evolution in flux ... :-)

>>So when map wants to call the Python function, it needs to stuff all
>>the local data it cares about into a frame object (see above), push
>>this frame object onto the (Python) stack, *return* to the interpreter
>>in such a way that the mapping function is called next and then every
>>time(!) that returns have the interpreter call back into map again.
>>
>
> Why couldn't map behave as if it was a plain Python function,
> implemented as
>
> def map(fun,args):
> res = []
> for a in args:
> res.append(fun(a))

It does so.

> [I know that map is implemented in a more-involved way; it should
> still be possible to find its Python equivalent, then think how
> stackless would execute this equivalent]
>
> For the additions to frame_state, that would mean that map should
> reserve a number of localsplus variables, instead of outright adding
> to frame_state on the C level.

I wanted map to be as efficient as before, so I needed some
integer registers, not Python objects.
No, I would propose to extend frames to also support memory
which doesn't need to hold Python objects. That makes them
general-purpose.

>>>- The code adds PREPARE macros into each branch of ceval. Why?
>>>- It adds a long list of explicitly not-supported opcodes into
>>> the ceval switch, instead of using 'default:'. No explanation
>>> for that change is given, other than 'unused opcodes go here'.
>>> Is it necessary to separately maintain them? Why?
>>>
>>This was an optimization Chris used to try and get back some of the
>>performance lost during the stackless changes. IIRC, he handles
>>exceptions and return values as "pseudo-opcodes" rather than using the
>>WHY_foo constants the current ceval.c uses. I never really understood
>>this part.
>>
>

> If it is an optimization, I think we should take a step back and first
> try to understand what it tries to optimize, and how it does
> that. Perhaps we find that it isn't needed at all, and that the
> problem could be solved in a different way. If you don't understand
> it, it can't go into Python.

Please, these optimizations should be kept out of Stackless
discussion. As you might have noticed by looking into the
source tree, there is a file named ceval_pre.c which does
not have these optimizations. You can compile it instead of
ceval.c and get the same (slower) functionality.
I always work with this file when I change Stackless. As a part
of the build step, I run the script cevalpatch.py, which scans
the source and applies the optimization patches. So why care.

>>I think integrating stackless into the core is a fairly huge amount of
>>work. I'd like to think I could do it, given several months of
>>full-time effort (which isn't going to happen). About the only likely
>>way I see for it to get in is for it to become important to Zope
>>Corp. for some reason, and them paying Tim or Guido (or Chris) to do
>>it.
>>
>

> I don't think these are the options. I don't volunteer to support this
> code myself, either, but if somebody would step forward and claim that
> she understands it all, and is willing to present it to Guido in a way
> that he understands it also, and if all the hackish parts of it would
> be replaced by understandable code, I think it could go into Python.
> It just needs a determined volunteer to work on it.

Turns out that I will have to try it.

Courageous

unread,

Jan 15, 2002, 5:51:58 PM1/15/02

to

>But there is no problem for C extensions which don't call back
>into Python or if they do this only for short time, and you can
>live with not switching coroutines/microthreads during this call.

In fact, there are whole classes of problems where this isn't
an issue at all. Speaking to the general audience, I'd like to
point out that microthreads are hardly the only use for continuations.
A step-wise cooperative scheduler can make use of these, and since
every python step completes a C extension call, there's no overlap
and therefor no blocking.

While this is certainly obscure, I'll note that it has pragmatic
use in a simulation environment (which is exactly where I use it).

C//

Christian Tismer

unread,

Jan 16, 2002, 10:24:41 AM1/16/02

to

Paul Rubin wrote:

I believe this is a *very* good idea!

Christian Tismer

unread,

Feb 15, 2002, 3:27:03 AM2/15/02

to

Cameron Laird wrote:

> In article <mailman.1009806401...@python.org>,
> Christian Tismer <tis...@tismer.com> wrote:
>
>>Well, it is a little late to answer this, but...
>>
>>Michael Hudson wrote:
>>
>>
>>>Martin von Loewis <loe...@informatik.hu-berlin.de> writes:
>>>
>>...
>>
>>...
>>
>>
>>>>- It adds a field nesting_level to the thread state, without
>>>> ever checking its value (it just counts the nesting level)
>>>>
>>>>
>>>I imagine this was just for bragging with :)
>>>
>>
>>Yes. I needed a way to track when and why there are still
>>recursions. At sme pahse I also had the idea to drive
>>decisions on this when I'm allowed to switch things, but
>>later I learned that this is the wrong way.
>>
> Along with bragging rights, one of the not-obvious-but-
> easy-to-explain reasons to introspect on recursion depth
> is as a defense against denial-of-service (or more un-
> intentional) security hazards. If you're executing
> something dynamic enough to give a user a wedge into the
> execution stack, it's prudent, however crude it might
> seem, to impose resource limits. One engineering discovery
> is that recursion depth can be a useful resource to limit.

The current working copy of Stackless is already running in
a total of 2 KB of C stack on Intel machines.
This limits the area of memory to be monitored for
attacks reasonably.

>>I'm at a redesign for Stackless 2.2. I hope to make it simpler,
>>split apart Stackless and optimization, and continuations are
>>no longer my primary target, but built-in microthreads.
>>
> Cool!

The new solution is cool and uncool at the same time.
All my difficult changes to the core have been abandoned.
All magic about Stackless' implementation is gone (sigh).
It is now orthogonal to the Python engine. The interaction
is down to the bare minimum.
The implementation has become ridiculously small.
Portability has changes substantially.
While Stackless is now much less dependant from
Python, it is now very much platform dependant, since
it involves a couple of assembly instructions.
I think this is an advantage, since Python develops
much more quickly than hardware platforms.

watch out for the soon-to-come release candidate - ciao - chris

Ben Wolfson

unread,

Feb 15, 2002, 11:39:48 AM2/15/02

to

On Fri, 15 Feb 2002 02:27:03 -0600, Christian Tismer wrote:

> The new solution is cool and uncool at the same time.
> All my difficult changes to the core have been abandoned.
> All magic about Stackless' implementation is gone (sigh).
> It is now orthogonal to the Python engine.

Does this mean that the changes that made Stackless faster than CPython
are gone, too?

--
BTR
BEN WOLFSON HAS RUINED ROCK MUSIC FOR A GENERATION
-- Crgre Jvyyneq

Christian Tismer

unread,

Feb 17, 2002, 10:13:16 AM2/17/02

to

Ben Wolfson wrote:

> On Fri, 15 Feb 2002 02:27:03 -0600, Christian Tismer wrote:
>
>
>>The new solution is cool and uncool at the same time.
>>All my difficult changes to the core have been abandoned.
>>All magic about Stackless' implementation is gone (sigh).
>>It is now orthogonal to the Python engine.
>>
>
> Does this mean that the changes that made Stackless faster than CPython
> are gone, too?

Right now, yes. Maybe they will be re-applied later.
Stackless is today running 1-2% slower than CPython.
That's due to a little indirection and overhead
in the function wrapped around eval-frame, plus
some fractal effects that seem to appear whenever
I touch the core at all. Adding code always resulted
in some speed loss, at least on Win32. No idea what
exactly happens; I think there are unfortunate
decisions about code placement in the compiling.

It has to be said that the speed patches of the former
Stackless have already been split apart. Without them,
old Stackless was 5 or more percent slower than CPython.
I don't know of these patches can still easily applied
to Python 2.2, but there is a good chance that this
will work out.
In that sense, the new Stackless is even faster than
the old one.

I might put some time into this, after Stackless is
ready for production code. These should be considered
as two disjoint projects now.