[Python-ideas] Register based interpreter

Venkatraman S

unread,

Feb 20, 2009, 1:07:22 PM2/20/09

to python...@python.org

Hi,

Has there been any discussion or effort on moving from a stack based interpreter to a more register based one?
(my limited search in the archives did not yield me any results)

regards,
-V-

Chris Rebert

unread,

Feb 20, 2009, 1:15:52 PM2/20/09

to Venkatraman S, python...@python.org

It's possible PyPy will investigate that, and the Python
implementation on Parrot (which was stagnant last I checked) would
obviously be register-based, but as for CPython, I don't think it's
going to happen (at least not any time soon). The bytecode is
stack-oriented through-and-through; it'd be quite a major change to
become register-based.

Cheers,
Chris

--
Follow the path of the Iguana...
http://rebertia.com
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Leonardo Santagada

unread,

Feb 20, 2009, 2:41:43 PM2/20/09

to Chris Rebert, python...@python.org

On Feb 20, 2009, at 3:15 PM, Chris Rebert wrote:

> On Fri, Feb 20, 2009 at 10:07 AM, Venkatraman S <venk...@gmail.com>
> wrote:
>> Hi,
>>
>> Has there been any discussion or effort on moving from a stack based
>> interpreter to a more register based one?
>> (my limited search in the archives did not yield me any results)
>
> It's possible PyPy will investigate that, and the Python
> implementation on Parrot (which was stagnant last I checked) would
> obviously be register-based, but as for CPython, I don't think it's
> going to happen (at least not any time soon). The bytecode is
> stack-oriented through-and-through; it'd be quite a major change to
> become register-based.

Antonio Cuni made some experiments on PyPy about this, If you ask at
the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
others can explain what happened. If I remember correctly there
weren't any significant improvements in performance as dispatch and
memory copies is not the problem on python, the bytecodes are very
complex.

[]'s
--
Leonardo Santagada
santagada at gmail.com

Brett Cannon

unread,

Feb 20, 2009, 2:46:56 PM2/20/09

to Venkatraman S, python...@python.org

Very limited work done way back in the day by I believe Skip Montanaro, but it didn't get far enough for a complete implementation and it didn't pan out anyway.

-Brett

Antoine Pitrou

unread,

Feb 20, 2009, 2:48:53 PM2/20/09

to python...@python.org

Leonardo Santagada <santagada@...> writes:
>
> Antonio Cuni made some experiments on PyPy about this, If you ask at
> the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
> others can explain what happened. If I remember correctly there
> weren't any significant improvements in performance as dispatch and
> memory copies is not the problem on python, the bytecodes are very
> complex.

If bytecode dispatch were not a problem, I wonder how enabling computed gotos on
the py3k branch could yield up to a 15% overall speedup on pybench :-)

The biggest complication I can think of with a register-based VM is that you
have to decref objects as soon as they aren't used anymore, which means you have
to track the actual lifetime of registers (while it's done automatically with a
stack-based design). I don't know how much it could slow down an implementation
(perhaps not at all, if a clever implementation is devised...).

Regards

Antoine.

Collin Winter

unread,

Feb 20, 2009, 3:46:23 PM2/20/09

to Antoine Pitrou, python...@python.org

On Fri, Feb 20, 2009 at 11:48 AM, Antoine Pitrou <soli...@pitrou.net> wrote:
> Leonardo Santagada <santagada@...> writes:
>>
>> Antonio Cuni made some experiments on PyPy about this, If you ask at
>> the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
>> others can explain what happened. If I remember correctly there
>> weren't any significant improvements in performance as dispatch and
>> memory copies is not the problem on python, the bytecodes are very
>> complex.
>
> If bytecode dispatch were not a problem, I wonder how enabling computed gotos on
> the py3k branch could yield up to a 15% overall speedup on pybench :-)
>
> The biggest complication I can think of with a register-based VM is that you
> have to decref objects as soon as they aren't used anymore, which means you have
> to track the actual lifetime of registers (while it's done automatically with a
> stack-based design). I don't know how much it could slow down an implementation
> (perhaps not at all, if a clever implementation is devised...).

FYI, some relevant reading on converting a stack-based JVM to use
register-based bytecode:
http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pdf

Collin Winter

Terry Reedy

unread,

Feb 20, 2009, 4:00:28 PM2/20/09

to python...@python.org

It was about 10 years ago, possibly before pydev, so google search in
c.l.p might have something.

Greg Ewing

unread,

Feb 20, 2009, 4:47:08 PM2/20/09

to python...@python.org

Chris Rebert wrote:
> The bytecode is
> stack-oriented through-and-through; it'd be quite a major change to
> become register-based.

Also, I have a hard time believing it would make much
difference. Python bytecodes are mostly quite large
units of functionality, so the time taken manipulating
the stack is not likely to be a significant component.

This is different from something like Java where you
have bytecodes that directly manipulate low-level things
such as integers.

--
Greg

Greg Ewing

unread,

Feb 20, 2009, 4:59:50 PM2/20/09

to python...@python.org

Antoine Pitrou wrote:

> If bytecode dispatch were not a problem, I wonder how enabling computed gotos on
> the py3k branch could yield up to a 15% overall speedup on pybench :-)

Perhaps there is room for a hybrid approach where you
take sequences of instructions such as

LOAD_LOCAL x
LOAD_LOCAL y
BINARY_ADD
STORE_GLOBAL z

and turn them into 3-operand macroinstructions

TRINARY_ADD LOCAL(x), LOCAL(y), GLOBAL(z)

This would actually do all the same pushing and popping
as before (at least conceptually), but it would reduce the
number of bytecodes being executed, and therefore the number
of unpredictable branches, from 4 to 1.

So it would be kind of like a register-based instruction
set, except that there aren't any registers, so there
wouldn't be any problem with managing refcounts.

(BTW, calling them "bytecodes" might become something of
a misnomer -- they're going to be more like "longcodes".:-)

--
Greg

Antoine Pitrou

unread,

Feb 20, 2009, 6:01:37 PM2/20/09

to python...@python.org

Greg Ewing <greg.ewing@...> writes:
>
> Perhaps there is room for a hybrid approach where you
> take sequences of instructions such as
>
> LOAD_LOCAL x
> LOAD_LOCAL y
> BINARY_ADD
> STORE_GLOBAL z
>
> and turn them into 3-operand macroinstructions
>
> TRINARY_ADD LOCAL(x), LOCAL(y), GLOBAL(z)

I was thinking of something like that, although in a simpler form of 2-operand
macroinstructions:

BINARY_ADD 5, 8

where 5 and 8 are indices into a "super array" which would be equal to:
[local variables] + [code object constants]
(fast random access to the array would imply copying the constants table each
time a new frame is created for the given code object, but it would hopefully
remain quite cheap)

We can keep the current instruction format and address up to 256 local variables
and constants by splitting the 16-bit operand in two bytes.

The result would be stored on top of stack.
(if we want to access the top of the stack using the macroinstructions, we could
reserve 255 as a special index value for popping the top of the stack...)

Regards

Antoine.

Venkatraman S

unread,

Feb 23, 2009, 5:44:50 AM2/23/09

to python...@python.org

On Sat, Feb 21, 2009 at 1:18 AM, Antoine Pitrou <soli...@pitrou.net> wrote:

>
> Antonio Cuni made some experiments on PyPy about this, If you ask at
> the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
> others can explain what happened.

Looks like there was no attempt on that front.

The biggest complication I can think of with a register-based VM is that you
have to decref objects as soon as they aren't used anymore, which means you have
to track the actual lifetime of registers (while it's done automatically with a
stack-based design).

I hope to take this as 'fun' project and try it out. Not sure how far i will reach.

-V-
http://twitter.com/venkat83

Collin Winter

unread,

Feb 23, 2009, 7:54:56 PM2/23/09

to Venkatraman S, python...@python.org

I think this would be interesting to attempt. Lua switched to a
stack-based VM to a register-based VM for Lua 5.0.
http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/sblp2005.pdf includes some
benchmark numbers.

Collin Winter

Reply all

Reply to author

Forward