It's possible PyPy will investigate that, and the Python
implementation on Parrot (which was stagnant last I checked) would
obviously be register-based, but as for CPython, I don't think it's
going to happen (at least not any time soon). The bytecode is
stack-oriented through-and-through; it'd be quite a major change to
become register-based.
Cheers,
Chris
--
Follow the path of the Iguana...
http://rebertia.com
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas
> On Fri, Feb 20, 2009 at 10:07 AM, Venkatraman S <venk...@gmail.com>
> wrote:
>> Hi,
>>
>> Has there been any discussion or effort on moving from a stack based
>> interpreter to a more register based one?
>> (my limited search in the archives did not yield me any results)
>
> It's possible PyPy will investigate that, and the Python
> implementation on Parrot (which was stagnant last I checked) would
> obviously be register-based, but as for CPython, I don't think it's
> going to happen (at least not any time soon). The bytecode is
> stack-oriented through-and-through; it'd be quite a major change to
> become register-based.
Antonio Cuni made some experiments on PyPy about this, If you ask at
the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
others can explain what happened. If I remember correctly there
weren't any significant improvements in performance as dispatch and
memory copies is not the problem on python, the bytecodes are very
complex.
[]'s
--
Leonardo Santagada
santagada at gmail.com
If bytecode dispatch were not a problem, I wonder how enabling computed gotos on
the py3k branch could yield up to a 15% overall speedup on pybench :-)
The biggest complication I can think of with a register-based VM is that you
have to decref objects as soon as they aren't used anymore, which means you have
to track the actual lifetime of registers (while it's done automatically with a
stack-based design). I don't know how much it could slow down an implementation
(perhaps not at all, if a clever implementation is devised...).
Regards
Antoine.
FYI, some relevant reading on converting a stack-based JVM to use
register-based bytecode:
http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pdf
Collin Winter
It was about 10 years ago, possibly before pydev, so google search in
c.l.p might have something.
Also, I have a hard time believing it would make much
difference. Python bytecodes are mostly quite large
units of functionality, so the time taken manipulating
the stack is not likely to be a significant component.
This is different from something like Java where you
have bytecodes that directly manipulate low-level things
such as integers.
--
Greg
> If bytecode dispatch were not a problem, I wonder how enabling computed gotos on
> the py3k branch could yield up to a 15% overall speedup on pybench :-)
Perhaps there is room for a hybrid approach where you
take sequences of instructions such as
LOAD_LOCAL x
LOAD_LOCAL y
BINARY_ADD
STORE_GLOBAL z
and turn them into 3-operand macroinstructions
TRINARY_ADD LOCAL(x), LOCAL(y), GLOBAL(z)
This would actually do all the same pushing and popping
as before (at least conceptually), but it would reduce the
number of bytecodes being executed, and therefore the number
of unpredictable branches, from 4 to 1.
So it would be kind of like a register-based instruction
set, except that there aren't any registers, so there
wouldn't be any problem with managing refcounts.
(BTW, calling them "bytecodes" might become something of
a misnomer -- they're going to be more like "longcodes".:-)
--
Greg
I was thinking of something like that, although in a simpler form of 2-operand
macroinstructions:
BINARY_ADD 5, 8
where 5 and 8 are indices into a "super array" which would be equal to:
[local variables] + [code object constants]
(fast random access to the array would imply copying the constants table each
time a new frame is created for the given code object, but it would hopefully
remain quite cheap)
We can keep the current instruction format and address up to 256 local variables
and constants by splitting the 16-bit operand in two bytes.
The result would be stored on top of stack.
(if we want to access the top of the stack using the macroinstructions, we could
reserve 255 as a special index value for popping the top of the stack...)
Regards
Antoine.
>
> Antonio Cuni made some experiments on PyPy about this, If you ask at
> the pypy-dev mailing list or on irc (#pypy on freenode.net) he or
> others can explain what happened.
The biggest complication I can think of with a register-based VM is that you
have to decref objects as soon as they aren't used anymore, which means you have
to track the actual lifetime of registers (while it's done automatically with a
stack-based design).
I think this would be interesting to attempt. Lua switched to a
stack-based VM to a register-based VM for Lua 5.0.
http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/sblp2005.pdf includes some
benchmark numbers.
Collin Winter