language implementation

0 views
Skip to first unread message

Davide Del Vento

unread,
Apr 7, 2011, 11:37:30 PM4/7/11
to slipstream-proteus-developers
Folks,
I stumbled in this one, which I think is (or at least it could be)
related to what we are doing: http://lwn.net/Articles/436970/
Besides the theory, (see the third
http://en.wikipedia.org/wiki/Partial_evaluation#Futamura_projections
or its explanation like this
http://blog.sigfpe.com/2009/05/three-projections-of-doctor-futamura.html
) I am really fascinated by the original blog post:
http://morepypy.blogspot.com/2011/04/tutorial-writing-interpreter-with-pypy.html

I know, we already briefly discussed a similar issue privately, but I
think this is more and better than what I said you, so let's discuss
again in public. Note: I don't want to push you on doing something you
don't like to do, I just want to be sure that we do what is best to
have this project succeed, because I really like it.

At present we have a functional (but incomplete?) C++ implementation.
There are some end-to-end tests, but not unit tests yet. I believe
some refactoring of the code is needed, sooner or later, for a few
reasons. Do you agree? But is it worth it?

So my original question was: why don't we (or even just I)
re-implement the thing in python, instead or in addition to the C++
implementation? Python has several advantages compared to C++,
including faster and easier development cycles, testing and
cross-platformity (is it a word?) Of course it also has its own
disadvantages, most notably speed. And developer time wasted in doing
something of little value (at least when compared to other things that
need to be done)

Now, IMHO PyPy changes everything (or, better, it's pypy-translator
that changes everything, when you use it of any interpreter written in
RPython=RestrictetPython as described in the morepypy blog post). In
fact, on x86 or x86_64 (and soon on ARM) CPUs, it's much faster than
you can believe (see http://speed.pypy.org/ but be sure to not fall
from the chair while you read) because it automagically creates a JIT
compiler for *any* language. Can you believe that a program written in
RPython (such a program is the Python interpreter in the
speed.pypy.org website) runs 3 times faster of the same program
written in C (a different implementation of the same Python
interpreter)?? And all without the usual quirks, platform specific
tricks, and obfuscations often needed for the optimization? I expect
that the same at-least-three-time-faster could happen for the proteus
interpreter.

1) so my long-term proposal is:
a) keep the current C++ implementation, maybe make it C-ish or even
pure ANSI-C and use it on platform where the RPython implementation
does not work (forget about making the C++ code clean and the fully
fledged C++ unit test suite)
b) write a clean, and easy RPython implementation, which could run either
- in Python (slow, but any platform that has Python, e.g. IBM's AIX
running on Power), or
- can be compiled with the pypy-translator in a JIT-ed native code
interpreter, almost certainly faster than a) - but only on x86 or
x86_64, regardless of the operating system (I expect the same thing
would apply to ARM, but we can't say for sure now)

2) my short-term proposal is:
a) keep working on the current C++ implementation, but only (or almost
only) fix the bugs that are show stoppers for the demo/prototype or
important models that we need to go forward on that side.
b) start writing the RPython implementation, not as a "hard effort"
but as "code documentation" of what I've so far understood in the
functionality

So now the 2b) is not wasted time because:
alpha) in the future, as described in 1b) this implementation will be
actually useful, not only a proof-of-concept
beta) while studying the language, I need to put in code what I have
understood, for my own reference and doing so in python as opposed to
C++ will be undoubtedly faster

What do you think?

Note: this is somehow biased by my idea that proteus itself will still
need a fair amount of development. If this is not the case (like it
seems from the latest wiki page about dev plans), then I kind-of
retract what I'm saying here, i.e. we can leave it as is, until we'll
need to do major work for it - and just do the 2a plus the creation of
models and similar stuff. But since I wrote this before reading that
wiki page, I'm sending it anyway, instead of saving the draft for
another day, just to have you mulling over it.

Davide

Bruce Long

unread,
Apr 8, 2011, 2:47:56 AM4/8/11
to slipstream-proteus-developers
Davide,
I really like the idea of a Pythonic implementation. But I also think that the current version is more robust and closer to done than you think. There are a handful of "situation bugs." by that I mean that major functionality works but when you toss a { } (i.e., empty infon) in at the wrong place or use an empty list as a function argument or some other single case situation it doesn't work. But these aren't tied to each other, I know most of them, and they take from 4 hours to 2 weeks to get out. If three or four of us work on them we can knock them all out in a month.

There is one more 'big' bug. It's the last of the 'big' ones and it isn't the biggest. This is what I call the 'Range" bug. These bugs take from 1 month - 4 months to solve by myself. But I think that with a lively conversation about the problem it could fall much faster.

Some formal testing could show that there are no 'situation bugs' left. At that point we have a pretty robust implementation. If a few other people have been working on optimizing things and some others have been working on models we could be at full, polished consumer product by the end of Summer.

At that point I think we will have begun to see weaknesses in the language. And if people are thinking about the mathematics of infons and how to improve syntax and speed there will likely be a lot of momentum for creating a new implementation. But as I said before, I think a separate implementation now would cause divergence; your implementation makes it easy to go one way that is hard to do on mine or vice-versa. Users don't know which version to model for so the project flounders.

You mentioned that perhaps you see that there isn't as much left to do as you thought. So I think we are on the same page. After the interactive demo let's step back and have a conversation about what to do next.

Bruce
--
Give me immortality or give me death!

Davide Del Vento

unread,
Apr 8, 2011, 12:20:26 PM4/8/11
to slipstream-pro...@googlegroups.com
Bruce,
this is fine, I'm glad we had this discussion, now things are less
murky. So let's move forward with the current implementation and use
it as is (with the necessary bugfixes, but without any major
overhaul). Let's do something practical. We will surely need more,
sometimes in the future, and at that point we will discuss what to do
- for what I can anticipate now, I believe that we'll do both 1a) and
1b) but there is no reason to commit a decision on this now.
I'll try to look at slyp this weekend (if the kids will be nice and
have enough naps). Let me know if there is anything in particular you
want me to look at.

Bye,
Davide

Davide Del Vento

unread,
Apr 16, 2011, 12:06:01 AM4/16/11
to slipstream-proteus-developers
Just to keep you informed on this issue, not to resume the discussion.
I tried PyPy with a simple python implementation of brainf***, against
a simple implementation of the same language in C++, against a simple
but more aggressive implementation of the same language in C. PyPy
wins hands down against C++ and easily against C. See
http://blog.javacorner.net/2011/04/pypy-wonders.html
Now, BF is simpler and thus easier to optimize than Proteus, but still
PyPy is faster than (C)Python, and they are comparable or even more
complex than Proteus!
So when time will come to make optimizations, PyPy will be the way to
go, not messing up with C/C++ hacks.
I know, we are not there yet, so for now it's irrelevant, but still
good to know!
'night
Dav
PS: I hope eventually this will be my weekend with slyp/slip.

Bruce Long

unread,
Apr 16, 2011, 8:45:00 AM4/16/11
to slipstream-pro...@googlegroups.com
PyPy sounds really cool. And encouraging because I have thought it should be possible for "script" languages to be faster than C. In some special cases the Haskell compiler is faster than C.  When the time comes I propose a race: Proteus Engine written in PyPy vs. Proteus Engine written in Proteus!

Bruce
Reply all
Reply to author
Forward
0 new messages