kind of toy kernel implementation on pypy

159 views
Skip to first unread message

Mariano Guerra

unread,
Oct 15, 2013, 6:50:04 PM10/15/13
to kl...@googlegroups.com
hi!

for a talk I will give at PyCon Argentina I implemented a small scheme/kernel
toy language using the pypy toolchaing.

it's implemented using continuations and it already runs something like this:

https://github.com/marianoguerra/plang/blob/master/examples/callcc.plang

it's in a really early stage but the basis are there, if someone wants to pick
it up from here and push it into kernel land it should be a matter of adding
the basic applicatives/operatives and try to bootstrap the others from there.

I was looking on how to add the jit hints to the language to have it running
with a jit and I found a prolog implementation in pypy that uses CSP and
has a jit:

https://bitbucket.org/cfbolz/pyrolog/

if that was added it would be a kernel implementation with a JIT and incremental garbage collection:

http://morepypy.blogspot.fr/2013/10/incremental-garbage-collector-in-pypy.html

not bad for a toy ;)

Oto Havle

unread,
Oct 18, 2013, 6:00:12 AM10/18/13
to kl...@googlegroups.com
Hi Mariano,

On 16.10.2013 00:50, Mariano Guerra wrote:
> hi!
>
> for a talk I will give at PyCon Argentina I implemented a small scheme/kernel
> toy language using the pypy toolchaing.
>
> it's implemented using continuations and it already runs something like this:
>
> https://github.com/marianoguerra/plang/blob/master/examples/callcc.plang
>
> [...]

I forked plang and added few things, just enough to run the TAK
microbenchmark (ported to the Kernel Language syntax)

https://github.com/havleoto/plang/blob/master/examples/tak.k

The TAK is originally one of the classical Lisp benchmarks of
P. Gabriel. It tests the speed of recursive function calls.

The execution times:

(1) plang (interpreted by python 2.7.3) ... 68.74 seconds
(2) plang (interpreted by pypy 2.1) ........ 5.51 seconds
(3) plang (compiled by RPython and pypy) ... 0.82 seconds

compare with Kernel Language interpreters:

(4) SINK (compiled by chicken 2.732) ..... 111.76 seconds
(5) klisp 0.3 (default branch) ............. 3.34 seconds
(6) Bronze Age Lisp (head).................. 0.17 seconds

Usual microbenchmark caveats apply. Also, plang is not really a kernel language
interpreter (no $vau, no general parameter trees, no guarded continuations, no
cyclic lists, etc.). Also, I'm not very familiar with python and never used pypy
before.

The performance of the compiled plang is quite impressive, considering
that it is implemented in a subset of a dynamic language.

Benchmark details:

All benchmark were run on Intel(R) Pentium(R) 4 CPU 3.06GHz, 0.5GB RAM,
with Debian "Wheezy" linux distribution. All software which is not
mentioned (gcc, libc, ...) comes from Debian.

(1) Run under python 2.7.3 from Debian Wheezy, with rply-0.6.1 parser
generator library from https://pypi.python.org/pypi/rply/0.6.1.

(2) Run under pypy-2.1. I used pypy's precompiled binary for x86 linux
(http://pypy.org/download.html)

(3) Compiled with RPython compiler from pypy-2.1 source distribution.
The compiler is run in the precompiled pypy-2.1.

In this benchmark, plang is NOT compiled with -Ojit. I was not able
to make -Ojit work (Exception: target has no jitpolicy defined).

(4) SINK is a Kernel Interpreter by John Shutt written in Scheme,
http://web.cs.wpi.edu/~jshutt/kernel.html, version 0.1 m 10, 21 September 2009.
I've concatenated all source files into one, replaced () with '() where necessary,
and compiled it with Chicken Scheme 2.732 (http://www.call-cc.org/).

(5) Corresponds to the current head of https://bitbucket.org/AndresNavarro/klisp,
built for "posix" target, USE_LIBFFI=1.

(6) Corresponds to the current head of https://bitbucket.org/havleoto/bronze-age-lisp.

Best wishes,

Oto Havle.

Mariano Guerra

unread,
Oct 18, 2013, 7:18:51 AM10/18/13
to kl...@googlegroups.com
Quoting Oto Havle (2013-10-18 07:00:12)
yep, there's a lot to be done to make it a kernel implementation, but the base
is there to implement it :)

> The performance of the compiled plang is quite impressive, considering
> that it is implemented in a subset of a dynamic language.

yep, and also that I didn't added the jit yet :)

> Benchmark details:
>
> All benchmark were run on Intel(R) Pentium(R) 4 CPU 3.06GHz, 0.5GB RAM,
> with Debian "Wheezy" linux distribution. All software which is not
> mentioned (gcc, libc, ...) comes from Debian.
>
> (1) Run under python 2.7.3 from Debian Wheezy, with rply-0.6.1 parser
> generator library from https://pypi.python.org/pypi/rply/0.6.1.
>
> (2) Run under pypy-2.1. I used pypy's precompiled binary for x86 linux
> (http://pypy.org/download.html)
>
> (3) Compiled with RPython compiler from pypy-2.1 source distribution.
> The compiler is run in the precompiled pypy-2.1.
>
> In this benchmark, plang is NOT compiled with -Ojit. I was not able
> to make -Ojit work (Exception: target has no jitpolicy defined).

yep, that would be the next step, it's my first pypy lang so I don't know how
to do it properly, but I would love to see who it improves the performance.

> (4) SINK is a Kernel Interpreter by John Shutt written in Scheme,
> http://web.cs.wpi.edu/~jshutt/kernel.html, version 0.1 m 10, 21 September 2009.
> I've concatenated all source files into one, replaced () with '() where necessary,
> and compiled it with Chicken Scheme 2.732 (http://www.call-cc.org/).
>
> (5) Corresponds to the current head of https://bitbucket.org/AndresNavarro/klisp,
> built for "posix" target, USE_LIBFFI=1.
>
> (6) Corresponds to the current head of https://bitbucket.org/havleoto/bronze-age-lisp.

thanks for the benchmarks :)

Estevo U. C. Castro

unread,
May 11, 2014, 4:22:58 AM5/11/14
to kl...@googlegroups.com
Thanks for bringing this up and for the benchmarks!

I had heard of pypy/RPython before, but I was scared about having so much fanciness (black magic? :)) going on under my runtime.  Hearing of this and other success stories I decided to give it a try anyway.

So I started a Kernel implementation in RPython.  I started from scratch, and not forking plang or pumice, the two previous Kernels-in-RPython that I knew about, because AFAICT these didn't support tail call optimization (pumice didn't do continuations at all), and that looked like something that might be hard to retrofit later on.  For the basic structure of the interpreter I borrowed heavily from pycket, a RPython implementation of Racket:

https://github.com/samth/pycket
http://samth.github.io/pycket-dyla.pdf

My results so far is in

https://github.com/euccastro/icbink

It is currently in a proof-of-concept stage, so many primitives are missing, but most of the fundamental mechanisms are there: $vau, tail call optimization, guarded continuations, with an error signaling mechanism based on these, first-class environments, encapsulations, promises, keyed dynamic and static variables, integers of arbitrary precision, and a rudimentary (and buggy -- the irony!) debugger.

Most glaringly missing so far are ports, chars, and all optional modules except environment mutation; most notably, mutable pairs and any kind of special handling of cyclic lists.  I figured that without the optional mutable pairs module it's impossible to build a cyclic list anyway(?), so I have little motivation to go the extra mile for these.  Anyway, I haven't given this much thought and I may change my mind on this.

All in all, I think it's an useful data point as for what performance to expect from RPython.

WRT benchmarks, in my machine (a core duo i5... something), tak.k takes the following in these implementations:

Klisp: ~1.7s
Compiled plang: ~0.3s
Compiled icbink (no jit): ~0.47s
Compiled icbink (jit): ~0.25s

So I guess there is significant overhead in the (naive) way I do continuations, but even a jit with no hints whatsoever (takes two lines to enable it; search for 'jit' in primitive.py) makes up for that.  There are more gains to be had in other obvious optimizations, as explained in this series of articles:

http://morepypy.blogspot.com/2011/03/controlling-tracing-of-interpreter-with.html

and other ideas for improvements may be gotten by studying the pycket code.

That said, the stat I like the most so far is that the whole thing takes 1511 (non blank or comment) lines of rather straightforward (R)Python*, although brevity was never much of a goal.  Also, it took ~3 weeks of part-time effort by one programmer with barely any experience implementing languages (I was already fluent in Python, though).

In the long run a compiler/interpreter that is meant for Kernel from the ground up (perhaps Bronze Age or what comes from Andrés's work with abstract interpretation) will probably beat anything that we can get from RPython.  But in the meanwhile, and especially while Kernel doesn't get more popularity and programmer*years, taking advantage of the amazing work of the pypy folks makes perfect economic sense.

My current plan is to go on and fill in the blanks to make this a conformant and robust (but not necessarily complete) Kernel implementation.  Then I'll work on some practicalities like better error messages and debug facilities, and ffi.  Only after that I'll look at adding optimizations (this may involve branching out 'development' and 'release' interpreters, the latter being stripped of debug functionality for performance, although I hope that's not necessary).

If anyone has interest in working on/forking this, I'm happy to reply any questions regarding the code.  I'll try and add some docs, but probably only as I reply actual questions.

One area in which I think kernel implementations can help each other is in building a corpus of conformance tests in pure kernel.  I'll take a look at what's there and perhaps spin this off as an independent project.

Another thing that would be nice to standardize is the tree of error continuations.  I have the following tentative hierarchy so far:

root
    error
        system-error
        user-error
            type-error
                encapsulation-type-error
                operand-mismatch
                arity-mismatch
            symbol-not-found
            unbound-dynamic-key
            unbound-static-key
            add-positive-to-negative-infinity

but it's all ad hoc.  Just looking at it now, maybe symbol-not-found and the unbound-* continuations could use a common ancestor like lookup-error... or maybe, on the contrary, the branch on type-error should be flattened.  Or the names changed, etc.

For the rest of the stuff that is missing in the R-1RK, I like Klisp's policy of using R7RS as a fallback reference.

I hope you find this stuff interesting.

Happy Kerneling!

- Estevo

* plus some 80 lines of kernel library which I just copied from the reference implementations in the R-1RK.

Mariano Guerra

unread,
May 12, 2014, 7:08:58 AM5/12/14
to kl...@googlegroups.com
On Sun, May 11, 2014 at 10:22 AM, Estevo U. C. Castro <eucc...@gmail.com> wrote:
Thanks for bringing this up and for the benchmarks!

I had heard of pypy/RPython before, but I was scared about having so much fanciness (black magic? :)) going on under my runtime.  Hearing of this and other success stories I decided to give it a try anyway.

So I started a Kernel implementation in RPython.  I started from scratch, and not forking plang or pumice, the two previous Kernels-in-RPython that I knew about, because AFAICT these didn't support tail call optimization (pumice didn't do continuations at all), and that looked like something that might be hard to retrofit later on.  For the basic structure of the interpreter I borrowed heavily from pycket, a RPython implementation of Racket:

https://github.com/samth/pycket
http://samth.github.io/pycket-dyla.pdf

My results so far is in

https://github.com/euccastro/icbink


awesome work, congratulations and thanks for sharing!

I hope this can continue and we can have a fast and complete implementation of kernel.

regarding the common set of tests for all implementations, I couldn't agree more.

Oto Havle

unread,
May 18, 2014, 4:21:14 PM5/18/14
to kl...@googlegroups.com
On 05/12/2014 01:08 PM, Mariano Guerra wrote:
> On Sun, May 11, 2014 at 10:22 AM, Estevo U. C. Castro
> <eucc...@gmail.com <mailto:eucc...@gmail.com>> wrote:
>
> Thanks for bringing this up and for the benchmarks!
>
> I had heard of pypy/RPython before, but I was scared about having so
> much fanciness (black magic? :)) going on under my runtime. Hearing
> of this and other success stories I decided to give it a try anyway.
>
> So I started a Kernel implementation in RPython. I started from
> scratch, and not forking plang or pumice, the two previous
> Kernels-in-RPython that I knew about, because AFAICT these didn't
> support tail call optimization (pumice didn't do continuations at
> all), and that looked like something that might be hard to retrofit
> later on. For the basic structure of the interpreter I borrowed
> heavily from pycket, a RPython implementation of Racket:
>
> https://github.com/samth/pycket
> http://samth.github.io/pycket-dyla.pdf
>
> My results so far is in
>
> https://github.com/euccastro/icbink
>
>
> awesome work, congratulations and thanks for sharing!
>

Yes, icbink looks very good.

> I hope this can continue and we can have a fast and complete
> implementation of kernel.
>
> regarding the common set of tests for all implementations, I couldn't
> agree more.
>

I've created a sketch of a test suite:

https://github.com/havleoto/kplts

The first challenge is building some compatibility layer so the code
can run on several different interpreters. I have scripts to run the
test suite in Klisp, Klink, Bronze Age Lisp and Icbink.

The overall management of the test run relies on encapsulation
and error handling features of Kernel. The test suite runs inside
single instance of the interpreter. I've considered having some
external control of the interpreter, but rejected that for sake
of simplicity.

There are very few tests yet. Now, most fails and warnings are caused
by bugs in the test suite.

Regarding Icbink, I have added (load ...) just to see if I can make
it work (in my github clone of icbink repo), but there are other
problems. I'll look into that later.

Best wishes,

Oto Havle.

Estevo

unread,
May 20, 2014, 10:20:06 AM5/20/14
to kl...@googlegroups.com
On Sun, May 18, 2014 at 10:21 PM, Oto Havle <havl...@gmail.com> wrote:
On 05/12/2014 01:08 PM, Mariano Guerra wrote:
On Sun, May 11, 2014 at 10:22 AM, Estevo U. C. Castro
<eucc...@gmail.com <mailto:eucc...@gmail.com>> wrote:

    [...]


    My results so far is in

    https://github.com/euccastro/icbink


awesome work, congratulations and thanks for sharing!


Yes, icbink looks very good.

Thanks for your kind words!
 
 
regarding the common set of tests for all implementations, I couldn't
agree more.


I've created a sketch of a test suite:

  https://github.com/havleoto/kplts

Nice, thanks!  When I said that I expected start that myself.  Glad to have been beaten to it!  Once I have icbink passing those tests, I'll contribute any further tests of R-1RK compliance I come up with as pull requests to your repo.
 
The first challenge is building some compatibility layer so the code
can run on several different interpreters. I have scripts to run the
test suite in Klisp, Klink, Bronze Age Lisp and Icbink.

Thanks, I'm sorry you had to do that for icbink.  I'll make it my first priority that icbink works with no compatibility layer.  I've been busy at work and home (having our first baby in a month or so) but I hope I can carve some time this week for that.

[...]

Regarding Icbink, I have added (load ...) just to see if I can make
it work (in my github clone of icbink repo), but there are other
problems. I'll look into that later.

Nice!  icbink is not expected to be conformant at this point.  It was in a proof-of-concept state.  I'll work through your tests until icbink passes them (hopefully without the need of a compatibility layer) and then I'll move over all the tests in icbink/test.k that only use standard Kernel facilities.

Thanks again for kicking this off!
Reply all
Reply to author
Forward
0 new messages