Haskell on the JVM

422 views
Skip to first unread message

Nick Sieger

unread,
Jun 25, 2009, 3:07:26 PM6/25/09
to jvm-la...@googlegroups.com
Cross-linking this post on haskell-cafe, since there's likely to be a
JVM guru lurking here who might be interested in helping to bootstrap
Haskell on the JVM:

http://www.haskell.org/pipermail/haskell-cafe/2009-June/063454.html

Cheers,
/Nick

Attila Szegedi

unread,
Jun 25, 2009, 4:53:30 PM6/25/09
to jvm-la...@googlegroups.com
Good luck with that. All previous attempts run into very ugly problems
in implementing lazy evaluation (and no, tail calls won't solve that).

Don't get me wrong, it'd be awesome to have good Haskell on JVM, but
I'm fairly sceptical whether it can be done. I'd be delighted to be
proven wrong, though.

Attila.

Jim White

unread,
Jun 25, 2009, 11:55:59 PM6/25/09
to jvm-la...@googlegroups.com
You know about Jaskell? Perhaps those folks would be helpful.

http://jaskell.codehaus.org/

Support for Jaskell is built-in to IFCX Wings (along with OCaml, Python,
Scheme, Ruby, Groovy, ...)!

<http://ifcx.org/attach/Wings/WingsExample.html#0.4.4.Jaskell|outline>

http://ifcx.org/wiki/Wings.html

Jim

Tom Davies

unread,
Jun 29, 2009, 7:40:01 PM6/29/09
to JVM Languages


On Jun 26, 6:53 am, Attila Szegedi <szege...@gmail.com> wrote:
> Good luck with that. All previous attempts run into very ugly problems  
> in implementing lazy evaluation (and no, tail calls won't solve that).

What are the ugly problems here? I've used CAL (sort of Haskell 98 for
the JVM minus some syntactic sugar) happily, although certainly you
can get big performance improvements by indicating strictness.

Tom

Neil Bartlett

unread,
Jun 29, 2009, 7:48:51 PM6/29/09
to jvm-la...@googlegroups.com
Take a look at the "things that suck" in the following description of
LambdaVM, another effort to compile Haskell to the JVM by Brian
Alliet:

http://wiki.brianweb.net/LambdaVM/Implementation

John Cowan

unread,
Jun 29, 2009, 8:00:48 PM6/29/09
to jvm-la...@googlegroups.com
On Mon, Jun 29, 2009 at 7:48 PM, Neil Bartlett<njbar...@gmail.com> wrote:
>
> Take a look at the "things that suck" in the following description of
> LambdaVM, another effort to compile Haskell to the JVM by Brian
> Alliet:
>
>   http://wiki.brianweb.net/LambdaVM/Implementation

No chance to comment there, so I'll point out here that instead of
seeing if the indirect pointer in a thunk is null, he should just
initialize it to 'this'.

>
>
>
> On Tue, Jun 30, 2009 at 12:40 AM, Tom Davies<tgda...@gmail.com> wrote:
>>
>>
>>
>> On Jun 26, 6:53 am, Attila Szegedi <szege...@gmail.com> wrote:
>>> Good luck with that. All previous attempts run into very ugly problems
>>> in implementing lazy evaluation (and no, tail calls won't solve that).
>>
>> What are the ugly problems here? I've used CAL (sort of Haskell 98 for
>> the JVM minus some syntactic sugar) happily, although certainly you
>> can get big performance improvements by indicating strictness.
>>
>> Tom
>> >
>>
>
> >
>



--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

Attila Szegedi

unread,
Jun 30, 2009, 5:30:20 AM6/30/09
to jvm-la...@googlegroups.com
In my experience, there's just a huge mismatch between the execution
models. (As for "experience": in addition to studying the JVM quite a
lot, I also had a one-semester course at the university as part of my
grad studies on implementing a VM for a subset of Haskell. 'twas a
very interesting course, even given that we had to emulate the machine
on paper for our exam... Funny thing was, the same VM with very slight
modifications could then also be used to run Prolog programs. I also
examined the Glasgow Haskell Compiler innards for a while. Anyway, I
don't claim any of these make me an expert in Haskell implementations,
but I'm aware of hurdles.)

Basically, the strict laziness of Haskell is not possible to
seamlessly marry with JVM's stack based call-and-return execution
model (at least I'm not smart enough to see how). Necessary tricks
with continuation-like constructs ("interface Code { Code exec() }")
all make the implementations very involved and hard to maintain, and
proliferation of heap-allocated thunks doesn't help performance either.

Writing an interpreter for Haskell would be no big deal, but it has
the performance you'd expect. Writing an efficient compiler, either to
bytecode or native machine code is an entirely different matter. If
you take a look at how Glasgow Haskell Compiler implements all of
this, you'll see that it emits C code that does *not* actually work as
expected with regard to strict lazy evaluation, runs it through a C
compiler, then after the C compilation step it will transform the
compiled object code, mostly putting jump sites in places of function
returns and finally link the doctored object code into an executable.
Now, you can do that with unmanaged machine code. Clearly, you can't
do that with the managed runtime environment of the JVM. (Nor with
CLR, for that matter. Given how both Erik Meijer and Simon Peyton-
Jones are Microsoft employees, if it were possible, I have no doubts
they would have since made it happen.) And the MLVM innovations don't
point in these directions either - well, continuations or at least
generators could help, I think.

Again, I don't claim these problems are unsurmountable, I'm just
saying I don't see how to solve them in a manner that is both
efficient and elegant. I'd be delighted to see someone smarter than me
overcome these problems for having an *efficient* Haskell on the JVM
implemented *elegantly*. That'd really rock.

Attila.

Patrick Wright

unread,
Jun 30, 2009, 7:28:56 AM6/30/09
to jvm-la...@googlegroups.com
> Again, I don't claim these problems are unsurmountable, I'm just
> saying I don't see how to solve them in a manner that is both
> efficient and elegant. I'd be delighted to see someone smarter than me
> overcome these problems for having an *efficient* Haskell on the JVM
> implemented *elegantly*. That'd really rock.

How does CAL handle these problems?
http://en.wikipedia.org/wiki/Quark_Framework#The_CAL_Language

See also PDFs linked at bottom of WP entry, in particular "CAL for
Haskell Programmers".

Regards
Patrick

Tom Davies

unread,
Jun 30, 2009, 9:10:12 AM6/30/09
to JVM Languages


On Jun 30, 9:28 pm, Patrick Wright <pdoubl...@gmail.com> wrote:
> See also PDFs linked at bottom of WP entry, in particular "CAL for
> Haskell Programmers".

This document (not linked from the Wikipedia page) is probably worth
lookingn at http://resources.businessobjects.com/labs/cal/cal_runtime_internals.pdf

John Cowan

unread,
Jun 30, 2009, 4:08:56 PM6/30/09
to jvm-la...@googlegroups.com
On Tue, Jun 30, 2009 at 5:30 AM, Attila Szegedi<szeg...@gmail.com> wrote:

> I'd be delighted to see someone smarter than me
> overcome these problems for having an *efficient* Haskell on the JVM
> implemented *elegantly*. That'd really rock.

That might be too much to ask, given that GHC is an efficient but
highly inelegant (per your description) implementation of Haskell on
the C VM. :-)

John Cowan

unread,
Jun 30, 2009, 4:53:02 PM6/30/09
to jvm-la...@googlegroups.com
On Tue, Jun 30, 2009 at 7:28 AM, Patrick Wright<pdou...@gmail.com> wrote:

> See also PDFs linked at bottom of WP entry, in particular "CAL for
> Haskell Programmers".

What this boils down to, AFAICT, is that 100% CAL code is pure, but
CAL that calls Java stuff is not.

>
>
>
> Regards
> Patrick

Tom Davies

unread,
Jul 2, 2009, 10:00:01 AM7/2/09
to JVM Languages


On Jul 1, 6:53 am, John Cowan <johnwco...@gmail.com> wrote:
> On Tue, Jun 30, 2009 at 7:28 AM, Patrick Wright<pdoubl...@gmail.com> wrote:
> > See also PDFs linked at bottom of WP entry, in particular "CAL for
> > Haskell Programmers".
>
> What this boils down to, AFAICT, is that 100% CAL code is pure, but
> CAL that calls Java stuff is not.

Java functions can be pure, but of course there are no guarantees as
there are with CAL code.

I often wrap a monad around Java libraries which rely on side effects
to ensure that calls are correctly sequenced.

Tom

Jon Harrop

unread,
Jul 3, 2009, 5:49:50 PM7/3/09
to jvm-la...@googlegroups.com
On Tuesday 30 June 2009 10:30:20 Attila Szegedi wrote:
> Given how both Erik Meijer and Simon Peyton-Jones are Microsoft employees,

> if it were possible, I have no doubts they would have since made it happen.

Microsoft Research put a huge amount of effort into trying to reconcile
Haskell and .NET in their Haskell.NET project. Note that they were not just
trying to implement Haskell on .NET but were actually changing the VM to
suite Haskell and they were still unable to create something usable. However,
their work culminated in the early adoption of some features that have become
critically important today, most notably tail call elimination.

Don Syme described in an interview with Sadek Drobi some of the lessons he
learned from their work on Haskell.NET:

http://www.infoq.com/interviews/F-Sharp-Don-Syme

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

John Rose

unread,
Jul 6, 2009, 9:38:29 PM7/6/09
to jvm-la...@googlegroups.com
On Jun 30, 2009, at 2:30 AM, Attila Szegedi wrote:

> And the MLVM innovations don't
> point in these directions either - well, continuations or at least
> generators could help, I think.

MLVM has Lukas Stadler's nice prototype of scoped continuations, which
provides one set of useful tricks, generally of the form "expect to
run in strict mode for a while, and then take a hit as you swap part
of the control stack to the heap".

There's also Arnold Schwaighofer's tailcall prototype on MLVM, which
if crossed with invokedynamic would produce the equivalent of
patchable jump sites.

(But, did I mention they are prototypes? Sadly, JDK 7 will not
include either, since it is focusing on invokedynamic and method
handles.)

-- John

John Rose

unread,
Jul 7, 2009, 1:06:37 AM7/7/09
to jvm-la...@googlegroups.com
Thanks for the very interesting references.  I would say, thanks for all the pain points, but that would sound ghoulish.

I have some rambling observations, presented as-is with no warranty, express or implied...

Regarding LambdaVM:

I think running the "mini interpreter" in the JVM makes good sense.  With scoped continuations, you could run a mix of stack-ful and thunk-ful calls.  Basically, run normal stack-ful calls where it is likely to be non-blocking, and when this fails (perhaps due to an arbitrary control stack depth limit), snapshot the continuation, and throw the next "Code" thunk down to the mini interpreter.  Note that the standard case of "return nextCode()" from "Code.exec" is equivalent to snapshotting a null continuation and throwing "nextCode" to the mini interpreter; the "return" can be viewed as a strength-reduced throw through zero intermediate stack frames.

The LambdaVM paper claims that global variables cannot be registerized by any "JIT on the planet"; HotSpot can do this partially.  Assuming that the Code.exec calls can be inlined, the machine code will probably look like "compute argument value in reg T23, store to global G1, use value i T23, do not load G1".  The store buffer gets full, but there are no dependent loads.

The desire for the GC to tension links through LazyThunk.ind fields is interesting.  I have seen several uses cases for lazy data structure normalization (to be carried out by the GC).  Another one is string compaction, something like the Icon GC did.  Currently, JVMs suffer from strings whose underlying char arrays contain unused positions.  Not sure how to design this.

The thunk state change is an interesting case to optimize.  I think it would be optimal, in a JVM-like system, to be able to change the thunk's type and "ind" field at the same time, in a transaction of some sort, especially if the hardware supports two-reference atomic updates.  This sort of thing may be a use case for what I call "split classes" where the object's underlying class field can be overloaded with different (but compatible) values.

Regarding CAL:

CAL manages thunk tensioning by having the mutator do it routinely at use points (accessor code in the using object, see 7.2).  This requires ubiquitous copies of the tensioning code; it's a sort of "field reference" aspect that has to be cut in everywhere.

It's interesting to me that CAL works hard to customize methods to take primitive types when possible (section 6).  For better or worse, JVMs support unboxed primitives, and there will probably always be some performance gain to statically removing box/unbox operations in generated bytecode.  It's for this reason that the JSR 292 APIs integrate primitive types at all points.

Regarding Haskell.NET:

They gave it up because the platform is strict and doesn't give much emulating non-strict nodes.  The CAL document shows what it looks like when you push through anyway (on the JVM, another strict platform).  You can make it work, but you have to invent a large set of shadow types, like the Java wrapper types.  I wonder what it would look like to add the right "hooks" to the Java wrappers.  Probably unworkable, since they are all final (monomorphic implementation).  But perhaps interface injection could be used to introduce the extra "evaluate myself as myself" and a "get my reduced type" method to the standard wrappers.  Maybe there is some value in making a parallel set of wrapper interfaces, retroactively implemented by the standard wrappers.

It's very cool how C# and F# are (apparently to me) doubling down on the original platform investment in generators, supporting LINQ and asynchronous agents.

The thing I like the most in F# is the computation expressions, which let you build asynchronous workflows (which Don Syme talks about).  To me it looks like a start towards doing domain-specific languages on borrowed host-language syntax with static typing.  (As opposed to Ruby/Groovy/Python where you get DSLs with dynamic typing.)  You can build agent machines in them, which is important, but also probably just the beginning of the interesting patterns that can be expressed as DSLs.  Perhaps you could represent a lazy variant of F# (or Scala) as a computation expression, with enough "hooks" in controlling type for re-interpreting each type of subexpression.

-- John

On Jul 3, 2009, at 2:49 PM, Jon Harrop wrote:

Don Syme described in an interview with Sadek Drobi some of the lessons he 
learned from their work on Haskell.NET:

 http://www.infoq.com/interviews/F-Sharp-Don-Syme

Reply all
Reply to author
Forward
0 new messages