introduction: the JVM language CAL

Bo Ilic

unread,

Aug 9, 2007, 3:55:18 PM8/9/07

to JVM Languages

Hi all,
CAL is a lazy strongly-typed functional programming language for the
JVM available at http://labs.businessobjects.com/cal/.

The basic syntax and semantics of CAL are similar to Haskell and it
is straightforward to port Haskell code to CAL. In addition, CAL
includes features to interoperate with Java. For example, CAL supports
concurrent creation, compilation, and execution of CAL entities,
entirely under the control of Java e.g. as a kind of functional
language meta-programming.

There is a visual programming language for graphically creating CAL
functions called the Gem Cutter. This is an example of the sort of
application we wanted to make possible: one that combines polished UI,
reuses Java libraries, and exposes functional logic in an accessible
form.

CAL has been under development by Business Objects since 2000, and
was released Jan 25, 2007 this year as open source under the BSD
license. There is also a CAL Eclipse Plug-in released as open source
under the EPL license. CAL is a mature language, used in quite a few
projects at Business Objects. The open-source offering (including the
Gem Cutter) consists of 800,000 lines of code, about 20% written in
CAL, and the rest in Java itself.

Performance has been one of the key things we've focused on, and
there are quite a few challenges to this with a lazy language, and
with the JVM which doesn't permit overwriting of thunks. A variety of
unboxing transformations, compiling tail calls as loops, etc. were
essential to get decent performance. CAL is benchmarked on the
Computer Language Benchmarks Game site (formerly the Computer
Benchmark Shootouts) at: http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all
Our current performance rating on the default suite of 2.0 is
comparable with Java 6 server (1.7) and Haskell GHC (2.1).

The CAL compiler can generate either bytecodes directly (using the
ASM bytecode generator) or reasonably readable Java source files that
are then compiled with javac. Having both these modes has been
important for us, in that the Java source mode helped us discover
optimization possibilities in our code generation, and to fix bugs,
while the direct bytecode generation gave us the speed and concurrent
compilation capabilities we needed for the functional meta-programming
mentioned above.

We have a document on the implementation of the CAL runtime that will
be released with the next release of Open Quark in the next few weeks.
Hopefully it will be of interest to some of the people on this group.

If you have any questions, please feel free to ask on our Google
group discussion forum http://groups.google.com/group/cal_language.

Cheers,
Bo

Charles Oliver Nutter

unread,

Aug 9, 2007, 9:05:24 PM8/9/07

to jvm-la...@googlegroups.com

Bo Ilic wrote:
> We have a document on the implementation of the CAL runtime that will
> be released with the next release of Open Quark in the next few weeks.
> Hopefully it will be of interest to some of the people on this group.

I would be *very* interested in hearing more about CAL. In addition, I
have a question to start things off: why not just Haskell? I know
there's a Jaskell project, but it sounds like you guys might be further
along (feel free to correct me, anyone). But your performance numbers
are excellent, showing that a language like Haskell could be made to
work extremely well on the JVM. I want to know what challenges and
tricks you had to do to get there, and what if anything might be
reusable for other languages.

- Charlie

Bo Ilic

unread,

Aug 10, 2007, 2:00:26 AM8/10/07

to JVM Languages

On Aug 9, 6:05 pm, Charles Oliver Nutter <charles.nut...@sun.com>
wrote:

> I would be *very* interested in hearing more about CAL. In addition, I
> have a question to start things off: why not just Haskell?

The main reason is that we wanted to have first-class support for
Java, given its ubiquity in business computing. For example, CAL's
String type is just java.lang.String and its Int type is the Java
primitive int. (CAL being lazy, a value in CAL of type Int can either
be a primitive Java int or a computation that evaluates to an int).
Another example is the ability to write dual language (Java/CAL)
applications such as the Gem Cutter with relative ease- each language
doing what it does best.

Longer answers are here:
http://www.haskell.org/pipermail/haskell/2006-September/018571.html
http://resources.businessobjects.com/labs/cal/cal_for_haskell_programmers.pdf

> But your performance numbers
> are excellent, showing that a language like Haskell could be made to
> work extremely well on the JVM. I want to know what challenges and
> tricks you had to do to get there, and what if anything might be
> reusable for other languages.

We implemented 3 different machines: the TIM, the g-machine and
finally our own home-brewed lecc machine. We started with the TIM, but
gave up on it after considerable effort. It was slow, and we were not
able to make it space-correct in certain cases. The g-machine
implementation was next. This is basically the same machine used in
the well known Haskell implementation Hugs. We still include it today
as an alternative machine for CAL (that can be enabled using VM
arguments). It was too slow in the end. Our main current machine is
the lecc, which is a modified graph-reducer. Its workings and main
optimizations are the subject of the forthcoming doc.

Some of the implementation of CAL is certainly reusable for other
languages. Here is one example. CAL source, after a series of
transformations, is eventually converted to a Java model. This Java
model can then be converted to Java source or Java bytecodes directly
using ASM. The compilation to bytecodes from this stage is one thing
that is surely reusable. For example, compiling boolean valued
expressions to avoid pushing intermediate boolean results, is a nice
snippet available here.

Cheers,
Bo

parren

unread,

Aug 10, 2007, 10:07:53 AM8/10/07

to JVM Languages

On Aug 9, 9:55 pm, Bo Ilic <bo.i...@businessobjects.com> wrote:
> The CAL compiler can generate either bytecodes directly (using the
> ASM bytecode generator) or reasonably readable Java source files that
> are then compiled with javac. Having both these modes has been
> important for us, in that the Java source mode helped us discover
> optimization possibilities in our code generation, and to fix bugs,
> while the direct bytecode generation gave us the speed and concurrent
> compilation capabilities we needed for the functional meta-programming
> mentioned above.

I had the same plans for AFC. But then I discovered JODE, a byte-code
to Java decompiler. This gave me the same benefits you mention, but
without having to maintain to different backends. So now I only
support ASM and use JODE to get decent source code.

-peo

Bo Ilic

unread,

Aug 10, 2007, 5:49:17 PM8/10/07

to JVM Languages

On Aug 10, 7:07 am, parren <peter.arrenbre...@gmail.com> wrote:
>
> I had the same plans for AFC. But then I discovered JODE, a byte-code
> to Java decompiler. This gave me the same benefits you mention, but
> without having to maintain to different backends. So now I only
> support ASM and use JODE to get decent source code.
>

This is an interesting idea. I didn't give some details on CAL's
approach that suggest some tradeoffs though. These might not be that
important for all language implementations, but are perhaps of some
interest to note.

1. CAL's implementation include some extra things in the Java model to
Java source conversion. For example, there are comments relating the
generated code to the CAL source code. As another example, we declare
local variables or parameters to methods final in the Java model where
appropriate i.e. enforcing a Java language constraint in our code
generation not enforceable in bytecode.

2. We periodically need to extend the Java model to handle new
constructs, since it is not a complete model of the Java language.
Deciding how we will encode new CAL constructs or optimizations into
the Java model is the hard part, and we like to test and validate
these first before going through the effort of implementing the Java
model to bytecode extensions needed, which is harder than updating the
Java model to Java source conversion.

3. When we are tracking down bugs in the CAL compiler implementation,
it can happen that the Java model will compile to bytecodes
successfully, but then produce verification errors on loading at
runtime, whereas compilation to Java source will produce javac
compilation errors in the generated code, which are easier to track
down.

4. We reuse the Java model in a few cases for things other than CAL
code generation. For example, in generating "binding" files for CAL
modules that make certain kinds of meta-programming with CAL more
statically checkable. These are files that users of CAL may actually
read, so need to be somewhat nice.

5. ASM makes it easy to compare the bytecodes generated by javac
versus asm. What we do is generate code using both modes, use ASM to
filter out the stuff generated by javac that we don't want (e.g. debug
information), and then compare the bytecodes of all files in a
differencing tool.

Cheers,
Bo

Bo Ilic

unread,

Aug 31, 2007, 12:25:49 AM8/31/07

to JVM Languages

On Aug 9, 12:55 pm, Bo Ilic <bo.i...@businessobjects.com> wrote:
> We have a document on the implementation of the CAL runtime that will
> be released with the next release of Open Quark in the next few weeks.
> Hopefully it will be of interest to some of the people on this group.
>

Open Quark 1.6.0 has been released and includes 3 new documents on the
implementation of CAL that may be of more general interest to readers
on this forum. The documents are available at
http://labs.businessobjects.com/cal/
either as part of the binary or source distribution or as separate
downloads.

CAL Runtime Internals explains how the CAL runtime is implemented, and
some of the optimizations used. This document basically starts with a
type-checked, desugared representation of our core functional
language, and explains how we translate it into Java.
http://resources.businessobjects.com/labs/cal/cal_runtime_internals.pdf

CAL Global Optimizer has some information about an optimization pass
of the CAL compiler for doing global optimizations such as inlining
and fusion.
http://resources.businessobjects.com/labs/cal/cal_global_optimizer.pdf

CAL Benchmarking is an account of various optimizations, and their
effects on performance, covering a period of approximately four years.
http://resources.businessobjects.com/labs/cal/cal_benchmarking.pdf

Cheers,
Bo

Reply all

Reply to author

Forward