feature proposal: Standalone CAL library JARs

1 view
Skip to first unread message

Bo Ilic

unread,
Sep 24, 2007, 5:41:34 PM9/24/07
to CAL Language Discussion
Hi all,
What follows is a feature proposal for another way to use CAL from
a Java application- this time as a quick-startup library JAR. We are
interested in this possibility here at Business Objects. Comments and
suggestions are welcome!

Cheers,
Bo

CAL Standalone JARs have been implemented for a few months now, and
they have achieved their goal of drastically speeding up the start-up
time of CAL when running a standalone CAL application. Proof of this
is in the Computer Language Benchmark Games benchmarks, where we would
not be able to compete successfully at all if with every run from a
cold-start a workspace needed to be loaded.

We'd like access to the fast start-up benefits for libraries written
in CAL. The existing tool is only useful for applications, since it
creates a single entry-point for a CAL function of type [String] ->
(). Of course, these libraries would not be able to use features such
as Java based meta-programming (SourceModel, GemGraph, TypeExpr). They
do have access to some dynamic support though, in the form of use of
the CAL Dynamic and TypeRep types.

Issues:

* Want the ability to create a JAR that has a public API
consisting of potentially several Java classes, each with several Java
static functions. These simply call into their resulting CAL
implementations.
* At the very least, should be able to export any CAL function
whose type consists purely of types that can be used in foreign
function signatures for a given module (i.e. foreign types with
accessible implementation scope, Prelude.Boolean and Prelude.Unit). We
may want to also allow other kinds of types, where the rule is that
inputs and outputs are all composed with the default input and output
policies. Note that in particular the types CalValue and CalFunction
qualify, and it will be possible to pass opaque CAL values through the
library boundary, including unevaluated CAL values i.e. closures.
* Unlike with the standalone application builder, each Java static
function has an extra argument for the execution context. This is
needed for CAFs, terminating a run, access to properties and
resources, the Dynamic type, etc. We also need to provide a factory
for cooking up an initial execution context.
* Standalone CAL applications make a check that the application
JAR is compatible with the CAL platform runtime support JAR, and if
not, terminate with an error. For example, there is a version check
that the version of the LECC generated files that the CAL platform
supports are the same as the version in the application JAR. There are
other checks that certain system property flags are consistent. We do
not want to make these checks with every method call into the library.
The solution is to create a special InitLibrary factory class. This
can be the factory for executions contexts, as well as provide a
consistency checking function.
* One issue that comes up is the process of development and
debugging of stand-alone libraries. The Java application making use of
the standalone library has an intermediate build step to create the
library's public Java API. Tooling support in Eclipse should make this
not too bad.
* There is an issue of how to specify various properties of the
generated Java entities. These include: their names, packages, scopes,
and any associated JavaDoc. One idea here is to define the library's
API in CAL, using CALDoc. This allows for proper CAL compile-time type
checking of the marshaling code. Scopes of functions can just be taken
from the CAL scopes (e.g. CAL's protected scope maps to Java's
protected scope). We already have some code that transforms CALDoc
into reasonably nice JavaDoc. The generated Java classes are always
final, and not instantiable. The main thing not easily specifiable in
a CAL module is the name, scope and package of the generated Java
class (unless we want to infer the Java name and package from the CAL
fully qualified name, which may not be that desirable). But this is
just a small amount of data in a script.
* Just to be explicit, a standalone library can consist of several
Java API classes (and not just one). It can also have "main" methods
i.e. incorporate standalone application entry points. For example,
standalone application entry points might be a useful place to hook up
self-tests for a library.

Another thing that would be useful to do for this feature is to
separate out the CAL_Platform project into CAL_Platform and
CAL_Platform_Runtime. This will help ensure the correctness of
standalone libraries with respect to changes and bug fixes (i.e. they
don't end up accidentally loading classes that shouldn't be part of a
standalone deployment). This also addresses the goal of having a small-
size disk footprint when redistributing an application or library
running in standalone mode.
Alternatives to the use of default input/output policies:

Input policies have the disadvantage that the input type is always
java.lang.Object. This is because when composing with Prelude.input,
the only thing that you can guarantee about the input foreign types is
that it is java.lang.Object. Indeed, for the case of the Prelude.List
type, the input java.lang.Object can be a java.util.Collection, a
java.util.Iterator, a java.util.Enumeration, or a primitive or
reference Java array! The only Java type that is a supertype of all of
these is java.lang.Object, so it is the only statically valid type to
use. Sometimes there is a Java type that is more specific that would
work, or one can select an arbitrary choice and support a limited API
e.g. arbitrarily choose java.util.List as the input Java type for CAL
lists. But this is an arbitrary ad-hoc choice that is not explicit in
the CAL.

For example, a CAL function with type:

foo :: JList -> Integer -> Boolean;

would create a method in a Java class

static Object foo (Object arg1, Object arg2)

However, there is an exception to this: foreign types whose Inputable
instances are defined via a deriving clause have a precisely defined
Java type. (We can make extensions to the algebraic types that are
treated specially in CAL i.e. Boolean and Unit that correspond to
Java's boolean and void). In this case, the precise Java type is
available so we could e.g. create the method:

static boolean foo(java.util.List arg1, java.math.BigInteger arg2)

The main drawback so far is that there is no Java generic type
information. There are a couple of ways to try to address this...

Another issue is that we may want *lazy* versions of these functions.
In that case we provide an overload of foo that uses CalValue:

static boolean foo(CalValue arg1, CalValue arg2);

Then the Java List value and BigInteger value could be passed as
unevaluated computations. We need a factory class for converting Java
types to CalValues in the event that someone has only one unevaluated
CAL expression at hand e.g. the List, but a Java BigInteger, which
then needs to be wrapped in a CalValue to call the foo method.

Tom Davies

unread,
Sep 25, 2007, 12:33:00 AM9/25/07
to CAL Language Discussion
So would the standalone JARs contain Java bytecode, instead of .cmi
files which are converted to bytecode by ASM at workspace load time?
(if I've got that right)

It might be useful to retain a 'reflective' interface which could
specify individual input/output policies, as well as having the
wrapping Java class.

I've written up my interfacing experiments here: http://diversions.nfshost.com/blog/

Regards,
Tom

Bo Ilic

unread,
Sep 25, 2007, 2:24:30 PM9/25/07
to CAL Language Discussion
Hi Tom,

I'm enjoying reading your "my diversions" site. It is interesting to
see the different ways of interfacing CAL with Java!

Currently CAR-JARs are JARs which contain cmi (compiled module info)
files, and possibly lc files. Note that lc files are just class files
with a custom directory structure (lc stands for lecc class). The
custom directory structure is a workaround for bugs in pre Java 5
versions of Java in handling long path names on Windows. Whether or
not to generate lc files is determined by the
org.openquark.cal.machine.lecc.static_runtime system property. Note
that in dynamic mode, the classes are not generated at workspace load
time, but rather as needed when a CAL function is run. The advantage
of the static runtime is that the first-run time of CAL functions is
faster, since it turns out that loading the class files from disk in
the lc format is faster than generating them in memory from cmi files.
The disadvantage is that the resulting CAR-JARs are much larger. Also,
if you are not using CAR-JARs, but just .cal files in folders on disk,
then the dynamic runtime approach offers faster recompilation when
doing incremental style development since all those lc files do not
need to be generated. Static runtime is primarily useful for
deployment situations.

Standalone JAR libraries and Standalone JAR applications differ from
using a CAL workspace in that they only contain the Java class files
necessary to run the selected CAL functions. They do not contain cmi
files or lc files but just Java class files. They also do not have the
one time start-up cost of loading a CAL workspace (i.e. deserializing
all the cmi files). This can also save on memory use since the
deserialized cmi files take some space.

Cheers,
Bo

Joseph Wong

unread,
Oct 26, 2007, 3:27:22 PM10/26/07
to CAL Language Discussion
Hi everyone,

The next release of Open Quark will include the new feature of
standalone CAL library JARs.

This feature makes it possible to generate a JAR from a CAL library
that can be used directly in Java.

For example, for a CAL function such as:

/**
* Returns the first {@code n@} elements of the given list in a new
list.
* @arg n the number of elements to take.
* @arg list the list of integers.
* @return a list of the requested elements.
*/
take :: Int -> [Int] -> JList;
public take n list = List.outputList (List.take n list);

a corresponding method will be generated in a library class included
in the JAR:

/**
* Returns the first <code>n</code> elements of the given list in a
new list.
* @param n (CAL type: {@code Int})
* the number of elements to take.
* @param list (CAL type: {@code [Int]})
* the list of integers.
* @return (CAL type: {@code JList})
* a list of the requested elements.
*/
public static List take(final int n, final CalValue list,
final ExecutionContext executionContext) throws
CALExecutorException {
// implementation clipped
}

The CAL_Samples project in the Open Quark source distribution will
include a new end-to-end sample standalone library based on the
directed graph library written in CAL.

The Standalone JAR Tool now also supports the generation of a
companion source zip file containing source files for all generated
classes, which can be used for debugging and documentation purposes.

To learn more about this new feature, the best place to start is the
"Using Quark with Standalone JARs" document, included in the
distribution.

We hope you'll like this new feature of Open Quark!

Cheers,
Joseph

Reply all
Reply to author
Forward
0 new messages