using javac for mixed-language programming?

71 views
Skip to first unread message

Per Bothner

unread,
Mar 2, 2014, 3:54:57 PM3/2/14
to jvm-la...@googlegroups.com
The issue is mixed-language programming, in the sense of module A
written in language L1 references members of class B written in Java,
which in terms references members of A. Potentially, there may be
complicated cycles and additional languages besides L1 and Java.

The "standard" solution I believe is to compile A twice: first with
a "stub compiler" that ignores "module internals". This generates
a skeletal A.class that can be read by javac. After B is compiled
by javac, then we compile A for real to generate the real A.class.

This has various problems. Compile-time performance is the obvious
(having to partially compile A twice). Worse, there are limitations
in the dependencies between A and B. For example, A might define a
class that extends B - but you can't tell until you've read the package
declaration in B.java.

Better if we can do the Enter/MemberEnter phases of javac on B.java
before we start compiling A.L1. And we might want to only doing a
first pass on A.L1 before we get back to B.java. Things may need to
be compiled on-demand, as mentioned in:
http://openjdk.java.net/groups/compiler/doc/compilation-overview/

It should be possible to write a multi-language compiler tool that
extends javac. That's what we did for JavaFX Script, but in that
case the entire JavaFX Script compiler was based on javac. It may
be more difficult to wrap an existing L1 compiler. E.g. the internal
L1 compiler objects have to be mapped to javac Symbol objects.

Anyone thought about this? Experimented with it? I'm wondering
if it might be good Google Summer-of-Code project. I wouldn't
expect a student to produce a usable tool in a Summer, but it
could be a useful experiment.
--
--Per Bothner
p...@bothner.com http://per.bothner.com/

Jochen Theodorou

unread,
Mar 3, 2014, 4:16:44 AM3/3/14
to jvm-la...@googlegroups.com
Am 02.03.2014 21:54, schrieb Per Bothner:
> The issue is mixed-language programming, in the sense of module A
> written in language L1 references members of class B written in Java,
> which in terms references members of A. Potentially, there may be
> complicated cycles and additional languages besides L1 and Java.

Oh this is funny, because just a few days ago, not only we got approved
as organization for GSOC2014, but also got a student interested in
writing an advanced joint Groovy-Java compiler. But if that is going to
happen or not is not sure yet.
The stub solution we currently use has bad limits, like most of our
transforms not being respected in this process. So this projects aims at
bridging the asts of both compilers, without us really knowing if javac
can be controlled enough to make this happen.

> The "standard" solution I believe is to compile A twice: first with
> a "stub compiler" that ignores "module internals". This generates
> a skeletal A.class that can be read by javac. After B is compiled
> by javac, then we compile A for real to generate the real A.class.
>
> This has various problems. Compile-time performance is the obvious
> (having to partially compile A twice). Worse, there are limitations
> in the dependencies between A and B. For example, A might define a
> class that extends B - but you can't tell until you've read the package
> declaration in B.java.

We worked around that I think by first resolving classes in our Groovy
code, where possible, plus leaving the import statements. Then we
generate the stubs, let javac compile everything and then groovyc picks
up missing classes from there to produce the final groovy classes. This
works only because groovy and java import statements work very similar.
And we still sometimes get into situations that this logic cannot resolve.

> Better if we can do the Enter/MemberEnter phases of javac on B.java
> before we start compiling A.L1. And we might want to only doing a
> first pass on A.L1 before we get back to B.java. Things may need to
> be compiled on-demand, as mentioned in:
> http://openjdk.java.net/groups/compiler/doc/compilation-overview/
>
> It should be possible to write a multi-language compiler tool that
> extends javac. That's what we did for JavaFX Script, but in that
> case the entire JavaFX Script compiler was based on javac. It may
> be more difficult to wrap an existing L1 compiler. E.g. the internal
> L1 compiler objects have to be mapped to javac Symbol objects.

For eclipse we have something like this already actually. And while
there is a command line eclipse compiler for a long time already, we
have the problem, that jdt does not expose everything as needed. So a
modified JDT is required. The goal of our project would be to have
something more lightweight... in example not have to ship a modified jdt

> Anyone thought about this? Experimented with it? I'm wondering
> if it might be good Google Summer-of-Code project. I wouldn't
> expect a student to produce a usable tool in a Summer, but it
> could be a useful experiment.

It was tried several years ago already to get some kind of framework to
do that and have for example a compiler for code mixing java, groovy and
scala. But there was not much interest apart from "that would be good"
on a jvm languages summit. It would be good if there is now more interest.

Frankly, if I had the time, I would have long written that (or failed
doing so;) )

bye Jochen

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org

John Cowan

unread,
Mar 3, 2014, 7:44:46 AM3/3/14
to jvm-la...@googlegroups.com

On Sun, Mar 2, 2014 at 3:54 PM, Per Bothner <p...@bothner.com> wrote:

The "standard" solution I believe is to compile A twice: first with
a "stub compiler" that ignores "module internals".  This generates
a skeletal A.class that can be read by javac.  After B is compiled
by javac, then we compile A for real to generate the real A.class.

Another approach is to have a mode of the L1 compiler that generates Java rather than bytecode, and then let javac deal with it all, since it already understands this problem. 




--
GMail doesn't have rotating .sigs, but you can see mine at http://www.ccil.org/~cowan/signatures

Jochen Theodorou

unread,
Mar 3, 2014, 10:10:01 AM3/3/14
to jvm-la...@googlegroups.com
Am 03.03.2014 13:44, schrieb John Cowan:
>
> On Sun, Mar 2, 2014 at 3:54 PM, Per Bothner <p...@bothner.com
> <mailto:p...@bothner.com>> wrote:
>
> The "standard" solution I believe is to compile A twice: first with
> a "stub compiler" that ignores "module internals". This generates
> a skeletal A.class that can be read by javac. After B is compiled
> by javac, then we compile A for real to generate the real A.class.
>
>
> Another approach is to have a mode of the L1 compiler that generates
> Java rather than bytecode, and then let javac deal with it all, since it
> already understands this problem.

But that works with even less languages

Per Bothner

unread,
Mar 4, 2014, 7:07:27 PM3/4/14
to jvm-la...@googlegroups.com
On 03/03/2014 01:16 AM, Jochen Theodorou wrote:

>> Better if we can do the Enter/MemberEnter phases of javac on B.java
>> before we start compiling A.L1. And we might want to only doing a
>> first pass on A.L1 before we get back to B.java. Things may need to
>> be compiled on-demand, as mentioned in:
>> http://openjdk.java.net/groups/compiler/doc/compilation-overview/
>>
>> It should be possible to write a multi-language compiler tool that
>> extends javac. That's what we did for JavaFX Script, but in that
>> case the entire JavaFX Script compiler was based on javac. It may
>> be more difficult to wrap an existing L1 compiler. E.g. the internal
>> L1 compiler objects have to be mapped to javac Symbol objects.
>
> For eclipse we have something like this already actually. And while
> there is a command line eclipse compiler for a long time already, we
> have the problem, that jdt does not expose everything as needed. So a
> modified JDT is required. The goal of our project would be to have
> something more lightweight... in example not have to ship a modified jdt

javac does support a fair amount of pluggability, in the sense that you
can extend and replace the various phases and classes.

For Kawa you could do something like:
(1) Read the "header" of each Kawa source file, to determine the
name of the generated main class.
(2) Enter these class names into the javac tables as "uncompeted"
classes.
(3) Start compiling the java files. When this requires the members
of the of the Kawa classes, switch to the Kawa files. From javac,
treat these as pre-compiled .class files. I.e. we treat the Kawa
compiler as a black box that produces Symbols in the same way as
reading class files.

This approach may not immediately provide as robust mixed-language
support as ideal, but it is more amenable to incremental improvement
that a standalone stub-generator.

Ingo W.

unread,
Mar 5, 2014, 7:13:10 PM3/5/14
to jvm-la...@googlegroups.com
This is a technical problem, but when you run into it, something is probably wrongly designed.
And maybe one can simply solve it by programming against interfaces.

Per Bothner

unread,
Mar 5, 2014, 7:28:18 PM3/5/14
to jvm-la...@googlegroups.com
On 03/05/2014 04:13 PM, Ingo W. wrote:
> This is a technical problem, but when you run into it, something is
> probably wrongly designed.

Maybe. There are probably few cases where you need tightly-coupled
classes written in two languages. Especially with Kawa-Scheme and Java:
There is little to no performance *or* expressiveness advantage to
writing something in Java rather than Kawa-Scheme. It may be different
for languages that make it harder to write close-to-the-JVM code.

Which I guess is why it hasn't been a big priority.

Still, it would be easier if you didn't have to carefully partition
and order your sources files to avoid inter-language cycles. Using
a single command to compile multiple languages is more convenient.

Jochen Theodorou

unread,
Mar 6, 2014, 2:09:36 AM3/6/14
to jvm-la...@googlegroups.com
Am 06.03.2014 01:13, schrieb Ingo W.:
> This is a technical problem, but when you run into it, something is
> probably wrongly designed.
> And maybe one can simply solve it by programming against interfaces.

programming against interfaces works only if you (a) extract those, (b)
solve the problems with interfaces that are supposed to be solved by
inheritance, (c) forget about an inheritance in style of A-B-C, where B
is written in your alternative JVM language.

The goal is to be able to freely mix languages. You cannot reach that by
requiring to program against interfaces only. Well, not if you want to
stay somehow OO

Programming against interfaces requires you to isolate the code parts
written in the other JVM language.

Ingo W.

unread,
Mar 6, 2014, 12:32:12 PM3/6/14
to jvm-la...@googlegroups.com


Am Donnerstag, 6. März 2014 08:09:36 UTC+1 schrieb blackdrag:
Am 06.03.2014 01:13, schrieb Ingo W.:
> This is a technical problem, but when you run into it, something is
> probably wrongly designed.
> And maybe one can simply solve it by programming against interfaces.

programming against interfaces works only if you (a) extract those, (b)
solve the problems with interfaces that are supposed to be solved by
inheritance, (c) forget about an inheritance in style of A-B-C, where B
is written in your alternative JVM language.

In that case you simply compile A, then B, then C. Where is the problem? (It is good to have tools then that find those dependencies and arrange proper compilatioon order.)
I understood Per so that A needs B and B needs A, which is a code smell, but YMMV. But if and when you have such cases, you better have a language that produces java source code, as John said.

Jochen Theodorou

unread,
Mar 6, 2014, 1:36:08 PM3/6/14
to jvm-la...@googlegroups.com
Am 06.03.2014 18:32, schrieb Ingo W.:
>
>
> Am Donnerstag, 6. März 2014 08:09:36 UTC+1 schrieb blackdrag:
>
> Am 06.03.2014 01:13, schrieb Ingo W.:
> > This is a technical problem, but when you run into it, something is
> > probably wrongly designed.
> > And maybe one can simply solve it by programming against interfaces.
>
> programming against interfaces works only if you (a) extract those, (b)
> solve the problems with interfaces that are supposed to be solved by
> inheritance, (c) forget about an inheritance in style of A-B-C, where B
> is written in your alternative JVM language.
>
> In that case you simply compile A, then B, then C. Where is the problem?

First of all, that you have to split your code base into three parts
that are not really three parts. Second, what do you do if for example B
has a field of type C?

> (It is good to have tools then that find those dependencies and arrange
> proper compilatioon order.)
> I understood Per so that A needs B and B needs A, which is a code smell,
> but YMMV.

Circular dependencies are not all that rare. If you have a project with
a hundred or more classes I really doubt that you have no circular
dependencies between any classes at all.

> But if and when you have such cases, you better have a
> language that produces java source code, as John said.

Maybe that's an option for Frege, but not for Groovy.

John Cowan

unread,
Mar 6, 2014, 1:39:01 PM3/6/14
to jvm-la...@googlegroups.com

On Thu, Mar 6, 2014 at 1:36 PM, Jochen Theodorou <blac...@gmx.org> wrote:
But if and when you have such cases, you better have a
language that produces java source code, as John said.

Maybe that's an option for Frege, but not for Groovy.

Is there a technical reason why not?  Goto can always be simulated using while-loops, break, and continue, provided the bytecode isn't complete spaghetti.

Ingo W.

unread,
Mar 6, 2014, 5:11:21 PM3/6/14
to jvm-la...@googlegroups.com


Am Donnerstag, 6. März 2014 19:36:08 UTC+1 schrieb blackdrag:
Am 06.03.2014 18:32, schrieb Ingo W.:
 
>     programming against interfaces works only if you (a) extract those, (b)
>     solve the problems with interfaces that are supposed to be solved by
>     inheritance, (c) forget about an inheritance in style of A-B-C, where B
>     is written in your alternative JVM language.
>
> In that case you simply compile A, then B, then C. Where is the problem?

First of all, that you have to split your code base into three parts
that are not really three parts. Second, what do you do if for example B
has a field of type C?

This, of course would be a case B <--> C. The A is immaterial.
 


> (It is good to have tools then that find those dependencies and arrange
> proper compilatioon order.)
> I understood Per so that A needs B and B needs A, which is a code smell,
> but YMMV.

Circular dependencies are not all that rare. If you have a project with
a hundred or more classes I really doubt that you have no circular
dependencies between any classes at all.

This is only because Java makes this easy, or, to put it differently, makes it look un-natural to have more than one class in one source file.
Yet,  it is possible to achieve this through ab-using of the top level class as name-space, where other public static classes live.

Jochen Theodorou

unread,
Mar 6, 2014, 6:01:00 PM3/6/14
to jvm-la...@googlegroups.com
Am 06.03.2014 19:39, schrieb John Cowan:
[...]
> Is there a technical reason why not? Goto can always be simulated using
> while-loops, break, and continue, provided the bytecode isn't complete
> spaghetti.

what about invokedynamic?
Reply all
Reply to author
Forward
0 new messages