Supporting Java compilation in ninja

1,193 views
Skip to first unread message

Elazar Leibovich

unread,
Nov 6, 2011, 5:51:51 AM11/6/11
to ninja-build
Java's compiler has an interesting feature. It requires all Java's
source files in a certain package in order to compile it. It must
compile all those files together.

For example, if I have a.java and b.java, I'll have to write

# javac a.java b.java

# find -name \*.class
a.class
b.class

However, it produce as many class files as source files in this phase.
Is there a way to support such dependency tracking in ninja now? (Why
not use Java native tools? Since on a huge C++ project, with little
Java interop classes, it adds too much overhead).

The best solution I could think of, is, compiling everything to a
single jar, and never let ninja know there were classes. However if
you actually need the class files, I see no solution.

Evan Martin

unread,
Nov 6, 2011, 1:30:29 PM11/6/11
to ninja...@googlegroups.com
On Sun, Nov 6, 2011 at 2:51 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> Java's compiler has an interesting feature. It requires all Java's
> source files in a certain package in order to compile it. It must
> compile all those files together.
>
> For example, if I have a.java and b.java, I'll have to write
>
> # javac a.java b.java
>
> # find -name \*.class
> a.class
> b.class
>
> However, it produce as many class files as source files in this phase.
> Is there a way to support such dependency tracking in ninja now?

Can you elaborate on the problem? It seems what you've described is a
build file that has a.java and b.java as inputs to a build line that
produces a.class and b.class.

(I'm sorry for not being too familar with Java, so you might need to
explain seemingly-obvious details to me.)

Elazar Leibovich

unread,
Nov 6, 2011, 2:00:12 PM11/6/11
to ninja...@googlegroups.com
On Sun, Nov 6, 2011 at 8:30 PM, Evan Martin <mar...@danga.com> wrote:

Can you elaborate on the problem?  It seems what you've described is a
build file that has a.java and b.java as inputs to a build line that
produces a.class and b.class.

(I'm sorry for not being too familar with Java, so you might need to
explain seemingly-obvious details to me.)


It might be that my question stems from my ignorance of ninja that causes this question.

I'll try to elaborate. Generally speaking, given a two Java source files a.java and b.java, we would like to create two "binary" files, a.class and b.class for them.

If those were C++ files, we could do that with two ninja rules

rule javac command = javac $in

build a.class: javac a.java
build b.class: javac b.java

This will issue two commands

javac a.java
javac b.java

Alas this is not good enough!

Java must compile all java files in the current directory at once! It cannot compile just a.java, and then just b.java. It must compile both simultaneously.

The basic problem is, that ninja assumes each command have a single output file, however the javac command have multiple ouput files. A .class file for each input file.

We could do

build a.class: javac a.java b.java

but then ninja wouldn't know about b.class.

We could do

build a.class: javac a.java b.java
build b.class: javac a.java b.java

but then ninja will issue javac *.java twice, when one invocation could do.

I hope I was clear enough.

Evan Martin

unread,
Nov 6, 2011, 2:52:06 PM11/6/11
to ninja...@googlegroups.com
On Sun, Nov 6, 2011 at 11:00 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> The basic problem is, that ninja assumes each command have a single output
> file,

Ah, here is the problem. This isn't correct. You can list multiple
files before the colon in a build statement.

> however the javac command have multiple ouput files. A .class file for
> each input file.
> We could do
> build a.class: javac a.java b.java
> but then ninja wouldn't know about b.class.
> We could do
> build a.class: javac a.java b.java
> build b.class: javac a.java b.java

build a.class b.class: javac a.java b.java


By the way, the code involving multiple outputs isn't exercised as
frequently, so there may be bugs in it. If you encounter any
surprising semantics with it, please ask whether the behavior is
intended before worrying too much about whether it makes sense.

Evan Jones

unread,
Nov 7, 2011, 10:28:11 AM11/7/11
to ninja...@googlegroups.com
On Nov 6, 2011, at 14:00 , Elazar Leibovich wrote:
> Java must compile all java files in the current directory at once! It cannot compile just a.java, and then just b.java. It must compile both simultaneously.

Its been a while since I've fought with Java compiles, but I don't recall this being true. I just tested it really quickly, and it *does* allow you to just compile one .java file in a directory (or at least it did in my tiny test). Is there some additional options or something that I need, in order for javac to have this behaviour?


HOWEVER: javac does (by default) implicitly compile any dependencies. I created a class One, and a class Two. Class Two uses One. If I compile One, it compiles that single class. If I compile Two, it compiles both:

Yamnuska:~ ej$ javac Two.java
Yamnuska:~ ej$ ls *.class
One.class Two.class

This probably makes writing ninja files tricky, as you ideally need to get this dependency information, and looking at javac -help doesn't make it clear that there is any easy way to do that (maybe by parsing the output from -verbose? but -verbose is very verbose, so that seems unfortunate).

Worse: My understanding is that javac performs MUCH better if you give it all the .java files to compile at once, as it then only loads and parses each file once. This would be tricky to do with Ninja, at least not without writing a bunch of additional support code.


My conclusion: you can probably use ninja to build java code correctly, but it probably will involve writing some additional tools / scripts to do it well.


Evan

--
http://evanjones.ca/

Evan Martin

unread,
Nov 7, 2011, 12:25:29 PM11/7/11
to ninja...@googlegroups.com
On Mon, Nov 7, 2011 at 7:28 AM, Evan Jones <ev...@csail.mit.edu> wrote:
> My conclusion: you can probably use ninja to build java code correctly, but it probably will involve writing some additional tools / scripts to do it well.

Well, using this:
https://github.com/martine/ninja/blob/master/misc/ninja_syntax.py

It'd be something like this:

import ninja_syntax
n = ninja_syntax.Writer(open('build.ninja', 'w'))
n.rule('javac', command='javac $in', description='JAVAC $in')
def javac(*basenames):
return n.build([b + '.class' for b in basenames], 'javac', [b +
'.java' for b in basenames])

That would then let you write the rest of your build with statements like:
classfiles = javac('Foo', 'Bar', 'Baz')
exe = ...some other function that build executables...(classfiles)
Where Foo/Bar/Baz are the names of your source .java files.

Doesn't seem so bad to me. I guess javac is likely doing some of the
same work as ninja again to determine which files to actually build,
but javac's problem is much simpler and smaller than ninja's and that
info is likely to be in cache anyway, so it shouldn't cost much.

Elazar Leibovich

unread,
Nov 7, 2011, 4:02:02 PM11/7/11
to ninja...@googlegroups.com
On Mon, Nov 7, 2011 at 5:28 PM, Evan Jones <ev...@csail.mit.edu> wrote:
On Nov 6, 2011, at 14:00 , Elazar Leibovich wrote:
> Java must compile all java files in the current directory at once! It cannot compile just a.java, and then just b.java. It must compile both simultaneously.

Its been a while since I've fought with Java compiles, but I don't recall this being true. I just tested it really quickly, and it *does* allow you to just compile one .java file in a directory (or at least it did in my tiny test). Is there some additional options or something that I need, in order for javac to have this behaviour?

Let me show you my testbed. Maybe it depends on Java version or vendor. And maybe it's my lack of Java expertise. However:

elazar@elazar-laptop:~/dev/java$ cat a.java
class A {}
elazar@elazar-laptop:~/dev/java$ cat b.java
class B {
A a;
}
elazar@elazar-laptop:~/dev/java$ # when compiling only b.java - we get an error
elazar@elazar-laptop:~/dev/java$ # a.java is not automatically fetched!
elazar@elazar-laptop:~/dev/java$ javac b.java
b.java:2: cannot find symbol
symbol  : class A
location: class B
A a;
^
1 error
elazar@elazar-laptop:~/dev/java$ # However when compiling both files - it works
elazar@elazar-laptop:~/dev/java$ javac a.java b.java
elazar@elazar-laptop:~/dev/java$ ls *class
A.class  B.class

Maybe what you did is, compiling first the standalone class a.java, and then compiling b.java. This would work, since when compiling b.java, A.class is available. However what would you do when we have cyclic dependency?

Can you please show me how did you manage to compile a.java which is "class A{B b;}" and b.java which is "class B{A a;}" without including both in the javac command line?




HOWEVER: javac does (by default) implicitly compile any dependencies. I created a class One, and a class Two. Class Two uses One. If I compile One, it compiles that single class. If I compile Two, it compiles both:

As I showed you, it didn't happen  in my version, but maybe I'm doing something wrong.
 
My conclusion: you can probably use ninja to build java code correctly, but it probably will involve writing some additional tools / scripts to do it well.

The trick is, as Evan showed, ignoring the interdependence between java files, and just have all *.java in a package depend on all *.class in the package. I'll show you how I did that once I done if you're interested.
 

Thanks for the input!

Rachel Blum

unread,
Nov 7, 2011, 4:16:59 PM11/7/11
to ninja...@googlegroups.com
That's happening because your test setup has the wrong file names :)

If you have a 'class A' it should be in 'A.java', not 'a.java' - names are case-sensitive. If the file has the right case, javac will pull in the necessary dependencies automatically.

Rachel

Elazar Leibovich

unread,
Nov 7, 2011, 11:38:22 PM11/7/11
to ninja...@googlegroups.com
On Mon, Nov 7, 2011 at 11:16 PM, Rachel Blum <gr...@chromium.org> wrote:
That's happening because your test setup has the wrong file names :)

Oops. Indeed, IDEs rot the mind. Being shielded from javac by Eclipse and Intellij made me slip this fact.

But I'll tell you why I missed that. When Eclipse see that the class name mismatch the filename, it complains and refuse to compile the file (the public class Foo must be definedin its own file). I figured that javac would do the same, so I thought that if I'm able to compile those files at all, it means their names are fine.

Thanks Rachel, and I apologize for the disinformation I spread

Evan Jones

unread,
Nov 8, 2011, 8:48:22 AM11/8/11
to ninja...@googlegroups.com
On Nov 7, 2011, at 12:25 , Evan Martin wrote:
> That would then let you write the rest of your build with statements like:
> classfiles = javac('Foo', 'Bar', 'Baz')
> exe = ...some other function that build executables...(classfiles)
> Where Foo/Bar/Baz are the names of your source .java files.


If I understand your proposal correctly here, this would generate a single invocation of javac with all .java files on the command line, right? That's fine for full builds, but not ideal for incremental rebuilds. My understanding is that ideally for incremental Java rebuilds, you want a two-pass system:

a) Traverse the dependency graph to determine which .class files are out of date (identically to what would happen for C++)

b) Issue a single javac command line with the out-of-date .java files.


(optional): Parallelize (b) by splitting the list (intelligently?) into pieces according to the number of available CPUs. However, javac is fast enough that I haven't needed to do this on any of my projects yet.

Ninja's design makes (a) relatively easy, but (b) is still hard.


You can also run javac as a server, which would make it more efficient with Ninja's typical "build one file at a time" mode.


I may actually be working with Java again in the next week or two, so I might end up trying to get ninja to play nice with Java.

Evan

--
http://evanjones.ca/

Evan Martin

unread,
Nov 8, 2011, 10:47:56 AM11/8/11
to ninja...@googlegroups.com
On Tue, Nov 8, 2011 at 5:48 AM, Evan Jones <ev...@csail.mit.edu> wrote:
> If I understand your proposal correctly here, this would generate a single invocation of javac with all .java files on the command line, right? That's fine for full builds, but not ideal for incremental rebuilds. My understanding is that ideally for incremental Java rebuilds, you want a two-pass system:
>
> a) Traverse the dependency graph to determine which .class files are out of date (identically to what would happen for C++)
>
> b) Issue a single javac command line with the out-of-date .java files.

Hm, interesting. I suppose we could expose an $in_dirty that is the
subset of $in that we think needs rebuilding...

...oh, but wait, it's the *output* files that are dirty. So we'd need
a way to map from those back to the corresponding input names. I
really don't want to go down the path of a weak programming language.
Heh, something like
javac `echo $in_dirty | sed -e s/.class/.java/`
might work, though not on Windows.

> (optional): Parallelize (b) by splitting the list (intelligently?) into pieces according to the number of available CPUs. However, javac is fast enough that I haven't needed to do this on any of my projects yet.

This one is definitely harder. It sorta reminds me of the problem of
splitting the list of input files when the command line is too long.
I wonder if there's a way to provide both in some nice simple way.

> You can also run javac as a server, which would make it more efficient with Ninja's typical "build one file at a time" mode.
>
>
> I may actually be working with Java again in the next week or two, so I might end up trying to get ninja to play nice with Java.

How much does it matter, in the end? Does javac not do a "compare
output file time to input file time and only build if necessary"
decision? I wonder if another way around all of this is to use a
javac wrapper that does the above computations for you...

to_rebuild = []
for dst in sys.argv[1:]:
src = os.path.splitext(dst)[0] + '.class'
if os.path.getmtime(src) > os.path.getmtime(dst):
to_rebuild.push(src)
subprocess.check_call(['javac'] + to_rebuild)

However, getting it to parallelize well would require exposing some
Ninja internals. My first guess would be to examine or implement the
make jobserver API1], but I would hope there was something more simple
available. Like maybe Ninja could support a helper command emitting a
list of command lines that need to be run and then it would enqueue
those into the existing system of running them in parallel.

[1] http://mad-scientist.net/make/jobserver.html

Reply all
Reply to author
Forward
0 new messages