Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Incremental Java Compile

37 views
Skip to first unread message

Joshua Maurice

unread,
May 1, 2010, 7:34:31 AM5/1/10
to
I'm back. I've been learning a lot over the last couple of months,
toying around with solutions, and realizing the inadequacies of each
approach I've tried.

From a high level perspective, I realized what I want from a build
system: A developer should be able to do any combination of the
following actions and trigger a small / minimal / incremental build,
and it should be correct without any corner cases of incorrectness.
1- Add, remove, and/or edit a source file, such as a Java file, cpp
file, etc.
2- Add, remove, and/or edit a build system script file to invoke a
standard rule, such as adding a new jar to the build, modifying a
classpath of an existing jar, removing a jar from the build, etc.

Specifically, some actions would require a full clean build:
1- If the logic which tracks incremental dependencies is changed, then
the developer must do a full clean.
2- If any other tool, such as Javac version, changes, then the
developer must do a full mean.
These are akin to modifying the build system itself. That I understand
has very hard "backwards compatibility" issues. That's outside the
scope of what I want. However, the aforementioned activities of
messing with source files, and invoking build system macros / rules to
create new standard binaries should "just work", and it should
"just work" quickly.

So, I've been trying to do this for Java. Man this is actually quite
hard, harder the more I learn. I think I finally "struck gold" when I
found this wonderful paper here:
http://www.jot.fm/issues/issue_2004_12/article4.pdf
For build systems focusing on Java, I rank it as just as important as
Recursive Make Considered Harmful.

However, the paper assumes that the build will "cascade", or recompile
everything downstream. I am trying very hard to avoid this if
possible, to get a much smaller rebuild without writing my own Java
compiler ala Eclipse. I think my current solution in my head will
work, using a combination of
1- Ghost Dependencies
2- Each class file depends on the last not-no-op build of all
previously used class files from the last build.

I finally finished an implementation of part 1. Part 2 is much easier
if I rely on Javac's verbose output, but to do that I need to do a
compile up front, passing all the out of date java files, and then a
separate compile per java file to get useful information from Javac's
output, to know for exactly which java file was a class loaded.

So, I post here because I feel better prepared to discuss this
subject. I still disagree that "build from clean" is the correct
answer. That would make our product's build still around ~25 minutes
for just the Java compilation of around ~20,000 source files (and
growing). There must / should be something better. Separation
translation units make so much sense. I just wish Java had them.

And yes, we're also working on "componentizing" to some degree, but
when all of the components are under active development, I would still
very much like builds to be as fast as possible to do regular
integration tests on an automated build machine.

So, anyone know a quick and easy way to get the list of class files
loaded during a compile, and know for exactly which subset of java
files in the compile is the class file is needed? Invoking a separate
Javac after the fact (using tools.jar analyze API) almost 4x my
overall from-clean build time for a subset of real code in my
company's codebase.

Lew

unread,
May 1, 2010, 10:38:27 AM5/1/10
to
Joshua Maurice wrote:
> So, I post here because I feel better prepared to discuss this
> subject. I still disagree that "build from clean" is the correct
> answer. That would make our product's build still around ~25 minutes
> for just the Java compilation of around ~20,000 source files (and
> growing). There must / should be something better. Separation
> translation units make so much sense. I just wish Java had them.

What, you never heard of JAR files?

There's no excuse for "build clean" having to touch all 20K files.

--
Lew

markspace

unread,
May 1, 2010, 12:02:09 PM5/1/10
to
Joshua Maurice wrote:

> So, I post here because I feel better prepared to discuss this
> subject. I still disagree that "build from clean" is the correct
> answer.


Ant can do most kinds of Java source dependencies for you:

<http://ant.apache.org/manual/OptionalTasks/depend.html>

Mike Schilling

unread,
May 1, 2010, 12:51:32 PM5/1/10
to

"The most obvious example of these limitations is that the task can't tell
which classes to recompile when a constant primitive data type exported by
other classes is changed. For example, a change in the definition of
something like
public final class Constants {
public final static boolean DEBUG=false;
}

will not be picked up by other classes. "

That is, it's an incremental (no pun intended) improvement on the usual Ant
algorithm of "recompile what onviouslt needs recompilation; if that doesn't
seem to work, do a clean build"


Lew

unread,
May 1, 2010, 1:43:57 PM5/1/10
to
Mike Schilling wrote:
> "The most obvious example of these limitations is that the task can't tell
> which classes to recompile when a constant primitive data type exported by
> other classes is changed. For example, a change in the definition of
> something like
> public final class Constants {
> public final static boolean DEBUG=false;
> }
>
> will not be picked up by other classes. "
>
> That is, it's an incremental (no pun intended) improvement on the usual Ant
> algorithm of "recompile what onviouslt needs recompilation; if that doesn't
> seem to work, do a clean build"

You can't blame Ant for that one. The class that depends on the compile-time
constant, such as 'DEBUG' in your example, compiles the constant into its
class, not the symbol. Without some external indication of the dependency,
there's not any way for a compiler or build tool to detect that it exists.

With respect to dependencies where the symbol is stored in the class rather
than its value, even 'javac' handles the situation pretty well.

--
Lew

Mike Schilling

unread,
May 1, 2010, 1:55:35 PM5/1/10
to
Lew wrote:
> Mike Schilling wrote:
>> "The most obvious example of these limitations is that the task
>> can't tell which classes to recompile when a constant primitive data
>> type exported by other classes is changed. For example, a change in
>> the definition of something like
>> public final class Constants {
>> public final static boolean DEBUG=false;
>> }
>>
>> will not be picked up by other classes. "
>>
>> That is, it's an incremental (no pun intended) improvement on the
>> usual Ant algorithm of "recompile what onviouslt needs
>> recompilation; if that doesn't seem to work, do a clean build"
>
> You can't blame Ant for that one.

True; it's the design of Java that doesn't lend itself to calculating
dependencies with a reasonable amount of effort.

> The class that depends on the
> compile-time constant, such as 'DEBUG' in your example, compiles the
> constant into its class, not the symbol. Without some external
> indication of the dependency, there's not any way for a compiler or
> build tool to detect that it exists.

Other than by noting it when the symbol is used during compilation, and
storing that bit of information somewhere. But the details of that get
messy.

>
> With respect to dependencies where the symbol is stored in the class
> rather than its value, even 'javac' handles the situation pretty well.

There are other difficult cases, like a method being added in class A that
results in a method in B (one of A's descendents) becoming overloaded, such
that a client of B should now choose the new overload. Java really makes
this stuff hard. (C# is no better.)


Lew

unread,
May 1, 2010, 2:24:17 PM5/1/10
to
Lew wrote:
>> The class that depends on the
>> compile-time constant, such as 'DEBUG' in your example, compiles the
>> constant into its class, not the symbol. Without some external
>> indication of the dependency, there's not any way for a compiler or
>> build tool to detect that it exists.

Mike Schilling wrote:
> Other than by noting it when the symbol is used during compilation, and
> storing that bit of information somewhere. But the details of that get
> messy.

I said that there is no way, not that there couldn't be a way.

Lew wrote:
>> With respect to dependencies where the symbol is stored in the class
>> rather than its value, even 'javac' handles the situation pretty well.

Mike Schilling wrote:
> There are other difficult cases, like a method being added in class A that
> results in a method in B (one of A's descendents) becoming overloaded, such
> that a client of B should now choose the new overload. Java really makes
> this stuff hard. (C# is no better.)

How would any language handle this?

Short of a class being aware of every possible past, present and future
extension of it.

You do present a good argument against overuse of inheritance.

--
Lew

markspace

unread,
May 1, 2010, 2:45:43 PM5/1/10
to
Lew wrote:
>
> How would any language handle this?

Class A exports a method:

public class A {
public void m( Object o ) {}
}

which B uses:

public class B {
public void b(A a) { a.m( "Hello" ) }
}

Now B has a dependency on A. If A changes, for any reason:

public class A {
public void m( Object o ) {}
public void m( String s ) {}
}

or:

public class A {
public void x( Object o ) {}
}


then B has to recompiled. That's pretty standard stuff I think. There
really isn't any need to detect an overloaded method, this simple
dependency graph catches it, and many other cases too.

Mike Schilling

unread,
May 1, 2010, 3:08:02 PM5/1/10
to

The overload is a subtle case, because the added method isn;t actually used
by anyone, so dependencies at the granularity of method usage won't catch
it. (Unless you conflate all overloads as being "the same method", which is
probably a good idea.)


Mike Schilling

unread,
May 1, 2010, 3:05:55 PM5/1/10
to
Lew wrote:
> Lew wrote:
>>> The class that depends on the
>>> compile-time constant, such as 'DEBUG' in your example, compiles the
>>> constant into its class, not the symbol. Without some external
>>> indication of the dependency, there's not any way for a compiler or
>>> build tool to detect that it exists.
>
> Mike Schilling wrote:
>> Other than by noting it when the symbol is used during compilation,
>> and storing that bit of information somewhere. But the details of
>> that get messy.
>
> I said that there is no way, not that there couldn't be a way.
>
> Lew wrote:
>>> With respect to dependencies where the symbol is stored in the class
>>> rather than its value, even 'javac' handles the situation pretty
>>> well.
>
> Mike Schilling wrote:
>> There are other difficult cases, like a method being added in class
>> A that results in a method in B (one of A's descendents) becoming
>> overloaded, such that a client of B should now choose the new
>> overload. Java really makes this stuff hard. (C# is no better.)
>
> How would any language handle this?

Languages that store the defintion of a class in a different file than its
implementation (e.g. C++) handle it by simple comparisons of file dates.
It's also possible to do this by having the compiler update a repository of
class definitions (I used to develop a system that did just that.)

>
> Short of a class being aware of every possible past, present and
> future extension of it.

You'd need to do it the other way around -- have the client of B ask B if
its definition had changed, and have B in turn ask A.


Andreas Leitgeb

unread,
May 2, 2010, 9:17:43 AM5/2/10
to
Mike Schilling <mscotts...@hotmail.com> wrote:
>> Now B has a dependency on A. If A changes, for any reason: [...]

>> then B has to recompiled. That's pretty standard stuff I think. There
>> really isn't any need to detect an overloaded method, this
>> simple dependency graph catches it, and many other cases too.
> The overload is a subtle case, because the added method isn;t actually used
> by anyone, so dependencies at the granularity of method usage won't catch
> it. (Unless you conflate all overloads as being "the same method", which is
> probably a good idea.)

Or, somewhat finer-grained: conflate all overloaded methods that take the
same number of arguments - with some special reasoning about varargs...

Another approach was: Maintain a database of each .class's API,
and if, after an incremental build, any recompiled class has a
changed API, or any class was removed, or added, then do a clean
build. Insert "non-private" whereever you find it appropriate.

Joshua Maurice

unread,
May 2, 2010, 5:08:16 PM5/2/10
to

And what if all of the code is under active development, aka new
features are being added to each layer on a weekly basis?

And what if a large portion of that Java code is generated from a
model file to facilitate serialization between C++ and Java? Thus a
change to a single file would require recompiling a large amount of
generated "interface" files, which theoretically touches a large
portion of the 20,000 Java files.

And that's still no excuse to not having an incremental compile. Even
if componentized, with a full clean build every time, that could be 5
to 10 minutes lost of my work for every compile which is just wasted
time.

Joshua Maurice

unread,
May 2, 2010, 5:09:53 PM5/2/10
to

Please read the paper in my opening post. Then, consider that a build
machine cannot (easily) distinguish between "most" and "all". I want a
fully correct incremental build. A 90% correct incremental build is
useful to nearly no one, as you would do many sanity-check clean
builds.

Joshua Maurice

unread,
May 2, 2010, 5:11:41 PM5/2/10
to

I'm not blaming anyone in particular. I just want to know how to get a
fully correct, aka 100% incremental build under the actions: adding,
removing, modifying java files, and adding, removing, or modifying
build steps of "take these jars, compile them to class files, then jar
them", aka the standard developer actions.

Joshua Maurice

unread,
May 2, 2010, 5:12:35 PM5/2/10
to
On May 1, 11:24 am, Lew <no...@lewscanon.com> wrote:

Read the paper in my opening post, Ghost Dependencies.

Joshua Maurice

unread,
May 2, 2010, 5:15:29 PM5/2/10
to
On May 1, 12:05 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

Actually no. It's not that. It's that the search path lookup is
handled in a fundamentally different way. In Java, any piece of code
anywhere in the file can result in a file system lookup, and this
lookup is "ambiguous", where I mean subject to change, or depends on
context. See the paper in my opening post Ghost Dependencies.

C++ gets around this by having 2 separate compilation steps, the
preprocessor, and the compiler proper. The preprocessor has a very
simple well defined lookup process which is not context dependent or
"ambiguous" unlike Java's lookup process. The preprocessor produces a
single file which the compiler proper takes and produces a single
output file. The entire thing is much less black box and much less
complex than Java's classpath lookup which makes it much easier to
produce a correct incremental build.

Joshua Maurice

unread,
May 2, 2010, 5:16:37 PM5/2/10
to
On May 2, 6:17 am, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
wrote:

This would work, but it's "overkill", and not very incremental. I was
hoping for a much more "minimal" rebuild.

Lew

unread,
May 2, 2010, 6:13:15 PM5/2/10
to
Joshua Maurice wrote:
>>> So, I post here because I feel better prepared to discuss this
>>> subject. I still disagree that "build from clean" is the correct
>>> answer. That would make our product's build still around ~25 minutes
>>> for just the Java compilation of around ~20,000 source files (and
>>> growing). There must / should be something better. Separation
>>> translation units make so much sense. I just wish Java had them.

Lew wrote:
>> What, you never heard of JAR files?
>>
>> There's no excuse for "build clean" having to touch all 20K files.

Joshua Maurice wrote:
> And what if all of the code is under active development, aka new
> features are being added to each layer on a weekly basis?

You have not designed your system in a very modular way.

--
Lew

Joshua Maurice

unread,
May 2, 2010, 7:09:19 PM5/2/10
to

Agreed. Sadly, as I'm a more junior developer, not much I can do about
it for such a large codebase, a fair share of which predates C++98
standardization.

Arne Vajhøj

unread,
May 2, 2010, 9:21:17 PM5/2/10
to

To me the entire idea is rather pointless.

The tool can not be made to work with binaries.

It should be possible to do it working with source code.

But it would require a huge effort to create a 100% working tool.

You could solve your build problems for much less effort
by working on the structure of the project.

The project does not provide bang for the buck.

Arne

Arne Vajhøj

unread,
May 2, 2010, 9:22:18 PM5/2/10
to

It would give more benefits for the effort to try and clean
up things.

Arne

Arne Vajhøj

unread,
May 2, 2010, 9:27:14 PM5/2/10
to
On 02-05-2010 17:08, Joshua Maurice wrote:
> On May 1, 7:38 am, Lew<no...@lewscanon.com> wrote:
>> Joshua Maurice wrote:
>>> So, I post here because I feel better prepared to discuss this
>>> subject. I still disagree that "build from clean" is the correct
>>> answer. That would make our product's build still around ~25 minutes
>>> for just the Java compilation of around ~20,000 source files (and
>>> growing). There must / should be something better. Separation
>>> translation units make so much sense. I just wish Java had them.
>>
>> What, you never heard of JAR files?
>>
>> There's no excuse for "build clean" having to touch all 20K files.
>
> And what if all of the code is under active development, aka new
> features are being added to each layer on a weekly basis?

Still not a good excuse.

Each team/stream/whatever you call it should work on their own
component and a stable binary release of all other components.

> And what if a large portion of that Java code is generated from a
> model file to facilitate serialization between C++ and Java? Thus a
> change to a single file would require recompiling a large amount of
> generated "interface" files, which theoretically touches a large
> portion of the 20,000 Java files.

Structure things better.

With a good OO model a series of related changes should not
require changes in "a large portion of 20000 Java files".

Arne

Joshua Maurice

unread,
May 3, 2010, 12:50:10 AM5/3/10
to
On May 2, 6:27 pm, Arne Vajhøj <a...@vajhoej.dk> wrote:
> Structure things better.
>
> With a good OO model a series of related changes should not
> require changes in "a large portion of 20000 Java files".

So, how do you suggest doing that when there's a code generator under
active development which generates Java code, and a large portion of
the Java code directly or indirectly works with the output of this
code generator? We model the object domains in a simple modeling
language which is then compiled to C++ and Java code to allow
serializing a description of a unit of work from the Java tools to the
C++ tools and back. Most of the infrastructure and apps work with the
output of this code generator in some form or another.

Unfortunately, one cannot fiat interfaces into being stable.

Mike Schilling

unread,
May 3, 2010, 2:54:50 AM5/3/10
to

I think it could. At least, reducing the problem to:

* You have a complete correct build of the system
* You have a set of changes to the source since that build was done
* Do a minimal (more or less) amount of recompliation to arrive at a new
complete and correct build

is feasible.

But I don't see the point, really. It's simpler, cheaper, and more reliable
to throw hardware at the problem. Buy a fast machine to do a continuous
build and archive the last N days worth of builds. You can now fetch a
completely built system at any release level with no compilation required at
all.


Joshua Maurice

unread,
May 3, 2010, 4:06:58 AM5/3/10
to
On May 2, 11:54 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

My company tried to do that, but I think they missed the important
part of the memo: that it only works when the code is decoupled,
modular, and relatively stable and well defined interfaces instead of
the ~25,000 source file mess we have now. It's made me really hate
Maven (~800 poms and counting), though I accept it may have situations
in which it's a decent build tool.

Lew

unread,
May 3, 2010, 7:38:07 AM5/3/10
to
On 05/03/2010 12:50 AM, Joshua Maurice wrote:

Arne Vajhøj wrote:
>> Structure things better.
>>
>> With a good OO model a series of related changes should not
>> require changes in "a large portion of 20000 Java files".


> So, how do you suggest doing that when there's a code generator under
> active development which generates Java code, and a large portion of
> the Java code directly or indirectly works with the output of this
> code generator? We model the object domains in a simple modeling

Generate code into packages. Generate different parts of the project into
separate modules.


> language which is then compiled to C++ and Java code to allow
> serializing a description of a unit of work from the Java tools to the
> C++ tools and back. Most of the infrastructure and apps work with the
> output of this code generator in some form or another.

The generator can be forced to follow good practices, rather than have bad
practices use "the generator" as an excuse.

> Unfortunately, one cannot fiat interfaces into being stable.

But one can *design* interfaces to be modular. Try it.

--
Lew

Mike Schilling

unread,
May 3, 2010, 9:26:15 AM5/3/10
to
Joshua Maurice wrote:
> On May 2, 11:54 pm, "Mike Schilling" <mscottschill...@hotmail.com>
> wrote:

Why doesn't it work, even with the mess you have now?


Roedy Green

unread,
May 3, 2010, 12:28:36 PM5/3/10
to
On Sat, 1 May 2010 04:34:31 -0700 (PDT), Joshua Maurice
<joshua...@gmail.com> wrote, quoted or indirectly quoted someone
who said :

>system: A developer should be able to do any combination of the
>following actions and trigger a small / minimal / incremental build,

IF you use ANT, you don't need to bother with this. The time in a
traditional compile is mostly loading Javac.exe. With ANT it gets
compiled only once. Further JAVAC looks at dates of *.java and
*.class files and avoids most unnecessary recompilation.

Compiling is almost inconsequential. Building Jars and Zips takes
much more of the time.

See http://mindprod.com/jgloss/ant.html
--
Roedy Green Canadian Mind Products
http://mindprod.com

It�s amazing how much structure natural languages have when you consider who speaks them and how they evolved.

Joshua Maurice

unread,
May 3, 2010, 2:29:58 PM5/3/10
to
On May 3, 4:38 am, Lew <no...@lewscanon.com> wrote:
> On 05/03/2010 12:50 AM, Joshua Maurice wrote:
>
> Arne Vajhøj wrote:
> >> Structure things better.
>
> >> With a good OO model a series of related changes should not
> >> require changes in "a large portion of 20000 Java files".
> > So, how do you suggest doing that when there's a code generator under
> > active development which generates Java code, and a large portion of
> > the Java code directly or indirectly works with the output of this
> > code generator? We model the object domains in a simple modeling
>
> Generate code into packages.  Generate different parts of the project into
> separate modules.
>
> > language which is then compiled to C++ and Java code to allow
> > serializing a description of a unit of work from the Java tools to the
> > C++ tools and back. Most of the infrastructure and apps work with the
> > output of this code generator in some form or another.
>
> The generator can be forced to follow good practices, rather than have bad
> practices use "the generator" as an excuse.

So, I ask again: what if the generator changes, which it does
"somewhat" frequently? I'd like to do a build in that case. The
generator is an example of what ties all of the code together, though
there's a couple more things. What's good practices for the generator,
never change? Well, ideally yes, but that's beyond my control.

> > Unfortunately, one cannot fiat interfaces into being stable.
>
> But one can *design* interfaces to be modular.  Try it.

Again, I do not hold sufficient sway, and we're dealing with a product
with a code level published API which wasn't well designed, so we've
coded ourselves into a corner, so to speak.

Joshua Maurice

unread,
May 3, 2010, 2:36:43 PM5/3/10
to
On May 3, 9:28 am, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:

> On Sat, 1 May 2010 04:34:31 -0700 (PDT), Joshua Maurice
> <joshuamaur...@gmail.com> wrote, quoted or indirectly quoted someone

> who said :
>
> >system: A developer should be able to do any combination of the
> >following actions and trigger a small / minimal / incremental build,
>
> IF you use ANT, you don't need to bother with this.  The time in a
> traditional compile is mostly loading Javac.exe.  With ANT it gets
> compiled only once.  Further JAVAC looks at dates of *.java and
> *.class files and avoids most unnecessary recompilation.

Did you even read any of my other posts in this thread? Ant's
incremental compile is woefully incorrect, so incorrect as to be near
useless on an automated build machine. As a developer, I would rather
take the extra 10 min - 1.5 hours to do a full clean build to not have
to debug bizarre obscure issues which result from a clean build.
There's nothing quite like debugging a system in which you have
inconsistent dlls / jars for a day straight; it's quite aggravating.

> Compiling is almost inconsequential.  Building Jars and Zips takes
> much more of the time.

Do you actually have timing numbers for any of this? I rewrote an
ant / make system which loads Sun's tools.jar and invokes javac
through the tools.jar Java interface, thus I loaded javac into memory
just once like Ant. The full clean compilation of a small portion of
my product, ~3000 files, took ~3 minutes, whereas a separate build
invocation to produce the jars from no jars took ~15 seconds (5-8 sec
of which is just reading in the build script files aka makefiles, stat-
ing files, checking dependencies, etc.). It seems that the
conventional wisdom is quite wrong here. It seems that making jars is
actually quite quick. Well, it's at least quick if you turn off
compression with the "0" flag to jar, as you should during
development.

Joshua Maurice

unread,
May 3, 2010, 2:38:17 PM5/3/10
to
On May 3, 11:36 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> As a developer, I would rather
> take the extra 10 min - 1.5 hours to do a full clean build to not have
> to debug bizarre obscure issues which result from a clean build.

Ack. Typo. It should read "[...] which result from a *inconsistent*
build."

Joshua Maurice

unread,
May 3, 2010, 2:40:23 PM5/3/10
to
On May 3, 6:26 am, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

> Joshua Maurice wrote:
> > On May 2, 11:54 pm, "Mike Schilling" <mscottschill...@hotmail.com>
> > wrote:

~3-7 hours turnaround time for any change on the build machine. Double
that for the common developer machine. That really hurts
productivity.

The whole thing is a mess, and some degree of componentization is
required, and is being done, but that's still no excuse to have a 30
min compile time for developers for a clean build when they could have
5 seconds + minimal rebuild time. Dittos for the automated build
machine.

Mike Schilling

unread,
May 3, 2010, 3:39:49 PM5/3/10
to
Joshua Maurice wrote:
> On May 3, 6:26 am, "Mike Schilling" <mscottschill...@hotmail.com>
> wrote:
>> Joshua Maurice wrote:
>>> On May 2, 11:54 pm, "Mike Schilling" <mscottschill...@hotmail.com>
>>> wrote:

Sorry, still confused. Is the recompilation time 30 min. or 3-7 hours?


Joshua Maurice

unread,
May 3, 2010, 4:43:07 PM5/3/10
to
On May 3, 12:39 pm, "Mike Schilling" <mscottschill...@hotmail.com>

wrote:
> Joshua Maurice wrote:
> > On May 3, 6:26 am, "Mike Schilling" <mscottschill...@hotmail.com>
> > wrote:
> >> Joshua Maurice wrote:
> >>> On May 2, 11:54 pm, "Mike Schilling" <mscottschill...@hotmail.com>
> >>> wrote:

Well, 30 min compile only for the hypothetical situation after a
"realistic" level componentization, and depending on the level of
tests run.

Currently ~145 min compile and package, no tests, on the automated
build machine. 188 min more for the standard regression / acceptance /
integration test suite. Some of the tests are currently distributed
across several automated build machines, with the longest suite at 87
min. Double those times, or thereabouts, for a lower end developer
computer. Any change requires a full clean build as we no
incrementally correct build, and it has not been componentized into
smaller chunks. For example, the serialization framework
implementation changes slightly frequently, which affects a lot of the
code, such as the file persistence, database persistence, engine, and
GUI "components".

Throwing more hardware at the tests is easy for the automated build
machine(s). Throwing more hardware at the compile for the automated
build is hard. Throwing more hardware at it for the developer is
really hard, and really expensive in cash. (I can't imagine a quick
solution to giving the developer 5-10 computers each, and the
maintenance nightmare to trying to have them all maintain their own
build farm.)

Mike Schilling

unread,
May 3, 2010, 6:46:06 PM5/3/10
to
Joshua Maurice wrote:
> Currently ~145 min compile and package, no tests, on the automated
> build machine. 188 min more for the standard regression / acceptance /
> integration test suite. Some of the tests are currently distributed
> across several automated build machines, with the longest suite at 87
> min. Double those times, or thereabouts, for a lower end developer
> computer. Any change requires a full clean build as we no
> incrementally correct build, and it has not been componentized into
> smaller chunks. For example, the serialization framework
> implementation changes slightly frequently, which affects a lot of the
> code, such as the file persistence, database persistence, engine, and
> GUI "components".

Does the serialization framework change often? That would be horrific, and
there's probably nothing to be done to improve the build cost when it does.
But I presume that it changes as the result of some feature being added, so
that can be mitigated by not checking the change into source control until
the feature (or better yet, set of fesatures) is complete.

Also, developers are usually good at optimizing their own work. If a
developer is adding new classes or changing implementation rather than
interface, there's no need to recompile the world. Even when changing
interfaces, the developer usually has a good idea of which bits of the
system use those interfaces, and can recompile just those parts.

Anyway, I'd suggest:

1. Invest in a good SCM system, one that handles multiple branches and
shared branches well.
2. Encourage developers to stay isolated, rather than intergating often and
updating other developers' changes often.
3. Do a continuous build that allows developers to grab the most recent
complete, tested code, so they can recompile only the code they have checked
out and the code that depends on it. Throw lots of hardware at this, so
that failures are found early.


Lew

unread,
May 3, 2010, 7:59:29 PM5/3/10
to
Joshua Maurice wrote:
> Did you even read any of my other posts in this thread? Ant's
> incremental compile is woefully incorrect, so incorrect as to be near
> useless on an automated build machine. As a developer, I would rather

That's a damn snarky tone to take with Roedy, who was just giving you good
advice, especially considering how you keep blaming Ant, Java and everything
else when it's clear from your own admission that it's your own process that's
at fault, as you keep throwing back at us every time someone makes a useful
suggestion.

It's not the tools' fault, it's your'n.

> take the extra 10 min - 1.5 hours to do a full clean build to not have
> to debug bizarre obscure issues which result from a clean build.
> There's nothing quite like debugging a system in which you have
> inconsistent dlls / jars for a day straight; it's quite aggravating.

So fix your system and quit whining about it.

--
Lew

Joshua Maurice

unread,
May 3, 2010, 8:18:22 PM5/3/10
to
On May 3, 4:59 pm, Lew <no...@lewscanon.com> wrote:
> Joshua Maurice wrote:
> > Did you even read any of my other posts in this thread? Ant's
> > incremental compile is woefully incorrect, so incorrect as to be near
> > useless on an automated build machine. As a developer, I would rather
>
> That's a damn snarky tone to take with Roedy, who was just giving you good
> advice, especially considering how you keep blaming Ant, Java and everything
> else when it's clear from your own admission that it's your own process that's
> at fault, as you keep throwing back at us every time someone makes a useful
> suggestion.

It's snarky because of what I consider to be this near absurd level of
deference given to the tools. If this was any other piece of software,
and there was a product out there which ran 10x to 100x faster, it was
be a no brainer conclusion which to use. Instead, I see far too many
people say "Meh. Just do a clean build. It's not that bad."

> It's not the tools' fault, it's your'n.

No. If you read my posts, you would know that I blame both process and
tool. The best the process could do is divide the build into more
manageable chunks, but as a developer I would still have to spend an
hour or so waiting on a build when most of the work is extraneous

> > take the extra 10 min - 1.5 hours to do a full clean build to not have
> > to debug bizarre obscure issues which result from a clean build.
> > There's nothing quite like debugging a system in which you have
> > inconsistent dlls / jars for a day straight; it's quite aggravating.
>
> So fix your system and quit whining about it.

I am fixing it. I am not whining. I was asking for help on how to do
it. I have asked for real solutions to the real problems I am facing
writing it, such as how to get a list of class files per compiled java
file as if I called javac once per java file in the dir.

Joshua Maurice

unread,
May 3, 2010, 8:29:43 PM5/3/10
to
On May 3, 3:46 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

> Joshua Maurice wrote:
> > Currently ~145 min compile and package, no tests, on the automated
> > build machine. 188 min more for the standard regression / acceptance /
> > integration test suite. Some of the tests are currently distributed
> > across several automated build machines, with the longest suite at 87
> > min. Double those times, or thereabouts, for a lower end developer
> > computer. Any change requires a full clean build as we no
> > incrementally correct build, and it has not been componentized into
> > smaller chunks. For example, the serialization framework
> > implementation changes slightly frequently, which affects a lot of the
> > code, such as the file persistence, database persistence, engine, and
> > GUI "components".
>
> Does the serialization framework change often?  That would be horrific, and
> there's probably nothing to be done to improve the build cost when it does.
> But I presume that it changes as the result of some feature being added, so
> that can be mitigated by not checking the change into source control until
> the feature (or better yet, set of fesatures) is complete.

I wish I knew. I just got an email today from the serialization team
asking "What's with this error?" I "hacked" the C++ Maven plugin to
report "<> has detected visual studios warning <>, deletion of a
pointer to an incomplete type. This is formally undefined behavior
according to the C++ spec. Fix it." Apparently changes are still
ongoing.

> Also, developers are usually good at optimizing their own work.  If a
> developer is adding new classes or changing implementation rather than
> interface, there's no need to recompile the world.  Even when changing
> interfaces, the developer usually has a good idea of which bits of the
> system use those interfaces, and can recompile just those parts.

As such a developer, perhaps, but when I mess up, I break the mainline
build, and because the build on the automated build machine, or
private perforce branch build machine, can take the better part of a
day, it's sometimes hard to isolate down who broke it, and especially
when ML is broken this leaves people in a bind. Currently we lock ML
on such events. Rollback is possible. Devops is floating that idea
around at the moment.

> Anyway, I'd suggest:
>
> 1. Invest in a good SCM system, one that handles multiple branches and
> shared branches well.

Done. Perforce is so awesome for the record.

> 2. Encourage developers to stay isolated, rather than intergating often and
> updating other developers' changes often.

Sounds like integration hell. We do have separate teams working on
their own little view for weeks or a month or two on end, and each
team has their own private branch in perforce which is integrated
roughly weekly with mainline.

> 3. Do a continuous build that allows developers to grab the most recent
> complete, tested code, so they can recompile only the code they have checked
> out and the code that depends on it.  Throw lots of hardware at this, so
> that failures are found early.

Also done.

The problem is that it's not helping. It's way too much code, way too
many tests, taking way too long to build.

Arne Vajhøj

unread,
May 3, 2010, 8:33:32 PM5/3/10
to
On 03-05-2010 14:36, Joshua Maurice wrote:
> On May 3, 9:28 am, Roedy Green<see_webs...@mindprod.com.invalid>
> wrote:
>> On Sat, 1 May 2010 04:34:31 -0700 (PDT), Joshua Maurice
>> <joshuamaur...@gmail.com> wrote, quoted or indirectly quoted someone
>> who said :
>>> system: A developer should be able to do any combination of the
>>> following actions and trigger a small / minimal / incremental build,
>>
>> IF you use ANT, you don't need to bother with this. The time in a
>> traditional compile is mostly loading Javac.exe. With ANT it gets
>> compiled only once. Further JAVAC looks at dates of *.java and
>> *.class files and avoids most unnecessary recompilation.
>
> Did you even read any of my other posts in this thread?

Most likely not.

> Ant's
> incremental compile is woefully incorrect, so incorrect as to be near
> useless on an automated build machine.

Ant is very useful for automated builds.

But it is common practice to clean and rebuild.

Your project structure is just not suited for this.

Arne

Arne Vajhøj

unread,
May 3, 2010, 8:38:13 PM5/3/10
to
On 03-05-2010 00:50, Joshua Maurice wrote:

You are working on fixing the symptoms not the problem.

Something is horrible wrong with the object model if
so many classes change all the time.

If you fix that problem (better requirements or more time
spend designing before coding or whatever necessary), then
you will be much better off.

Arne

Arne Vajhøj

unread,
May 3, 2010, 8:40:26 PM5/3/10
to
On 03-05-2010 14:29, Joshua Maurice wrote:
> On May 3, 4:38 am, Lew<no...@lewscanon.com> wrote:
>> The generator can be forced to follow good practices, rather than have bad
>> practices use "the generator" as an excuse.
>
> So, I ask again: what if the generator changes, which it does
> "somewhat" frequently? I'd like to do a build in that case.

But it should not. The team that maintains that generator should
make multiple changes, test them carefully and then release
them to the other teams.

>> But one can *design* interfaces to be modular. Try it.
>
> Again, I do not hold sufficient sway, and we're dealing with a product
> with a code level published API which wasn't well designed, so we've
> coded ourselves into a corner, so to speak.

Stuck with certain API's is a very common thing. But that does
not necessarily mean that thousands of classes change all the
time or that the generator tool change all the time.

Arne

markspace

unread,
May 3, 2010, 9:49:52 PM5/3/10
to
Joshua Maurice wrote:

> I think they missed the important
> part of the memo: that it only works when the code is decoupled,
> modular, and relatively stable and well defined interfaces instead of
> the ~25,000 source file mess we have now.


I've heard that misery loves company, so here you go:

<http://en.wikipedia.org/wiki/Big_ball_of_mud>

"Programmers in control of a big ball of mud project are strongly
encouraged to study it and to understand what it accomplishes, and to
use this as a loose basis for a formal set of requirements for a
well-designed system that could replace it. Technology shifts � such as
client-server to web-based or file-based to database-based � may provide
good reasons to start over from scratch."

Joshua Cranmer

unread,
May 3, 2010, 9:51:44 PM5/3/10
to
On 05/03/2010 08:18 PM, Joshua Maurice wrote:
> I am fixing it. I am not whining. I was asking for help on how to do
> it. I have asked for real solutions to the real problems I am facing
> writing it, such as how to get a list of class files per compiled java
> file as if I called javac once per java file in the dir.

final static int constants (or other constant types that ldc works on)
are directly hardcoded into the class file. It is therefore impossible
to read a classfile and tell you which Java classes would have to change
for it to need to be recompiled.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Arne Vajhøj

unread,
May 3, 2010, 9:57:00 PM5/3/10
to

As I recall it then the conclusion was that correct handling
of constants (static final) required source-

Arne

Arne Vajhøj

unread,
May 3, 2010, 9:59:02 PM5/3/10
to
On 03-05-2010 12:28, Roedy Green wrote:
> On Sat, 1 May 2010 04:34:31 -0700 (PDT), Joshua Maurice
> <joshua...@gmail.com> wrote, quoted or indirectly quoted someone
> who said :
>> system: A developer should be able to do any combination of the
>> following actions and trigger a small / minimal / incremental build,
>
> IF you use ANT, you don't need to bother with this. The time in a
> traditional compile is mostly loading Javac.exe. With ANT it gets
> compiled only once. Further JAVAC looks at dates of *.java and
> *.class files and avoids most unnecessary recompilation.

Well - the safe way is to clean before building.

Arne

Mike Schilling

unread,
May 4, 2010, 12:40:30 AM5/4/10
to

You only break the mainline build if you check in code based on doing that
incorrectly. I'm not suggesting that.

>
>> Anyway, I'd suggest:
>>
>> 1. Invest in a good SCM system, one that handles multiple branches
>> and shared branches well.
>
> Done. Perforce is so awesome for the record.

It is. They do a hell of a job (and I don't say that just because I know a
lot of the folks there.)

>
>> 2. Encourage developers to stay isolated, rather than intergating
>> often and updating other developers' changes often.
>
> Sounds like integration hell. We do have separate teams working on
> their own little view for weeks or a month or two on end, and each
> team has their own private branch in perforce which is integrated
> roughly weekly with mainline.

It shouldn't be hell, especially with a good tool like Perforce helping with
any merges that result. Though if there's a lot of churn in the code and
everything uses everything else, yeah, it'll be harder than if things were
stable and well-organized.

Mike Schilling

unread,
May 4, 2010, 12:57:12 AM5/4/10
to

You mean analysis of the source is required, because no traces of the
constant use are left in the class file? You know, I assumed that such
constants would be listed in the constant pool, but (having tried a simple
test) I see that that's wrong (and you're correct.). Bad decision.


Mike Schilling

unread,
May 4, 2010, 12:59:39 AM5/4/10
to
Joshua Cranmer wrote:
> On 05/03/2010 08:18 PM, Joshua Maurice wrote:
>> I am fixing it. I am not whining. I was asking for help on how to do
>> it. I have asked for real solutions to the real problems I am facing
>> writing it, such as how to get a list of class files per compiled
>> java file as if I called javac once per java file in the dir.
>
> final static int constants (or other constant types that ldc works on)
> are directly hardcoded into the class file.

And no trace of their origin is written to the class file.

> It is therefore impossible
> to read a classfile and tell you which Java classes would have to
> change for it to need to be recompiled.

Because the designers of Java didn't consider that important, or the
necessary information would have been written to the class file.


Joshua Maurice

unread,
May 4, 2010, 3:09:24 AM5/4/10
to
On May 3, 9:59 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

> Joshua Cranmer wrote:
> > It is therefore impossible
> > to read a classfile and tell you which Java classes would have to
> > change for it to need to be recompiled.
>
> Because the designers of Java didn't consider that important, or the
> necessary information would have been written to the class file.

And because such information alone is insufficient to do a correct
incremental compile. See the paper in my opening post "Ghost
Dependencies" or some such.

It would go a long way to helping me make it correct though if javac
spat out "loading class X when compiling java file Y" for all such
pairs.

Joshua Maurice

unread,
May 4, 2010, 3:12:07 AM5/4/10
to
On May 3, 9:40 pm, "Mike Schilling" <mscottschill...@hotmail.com>

wrote:
> Joshua Maurice wrote:
> > On May 3, 3:46 pm, "Mike Schilling" <mscottschill...@hotmail.com>
> >> Also, developers are usually good at optimizing their own work. If a
> >> developer is adding new classes or changing implementation rather
> >> than interface, there's no need to recompile the world. Even when
> >> changing interfaces, the developer usually has a good idea of which
> >> bits of the system use those interfaces, and can recompile just
> >> those parts.
>
> > As such a developer, perhaps, but when I mess up, I break the mainline
> > build, and because the build on the automated build machine, or
> > private perforce branch build machine, can take the better part of a
> > day, it's sometimes hard to isolate down who broke it, and especially
> > when ML is broken this leaves people in a bind. Currently we lock ML
> > on such events. Rollback is possible. Devops is floating that idea
> > around at the moment.
>
> You only break the mainline build if you check in code based on doing that
> incorrectly.  I'm not suggesting that.

Interesting idea. "Don't break the mainline build!" says the managers.
Unfortunately, if they're unable to build to a documented interface,
and the whole build takes hours, if not longer, on their own computer,
then it's hard in practice to not break it.

Just saying.

Mike Schilling

unread,
May 4, 2010, 10:35:58 AM5/4/10
to

Stay isolated longer. Do the full build and test less often.


Arne Vajhøj

unread,
May 4, 2010, 7:44:11 PM5/4/10
to

If I remember correct then const in C# is the same way.

Arne

Roedy Green

unread,
May 5, 2010, 8:03:56 PM5/5/10
to
On Mon, 3 May 2010 11:36:43 -0700 (PDT), Joshua Maurice
<joshua...@gmail.com> wrote, quoted or indirectly quoted someone
who said :

>


>Did you even read any of my other posts in this thread? Ant's
>incremental compile is woefully incorrect, so incorrect as to be near
>useless on an automated build machine.

In the early days the company I worked for a company that had a guy
who did nothing but tweak compile scripts using traditional MAKE-like
logic. They were INCREDIBLY slow compared with ANT. The errors
Javac/ANT makes in deciding what to recompile make little difference
compared to the massive speedup of loading Javac.exe only once. They
are also insignificant compared with jar and zip time.


If you fiddle non-private static finals, remember to do a clean
compile of the universe. Other than that, for all practical purposes,
ANT works.

--
Roedy Green Canadian Mind Products
http://mindprod.com

What is the point of a surveillance camera with insufficient resolution to identify culprits?

Joshua Maurice

unread,
May 5, 2010, 11:06:51 PM5/5/10
to
On May 5, 5:03 pm, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:

> On Mon, 3 May 2010 11:36:43 -0700 (PDT), Joshua Maurice
> <joshuamaur...@gmail.com> wrote,

> >Did you even read any of my other posts in this thread? Ant's
> >incremental compile is woefully incorrect, so incorrect as to be near
> >useless on an automated build machine.
>
> In the early days the company I worked for a company that  had a guy
> who did nothing but tweak compile scripts using traditional MAKE-like
> logic. They were INCREDIBLY slow compared with ANT. The errors
> Javac/ANT makes in deciding what to recompile make little difference
> compared to the massive speedup of loading Javac.exe only once. They
> are also insignificant compared with jar and zip  time.
>
> If you fiddle non-private static finals, remember to do a clean
> compile of the universe.  Other than that, for all practical purposes,
> ANT works.

No, no, and no. Perhaps you'll listen this time.

First, my solution only loads javac once. It would be silly to do
otherwise. Moreover, calling javac once per java file is quite slow,
and I was asking for a way around that to get fast, incrementally
correct java compiles. However, even with calling javac once per java,
I still outperformed Maven for a full clean build.

Also, do you have any numbers at all to support your proposition that
jar-ing and zip-ing is the time sucker? Any sources, your own or
otherwise? For a sizable portion of my company's product, with my new
system without incremental dependency analysis, I was able to compile
~3,000 java files in ~3 min. The jar-ing of the resultant class files
took ~15 seconds, and ~8 seconds of that was simply my build system
overhead. It seems that jar-ing is \much\ faster than java compilation
for standard hardware. At least, is it with the -O option to jar, the
"do not compress" option, which should be the standard option during
development.

Finally, no. Ant does not work all of the time for all practical
purposes. I can list off numerous times from the last month where our
"streaming build" because it uses such poor dependency analysis
techniques. It required manual intervention by devops to do a clean to
get it to start passing, and I can assure you static finals were not
the dominant cause. (Though, admittingly, Ant probably does a better
job than Maven.)

However, even then, static finals are part of the language, and I want
an automated build which can actually do automated builds. The build
machine has no (easy) way to determine if a static final was changed
or not, or other (potentially obscure) scenarios which Ant would fail
on. When I as a developer update to a new revision / changelist using
perforce, I do not have an (easy) way to check if there was a change
to a static final, so I would have to do a clean build. This is not
acceptable if there's a feasible alternative.

Also, QA now will never take such a dubiously correct build after
having been burned many times by incrementally incorrect builds,
wasting days of stress testing because the build was "incorrectly"
done.

I will not continue if you just repeat these unfounded and inaccurate
assessments, especially if you do not cite any sources at all, even
your own. For example, I have asked of your own timing numbers for
your own builds for java compilation vs jar -0 times, and you have yet
to provide any.

Lew

unread,
May 5, 2010, 11:34:55 PM5/5/10
to
Joshua Maurice wrote:
> No, no, and no. Perhaps you'll listen this time.

And perhaps you won't be so damn rude next time. What the hell?

You have consistently rejected every piece of good advice, given complete
nonsense excuses for doing so, and thrown mud in the face of people who try to
help you. What a piece of work!

--
Lew

Joshua Maurice

unread,
May 6, 2010, 12:48:03 AM5/6/10
to

And you have done the same droning, repeating the same untruths which
I have called out, and repeated them thrice in this thread. Such
untruths include:

1- Jar-ing takes longer than java compilation. Correct: No it doesn't,
at least not always, and I would wager not often judging from the
actual numbers before me for my company's code base.

2- Halfway incremental works always in practice. Again, I have lots of
evidence from the automated build in my own company that no, it
doesn't.

3- Everyone immediately assumes that I'm going to implement it
stupidity in make, invoking a separate JVM for each different jar-dir,
possibly per java file. I have said numerous times that I would not do
this, and this is not what I want. When people mention this, it is a
straw man argument. It is a great disservice to me.

I have been so rude because they have been rude to me first, except
they were more insidious about it.

PS: I do agree that we need to componentize. I disagree that
incrementally correct builds are useless after that.

Joshua Maurice

unread,
May 6, 2010, 12:50:13 AM5/6/10
to
On May 5, 9:48 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On May 5, 8:34 pm, Lew <no...@lewscanon.com> wrote:
>
> > Joshua Maurice wrote:
> > > No, no, and no. Perhaps you'll listen this time.
>
> > And perhaps you won't be so damn rude next time.  What the hell?
>
> > You have consistently rejected every piece of good advice, given complete
> > nonsense excuses for doing so, and thrown mud in the face of people who try to
> > help you.  What a piece of work!
>
> And you have done the same droning, repeating the same untruths which
> I have called out, and repeated them thrice in this thread. Such
> untruths include:

Sorry for the collective "you" there. I recognize that it was not
literally "you" who have said everything. Freudian slip. It should
read "him", or as a more dangerous general "them".

Lew

unread,
May 6, 2010, 1:00:15 AM5/6/10
to
Joshua Maurice wrote:
> I have been so rude because they have been rude to me first, except
> they were more insidious about it.

Plonk.

--
Lew

Joshua Maurice

unread,
May 6, 2010, 1:33:55 AM5/6/10
to

Ok sir. As you will. I tried to have a decent and civil conversation
about a technical detail - the ability to get the list of used class
files per java file in java compilation. Instead of discussing my
topic of interest, I was told half-truths which are repeated ad
nauseum, and downright lies and misinformation.

I am sorry sir that we will no longer have an intelligent discourse,
or any discourse, but neither will I take such abuse lying down. I
have asked numerous times for any such evidence that jar-ing indeed
takes longer than java compilation, as I posit it does not and present
evidence, and certain people have suggested the opposite for quite a
while now. It does tend to grate on one's nerves.

Mike Schilling

unread,
May 6, 2010, 1:45:22 AM5/6/10
to
Joshua Maurice wrote:
> PS: I do agree that we need to componentize. I disagree that
> incrementally correct builds are useless after that.

I've worked in systems roughly as large as yours (tens of thousands of
source file) which were layered, so that each seperately compiled subsystem
had at most a few hundreds of files. At that point, there's no particular
advantage to avoiding clean builds.

During development, a developer works on a small set of subsystems. He
knows when he's changing interfaces rather than implementations, and at that
point can afford the clean build.

The automated build-and-test might spend an hour or so on the clean build,
but that's a small fraction of the time the tests take.


Joshua Maurice

unread,
May 6, 2010, 2:22:33 AM5/6/10
to
On May 5, 10:45 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

We recently were handed out this book describing Scrum, a variant of
agile development. I agree with what the author bolds and italicizes,
that interfaces need to be \stable\ (just the word \stable\).

I would love what you describe. However, my fellow employees and
managers understand little and respect little of what decoupled,
relatively well thought out, well defined interfaces can do for them.
It's always about the new feature. No code cleanup ever really get
done. My only real option to attack that front is to vote with my
feet. (As an example, I remember this one time that an architect at
the company in question said it was perfectly fine to use a finalizer
to manage C standard library heap memory allocated via JNI. I
protested quite vigorously.)

Also, as a potentially incorrect observation, do you think most java
developers use notepad or some other text editor to do their work? I
would suspect that most people use Eclipse nowadays. Eclipse is almost
exactly what I want from a build system, except it's limited to Java.
It's a nearly fully incrementally correct build system, and is a lot
better than I could ever do on my own as a side project. Would you all
be saying the same straw man arguments if you lost your incremental
IDE and had to use notepad / wordpad / emacs without all your cool
java-specific stuff to do your work? I think not.

Mike Schilling

unread,
May 6, 2010, 2:40:11 AM5/6/10
to

If the situation is completely (^&%ed and no one with the power to fix it
will do anything, by all means find a better place.

>
> Also, as a potentially incorrect observation, do you think most java
> developers use notepad or some other text editor to do their work? I
> would suspect that most people use Eclipse nowadays. Eclipse is almost
> exactly what I want from a build system, except it's limited to Java.
> It's a nearly fully incrementally correct build system, and is a lot
> better than I could ever do on my own as a side project. Would you all
> be saying the same straw man arguments if you lost your incremental
> IDE and had to use notepad / wordpad / emacs without all your cool
> java-specific stuff to do your work? I think not.

I don't use Eclipse, not do I do builds from within the IDE I do use
(IntelliJ), since our build system is complicated enough to require ANT.
(I could probably make IntelliJ call ANT, but I've never bothered.)

Anyway, I don't think people are dismissing you because they have a solution
they won't tell you about. Your company has built a horrific system far
outside the parameters of what Java was intended to handle, and they're
paying the price for that. It's a lot like someone who puts 100 novels in a
single Word document and complains that it's slow. Yes, it is.


Joshua Cranmer

unread,
May 6, 2010, 7:29:06 AM5/6/10
to
On 05/06/2010 02:22 AM, Joshua Maurice wrote:
> Also, as a potentially incorrect observation, do you think most java
> developers use notepad or some other text editor to do their work? I
> would suspect that most people use Eclipse nowadays. Eclipse is almost
> exactly what I want from a build system, except it's limited to Java.

As an aside, I've noticed that not using a fully-featured IDE seems to
increase my productivity. Trying to figure out how to get it to just add
on a single -classpath argument to the build step was an hour of my life
wasted. Not to mention the length of time it takes to start up on my
system, as well as the braindead autocompletion it attempts to do.

> It's a nearly fully incrementally correct build system, and is a lot
> better than I could ever do on my own as a side project. Would you all
> be saying the same straw man arguments if you lost your incremental
> IDE and had to use notepad / wordpad / emacs without all your cool
> java-specific stuff to do your work? I think not.

Actually, I think IDEs tend to fall flat on their face when presented
with humongous heterogeneous heaps of code which defy standard build
system logic and which require large databases of tag information (i.e.,
in the 100s of MB range or higher).

I do work on a project which requires about 2 hours to build on my
laptop whenever I update the source, aggravated by the fact that the
thermal setpoints on my laptop appear to be misset and not attending the
build would frequently cause it to shut down due to overheating.

As people have repeated stated:
1. You can't abstract dependency data out of class files alone.
2. If your project were properly compartmentalized, this wouldn't be an
issue.
3. If fixing the design is really so big a deal, then you are up a creek
without a paddle.

Tom Anderson

unread,
May 6, 2010, 8:58:42 AM5/6/10
to

He hasn't been given good advice. He's been given advice completely
unrelated to the problem he's explained at great length, and which is not
nonsense. The only mystifying thing is that he's still here, rather than
having long since buggered off in search of people who might engage with
his problem.

tom

--
It is not the nature of a meme to be understood, it is only to be
followed. -- Benny

Tom Anderson

unread,
May 6, 2010, 9:03:14 AM5/6/10
to
On Thu, 6 May 2010, Joshua Cranmer wrote:

> On 05/06/2010 02:22 AM, Joshua Maurice wrote:
>
>> Also, as a potentially incorrect observation, do you think most java
>> developers use notepad or some other text editor to do their work? I
>> would suspect that most people use Eclipse nowadays. Eclipse is almost
>> exactly what I want from a build system, except it's limited to Java.
>
> As an aside, I've noticed that not using a fully-featured IDE seems to
> increase my productivity. Trying to figure out how to get it to just add on a
> single -classpath argument to the build step was an hour of my life wasted.

Could you expand on that? What do you mean by 'build step', and how did
you do it? Was this some specific and unusual need, or did you just not
know about the Configure Build Path dialogue box?

> Not to mention the length of time it takes to start up on my system,

Fair enough!

> as well as the braindead autocompletion it attempts to do.

Could you expand on that too?

> As people have repeated stated:
> 1. You can't abstract dependency data out of class files alone.

He knows that, and has stated so.

> 2. If your project were properly compartmentalized, this wouldn't be an
> issue.

He knows that, and has stated so.

> 3. If fixing the design is really so big a deal, then you are up a creek
> without a paddle.

Which is exactly why he's trying to build a paddle!

tom

--
File under 'directionless space novelty ultimately ruined by poor
self-editing'

Arne Vajhøj

unread,
May 6, 2010, 8:00:49 PM5/6/10
to
On 06-05-2010 08:58, Tom Anderson wrote:
> On Wed, 5 May 2010, Lew wrote:
>> Joshua Maurice wrote:
>>> No, no, and no. Perhaps you'll listen this time.
>>
>> And perhaps you won't be so damn rude next time. What the hell?
>>
>> You have consistently rejected every piece of good advice, given
>> complete nonsense excuses for doing so, and thrown mud in the face of
>> people who try to help you. What a piece of work!
>
> He hasn't been given good advice. He's been given advice completely
> unrelated to the problem he's explained at great length, and which is
> not nonsense. The only mystifying thing is that he's still here, rather
> than having long since buggered off in search of people who might engage
> with his problem.

The advice is not unrelated to his problem (except for Roedys,
but that is not special for his questions).

He has been giving the correct advice.

They need to fix their way of building software.

If they do that then their build time problems will
disappear. And a lot of other problems.

He has gotten explanation of some of the technical
difficulties that the tool he asks for will face.

He has not gotten such a tool. Because nobody wants
to spend lots of time (thousands of hours) developing
a tool that is only needed in completely fucked up
environments.

Arne

Arne Vajhøj

unread,
May 6, 2010, 8:06:05 PM5/6/10
to
On 06-05-2010 02:22, Joshua Maurice wrote:
> We recently were handed out this book describing Scrum, a variant of
> agile development. I agree with what the author bolds and italicizes,
> that interfaces need to be \stable\ (just the word \stable\).
>
> I would love what you describe. However, my fellow employees and
> managers understand little and respect little of what decoupled,
> relatively well thought out, well defined interfaces can do for them.
> It's always about the new feature. No code cleanup ever really get
> done. My only real option to attack that front is to vote with my
> feet. (As an example, I remember this one time that an architect at
> the company in question said it was perfectly fine to use a finalizer
> to manage C standard library heap memory allocated via JNI. I
> protested quite vigorously.)

You should fix that problem instead of searching for the
magic tool that can compensate for those problems.

> Also, as a potentially incorrect observation, do you think most java
> developers use notepad or some other text editor to do their work? I
> would suspect that most people use Eclipse nowadays. Eclipse is almost
> exactly what I want from a build system, except it's limited to Java.
> It's a nearly fully incrementally correct build system, and is a lot
> better than I could ever do on my own as a side project. Would you all
> be saying the same straw man arguments if you lost your incremental
> IDE and had to use notepad / wordpad / emacs without all your cool
> java-specific stuff to do your work? I think not.

If you think it is useful, then the Eclipse compiler is
open source and you can grab it and hack it to do what you
want.

Arne

Mike Schilling

unread,
May 7, 2010, 8:27:47 PM5/7/10
to

I'm qoing to quibble. First, it wouldn't take thousands of hours; hundreds
at most. (My latest idea is to modify javac to create dependency files.
That way you wouldn't need to do a seperate source analysis to find, where,
e.g. constants are used.) Second, it would be useful, though not required,
in all environments with a big source tree. If I pull down the latest
changes from the SCM and see that some are in low-levbel utility routines, I
probably don't need to recompile all the code that used them, but I don't
know for sure. That's annoying. And if I start to see odd errors, not
knowing whether taking the 20 or 30 minutes to build everything clean is the
fix or a waste of time is annoying as well. If such a tool existed, I'd use
it.

But I'll agree that the reason no such tool exists is that almost all opf us
we get on well enough without it.


Joshua Maurice

unread,
May 7, 2010, 10:44:51 PM5/7/10
to
If anyone else cares, I managed to inadvertently stumble across a
solution. On impulse, I asked a co-worker at lunch. It seems that
class files do not contain sufficient information with default javac
options. However, when compiled with -g, it contains a listing of all
types used in the compile. When combined with Ghost Dependencies, I
think this can result in a correct incremental build at the file level
which will not cascade endlessly downstream. I'm working on the
finishing touches to my prototype now.

Mike Schilling

unread,
May 8, 2010, 12:08:33 AM5/8/10
to

You realize that you're now going to recompile a class when it refers to
another class to which a comment was added.


Lew

unread,
May 8, 2010, 9:58:53 AM5/8/10
to
Joshua Maurice wrote:
>>>>> No, no, and no.

Unplonk. Whatever.

If I plonked everyone who's rude here I wouldn't be allowed to post either.

--
Lew

Joshua Maurice

unread,
May 8, 2010, 4:18:44 PM5/8/10
to
On May 7, 9:08 pm, "Mike Schilling" <mscottschill...@hotmail.com>
wrote:

Yes. I'm pretty sure that it would be better than doing a full clean
build or a cascading jar-dir-unit incremental build.

Mike Schilling

unread,
May 9, 2010, 4:31:48 AM5/9/10
to

No doubt, but the result isn't the minimal amount of recompilation we were
discussing earlier.


Tom Anderson

unread,
May 9, 2010, 6:23:32 AM5/9/10
to
On Sat, 8 May 2010, Joshua Maurice wrote:

> On May 7, 9:08�pm, "Mike Schilling" <mscottschill...@hotmail.com>
> wrote:
>> Joshua Maurice wrote:
>>> If anyone else cares, I managed to inadvertently stumble across a
>>> solution. On impulse, I asked a co-worker at lunch. It seems that
>>> class files do not contain sufficient information with default javac
>>> options. However, when compiled with -g, it contains a listing of all
>>> types used in the compile.

Are you sure?

$ javac -version
javac 1.6.0_16
$ echo "class Foo {public static final int X=23;}" >Foo.java
$ echo "class Bar {public static final int Y=Foo.X;}" >Bar.java
$ javac -g Foo.java Bar.java
$ grep Foo Bar.class
$

I can see no sign of Bar.class containing any mention of Foo.

>>> When combined with Ghost Dependencies, I think this can result in a
>>> correct incremental build at the file level which will not cascade
>>> endlessly downstream. I'm working on the finishing touches to my
>>> prototype now.
>>
>> You realize that you're now going to recompile a class when it refers to
>> another class to which a comment was added.
>
> Yes. I'm pretty sure that it would be better than doing a full clean
> build or a cascading jar-dir-unit incremental build.

The previous time we discussed this, the idea came up of looking at
changed class files to see if the changes were consequential -
essentially, if the change changed the interface of the class (added a
method, changed a method's signature, changed the value of a constant,
etc). If you did that, you could filter the changes so that only
consequential ones triggerd recompilation of dependents. That would avoid
the unnecessary recompilation Mike mentions, wouldn't it?

tom

--
All roads lead unto death row; who knows what's after?

Joshua Maurice

unread,
May 9, 2010, 7:27:49 AM5/9/10
to
On May 9, 3:23 am, Tom Anderson <t...@urchin.earth.li> wrote:
> On Sat, 8 May 2010, Joshua Maurice wrote:
> > On May 7, 9:08 pm, "Mike Schilling" <mscottschill...@hotmail.com>
> > wrote:
> >> Joshua Maurice wrote:
> >>> If anyone else cares, I managed to inadvertently stumble across a
> >>> solution. On impulse, I asked a co-worker at lunch. It seems that
> >>> class files do not contain sufficient information with default javac
> >>> options. However, when compiled with -g, it contains a listing of all
> >>> types used in the compile.
>
> Are you sure?
>
> $ javac -version
> javac 1.6.0_16
> $ echo "class Foo {public static final int X=23;}" >Foo.java
> $ echo "class Bar {public static final int Y=Foo.X;}" >Bar.java
> $ javac -g Foo.java Bar.java
> $ grep Foo Bar.class
> $
>
> I can see no sign of Bar.class containing any mention of Foo.

Apparently I am mistaken. I would suggest looking for "Bar" and not
"Bar.class", but the result is the same. static finals might be an
exception to the debug information, which is sad. I'm wondering how
I'll work around this now. I am still in the process of implementing,
so I haven't really been able to test, or I would have caught this
eventually. Thanks for letting me catch it earlier. I could still
catch this through Ghost Dependency analysis, but it becomes more
tricky. I'll have to think about it. At a minimum, I could detect all
class files which have static final fields, and force all classes
downstream to be out of date. Not very incremental in this case, but
at least it's correct. Hopefully this is the only such corner case. I
need more tests.

Joshua Maurice

unread,
May 9, 2010, 7:33:00 AM5/9/10
to
On May 9, 1:31 am, "Mike Schilling" <mscottschill...@hotmail.com>

I'm not sure what this minimal recompile which we were discussing is.
It is technically impossible to do a true minimal recompile
algorithmically. Let's define it as "Let a build be a set of file
compilations. Let the minimum recompile be the minimum such set for
which the output class file are equivalent to the class files of a
full clean build." First, we'd have to prove such a minimum exists.
That's relatively straightforward. With that out of the way, I think I
could then prove that the problem is equivalent to the Halting
problem. If you define "equivalent" generously, I'm pretty sure this
is the case. If you define it as "same binary file content", then
perhaps not, though still possibly yes.

Either way, this is not my goal. If someone modifies comments to a
Java source file, I'm not going to try and catch that. What I will do
is recompile all files which depend directly on that changed-source
Java file, any files affected by Ghost Dependencies, and continue
cascading this change down until all of the "leaves" of the cascading
recompile are binary equivalent class files to the class files before
the recompile. Perhaps too conservative, but I think that's easy
enough to show that it's correct. Perhaps I'll make it a "tighter fit"
later, though honestly I'm still fumbling around in the dark at the
moment, still learning.

Tom Anderson

unread,
May 9, 2010, 10:49:56 AM5/9/10
to
On Sun, 9 May 2010, Joshua Maurice wrote:

> On May 9, 1:31�am, "Mike Schilling" <mscottschill...@hotmail.com>
> wrote:
>
>> No doubt, but the result isn't the minimal amount of recompilation we were
>> discussing earlier.
>
> I'm not sure what this minimal recompile which we were discussing is. It
> is technically impossible to do a true minimal recompile
> algorithmically. Let's define it as "Let a build be a set of file
> compilations. Let the minimum recompile be the minimum such set for
> which the output class file are equivalent to the class files of a full
> clean build." First, we'd have to prove such a minimum exists. That's
> relatively straightforward.

Extremely so.

> With that out of the way, I think I could then prove that the problem is
> equivalent to the Halting problem.

Certainly not.

> If you define "equivalent" generously, I'm pretty sure this is the case.
> If you define it as "same binary file content", then perhaps not, though
> still possibly yes.

I'm not sure what you mean by 'generously'. Is there a kind of equivalence
less strict than binary equivalence which would actually work?

Anyway, here's a straightforward but slow algorithm to find the minimal
recompile:

1. Copy all your source code somewhere and do a clean build on it; call
the output the reference output
2. Count your source files, and call the total number N
3. Number all your source files, starting at 0 and going up to N
4. Let M be the set of all source files
5. For each integer i between 0 and 2**N - 1:
6a. Let S be the set of source files for whose number j, the jth bit in i
is set
6b. Do a recompilation of just the files in S
6c. Compare the output to the reference output, and if it is identical,
and the size of S is smaller than the size of M, let M be S
6d. Restore the class files to how they were before recompilation

S now contains the set of source files needed for a minimal recompile. It
doesn't follow from this algorithm that it's the only minimal set,
although i suspect that in practice it will be.

I wouldn't suggest you do this in practice, but it shows that the minimal
set exists, can be found algorithmically, and can be found in O(2**N)
time, with a rather large constant. Your task is thus merely to improve
the speed!

> Either way, this is not my goal. If someone modifies comments to a Java
> source file, I'm not going to try and catch that. What I will do is
> recompile all files which depend directly on that changed-source Java
> file, any files affected by Ghost Dependencies, and continue cascading
> this change down until all of the "leaves" of the cascading recompile
> are binary equivalent class files to the class files before the
> recompile. Perhaps too conservative, but I think that's easy enough to
> show that it's correct.

Agreed. If you were a bit more aggressive about the consequentiality of
changes, you could prune off a lot of the leaves of the tree, but it
wouldn't be an asymptotic speedup.

That said, i don't think it would be that hard to work out
consequentiality. The output of javap is almost exactly what you need - i
think the only thing it's missing is those bloody constant values. Adding
them doesn't look hard. This is the relevant bit of javap's source:

https://openjdk.dev.java.net/source/browse/openjdk/jdk/trunk/langtools/src/share/classes/sun/tools/javap/JavapPrinter.java?rev=257&view=markup

You need to change the line that says:

out.println(fields[f].getType()+" " +fields[f].getName()+";");

To say:

int cpx = fields[f].getConstantValueIndex();
if (cpx == 0) out.println(fields[f].getType()+" " +fields[f].getName()+";");
else out.println(fields[f].getType()+" " +fields[f].getName()+" = "+cls.getCpoolEntryobj(cpx)+";");

That adds the value of any compile-time constants to the output. I'm not
sure if it will also add values to instance fields which have initial
values; i think those are handled in the constructors, rather than as
ConstantValue attributes.

You could then compare the output of javap, or hashes of that output, to
determine if the interface of the class had changed. If it hasn't, then
any changes to the class file are inconsequential in terms of
recompilation.

Unless i've missed something. Notably absent from javap output is
annotations - can the annotations on a class affect compilation of other
classes which refer to it? @Override only affects the declaring class.
@Deprecated could affect another class, but could only cause a warning to
be generated.

> Perhaps I'll make it a "tighter fit" later, though honestly I'm still
> fumbling around in the dark at the moment, still learning.

PROTIP: that phase never actually ends.

tom

--
coincidences, body modification, hungarian voice sebestyen marta, **

Tom Anderson

unread,
May 9, 2010, 11:26:12 AM5/9/10
to
On Sun, 9 May 2010, Joshua Maurice wrote:

> On May 9, 3:23�am, Tom Anderson <t...@urchin.earth.li> wrote:
>> On Sat, 8 May 2010, Joshua Maurice wrote:
>>> On May 7, 9:08�pm, "Mike Schilling" <mscottschill...@hotmail.com>
>>> wrote:
>>>> Joshua Maurice wrote:
>>>>> If anyone else cares, I managed to inadvertently stumble across a
>>>>> solution. On impulse, I asked a co-worker at lunch. It seems that
>>>>> class files do not contain sufficient information with default javac
>>>>> options. However, when compiled with -g, it contains a listing of all
>>>>> types used in the compile.
>>
>> Are you sure?
>>
>> $ javac -version
>> javac 1.6.0_16
>> $ echo "class Foo {public static final int X=23;}" >Foo.java
>> $ echo "class Bar {public static final int Y=Foo.X;}" >Bar.java
>> $ javac -g Foo.java Bar.java
>> $ grep Foo Bar.class
>> $
>>
>> I can see no sign of Bar.class containing any mention of Foo.
>
> Apparently I am mistaken. I would suggest looking for "Bar" and not
> "Bar.class", but the result is the same. static finals might be an
> exception to the debug information, which is sad. I'm wondering how
> I'll work around this now.

I wonder how hard it would be to modify javac to add an attribute to
classes to record the origins of any inlined constants. That would let you
pull the information out from the class file later on, just as you can
already do with all the other types of dependency.

A bit of a poke around in the javac source code suggests it already has
code for tracking dependencies, which is there to support JWS in some way.
There are some flags that should switch it on (-xdepend, -Xdepend, -Xjws),
but they don't work on the version i have installled. There's also some
flag you can set on the compilation environment object that will make it
print dependencies, so if you're driving compilation from code, you should
be able to set that. You'd have to parse the compiler output, but that's
not that bad.

> I am still in the process of implementing, so I haven't really been able
> to test,

You're doing it wrong. Test first, test incrementally, build things in a
way you can test as you go. Before you start work for real, do a
higher-level test, a 'spike solution', to make sure that all your
assumptions (like this one) are valid.

> Hopefully this is the only such corner case. I need more tests.

Always true!

Since you have this vast and terrifying codebase, you can probably
generate some pretty thorough functional tests from it. Take pairs of
adjacent revisions from source control, do full builds on each, find the
differences in output, then apply your tool and see if it comes up with
the right answers. You should be able to automate the process of turning a
pair of revisions into a test suite, and then you can just leave it to
crank away generating them for a few days.

Mike Schilling

unread,
May 9, 2010, 12:30:36 PM5/9/10
to

Why would you think that? The ways in which a change to A can affect B is
finite and well-defined.


EJP

unread,
May 9, 2010, 7:42:38 PM5/9/10
to
On 10/05/2010 1:26 AM, Tom Anderson wrote:

Doesn't/didn't 'jikes' do all this?

Joshua Maurice

unread,
May 9, 2010, 8:05:49 PM5/9/10
to
On May 9, 9:30 am, "Mike Schilling" <mscottschill...@hotmail.com>

Let's suppose the source changed from "2" to "(1+1)". Using the strict
interpretation, the output class files would probably be equivalent
(ignoring debug information like original source code). Thus a rebuild
is technically not required.

Or, a slightly less trivial case: Suppose we have class A which has 2
public static functions A1 and A2 which have independent
implementations. Class C uses A1. Suppose someone comes along and
changes A2 in some meaningful way. Class C does not need to be
recompiled, but any sort of file level dependency analysis which is
correct would recompile Class C.

If you define it as binary file contents as equivalent output files,
then it's not equivalent to the Halting problem, I think. However, if
you define it as "The output class files display the same visible
behavior across all valid input", then I think the general case \is\
the Halting problem. You would have to prove for all allowed inputs to
the built program that the behavior is the same with the "minimal"
rebuild and with a full clean build, and that there is no smaller
rebuild which would display the same behavior.

If you instead simplify it to "If a file is touched, then all direct
dependencies and ghost dependencies should be recompiled, and repeat
until the leaves of the cascade are unchanged", then I think this is
quite doable. However, someone did mention "But what if you change
just a comment?". I can extend this too "but what if you just changed
a '2' to '1+1'?", which can probably be extended further. Hopefully
you see my point.

Joshua Maurice

unread,
May 9, 2010, 8:06:33 PM5/9/10
to
On May 9, 4:42 pm, EJP <esmond.not.p...@not.bigpond.com> wrote:
> On 10/05/2010 1:26 AM, Tom Anderson wrote:
>
> Doesn't/didn't 'jikes' do all this?

As far as I can tell, Jikes is no longer in development or supported.
Also, I have doubts as to its correctness. Also, my company would
probably feel better if the compiler in use was Sun' javac and not
Jikes.

Joshua Maurice

unread,
May 9, 2010, 8:11:10 PM5/9/10
to
On May 9, 8:26 am, Tom Anderson <t...@urchin.earth.li> wrote:
> On Sun, 9 May 2010, Joshua Maurice wrote:
> > I am still in the process of implementing, so I haven't really been able
> > to test,
>
> You're doing it wrong. Test first, test incrementally, build things in a
> way you can test as you go. Before you start work for real, do a
> higher-level test, a 'spike solution', to make sure that all your
> assumptions (like this one) are valid.

I am. It's just that implementing this class file parser, ghost
dependency, build system, etc., from scratch will take some time. I
almost have it working on a toy example, at which point I can throw
what little tests I have against it.

Mike Schilling

unread,
May 9, 2010, 8:21:34 PM5/9/10
to
EJP wrote:
> Doesn't/didn't 'jikes' do all this?

It did something of the sort (I don't know how thoroughly), but it's very
obsolete, never having supported the language features that were new in JDK
1.5. But having thought about this a bit, I think that the onlt feasible
way to support anything like minimal recompilation is by having the compiler
calculate and persist dependency information.


Mike Schilling

unread,
May 9, 2010, 8:41:51 PM5/9/10
to
Joshua Maurice wrote:
> On May 9, 9:30 am, "Mike Schilling" <mscottschill...@hotmail.com>
> wrote:
>> Joshua Maurice wrote:
>>> It is technically impossible to do a true minimal recompile
>>> algorithmically. Let's define it as "Let a build be a set of file
>>> compilations. Let the minimum recompile be the minimum such set for
>>> which the output class file are equivalent to the class files of a
>>> full clean build." First, we'd have to prove such a minimum exists.
>>> That's relatively straightforward. With that out of the way, I
>>> think I could then prove that the problem is equivalent to the
>>> Halting problem. If you define "equivalent" generously, I'm pretty
>>> sure this is the case. If you define it as "same binary file
>>> content", then perhaps not, though still possibly yes.
>>
>> Why would you think that? The ways in which a change to A can affect
>> B is finite and well-defined.
>
> Let's suppose the source changed from "2" to "(1+1)". Using the strict
> interpretation, the output class files would probably be equivalent
> (ignoring debug information like original source code). Thus a rebuild
> is technically not required.

If the change is from

public static final int I = 2;

to

public static final int I = 1 + 1;

I'd accept a system that recompiles all users of I as still "minimal". (Not
that it's a difficult optimization to make, since the rules for what's a
compile-time constant are straightforward.) If this change isn't to a
compile-time constant, it would have no effect on anything defined in a
different source file.

>
> Or, a slightly less trivial case: Suppose we have class A which has 2
> public static functions A1 and A2 which have independent
> implementations. Class C uses A1. Suppose someone comes along and
> changes A2 in some meaningful way.

That is, chaged its signature. Merely chaning its implementation would not
rquire C to be recompiled.
.


> Class C does not need to be
> recompiled, but any sort of file level dependency analysis which is
> correct would recompile Class C.

Right, strictly minimal recompilation wouild need to know not only that C
uses A but what parts of A it uses. Quite strightforward, though perhaps not
a good idea. (It's possbile that gathering too much dependency information,
and evaluationing it at too granular a level, slows the whole process down,
compared to allowing some recompilations that are strictly speaking
unnecessary.)

>
> If you define it as binary file contents as equivalent output files,
> then it's not equivalent to the Halting problem, I think. However, if
> you define it as "The output class files display the same visible
> behavior across all valid input", then I think the general case \is\
> the Halting problem.

If I understand you, you're distinguishing between:

1. A recompilation would result in the same class file that already
exists, and
2. A recompilation would result in a change to the class file, but the
result of running the two will always be the same

I agree that a system that tries to determine 2 is infeasible. And my point
about adding the comment isn't that the file itself would be recompiled.
That's acceptable, and might even be necessary to get the line numbers seen
by a debugger to be correct. My point is that all files which refer to it
are recompiled, even though nothing they use (the class definitions, the
field and method definitions, and the values of the compile-time constants)
has changed. The same is true when e.g. method implementations change or an
init block is added; I was using "add a comment" as the limiting case of a
change no other file would care about.

Lew

unread,
May 10, 2010, 12:42:34 AM5/10/10
to
Mike Schilling wrote:
> If the change is from
>
> public static final int I = 2;
>
> to
>
> public static final int I = 1 + 1;
>
> I'd accept a system that recompiles all users of I as still "minimal". (Not
> that it's a difficult optimization to make, since the rules for what's a
> compile-time constant are straightforward.) If this change isn't to a
> compile-time constant, it would have no effect on anything defined in a
> different source file.

That source change would produce no change in the bytecode of the class
wherein 'I' is defined.

Ergo there would be no need for users of I to be recompiled.

So a system that recompiles no users of I in that case would be "minimal".

I'm having a hard time wrapping my mind around your last sentence in that
paragraph.

--
Lew

Mike Schilling

unread,
May 10, 2010, 2:18:22 AM5/10/10
to
Lew wrote:
> Mike Schilling wrote:
>> If the change is from
>>
>> public static final int I = 2;
>>
>> to
>>
>> public static final int I = 1 + 1;
>>
>> I'd accept a system that recompiles all users of I as still
>> "minimal". (Not that it's a difficult optimization to make, since
>> the rules for what's a compile-time constant are straightforward.)
>> If this change isn't to a compile-time constant, it would have no
>> effect on anything defined in a different source file.
>
> That source change would produce no change in the bytecode of the
> class wherein 'I' is defined.
>
> Ergo there would be no need for users of I to be recompiled.
>
> So a system that recompiles no users of I in that case would be
> "minimal".

I'm saying that one that says "definition of I changed, we need to recompile
I's users" would be minimal enough for me.

>
> I'm having a hard time wrapping my mind around your last sentence in
> that paragraph.

If the sole change were

void doit()
{
int i = 1 + 1; //formerly int i = 2
}

it's still true that the class's bytecode doesn't change. It's also true
that nothing about A that could possibly require anything else to recompile
changed, since there are never dependencies on method implementations. Thus
there's no need to recognize that A's bytecode didn't change, because it's
really irrelevant; there's also be no need to recompile any other files if
the sole change were

void doit()
{
int i = 1 + 2; //formerly int i = 2
}

Joshua Maurice

unread,
May 12, 2010, 5:35:34 PM5/12/10
to
Bad news. It appears that class files do not contain the necessary
dependency information for my goal of not rebuilding all java files
downstream. Ex:

//AA.java
public class AA { public final int x = 1; }

//BB.java
public class BB { public int x = new AA().x; }

//javap -verbose -classpath . BB
Compiled from "BB.java"
public class BB extends java.lang.Object
SourceFile: "BB.java"
minor version: 0
major version: 50
Constant pool:
const #1 = Method #7.#19; // java/lang/Object."<init>":()V
const #2 = class #20; // AA
const #3 = Method #2.#19; // AA."<init>":()V
const #4 = Method #7.#21; // java/lang/Object.getClass:()Ljava/
lang/Class
;
const #5 = Field #6.#22; // BB.x:I
const #6 = class #23; // BB
const #7 = class #24; // java/lang/Object
const #8 = Asciz x;
const #9 = Asciz I;
const #10 = Asciz <init>;
const #11 = Asciz ()V;
const #12 = Asciz Code;
const #13 = Asciz LineNumberTable;
const #14 = Asciz LocalVariableTable;
const #15 = Asciz this;
const #16 = Asciz LBB;;
const #17 = Asciz SourceFile;
const #18 = Asciz BB.java;
const #19 = NameAndType #10:#11;// "<init>":()V
const #20 = Asciz AA;
const #21 = NameAndType #25:#26;// getClass:()Ljava/lang/Class;
const #22 = NameAndType #8:#9;// x:I
const #23 = Asciz BB;
const #24 = Asciz java/lang/Object;
const #25 = Asciz getClass;
const #26 = Asciz ()Ljava/lang/Class;;

{
public int x;

public BB();
Code:
Stack=3, Locals=1, Args_size=1
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: aload_0
5: new #2; //class AA
8: dup
9: invokespecial #3; //Method AA."<init>":()V
12: invokevirtual #4; //Method java/lang/Object.getClass:()Ljava/
lang/Clas
s;
15: pop
16: iconst_1
17: putfield #5; //Field x:I
20: return
LineNumberTable:
line 1: 0

LocalVariableTable:
Start Length Slot Name Signature
0 21 0 this LBB;


}
/////
/////

Specifically note that the instructions to initialize BB.x involve
"iconst_1", which, as I understand it, puts the constant 1 on the
stack. javac, even with -g, inlined the value of a final not-static
int field. If it can do this, I don't know what else it can do, and I
don't want to try to write tests to catch all possible variations. So,
I'm back to calling javac -verbose once per java file (but still only
one JVM), and parsing the verbose output to get the actual class
compile dependencies. From initial testing, this slows down a clean
build by a factor of 4x to 5x for my code base.

So, I'm back to square one. Anyone know a way to get the dependency
information javac -verbose would supply per java file without calling
javac -verbose (using the tools.jar API) once per java file?

Anyone have an inkling of how hard this would be to just modify javac
itself to output the information I need? I assume the change would be
quite minor, relatively speaking. It already prints out a message when
it loads a class file, and it has to know which java file it's loading
it for; it's just that it only prints the message the first time
loading the class file (presumably caching its contents), and does not
print out the java file for which it's being loaded.

Arne Vajhøj

unread,
May 12, 2010, 5:37:29 PM5/12/10
to
On 12-05-2010 17:35, Joshua Maurice wrote:
> Bad news. It appears that class files do not contain the necessary
> dependency information for my goal of not rebuilding all java files
> downstream. Ex:
>
> //AA.java
> public class AA { public final int x = 1; }
>
> //BB.java
> public class BB { public int x = new AA().x; }

Well - we told you that last week, so that should not
come as a surprise.

Arne

Joshua Maurice

unread,
May 13, 2010, 3:43:49 AM5/13/10
to

I believe no one suggested specifically that "The dependency
information contained in class files compiled with debug information
ala javac -g and ghost dependencies obtained via JavacTask in
tools.jar (which alone is highly underspecified) is insufficient to do
a Java file level incremental build which does not cascade endlessly
downstream."

Nevertheless, I have finished the implementation, and it appears to be
working quite well, passing a small suite of tests I wrote beforehand.
I plan to implement this on ~1/8 of a piece of my product's code base,
~100 Maven poms, and see how well it performs. I've never written
something quite like this in Java before (yes, I'm a C++ guy in case
you couldn't tell), and the design is new as well. I'm cautiously
optimistic.

Arne Vajhøj

unread,
May 13, 2010, 8:07:09 PM5/13/10
to
On 13-05-2010 03:43, Joshua Maurice wrote:

> On May 12, 2:37 pm, Arne Vajh�j<a...@vajhoej.dk> wrote:
>> On 12-05-2010 17:35, Joshua Maurice wrote:
>>> Bad news. It appears that class files do not contain the necessary
>>> dependency information for my goal of not rebuilding all java files
>>> downstream. Ex:
>>
>>> //AA.java
>>> public class AA { public final int x = 1; }
>>
>>> //BB.java
>>> public class BB { public int x = new AA().x; }
>>
>> Well - we told you that last week, so that should not
>> come as a surprise.
>
> I believe no one suggested specifically that "The dependency
> information contained in class files compiled with debug information
> ala javac -g and ghost dependencies obtained via JavacTask in
> tools.jar (which alone is highly underspecified) is insufficient to do
> a Java file level incremental build which does not cascade endlessly
> downstream."

No but the case of constants not being in the class file
where explicitly discussed.

Arne

Joshua Maurice

unread,
May 13, 2010, 9:14:04 PM5/13/10
to
On May 13, 5:07 pm, Arne Vajhøj <a...@vajhoej.dk> wrote:
> On 13-05-2010 03:43, Joshua Maurice wrote:
>
>
>
> > On May 12, 2:37 pm, Arne Vajhøj<a...@vajhoej.dk>  wrote:

I'm sorry. Perhaps I am mistaken, but I recall only discussion of
"static finals", not "constants". It does appear that "constants" is
the correct conclusion.

Steven Simpson

unread,
May 14, 2010, 7:15:01 AM5/14/10
to
On 12/05/10 22:35, Joshua Maurice wrote:
> Bad news. It appears that class files do not contain the necessary
> dependency information for my goal of not rebuilding all java files
> downstream. Ex:
>
> //AA.java
> public class AA { public final int x = 1; }
>
> //BB.java
> public class BB { public int x = new AA().x; }
>
> //javap -verbose -classpath . BB
> Compiled from "BB.java"
> public class BB extends java.lang.Object
> SourceFile: "BB.java"
> minor version: 0
> major version: 50
> Constant pool:
> const #1 = Method #7.#19; // java/lang/Object."<init>":()V
> const #2 = class #20; // AA
>

Surely, you /do/ have enough information in this case, as BB.class
refers to AA in order to call 'new AA()' in order to get AA#x. You
don't specifically need to know that BB uses AA#x, do you? ...just that
BB uses AA.

Only if AA#x was static would you be able to write AA.x, which would be
inlined with no reference to AA.


--
ss at comp dot lancs dot ac dot uk

Joshua Maurice

unread,
May 14, 2010, 4:11:28 PM5/14/10
to

A simple example of my goal is the following, for the appropriate
definition of "using":
A uses B
B uses C.

I modify C. I need to rebuild C. If C's classfile has the same binary
contents, then no further work needs to be done. Otherwise, I need to
rebuild B. If B's classfile has the same binary contents, then no
further work needs to be done. Otherwise, I need to rebuild A.

Put more simply, I want a system where I do not have to rebuild all
java files downstream. I think that it is unnecessary to do such a
thing, and a lot of time could be saved if you could identify a point
where a change no longer "ripples" down the dependency chain.


Let's take a slight alteration of the example from the previous
post:

//AA.java
public class AA { public final int x = 1; }
//BB.java

public class BB extends AA {}
//CC.java
public class CC { public final int x = new BB().x; }

When javac compiles CC.java, it loads BB.class, looks for a member
named x, finds no such member, then loads AA.class, and finds member
x. javac's verbose output contains this information.

The class file CC.class does not refer to "AA" or "x". It calls BB's
constructor, and it hardcodes "1" through the JVM instruction
iconst_1.

With just the information available in the class files, I don't think
it would be possible to detect when the change cascading down the
dependency graph can have no further effects. To know when the cascade
is done, I need the full compile dependencies, specifically "CC uses
AA". (I also need ghost dependencies, as mentioned else-thread.)


Note that there is no analogy between the first and second examples of
this post. The first example is "A uses B, and B uses C". For the
second example, I would say "BB uses AA, CC uses BB, and CC uses AA."


Maybe one day I'll get even fancier, and instead of "same class file
binary contents", I'll use something like "same super class, same
interface, same methods".

Joshua Maurice

unread,
May 14, 2010, 4:15:44 PM5/14/10
to
On May 14, 1:11 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> Maybe one day I'll get even fancier, and instead of "same class file
> binary contents", I'll use something like "same super class, same
> interface, same methods".

Ack, that should read "same super class, same interfaces*, same
members*".

Tom Anderson

unread,
May 14, 2010, 5:25:16 PM5/14/10
to
On Wed, 12 May 2010, Joshua Maurice wrote:

> //AA.java
> public class AA { public final int x = 1; }
>
> //BB.java
> public class BB { public int x = new AA().x; }
>
> //javap -verbose -classpath . BB

> public BB();
> Code:
> Stack=3, Locals=1, Args_size=1
> 0: aload_0
> 1: invokespecial #1; //Method java/lang/Object."<init>":()V
> 4: aload_0
> 5: new #2; //class AA
> 8: dup
> 9: invokespecial #3; //Method AA."<init>":()V
> 12: invokevirtual #4; //Method java/lang/Object.getClass:()Ljava/lang/Class;
> 15: pop
> 16: iconst_1
> 17: putfield #5; //Field x:I
> 20: return
>

> Specifically note that the instructions to initialize BB.x involve
> "iconst_1", which, as I understand it, puts the constant 1 on the
> stack. javac, even with -g, inlined the value of a final not-static
> int field.

Yeah, this is a bit weird.

Also, what the hell is that getClass call all about? I see that in code i
compile too (javac 1.6.0_16). A bit of googling reveals it's the code
generated to force a null check of a variable, and this is used in
compiling certain contortions involving inner classes. But there's no
inner class here, and there is no way in a month of sundays that the top
of stack can be null at instruction 12 - it's produced by applying dup to
the result of new, and new can never produce a null (right?). So what's it
doing?

Anyway, turning back to the initialisation of x. if you look at the
bytecode of AA, that's also weird. It has a constructor which does
iconst_1 + putfield to initialise x - but x *also* has a ConstantValue
attribute, giving it the value 1. Why both? If you write a verion of AA
where x is static, then there's only a ConstantValue, and no synthetic
clinit or anything touching it. Or instead make it non-final, and of
course it keeps the constructor but loses the ConstantValue.

The good news is that it looks like you can detect 'silently inlinable'
variables by the presence of a ConstantValue attribute. The bad news is
that javac does seem to be violating the VM spec (AIUI) here.

And on the gripping hand, you still have no way to discover the relevance
of AA from CC (the class you mention in a later post).

When i looked into this a while ago, my planned approach was:

1. Keep a table of explicit dependencies between classes (ie CC -> BB, but
not CC -> AA)

2. Keep a tree of direct inheritance relationships, probably including
interface implementation (ie BB -> AA)

3. Define the 'signature' of a class to be the aggregation of its
kind (class or interface), name, list of direct supertypes, the names
and types of its non-private fields, the values of its constant fields,
and the names, parameter types, return types, and exception lists of
its methods. Anything else?

4. When a source file changes, recompile, and compare the signature of the
new class to that of the old class

5. If the signature has changed, walk the inheritance tree, and build
the set of all classes which descend from the class - call this,
including the original class, the family.

6. Use the dependency table to find every class which depends on a member
of the family. Call these the friends.

7. Recompile the family and friends.

8. Repeat the analysis on the newly recompiled files; this is necessary
because changes to constant values can propagate.

If you extend javap to report constant field values, then you can use the
hash of the output of javap has a practical stand-in for a complete
signature. It's a bit oversensitive, because it will change if you add or
remove a static block, or cause the set of secret inner-class backdoor
methods to change, neither of which really change the signature.

I didn't know about ghost dependencies, so i didn't deal with those at
all. But on that subject - am i right in thinking that to build the set of
ghost dependencies, you need to know every name used by the class? If so,
doesn't that already cover this situation? CC uses the name BB.x, and
presumably you have to have an inheritance rule like the above that means
that a change to AA.x means a change to BB.x if there is no actual BB.x.

It is really bloody annoying that compile-time constants can be inlined
like this. Would it be legal for a compiler to *not* inline them? If so,
an option to javac to tell it to do that would be incredibly useful in a
situation like this.

tom

--
inspired by forty-rod whiskey

Joshua Maurice

unread,
May 14, 2010, 6:44:03 PM5/14/10
to

See
http://www.jot.fm/issues/issue_2004_12/article4.pdf
for "ghost dependencies".

I don't think as presented in the paper that ghost dependencies will
catch this. Again, take the example


//AA.java
public class AA { public final int x = 1; }
//BB.java

public class BB extends AA {}
//CC.java

public class CC { public final int x = new BB().x; }

CC.java has ghost dependencies "CC", "BB", "x", aka all names in the
class file (using the Java technical definition of "name" as a single
identifier, or a list of identifiers separated by dots '.'), then get
all possible interpretations under all imports (including the implicit
import <this-package>.*;), then close over all such prefixes. (Or
something like that. The details are somewhat involved. See the
paper.)

AA.class exports the name "AA", aka the full name of the class.
BB.class exports the name "BB", aka the full name of the class.

I'm not sure offhand if there is a good way to extend ghost
dependencies to catch this case without introduces a lot of false
positives.

--
I've also given some thought as you had to maintain this list keeping
track of super classes. I'm not sure how it would interact with this
example:

//AAA.java
public class AAA { public static int aaa = 1; }
//BBB.java
public class BBB { public static AAA bbb = null; }
//CCC.java
public class CCC { public static BBB ccc = null; }
//DDD.java
public class DDD { public final int ddd = CCC.ccc.bbb.aaa; }

If we chance AAA.aaa to "public static double aaa = 2", then BBB.class
would be a noop recompile, CCC.class would be a noop recompile, but
DDD.class would need a recompile. Again, I think I would need the same
information to make this work without endless cascading; I would need
to know that DDD (directly) uses AAA. I thus think that your / my
scheme of keeping tracking of super classes would not be terribly
effective / productive.

Arne Vajhøj

unread,
May 14, 2010, 6:46:48 PM5/14/10
to
On 13-05-2010 21:14, Joshua Maurice wrote:

> On May 13, 5:07 pm, Arne Vajh�j<a...@vajhoej.dk> wrote:
>> On 13-05-2010 03:43, Joshua Maurice wrote:
>>> On May 12, 2:37 pm, Arne Vajh�j<a...@vajhoej.dk> wrote:
>>>> On 12-05-2010 17:35, Joshua Maurice wrote:
>>>>> Bad news. It appears that class files do not contain the necessary
>>>>> dependency information for my goal of not rebuilding all java files
>>>>> downstream. Ex:
>>
>>>>> //AA.java
>>>>> public class AA { public final int x = 1; }
>>
>>>>> //BB.java
>>>>> public class BB { public int x = new AA().x; }
>>
>>>> Well - we told you that last week, so that should not
>>>> come as a surprise.
>>
>>> I believe no one suggested specifically that "The dependency
>>> information contained in class files compiled with debug information
>>> ala javac -g and ghost dependencies obtained via JavacTask in
>>> tools.jar (which alone is highly underspecified) is insufficient to do
>>> a Java file level incremental build which does not cascade endlessly
>>> downstream."
>>
>> No but the case of constants not being in the class file
>> where explicitly discussed.
>
> I'm sorry. Perhaps I am mistaken, but I recall only discussion of
> "static finals", not "constants". It does appear that "constants" is
> the correct conclusion.

static final is constant in Java.

Arne

Joshua Maurice

unread,
May 14, 2010, 7:25:19 PM5/14/10
to
On May 14, 3:46 pm, Arne Vajhøj <a...@vajhoej.dk> wrote:
> On 13-05-2010 21:14, Joshua Maurice wrote:
>
>
>
> > On May 13, 5:07 pm, Arne Vajhøj<a...@vajhoej.dk>  wrote:

> >> On 13-05-2010 03:43, Joshua Maurice wrote:
> >>> On May 12, 2:37 pm, Arne Vajhøj<a...@vajhoej.dk>    wrote:

> >>>> On 12-05-2010 17:35, Joshua Maurice wrote:
> >>>>> Bad news. It appears that class files do not contain the necessary
> >>>>> dependency information for my goal of not rebuilding all java files
> >>>>> downstream. Ex:
>
> >>>>> //AA.java
> >>>>> public class AA { public final int x = 1; }
>
> >>>>> //BB.java
> >>>>> public class BB { public int x = new AA().x; }
>
> >>>> Well - we told you that last week, so that should not
> >>>> come as a surprise.
>
> >>> I believe no one suggested specifically that "The dependency
> >>> information contained in class files compiled with debug information
> >>> ala javac -g and ghost dependencies obtained via JavacTask in
> >>> tools.jar (which alone is highly underspecified) is insufficient to do
> >>> a Java file level incremental build which does not cascade endlessly
> >>> downstream."
>
> >> No but the case of constants not being in the class file
> >> where explicitly discussed.
>
> > I'm sorry. Perhaps I am mistaken, but I recall only discussion of
> > "static finals", not "constants". It does appear that "constants" is
> > the correct conclusion.
>
> static final is constant in Java.

Ok. Let me try again. I recall only discussions of static finals. The
problem of expanding inline "things" also happens with non-static
final fields. I do not recall any discussion previous to that post of
mine about non-static final fields being a problem. I'm sorry; I did
not know that "constant" means "static final" in Java.

le...@lewscanon.com

unread,
May 14, 2010, 8:24:54 PM5/14/10
to
Arne Vajhøj wrote:
>> static final is constant in Java.
>

Joshua Maurice <joshuamaur...@gmail.com> wrote:
> Ok. Let me try again. I recall only discussions of static finals. The
> problem of expanding inline "things" also happens with non-static
> final fields. I do not recall any discussion previous to that post of
> mine about non-static final fields being a problem. I'm sorry; I did
> not know that "constant" means "static final" in Java.
>

Strictly speaking, it doesn't. It's a really, really, really good
idea to study the language spec if you actually want to learn Java.
The relevant section is:
<http://java.sun.com/docs/books/jls/third_edition/html/
typesValues.html#10931>

--
Lew

It is loading more messages.
0 new messages