Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

No unanswered question

10 views

Skip to first unread message

Alf P. Steinbach /Usenet

unread,

Jul 8, 2010, 5:13:53 PM7/8/10

Occasionally I fire up Thunderbird and look in [comp.lang.c++] if perhaps there
should be some question that I could answer.

But no.

There's always some new postings, but even if a question has been asked only a
minute ago, it has already been answered!

Argh.

OK, I'll ask a question myself: what is a good way to implement some thing where
C++ event handlers can be specified in an XML definition of a GUI?

Cheers,

- Alf

--
blog at <url: http://alfps.wordpress.com>

Andrea

unread,

Jul 8, 2010, 5:36:24 PM7/8/10

Alf P. Steinbach /Usenet wrote:
> Occasionally I fire up Thunderbird and look in [comp.lang.c++] if perhaps there
> should be some question that I could answer.
>
> But no.
>
> There's always some new postings, but even if a question has been asked only a
> minute ago, it has already been answered!

ther's one about patterns, give it a shot

> Argh.
>
> OK, I'll ask a question myself: what is a good way to implement some thing where
> C++ event handlers can be specified in an XML definition of a GUI?

don't understand your question

Jonathan Lee

unread,

Jul 8, 2010, 5:39:06 PM7/8/10

On Jul 8, 5:13 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach

+use...@gmail.com> wrote:
> OK, I'll ask a question myself: what is a good way to implement some thing where
> C++ event handlers can be specified in an XML definition of a GUI?

I used to use XUL with javascript to accomplish something like this.
I suppose you could define a mapping between XUL widgets and the
host GUI, and between javascript events and C++. I'm sure that's
effectively what happens anyway.

--Jonathan

Alf P. Steinbach /Usenet

unread,

Jul 8, 2010, 5:48:30 PM7/8/10

* Andrea, on 08.07.2010 23:36:

> Alf P. Steinbach /Usenet wrote:
>> Occasionally I fire up Thunderbird and look in [comp.lang.c++] if perhaps there
>> should be some question that I could answer.
>>
>> But no.
>>
>> There's always some new postings, but even if a question has been asked only a
>> minute ago, it has already been answered!
>
> ther's one about patterns, give it a shot

Well, OK, done, but really it isn't any fun replying to malformed questions.

>> Argh.
>>
>> OK, I'll ask a question myself: what is a good way to implement some thing where
>> C++ event handlers can be specified in an XML definition of a GUI?
>
> don't understand your question

If you had used XUL you would have. :-)

Dilip

unread,

Jul 8, 2010, 8:46:36 PM7/8/10

On Jul 8, 4:13 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach

+use...@gmail.com> wrote:
> OK, I'll ask a question myself: what is a good way to implement some thing where
> C++ event handlers can be specified in an XML definition of a GUI?
>

That sounds suspiciously like XAML :-) Too bad its only supported on
the .NET platform.

Or you could take a look at this: http://cristianadam.blogspot.com/2009/01/xaml-in-native-c.html

Joshua Maurice

unread,

Jul 8, 2010, 8:59:59 PM7/8/10

On Jul 8, 2:48 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach

At great risk of being offtopic (though I would argue it's quite
important to using the C++ language, and thus on topic), here's a fun
question for you. (Well, a series of questions.) Why is there no
incrementally correct build system, or easy to use incrementally
correct build framework, for C, C++, and Java? As far as I can tell,
all publicly available solutions fail basic incremental correctness
tests.

Let me further define the problem.

A build is the act of following a set of steps, a plan, a process, of
creating executable "files" from source "files".

Let me try to define incremental. A user has a codebase, a local view
of source control, a bunch of source files on disk. the source
includes the build script files as well, such as the makefiles. The
user does a full clean build. The user then makes some changes to the
source, such as adding, removing, or editing source files (which
include build script files). The user then does another build, called
an incremental build, which selectively skips (some) portions of the
full build which are unnecessary because they would produce equivalent
output as the already existing files. This partial build is called an
incremental build.

A correct incremental build is an incremental build which produces
output files equivalent to what a full clean build would produce. An
incremental build process, or incremental build system, is
(incrementally) correct if it can only produce correct incremental
builds, that is, if it will always produce output equivalent to a full
clean build.

An incremental build can be done by hand. A build system is just a
build process, a plan to do a build, a set of actionable items to do a
build. The dependency analysis can be done manually. However, such
analysis tends to take longer than just a full clean build, and
mistakes can be made the human doing the analysis, so it's not really
a correct build system either. Thus any correct incremental build must
automate the tracking of dependencies.

Under that definition, all the build systems and build frameworks
known to me are not incrementally correct, to varying degrees.

The common GNU Make solution described in Recursive Make Considered
Harmful for C and C++ is the closest, but still misses out on corner
cases, including:
1- Removing a new C++ source file when using wildcards will not result
in a relink of its library or executable.
2- Adding a new include file which "hides" another include file on an
include path will not result in a recompile of all corresponding
object files.
3- A change to a makefile itself won't always be caught, such as a
change to a command line preprocessor maco definition.
4- Other various changes to the makefiles which might inadvertently
break incremental correctness.

One might argue that 3, and to a larger extent 4, are outside the
scope of a build system. I disagree with that assessment. When I'm
working on my company's large codebase, and I do a sync to changelist,
this includes changing makefiles which I know nothing about. I want
the build system to correctly incrementally build affected files
without requiring a full clean build. However, with the common GNU
Make solution described in Recursive Make Considered Harmful, this
will not happen; the build can be incorrect.

I'll skip the in depth discussion of other various publicly available
build systems, but as far as I can tell, they are all incorrect under
this set of criteria.

So, my questions were, is there some build system for C, C++, Java,
and extensible to other sane programming languages, which is
incrementally correct, which I somehow glossed over?

Why is there no incrementally correct build system out there? I would
argue that incremental builds represent the most effective approach to
decreasing build times. If your build is taking too long, you can
throw hardware at it, parallelization at it, distributed systems on a
grid at it, throw better faster compilers at it, etc., but all of
these approaches are just taking off a coefficient of the build.
Incremental builds tend to result in build times which are
proportional to the size of the change, not the size of the code base,
which makes them asymptotically much faster than any other possible
change to the build. (pimp is an interesting exception. pimpl does
improve full clean build times by more than just some coefficient by
reducing the size of the effective source given the compiler. However,
pimpl plays with incremental by decreasing the number of dependencies
of the build, thereby improving an incremental build.)

Finally, if I manage to get my company's legal to let me open source
my own little build tool which I've been writing myself in my spare
time, what would I license it under? (I was leaning GNU GPL.) Where
would I put it up to get people to actually consider using it?

Perhaps more generally, who should I bug about the huge shortcoming in
all modern build technology?

Öö Tiib

unread,

Jul 9, 2010, 2:41:32 AM7/9/10

On 9 juuli, 00:13, "Alf P. Steinbach /Usenet" <alf.p.steinbach

+use...@gmail.com> wrote:
> OK, I'll ask a question myself: what is a good way to implement some thing where
> C++ event handlers can be specified in an XML definition of a GUI?

How that XML is used? If GUI is generated from it then event handling
code can be also generated. If XML is run time loaded then there
should be some sort of common event handling interface that accepts
strings and ints.

Vladimir Jovic

unread,

Jul 9, 2010, 3:49:56 AM7/9/10

Alf P. Steinbach /Usenet wrote:

> Occasionally I fire up Thunderbird and look in [comp.lang.c++] if
> perhaps there should be some question that I could answer.
>
> But no.
>
> There's always some new postings, but even if a question has been asked
> only a minute ago, it has already been answered!
>
> Argh.

Tough luck. Press faster that "Get Mail" button ;)

>
> OK, I'll ask a question myself: what is a good way to implement some
> thing where C++ event handlers can be specified in an XML definition of
> a GUI?
>

Take a look at this library :
http://code.google.com/p/pococapsule/

Gil

unread,

Jul 9, 2010, 1:41:00 PM7/9/10

<connections>
<connection>
<sender>Form</sender>
<signal>customContextMenuRequested(Point)</signal>
<receiver>Form</receiver>
<slot>showContextMenu()</slot>
<hints>
<hint type="sourcelabel">
<x>199</x>
<y>149</y>
</hint>
<hint type="destinationlabel">
<x>199</x>
<y>149</y>
</hint>
</hints>
</connection>
<connection>
<sender>horizontalSlider</sender>
<signal>sliderReleased()</signal>
<receiver>calendarWidget</receiver>
<slot>showMonth()</slot>
<hints>
<hint type="sourcelabel">
<x>190</x>
<y>197</y>
</hint>
<hint type="destinationlabel">
<x>141</x>
<y>95</y>
</hint>
</hints>
</connection>
</connections>

cpp4ever

unread,

Jul 9, 2010, 2:25:23 PM7/9/10

You do know Linus Torvalds, (the original Linux guy), created his own
version control system, (Git), because nothing met his needs. I wish you
the best of luck with your build system, but as folks become ever more
sophisticated, no doubt the wish list will change. Hopefully, the useful
C/C++ code base that has built up over the years will not become
obsolete too soon, (although I suspect some Businesses like it that way).

Joshua Maurice

unread,

Jul 9, 2010, 3:25:52 PM7/9/10

On Jul 9, 11:25 am, cpp4ever <n2xssvv.g02gfr12...@ntlworld.com> wrote:
>> [Snipping my incremental build rant]

> You do know Linus Torvalds, (the original Linux guy), created his own
> version control system, (Git), because nothing met his needs. I wish you
> the best of luck with your build system, but as folks become ever more
> sophisticated, no doubt the wish list will change. Hopefully, the useful
> C/C++ code base that has built up over the years will not become
> obsolete too soon, (although I suspect some Businesses like it that way).

All I want is what every developer wants: to be able to make an
arbitrary change to "the source", and have a always correct
incremental build. I disagree that there has been this "ever more
sophisticated" trend. This simple requirement existed back in the
first days of make, and it still exists now. It's just that the
original make author either:
- purposefully punted because he decided that makefiles are not source
and instead part of the implementation of the build system (but now,
makefiles effectively are source in a large project as no single
person understands all of the makefiles in a 10,000 source file
project),
- or purposefully punted on some deltas because he considered being
100% correct is too hard,
- or as I think more likely, he just didn't realize that a build
system based strictly on a file dependency DAG is insufficient for
incremental correctness.

Most people don't actually realize that all build systems out there
are not 100% incrementally correct. Some think that theirs actually is
correct. I'm wondering why this is the case, and why developers put up
with this sad state of affairs. While I'm not the first to realize
this, judging from various papers it's not something commonly known,
such as Recursive Make Considered Harmful, which claims / assumes that
a file dependency DAG is sufficient for incremental correctness (and
it's not).
http://miller.emu.id.au/pmiller/books/rmch/

I have stumbled across a paper which actually does address these
concerns, and notes that all current build systems fail incremental
correctness tests. Oddly though, it is in the context of Java, though
a lot of its ideas also apply to C++ builds. Capturing Ghost
Dependencies In Java
http://www.jot.fm/issues/issue_2004_12/article4.pdf

Joshua Maurice

unread,

Jul 9, 2010, 3:29:13 PM7/9/10

On Jul 9, 11:25 am, cpp4ever <n2xssvv.g02gfr12...@ntlworld.com> wrote:

> Hopefully, the useful
> C/C++ code base that has built up over the years will not become
> obsolete too soon, (although I suspect some Businesses like it that way).

Oh yes. I am not advocating changing C++ itself at all. I am merely
advocating abandoning GNU Make and all other incorrect incremental
build systems in favor of correct incremental build systems. Old
projects which use Make can continue to use Make, but new projects
would hopefully have build scripts of an incrementally correct build
system.

cpp4ever

unread,

Jul 9, 2010, 4:06:18 PM7/9/10

You are correct about current build systems not handling all incremental
changes correctly, I've been programming long enough to have come across
that problem. Without more thought on this topic I'm not entirely sure
how to ensure incremental changes are correctly handled, but it sounds
like you'd need to generate some sort of relationship graph. As long as
the overhead in doing this is not too time consuming, it is worthwhile.
Then again if the relationships are becoming that complex, perhaps the
code design is poor and needs to be redesigned. Again I wish you every
success in your endeavours with this.

cpp4ever

Joshua Maurice

unread,

Jul 9, 2010, 10:07:21 PM7/9/10

On Jul 9, 1:06 pm, cpp4ever <n2xssvv.g02gfr12...@ntlworld.com> wrote:
> You are correct about current build systems not handling all incremental
> changes correctly, I've been programming long enough to have come across
> that problem. Without more thought on this topic I'm not entirely sure
> how to ensure incremental changes are correctly handled, but it sounds
> like you'd need to generate some sort of relationship graph. As long as
> the overhead in doing this is not too time consuming, it is worthwhile.
> Then again if the relationships are becoming that complex, perhaps the
> code design is poor and needs to be redesigned.

I'm leaning towards the following design, and I've implemented
something based on this design. My design is a bastard mix of concepts
from Ant, Maven, and GNU Make.

First, my goal, using the terms define above, is to guarantee
incremental correctness over all possible deltas which can be checked
into source control.

This specifically declares things to be outside the scope of
incremental correctness, including correct and bug-free OS, correct
version of gcc and other compilers, correctly installed "third party
libraries" which are installed apart from the project in question and
the project's source control system. This also excludes the build
system implementation itself. That is, most makefiles (or their
equivalent) will be checked into source control, so they must be
covered by the incremental correctness guarantees, but the make
implementation itself is not checked into the same source control as
the project in question, so the make implementation is outside the
guarantee of correctness; the incremental correctness guarantee is
contingent on the correctness of the build system implementation.

With that out of the way, it follows pretty quickly that a free form
language like makefiles of GNU Make are unsuitable for this purpose.
Apart from the design considerations of a file-level dependency graph,
the turing complete nature of GNU Make means that we cannot guarantee
incremental correctness when developers can arbitrarily modify
makefiles.

What is needed is a very strict and narrow build script language ala
idiomatic Maven, and to a lesser degree idiomatic Ant. The goal is a
build script language where the developer "instantiates" a template or
a macro from a prebuilt list of macros. A macro would correspond to a
"build type", like "cpp dll" or "java jar". The macro definition
itself would not be in the source control repository of the project in
question, and thus the incremental correctness guarantee would be
contingent on the correctness of the macro implementations. Great care
must be made when changing or adding new macros as it could break
incremental correctness, but this would not be a normal developer
activity.

This system also has the desirable property that it removes a lot of
clutter from the build scripts. In my company's current build scripts,
especially the vcproj files, there is a very low "information
content". There is much duplication of settings. Sometimes the
settings differ. It's difficult to find the variations, if not near
impossible, and it is near impossible to determine why this project
has that build variation from that build.

With these macros, all of the common configurations would be moved to
a common place. This allows for easier auditing of current build
options as only the deltas from the default would be in the build
script files (delta used in a meaning than the deltas for incremental
builds). It also allows for useful and easy extensions and rewrites. A
simple example might be that you want to publish Javadocs. In an Ant
system, you would have to add a new Javadoc task to each build script,
whereas if you used Ant macros, or my macros, you could just add a new
piece to the macro which all the build scripts use. I go one step
further than Ant and require all developers to use the prebuilt,
presumably correct, macros. (Ant's prebuild tasks are overwhelmingly
not incrementally correct, including its depends task.)

Each build script would give arguments to the macros, things such as
include path, library dependencies, preprocessor defines, etc. The
build system would parse it all, get all of the instantiated macros,
and then create tasks for each macro. Ex: a macro might be "java files
to jar". (Sorry I didn't use C++ as an example, but the C++ case is
actually more complex.) Each macro invocation would create two tasks,
one task to call javac, and one task to call jar on the produced class
files. In the future, this macro might be edited to include another
task, a Javadoc task. The macro implementation contains information to
associate execution time dependencies between these tasks. Once all
the tasks are created, the build engine can then execute these tasks
in DAG order, parallelizing as possible. It's up to each individual
task to guarantee incremental correctness.

GNU Make has all of the incremental logic hidden away in its
implementation, in its dependency graph logic. However, this is
insufficient. In my system, this logic is encoded in the task
implementations. There will be a very small number of very stable task
implementations. It would not be usual practice for a developer to
modify them. Thus there is some hope that they could be kept correct.

Let me also note that I don't see any particular reason that this
should be much slower than GNU Make's approach of keeping all of the
incremental logic isolated to one place. Sure, the implementation code
is a little more complex, but the actual actions, such as filesystem
accesses, will be on the same order. As an example, for a real portion
of code of my company's product, roughly 4300 java files, on a
standard 4 core desktop, this dependency analysis finishes in less
than 5 seconds. An optimization already in place which skips detailed
analysis of each javac unit when all dependencies have not changed
reduces that to about 2 seconds. For the same codebase, my build
system does a full clean build in about 200 seconds to 340 seconds
depending on the state of the caches of the hard disk (aka hot vs cold
start), and Maven does it in about about 700 seconds. (However, this
comparison is somewhat cheating, as my build system parallelizes the
build whereas Maven does not.) When I picked files which I expected to
have the worst case incremental build time, the worst I managed was
about 60 seconds for a single file modification. (No time given for
Maven, as Maven isn't incrementally correct, so the number is
meaningless.) As noted else-thread, I expect this time to remain
relatively independent of the size of the code base, though
particulars do matter. This is also Java dependency analysis, which
when done correctly, is a lot more complex than the dependency
analysis required for C++ compilation, so I would expect it to be even
faster for the equivalent amount of C++ source files.

A slightly longer explanation of my build system for java: The java
code is broken up into different directories which will end up in
different jars. There's about 100 jars for those java files. The
dependency cascade goes without termination to the boundary of this
javac task, taking the conservative approach, then executes javac. It
begins the analysis anew for the next javac task, in effect allowing a
termination to the dependency cascade. This is required because of the
possible circular references in the java code, because developers
frequently do not specify java to java file dependency information,
and because javac is quite expensive to invoke so the cost is
amortized over lots of java files. This seems to result in a good
degree of parallelization and a good degree of incremental while still
having a fast build.

C++ has an entirely different model. Each cpp dir has its own macro
invocation, each of which creates a single task. When that task
executes during dependency graph traversal, it analyzes which cpp
files are out of date, then creates new tasks for the out of date cpp
files, and adds those tasks to the dependency graph. To be clear, I am
modifying the graph during graph traversal. This allows cool things
like creating cpp files from a black box process (such as unzipping)
and compiling those in parallel (something which would be quite
difficult in GNU Make while still preserving a global cap on the
number of active build jobs), and aggregating multiple cpp files into
a single compiler invocation. I've read that some compilers, IIRC
notably microsoft's visual studios compiler, has a large startup cost,
so if more than 1 cpp file is given to the compiler, say instead in
batches of 5, this greatly speeds up the build.

While implementing some common tasks (such as compile cpp, link obj
files, compile java files, make jar, make zip, unzip, download a file,
etc.), I have been seeing some commonalities between tasks. I still
don't see a good way to factor out some of these commonalities, but
I'm trying. Still, the logic, the hard implementation stuff, is
centralized to a small set of tasks instead of spread out across all
the build scripts of a project.

Let me emphasize that with this system, when the macros are correctly
implemented, it will be quite difficult for a developer to break
incremental correctness short of outright maliciousness. Examples
include: modifying the state file saved between runs of the build
system, adding malicious code in automated tests, setting timestamps
of source files to be the past (a rather notable one as this could
happen from an unzip tool or from a sync from a source control
system), modifying the timestamps of output files at all (though these
are hidden away in a parallel folder structure, so less prone to
accidents).

joe

unread,

Jul 10, 2010, 11:23:30 PM7/10/10

"Alf P. Steinbach /Usenet" <alf.p.stein...@gmail.com> wrote in
message news:i15f2k$5ll$1...@news.eternal-september.org...

Did you really mean "answered" or simply "responded to"? I tend to ask
"big picture" "questions", so I like to get multiple answers/responses
(yes, even from the naysayers). If you want to really "answer" some
questions, dive into the on-going and even tangential threads of
discussion and help out those who are going back-n-forth,
round-in-circles, talking-past-each other (if you don't do that already
that is, I'm not in here enough to know what goes on and who is who),
etc. I remember you doing a fine job of summarizing the "to unsigned or
not to unsigned" debate: I actually kept that post for future reference,
and I didn't need the whole thread then. I think I remember that it was
all one long paragraph though, so consider using multiple paragraphs if
that was/is the case with you.

joe

unread,

Jul 10, 2010, 11:27:09 PM7/10/10

"Alf P. Steinbach /Usenet" <alf.p.stein...@gmail.com> wrote in
message news:i15f2k$5ll$1...@news.eternal-september.org...

> Occasionally I fire up Thunderbird and look in [comp.lang.c++]

Wow. In my city, you can't even chat on your cell phone while driving!

(And BTW, Fords suck.)

joe

unread,

Jul 10, 2010, 11:31:45 PM7/10/10

"Alf P. Steinbach /Usenet" <alf.p.stein...@gmail.com> wrote in

message news:i15h3f$n7q$2...@news.eternal-september.org...

Some "IDL-ish" kind of thing was you original question? Can't help ya.
Aside though, how many people here have used Excel to generate C++ code
for them? Think about it, given a dataset and a bit of VBA, the code is
guaranteed correct as dataset changes (if you've coded your VBA
correctly!).

joe

unread,

Jul 10, 2010, 11:35:43 PM7/10/10

Wow, deja vu: this thread is going off into tangential chaos. My last
post in this thread ended in a completely tangential question, and here
you are doing the exact same thing! Uncanny, for sure. I stopped
believing in witches years ago, so I'm sure that isn't it. :)

joe

unread,

Jul 10, 2010, 11:50:40 PM7/10/10

Was he "endeavoring"? I thought he was just noting the sad state of
affairs in building C++ systems (yes, I said "systems"). I know where
this line of discussion goes to: header files vs. modules. Gotta fix the
underlying before finding fresh air.

Jorgen Grahn

unread,

Jul 20, 2010, 7:39:59 AM7/20/10

(If you had used a proper subject line, and started a new thread
instead of Alf's unfortunately named, one, I would have commented much
sooner.)

On Fri, 2010-07-09, Joshua Maurice wrote:
...

> All I want is what every developer wants: to be able to make an
> arbitrary change to "the source", and have a always correct
> incremental build. I disagree that there has been this "ever more
> sophisticated" trend. This simple requirement existed back in the
> first days of make, and it still exists now. It's just that the
> original make author either:
> - purposefully punted because he decided that makefiles are not source
> and instead part of the implementation of the build system (but now,
> makefiles effectively are source in a large project as no single
> person understands all of the makefiles in a 10,000 source file
> project),

A good insight. Makefiles are part of the source code.

> - or purposefully punted on some deltas because he considered being
> 100% correct is too hard,
> - or as I think more likely, he just didn't realize that a build
> system based strictly on a file dependency DAG is insufficient for
> incremental correctness.

I understand your arguments, but I still think the best (easiest, with
greatest chance of success) is to:

- Insist on one, complete, Makefile a la "recursive make considered
...". It doesn't have to be split into fragments, by the way -- I
find that even for hundreds of source files in dozens of directories,
one big Makefile at the top is quite clear and readable.

- Accept its shortcomings, and learn (teach others) in which
situations you have to issue a "make clean" to be on the safe side.
Having 95% of all builds be incremental ones is pretty good!

> Most people don't actually realize that all build systems out there
> are not 100% incrementally correct. Some think that theirs actually is
> correct. I'm wondering why this is the case, and why developers put up
> with this sad state of affairs.

In my experience, many programmers put up with having 0% of their
builds being incremental ones, or with the other abnormal situations
that the "recursive make considered harmful" paper describes in such
detail. They'd be much better off with a 95% solution /now/ than with
100% in the future.

> While I'm not the first to realize
> this, judging from various papers it's not something commonly known,
> such as Recursive Make Considered Harmful, which claims / assumes that
> a file dependency DAG is sufficient for incremental correctness (and
> it's not).
> http://miller.emu.id.au/pmiller/books/rmch/

...

The most noticeable problem with make's approach to me is the one I
don't think you mentioned: timestamp changes due to version control
tools. E.g. switching from seeing foo.cc version 5 to version 4.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Keith H Duggar

unread,

Jul 20, 2010, 12:36:06 PM7/20/10

On Jul 20, 7:39 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Fri, 2010-07-09, Joshua Maurice wrote:
> > All I want is what every developer wants: to be able to make an
> > arbitrary change to "the source", and have a always correct
> > incremental build. I disagree that there has been this "ever more
> > sophisticated" trend. This simple requirement existed back in the
> > first days of make, and it still exists now. It's just that the
> > original make author either:
> > - purposefully punted because he decided that makefiles are not source
> > and instead part of the implementation of the build system (but now,
> > makefiles effectively are source in a large project as no single
> > person understands all of the makefiles in a 10,000 source file
> > project),
>
> A good insight. Makefiles are part of the source code.

No, not unless the clients /want/ them to be "part of the
source code". Make is a rather general purpose dependency
analysis tool. The dependencies are specified as files; so
if you want one or more makefiles to be in the prereqs of
one or more targets then put them there! The complaint or
"insight" as you call would instead be properly directed
at the dependency generation (whether manual or automatic)
tool not the dependency analysis tool (Make).

The same point applies to the complaining about environment
variables, changes that affect include search path results,
etc. Many of those can be handled with proper scripting to
touch some prereq files. For example, in a build system that
we maintain there is a script that compares the environment
with a capture file. If the current environment differs from
the capture a set of changed variables is calculated and all
all source files (including makefiles) that reference one of
the changed variables is touched thus triggering a rebuild
for the dependencies.

In short, make is /one/ tool, a dependency analysis tool,
that is /part/ of a build system (called Unix). Learn to use
the full suite of tools instead of searching for a "One True
Uber Tool" monolith. Remember the Unix paradigm of "many small
tools working together to solve big problems".

Of course, there are some actual problems with make. A few
have be mentioned in other posts. Another is proper handling
of a single command that outputs multiple targets which is,
well let's say annoying ;-), with make.

KHD

James Kanze

unread,

Jul 20, 2010, 1:38:53 PM7/20/10

On Jul 9, 9:06 pm, cpp4ever <n2xssvv.g02gfr12...@ntlworld.com> wrote:
> On 07/09/2010 08:25 PM, Joshua Maurice wrote:

[...]

> You are correct about current build systems not handling all
> incremental changes correctly,

What do you mean about "not handling incremental changes
correctly". Do you mean that some files are not being
recompiled when they should be? Or do you mean that files are
being recompiled when it isn't necessary (because all you did
was modify an inline function in the header, and the file in
question doesn't use that function)?

For the first, make, used correctly and in conjunction with the
compiler, seems to work well, provided you don't go messing
around with the timestamps on the file.

> I've been programming long enough to have come across that
> problem. Without more thought on this topic I'm not entirely
> sure how to ensure incremental changes are correctly handled,
> but it sounds like you'd need to generate some sort of
> relationship graph.

Generating the relationship graph is exactly what make does.

--
James Kanze

Joshua Maurice

unread,

Jul 20, 2010, 1:54:33 PM7/20/10

On Jul 20, 9:36 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Jul 20, 7:39 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
>
> > On Fri, 2010-07-09, Joshua Maurice wrote:
> > > All I want is what every developer wants: to be able to make an
> > > arbitrary change to "the source", and have a always correct
> > > incremental build. I disagree that there has been this "ever more
> > > sophisticated" trend. This simple requirement existed back in the
> > > first days of make, and it still exists now. It's just that the
> > > original make author either:
> > > - purposefully punted because he decided that makefiles are not source
> > > and instead part of the implementation of the build system (but now,
> > > makefiles effectively are source in a large project as no single
> > > person understands all of the makefiles in a 10,000 source file
> > > project),
>
> > A good insight. Makefiles are part of the source code.
>
> No, not unless the clients /want/ them to be "part of the
> source code". Make is a rather general purpose dependency
> analysis tool. The dependencies are specified as files; so
> if you want one or more makefiles to be in the prereqs of
> one or more targets then put them there! The complaint or
> "insight" as you call would instead be properly directed
> at the dependency generation (whether manual or automatic)
> tool not the dependency analysis tool (Make).

Make's model is to have developers specify rules in a turing complete
programming language, aka makefiles. This is a horrible model.

First, as a matter of practicality, very few people in my company, and
I would imagine the industry at large, are anywhere near knowledgeable
as I on build systems. I have an architect in my company who have
sworn up and down that the Recursive Make Considered Harmful solution
wasn't known or implementable 6 years ago, but it was. As most of the
users of make do not even understand the basics of make, they will
break it. Moreover, it's somewhat unreasonable to require them to.
They're supposed to be experts at writing code in the product domain,
not in writing build systems.

Second, make's model is fundamentally fubar. You cannot have a fully
correct incremental build written in idiomatic make ala Recursive Make
Considered Harmful. See else-thread, or the end of this post for a
synopsis. Make was good back in the day when a single project did fit
into a single directory and a single developer knew all of the code,
but when a developer does not know all of the code, make's model no
longer works.

Simply put, this is my use case which make will not handle. I'm
working in a company on a project with over 25,000 source files in a
single build. The compile / link portion takes over an hour on a
developer machine, assuming no random errors, which there frequently
are on an incremental build. I work on one of the root components, a
reusable module which is used by several services (also part of the
same build). It is my explicit responsibility to do a decent effort at
not breaking the build from any checkin. As the closest thing my
company has to a build expert, I know that the build is not
incrementally correct. I hacked a large portion of it together. I can
do an incremental build most of the time, and just cross my fingers
and hope that it's correct, but I have no way of knowing it.

On the bright side, I manage to not break it most of the time.
However, with a large number of developers working on it, the last
build on the automated build machine is almost always broken. On an
almost weekly basis checkin freezes are enacted in an attempt to
"stabilize" the build. The problem is that most other developers are
not as thorough in their own testing as I, and the automated build
machine takes hours to do the full clean recompile. The time from a
checkin to a confirmed bug is quite large, and as the number of build
breakages goes up, so does this turnaround time as compile failures
hide compile failures.

Yes, I know the standard solution is to break up the project into
smaller projects. I would like that too. However, I'm not in a
position of power to make that happen, and no one else seems
interested in changing the status quo there.

> The same point applies to the complaining about environment
> variables, changes that affect include search path results,
> etc. Many of those can be handled with proper scripting to
> touch some prereq files. For example, in a build system that
> we maintain there is a script that compares the environment
> with a capture file. If the current environment differs from
> the capture a set of changed variables is calculated and all
> all source files (including makefiles) that reference one of
> the changed variables is touched thus triggering a rebuild
> for the dependencies.

Pretty cool system. I would still argue no, that there is a difference
between what I want and what your system handles. As I mentioned else-
thread, no build system is perfect. The line must be drawn somewhere.
At the very least, the correctness of an incremental build system is
conditional on the correctness of the code of the build system itself.
Moreover, if the developer installs the wrong version of the build
system, then he's also fubar.

However, there is an obvious constrained problem set which make and
other build systems claim to solve. The problem is: "Given a correctly
set up environment, the build system should be able to do a correct
incremental build over any and all possible changes in source control
for the project, and any and all possible changes in a developer's
local view of source control." A developer working on a large project,
who does not know all of the code, is unable to distinguish any finer.
He doesn't know about that other component's code, nor their
makefiles. The entire point of an incremental build system is the
automation of such dependency tracking, and that should include the
build system scripts themselves.

> In short, make is /one/ tool, a dependency analysis tool,
> that is /part/ of a build system (called Unix). Learn to use
> the full suite of tools instead of searching for a "One True
> Uber Tool" monolith. Remember the Unix paradigm of "many small
> tools working together to solve big problems".
>
> Of course, there are some actual problems with make. A few
> have be mentioned in other posts. Another is proper handling
> of a single command that outputs multiple targets which is,
> well let's say annoying ;-), with make.

Interesting. I'm sure there's some logical fallacy in here somewhere,
but I don't know the name(s). In short, you assert that the Unix way
works, and is better than other ways, especially better than "One True
Uber Tool monolith". I'd rather not get into this discussion, as it's
mostly tangential. The discussion at hand is make is broken. Not being
able to hand multiple output from a single step is annoying to handle,
but it's relatively minor. Its major problems are:
1- It's core philosophy of a file dependency graph with cascading
rebuilds without termination, combined with its idiomatic usage, is
inherently limited and broken without further enhancement.
- a- It will not catch new nodes hiding other nodes in search path
results (such as an include path).
- b- It will not catch removed nodes nor edges which should trigger
rebuilds (such as removing a cpp file will not relink the library).
- c- It will not catch when the rule to build a node has changed which
should trigger a rebuild (such as adding a command line processor
define).
- d- It will not get anywhere close to a good incremental build for
other compilation models, specifically Java. A good incremental Java
build cannot be based on a file dependency graph. In my solution, file
timestamps are involved yes, but the core is not a file dependency
graph with cascading rebuilds without termination conditions.
2- It exposes a full turing complete language to common developer
edits, and these common developer edits wind up in source control. The
immediate conclusion is that a build system based on idiomatic make
can never be incrementally correct over all possible changes in source
control. That is, a developer will inevitably make a change which
breaks the idiomatic usage (what little there is) and will result in
incremental incorrectness. False negatives are quite annoying, but
perhaps somewhat acceptable. False positives are the bane of a build
systems existence, but they are possible when the build system itself
is being constantly modified \without any tests whatsoever\ as is
common with idiomatic make.

Ian Collins

unread,

Jul 20, 2010, 5:15:31 PM7/20/10

On 07/21/10 05:54 AM, Joshua Maurice wrote:
> On Jul 20, 9:36 am, Keith H Duggar<dug...@alum.mit.edu> wrote:
>>
>> No, not unless the clients /want/ them to be "part of the
>> source code". Make is a rather general purpose dependency
>> analysis tool. The dependencies are specified as files; so
>> if you want one or more makefiles to be in the prereqs of
>> one or more targets then put them there! The complaint or
>> "insight" as you call would instead be properly directed
>> at the dependency generation (whether manual or automatic)
>> tool not the dependency analysis tool (Make).
>
> Make's model is to have developers specify rules in a turing complete
> programming language, aka makefiles. This is a horrible model.

But it works and tool can hide them from the nervous developer.

> First, as a matter of practicality, very few people in my company, and
> I would imagine the industry at large, are anywhere near knowledgeable
> as I on build systems. I have an architect in my company who have
> sworn up and down that the Recursive Make Considered Harmful solution
> wasn't known or implementable 6 years ago, but it was. As most of the
> users of make do not even understand the basics of make, they will
> break it. Moreover, it's somewhat unreasonable to require them to.
> They're supposed to be experts at writing code in the product domain,
> not in writing build systems.

Which is why every team I have worked with or managed had one or two
specialists who look after the build system and other supporting tools
(SCM for instance).

No matter how deep a project's directories go, you can still use a
single flat makefile. I use a single makefile for all my in house code.
There is no better way of supporting distributed building. On larger
projects I have run, we used a single makefile for each layer of the
application (typically no more then 4).

> Simply put, this is my use case which make will not handle. I'm
> working in a company on a project with over 25,000 source files in a
> single build. The compile / link portion takes over an hour on a
> developer machine, assuming no random errors, which there frequently
> are on an incremental build. I work on one of the root components, a
> reusable module which is used by several services (also part of the
> same build). It is my explicit responsibility to do a decent effort at
> not breaking the build from any checkin. As the closest thing my
> company has to a build expert, I know that the build is not
> incrementally correct. I hacked a large portion of it together. I can
> do an incremental build most of the time, and just cross my fingers
> and hope that it's correct, but I have no way of knowing it.
>
> On the bright side, I manage to not break it most of the time.
> However, with a large number of developers working on it, the last
> build on the automated build machine is almost always broken. On an
> almost weekly basis checkin freezes are enacted in an attempt to
> "stabilize" the build. The problem is that most other developers are
> not as thorough in their own testing as I, and the automated build
> machine takes hours to do the full clean recompile. The time from a
> checkin to a confirmed bug is quite large, and as the number of build
> breakages goes up, so does this turnaround time as compile failures
> hide compile failures.

It sounds like it is your company's process that's broken, not make.

> Yes, I know the standard solution is to break up the project into
> smaller projects. I would like that too. However, I'm not in a
> position of power to make that happen, and no one else seems
> interested in changing the status quo there.

Ah, so it is!

Your tools have to support your process and your process has to suit
your tools. You can't fix a process problem by using "better" tools.

--
Ian Collins

Joshua Maurice

unread,

Jul 20, 2010, 5:48:02 PM7/20/10

Yes, my company has those too. Unfortunately the build specialists are
only that in name; they have no actual power, and a lot of them have
no actual knowledge of builds. The developers control the build, and
the "build specialists" just manage the automated build machines.

> > Second, make's model is fundamentally fubar. You cannot have a fully
> > correct incremental build written in idiomatic make ala Recursive Make
> > Considered Harmful. See else-thread, or the end of this post for a
> > synopsis. Make was good back in the day when a single project did fit
> > into a single directory and a single developer knew all of the code,
> > but when a developer does not know all of the code, make's model no
> > longer works.
>
> No matter how deep a project's directories go, you can still use a
> single flat makefile. I use a single makefile for all my in house code.
> There is no better way of supporting distributed building. On larger
> projects I have run, we used a single makefile for each layer of the
> application (typically no more then 4).

All of your arguments seem to address only one of my complaints, that
a turing complete build script language is hopelessly not
incrementally correct. If the makefile could be made simple enough so
all developers could understand it, and so it's exceedingly hard to
make a mistake bad edit, I could live with that. As an educated guess
on a matter of fact, I do not think this is possible with split
makefiles ala GNU Make includ, nor a single large makefile.

There's still the problems that the idiomatic make usage will not
result in an incrementally correct build system (for the reasons
mentioned else-thread), and it basically will never have a good
incremental build for some other compilation models, like Java.

> > Yes, I know the standard solution is to break up the project into
> > smaller projects. I would like that too. However, I'm not in a
> > position of power to make that happen, and no one else seems
> > interested in changing the status quo there.
>
> Ah, so it is!
>
> Your tools have to support your process and your process has to suit
> your tools. You can't fix a process problem by using "better" tools.

Somewhat of a red herring and a straw man in that it doesn't have much
to do about Make's shortcomings. Also, I disagree. One can fix a
process, or at least alleviate the problems, by using better tools.
Also, the better tools I suggest would also help in the situation of a
componentized build with well defined, stable inter-component
interfaces. The build time may be shorter, but you would still be
forced to do a clean build to have some known measure of success. An
incrementally correct build system would be useful in all cases.

Pete Becker

unread,

Jul 20, 2010, 6:15:17 PM7/20/10

On 2010-07-20 17:48:02 -0400, Joshua Maurice said:

> One can fix a
> process, or at least alleviate the problems, by using better tools.

A fool with a tool is still a fool.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Ian Collins

unread,

Jul 20, 2010, 6:19:46 PM7/20/10

On 07/21/10 09:48 AM, Joshua Maurice wrote:
> On Jul 20, 2:15 pm, Ian Collins<ian-n...@hotmail.com> wrote:
>>
>> No matter how deep a project's directories go, you can still use a
>> single flat makefile. I use a single makefile for all my in house code.
>> There is no better way of supporting distributed building. On larger
>> projects I have run, we used a single makefile for each layer of the
>> application (typically no more then 4).
>
> All of your arguments seem to address only one of my complaints, that
> a turing complete build script language is hopelessly not
> incrementally correct. If the makefile could be made simple enough so
> all developers could understand it, and so it's exceedingly hard to
> make a mistake bad edit, I could live with that. As an educated guess
> on a matter of fact, I do not think this is possible with split
> makefiles ala GNU Make includ, nor a single large makefile.

I loathe nested makefiles!

But with our single makefiles, all the developer edits were to add or
remove source files.

> There's still the problems that the idiomatic make usage will not
> result in an incrementally correct build system (for the reasons
> mentioned else-thread), and it basically will never have a good
> incremental build for some other compilation models, like Java.

Maybe. If so, that is because make grew up with C and related
languages. Don't most Java projects use Java based build tools like Ant?

>>> Yes, I know the standard solution is to break up the project into
>>> smaller projects. I would like that too. However, I'm not in a
>>> position of power to make that happen, and no one else seems
>>> interested in changing the status quo there.
>>
>> Ah, so it is!
>>
>> Your tools have to support your process and your process has to suit
>> your tools. You can't fix a process problem by using "better" tools.
>
> Somewhat of a red herring and a straw man in that it doesn't have much
> to do about Make's shortcomings. Also, I disagree. One can fix a
> process, or at least alleviate the problems, by using better tools.

Once can to an extent, but that is analogous to prescribing Aspirin
instead of finding to cause of the headache.

> Also, the better tools I suggest would also help in the situation of a
> componentized build with well defined, stable inter-component
> interfaces. The build time may be shorter, but you would still be
> forced to do a clean build to have some known measure of success. An
> incrementally correct build system would be useful in all cases.

Oh I agree. I don't claim that make like tools are perfect for C++
projects. I have made several attempts over the years (decades!) to
build better makes, but time constraints always got the better of me.

One of my large projects did have some obscure dependencies between
generated files that did require a clean build if the moon was in the
wrong phase when they were edited. We were willing to live with that
given the overall simplicity and speed of the distributed make based
build system.

Life and software development is built around compromise!

--
Ian Collins

Joshua Maurice

unread,

Jul 20, 2010, 7:51:19 PM7/20/10

On Jul 20, 3:19 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> On 07/21/10 09:48 AM, Joshua Maurice wrote:
> > On Jul 20, 2:15 pm, Ian Collins<ian-n...@hotmail.com> wrote:
> >> No matter how deep a project's directories go, you can still use a
> >> single flat makefile. I use a single makefile for all my in house code.
> >> There is no better way of supporting distributed building. On larger
> >> projects I have run, we used a single makefile for each layer of the
> >> application (typically no more then 4).
>
> > All of your arguments seem to address only one of my complaints, that
> > a turing complete build script language is hopelessly not
> > incrementally correct. If the makefile could be made simple enough so
> > all developers could understand it, and so it's exceedingly hard to
> > make a mistake bad edit, I could live with that. As an educated guess
> > on a matter of fact, I do not think this is possible with split
> > makefiles ala GNU Make includ, nor a single large makefile.
>
> I loathe nested makefiles!
>
> But with our single makefiles, all the developer edits were to add or
> remove source files.

Interesting. I think you're saying you had all of your files in a
single directory. At least, I think that's what your saying.
Otherwise, wouldn't the developer have to modify the giant makefile to
add new source directories? I guess you could have done it so it
automatically scans some root dir for all possible source dirs.
However, even then, you'd have to specify the link dependencies
somewhere. I don't see how you get around developers "commonly"
modifying the giant makefile.

I'm working on a project with C++, Java, code generation from a simple
language to Java and C++ to facilitate serialization, an eclipse
plugin build thingy, and various other builds steps. Depending on your
answer, I'm not sure if that would work for me.

> > There's still the problems that the idiomatic make usage will not
> > result in an incrementally correct build system (for the reasons
> > mentioned else-thread), and it basically will never have a good
> > incremental build for some other compilation models, like Java.
>
> Maybe. If so, that is because make grew up with C and related
> languages. Don't most Java projects use Java based build tools like Ant?

Idiomatic GNU Make is much much closer to incremental correctness than
Ant with Java. Ant with Java, even with its depends task, is
hopelessly bad. At least GNU Make covers 95% or more of cases - It
covers cpp file edits, but doesn't cover arbitrary file creation,
deletion, and build script modification. Ant with Java doesn't even
handle basic Java file edits.

> >>> Yes, I know the standard solution is to break up the project into
> >>> smaller projects. I would like that too. However, I'm not in a
> >>> position of power to make that happen, and no one else seems
> >>> interested in changing the status quo there.
>
> >> Ah, so it is!
>
> >> Your tools have to support your process and your process has to suit
> >> your tools. You can't fix a process problem by using "better" tools.
>
> > Somewhat of a red herring and a straw man in that it doesn't have much
> > to do about Make's shortcomings. Also, I disagree. One can fix a
> > process, or at least alleviate the problems, by using better tools.
>
> Once can to an extent, but that is analogous to prescribing Aspirin
> instead of finding to cause of the headache.

But either way you'll have a headache. We'll still have a build, and
it will still take more time than 0, so an incrementally correct build
will always be useful.

Also, proof by analogy is fraud - Bjarne Stroustrup. (Not that I'm
claiming you're trying to prove by analogy. At least I hope you're
not.)

> > Also, the better tools I suggest would also help in the situation of a
> > componentized build with well defined, stable inter-component
> > interfaces. The build time may be shorter, but you would still be
> > forced to do a clean build to have some known measure of success. An
> > incrementally correct build system would be useful in all cases.
>
> Oh I agree. I don't claim that make like tools are perfect for C++
> projects. I have made several attempts over the years (decades!) to
> build better makes, but time constraints always got the better of me.
>
> One of my large projects did have some obscure dependencies between
> generated files that did require a clean build if the moon was in the
> wrong phase when they were edited. We were willing to live with that
> given the overall simplicity and speed of the distributed make based
> build system.
>
> Life and software development is built around compromise!

I disagree. Some parts are open to compromise, but other parts really
aren't. Take a compiler, for example. There is little to no room in a
compiler for compromising correctness. (This is very true of most
software. If it doesn't give the right behavior, then in most domains
it's effectively worthless. Correctness in software is usually a hard
requirement.) If a compiler randomly spat out bad code, especially for
the common case, you know people would be up in arms, and it would be
fixed post haste.

A build system is very much like a compiler, but for some reason we
tolerate a lower standard, perhaps because we can always do a full
clean build if we want a guaranteed correct build. As such, it does
become a tradeoff of developer time lost by incorrect builds vs
developer time required to write a correct incremental build system.
I'm just kind of surprised no one in their spare time for fun has done
it yet. Writing such a system only needs to be done once.

Ian Collins

unread,

Jul 20, 2010, 10:53:13 PM7/20/10

No, the files were anywhere but there was only one makefile for each
layer or module (where a module comprised a number of libraries). New
files were added as required.

> I'm working on a project with C++, Java, code generation from a simple
> language to Java and C++ to facilitate serialization, an eclipse
> plugin build thingy, and various other builds steps. Depending on your
> answer, I'm not sure if that would work for me.

Does the build thingy manage makefiles for you?

>>>>> Yes, I know the standard solution is to break up the project into
>>>>> smaller projects. I would like that too. However, I'm not in a
>>>>> position of power to make that happen, and no one else seems
>>>>> interested in changing the status quo there.
>>
>>>> Ah, so it is!
>>
>>>> Your tools have to support your process and your process has to suit

>>>> your tools. You can't fix a process problem by using "better" tools..

>>
>>> Somewhat of a red herring and a straw man in that it doesn't have much
>>> to do about Make's shortcomings. Also, I disagree. One can fix a
>>> process, or at least alleviate the problems, by using better tools.
>>
>> Once can to an extent, but that is analogous to prescribing Aspirin
>> instead of finding to cause of the headache.
>
> But either way you'll have a headache. We'll still have a build, and
> it will still take more time than 0, so an incrementally correct build
> will always be useful.
>
> Also, proof by analogy is fraud - Bjarne Stroustrup. (Not that I'm
> claiming you're trying to prove by analogy. At least I hope you're
> not.)

No, I was just saying tools aren't the way to fix a broken process. One
of the most common questions I hear from companies in a mess with their
process is what tool should we use to fix it?

>>> Also, the better tools I suggest would also help in the situation of a
>>> componentized build with well defined, stable inter-component
>>> interfaces. The build time may be shorter, but you would still be
>>> forced to do a clean build to have some known measure of success. An
>>> incrementally correct build system would be useful in all cases.
>>
>> Oh I agree. I don't claim that make like tools are perfect for C++
>> projects. I have made several attempts over the years (decades!) to
>> build better makes, but time constraints always got the better of me.
>>
>> One of my large projects did have some obscure dependencies between
>> generated files that did require a clean build if the moon was in the
>> wrong phase when they were edited. We were willing to live with that
>> given the overall simplicity and speed of the distributed make based
>> build system.
>>
>> Life and software development is built around compromise!
>
> I disagree. Some parts are open to compromise, but other parts really
> aren't. Take a compiler, for example. There is little to no room in a
> compiler for compromising correctness. (This is very true of most
> software. If it doesn't give the right behavior, then in most domains
> it's effectively worthless. Correctness in software is usually a hard
> requirement.) If a compiler randomly spat out bad code, especially for
> the common case, you know people would be up in arms, and it would be
> fixed post haste.

True, but build systems are like compilers in another way: they have
bugs. If you write some unusual code that upsets the compiler, you can
file a bug and get it fixed. Or you can change the code. If you do
something unusual which breaks your build, you can try and fix the tool.
Or you can solve the problem a different way. Compromise!

> A build system is very much like a compiler, but for some reason we
> tolerate a lower standard, perhaps because we can always do a full
> clean build if we want a guaranteed correct build. As such, it does
> become a tradeoff of developer time lost by incorrect builds vs
> developer time required to write a correct incremental build system.

It is and that's the point. We knew we had an issue, probably due to
nested dependencies, that mucked up our build. So we added a process
rule "if you change this file, do a clean build and send out a notice to
the rest of the team". If it had been a file we changed often, we would
have fixed the problem, but as it wasn't, we compromised.

Most projects I've worked on have had multiple targets, so there were
always continuous clean builds running in the background to ensure a
change to one target didn't break another. Where "another" could
include the build process it's self!

> I'm just kind of surprised no one in their spare time for fun has done
> it yet. Writing such a system only needs to be done once.

Oh I've tried, but there simply isn't enough spare time!

--
Ian Collins

Keith H Duggar

unread,

Jul 21, 2010, 11:06:19 AM7/21/10

On Jul 20, 6:15 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2010-07-20 17:48:02 -0400, Joshua Maurice said:
>
> > One can fix a
> > process, or at least alleviate the problems, by using better tools.
>
> A fool with a tool is still a fool.

Never heard that before, great quote!

I was actually going to explain in some detail the scripting
solutions I've created to eliminate ALL of the make problems
Joshua complains about. But the more I ruminated on this:

Joshua Maurice wrote:
> First, as a matter of practicality, very few people in my company, and
> I would imagine the industry at large, are anywhere near knowledgeable
> as I on build systems. I have an architect in my company who have
> sworn up and down that the Recursive Make Considered Harmful solution
> wasn't known or implementable 6 years ago, but it was. As most of the
> users of make do not even understand the basics of make, they will
> break it.

The more I came to think that said scripts are a significant
competitive advantage! After all, if one of the /industries/
foremost leading build system masters, a prime among firsts,
does not know how to script these problems away then Wow Wow
Wubbsy I've stumbled on scripts made of pure gold!

Though half of me thinks in contradistinction that many have
implemented the same or very similar solutions and that the
problem here is a need for less whinging and more winning.

KHD

Joshua Maurice

unread,

Jul 21, 2010, 11:16:46 AM7/21/10

On Jul 21, 8:06 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> Joshua Maurice wrote:
> > First, as a matter of practicality, very few people in my company, and
> > I would imagine the industry at large, are anywhere near knowledgeable
> > as I on build systems. I have an architect in my company who have
> > sworn up and down that the Recursive Make Considered Harmful solution
> > wasn't known or implementable 6 years ago, but it was. As most of the
> > users of make do not even understand the basics of make, they will
> > break it.
>
> The more I came to think that said scripts are a significant
> competitive advantage! After all, if one of the /industries/
> foremost leading build system masters, a prime among firsts,
> does not know how to script these problems away then Wow Wow
> Wubbsy I've stumbled on scripts made of pure gold!

I hesitate to go that far. I doubt I am. I can't quite tell if you're
being sarcastic or not. Uhh, thank you if you're serious, but it's too
much. If you are being sarcastic, then you're putting words into my
mouth.

I just seem to be the only one raising a ruckus about the lack of
incrementally correct builds at the moment. GNU Make and other tools
promised me incrementally correct builds, and I want incrementally
correct builds. This is not too much to ask for. As I did mention else-
thread, I'm sure I'm not the first to come up with these ideas or
notice these problems. I did link to that paper for Java, Ghost
Dependencies, which lays out nearly all of my grievances. The Turing
complete scripting grievance is of my own original creation, though
again I doubt I am the first to say so.

Joshua Maurice

unread,

Jul 21, 2010, 11:45:29 AM7/21/10

Compilers can have bugs yes, but if a bug is discovered, especially if
it's common usage, it will be fixed, or the compiler will be dropped.
There is no compromise. I claim that simple file creation, file
deletion, and adding or removing command line preprocessor defines, is
the common case, and if an incremental build system doesn't handle
every such case all the time, it is woefully more broken than any
commonly used compiler.

However, as I said before, yes, you can compromise with an incremental
build system, because if it breaks, it only wastes some developer
time, and once he figures it out, he can do a full clean build to work
around in the issue. With a broken compiler, your options are much
more limited, depending on the exact bug.

> > A build system is very much like a compiler, but for some reason we
> > tolerate a lower standard, perhaps because we can always do a full
> > clean build if we want a guaranteed correct build. As such, it does
> > become a tradeoff of developer time lost by incorrect builds vs
> > developer time required to write a correct incremental build system.
>
> It is and that's the point. We knew we had an issue, probably due to
> nested dependencies, that mucked up our build. So we added a process
> rule "if you change this file, do a clean build and send out a notice to
> the rest of the team". If it had been a file we changed often, we would
> have fixed the problem, but as it wasn't, we compromised.

An email notice? Really? I can understand that a single company has
little motivation to fix this problem, but I am flabbergasted that the
unix community and the other open source communities for C, C++, Java,
and other languages, continue to put up with this. I can only conclude
the problem is smaller for them because they have a smaller code base
size. The problem still exists, but it's not as acute as in my
company. It's when you scale up to a project of this size, ~25,000
source files and growing, with 100+ developers concurrently working on
it, that you start to see these problems really hurt. Sending out an
email notification to do a full clean build like that would result in
several such emails send to me a day, at least.

That still also doesn't address my concern that as I'm at the bottom
of the whole hierarchy, aka most of the other guys depend on me, I
have to do a full clean build before checking in because I don't know
what will or what will not break stuff downstream. Perhaps I might
feel a little safer if it was just Recursive Make Considered Harmful
with purely C++ source code to executables, but the build has a wide
variety of stuff, like Java, code generation to Java and C++, etc.,
and it's not incrementally correct at all. This gets back to my
original point that having to maintain a build script in a Turing
complete language is like writing your own build system every time.
Changes almost certainly are untested before deployed. That really
makes me nervous, and it should make you nervous as well. At least Ian
Collins mentions in a quoted section at the bottom of this post, he
attempts some sort of testing of the build system, but I suspect the
tests are quite minimal and could easily let through bugs. In his own
words, "compromise".

As much as it pains me to quote Peter Olcott, I must agree with him in
a limited basis in this case.

http://groups.google.com/group/comp.lang.c++.moderated/msg/5f02c79589a3a465

On Jun 17, 11:54 am, Peter Olcott <NoS...@OCR4Screen.com> wrote:
> Conventional wisdom says to get the code working and then profile it and
> then make those parts that are too slow faster.
>
> What if the original goal is to make the code in the ball park of as
> fast as possible?
>
> For example how fast is fast enough for a compiler? In the case of the
> time it takes for a compiler to compile, any reduction in time is a
> reduction of wasted time.

A compiler, or any other developer tool, is used so much that any time
savings will almost always have a return on investment if you use it
for long enough. When amortized over the whole language or programming
community, the investment pays off even sooner. There is no "fast
enough" for a compiler, and there is no "fast enough" for a build
system. Yes, there is a "fast enough" to get it in use or to sell it -
it just has to be comparable to the competition on speed, but if
someone comes along with a equal or better one which is also faster,
we all know which we would use. That's not true of all software today.
Some software today does have a fast enough, but compilers and builds
are nowhere near close to "fast enough". "Fast enough" for a compiler
would be compiling the entire Linux kernel and all apps, on an old
computer, in a second. Anything slower and being faster would be a
selling point for a compiler.

Yes, I admit this is a balancing act. It's developer time spent now vs
developer time saved in the future. However, it appears to obvious to
me that it's an easy win investment, which is why it's so frustrating
that no one has done it already, and even more frustrating that no one
else is interested \now\.

Then there's the personal aspect. I don't like spending 20% or more of
my time on builds. I like coding.

On Jul 20, 7:53 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> Most projects I've worked on have had multiple targets, so there were
> always continuous clean builds running in the background to ensure a
> change to one target didn't break another. Where "another" could
> include the build process it's self!

I'm curious. How would you test this? A continuous build in the
background which compared full clean build results to an incremental
build? I always figured I would set this up if I was in control and I
managed to write a correct incremental build system, just as a measure
of sanity checking. However, that's all it would be - sanity checking.
It is nowhere near a set of robust acceptance tests. If I ever had a
failure in this sanity test, I would definitely be adding a new test
to my suite of acceptance tests of my build system.

Ian Collins

unread,

Jul 21, 2010, 4:10:01 PM7/21/10

On 07/22/10 03:45 AM, Joshua Maurice wrote:
> On Jul 20, 7:53 pm, Ian Collins<ian-n...@hotmail.com> wrote:

>> Most projects I've worked on have had multiple targets, so there were
>> always continuous clean builds running in the background to ensure a
>> change to one target didn't break another. Where "another" could
>> include the build process it's self!
>
> I'm curious. How would you test this? A continuous build in the
> background which compared full clean build results to an incremental
> build? I always figured I would set this up if I was in control and I
> managed to write a correct incremental build system, just as a measure
> of sanity checking. However, that's all it would be - sanity checking.
> It is nowhere near a set of robust acceptance tests. If I ever had a
> failure in this sanity test, I would definitely be adding a new test
> to my suite of acceptance tests of my build system.

The first test was did the build succeed? The second was did the built
executable pass its tests?

--
Ian Collins

Joshua Maurice

unread,

Jul 21, 2010, 7:57:30 PM7/21/10

As a test of the cpp source code of your product, that's a very good
test - well, at least as good as the tests which are run. However, the
build system is not the product. The build system is a separate
program with different input and output. You confirmed that the build
system produced "acceptable" output for some input. That's like
testing gcc vs a single simple input source program. You just tested
the incremental build system over one (1) input. This is not a robust
testing scheme, yet the changes to the Turing complete program - the
incremental build system - are deployed after this depressingly low
coverage test.

You said it yourself: sometimes there are failures, and in which case
your solution used in X situation was to send out an email to ignore
the broken program - the incremental build system - and just do a full
clean build. This is an example of the incremental build system
failing a basic acceptance test, but the solution isn't to fix the
program. Instead, the solution is to simply not use it and due a mass
email to the company that the program is known broken for this one
input.

I'm sorry to be so pedantic, but I think this is an important
distinction. I reject your insinuation that your tests for the build
system are anything but \incredibly\ poor. The level of correctness
commonly accepted for incremental build systems is by far the worst of
any commercially or professionally used program ever. Failures and
inadequacies in basic usage are not treated as bug reports but instead
as a reason to sent out a company wide email detailing the bug and a
workaround of "don't use the program (incremental build system)". I am
not trying to draw any sort of conclusions from this particular post.
I'm merely pointing out that AFAIK your so called tests of the build
system are the worst set of tests of any commercially or
professionally used program besides other incremental build systems.

Ian Collins

unread,

Jul 21, 2010, 8:21:03 PM7/21/10

No we did not. I clearly stated above that we ran "continuous clean
builds". The unit test for the product were comprehensive. If they
passed, we could ship it to beta customers. The final executable was
the product, if the build system failed to build it, it would not have
passed its tests.

> You said it yourself: sometimes there are failures, and in which case
> your solution used in X situation was to send out an email to ignore
> the broken program - the incremental build system - and just do a full
> clean build. This is an example of the incremental build system
> failing a basic acceptance test, but the solution isn't to fix the
> program. Instead, the solution is to simply not use it and due a mass
> email to the company that the program is known broken for this one
> input.

Which as I clearly said was a compromise. Yes we could and maybe should
have fixed it, but the problem was well known and infrequent. The
result of the failure was either a failed (incremental) build, or
failure of the tests added for the change.

> I'm sorry to be so pedantic, but I think this is an important
> distinction. I reject your insinuation that your tests for the build
> system are anything but \incredibly\ poor.

Nonsense. If a pair added a new or updated existing tests and they
failed when they should of passed, the problem was spotted. There was
no way for a "broken build" top pass though unnoticed. The symptom of
the problem was failure to regenerate a header. The invariably caused
the build to fail (if something had been added), or tests to fail (if
something was changed).

> The level of correctness
> commonly accepted for incremental build systems is by far the worst of
> any commercially or professionally used program ever. Failures and
> inadequacies in basic usage are not treated as bug reports but instead
> as a reason to sent out a company wide email detailing the bug and a
> workaround of "don't use the program (incremental build system)".

You are twisting my words out of context. I will say this again in the
hope it gets through: yes we could and maybe should have fixed it, but
the problem was well known and infrequent. I have reported bugs in the
distributed make tool I use and had them fixed.

> I am
> not trying to draw any sort of conclusions from this particular post.
> I'm merely pointing out that AFAIK your so called tests of the build
> system are the worst set of tests of any commercially or
> professionally used program besides other incremental build systems.

You have no idea what our test were. If any tool in the chain had
misfired, the tests would fail. We weren't testing the build process,
we were testing the product it produced.

--
Ian Collins

Joshua Maurice

unread,

Jul 21, 2010, 9:48:57 PM7/21/10

Let me be very explicit now. There are two programs, the sellable
product, and the build system. Your sellable product presumably has a
very extensive test suite.

The incremental build system doesn't have quite the same extensive
test suite. It can be tested by comparing the output executables of an
incremental build to the executables of a full clean build. It can
also be indirectly tested by testing the output executables with the
sellable product test suite.

Your overall process is presumably something like:

1- Let developers check out whatever source code they want from source
control. Tell them to highly prefer the last promoted changelist /
revision for their work.

2- Every developer checkin, both to product source code and the
incremental build system source code - aka the giant makefile - will
trigger a set of builds on official automated build machine. After a
successful build, the source code changelist will be promoted, aka
declared good. (Now, you can promote a build after a successful
incremental build with tests, or you can choose to only promote a
build after a successful full clean build with tests. This is
tangential to my main argument that the testing of the incremental
build system is pisspoor.)

Under this system, a developer can make a change to the incremental
build system, aka the giant makefile. After a single build, full clean
or incremental depending on the exact flavor, the change to the
incremental build system, aka the giant makefile, will be deployed to
users, aka developers. However, this single build represents only a
small fraction of the possible input to the incremental build system.
The "input" to an incremental build system is:
- the source code of the incremental build system, aka the giant
makefile, at the time of the previous complete build.
- the source code of the incremental build system, aka the giant
makefile, now.
- the source code of the sellable product at the time of the previous
complete build.
- the source code of the sellable product now.

Incremental correctness is the property of an incremental build system
which will always produce output equivalent to a full clean build on
the source code of now. However, the incremental build takes
additional input, the source code of the previous complete build and
the build system of the previous complete build.

To give an example, you might test a change to the giant makefile with
a full clean build. However, a developer might have missed a
dependency somewhere so that under some set of edits, the incremental
build will not be correct.

That is, you did not test a comprehensive set of possible source code
deltas before deploying a change to the build system. In fact, you did
only one (1) test. Developers will be consuming this change, throwing
many different source code deltas at the incremental build system, all
effectively untested.

Let me apply this reasoning to specific points now:

(Reproducing most of his post. Sorry, I can't think of a better way
offhand to structure this.)

On Jul 21, 5:21 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> On 07/22/10 11:57 AM, Joshua Maurice wrote:
> > This is not a robust
> > testing scheme, yet the changes to the Turing complete program - the
> > incremental build system - are deployed after this depressingly low
> > coverage test.
>
> No we did not. I clearly stated above that we ran "continuous clean
> builds". The unit test for the product were comprehensive. If they
> passed, we could ship it to beta customers. The final executable was
> the product, if the build system failed to build it, it would not have
> passed its tests.

"Continuous clean builds" are irrelevant. A full clean build is not a
test of an incremental build system at all. "Continuous clean builds",
and comparing their output to an incremental build, is, at best, one
very small test of the very large input space of the incremental build
system.

> > I'm sorry to be so pedantic, but I think this is an important
> > distinction. I reject your insinuation that your tests for the build
> > system are anything but \incredibly\ poor.
>
> Nonsense. If a pair added a new or updated existing tests and they
> failed when they should of passed, the problem was spotted. There was
> no way for a "broken build" top pass though unnoticed. The symptom of
> the problem was failure to regenerate a header. The invariably caused
> the build to fail (if something had been added), or tests to fail (if
> something was changed).

There is a difference between "broken build" and "broken incremental
build system". I agree that it sounds like it would be very hard for a
broken build to pass the tests. However, it sounds \very easy\ for a
bug to be introduced in the incremental build \system\.

> > The level of correctness
> > commonly accepted for incremental build systems is by far the worst of
> > any commercially or professionally used program ever. Failures and
> > inadequacies in basic usage are not treated as bug reports but instead
> > as a reason to sent out a company wide email detailing the bug and a
> > workaround of "don't use the program (incremental build system)".
>
> You are twisting my words out of context. I will say this again in the
> hope it gets through: yes we could and maybe should have fixed it, but
> the problem was well known and infrequent. I have reported bugs in the
> distributed make tool I use and had them fixed.

I don't think I'm twisting them out of context.

Moreover, there is still a fundamental nuance that I'm trying to get
across. If there's a bug in the compiler for named return value
optimization (looking at you MSVC), then it's possible to work around
it. It's not pleasant, and it becomes more unpleasant the more common
it becomes. However, with an incremental build system, you are no
longer in control of the issue. It could be someone else's edit to the
giant makefile, or it could be someone else's addition of a new header
which hides a previously included header. In this case, it requires
you to know everything about the system and its input, all of source
code, or at least keep up on your emails, to work around the bugs.

Suppose you need to go back in time 3 months in source control. Should
you have to look through all of your old emails to use the incremental
build system? The difference is if I go back in time I don't have to
review emails to work around known compiler bugs. To work around the
MSVC named return value optimization bug, I just need to look at a
single C++ function. To work around a header hiding a header, I need
to track all makefile changes and source control file adds. The
difference is that the compiler and the browser are known quantities
with comprehensive test suites, and they change at a slow rate,
whereas the build system is an unknown quantity, always changing,
always introducing new bugs, without even a basic test suite, bugs are
rarely fixed, and bugs are tracked \via email\.

> > I'm merely pointing out that AFAIK your so called tests of the build
> > system are the worst set of tests of any commercially or
> > professionally used program besides other incremental build systems.
>
> You have no idea what our test were. If any tool in the chain had
> misfired, the tests would fail. We weren't testing the build process,
> we were testing the product it produced.

I claimed that your tests of the \build system\ are the worst ever.
You reply with a non sequitur, talking about the tests of the \build\.
It doesn't matter if you tested every single possible input to your
sellable product for a given build. The build is not the build system.
The actual testing of the incremental build system before deployment
to users, aka internal developers, is next to nothing. You did not
test your incremental build system - the giant makefile - for
(incremental) correctness any more than you tested gcc for correctness
or emacs for correctness. Yes, you did test it for "full clean build"
correctness reasonable well, but this is not incremental correctness.
It is quite possible, and highly likely, that there are source code
deltas for which the incremental build system - the giant makefile -
will produce incorrect output, and more bugs will creep in all the
time.

And yes, we can talk about if it's reasonable to compromise here. I
argued it's not, but that's another separate point, a normative
statement. Disputing that does nothing to dispute the main point of
this post, a statement of fact that the tests for promotion of your
incremental build \system\ are among the worst test suites of any
commercial or professional software AFAIK except for other incremental
build systems.

Öö Tiib

unread,

Jul 22, 2010, 1:49:43 AM7/22/10

On 22 juuli, 04:48, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 21, 5:21 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> > You have no idea what our test were. If any tool in the chain had
> > misfired, the tests would fail. We weren't testing the build process,
> > we were testing the product it produced.
>
> Let me be very explicit now. There are two programs, the sellable
> product, and the build system. Your sellable product presumably has a
> very extensive test suite.

Probably it also has several rooms of dedicated human testers. These
presumably test real product for month after code freeze if it does
anything useful and mission critical for someone at all. Giving them
something that does not install or run is out of question, developers
lose face if their build system did spit out something like that
without noticing it itself. Talking about importance of building is
therefore like talking about that there should be also stable
electricity present, otherwise nothing runs. Sure it is so, but it is
relatively cheap to arrange.

> The incremental build system doesn't have quite the same extensive
> test suite. It can be tested by comparing the output executables of an
> incremental build to the executables of a full clean build. It can
> also be indirectly tested by testing the output executables with the
> sellable product test suite.

Why it is important to have that *incremental* build when you have
large commercial product? You anyway want to have each module with
clear mark in it to what build it belongs. You make build farm that
makes you clean build each time. It is lot simpler solution. What does
a computer cost? Depends perhaps on continent but you likely can buy
several for cost of single month of good worker.

One may want incremental build when he develops something with over
average open-source-project size on sole computer at home. Then he
uses incremental builds and still majority of time goes into building,
not developing nor running tests. That is market for good incremental
build system.

> "Continuous clean builds" are irrelevant. A full clean build is not a
> test of an incremental build system at all. "Continuous clean builds",
> and comparing their output to an incremental build, is, at best, one
> very small test of the very large input space of the incremental build
> system.

Why? On the contrary. Incremental builds are irrelevant. No one uses
these anyway.

> There is a difference between "broken build" and "broken incremental
> build system". I agree that it sounds like it would be very hard for a
> broken build to pass the tests. However, it sounds \very easy\ for a
> bug to be introduced in the incremental build \system\.

Yes. So *do* *not* use incremental build systems and sun is shining
once again.

Test of a clean build is lot simpler. Did it build everything it was
made to build? Yes? Success! No? Fail! That is it. Tested. Simple part
is over. Now can run automatic tests (that presumably takes lot more
time than building) to see if all the modules that were built (all
modules of full product) are good too.

No one notices it because build system does everything automatically.
Checks out changed production branch from repository, runs all sort of
tools and tests and also produces web site about success and the
details and statistics (and changes in such) of various tools ran on
the code and on the freshly built modules. Building is tiny bit of its
job. It can even blame whose exactly changeset in repository did
likely break something. It can use some instant messaging system if
team dislikes e-mails. Of course ... all such features of build system
have to be tested too. If team does not sell the build system, then
testing it is less mission critical. Lets say it did blame an
innocent ... so there is defect and also interested part (wrongly
accused innocent) who wants it to be fixed.

Jorgen Grahn

unread,

Jul 22, 2010, 4:11:17 AM7/22/10

On Tue, 2010-07-20, Keith H Duggar wrote:
> On Jul 20, 7:39 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
>> On Fri, 2010-07-09, Joshua Maurice wrote:
>> > All I want is what every developer wants: to be able to make an
>> > arbitrary change to "the source", and have a always correct
>> > incremental build. I disagree that there has been this "ever more
>> > sophisticated" trend. This simple requirement existed back in the
>> > first days of make, and it still exists now. It's just that the
>> > original make author either:
>> > - purposefully punted because he decided that makefiles are not source
>> > and instead part of the implementation of the build system (but now,
>> > makefiles effectively are source in a large project as no single
>> > person understands all of the makefiles in a 10,000 source file
>> > project),
>>
>> A good insight. Makefiles are part of the source code.
>
> No, not unless the clients /want/ them to be "part of the
> source code". Make is a rather general purpose dependency
> analysis tool. The dependencies are specified as files; so
> if you want one or more makefiles to be in the prereqs of
> one or more targets then put them there! The complaint or
> "insight" as you call would instead be properly directed
> at the dependency generation (whether manual or automatic)
> tool not the dependency analysis tool (Make).

[snip]

I think you are responding to JM rather than to me.

My claim -- which I now see was was weaker than JM's -- was simply
that within a C++ project, the Makefile (if you use one) is just as
important as the C++ code: it too has to be correct, readable, etc.
Not something "that CM guy" handles and noone else has to think about.

Joshua Maurice

unread,

Jul 22, 2010, 4:40:48 AM7/22/10

On Jul 21, 10:49 pm, Öö Tiib <oot...@hot.ee> wrote:
> On 22 juuli, 04:48, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > The incremental build system doesn't have quite the same extensive
> > test suite. It can be tested by comparing the output executables of an
> > incremental build to the executables of a full clean build. It can
> > also be indirectly tested by testing the output executables with the
> > sellable product test suite.
>
> Why it is important to have that *incremental* build when you have
> large commercial product? You anyway want to have each module with
> clear mark in it to what build it belongs. You make build farm that
> makes you clean build each time. It is lot simpler solution. What does
> a computer cost? Depends perhaps on continent but you likely can buy
> several for cost of single month of good worker.
>
> One may want incremental build when he develops something with over
> average open-source-project size on sole computer at home. Then he
> uses incremental builds and still majority of time goes into building,
> not developing nor running tests. That is market for good incremental
> build system.

First, let me note that the post to which you are replying only argued
that he basically did not test the incremental build system. I was
quite explicit in this. However, else-thread, I made arguments in
favor of incremental. Let me add a couple new replies here.

Some companies don't have those build farms. For starters, they're
expensive. I also disagree that build farms are the "simpler"
solution. Maintaining a build farm is a huge PITA. Just maintaining
automated build machines for the dozen or so platforms my company
supports for emergency bug fixes, hot fixes, and daily ML builds takes
the full time employment of 3+ people. The simpler solution is to have
the developer build everything himself. A lot less moving parts. A lot
less different versions running around. A lot less active Cruise
Control instances. Is building it all in one giant build more
sensible? Perhaps not; it depends on the situation, but a build farm
is definitely not simpler than building everything yourself in a giant
build.

Now, correct incremental. It's relatively clear to me that this is
more complex than a full clean build every time, aka not simpler. Is
it simpler or more complex than a build farm? I don't know. I would
think that a correct incremental build system is actually simpler than
a build farm. A build farm really is complex and takes effort to
maintain, but a correct incremental build system would only have to be
written once, or at least each different kind of task would only have
to be written once, amortized over all companies, whereas a build farm
would probably have a lot more per-company setup cost.

> > "Continuous clean builds" are irrelevant. A full clean build is not a
> > test of an incremental build system at all. "Continuous clean builds",
> > and comparing their output to an incremental build, is, at best, one
> > very small test of the very large input space of the incremental build
> > system.
>
> Why? On the contrary. Incremental builds are irrelevant. No one uses
> these anyway.

The argument made was that he did not test the incremental build
system to any significant degree. He argued that he did test the
incremental build system with "continuous clean builds". He is in
error, and you are arguing a non sequitur.

Also, no one uses incremental builds? Why do we even use Make anymore
then? I suppose that Make has some nifty parallelization features, so
I guess it has some use if we ignore incremental.

I must say this is somewhat surprising to hear in a C++ forum. I
expected most people to still be under the incremental spell.
Incremental really is not that hard to achieve. It's just that no one
has really tried as far as I can tell. (Specifically under my
definition of "incremental correctness" which by definition includes
the associated build scripts as source, though I think even ignoring
build script changes, all common, purported incremental, build systems
are not incrementally correct under idiomatic usage.)

> > There is a difference between "broken build" and "broken incremental
> > build system". I agree that it sounds like it would be very hard for a
> > broken build to pass the tests. However, it sounds \very easy\ for a
> > bug to be introduced in the incremental build \system\.
>
> Yes. So *do* *not* use incremental build systems and sun is shining
> once again.
>
> Test of a clean build is lot simpler. Did it build everything it was
> made to build? Yes? Success! No? Fail! That is it. Tested. Simple part
> is over. Now can run automatic tests (that presumably takes lot more
> time than building) to see if all the modules that were built (all
> modules of full product) are good too.

Again, arguing a non sequitur. I would love to have a discussion of if
we should have clean builds, but you reply as though I was making the
argument in that quote that we should have incremental build systems.
I was not. I was very clear and explicit in that post that I was
arguing solely that he did not test the incremental build system for
correctness in any significant way.

Again, to change topics to your new point, why should we have correct
incremental builds? Because it's faster, and componentizing components
might not make sense, and it might be more costly to the company than
a correct incremental build system, especially when the cost of the
incremental build system can be amortized over all companies.

Think about it like this. It's all incremental. Splitting it up into
components is one kind of incremental. It's incremental at the
component level. However, the benefit of this can only go so far.
Eventually there would be too many different components, and we're
right in the situation described in Recursive Make Considered Harmful,
the situation without automated dependency analysis. Yes, we do need
to break it down at the component level at some point. It's not
practical to rebuild all of the linux kernel whenever I compile a
Hello World! app, but nor is it practical to say componentization
solves all problems perfectly without need of other solutions like a
parallel build, a distributed build, build farms, faster compilers,
pImpl, and/or incremental builds.

> No one notices it because build system does everything automatically.
> Checks out changed production branch from repository, runs all sort of
> tools and tests and also produces web site about success and the
> details and statistics (and changes in such) of various tools ran on
> the code and on the freshly built modules. Building is tiny bit of its
> job. It can even blame whose exactly changeset in repository did
> likely break something. It can use some instant messaging system if
> team dislikes e-mails. Of course ... all such features of build system
> have to be tested too. If team does not sell the build system, then
> testing it is less mission critical. Lets say it did blame an
> innocent ... so there is defect and also interested part (wrongly
> accused innocent) who wants it to be fixed.

Yes, a build system does everything "automatically" if you do a full
clean build every time, then it is handled automatically. Well, except
it's slow. And if there's a lot of dependency components which are
frequently changing, and you have to manually get these extra-project
dependencies, then we're in Recursive Make Considered Harmful. If
instead you use some automated tool like Maven to download
dependencies, and you do a full clean build, aka redownload, of those
every time, then it's really really slow. (I'm in the situation now at
work where we use Maven to handle downloading a bazillion different
kinds of dependencies. As Maven has this nasty habit of automatically
downloading newer "versions" of the same snapshot version, it's quite
easy to get inconsistent versions of other in-house components. It's
quite inconvenient and annoying. I've managed to deal with it, and
work around several bugs, to avoid this unfortunate default. Did I
mention I hate Maven as a build system?)

Also, an automated build machine polling source control for checkins
can only tell you which checkin broke the automated build (and tests)
if your full clean build runs faster than the average checkin
interval. At my company, the build of the ~25,000 source file project
can take 2-3 hours on some supported systems, and that's without any
tests. The basic test suite add another 5-6 hours. As a rough guess, I
would imagine we have 100s of checkins a day.

Even if we were to break up the stuff by team, with our level of
testing, I don't think we could meet this ideal of "automated build
machine isolates breaking checkin" without a build farm. Even then, as
all of this code is under active development, arguably a change to my
component should trigger tests of every component downstream, and as
inter-component interfaces change relatively often (but thankfully
slowing down in rate), it might even require recompiles of things
downstream. As the occasional recompile is needed of things
downstream, the only automation solution without incremental is to do
full clean rebuilds of the entire shebang.

Yes, I know the canned answer is "fix your build process". That is
still no reason to use the inferior tools (build systems) which would
help even after "we did the right thing" and componentized. Simply
put, full clean rebuilds do not scale to the size of my company's
project, and I argue that incremental correctness would be the
cheapest way to solve all of the problems.

It's unsurprising that I get about as much support in the company as I
do here.

However, I do admit that it might be a bad business decision to do it
fully in-house at this point in time. As I emphasized else-thread, it
is only easily worth it when amortized over all companies, or when
done by someone in GPL in their spare time for fun. However, the only
people who really need it are the large companies, and any single one
of them has little incentive to do it themselves. It's most
unfortunate.

Ian Collins

unread,

Jul 22, 2010, 5:08:46 AM7/22/10

On 07/22/10 08:40 PM, Joshua Maurice wrote:

> On Jul 21, 10:49 pm, 嘱 Tiib<oot...@hot.ee> wrote:
>> On 22 juuli, 04:48, Joshua Maurice<joshuamaur...@gmail.com> wrote:
>>> The incremental build system doesn't have quite the same extensive
>>> test suite. It can be tested by comparing the output executables of an
>>> incremental build to the executables of a full clean build. It can
>>> also be indirectly tested by testing the output executables with the
>>> sellable product test suite.
>>
>> Why it is important to have that *incremental* build when you have
>> large commercial product? You anyway want to have each module with
>> clear mark in it to what build it belongs. You make build farm that
>> makes you clean build each time. It is lot simpler solution. What does
>> a computer cost? Depends perhaps on continent but you likely can buy
>> several for cost of single month of good worker.
>>
>> One may want incremental build when he develops something with over
>> average open-source-project size on sole computer at home. Then he
>> uses incremental builds and still majority of time goes into building,
>> not developing nor running tests. That is market for good incremental
>> build system.
>
> First, let me note that the post to which you are replying only argued
> that he basically did not test the incremental build system. I was
> quite explicit in this. However, else-thread, I made arguments in
> favor of incremental. Let me add a couple new replies here.

Well I still maintain that our process did, indirectly, test the
incremental build system. Let me explain. My teams follow a test
driven development process with continuous integration so they are
continuously adding code, building, integrating and testing. If a
developer adds a failing test or the new code to pass it and the make
(which is always "make test") fails, there is a problem with the build.
These are always incremental builds, run hundreds of times a day (ever
time a few lines of code change).

> Some companies don't have those build farms. For starters, they're
> expensive.

I have one in my garage. It's only a couple of multi-core boxes, but
its quick and saves me a lot of billable time.

> Now, correct incremental. It's relatively clear to me that this is
> more complex than a full clean build every time, aka not simpler.

I don't think anyone was advocating a clean build every time, certainly
not me. What goes on on automated background builds is a world apart
from what happens on the developer's desktop.

> Is
> it simpler or more complex than a build farm? I don't know. I would
> think that a correct incremental build system is actually simpler than
> a build farm.

The build farm typically severs several masters (unless you have the
luxury of more than one). It will be running distributed incremental
builds for developers and it will be running continuous clean builds,
often for different platforms or with different compile options.

>> Why? On the contrary. Incremental builds are irrelevant. No one uses
>> these anyway.

I don't agree with that, 99% of my builds are incremental, often just
the one file I'm editing.

> The argument made was that he did not test the incremental build
> system to any significant degree. He argued that he did test the
> incremental build system with "continuous clean builds". He is in
> error, and you are arguing a non sequitur.

I don't think I did although I probably wasn't clear on that point.

--
Ian Collins

Jorgen Grahn

unread,

Jul 22, 2010, 6:00:21 AM7/22/10

On Tue, 2010-07-20, Joshua Maurice wrote:
> On Jul 20, 2:15 pm, Ian Collins <ian-n...@hotmail.com> wrote:
>> On 07/21/10 05:54 AM, Joshua Maurice wrote:

...

>> > Make's model is to have developers specify rules in a turing complete
>> > programming language, aka makefiles. This is a horrible model.
>>
>> But it works and tool can hide them from the nervous developer.
>>
>> > First, as a matter of practicality, very few people in my company, and
>> > I would imagine the industry at large, are anywhere near knowledgeable
>> > as I on build systems.

...

>> > Moreover, it's somewhat unreasonable to require them to.
>> > They're supposed to be experts at writing code in the product domain,
>> > not in writing build systems.

Why do you say that? They have a goal to meet, and they have to use
two languages to do that: C++ and Make. You don't think it's
unreasonable to require them to know C++ well enough, so why forgive
them if they write Makefiles without having a clue? (I'm assuming here
of course that they have to use Make, or something equivalent.)

>> Which is why every team I have worked with or managed had one or two
>> specialists who look after the build system and other supporting tools
>> (SCM for instance).

Yes, I guess not /everyone/ in the project has to know /all/ about the
build system. But I also think it's dangerous to delegate build and
SCM to someone, especially someone who's not dedicated 100% to the
project, and who doesn't know the code itself.

You get problems like people not arranging their code to support
incremental builds. For example, if all object files have a dependency
on version.h or a config.h which is always touched, incremental builds
don't exist. (Delegating SCM is even worse IMO, but I won't go into
that.)

> Yes, my company has those too. Unfortunately the build specialists are
> only that in name; they have no actual power, and a lot of them have
> no actual knowledge of builds. The developers control the build, and
> the "build specialists" just manage the automated build machines.

That is my experience, too. What happens then (if you're lucky) is
that you get unofficial experts within the team -- at least mildly
interested in the topic, but with no time allocated for such work, and
no official status.

Ian Collins

unread,

Jul 22, 2010, 6:09:17 AM7/22/10

On 07/22/10 10:00 PM, Jorgen Grahn wrote:
> On Tue, 2010-07-20, Joshua Maurice wrote:
>> On Jul 20, 2:15 pm, Ian Collins<ian-n...@hotmail.com> wrote:
>
>>> Which is why every team I have worked with or managed had one or two
>>> specialists who look after the build system and other supporting tools
>>> (SCM for instance).
>
> Yes, I guess not /everyone/ in the project has to know /all/ about the
> build system. But I also think it's dangerous to delegate build and
> SCM to someone, especially someone who's not dedicated 100% to the
> project, and who doesn't know the code itself.

I guess you've never worked on a multi-site project using Clear Case!

> You get problems like people not arranging their code to support
> incremental builds. For example, if all object files have a dependency
> on version.h or a config.h which is always touched, incremental builds
> don't exist.

A good slap round the head normally solves that problem.

--
Ian Collins

Jorgen Grahn

unread,

Jul 22, 2010, 7:25:15 AM7/22/10

On Tue, 2010-07-20, Joshua Maurice wrote:

> On Jul 20, 9:36 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

...

> Second, make's model is fundamentally fubar. You cannot have a fully
> correct incremental build written in idiomatic make ala Recursive Make
> Considered Harmful. See else-thread, or the end of this post for a
> synopsis. Make was good back in the day when a single project did fit
> into a single directory and a single developer knew all of the code,
> but when a developer does not know all of the code, make's model no
> longer works.
>
> Simply put, this is my use case which make will not handle. I'm
> working in a company on a project with over 25,000 source files in a
> single build. The compile / link portion takes over an hour on a
> developer machine, assuming no random errors, which there frequently
> are on an incremental build.

There's something suspect in that sentence, since incremental builds
vary in the time they take, from a few seconds (to check file
timestamps) and up.

> I work on one of the root components, a
> reusable module which is used by several services (also part of the
> same build). It is my explicit responsibility to do a decent effort at
> not breaking the build from any checkin. As the closest thing my
> company has to a build expert, I know that the build is not
> incrementally correct. I hacked a large portion of it together. I can
> do an incremental build most of the time, and just cross my fingers
> and hope that it's correct, but I have no way of knowing it.

So why not fix the Makefile? You cannot expect Make to do something
sensible when given incorrect instructions. Unless you refer to the
loopholes you list below (see there).

> On the bright side, I manage to not break it most of the time.
> However, with a large number of developers working on it, the last
> build on the automated build machine is almost always broken. On an
> almost weekly basis checkin freezes are enacted in an attempt to
> "stabilize" the build. The problem is that most other developers are
> not as thorough in their own testing as I, and the automated build
> machine takes hours to do the full clean recompile. The time from a
> checkin to a confirmed bug is quite large, and as the number of build
> breakages goes up, so does this turnaround time as compile failures
> hide compile failures.
>
> Yes, I know the standard solution is to break up the project into
> smaller projects. I would like that too. However, I'm not in a
> position of power to make that happen, and no one else seems
> interested in changing the status quo there.

Been there, done that. I think it would be harmful. If you have
informal sub-projects which break each other today, actually splitting
them would just force you to do handle dependencies manually. "Let's
see, if I do this change in Foo, Bar and Baz need to be updated. So
Bar 1.5 now needs Foo 1.4; I must remember to tell everyone ..."

You can e.g. look at what the Debian Linux project does: they spend
much of their time managing such dependencies, and it doesn't look
like a lot of fun.

If the root components were stable (their interface rarely changed) it
would work, but then it would work in your current setup too.

...

> Pretty cool system. I would still argue no, that there is a difference
> between what I want and what your system handles. As I mentioned else-
> thread, no build system is perfect. The line must be drawn somewhere.
> At the very least, the correctness of an incremental build system is
> conditional on the correctness of the code of the build system itself.
> Moreover, if the developer installs the wrong version of the build
> system, then he's also fubar.

Now you're talking sense again!

...

> The discussion at hand is make is broken. Not being
> able to hand multiple output from a single step is annoying to handle,
> but it's relatively minor. Its major problems are:
> 1- It's core philosophy of a file dependency graph with cascading
> rebuilds without termination, combined with its idiomatic usage, is
> inherently limited and broken without further enhancement.
> - a- It will not catch new nodes hiding other nodes in search path
> results (such as an include path).

Solution: don't feed multiple -I (include path root) statements to the
compiler. #include <my_components/foo/bar.h> is better than #include
<bar.h>. Or you can do -Imy_components and #include <foo/bar.h> if you
want to -- just try to make the #include statements unambiguous.

Fixing that in an existing code base can be very time consuming.
I've done it once in a 1000+ file code base, and it took a few days.
Hard to explain to people, too.

Or, issue "make clean" whenever a new include file shows up. Either
you add the file yourself, or you see it show up when you update from
revision control. (If things can change without you knowing it, you
have even bigger problems and need to adjust how you work with
revision control too).

> - b- It will not catch removed nodes nor edges which should trigger
> rebuilds (such as removing a cpp file will not relink the library).

Solution: issue "make clean" when files are removed; detection as above.

> - c- It will not catch when the rule to build a node has changed which
> should trigger a rebuild (such as adding a command line processor
> define).

Solution: issue "make clean" when the Makefile is changed and you
haven't reviewed the change and seen that it's obviously harmless.

> - d- It will not get anywhere close to a good incremental build for
> other compilation models, specifically Java. A good incremental Java

> build cannot be based on a file dependency graph. [...]

Well, I don't do Java ;-) Everything I use fits the make model. And
that's not just C and C++ compilers; I have Make control dozens of
different tools.

Like you said above, "the line has to be drawn somewhere". Or, as the
saying goes: "all software sucks". I claim that while make has
loopholes, (a) you can easily see when an incremental build may be
incorrect and fix it with a 'make clean'; (b) this doesn't happen very
often, and the vast majority of builds can be done incrementally.

Since I don't see any better alternatives (and you seem to think the
existing "make replacements" share its flaws) I am relatively happy.

> 2- It exposes a full turing complete language to common developer
> edits, and these common developer edits wind up in source control.

Just like C++. I commented on that that above.

But I think you exaggerate the changes "common developer" make.
Those are almost always of the form of adding or removing lines like

libfoo.a: fred.o

and if someone doesn't get that, maybe he shouldn't edit the C++ code
either.

I am assuming here that someone designed the Makefile correctly
originally; set up automatic dependency generation and so on. Lots of
projects fail to do that -- but lots of projects fail to use C++
properly too.

> The
> immediate conclusion is that a build system based on idiomatic make
> can never be incrementally correct over all possible changes in source
> control. That is, a developer will inevitably make a change which
> breaks the idiomatic usage (what little there is) and will result in
> incremental incorrectness.

So review the changes and reprimand him. Same as with the C++ code,
but *a lot* easier.

> False negatives are quite annoying, but
> perhaps somewhat acceptable. False positives are the bane of a build
> systems existence, but they are possible when the build system itself
> is being constantly modified \without any tests whatsoever\ as is
> common with idiomatic make.

That *is* scary, but I see no way around it.

Keith H Duggar

unread,

Jul 22, 2010, 10:03:20 AM7/22/10

On Jul 22, 4:40 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> I must say this is somewhat surprising to hear in a C++ forum. I
> expected most people to still be under the incremental spell.
> Incremental really is not that hard to achieve. It's just that no one
> has really tried as far as I can tell. (Specifically under my
> definition of "incremental correctness" which by definition includes
> the associated build scripts as source, though I think even ignoring
> build script changes, all common, purported incremental, build systems
> are not incrementally correct under idiomatic usage.)

Yes people have tried and have succeeded and it was not hard.
A few simple scripts and conventions to augment make suffice.
I've already told you this (and given you one example that you
could/should have expanded on).

Your problem is not one of intellect rather it is an attitude
problem and a problem of self-delusion. You are operating on
several false assumptions:

1) That you are a master of build /systems/ (make is a tool
not a system by the way).

2) That make is a fundamentally fubar, flawed, horrific,
broken, etc /tool/ that cannot serve as a core component
of an incrementally correct build /system/.

3) That the problems make does have are insurmountable
without major overhaul.

4) That the incompetence and social problems of your
workplace are relevant to the correctness of make.

All of the above are false but you labor with them as truths.
They are holding you back! If you would stop ranting and whining
and start thinking and scripting your problems would start to
evaporate.

Even the fact that I'm telling you it is /possible/ to solve ALL
the problems you have outlined and indeed can be done with simple
scripts + make (at least with C++ projects, I can't comment on
Java), that alone should be a boon to you if you could only get
past those flawed preconceptions.

KHD

Maxim Yegorushkin

unread,

Jul 22, 2010, 10:26:38 AM7/22/10

On 20/07/10 18:54, Joshua Maurice wrote:
> On Jul 20, 9:36 am, Keith H Duggar<dug...@alum.mit.edu> wrote:
>> On Jul 20, 7:39 am, Jorgen Grahn<grahn+n...@snipabacken.se> wrote:

[]

>> In short, make is /one/ tool, a dependency analysis tool,
>> that is /part/ of a build system (called Unix). Learn to use
>> the full suite of tools instead of searching for a "One True
>> Uber Tool" monolith. Remember the Unix paradigm of "many small
>> tools working together to solve big problems".
>>
>> Of course, there are some actual problems with make. A few
>> have be mentioned in other posts. Another is proper handling
>> of a single command that outputs multiple targets which is,
>> well let's say annoying ;-), with make.
>
> Interesting. I'm sure there's some logical fallacy in here somewhere,
> but I don't know the name(s). In short, you assert that the Unix way
> works, and is better than other ways, especially better than "One True
> Uber Tool monolith". I'd rather not get into this discussion, as it's
> mostly tangential. The discussion at hand is make is broken. Not being
> able to hand multiple output from a single step is annoying to handle,
> but it's relatively minor.

I find it little known, but make does handle the case of multiple output
files when using pattern rules. See example 3 on
http://www.gnu.org/software/make/manual/make.html.gz#Pattern-Examples

> Its major problems are:
> 1- It's core philosophy of a file dependency graph with cascading
> rebuilds without termination, combined with its idiomatic usage, is
> inherently limited and broken without further enhancement.

I've read the thread but could not find anything that proves the above
statement.

> - a- It will not catch new nodes hiding other nodes in search path
> results (such as an include path).

True. Not sure if it is a good practise though.

> - b- It will not catch removed nodes nor edges which should trigger
> rebuilds (such as removing a cpp file will not relink the library).

When you remove a source file you end up updating a makefile. In a
robust system changes to makefiles trigger a rebuild. In an in-house
system I built different aspects of building (compiling, linking) are
put in different makefiles, so that changes to one makefile only trigger
a relink, to others - a recompilation.

> - c- It will not catch when the rule to build a node has changed which
> should trigger a rebuild (such as adding a command line processor
> define).

It will if your .o files depend on the makefiles, which they should.

> - d- It will not get anywhere close to a good incremental build for
> other compilation models, specifically Java. A good incremental Java
> build cannot be based on a file dependency graph. In my solution, file
> timestamps are involved yes, but the core is not a file dependency
> graph with cascading rebuilds without termination conditions.

Could you elaborate please?

> 2- It exposes a full turing complete language to common developer
> edits, and these common developer edits wind up in source control. The
> immediate conclusion is that a build system based on idiomatic make
> can never be incrementally correct over all possible changes in source
> control. That is, a developer will inevitably make a change which
> breaks the idiomatic usage (what little there is) and will result in
> incremental incorrectness. False negatives are quite annoying, but
> perhaps somewhat acceptable. False positives are the bane of a build
> systems existence, but they are possible when the build system itself
> is being constantly modified \without any tests whatsoever\ as is
> common with idiomatic make.

One often used tool in GNU make is macro calls $(eval $(call ...)),
using which you hide all the complexity from the average developers.

--
Max

Keith H Duggar

unread,

Jul 22, 2010, 11:17:24 AM7/22/10

On Jul 22, 10:26 am, Maxim Yegorushkin <maxim.yegorush...@gmail.com>
wrote:

> On 20/07/10 18:54, Joshua Maurice wrote:
> > On Jul 20, 9:36 am, Keith H Duggar<dug...@alum.mit.edu> wrote:
> >> On Jul 20, 7:39 am, Jorgen Grahn<grahn+n...@snipabacken.se> wrote:
> >> In short, make is /one/ tool, a dependency analysis tool,
> >> that is /part/ of a build system (called Unix). Learn to use
> >> the full suite of tools instead of searching for a "One True
> >> Uber Tool" monolith. Remember the Unix paradigm of "many small
> >> tools working together to solve big problems".
>
> >> Of course, there are some actual problems with make. A few
> >> have be mentioned in other posts. Another is proper handling
> >> of a single command that outputs multiple targets which is,
> >> well let's say annoying ;-), with make.
>
> > Interesting. I'm sure there's some logical fallacy in here somewhere,
> > but I don't know the name(s). In short, you assert that the Unix way
> > works, and is better than other ways, especially better than "One True
> > Uber Tool monolith". I'd rather not get into this discussion, as it's
> > mostly tangential. The discussion at hand is make is broken. Not being
> > able to hand multiple output from a single step is annoying to handle,
> > but it's relatively minor.
>
> I find it little known, but make does handle the case of multiple output
> files when using pattern rules. See example 3 onhttp://www.gnu.org/software/make/manual/make.html.gz#Pattern-Examples

Yeah, the only problem is that the targets must match a pattern.
Often they don't. There are other "solutions" too but none all of
them have annoying limitations. This is one thing that is tough
(in the general) case to get around in make. (Or at least I have
not figured out a nice way). These other "problems" the op brings
up are easier to eliminate in a nice and useful way.

> > Its major problems are:
>
> > 1- It's core philosophy of a file dependency graph with cascading
> > rebuilds without termination, combined with its idiomatic usage, is
> > inherently limited and broken without further enhancement.
>
> I've read the thread but could not find anything that proves the above
> statement.

Because it is false.

> > - a- It will not catch new nodes hiding other nodes in search path
> > results (such as an include path).
>
> True. Not sure if it is a good practise though.

Handled with a very simple script.

> > - b- It will not catch removed nodes nor edges which should trigger
> > rebuilds (such as removing a cpp file will not relink the library).
>
> When you remove a source file you end up updating a makefile. In a

Not necessarily. My make files do not change when a file is
removed (unless those files are involved in special "override"
functionality we have). Furthermore, a simple script (closely
related to the script that solves -a-) triggers the necessary
updates.

> robust system changes to makefiles trigger a rebuild. In an in-house
> system I built different aspects of building (compiling, linking) are
> put in different makefiles, so that changes to one makefile only trigger
> a relink, to others - a recompilation.

A more sophisticated system (still make based!) can be smarter.
In addition, as mentioned, the makefiles of a sophisticated make
system might change very rarely (if ever) anyhow. Simple actions
like adding/removing source files will not change the make files.

> > - c- It will not catch when the rule to build a node has changed which
> > should trigger a rebuild (such as adding a command line processor
> > define).
>
> It will if your .o files depend on the makefiles, which they should.

And factoring make files and/or using environment variables
together with a smart script which examines the environment
helps to minimize impact.

> > - d- It will not get anywhere close to a good incremental build for
> > other compilation models, specifically Java. A good incremental Java
> > build cannot be based on a file dependency graph. In my solution, file
> > timestamps are involved yes, but the core is not a file dependency
> > graph with cascading rebuilds without termination conditions.
>
> Could you elaborate please?
>
> > 2- It exposes a full turing complete language to common developer
> > edits, and these common developer edits wind up in source control. The
> > immediate conclusion is that a build system based on idiomatic make
> > can never be incrementally correct over all possible changes in source
> > control. That is, a developer will inevitably make a change which
> > breaks the idiomatic usage (what little there is) and will result in
> > incremental incorrectness. False negatives are quite annoying, but
> > perhaps somewhat acceptable. False positives are the bane of a build
> > systems existence, but they are possible when the build system itself
> > is being constantly modified \without any tests whatsoever\ as is
> > common with idiomatic make.
>
> One often used tool in GNU make is macro calls $(eval $(call ...)),
> using which you hide all the complexity from the average developers.

There are many additional ways to simplify makefiles that are
(if ever) touched by "common" users. Besides, such changes must
be reviewed by a senior anyhow (unless your shop sucks or is in
too much of hurry to practice software /engineering/).

KHD

Öö Tiib

unread,

Jul 22, 2010, 1:06:17 PM7/22/10

On 22 juuli, 11:40, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 21, 10:49 pm, Öö Tiib <oot...@hot.ee> wrote:
>
> > On 22 juuli, 04:48, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > > The incremental build system doesn't have quite the same extensive
> > > test suite. It can be tested by comparing the output executables of an
> > > incremental build to the executables of a full clean build. It can
> > > also be indirectly tested by testing the output executables with the
> > > sellable product test suite.
>
> > Why it is important to have that *incremental* build when you have
> > large commercial product? You anyway want to have each module with
> > clear mark in it to what build it belongs. You make build farm that
> > makes you clean build each time. It is lot simpler solution. What does
> > a computer cost? Depends perhaps on continent but you likely can buy
> > several for cost of single month of good worker.
>
> > One may want incremental build when he develops something with over
> > average open-source-project size on sole computer at home. Then he
> > uses incremental builds and still majority of time goes into building,
> > not developing nor running tests. That is market for good incremental
> > build system.
>
> First, let me note that the post to which you are replying only argued
> that he basically did not test the incremental build system. I was

Sure. Sorry. He seemingly did use clean builds as well. With clean
build there may be also issues but these are less common and false
positives are rare.

> quite explicit in this. However, else-thread, I made arguments in
> favor of incremental. Let me add a couple new replies here.

>
> Some companies don't have those build farms. For starters, they're
> expensive. I also disagree that build farms are the "simpler"
> solution. Maintaining a build farm is a huge PITA. Just maintaining
> automated build machines for the dozen or so platforms my company
> supports for emergency bug fixes, hot fixes, and daily ML builds takes
> the full time employment of 3+ people.

Experiences differ, I have not observed such PITA. Software for
distributed building is perhaps different, maintainers are different,
sizes are different and so the budgets different. Bigger problems need
bigger baseball bat to deal with them. Usually solving bigger problems
also raises money for to buy bigger sticks. Finesse of clever details
is wrong way to go there. The bigger it is the more robust and more
oriented to usage of power (not dexterity) it should be.

> The simpler solution is to have
> the developer build everything himself. A lot less moving parts. A lot
> less different versions running around. A lot less active Cruise
> Control instances. Is building it all in one giant build more
> sensible? Perhaps not; it depends on the situation, but a build farm
> is definitely not simpler than building everything yourself in a giant
> build.

Individual developers (or sub-teams) work on a single module. He/they
can build it separately using a diagnose/build/test/deploy systems
dedicated for the module. Only before he/they move work results into
some main repository branch it needs to be integrated with efforts of
other developers/teams. Other teams can write protective tests against
alien modules to give fast feedback when they are unable to integrate
with things thrown at them.

> Now, correct incremental. It's relatively clear to me that this is
> more complex than a full clean build every time, aka not simpler. Is
> it simpler or more complex than a build farm? I don't know. I would
> think that a correct incremental build system is actually simpler than
> a build farm. A build farm really is complex and takes effort to
> maintain, but a correct incremental build system would only have to be
> written once, or at least each different kind of task would only have
> to be written once, amortized over all companies, whereas a build farm
> would probably have a lot more per-company setup cost.

I am even not sure why you say it? Building modules is only tiny bit
of computer time what full building takes. If incrementally versus
cleanly means 5 times quicker, then just that tiny thing gets tinier.

> > > "Continuous clean builds" are irrelevant. A full clean build is not a
> > > test of an incremental build system at all. "Continuous clean builds",
> > > and comparing their output to an incremental build, is, at best, one
> > > very small test of the very large input space of the incremental build
> > > system.
>
> > Why? On the contrary. Incremental builds are irrelevant. No one uses
> > these anyway.
>
> The argument made was that he did not test the incremental build
> system to any significant degree. He argued that he did test the
> incremental build system with "continuous clean builds". He is in
> error, and you are arguing a non sequitur.

Ok. He did test clean build system, so he did not test incremental
build system.

> Also, no one uses incremental builds? Why do we even use Make anymore
> then? I suppose that Make has some nifty parallelization features, so
> I guess it has some use if we ignore incremental.

Yes. makefile is script in flexible programming language. It is sort
of tradition i suppose to use that language for scripts that analyze
dependencies and build software.

> I must say this is somewhat surprising to hear in a C++ forum. I
> expected most people to still be under the incremental spell.
> Incremental really is not that hard to achieve. It's just that no one
> has really tried as far as I can tell. (Specifically under my
> definition of "incremental correctness" which by definition includes
> the associated build scripts as source, though I think even ignoring
> build script changes, all common, purported incremental, build systems
> are not incrementally correct under idiomatic usage.)

Sure, i think i explained. Incremental makes sense when won time that
you get is something that matters. Then it is good. Sure. That is
usually so when debugging product of small project. Small may be
project for building a tool for bigger project, but into bigger
project it can be then integrated as external dependency or satellite
co-product.

However now consider such a build: Lets say target of (certain from
several) build system is Win32. Production binaries are compiled with
intel's compiler but build system also compiles the same code with g++
and msvc as well just to collect diagnostics produced by these
compilers. Incremental build will then damage the purpose of compiling
with the other two compilers whatsoever, because you do not get full
set of diagnostics. Also you do not get full set of diagnostics from
Intel (so part of reason of building with it is damaged as well), but
from there you at least get modules.

> > > There is a difference between "broken build" and "broken incremental
> > > build system". I agree that it sounds like it would be very hard for a
> > > broken build to pass the tests. However, it sounds \very easy\ for a
> > > bug to be introduced in the incremental build \system\.
>
> > Yes. So *do* *not* use incremental build systems and sun is shining
> > once again.
>
> > Test of a clean build is lot simpler. Did it build everything it was
> > made to build? Yes? Success! No? Fail! That is it. Tested. Simple part
> > is over. Now can run automatic tests (that presumably takes lot more
> > time than building) to see if all the modules that were built (all
> > modules of full product) are good too.
>
> Again, arguing a non sequitur. I would love to have a discussion of if
> we should have clean builds, but you reply as though I was making the
> argument in that quote that we should have incremental build systems.
> I was not. I was very clear and explicit in that post that I was
> arguing solely that he did not test the incremental build system for
> correctness in any significant way.

Sorry there, then.

> Again, to change topics to your new point, why should we have correct
> incremental builds? Because it's faster, and componentizing components
> might not make sense, and it might be more costly to the company than
> a correct incremental build system, especially when the cost of the
> incremental build system can be amortized over all companies.

Componentizing makes always so great sense for me from so lot of
different angles that it is somewhat holy thing for me. I believe into
it. Testability, reusability, maintanability etc. If you use
components statically or dynamically in your end product is entirely
different issue.

> Think about it like this. It's all incremental. Splitting it up into
> components is one kind of incremental. It's incremental at the
> component level. However, the benefit of this can only go so far.
> Eventually there would be too many different components, and we're
> right in the situation described in Recursive Make Considered Harmful,
> the situation without automated dependency analysis. Yes, we do need
> to break it down at the component level at some point. It's not
> practical to rebuild all of the linux kernel whenever I compile a
> Hello World! app, but nor is it practical to say componentization
> solves all problems perfectly without need of other solutions like a
> parallel build, a distributed build, build farms, faster compilers,
> pImpl, and/or incremental builds.

I am not saying components solve everything ... just that these help
greatly and simplify life from so numerous angles. As for Make ... it
is itself fully automated dependency analysis?

> > No one notices it because build system does everything automatically.
> > Checks out changed production branch from repository, runs all sort of
> > tools and tests and also produces web site about success and the
> > details and statistics (and changes in such) of various tools ran on
> > the code and on the freshly built modules. Building is tiny bit of its
> > job. It can even blame whose exactly changeset in repository did
> > likely break something. It can use some instant messaging system if
> > team dislikes e-mails. Of course ... all such features of build system
> > have to be tested too. If team does not sell the build system, then
> > testing it is less mission critical. Lets say it did blame an
> > innocent ... so there is defect and also interested part (wrongly
> > accused innocent) who wants it to be fixed.
>
> Yes, a build system does everything "automatically" if you do a full
> clean build every time, then it is handled automatically. Well, except
> it's slow.

It is slow because it runs all static tools on code, runs class level
unit tests, (X) builds modules, runs module level unit tests and then
builds product (and possibly its installers), deploys to places and
runs product level automatic tests there. Also it produces reports
about everything. Why to say that it is slow because not using
incremental build in spot (X)? Does incremental build speed it up so
lot that it matters?

> And if there's a lot of dependency components which are
> frequently changing, and you have to manually get these extra-project
> dependencies, then we're in Recursive Make Considered Harmful. If
> instead you use some automated tool like Maven to download
> dependencies, and you do a full clean build, aka redownload, of those
> every time, then it's really really slow. (I'm in the situation now at
> work where we use Maven to handle downloading a bazillion different
> kinds of dependencies. As Maven has this nasty habit of automatically
> downloading newer "versions" of the same snapshot version, it's quite
> easy to get inconsistent versions of other in-house components. It's
> quite inconvenient and annoying. I've managed to deal with it, and
> work around several bugs, to avoid this unfortunate default. Did I
> mention I hate Maven as a build system?)

Isn't Maven for java? Sorry, i have no experience, our java teams have
their own processes, tools and procedures.

> Also, an automated build machine polling source control for checkins
> can only tell you which checkin broke the automated build (and tests)
> if your full clean build runs faster than the average checkin
> interval. At my company, the build of the ~25,000 source file project
> can take 2-3 hours on some supported systems, and that's without any
> tests. The basic test suite add another 5-6 hours. As a rough guess, I
> would imagine we have 100s of checkins a day.

~25,000 *files* and splitting into components (and even fully
autonomously useful subsystems) does not make sense? We are from
different universes. Apparently. There should be distributed version
control system, several repositories and so on i feel. Nothing to talk
of modules and components. Also it feels you should have budget to
increase such a build farm (if it is building it 2 hours) until it
takes 10 minutes. One thing i am sure, problems of incremental build
system are smallest cause of issues there.

Probably i am not qualified enough to discuss such an ill situation
and good ways out of it. Continuously integrating millions of SLOC to
keep it as one big blob? Rewrite it, split it up into components and
drop integration cycle of components to 2 weeks as minimum?

Joshua Maurice

unread,

Jul 22, 2010, 1:35:36 PM7/22/10

Can you come slap mine please? We have two parallel build machines,
one full clean, one this hacked version of incremental I set up. Every
such incremental build changes the version.h (and a couple other
version files like version.java), updating the build number, which has
the result that my incremental build tends to rebuild like 40% of all
of the code on every streaming incremental build because these version
files were changed. This was noted to management, but no time was
allocated to fix this.

Joshua Maurice

unread,

Jul 22, 2010, 1:55:35 PM7/22/10

On Jul 22, 4:25 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Tue, 2010-07-20, Joshua Maurice wrote:
> > On Jul 20, 9:36 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> ...
> > Second, make's model is fundamentally fubar. You cannot have a fully
> > correct incremental build written in idiomatic make ala Recursive Make
> > Considered Harmful. See else-thread, or the end of this post for a
> > synopsis. Make was good back in the day when a single project did fit
> > into a single directory and a single developer knew all of the code,
> > but when a developer does not know all of the code, make's model no
> > longer works.
>
> > Simply put, this is my use case which make will not handle. I'm
> > working in a company on a project with over 25,000 source files in a
> > single build. The compile / link portion takes over an hour on a
> > developer machine, assuming no random errors, which there frequently
> > are on an incremental build.
>
> There's something suspect in that sentence, since incremental builds
> vary in the time they take, from a few seconds (to check file
> timestamps) and up.

Yes, looking over that, it is quite nonsensical. Sorry. I think I was
trying to say that the full clean build takes forever, and I
frequently hit errors during it because ours is especially a POS.

Yes. I agree. Simply breaking up the components by fiat I think would
be harmful to the process. Instead, clear defined interfaces need to
be put in place first, preferably with some decent acceptance tests at
every component level. Some people in my company just want to
componentize by fiat as though this will fix anything. Unfortunately,
such well defined interface would be a paradigm shift. People would
actually have to plan ahead, get requirements earlier on, write
generic reusable interfaces instead of the feature driven approach we
currently have, etc.

I might be amenable to this in practice. As mentioned else-thread,
"good" GNU Make usage has a lot of $(eval $(value ...)) stuff in it
(or $(eval $(call ...)) where the call directly contains a $
(value ...)). This was my first prototype at attempting incremental
correctness, but man it was slow. The file IO for all of the build
state files, the cat and echo calls, and all of the interpreted string
manipulation, was killing it, especially on windows. It was spending
over half of its time on an up to date build just reading and parsing
the saved state files. (The .d files, one might say.)

> > - d- It will not get anywhere close to a good incremental build for
> > other compilation models, specifically Java. A good incremental Java
> > build cannot be based on a file dependency graph. [...]
>
> Well, I don't do Java ;-) Everything I use fits the make model. And
> that's not just C and C++ compilers; I have Make control dozens of
> different tools.

Unfortunately, my shop is moving towards Java, but we have a lot of
legacy C and C++ code which provides the engine for the Java tools, so
any usable incremental solution must handle Java. I think I hacked
together a solution which did at one point, but again the file IO and
interpreted string manipulation really killed the performance. I've
achieved much better performance just writing all of the logic in
actual compiled code instead of makefile.

> Like you said above, "the line has to be drawn somewhere". Or, as the
> saying goes: "all software sucks". I claim that while make has
> loopholes, (a) you can easily see when an incremental build may be
> incorrect and fix it with a 'make clean'; (b) this doesn't happen very
> often, and the vast majority of builds can be done incrementally.
>
> Since I don't see any better alternatives (and you seem to think the
> existing "make replacements" share its flaws) I am relatively happy.

I'm writing my own. See my first post in this thread, or one of the
earlier ones.

> > 2- It exposes a full turing complete language to common developer
> > edits, and these common developer edits wind up in source control.
>
> Just like C++. I commented on that that above.
>
> But I think you exaggerate the changes "common developer" make.
> Those are almost always of the form of adding or removing lines like
>
> libfoo.a: fred.o
>
> and if someone doesn't get that, maybe he shouldn't edit the C++ code
> either.
>
> I am assuming here that someone designed the Makefile correctly
> originally; set up automatic dependency generation and so on. Lots of
> projects fail to do that -- but lots of projects fail to use C++
> properly too.

Well, I'm one of those projects which failed to do that. It also uses
recursive make.

However, it's not just a simple C++ project. We have Java, code
generation from a simple Java like language to Java and C++ to support
serialization between the two, C++, other code generation for message
files, JNI, some Eclipse plugin build thingy, ?AWT?, and at least a
half a dozen other kinds of build steps in the build. It also somehow
manages to throw Maven in. It really is a mess.

The unfortunate problem of this thread is everyone has their own
preferred solution. Some suggest $(eval (value ...)) to deal with my
corner cases. Some suggest doing a make clean on every such corner
case. I think there was a serious suggestion to just do full clean
builds everytime as well, and people are getting confused to my
replies, where I reply to idea A, but they argue I'm wrong when
applied to idea B. Thus I'm trying to be overly pedantic in this
thread.

So, if edits are always like
libfoo.a : fred.o
then we have my corner cases which must be handled by email or some
other word of mouth or manual step, and this is a relatively horrible
state of affairs.

If the makefiles use file system wildcards, then we're looking a
little better. However, another corner case of mine sneaks in, which
again can only be handled by email, word of mouth, or manually
checking for it.

You could go the distance and abandon idiomatic make usage entirely
and use $(eval $(value ...)) to guarantee incremental correctness. All
developer makefiles would consist only of $(eval $(value ...)) and
they would never specify a make rule. This was my initial route. It
could work, but man it was slow with GNU Make 3.81 on windows. It's
also somewhat ugly, if I may say so, and I fear there would be a great
desire for a developer to just write his own one-off rule if the
situation arised, and anyone with sufficient power over him wouldn't
care. At least, that's how my company operates.

So, I took the approach where I wrote my own replacement system which
is kind of like a bastard mix of Maven, Ant, with implementation
details of Make, to put all of the build logic in a single place, and
developers can only instantiate build macros which have otherwise been
thoroughly reviewed and tested before being deployed to developers.

> > The
> > immediate conclusion is that a build system based on idiomatic make
> > can never be incrementally correct over all possible changes in source
> > control. That is, a developer will inevitably make a change which
> > breaks the idiomatic usage (what little there is) and will result in
> > incremental incorrectness.
>
> So review the changes and reprimand him. Same as with the C++ code,
> but *a lot* easier.

They'll argue that 95% incremental correctness is acceptable, just as
someone else has else-thread. If you allow a build system where the
developer can incorrectly specify a build script, but it works most of
the time, management will not see a need to spend developer time
fixing it. That's why I want it near impossible for a developer to be
able to break incremental correctness short of maliciousness.

> > False negatives are quite annoying, but
> > perhaps somewhat acceptable. False positives are the bane of a build
> > systems existence, but they are possible when the build system itself
> > is being constantly modified \without any tests whatsoever\ as is
> > common with idiomatic make.
>
> That *is* scary, but I see no way around it.

I'm working on it. See else-thread for a description of my build
system.

Joshua Maurice

unread,

Jul 22, 2010, 2:03:23 PM7/22/10

On Jul 22, 7:03 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Jul 22, 4:40 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> > I must say this is somewhat surprising to hear in a C++ forum. I
> > expected most people to still be under the incremental spell.
> > Incremental really is not that hard to achieve. It's just that no one
> > has really tried as far as I can tell. (Specifically under my
> > definition of "incremental correctness" which by definition includes
> > the associated build scripts as source, though I think even ignoring
> > build script changes, all common, purported incremental, build systems
> > are not incrementally correct under idiomatic usage.)
>
> Yes people have tried and have succeeded and it was not hard.
> A few simple scripts and conventions to augment make suffice.
> I've already told you this (and given you one example that you
> could/should have expanded on).

As I just mentioned in the previous post, conventions are insufficient
IMHO. If it allows for someone to add a new rule to make, such as for
a one-off build step, it's very hard to convince managers that it's
worth developer time to do it the right way if it'll work 99% of the
time. However, those 1% "add up", and the end result is my current
build system which is horribly broken for incremental, and I suspect
this the same for most other build system of equivalent size.

Also, could you point me to publicly available implementations?

> Your problem is not one of intellect rather it is an attitude
> problem and a problem of self-delusion. You are operating on
> several false assumptions:
>
> 1) That you are a master of build /systems/ (make is a tool
> not a system by the way).

I'm not operating on that assumption. If I was, I wouldn't be posting
here asking for feedback and advice. I merely said I was above the
pack, which is quite evident from my company, but less-so here. And
yes, make is a build system framework. You can implement many
different kinds of build systems from Make. I was trying to be clear
when I emphasized "idiomatic GNU Make usage ala Recursive Make
Considered Harmful", though I may have let that slip several times to
simply "make".

> 2) That make is a fundamentally fubar, flawed, horrific,
> broken, etc /tool/ that cannot serve as a core component
> of an incrementally correct build /system/.
>
> 3) That the problems make does have are insurmountable
> without major overhaul.

Can you elaborate further? It's the existence of these scripts and
conventions which don't make it broken? I think that's still under
dispute. Moreover, this isn't an assumption of mine. I have spent a
great deal of text-space here in this thread clarifying the problems
and arguing that they are real problems.

> 4) That the incompetence and social problems of your
> workplace are relevant to the correctness of make.

They're not? I live in a world where practicality matters. I live in a
world where politics and social pressures matter. I live in a world
where we like type safety and const correctness because developers are
not perfect, myself included. I live in a world where we use C++
instead of assembly.

I could not write an incrementally correct build on my first try. I
would need several iterations, a large test suite, etc. (which I'm in
the process of doing). If you have an academic solution, but it
doesn't work in practice, then it simply does not work.

> All of the above are false but you labor with them as truths.
> They are holding you back! If you would stop ranting and whining
> and start thinking and scripting your problems would start to
> evaporate.
>
> Even the fact that I'm telling you it is /possible/ to solve ALL
> the problems you have outlined and indeed can be done with simple
> scripts + make (at least with C++ projects, I can't comment on
> Java), that alone should be a boon to you if you could only get
> past those flawed preconceptions.

So yes, you're advocating the $(eval $(value ...)) like approach where
all developers work in terms of the predefined, vetted GNU Make
macros. As I mentioned, this was my initial idea, my prototype, but I
threw it out for the reasons already mentioned.

Also, unfortunately my solution requires Java support.

Joshua Maurice

unread,

Jul 22, 2010, 2:07:24 PM7/22/10

As already mentioned, mine is. Mine is in the business of making
money, not perfecting a GNU Make incremental build system. If I go to
a manager and say "Here's this problem. In 1% of incremental builds,
the result will be incorrect, and we'll lose X developer time over
it." Manager will say "Ok, how long will it take to fix? I can't just
send you on this academic exercise. If we do that, we'll lose
developer time overall, and we'll let feature Y miss the release." And
the manager would be right. The problem is that all of these little
problems in the build system pile up, each one not worth fixing, but
in the end the incremental build system becomes quite broken.

That's why I argue it's important to make it right by default, and
make it exceptionally hard to break the incremental build system.

Öö Tiib

unread,

Jul 22, 2010, 2:13:07 PM7/22/10

On 22 juuli, 12:08, Ian Collins <ian-n...@hotmail.com> wrote:
>
> >> Why? On the contrary. Incremental builds are irrelevant. No one uses
> >> these anyway.
>
> I don't agree with that, 99% of my builds are incremental, often just
> the one file I'm editing.

Yes, i did mean for producing full fifty-modules and thousand-code-
files product after pulling from repository with changes made by who
knows whom and in where. As for one file or even module and its unit-
tests and tools (with what i usually work) it usually takes minute or
so for my PC to compile and run them all so i barely can run to take
coffee during such build ... no difference if it is incremental or
not. Incrementally building and linking full product whole day just to
see if tiny change you made in one file did the trick in context of
full product feels a bit like voodoo-programming and hacking and may
throw all the productivity out of window i believe.

Keith H Duggar

unread,

Jul 22, 2010, 3:12:55 PM7/22/10

I'm not sure I understand what you meant by "mine is". Did you
mean "my (Joshua's) shop sucks"? Or "my (Joshua's) shop is in too
much of a hurry to practice software engineering"?

Anyhow, our goal is making money as well. Luckily we know (to
some extent at least) that haste can make waste and that cutting
corners eventually bites your ass. We learned the hard way.

Also, haven't you heard of working on the weekends and at night?
What does your employer think of all the time you are spending
whinging here in the newsgroup about make? Would they allow you
to spend such whining time on improving the build system instead?

On more than one occasion I've had a conversion like this with
my director:

KHD : By the way, I went ahead and implemented a solution this
last two weekends to problem XYZ that I've been complaining
about for the last month.
DIR : Hehe. Why did you do that?
KHD : I just couldn't stand screwing around with the hacks
any longer. They were breaking and wasting my time.
DIR : Ok. Cool. When will it be online?
KHD : Well, I have to wait until ABC has time to review it.
It's simple so it will only take him an hour or two.
DIR : Ok. Well tell him he can review it after he finishes
project PQR unless something else comes up.
KHD : Thanks. In the meantime I'll just keep hacking shit.

Step up to the plate, man. Be a leader and problem solver. Work
overtime to implement your ideas. Maybe this will help you out:

http://www.youtube.com/watch?v=unkIVvjZc9Y

KHD

Ian Collins

unread,

Jul 22, 2010, 4:54:29 PM7/22/10

On 07/23/10 06:03 AM, Joshua Maurice wrote:
>
> So yes, you're advocating the $(eval $(value ...)) like approach where
> all developers work in terms of the predefined, vetted GNU Make
> macros. As I mentioned, this was my initial idea, my prototype, but I
> threw it out for the reasons already mentioned.

There's more to make than GNU make. In some ways the "extensions" in
GNU make simply give the unwary more rope to hang them selves.

--
Ian Collins

Ian Collins

unread,

Jul 22, 2010, 5:02:08 PM7/22/10

On 07/23/10 05:06 AM, 嘱 Tiib wrote:
> On 22 juuli, 11:40, Joshua Maurice<joshuamaur...@gmail.com> wrote:
>>
>> The argument made was that he did not test the incremental build
>> system to any significant degree. He argued that he did test the
>> incremental build system with "continuous clean builds". He is in
>> error, and you are arguing a non sequitur.
>
> Ok. He did test clean build system, so he did not test incremental
> build system.

I thought I had explained clearly that incremental development tests
incremental builds!

You add a test or some code to pass a test, build, test. If you don't
get the expected result, something is broken. That something can either
be the new code, or the build (or on a really bad day, the compiler!).

--
Ian Collins

Joshua Maurice

unread,

Jul 22, 2010, 5:36:49 PM7/22/10

On Jul 22, 2:02 pm, Ian Collins <ian-n...@hotmail.com> wrote:

If you restrict developers to only $(eval $(value ...)) of predefined,
vetted macros, then at least you might be able to thoroughly test each
macro for correctness. I presumed you were not using such a scheme. I
apologize if that assumption was incorrect. If my assumption was
correct, and there are a lot of explicit rules commonly modified by
developers, then my claims hold.

To reiterate: My first claim was that you did not thoroughly test the
incremental build system at all before deploying it to developers. The
fact remains that the build system is under constant modification
without even a basic sanity test before being deployed to developers.

Moreover, I have a new claim as well, that even the everyday use by
developers will not test all possible source code deltas, that is
developers using it over the next day will not constitute a
comprehensive test either. In addition to being forced to be the
guinea pig testers for the new incremental build system, they still
are not thoroughly testing it. Their goal isn't to test it and its
corner cases. They're just trying to use it to get another job done. A
single missing dependency could remain uncaught for months of usage,
or never caught.

Joshua Maurice

unread,

Jul 22, 2010, 6:04:21 PM7/22/10

Possibly both? If you define "software engineering" appropriately,
then we do not do it. We decide that we cannot spend developer time on
an investment which would cost more than it would return before the
next release. At least, that's how most of the decisions go. If you
want to define a short time horizon as not software engineering, then
yes.

> Also, haven't you heard of working on the weekends and at night?
> What does your employer think of all the time you are spending
> whinging here in the newsgroup about make? Would they allow you
> to spend such whining time on improving the build system instead?

This is my own time. I have rather flexible hours, but I put in more
than my expected time most days. I don't appreciate the insinuations
and personal attacks either.

And no, they're not terribly interested in fixing the build, either
componentizing, or any of my other suggestions involving incremental
correctness. I can't really blame them either for the aforementioned
reasons, such as it might not be a wise investment for the next
release.

> On more than one occasion I've had a conversion like this with
> my director:
>
> KHD : By the way, I went ahead and implemented a solution this
> last two weekends to problem XYZ that I've been complaining
> about for the last month.
> DIR : Hehe. Why did you do that?
> KHD : I just couldn't stand screwing around with the hacks
> any longer. They were breaking and wasting my time.
> DIR : Ok. Cool. When will it be online?
> KHD : Well, I have to wait until ABC has time to review it.
> It's simple so it will only take him an hour or two.
> DIR : Ok. Well tell him he can review it after he finishes
> project PQR unless something else comes up.
> KHD : Thanks. In the meantime I'll just keep hacking shit.
>
> Step up to the plate, man. Be a leader and problem solver. Work
> overtime to implement your ideas.

What do you think this is? Whining? I specifically asked where I
should post up the code in order to get reviews and such, and possibly
wide spread public adoption. Since then, I have merely participated in
discussions on my claims, and I have defended my claims where I think
I am right based on evidence and argument. However, this has been very
beneficial to me, as I now know where to further explore my ideas.
This thread has led to several novel claims and ideas for reinforcing
my beliefs and presented several new against, and for that I thank
you.

My build system, on which I do work weekends and nights off-clock,
it's not complete enough for use in my company. I have many macros to
implement before it's in a usable state, and even then there's the
cost to move over from the recursive make system + Maven nonsense to
my system. As my system is not publicly adopted, it's an in-house
system, they're hesitant. They also don't believe that the costs of
the current system are unreasonable - I just wished they would
actually do some coding sometime to see how bad it actually is.

Then we also have developers in high level meetings take the weasel
way out, much like this thread, so that's not helping. I was just
recently involved in a team to help "fix" the build. As far as I can
tell, this is basically how it went:
- Developers: Yo managers! The build is like, way slow, and very
fragile.
- Managers: Ok. So, we agree. In a meeting which Joshua was not
involved in, nor anyone else from the "build specialist" team, we
pulled numbers out of a hat of for acceptable build times. None of
these targets include anything about its fragileness. How bad is our
current build system? What can we do to meet these targets?
- Developer Teams Representatives (of which I was one): Well, if a
developer chooses to not do a full rebuild but only a rebuild of my
own unofficial subcomponent, and I skipped running 80% of the tests,
then we actually already meet these goals without any changes. (Insert
other weaseling which shows we already meet build time targets.)
- Managers: Excellent! So, issue closed?

I've had discussions similar to yours with my manager, other managers,
and higher level managers, but they don't end the same way. In the
end, they don't understand the technical details, they don't
understand the lost developer time, and they defer to the other senior
developers and/or developer+managers who take the weasel way out
because they have looming deadlines, and to some extent because they
also don't fully understand the technical details. . When I'm in such
meetings, I have to explain to the higher level technical managers,
such as the manager of the manager of the build specialist team, what
a pom.xml is ("Oh, like a makefile"), and we've had Maven as our core
build tool for the last 2 years.

At least, this is my impression. I believe a correct incremental build
system is a doable proposition, and I believe that switching to it
would be a very worthwhile investment if such a system already
existed. However, if the system had to be written from scratch, like
what I'm doing, then it may not be worth it if our time horizon is
only the next release. However, as it will never fit in a release, and
the investment return spawns over several releases, I suspect it'll
never get done in my company. Well, at least not for many more
releases from now.

PS: I feel as though I'm being given the runaround by some people in
here. I have made very clear points, and some people continue to
misinterpret or misrepresent my arguments, and others bring up
tangentially related arguments as though it's a rebuttal to my mostly
unrelated points. Now I'm replying to mostly personal attacks and
other non sequiturs, like "Stop complaining and fix it", despite my
sincere belief that this thread is exactly that, an important and
crucial step towards fixing it (initially I asked where I could post
the source code if I could get it open-sourced) , and I have explained
else-thread that I have been working (in my own time) on such a fix.

Ian Collins

unread,

Jul 22, 2010, 6:04:57 PM7/22/10

On 07/23/10 09:36 AM, Joshua Maurice wrote:
> On Jul 22, 2:02 pm, Ian Collins<ian-n...@hotmail.com> wrote:

>> On 07/23/10 05:06 AM, 嘱 Tiib wrote:
>>
>>> On 22 juuli, 11:40, Joshua Maurice<joshuamaur...@gmail.com> wrote:
>>
>>>> The argument made was that he did not test the incremental build
>>>> system to any significant degree. He argued that he did test the
>>>> incremental build system with "continuous clean builds". He is in
>>>> error, and you are arguing a non sequitur.
>>
>>> Ok. He did test clean build system, so he did not test incremental
>>> build system.
>>
>> I thought I had explained clearly that incremental development tests
>> incremental builds!
>>
>> You add a test or some code to pass a test, build, test. If you don't
>> get the expected result, something is broken. That something can either
>> be the new code, or the build (or on a really bad day, the compiler!).
>
> If you restrict developers to only $(eval $(value ...)) of predefined,
> vetted macros, then at least you might be able to thoroughly test each
> macro for correctness.

If I knew what $(eval $(value ...)) did, I'd be able to answer. We
didn't use GNU make.

> I presumed you were not using such a scheme. I
> apologize if that assumption was incorrect. If my assumption was
> correct, and there are a lot of explicit rules commonly modified by
> developers, then my claims hold.

The *only* changes made to makefiles by developers where the addition
and removal of targets (source files).

> To reiterate: My first claim was that you did not thoroughly test the
> incremental build system at all before deploying it to developers. The
> fact remains that the build system is under constant modification
> without even a basic sanity test before being deployed to developers.

I'm sorry, but that's bollocks.

> Moreover, I have a new claim as well, that even the everyday use by
> developers will not test all possible source code deltas, that is
> developers using it over the next day will not constitute a
> comprehensive test either. In addition to being forced to be the
> guinea pig testers for the new incremental build system, they still
> are not thoroughly testing it. Their goal isn't to test it and its
> corner cases. They're just trying to use it to get another job done. A
> single missing dependency could remain uncaught for months of usage,
> or never caught.

The world isn't perfect, get over it.

--
Ian Collins

Joshua Maurice

unread,

Jul 22, 2010, 6:13:51 PM7/22/10

On Jul 22, 3:04 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> On 07/23/10 09:36 AM, Joshua Maurice wrote:
> > I presumed you were not using such a scheme. I
> > apologize if that assumption was incorrect. If my assumption was
> > correct, and there are a lot of explicit rules commonly modified by
> > developers, then my claims hold.
>
> The *only* changes made to makefiles by developers where the addition
> and removal of targets (source files).

So, developers never specified new rules? What if the developer added
new source code which was to be in a new library? I'm confused.
Presumably the developers had to define new make rules. I can only
assume that's what you meant by the addition of new targets. In which
case, do you even track header dependencies? If not, your system is
laughably not incremental. If you do track header file dependencies,
and the developer has to add the rules to track header file
dependencies every time he adds a new library, then there's plenty of
room for error. (In addition to all of my other points.) Also typos.

> > To reiterate: My first claim was that you did not thoroughly test the
> > incremental build system at all before deploying it to developers. The
> > fact remains that the build system is under constant modification
> > without even a basic sanity test before being deployed to developers.
>
> I'm sorry, but that's bollocks.

Can you explain? It sounds to me still that the makefile is a Turing
complete programming language, and this file is being modified and
deployed without testing all aspects of it. I think your argument is
"It's so simple they can't break it." Fine, I guess, for a
sufficiently small makefile and project working entirely on C++ code,
which never adds nor removes source files, which was written the right
way to start with, and which is maintained by knowledgeable people of
make and your conventions. That doesn't sound reasonable. Also typos.

> > Moreover, I have a new claim as well, that even the everyday use by
> > developers will not test all possible source code deltas, that is
> > developers using it over the next day will not constitute a
> > comprehensive test either. In addition to being forced to be the
> > guinea pig testers for the new incremental build system, they still
> > are not thoroughly testing it. Their goal isn't to test it and its
> > corner cases. They're just trying to use it to get another job done. A
> > single missing dependency could remain uncaught for months of usage,
> > or never caught.
>
> The world isn't perfect, get over it.

That's not a rebuttal. I'm sorry I am pedantic, it's just the way I
am. I originally claimed that incremental build systems are basically
deployed entirely untested, including yours. You contested this claim,
and based on the available evidence and argument, you are wrong, and I
will continue arguing this until presented with some new evidence or
argument. You're welcome to stop contesting it.

Ian Collins

unread,

Jul 22, 2010, 6:58:28 PM7/22/10

On 07/23/10 10:13 AM, Joshua Maurice wrote:
> On Jul 22, 3:04 pm, Ian Collins<ian-n...@hotmail.com> wrote:
>> On 07/23/10 09:36 AM, Joshua Maurice wrote:
>>> I presumed you were not using such a scheme. I
>>> apologize if that assumption was incorrect. If my assumption was
>>> correct, and there are a lot of explicit rules commonly modified by
>>> developers, then my claims hold.
>>
>> The *only* changes made to makefiles by developers where the addition
>> and removal of targets (source files).
>
> So, developers never specified new rules? What if the developer added
> new source code which was to be in a new library?

Adding a new library would have been a team decision, it didn't happen
very often.

> I'm confused.
> Presumably the developers had to define new make rules. I can only
> assume that's what you meant by the addition of new targets. In which
> case, do you even track header dependencies? If not, your system is
> laughably not incremental. If you do track header file dependencies,
> and the developer has to add the rules to track header file
> dependencies every time he adds a new library, then there's plenty of
> room for error. (In addition to all of my other points.) Also typos.

*make* tracks header dependencies, that's why people use it! Google
"make .KEEP_STATE". As I said, there's more to make than GNU make. We
only listed source files, not headers. The source file dependencies are
in the files generated buy make. So I/we only write the higher level
dependencies:

Executable
|
Libraries and Object files
|
Source files

My current personal project space makefile is 1055 lines long, with 180
targets, in 11 libraries, the generated dependency file is almost 6000
lines.

>>> To reiterate: My first claim was that you did not thoroughly test the
>>> incremental build system at all before deploying it to developers. The
>>> fact remains that the build system is under constant modification
>>> without even a basic sanity test before being deployed to developers.
>>
>> I'm sorry, but that's bollocks.
>
> Can you explain? It sounds to me still that the makefile is a Turing
> complete programming language, and this file is being modified and
> deployed without testing all aspects of it. I think your argument is
> "It's so simple they can't break it." Fine, I guess, for a
> sufficiently small makefile and project working entirely on C++ code,
> which never adds nor removes source files, which was written the right
> way to start with, and which is maintained by knowledgeable people of
> make and your conventions. That doesn't sound reasonable. Also typos.

I think you are assuming too much of how we used our makefile. All we
had in makefiles were lists of tools, options and targets. No fancy
stuff, no variable evaluations, just lists. There was very little to
break, if a source was missing, the application wouldn't link.

I guess with hindsight, we were lucky to start out with makefiles
generated by an IDE. IDEs being simple beasts generate simple makefiles!

--
Ian Collins

Joshua Maurice

unread,

Jul 22, 2010, 8:27:08 PM7/22/10

Ok, I give. I misunderstood, or you were unclear, either way my fault,
and I persisted in this without further clarifying exactly what you
have done.

In effect, it sounds like in your system that there are some rules for
building certain kinds of code, like C++, which have been prevetted
and rarely change, very much like the $(eval $(value ...)) approach.
As I said earlier, I'm much more partial to this approach. In fact, my
initial prototype was very much something like this. However, at least
my implementation on GNU Make was quite slow. Some profiling showed
that it spent a large portion of its time in string manipulation,
process creation for the cat and echo processes, and io, on windows.
It was a magnitude or two slower than the solution written in compiled
code I have now. My compiled code solution determines ~4,000 java
files of an unofficial component of my company's project is up to date
in about 2 seconds. The GNU Make based approach with prevetted complex
rules ala $(eval $(value ...)) took much longer to determine the
~4,000 java files were up to date.

I think my complaints about such a scheme are simply that it's slower
than a compiled code solution, and it will not work for Java, both of
which are sticking points for my company's very large project.
However, it sounds like it actually works for you. I guess I'll have
to retract my points as made in ignorance based on assumption. My
bad.

Still though, I'm curious how exactly how yours is implemented. Is it
portable across operating systems, like HPUX, z linux, win itanium,
and more? Does it support cross compilation which is required for msvc
on win itanium? What exact make does it use? What other tools does it
use? Anything fancy, or just cat, echo, gcc (or whatever c++
compiler), etc.? You said it was created by an IDE initially, which?

Ian Collins

unread,

Jul 22, 2010, 8:49:17 PM7/22/10

On 07/23/10 12:27 PM, Joshua Maurice wrote:
>
> Ok, I give. I misunderstood, or you were unclear, either way my fault,
> and I persisted in this without further clarifying exactly what you
> have done.
>
> In effect, it sounds like in your system that there are some rules for
> building certain kinds of code, like C++, which have been prevetted
> and rarely change, very much like the $(eval $(value ...)) approach.

Yes. I think all "make" solutions have a hierarchical structure with
fixed system wide rules (how to make a .o form a .c etc.) and
progressively more flexible local conventions.

> As I said earlier, I'm much more partial to this approach. In fact, my
> initial prototype was very much something like this. However, at least
> my implementation on GNU Make was quite slow. Some profiling showed
> that it spent a large portion of its time in string manipulation,
> process creation for the cat and echo processes, and io, on windows.

I can imagine. Trying to get too smart with make rules will do that, it
is after all just another interpreted scripting language.

> Still though, I'm curious how exactly how yours is implemented.

We/I just use Sun's dmake.

> Is it
> portable across operating systems, like HPUX, z linux, win itanium,
> and more? Does it support cross compilation which is required for msvc
> on win itanium? What exact make does it use? What other tools does it
> use? Anything fancy, or just cat, echo, gcc (or whatever c++
> compiler), etc.?

None of the above! The tools run on Solaris and Linux flavours only.

> You said it was created by an IDE initially, which?

An old Sun product called Workshop.

--
Ian Collins

Keith H Duggar

unread,

Jul 23, 2010, 2:54:35 AM7/23/10

On Jul 22, 6:13¬†pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 22, 3:04¬†pm, Ian Collins <ian-n...@hotmail.com> wrote:
>
> > On 07/23/10 09:36 AM, Joshua Maurice wrote:
> > > I presumed you were not using such a scheme. I
> > > apologize if that assumption was incorrect. If my assumption was
> > > correct, and there are a lot of explicit rules commonly modified by
> > > developers, then my claims hold.
>
> > The *only* changes made to makefiles by developers where the addition
> > and removal of targets (source files).
>
> So, developers never specified new rules? What if the developer added
> new source code which was to be in a new library? I'm confused.
> Presumably the developers had to define new make rules. I can only
> assume that's what you meant by the addition of new targets. In which
> case, do you even track header dependencies? If not, your system is
> laughably not incremental. If you do track header file dependencies,
> and the developer has to add the rules to track header file
> dependencies every time he adds a new library, then there's plenty of
> room for error. (In addition to all of my other points.) Also typos.

Let me show you the exact commands one would use to create a new
library in my system assuming that $LIBROOT is the root library
path (ie the dir that contains top level libraries) and newlib is
the new library:

$ cd $LIBROOT
$ mkdir newlib
$ cp $LIBROOT/makes/makefile.lib newlib/makefile
$ cd newlib
$ gencode -ch fileone filetwo
$ gvim fileone.hpp fileone.cpp filetwo.hpp filetwo.cpp ...
... edit/write the new code for fileone.hpp etc ...
... check code into depot if desired ...

that's it; job is done. Now you want to build the new library?
Ok assume $OBJROOT is the directory in which you have previously
built the libraries and now you want to build this new library:

$ cd $OBJROOT
$ gmake -f $LIBROOT/makefile

end of story; the library collection is incrementally built which
in this case includes a clean build of this new library. Did the
developer specify new rules? No. All they needed to do was create
directories, copy files, and write code.

Now I have a question. Do you know how to implement a build system
that provides the above simple support for adding new libraries?

KHD

Keith H Duggar

unread,

Jul 23, 2010, 3:08:44 AM7/23/10

On Jul 22, 8:49¬†pm, Ian Collins <ian-n...@hotmail.com> wrote:
> On 07/23/10 12:27 PM, Joshua Maurice wrote:
> > Ok, I give. I misunderstood, or you were unclear, either way my fault,
> > and I persisted in this without further clarifying exactly what you
> > have done.
>
> > In effect, it sounds like in your system that there are some rules for
> > building certain kinds of code, like C++, which have been prevetted
> > and rarely change, very much like the $(eval $(value ...)) approach.
>
> Yes. ¬†I think all "make" solutions have a hierarchical structure with
> fixed system wide rules (how to make a .o form a .c etc.) and
> progressively more flexible local conventions.
>
> > As I said earlier, I'm much more partial to this approach. In fact, my
> > initial prototype was very much something like this. However, at least
> > my implementation on GNU Make was quite slow. Some profiling showed
> > that it spent a large portion of its time in string manipulation,
> > process creation for the cat and echo processes, and io, on windows.
>
> I can imagine. ¬†Trying to get too smart with make rules will do that, it
> is after all just another interpreted scripting language.

Or simply not knowing when to use := instead of =, or when to
let a script/program do the work instead, etc. You know, all the
usual noob mistakes. The kinds of things that a decent book such
as "Managing Projects with GNU Make" by Robert Mecklenburg would
teach you about. I wonder if the op has read that book?

KHD

Joshua Maurice

unread,

Jul 23, 2010, 3:45:21 PM7/23/10

I'm sorry. I was specifically attacking the idiomatic usage of make,
where I thought that was developers specifying rules explicitly ala
foo.o : foo.cpp
or some use of wildcards, or whatever. Using prebuilt macros does not
appear to be idiomatic usage as described in the manual or half of the
books which I have read.

Your system is pretty good. I presume that it uses file system
wildcards to get the list of cpp files in that directory, sets the
include path to automatically include the relevant dir (or you specify
all includes relative to the base dir of the project), and it uses the
lib-root dir name as the soname of the output library. Overall, pretty
good. I'd still expect that the users will have to modify the makefile
for any real usage though, such as for link dependencies, preprocessor
defines, and possibly include path dirs.

Then there's still also situations where you might need to tweak the
compiler options for a particular platform. In my company, there's a
lot of places where this is the case, to work around compiler bugs, to
get acceptable compile times, etc. In this one particularly complex
template file, on one platform we had to turn down the optimization
level of this one for regular builds, otherwise the compile time would
be many hours for just this one file.

With your system, you could also handling new file creation hiding
other files on search paths, file deletion with file system wildcards
triggering appropriate cleans, and changes to compile options also
triggering cleans such as changing command line preprocessor defines.
I'd be curious if you handle those. In which case, your system is very
good.

As I just mentioned, I have no real problems with such a system
besides
1- its very poor speed compared to a solution written not so much in
an interpreted language,
2- and it can't work for other compilation models such as Java, which
is a sticking point for my use case.
So, if it's speed is fast enough and/or on a small enough project, and
you don't need to support other compilation models, I really like it
(conditionally on a couple other minor aspects).

Still, I think we're really abusing GNU Make (or whatever make flavor)
when we do this. This is definitely not what was intended when make
was originally written. Make was written with the intent of users
specifying rules to build stuff, possibly with variables like $(CC)
etc. It wasn't really written with a pure macro aka task system in
mind. Instead, this is much closer in spirit to Maven or Ant. It's
just that the (GNU) Make engine is pretty general purpose, portable,
and reusable, so we reused it to solve the problem in a very
unorthodox approach from the perspective of the original authors of
Make. Also, I would be a little happier if there was a publicly
available \standard\ implementation of these makefile scripts instead
of each company and/or team rolling their own, with their own
conventions. My complaint I guess is simply that GNU Make is a build
system framework, but I would really like a standardized build system
usable out of the box instead of having to roll my own.

> Now I have a question. Do you know how to implement a build system
> that provides the above simple support for adding new libraries?

Yes.

Hell, I wrote a Make clone myself, to provide built-ins for echo to
file, cat, etc., to see if I could get acceptable performance. This
was before I decided to abandon the Make model entirely. I know some
won't believe me, but on some rather uncontrived, large makefiles, my
own implementation outperformed GNU Make by quite a large factor, in
the area of 25 to 50x faster runtimes for an up to date build on
~25,000 source files on a recent Linux install with 8 cores. (Haven't
got legal to let me open source it yet. Yes, both were compiled with
standard optimization compiler options to gcc 4 something on Linux. No
cheating.) I'll get around to profiling later to figure out how this
is the case. I really don't know why. I just wrote mine in a
simplistic straightforward C++ approach, whereas the GNU Make
internals is crazy paranoid about string copies. You should see the
hackery they do for the built in functions and how they pass a large
string buffer around in an attempt to avoid extra string copies.
Perhaps it's because I don't keep information around about the
definition location of variables like GNU Make, but I doubt that could
be a significant performance hit.

Joshua Maurice

unread,

Jul 23, 2010, 3:57:57 PM7/23/10

Partially. I admit I have only read the sections available on
http://oreilly.com/catalog/make3/book/index.csp

Still, I would note that I am not the run of the mill Make noob.

It's just that I really don't like the contortions I had to go through
to get an incrementally correct build for C++ in Make. The biggest
problem with Make is it's dependency graph model. Specifically, if one
node is considered out of date, all nodes downstream are considered
out of date. This makes it exceptionally hard, or I think impossible
(? correct me if I'm wrong please), to have "code" execute on every
go, in parallel, and be able to make a node out of date, but not force
out-of-date-ness.

An example is dealing with when a cpp source file has been removed. In
this case, you want to force a relink of its library, and preferably
delete the "orphaned" object file. This can be handled in pass 1, but
then it's done by a single thread only. You could try to move such
logic to rule evaluation, but rules only run when the node is out of
date, and there's no way AFAIK to conditionally mark another node out
of date in a rule's command. At least, GNU Make has some interesting
caching of file timestamps, so you can't just simply delete a file
which has an order-only dependency on yourself and expect that to
work. I've tried; it doesn't.

This is also one of the reasons I wrote my own make clone. In my
clone, for nodes A and B, A depends order-only on B, if B's command
deletes file A, then A will be considered out of date, and otherwise
it behaves as usual. The result is that you can conditionally execute
portions of the graph downstream. Again, you can get this same
behavior by writing this logic outside of rule commands so it will be
done in pass 1, but pass 1 is single threaded.

Keith H Duggar

unread,

Jul 27, 2010, 2:32:40 PM7/27/10

I don't understand why you would say that. The make manuals I've
seen discuss macros (or just "variables" and GNU calls them) and
both the make books I've read discuss them at length.

Anyhow who cares what is "idiomatic" or not? I would've thought
our shared interest is in solving problems not arguing about what
is "idiomatic". It seems instead that you are trying to limit the
discussion a narrow practice where your complaints are justified
rather than discussing how to effectively use make.

> Your system is pretty good. I presume that it uses file system
> wildcards to get the list of cpp files in that directory, sets the

Correct. There are also a couple of scripts that find sets of
files in a way slightly more sophisticated that simple wildcards.

> include path to automatically include the relevant dir (or you specify
> all includes relative to the base dir of the project), and it uses the

Correct. We specify all includes relative to the base dir (well,
to a small set of "base" dirs actually).

> lib-root dir name as the soname of the output library. Overall, pretty

That is the default. Of course it can be overridden easily but
so far nobody has seen the need to. And it would need a decent
justification to pass review.

> good. I'd still expect that the users will have to modify the makefile
> for any real usage though, such as for link dependencies, preprocessor
> defines, and possibly include path dirs.

Link dependencies are automatically computed by a script that
examines the apps includes and a database (also automatically
generated) that relates and partially orders inter-library
dependencies.

We specify any app specific defines and if necessary link
dependency overrides in the individual application makefile.

So far we have not had the need to compile internal libraries
with idiosyncratic defines. However, if we did I suspect we
would put those in the library specific make files (similar to
how we handle 3rd party configurations) but I haven't thought
that through yet.

> Then there's still also situations where you might need to tweak the
> compiler options for a particular platform. In my company, there's a
> lot of places where this is the case, to work around compiler bugs, to
> get acceptable compile times, etc. In this one particularly complex
> template file, on one platform we had to turn down the optimization
> level of this one for regular builds, otherwise the compile time would
> be many hours for just this one file.

Yeah, for a while we compiled on 3 platforms (now just 2) so
there were various platform defines. Those are centralized in
a single included makefile and a few header files.

> With your system, you could also handling new file creation hiding
> other files on search paths, file deletion with file system wildcards
> triggering appropriate cleans, and changes to compile options also
> triggering cleans such as changing command line preprocessor defines.
> I'd be curious if you handle those. In which case, your system is very
> good.

Yes, all are handled with a caveat: command line defines are
captured only if one runs make with the forwarding script:

runmake -f $LIBROOT/makefile ...

which also does some other stuff. Common developers are taught
to use (and do use) this command. However, there is sometimes
a use for the "raw" command I gave earlier ie gmake -f ...

A good example is the NDEBUG argument. Specifying NDEBUG would
be captured by runmake and could (depending on which production
library cache you are linking against) trigger and total world
rebuild. Often when you are debugging you might want to define
NDEBUG=0 only for your module while still linking against the
NDEBUG=1 library cache. To do that one can use the raw gmake
command.

This has never been a problem and if it became one we would
just alias gmake -> runmake in everyone's environment and
restrict direct access to gmake.

> As I just mentioned, I have no real problems with such a system
> besides
> 1- its very poor speed compared to a solution written not so much in
> an interpreted language,

Not in our case. The time between invoking the gmake and the
first g++ invocation (by which point all the hard work has
been done to synthesize the global make file) is just several
seconds. And that is a project having nearly 80,000 files,
scores of libraries, and hundreds of apps. The bulk of that
time is just file system access not the "interpretation" you
are worried about.

> 2- and it can't work for other compilation models such as Java, which
> is a sticking point for my use case.

Well, you say that but I don't know if that is true. There is
Java code compiled/jarred whatever in this system it's just that
I wasn't involved in specifying the rules for it. That's why I
cannot speak with any certainty but so far those guys haven't
complained to me about the core system. What specifically is
your Java specific difficulty.

> So, if it's speed is fast enough and/or on a small enough project, and
> you don't need to support other compilation models, I really like it
> (conditionally on a couple other minor aspects).

Our project seems to be larger than yours by far so I guess
the question is whether several seconds it too long for you.

> Still, I think we're really abusing GNU Make (or whatever make flavor)
> when we do this. This is definitely not what was intended when make
> was originally written. Make was written with the intent of users
> specifying rules to build stuff, possibly with variables like $(CC)
> etc. It wasn't really written with a pure macro aka task system in
> mind. Instead, this is much closer in spirit to Maven or Ant. It's
> just that the (GNU) Make engine is pretty general purpose, portable,
> and reusable, so we reused it to solve the problem in a very
> unorthodox approach from the perspective of the original authors of

Oh boy. That is the kind of ranting where we part ways. I
don't personally know what the original authors of think/
feel about how make should be used. And I don't care. make
is a /tool/ (contrary to your repeated assertions that it
is a "framework" or a "system" which it is not). As a tool
it performs a job and does so well enough that it can form
the basis of a nice build /system/.

> Make. Also, I would be a little happier if there was a publicly
> available \standard\ implementation of these makefile scripts instead
> of each company and/or team rolling their own, with their own

Yes, that would be very nice (if there isn't already such
a thing). But, that's up to the community not make.

> conventions. My complaint I guess is simply that GNU Make is a build
> system framework, but I would really like a standardized build system
> usable out of the box instead of having to roll my own.

I don't think we agree on what the words "framework" and "system"
mean. Maybe that is partly why you are so angry at make because
you are thinking of it as a framework or a system and thus are
expecting too much from it. Try to get this: make is a TOOL.

> > Now I have a question. Do you know how to implement a build system
> > that provides the above simple support for adding new libraries?
>
> Yes.
>
> Hell, I wrote a Make clone myself, to provide built-ins for echo to
> file, cat, etc., to see if I could get acceptable performance. This

The problem I'm having is that I find your claims incongruous
with seemingly naive questions like:

> > > So, developers never specified new rules? What if the developer added
> > > new source code which was to be in a new library? I'm confused.

I mean come on, seriously. Yeah yeah, I know you just said
that /you/ were talking/thinking only about using make in a
certain gimped way. Well, that wasn't clear at all. So such
questions as the above came across either as ignorant or
"playing stupid". Especially when paired with all the many
many paragraphs of ranting/complaining that came before.

KHD

Joshua Maurice

unread,

Jul 27, 2010, 4:29:34 PM7/27/10

On Jul 27, 11:32 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Jul 23, 3:45 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:

[snip the discussion of Keith's good build system]

> > As I just mentioned, I have no real problems with such a system
> > besides
> > 1- its very poor speed compared to a solution written not so much in
> > an interpreted language,
>
> Not in our case. The time between invoking the gmake and the
> first g++ invocation (by which point all the hard work has
> been done to synthesize the global make file) is just several
> seconds. And that is a project having nearly 80,000 files,
> scores of libraries, and hundreds of apps. The bulk of that
> time is just file system access not the "interpretation" you
> are worried about.

Odd. GNU Make 3.81? What kind of box are you running this on?

Here's mine.

psflor.informatica.com ~$ uname -a
Linux psflor.informatica.com 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38
EST 2008 x86_64 x86_64 x86_64 GNU/Linux
psflor.informatica.com ~$ free
total used free shared buffers
cached
Mem: 16366100 14678028 1688072 0 471320
11997828
-/+ buffers/cache: 2208880 14157220
Swap: 33551744 1830912 31720832
psflor.informatica.com ~$ grep "model name" /proc/cpuinfo
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387
model name : Quad-Core AMD Opteron(tm) Processor 2387

For a very simple makefile containing (?) 20,000 rules in a binary-
tree dependency graph with branch factor 75, it took the regular GNU
Make install (which defaults to -O3 and such IIRC) several minutes to
run. My make clone took seconds.

For my own sanity, let me go rerun this test. I'll get back today
hopefully. I'll put up the source code of the test generator at least
to see if you think it's a somewhat fair test.

> > 2- and it can't work for other compilation models such as Java, which
> > is a sticking point for my use case.
>
> Well, you say that but I don't know if that is true. There is
> Java code compiled/jarred whatever in this system it's just that
> I wasn't involved in specifying the rules for it. That's why I
> cannot speak with any certainty but so far those guys haven't
> complained to me about the core system. What specifically is
> your Java specific difficulty.
>
> > So, if it's speed is fast enough and/or on a small enough project, and
> > you don't need to support other compilation models, I really like it
> > (conditionally on a couple other minor aspects).
>
> Our project seems to be larger than yours by far so I guess
> the question is whether several seconds it too long for you.

Very odd.

I'd imagine that it can't be incremental for the Java, though your
system has surprised me thus far. I would be very impressed if it did
do incremental java compilation.

The problem is that Java's compilation model is very different than C+
+. In C++, each unit can be done in isolation to the other units, and
in the end a "minimal" linking step is done. Java's compilation model
is much closer to compiling a C++ source file to a dll, a shared
library, which is not allowed to have unresolved external symbols.
Each compilation cannot be done independently. When I change a cpp
source file, I only need to recompile that object file and do some
relinking. When I change a Java file, a naive solution is to recompile
that Java file, all direct dependent Java files, and all of their
dependencies, and so on. This is how Make operates: any change to a
node in the graph forces a full recompile of all nodes downstream.
There is no easy termination condition to the rebuild cascade across
the dependency graph.

Simple example:
A.java uses type name B, but does not use type name C.
B.java uses type name C.
C.java is there too.

When a change is made to the internal implementation of a function in
C.java, there is no need to recompile B or A. When there is a change
to the implementation of C.java, you do need to recompile B, but there
is no need to recompile A. Does your system do this automatically?

Then, does your system handle ghost dependencies?
http://www.jot.fm/issues/issue_2004_12/article4.pdf
I would congratulate you for having one of the few correct Java
incremental build systems in existence if this is the case.

> > Still, I think we're really abusing GNU Make (or whatever make flavor)
> > when we do this. This is definitely not what was intended when make
> > was originally written. Make was written with the intent of users
> > specifying rules to build stuff, possibly with variables like $(CC)
> > etc. It wasn't really written with a pure macro aka task system in
> > mind. Instead, this is much closer in spirit to Maven or Ant. It's
> > just that the (GNU) Make engine is pretty general purpose, portable,
> > and reusable, so we reused it to solve the problem in a very
> > unorthodox approach from the perspective of the original authors of
>
> Oh boy. That is the kind of ranting where we part ways. I
> don't personally know what the original authors of think/
> feel about how make should be used. And I don't care. make
> is a /tool/ (contrary to your repeated assertions that it
> is a "framework" or a "system" which it is not). As a tool
> it performs a job and does so well enough that it can form
> the basis of a nice build /system/.

It's a good tool for what it claims to do. It's fast, portable,
extensible, etc. However, I still claim that there could be a better
tool. (I'm in the process of writing something which I hope will be
that.) Make's overall design of a rebuild cascading down the
dependency tree without termination is not a good one. When I
implemented a correct incremental build system on top of it, I found
myself frequently fighting against it instead. I wasn't able to use
what I thought were idiomatic and standard styles, like (modulo typos,
simplified for explanation)
*.o : *.cpp ; $(CC) $(OPT_FLAGS) -c $^ -o $@
Sorry that you consider it ranting. I was merely trying to raise
consciousness that it actually is not ideal, and that its foundation
is flawed. You still manage to make quite good use out of it though.

> > Make. Also, I would be a little happier if there was a publicly
> > available \standard\ implementation of these makefile scripts instead
> > of each company and/or team rolling their own, with their own
>
> Yes, that would be very nice (if there isn't already such
> a thing). But, that's up to the community not make.
>
> > conventions. My complaint I guess is simply that GNU Make is a build
> > system framework, but I would really like a standardized build system
> > usable out of the box instead of having to roll my own.
>
> I don't think we agree on what the words "framework" and "system"
> mean. Maybe that is partly why you are so angry at make because
> you are thinking of it as a framework or a system and thus are
> expecting too much from it. Try to get this: make is a TOOL.

Well, all I'm trying to say is that make is not usable out of the box.
Instead, you need heavy customization to get it anywhere near usable.
I think this is a bad state of affairs. I think most developers would
much prefer some sort of standard build tool which is usable out of
the box. It's not to say make is a bad tool. I just wish instead that
developers didn't have to write their own build system for each
company.

> > > Now I have a question. Do you know how to implement a build system
> > > that provides the above simple support for adding new libraries?
>
> > Yes.
>
> > Hell, I wrote a Make clone myself, to provide built-ins for echo to
> > file, cat, etc., to see if I could get acceptable performance. This
>
> The problem I'm having is that I find your claims incongruous
> with seemingly naive questions like:
>
> > > > So, developers never specified new rules? What if the developer added
> > > > new source code which was to be in a new library? I'm confused.
>
> I mean come on, seriously. Yeah yeah, I know you just said
> that /you/ were talking/thinking only about using make in a
> certain gimped way. Well, that wasn't clear at all. So such
> questions as the above came across either as ignorant or
> "playing stupid". Especially when paired with all the many
> many paragraphs of ranting/complaining that came before.

I'm sorry. I haven't really seen this in practice. I don't have that
much corporate experience. What little I have is that no one in my
company has done anything like that. I also haven't found any public
reference implementations using macros like you do. I assumed such
usage was rather rare. I apologize for grouping you with them.

Joshua Maurice

unread,

Jul 27, 2010, 4:43:44 PM7/27/10

On Jul 27, 1:29 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> Simple example:
> A.java uses type name B, but does not use type name C.
> B.java uses type name C.
> C.java is there too.
>
> When a change is made to the internal implementation of a function in
> C.java, there is no need to recompile B or A. When there is a change
> to the implementation of C.java, you do need to recompile B, but there
> is no need to recompile A. Does your system do this automatically?

Ack, hit submit accidentally. Small typo. That should read:

> When a change is made to the internal implementation of a function in
> C.java, there is no need to recompile B or A. When there is a change

> to the **interface** of C.java, you do need to recompile B, but there

> is no need to recompile A

> **nor anything downstream which does not directly depend on C**.

Keith H Duggar

unread,

Jul 28, 2010, 12:02:42 PM7/28/10

On Jul 27, 4:29 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 27, 11:32 am, Keith H Duggar <dug...@alum.mit.edu> wrote:> On Jul 23, 3:45 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> [snip the discussion of Keith's good build system]
>
> > > As I just mentioned, I have no real problems with such a system
> > > besides
> > > 1- its very poor speed compared to a solution written not so much in
> > > an interpreted language,
>
> > Not in our case. The time between invoking the gmake and the
> > first g++ invocation (by which point all the hard work has
> > been done to synthesize the global make file) is just several
> > seconds. And that is a project having nearly 80,000 files,
> > scores of libraries, and hundreds of apps. The bulk of that
> > time is just file system access not the "interpretation" you
> > are worried about.
>
> Odd. GNU Make 3.81?

Yes.

> What kind of box are you running this on?
>
> Here's mine.
>
> psflor.informatica.com ~$ uname -a
> Linux psflor.informatica.com 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38
> EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> psflor.informatica.com ~$ free
> total used free shared buffers
> cached
> Mem: 16366100 14678028 1688072 0 471320
> 11997828
> -/+ buffers/cache: 2208880 14157220
> Swap: 33551744 1830912 31720832
> psflor.informatica.com ~$ grep "model name" /proc/cpuinfo
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387
> model name : Quad-Core AMD Opteron(tm) Processor 2387

$ uname -a
Linux [redacted] 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST
2010 x86_64 x86_64 x86_64 GNU/Linux

$ free
total used free shared buffers
cached

Mem: 65875328 4239496 61635832 0 441532
3411480
-/+ buffers/cache: 386484 65488844
Swap: 33551744 0 33551744

$ grep "model name" /proc/cpuinfo

model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
model name : Intel(R) Xeon(R) CPU X7460 @ 2.66GHz

> For a very simple makefile containing (?) 20,000 rules in a binary-
> tree dependency graph with branch factor 75, it took the regular GNU
> Make install (which defaults to -O3 and such IIRC) several minutes to
> run. My make clone took seconds.

Please explain, several minutes until the first g++ (or another
compiler call)? Or several minutes to complete an entire --dry-run
incremental or full build?

I'm thinking now it is somewhat difficult to actually measure the
performance of the build system itself. I've always used time to
first compiler execution because even though I would prefer to do
something like a --dry-run a --dry-run will obviously not trigger
all the dependencies that would be triggered in a real run since
the targets of multiple output rules are not understood by make
and will not actually be modified.

Furthermore, I wasn't making any effort to sporadically touch
multiple files which could effect the analysis time. Though I have
not thought that through and given your experience in writing make
clones you would be in a much better position to comment on that
possibility?

However, there is still a chance that this could be as simple as
a poor choice of = when := would have done.

> For my own sanity, let me go rerun this test. I'll get back today
> hopefully. I'll put up the source code of the test generator at least
> to see if you think it's a somewhat fair test.

Yeah, the tests are probably not equivalent. You might be running
a much more stressful test than I. I just touched a single library
.cpp file and ran the build. Is that a gimped performance test?

> > > 2- and it can't work for other compilation models such as Java, which
> > > is a sticking point for my use case.
>
> > Well, you say that but I don't know if that is true. There is
> > Java code compiled/jarred whatever in this system it's just that
> > I wasn't involved in specifying the rules for it. That's why I
> > cannot speak with any certainty but so far those guys haven't
> > complained to me about the core system. What specifically is
> > your Java specific difficulty.
>
> > > So, if it's speed is fast enough and/or on a small enough project, and
> > > you don't need to support other compilation models, I really like it
> > > (conditionally on a couple other minor aspects).
>
> > Our project seems to be larger than yours by far so I guess
> > the question is whether several seconds it too long for you.
>
> Very odd.
>
> I'd imagine that it can't be incremental for the Java, though your
> system has surprised me thus far. I would be very impressed if it did
> do incremental java compilation.

It may not be. I truly don't know. I know they often have trouble
with Java deployments but maybe that is just "ordinary" jar-hell.

> The problem is that Java's compilation model is very different than C+
> +. In C++, each unit can be done in isolation to the other units, and
> in the end a "minimal" linking step is done. Java's compilation model
> is much closer to compiling a C++ source file to a dll, a shared
> library, which is not allowed to have unresolved external symbols.
> Each compilation cannot be done independently. When I change a cpp
> source file, I only need to recompile that object file and do some
> relinking. When I change a Java file, a naive solution is to recompile
> that Java file, all direct dependent Java files, and all of their
> dependencies, and so on. This is how Make operates: any change to a
> node in the graph forces a full recompile of all nodes downstream.
> There is no easy termination condition to the rebuild cascade across
> the dependency graph.
>
> Simple example:
> A.java uses type name B, but does not use type name C.
> B.java uses type name C.
> C.java is there too.
>
> When a change is made to the internal implementation of a function in
> C.java, there is no need to recompile B or A. When there is a change

> to the interface of C.java, you do need to recompile B, but there

> is no need to recompile A.

So in Java B.java depends on C.class instead of C.java? As if a
B.cpp depended on C.obj instead of C.hpp? In this partly because
Java does not separate interface and implementation into separate
files (ie .hpp, .cpp) or no?

> Does your system do this automatically?

I don't know. As I said I did not write the Java specific rules.
However, if it is the case that B.java depends on C.class instead
of C.java then it does seem like a pretty nasty situation.

> > > Still, I think we're really abusing GNU Make (or whatever make flavor)
> > > when we do this. This is definitely not what was intended when make
> > > was originally written. Make was written with the intent of users
> > > specifying rules to build stuff, possibly with variables like $(CC)
> > > etc. It wasn't really written with a pure macro aka task system in
> > > mind. Instead, this is much closer in spirit to Maven or Ant. It's
> > > just that the (GNU) Make engine is pretty general purpose, portable,
> > > and reusable, so we reused it to solve the problem in a very
> > > unorthodox approach from the perspective of the original authors of
>
> > Oh boy. That is the kind of ranting where we part ways. I
> > don't personally know what the original authors of think/
> > feel about how make should be used. And I don't care. make
> > is a /tool/ (contrary to your repeated assertions that it
> > is a "framework" or a "system" which it is not). As a tool
> > it performs a job and does so well enough that it can form
> > the basis of a nice build /system/.
>
> It's a good tool for what it claims to do. It's fast, portable,
> extensible, etc. However, I still claim that there could be a better

> ...

> Well, all I'm trying to say is that make is not usable out of the box.
> Instead, you need heavy customization to get it anywhere near usable.
> I think this is a bad state of affairs. I think most developers would
> much prefer some sort of standard build tool which is usable out of
> the box. It's not to say make is a bad tool. I just wish instead that
> developers didn't have to write their own build system for each
> company.

Ok and agreed. However, let's please be honest here; that latest
statement is significantly different from your previous statements
that make is fundamentally fubar, horrific, etc. Agreed? If so I
think we have come far in understanding each other. At least I now
understand what you are saying and I agree that:

1) make is a limited tool and does not provide out-of-the-box
an incrementally correct build /system/. It must be augmented
to provide that.

2) make does have subtle annoyances and may be "slow".

3) it would be very, very nice if there was an open source build
/system/ of script augmentations etc for make.

> tool. (I'm in the process of writing something which I hope will be
> that.) Make's overall design of a rebuild cascading down the
> dependency tree without termination is not a good one. When I

Ok well that is a point of disagreement still. I don't understand
why "cascading down the dependency tree without termination" is a
fundamentally bad design. What mathematical model do you propose
to replace a partially ordered directed acyclic graph analysis
such as the make model?

KHD

Joshua Maurice

unread,

Jul 28, 2010, 7:46:58 PM7/28/10

On Jul 28, 9:02 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Jul 27, 4:29 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > For a very simple makefile containing (?) 20,000 rules in a binary-
> > tree dependency graph with branch factor 75, it took the regular GNU
> > Make install (which defaults to -O3 and such IIRC) several minutes to
> > run. My make clone took seconds.
>
> Please explain, several minutes until the first g++ (or another
> compiler call)? Or several minutes to complete an entire --dry-run
> incremental or full build?
>
> I'm thinking now it is somewhat difficult to actually measure the
> performance of the build system itself. I've always used time to
> first compiler execution because even though I would prefer to do
> something like a --dry-run a --dry-run will obviously not trigger
> all the dependencies that would be triggered in a real run since
> the targets of multiple output rules are not understood by make
> and will not actually be modified.
>
> Furthermore, I wasn't making any effort to sporadically touch
> multiple files which could effect the analysis time. Though I have
> not thought that through and given your experience in writing make
> clones you would be in a much better position to comment on that
> possibility?
>
> However, there is still a chance that this could be as simple as
> a poor choice of = when := would have done.

Tests are running. Finally got some free time. Will get back shortly.

> > The problem is that Java's compilation model is very different than C+
> > +. In C++, each unit can be done in isolation to the other units, and
> > in the end a "minimal" linking step is done. Java's compilation model
> > is much closer to compiling a C++ source file to a dll, a shared
> > library, which is not allowed to have unresolved external symbols.
> > Each compilation cannot be done independently. When I change a cpp
> > source file, I only need to recompile that object file and do some
> > relinking. When I change a Java file, a naive solution is to recompile
> > that Java file, all direct dependent Java files, and all of their
> > dependencies, and so on. This is how Make operates: any change to a
> > node in the graph forces a full recompile of all nodes downstream.
> > There is no easy termination condition to the rebuild cascade across
> > the dependency graph.
>
> > Simple example:
> > A.java uses type name B, but does not use type name C.
> > B.java uses type name C.
> > C.java is there too.
>
> > When a change is made to the internal implementation of a function in
> > C.java, there is no need to recompile B or A. When there is a change
> > to the interface of C.java, you do need to recompile B, but there
> > is no need to recompile A.
>
> So in Java B.java depends on C.class instead of C.java? As if a
> B.cpp depended on C.obj instead of C.hpp?

Well, it's actually much more perverse than that. At least in my
company, for organization reasons, each team has its own java source
directory. In addition, for packaging reasons, they're further split
into more source directories. Each of these source directories is
built with a single javac invocation. For a clean build of that Java
source dir, a Java file depends on the other Java source files in that
source directory, and it depends on the class files external to that
source directory. For an "incremental build", javac will check file
somewhat, somewhat, and selectively use the "up to date" (not
determined accurately) class file or recompile "out of date" (not
determined accurately) java source file in the this java source
directory. It's really different.

The short version is for the dependency chain A depends on B, B
depends on C, A does not directly use C, then a change to C's source
should not trigger a rebuild of A. If you cascade the rebuild without
termination, then a single change to a java file near the beginning of
the dependency graph could trigger a rebuild of the whole graph under
GNU Make's model, but it should not because almost all of the
recompiles would be unnecessary.

> In this partly because
> Java does not separate interface and implementation into separate
> files (ie .hpp, .cpp) or no?

Yeah. This is basically because interface and implementation are in
the same file. Luckily at least, there is no automatic transitive
compile dependencies like you get when headers include headers. That
allows you to terminate the cascading rebuild at the right spot and
still get a good, small / fast, correct incremental build.

> > It's
[Make]

> > a good tool for what it claims to do. It's fast, portable,
> > extensible, etc. However, I still claim that there could be a better
> > ...
> > Well, all I'm trying to say is that make is not usable out of the box.
> > Instead, you need heavy customization to get it anywhere near usable.
> > I think this is a bad state of affairs. I think most developers would
> > much prefer some sort of standard build tool which is usable out of
> > the box. It's not to say make is a bad tool. I just wish instead that
> > developers didn't have to write their own build system for each
> > company.
>
> Ok and agreed. However, let's please be honest here; that latest
> statement is significantly different from your previous statements
> that make is fundamentally fubar, horrific, etc. Agreed? If so I
> think we have come far in understanding each other. At least I now
> understand what you are saying and I agree that:

Perhaps. I did come off too strongly. I just honestly did not expect
people to actually be doing that judging from the experience of the
build team specialists in my company, the architects, etc. I was
surprised. GNU Make can be made to work quite well for a simple C++
build. However, it really can't for other kinds of builds, such as
Java, and "1 to many" or "many to many" C++ code generation. See below
for an example.

> 1) make is a limited tool and does not provide out-of-the-box
> an incrementally correct build /system/. It must be augmented
> to provide that.
>
> 2) make does have subtle annoyances and may be "slow".
>
> 3) it would be very, very nice if there was an open source build
> /system/ of script augmentations etc for make.
>
> > tool. (I'm in the process of writing something which I hope will be
> > that.) Make's overall design of a rebuild cascading down the
> > dependency tree without termination is not a good one. When I
>
> Ok well that is a point of disagreement still. I don't understand
> why "cascading down the dependency tree without termination" is a
> fundamentally bad design. What mathematical model do you propose
> to replace a partially ordered directed acyclic graph analysis
> such as the make model?

Doesn't work in the general case. For example, Java.

For example, there's some makefile code which you want to run on every
invocation. Such could includes checking if there's a stale object
which no longer has its corresponding source file. If such a stale
object file is found, it should be deleted, along with its
corresponding lib. You can run this code in phase 1, but this has
several limitations:

1- Not parallelizable.

2- The source code must be known in phase 1. My company, for example,
generates C++ code from a model to facilitate serialization between
Java and C++. Thus, there is some C++ code which does not exist in
phase 1. This plays havoc with the entire Make scheme. You need to
check for stale object files \after\ code generation, but \before\
deciding if the shared library is up to date, and this is quite
difficult and masochistic to do in GNU Make.

I thought I could do it with order only dependencies. I tried a phony
dependency, which depends on the C++ code generation, which is an
order only dependency of the C++ compilation. Its command checks to
see if there's a stale object file, and if there is, kill the shared
lib and the stale object file(s). However, GNU Make does fun stuff
with caching file timestamps during phase 2, so any modification to a
file in phase 2 may not be picked up. In fact, I found that such a
scheme broke more often than it worked - GNU Make would not detect
that the shared lib was just deleted by the phony order only
dependency, and it would conclude from the cached file timestamp that
the shared lib was up to date. (I wouldn't rely on undocumented
behavior anyway in a production system. There's a couple threads about
this on some of the GNU Make mailing lists. I recall one where a GNU
Make implementer basically said this is intended behavior.) I think
this is the thing which finally drove me to do write my make clone, to
give the guarantee that make will only check a file's timestamp in
phase two after all out of date dependencies have had their commands
run.

PS: I realized just now that you could put the logic to check for
stale object files in the make command of C++ code generation.
However, then the C++ code generation command needs to know about the C
++ compilation object dir. I don't think such an approach would scale
well either; it tightly couples what should be two independent
"macros".

PPS: You could also abandon entirely letting GNU Make track out of
date. That's the other way I see to do solve this problem in GNU Make
and get parallelization.

Joshua Maurice

unread,

Jul 28, 2010, 9:28:11 PM7/28/10

I've included several metrics, most of the relevant ones. I did not
include any good testing of an actual incremental build, as that would
be more involved than I have time for right now, but I believe that it
should lie somewhere between the times for "all build from all build"
and "all build from clean".

> Furthermore, I wasn't making any effort to sporadically touch
> multiple files which could effect the analysis time. Though I have
> not thought that through and given your experience in writing make
> clones you would be in a much better position to comment on that
> possibility?
>
> However, there is still a chance that this could be as simple as
> a poor choice of = when := would have done.

As I promised earlier, here is some speed testing between my make
clone and GNU Make.

The source code for the test generator, and the script used to run the
tests, is at the end of this post.

The short version of what my tests were is: The dependency graph is a
balanced binary tree with roughly 25,000 nodes, with a single root,
where each node has 75 children except for the leafs. I figured that
this is somewhat indicative of the dependency graph of C++ source
code. (At least, without header dependencies. That could get a lot
more complex trying to model. I figure that this is still a relatively
indicative test of real world usage, though by no means exhaustive or
foolproof.) I hope I have no glaring mistakes or oversights. I'm
welcome to rerun it or redesign the test if you find any problems.

The command to "build" each node is simply touch. Rules for cleaning
are also present. It's only a skeleton implementation of a correct
incremental build system, but hopefully it is indicative of a fully
fleshed out build system. I think it is. This testing here also fits
with my previous experience implementing such a fully fleshed out
build system and comparing the time with GNU Make and my make clone.
Tests were run on the same machine listed in my above post. I ran the
tests twice, and the numbers were reasonably close.

Results:

my make clone using new built-in no-process-spawn touch and rm
** notarget -- ~1 second
** all from clean -- 55 seconds overall -- ~1 second in phase 1
** all from all -- ~3 seconds
** all from single leaf out of date -- ~2 seconds
** clean from all -- 24 seconds overall -- ~1 second in phase 1
** clean from clean -- 13 seconds overall -- ~1 second in phase 1

my make clone using $(process touch ...) and $(process rm ...)
** notarget -- ~1 second
** all from clean -- 225 seconds overall -- ~1 second in phase 1
** all from all -- ~2 seconds
** all from single leaf out of date -- ~1 second in phase 1
** clean from all -- 136 seconds overall -- ~1 second in phase 1
** clean from clean -- 125 seconds overall -- ~1 second in phase 1

my make clone using $(shell touch ...) and $(shell rm ...)
** notarget -- ~1 second
** all from clean -- 150 seconds overall -- ~1 second in phase 1
** all from all -- ~3 seconds
** all from single leaf out of date -- ~2 seconds
** clean from all -- 134 seconds overal -- ~1 second in phase 1
** clean from clean -- 137 seconds overall -- ~1 second in phase 1

GNU Make 3.81 using $(shell touch ...) and $(shell rm ...)
** notarget -- 32 seconds overall -- 19 seconds spent in phase 1
** all from clean -- 95 seconds overall -- 20 seconds spent in phase 1
** all from all -- 44 seconds overall -- 20 seconds spent in phase 1
** all from single leaf out of date -- 44 seconds overall -- 20
seconds spent in phase 1
** clean from all -- 63 seconds overall -- 20 seconds spent in phase 1
** clean from clean -- 46 seconds overall -- 20 seconds spent in phase
1

Let me talk about a couple things about the tests.

The first set of tests for my make clone does not spawn processes. It
uses the C APIs of POSIX and WIN32 directly. The next set uses my new
built-in $(process ) which directly spawns the process without a shell
inbetween. The last set of tests of my make clone uses $(shell ),
which uses an intermediary shell to spawn the process just like Make
is advertised to do. (See gen_tests.cpp in the appendex for a more
complete description.)

One thing to note: I remember that GNU Make is optimized so that if it
sees a single "simple" command, then it bypasses the shell entirely
and directly spawns the process. Similarly, my $(process ) primitive
does not spawn a shell and spawns the process directly. However, it
appears that my underlying process library, used to implement $
(shell ) and $(process ), written by me, is rather inefficient. I need
to look at it. If I was doing this again, I would just use Boost.
(However, I am not pursing a Make clone any more, for the
aforementioned reasons.)

I just now noticed this inefficiency in my process spawning because I
wasn't really planning on using $(shell ) and $(process ) all that
often. I was planning on using built-ins to avoid a lot of process
spawning. These built-ins are one of the major motivating factors
behind me writing my make clone. The performance of using an external
shell for "read from file" and "write to file" was especially bad on
windows. (That, and I really did not feel like learning in depth the C
code for GNU Make. It has some particularly annoying hacks and styles
devoted to avoiding string copies.) I also did it as a learning
exercise. I never expected that I would so outperform GNU Make on
Linux.

One can also see that my parsing code and string interpreter code is
the neighborhood of 20 times faster than GNU Make's. Why? I do not
know. Figuring that out would involve profiling and looking at the GNU
Make source code in depth, and one of my explicit goals in writing
this clone has been to avoid looking at GNU Make's source in depth.
From what I recall on an actual production-ready makefile
implementation for incremental C++, this number was about the same.

Similarly, we can compare the "all from all" tests, and we see that my
phase 2 code (including dependency graph walking, where we stat files,
decide if a node's commands need to be run, etc.), is also in the
neighborhood of 10 to 20 times faster. Why? Again I do not know.

The rest of this post is just an explanation of what my make clone is
and why I think this is a fair comparison (and appendix).

My make clone is in many ways a near drop-in replacement for GNU Make.
It's written in portable C++, uses only the C++ standard, POSIX, and
WIN32 headers. It still interprets the makefiles, doesn't compile
anything at runtime, or anything else fancy. It supports recursive and
simple variables. It supports a large portion of the GNU Make built-in
functions, including:
and, abspath, addprefix, addsuffix, basename, call, delete-files, dir,
error, eval, filter, filter-out, findstring, firstword, foreach, if,
index, info, lastword, notdir, or, patsubst, shell, sort, strip,
subst, suffix, value, warning, wildcard, word, wordlist, words.

I also support the additional new built-in functions:
append-to-file [especially useful to get around command line length
limitations with $(shell echo )], cat, cwd, def-recursive-var, eq, eq-
ci [equals case insensitive ASCII], eval-value [equivalent to eval and
value, but allows use of line numbers in the stack trace from the
original variable definition given to $(eval-value ...)], foreach-
line, index-ci, lowercase [ASCII], makedirs, namespace, neq, neq-ci,
not, print-to-file, process [allows spawning a process directly
without a shell], recursive-wildcard, remove-dirs, sleep, sort, sort-
ci, touch, uppercase [ASCII],

However, I do honestly admit that I do not have all of the
functionality of GNU Make, some of which favors the results in my
make's favor. Things which immediately jump out at me are:

1- I don't support vpath or VPATH. (I don't think this is a
significant contributer to the slowness when all the files of the test
are in a single directory, though.)

2- I do not have $(origin ) nor $(flavor ), which would make GNU Make
slower than mine. I don't believe this is a significant contributer to
the difference in observed speed though.

3- My make does not have any implicit rules, nor pattern rules, nor
any other kind of rule which isn't the simple explicit kind. (However,
I don't think that these features would apply a significant penalty
when they are not applied such as this test.) (My clone, however, does
support rules with multiple targets made by a single command
invocation which GNU Make does not.)

#### ####
#### test_driver.sh contents

#! /bin/sh

date | tee foo.txt

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase notarget | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase notarget 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo '---- echo "" > aasuaa' | tee -a
foo.txt ; echo "" > aasuaa

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile1.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile1.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase notarget | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase notarget 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo '---- echo "" > aasuaa' | tee -a
foo.txt ; echo "" > aasuaa

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile2.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile2.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase notarget | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase notarget 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo '---- echo "" > aasuaa' | tee -a
foo.txt ; echo "" > aasuaa

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase all | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase all 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- infamake -f infamakefile3.mk -j=8 -d=phase clean | tee -a
foo.txt ; infamake -f infamakefile3.mk -j=8 -d=phase clean 2>&1 |
tee -a foo.txt ; date | tee -a foo.txt

echo ---- make -j8 notarget | tee -a foo.txt ; make -j8 notarget 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

echo ---- make -j8 all | tee -a foo.txt ; make -j8 all 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

echo ---- make -j8 all | tee -a foo.txt ; make -j8 all 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

echo '---- echo "" > aasuaa' | tee -a foo.txt ; echo "" > aasuaa

echo ---- make -j8 all | tee -a foo.txt ; make -j8 all 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

echo ---- make -j8 clean | tee -a foo.txt ; make -j8 clean 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

echo ---- make -j8 clean | tee -a foo.txt ; make -j8 clean 2>&1
| tee -a foo.txt ; date | tee -a foo.txt

#### ####
#### gen_tests.cpp

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <queue>
using namespace std;

struct Node
{ Node() : parent(0) {}
~Node() { for (int i=0; i<children.size(); ++i) delete
children[i]; }
string name;
Node * parent; //does not own
vector<Node*> children; //this object owns these
private:
Node(Node const& ); //not copyable
Node& operator= (Node const& ); //not copyable
};

string& makeNextName(string& name)
{ while (name.size() < 2)
name.push_back('a');
for (int i=2; ; )
{ if (i >= name.size())
{ name.push_back('a');
return name;
}
if (name[i] == 'z')
{ name[i] = 'a';
++i;
continue;
}
++name[i];
return name;
}
};

string getChildrenNameList(Node const* n)
{ string x;
for (int i=0; i<n->children.size(); ++i)
{ x += ' ';
x += n->children[i]->name;
}
return x;
}

ostream& print(ostream& out, Node* n)
{ out << "$(call macro," << n->name << "," << getChildrenNameList(n)
<< ")" "\n";
for (int i=0; i<n->children.size(); ++i)
print(out, n->children[i]);
return out;
}

int main()
{
int const cap = 25 * 1000;
int const branchingFactor = 75;

string name;

Node base;
base.name = makeNextName(name);

queue<Node*> q;
q.push(&base);

int numNodes = 1;
while (numNodes < cap)
{ Node * const next = q.front();
q.pop();
for (int i=0; i<branchingFactor; ++i)
{ auto_ptr<Node> child(new Node);
child->name = makeNextName(name);

++numNodes;
q.push(child.get());

next->children.push_back(child.get());
child.release();
}
}

ofstream gnumakefile("makefile");
ofstream infamakefile1("infamakefile1.mk");
ofstream infamakefile2("infamakefile2.mk");
ofstream infamakefile3("infamakefile3.mk");

gnumakefile <<
"all : ; $(info all) \n"
".PHONY : all " "\n"
"\n"
"clean : ; $(info clean) \n"
".PHONY : clean " "\n"
"\n"
"notarget : ; $(info notarget) \n"
".PHONY : notarget " "\n"
"\n"
"macro = $(eval $(value macro_impl))" "\n"
"define macro_impl" "\n"
" # $1 name of file" "\n"
" # $2 list of dependencies" "\n"
" " "\n"
" $1 : $2 ; @touch $@" "\n"
" " "\n"
" all : $1" "\n"
" " "\n"
" $1.clean : ; @rm -f $(basename $@)" "\n"
" .PHONY : $1.clean " "\n"
" clean : $1.clean " "\n"
"endef" "\n"
"\n"
;
infamakefile1 <<
".PHONY.all : ; $(info all)" "\n"
".PHONY.clean : ; $(info clean)" "\n"
".PHONY.notarget : ; $(info notarget)" "\n"
"\n"
"macro = $(eval $(value macro_impl))" "\n"
"define macro_impl" "\n"
" # $1 name of file" "\n"
" # $2 list of dependencies" "\n"
" " "\n"
" $1 : $2 ; $(touch $@)" "\n"
" " "\n"
" .PHONY.all : $1" "\n"
" " "\n"
" .PHONY.$1.clean : ; $(delete-files $(basename $@))" "\n"
" .PHONY.clean : .PHONY.$1.clean " "\n"
"endef" "\n"
"\n"
;

infamakefile2 <<
".PHONY.all : ; $(info all)" "\n"
".PHONY.clean : ; $(info clean)" "\n"
".PHONY.notarget : ; $(info notarget)" "\n"
"\n"
"macro = $(eval $(value macro_impl))" "\n"
"define macro_impl" "\n"
" # $1 name of file" "\n"
" # $2 list of dependencies" "\n"
" " "\n"
" $1 : $2 ; $(process touch $@)" "\n"
" " "\n"
" .PHONY.all : $1" "\n"
" " "\n"
" .PHONY.$1.clean : ; $(process rm $(basename $@))" "\n"
" .PHONY.clean : .PHONY.$1.clean " "\n"
"endef" "\n"
"\n"
;

infamakefile3 <<
".PHONY.all : ; $(info all)" "\n"
".PHONY.clean : ; $(info clean)" "\n"
".PHONY.notarget : ; $(info notarget)" "\n"
"\n"
"macro = $(eval $(value macro_impl))" "\n"
"define macro_impl" "\n"
" # $1 name of file" "\n"
" # $2 list of dependencies" "\n"
" " "\n"
" $1 : $2 ; $(shell touch $@)" "\n"
" " "\n"
" .PHONY.all : $1" "\n"
" " "\n"
" .PHONY.$1.clean : ; $(shell rm $(basename $@))" "\n"
" .PHONY.clean : .PHONY.$1.clean " "\n"
"endef" "\n"
"\n"
;

print(gnumakefile, &base);
print(infamakefile1, &base);
print(infamakefile2, &base);
print(infamakefile3, &base);

gnumakefile << "$(info $(shell date) -- done with phase 1)" "\n";
infamakefile1 << "$(info $(shell date) -- done with phase 1)" "\n";
infamakefile2 << "$(info $(shell date) -- done with phase 1)" "\n";
infamakefile3 << "$(info $(shell date) -- done with phase 1)" "\n";

gnumakefile.close();
if (!gnumakefile)
{ cerr << "error" << endl;
return 1;
}
infamakefile1.close();
if (!infamakefile1)
{ cerr << "error" << endl;
return 1;
}
infamakefile2.close();
if (!infamakefile2)
{ cerr << "error" << endl;
return 1;
}
infamakefile3.close();
if (!infamakefile3)
{ cerr << "error" << endl;
return 1;
}
}

Joshua Maurice

unread,

Jul 28, 2010, 9:34:06 PM7/28/10

On Jul 28, 6:28 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> string& makeNextName(string& name)
> { while (name.size() < 2)
> name.push_back('a');
> for (int i=2; ; )
> { if (i >= name.size())
> { name.push_back('a');
> return name;
> }
> if (name[i] == 'z')
> { name[i] = 'a';
> ++i;
> continue;
> }
> ++name[i];
> return name;
> }
> };

Also, yes, I know this works only for ASCII and similar encodings.
It's a throwaway program, and I was lazy. Fix it up if you want to use
it on a non-ASCII system.

Jorgen Grahn

unread,

Jul 30, 2010, 7:05:12 AM7/30/10

On Thu, 2010-07-22, Joshua Maurice wrote:
> On Jul 22, 3:09 am, Ian Collins <ian-n...@hotmail.com> wrote:
>> On 07/22/10 10:00 PM, Jorgen Grahn wrote:
>>
>> > On Tue, 2010-07-20, Joshua Maurice wrote:
>> >> On Jul 20, 2:15 pm, Ian Collins<ian-n...@hotmail.com> wrote:
>>
>> >>> Which is why every team I have worked with or managed had one or two
>> >>> specialists who look after the build system and other supporting tools
>> >>> (SCM for instance).
>>
>> > Yes, I guess not /everyone/ in the project has to know /all/ about the
>> > build system. But I also think it's dangerous to delegate build and
>> > SCM to someone, especially someone who's not dedicated 100% to the
>> > project, and who doesn't know the code itself.
>>
>> I guess you've never worked on a multi-site project using Clear Case!

I've been close enough to it to see your point ... the extra
limitations that come from "multi-site" are indeed scary.

But I've used ClearCase a lot, and while there *are* things that are
delegated, it's mostly the invisible housekeeping[1] (and the guru/
troubleshooter role). The daily use of the system has been a pure
project thing.

>> > You get problems like people not arranging their code to support
>> > incremental builds. For example, if all object files have a dependency
>> > on version.h or a config.h which is always touched, incremental builds
>> > don't exist.
>>
>> A good slap round the head normally solves that problem.

Yes, but you need someone (a) close enough to notice, (b) knowing it
doesn't have to be like that, and (c) with enough authority to slap
them without causing even more problems.

> Can you come slap mine please? We have two parallel build machines,
> one full clean, one this hacked version of incremental I set up. Every
> such incremental build changes the version.h (and a couple other
> version files like version.java), updating the build number, which has
> the result that my incremental build tends to rebuild like 40% of all
> of the code on every streaming incremental build because these version
> files were changed. This was noted to management, but no time was
> allocated to fix this.

In those situations I sometimes find an hour or two somewhere
(overtime, or even unpaid[2]) and fix it. You can probably come up
with various fixes, for example:
- rename version.h version.cpp
- make it contain a const char* build_number()
- replace all references to the build number with
extern const char* build_number();
foo(..., build_number(), ...);
- Show it to people and offer to merge it into mainline.

You need a bit of cred to pull it off, and the right kind of
workplace. And you better not introduce bugs or force people to change
their work habits.

/Jorgen

[1] Not trying to belittle what it takes to keep ClearCase
up and reasonably fast.
[2] Sometimes removing a source of daily frustration is worth
a few hours of your free time.

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Jorgen Grahn

unread,

Jul 30, 2010, 7:25:03 AM7/30/10

On Thu, 2010-07-22, Joshua Maurice wrote:

> On Jul 22, 4:25 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
>> On Tue, 2010-07-20, Joshua Maurice wrote:

...

>> > The
>> > immediate conclusion is that a build system based on idiomatic make
>> > can never be incrementally correct over all possible changes in source
>> > control. That is, a developer will inevitably make a change which
>> > breaks the idiomatic usage (what little there is) and will result in
>> > incremental incorrectness.
>>

>> So review the changes and reprimand him. Same as with the C++ code,
>> but *a lot* easier.
>
> They'll argue that 95% incremental correctness is acceptable, just as
> someone else has else-thread.

I think it was me, but the other 5% was the make "loopholes" we've
been talking about (source files being removed etc), not Makefiles
that are plain incorrect.

> If you allow a build system where the
> developer can incorrectly specify a build script, but it works most of
> the time, management will not see a need to spend developer time
> fixing it. That's why I want it near impossible for a developer to be
> able to break incremental correctness short of maliciousness.

That's a realistic descripton of many organizations :-/ The costs
(broken builds, long compile-edit cycles, bored developers ...) are
hidden; if a build has always taken 2 hours, it's hard for management
to imagine that it could really take 2 minutes.

I guess I see that as the same fight as for type safety, const
correctness, thread safety ... whatever makes the C++ code safe to
work with, but has an initial cost and doesn't visibly contribute to
productivity.

>> > False negatives are quite annoying, but
>> > perhaps somewhat acceptable. False positives are the bane of a build
>> > systems existence, but they are possible when the build system itself
>> > is being constantly modified \without any tests whatsoever\ as is
>> > common with idiomatic make.
>>

>> That *is* scary, but I see no way around it.
>
> I'm working on it. See else-thread for a description of my build
> system.

I don't see how it's possible (technically, and also have it accepted
and so widely spread that it matters to people like me). But I'm not
exactly known for embracing new ideas, so don't listen to me!

/Jorgen

Joshua Maurice

unread,

Jul 30, 2010, 12:06:38 PM7/30/10

Keith H Duggar has done it quite well I think. He hasn't fully
specified enough so I could replicate it, but he has specified enough
that I'm convinced it meets all of my criteria for his use case.

Developers do not modify makefiles for their normal activities. It has
prebuilt macros (not make evil-value macros, but whole prebuilt
makefiles). (He said something about a database for handling link
dependencies. Sounds interesting. At the very least, there could just
be a single file describing link dependencies and nothing else, so
presumably it would be quite difficult either way for developers to
break it, and quite easy to fix if they did.)

Once you have everyone using instantiations of verified macros, the
rest is childsplay. Keith's macros seem to handle most or all of my
corner cases, or at the very least could be made to handle all of
them, including includes hiding includes and removed cpp files causing
a relink of its output library.

First, it's not very portable. That's a big problem for my company
when we support basically every desktop and mainframe known to man (or
at least a large portion of them).

Second, from all of my testing, using GNU Make would be unacceptably
slow. As I have shown, other make clones could greatly outperform GNU
Make, but even then, they would be outperformed by a solution written
more in a compiled language and not an interpreted language. At least,
the last time I tried using makefile for a small section of my
company's codebase, the build system overhead was in the minutes to
check "out of date" aka phase 1.

Also, it's a little light on customization. He basically said that
their developers never need to add custom command line preprocessor
defines, nor change any other build options. My company's makefiles
have the occasional "hack" to overcome a compiler bug, to turn down
optimizations in this one particular file on this one system otherwise
the compile of that file would take hours alone, and so on. (Of
specific note is the handling of shared libraries for windows. AFAIK,
the standard solution is for each project to have its own special
preprocessor define, which is defined when compiling that shared lib
only, to allow correct usage of __declspec(dllimport) and
__declspec(dllexport). Then again, I suppose the scripts could auto
define this preprocessor define based on the soname, so actually
nevermind on this point.)

Finally, most importantly, it only works well for C++ or similar
compilation models. As soon as you throw in Java, C++ code generation,
or anything else which doesn't fit the make model, then you lose
correct, fast incremental.

So, how do you fix this? See my earlier posts. The short version is to
write a domain specific language, a build specific language, where all
the normal developer can do is say "I have some cpp source files here
which I want turned into a shared lib here", "I have some java source
files here, with this classpath, which I want turned into a jar here",
etc. Do not give them a Turing complete programming language. Just
give them a simple declarative language where they can declare
instances of a predefined built type and specify the particular build
options (ex: includes, preprocessor defines, dependencies) to the
instantiation of the macro. Then, just have an engine which you can
plugin these macros, and then write all the macros you need. Sometimes
a developer will have a need not met by the current list of macros, so
he can write his own, or if it comes to that I'll have to provide a
general purpose "exec" macro. (I've been thinking about it, and I've
been thinking if I could expose such a think and have some level of
guarantees with incremental correctness. I don't think I can. As soon
as I expose the shell, they have their Turing complete programming
language, and they will inevitably break incremental correctness
through hackery and stupidity [ignorance], myself included.)

There are two insights. First is that file level dependency graphs
don't work all that well, but a "macro level" dependency graph would
work wonders, something very much like Ant. (Except less stupid. I am
particularly annoyed that you cannot just hit a switch and parallelize
all Ant tasks as much as possible within dependency constraints ala
GNU Make -j[#]. Instead, you have to identify beforehand the
particular sets of tasks which are to be executed in parallel. Far
less powerful, far more cumbersome, and error-prone.) Second, give a
bunch of prebuilt tasks, or macros, for the developer to use, which
have been vetted for incremental correctness when used together.
Again, very much like Ant at first glance, except the large majority
of Ant's actual tasks are not incrementally correct, whereas I'm
suggesting you write actually incrementally correct macros / tasks.

Keith H Duggar

unread,

Jul 30, 2010, 1:09:06 PM7/30/10

The database is flat text in a simple format that looks like
this (some names redacted):

% DEPTYPE TGTPATH DEPPATH
- link apps/abc/abc libs/libbase.a
- link apps/abc/abc libs/libmath.a
...
- link libs/libmath.a libs/libbase.a

There are many such text databases ("tdb") housing the majority
of our data (not only build related but business related). They
are accessed through a set of command line tools. For example,
the dependency database can be dumped to stdout with the command:

tdbcat build.deps

That command processes one or more actual files to generate the
requested view. The files include override files that allow for
easy manual correction or augmentation.

The build system essentially runs

tdbcat build.deps | tdbtransclose

to project the transitive closure of dependencies and processes
that to generate an ordered list of libraries for an apps linker
command ie variables that look like:

LIBDEPS := libs/libprop.a libs/libmath.a libs/libbase.a

Note that we do not allow circular link dependecies so if the
transitive closure results in any a -> a entries the build is
halted.

KHD

Jorgen Grahn

unread,

Jul 31, 2010, 6:39:35 PM7/31/10

And I should say, when you have something and want to share it,
announce it on comp.software.config-mgmt. That's one group where I
know build tools are on topic.

Jorgen Grahn

unread,

Jul 31, 2010, 7:07:01 PM7/31/10

On Fri, 2010-07-30, Joshua Maurice wrote:
...

> Second, from all of my testing, using GNU Make would be unacceptably
> slow. As I have shown, other make clones could greatly outperform GNU
> Make,

I have not seen that shown (I have missed a few postings), but is that
really so, and which clones?

> but even then, they would be outperformed by a solution written
> more in a compiled language and not an interpreted language.

I doubt it. I believe any sane make's performance is limited by
stat()ing files and running external programs. Once you have minimized
the first one, there is not much more power to squeeze out of it.

> At least,
> the last time I tried using makefile for a small section of my
> company's codebase, the build system overhead was in the minutes to
> check "out of date" aka phase 1.

Also surprising, unless you're on Windows or have some other (to me)
unusual thing in your environment. I once wrote a sane Makefile
(replaced insane ones) for a fairly big project (maybe 500 translation
units and 1000 header files), and had it run on a *really* slow file
system (ClearCase dynamic view). That overhead was a fraction of a
second -- I expected much more.

> Finally, most importantly, it only works well for C++ or similar
> compilation models. As soon as you throw in Java, C++ code generation,
> or anything else which doesn't fit the make model, then you lose
> correct, fast incremental.

I'm not sure why you say C++ code generation doesn't fit the make model.
Make even has builtin rules for tools like lex and yacc. Is it because
you cannot find that part of the dependency graph without actually
generating the code?

> So, how do you fix this? See my earlier posts. The short version is to
> write a domain specific language, a build specific language, where all
> the normal developer can do is say "I have some cpp source files here
> which I want turned into a shared lib here", "I have some java source
> files here, with this classpath, which I want turned into a jar here",
> etc. Do not give them a Turing complete programming language. Just
> give them a simple declarative language where they can declare
> instances of a predefined built type and specify the particular build
> options (ex: includes, preprocessor defines, dependencies) to the
> instantiation of the macro. Then, just have an engine which you can
> plugin these macros, and then write all the macros you need. Sometimes
> a developer will have a need not met by the current list of macros, so
> he can write his own, or if it comes to that I'll have to provide a
> general purpose "exec" macro. (I've been thinking about it, and I've
> been thinking if I could expose such a think and have some level of
> guarantees with incremental correctness. I don't think I can. As soon
> as I expose the shell, they have their Turing complete programming
> language, and they will inevitably break incremental correctness
> through hackery and stupidity [ignorance], myself included.)

Most people (at least me) expect a build tool to be able to run
programs which the build tool has never heard of. I often have a Perl
script generate some table, a shell script formatting the documentation,
or some tool prepare the distributable stuff (e.g. RPMs for some
Linuxes, or some odd format for loading into PROM).

But perhaps some or all of those steps can be isolated from the
compilation and linking ... into a trivial Makefile.

Joshua Maurice

unread,

Jul 31, 2010, 8:29:39 PM7/31/10

On Jul 31, 4:07 pm, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Fri, 2010-07-30, Joshua Maurice wrote:
> > Second, from all of my testing, using GNU Make would be unacceptably
> > slow. As I have shown, other make clones could greatly outperform GNU
> > Make,
>
> I have not seen that shown (I have missed a few postings), but is that
> really so, and which clones?

Sorry. I quoted some tests run by me else-thread. Please see that.
It's not exactly showing reliably because I can't post all of the
source, so I guess you'll just have to trust me. I'm working on
getting my company to open source it, but no guarantees on time.

> > but even then, they would be outperformed by a solution written
> > more in a compiled language and not an interpreted language.
>
> I doubt it. I believe any sane make's performance is limited by
> stat()ing files and running external programs. Once you have minimized
> the first one, there is not much more power to squeeze out of it.

Please see my tests else-thread. Here's the google groups link:
http://groups.google.com/group/comp.lang.c++/msg/843d5f7230cccc00

> > At least,
> > the last time I tried using makefile for a small section of my
> > company's codebase, the build system overhead was in the minutes to
> > check "out of date" aka phase 1.
>
> Also surprising, unless you're on Windows or have some other (to me)
> unusual thing in your environment. I once wrote a sane Makefile
> (replaced insane ones) for a fairly big project (maybe 500 translation
> units and 1000 header files), and had it run on a *really* slow file
> system (ClearCase dynamic view). That overhead was a fraction of a
> second -- I expected much more.

My company builds on windows and most other common Unix-like OSs and a
lot of mainframes. The particular product which I work on has about
25,000 source files in its build, and growing.

Let me repeat what I have said else-thread. A "true" correct
incremental build system should do only correct incremental builds. It
should not be capable of doing incorrect incremental builds; that is,
all incremental builds should produce output equivalent to full clean
builds.

This includes, when a developer removes a cpp source file, it should
remove its corresponding stale obj file, and it should relink its
corresponding lib.

I believe that this is rather unpractical when everyday developer
actions involve modifying the build scripts when the build scripts are
written in a Turing complete language like GNU Make. As Keith has
shown, you can use GNU Make to do a good incremental build system, but
in so doing, developers do not generally modify the makefiles
themselves.

For the specific question, C++ code generation. I specifically
mentioned else-thread that it does not work well if the output of the
code generation is not known at makefile parse time. For example, my
company does codegen from Rose model files to cpp source files, and to
java source files. This is done to facilitate serialization of object
graphs between the cpp and java. This is a one to many build step: a
single Rose file can have many cpp files as output. Thus, when make is
parsing the makefile, it cannot tell which obj files are out of date
and should be deleted. No one knows which obj files are out of date
until actually running the code gen. The exact order of events needs
to be "generate cpp code", "check for out of date obj files and lib",
and "relink if necessary the lib". I don't see a particularly
intuitive way to do this in GNU Make. As mentioned else-thread, I
suppose I could put the logic for detecting stale object files in the
same make rule command as the Rose code generation, but this doesn't
seem like the best of ideas. Also, suppose that the rose file itself
was removed, but other checked-in cpp source code for that lib was
still there: I don't see offhand how I could check for stale obj files
if the developer just removes the Rose file.

Then we also have the difficulty of checking if a new header file hid
an old header file on some search path. This you really could not just
add to the Rose code generation command. The logic to check for header
file hiding needs to be at the point of compilation, not at the header
file generation.

The short version is that you need some logic to run on every build,
after certain incremental build steps but before others, and GNU Make
does not support this at all.

> Most people (at least me) expect a build tool to be able to run
> programs which the build tool has never heard of. I often have a Perl
> script generate some table, a shell script formatting the documentation,
> or some tool prepare the distributable stuff (e.g. RPMs for some
> Linuxes, or some odd format for loading into PROM).

But no build tool actually does this. At best, they just provide a
framework for the new compilation step. GNU Make just provides a
framework. The build tool which I'm writing just provides a framework.
However, for actual incremental correctness, and for applicability to
lots of different kinds of build steps like my company uses, the make
model - file level dependency graph with cascading rebuilds without
termination conditions - does not work well at all, at least for my
company's uses.

Jorgen Grahn

unread,

Aug 4, 2010, 9:30:48 AM8/4/10

On Sun, 2010-08-01, Joshua Maurice wrote:
> On Jul 31, 4:07 pm, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
>> On Fri, 2010-07-30, Joshua Maurice wrote:

...

> For the specific question, C++ code generation. I specifically
> mentioned else-thread that it does not work well if the output of the
> code generation is not known at makefile parse time. For example, my
> company does codegen from Rose model files to cpp source files, and to
> java source files. This is done to facilitate serialization of object
> graphs between the cpp and java. This is a one to many build step: a
> single Rose file can have many cpp files as output. Thus, when make is
> parsing the makefile, it cannot tell which obj files are out of date
> and should be deleted. No one knows which obj files are out of date
> until actually running the code gen. The exact order of events needs
> to be "generate cpp code", "check for out of date obj files and lib",
> and "relink if necessary the lib". I don't see a particularly
> intuitive way to do this in GNU Make.

I tend to blame the company developing the tool in cases like these. I
know Rational Rose (not Rose Realtime, which I hear is different and
better), and I have always loathed it for its lack of support for sane
version control, parallel development etc.

It never occured to me because I have never used it for code
generation, but I suppose that in the same way it lacks support for
building. Perhaps when you generate a lot of source files at random,
you should at the same time generate a Makefile fragment describing
the dependencies between them. Perhaps the code generator should
follow the same policy as a Makefile build and not touch a generated
header file unless it is actually changed.

So I'm defining tools that break Make as bad tools ... which of course
doesn't help people who are stuck with them :-/

Maybe it would help somewhat to wrap the code generator in a shell
script which only touches the regenerated .cpp/.h files which have
actually changed (and removes all of them if the generation fails).

> As mentioned else-thread, I
> suppose I could put the logic for detecting stale object files in the
> same make rule command as the Rose code generation, but this doesn't
> seem like the best of ideas. Also, suppose that the rose file itself
> was removed, but other checked-in cpp source code for that lib was
> still there: I don't see offhand how I could check for stale obj files
> if the developer just removes the Rose file.
>
> Then we also have the difficulty of checking if a new header file hid
> an old header file on some search path. This you really could not just
> add to the Rose code generation command. The logic to check for header
> file hiding needs to be at the point of compilation, not at the header
> file generation.

My assertion early in this thread was that such things (the "make
loopholes") can be avoided (don't use many include search paths)
and/or detected manually when they happen (rebuild from scratch when
files disappear from version control). I don't think I saw you
explaining why that isn't good enough, or did I miss that?

...

[About being able to specify other actions than compiling and linking]

>> Most people (at least me) expect a build tool to be able to run
>> programs which the build tool has never heard of. I often have a Perl
>> script generate some table, a shell script formatting the documentation,
>> or some tool prepare the distributable stuff (e.g. RPMs for some
>> Linuxes, or some odd format for loading into PROM).
>
> But no build tool actually does this. At best, they just provide a
> framework for the new compilation step. GNU Make just provides a
> framework. The build tool which I'm writing just provides a framework.

Yes, but it seemed to me you considered *not* providing that. That's
why I pointed out that it's important to many of us.

It seems to me that you have a pretty narrow focus and don't want to
listen to objections a lot. Actually, I think that's fine. That's what
*I* do when I have an idea and want to summon the energy to do
something about it.

Joshua Maurice

unread,

Aug 4, 2010, 3:29:10 PM8/4/10

On Aug 4, 6:30 am, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Sun, 2010-08-01, Joshua Maurice wrote:
> > For the specific question, C++ code generation. I specifically
> > mentioned else-thread that it does not work well if the output of the
> > code generation is not known at makefile parse time. For example, my
> > company does codegen from Rose model files to cpp source files, and to
> > java source files. This is done to facilitate serialization of object
> > graphs between the cpp and java. This is a one to many build step: a
> > single Rose file can have many cpp files as output. Thus, when make is
> > parsing the makefile, it cannot tell which obj files are out of date
> > and should be deleted. No one knows which obj files are out of date
> > until actually running the code gen. The exact order of events needs
> > to be "generate cpp code", "check for out of date obj files and lib",
> > and "relink if necessary the lib". I don't see a particularly
> > intuitive way to do this in GNU Make.
>
> I tend to blame the company developing the tool in cases like these. I
> know Rational Rose (not Rose Realtime, which I hear is different and
> better), and I have always loathed it for its lack of support for sane
> version control, parallel development etc.
>
> It never occured to me because I have never used it for code
> generation, but I suppose that in the same way it lacks support for
> building.

Well, our problem is that we develop the GUI and some newer components
in Java, but our old engine is in C++. Our engine solves a domain
specific problem with a domain specific language. This language is
represented by an object graph, whose representation is logically
coupled with the GUI of the end users. Our engine takes this graph,
picks it apart into separate tasks assignable to separate threads, and
begins processing. Implicit in this is that we want to be able to take
an object graph from Java, serialize to some XML format or some binary
format, and then deserialize to C++ to give to the engine, and vice
versa for debugging etc. This potentially requires arbitrary code
generation. At one point in time, we used several Rose model files to
describe the object graph of the domain specific language. We had a
custom inhouse tool which converted this to C++ classes and Java
classes with the serialization code in place. I don't really know any
other sane way to handle this use case, the serialization of object
graphs between different languages such as C++ and Java.

> Perhaps when you generate a lot of source files at random,
> you should at the same time generate a Makefile fragment describing
> the dependencies between them. Perhaps the code generator should
> follow the same policy as a Makefile build and not touch a generated
> header file unless it is actually changed.
>
> So I'm defining tools that break Make as bad tools ... which of course
> doesn't help people who are stuck with them :-/
>
> Maybe it would help somewhat to wrap the code generator in a shell
> script which only touches the regenerated .cpp/.h files which have
> actually changed (and removes all of them if the generation fails).

The "conditional touching of files in a command of a rule" won't work,
at least not with GNU Make. The GNU Make mailing list has confirmed
that any file creation, deletion, or modification during phase 2 may
not be picked up. This has been my experience playing with it as well.
GNU Make effectively determines which portions of the graph are out of
date before running any command of any rule.

This is one facet of my major beef with Make: from a command, you
cannot conditionally choose to mark downstream nodes as up to date or
out of date. Depending on the kind of build step, this affects its
incremental "goodness", the ability to skip unnecessary build steps,
to varying degrees.

With the Rose generation, with GNU Make, when the code generation task
is out of date, you can mark all output files out of date. It'll
result in some additional C++ compilation - a better system could skip
more unnecessary work, but at least it's incrementally correct.

With Java compilation, you could make an incrementally correct build
system, but it would be a cascading rebuild without termination,
vastly inferior to the aforementioned system which can terminate the
rebuild early.

> > As mentioned else-thread, I
> > suppose I could put the logic for detecting stale object files in the
> > same make rule command as the Rose code generation, but this doesn't
> > seem like the best of ideas. Also, suppose that the rose file itself
> > was removed, but other checked-in cpp source code for that lib was
> > still there: I don't see offhand how I could check for stale obj files
> > if the developer just removes the Rose file.
>
> > Then we also have the difficulty of checking if a new header file hid
> > an old header file on some search path. This you really could not just
> > add to the Rose code generation command. The logic to check for header
> > file hiding needs to be at the point of compilation, not at the header
> > file generation.
>
> My assertion early in this thread was that such things (the "make
> loopholes") can be avoided (don't use many include search paths)
> and/or detected manually when they happen (rebuild from scratch when
> files disappear from version control). I don't think I saw you
> explaining why that isn't good enough, or did I miss that?

Well, a couple things.

First, some build steps, like javah, javac, the aforementioned Rose
compilation, unzipping, and others, produce output which is not
predictable without doing the actual build step. With file creation
and deletion, you need to check for:
1- Stale files - this might require some cleaning, and rerunning of
build steps downstream.
2- New files
2a- New files which hide old files on some search path. Relatively
unlikely for C++ depending on naming conventions (a lot more likely in
my company's product due to bad naming conventions and lots of include
path entries), but much more likely for Java.
2b- New files which require new build steps, or new nodes in the
dependency graph. For my Rose to C++ code generation, I do not know
what C++ files will come out of it until I actually do the code
generation. It will produce lots of .cpp files (which will not be
modified by hand). Each of these .cpp files needs to be compiled to
a .o file. I would like for this to be done in parallel, but to do
that I need to define new make rules, aka add nodes to the dependency
graph, which one really cannot do in GNU Make.

I know a little about GNU Make and how it can have makefiles as
targets, and if it detects an out of date makefile, it will like,
rebuild that makefile (and all prereqs), and restart GNU Make from the
start with the new makefile. Has anyone ever used this? I admit that I
haven't played around with this fully, but my initial impression is
that it's basically unworkable for my problems. Restarting the whole
shebang after every such Rose code generation would result in a lot of
makefile parsing, easily adding minutes (or likely much more) to a
build time. Though, I admit I could be wrong here.

Finally, why punt? We're requiring that the developer be fully aware,
but I think that a lot of these problems, such as "do a full clean
build whenever a file is deleted" is easy to forget or accidentally
miss. I think this is a little different than "don't dereference a
null pointer" or similar arguments you can make. When we're writing
code, we're aware of the pointer and that it could be null. When we're
doing a build, we're busy thinking about code, not about whether the
entire codebase breaks some "build style" rule, or whether some file
has been deleted. It's not practical to check your email for
"incremental build breaking messages". It's inefficient, and error
prone. Moreover, it's fixable. It's quite doable to handle all of
this, and more, and to do it faster than GNU Make. There is no reason
to punt. The investment of time now to make a build system which can
handle it all will save lots of developer time later - for those
developers which:
- work on "the build from hell" (like me)
- or those who forgot to check for a file deletion when they did a
sync
- or those who are working on a mixed code base with Java, C++, etc.
(like me).

Put another way, yes I recognize that the perfectly correct, academic
way is not the way to do things. For example, see my post here:
http://groups.google.com/group/comp.lang.c++.moderated/msg/dacba7e87ded4dd7

However, it seems clear to me that this is a clear win for investing.
The investment needs to be done once, by one guy, and everyone in the
entire C++, Java, and more, programming world can use it to save time.
Any time savings \at all\ is easily worth it when we can amortize the
cost to one guy but claim savings from every developer everywhere.
Now, it's hard to make such an argument to management, mostly because
it's wrong. For management, you correctly need to show that it helps
the company, which is a bit harder to show, but I still think that
this is the case. (My management and peers disagree though.)

> [About being able to specify other actions than compiling and linking]
>
> >> Most people (at least me) expect a build tool to be able to run
> >> programs which the build tool has never heard of. I often have a Perl
> >> script generate some table, a shell script formatting the documentation,
> >> or some tool prepare the distributable stuff (e.g. RPMs for some
> >> Linuxes, or some odd format for loading into PROM).
>
> > But no build tool actually does this. At best, they just provide a
> > framework for the new compilation step. GNU Make just provides a
> > framework. The build tool which I'm writing just provides a framework.
>
> Yes, but it seemed to me you considered *not* providing that. That's
> why I pointed out that it's important to many of us.
>
> It seems to me that you have a pretty narrow focus and don't want to
> listen to objections a lot. Actually, I think that's fine. That's what
> *I* do when I have an idea and want to summon the energy to do
> something about it.

I'm not quite following. One second. I think I need to clarify. Make
does not do C++ compilation out of the box, nor any other random
possible build kind. You need to write some logic in makefile to
handle this new kind of build. My new tool will be effectively the
same: it won't handle any arbitrary build kind out of the box - it
won't be magic, but it will be simple and quick to extend it to handle
X new build kind, just like make. The difference is that I'm strictly
enforcing the separation of build kind logic from the average
developer who just instantiates an already defined macro. If need be,
the average developer can add a new macro, but it will not be in an
interpreted language ala make so it will be much faster, the macro
definition cannot be in arbitrary build script file ala make which
will allow much easier auditing of incremental correctness, and it
will make it harder for a unknowledgeable developer to break the build
because the build system makes it exceptionally hard to do so. I think
an analogy which applies: "private, protected, public, const" are
technically unnecessary for perfect developers, but we recognize their
utility in protecting us from ourselves. (No, this is not a proof by
analogy. I'm just trying to explain my case.)

Joshua Maurice

unread,

Aug 4, 2010, 3:50:11 PM8/4/10

Sorry, I just realized a much better way to put this:

This ties back to answer an earlier question: "why isn't imposing a
'build style' restriction, plus requiring clean builds on known
unhandled cases, enough?"

The tools are meant to aid us developers by automating the dependency
analysis. If a system can handle all deltas (aka full incremental
correctness), it's easy to write, it has low overhead, it is as
extensible, and it does better at skipping unnecessary build steps,
then I see absolutely no reason to punt, yet you advocate punting.
Given these facts, the outcome seems clear. Only inertia and ignorance
is keeping us with the obviously inferior tool.

However, I have no problem with Keith and his build system. It's fully
incrementally correct, it's already written, it has low overhead
[supposedly, I'm still somewhat incredulous on this fact], it's as
extensible as Keith needs, and it skips most of the unnecessary build
steps. It doesn't work for me and the general case because it's not as
extensible as I need to other kinds of build steps while maintaining
full incremental correctness, and it has been unacceptably slow in my
experience compared to alternatives.

Jorgen Grahn

unread,

Aug 6, 2010, 5:33:25 AM8/6/10

On Wed, 2010-08-04, Joshua Maurice wrote:
...

>> [About being able to specify other actions than compiling and linking]
>>
>> >> Most people (at least me) expect a build tool to be able to run
>> >> programs which the build tool has never heard of. I often have a Perl
>> >> script generate some table, a shell script formatting the documentation,
>> >> or some tool prepare the distributable stuff (e.g. RPMs for some
>> >> Linuxes, or some odd format for loading into PROM).
>>
>> > But no build tool actually does this. At best, they just provide a
>> > framework for the new compilation step. GNU Make just provides a
>> > framework. The build tool which I'm writing just provides a framework.
>>
>> Yes, but it seemed to me you considered *not* providing that. That's
>> why I pointed out that it's important to many of us.

...

> I'm not quite following. One second. I think I need to clarify. Make
> does not do C++ compilation out of the box, nor any other random
> possible build kind. You need to write some logic in makefile to
> handle this new kind of build. My new tool will be effectively the
> same: it won't handle any arbitrary build kind out of the box - it
> won't be magic, but it will be simple and quick to extend it to handle
> X new build kind, just like make.

OK, then I misunderstood. A simple and quick way to handle new
"compilers" and "languages" is all I ask for in a make replacement.

Joshua Maurice

unread,

Sep 24, 2010, 5:43:35 PM9/24/10

On Jul 28, 6:28 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
<snip comparison of my make clone and GNU Make 3.81>

So, my company's management decided that we're not in the business of
making and selling make clones (pardon the pun), and some saw that we
might benefit in the future if this situation is improved, so they let
me open source it. I need to work out the details still, but I hope to
publish the source to my make clone in the near future.

I think that management has said it prefers a license with copyleft
and with an explicit linking exception. Anyone have any preferences on
this? Anyone know any good articles I could peruse? My default
position for copyleft and explicit linking exception is GNU LGPL, but
I'm not particularly well educated on this.

Second, where would I post the code? Sourceforge I guess?

Joshua Maurice

unread,

Sep 24, 2010, 7:44:38 PM9/24/10

So, sorry to be possibly greatly off topic, but I figure it relates to
programming in C++ however tenuously, and it relates to this thread a
bit more strongly.

So, I just reviewed basic copyright law, and the common licenses in
use, including GNU GPL, GNU LGPL, Boost, BSD, and MIT.

As I see it, my basic options are:

1- Full viral copyleft which tries to assert that linking creatives a
protected derivative work, ex: GNU GPL.
2- Full viral copyleft with an explicit linking exception, ex: GNU
LGPL.
3- Full viral on source only, aka an explicit binary exception, ex:
Boost, BSD, MIT licenses.

I wouldn't want someone to develop a plugin for my build framework,
then be allowed to distribute the build framework binaries but without
the source code to that very essential plugin. If I understand it
correctly, an explicit linking exception would allow such a thing,
thus I lean towards full GNU GPL. (I understand that the issue of
whether linking creates a protected derivative work has not been
decided by the courts, and there's lots of reasonable arguments on
both sides.)

However, I dislike the verbosity of the GNU GPL and the GNU LGPL. Is
all of that legally necessary? I would really prefer something shorter
and simpler which basically has the same essence, and causes as little
license conflict as possible.

Any comments?

I think my management is leaning towards GNU LGPL. I am not privy to
why. I just heard a one-off from a manager which I think included
"open source", "linking exception", and "not GNU GPL".

Paavo Helde

unread,

Sep 25, 2010, 3:55:29 AM9/25/10

Joshua Maurice <joshua...@gmail.com> wrote in news:6a3693d6-5d1d-
41fb-88de-8...@p22g2000pre.googlegroups.com:

> So, I just reviewed basic copyright law, and the common licenses in
> use, including GNU GPL, GNU LGPL, Boost, BSD, and MIT.
>
> As I see it, my basic options are:
>
> 1- Full viral copyleft which tries to assert that linking creatives a
> protected derivative work, ex: GNU GPL.
> 2- Full viral copyleft with an explicit linking exception, ex: GNU
> LGPL.
> 3- Full viral on source only, aka an explicit binary exception, ex:
> Boost, BSD, MIT licenses.

In our company (producing commercial software) we are using multiple
external libraries for various peripheral tasks, both closed and open
source. However, if something is licensed GPL, it is automatically
excluded for us, we cannot use it however good it might be, and are bound
to reinvent wheels or use some other library. LGPL on the other hand is
OK.

> I wouldn't want someone to develop a plugin for my build framework,
> then be allowed to distribute the build framework binaries but without
> the source code to that very essential plugin. If I understand it
> correctly, an explicit linking exception would allow such a thing,
> thus I lean towards full GNU GPL. (I understand that the issue of
> whether linking creates a protected derivative work has not been
> decided by the courts, and there's lots of reasonable arguments on
> both sides.)

For us it is the other way around. We are providing a large framework
which has its own extension libraries (effectively plugins). Now if one
plugin were somehow linked to some GPL code and loaded into our process,
it might arguably turn all of our large code base into GPL. I personally
would have nothing against this, but the management thinks otherwise. And
of course it would create a lot of problems with closed source libraries
we are using in other parts of the system.

> However, I dislike the verbosity of the GNU GPL and the GNU LGPL. Is
> all of that legally necessary? I would really prefer something shorter
> and simpler which basically has the same essence, and causes as little
> license conflict as possible.

I have understood that the verbosity is needed, and that MIT-style
licenses are too weak if you ever want to have any say about the code you
have written. But IANAL, so don't rely on my words on that.

Regards
Paavo

Öö Tiib

unread,

Sep 25, 2010, 6:23:18 AM9/25/10

On Sep 25, 2:44 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> I wouldn't want someone to develop a plugin for my build framework,
> then be allowed to distribute the build framework binaries but without
> the source code to that very essential plugin. If I understand it
> correctly, an explicit linking exception would allow such a thing,
> thus I lean towards full GNU GPL. (I understand that the issue of
> whether linking creates a protected derivative work has not been
> decided by the courts, and there's lots of reasonable arguments on
> both sides.)

Then use GPL. Like i understand it is separate tool for integrating
variety of other tools into build process. Integration goes without
linking and the tool will be probably distributed separately from
compilers and linkers and what else tools it will integrate so i do
not see why you need GNU LGPL.

There are hundreds of countries on our planet. At some places parts of
the GPL text may be unnecessary and at other places it may be not
sufficient to protect your work from usages that violate the license.
I think you do not need to write all that license into each source
code file, so why does verbosity worry you? Put a file with license
into distribution and refer to it from each source code file. If it is
GNU GPL of some version then no one will ever read it anyway,
everybody know already what it is about.

Joshua Maurice

unread,

Sep 25, 2010, 6:31:57 AM9/25/10

On Sep 25, 12:55 am, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
> Joshua Maurice <joshuamaur...@gmail.com> wrote in news:6a3693d6-5d1d-
> 41fb-88de-8945d01a5...@p22g2000pre.googlegroups.com:

As Öö Tiib put it else-thread, I think your complaints do not apply.
Presumably, you use GNU gcc or GNU Make on some platforms, and offhand
both are GNU GPL. However, that doesn't force your product built with
GNU gcc and GNU Make to be GNU GPL.

Ian Collins

unread,

Sep 25, 2010, 6:36:43 AM9/25/10

On 09/25/10 10:31 PM, Joshua Maurice wrote:
>
> As 嘱 Tiib put it else-thread, I think your complaints do not apply.

> Presumably, you use GNU gcc or GNU Make on some platforms, and offhand
> both are GNU GPL. However, that doesn't force your product built with
> GNU gcc and GNU Make to be GNU GPL.

What you build with is irrelevant, it's what you link with that causes
license problems.

--
Ian Collins

Paavo Helde

unread,

Sep 25, 2010, 12:23:53 PM9/25/10

Joshua Maurice <joshua...@gmail.com> wrote in

news:1bba3829-022e-4259...@k1g2000prl.googlegroups.com:

Yes, if it's a different tool GPL would be OK. However, the fact that you
even consider LGPL indicates that your tool can be used by linking some
other software with it. If this is useful for extra plugins only then
indeed GPL would probably be a good idea.

Regards
Paavo

Francesco S. Carta

unread,

Sep 25, 2010, 12:41:53 PM9/25/10

Once we are there, I'd like to ask a somewhat silly question, just for
confirmation.

Assume I have some GPL code and I make an executable out of it, I'm only
expected to provide the license and the source code of that binary, that
should not affect the licensing of any other executable that "uses" the
GPL executable by calling it and using its output, even if I happen to
ship all of it together in a single installer, is this interpretation
correct?

To put it in other words, take this as the content of an installer:
-----------------
gpl_program.exe
gpl_program.cpp
gpl_program.license
proprietary_program.exe (calls gpl_program.exe depends on its output)
-----------------

The fact that I ship all the above with a single installer does not
affect the licensing of proprietary_program.exe, which can be left fully
copyrighted and closed source, am I right?

(I know this is not a lawyers' lounge, I'm just asking for some common
sense interpretations)

--
Francesco S. Carta
http://fscode.altervista.org

Öö Tiib

unread,

Sep 25, 2010, 3:51:25 PM9/25/10

On Sep 25, 7:41 pm, "Francesco S. Carta" <entul...@gmail.com> wrote:

> Ian Collins <ian-n...@hotmail.com>, on 25/09/2010 22:36:43, wrote:
>
> > On 09/25/10 10:31 PM, Joshua Maurice wrote:
>

> >> As Öö Tiib put it else-thread, I think your complaints do not apply.

IANAL neither. AFAIK it seems it is done how you describe it
sometimes. There may be some thin nuances of course, who knows. Isn't
Apple XCode developer suite for example bundled with (modified by
Apple) GNU Compiler Collection?

Bo Persson

unread,

Sep 26, 2010, 7:08:41 AM9/26/10

嘱 Tiib wrote:
> On Sep 25, 7:41 pm, "Francesco S. Carta" <entul...@gmail.com> wrote:
>> Ian Collins <ian-n...@hotmail.com>, on 25/09/2010 22:36:43, wrote:
>>
>>> On 09/25/10 10:31 PM, Joshua Maurice wrote:
>>

>>>> As 嘱 Tiib put it else-thread, I think your complaints do not

Yes, but we also notice that Apple stopped at gcc 4.2, which is
licensed under GPL v2.

This is not simple at all. Not being a lawyer, I would still make the
example above two separate installers.

Bo Persson

Francesco S. Carta

unread,

Sep 26, 2010, 1:11:12 PM9/26/10

Bo Persson <b...@gmb.dk>, on 26/09/2010 13:08:41, wrote:

> Öö Tiib wrote:
>> On Sep 25, 7:41 pm, "Francesco S. Carta"<entul...@gmail.com> wrote:
>>> Ian Collins<ian-n...@hotmail.com>, on 25/09/2010 22:36:43, wrote:
>>>
>>>> On 09/25/10 10:31 PM, Joshua Maurice wrote:
>>>

>>>>> As Öö Tiib put it else-thread, I think your complaints do not

Thank you both for your comments, this is a problem I'm going to face,
I'll speak about it with the people who will distribute my code
(assuming I'll be able to do what they ask me to). I think that having
such an "helper" program licensed under BSD instead of GPL would
simplify my case.

Joshua Maurice

unread,

Sep 29, 2010, 5:52:49 PM9/29/10

I was thinking about this more, and GPL might preclude integrating my
new build system with something such as Eclipse (or any other
anything). As such, I'm heavily leaning towards LGPL.

In other news, my company's lwayers have given an ETA of a couple
weeks for figuring out the licensing stuff and giving me permission to
post the source code to my Make drop-in replacement, and the other
build system, my newer one which isn't a simple drop-in Make
replacement.

0 new messages