tl;dr We're proposing moving away from Makefile's as the sole source of
the build system definition. This will lead to faster build times.
Bikeshedding^wFeedback on the file format is requested.
The existing build system is defined by Makefile.in's scattered around
the source tree (typically one Makefile.in per directory). At configure
time, these Makefile.in's get preprocessed into Makefile's using simple
variable substitution. Then make/pymake is let loose on the result. It
is a very traditional model.
We are attempting to move to a model where the build definition is
generic and data-driven. By treating the build definition as data
(rather than a glorified shell script that is Makefiles), this will
allow us to take that data and convert it into formats understood by
other, better/faster build backends, such as non-recursive make files,
Tup, Ninja, or even Visual Studio.
Up until now, the focus has been on making Makefile.in's themselves
generic and data-driven [1]. We would use pymake's API to parse, load,
and extract data from Makefile.in's to construct the build definition.
In the long run, we'd realize that using make files for data definition
was silly (and a large foot gun) and thus we would switch to something else.
After a long IRC conversation, Mike Hommey and I concluded that we want
to begin the transition away from Makefile.in's ASAP.
Essentially, the proposal is to move (not duplicate) some data from
Makefile.in's into new files. Initially, this would include things like
subdirectories to descend into and files to copy/preprocess. Simple
stuff to start with. Eventually, scope would likely increase to cover
the entirety of the build system definition (like compiling), rendering
Makefile.in's obsolete. But, it will take a *long* time before we get there.
In the new world, the source of truth for the build system is jointly
defined by existing Makefile.in's and whatever these new files are that
we create. I'll call these not-yet-existing files "build manifest
files." Somewhere in the build process we read in the build manifest
files and generate output for the build backend of choice.
Our existing non-recursive make backend should integrate with this
seamlessly. Instead of a dumb variable substitution phase for
configuring the build backend, we'll have some additional logic to write
out new make files derived from the contents of the build manifest
files. This is similar to the approach I've taken in build splendid [2].
The only difference is the build definition is living in somewhere not
Makefile.in's.
We don't have details on how exactly the migration will be carried
about. But, it should be seamless. So, unless you touch the build
system, you should be able to continue living in blissful ignorance.
If you have any concerns over this transition, please voice them.
File Format
===========
I hinted at bikeshedding in the tl;dr. We want feedback on the file
format to use for the new build manifest files. The requirements are as
follows (feel free to push back on these):
1. Easy for humans to grok and edit. An existing and well-known format
is preferred. We don't want a steep learning curve here.
2. Simple for computers to parse. We will use Python to load the build
manifest files. Python can do just about anything, so I'm not too
worried here.
3. Efficient for computers to load. As these files need to be consulted
to perform builds, we want to minimize the overhead for reading them
into (Python) data structures.
4. Native support for list and maps. Make files only support strings.
The hacks this results in are barely tolerable.
5. Ability to handle conditionals. We need to be able to conditionally
define things based on the presence or value of certain "variables."
e.g. "if the current OS is Linux, append this value to this list." I
quote "variables" because there may not be a full-blown variable system
here, just magic values that come from elsewhere and are addressed by
some convention.
6. Ability to perform ancillary functionality, such as basic string
transforms. I'm not sure exactly what would be needed here. Looking at
make's built-in functions might be a good place to start. We may be able
to work around this by shifting functionality to side-effects from
specially named variables, function calls, etc. I really don't know.
7. Evaluation must be free from unknown side-effects. If there are
unknown side-effects from evaluation, this could introduce race
conditions, order dependency, etc. We don't want that. Evaluation must
either be sandboxed to ensure nothing can happen or must be able to be
statically analyzed by computers to ensure it doesn't do anything it
isn't supposed to.
8. Doesn't introduce crazy build dependencies. We /might/ be able to get
away with something well-known. But, new build dependencies are new
build dependencies.
Ideally, the data format is static and doesn't require an interpreter
(something like YAML or JSON). Unfortunately, the need for conditionals
makes that, well, impossible (I think).
We could go the route of GYP and shoehorn conditionals into a static
document (JSON) [3]. Actually, using GYP itself is an option! Although,
some really don't like the data format because of this shoehorning (I
tend to agree).
On the other end of the spectrum, we could have the build manifest files
be Python "scripts." This solves a lot of problems around needing
functionality in the manifest files. But, it would be a potential foot
gun. See requirement #7.
Or, there is something in the middle. Does anyone know of anything that
can satisfy these requirements? I think Lua is perfect for this (it was
invented to be a configuration language after all). But, I'm not sure it
satisfies #1 nor #8.
[1]
https://bugzilla.mozilla.org/show_bug.cgi?id=774049
[2]
http://gregoryszorc.com/blog/2012/08/15/build-firefox-faster-with-build-splendid/
[3]
https://code.google.com/p/gyp/wiki/GypLanguageSpecification