Moving Away from Makefile's

Showing 1-16 of 16 messages
Moving Away from Makefile's Gregory Szorc 8/21/12 4:36 PM
tl;dr We're proposing moving away from Makefile's as the sole source of
the build system definition. This will lead to faster build times.
Bikeshedding^wFeedback on the file format is requested.

The existing build system is defined by Makefile.in's scattered around
the source tree (typically one Makefile.in per directory). At configure
time, these Makefile.in's get preprocessed into Makefile's using simple
variable substitution. Then make/pymake is let loose on the result. It
is a very traditional model.

We are attempting to move to a model where the build definition is
generic and data-driven. By treating the build definition as data
(rather than a glorified shell script that is Makefiles), this will
allow us to take that data and convert it into formats understood by
other, better/faster build backends, such as non-recursive make files,
Tup, Ninja, or even Visual Studio.

Up until now, the focus has been on making Makefile.in's themselves
generic and data-driven [1]. We would use pymake's API to parse, load,
and extract data from Makefile.in's to construct the build definition.
In the long run, we'd realize that using make files for data definition
was silly (and a large foot gun) and thus we would switch to something else.

After a long IRC conversation, Mike Hommey and I concluded that we want
to begin the transition away from Makefile.in's ASAP.

Essentially, the proposal is to move (not duplicate) some data from
Makefile.in's into new files. Initially, this would include things like
subdirectories to descend into and files to copy/preprocess. Simple
stuff to start with. Eventually, scope would likely increase to cover
the entirety of the build system definition (like compiling), rendering
Makefile.in's obsolete. But, it will take a *long* time before we get there.

In the new world, the source of truth for the build system is jointly
defined by existing Makefile.in's and whatever these new files are that
we create. I'll call these not-yet-existing files "build manifest
files." Somewhere in the build process we read in the build manifest
files and generate output for the build backend of choice.

Our existing non-recursive make backend should integrate with this
seamlessly. Instead of a dumb variable substitution phase for
configuring the build backend, we'll have some additional logic to write
out new make files derived from the contents of the build manifest
files. This is similar to the approach I've taken in build splendid [2].
The only difference is the build definition is living in somewhere not
Makefile.in's.

We don't have details on how exactly the migration will be carried
about. But, it should be seamless. So, unless you touch the build
system, you should be able to continue living in blissful ignorance.

If you have any concerns over this transition, please voice them.

File Format
===========

I hinted at bikeshedding in the tl;dr. We want feedback on the file
format to use for the new build manifest files. The requirements are as
follows (feel free to push back on these):

1. Easy for humans to grok and edit. An existing and well-known format
is preferred. We don't want a steep learning curve here.
2. Simple for computers to parse. We will use Python to load the build
manifest files. Python can do just about anything, so I'm not too
worried here.
3. Efficient for computers to load. As these files need to be consulted
to perform builds, we want to minimize the overhead for reading them
into (Python) data structures.
4. Native support for list and maps. Make files only support strings.
The hacks this results in are barely tolerable.
5. Ability to handle conditionals. We need to be able to conditionally
define things based on the presence or value of certain "variables."
e.g. "if the current OS is Linux, append this value to this list." I
quote "variables" because there may not be a full-blown variable system
here, just magic values that come from elsewhere and are addressed by
some convention.
6. Ability to perform ancillary functionality, such as basic string
transforms. I'm not sure exactly what would be needed here. Looking at
make's built-in functions might be a good place to start. We may be able
to work around this by shifting functionality to side-effects from
specially named variables, function calls, etc. I really don't know.
7. Evaluation must be free from unknown side-effects. If there are
unknown side-effects from evaluation, this could introduce race
conditions, order dependency, etc. We don't want that. Evaluation must
either be sandboxed to ensure nothing can happen or must be able to be
statically analyzed by computers to ensure it doesn't do anything it
isn't supposed to.
8. Doesn't introduce crazy build dependencies. We /might/ be able to get
away with something well-known. But, new build dependencies are new
build dependencies.

Ideally, the data format is static and doesn't require an interpreter
(something like YAML or JSON). Unfortunately, the need for conditionals
makes that, well, impossible (I think).

We could go the route of GYP and shoehorn conditionals into a static
document (JSON) [3]. Actually, using GYP itself is an option! Although,
some really don't like the data format because of this shoehorning (I
tend to agree).

On the other end of the spectrum, we could have the build manifest files
be Python "scripts." This solves a lot of problems around needing
functionality in the manifest files. But, it would be a potential foot
gun. See requirement #7.

Or, there is something in the middle. Does anyone know of anything that
can satisfy these requirements? I think Lua is perfect for this (it was
invented to be a configuration language after all). But, I'm not sure it
satisfies #1 nor #8.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=774049
[2]
http://gregoryszorc.com/blog/2012/08/15/build-firefox-faster-with-build-splendid/
[3] https://code.google.com/p/gyp/wiki/GypLanguageSpecification
Re: Moving Away from Makefile's xunxun 8/21/12 6:12 PM
于 2012/8/22 7:36, Gregory Szorc 写道:
> tl;dr We're proposing moving away from Makefile's as the sole source
> of the build system definition. This will lead to faster build times.
> Bikeshedding^wFeedback on the file format is requested.
>
> The existing build system is defined by Makefile.in's scattered around
> the source tree (typically one Makefile.in per directory). At
> configure time, these Makefile.in's get preprocessed into Makefile's
> using simple variable substitution. Then make/pymake is let loose on
> the result. It is a very traditional model.
>
> We are attempting to move to a model where the build definition is
> generic and data-driven. By treating the build definition as data
> (rather than a glorified shell script that is Makefiles), this will
> allow us to take that data and convert it into formats understood by
> other, better/faster build backends, such as non-recursive make files,
> Tup, Ninja, or even Visual Studio.
Don't like VS project, and we will find that VS project is a so big work
in future development.
>
> Up until now, the focus has been on making Makefile.in's themselves
> generic and data-driven [1]. We would use pymake's API to parse, load,
> and extract data from Makefile.in's to construct the build definition.
> In the long run, we'd realize that using make files for data
> definition was silly (and a large foot gun) and thus we would switch
> to something else.
At present, we also can use pymake, I don't know and I want to know when
your work is done, we will abandon pymake?
And will the work become a new development branch?
> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=774049
> [2]
> http://gregoryszorc.com/blog/2012/08/15/build-firefox-faster-with-build-splendid/
> [3] https://code.google.com/p/gyp/wiki/GypLanguageSpecification
> _______________________________________________
> dev-builds mailing list
> dev-b...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-builds


--
Best Regards,
xunxun

Re: Moving Away from Makefile's Gregory Szorc 8/21/12 8:03 PM
On 8/21/2012 6:12 PM, xunxun wrote:
> 于 2012/8/22 7:36, Gregory Szorc 写道:
>> Up until now, the focus has been on making Makefile.in's themselves
>> generic and data-driven [1]. We would use pymake's API to parse,
>> load, and extract data from Makefile.in's to construct the build
>> definition. In the long run, we'd realize that using make files for
>> data definition was silly (and a large foot gun) and thus we would
>> switch to something else.
> At present, we also can use pymake, I don't know and I want to know
> when your work is done, we will abandon pymake?

As long as we are supporting building with make on Windows, we will
support pymake.

> And will the work become a new development branch?

It's too early to tell how this work will play out. I imagine the goal
will be to merge work into mozilla-central as soon as it is ready to
minimize the potential for bit rot. We don't want two sources of truth
for build system data which can get out of sync.
Re: Moving Away from Makefile's Blair McBride 8/21/12 8:56 PM
On 22/08/2012 11:36 a.m., Gregory Szorc wrote:
> I think Lua is perfect for this (it was invented to be a configuration
> language after all). But, I'm not sure it satisfies #1 nor #8.

+1 for Lua - it seems perfect for this. For #1, I find it far easier to
read (and write) than Gyp, when it comes to things like conditionals.
For #8, we could just ship the entire runtime in the tree for Tier 1
platforms (its small enough!), then its only an additional dependency
for the 0.01%.

- Blair
Re: Moving Away from Makefile's Mike Hommey 8/21/12 10:58 PM
But then, you have a chicken and egg kind of problem, where your build
system depends on building something...

My preference would go do simple, preprocessed .ini-like files. Or
somewhere between a .ini and a jar.mn.

Mike
Re: Moving Away from Makefile's Blair McBride 8/22/12 6:07 AM
On 22/08/2012 5:58 p.m., Mike Hommey wrote:
>> +1 for Lua - it seems perfect for this. For #1, I find it far easier
>> >to read (and write) than Gyp, when it comes to things like
>> >conditionals. For #8, we could just ship the entire runtime in the
>> >tree for Tier 1 platforms (its small enough!), then its only an
>> >additional dependency for the 0.01%.
> But then, you have a chicken and egg kind of problem, where your build
> system depends on building something...

When I say "runtime" I mean the actual pre-built runtime, not the source.

- Blair
Re: Moving Away from Makefile's Benjamin Smedberg 8/22/12 6:50 AM
On 8/21/2012 7:36 PM, Gregory Szorc wrote:
>
>
>
> On the other end of the spectrum, we could have the build manifest
> files be Python "scripts." This solves a lot of problems around
> needing functionality in the manifest files. But, it would be a
> potential foot gun. See requirement #7.
I don't think this would be a big deal. We could just enforce no side
effects at review, or with a small bit of python we could enforce some
basic restraints in code:

* allow imports from only a small whitelist of known-safe modules:
perhaps just 're', or disallow imports altogether (by modifying the
globals in the execution environment) and pre-import the safe modules
* "fix" output variables to ensure that they are the expected
string/list-of-strings/whatever types

I really think that python manifests are the best choice here, since
python is already optimized to parse them efficiently, the control
structures are fairly straightforward, and most people (both regular
engineers and build-sytem hackers) are going to know enough to get
started without confusion.

One thing that wasn't clear to me from the original post is whether we
are planning on automatically transforming some parts of the existing
makefiles (e.g. DIRS, EXPORTS, preprocessor stuff) into the new manifest
format, or whether the plan is to just migrate by hand. Or is that still
TBD?

--BDS

Re: Moving Away from Makefile's Mike Hommey 8/22/12 7:12 AM
Which you then need to ship as a windows binary, an osx binary, and a
linux binary. And that other platforms need to build or install
separately.

Mike
Re: Moving Away from Makefile's Dirkjan Ochtman 8/22/12 11:48 PM
On Wed, Aug 22, 2012 at 1:36 AM, Gregory Szorc <g...@mozilla.com> wrote:
> Up until now, the focus has been on making Makefile.in's themselves generic
> and data-driven [1]. We would use pymake's API to parse, load, and extract
> data from Makefile.in's to construct the build definition. In the long run,
> we'd realize that using make files for data definition was silly (and a
> large foot gun) and thus we would switch to something else.

Can you expand on that? From the discussion so far, JSON is not
expressive enough and Python is too expressive. There are some very
understandable reservations about inventing a new language, as well as
the desire for a "clean slate" (admittedly attractive). Why doesn't it
make sense to use a very-restricted dialect of Makefiles? pymake
already has a parser, the dumbing down of which to disallow arbitrary
shell expressions would supposedly be fairly straightforward. No one
would have to learn a new language, and you can start with the current
files, duplicate the parser inside pymake, then start to dumb it down
as you weed the complexity out of the Makefile.ins.

Cheers,

Dirkjan
Re: Moving Away from Makefile's Ted Mielczarek 8/24/12 7:32 AM
On Fri, Aug 24, 2012 at 9:17 AM, qheaden <qhe...@phaseshiftsoftware.com> wrote:
> Is there any special reason why an existing build system such as SCcons couldn't be used as a new build system for Mozilla? I know the Mozilla source has a lot of special build instructions, but SCons does allow you to create your own special builders in Python code.

Build systems like SCons are just a different coat of paint over make.
They wouldn't really solve any of our problems, it'd just be
busy-work. In addition, SCons (among other build systems) tries to
solve more problems than we need, by providing the features of
autoconf as well as make. Finally, for SCons in particular, I have
doubts about its ability to scale to a project of Mozilla's size. KDE
tried to switch to SCons and failed, and wound up using CMake.

In short, most build systems suck at large scale. Almost any will
suffice for a small project, but for a project of Mozilla's size
there's no perfect solution.

-Ted
Re: Moving Away from Makefile's John Hopkins 8/24/12 8:05 AM
Suggestion: identify the ugliest sections of Makefile usage and use
those as a benchmark for evaluating different solutions.  ie. how could
it be implemented in SCons, pymake, etc.  or even, how could it be
reimplemented in Make in a clean fashion.

John

Re: Moving Away from Makefile's Brian Smith 8/24/12 12:42 PM
Gregory Szorc wrote:
> 4. Native support for list and maps. Make files only support strings.
> The hacks this results in are barely tolerable.
>
> 5. Ability to handle conditionals. We need to be able to
> conditionally define things based on the presence or value of certain
>  "variables."
> e.g. "if the current OS is Linux, append this value to this list." I
> quote "variables" because there may not be a full-blown variable
> system here, just magic values that come from elsewhere and are
> addressed by some convention.
>
> 6. Ability to perform ancillary functionality, such as basic string
> transforms. I'm not sure exactly what would be needed here. Looking
> at make's built-in functions might be a good place to start. We may
> be able to work around this by shifting functionality to side-effects
> from specially named variables, function calls, etc. I really don't
> know.
>
> 7. Evaluation must be free from unknown side-effects. If there are
> unknown side-effects from evaluation, this could introduce race
> conditions, order dependency, etc. We don't want that. Evaluation
> must either be sandboxed to ensure nothing can happen or must be able
> to be statically analyzed by computers to ensure it doesn't do anything
> it isn't supposed to.

...

> On the other end of the spectrum, we could have the build manifest
> files be Python "scripts." This solves a lot of problems around
> needing functionality in the manifest files. But, it would be a
> potential foot gun. See requirement #7.

I do not think it is reasonable to require support for alternate build systems for all of Gecko/Firefox.

But, let's say were were to divide the build into three phases:
1. Generate any generated C/C++ source files.
2. Build all the C/C++ code into libraries and executables
3. Do everything else (build omnijar, etc.)

(I imagine phase 3 could probably run 100% concurrently with the first two phases).

It would be very nice if phase #2 ONLY could support msbuild (building with Visual Studio project files, basically), because this would allow smart editors'/IDEs' code completion and code navigation features to work very well, at least for the C/C++ source code. I think this would also greatly simplify the deployment of any static analysis tools that we would develop.

In addition, potentially it would allow Visual Studio's "Edit and Continue" feature to work. ("Edit and Continue" is a feature that allows you to make changes to the C++ source code and relink those changes into a running executable while execution is paused at a breakpoint, without restarting the executable.)

I think that if you look at the limitations of gyp, some (all?) of them are at least partially driven by the desire to provide such support. I am sure the advanced features that you list in (4), (5), (6), (7) are helpful, but they may make it difficult to support these secondary use cases.

That said, getting the build system to build as fast as it can is much more important.

Cheers,
Brian
Re: Moving Away from Makefile's Gregory Szorc 8/24/12 1:41 PM
On 8/24/12 12:42 PM, Brian Smith wrote:
> I do not think it is reasonable to require support for alternate build systems for all of Gecko/Firefox.
>
> But, let's say were were to divide the build into three phases:
> 1. Generate any generated C/C++ source files.
> 2. Build all the C/C++ code into libraries and executables
> 3. Do everything else (build omnijar, etc.)
>
> (I imagine phase 3 could probably run 100% concurrently with the first two phases).
>
> It would be very nice if phase #2 ONLY could support msbuild (building with Visual Studio project files, basically), because this would allow smart editors'/IDEs' code completion and code navigation features to work very well, at least for the C/C++ source code. I think this would also greatly simplify the deployment of any static analysis tools that we would develop.
>
> In addition, potentially it would allow Visual Studio's "Edit and Continue" feature to work. ("Edit and Continue" is a feature that allows you to make changes to the C++ source code and relink those changes into a running executable while execution is paused at a breakpoint, without restarting the executable.)
>
> I think that if you look at the limitations of gyp, some (all?) of them are at least partially driven by the desire to provide such support. I am sure the advanced features that you list in (4), (5), (6), (7) are helpful, but they may make it difficult to support these secondary use cases.
>
> That said, getting the build system to build as fast as it can is much more important.

Agreed.

Changing how we define the build config would enable us to do everything
you mentioned and more. With the current "architecture" of our
Makefile's, we effectively have different build phases called tiers. See
[1] for more. As much as I would love to split things up into more
distinct phases/tiers, the overhead for recursive make traversal would
be prohibitive. As far as prioritizing work to enable basic Visual
Studio project generation, I'm all for that: my build-splendid branch
[2] had its roots in VS generation after all (if you go back far enough
in the history you can still see this)!

Once we treat the build system as a giant data structure, we are free to
transform that any way we want. We feed that data structure into a
generator and spit something out the other side. This is very similar to
GYP's model. Essentially what we are proposing is reinventing GYP, but
with a different frontend. It's entirely possible we will implement
things using GYP's APIs so we can reuse GYP's existing generators! Time
will tell.

[1] http://gregoryszorc.com/blog/2012/07/29/mozilla-build-system-overview/
[2] https://github.com/indygreg/mozilla-central/tree/build-splendid
Re: Moving Away from Makefile's Gregory Szorc 8/28/12 2:27 PM
Yes, we could use a subset of make to define things. On the surface, the
functionality is just what you want: simple conditionals, built-in
functions, simple appends, well-understood (more or less). Using make
*is* very tempting. And, we could probably use pymake as-is: we could
validate the parser's "statement list" output for conformance with our
limited make dialect. We should seriously consider using make files,
albeit in a restricted form to make parsing easier.

That being said, there are a few cons:

* Everything is a string. There is no null, false, true, or arrays. "if"
is "ifneq (,$(foo))" or even "ifneq (,$(strip $(foo)))" in case some
extra whitespace snuck in there. Arrays are strings delimited by
whitespace. This results in lots of ugly and hard-to-read code. Maps
don't exist at all. So, you have to normalize everything down to
key-value pairs. This results in weird foo like the library/module or
EXPORTS boilerplate.
* Syntax for complex behavior is hard to read. You inevitably need to
call functions for more advanced behavior. This results in code like
https://hg.mozilla.org/mozilla-central/file/ad7963c93bd8/config/rules.mk#l1623.
That's just as bad as poorly-written Perl. And, unlike Perl, it *has* to
be that way. Fortunately, this complexity *should* be hidden to everyone
but build system people. But, we need to maintain it and that's no fun.
* Performance issues. = in make is deferred assignment. There's a lot of
overhead in resolving values (although bsmedberg proposed a solution for
pymake that may combat this).
* Might require stapling some new features onto pymake. Not a deal breaker.

Individually, they aren't too bad. A lot are superficial. But, when you
combine them, it gets ugly.
Re: Moving Away from Makefile's Gregory Szorc 8/28/12 2:36 PM
On 8/24/12 8:05 AM, John Hopkins wrote:
> Suggestion: identify the ugliest sections of Makefile usage and use
> those as a benchmark for evaluating different solutions.  ie. how could
> it be implemented in SCons, pymake, etc.  or even, how could it be
> reimplemented in Make in a clean fashion.

I've started [1] to compare things. I just plucked a random Makefile.in
out of the source tree that seemed to have a nice mix of things. We may
want to throw some more complexity in there just for completeness.

If someone familiar with some of the empty sections has time to fill
those out, it would be appreciated (I'm too busy!).

[1] https://wiki.mozilla.org/User:Gszorc/Build_frontend_shootout
Re: Moving Away from Makefile's Gregory Szorc 9/2/12 2:15 PM
A decision has been made: we will be using Python files executing in a
sandboxed environment (using the technique that Hanno posted).
Supporting this decision are Ted (build system owner) and Benjamin
Smedberg and myself (build system peers).

The first step is moving all the directory traversal definitions from
existing make files into Python files. This is being tracked in bug
784841 [1]. If all goes according to plan, this should be a transparent
transition: this won't change how you build the tree.

Initially, this transition will seem like a lot of busy work with no
real benefit. The real wins come after we've moved more exciting pieces
such as C/C++ compilation and IDL generation to the new Python frontend
files. When those are in place, we should be able to do things such as
generate non-recursive make files, which should make builds faster.

More information about the new world order will be communicated once
things are closer to landing. If you wish to influence it, please follow
bug 784841.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=784841

On 8/21/2012 4:36 PM, Gregory Szorc wrote:
> tl;dr We're proposing moving away from Makefile's as the sole source
> of the build system definition. This will lead to faster build times.
> Bikeshedding^wFeedback on the file format is requested.
>
> The existing build system is defined by Makefile.in's scattered around
> the source tree (typically one Makefile.in per directory). At
> configure time, these Makefile.in's get preprocessed into Makefile's
> using simple variable substitution. Then make/pymake is let loose on
> the result. It is a very traditional model.
>
> We are attempting to move to a model where the build definition is
> generic and data-driven. By treating the build definition as data
> (rather than a glorified shell script that is Makefiles), this will
> allow us to take that data and convert it into formats understood by
> other, better/faster build backends, such as non-recursive make files,
> Tup, Ninja, or even Visual Studio.
>
> Up until now, the focus has been on making Makefile.in's themselves
> generic and data-driven [1]. We would use pymake's API to parse, load,
> and extract data from Makefile.in's to construct the build definition.
> In the long run, we'd realize that using make files for data
> definition was silly (and a large foot gun) and thus we would switch
> to something else.
>
> 4. Native support for list and maps. Make files only support strings.
> The hacks this results in are barely tolerable.
> 5. Ability to handle conditionals. We need to be able to conditionally
> define things based on the presence or value of certain "variables."
> e.g. "if the current OS is Linux, append this value to this list." I
> quote "variables" because there may not be a full-blown variable
> system here, just magic values that come from elsewhere and are
> addressed by some convention.
> 6. Ability to perform ancillary functionality, such as basic string
> transforms. I'm not sure exactly what would be needed here. Looking at
> make's built-in functions might be a good place to start. We may be
> able to work around this by shifting functionality to side-effects
> from specially named variables, function calls, etc. I really don't know.
> 7. Evaluation must be free from unknown side-effects. If there are
> unknown side-effects from evaluation, this could introduce race
> conditions, order dependency, etc. We don't want that. Evaluation must
> either be sandboxed to ensure nothing can happen or must be able to be
> statically analyzed by computers to ensure it doesn't do anything it
> isn't supposed to.
> 8. Doesn't introduce crazy build dependencies. We /might/ be able to
> get away with something well-known. But, new build dependencies are
> new build dependencies.
>
> Ideally, the data format is static and doesn't require an interpreter
> (something like YAML or JSON). Unfortunately, the need for
> conditionals makes that, well, impossible (I think).
>
> We could go the route of GYP and shoehorn conditionals into a static
> document (JSON) [3]. Actually, using GYP itself is an option!
> Although, some really don't like the data format because of this
> shoehorning (I tend to agree).
>
> On the other end of the spectrum, we could have the build manifest
> files be Python "scripts." This solves a lot of problems around
> needing functionality in the manifest files. But, it would be a
> potential foot gun. See requirement #7.
>
> Or, there is something in the middle. Does anyone know of anything
> that can satisfy these requirements? I think Lua is perfect for this
> (it was invented to be a configuration language after all). But, I'm
> not sure it satisfies #1 nor #8.
>