Jumbo build project status and some general information

113 views
Skip to first unread message

Daniel Bratell

unread,
Jul 18, 2017, 12:31:09 PM7/18/17
to chromi...@chromium.org
For new readers
----------------
Jumbo building is the Chromium implementation of "unity builds" where
source files are merged for drastically improved compilation times.

For all readers:
----------------
The official test bot is still pending so the feature is still default off
(can be enabled by setting use_jumbo_build = true in gn). There seems to
be some issue with local mac builds ( https://crbug.com/716395 ) but I
think Linux and Windows work fine.

Official Jumbo documentation:
https://chromium.googlesource.com/chromium/src/+/master/docs/jumbo.md

For old readers:
----------------
Since the last mail 10 days ago only core_generated has been added to
jumbo builds on master. I haven't measured but that is a smallish target
so it will only have saved 10 CPU minutes at most (out of 1000-2000 in the
test configuration).

Some major targets are in the pipeline though: blink/modules (saving 60
CPU minutes), blink/core unit_tests (saving 59 CPU minutes), blink/modules
unit_tests (saving 19 CPU minutes) and blink/platform (saving 35 CPU
minutes).

HEADS UP
--------
To avoid having to exclude files and targets from jumbo compilation the
files in the target has to compile when they are in the same translation
unit. That means that names of functions, constants, classes must be
unique in the gn target. This includes symbols in anonymous namespaces and
static functions because the translation unit will no longer contain just
one file.

Just to illustrate:

mycode/A.cpp
-----------------
#define Y Z
namespace { enum X { a, b }; }
static int Y() { return 1; }


mycode/B.cpp
-----------------
#define Y Z
namespace { enum X { a, b }; }
static int Y() { return 2;}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NOT JUMBO COMPATIBLE if A.cpp and B.cpp are in the same target!

Most code use unique names so this does not matter a lot, but tests have
turned out to be an exception. Thanks for all the people that have
reviewed my renaming and general clean up of unit tests during the last
week!

Official Jumbo documentation:
https://chromium.googlesource.com/chromium/src/+/master/docs/jumbo.md

The attached image illustrates the state of my local work branch as some
kind of indication of what can be achieved (ignore the grey part).

/Daniel

P.S. Did I mention that I'm looking for volunteers, especially someone on
Mac? D.S.

--
/* Opera Software, Linköping, Sweden: CET (UTC+1) */
jumbo-times 20170712.png

Daniel Bratell

unread,
Jul 18, 2017, 1:01:57 PM7/18/17
to chromi...@chromium.org
On Tue, 18 Jul 2017 18:29:02 +0200, Daniel Bratell <bra...@opera.com>
wrote:

> Most code use unique names so this does not matter a lot, but tests have
> turned out to be an exception. Thanks for all the people that have
> reviewed my renaming and general clean up of unit tests during the last
> week!

Thanks *to* all the people! I'm also thankful for those people existing,
but I really just wanted to thank people that help on this project to make
Chromium non-goma-developer friendly again!

/Daniel

Yutaka Hirano

unread,
Jul 18, 2017, 10:32:37 PM7/18/17
to bra...@opera.com, blink-dev, chromi...@chromium.org
+blink-dev@

It requires authors to give unique names to file scope things, and has some impact on the code style. So I think we need to update the code style guidelines if we support jumbo builds. I'm wondering if it's good to have nested namespaces, like blink::xhr.

Test code is particularly affected because we tend to use shorter names in tests and test helpers. Currently I often enclose everything in an unnamed namespace as below.

// FooTest.cpp
#includes

namespace blink {
namespace {

class Helper {
...
};

TEST(X, Y) {
  Helper h;
  ...
}

}
}

I would love to use shorter names for unittests, so instead of keeping the name uniqueness manually, I would propose to enclose  everything in a file specific namespace, as below.

// FooTest.cpp
#includes

namespace blink {
namespace FooTest { // Filenames are not unique so we may need to include path names.

class Helper {
...
};

TEST(X, Y) {
  Helper h;
  ...
}

}
}

What do you think?




--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:   http://groups.google.com/a/chromium.org/group/chromium-dev
---You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/op.y3kyqoscrbppqq%40cicero2.linkoping.osa.

Peter Kasting

unread,
Jul 18, 2017, 10:49:44 PM7/18/17
to Yutaka Hirano, Daniel Bratell, blink-dev, chromi...@chromium.org
On Tue, Jul 18, 2017 at 7:31 PM, 'Yutaka Hirano' via blink-dev <blin...@chromium.org> wrote:
I would love to use shorter names for unittests, so instead of keeping the name uniqueness manually, I would propose to enclose  everything in a file specific namespace, as below.

I think this is slightly less clear (as to the intended purpose, which is a file-scoped helper class) than:

namespace {

class FooTestHelper {  // Or better yet: class DescribeWhatItDoes {
...

To preserve the ability of people to be brief (which you noted you wanted), I would say that people are still allowed to use brief names unless the bot actually fails to compile, at which point they can use longer names.  So I wouldn't mandate the style above by default, only for resolving errors.

PK

Daniel Bratell

unread,
Jul 19, 2017, 5:32:56 AM7/19/17
to Yutaka Hirano, Peter Kasting, blink-dev, chromi...@chromium.org
I second what Peter says. This is not a big issue. I happened to land a lot of rename patches in a few hours which might have given the impression that this is common. It is not. Quick counting says there are 878 classes in Blink unit tests and about 30 of those collided in jumbo builds, 800+ did not.

Normal code is much++ better. Tests are by their nature a bit repetitive and prone to copy/paste solutions.

I don't pretend that the longer names I picked are always the best ones but I would also not like to get stuck in long processes trying to optimize every class name. The best (for everyone) would be if someone familiar with code enabled jumbo for it. (see quick guide at https://chromium.googlesource.com/chromium/src/+/master/docs/jumbo.md ), turned on jumbo locally and resolved the collisions if there were any (many targets have no collisions at all, some have several).

Note: IPC headers (without include guards) and X11 headers cause known problems. If you see such things when testing, don't spend time trying to resolve it locally unless you have a great plan that is better than my WIP there.

/Daniel

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAAHOzFDFHeq6s4QvgQoWrsX5j-W-0_00MD8PmFaOqxcHUbvVAQ%40mail.gmail.com.

Daniel Cheng

unread,
Jul 19, 2017, 5:33:19 AM7/19/17
to Peter Kasting, Yutaka Hirano, Daniel Bratell, blink-dev, chromi...@chromium.org
Very cool to see the build time improvements!

One thing that makes me a bit sad is that we can't rely on unnamed namespaces to avoid naming collisions between unrelated files. How many of Chrome's source files are going to be in these jumbo compilation units? What's the granularity of each compilation unit?

Also, is there any hope of realizing these build time improvements without using the jumbo compilation strategy? From the bug, it seems header parsing is a large component of build times. If we could somehow preparse and save the AST of individual headers (rather than saving all the headers in a giant precompile), would that help in any way?

Daniel


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAAHOzFDFHeq6s4QvgQoWrsX5j-W-0_00MD8PmFaOqxcHUbvVAQ%40mail.gmail.com.

Daniel Bratell

unread,
Jul 19, 2017, 9:23:19 AM7/19/17
to Peter Kasting, Yutaka Hirano, Daniel Cheng, blink-dev, chromi...@chromium.org
Many questions!

Q.  What's the granularity of each compilation unit?

They are grouped by gn target so if a target only lists 2 cc files in |sources|, then that is all you will get in that jumbo unit. In other words, there is no cross-pollination between different targets.

Q. How many of Chrome's source files are going to be in these jumbo compilation units?

After grouping all files in the target, the list is split so that each translation unit isn't too large. "Too large" is still to be determined, but the final limit will probably be somewhere between 50 and 200 files. More files -> fewer electrons consumed, fewer files -> more electrons but more parallelism. Some test data in the documentation.

Q.  Also, is there any hope of realizing these build time improvements without using the jumbo compilation strategy?

Single computer compilations, most likely not. The closest I can think of is a commercial effort to make a compilation server that can cache results in memory and reuse when compiling other files, but in a race I would still put my money on jumbo.

With jumbo builds the generated data is so much less and it will always be hard to beat doing less, even with caching. This will be anecdotal, but normally compiling Linux debug Blink/core generates 10 GB machine code and debug information. In a jumbo build only 1 GB machine code and debug information is generated.

Q.  If we could somehow preparse and save the AST of individual headers (rather than saving all the headers in a giant precompile), would that help in any way?

Precompiled headers with key headers included is already used, at least for Windows. It saved a fair bit of time, but the effect is almost a magnitude smaller than this.

Not quite Q.  One thing that makes me a bit sad is that we can't rely on unnamed namespaces to avoid naming collisions between unrelated files.

Yes, I wish this was 100% without side effects, but that is one thing that doesn't change to the better.

I have gone through this process once before on the other side, owning a lot of code, someone else wanted to change it to work with jumbo compile. I was initially suspicious, for the lack of a better word, but the result proved them absolutely right. The side effects there were minuscule compared to the gain. Different project though. Nothing is quite like the Chromium project.

/Daniel

Reid Kleckner

unread,
Jul 19, 2017, 3:00:53 PM7/19/17
to Daniel Cheng, Peter Kasting, Yutaka Hirano, Daniel Bratell, blink-dev, chromi...@chromium.org
On Wed, Jul 19, 2017 at 2:31 AM, Daniel Cheng <dch...@chromium.org> wrote:
Very cool to see the build time improvements!

One thing that makes me a bit sad is that we can't rely on unnamed namespaces to avoid naming collisions between unrelated files. How many of Chrome's source files are going to be in these jumbo compilation units? What's the granularity of each compilation unit?

Also, is there any hope of realizing these build time improvements without using the jumbo compilation strategy? From the bug, it seems header parsing is a large component of build times. If we could somehow preparse and save the AST of individual headers (rather than saving all the headers in a giant precompile), would that help in any way?

In theory, that's what C++ modules do. People are starting to see similar reductions in the total overall C++ compilation work done when compiling Clang with modules:

Modularizing Chrome will not be a small effort, but it's looking more and more like a real possibility.

Dirk Pranke

unread,
Jul 19, 2017, 4:04:53 PM7/19/17
to Reid Kleckner, Daniel Cheng, Peter Kasting, Yutaka Hirano, Daniel Bratell, blink-dev, chromi...@chromium.org
I am optimistic that modules will be the long-term Right Path, but I think we're a long way away from getting the same level of benefit that jumbo builds look like they will give for a relatively far smaller amount of effort right now. We can and will continue to monitor the impact of the unnamed namespace collions, but so far they've been pretty small. I think supporting jumbo will give us boosts equal to or better than component builds, and for arguably less work.

-- Dirk

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.

Daniel Bratell

unread,
Jul 19, 2017, 4:30:37 PM7/19/17
to Daniel Cheng, Reid Kleckner, Peter Kasting, Yutaka Hirano, blink-dev, chromi...@chromium.org
Didn't know about C++ modules. Makes C++ look like an all different language. Still, the effect in that post seems smaller, much smaller, than jumbo. They found a 30% compile time improvement, jumbo in Blink gives a 10x compile time improvement[1]. And the best thing as I see it, one will not exclude the other so together the compile time will be better than just one of them.

/Daniel

[1]: Compiling blink/core goes from 360 CPU minutes to 29 CPU minutes with jumbo in my measurements, once the unit_test jumbo patch ( https://chromium-review.googlesource.com/c/575055/ ) has landed.

bruce...@chromium.org

unread,
Jul 19, 2017, 9:19:27 PM7/19/17
to Chromium-dev, dch...@chromium.org, r...@google.com, pkas...@chromium.org, yhi...@google.com, blin...@chromium.org
> Q.  Also, is there any hope of realizing these build time improvements without using the jumbo compilation strategy?

I'm going to go a bit further than Daniel and say no. While goma/distributed builds can help with compilation times they can't help with link times, and on Windows they actually make it worse. Jumbo builds make link times better.

Maybe C++ modules will give us this sort of build-time speedup without the consequences but we can evaluate that when the time is right.

Actually, there is a way to get the jumbo build speedups without changing language semantics, but it would probably be worse. This would be to reduce the number of translation units by actually concatenating source files. This is actually a reasonable strategy for small source files that have lots of includes - their build times are disproportionately large. However this strategy is less palatable for large files. And, it doesn't seem popular for small source files either.

In short, I think it's impossible to have good build times with as many source files as we have, so jumbo builds FTW!

Yutaka Hirano

unread,
Jul 19, 2017, 9:32:43 PM7/19/17
to Daniel Bratell, Daniel Cheng, Reid Kleckner, Peter Kasting, blink-dev, chromi...@chromium.org
The extent of the scope depends on the build efficiency rather than logical code organization. It may change over time, it may be different among platforms or build configurations. I would like to avoid relying on "gn-target-local scope" if possible. 

Dirk Pranke

unread,
Jul 19, 2017, 9:57:35 PM7/19/17
to Yutaka Hirano, Daniel Bratell, Daniel Cheng, Reid Kleckner, Peter Kasting, blink-dev, chromi...@chromium.org
It is always the case on very large C/C++ projects that you have to consider the physical configuration of your build as well as the logical configuration. We have been doing this for a long time via things like the component builds, the split chrome dlls on windows, and other such things.

It would be nice if we didn't have to do this, but Chromium is well past the point where most tools break. That's why we invest in our own tools, toolchains, and infrastructure so much, and why we think things like modules will be important in the future.

In the meantime, though, we need to trade off all of these tradeoffs. If we really can cut our compile times significantly, that is worth an awful lot of inconvenience, because it buys us an awful lot of convenience as well. 

Daniel has been careful to try and give us the hooks we need to control the impact of these changes, and so we can look at every tradeoff as they go. We are very open to feedback and looking at cases where we think the tradeoffs are too bad. We know there are some already, and as we work through more of the targets we're going to be getting better at knowing what works and what doesn't (just like we've done with exports for components).

Yes, it's something we'll need to be aware of. So are the EXPORT macros, the entries in the C++ style guide, the rules around build files and checkdeps, and all of the other things we have to keep in mind when we're writing code. 

-- Dirk

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.

Daniel Bratell

unread,
Jul 20, 2017, 5:57:16 AM7/20/17
to Yutaka Hirano, Dirk Pranke, Daniel Cheng, Reid Kleckner, Peter Kasting, blink-dev, chromi...@chromium.org
I began this as an experiment when builds started hitting 4 hour timeouts on our official build machines. It is possible a lot of people do not know what fantastic work goma does for them. And how much work goma does. (Whoever wrote it is probably under-appreciated :-) ).

Still, the official, public, build system has to handle tens of thousands of translation units using the most complex and advanced C++ and with a lot of dependencies, for both compiler and linker. People have tried split_static_libraries, precompiled headers, reducing dependencies, removing header includes, but it has barely made a dent in the steadily increasing compilation demands.

Unity builds (jumbo being the Chromium implementation) is really the big hammer. It is big because you cannot pretend it doesn't exist and it is big because it actually makes a huge difference to compilation times.

To get it working without any |jumbo_excluded_sources| name conflicts have to be resolved. Name conflicts that have built up over 5-10 years and, and for names that people have grown attached to. This is the hurdle, or the price to pay. Had not that hurdle been there I or someone else would have done this long ago. Once those initial name conflicts are gone, I don't think there will be much to think about and it will be very obvious if there is a name conflict.

For anyone without access to goma, this is a very promising and important project, and it might also turn out to be beneficial to those with access to goma, but we will see once we have some bots up.

/Daniel
Reply all
Reply to author
Forward
0 new messages