Basel: Yet another build tool (from mighty corporation)

1,635 views
Skip to first unread message

tech...@gmail.com

unread,
Mar 30, 2015, 6:22:57 AM3/30/15
to bazel-...@googlegroups.com
Excuse me for the loaded language. This contains critics and hate vent that some nerds find inappropriate, so approach with caution if you're not prepared to handle that stuff of heavy content to correct me where I wrong.


For a couple of years I am trying not to reinvent the bicycle and just improve the things that already exist, it is not serious, it is just a toy activity for me, because you can learn a lot from past mistakes and approaches.

I must say that this is a distraction from actual work, so nobody approves me studying the code that is so ancient, and especially writing patches for it. I so much hate that atmosphere that I left my job and don't want to get back to it again. It is just too stressful. So imagine my envy and sadness when I see that people can happily spend their time for building such tools. But.. with more and more of there to appear, the envy changes to.. like.. anger? I know that management will likely to approve the work on new tools, but absolutely not interested in improving existing ones, because the challenge of making change to that codebase is usually too high for that company even if it provides benefits for everybody else.

Is that proliferation of build tools adds anything valuable to open source ecosystem? Like working for Google, with contract in which there is no payment - is it what a contribution agreement all about? Erasing people names and placing Google label there - is that the proper crediting for people's work. That's a non technical aspect on how Google participates in open source ecosystem.


Now to the technical side. How that is different from ANY OTHER BUILD TOOL THAT EXISTS in the wild? The mighty Google provides safety shelter for a thousands of most talented designers, user experience specialists, media and marketing, and of course, computer science guys (males and females). And this looks like something that is so weird - no any good comparison, blurred rationale, no presentation or pictures that convey the main idea and problem domain for this specific tool. "Correct, reproducible, fast builds for everyone.", well except for Windows users. In 2015 if you're a >billion $$ corporation that is putting a new tool to the outside word, there should be some research done, especially if you're Google (because what are other reasons to work for this company?). Research or just user experience about common problem with other tools that this one is going to handle. The World is already using something like CMake, Autotools, premake etc. The honest comparison of advantages and disadvantages of chosen approach, list the key challenges in build tools and how things are handling them. Provide something that people can learn from, some diagrams and pictures that the people outside can not draw themselves in their free time, because there is no such time for you if you're a coder.

I would like to appreciate the benefits of Bazel and compare if the approach is beneficial for anything, but I spend a few minutes browsing the site and I really don't see anything exciting. The arguments like "their feature sets are different, so they aren't viable alternatives for us" leave the impression that it is just another NIH product, and I see no any good excuse to do that. Not anything that would suit Google.

Feel free to prove me that I am wrong.

Ulf Adams

unread,
Mar 30, 2015, 8:07:54 AM3/30/15
to tech...@gmail.com, bazel-...@googlegroups.com
Hi Anatoly,

I'm sorry that our documentation isn't as clear as we'd like it to be.

On Mon, Mar 30, 2015 at 12:22 PM, <tech...@gmail.com> wrote:
Excuse me for the loaded language. This contains critics and hate vent that some nerds find inappropriate, so approach with caution if you're not prepared to handle that stuff of heavy content to correct me where I wrong.


For a couple of years I am trying not to reinvent the bicycle and just improve the things that already exist, it is not serious, it is just a toy activity for me, because you can learn a lot from past mistakes and approaches.

I must say that this is a distraction from actual work, so nobody approves me studying the code that is so ancient, and especially writing patches for it. I so much hate that atmosphere that I left my job and don't want to get back to it again. It is just too stressful. So imagine my envy and sadness when I see that people can happily spend their time for building such tools. But.. with more and more of there to appear, the envy changes to.. like.. anger? I know that management will likely to approve the work on new tools, but absolutely not interested in improving existing ones, because the challenge of making change to that codebase is usually too high for that company even if it provides benefits for everybody else.

Please keep in mind that Google has been working on this build tool for ~9 years. Not being around back then, I can't comment on the exact thought processes that went into the creation of Bazel (or Blaze, as it is internally called) back then. However, given that many ideas manifested in Blaze have been copied by others in the industry (buck, pants, gyp, gn, to name a few), it appears they are not so bad after all. Of course, Blaze/Bazel wasn't open source at the beginning (and there are plenty other non-open source build systems), so this isn't easy to retrace.


Is that proliferation of build tools adds anything valuable to open source ecosystem? Like working for Google, with contract in which there is no payment - is it what a contribution agreement all about? Erasing people names and placing  Google label there - is that the proper crediting for people's work. That's a non technical aspect on how Google participates in open source ecosystem.

We are making Bazel open source with the belief that it provides some features that are unique, and that are useful to others. Google will continue to use Bazel internally for most of its software development and continue to fund it regardless of whether it is or isn't adopted by other software developers. At this point in time, as far as we can tell, Bazel is the only build system that provides the features that we internally require. We've been successfully using in production for over 7 years.

That said, if there was an alternative system that provides the features (or most of the features) that we need, and is better in some respect, we'd certainly be interested to hear about it.

While we do want to avoid reinventing the wheel, we also have to ship something that works today. During the time I've been writing this email, Bazel / Blaze has been running thousands of builds and tests in order to ship the next version of Gmail, Search, Docs, or any of the other products and services Google provides.

On the non-technical aspects, Bazel is open source, it is free to use and modify. We will make sure to record and credit (in the source code or thereabouts) everyone who has contributed to Bazel and who wants their name recorded. The only restriction is that if you don't sign the CLA, you will not be able to contribute patches to Bazel itself. We are still interested in hearing about your requirements, bugs you find, or problems you see with the documentation.


Now to the technical side. How that is different from ANY OTHER BUILD TOOL THAT EXISTS in the wild? The mighty Google provides safety shelter for a thousands of most talented designers, user experience specialists, media and marketing, and of course, computer science guys (males and females). And this looks like something that is so weird - no any good comparison, blurred rationale, no presentation or pictures that convey the main idea and problem domain for this specific tool. "Correct, reproducible, fast builds for everyone.", well except for Windows users.

We would certainly like to support Windows, but there are limits to how much time we Google employees can spend on that, and what changes we can make to Bazel given that we absolutely have to keep shipping Gmail, Search, Docs, and everything else. We understand the drawbacks of that, but sometimes you need to make a decision even if you don't like any of the options.

We are striving towards making the decision making process more transparent.
 
In 2015 if you're a >billion $$ corporation that is putting a new tool to the outside word, there should be some research done, especially if you're Google (because what are other reasons to work for this company?). Research or just user experience about common problem with other tools that this one is going to handle. The World is already using something like CMake, Autotools, premake etc. The honest comparison of advantages and disadvantages of chosen approach, list the key challenges in build tools and how things are handling them. Provide something that people can learn from, some diagrams and pictures that the people outside can not draw themselves in their free time, because there is no such time for you if you're a coder.

We have investigated a number of other build systems, and we are working on making that data available for others. Again, as far as we know, Bazel is the only build system that fulfills all of our internal requirements.
 

I would like to appreciate the benefits of Bazel and compare if the approach is beneficial for anything, but I spend a few minutes browsing the site and I really don't see anything exciting. The arguments like "their feature sets are different, so they aren't viable alternatives for us" leave the impression that it is just another NIH product, and I see no any good excuse to do that. Not anything that would suit Google.

Feel free to prove me that I am wrong.

To be honest, it doesn't sound like you are open to other opinions, but let me try to enumerate some of the things that we believe make Bazel / Blaze unique:

- Blaze (internally) supports remote execution for everything, including full builds; this, in turn, powers our continuous integration system, static analysis pipeline, code search, and cross referencing systems. This is (close to) zero administration - we can simply use our existing data center machines, with no additional setup or maintenance required. Bazel doesn't support all of this, yet, but it contains much of the necessary infrastructure - it tracks exactly what files are needed for which build steps, it allows checking all required tools into source control, and it can usually determine exactly which build steps can be run independently and which depend on each other (local execution of genrules is very leaky in this respect, but sandboxing helps enforce it).

- This same infrastructure also means that Bazel can take a list of modified files (such as from inotify) and use that to drive a build. If such a list is available, the overhead of Bazel is close to minimal, i.e., almost all of the CPU time spent during a build is spent running tools such as gcc and javac. If no file has changed, Bazel should run in under 10ms (we may not be quite there yet).

- We've made a significant effort to ensure that you never need to run "bazel clean", ever. Bazel tracks all changes to command-line options, as well as to all checked-in tools (including the files that make up gcc and javac) and correctly determines exactly what needs to be re-executed in all cases. (Modulo bugs in Bazel, local genrules, or custom actions.) Internally, whenever someone encounters a situation where a clean build produces different output from an incremental build, we treat that as a high priority bug.

- Bazel does not need to read all of the metadata files in your workspace. If you're following the principles in "Recursive Make Considered Harmful" (http://aegis.sourceforge.net/auug97.pdf), your single Makefile generally grows with your project. For larger projects, just reading that Makefile can end up taking minutes, even if you only want to compile a single .cc file (this has happened at Google back in the day). How much Bazel has to read depends on the structure and content of your BUILD files, but it is generally possible to make them sufficiently fine-grained to avoid excessive overhead in reading the BUILD files themselves.

- This in turn allows you to have multiple, interdependent projects in a single version control archive. This encourages continuous integration (and I mean continuous - every change is immediately visible to everyone). In combination with the right infrastructure, you can run all affected (and _only_ the affected) tests after every change, to pinpoint exactly which change breaks your project in what ways.

In fact, we use Blaze internally in just that way - we have a single version control archive that contains hundreds of millions of lines of code, with a continuous integration system that pinpoints the exact breaking change across any number of intermediate dependencies, often within minutes (and we're aiming to be much faster than that).

Personally, I think that's very very cool.

On the flip side, these features have costs, too. Remote execution isn't for free, even if you have the machines available. Perfectly hermetic builds can only be approached asymptotically, but never completely guaranteed. Writing good BUILD files is still partially an art rather than a science, and we have plenty of examples where things go wrong. Bazel can't support all the features that people expect from a build tool - in particular, we need to know all the input and output files ahead of time, which is sometimes difficult to do, for example if you want to name an output file by the checksum of its contents.

Thanks,

-- Ulf
 

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/7900a245-e8b9-4bdd-9bc9-1f3300733c2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

bocto...@gmail.com

unread,
Mar 30, 2015, 8:10:31 PM3/30/15
to bazel-...@googlegroups.com, tech...@gmail.com
On Monday, March 30, 2015 at 5:07:54 AM UTC-7, Ulf Adams wrote:

> I'm sorry that our documentation isn't as clear as we'd like it to be.

I, for one, look forward to any sort of improvement in this area. Chief amongst my disappointments in Bazel's documentation is the hand-waving dismissal of (admittedly flawed) make, referring only to the necessity of perfect makefile construction and the famous document "Recursive Make Considered Harmful"; yet there isn't really an explanation of what Bazel does to improve upon make, and how that improvement is realized.

At this time, I agree with Anatoly's assertion of WABT (whoopee! another build tool).

> - Blaze (internally) supports remote execution for everything, including full builds; this, in turn, powers our continuous integration system, static analysis pipeline, code search, and cross referencing systems. This is (close to) zero administration - we can simply use our existing data center machines, with no additional setup or maintenance required. Bazel doesn't support all of this, yet, but it contains much of the necessary infrastructure - it tracks exactly what files are needed for which build steps, it allows checking all required tools into source control, and it can usually determine exactly which build steps can be run independently and which depend on each other (local execution of genrules is very leaky in this respect, but sandboxing helps enforce it).

My casual reading of the unsearchable and unindexed online documentation tells me that perfect build file design and construction are NO less important for Bazel than they are for make. The instructions (under http://bazel.io/docs/build-ref.html#packages_targets) explain a super simple way to screw up build configuration without knowing anything is wrong. It sure doesn't sound to me like improvement. It just sounds like Same Problem, Different System.

Time will tell. I'm neither convinced or even slightly impressed at this point. Many years of doing builds with many different systems just says this is YetAnotherBuild to me.

Thiago Farina

unread,
Mar 30, 2015, 8:48:35 PM3/30/15
to bocto...@gmail.com, bazel-...@googlegroups.com, tech...@gmail.com


On Monday, March 30, 2015, <bocto...@gmail.com> wrote:
On Monday, March 30, 2015 at 5:07:54 AM UTC-7, Ulf Adams wrote:

> I'm sorry that our documentation isn't as clear as we'd like it to be.

I, for one, look forward to any sort of improvement in this area. Chief amongst my disappointments in Bazel's documentation is the hand-waving dismissal of (admittedly flawed) make, referring only to the necessity of perfect makefile construction and the famous document "Recursive Make Considered Harmful"; yet there isn't really an explanation of what Bazel does to improve upon make, and how that improvement is realized.

I think comparing Bazel/Blaze to Make or vice-versa is not fair. Although they seem to do the same thing (build software). They do in many different ways.
 
It is said somewhere, look for Mike Bland posts, that at some point Google was not scaling anymore and engineers weren't being producted because of the compile that that make produced. Same thing happened in Chromium until Martine came with Ninja.

So for big, large scale code bases, hand-written Makefiles simply do not scale.

But that is just my interpretation, from an outside view.


--
Thiago Farina

Eric Zundel Ayers

unread,
Mar 30, 2015, 10:17:32 PM3/30/15
to Thiago Farina, bocto...@gmail.com, bazel-...@googlegroups.com, tech...@gmail.com
There are a lot of build tools out there.  But this family (Bazel/Buck/Pants which all derive from the ideas in Google Blaze) that are meant to solve problems not everyone has.  You may not notice the advantages of the approach unless you have a very large project tree where you routinely want to build subsets of it, all from tip of trunk, or you want to switch back and forth to different versions of the source tree.  To me, the big difference is that these tools make is that they can allow you to keep the builds running quickly, even when your code base scales up to thousands of individual projects in the same source tree.

I can give you some concrete numbers from our repo.

Time to build a middling size project composed of about 100 maven project dependencies from clean start to finish 
mvn -am  ~4 minutes
Pants: 1 minute

Incremental build time:
mvn -am ~40 seconds
Pants: 9 seconds

The bigger your project, the more the time savings.  The savings I've mentioned come from not having to recursively evaluate logic to determine if something is already built.  This doesn't even count the scaling you could get with a massive build farm, or the benefits of having a distributed cache to save your incremental results. 

If the difference between 4 minutes and 1 minute doesn't make an impression on you,  I have experience working with a large code base where ant based builds took 20-30 minutes after merging up with master, and then it took another 15-20 minutes to bring up the IDE.  No one wanted to build from master more frequently than once every week or two.  No one wanted to check out a new branch to patch in some code for a review. That is a productivity and quality killer.  When multiplied by hundreds of engineers it is a really big deal.

-Eric.

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.

Markus Kohler

unread,
Mar 31, 2015, 3:20:40 AM3/31/15
to bazel-...@googlegroups.com, tfa...@chromium.org, bocto...@gmail.com, tech...@gmail.com
Hi Eric, 
I agree. 
And as far as I know Maven`s incremental Java build is still not reliable and so isn't ants incremental Java build support. 

Pants goes further down the road of implementing an incremental and reliable Java build. 

I somehow agree that for people not familiar with the famous google Blaze article and not familiar with the newer build tools such as buck and pants, it is somehow difficult to understand how much of a big deal the availabilty of bazel as open source is. 
A lot of  people just cannot imagine how broken their build tools are, and how much better they could. be. For example distributed caching of build results makes a whole lot of sense to me and bazel should enable this (also I haven't seen any documentation), but only buck really support it as of now (AFAIK). 



Regards,
Markus

Ulf Adams

unread,
Mar 31, 2015, 4:23:13 AM3/31/15
to bocto...@gmail.com, bazel-...@googlegroups.com, anatoly techtonik
On Tue, Mar 31, 2015 at 2:10 AM, <bocto...@gmail.com> wrote:
On Monday, March 30, 2015 at 5:07:54 AM UTC-7, Ulf Adams wrote:

> I'm sorry that our documentation isn't as clear as we'd like it to be.

I, for one, look forward to any sort of improvement in this area. Chief amongst my disappointments in Bazel's documentation is the hand-waving dismissal of (admittedly flawed) make, referring only to the necessity of perfect makefile construction and the famous document "Recursive Make Considered Harmful"; yet there isn't really an explanation of what Bazel does to improve upon make, and how that improvement is realized.

At this time, I agree with Anatoly's assertion of WABT (whoopee! another build tool).

> - Blaze (internally) supports remote execution for everything, including full builds; this, in turn, powers our continuous integration system, static analysis pipeline, code search, and cross referencing systems. This is (close to) zero administration - we can simply use our existing data center machines, with no additional setup or maintenance required. Bazel doesn't support all of this, yet, but it contains much of the necessary infrastructure - it tracks exactly what files are needed for which build steps, it allows checking all required tools into source control, and it can usually determine exactly which build steps can be run independently and which depend on each other (local execution of genrules is very leaky in this respect, but sandboxing helps enforce it).

My casual reading of the unsearchable and unindexed online documentation tells me that perfect build file design and construction are NO less

I'm not sure what you mean with unsearchable and unindexed. What can we change to improve that?
 
important for Bazel than they are for make. The instructions (under http://bazel.io/docs/build-ref.html#packages_targets) explain a super simple way to screw up build configuration without knowing anything is wrong. It sure doesn't sound to me like improvement. It just sounds like Same Problem, Different System.

No system is foolproof, and I don't think that's the right criterium. We have hundreds of thousands of BUILD files, and we have only seen a few outliers which were particularly bad. I'm also not sure what possible screw-up you're referring to with that link. From that anchor, the closest 'match' to your description is this:

If, by mistake, you refer to testdepot.zip by the wrong label, such as //my/app:testdata/testdepot.zip or//my:app/testdata/testdepot.zip, you will get an error from the build tool saying that the label "crosses a package boundary". You should correct the label by putting the colon after the directory containing the innermost enclosing BUILD file, i.e., //my/app/testdata:testdepot.zip.

However, Bazel is giving an error message in this case, which doesn't seem to match your statement of 'without knowing anything is wrong'.
 

Time will tell. I'm neither convinced or even slightly impressed at this point. Many years of doing builds with many different systems just says this is YetAnotherBuild to me.
--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.

Br.Bill

unread,
Mar 31, 2015, 7:09:49 PM3/31/15
to bazel-...@googlegroups.com, bocto...@gmail.com, tech...@gmail.com
On Tuesday, March 31, 2015 at 1:23:13 AM UTC-7, Ulf Adams wrote:
On Tue, Mar 31, 2015 at 2:10 AM, <bocto...@gmail.com> wrote:
My casual reading of the unsearchable and unindexed online documentation tells me that perfect build file design and construction are NO less

I'm not sure what you mean with unsearchable and unindexed. What can we change to improve that?

Suggested improvment: Put a search tool on the main doc page, since there are multiple pages, so we don't have to search multiple times with browsers. With this, an index is not required. Lacking this, a standard index of clickable terms.
 
 
important for Bazel than they are for make. The instructions (under http://bazel.io/docs/build-ref.html#packages_targets) explain a super simple way to screw up build configuration without knowing anything is wrong. It sure doesn't sound to me like improvement. It just sounds like Same Problem, Different System.

No system is foolproof, and I don't think that's the right criterium. We have hundreds of thousands of BUILD files, and we have only seen a few outliers which were particularly bad. I'm also not sure what possible screw-up you're referring to with that link.

I was referring to this one, specifically used as an example: "The declared dependencies no longer overapproximate the actual dependencies. This may build ok, because the transitive closures of the two graphs are equal, but masks a problem: a has an actual but undeclared dependency on c."

Further down: "The declared dependency graph is now an underapproximation of the actual dependencies, even when transitively closed; the build is likely to fail. The problem could have been averted by ensuring that the actual dependency from a to c introduced in Step 2 was properly declared in the BUILD file."

Again, this sounds like every build system ever introduced. Not a defect in your system, but rather, proof that reinventing the wheel still reveals that flat tires are problematic.

I don't deny Blaze's speed improvements. I see that these systems are heavily dependent on maven and ant, which are known to be problematic with make anyway. Maybe that's why I don't care; I don't work with any teams or projects that depend on them. Would be curious to see if Blaze realized similar speed gains in large builds involving Mono or Xcode. It seems to be a good thing for Google or any company that builds exactly like Google.

Ulf Adams

unread,
Apr 1, 2015, 4:54:07 AM4/1/15
to Br.Bill, bazel-...@googlegroups.com, anatoly techtonik
On Wed, Apr 1, 2015 at 1:09 AM, Br.Bill <bocto...@gmail.com> wrote:
On Tuesday, March 31, 2015 at 1:23:13 AM UTC-7, Ulf Adams wrote:
On Tue, Mar 31, 2015 at 2:10 AM, <bocto...@gmail.com> wrote:
My casual reading of the unsearchable and unindexed online documentation tells me that perfect build file design and construction are NO less

I'm not sure what you mean with unsearchable and unindexed. What can we change to improve that?

Suggested improvment: Put a search tool on the main doc page, since there are multiple pages, so we don't have to search multiple times with browsers. With this, an index is not required. Lacking this, a standard index of clickable terms.
 
 
important for Bazel than they are for make. The instructions (under http://bazel.io/docs/build-ref.html#packages_targets) explain a super simple way to screw up build configuration without knowing anything is wrong. It sure doesn't sound to me like improvement. It just sounds like Same Problem, Different System.

No system is foolproof, and I don't think that's the right criterium. We have hundreds of thousands of BUILD files, and we have only seen a few outliers which were particularly bad. I'm also not sure what possible screw-up you're referring to with that link.

I was referring to this one, specifically used as an example: "The declared dependencies no longer overapproximate the actual dependencies. This may build ok, because the transitive closures of the two graphs are equal, but masks a problem: a has an actual but undeclared dependency on c."

Further down: "The declared dependency graph is now an underapproximation of the actual dependencies, even when transitively closed; the build is likely to fail. The problem could have been averted by ensuring that the actual dependency from a to c introduced in Step 2 was properly declared in the BUILD file."

Again, this sounds like every build system ever introduced. Not a defect in your system, but rather, proof that reinventing the wheel still reveals that flat tires are problematic.

We have tools for Java and C++ to detect such cases and give an error during the build; I'm not sure if they're open source yet. Unfortunately, this does require language-specific tooling; in some cases, such as Python, it's not even possible to do at build time (unless you had some sort of type annotations).
 

I don't deny Blaze's speed improvements. I see that these systems are heavily dependent on maven and ant, which are known to be problematic with make anyway. Maybe that's why I don't care; I don't work with any teams or projects that depend on them. Would be curious to see if Blaze realized similar speed gains in large builds involving Mono or Xcode. It seems to be a good thing for Google or any company that builds exactly like Google.

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages