Hi all,
tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a side-directory in the monorepo, similar to the gn build.
Some of us have been working on open-source Bazel BUILD files for the LLVM Project. You may have seen us hanging out in the #build-systems discord channel. As you may know, Google uses Bazel internally and has maintained a Bazel BUILD of LLVM for years. Especially with the introduction of MLIR, we've got more and more OSS projects with a Bazel BUILD depending on LLVM (e.g. IREE and TensorFlow). We're also not the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've borrowed from TF. Each of these projects has to jump through some weird hoops to keep their version of the Bazel BUILD files in sync with the code, which requires some fragile combination of scripts and human intervention. Instead, we'd like to move general-purpose Bazel BUILD files into the LLVM Project monorepo. We expect to follow the model of the GN build where these will be maintained by interested contributors rather than expecting the general community to maintain them.
To facilitate and test this we've been developing a standalone repository that just has the Bazel BUILD files. It symlinks together the directory trees on top of a submodule as we would need in the monorepo to to avoid in-tree BUILD files. The configuration is at https://github.com/google/llvm-bazel. We now have those in a good place and think they would be useful upstream.
# Details
## What
Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review) subprojects, potentially expanding to others, as needed. Basically everything currently at https://github.com/google/llvm-bazel.
## Where
In https://github.com/google/llvm-bazel the BUILD files live in a single directory tree matching the structure of the overall llvm-project directory. For users, @llvm-project is a single Bazel repository that includes both LLVM and MLIR subprojects. To maintain this structure, we would probably want to put a `bazel` directory in the monorepo's utils directory, which currently only contains a directory for arcanist. This is different from gn, which is under the LLVM subproject's utils directory. We could similarly put the Bazel BUILD files under llvm/utils/bazel but have them be for the entire llvm project (the subsets that are supported). This seems like an odd structure to me, but I know that the CMake build for LLVM also builds the other subprojects, so maybe this would be preferable.
Alternatively we could split each subproject into a separate Bazel repository and put the Bazel build files under each subproject. I think this fragments the configuration of the BUILD without much benefit.
## Configurations
We currently have configurations for Linux GCC and Clang, MacOS GCC and Clang, and Windows MSVC. Support for other configurations can be added as-desired, but supporting all possible LLVM build configurations is not the goal.
## Support
Support would be similar to the gn build. Contributors could optionally update the Bazel BUILD files as part of their patches, but would be under no obligation to do so.
## Preserving History
I don't *think* the history of llvm-bazel is interesting enough to try to merge it into the monorepo and I was planning to submit this as a single patch, but please let me know if you disagree.
## Benefits to the community
Projects that depend on LLVM and use the Bazel build system can avoid duplicating fragile effort. We'll spend more time contributing to LLVM instead :-D
Bazel is stricter than CMake in many ways (e.g. it requires that even header dependencies be declared) and can catch layering issues very easily. There's even an optional layering_check feature we could turn on if its use would benefit the community. (though currently the existing problematic layering makes it a burden to maintain on our own). Even without that additional check, as I've been keeping the Bazel build green, I've found and fixed a number of layering issues in the past couple weeks (e.g. https://reviews.llvm.org/rGb49787df9a and https://reviews.llvm.org/rGc17ae2916c).
Can you explain some of the benefits to using Bazel instead of CMake?
I'm a little concerned about having two 'unsupported' buildsystems
living in tree, and I'm not sure what would stop us from continuing to
add more. I would feel better if we had a set of guidelines to define
the criteria for adding a new buildsytem and also criteria for when we
can remove them.
Would you be able to amend this proposal to include some general
guidelines for adding/removing new buildsystems, so that we can discuss
that too?
Thanks,
Tom
>
> # Details
>
> ## What
>
> Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review
> <https://github.com/google/llvm-bazel/pull/72>) subprojects, potentially
> expanding to others, as needed. Basically everything currently at
> https://github.com/google/llvm-bazel.
>
>
> ## Where
>
> In https://github.com/google/llvm-bazelthe BUILD files live in a single
> directory tree matching the structure of the overall llvm-project
> directory. For users, @llvm-project is a single Bazel repository
> <https://docs.bazel.build/versions/master/build-ref.html#repositories>that
> includes both LLVM and MLIR subprojects. To maintain this structure, we
> would probably want to put a `bazel` directory in the monorepo's utils
> directory <https://github.com/llvm/llvm-project/tree/master/utils>,
> which currently only contains a directory for arcanist. This is
> different from gn, which is under the LLVM subproject's utils directory
> <https://github.com/llvm/llvm-project/tree/master/llvm/utils/gn>. We
> could similarly put the Bazel BUILD files under llvm/utils/bazel but
> have them be for the entire llvm project (the subsets that are
> supported). This seems like an odd structure to me, but I know that the
> CMake build for LLVM also builds the other subprojects
> <https://github.com/llvm/llvm-project/blob/529ac33197f6/llvm/tools/CMakeLists.txt#L34-L41>,
> so maybe this would be preferable.
>
> Alternatively we could split each subproject into a separate Bazel
> repository and put the Bazel build files under each subproject. I think
> this fragments the configuration of the BUILD without much benefit.
>
>
> ## Configurations
>
> We currently have configurations for Linux GCC and Clang, MacOS GCC and
> Clang, and Windows MSVC. Support for other configurations can be added
> as-desired, but supporting all possible LLVM build configurations is not
> the goal.
>
>
> ## Support
>
> Support would be similar to the gn build. Contributors could optionally
> update the Bazel BUILD files as part of their patches, but would be
> under no obligation to do so.
>
>
> ## Preserving History
>
> I don't *think* the history of llvm-bazel is interesting enough to try
> to merge it into the monorepo and I was planning to submit this as a
> single patch, but please let me know if you disagree.
>
>
> ## Benefits to the community
>
> *
>
> Projects that depend on LLVM and use the Bazel build system can
> avoid duplicating fragile effort. We'll spend more time contributing
> to LLVM instead :-D
>
> *
>
> Bazel is stricter than CMake in many ways (e.g. it requires that
> even header dependencies be declared) and can catch layering issues
> very easily. There's even an optional layering_check feature we
> could turn on if its use would benefit the community. (though
> currently the existing problematic layering makes it a burden to
> maintain on our own). Even without that additional check, as I've
> been keeping the Bazel build green, I've found and fixed a number of
> layering issues in the past couple weeks (e.g.
> https://reviews.llvm.org/rGb49787df9a
> <https://reviews.llvm.org/rGb49787df9a535f03761c340dca7ec3ec1155133d>and
> https://reviews.llvm.org/rGc17ae2916c
> <https://reviews.llvm.org/rGc17ae2916ccf45a0c1717bd5f11598cc4fff342a>).
>
>
> Here's a patch <https://reviews.llvm.org/D90352>adding the Bazel build
> system. It's basically just `cp -r llvm-bazel/llvm-bazel
> llvm-project/utils/bazel`.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
I'm a little concerned about having two 'unsupported' buildsystems
living in tree, and I'm not sure what would stop us from continuing to
add more. I would feel better if we had a set of guidelines to define
the criteria for adding a new buildsytem and also criteria for when we
can remove them.
> > tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a
> > side-directory in the monorepo, similar to the gn build.
> Can you explain some of the benefits to using Bazel instead of CMake?
I can, and I will be very brief: None.
--
Stefan Teleman
stefan....@gmail.com
On Thu, Oct 29, 2020 at 11:23 AM Tom Stellard via llvm-dev
<llvm...@lists.llvm.org> wrote:
> > tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a
> > side-directory in the monorepo, similar to the gn build.
> Can you explain some of the benefits to using Bazel instead of CMake?
I can, and I will be very brief: None.
Could not a public repository for the Bazel build system be created, that has a submodule for the llvm monorepo? Users of the Bazel build system could checkout the Bazel build repo and do the submodule init, and this public repo could be used for collaboration.
Thanks,
Christopher Tetreault
I think Renato has articulated quite well some concerns I have about this but was unable to express. I would very much prefer if we just focus on using CMake effectively.
Thanks,
Christopher Tetreault
From: llvm-dev <llvm-dev...@lists.llvm.org>
On Behalf Of Renato Golin via llvm-dev
Sent: Thursday, October 29, 2020 9:06 AM
To: tste...@redhat.com
Cc: Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>;
Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, 29 Oct 2020 at 15:23, Tom Stellard via llvm-dev <llvm...@lists.llvm.org> wrote:
I'm a little concerned about having two 'unsupported' buildsystems
living in tree, and I'm not sure what would stop us from continuing to
add more. I would feel better if we had a set of guidelines to define
the criteria for adding a new buildsytem and also criteria for when we
can remove them.
I have used Bazel and it doesn't seem to map well to CMake. It seems to be in between CMake and Ninja with a lot of hard-coded dependencies that are cumbersome to keep updating. I'm by no means an expert, and I could very well be wrong, but supporting more than one build system is not trivial (remember the autoconf days?).
For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we'll have to do so in multiple build systems that do not easily match each other. If we don't try to match functionality, we'll segregate the community, because people will be able to do X on build system A but not B, and the similar features cluster together and then we have essentially two projects built from the same source code.
Testing this, or worse, trying to fix a buildbot that is built with Bazel (and having to install Java JDK and all its dependencies) on potentially a hardware that you do not have access to, will be a nightmare to debug. The nature of post-commit testing, revert and review of LLVM will not make that simpler. Unless we treat the Bazel build as "not our problem" (which defeats the point of having it?).
To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).
If the Bazel files can be completely ignored, then it's just more clutter. But if other projects start to use more different build systems and we start packing them all in LLVM, then we'll have a hard time knowing what we build how. I can't really see this scaling.
Two-cents worth.
--renato
I did not see a rationale for the Bazel proposal, outlining its
benefits over CMake.
Speaking with direct experience with Bazel - Tensorflow - I cannot
think of a single reason why it would/should be considered "better"
over the current CMake.
Everyone has their own favorite build system. That is nice, but it is
not enough of a reason to propose adding it.
I would also like to become informed as to what particular
needs/shortcomings/defects are addressed by Bazel, that are lacking in
/ cannot be addressed by CMake.
Thanks.
I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there's no burden on developers to update gn build files (& consequently bazel build files)).
Everyone has their own favorite build system. That is nice, but it is
not enough of a reason to propose adding it.
I would also like to become informed as to what particular
needs/shortcomings/defects are addressed by Bazel, that are lacking in
/ cannot be addressed by CMake.
>
> In the meantime, having those files wouldn't be the end of the world. But I fear that once we add, they'll stay there forever, and will lead to people ignoring CMake and segregating the project.
Yes that is my main concern as well.
Build systems for complex projects are ... messy. I believe that what
we have right now - with CMake - works quite well. And I am perfectly
aware of the insane amount of work that has gotten into making LLVM
quite easy to build.
I'm a little concerned about having two 'unsupported' buildsystems
living in tree, and I'm not sure what would stop us from continuing to
add more.
I would feel better if we had a set of guidelines to define
the criteria for adding a new buildsytem and also criteria for when we
can remove them.
Would you be able to amend this proposal to include some general
guidelines for adding/removing new buildsystems, so that we can discuss
that too?
I think Renato has articulated quite well some concerns I have about this but was unable to express. I would very much prefer if we just focus on using CMake effectively.
For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we'll have to do so in multiple build systems that do not easily match each other.
If we don't try to match functionality, we'll segregate the community, because people will be able to do X on build system A but not B, and the similar features cluster together and then we have essentially two projects built from the same source code.
Testing this, or worse, trying to fix a buildbot that is built with Bazel (and having to install Java JDK and all its dependencies) on potentially a hardware that you do not have access to, will be a nightmare to debug. The nature of post-commit testing, revert and review of LLVM will not make that simpler. Unless we treat the Bazel build as "not our problem" (which defeats the point of having it?).
To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).
The problem is that once it’s in community LLVM, it becomes the community’s problem. The expectation is that individual contributors do not break anything in upstream. Why else would you contribute it to the LLVM monorepo? If the goal is just to enable external-to-google orgs to collaborate on it, why not contribute it as a new repo separate from LLVM? You wouldn’t need to ask anybody’s permission to do this.
From: Sterling Augustine <saugu...@google.com>
Sent: Thursday, October 29, 2020 1:14 PM
To: Chris Tetreault <ctet...@quicinc.com>
Cc: Renato Golin <reng...@gmail.com>; tste...@redhat.com; Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg
<tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:
On Thu, 29 Oct 2020 at 19:16, David Blaikie <dbla...@gmail.com> wrote:I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there's no burden on developers to update gn build files (& consequently bazel build files)).Perhaps the initial assumption about my concerns weren't well articulated.I get that those files would be "additional" and other developers won't need to care much about them.But what happens when people join the project with experience in Bazel and, instead of building pure LLVM with CMake, they start using Bazel for everything, just because they're used to it?
Bazel is big enough (at least inside Google) that the probability of that happening is not trivial.What if they create sub-projects that can only build with Bazel? Do we refuse inclusion? But don't we have Bazel files already?
One big example is Android. They used to build LLVM in a very different way, and the inclusion of run-time library files was completely different. So different it was not possible to merge some changes they had (128 bit maths IIRC) because of the amount of work required.My point is that adding another build system will not necessarily improve the chances of external people contributing to LLVM if they use those build systems. It may very well *reduce* those chances.
Glad to see that the current required cmake version is being met with
Ubuntu 20.04 and later.
The problem is that once it’s in community LLVM, it becomes the community’s problem. The expectation is that individual contributors do not break anything in upstream.
Why else would you contribute it to the LLVM monorepo? If the goal is just to enable external-to-google orgs to collaborate on it, why not contribute it as a new repo separate from LLVM? You wouldn’t need to ask anybody’s permission to do this.
From: Sterling Augustine <saugu...@google.com>
Sent: Thursday, October 29, 2020 1:14 PM
To: Chris Tetreault <ctet...@quicinc.com>
Cc: Renato Golin <reng...@gmail.com>; tste...@redhat.com; Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:
I think Renato has articulated quite well some concerns I have about this but was unable to express. I would very much prefer if we just focus on using CMake effectively.
...
For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we'll have to do so in multiple build systems that do not easily match each other.
Google already does all of this work, and has for years. I think it is fair to say that it hasn't been a burden on the community.
If we don't try to match functionality, we'll segregate the community, because people will be able to do X on build system A but not B, and the similar features cluster together and then we have essentially two projects built from the same source code.
As long as we keep CMake as the canonical system everything will be fine. It works perfectly well today, except that not everyone gets to see or use the bazel files. They exist right now; they work right now; and it hasn't been a burden on anyone but the people who care about bazel.
Testing this, or worse, trying to fix a buildbot that is built with Bazel (and having to install Java JDK and all its dependencies) on potentially a hardware that you do not have access to, will be a nightmare to debug. The nature of post-commit testing, revert and review of LLVM will not make that simpler. Unless we treat the Bazel build as "not our problem" (which defeats the point of having it?).
Google makes it work like this today, with the rest of the project treating it as "not our problem" because they don't even see that they exist. The build bot issues would be real, but I think surmountable, given that Google already cleans up the bazel files, it just doesn't push them. Perhaps an explicit policy that cmake folks don't have to update the bazel files would be helpful.
To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).
It wouldn't be starting a new build system, it would be making a pre-existing, already extremely well functioning one, available to more people.
I can definitely see folks who use cmake not wanting more hassle--that may be a valid reason not to do it. But "it won't work" or "it's hard to keep up" or "it's too complicated" seem well refuted by a multi-year existence proof.
_______________________________________________
On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:I think Renato has articulated quite well some concerns I have about this but was unable to express. I would very much prefer if we just focus on using CMake effectively.
...
For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we'll have to do so in multiple build systems that do not easily match each other.
Google already does all of this work, and has for years. I think it is fair to say that it hasn't been a burden on the community.
I can definitely see folks who use cmake not wanting more hassle--that may be a valid reason not to do it.
I would propose to have the files in a separate tree from llvm/, mlir/, clang/ ; labelling these clearly as unsupported (either in the path to these files or in the README, or both), and not provide any public documentation on llvm.org that would invite users to work with these. The readme would explain how to use them to include LLVM as a dependency to an existing Bazel project and document the intent as such.
This is a fair concern: can we defend against this with a clear policy?Also: no public bot with Bazel or other build system than CMake should help right?
My intuition was that by having the file upstream, we would instead encourage such users to track the HEAD of our main branch more closely and so provide them an easier path for upstream work. The fact that they can get upstream working with their build environment may provide an incentive to upstream along the way, even if they have to do the CMake integration first.
What would be the hassle to cmake users?
To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).
It wouldn't be starting a new build system, it would be making a pre-existing, already extremely well functioning one, available to more people.I can definitely see folks who use cmake not wanting more hassle--that may be a valid reason not to do it. But "it won't work" or "it's hard to keep up" or "it's too complicated" seem well refuted by a multi-year existence proof.
Honestly, I’m hearing that some people would like the Bazel build system to be in community master, and the argument basically boils down to “It’ll be fine. It’ll just sit there and mind its own business and you don’t have to care about it.”
> So why are we doing it? I mentioned this in another answer: this is mainly to provide a collaboration space for the support of OSS projects using Bazel interested to use LLVM (and some subprojects). …
Which could be handled by having it in an external public repo.
> Having them in-tree means that we can publish every day (or more) a git hash that we validate with Bazel on private bots (like `gn`) and every project can use to clone the LLVM monorepo and integrate in their build flow easily.
You could still publish this info: “Today, the head of llvm-bazel is confirmed to work with LLVM monorepo sha [foo]”. I don’t think two git clones is significantly harder than one. I submit that in a way this is simpler because you can always advertise the head of the bazel repo. If the Bazel build system were in the community repo, then you might have to tell users to use an older version of the bazel build if a fix went into the monorepo in the afternoon, but the next morning’s nightly finds that the most recent sha that passes the tests is prior to that fix.
I guess my concern is that I’m not really hearing a compelling (to my ear) argument for this inclusion. I guess it would make the lives of google employees easier? Then what’s to stop every large org from committing their internal stuff to master?
From: Mehdi AMINI <joke...@gmail.com>
Sent: Thursday, October 29, 2020 2:00 PM
To: Chris Tetreault <ctet...@quicinc.com>
Cc: Sterling Augustine <saugu...@google.com>; Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 1:24 PM Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:
The problem is that once it’s in community LLVM, it becomes the community’s problem. The expectation is that individual contributors do not break anything in upstream.
I would expect that the community by now has concrete experience with `gn` gained over a few years demonstrating that this hasn't been a problem to have this in-tree, without a burden of support on the community.
In particular, I think that a salient point is the guarantee that no public bot would be testing it (I mean here by "no public bot" that no bot would email you when you break it).
Why else would you contribute it to the LLVM monorepo? If the goal is just to enable external-to-google orgs to collaborate on it, why not contribute it as a new repo separate from LLVM? You wouldn’t need to ask anybody’s permission to do this.
Yes, we could do this, and you are correct that in many cases a motivation to upstream a component is to make sure it is maintained by the community and works out of the box.
In this case it is slightly different: we are OK with people to break this. We are already maintaining these files out-of-tree for our own purposes, and this has been the case for years as Sterling mentions. I would even suspect that for Google internal build integration, it is actually easier to have these files internal only rather than unsupported upstream.
So why are we doing it? I mentioned this in another answer: this is mainly to provide a collaboration space for the support of OSS projects using Bazel interested to use LLVM (and some subprojects).
Having them in-tree means that we can publish every day (or more) a git hash that we validate with Bazel on private bots (like `gn`) and every project can use to clone the LLVM monorepo and integrate in their build flow easily. Another repo, submodules, etc. are not making this possible / practical.
From: Sterling Augustine <saugu...@google.com>
Sent: Thursday, October 29, 2020 1:14 PM
To: Chris Tetreault <ctet...@quicinc.com>
Cc: Renato Golin <reng...@gmail.com>; tste...@redhat.com; Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 12:29 PM Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:
I think Renato has articulated quite well some concerns I have about this but was unable to express. I would very much prefer if we just focus on using CMake effectively.
...
For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we'll have to do so in multiple build systems that do not easily match each other.
Google already does all of this work, and has for years. I think it is fair to say that it hasn't been a burden on the community.
If we don't try to match functionality, we'll segregate the community, because people will be able to do X on build system A but not B, and the similar features cluster together and then we have essentially two projects built from the same source code.
As long as we keep CMake as the canonical system everything will be fine. It works perfectly well today, except that not everyone gets to see or use the bazel files. They exist right now; they work right now; and it hasn't been a burden on anyone but the people who care about bazel.
Testing this, or worse, trying to fix a buildbot that is built with Bazel (and having to install Java JDK and all its dependencies) on potentially a hardware that you do not have access to, will be a nightmare to debug. The nature of post-commit testing, revert and review of LLVM will not make that simpler. Unless we treat the Bazel build as "not our problem" (which defeats the point of having it?).
Google makes it work like this today, with the rest of the project treating it as "not our problem" because they don't even see that they exist. The build bot issues would be real, but I think surmountable, given that Google already cleans up the bazel files, it just doesn't push them. Perhaps an explicit policy that cmake folks don't have to update the bazel files would be helpful.
To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).
It wouldn't be starting a new build system, it would be making a pre-existing, already extremely well functioning one, available to more people.
I can definitely see folks who use cmake not wanting more hassle--that may be a valid reason not to do it. But "it won't work" or "it's hard to keep up" or "it's too complicated" seem well refuted by a multi-year existence proof.
_______________________________________________
Honestly, I’m hearing that some people would like the Bazel build system to be in community master, and the argument basically boils down to “It’ll be fine. It’ll just sit there and mind its own business and you don’t have to care about it.”
> So why are we doing it? I mentioned this in another answer: this is mainly to provide a collaboration space for the support of OSS projects using Bazel interested to use LLVM (and some subprojects). …
Which could be handled by having it in an external public repo.
> Having them in-tree means that we can publish every day (or more) a git hash that we validate with Bazel on private bots (like `gn`) and every project can use to clone the LLVM monorepo and integrate in their build flow easily.
You could still publish this info: “Today, the head of llvm-bazel is confirmed to work with LLVM monorepo sha [foo]”. I don’t think two git clones is significantly harder than one.
I submit that in a way this is simpler because you can always advertise the head of the bazel repo. If the Bazel build system were in the community repo, then you might have to tell users to use an older version of the bazel build if a fix went into the monorepo in the afternoon, but the next morning’s nightly finds that the most recent sha that passes the tests is prior to that fix.
I guess my concern is that I’m not really hearing a compelling (to my ear) argument for this inclusion.
I guess it would make the lives of google employees easier?
If their "internal stuff" is highly-coupled to LLVM, has zero-cost maintenance on the community, and is something that multiple other parties can benefit and established members of the community want to maintain and collaborate on, why not?
I mentioned it before, but Bazel is not something internal or specific to Google: it isn't (actually there are many incompatibilities between Bazel and the internal system), 400 people attended the Bazel conference last year. I attended this conference 3 years ago when I was at Tesla trying to deploy Bazel internally. Many other companies are using Bazel, open-source projects as well. Feel free to watch the talks online about SpaceX or Two Sigma and Uber for example
I'm not trying to convince anyone to use Bazel, it has drawbacks, but the point here is to recognize that this is about OpenSource communities that Bazel is serving: these are users, some of us in the LLVM community are trying to provide these users with a reasonably good integration story, and we're ready to pay the cost for everyone.
A at the end of the day, I’m just one person stating his opinion. You’ve submitted a proposal to make a change to the community, and I’ve stated my concerns as a member of the community. If broad support for this change is expressed by members of the community, then fine. As it stands, I’m seeing lots of googlers voicing support, and not much support from others. I’m not the arbiter of what gets done or not done. All I can do is state my opinions, which I have done.
If this contribution is accepted, I would request that the promise that the CMake build system will remain the canonical build system, and that no build bot will ever email me that I broke it, be documented.
For the record, I don’t find the argument that “gn already does this, so why not Bazel” to be compelling. If I were around when the gn build was added, I would have complained about it too.
On Thu, 29 Oct 2020 at 19:16, David Blaikie <dbla...@gmail.com> wrote:I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there's no burden on developers to update gn build files (& consequently bazel build files)).Perhaps the initial assumption about my concerns weren't well articulated.I get that those files would be "additional" and other developers won't need to care much about them.But what happens when people join the project with experience in Bazel and, instead of building pure LLVM with CMake, they start using Bazel for everything, just because they're used to it?
> This is a fairly unhelpful email - clearly folks using Bazel derive some benefit/have chosen some tradeoff compared to CMake. Doesn't have to be the thing you want, but it's pretty unhelpful to dismiss/diminish the needs of others like this.
I did not see a rationale for the Bazel proposal, outlining its
benefits over CMake.
Speaking with direct experience with Bazel - Tensorflow - I cannot
think of a single reason why it would/should be considered "better"
over the current CMake.
Everyone has their own favorite build system. That is nice, but it is
not enough of a reason to propose adding it.
I would also like to become informed as to what particular
needs/shortcomings/defects are addressed by Bazel, that are lacking in
/ cannot be addressed by CMake.
> I expect most of it is probably a statement free of value judgments: Some other projects chose to use it/some folks have to use it for other reasons, clearly there's enough use that it's motivated folks to have/maintain Bazel builds for LLVM for years. Rather than judging their choices as bad/lesser/wrong - might be useful to accept that some folks had their reasons and they're trying to make the most of the situation. I don't think anyone's making an argument that LLVM should switch to Bazel/that that would be better than the CMake we're using, and I think it's helpful to return the favor and not suggest that other projects would be better off switching to CMake over Bazel - they no doubt have their reasons.
Please do not manufacture statements that I did not make. I never
suggested, or stated, anywhere, that some other imaginary project
using Bazel should switch to CMake.
I did state that I do not find Bazel to be a better alternative to
CMake. My statement is based on direct experience with both.
If the intent behind Bazel is not to present it as a better
alternative to CMake, then what is the intent? Instead of maintaining
this impenetrable mystery as to why a Bazel build system should be
included in LLVM, please take the time to advocate for Bazel with
technical facts, than "someone at Google really likes it".
Just because someone likes and maintains an alternative build system
for LLVM, somewhere, that does not automatically mean, or imply that
it should be upstreamed.
For all I know, someone might be building their fork of LLVM with
autoconf. I am sure they have their own very good reasons for doing
so. Should we, therefore, bring back autoconf?
Thanks.
--
Stefan Teleman
stefan....@gmail.com
I think this argument is the slippery slope in action. Just because we allowed the gn build system to be added previously, does not mean that we should allow a new build system now. And forbidding this build system now does not mean that we must kick gn out of the repo.
We should accept or reject Bazel on its merits alone, and not based on historical precedent.
From: llvm-dev <llvm-dev...@lists.llvm.org>
On Behalf Of Zachary Turner via llvm-dev
Sent: Thursday, October 29, 2020 4:11 PM
To: Renato Golin <reng...@gmail.com>
Cc: Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>;
Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On the grounds that it was a bad idea after all.
Any commits going into the LLVM repository should not break any part of it, at least not without a consideration for a fix. There is an exception to it---experimental targets. They can be broken, but they are there with the explicit intent of becoming officially supported.
Same thing applies to the cmake files. If they get broken, they need to be fixed, but the same doesn’t apply to the extraneous build systems. They can be broken and never fixed. There is no commitment from the community as a whole to keep them working. IMO, this isn’t right, and files like that should not be a part of the official repository.
Whether GN or Bazel have superior features is irrelevant. Unless their configuration files are a part of a longer-term transition process, they don’t belong in the repo.
--
Krzysztof Parzyszek kpar...@quicinc.com AI tools development
From: llvm-dev <llvm-dev...@lists.llvm.org> On Behalf Of
Zachary Turner via llvm-dev
Sent: Thursday, October 29, 2020 6:11 PM
To: Renato Golin <reng...@gmail.com>
Cc: Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
I think this argument is the slippery slope in action. Just because we allowed the gn build system to be added previously, does not mean that we should allow a new build system now. And forbidding this build system now does not mean that we must kick gn out of the repo.
We should accept or reject Bazel on its merits alone, and not based on historical precedent.
From: llvm-dev <llvm-dev...@lists.llvm.org> On Behalf Of Zachary Turner via llvm-dev
Sent: Thursday, October 29, 2020 4:11 PM
To: Renato Golin <reng...@gmail.com>
Cc: Mehdi Amini <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble <gc...@google.com>; Thomas Joerg <tjo...@google.com>
Subject: [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 12:49 PM Renato Golin via llvm-dev <llvm...@lists.llvm.org> wrote:
On Thu, 29 Oct 2020 at 19:16, David Blaikie <dbla...@gmail.com> wrote:
I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there's no burden on developers to update gn build files (& consequently bazel build files)).
Perhaps the initial assumption about my concerns weren't well articulated.
I get that those files would be "additional" and other developers won't need to care much about them.
But what happens when people join the project with experience in Bazel and, instead of building pure LLVM with CMake, they start using Bazel for everything, just because they're used to it?
Didn't the community already go through this exact discussion when gn was added? Let me ask a different question. If gn support was permitted, on what grounds should we refuse a different parallel build system? Either we should allow people to contribute build systems upstream that they wish to maintain, or we should keep every buidl system other than CMake out of the tree.
_______________________________________________
On Thu, Oct 29, 2020 at 7:16 PM David Blaikie <dbla...@gmail.com> wrote:
> I expect most of it is probably a statement free of value judgments: Some other projects chose to use it/some folks have to use it for other reasons, clearly there's enough use that it's motivated folks to have/maintain Bazel builds for LLVM for years. Rather than judging their choices as bad/lesser/wrong - might be useful to accept that some folks had their reasons and they're trying to make the most of the situation. I don't think anyone's making an argument that LLVM should switch to Bazel/that that would be better than the CMake we're using, and I think it's helpful to return the favor and not suggest that other projects would be better off switching to CMake over Bazel - they no doubt have their reasons.
Please do not manufacture statements that I did not make. I never
suggested, or stated, anywhere, that some other imaginary project
using Bazel should switch to CMake.
I did state that I do not find Bazel to be a better alternative to
CMake. My statement is based on direct experience with both.
If the intent behind Bazel is not to present it as a better
alternative to CMake, then what is the intent?
Instead of maintaining
this impenetrable mystery as to why a Bazel build system should be
included in LLVM, please take the time to advocate for Bazel with
technical facts, than "someone at Google really likes it".
Just because someone likes and maintains an alternative build system
for LLVM, somewhere, that does not automatically mean, or imply that
it should be upstreamed.
For all I know, someone might be building their fork of LLVM with
autoconf. I am sure they have their own very good reasons for doing
so. Should we, therefore, bring back autoconf?
On the grounds that it was a bad idea after all.
_______________________________________________
>> Instead of maintaining
>> this impenetrable mystery as to why a Bazel build system should be
>> included in LLVM, please take the time to advocate for Bazel with
>> technical facts, than "someone at Google really likes it".
>
>
> That's the technical facts though: A variety of other projects with LLVM as a dependency use Bazel, for whatever their reasons, and are currently maintaining Bazel build files out of tree and it would be easier for them to coordinate in-tree instead.
I fail to see how any of these are technical facts. Whatever "variety
of other projects with LLVM as a dependency" choose to use for their
build system is their business.
Let's be a bit more precise here: this "variety of other projects with
LLVM as a dependency" aren't just random projects off the Internet.
These are all Google projects. Correct?
So, in final analysis, this has nothing to do with Bazel's technical
merits. It has everything to do with "It's convenient for Google".
Regardless of whether the larger LLVM community agrees with the idea,
or not. Which, so far, it does not seem to me that it has.
Thanks for clarifying.
On 10/28/20 6:18 PM, Geoffrey Martin-Noble via llvm-dev wrote:
> Hi all,
>
> tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a
> side-directory in the monorepo, similar to the gn build.
>
> Some of us have been working on open-source Bazel BUILD files for the
LLVM
> Project. You may have seen us hanging out in the #build-systems discord
> channel. As you may know, Google uses Bazel internally and has
maintained a
> Bazel BUILD of LLVM for years. Especially with the introduction of MLIR,
> we've got more and more OSS projects with a Bazel BUILD depending on LLVM
> (e.g. IREE <https://github.com/google/iree> and TensorFlow
> <https://github.com/tensorflow/tensorflow>). We're also not the only ones
> using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've
borrowed
> from TF
> <https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>.
> Each of these projects has to jump through some weird hoops to keep their
> version of the Bazel BUILD files in sync with the code, which
requires some
> fragile combination of scripts and human intervention. Instead, we'd like
> to move general-purpose Bazel BUILD files into the LLVM Project monorepo.
> We expect to follow the model of the GN build where these will be
> maintained by interested contributors rather than expecting the general
> community to maintain them.
>
> To facilitate and test this we've been developing a standalone repository
> that just has the Bazel BUILD files. It symlinks together the directory
> trees on top of a submodule as we would need in the monorepo to to avoid
> in-tree BUILD files. The configuration is at
> https://github.com/google/llvm-bazel. We now have those in a good
place and
> think they would be useful upstream.
>
> # Details
>
> ## What
>
> Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review
> <https://github.com/google/llvm-bazel/pull/72>) subprojects, potentially
> expanding to others, as needed. Basically everything currently at
> https://github.com/google/llvm-bazel.
>
> ## Where
>
> In https://github.com/google/llvm-bazel the BUILD files live in a single
> -
>
> Projects that depend on LLVM and use the Bazel build system can avoid
> duplicating fragile effort. We'll spend more time contributing to LLVM
> instead :-D
> -
>
> Bazel is stricter than CMake in many ways (e.g. it requires that even
> header dependencies be declared) and can catch layering issues
very easily.
> There's even an optional layering_check feature we could turn on
if its use
> would benefit the community. (though currently the existing
problematic
> layering makes it a burden to maintain on our own). Even without that
> additional check, as I've been keeping the Bazel build green, I've
found
> and fixed a number of layering issues in the past couple weeks (e.g.
> https://reviews.llvm.org/rGb49787df9a
> <https://reviews.llvm.org/rGb49787df9a535f03761c340dca7ec3ec1155133d>
> and https://reviews.llvm.org/rGc17ae2916c
> <https://reviews.llvm.org/rGc17ae2916ccf45a0c1717bd5f11598cc4fff342a>).
>
>
> Here's a patch <https://reviews.llvm.org/D90352> adding the Bazel build
> system. It's basically just `cp -r llvm-bazel/llvm-bazel
> llvm-project/utils/bazel`.
Doesn't the last paragraph mean all benefits derived from this can be
described either as:
(1) users do not need to clone the llvm-bazel git repo but get the
files in llvm-project, or
(2) "interested contributors" could send patches to llvm-project
instead of llvm-bazel to update the bazel build.
TBH, I have no interest in using bazel nor anything against it being
merged per se. I just find it curious that we merge another build system
"at no cost" for the community (I think I picked that up in the thread
but I might have imagined the phrasing). I mean, there is always "a
cost"* so it boils down to determine if the benefit is worth it.
~ Johannes
* i.a., people will assume we (=the LLVM community) maintain(s) a bazel
build, which can certainly be a benefit but also a cost", e.g., when
the build is not properly maintained, support is scarce, etc. and
emails come in complaining about it (not thinking of prior examples
here.)
Wouldn't your argument hold for anything that "just lives" in the mono
repo but doesn't impact people? I mean, where is the line for stuff that
some contributors have "strong interest" in and others can't really
"hear a compelling argument for inclusion"? People raise concerns here
and from where I am sitting they are brushed over easily and more
aggressively as the thread progresses (up to the email I respond to).
>
>
>> I guess it would make the lives of google employees easier?
>>
> I explained before that Google internal integration flow is likely better
> without this at the moment, TensorFlow itself is also in a reasonably good
> spot at the moment. But Google is also not a monolithic place, some people
> are working on small independent projects that they are open-sourcing, and
> would like to be able to use LLVM.
>
>> Then what’s to stop every large org from committing their internal stuff
> to master?
>
>
>
> If their "internal stuff" is highly-coupled to LLVM, has zero-cost
> maintenance on the community, and is something that multiple other parties
> can benefit and established members of the community want to maintain and
> collaborate on, why not?
Let's be honest, nothing has "zero-cost". It seems unhelpful to pretend
it does. (FWIW, I explained a simple scenario that would make the bazel
inclusion "costly" in my previous mail.)
>
> I mentioned it before, but Bazel is not something internal or specific to
> Google: it isn't (actually there are many incompatibilities between Bazel
> and the internal system), 400 people attended the Bazel conference last
> year. I attended this conference 3 years ago when I was at Tesla trying to
> deploy Bazel internally. Many other companies are using Bazel, open-source
> projects as well. Feel free to watch the talks online about SpaceX
> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two Sigma and Uber
> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for example
Let's not conflate "using bazel" and "benefit for LLVM", the former
is not up for debate here. (I mean, a lot of people use autoconf but
we got rid of it anyway).
That said, I think the original question is highly relevant. As I also
mentioned somewhere above, where do we draw the line is the key to this
RFC at the end of the day. A lot of the arguments I hear pro integration
apply to various other things that currently live out-of-tree, some of
which were proposed and not integrated. I think we should not dismiss
this easily, no matter on which side of the argument you are this time.
~ Johannes
>
>
> I'm not trying to convince anyone to use Bazel, it has drawbacks, but the
> point here is to recognize that this is about OpenSource communities that
> Bazel is serving: these are users, some of us in the LLVM community are
> trying to provide these users with a reasonably good integration story, and
> we're ready to pay the cost for everyone.
>
>
>
>>
>> *From:* Mehdi AMINI <joke...@gmail.com>
>> *Sent:* Thursday, October 29, 2020 2:00 PM
>> *To:* Chris Tetreault <ctet...@quicinc.com>
>> *Cc:* Sterling Augustine <saugu...@google.com>; Mehdi Amini <
>> ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <
>> laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble
>> <gc...@google.com>; Thomas Joerg <tjo...@google.com>
>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to
>> *From:* Sterling Augustine <saugu...@google.com>
>> *Sent:* Thursday, October 29, 2020 1:14 PM
>> *To:* Chris Tetreault <ctet...@quicinc.com>
>> *Cc:* Renato Golin <reng...@gmail.com>; tste...@redhat.com; Mehdi Amini
>> <ami...@google.com>; LLVM Dev <llvm...@lists.llvm.org>; Stella Laurenzo <
>> laur...@google.com>; Tres Popp <tp...@google.com>; Geoffrey Martin-Noble
>> <gc...@google.com>; Thomas Joerg <tjo...@google.com>
>> *Subject:* [EXT] Re: [llvm-dev] Contributing Bazel BUILD files similar to
TBH, I have no interest in using bazel nor anything against it being
merged per se. I just find it curious that we merge another build system
"at no cost" for the community (I think I picked that up in the thread
but I might have imagined the phrasing). I mean, there is always "a
cost"* so it boils down to determine if the benefit is worth it.
* i.a., people will assume we (=the LLVM community) maintain(s) a bazel
build, which can certainly be a benefit but also a cost", e.g., when
the build is not properly maintained, support is scarce, etc. and
emails come in complaining about it (not thinking of prior examples
here.)
I replied only selectively.
Let's not conflate "using bazel" and "benefit for LLVM", the former
is not up for debate here. (I mean, a lot of people use autoconf but
we got rid of it anyway).
>> I guess it would make the lives of google employees easier?
>>
> I explained before that Google internal integration flow is likely better
> without this at the moment, TensorFlow itself is also in a reasonably good
> spot at the moment. But Google is also not a monolithic place, some people
> are working on small independent projects that they are open-sourcing, and
> would like to be able to use LLVM.
>
>> Then what’s to stop every large org from committing their internal stuff
> to master?
>
>
>
> If their "internal stuff" is highly-coupled to LLVM, has zero-cost
> maintenance on the community, and is something that multiple other parties
> can benefit and established members of the community want to maintain and
> collaborate on, why not?
Let's be honest, nothing has "zero-cost".
It seems unhelpful to pretend it does. (FWIW, I explained a simple scenario that would make the bazel
inclusion "costly" in my previous mail.)
>
> I mentioned it before, but Bazel is not something internal or specific to
> Google: it isn't (actually there are many incompatibilities between Bazel
> and the internal system), 400 people attended the Bazel conference last
> year. I attended this conference 3 years ago when I was at Tesla trying to
> deploy Bazel internally. Many other companies are using Bazel, open-source
> projects as well. Feel free to watch the talks online about SpaceX
> <https://www.youtube.com/watch?v=t_3bckhV_YI> or Two Sigma and Uber
> <https://www.youtube.com/watch?v=_bPyEbAyC0s> for example
Let's not conflate "using bazel" and "benefit for LLVM", the former
is not up for debate here. (I mean, a lot of people use autoconf but
we got rid of it anyway).
That said, I think the original question is highly relevant. As I also
mentioned somewhere above, where do we draw the line is the key to this
RFC at the end of the day. A lot of the arguments I hear pro integration
apply to various other things that currently live out-of-tree, some of
which were proposed and not integrated.
Long story short, I did not try to imply you were dishonest.
I'm saying that the sentence "has zero-cost maintenance on the community"
cannot be true in a general sense but only in a narrow one. I believe that
everything has cost. I added, "let's be honest", because the cost is not
obvious and one can easily overlook it. However, I assumed we all know
there has to be one as it would otherwise conflict with some universal
law or something. The way I see it you acknowledge the existence in a few
other places.
It broke, ppl complained, and nobody wanted to fix it. That is the
kind of technical debt (aka. cost) you can accumulate.
>
>> That said, I think the original question is highly relevant. As I also
>> mentioned somewhere above, where do we draw the line is the key to this
>> RFC at the end of the day. A lot of the arguments I hear pro integration
>> apply to various other things that currently live out-of-tree, some of
>> which were proposed and not integrated.
>
> Can you provide more concrete reference to these things that could have
> been integrated in similar "zero cost" fashion?
> I'm all for consistency, and the only point of comparison here is `gn`.
Let's say RV, in a subfolder not build by default. Or any other
project that was proposed for inclusion without being build by
default. (I remember also the discussion if we can/should add
isl to llvm, pre-mono repo.)
Or any other
project that was proposed for inclusion without being build by
default. (I remember also the discussion if we can/should add
isl to llvm, pre-mono repo.)
Sorry, the region vectorizer [0,1]. Came to mind because it is the last
thing I wished we had upstream so I could use it without forking under a
cmake flag.
[0] https://github.com/cdl-saarland/rv
[1] http://llvm.org/devmtg/2016-11/Slides/Moll-RV.pdf
>
>
>> Or any other
>> project that was proposed for inclusion without being build by
>> default. (I remember also the discussion if we can/should add
>> isl to llvm, pre-mono repo.)
>>
>
> I am not sure I agree that we can compare new "projects" (or
something like
> ISL) with "utilities for LLVM users".
> I would expect a more comparable situation to me to be:
> - the gdb scripts in llvm/utils/gdb-scripts/prettyprinters.py
> - IDE visualizer in llvm/utils/LLVMVisualizers
> - The Visual Studio Code syntax highlighting for LLVM IR and TableGen in
> llvm/utils/vscode ; and similar for kate, jedit, vim, textmate, ...
> - the gn files in llvm/utils/gn
>
> The general theme here is that these are not "new projects" in
themselves:
> they are highly coupled to LLVM itself and only allow a specific
subset of
> users to plug their tool/workflow into LLVM at a given revision.
> Also all of these are "zero cost" in that they may be "broken" and
> maintained with best effort (I don't think we revert someone breaking any
> of the visualizer or syntax highlighter?). And none of these are really
> core to LLVM, and each could be in a separate repo where the interested
> parties could maintain it.
If I want to use isl, RV, project XYZ from an in-tree pass, you cannot
upstream it if the dependences are not upstream or properly hooked up.
Both things have been very hard to get into upstream llvm in the past.
I'm aware this is a build system we are talking about so it's a bit
different
but conceptually we should have better guidelines for integration of
code not
build by default, especially the code that is not planned to be enabled by
default any time soon.
Eric mentioned in a follow up that he is more inclined to accept such
code, at
least that is what I read. I am actually as well, probably always was ;)
I have no problem with gn, bazel, ... but I want us to be similarly open to
other projects that are used by the community and benefit from integration
without burdening everyone.
~ Johannes
>
> Best,
I *am* a Googler, though not directly involved with the teams that maintain the internal LLVM build. I happen to be a big fan of Bazel - and mostly build LLVM with the internal Bazel build, rather than the external CMake, because the better caching and remote-build-farm support is such an enormous help. (Also, I find the CMake build & build options kind of impenetrable.) However, I'm writing this particular email on my personal account, with personal resources, well past the close of business; my Google hat is firmly on the shelf, and I'm speaking as just an individual contributor.When I first started contributing to LLVM, I was confused by the GN build's existence. I didn't understand who was supposed to maintain it, whether I should use it, what the benefits were... you name it.I agree with some of the first comments on this thread. I'd suggest that we set aside the question of contributing Bazel BUILD files into the LLVM repository for now, and start by proposing a general policy around alternate/unsupported build systems in relation to the main repository. (GN can have an exception if needed.) The fact that the GN build is basically working, and doesn't confuse too many people, is a data point - but going from 1 alternate build system to 2 seems like a good point to pause and set an actual set of constraints and goals. Eventually, someone may want a third, and we should know what the guidelines are so we don't hash out the decision from scratch again!I don't think I could draft the RFC in question - I don't have enough experience with the community yet to judge what's really needed - but I'd be glad to help out with it. The idea should be to minimize the cost (to nearly zero) for both experienced LLVM contributors and new LLVM contributors. A few requirements I'd suggest, mostly put together from this thread:
- CMake should be able to build (and test!) everything the alternate build system can, at all times.
- There must be a clear group who want to maintain the alternate build system.
- The alternate build system's files should be isolated in a separate directory, with a README explaining that this is an alternate build system for LLVM, maintained by its own smaller community - and is not supported by the community at large.
- The alternate build system must have independent buildbots, which do not email the larger community; people can opt into being emailed about this. (And should, if they're contributing to it!)
- If the buildbots are red for an extended time, we should put out a call for maintainers to fix the issues; if not answered in a reasonable time, we shouldn't be afraid to delete the alternate build system.
I do also see the argument for the git submodule approach. It looks like a .gitmodules file would theoretically let a repository of Bazel BUILD files specify exactly which LLVM commit it currently tracks - and you could fetch the corresponding updates in both with a single command. I think that addresses the main point I noticed brought up on this side of the argument. Any RFC here probably needs to present pros & cons of both approaches. We'll need to hash those out in general discussion before people start looking for consensus, so people understand what they're deciding on.
On Thu, Oct 29, 2020 at 7:16 PM David Blaikie <dbla...@gmail.com> wrote:
> I expect most of it is probably a statement free of value judgments: Some other projects chose to use it/some folks have to use it for other reasons, clearly there's enough use that it's motivated folks to have/maintain Bazel builds for LLVM for years. Rather than judging their choices as bad/lesser/wrong - might be useful to accept that some folks had their reasons and they're trying to make the most of the situation. I don't think anyone's making an argument that LLVM should switch to Bazel/that that would be better than the CMake we're using, and I think it's helpful to return the favor and not suggest that other projects would be better off switching to CMake over Bazel - they no doubt have their reasons.
Please do not manufacture statements that I did not make. I never
suggested, or stated, anywhere, that some other imaginary project
using Bazel should switch to CMake.
I did state that I do not find Bazel to be a better alternative to
CMake. My statement is based on direct experience with both.
If the intent behind Bazel is not to present it as a better
alternative to CMake, then what is the intent?
Instead of maintaining
this impenetrable mystery as to why a Bazel build system should be
included in LLVM, please take the time to advocate for Bazel with
technical facts, than "someone at Google really likes it".
Just because someone likes and maintains an alternative build system
for LLVM, somewhere, that does not automatically mean, or imply that
it should be upstreamed.
For all I know, someone might be building their fork of LLVM with
autoconf. I am sure they have their own very good reasons for doing
so. Should we, therefore, bring back autoconf?
Some good remarks Stella.
What does second tier mean?
There are additional directories in the LLVM download such as flang, compiler-rt, openmp, but these do not seem to be second-tier though there may be a sense in which they are.
Is the idea of second-tier that there will be additional directories or programs embedded in the existing LLVM directories not available for use to those without bazel? If that is the case, then what is the relevance of those contributions?
It seems we are saying that if a contribution is relevant then either it is in the cmake build, making bazel superfluous to obtain a build, or it is in a bazel-only build. A cmake build would be required for the parts we have now and then an additional bazel build for the second-tier parts.
There is talk of gn. I am not seeing gn installed here but am not aware it is required. Is it the case that whatever gn does, cmake does, or is it the case there is a necessary gn build sequence in LLVM somewhere?
Neil Nelson
_______________________________________________
Some good remarks Stella.
What does second tier mean?
_______________________________________________
It’s not a practical experience that makes me think it was a bad idea. It’s the principle that there are files in the project repository that the community is not responsible for. When a new target is added to the project, we all assume some degree of responsibility for it, even if we never use it. This is not the case for the alternative build systems. For those, we are simply renting out storage space in our repo, so to speak. Any tangible consequences may take time to appear, but by then they may be difficult to deal with.
Finally, the question shouldn’t be whether it’s causing ongoing difficulties, but whether we want to make it a part of the project.
As far as prior-art here, don’t we have a bunch of MSVC and GDB debug ‘pretty print scripts’ in the repo as well? Both of those are not particularly well maintained and only serve the handful of people who maintain them.
The question to ask may be: if they happened to be broken, would we, as the project community, accept bug reports about them?
Hi Geoffrey,
I think you've received some good feedback on this thread, but It also
doesn't look like continuing discussion on this thread is going to lead
to a consensus. If you are still interested in making this change, I
think it would be best to escalate this proposal into a "proposal pitch"
described here[1].
There is also a template here[2] you can use.
[1]
https://github.com/llvm/llvm-www/blob/master/proposals/LP0001-LLVMDecisionMaking.md
[2]
https://github.com/llvm/llvm-www/blob/master/proposals/LP0000-Template.md
-Tom
> In https://github.com/google/llvm-bazelthe BUILD files live in a single
> single patch, but please let me know if you disagree.
>
>
> ## Benefits to the community
>
> *
>
> Projects that depend on LLVM and use the Bazel build system can
> avoid duplicating fragile effort. We'll spend more time contributing
> to LLVM instead :-D
>
> *
>
> Bazel is stricter than CMake in many ways (e.g. it requires that
> even header dependencies be declared) and can catch layering issues
> very easily. There's even an optional layering_check feature we
> could turn on if its use would benefit the community. (though
> currently the existing problematic layering makes it a burden to
> maintain on our own). Even without that additional check, as I've
> been keeping the Bazel build green, I've found and fixed a number of
> layering issues in the past couple weeks (e.g.
> https://reviews.llvm.org/rGb49787df9a
> <https://reviews.llvm.org/rGb49787df9a535f03761c340dca7ec3ec1155133d>and
> https://reviews.llvm.org/rGc17ae2916c
> <https://reviews.llvm.org/rGc17ae2916ccf45a0c1717bd5f11598cc4fff342a>).
>
>
> Here's a patch <https://reviews.llvm.org/D90352>adding the Bazel build
> system. It's basically just `cp -r llvm-bazel/llvm-bazel
> llvm-project/utils/bazel`.
>
I think you've received some good feedback on this thread, but It also
doesn't look like continuing discussion on this thread is going to lead
to a consensus.
Michael
Am Fr., 30. Okt. 2020 um 08:49 Uhr schrieb Keane, Erich via llvm-dev
<llvm...@lists.llvm.org>:
1) I don’t think this discussion has anything to do with the technical merits of bazel over cmake. There is empirically a community of people who would benefit from this, and there is no proposal to replace cmake with bazel. This is your "We'll spend more time contributing to LLVM instead :-D” point. I would de-emphasize the technical points, because you’re not actually evangelizing the technology here, you’re making a practical pitch.
2) I don’t think the comparison to GN is very important. It is prior art, but doesn’t mean that it is necessarily correlated to this decision. However, it doesn’t seem to me that GN has been a problem in practice for the community, so perhaps there is something to learn from that. For example, the llvm/utils/GN/README.rst file clearly labels GN support as “experimental and best effort”. GN also seems maintained, and even has a "LLVM GN Syncbot” that is doing stuff.
3) I think there is an important question (independent of Bazel) of “how do we foster new things” and “how do we support things that are empirically important to the community”? We don’t want to burden all contributors to help out a few of them, but keeping something “in tree” and marked experimental doesn’t seem like a burden.
4) The big question is: Should this be in the mono-repo, a separate llvm incubator project, or none of these? What should the terms of support be, etc?
If you frame this carefully, I think we can make a decision quickly on this and move on. From my personal perspective, it seems pretty obvious that we should take this, we just need to get the terms clear. For example, if I check in a cmake change and break a bazel builder, I shouldn’t be on the hook to fix it, and arguably shouldn’t even get emailed about it.
-Chris
I agree with Tom. This seems like a natural time to use the decision making process, and I encourage you to do so. Some random points I’ve taken away from the threads:
1) I don’t think this discussion has anything to do with the technical merits of bazel over cmake. There is empirically a community of people who would benefit from this, and there is no proposal to replace cmake with bazel. This is your "We'll spend more time contributing to LLVM instead :-D” point. I would de-emphasize the technical points, because you’re not actually evangelizing the technology here, you’re making a practical pitch.
2) I don’t think the comparison to GN is very important. It is prior art, but doesn’t mean that it is necessarily correlated to this decision. However, it doesn’t seem to me that GN has been a problem in practice for the community, so perhaps there is something to learn from that. For example, the llvm/utils/GN/README.rst file clearly labels GN support as “experimental and best effort”. GN also seems maintained, and even has a "LLVM GN Syncbot” that is doing stuff.
3) I think there is an important question (independent of Bazel) of “how do we foster new things” and “how do we support things that are empirically important to the community”? We don’t want to burden all contributors to help out a few of them, but keeping something “in tree” and marked experimental doesn’t seem like a burden.
4) The big question is: Should this be in the mono-repo, a separate llvm incubator project, or none of these? What should the terms of support be, etc?
If you frame this carefully, I think we can make a decision quickly on this and move on. From my personal perspective, it seems pretty obvious that we should take this, we just need to get the terms clear.
For example, if I check in a cmake change and break a bazel builder, I shouldn’t be on the hook to fix it, and arguably shouldn’t even get emailed about it.
Just don't change "should not" to "must not" ("by default must not" is
fine).
I can think of the scenario where a generic Bazel change breaks a corner
case Cmake, and it's reasonable for the Bazel guys to ask for help from
the Cmake crew. That said, that scenario is probably a bug in Cmake ... :-)
Cheers,
Wol
Hi all,
tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a side-directory in the monorepo, similar to the gn build.
Some of us have been working on open-source Bazel BUILD files for the LLVM Project. You may have seen us hanging out in the #build-systems discord channel. As you may know, Google uses Bazel internally and has maintained a Bazel BUILD of LLVM for years. Especially with the introduction of MLIR, we've got more and more OSS projects with a Bazel BUILD depending on LLVM (e.g. IREE and TensorFlow). We're also not the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've borrowed from TF. Each of these projects has to jump through some weird hoops to keep their version of the Bazel BUILD files in sync with the code, which requires some fragile combination of scripts and human intervention. Instead, we'd like to move general-purpose Bazel BUILD files into the LLVM Project monorepo. We expect to follow the model of the GN build where these will be maintained by interested contributors rather than expecting the general community to maintain them.
To facilitate and test this we've been developing a standalone repository that just has the Bazel BUILD files. It symlinks together the directory trees on top of a submodule as we would need in the monorepo to to avoid in-tree BUILD files. The configuration is at https://github.com/google/llvm-bazel. We now have those in a good place and think they would be useful upstream.
# Details
## What
Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review) subprojects, potentially expanding to others, as needed. Basically everything currently at https://github.com/google/llvm-bazel.
## Where
In https://github.com/google/llvm-bazel the BUILD files live in a single directory tree matching the structure of the overall llvm-project directory. For users, @llvm-project is a single Bazel repository that includes both LLVM and MLIR subprojects. To maintain this structure, we would probably want to put a `bazel` directory in the monorepo's utils directory, which currently only contains a directory for arcanist. This is different from gn, which is under the LLVM subproject's utils directory. We could similarly put the Bazel BUILD files under llvm/utils/bazel but have them be for the entire llvm project (the subsets that are supported). This seems like an odd structure to me, but I know that the CMake build for LLVM also builds the other subprojects, so maybe this would be preferable.
Alternatively we could split each subproject into a separate Bazel repository and put the Bazel build files under each subproject. I think this fragments the configuration of the BUILD without much benefit.
## Configurations
We currently have configurations for Linux GCC and Clang, MacOS GCC and Clang, and Windows MSVC. Support for other configurations can be added as-desired, but supporting all possible LLVM build configurations is not the goal.
## Support
Support would be similar to the gn build. Contributors could optionally update the Bazel BUILD files as part of their patches, but would be under no obligation to do so.
## Preserving History
I don't *think* the history of llvm-bazel is interesting enough to try to merge it into the monorepo and I was planning to submit this as a single patch, but please let me know if you disagree.
## Benefits to the community
Projects that depend on LLVM and use the Bazel build system can avoid duplicating fragile effort. We'll spend more time contributing to LLVM instead :-D
Bazel is stricter than CMake in many ways (e.g. it requires that even header dependencies be declared) and can catch layering issues very easily. There's even an optional layering_check feature we could turn on if its use would benefit the community. (though currently the existing problematic layering makes it a burden to maintain on our own). Even without that additional check, as I've been keeping the Bazel build green, I've found and fixed a number of layering issues in the past couple weeks (e.g. https://reviews.llvm.org/rGb49787df9a and https://reviews.llvm.org/rGc17ae2916c).
Here's a patch adding the Bazel build system. It's basically just `cp -r llvm-bazel/llvm-bazel llvm-project/utils/bazel`.
Hi,Doesn't an email like this maybe point to another problem that, if solved, would make situations like this not a problem at all.One would have 2 git repos.The main one and an overlay one. Much like overlay filesystems work.They would be linked in such a way that the developer would not need to manually add dependencies but the git repos would be linked (linked at the commit hash tree level) so that if one did a checkout from one, the correct matching version of the other repo would also be checked out.Then the main repo would just need a "readme", saying: To build with Bazel add this overlay repo.But on the topic of having multiple build systems for LLVM.If I was to wish to upstream a commit to LLVM, I would:1) Want to know which build system that commit should work with.2) Not have to make sure my commit worked with both build systems.I worked on one project that had two build systems, and most of the time, only one or the other build would actually work, because committers only fixed the build system they actually used.