Various JDKs and JVMs in Bazel

290 views
Skip to first unread message

Lukács T. Berki

unread,
Aug 23, 2018, 3:40:50 AM8/23/18
to bazel-...@googlegroups.com, Jakob Buchgraber, Irina Iancu, Liam Miller-Cushon, Nicolas Lopez
Hey there,

In the wake of the JDK breakage (#5888, #5744, #5741, #5766) in Bazel 0.16, let's come up with a principled plan of what JDKs [1] to use in which situations.

We have the following JDKs:
  1. @embedded_jdk, the one embedded in Bazel
  2. @local_jdk, the auto-detected one (that is, whatever is installed locally)
  3. Any other JDKs made known to Bazel
And the following use cases:
  1. Running Bazel itself. This does not require much, in particular, the classes comprising javac are not needed
  2. Running whatever tools are run during the build (the host JDK)
  3. Running JavaBuilder. This is a special case of (2) and differs from the general case in that JavaBuilder is coupled to the major version of the JDK. We have an alternative one (VanillaJavaBuilder), but that's a bit less smart (e.g. no Error Prone or strict deps)
  4. Running the tests and binaries built by Bazel (the target JDK)
We should also make sure that this works as well as possible with remote execution. This means relying on @local_jdk as little as possible because that's only guaranteed to be present on the machine Bazel runs on.

My proposal:
  1. Bazel is run under the embedded JDK and the embedded JDK is not used for anything else
  2. The target and host javabases default to @local_jdk. This means that if you want to compile Java, you'll have to have a JDK installed.
  3. We use the regular JavaBuilder if that JDK matches whatever JDK it expects, otherwise, we use VanillaJavaBuilder. If you want to make use of all the nifty features of JavaBuilder, the local JDK has to be of the right major version.
  4. We discourage (either by documentation or by some sort of enforcement) that @local_jdk be used for anything else other than --host_javabase and --javabase.
  5. If you want to use remote execution, you need to override --javabase and --host_javabase with something that's not @local_jdk
How does this sound?

1: I'll use the term "JDK" to also mean "JVM" below for simplicity.

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

Jakob Buchgraber

unread,
Aug 23, 2018, 5:32:35 AM8/23/18
to Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Liam Miller-Cushon, Nicolas Lopez
I support your proposal, with one exception:

On Thu, Aug 23, 2018 at 9:40 AM Lukács T. Berki <lbe...@google.com> wrote:
  1. We use the regular JavaBuilder if that JDK matches whatever JDK it expects, otherwise, we use VanillaJavaBuilder. If you want to make use of all the nifty features of JavaBuilder, the local JDK has to be of the right major version.
I would have been able to get behind this statement before the OpenJDK project switched to a 6-month
release schedule. I think given this new reality a Bazel release should also fully support (with a proper
JavaBuilder) the Java releases of the previous 12-24 months.

I thus propose that Bazel should fetch the right JavaBuilder from a remote repository depending on the
JDK version of the host_javabase. This would additionally be a very positive contribution to our efforts
of reducing the Bazel binary size, where we are trying to move all the embedded tools to a remote
repository.

Best,
Jakob

Ian O'Connell

unread,
Aug 23, 2018, 8:47:34 AM8/23/18
to Jakob Buchgraber, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Liam Miller-Cushon, Nicolas Lopez
The proposal sounds good to me, makes it hopefully easier to understand when which jdk will be used.

One request would be that as part of removing the embedded jdk from being used in rules we see if we can have a replacement set of hermetic JDK rules. local_jdk isn't always stable(hash code/files) across machines -- on os x at least we've ran into this with the jdk8's . This will break remote caching when it comes up -- https://github.com/bazelbuild/bazel/issues/4769

--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jvm+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CAGQ4vn2oXBL%3DM48%3Dno0xX8r9MnqfdiNjJWpUGWGL%2BowTSN0EYw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Liam Miller-Cushon

unread,
Aug 23, 2018, 10:59:22 AM8/23/18
to ia...@stripe.com, Jakob Buchgraber, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 2:32 AM Jakob Buchgraber <buc...@google.com> wrote:
On Thu, Aug 23, 2018 at 9:40 AM Lukács T. Berki <lbe...@google.com> wrote:
  1. We use the regular JavaBuilder if that JDK matches whatever JDK it expects, otherwise, we use VanillaJavaBuilder. If you want to make use of all the nifty features of JavaBuilder, the local JDK has to be of the right major version.

Is this configuration automatic, or would you have to explicitly configure a different --{host_,}java_toolchain?

I do not recommend making it automatic: VJB and regular JavaBuilder have significantly different features It would be surprising if upgrading the host_javabase version magically enabled e.g. strict deps.
 
I would have been able to get behind this statement before the OpenJDK project switched to a 6-month
release schedule. I think given this new reality a Bazel release should also fully support (with a proper
JavaBuilder) the Java releases of the previous 12-24 months.

Does "proper" mean non-VanillaJavaBuilder, or something else?

On Thu, Aug 23, 2018 at 5:47 AM Ian O'Connell <ia...@stripe.com> wrote:
One request would be that as part of removing the embedded jdk from being used in rules we see if we can have a replacement set of hermetic JDK rules. local_jdk isn't always stable(hash code/files) across machines -- on os x at least we've ran into this with the jdk8's . This will break remote caching when it comes up -- https://github.com/bazelbuild/bazel/issues/4769

One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.

Jakob Buchgraber

unread,
Aug 23, 2018, 11:48:55 AM8/23/18
to Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 4:59 PM Liam Miller-Cushon <cus...@google.com> wrote:
Does "proper" mean non-VanillaJavaBuilder, or something else?

Yes, JavaBuilder. I think ideally we would not have the VanillaJavaBuilder anymore.
 
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.

Agreed that it would be nice, but I don't think that this should be the default. I think users of remote caching / execution
should simply be able to add the JDK of their choice as a remote repository that they host themselves. I don't think Bazel
should be in the business of providing JDKs to users. This should work right now already (correct me if I am wrong Liam)
and might just need some documentation.

Kevin Bierhoff

unread,
Aug 23, 2018, 12:03:20 PM8/23/18
to Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
Maybe this is obvious to everyone else, but why not use the @embedded_jdk for JavaBuilder by default?  Seems like that would resolve a lot of problems mentioned here.  I think we'd also rather avoid maintaining 4 versions of JavaBuilder, to support "recent" Java versions.

+1 on not silently switching between VanillaJavaBuilder and JavaBuilder.

--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.

To post to this group, send email to bazel-...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Kevin Bierhoff
Google

Liam Miller-Cushon

unread,
Aug 23, 2018, 12:25:23 PM8/23/18
to Kevin Bierhoff, Jakob Buchgraber, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 8:48 AM Jakob Buchgraber <buc...@google.com> wrote:
On Thu, Aug 23, 2018 at 4:59 PM Liam Miller-Cushon <cus...@google.com> wrote:
Does "proper" mean non-VanillaJavaBuilder, or something else?

Yes, JavaBuilder. I think ideally we would not have the VanillaJavaBuilder anymore.

I agree it would be nice :)

In practice it would require maintaining release branches of non-VanillaJavaBuilder that were compatible with previous JDK versions, so VJB is necessary in the near-term until someone has time to do that.
 
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.

Agreed that it would be nice, but I don't think that this should be the default. I think users of remote caching / execution
should simply be able to add the JDK of their choice as a remote repository that they host themselves. I don't think Bazel
should be in the business of providing JDKs to users. This should work right now already (correct me if I am wrong Liam)
and might just need some documentation.

It sounds like it's already possible, I just wasn't that familiar with remote repos. I agree re: providing JDKs, but it might be helpful to provide the BUILD files and remote repo configuration to use e.g. zulu.

On Thu, Aug 23, 2018 at 9:03 AM Kevin Bierhoff <k...@google.com> wrote:
Maybe this is obvious to everyone else, but why not use the @embedded_jdk for JavaBuilder by default?  Seems like that would resolve a lot of problems mentioned here.  I think we'd also rather avoid maintaining 4 versions of JavaBuilder, to support "recent" Java versions.

If the embedded JDK is used only as a server_javabase and not a host_javabase, then it can be a minimal image that only contains the modules Bazel needs, which will make the distribution size significantly smaller.

Jakob Buchgraber

unread,
Aug 23, 2018, 2:04:36 PM8/23/18
to Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez

On Thu, Aug 23, 2018 at 6:25 PM Liam Miller-Cushon <cus...@google.com> wrote:

It sounds like it's already possible, I just wasn't that familiar with remote repos. I agree re: providing JDKs, but it might be helpful to provide the BUILD files and remote repo configuration to use e.g. zulu.

Providing a tutorial on how to do that sounds good to me.

On Thu, Aug 23, 2018 at 9:03 AM Kevin Bierhoff <k...@google.com> wrote:
Maybe this is obvious to everyone else, but why not use the @embedded_jdk for JavaBuilder by default?  Seems like that would resolve a lot of problems mentioned here.  I think we'd also rather avoid maintaining 4 versions of JavaBuilder, to support "recent" Java versions.

If the embedded JDK is used only as a server_javabase and not a host_javabase, then it can be a minimal image that only contains the modules Bazel needs, which will make the distribution size significantly smaller.

Correct. I may add that additionally any other JVM based tools (i.e. scalac) run on the host javabase and so we can't make any assumptions there.

Kevin Bierhoff

unread,
Aug 23, 2018, 2:22:31 PM8/23/18
to Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
I still don't think I'm following.  I'm not suggesting embedded_jdk==host_jdk.  I'm suggesting using embedded_jdk for JavaBuilder in particular, by default.  I understand that JavaBuilder may require additional modules, but the set of those modules needed for JavaBuilder in particular would be known and could be included.  Do we know how much larger embedded_jdk would have to be to accommodate JavaBuilder?  Can we count that low?


--
Kevin Bierhoff
Google

Jakob Buchgraber

unread,
Aug 23, 2018, 2:43:24 PM8/23/18
to Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 8:22 PM Kevin Bierhoff <k...@google.com> wrote:
I still don't think I'm following.  I'm not suggesting embedded_jdk==host_jdk.  I'm suggesting using embedded_jdk for JavaBuilder in particular, by default.  I understand that JavaBuilder may require additional modules, but the set of those modules needed for JavaBuilder in particular would be known and could be included.  Do we know how much larger embedded_jdk would have to be to accommodate JavaBuilder?  Can we count that low?

It's about 6 MiB the last time I checked (the jdk.compiler module). It's my understanding that additionally any annotation
processors would also run on the JDK used by JavaBuilder and those processors again could use arbitrary jdk classes...

Nicolas Lopez

unread,
Aug 23, 2018, 3:03:26 PM8/23/18
to Jakob Buchgraber, Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu
Chiming in late to the conversation.

From the perspective of remote execution all that is said here so far should work (particularly requiring setting host and target javabase to valid paths in remote exec env).

wrt zulu configs: I do think that its useful to provide example workspace rules for how to get a valid jdk on demand if a user does not want to install one 

wrt remote caching: the local_jdk is a problem for remote caching (not just mac). And I don't think it would be wise to ignore files when computing the cache as is mentioned in https://github.com/bazelbuild/bazel/issues/4769 (i.e., it could lead to cache poisoning if any of the ignored files actually has an impact on build outputs). I think the only solution for proper remote caching is to have fully hermetic builds. For java, this would likely mean the best results (wrt caching) will only be obtained if projects download a jdk in a workspace rule (i.e., the comment above about zulu configs) and use that instead of local_jdk via javabase flags.

Jakob Buchgraber

unread,
Aug 23, 2018, 3:07:03 PM8/23/18
to Nicolas Lopez, Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu
On Thu, Aug 23, 2018 at 9:03 PM Nicolas Lopez <ngir...@google.com> wrote:
wrt remote caching: the local_jdk is a problem for remote caching (not just mac). And I don't think it would be wise to ignore files when computing the cache as is mentioned in https://github.com/bazelbuild/bazel/issues/4769 (i.e., it could lead to cache poisoning if any of the ignored files actually has an impact on build outputs). I think the only solution for proper remote caching is to have fully hermetic builds. For java, this would likely mean the best results (wrt caching) will only be obtained if projects download a jdk in a workspace rule (i.e., the comment above about zulu configs) and use that instead of local_jdk via javabase flags.

Agreed. I think this is again purely a documentation issue though. We need to make it clear in our documentation that for remote caching one needs to check in the JDK and then explain how to do that.

Kevin Bierhoff

unread,
Aug 23, 2018, 4:50:34 PM8/23/18
to Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
6 MiB seems worth it no?  I see the point about annotation processors, but I would also expect most common annotation processors not to go outside the JDK modules you'll need to include into @embedded_jdk anyways.  If someone does want to use an annotation processor that needs other JDK modules then they would have use use @local_jdk or another JDK, but it seems good enough to just document that.  (Annotation processors also have a regular classpath where they can get access to stuff outside the JDK, that's a lot more common I think and should work fine.)

So this all doesn't seem to prevent using @embedded_jdk for JavaBuilder by default.  And it seems well worth it to me considering that that way, everyone gets the non-vanilla JavaBuilder by default.  

ittai zeidman

unread,
Aug 23, 2018, 5:08:08 PM8/23/18
to Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
3 thoughts:
1. IIRC the current docs (via bazel-toolchains) actually pushes users of RBE to use a locally pre installed java and not a checked in one.
2. In your suggestion do you mean that @local_jdk will be reserved (by intention at least) only as the default to which host-javabase and javabase defaults to?
3. If the answer to 2 is yes then can you give an example of what rules (for example rules_scala) need to use? I think we use @local_jdk currently (haven’t verified)
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.

Nicolas Lopez

unread,
Aug 23, 2018, 7:00:18 PM8/23/18
to ittai zeidman, Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu
On Thu, Aug 23, 2018 at 5:08 PM ittai zeidman <itt...@gmail.com> wrote:
3 thoughts:
1. IIRC the current docs (via bazel-toolchains) actually pushes users of RBE to use a locally pre installed java and not a checked in one.
 
We suggest use of a pre installed java as that avoids having to transfer the jdk as an additional input, but using a checked in one also works (as long as the checked in jdk binaries are compatible with the execution environment)

2. In your suggestion do you mean that @local_jdk will be reserved (by intention at least) only as the default to which host-javabase and javabase defaults to?

yes, iiuc
 
3. If the answer to 2 is yes then can you give an example of what rules (for example rules_scala) need to use? I think we use @local_jdk currently (haven’t verified)
 
I'm pretty sure rules_scala has deps on the @local_jdk. I have plans to prepare docs with detailed advice as to how to do this. In the meantime, I have some concrete examples that you can use as reference:
When you do decide to start removing local_jdk deps please loop me in on PRs so I can help if possible (and also compile more examples from the experience!)

Lukács T. Berki

unread,
Aug 24, 2018, 3:22:20 AM8/24/18
to Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
This would mean introducing Yet Another Javabase Concept, bringing the number of knobs that control what sort of JDK to use when to four (server, host, target, JavaBuilder). It would also mean surprises because then we'd be running annotation processors on a weird JDK. It would also make it more difficult to deal with remote execution, because then we'd need to have a way to get the embedded JDK to run on the remote workers, which is especially interesting if they have a different operating system.

I agree that we'd ideally not have VanillaJavaBuilder, but since that requires maintaining multiple JavaBuilder versions, I think VanillaJavaBuilder is a nice stopgap. I hope that maintaining multiple JavaBuilders will not be a lot of work once we figure out who will do it. And there is a pretty easy migration path from "VanillaJavaBuilder is required for all JDK major versions that are not the one JavaBuilder supports" to "Bazel selects the right JavaBuilder".

As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".

In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.

What we do need, however, is java_toolchain / java_runtime rules that describe the JDK on the remote workers, but we already have that. Fortuitously, it's a very good place to distribute hermetic JDK repositories if we ever make that leap.



Jakob Buchgraber

unread,
Aug 24, 2018, 4:24:45 AM8/24/18
to Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 10:50 PM Kevin Bierhoff <k...@google.com> wrote:
6 MiB seems worth it no?  I see the point about annotation processors, but I would also expect most common annotation processors not to go outside the JDK modules you'll need to include into @embedded_jdk anyways.

As tempting as it is, I think we should not introduce another javabase concept just for the JavaBuilder. We need a solution that works for any JVM based tools
and rules.

Jakob Buchgraber

unread,
Aug 24, 2018, 4:34:19 AM8/24/18
to Lukács T. Berki, Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Fri, Aug 24, 2018 at 9:22 AM Lukács T. Berki <lbe...@google.com> wrote:
As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".

+1

In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.

I'd argue that it should not be on the Bazel team to provide and maintain a JDK for remote execution. Ultimately people will want and should be able to bring
their own JDK for remote execution just as they do for local execution. It should be on the remote execution system to make recommendations on how best
to do that / where to get it from.

Lukács T. Berki

unread,
Aug 24, 2018, 4:39:11 AM8/24/18
to Jakob Buchgraber, Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
In general, I think it's on Googlers to maintain these things. Which particular set of Googlers, we can argue, but let's first establish that this is a necessary piece of infrastructure. More concretely, that's what https://github.com/bazelbuild/bazel-toolchains should be, right? 

Nicolas Lopez

unread,
Aug 24, 2018, 11:16:06 AM8/24/18
to Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, Liam Miller-Cushon, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu
On Fri, Aug 24, 2018 at 4:39 AM Lukács T. Berki <lbe...@google.com> wrote:
On Fri, Aug 24, 2018 at 10:34 AM, Jakob Buchgraber <buc...@google.com> wrote:
On Fri, Aug 24, 2018 at 9:22 AM Lukács T. Berki <lbe...@google.com> wrote:
As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".

+1

In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.

I'd argue that it should not be on the Bazel team to provide and maintain a JDK for remote execution. Ultimately people will want and should be able to bring
their own JDK for remote execution just as they do for local execution. It should be on the remote execution system to make recommendations on how best
to do that / where to get it from.
In general, I think it's on Googlers to maintain these things. Which particular set of Googlers, we can argue, but let's first establish that this is a necessary piece of infrastructure. More concretely, that's what https://github.com/bazelbuild/bazel-toolchains should be, right? 

Yes, this is exactly what https://github.com/bazelbuild/bazel-toolchains is for.

We (remote exec toolchains team) have been making sure that for remote execution we have a valid JDK that is well supported by google bundled in our toolchain containers. So far, we had been using the same OpenJDK as distributed in the cloud marketplace images (http://gcr.io/cloud-marketplace/google/openjdk8). However, we are now migrating to use the exact same jdk in our containers as the one distributed with Bazel (https://mirror.bazel.build/openjdk/index.html). As long as we can keep counting on this mirror existing (and it having a full version of jdk that is compatible with Bazel both as host and target javabase) then there should be no problem at all for remote execution (might be worth noting that remote exec users could also provide their own jdk if they want, but verifying/maintaining compatibility for those versions of jdk would be on them). 


Kevin Bierhoff

unread,
Aug 24, 2018, 2:07:34 PM8/24/18
to Lukács T. Berki, Jakob Buchgraber, Liam Miller-Cushon, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Fri, Aug 24, 2018 at 12:22 AM Lukács T. Berki <lbe...@google.com> wrote:


On Thu, Aug 23, 2018 at 10:50 PM, Kevin Bierhoff <k...@google.com> wrote:
On Thu, Aug 23, 2018 at 11:43 AM Jakob Buchgraber <buc...@google.com> wrote:
On Thu, Aug 23, 2018 at 8:22 PM Kevin Bierhoff <k...@google.com> wrote:
I still don't think I'm following.  I'm not suggesting embedded_jdk==host_jdk.  I'm suggesting using embedded_jdk for JavaBuilder in particular, by default.  I understand that JavaBuilder may require additional modules, but the set of those modules needed for JavaBuilder in particular would be known and could be included.  Do we know how much larger embedded_jdk would have to be to accommodate JavaBuilder?  Can we count that low?

It's about 6 MiB the last time I checked (the jdk.compiler module). It's my understanding that additionally any annotation
processors would also run on the JDK used by JavaBuilder and those processors again could use arbitrary jdk classes...

6 MiB seems worth it no?  I see the point about annotation processors, but I would also expect most common annotation processors not to go outside the JDK modules you'll need to include into @embedded_jdk anyways.  If someone does want to use an annotation processor that needs other JDK modules then they would have use use @local_jdk or another JDK, but it seems good enough to just document that.  (Annotation processors also have a regular classpath where they can get access to stuff outside the JDK, that's a lot more common I think and should work fine.)

So this all doesn't seem to prevent using @embedded_jdk for JavaBuilder by default.  And it seems well worth it to me considering that that way, everyone gets the non-vanilla JavaBuilder by default.  
This would mean introducing Yet Another Javabase Concept, bringing the number of knobs that control what sort of JDK to use when to four (server, host, target, JavaBuilder). It would also mean surprises because then we'd be running annotation processors on a weird JDK. It would also make it more difficult to deal with remote execution, because then we'd need to have a way to get the embedded JDK to run on the remote workers, which is especially interesting if they have a different operating system.

Remote workers seem to already use flags etc. to find JDKs and various other things in non-standard places, so it seems like that has to be done to run JavaBuilder in remote workers regardless of what JDK they use.  Again, the annotation processor issue is IMHO a red-herring as it will affect few if any users (and those users can always direct Bazel to use @local_jdk or something else).  If you want to distinguish a @javabuilder_jdk that's @embedded_jdk by default, instead of referencing @embedded_jdk directly for JavaBuilder, that's fine with me; again none of these seem like big issues to me.
 

I agree that we'd ideally not have VanillaJavaBuilder, but since that requires maintaining multiple JavaBuilder versions, I think VanillaJavaBuilder is a nice stopgap. I hope that maintaining multiple JavaBuilders will not be a lot of work once we figure out who will do it. And there is a pretty easy migration path from "VanillaJavaBuilder is required for all JDK major versions that are not the one JavaBuilder supports" to "Bazel selects the right JavaBuilder".

VanillaJavaBuilder is a nice stopgap, but it's far inferior to "proper" JavaBuilder, so I still maintain Bazel should make every effort to use JavaBuilder instead of VanillaJavaBuilder.  Also note that VanillaJavaBuilder to a first approximation allows more code to compile than JavaBuilder, due to the latter's strict_deps and Error-Prone enforcement.  That seams to imply that changes down the line that make new versions of Bazel use JavaBuilder where VJB was previously used would run the risk of breaking existing builds.  So it seems to me it's well worth making sure VJB is used in as few cases as possible and, not to sound like a broken record, giving users the benefits of JavaBuilder.
 

As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".

In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.

What we do need, however, is java_toolchain / java_runtime rules that describe the JDK on the remote workers, but we already have that. Fortuitously, it's a very good place to distribute hermetic JDK repositories if we ever make that leap.



--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891


--
Kevin Bierhoff
Google

Liam Miller-Cushon

unread,
Aug 24, 2018, 2:28:50 PM8/24/18
to Kevin Bierhoff, Lukács T. Berki, Jakob Buchgraber, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Fri, Aug 24, 2018 at 11:07 AM Kevin Bierhoff <k...@google.com> wrote:
Again, the annotation processor issue is IMHO a red-herring as it will affect few if any users (and those users can always direct Bazel to use @local_jdk or something else).

I'm curious what examples we have of host tools that need modules that we don't want to include in the embedded JDK? If I'm understanding, the textbook example of this would be something like an annotation processor that needed corba, which doesn't seem extremely common.

Re: the 6mb tax from jdk.compiler, we're already distributing that module separately from the host_jdk, so leaving it out of the embedded JDK would not be a problem: https://github.com/bazelbuild/bazel/blob/master/third_party/java/jdk/langtools/jdk_compiler.jar

I think it's worth re-considering having the default host_javabase be a somewhat minimal JDK. There may be a sweet spot where the binary size is still acceptably small and it has 99% of what all host tools need, and the special-cases can set an explicit --host_javabase.
 
VanillaJavaBuilder is a nice stopgap, but it's far inferior to "proper" JavaBuilder, so I still maintain Bazel should make every effort to use JavaBuilder instead of VanillaJavaBuilder.  Also note that VanillaJavaBuilder to a first approximation allows more code to compile than JavaBuilder, due to the latter's strict_deps and Error-Prone enforcement.  That seams to imply that changes down the line that make new versions of Bazel use JavaBuilder where VJB was previously used would run the risk of breaking existing builds.  So it seems to me it's well worth making sure VJB is used in as few cases as possible and, not to sound like a broken record, giving users the benefits of JavaBuilder.

I agree with Kevin that it's going to be very hard to migrate back to the regular JavaBuilder  from VJB, and (while biased) i think the features in the non-VJB are valuable.

The lack of strict deps support in particular will very quickly result in builds that rely on transitive deps, and we know from experience those builds will be fragile and harder to maintain, and that the cleanup to re-enable strict deps is a lot of work.

Liam Miller-Cushon

unread,
Aug 24, 2018, 3:11:12 PM8/24/18
to Kevin Bierhoff, Jakob Buchgraber, ia...@stripe.com, Lukács T. Berki, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Thu, Aug 23, 2018 at 9:25 AM Liam Miller-Cushon <cus...@google.com> wrote:
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.

Agreed that it would be nice, but I don't think that this should be the default. I think users of remote caching / execution
should simply be able to add the JDK of their choice as a remote repository that they host themselves. I don't think Bazel
should be in the business of providing JDKs to users. This should work right now already (correct me if I am wrong Liam)
and might just need some documentation.

It sounds like it's already possible, I just wasn't that familiar with remote repos. I agree re: providing JDKs, but it might be helpful to provide the BUILD files and remote repo configuration to use e.g. zulu.

This is sort of an aside, but it seems to be straightforward to get the --host_javabase or --javabase from a remote repo.

I'm not sure how that would work if we wanted to make it the default, though, since it depends on a per-project WORKSPACE entry?

$ cat WORKSPACE 
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "openjdk_linux",
    build_file_content = """
java_runtime(
    name = "jdk",
    srcs = glob(["**"]),
    visibility = ["//visibility:public"],
)
""",
    sha256 = "47ef4b708689f1923a2be9325b17c9df8545a33769a7f11b00dc70f0be4f12ca",
    strip_prefix = "zulu10.3+5-jdk10.0.2-linux_x64",
)
$ bazel build --host_javabase=@openjdk_linux//:jdk :a

Jakob Buchgraber

unread,
Aug 26, 2018, 4:29:30 PM8/26/18
to Liam Miller-Cushon, Kevin Bierhoff, Lukács T. Berki, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Fri, Aug 24, 2018 at 8:28 PM Liam Miller-Cushon <cus...@google.com> wrote:
I'm curious what examples we have of host tools that need modules that we don't want to include in the embedded JDK?

A valid answer to this question is that we don't know and don't need to know. The different rule implementations for JVM based
languages (i.e. rules_kotlin, rules_scala) and any genrule using the $(JAVA*) make variables can run arbitrary code on the host
JDK. We do not want to be in a situation where we have to consider all of these cases when upgrading our embedded JDK.
Additionally, as Martin Buchholz wrote in a different thread "I have a strong expectation that $(JAVABASE) is a "complete" JDK."
and I absolutely agree.

Lukács T. Berki

unread,
Aug 27, 2018, 7:00:20 AM8/27/18
to Jakob Buchgraber, Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.

I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase? 

Jakob Buchgraber

unread,
Aug 27, 2018, 7:17:05 AM8/27/18
to Lukács T. Berki, Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 1:00 PM Lukács T. Berki <lbe...@google.com> wrote:
I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase? 

It was my understanding that this thread discusses the mid- and longterm future of the java rules in Bazel,
and I think in that future VanillaJavaBuilder should not exist. We should have a JavaBuilder for every JDK
version that we support.

Lukács T. Berki

unread,
Aug 27, 2018, 7:24:18 AM8/27/18
to Jakob Buchgraber, Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
That's also an option, but it's more expensive than packaging the classes that comprise javac. I also think that maintaining multiple JavaBuilders would be a way to go if it's feasible, but I'd like a backup option in case it isn't.

Jakob Buchgraber

unread,
Aug 27, 2018, 7:28:35 AM8/27/18
to Lukács T. Berki, Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 1:24 PM Lukács T. Berki <lbe...@google.com> wrote:
That's also an option, but it's more expensive than packaging the classes that comprise javac.

It's my understanding that we already do that. Last time I checked there was a javac.jar in Bazel's
embedded tools.

Lukács T. Berki

unread,
Aug 27, 2018, 7:35:58 AM8/27/18
to Jakob Buchgraber, Liam Miller-Cushon, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
But then why is the host javabase *that* important?  

Liam Miller-Cushon

unread,
Aug 27, 2018, 12:36:15 PM8/27/18
to Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.

I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)

I agree that in the medium-term that it would be nice to support more host_javabases, possibly by maintaining release branches of all of the host tools, or by moving them to a remote repository that could be versioned independently from Bazel and pinned to a version compatible with a particular JDK version.

It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain?

On Mon, Aug 27, 2018 at 4:24 AM Lukács T. Berki <lbe...@google.com> wrote:
On Mon, Aug 27, 2018 at 1:16 PM, Jakob Buchgraber <buc...@google.com> wrote:
On Mon, Aug 27, 2018 at 1:00 PM Lukács T. Berki <lbe...@google.com> wrote:
I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase? 

It was my understanding that this thread discusses the mid- and longterm future of the java rules in Bazel,
and I think in that future VanillaJavaBuilder should not exist. We should have a JavaBuilder for every JDK
version that we support.
That's also an option, but it's more expensive than packaging the classes that comprise javac. I also think that maintaining multiple JavaBuilders would be a way to go if it's feasible, but I'd like a backup option in case it isn't.

javac is tied to a particular JDK version, and the other tools built on top of javac depend on a large API surface area that tends to change between releases. Versioning javac alone wouldn't be sufficient, given the way things are currently structured we'd need to version JavaBuilder and probably some of the other tools.
 
On Mon, Aug 27, 2018 at 4:35 AM Lukács T. Berki <lbe...@google.com> wrote:
It's my understanding that we already do that. Last time I checked there was a javac.jar in Bazel's
embedded tools.
But then why is the host javabase *that* important?  

javac is a Java program that needs a JDK to run on, and the javac in JDK version N can be made to work on JDK version N-1, but no earlier than that.

$ javap -v -cp ./third_party/java/jdk/langtools/jdk_compiler.jar com.sun.tools.javac.main.Main | grep 'major version'
  major version: 53

Lukács T. Berki

unread,
Aug 27, 2018, 1:29:20 PM8/27/18
to Liam Miller-Cushon, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 6:36 PM, Liam Miller-Cushon <cus...@google.com> wrote:
On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.

I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)
The reason why I'm reluctant to go that route is that because this will inevitably introduce coupling between the embedded JDK and the users of Bazel, thus making it harder to update said embedded JDK. I'd much rather have the default be that Bazel uses a complete JDK that has to be provided to it in some way and if need be, the embedded JDK can be used as the host JDK by explicitly opting in with the understanding that it's unsupported and if you it breaks, you get to keep both pieces.
 
Your desire below that changes to the embedded JDK not regress the functionality of the Java toolchain is a very good indication what would happen if we kept things this way -- whenever we wanted to update the JVM of Bazel itself, we'd have to make sure that a *lot* of other things keep working that are not Bazel itself.


I agree that in the medium-term that it would be nice to support more host_javabases, possibly by maintaining release branches of all of the host tools, or by moving them to a remote repository that could be versioned independently from Bazel and pinned to a version compatible with a particular JDK version.

It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain? 

On Mon, Aug 27, 2018 at 4:24 AM Lukács T. Berki <lbe...@google.com> wrote:
On Mon, Aug 27, 2018 at 1:16 PM, Jakob Buchgraber <buc...@google.com> wrote:
On Mon, Aug 27, 2018 at 1:00 PM Lukács T. Berki <lbe...@google.com> wrote:
I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase? 

It was my understanding that this thread discusses the mid- and longterm future of the java rules in Bazel,
and I think in that future VanillaJavaBuilder should not exist. We should have a JavaBuilder for every JDK
version that we support.
That's also an option, but it's more expensive than packaging the classes that comprise javac. I also think that maintaining multiple JavaBuilders would be a way to go if it's feasible, but I'd like a backup option in case it isn't.

javac is tied to a particular JDK version, and the other tools built on top of javac depend on a large API surface area that tends to change between releases. Versioning javac alone wouldn't be sufficient, given the way things are currently structured we'd need to version JavaBuilder and probably some of the other tools.
Yeah, it would be nice if JavaBuilder wasn't coupled to javac like that, but there are a lot of other things in the world I'd also like not to be the way they are :)
 
 
On Mon, Aug 27, 2018 at 4:35 AM Lukács T. Berki <lbe...@google.com> wrote:
It's my understanding that we already do that. Last time I checked there was a javac.jar in Bazel's
embedded tools.
But then why is the host javabase *that* important?  

javac is a Java program that needs a JDK to run on, and the javac in JDK version N can be made to work on JDK version N-1, but no earlier than that.
Understood, thanks.
 

$ javap -v -cp ./third_party/java/jdk/langtools/jdk_compiler.jar com.sun.tools.javac.main.Main | grep 'major version'
  major version: 53

Liam Miller-Cushon

unread,
Aug 27, 2018, 5:03:58 PM8/27/18
to Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:
On Mon, Aug 27, 2018 at 6:36 PM, Liam Miller-Cushon <cus...@google.com> wrote:
On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.

I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)
The reason why I'm reluctant to go that route is that because this will inevitably introduce coupling between the embedded JDK and the users of Bazel, thus making it harder to update said embedded JDK. I'd much rather have the default be that Bazel uses a complete JDK that has to be provided to it in some way and if need be, the embedded JDK can be used as the host JDK by explicitly opting in with the understanding that it's unsupported and if you it breaks, you get to keep both pieces.
 
Your desire below that changes to the embedded JDK not regress the functionality of the Java toolchain is a very good indication what would happen if we kept things this way -- whenever we wanted to update the JVM of Bazel itself, we'd have to make sure that a *lot* of other things keep working that are not Bazel itself.

Part of the solution is better testing and validation of release candidates, regardless of whether that coupling with the embedded JDK exists long-term. i.e. if a new release ships with a toolchain that requires a locally installed JDK 10, and some other host tools aren't compatible with JDK 10 yet, that's still a breakage. I don't see a way around that as long as the toolchain is distributed with Bazel, unless we stop making changes to the defaults entirely.

On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:
It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain? 

What are your thoughts on this part? I'm not sure what the current proposal is.

Ian O'Connell

unread,
Aug 27, 2018, 5:20:01 PM8/27/18
to Liam Miller-Cushon, Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
Long term should the java rules be embedded/tied strongly to bazel, that sort of feels like it comes into this as a question. If the answer is no then i'm not sure if

-> Point at a particular external JDK version by default for the host_javabase which supports the JB
-> People can use other JDK's as they can now, but probably will be stuck with the VJB
-> If google/others add support for multiple JDK's/JB's this works fine here
-> We can minimize the embedded JDK


Other than downloading/unpacking the other JDK it doesn't feel like its a regression from today unless i'm missing something? Since today the JB is tied to a particular JDK anyway. But Bazel's internals will just be isolated going forward. 

For simplicity I do like the idea of full JDK's only being exposed since having to care about which packages code uses/annotation processors feels like an abstraction breakage to me. End users having to look at bazel's source to figure out which parts of the JDK were embedded isn't ideal. (Even if it occurs rarely, if it happens to you once, you'll now always have to wonder if you have a real/other bug or that part of the jdk is missing). 


Liam Miller-Cushon

unread,
Aug 27, 2018, 5:39:59 PM8/27/18
to ia...@stripe.com, Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 2:20 PM Ian O'Connell <ia...@stripe.com> wrote:
Long term should the java rules be embedded/tied strongly to bazel, that sort of feels like it comes into this as a question. If the answer is no then i'm not sure if

-> Point at a particular external JDK version by default for the host_javabase which supports the JB
-> People can use other JDK's as they can now, but probably will be stuck with the VJB
-> If google/others add support for multiple JDK's/JB's this works fine here
-> We can minimize the embedded JDK

I think that long-term we want the tools and host JDK to be versioned independently from the Bazel binary, and moving them to a remote repo accomplishes that.
 
Other than downloading/unpacking the other JDK it doesn't feel like its a regression from today unless i'm missing something?

I don't see any disadvantages either, aside from the additional download.

Lukács T. Berki

unread,
Aug 28, 2018, 3:43:03 AM8/28/18
to Liam Miller-Cushon, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Mon, Aug 27, 2018 at 11:03 PM, Liam Miller-Cushon <cus...@google.com> wrote:
On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:
On Mon, Aug 27, 2018 at 6:36 PM, Liam Miller-Cushon <cus...@google.com> wrote:
On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.

I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)
The reason why I'm reluctant to go that route is that because this will inevitably introduce coupling between the embedded JDK and the users of Bazel, thus making it harder to update said embedded JDK. I'd much rather have the default be that Bazel uses a complete JDK that has to be provided to it in some way and if need be, the embedded JDK can be used as the host JDK by explicitly opting in with the understanding that it's unsupported and if you it breaks, you get to keep both pieces.
 
Your desire below that changes to the embedded JDK not regress the functionality of the Java toolchain is a very good indication what would happen if we kept things this way -- whenever we wanted to update the JVM of Bazel itself, we'd have to make sure that a *lot* of other things keep working that are not Bazel itself.

Part of the solution is better testing and validation of release candidates, regardless of whether that coupling with the embedded JDK exists long-term. i.e. if a new release ships with a toolchain that requires a locally installed JDK 10, and some other host tools aren't compatible with JDK 10 yet, that's still a breakage. I don't see a way around that as long as the toolchain is distributed with Bazel, unless we stop making changes to the defaults entirely.
Agreed. Once we figure out what the plan for JDKs is, the next step is to figure out which versions we want to support and then we can add testing for them.
 

On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:
It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain? 

What are your thoughts on this part?
Ideally, we'd use the embedded JDK only for running Bazel itself and thus we could minimize it at will.
 
I'm not sure what the current proposal is.
I think there are two proposals on the table:
  1. Embedded JDK is only used for running Bazel itself, and if you want to compile Java code, you need to provide a JDK to be used as the host javabase and the target javabase. If that JDK is not of the blessed version, you also need to explicitly revert to VanillaJavaBuilder.
  2. Embedded JDK is only used for running Bazel itself, and the default host javabase is an external repository that is automatically downloaded when needed.
I could be convinced about both, but (2) means that we have to commit to maintaining the place where the JDK is to be downloaded from for all supported architectures or figure out a way to download the JDK automatically from some official place while making sure that we are abiding by its license. (1) seems to be simpler. One could argue that it would mean shifting burden from us (one group of people) to all users of Bazel who want to build JVM stuff (many groups of people). However, I think that if one wants to build a JVM language, it's a pretty reasonable request to install a JDK, isn't it?

Liam Miller-Cushon

unread,
Aug 28, 2018, 11:49:16 AM8/28/18
to Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Tue, Aug 28, 2018 at 12:43 AM Lukács T. Berki <lbe...@google.com> wrote:
I think there are two proposals on the table:
  1. Embedded JDK is only used for running Bazel itself, and if you want to compile Java code, you need to provide a JDK to be used as the host javabase and the target javabase. If that JDK is not of the blessed version, you also need to explicitly revert to VanillaJavaBuilder.
  2. Embedded JDK is only used for running Bazel itself, and the default host javabase is an external repository that is automatically downloaded when needed.
I could be convinced about both, but (2) means that we have to commit to maintaining the place where the JDK is to be downloaded from for all supported architectures or figure out a way to download the JDK automatically from some official place while making sure that we are abiding by its license.

My understanding from earlier in the thread was that we want to be able to do this for remote execution regardless, so (1) doesn't eliminate the need to have a mirror of those JDKs somewhere.

(1) seems to be simpler. One could argue that it would mean shifting burden from us (one group of people) to all users of Bazel who want to build JVM stuff (many groups of people). However, I think that if one wants to build a JVM language, it's a pretty reasonable request to install a JDK, isn't it?

If the default toolchain depends on the latest six-monthly JDK release, this is going to be onerous. People may not have (or have access to) a local install of the latest version immediately. And the Java installers typically make the new version the default (and add it to your PATH) which may interfere with other programs that don't support that JDK yet. If we want the non-VanillaJavaBuilder as the default, and for it to be relatively hassle-free, I think (2) will provide a better experience.

Kevin Bierhoff

unread,
Aug 28, 2018, 12:59:56 PM8/28/18
to Liam Miller-Cushon, Lukács T. Berki, Jakob Buchgraber, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
One other thought I had on this topic, not sure how important it is so I wanted to bring it up so y'all can be the judge of that, is Java-based host tools needed for building other languages.  Not sure how common those are, but it seems like it would be nice if those "just worked", at least in most cases, without having to install a JDK.  cushon@'s option (2) would I believe make that happen as well (with a download).  To be clear, I'm not talking about user-written genrules but rather tools used by non-Java build rules.


--
Kevin Bierhoff
Google

Lukács T. Berki

unread,
Aug 28, 2018, 1:07:03 PM8/28/18
to Liam Miller-Cushon, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Tue, Aug 28, 2018 at 5:48 PM, Liam Miller-Cushon <cus...@google.com> wrote:
On Tue, Aug 28, 2018 at 12:43 AM Lukács T. Berki <lbe...@google.com> wrote:
I think there are two proposals on the table:
  1. Embedded JDK is only used for running Bazel itself, and if you want to compile Java code, you need to provide a JDK to be used as the host javabase and the target javabase. If that JDK is not of the blessed version, you also need to explicitly revert to VanillaJavaBuilder.
  2. Embedded JDK is only used for running Bazel itself, and the default host javabase is an external repository that is automatically downloaded when needed.
I could be convinced about both, but (2) means that we have to commit to maintaining the place where the JDK is to be downloaded from for all supported architectures or figure out a way to download the JDK automatically from some official place while making sure that we are abiding by its license.

My understanding from earlier in the thread was that we want to be able to do this for remote execution regardless, so (1) doesn't eliminate the need to have a mirror of those JDKs somewhere.
That's a good point. I'll ask Philipp tomorrow about the mechanics of maintaining them, because if we go with this option, they will become *way* more important.
 

(1) seems to be simpler. One could argue that it would mean shifting burden from us (one group of people) to all users of Bazel who want to build JVM stuff (many groups of people). However, I think that if one wants to build a JVM language, it's a pretty reasonable request to install a JDK, isn't it?

If the default toolchain depends on the latest six-monthly JDK release, this is going to be onerous. People may not have (or have access to) a local install of the latest version immediately. And the Java installers typically make the new version the default (and add it to your PATH) which may interfere with other programs that don't support that JDK yet. If we want the non-VanillaJavaBuilder as the default, and for it to be relatively hassle-free, I think (2) will provide a better experience.
You and Kevin are starting to convince me. I'll also check with Klaus tomorrow what he thinks about making remote repositories essentially mandatory for Bazel. I think that's okay, but better be safe than sorry.

Lukács T. Berki

unread,
Aug 29, 2018, 8:35:48 AM8/29/18
to Liam Miller-Cushon, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:

If we default to host JDK in a remote repository:
  1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)
  2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)
  3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
  4. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)
  5. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
  6. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)
If we default to a preinstalled host JDK:
  1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
  2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)
  3. Users of Bazel get the exact JDK they want
  4. No surprising large download of a JDK
  5. When one wants to start using the RBE, --host_javabase needs to be changed so that it's a JDK that exists on the remote machine (but you already have a lot of flags you need to add in that case)
Is this a fair assessment of the upsides and downsides of each approach?

Liam Miller-Cushon

unread,
Aug 29, 2018, 10:58:16 AM8/29/18
to Lukács T. Berki, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
Is this a fair assessment of the upsides and downsides of each approach?

SGTM. I left a few notes inline:

To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:
If we default to host JDK in a remote repository:

1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)

Are we sure there aren't any Java host tools that are used by other languages? Nothing for coverage, or windows singlejar?
 
2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)

The minor version and vendor don't matter in theory. In practice we depend on a few internal APIs, so they could matter. This should be much less of an issue with the six-monthly release cadence since each major version only has two point releases.
 
3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
4. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)

`zulu10.2+3-jdk10.0.1-linux_x64-allmodules.tar.gz` is 0.52 hundred MiB :)
 
5. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
6. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)

If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.
 
If we default to a preinstalled host JDK:

1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)

I just want to re-iterate that this seems like a bad out of the box experience: you get Bazel, do a Java build, it fails mysteriously (or at best reports an error), and then you have to go and install a JDK or pass additional flags we'd prefer most users didn't have to have expertise with.
 
3. Users of Bazel get the exact JDK they want

It depends what JDK they want, which I don't have a good sense of. I'd expect more people to have a preference about target --javabase and language level, and to care much less about the toolchain implementation details as long as it provides the functionality they expect.
 
4. No surprising large download of a JDK
5. When one wants to start using the RBE, --host_javabase needs to be changed so that it's a JDK that exists on the remote machine (but you already have a lot of flags you need to add in that case)

Lukács T. Berki

unread,
Aug 29, 2018, 1:22:57 PM8/29/18
to Liam Miller-Cushon, Jakob Buchgraber, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
On Wed, Aug 29, 2018 at 4:58 PM, Liam Miller-Cushon <cus...@google.com> wrote:
Is this a fair assessment of the upsides and downsides of each approach?

SGTM. I left a few notes inline:

To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:
If we default to host JDK in a remote repository:

1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)

Are we sure there aren't any Java host tools that are used by other languages? Nothing for coverage, or windows singlejar?
These are "accidental" issues, so I'd prefer not basing long-term decisions based on what the problem du jour is. 

(To address these two concerns specifically, the Windows singlejar is just something that needs to be removed. IIRC the C++ singlejar could be updated to work on Windows pretty easily, it's just a matter of investing the little work that's needed. LcovMerger is run in the test action which means that it uses the target javabase and not the host one)

 
2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)

The minor version and vendor don't matter in theory. In practice we depend on a few internal APIs, so they could matter. This should be much less of an issue with the six-monthly release cadence since each major version only has two point releases.
 
3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
4. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)

`zulu10.2+3-jdk10.0.1-linux_x64-allmodules.tar.gz` is 0.52 hundred MiB :)
Erm. My bad.
 
 
5. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
6. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)

If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.
That's a good point. Maybe we should just cross that bridge now and be done with it.

 
If we default to a preinstalled host JDK:

1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)

I just want to re-iterate that this seems like a bad out of the box experience: you get Bazel, do a Java build, it fails mysteriously (or at best reports an error), and then you have to go and install a JDK or pass additional flags we'd prefer most users didn't have to have expertise with.
You have to install a JDK to run the JVM software anyway. Are you concerned about cases where there is a JDK of the wrong version or where it's not JVM code that's being built?
 
 
 
3. Users of Bazel get the exact JDK they want

It depends what JDK they want, which I don't have a good sense of. I'd expect more people to have a preference about target --javabase and language level, and to care much less about the toolchain implementation details as long as it provides the functionality they expect.
That's true -- I expect that the host javabase wouldn't matter as much as the target one.

Jakob Buchgraber

unread,
Aug 29, 2018, 2:56:28 PM8/29/18
to Liam Miller-Cushon, Lukács T. Berki, Kevin Bierhoff, ia...@stripe.com, bazel-...@googlegroups.com, Irina Iancu, Nicolas Lopez
Liam,

could you please give us all a better sense on what would be required to support / maintain
multiple JavaBuilder versions for several major JDKs versions? Do I understand correctly that
to support JDK [N-3, N] we would need need two versions of JavaBuilder and javac? For example,
could we support JDK 8,9,10,11 as a host_javabase with a JDK9 JavaBuilder/javac and
a JDK10 JavaBuilder/javac? Once 12 was released, we could support JDK 9,10,11,12 
as a host_javabase with a JDK10 JavaBuilder/javac and a JDK12 JavaBuilder/javac? Correct?

On Wed, Aug 29, 2018 at 4:58 PM Liam Miller-Cushon <cus...@google.com> wrote:
`zulu10.2+3-jdk10.0.1-linux_x64-allmodules.tar.gz` is 0.52 hundred MiB :)

Please note that this is already a "stripped" JDK. The full JDK we get from Azul is ~200MiB.
Besides that downloading 50MiB is *a lot*. It's really worrysome for CI builds where lots of
CI systems out there have stateless builds in some docker container. 
 
 I just want to re-iterate that this seems like a bad out of the box experience: you get Bazel, do a Java build, it fails mysteriously (or at best reports an error), and then you have to go and install a JDK or pass additional flags we'd prefer most users didn't have to have expertise with.

Agreed. We should define which JDK major versions we support, provide a JavaBuilder for
them and for unsupported JDK versions given an error message that the JDK is too old and
unsupported.

Thanks,
Jakob

Nicolas Lopez

unread,
Aug 29, 2018, 3:28:13 PM8/29/18
to Lukács T. Berki, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, Liam Miller-Cushon, bazel-...@googlegroups.com, ia...@stripe.com


On Wed, Aug 29, 2018, 8:35 AM Lukács T. Berki <lbe...@google.com> wrote:
To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:

If we default to host JDK in a remote repository:
  1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)
  2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)
  3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
We do need the JDK on mirror.bazel.build to be maintained as it is currently being included in the rbe-ubuntu16-04 container (more details about that decision here. Note that we build this container about ~2 per quarter, at which times the mirror must be available. This happens offline from any rbe builds, and availability issues in the mirror do not affect rbe customers directly ever.
  1. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)
  2. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
We currently preinstall the jdk in the workers (in the rbe-ubuntu16-04 toolchain container), but we should also support downloading it on the host via workspace and shipping that to workers. The overhead from the perspective of RBE should not be much, as these files are cached, so files only need to be sent to each worker once. Still, since this requires a large download on bazel fetch, we recommend using the preinstalled jdk on the container.

Note that if we decide to use the mirror to download via WORKSPACE rule, the availability of the mirror will become critical and may affect (both rbe and local) builds.
  1. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)
If we default to a preinstalled host JDK:
  1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
  2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)
  3. Users of Bazel get the exact JDK they want
  4. No surprising large download of a JDK
  5. When one wants to start using the RBE, --host_javabase needs to be changed so that it's a JDK that exists on the remote machine (but you already have a lot of flags you need to add in that case)
Is this a fair assessment of the upsides and downsides of each approach?

My 2 cents is that we should provide support for both approaches with clear advice/error messages. This would work as follows:
- Default to a preinstalled jdk if it exists and is "new enough" to use non-VJB
- Error out if one does not exist and user does not provide any explicit guidance wrt the two options they have:
1. Use VJB (+ older local jdk) with explicit flag/env variable (and build produces a clear message showing a full list of downsides of VJB and/or URL to documentation with more detailed information)
2. Use downloaded from mirror jdk + non-VJB with explicit flag/env variable (and build produces a message showing download will be large and running bazel fetch is recommended)

Nicolas Lopez

unread,
Aug 29, 2018, 3:28:51 PM8/29/18
to Lukács T. Berki, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, Liam Miller-Cushon, bazel-...@googlegroups.com, ia...@stripe.com
thanks, hit the wrong reply button. Resent to everyone.

Liam Miller-Cushon

unread,
Aug 30, 2018, 2:57:16 AM8/30/18
to Nicolas Lopez, Lukács T. Berki, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, ia...@stripe.com
> These are "accidental" issues, so I'd prefer not basing long-term decisions based on what the problem du jour is. 

I probably could have chosen better examples, unless you're saying that any use of Java for host tools to support non-JVM languages should be discouraged. Is there a preferred alternative?

> > If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.
> That's a good point. Maybe we should just cross that bridge now and be done with it.

Let's find out if the concerns you raised are blockers before we try to migrate. Can you follow up on that?

> You have to install a JDK to run the JVM software anyway. Are you concerned about cases where there is a JDK of the wrong version

Yes. This is the point I raised earlier that people may not have a local install of the latest version immediately, and that installing one can be disruptive if other programs on their system don't support that JDK yet.

> what would be required to support / maintain multiple JavaBuilder versions for several major JDKs versions?

Staffing, primarily :)

> Do I understand correctly that to support JDK [N-3, N] we would need need two versions of JavaBuilder and javac? For example, could we support JDK 8,9,10,11 as a host_javabase with a JDK9 JavaBuilder/javac and a JDK10 JavaBuilder/javac? Once 12 was released, we could support JDK 9,10,11,12 as a host_javabase with a JDK10 JavaBuilder/javac and a JDK12 JavaBuilder/javac? Correct?

I'm not following the math here. JavaBuilder/javac N can be run on JDKs newer than N, but the latest language level they support is N, so that's not a good experience (i.e. the default toolchain doesn't support JDK 11 today, even though you can use a JDK 11 host_javabase). JavaBuilder/javac N can be made to work on JDK N-1, but that takes additional work and is slightly hacky.

So in practice we'd want one javac/JavaBuilder per host JDK version.

I think the shortest path to supporting additional host JDKs is to migrate the entire toolchain to a remote repo so it can be versioned independently from Bazel. That doesn't solve the problem of actively supporting the old versions (including e.g. back-porting fixes to them), but at least it lets people pin to an old version of the toolchain that matches the host JDK they're using, and to upgrade independently from their Bazel version.

> Besides that downloading 50MiB is *a lot*. It's really worrysome for CI builds

CI systems don't have to use the remote JDK, as Nicholas said.

It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?

> My 2 cents is that we should provide support for both approaches with clear advice/error messages. This would work as follows:
> - Default to a preinstalled jdk if it exists and is "new enough" to use non-VJB
> - Error out if one does not exist and user does not provide any explicit guidance wrt the two options they have:

My concern with this approach is that we might end up erroring out for a large fraction of users. It would be helpful to have data on Bazel users' locally installed JDK versions, but I suspect that a minority have JDK 10 installed.

Liam Miller-Cushon

unread,
Aug 30, 2018, 3:01:54 AM8/30/18
to Nicolas Lopez, Lukács T. Berki, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, ia...@stripe.com
Taking a step back, what's the deadline for getting this resolved?

The most promising option seems to be moving the entire toolchain to a remote repo, and I understand that's something you already wanted to do in the medium/long term.

However there are some open questions about remote repos, including how stable they are, how they interact with distro packaging requirements, and how acceptable downloading additional MiBs is.

Can we hit pause on this until some of those issues are worked out?

I understand the forcing functions to be:
1) avoiding a repeat of the breakages with 0.16, and in particular preventing uses of the embedded JDK except as the server_javabase
2) minimizing the embedded JDK

I don't think (1) is urgent now that the dust has settled on 0.16. JDK 11 will be released in a month, but it's also going to be a less disruptive change than 8->9->10 was.

(2) is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.

Lukács T. Berki

unread,
Aug 30, 2018, 3:23:30 AM8/30/18
to Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cus...@google.com> wrote:
Taking a step back, what's the deadline for getting this resolved?
Bazel 1.0, I guess?
 

The most promising option seems to be moving the entire toolchain to a remote repo, and I understand that's something you already wanted to do in the medium/long term.

However there are some open questions about remote repos, including how stable they are,
These existed only in my mind, apparently.
 
how they interact with distro packaging requirements, and how acceptable downloading additional MiBs is.
 I think it's not optimal that we'd have to download something, but in the long term, I'd expect remote repositories to be heavily used, so I also think it doesn't make sense to irrationally shy away from them. Again, if surprise downloads become a problem, we should solve that somehow that works not only for the JDK, but other remote repositories, too.


Can we hit pause on this until some of those issues are worked out?

I understand the forcing functions to be:
1) avoiding a repeat of the breakages with 0.16, and in particular preventing uses of the embedded JDK except as the server_javabase
2) minimizing the embedded JDK

I don't think (1) is urgent now that the dust has settled on 0.16. JDK 11 will be released in a month, but it's also going to be a less disruptive change than 8->9->10 was.

(2) is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
+1. Let's figure out what to do first, then tweak the embedded JDK when we can do it without harm.


On Wed, Aug 29, 2018 at 11:57 PM Liam Miller-Cushon <cus...@google.com> wrote:
> These are "accidental" issues, so I'd prefer not basing long-term decisions based on what the problem du jour is. 

I probably could have chosen better examples, unless you're saying that any use of Java for host tools to support non-JVM languages should be discouraged. Is there a preferred alternative?

> > If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.
> That's a good point. Maybe we should just cross that bridge now and be done with it.

Let's find out if the concerns you raised are blockers before we try to migrate. Can you follow up on that?
Remote repository support should be stable enough according to Klaus. My opinion is that since remote repositories are here to stay, there isn't much point in shying away from them. If we deem JDK-in-remote-repository the right approach, we should put that behind a flag, roll it out at a convenient Bazel release, then, if bugs surface, be it with remote repository support or not, we'll fix it before flipping the flag by default.
 

> You have to install a JDK to run the JVM software anyway. Are you concerned about cases where there is a JDK of the wrong version

Yes. This is the point I raised earlier that people may not have a local install of the latest version immediately, and that installing one can be disruptive if other programs on their system don't support that JDK yet.

> what would be required to support / maintain multiple JavaBuilder versions for several major JDKs versions?

Staffing, primarily :)

> Do I understand correctly that to support JDK [N-3, N] we would need need two versions of JavaBuilder and javac? For example, could we support JDK 8,9,10,11 as a host_javabase with a JDK9 JavaBuilder/javac and a JDK10 JavaBuilder/javac? Once 12 was released, we could support JDK 9,10,11,12 as a host_javabase with a JDK10 JavaBuilder/javac and a JDK12 JavaBuilder/javac? Correct?

I'm not following the math here. JavaBuilder/javac N can be run on JDKs newer than N, but the latest language level they support is N, so that's not a good experience (i.e. the default toolchain doesn't support JDK 11 today, even though you can use a JDK 11 host_javabase). JavaBuilder/javac N can be made to work on JDK N-1, but that takes additional work and is slightly hacky.

So in practice we'd want one javac/JavaBuilder per host JDK version.

I think the shortest path to supporting additional host JDKs is to migrate the entire toolchain to a remote repo so it can be versioned independently from Bazel. That doesn't solve the problem of actively supporting the old versions (including e.g. back-porting fixes to them), but at least it lets people pin to an old version of the toolchain that matches the host JDK they're using, and to upgrade independently from their Bazel version.

> Besides that downloading 50MiB is *a lot*. It's really worrysome for CI builds

CI systems don't have to use the remote JDK, as Nicholas said.

It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?

> My 2 cents is that we should provide support for both approaches with clear advice/error messages. This would work as follows:
> - Default to a preinstalled jdk if it exists and is "new enough" to use non-VJB
> - Error out if one does not exist and user does not provide any explicit guidance wrt the two options they have:

My concern with this approach is that we might end up erroring out for a large fraction of users. It would be helpful to have data on Bazel users' locally installed JDK versions, but I suspect that a minority have JDK 10 installed.
This is the ancient question of whether 'tis nobler in the mind to Do-What-I-Mean and possibly end up with the wrong result or bail out at the first sign of trouble and ask for explicit instructions. I would prefer the latter (as has the Bazel team in the past).

Jakob Buchgraber

unread,
Aug 30, 2018, 4:01:50 AM8/30/18
to Liam Miller-Cushon, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com, ia...@stripe.com
On Thu, Aug 30, 2018 at 8:57 AM Liam Miller-Cushon <cus...@google.com> wrote:
I'm not following the math here. JavaBuilder/javac N can be run on JDKs newer than N, but the latest language level they support is N, so that's not a good experience (i.e. the default toolchain doesn't support JDK 11 today, even though you can use a JDK 11 host_javabase). JavaBuilder/javac N can be made to work on JDK N-1, but that takes additional work and is slightly hacky.

So in practice we'd want one javac/JavaBuilder per host JDK version.

I might have misunderstood you that javac N can also be run on a JDK N-1. When Bazel was on JDK8 we were using a javac9. Did that just happen to work on
JDK8 by accident? 
 
CI systems don't have to use the remote JDK, as Nicholas said.
As I argue below *defaults matter* and I think we should have a solution that works and has good performance in all scenarios.
 
It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?
I'd argue any toolchain download is a bad user experience and thus downloads should be as fast (small) as possible. Excessive downloads make Bazel look bad
and even more so if a user is waiting for a JDK to download even though he has one installed locally. It's a surprising and unexpected thing to do for a build tool.
One can make the argument that a user can overwrite the --host_javabase to use the local JDK if he doesn't want to download one (and one can argue vice versa)
but defaults matter *a lot*. It's an extra >50MiB in downloads for no technical reason that can be avoided and I think thus we should. There's this joke among Java
developers that apache maven is always downloading half the internet, but that joke has a truth to it. I'd like to avoid such popular opinion forming about Bazel.

My concern with this approach is that we might end up erroring out for a large fraction of users. It would be helpful to have data on Bazel users' locally installed JDK versions, but I suspect that a minority have JDK 10 installed.
I share your concern. We could send out a survey to users to gather some data and announce these changes a long time in advance in the hope of getting feedback.

Ian O'Connell

unread,
Aug 30, 2018, 8:38:57 AM8/30/18
to Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com

It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?
I'd argue any toolchain download is a bad user experience and thus downloads should be as fast (small) as possible. Excessive downloads make Bazel look bad
and even more so if a user is waiting for a JDK to download even though he has one installed locally. It's a surprising and unexpected thing to do for a build tool.
One can make the argument that a user can overwrite the --host_javabase to use the local JDK if he doesn't want to download one (and one can argue vice versa)
but defaults matter *a lot*. It's an extra >50MiB in downloads for no technical reason that can be avoided and I think thus we should. There's this joke among Java
developers that apache maven is always downloading half the internet, but that joke has a truth to it. I'd like to avoid such popular opinion forming about Bazel.


Tool chain downloads/remote dependencies are effectively the norm for most users i'd imagine(at least outside of google) no? (also the bazel installers themselves are 160MiB, with a new bazel point release happening faster than JDK bumps, making the bazel binary smaller and the install is shifted to those who need parts of it, that feels like it winds up being better for end users?)

Jakob Buchgraber

unread,
Aug 30, 2018, 8:44:32 AM8/30/18
to ia...@stripe.com, Liam Miller-Cushon, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Thu, Aug 30, 2018 at 2:38 PM Ian O'Connell <ia...@stripe.com> wrote:
Tool chain downloads/remote dependencies are effectively the norm for most users i'd imagine(at least outside of google) no?

The point I am trying to make is that remote dependencies should be kept as small as possible (in binary size) and a 50MiB
download that can be avoided should be avoided.

Liam Miller-Cushon

unread,
Aug 30, 2018, 11:29:51 AM8/30/18
to Jakob Buchgraber, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
> I might have misunderstood you that javac N can also be run on a JDK N-1. When Bazel was on JDK8 we were using a javac9. Did that just happen to work on JDK8 by accident? 

The OpenJDK build uses javac N on JDK N-1 during bootstrapping, and we've got a lot of mileage out of using that combination in production, but it requires additional effort to make it work and is a bit of a hack. It's best if we can avoid needing to do that.

As you point out that work was already done for 8/9, so making a snapshot of that available would be an expedient way to preserve support for those versions.

> remote dependencies should be kept as small as possible (in binary size) and a 50MiB download that can be avoided should be avoided
> I think we should have a solution that works and has good performance in all scenarios.

I don't think anyone is going to argue with this on principle, but I'm not sure what the specific proposal is.

I think you have good reasons for wanting to avoid unnecessary downloads, and that I have good reasons for wanting the default toolchain to be modern and featureful and reliable even if it requires downloading some additional bits, and that we're lacking data that would help prioritize.

Your suggestion of a user survey sounds good; perhaps we could also use it to get a sense of how much of a concern the downloads would be.

Liam Miller-Cushon

unread,
Aug 30, 2018, 11:30:08 AM8/30/18
to Lukács T. Berki, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, Kevin Bierhoff, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Thu, Aug 30, 2018 at 12:23 AM Lukács T. Berki <lbe...@google.com> wrote:
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cus...@google.com> wrote:
Taking a step back, what's the deadline for getting this resolved?
Bazel 1.0, I guess?
 
(2) is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
+1. Let's figure out what to do first, then tweak the embedded JDK when we can do it without harm.

Ok, great! Thanks for your patience with this thread.

How long do we have until 1.0? :)
 

Kevin Bierhoff

unread,
Aug 30, 2018, 7:37:14 PM8/30/18
to Liam Miller-Cushon, Lukács T. Berki, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
So for some reason I'm getting confused again.  To keep our eyes on the prize: IIUC, the main goal here is to avoid breakages when users try to use a newer version of Bazel.  It seems to me the key proposal on the table to make that happen is to change the (not-host, aka target) --javabase default to @local_jdk.  Making that change may in itself break some people, but it makes it so the JDK that tests are run on doesn't change when Bazel is updated going forward.  This change appears uncontroversial based on this thread, and I'd vote to make it sooner rather than later, but we could have a discussion around how to minimize breakages from this change alone*.

Now, the host JDK is where it seems various other requirements come in.  Fortunately--or at least interestingly--the set of tools that run on the host JDK is much smaller and better defined than what the target JDK is used for [see the initial email in this thread for the "host" vs. "target" terminology].  Moreover, a lot of those tools are bundled with Bazel at the moment (though that is subject to change).  However, at least one of these tools, JavaBuilder has pretty tight requirements on what JDKs it can run on.

Another wrinkle, which I think considerably shifts the discussion here, is the fact that Java releases are becoming much more frequent than they used to be.  This not only means that Bazel users may have a need to compile various versions of Java (including very recent versions), but also that it's becoming harder to assume that a Bazel user has a particular JDK version installed.

Additionally there seem to be concerns around download sizes, both for Bazel users that do and users that don't compile Java.

Now, in principle I think that the remote repo approach for the host JDK is viable (leaving download considerations aside).  However, I'm not sure it gets us any closer to avoiding breakages: the default --host_javabase would point to a download of a particular JDK, IIUC, and we'll occasionally want to update the default to point to a newer JDK.  In fact, we'll have to do so in order to support compiling the latest Java versions, by pointing to a JDK of the latest version.  Meaning, we'll want to update this default JDK on a semi-regular basis.  Every time that default is updated, the next Bazel release will have that new default, resulting in a different host JDK being used that with the previous version of Bazel, opening up the possibility of breakages.

Naively, that appears no better or worse than just making @embedded_jdk the default for --host_javabase.  As a bonus, using @embedded_jdk avoids the extra JDK download (at least for the moment, see below).  And fortunately, as I said, we control the bundled tools that Bazel runs on the host JDK, except for user-written genrules or skylark rules running Java-based tools.  For the tools we bundle, we should be able to make sure that they work on the default host JDK, whatever that may be: again it seems we can test for that equally well whether the default host JDK is @embedded_jdk or some remote repo.  Other tools (used in genrules or skylark rules) can still break with host JDK updates, but I'd argue that that's a far rarer problem than the target JDK changing, and, again, I think I'm missing how using a remote repo mitigates this potential issue.  Am I missing something?

Finally, the size consideration.  It was stated that currently, the @embedded_jdk is still a full JDK, not minimized.  So for the moment, @embedded_jdk seems fully capable of running host tools.  This, again, suggests that for the moment we can just keep using @embedded_jdk for host tools.  I feel (and agree with previous statements to that effect) we can decide later how to accomplish the desired @embedded_jdk minimization: if we do decide on a remote repo approach we can at that point change the --host_javabase default to point to a remote repo containing the same JDK version as @embedded_jdk at that time, effectively as a no-op (except for the extra download).

The takeaway from all this is that I'd think we want to change --javabase's default soon, understanding that doing so may incur one-time breakages*.  I propose holding off on any changes to --host_javabase's default until the --javabase change has baked, with the understanding that we'll have a separate discussion around how we can minimize @embedded_jdk, which may introduce the need for a remote repo.

* If we're worried about breakages from changing the --javabase default, I wonder if we can play some tricks.  For instance, we could have a fallback behavior that downloads a JDK (version 8?) from a remote repo if no locally installed JDK can be found.
--
Kevin Bierhoff
Google

Lukács T. Berki

unread,
Aug 31, 2018, 4:48:10 AM8/31/18
to Kevin Bierhoff, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
I think there is consensus that --javabase should point to @local_jdk.

For --host_javabase, What I'd like to avoid is coupling the JDK Bazel runs under to the JDK host tools run under, that is, --host_javabase defaulting to @embedded_jdk in any form. This is both to make it possible to minimize the embedded JDK and to be able to update it while being sure that we won't break anything. This leaves us two options: a local JDK and a remote repository.

Based on what I heard, I'm leaning towards the remote repository option -- Most of its cost ("mandatory" download, size, need to maintain mirror.bazel.build) will be incurred due to other reasons anyway, and Bazel versions packaged with eg. Homebrew/Debian can always default to a JDK in another package.

(Another point: it's not true that only our tools run under --host_javabase. It's any tool that runs during the build and is written in Java)

Kevin Bierhoff

unread,
Aug 31, 2018, 3:10:21 PM8/31/18
to Lukács T. Berki, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Fri, Aug 31, 2018 at 1:48 AM Lukács T. Berki <lbe...@google.com> wrote:
I think there is consensus that --javabase should point to @local_jdk.

For --host_javabase, What I'd like to avoid is coupling the JDK Bazel runs under to the JDK host tools run under, that is, --host_javabase defaulting to @embedded_jdk in any form. This is both to make it possible to minimize the embedded JDK and to be able to update it while being sure that we won't break anything. This leaves us two options: a local JDK and a remote repository.

Based on what I heard, I'm leaning towards the remote repository option -- Most of its cost ("mandatory" download, size, need to maintain mirror.bazel.build) will be incurred due to other reasons anyway, and Bazel versions packaged with eg. Homebrew/Debian can always default to a JDK in another package.

 
Again this is where I'm confused.  Avoiding the embedded JDK seems to just shifts the breaking from one change (embedded JDK update) to a different change (updating the default remote repo --host_javabase points to, updating the package Debian points to etc.).  Am I missing something?
 
(Another point: it's not true that only our tools run under --host_javabase. It's any tool that runs during the build and is written in Java)

Apologies, I'm aware of this and discussed this point somewhere in my last email.  Because of this issue, breakages due to --host_javabase changes are a concern at all (assuming sufficient testing of bundled tools), IIUC.  I still make the claim that this is far less common or likely to be an issue, especially for users who aren't doing Java compilation.  Are there non-packaged, non Android SDK, Java-based tools you are aware of that are commonly used?  This question was asked before and it didn't seem like there was a large surface here to worry about.  Did I get the wrong impression?  Another question would be whether we think the bigger issue here are user-written genrules that happen to invoke a Java program, or Skylark rules that use Java-based tools?  Just curious what we know about the landscape of host-side Java tools outside Java and Android compilation.


--
Kevin Bierhoff
Google

Lukács T. Berki

unread,
Sep 3, 2018, 5:51:11 AM9/3/18
to Kevin Bierhoff, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Fri, Aug 31, 2018 at 9:10 PM, Kevin Bierhoff <k...@google.com> wrote:
On Fri, Aug 31, 2018 at 1:48 AM Lukács T. Berki <lbe...@google.com> wrote:
I think there is consensus that --javabase should point to @local_jdk.

For --host_javabase, What I'd like to avoid is coupling the JDK Bazel runs under to the JDK host tools run under, that is, --host_javabase defaulting to @embedded_jdk in any form. This is both to make it possible to minimize the embedded JDK and to be able to update it while being sure that we won't break anything. This leaves us two options: a local JDK and a remote repository.

Based on what I heard, I'm leaning towards the remote repository option -- Most of its cost ("mandatory" download, size, need to maintain mirror.bazel.build) will be incurred due to other reasons anyway, and Bazel versions packaged with eg. Homebrew/Debian can always default to a JDK in another package.

 
Again this is where I'm confused.  Avoiding the embedded JDK seems to just shifts the breaking from one change (embedded JDK update) to a different change (updating the default remote repo --host_javabase points to, updating the package Debian points to etc.).  Am I missing something?
Correct, but it makes these the embedded JDK and the host JDK independent, which is a win.
 
 
(Another point: it's not true that only our tools run under --host_javabase. It's any tool that runs during the build and is written in Java)

Apologies, I'm aware of this and discussed this point somewhere in my last email.  Because of this issue, breakages due to --host_javabase changes are a concern at all (assuming sufficient testing of bundled tools), IIUC.  I still make the claim that this is far less common or likely to be an issue, especially for users who aren't doing Java compilation.  Are there non-packaged, non Android SDK, Java-based tools you are aware of that are commonly used?  This question was asked before and it didn't seem like there was a large surface here to worry about.  Did I get the wrong impression?  Another question would be whether we think the bigger issue here are user-written genrules that happen to invoke a Java program, or Skylark rules that use Java-based tools?  Just curious what we know about the landscape of host-side Java tools outside Java and Android compilation.
I don't know, since we have far less visibility into what people are doing with Bazel than with Blaze. For me, the most convincing arguments for *not* using the embedded JDK as the host javabase are:
  1. Independence of Bazel itself from the tools it runs locally
  2. the fact that we either package a full JDK (thus making Bazel a very heavyweight tool) or we'd eventually get surprises about the parts of the JDK that we deemed not important enough to include, yet people need it
since Bazel is a base tool, it should not make surprising choices. And omitting parts of a JDK definitely counts as surprising.
 

Jakob Buchgraber

unread,
Sep 3, 2018, 7:40:13 AM9/3/18
to Liam Miller-Cushon, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Thu, Aug 30, 2018 at 5:29 PM Liam Miller-Cushon <cus...@google.com> wrote:
I don't think anyone is going to argue with this on principle, but I'm not sure what the specific proposal is.

The specific proposal for the RHS --host_javabase is:
 * Require the user to install a local JDK or bring his own JDK as a remote repository
 * Bazel defines a list of supported major JDK versions and fails with a decent error message
   if the user provided JDK is not in that list.
 * Bazel automatically downloads a JavaBuilder from a remote repository that works with the
   user provided JDK. We have a working JavaBuilder for all supported JDKs. No more VanillaJavaBuilder.

... I have good reasons for wanting the default toolchain to be modern and featureful and reliable even if it requires downloading some additional bits, and that we're lacking data that would help prioritize.

So we both want the same things but disagree on how to get there :-). I am simply arguing that we should put in the engineering effort (Note: I am not saying that you should be the one doing this work (unless you want to ofc))
to make your statement true without having to download a separate JDK for the --host_javabase.

Liam Miller-Cushon

unread,
Sep 4, 2018, 1:00:32 PM9/4/18
to Jakob Buchgraber, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Mon, Sep 3, 2018 at 7:40 AM Jakob Buchgraber <buc...@google.com> wrote:
The specific proposal for the RHS --host_javabase is:
 * Require the user to install a local JDK or bring his own JDK as a remote repository
 * Bazel defines a list of supported major JDK versions and fails with a decent error message
   if the user provided JDK is not in that list.
 * Bazel automatically downloads a JavaBuilder from a remote repository that works with the
   user provided JDK. We have a working JavaBuilder for all supported JDKs. No more VanillaJavaBuilder.

That suite of JavaBuilders is likely to be substantively different depending on which JDK they need to be compatible with, so this suffers from some of the issues I raised about automatically switching between VJB and non-VJB depending on host JDK version.

Requiring explicitly selecting a toolchain version that matches the local JDK would take some of that magic away. 

Liam Miller-Cushon

unread,
Sep 4, 2018, 1:00:38 PM9/4/18
to Lukács T. Berki, Kevin Bierhoff, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Mon, Sep 3, 2018 at 5:51 AM Lukács T. Berki <lbe...@google.com> wrote:
Again this is where I'm confused.  Avoiding the embedded JDK seems to just shifts the breaking from one change (embedded JDK update) to a different change (updating the default remote repo --host_javabase points to, updating the package Debian points to etc.).  Am I missing something?
Correct, but it makes these the embedded JDK and the host JDK independent, which is a win.

That decoupling seems like a big advantage: as long as the interface between Bazel and the toolchain is relatively stable (including e.g. JavaBuilder flags) then this allows upgrading to a newer Bazel release independently from upgrading to the latest Java toolchain. That would make it easier to absorb breaking changes on either side, and should avoid getting stuck on the previous version of Bazel due to a toolchain change.

Jakob Buchgraber

unread,
Sep 4, 2018, 1:29:26 PM9/4/18
to Liam Miller-Cushon, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Tue, Sep 4, 2018 at 7:00 PM Liam Miller-Cushon <cus...@google.com> wrote:
Requiring explicitly selecting a toolchain version that matches the local JDK would take some of that magic away. 

I imagine we should be able to auto detect the Java version in the local_jdk?

Liam Miller-Cushon

unread,
Sep 4, 2018, 1:38:04 PM9/4/18
to Jakob Buchgraber, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Tue, Sep 4, 2018 at 1:29 PM Jakob Buchgraber <buc...@google.com> wrote:
On Tue, Sep 4, 2018 at 7:00 PM Liam Miller-Cushon <cus...@google.com> wrote:
That suite of JavaBuilders is likely to be substantively different depending on which JDK they need to be compatible with, so this suffers from some of the issues I raised about automatically switching between VJB and non-VJB depending on host JDK version.  
 
Requiring explicitly selecting a toolchain version that matches the local JDK would take some of that magic away. 

I imagine we should be able to auto detect the Java version in the local_jdk?

It would be difficult to guarantee that a JavaBuilder that runs on, say, JDK 6 is bug and feature-compatible with one that runs on JDK 11.

Automagically switching between the two for the same build when running on different systems with different local JDKs maybe not provide a good experience.

Kevin Bierhoff

unread,
Sep 4, 2018, 4:02:08 PM9/4/18
to Liam Miller-Cushon, Lukács T. Berki, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
Even if embedded JDK and host JDK/Java toolchain are decoupled there'll be some Bazel release that upgrades the Java toolchain, which could break users.  The only difference seems to be that a given Bazel release can upgrade embedded JDK without Java toolchain upgrade (or vice versa), but every Java toolchain upgrade still seems to invite breakages?

I guess my larger point here is that, given the current tool bundling etc., it seems like we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial (e.g., we don't want to upgrade the embedded JDK for some reason but need to upgrade host JDK, e.g., to support a new Java version; or we want to minify the embedded JDK).  Until that time it seems we can avoid extra downloads (and breakages!) by keeping them together, seemingly at no cost.  Put another way I'm not advocating keeping embedded==host forever, but it does seem we can wait on separating them, so let's do that and focus on the target JDK separation first.


Jakob Buchgraber

unread,
Sep 4, 2018, 4:22:24 PM9/4/18
to Liam Miller-Cushon, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Tue, Sep 4, 2018 at 7:38 PM Liam Miller-Cushon <cus...@google.com> wrote:
It would be difficult to guarantee that a JavaBuilder that runs on, say, JDK 6 is bug and feature-compatible with one that runs on JDK 11.

I can see your point, but do you think that this will be an issue we can't handle? I imagine a JavaBuilder for the latest JDK N will stay current
for at least six months and any bugs reported in that timeframe will get fixed. Once we support JDK N+1, the JavaBuilder for JDK N should be
rather stable and for critical bugs we can still do updates to it, although I would expect this to be the exception no?
 
Automagically switching between the two for the same build when running on different systems with different local JDKs maybe not provide a good experience.

I am not entirely sure what you mean by the "same build" on two different systems with different JDKs.
I ll assume you mean the same codebase and source state: I don't see how this is a problem in that
different systems with different JDKs are not expected to behave the same - even today.


Jakob Buchgraber

unread,
Sep 4, 2018, 4:24:26 PM9/4/18
to Kevin Bierhoff, Liam Miller-Cushon, Lukács T. Berki, Nicolas Lopez, Irina Iancu, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Tue, Sep 4, 2018 at 10:02 PM Kevin Bierhoff <k...@google.com> wrote:
... we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial ...

This situation exists already, in that for non-Java users this would allow us to significantly reduce the Bazel
binary size by building a minimal JDK using jlink. See https://docs.google.com/document/d/1Igmv-2GfXkoVFWTXvBYPeniQom8nLAwzqzridDlBIS4/edit

Kevin Bierhoff

unread,
Sep 4, 2018, 5:13:30 PM9/4/18
to Jakob Buchgraber, Liam Miller-Cushon, Lukács T. Berki, Nicolas Lopez, Irina Iancu, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
Apologies, I didn't realize you were imminently planning/hoping to do the size minimization.  Is there a discussion somewhere where we can talk about how best to achieve that, or is this (and the doc linked above) that discussion?  We extensively discussed requirements for minimization above as well, and none of them to me necessitated a embedded/host split either, so my sense is that I'm missing something there as well.  We can certainly continue that discussion in this email thread or further discuss it in a separate thread, whatever you prefer.  (I was hoping we could defer that part, since it's complex for its own reasons, but I guess that won't do it.)

Liam Miller-Cushon

unread,
Sep 4, 2018, 7:27:27 PM9/4/18
to Jakob Buchgraber, ia...@stripe.com, Nicolas Lopez, Lukács T. Berki, Irina Iancu, Kevin Bierhoff, bazel-...@googlegroups.com
On Tue, Sep 4, 2018 at 4:22 PM Jakob Buchgraber <buc...@google.com> wrote:
On Tue, Sep 4, 2018 at 7:38 PM Liam Miller-Cushon <cus...@google.com> wrote:
It would be difficult to guarantee that a JavaBuilder that runs on, say, JDK 6 is bug and feature-compatible with one that runs on JDK 11.

I can see your point, but do you think that this will be an issue we can't handle? I imagine a JavaBuilder for the latest JDK N will stay current
for at least six months and any bugs reported in that timeframe will get fixed. Once we support JDK N+1, the JavaBuilder for JDK N should be
rather stable and for critical bugs we can still do updates to it, although I would expect this to be the exception no?

They'll probably be stable, but they won't be bug and feature-compatible with the latest version unless we stop fixing bugs or adding new features, or someone spends time back-porting all of those fixes and features. I think that time would be better spent making it easy to use a modern host JDK.
 
 Automagically switching between the two for the same build when running on different systems with different local JDKs maybe not provide a good experience.

I am not entirely sure what you mean by the "same build" on two different systems with different JDKs.
I ll assume you mean the same codebase and source state: I don't see how this is a problem in that
different systems with different JDKs are not expected to behave the same - even today.

Maybe "build of the same version of the same repo". Using whatever the local JDK happens to be makes the build less hermetic and reproducible than using a known version (e.g. from a remote repo or the embedded JDK).

Liam Miller-Cushon

unread,
Sep 4, 2018, 7:27:44 PM9/4/18
to Kevin Bierhoff, Lukács T. Berki, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
On Tue, Sep 4, 2018 at 4:02 PM 'Kevin Bierhoff' via Bazel/JVM Special Interest Group <bazel-...@googlegroups.com> wrote:
I guess my larger point here is that, given the current tool bundling etc., it seems like we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial (e.g., we don't want to upgrade the embedded JDK for some reason but need to upgrade host JDK, e.g., to support a new Java version; or we want to minify the embedded JDK).  Until that time it seems we can avoid extra downloads (and breakages!) by keeping them together, seemingly at no cost.  Put another way I'm not advocating keeping embedded==host forever, but it does seem we can wait on separating them, so let's do that and focus on the target JDK separation first.

I think we agreed earlier in the thread to keep using the embedded JDK as the default host_javabase for now, and revisit minification some time before Bazel 1.0:

On Thu, Aug 30, 2018 at 3:23 AM Lukács T. Berki <lbe...@google.com> wrote:
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cus...@google.com> wrote:
Taking a step back, what's the deadline for getting this resolved?
Bazel 1.0, I guess?
[minimizing the embedded JDK] is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.

Liam Miller-Cushon

unread,
Sep 7, 2018, 2:34:17 PM9/7/18
to Kevin Bierhoff, Lukács T. Berki, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
The highest-priority and least contentious part of this seems to be that defaulting --javabase to the embedded JDK if we can't find a local_jdk is a bad idea.

Lukács T. Berki

unread,
Sep 10, 2018, 4:32:27 AM9/10/18
to Liam Miller-Cushon, Kevin Bierhoff, Nicolas Lopez, Irina Iancu, Jakob Buchgraber, Bazel/JVM Special Interest Group, ia...@stripe.com, Klaus Aehlig
Thanks! That indeed seems to be the most accepted part of the proposal.

On Fri, Sep 7, 2018 at 8:34 PM, Liam Miller-Cushon <cus...@google.com> wrote:
The highest-priority and least contentious part of this seems to be that defaulting --javabase to the embedded JDK if we can't find a local_jdk is a bad idea.

On Tue, Sep 4, 2018 at 4:27 PM Liam Miller-Cushon <cus...@google.com> wrote:
On Tue, Sep 4, 2018 at 4:02 PM 'Kevin Bierhoff' via Bazel/JVM Special Interest Group <bazel-sig-jvm@googlegroups.com> wrote:
I guess my larger point here is that, given the current tool bundling etc., it seems like we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial (e.g., we don't want to upgrade the embedded JDK for some reason but need to upgrade host JDK, e.g., to support a new Java version; or we want to minify the embedded JDK).  Until that time it seems we can avoid extra downloads (and breakages!) by keeping them together, seemingly at no cost.  Put another way I'm not advocating keeping embedded==host forever, but it does seem we can wait on separating them, so let's do that and focus on the target JDK separation first.

I think we agreed earlier in the thread to keep using the embedded JDK as the default host_javabase for now, and revisit minification some time before Bazel 1.0:

On Thu, Aug 30, 2018 at 3:23 AM Lukács T. Berki <lbe...@google.com> wrote:
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cushon@google.com> wrote:
Taking a step back, what's the deadline for getting this resolved?
Bazel 1.0, I guess?
[minimizing the embedded JDK] is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
+1. Let's figure out what to do first, then tweak the embedded JDK when we can do it without harm.

Lukács T. Berki

unread,
Sep 20, 2018, 8:27:36 AM9/20/18
to Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Bazel/JVM Special Interest Group, ia...@stripe.com, Klaus Aehlig
Given that the embedded/host JDK separation will need to be done eventually and it makes the minimization of the embedded JDK much easier, I'd prefer to start with it as soon as we can. It will take a while to be rolled out anyway... 

Kevin Bierhoff

unread,
Sep 20, 2018, 8:54:18 PM9/20/18
to Lukács T. Berki, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
Again, in my opinion we haven't sufficiently established that the separation is necessary in order to minimize the embedded JDK, so let's please not do anything that we can't take back before we know it's needed.  Additionally, if such a separation was needed I think the window between separation and minimization should be as short as possible, ideally within the same Bazel release, to avoid Java users of Bazel downloading a full host JDK in addition to a full embedded JDK and a full local JDK.  Meaning, we should only start on this when we're ready to finish it.

My preference, based on the writeup for the minimization, would be to go forward on the other Bazel minimization opportunities identified in the doc.  IIUC, the diff between using an embedded JDK with all modules vs. an embedded JDK with minimal modules is 30 megs, but the other opportunities identified already give a bunch of savings.  So let's reap those other savings while discussing how best to remove modules from the embedded JDK, as a final step, if such a sequencing is possible.
--
Kevin Bierhoff
Google

Ulf Adams

unread,
Sep 21, 2018, 3:22:31 AM9/21/18
to Kevin Bierhoff, Lukacs Berki, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, bazel-...@googlegroups.com, Ian O'Connell, Klaus Aehlig
On Fri, Sep 21, 2018 at 2:54 AM 'Kevin Bierhoff' via Bazel/JVM Special Interest Group <bazel-...@googlegroups.com> wrote:
Again, in my opinion we haven't sufficiently established that the separation is necessary in order to minimize the embedded JDK, so let's please not do anything that we can't take back before we know it's needed.  Additionally, if such a separation was needed I think the window between separation and minimization should be as short as possible, ideally within the same Bazel release, to avoid Java users of Bazel downloading a full host JDK in addition to a full embedded JDK and a full local JDK.  Meaning, we should only start on this when we're ready to finish it.

Hi Kevin,

There are multiple reasons for decoupling the Bazel Jdk and the rules Jdk:
1. decouple the Java rules from Bazel, especially:
1.1 to decouple Bazel releases from Java rule Jdk upgrades
1.2 to decouple user's Jdk upgrades from Bazel upgrades
2. minimize surprises for users of the Java rules - Java has a decent backwards compatibility story, but given a sufficiently large user base (which we're hoping for even if we don't have it yet), Jdk changes generally break Java code, and Java code is run as part of the compiler due to annotation processing
3. better support remote execution out of the box
4. make it possible to use different, incompatible setups for Bazel (e.g., Java-to-machine compilation, Jdks other than OpenJdk, or even a different language than Java)

If you have a proposal that allows us to achieve all this without decoupling, we'd be interested to hear it, but ultimately, the decision of whether to make the built-in Jdk available to rules rests with the Bazel team, and it's not something we want to support in the medium term.

My preference, based on the writeup for the minimization, would be to go forward on the other Bazel minimization opportunities identified in the doc.  IIUC, the diff between using an embedded JDK with all modules vs. an embedded JDK with minimal modules is 30 megs, but the other opportunities identified already give a bunch of savings.  So let's reap those other savings while discussing how best to remove modules from the embedded JDK, as a final step, if such a sequencing is possible.

There's no reason not to work on these things in parallel. IIRC, the bundled Android tools are the biggest contributor to binary size, and I believe that's something the Android rules team is already working on. But certainly, minimizing Bazel binary size is not the only reason for not making the built-in Jdk available to rules.

Thanks,

-- Ulf
 

On Thu, Sep 20, 2018 at 5:27 AM Lukács T. Berki <lbe...@google.com> wrote:


On Tue, Sep 4, 2018 at 11:13 PM, Kevin Bierhoff <k...@google.com> wrote:
On Tue, Sep 4, 2018 at 1:24 PM Jakob Buchgraber <buc...@google.com> wrote:
On Tue, Sep 4, 2018 at 10:02 PM Kevin Bierhoff <k...@google.com> wrote:
... we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial ...

This situation exists already, in that for non-Java users this would allow us to significantly reduce the Bazel
binary size by building a minimal JDK using jlink. See https://docs.google.com/document/d/1Igmv-2GfXkoVFWTXvBYPeniQom8nLAwzqzridDlBIS4/edit

Apologies, I didn't realize you were imminently planning/hoping to do the size minimization.  Is there a discussion somewhere where we can talk about how best to achieve that, or is this (and the doc linked above) that discussion?  We extensively discussed requirements for minimization above as well, and none of them to me necessitated a embedded/host split either, so my sense is that I'm missing something there as well.  We can certainly continue that discussion in this email thread or further discuss it in a separate thread, whatever you prefer.  (I was hoping we could defer that part, since it's complex for its own reasons, but I guess that won't do it.)
Given that the embedded/host JDK separation will need to be done eventually and it makes the minimization of the embedded JDK much easier, I'd prefer to start with it as soon as we can. It will take a while to be rolled out anyway... 



--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891


--
Kevin Bierhoff
Google

--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CABdRVUY8mQb6O7jmShcYWqRBOj-skji2jeSaVDH-jAeQuTvA_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Lukács T. Berki

unread,
Sep 21, 2018, 3:59:19 AM9/21/18
to Ulf Adams, Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Bazel/JVM Special Interest Group, Ian O'Connell, Klaus Aehlig
Some data about the unpacked Bazel binary (altogether 300 MB):
  • JDK: 152MB
  • Server jar: 44MB
  • tools: 51MB
    • tools/jdk: 43MB
In other words: the JDK is half of the Bazel binary, with the code of the server and our Java tooling taking another 15% each.


On Fri, Sep 21, 2018 at 9:22 AM, Ulf Adams <ulf...@google.com> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jvm+unsubscribe@googlegroups.com.

To post to this group, send email to bazel-...@googlegroups.com.

Lukács T. Berki

unread,
Sep 21, 2018, 8:37:22 AM9/21/18
to Ulf Adams, Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, Bazel/JVM Special Interest Group, Ian O'Connell, Klaus Aehlig
My educated co-worker Jakob just mentioned that these numbers are misleading because these things compress differently. Here are the compressed sizes (the Bazel binary is 173MB at HEAD):
  • JDK: 54 MB
  • tools/jdk: 34 MB
  • server jar: 41 MB

On Fri, Sep 21, 2018 at 9:58 AM, Lukács T. Berki <lbe...@google.com> wrote:
Some data about the unpacked Bazel binary (altogether 300 MB):
  • JDK: 152MB
  • Server jar: 44MB
  • tools: 51MB
    • tools/jdk: 43MB
In other words: the JDK is half of the Bazel binary, with the code of the server and our Java tooling taking another 15% each.

On Fri, Sep 21, 2018 at 9:22 AM, Ulf Adams <ulf...@google.com> wrote:

Tobias Werth

unread,
Sep 27, 2018, 10:33:45 AM9/27/18
to Lukács T. Berki, Ulf Adams, Kevin Bierhoff, Jakob Buchgraber, Liam Miller-Cushon, Nicolas Lopez, Irina Iancu, bazel-...@googlegroups.com, ia...@stripe.com, Klaus Aehlig
Please have a look at my doc https://docs.google.com/document/d/11OGJ2QTjnZgxbkgVFrKa3eD2DQ68qnXws7UQ1czc-qQ/edit?ts=5bab8e2e# where I propose to hide the embedded JDK.

To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.

To post to this group, send email to bazel-...@googlegroups.com.



--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891



--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.

To post to this group, send email to bazel-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages