- We use the regular JavaBuilder if that JDK matches whatever JDK it expects, otherwise, we use VanillaJavaBuilder. If you want to make use of all the nifty features of JavaBuilder, the local JDK has to be of the right major version.
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jvm+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CAGQ4vn2oXBL%3DM48%3Dno0xX8r9MnqfdiNjJWpUGWGL%2BowTSN0EYw%40mail.gmail.com.
On Thu, Aug 23, 2018 at 9:40 AM Lukács T. Berki <lbe...@google.com> wrote:
- We use the regular JavaBuilder if that JDK matches whatever JDK it expects, otherwise, we use VanillaJavaBuilder. If you want to make use of all the nifty features of JavaBuilder, the local JDK has to be of the right major version.
I would have been able to get behind this statement before the OpenJDK project switched to a 6-monthrelease schedule. I think given this new reality a Bazel release should also fully support (with a properJavaBuilder) the Java releases of the previous 12-24 months.
One request would be that as part of removing the embedded jdk from being used in rules we see if we can have a replacement set of hermetic JDK rules. local_jdk isn't always stable(hash code/files) across machines -- on os x at least we've ran into this with the jdk8's . This will break remote caching when it comes up -- https://github.com/bazelbuild/bazel/issues/4769
Does "proper" mean non-VanillaJavaBuilder, or something else?
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CAGQ4vn1g1CZsokgLOAddxH6%3DPLYamthAE0PL9EKaRvEVNGJgZA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
On Thu, Aug 23, 2018 at 4:59 PM Liam Miller-Cushon <cus...@google.com> wrote:Does "proper" mean non-VanillaJavaBuilder, or something else?Yes, JavaBuilder. I think ideally we would not have the VanillaJavaBuilder anymore.
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.Agreed that it would be nice, but I don't think that this should be the default. I think users of remote caching / executionshould simply be able to add the JDK of their choice as a remote repository that they host themselves. I don't think Bazelshould be in the business of providing JDKs to users. This should work right now already (correct me if I am wrong Liam)and might just need some documentation.
Maybe this is obvious to everyone else, but why not use the @embedded_jdk for JavaBuilder by default? Seems like that would resolve a lot of problems mentioned here. I think we'd also rather avoid maintaining 4 versions of JavaBuilder, to support "recent" Java versions.
On Thu, Aug 23, 2018 at 6:25 PM Liam Miller-Cushon <cus...@google.com> wrote:
It sounds like it's already possible, I just wasn't that familiar with remote repos. I agree re: providing JDKs, but it might be helpful to provide the BUILD files and remote repo configuration to use e.g. zulu.
On Thu, Aug 23, 2018 at 9:03 AM Kevin Bierhoff <k...@google.com> wrote:Maybe this is obvious to everyone else, but why not use the @embedded_jdk for JavaBuilder by default? Seems like that would resolve a lot of problems mentioned here. I think we'd also rather avoid maintaining 4 versions of JavaBuilder, to support "recent" Java versions.If the embedded JDK is used only as a server_javabase and not a host_javabase, then it can be a minimal image that only contains the modules Bazel needs, which will make the distribution size significantly smaller.
I still don't think I'm following. I'm not suggesting embedded_jdk==host_jdk. I'm suggesting using embedded_jdk for JavaBuilder in particular, by default. I understand that JavaBuilder may require additional modules, but the set of those modules needed for JavaBuilder in particular would be known and could be included. Do we know how much larger embedded_jdk would have to be to accommodate JavaBuilder? Can we count that low?
wrt remote caching: the local_jdk is a problem for remote caching (not just mac). And I don't think it would be wise to ignore files when computing the cache as is mentioned in https://github.com/bazelbuild/bazel/issues/4769 (i.e., it could lead to cache poisoning if any of the ignored files actually has an impact on build outputs). I think the only solution for proper remote caching is to have fully hermetic builds. For java, this would likely mean the best results (wrt caching) will only be obtained if projects download a jdk in a workspace rule (i.e., the comment above about zulu configs) and use that instead of local_jdk via javabase flags.
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CABdRVUbp3yB%2B9xtH0Abc06-bcza1F2bT0KAwUeCwAiVFbS4o8w%40mail.gmail.com.
3 thoughts:
1. IIRC the current docs (via bazel-toolchains) actually pushes users of RBE to use a locally pre installed java and not a checked in one.
2. In your suggestion do you mean that @local_jdk will be reserved (by intention at least) only as the default to which host-javabase and javabase defaults to?
3. If the answer to 2 is yes then can you give an example of what rules (for example rules_scala) need to use? I think we use @local_jdk currently (haven’t verified)
6 MiB seems worth it no? I see the point about annotation processors, but I would also expect most common annotation processors not to go outside the JDK modules you'll need to include into @embedded_jdk anyways.
As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".
In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.
On Fri, Aug 24, 2018 at 10:34 AM, Jakob Buchgraber <buc...@google.com> wrote:On Fri, Aug 24, 2018 at 9:22 AM Lukács T. Berki <lbe...@google.com> wrote:As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".+1In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.I'd argue that it should not be on the Bazel team to provide and maintain a JDK for remote execution. Ultimately people will want and should be able to bringtheir own JDK for remote execution just as they do for local execution. It should be on the remote execution system to make recommendations on how bestto do that / where to get it from.In general, I think it's on Googlers to maintain these things. Which particular set of Googlers, we can argue, but let's first establish that this is a necessary piece of infrastructure. More concretely, that's what https://github.com/bazelbuild/bazel-toolchains should be, right?
On Thu, Aug 23, 2018 at 10:50 PM, Kevin Bierhoff <k...@google.com> wrote:On Thu, Aug 23, 2018 at 11:43 AM Jakob Buchgraber <buc...@google.com> wrote:On Thu, Aug 23, 2018 at 8:22 PM Kevin Bierhoff <k...@google.com> wrote:I still don't think I'm following. I'm not suggesting embedded_jdk==host_jdk. I'm suggesting using embedded_jdk for JavaBuilder in particular, by default. I understand that JavaBuilder may require additional modules, but the set of those modules needed for JavaBuilder in particular would be known and could be included. Do we know how much larger embedded_jdk would have to be to accommodate JavaBuilder? Can we count that low?It's about 6 MiB the last time I checked (the jdk.compiler module). It's my understanding that additionally any annotationprocessors would also run on the JDK used by JavaBuilder and those processors again could use arbitrary jdk classes...6 MiB seems worth it no? I see the point about annotation processors, but I would also expect most common annotation processors not to go outside the JDK modules you'll need to include into @embedded_jdk anyways. If someone does want to use an annotation processor that needs other JDK modules then they would have use use @local_jdk or another JDK, but it seems good enough to just document that. (Annotation processors also have a regular classpath where they can get access to stuff outside the JDK, that's a lot more common I think and should work fine.)So this all doesn't seem to prevent using @embedded_jdk for JavaBuilder by default. And it seems well worth it to me considering that that way, everyone gets the non-vanilla JavaBuilder by default.This would mean introducing Yet Another Javabase Concept, bringing the number of knobs that control what sort of JDK to use when to four (server, host, target, JavaBuilder). It would also mean surprises because then we'd be running annotation processors on a weird JDK. It would also make it more difficult to deal with remote execution, because then we'd need to have a way to get the embedded JDK to run on the remote workers, which is especially interesting if they have a different operating system.
I agree that we'd ideally not have VanillaJavaBuilder, but since that requires maintaining multiple JavaBuilder versions, I think VanillaJavaBuilder is a nice stopgap. I hope that maintaining multiple JavaBuilders will not be a lot of work once we figure out who will do it. And there is a pretty easy migration path from "VanillaJavaBuilder is required for all JDK major versions that are not the one JavaBuilder supports" to "Bazel selects the right JavaBuilder".
--As for what JDK to use for the host and target javabases, I think the least surprising default choice is @local_jdk, because that's what all other build tools use. It's also conceptually simple to say "if you want to compile Java, you need a JDK, and Bazel tries to find it, but if it cannot, here's how to you tell it where it is".In a way, we *must* be in the business of distributing JDKs because the remote execution workers must have a JDK somehow. If I understand Nick correctly, he's saying that it's currently distributed with the machine images, but it's still under our control. I think eventually we want to have an easy way to use a hermetic JDK downloaded from somewhere, but that's not very urgent, because local builds are covered by @local_jdk and the ability to point Bazel to a local JDK at an arbitrary location and remote builds are covered by the JDK installed on the workers.What we do need, however, is java_toolchain / java_runtime rules that describe the JDK on the remote workers, but we already have that. Fortuitously, it's a very good place to distribute hermetic JDK repositories if we ever make that leap.Lukács T. Berki | Software Engineer | lbe...@google.com |Google Germany GmbH | Erika-Mann-Str. 33 | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891
Again, the annotation processor issue is IMHO a red-herring as it will affect few if any users (and those users can always direct Bazel to use @local_jdk or something else).
VanillaJavaBuilder is a nice stopgap, but it's far inferior to "proper" JavaBuilder, so I still maintain Bazel should make every effort to use JavaBuilder instead of VanillaJavaBuilder. Also note that VanillaJavaBuilder to a first approximation allows more code to compile than JavaBuilder, due to the latter's strict_deps and Error-Prone enforcement. That seams to imply that changes down the line that make new versions of Bazel use JavaBuilder where VJB was previously used would run the risk of breaking existing builds. So it seems to me it's well worth making sure VJB is used in as few cases as possible and, not to sound like a broken record, giving users the benefits of JavaBuilder.
One option would be to fetch the --host_javabase or --javabase from a remote repo, which would make it hermetic and wouldn't increase the Bazel distribution size.Agreed that it would be nice, but I don't think that this should be the default. I think users of remote caching / executionshould simply be able to add the JDK of their choice as a remote repository that they host themselves. I don't think Bazelshould be in the business of providing JDKs to users. This should work right now already (correct me if I am wrong Liam)and might just need some documentation.It sounds like it's already possible, I just wasn't that familiar with remote repos. I agree re: providing JDKs, but it might be helpful to provide the BUILD files and remote repo configuration to use e.g. zulu.
I'm curious what examples we have of host tools that need modules that we don't want to include in the embedded JDK?
I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase?
That's also an option, but it's more expensive than packaging the classes that comprise javac.
I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.
On Mon, Aug 27, 2018 at 1:16 PM, Jakob Buchgraber <buc...@google.com> wrote:On Mon, Aug 27, 2018 at 1:00 PM Lukács T. Berki <lbe...@google.com> wrote:I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase?It was my understanding that this thread discusses the mid- and longterm future of the java rules in Bazel,and I think in that future VanillaJavaBuilder should not exist. We should have a JavaBuilder for every JDKversion that we support.That's also an option, but it's more expensive than packaging the classes that comprise javac. I also think that maintaining multiple JavaBuilders would be a way to go if it's feasible, but I'd like a backup option in case it isn't.
It's my understanding that we already do that. Last time I checked there was a javac.jar in Bazel'sembedded tools.But then why is the host javabase *that* important?
On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)
I agree that in the medium-term that it would be nice to support more host_javabases, possibly by maintaining release branches of all of the host tools, or by moving them to a remote repository that could be versioned independently from Bazel and pinned to a version compatible with a particular JDK version.It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain?
On Mon, Aug 27, 2018 at 4:24 AM Lukács T. Berki <lbe...@google.com> wrote:On Mon, Aug 27, 2018 at 1:16 PM, Jakob Buchgraber <buc...@google.com> wrote:On Mon, Aug 27, 2018 at 1:00 PM Lukács T. Berki <lbe...@google.com> wrote:I think the strongest argument for using the built-in JDK for anything else other than Bazel itself is that VanillaJavaBuilder should be used as little as possible. Could that be worked around, if needed, by distributing a javac.jar with Bazel and running that on whatever is --host_javabase?It was my understanding that this thread discusses the mid- and longterm future of the java rules in Bazel,and I think in that future VanillaJavaBuilder should not exist. We should have a JavaBuilder for every JDKversion that we support.That's also an option, but it's more expensive than packaging the classes that comprise javac. I also think that maintaining multiple JavaBuilders would be a way to go if it's feasible, but I'd like a backup option in case it isn't.javac is tied to a particular JDK version, and the other tools built on top of javac depend on a large API surface area that tends to change between releases. Versioning javac alone wouldn't be sufficient, given the way things are currently structured we'd need to version JavaBuilder and probably some of the other tools.
On Mon, Aug 27, 2018 at 4:35 AM Lukács T. Berki <lbe...@google.com> wrote:It's my understanding that we already do that. Last time I checked there was a javac.jar in Bazel'sembedded tools.But then why is the host javabase *that* important?javac is a Java program that needs a JDK to run on, and the javac in JDK version N can be made to work on JDK version N-1, but no earlier than that.
$ javap -v -cp ./third_party/java/jdk/langtools/jdk_compiler.jar com.sun.tools.javac.main.Main | grep 'major version'
major version: 53
On Mon, Aug 27, 2018 at 6:36 PM, Liam Miller-Cushon <cus...@google.com> wrote:On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)The reason why I'm reluctant to go that route is that because this will inevitably introduce coupling between the embedded JDK and the users of Bazel, thus making it harder to update said embedded JDK. I'd much rather have the default be that Bazel uses a complete JDK that has to be provided to it in some way and if need be, the embedded JDK can be used as the host JDK by explicitly opting in with the understanding that it's unsupported and if you it breaks, you get to keep both pieces.Your desire below that changes to the embedded JDK not regress the functionality of the Java toolchain is a very good indication what would happen if we kept things this way -- whenever we wanted to update the JVM of Bazel itself, we'd have to make sure that a *lot* of other things keep working that are not Bazel itself.
It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain?
Long term should the java rules be embedded/tied strongly to bazel, that sort of feels like it comes into this as a question. If the answer is no then i'm not sure if-> Point at a particular external JDK version by default for the host_javabase which supports the JB-> People can use other JDK's as they can now, but probably will be stuck with the VJB-> If google/others add support for multiple JDK's/JB's this works fine here-> We can minimize the embedded JDK
Other than downloading/unpacking the other JDK it doesn't feel like its a regression from today unless i'm missing something?
On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:On Mon, Aug 27, 2018 at 6:36 PM, Liam Miller-Cushon <cus...@google.com> wrote:On Mon, Aug 27, 2018 at 4:00 AM Lukács T. Berki <lbe...@google.com> wrote:I also agree with Martin's expectation that @(JAVABASE) is a complete JDK and I also agree that it's best to decouple the decision which JVM to use for running Bazel from as many things as possible.I agree in principle. In practice, there are a lot of JDK classes that are extremely unlikely to be needed by host tools (corba, etc.). We could get a fair bit of mileage out of keeping the current embedded JDK approach, and removing any classes that aren't needed by any popular host tools (the compilers for the major JVM languages, etc.)The reason why I'm reluctant to go that route is that because this will inevitably introduce coupling between the embedded JDK and the users of Bazel, thus making it harder to update said embedded JDK. I'd much rather have the default be that Bazel uses a complete JDK that has to be provided to it in some way and if need be, the embedded JDK can be used as the host JDK by explicitly opting in with the understanding that it's unsupported and if you it breaks, you get to keep both pieces.Your desire below that changes to the embedded JDK not regress the functionality of the Java toolchain is a very good indication what would happen if we kept things this way -- whenever we wanted to update the JVM of Bazel itself, we'd have to make sure that a *lot* of other things keep working that are not Bazel itself.Part of the solution is better testing and validation of release candidates, regardless of whether that coupling with the embedded JDK exists long-term. i.e. if a new release ships with a toolchain that requires a locally installed JDK 10, and some other host tools aren't compatible with JDK 10 yet, that's still a breakage. I don't see a way around that as long as the toolchain is distributed with Bazel, unless we stop making changes to the defaults entirely.
On Mon, Aug 27, 2018 at 10:29 AM Lukács T. Berki <lbe...@google.com> wrote:It sounds like you want to minimize the embedded JDK ASAP, but can we consider it a blocker that doing so not regress the functionality of the Java toolchain?What are your thoughts on this part?
I'm not sure what the current proposal is.
I think there are two proposals on the table:
- Embedded JDK is only used for running Bazel itself, and if you want to compile Java code, you need to provide a JDK to be used as the host javabase and the target javabase. If that JDK is not of the blessed version, you also need to explicitly revert to VanillaJavaBuilder.
- Embedded JDK is only used for running Bazel itself, and the default host javabase is an external repository that is automatically downloaded when needed.
I could be convinced about both, but (2) means that we have to commit to maintaining the place where the JDK is to be downloaded from for all supported architectures or figure out a way to download the JDK automatically from some official place while making sure that we are abiding by its license.
(1) seems to be simpler. One could argue that it would mean shifting burden from us (one group of people) to all users of Bazel who want to build JVM stuff (many groups of people). However, I think that if one wants to build a JVM language, it's a pretty reasonable request to install a JDK, isn't it?
On Tue, Aug 28, 2018 at 12:43 AM Lukács T. Berki <lbe...@google.com> wrote:I think there are two proposals on the table:
- Embedded JDK is only used for running Bazel itself, and if you want to compile Java code, you need to provide a JDK to be used as the host javabase and the target javabase. If that JDK is not of the blessed version, you also need to explicitly revert to VanillaJavaBuilder.
- Embedded JDK is only used for running Bazel itself, and the default host javabase is an external repository that is automatically downloaded when needed.
I could be convinced about both, but (2) means that we have to commit to maintaining the place where the JDK is to be downloaded from for all supported architectures or figure out a way to download the JDK automatically from some official place while making sure that we are abiding by its license.My understanding from earlier in the thread was that we want to be able to do this for remote execution regardless, so (1) doesn't eliminate the need to have a mirror of those JDKs somewhere.
(1) seems to be simpler. One could argue that it would mean shifting burden from us (one group of people) to all users of Bazel who want to build JVM stuff (many groups of people). However, I think that if one wants to build a JVM language, it's a pretty reasonable request to install a JDK, isn't it?If the default toolchain depends on the latest six-monthly JDK release, this is going to be onerous. People may not have (or have access to) a local install of the latest version immediately. And the Java installers typically make the new version the default (and add it to your PATH) which may interfere with other programs that don't support that JDK yet. If we want the non-VanillaJavaBuilder as the default, and for it to be relatively hassle-free, I think (2) will provide a better experience.
Is this a fair assessment of the upsides and downsides of each approach?
To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:
If we default to host JDK in a remote repository:
1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)
2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)
3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
4. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)
5. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
6. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)
If we default to a preinstalled host JDK:
1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)
3. Users of Bazel get the exact JDK they want
4. No surprising large download of a JDK
5. When one wants to start using the RBE, --host_javabase needs to be changed so that it's a JDK that exists on the remote machine (but you already have a lot of flags you need to add in that case)
Is this a fair assessment of the upsides and downsides of each approach?SGTM. I left a few notes inline:To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:If we default to host JDK in a remote repository:1. People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)Are we sure there aren't any Java host tools that are used by other languages? Nothing for coverage, or windows singlejar?
2. The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)The minor version and vendor don't matter in theory. In practice we depend on a few internal APIs, so they could matter. This should be much less of an issue with the six-monthly release cadence since each major version only has two point releases.3. We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
4. It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)`zulu10.2+3-jdk10.0.1-linux_x64-allmodules.tar.gz` is 0.52 hundred MiB :)
5. Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
6. Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.
If we default to a preinstalled host JDK:1. People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
2. Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)I just want to re-iterate that this seems like a bad out of the box experience: you get Bazel, do a Java build, it fails mysteriously (or at best reports an error), and then you have to go and install a JDK or pass additional flags we'd prefer most users didn't have to have expertise with.
3. Users of Bazel get the exact JDK they wantIt depends what JDK they want, which I don't have a good sense of. I'd expect more people to have a preference about target --javabase and language level, and to care much less about the toolchain implementation details as long as it provides the functionality they expect.
`zulu10.2+3-jdk10.0.1-linux_x64-allmodules.tar.gz` is 0.52 hundred MiB :)
I just want to re-iterate that this seems like a bad out of the box experience: you get Bazel, do a Java build, it fails mysteriously (or at best reports an error), and then you have to go and install a JDK or pass additional flags we'd prefer most users didn't have to have expertise with.
To summarize the pros and cons of approach (1), that is, always download the host JDK from a remote repository and approach (2), that is, to require a pre-installed local JDK:If we default to host JDK in a remote repository:
- People who unknowingly and transitively depend on the JDK can work without installing a JDK (but I expect this to be rare)
- The host JDK will always be the right major version (minor version and JDK vendor don't matter, do they?)
- We need to maintain our own JDK on mirror.bazel.build (@ngiraldo: do we need to do that for RBE?)
- It's a large download (hundreds of megabytes) and people working offline will need to remember to do "bazel fetch" (but that's already the case if you have remote repositories)
- Works by default with RBE (@ngiraldo: but is this actually useful as opposed to the JDK being preinstalled in on the workers? Then the host javabase will need to be shipped to workers individually)
- Downloading a host javabase may not the really received well by distributions where we package Bazel (Debian, Homebrew)
If we default to a preinstalled host JDK:
- People who unknowingly and transitively depend on the JDK will be surprised by the requirement to install a JDK
- Uesrs would need to explicitly revert to VanillaJavaBuilder if the system JDK doesn't work with JavaBuilder (but JavaBuilder works across two major versions and if need be, we can maintain multiple versions)
- Users of Bazel get the exact JDK they want
- No surprising large download of a JDK
- When one wants to start using the RBE, --host_javabase needs to be changed so that it's a JDK that exists on the remote machine (but you already have a lot of flags you need to add in that case)
Is this a fair assessment of the upsides and downsides of each approach?
Taking a step back, what's the deadline for getting this resolved?
The most promising option seems to be moving the entire toolchain to a remote repo, and I understand that's something you already wanted to do in the medium/long term.However there are some open questions about remote repos, including how stable they are,
how they interact with distro packaging requirements, and how acceptable downloading additional MiBs is.
Can we hit pause on this until some of those issues are worked out?I understand the forcing functions to be:1) avoiding a repeat of the breakages with 0.16, and in particular preventing uses of the embedded JDK except as the server_javabase2) minimizing the embedded JDKI don't think (1) is urgent now that the dust has settled on 0.16. JDK 11 will be released in a month, but it's also going to be a less disruptive change than 8->9->10 was.(2) is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
On Wed, Aug 29, 2018 at 11:57 PM Liam Miller-Cushon <cus...@google.com> wrote:> These are "accidental" issues, so I'd prefer not basing long-term decisions based on what the problem du jour is.I probably could have chosen better examples, unless you're saying that any use of Java for host tools to support non-JVM languages should be discouraged. Is there a preferred alternative?> > If it's poorly received, wouldn't those concerns apply equally to all other uses of remote repos? I thought we wanted to move most of the language-specific support stuff (toolchains, rules, etc.) to remote repos.> That's a good point. Maybe we should just cross that bridge now and be done with it.Let's find out if the concerns you raised are blockers before we try to migrate. Can you follow up on that?
> You have to install a JDK to run the JVM software anyway. Are you concerned about cases where there is a JDK of the wrong versionYes. This is the point I raised earlier that people may not have a local install of the latest version immediately, and that installing one can be disruptive if other programs on their system don't support that JDK yet.> what would be required to support / maintain multiple JavaBuilder versions for several major JDKs versions?Staffing, primarily :)> Do I understand correctly that to support JDK [N-3, N] we would need need two versions of JavaBuilder and javac? For example, could we support JDK 8,9,10,11 as a host_javabase with a JDK9 JavaBuilder/javac and a JDK10 JavaBuilder/javac? Once 12 was released, we could support JDK 9,10,11,12 as a host_javabase with a JDK10 JavaBuilder/javac and a JDK12 JavaBuilder/javac? Correct?I'm not following the math here. JavaBuilder/javac N can be run on JDKs newer than N, but the latest language level they support is N, so that's not a good experience (i.e. the default toolchain doesn't support JDK 11 today, even though you can use a JDK 11 host_javabase). JavaBuilder/javac N can be made to work on JDK N-1, but that takes additional work and is slightly hacky.So in practice we'd want one javac/JavaBuilder per host JDK version.I think the shortest path to supporting additional host JDKs is to migrate the entire toolchain to a remote repo so it can be versioned independently from Bazel. That doesn't solve the problem of actively supporting the old versions (including e.g. back-porting fixes to them), but at least it lets people pin to an old version of the toolchain that matches the host JDK they're using, and to upgrade independently from their Bazel version.> Besides that downloading 50MiB is *a lot*. It's really worrysome for CI buildsCI systems don't have to use the remote JDK, as Nicholas said.It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?> My 2 cents is that we should provide support for both approaches with clear advice/error messages. This would work as follows:> - Default to a preinstalled jdk if it exists and is "new enough" to use non-VJB> - Error out if one does not exist and user does not provide any explicit guidance wrt the two options they have:My concern with this approach is that we might end up erroring out for a large fraction of users. It would be helpful to have data on Bazel users' locally installed JDK versions, but I suspect that a minority have JDK 10 installed.
I'm not following the math here. JavaBuilder/javac N can be run on JDKs newer than N, but the latest language level they support is N, so that's not a good experience (i.e. the default toolchain doesn't support JDK 11 today, even though you can use a JDK 11 host_javabase). JavaBuilder/javac N can be made to work on JDK N-1, but that takes additional work and is slightly hacky.So in practice we'd want one javac/JavaBuilder per host JDK version.
CI systems don't have to use the remote JDK, as Nicholas said.
It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?
My concern with this approach is that we might end up erroring out for a large fraction of users. It would be helpful to have data on Bazel users' locally installed JDK versions, but I suspect that a minority have JDK 10 installed.
It's still not clear to me why a 50MiB download is such a concern, or why it's being weighted so heavily against the experience for Java users. Can you help me understand this part?I'd argue any toolchain download is a bad user experience and thus downloads should be as fast (small) as possible. Excessive downloads make Bazel look badand even more so if a user is waiting for a JDK to download even though he has one installed locally. It's a surprising and unexpected thing to do for a build tool.One can make the argument that a user can overwrite the --host_javabase to use the local JDK if he doesn't want to download one (and one can argue vice versa)but defaults matter *a lot*. It's an extra >50MiB in downloads for no technical reason that can be avoided and I think thus we should. There's this joke among Javadevelopers that apache maven is always downloading half the internet, but that joke has a truth to it. I'd like to avoid such popular opinion forming about Bazel.
Tool chain downloads/remote dependencies are effectively the norm for most users i'd imagine(at least outside of google) no?
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cus...@google.com> wrote:Taking a step back, what's the deadline for getting this resolved?Bazel 1.0, I guess?
(2) is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
+1. Let's figure out what to do first, then tweak the embedded JDK when we can do it without harm.
I think there is consensus that --javabase should point to @local_jdk.For --host_javabase, What I'd like to avoid is coupling the JDK Bazel runs under to the JDK host tools run under, that is, --host_javabase defaulting to @embedded_jdk in any form. This is both to make it possible to minimize the embedded JDK and to be able to update it while being sure that we won't break anything. This leaves us two options: a local JDK and a remote repository.Based on what I heard, I'm leaning towards the remote repository option -- Most of its cost ("mandatory" download, size, need to maintain mirror.bazel.build) will be incurred due to other reasons anyway, and Bazel versions packaged with eg. Homebrew/Debian can always default to a JDK in another package.
(Another point: it's not true that only our tools run under --host_javabase. It's any tool that runs during the build and is written in Java)
On Fri, Aug 31, 2018 at 1:48 AM Lukács T. Berki <lbe...@google.com> wrote:I think there is consensus that --javabase should point to @local_jdk.For --host_javabase, What I'd like to avoid is coupling the JDK Bazel runs under to the JDK host tools run under, that is, --host_javabase defaulting to @embedded_jdk in any form. This is both to make it possible to minimize the embedded JDK and to be able to update it while being sure that we won't break anything. This leaves us two options: a local JDK and a remote repository.Based on what I heard, I'm leaning towards the remote repository option -- Most of its cost ("mandatory" download, size, need to maintain mirror.bazel.build) will be incurred due to other reasons anyway, and Bazel versions packaged with eg. Homebrew/Debian can always default to a JDK in another package.Again this is where I'm confused. Avoiding the embedded JDK seems to just shifts the breaking from one change (embedded JDK update) to a different change (updating the default remote repo --host_javabase points to, updating the package Debian points to etc.). Am I missing something?
(Another point: it's not true that only our tools run under --host_javabase. It's any tool that runs during the build and is written in Java)Apologies, I'm aware of this and discussed this point somewhere in my last email. Because of this issue, breakages due to --host_javabase changes are a concern at all (assuming sufficient testing of bundled tools), IIUC. I still make the claim that this is far less common or likely to be an issue, especially for users who aren't doing Java compilation. Are there non-packaged, non Android SDK, Java-based tools you are aware of that are commonly used? This question was asked before and it didn't seem like there was a large surface here to worry about. Did I get the wrong impression? Another question would be whether we think the bigger issue here are user-written genrules that happen to invoke a Java program, or Skylark rules that use Java-based tools? Just curious what we know about the landscape of host-side Java tools outside Java and Android compilation.
I don't think anyone is going to argue with this on principle, but I'm not sure what the specific proposal is.
... I have good reasons for wanting the default toolchain to be modern and featureful and reliable even if it requires downloading some additional bits, and that we're lacking data that would help prioritize.
The specific proposal for the RHS --host_javabase is:* Require the user to install a local JDK or bring his own JDK as a remote repository* Bazel defines a list of supported major JDK versions and fails with a decent error messageif the user provided JDK is not in that list.* Bazel automatically downloads a JavaBuilder from a remote repository that works with theuser provided JDK. We have a working JavaBuilder for all supported JDKs. No more VanillaJavaBuilder.
Again this is where I'm confused. Avoiding the embedded JDK seems to just shifts the breaking from one change (embedded JDK update) to a different change (updating the default remote repo --host_javabase points to, updating the package Debian points to etc.). Am I missing something?Correct, but it makes these the embedded JDK and the host JDK independent, which is a win.
Requiring explicitly selecting a toolchain version that matches the local JDK would take some of that magic away.
On Tue, Sep 4, 2018 at 7:00 PM Liam Miller-Cushon <cus...@google.com> wrote:
That suite of JavaBuilders is likely to be substantively different depending on which JDK they need to be compatible with, so this suffers from some of the issues I raised about automatically switching between VJB and non-VJB depending on host JDK version.
Requiring explicitly selecting a toolchain version that matches the local JDK would take some of that magic away.
I imagine we should be able to auto detect the Java version in the local_jdk?
It would be difficult to guarantee that a JavaBuilder that runs on, say, JDK 6 is bug and feature-compatible with one that runs on JDK 11.
Automagically switching between the two for the same build when running on different systems with different local JDKs maybe not provide a good experience.
... we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial ...
On Tue, Sep 4, 2018 at 7:38 PM Liam Miller-Cushon <cus...@google.com> wrote:It would be difficult to guarantee that a JavaBuilder that runs on, say, JDK 6 is bug and feature-compatible with one that runs on JDK 11.I can see your point, but do you think that this will be an issue we can't handle? I imagine a JavaBuilder for the latest JDK N will stay currentfor at least six months and any bugs reported in that timeframe will get fixed. Once we support JDK N+1, the JavaBuilder for JDK N should berather stable and for critical bugs we can still do updates to it, although I would expect this to be the exception no?
Automagically switching between the two for the same build when running on different systems with different local JDKs maybe not provide a good experience.I am not entirely sure what you mean by the "same build" on two different systems with different JDKs.I ll assume you mean the same codebase and source state: I don't see how this is a problem in thatdifferent systems with different JDKs are not expected to behave the same - even today.
I guess my larger point here is that, given the current tool bundling etc., it seems like we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial (e.g., we don't want to upgrade the embedded JDK for some reason but need to upgrade host JDK, e.g., to support a new Java version; or we want to minify the embedded JDK). Until that time it seems we can avoid extra downloads (and breakages!) by keeping them together, seemingly at no cost. Put another way I'm not advocating keeping embedded==host forever, but it does seem we can wait on separating them, so let's do that and focus on the target JDK separation first.
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cus...@google.com> wrote:Taking a step back, what's the deadline for getting this resolved?Bazel 1.0, I guess?
[minimizing the embedded JDK] is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.
The highest-priority and least contentious part of this seems to be that defaulting --javabase to the embedded JDK if we can't find a local_jdk is a bad idea.I filed https://github.com/bazelbuild/bazel/issues/6105 to track fixing that.
On Tue, Sep 4, 2018 at 4:27 PM Liam Miller-Cushon <cus...@google.com> wrote:
On Tue, Sep 4, 2018 at 4:02 PM 'Kevin Bierhoff' via Bazel/JVM Special Interest Group <bazel-sig-jvm@googlegroups.com> wrote:I guess my larger point here is that, given the current tool bundling etc., it seems like we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial (e.g., we don't want to upgrade the embedded JDK for some reason but need to upgrade host JDK, e.g., to support a new Java version; or we want to minify the embedded JDK). Until that time it seems we can avoid extra downloads (and breakages!) by keeping them together, seemingly at no cost. Put another way I'm not advocating keeping embedded==host forever, but it does seem we can wait on separating them, so let's do that and focus on the target JDK separation first.I think we agreed earlier in the thread to keep using the embedded JDK as the default host_javabase for now, and revisit minification some time before Bazel 1.0:
On Thu, Aug 30, 2018 at 9:01 AM, Liam Miller-Cushon <cushon@google.com> wrote:Taking a step back, what's the deadline for getting this resolved?Bazel 1.0, I guess?[minimizing the embedded JDK] is appealing, but it doesn't seem worth rushing into at the risk of destabilizing Java support again.+1. Let's figure out what to do first, then tweak the embedded JDK when we can do it without harm.
Again, in my opinion we haven't sufficiently established that the separation is necessary in order to minimize the embedded JDK, so let's please not do anything that we can't take back before we know it's needed. Additionally, if such a separation was needed I think the window between separation and minimization should be as short as possible, ideally within the same Bazel release, to avoid Java users of Bazel downloading a full host JDK in addition to a full embedded JDK and a full local JDK. Meaning, we should only start on this when we're ready to finish it.
My preference, based on the writeup for the minimization, would be to go forward on the other Bazel minimization opportunities identified in the doc. IIUC, the diff between using an embedded JDK with all modules vs. an embedded JDK with minimal modules is 30 megs, but the other opportunities identified already give a bunch of savings. So let's reap those other savings while discussing how best to remove modules from the embedded JDK, as a final step, if such a sequencing is possible.
On Thu, Sep 20, 2018 at 5:27 AM Lukács T. Berki <lbe...@google.com> wrote:On Tue, Sep 4, 2018 at 11:13 PM, Kevin Bierhoff <k...@google.com> wrote:On Tue, Sep 4, 2018 at 1:24 PM Jakob Buchgraber <buc...@google.com> wrote:On Tue, Sep 4, 2018 at 10:02 PM Kevin Bierhoff <k...@google.com> wrote:... we can wait to separate embedded JDK from host JDK until a situation arises that makes that beneficial ...This situation exists already, in that for non-Java users this would allow us to significantly reduce the Bazelbinary size by building a minimal JDK using jlink. See https://docs.google.com/document/d/1Igmv-2GfXkoVFWTXvBYPeniQom8nLAwzqzridDlBIS4/editApologies, I didn't realize you were imminently planning/hoping to do the size minimization. Is there a discussion somewhere where we can talk about how best to achieve that, or is this (and the doc linked above) that discussion? We extensively discussed requirements for minimization above as well, and none of them to me necessitated a embedded/host split either, so my sense is that I'm missing something there as well. We can certainly continue that discussion in this email thread or further discuss it in a separate thread, whatever you prefer. (I was hoping we could defer that part, since it's complex for its own reasons, but I guess that won't do it.)Given that the embedded/host JDK separation will need to be done eventually and it makes the minimization of the embedded JDK much easier, I'd prefer to start with it as soon as we can. It will take a while to be rolled out anyway...--Lukács T. Berki | Software Engineer | lbe...@google.com |Google Germany GmbH | Erika-Mann-Str. 33 | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891--Kevin Bierhoff
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CABdRVUY8mQb6O7jmShcYWqRBOj-skji2jeSaVDH-jAeQuTvA_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jvm+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CABdRVUY8mQb6O7jmShcYWqRBOj-skji2jeSaVDH-jAeQuTvA_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Some data about the unpacked Bazel binary (altogether 300 MB):
- JDK: 152MB
- Server jar: 44MB
- tools: 51MB
- tools/jdk: 43MB
In other words: the JDK is half of the Bazel binary, with the code of the server and our Java tooling taking another 15% each.
On Fri, Sep 21, 2018 at 9:22 AM, Ulf Adams <ulf...@google.com> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CABdRVUY8mQb6O7jmShcYWqRBOj-skji2jeSaVDH-jAeQuTvA_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--Lukács T. Berki | Software Engineer | lbe...@google.com |Google Germany GmbH | Erika-Mann-Str. 33 | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891
--Lukács T. Berki | Software Engineer | lbe...@google.com |Google Germany GmbH | Erika-Mann-Str. 33 | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891
--
You received this message because you are subscribed to the Google Groups "Bazel/JVM Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-jv...@googlegroups.com.
To post to this group, send email to bazel-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-jvm/CAOu%2B0LX-YYP%3DwpkGXqtXxEXDoeNa_5L4kw8GBtfsxfencKEGGw%40mail.gmail.com.