--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAFRCsYSQVZ0ZGFoUorEe-kZzLpgmYgCJ4PGv746Mt_OdPdUveQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Feels like we're on the path to reinvent Skyframe for action execution. The goal is a similar evaluation model: gradual input (= dependency) discovery where earlier-requested (= built) inputs determine later-requested inputs. Does this mapping of the problem to Skyframe make sense?
On a more concrete level, consider Windows (and Unixes without Bash) when talking about "shell commands" -- i.e. don't assume Bash, or any Sh-compatible shell, is always available.
--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAOu%2B0LVvChw7PpDi0fPQ5Utr6-uqaeSDP1Di5XZtm98SHN-duA%40mail.gmail.com.
I feel like I haven't quite understood what this feature is trying to accomplish. I see the value in removing the special cases, but not why they exist in the first place. A few probing questions to try to scope out the problem here:How is this better than simply using all possible inputs to the compiler and letting it sort itself out?
Is this expected to improve caching? It's not clear to me if there's a way to avoid having the entire superset of inputs in the cache key.
How can this be sandboxed, without creating the same "action with large list of inputs" problem? Don't we always have to deal with that?
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAJ3KJga3zbTpu%3D%2BBiGUqqeZEoxSoKzATPxGy3KrQCr8UcPxxqA%40mail.gmail.com.
Feels like we're on the path to reinvent Skyframe for action execution. The goal is a similar evaluation model: gradual input (= dependency) discovery where earlier-requested (= built) inputs determine later-requested inputs. Does this mapping of the problem to Skyframe make sense?
That one seems to be a much more difficult problem, so I think it's better left out of scope for now. We were discussing an (eventual) approach where Bazel would parse input files in some way to generate dependency edges and maybe even targets, but it's a much more intrusive thing, since it would require the dynamic generation of actions (and maybe even targets) and not only the set of inputs for an action.
Yay, some discussion about input discovery!
I authored the Post-analysis actions in Skylark proposal, based on my own findings, Bazel documentation, Oscar's "subgraph" idea which I cited at the bottom, and my understanding of Skyframe.Feels like we're on the path to reinvent Skyframe for action execution. The goal is a similar evaluation model: gradual input (= dependency) discovery where earlier-requested (= built) inputs determine later-requested inputs. Does this mapping of the problem to Skyframe make sense?Yes, absolutely!That one seems to be a much more difficult problem, so I think it's better left out of scope for now. We were discussing an (eventual) approach where Bazel would parse input files in some way to generate dependency edges and maybe even targets, but it's a much more intrusive thing, since it would require the dynamic generation of actions (and maybe even targets) and not only the set of inputs for an action.I achieve dependency discovery today: (1) First using an action to generate archive of selected inputs (2) Giving that archive to the next step (3) Rely on Bazel short circuiting the second step if the archive is the same. I'd be thrilled for that process to use less disk space and be less awkward.More generally, IMO there are three type needs for more dynamism in actions (also given in my proposal earlier).
- Pruning inputs to achieve better cache hits. C++ rules do this. This would be for anything with cheap static dependency analysis: Go, TypeScript, JavaScript, Sass.
- Using a fork-join pattern to parallelize work. The equivalent of Bazel native's "template actions". This would be useful for anything "shardable": compressed archives, tests, Android dexing.
- Plan efficient build graphs of arbitary shape. This is the generalization of 1 and 2. For example, given a source tree of 500 TypeScript files, examine the dependencies and generate ~50 interdependent build actions with a intelligent level of granularity. (Ditto for Java or anything else with a similar compilation model.)
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/d7cb9c91-4b3c-4abe-aed1-c8cb44a07453%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
- Using a fork-join pattern to parallelize work. The equivalent of Bazel native's "template actions". This would be useful for anything "shardable": compressed archives, tests, Android dexing.
Yep. However, I'm not very happy with template actions in particular because it looks like a somewhat clumsy abstraction, so I'm somewhat reluctant to set them in stone by exposing them to Starlark.
> bazel can stop watching for any change in those files, as they cannot affect the outcome of the action.This means that any bug in the action can lead to incremental incorrectness in Bazel?
--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAOu%2B0LWhLNiuSmScP%3DE-%2Bg9NBQJpn65Wi0cskURQibqb2w4HAg%40mail.gmail.com.
> bazel can stop watching for any change in those files, as they cannot affect the outcome of the action.This means that any bug in the action can lead to incremental incorrectness in Bazel?
How does an action know which files are unused?
How do you know which files are available then?
On Tue, Apr 9, 2019 at 1:18 PM Ulf Adams <ulf...@google.com> wrote:How do you know which files are available then?The set of available files doesn't change--all the inputs are provided to every run.The only thing that changes is the action is rerun less often.Effectively, the action promises that it will produce the same output if only files it declares unused changed. Bazel trusts it and does not rerun for such cases.
The set of available files doesn't change--all the inputs are provided to every run.
It won't catch every bug, but building a reasonably large code base for a couple of weeks will pretty quickly tell you if it broke.
It won't catch every bug, but building a reasonably large code base for a couple of weeks will pretty quickly tell you if it broke.And in the meantime, you'll have poisoned your cache.Why can't this be done in two steps? One for `cc -M`, the other for `cc`.The `cc -M` is run when any of the input files change. The result is a list of input which are provided for the second step.The requires that `cc -M` be sufficiently fast, but to my knowledge that is at least true for C/C++.
It won't catch every bug, but building a reasonably large code base for a couple of weeks will pretty quickly tell you if it broke.And in the meantime, you'll have poisoned your cache.
On Tue, Apr 9, 2019 at 4:30 PM Paul Draper <pa...@rivethealth.com> wrote:It won't catch every bug, but building a reasonably large code base for a couple of weeks will pretty quickly tell you if it broke.And in the meantime, you'll have poisoned your cache.Yeah, that's the tradeoff and that's the reason for my reluctance. The counterargument is that we already do that for C++ and it seems to work well enough. In the end, I think we must come up with something like this so that the C++ rules can be fully Starlarkified.I kinda like both of these ideas for verifying whether an action worked well, but both of them have caveats: Austin's is an ex-post-facto one that needs to run in a continuous build and as Paul said opens up the possibility of cache poisoning and Paul's doesn't work on Windows and it's leaky in any case (what a about subprocesses doing creative things with the environment?)
We will never eliminate them.
Determinism != Correctness. I'd define correctness roughly like this:
A build system is correct if the output of an incremental build is consistent with the output of a clean build from the same source, where consistency means that you could come up with a hypothetical sequence / timing of running the actions pertaining to the build to obtain that output.
The idea of that definition being that we can define correctness of a build system independently of the correctness or determinism of the individual tools. This is useful for two reasons: (1) there are scenarios in which build systems fail to produce correct output even if the tools are all fully correct and deterministic, and (2) otherwise you depend on all tools being correct and deterministic, which is obviously not the case. (Note that this precludes concurrent modifications to the file system, though we're also trying to ensure that Bazel always picks up changes made while the build is running in the next build at the latest.)
With this change we're expecting to regularly see build machine use reductions of x100 for incremental Dart builds
non-stateful worker
With this change we're expecting to regularly see build machine use reductions of x100 for incremental Dart buildsWhere are the Dart rules?
--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/3003a1b5-ddbb-428e-8493-2c1e17c5f94b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to baze...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/3003a1b5-ddbb-428e-8493-2c1e17c5f94b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Laurent
To unsubscribe from this group and stop receiving emails from it, send an email to baze...@googlegroups.com.
To post to this group, send email to baze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/3003a1b5-ddbb-428e-8493-2c1e17c5f94b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Laurent
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/89ea16a9-98d5-44fd-a987-e732373ed710%40googlegroups.com.
Thanks.So, that means that using unused files heavily will decrease our cache hit rate, I think IIUC. This is because rules will accumulate many unused deps, which are used in cache key computation, but actually not useful. That seems like it will be a regression relative to erroring on unused dependencies. Erroring is a bad user experience, but cache hit reduction is probably too high a price to pay for slow compilers.
You received this message because you are subscribed to a topic in the Google Groups "bazel-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-dev/oFRdGdrm8DM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAOu%2B0LV9FGVcuRXGD%2Bf%3DuS%2B9%3DRh4x7qt%3D4is%3Du55g89Lm8aAMQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAE6AC96zh3QDYkRc9TaJs2yqy7V8RdRWruPaW9ioEPkrT_ZxrQ%40mail.gmail.com.
If it doesn't affect remote cache, I don't understand something.Imagine the following case: you have two rules A, and B. B depends on A, but it is unused.When developer 1 goes to build A and B they see that B's srcs have changed, so they need to rebuild. They write to the action cache as though B depends on A since writing to the remote cache is not aware of pruning.Developer 2 goes to build B. In fact, their srcs to B are the same as developer 1, but A is different from developer 1. Since the remote cache wrongly depends on A and B, developer 2 finds no entry in the cache and now rebuilds B even though the output from developer 1's build would have been sufficient.What have I misunderstood in this scenario? If the above is accurate, building up unused dependencies lowers your cache efficiency.