--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CAMNsu3nZ7_Bxa%3DSdg4iv7%3D729rLcmXvJtJz-dUci1fJ0k-%2BWrg%40mail.gmail.com.
--
Lars Clausen
Software Engineer
Google Germany GmbH
Erika-Mann-Straße 33
80636 München
Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.
This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.
One workaround would be to tag the (presumably) few very large tests with `exclusive` (https://docs.bazel.build/versions/main/test-encyclopedia.html#tag-conventions). Then they're run alone, which should help on the memory issues, but is likely to make the CI take longer.
There doesn't seem to be an equivalent to the `cpu:N` tag for memory. Since fairly recently (https://github.com/bazelbuild/bazel/commit/d7f0724b6b91b6c57039a1634ff00ccebd872714), there has been support for specifying expected resource usage for a Starlark-defined rule, but that handle is not surfaced in the test rules.I take it you've already spent some time trying to reduce the size of the tests themselves.
If you use workers, consider reducing `--worker_max_instances` from the default 4, if you see workers sitting idle holding on to a lot of memory. We're looking into better management of worker memory usage.
On Tue, Mar 29, 2022 at 4:58 PM Lars Clausen <lar...@google.com> wrote:One workaround would be to tag the (presumably) few very large tests with `exclusive` (https://docs.bazel.build/versions/main/test-encyclopedia.html#tag-conventions). Then they're run alone, which should help on the memory issues, but is likely to make the CI take longer.There doesn't seem to be an equivalent to the `cpu:N` tag for memory. Since fairly recently (https://github.com/bazelbuild/bazel/commit/d7f0724b6b91b6c57039a1634ff00ccebd872714), there has been support for specifying expected resource usage for a Starlark-defined rule, but that handle is not surfaced in the test rules.I take it you've already spent some time trying to reduce the size of the tests themselves.Probably not enough, but the JVM footprint is pretty high, and most tests would qualify as enormous. I guess we could use `--local_ram_resources` to reduce the total memory consumption (so, equivalently scaling up the logical per-test memory) but didn't try that yet.
If you use workers, consider reducing `--worker_max_instances` from the default 4, if you see workers sitting idle holding on to a lot of memory. We're looking into better management of worker memory usage.We're using multiplex workers, so my understanding is that we should already have only one instance per mnemonic.
Our alternatives we're looking at:- reduce parallelism globally (--jobs). Cons: conservative and slower but most likely to solve the issue. Requires tweaking
- reduce memory consumption (--local_ram_resources): Cons: needs tweaking, unsure how the scheduler works
- reduce OOM score for the Bazel server process. This may still kill individual tests, which is somewhat better but not a solution (the CI run would still fail but partially populate the cache so the next run would be more likely to succeed)
Our alternatives we're looking at:- reduce parallelism globally (--jobs). Cons: conservative and slower but most likely to solve the issue. Requires tweakingThat's a very brutal solution indeed. Since it's only tests causing this problem, --local_test_jobs would be more suitable.- reduce memory consumption (--local_ram_resources): Cons: needs tweaking, unsure how the scheduler worksThis is probably your best handle. The scheduler estimates the memory usage of each action and doesn't schedule actions if there isn't enough memory available (except it'll always schedule at least one, to make progress). The actions may end up using more memory, but lowering this is a good start for sure. It defaults to 2/3 of the host memory, so with your giant tests it's almost guaranteed to end up using much more. Try aggressively reducing this until you start seeing the CI runs taking longer.
Just to make sure I understand. Setting --local_ram_resources=HOST_RAM*0.66 would be a no-op, so we need to start below? Or HOST_RAM itself is set to 2/3 of total host memory?
--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/61ffffe3-433e-41b7-a27c-0131318065dcn%40googlegroups.com.