Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Side effects of clang plugin forcing out-of-line constructors

129 views
Skip to first unread message

Devon Loehr

unread,
Sep 16, 2024, 12:25:17 PM9/16/24
to cxx
One of our clang plugin checks forces constructors/destructors to be declared out-of-line for complex classes (documentation), which reduces binary size. Unfortunately, this has several negative consequences.

One of them is that it disqualifies classes which otherwise would qualify as aggregates, prevents them from being trivially copyable, etc (crbug.com/355003174). This can lead to annoying hacks to work around it, and prevents some handy patterns like designated initializers.

Less directly, a recent bug brought up the fact that we have many classes which are "accidentally" not moveable, because they have a copy constructor but no move constructor. One cause of this is that our style plugin will require users to define a copy constructor, but won't force them to make a move constructor along with it.

Drawing from the linked discussions, it seems there are a number of things we could do to try to address this (not necessarily mutually exclusive):
  1. Update the plugin to require explicitly defining move constructors per the rule of 5 (even if just default/delete).
  2. Update just the plugin's message to recommend adding a move constructor whenever it requires adding a copy constructor.
  3. Add an easier way to opt-out of this check.
  4. Remove this check from the plugin entirely.
  5. Alongside (3) or (4), see if we can tweak LLVM inlining heuristics to reduce the binary size impact.
  6. Enable the performance-move-const-arg flag in clang-tidy, which (among other things) highlights some of the issues with classes that lack move constructors  (crbug/364788123).
Personally I'd lean towards (1), (3), and eventually (6). (1) helps adhere to C++ guidelines, and (6) seems to be providing useful information. (3) lets us choose the rule of 0 when necessary, allowing us to benefit from things like aggregates which have the right behavior only if we don't interfere. It's frustrating that we need to make this choice, but I don't know of any way to avoid it.

I'm looking for thoughts about how we should address this. What's the best way to resolve this? Are there other options I didn't outline above? Is there a way to "have it all" by avoiding mass-inlining without explicitly declaring constructors?

--Devon Loehr

danakj

unread,
Sep 16, 2024, 12:32:44 PM9/16/24
to Devon Loehr, cxx, Daniel Cheng
Daniel (cc'd) currently owns getting rid of this check entirely. If that's not possible, some of these other options may be needed as a fallback.

It also causes issues with constexpr: https://crbug.com/40264722

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/384738ca-2c74-44a5-8256-98893af30fc5n%40chromium.org.

Daniel Cheng

unread,
Sep 25, 2024, 2:09:42 AM9/25/24
to danakj, Devon Loehr, cxx
We've had several forms of these discussions, and I'd really like to avoid adding an escape hatch.

I think we should be investing in (5) in parallel with (4)

From my side, I've done two things:
I made a rewriting tool that automatically reverses the out-of-lined special member functions: crrev.com/c/5784484
Unfortunately, Chrome /really/ loves forward declares, which means the current rewriter doesn't work—many types don't have a complete definition in the header, so the result just doesn't compile. I'm working on limiting the scope of the rewrites using some different criteria, but I've been a bit distracted by other things that came up in the last week. The goal here is to try to get an idea of how much binary size will be impacted—and if we feel it's useful, we can use it for whatever rewrites we deem appropriate.

The second thing is to abuse Mojo code generation as a proxy for measuring the size impact. Since the plugin does not warn on generated code, we can use it to conduct some quick experiments. Some numbers measuring the effect when changing how the destructor is defined [1]:

                        | arm32         | arm64          | CL
------------------------+---------------+----------------+----------------------------
= default               | +3,171 bytes  | +72,256 bytes  | https://crrev.com/c/5876826
{}                      | +3,895 bytes  | +73,912 bytes  | https://crrev.com/c/5873712
ALWAYS_INLINE {}        | -525 bytes    | +86,088 bytes  | https://crrev.com/c/5873713
NOINLINE {}             | +30,887 bytes | -168,212 bytes | https://crrev.com/c/5873714
ALWAYS_INLINE = default | -1,265 bytes  | +84,060 bytes  | https://crrev.com/c/5889327
NOINLINE = default      | +11,603 bytes | -263,324 bytes | https://crrev.com/c/5889326

From these results:
- It's not exactly clear how clang decides to inline things when there's no ALWAYS_INLINE or NOINLINE attribute. The results for both `= default` and a destructor with an explicitly empty body (meaning that the destructor would never be trivial) both lie somewhere in between the ALWAYS_INLINE and NOINLINE experiments. It would be really nice if someone could help shed light on this.
- It's interesting that the most straightforward options for defaulting destructors have minimal effects on arm32. It would be interesting to understand what arm32 build flags are affecting that and how.
- NOINLINE + `= default` looks somewhat viable, but the problem is NOINLINE overrides PGO inlining decisions, and that is probably not desirable. If we could fix it... maybe this could be an option?
- Can we somehow default to whatever codegen behavior arm32 is using for special member functions that are defaulted in the header?

We should also get some updated desktop numbers. Previously, the impact on Linux desktop was much higher (+600KB for a stripped binary with a similar Mojo change). That seems quite large, if those numbers still hold up :)

I haven't checked how many instances/instantiations were affected. Knowing that would be useful for getting a better estimate of the actual effect if this restriction were lifted. It is also important to note that this is only the effect of *destructors*; we do not measure the effect of using = default for constructors or copy operations or move operations.

Being able to understand and (ideally) improve codegen would be nice.

There are a couple other concerns that would have to be addressed/mitigated:
- more types need to be completely defined in the headers, which means more includes.
- ignoring the increase in textual includes, the compiler also needs to do more redundant work to instantiate the special member functions in each TU that includes a header with defaulted special operations.

These would both affect build times negatively. I would suggest that in most cases, we still want to default special members out-of-line in the .cc, and we would only allow explicit = default in headers (or omitting it altogether) for structs.

Finally, if we decide that we can tolerate the size/build time increase without any further work, it would be nice to have tooling to measure how this is affecting the binary size/build times so we could determine if follow up toolchain work is needed.

Daniel

[1] We do not measure other special members here, because it's harder and the underlying Mojo struct itself may not actually be intrinsically copyable or movable.

Andrew Grieve

unread,
Sep 25, 2024, 10:15:48 AM9/25/24
to Daniel Cheng, chrome-pgo, danakj, Devon Loehr, cxx
Some insights on size:

android-binary-size trybot currently enables AFDO profiles for arm32, but disables PGO profiles for arm64 (as of sept 4).

The reason the profiles are still applied on arm32, is that they are collected from ChromeOS devices (so are >weeks out of date and don't cover android-specific branches) and arm32 enables a flag to have code that doesn't match the profile optimized for size. The arm32 configuration has generally worked well, as it doesn't suffer from CLs that deviate from an accurate profile (because the profile isn't accurate to start with), and because it errors on the side of smaller code.

Anyways, the differing configurations and stale profiles make the numbers fairly untrustworthy :(
  • It could very well be that with a PGO profile applied, the arm64 numbers would look very different.
  • While AFDO profiles are generally not accurate, they probably *are* fairly accurate for mojo-generated code since it's the same on desktop, so changed hot symbols here will be moved from -O2 -> -Os
If size is the best metric to judge this on, I think getting accurate PGO numbers for arm64 is the best bet. I don't think there's even a semi-automated way to do this yet though. +chrome-pgo - please correct me if I'm wrong.




Peter Wen

unread,
Sep 25, 2024, 12:51:46 PM9/25/24
to Andrew Grieve, Zhaoyang Li, Daniel Cheng, chrome-pgo, danakj, Devon Loehr, cxx
Theoretically we have an android-arm64-pgo try bot: https://ci.chromium.org/ui/p/chrome/builders/try/android-arm64-pgo

This can be made to work (perhaps removing the skip_profile_upload property as well as make it always build and test even if analyze doesn't think a change is needed). Then the uploaded sha1 can be used in a CL to use a new PGO profile generated at that CL. @Andrew Grieve mentioned there'd be a need to update the GN args when re-running the binary size bot and to be able to effectively skip the generator (so that a fresh baseline is generated instead of one with the old GN args).

There's a good amount of work that would need to be done (including making the try bot actually reliable, something that @Zhaoyang Li has been working on for the mirrored CI bot) to hook everything up and document the steps, but it should be possible to have the bot generate a new PGO profile at a particular CL.

Peter

You received this message because you are subscribed to the Google Groups "chrome-pgo" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chrome-pgo+...@google.com.
To view this discussion on the web visit https://groups.google.com/a/google.com/d/msgid/chrome-pgo/CABiQX1VxHnjCo5NZscNpk9LC_rLMYGjUfH%3DcwzTcmk%3D4atar_Q%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages