Automatically adding defaulted move constructors

566 views
Skip to first unread message

Martin Bidlingmaier

unread,
Nov 22, 2022, 6:51:36 PM11/22/22
to c...@chromium.org
I've been tinkering a bit with clang-tidy to automatically insert explicitly
defaulted move constructors for classes that do not have implicit move
constructors. I'm writing because I wanted to see whether something like that
is a) something we want at all, and, if so, to b) possibly get some pointers
or help with this project.

## The problem

C++ is very cautious about generating implicit move constructors. The rules
can be found at [1]. In particular, implicit move constructors are not
generated for classes that have one of the following user-declared methods:
- A copy constructor.
- A copy assignment operator.
- A destructor.

Throughout the  chromium codebase, copy constructors and destructors must be
explicitly declared and defined out-of-line in the .cc file for all
non-trivial classes, even if those methods are defined as `= default`. As a
result, most classes do not have move constructors. If a class does not have a
move constructor, the copy constructor is invoked in places where a move would
be sufficient. Copying complex objects usually takes longer than moving them.
Thus, we're probably missing out on some easy performance gains by not having
move constructors.

## Proposed solution

I propose to automatically insert code that default-defines move constructors
to classes that have defaulted copy constructors but no move constructor.
Such automated changes are possible using custom clang tooling, for example a
clang-tidy check that emits fixit notes. To get an approximate idea of what
would be done, a CL that adds automatically generated move constructors in
//chrome/browser/notifications/scheduler/public can be found at [2].

The linked CL simply fixes all instances where a move constructor is missing,
but we could consider being more selective:
- We could exclude testing code.
- Do we fix classes that have an *implicit* copy constructor but not a move
  constructor? This can happen if a class has an explicitly declared
  deconstructor, which prevents generation of the implicit move constructor
  but not the implicit copy constructors. Note that chromium doesn't allow
  calling implicit copy constructors on non-trivial classes, and there's a
  custom clang-plugin that checks this for us. In those cases, we'd add an
  explicitly declared move constructor even though it is never called, at
  least for now. This check is only active for classes that are defined in a
  header files, however, and doesn't apply to assignment operators. Classes
  that are local to some .cc file are not required to have an explicitly
  defined copy constructor, and for those an explicitly declared move
  constructor could still be beneficial.
- Only fix instances that matter, in the sense that we can measure some
  performance gains. This would be possible if there are only a few
  significant cases. It might well be that we're facing a "fat tail"
  distribution though, in the sense that many missing moving constructors have
  a small but relevant effect on performance.

Personally, I think we should simply fix all cases that we find, since the
code change isn't very invasive. This would also allow us to turn on the
cppcoreguidelines-special-member-functions clang-tidy check [6] so that new
code is more likely to define move constructors. It would also allow us to
apply more move-related automated fixes, e.g. modernize-pass-by-value [3]
mentioned below.

## Discussion

The benefit would be improved performance. Here are some scenarios where we
should see some gains:
+ Move-aware containers such as std::vector<T> will grow more efficiently if T
  has a move constructor.
+ If T is a member of S, where S is moved instead of copied in some places.
+ Some edge cases where local variabes are returned. If an implicit conversion
  needs to occur here, the variable is treated as an rvalue, which means that
  potentially a move constructor is invoked.
+ It makes possible future automated code changes that insert std::move where
  possible more useful. See for example the clang-tidy check
  modernize-pass-by-value [3].

Some downsides:
- Unlikely but possible introduction of bugs. One particular scenario I can
  see is a use-after-move that currently doesn't cause problems because of a
  missing move constructor, so that the copy constructor is invoked instead.
  However, the bugprone-use-after-move clang-tidy check [4] is usually run on
  chromium CLs, so there's hopefully very few instances of such code in
  chromium.
- Class definitions will contain more boilerplate.
- Binary size might increase because of the new move constructors.
- Compile times might increase.
- git blame becomes slightly less readable. However, the change would almost
  entirely only add new lines instead of changing existing lines, so this
  shouldn't be a big concern.

## Implementation details

I started by writing a clang-tidy check that emits fixit notes to insert
defaulted move constructors (and assignment operators). However, emitting just
move constructors is often not enough, because the presence of an explicit move
constructors or move assignment operator means that some other implicit methods
are not generated:
- The implicit default (nullary/zero-argument) constructor.
- The implicit copy constructor.
- The implicit copy assignment operator.

If a class has one of these implicit methods, then we must also explicitly
define them as `= default` so as to not break code relying on these methods.
For some classes, however, the defaulted copy constructor would be deleted, so
we must also detect such cases.

C++ allows compilers to defer instantiation of implicit methods to when they
are used. For example,

  std::is_copy_constructible<std::vector<std::unique_ptr<int>>>::value

is true because std::vector declares a copy constructor. Actually invoking the
copy constructor will force the compiler to instantiate it, leading to a
compiler error.

Clang will thus often not compute whether or not implicit copy constructors
are valid or not if they are not instantiated in the same compilation unit. I
couldn't figure out how to make clang compute information about whether or not
implicit methods are valid eagerly without aborting the compilation. This
might still be easy though, I'm just not very familiar with clang. I tried
invoking the Sema method [5] eagerly, but this didn't work.

As a hacky solution, my clang-tidy check speculatively emits defaulted
versions of default constructors, copy constructors and copy assignments when
it can't decide whether or not they would be deleted. I've then written a
small script that applies these fixit notes and checks whether chromium still
compiles. If not, the script discards the fixit note.


[1] https://en.cppreference.com/w/cpp/language/move_constructor#Implicitly-declared_move_constructor
[2] https://chromium-review.googlesource.com/c/chromium/src/+/4047193
[3] https://clang.llvm.org/extra/clang-tidy/checks/modernize/pass-by-value.html
[4] https://releases.llvm.org/11.1.0/tools/clang/tools/extra/docs/clang-tidy/checks/bugprone-use-after-move.html
[5] https://clang.llvm.org/doxygen/classclang_1_1Sema.html#a5ca0c4cb811521246b3d0d3d3883e9c7
[6] https://clang.llvm.org/extra/clang-tidy/checks/cppcoreguidelines/special-member-functions.html
--

Martin Bidlingmaier

Software Engineer

mb...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Peter Kasting

unread,
Nov 22, 2022, 9:26:21 PM11/22/22
to cxx, mb...@google.com
I think the idea is plausible; the probable-next-step is to take some kinda-working script, run it over some subdirectory where we think it really ought to have a perf impact, and then check the resulting binary to see if it actually has a perf impact.  The other thing we'd probably want is to check what (if any) binary size impact this has.

PK

dan...@chromium.org

unread,
Nov 23, 2022, 9:59:33 AM11/23/22
to Martin Bidlingmaier, c...@chromium.org
FWIW we did see a bug of this nature when preparing for C++20. A caller was passing a T with std::move() but it did not support moving, so it was copied. It then relied on the moved-from T not actually being moved-from. So things broke when it was allowed to move.

I wonder if it would be possible to flag types that are move()'d but would be having a move constructor added by the tool, so those move-sites could be examined by humans?
 
--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAMOjFMiZQkmqvpqM7iORzEESm%3D1%2BmzDW_kTbwessNKdqaKR5Tg%40mail.gmail.com.

Martin Bidlingmaier

unread,
Nov 24, 2022, 10:30:29 AM11/24/22
to dan...@chromium.org, c...@chromium.org
> the probable-next-step is to take some kinda-working script, run it over some subdirectory where we think it really ought to have a perf impact, and then check the resulting binary to see if it actually has a perf impact.
Do you have some suggestions on which directories would be a good fit here? I was thinking about //chrome, //third_party/blink and //v8.

> FWIW we did see a bug of this nature when preparing for C++20. A caller was passing a T with std::move() but it did not support moving, so it was copied. It then relied on the moved-from T not actually being moved-from.
Do you have a link to the commit? I'd be interested in whether bugprone-use-after-move fired there.

> I wonder if it would be possible to flag types that are move()'d but would be having a move constructor added by the tool, so those move-sites could be examined by humans?
Yes, this should be possible. Perhaps it'd make sense to patch the bugprone-use-after-move clang-tidy check to output just those locations where the type is one we've added a move constructor to.

Joe Mason

unread,
Nov 24, 2022, 2:30:10 PM11/24/22
to Martin Bidlingmaier, dan...@chromium.org, c...@chromium.org
//base is usually a good choice, since it's not too big, but it's used everywhere so any perf impact would be magnified. On the other hand the code there is more carefully scrutinized so there might not be as many missing move constructors to add.

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
Reply all
Reply to author
Forward
0 new messages