TL;DR
On Linux+clang-O2+official_build std::vector's push_back() seems slower (up to 2x for vector<int>) and quite binary inefficient than resize()+data() / operator[]. This has nothing to do with memory / allocator / expansion policies.
Bonus: data() seems to give better alignment hints vs operator[]. For vector<int> looping over .data() seems ~9% faster than operator[] due to the use of double-quadword registers.
In the context of trying to reproduce go/minimize-pointer-following (nothing secret but unfortunately the link is internal only [1]) with our toolchains, I ended up with the following:
That vector-pointer-deref-is-not-hoisted issue doesn't seem to repro for me on Linux with clang -O2. In both cases (caching the data() ptr vs using operator[] on the pointer) the vector ptr dereferencing is hoisted outside of the loop (good). There is only something funny going on with alignment: the former is 9% faster as the compiler realizes it can use double-quadword instructions.
The thing that jumped to my attention though is the 3rd experiment using push_back(). That:
- For the specific case of inserting into a vector<int> push_back() seems 2X slower than the two other alternatives (resize() + data() and operator[])
Probably it is just me, but until now I would have never expected push_back to be _that_ slow and binary-inefficient. I know that push_back() generally yields to safer and more readable code but, if I didn't do anything silly, this seems to suggest that .data() is faster for hot loops.
Talking about which, I am not sure how much of them we have. An
codesearch attempt [4] suggests that we have ~3600 instances of push_back() within a loop.
[1] The TL;DR of that doc is that passing a pointer to std::vector<int>* to a function which reads/writes via .data()/operator[] can cause the vector ptr derefencing operation to _not_ be hoisted outside of the loop due to aliasing. Which doesn't seem to happen here.