To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/30b84748-d2ab-4a32-be37-18406bcaa829n%40chromium.org.
Fair point. I'll try to finish cleaning up the docs around this to allow using it (but won't update the default recommendations for maps yet).... to start the discussion around default recommendations, my belief is that we should default to using absl::flat_hash_map everywhere.Pros:- decent default behaviour in most cases.- an empty collection is about the same size as a std::unordered_map.- insert/removal are O(1) instead of O(n) like base::flat_map (which is the current default recommendation; this has caused perf issues in a number of places)- special cases for small object sizes (1, 2, 3, 5, 7)- at small sizes and "reasonably" sized objects(i.e. std::string -> std::string), competitive or better than the STL containers- has built-in protections against dereferencing end (which has been responsible for some security bugs). That being said, I think libc++ is slowly gaining these protections as well.- codegen is comparable to STL typesCons:- base::flat_map is strictly better from a size perspective- an empty flat_map is 3 words; an empty flat_hash_map is 5 words.- the max load factor for flat_hash_map is 87.5%, so it's utilization will always be somewhat worse than base::flat_map- Large key/value types are more of a drag on absl::flat_hash_map. That being said, base::flat_map suffers from this problem too—and the solution to this is to use absl::node_hash_map or wrap the value in std::unique_ptr.Curious what people think about these points (or who would need to sign-off/approve these sorts of changes).
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAF3XrKrk_tvcHgujcSLWXrtxi3o9b_xebMHYpWRvm-XhQfaTJQ%40mail.gmail.com.
On Mon, Apr 29, 2024 at 3:26 PM Daniel Cheng <dch...@chromium.org> wrote:Fair point. I'll try to finish cleaning up the docs around this to allow using it (but won't update the default recommendations for maps yet).... to start the discussion around default recommendations, my belief is that we should default to using absl::flat_hash_map everywhere.Pros:- decent default behaviour in most cases.- an empty collection is about the same size as a std::unordered_map.- insert/removal are O(1) instead of O(n) like base::flat_map (which is the current default recommendation; this has caused perf issues in a number of places)- special cases for small object sizes (1, 2, 3, 5, 7)- at small sizes and "reasonably" sized objects(i.e. std::string -> std::string), competitive or better than the STL containers- has built-in protections against dereferencing end (which has been responsible for some security bugs). That being said, I think libc++ is slowly gaining these protections as well.- codegen is comparable to STL typesCons:- base::flat_map is strictly better from a size perspective- an empty flat_map is 3 words; an empty flat_hash_map is 5 words.- the max load factor for flat_hash_map is 87.5%, so it's utilization will always be somewhat worse than base::flat_map- Large key/value types are more of a drag on absl::flat_hash_map. That being said, base::flat_map suffers from this problem too—and the solution to this is to use absl::node_hash_map or wrap the value in std::unique_ptr.Curious what people think about these points (or who would need to sign-off/approve these sorts of changes).I think your conclusions sound very reasonable.It's hard to tell if it is going off the rails by changing default guidance though. Without actually rewriting everything in prod, I wonder if it's possible to do some hacky rewrite of all our base::flat* to absl types and look at the APK size and run speedometer locally (without PGO) just to make sure it's not super regrettable at scale?
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAAAE6AT0P9SvRn%3DYVKX-2Yvd7y_waLM2444rh_cemTWYC%2BXiDg%40mail.gmail.com.
Fair point. I'll try to finish cleaning up the docs around this to allow using it (but won't update the default recommendations for maps yet).
... to start the discussion around default recommendations, my belief is that we should default to using absl::flat_hash_map everywhere.Pros:- decent default behaviour in most cases.- an empty collection is about the same size as a std::unordered_map.- insert/removal are O(1) instead of O(n) like base::flat_map (which is the current default recommendation; this has caused perf issues in a number of places)- special cases for small object sizes (1, 2, 3, 5, 7)- at small sizes and "reasonably" sized objects(i.e. std::string -> std::string), competitive or better than the STL containers- has built-in protections against dereferencing end (which has been responsible for some security bugs). That being said, I think libc++ is slowly gaining these protections as well.- codegen is comparable to STL typesCons:- base::flat_map is strictly better from a size perspective- an empty flat_map is 3 words; an empty flat_hash_map is 5 words.- the max load factor for flat_hash_map is 87.5%, so it's utilization will always be somewhat worse than base::flat_map- Large key/value types are more of a drag on absl::flat_hash_map. That being said, base::flat_map suffers from this problem too—and the solution to this is to use absl::node_hash_map or wrap the value in std::unique_ptr.Curious what people think about these points (or who would need to sign-off/approve these sorts of changes).
There is another problem too, which is that Mojo has settled on base::flat_map as its default container. While it's possible to use typemapping to use an alternate map type instead, this is fairly unwieldy.
In fact, Mojo maps have another problem: the Mojo `map` type is sorted in regular variants... and unsorted in the Blink variants. This isn't great, and I kind of feel like we should probably migrate explicitly to sorted_map (which would be uniformly sorted in both variants) and unsorted_map (which would be uniformly unsorted in both variants)...