Add new class to avoid creating unnecessary AtomicStrings. [chromium/src : main]

0 views
Skip to first unread message

Luis Fernando Pardo Sixtos (Gerrit)

unread,
Oct 26, 2021, 1:34:49 PM10/26/21
to Kevin Babbitt, blink-re...@chromium.org, blink-...@chromium.org, kinuko...@chromium.org

Attention is currently required from: Kevin Babbitt.

Luis Fernando Pardo Sixtos would like Kevin Babbitt to review this change.

View Change

Add new class to avoid creating unnecessary AtomicStrings.

Some instances of AtomicStrings are only use for lookups in hash maps
that could be performed with only the string hash and the StringImpl
pointer in a non dereferenceable form. (e.g. getElementById)

This CL adds the MaybeAtomicString class containing this information.
The new class allows safe conversion back to AtomicString if the string
was already in the AtomicStringTable. This keeps externalization for
such strings, however strings that were not in the AtomicStringTable
can't be externalized.


Bug: 1083392
Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
---
M third_party/blink/renderer/platform/wtf/text/atomic_string_table.h
A third_party/blink/renderer/platform/wtf/text/maybe_atomic_string.h
A third_party/blink/renderer/platform/wtf/text/maybe_atomic_string.cc
M third_party/blink/renderer/platform/wtf/text/atomic_string_table.cc
M third_party/blink/renderer/platform/wtf/BUILD.gn
M third_party/blink/renderer/platform/wtf/text/atomic_string.cc
M third_party/blink/renderer/platform/wtf/text/atomic_string.h
7 files changed, 260 insertions(+), 15 deletions(-)


To view, visit change 3238152. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
Gerrit-Change-Number: 3238152
Gerrit-PatchSet: 9
Gerrit-Owner: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Reviewer: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Reviewer: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Attention: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-MessageType: newchange

Luis Fernando Pardo Sixtos (Gerrit)

unread,
Oct 26, 2021, 1:34:57 PM10/26/21
to blink-re...@chromium.org, blink-...@chromium.org, kinuko...@chromium.org, Kevin Babbitt, Tricium, Chromium LUCI CQ, chromium...@chromium.org

Attention is currently required from: Kevin Babbitt.

View Change

2 comments:

  • Patchset:

    • Patch Set #9:

      Hi Kevin, PTAL.

      I split the change in 3 CLs.
      1. Adds the new class and modify AtomicStringTable code.
      2. Connects the class with the bindings layer.
      3. Replaces AtomicStrings on getElement by Id.

      Current perf tests show around 4% improvement in get-element-by-id and more than 100% improvement in undefined-get-element-by-id. For our objective it should be enough to not have regressions before making the refcount changes atomic.

      As I mentioned in the code comment, I think that we could replace the WeakResult class with MaybeAtomicString, please let me know your thoughts on it.

  • File third_party/blink/renderer/platform/wtf/text/atomic_string_table.h:

    • Patch Set #7, Line 53:

      class WeakResult {
      public:
      WeakResult() = default;
      explicit WeakResult(StringImpl* str)
      : ptr_value_(reinterpret_cast<uintptr_t>(str)) {
      CHECK(!str || str->IsAtomic() || str == StringImpl::empty_);
      }

      bool IsNull() const { return ptr_value_ == 0; }

      private:
      friend bool operator==(const WeakResult& lhs, const WeakResult& rhs);
      friend bool operator==(const StringImpl* lhs, const WeakResult& rhs);

      // Contains the pointer a string in a non-deferenceable form. Do NOT cast
      // back to a StringImpl and dereference. The object may no longer be alive.
      uintptr_t ptr_value_ = 0;
      };

      I used this class as the intermediary API between AtomicStringTable and MaybeAtomicString to keep consistency on how the WeakFind APIs interact with the table. I would prefer to delete this class and replace its instances with MaybeAtomicString.

To view, visit change 3238152. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
Gerrit-Change-Number: 3238152
Gerrit-PatchSet: 9
Gerrit-Owner: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Reviewer: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Reviewer: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Attention: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 17:34:46 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Gerrit-MessageType: comment

Kevin Babbitt (Gerrit)

unread,
Oct 28, 2021, 12:48:22 AM10/28/21
to Luis Fernando Pardo Sixtos, blink-re...@chromium.org, blink-...@chromium.org, kinuko...@chromium.org, Tricium, Chromium LUCI CQ, chromium...@chromium.org

Attention is currently required from: Luis Fernando Pardo Sixtos.

View Change

3 comments:

  • File third_party/blink/renderer/platform/wtf/text/maybe_atomic_string.h:

    • Patch Set #10, Line 52:

          static bool Equal(const AtomicString& str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str.Impl()) == key.hint_;
      }

      I did some digging and found commentary explaining why the hints are treated as opaque. https://chromium-review.googlesource.com/c/chromium/src/+/2263652/14..30/third_party/blink/renderer/core/dom/element.cc#b980

      In a nutshell: Dereferencing these results is only safe in the immediate term, since AtomicStringTable does not hold reference on the strings it stores. One example where things can go bad is if we have collection C with AtomicStrings S1, S2, S3; we WeakFind and get a hit on S4 which is held by some other structure. If that other structure goes away while we still have the WeakResult on S4, the pointer we have is no longer valid, and worse the underlying memory might be reallocated to point to some other string.

      So I think this comparison is safe from a data lifetime perspective, but I'm not convinced it will always produce the expected result. Continuing the example above, suppose after S4 goes away, we construct AtomicString S5 which gets the memory left behind by S4. Now our MaybeAtomicString for S4 matches the content for S5. We'd still be okay in terms of using this as a lookup in C because S4 == S5 != {S1, S2, S3}. But calling ToAtomicString() would now give us S5 when before it would have given us S4.

      With all that in mind, I think if we want to be able to convert a MaybeAtomicString to an AtomicString, we have to take the hit of AddRef'ing the underlying StringImpl. Which would not be any worse than without MaybeAtomicString; in cases where the string isn't in the table we can still avoid adding it.

      (Aside: It would be awesome if, as part of introducing MaybeAtomicString, we brought some of the details from that discussion thread into code comments.)

    • Patch Set #10, Line 55:

          static bool Equal(const StringImpl* str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str) == key.hint_;
      }

      This has the same issue I pointed out with operator==(const MaybeAtomicString&, const StringImpl*) below.

    • Patch Set #10, Line 70:

      inline bool operator==(const MaybeAtomicString& string,
      const StringImpl* impl) {
      return string.Hint() == reinterpret_cast<uintptr_t>(impl);
      }

      inline bool operator==(const StringImpl* impl,
      const MaybeAtomicString& string) {
      return string == impl;
      }

      These comparisons are valid if `impl` happens to be held by an AtomicString, but it can return a false negative if `impl` is held by a String that has the same character data as `string` but is not the same instance, e.g.
      // adds "hello" to table
      AtomicString atomic("hello");
      // finds "hello" in the table and captures a hint to it
      MaybeAtomicString maybe_atomic("hello");
      // allocates a new StringImpl with "hello"
      String non_atomic("hello");
      // actually returns false
      EXPECT_TRUE(maybe_atomic == non_atomic.Impl());

To view, visit change 3238152. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
Gerrit-Change-Number: 3238152
Gerrit-PatchSet: 10
Gerrit-Owner: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Reviewer: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Reviewer: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Attention: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Comment-Date: Thu, 28 Oct 2021 04:48:08 +0000

Luis Fernando Pardo Sixtos (Gerrit)

unread,
Oct 28, 2021, 3:57:25 AM10/28/21
to blink-re...@chromium.org, blink-...@chromium.org, kinuko...@chromium.org, Kevin Babbitt, Tricium, Chromium LUCI CQ, chromium...@chromium.org

Attention is currently required from: Kevin Babbitt.

View Change

4 comments:

  • Patchset:

  • File third_party/blink/renderer/platform/wtf/text/maybe_atomic_string.h:

    • Patch Set #10, Line 52:

          static bool Equal(const AtomicString& str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str.Impl()) == key.hint_;
      }

    • I did some digging and found commentary explaining why the hints are treated as opaque. […]

      Yes, this class faces similar restrictions as WeakResult. I can see 2 troublesome scenarios that can happen between the creation of the MaybeAtomicString and the moment we want to compare/convert to AtomicString.
      1. The original AtomicString was deleted.
      2. The original AtomicString was replaced by a new one.

      Case 1:
      Compare will return false and converting will return null. Actually, the only reason we want to be able to convert to AtomicString is to enable externalization, and there are no Release calls in that path, hence it should be safe to just dereference the pointer; I coded it as a lookup to be extra safe, but AFAIK it should not be necessary.

      Case 2:
      It could happen and it would yield wrong results. As with WeakResult, this lookups should only be performed if there have not been new AtomicStrings added to the AtomicStringTable.

      To summarize. Similar to WeakResult, usage of this class can be problematic if not done correctly. I could try to mitigate the risk by adding thorough code comments, and some machinery for DCHECK_IS_ON builds to crash if we ever hit these cases. Please let me know if you think this could minimize the risk enough to keep moving forward with the change.

    • Patch Set #10, Line 55:

          static bool Equal(const StringImpl* str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str) == key.hint_;
      }

    • This has the same issue I pointed out with operator==(const MaybeAtomicString&, const StringImpl*) b […]

      This translator class is meant to be used only for comparisons within hash tables involving AtomicStrings, perhaps a comment about it or placing the Translator closer to the AtomicStringTable would be enough to address the issue.

    • Patch Set #10, Line 70:

      inline bool operator==(const MaybeAtomicString& string,
      const StringImpl* impl) {
      return string.Hint() == reinterpret_cast<uintptr_t>(impl);
      }

      inline bool operator==(const StringImpl* impl,
      const MaybeAtomicString& string) {
      return string == impl;
      }

    • These comparisons are valid if `impl` happens to be held by an AtomicString, but it can return a fal […]

      You are right, and we actually don't need this operator.

To view, visit change 3238152. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
Gerrit-Change-Number: 3238152
Gerrit-PatchSet: 10
Gerrit-Owner: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Reviewer: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Reviewer: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Attention: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Comment-Date: Thu, 28 Oct 2021 07:57:14 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-MessageType: comment

Kevin Babbitt (Gerrit)

unread,
Oct 28, 2021, 4:38:04 PM10/28/21
to Luis Fernando Pardo Sixtos, blink-re...@chromium.org, blink-...@chromium.org, kinuko...@chromium.org, Tricium, Chromium LUCI CQ, chromium...@chromium.org

Attention is currently required from: Luis Fernando Pardo Sixtos.

View Change

2 comments:

  • File third_party/blink/renderer/platform/wtf/text/maybe_atomic_string.h:

    • Patch Set #10, Line 52:

          static bool Equal(const AtomicString& str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str.Impl()) == key.hint_;
      }

    • As with WeakResult, this lookups should only be performed if there have not been new AtomicStrings added to the AtomicStringTable.

      My concern is that this is going to be hard to predict/maintain, especially on codepaths that have gnarly legacy behaviors. For example, setAttribute synchronously fires DOMAttrModified in the middle of its work, at which point Javascript can do anything including re-entering setAttribute. There's no way we can guarantee that the AST goes unmodified in that situation, which means if we want to convert a MaybeAtomicString we have to be careful to do it before calling out to script. We can fix that case since we know about it, but it makes me wonder about the cases we don't know of. It seems brittle, and if a mistake does get made with it, the problem will not be obvious, nor will it happen deterministically.

    • Patch Set #10, Line 55:

          static bool Equal(const StringImpl* str, const MaybeAtomicString& key) {
      return reinterpret_cast<uintptr_t>(str) == key.hint_;
      }

    • This translator class is meant to be used only for comparisons within hash tables involving AtomicSt […]

      How about DCHECK(str->IsAtomic())?

To view, visit change 3238152. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: chromium/src
Gerrit-Branch: main
Gerrit-Change-Id: I4f50306876f0210bc5c2cc05a4a9f5acfd485c91
Gerrit-Change-Number: 3238152
Gerrit-PatchSet: 10
Gerrit-Owner: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Reviewer: Kevin Babbitt <kbab...@microsoft.com>
Gerrit-Reviewer: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Attention: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-Comment-Date: Thu, 28 Oct 2021 20:37:54 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Kevin Babbitt <kbab...@microsoft.com>
Comment-In-Reply-To: Luis Fernando Pardo Sixtos <lpardo...@microsoft.com>
Gerrit-MessageType: comment
Reply all
Reply to author
Forward
0 new messages