should internalizing a string maintain external data when possible?

22 views
Skip to first unread message

Bruce MacNaughton

unread,
Apr 17, 2023, 10:12:19 AM4/17/23
to v8-users
This is my first relatively deep dive into v8 strings; if I misinterpreted what's going on, please correct me.

When a string is used as a key:

const x = 'xyzzy';
// add some external data to x
const obj = { [x]: 1234 };

the external data is lost because StringTable:::LookupKey() finds an entry (data->FindEntry()) and returns a Handle<String> to it. That String is *not* internalized, so
String::MakeThin() is called on the original string (as 'this') with the Handle<String> returned by FindEntry(). MakeThin() finds this->IsExternalString() true so calls MigrateExternalString(), internalized is *not* an External string, so FinalizeExternalString() is called, discarding the external data.

whew.

It seems like, to propagate the external data that, in LookupString(), if string->IsExternal() and !result->IsExternal() then result should be reconstructed as an External string.

It may be that the way it works is intended, but it's not clear because MigrateExternalString() in string.cc checks to see if 'internalized' is an external string (either one or two byte) because defaulting to throw away the external data.

But maybe there are considerations with the shared heap that I have yet to understand.

Basic question: is this is a bug or intentional design?


Bruce MacNaughton

unread,
Apr 17, 2023, 10:30:31 AM4/17/23
to v8-users
"(either one or two byte) because defaulting to throw away the external data" should be "(either one or two byte) BEFORE defaulting to throw away the external data"

Patrick Thier

unread,
Apr 17, 2023, 11:40:23 AM4/17/23
to v8-u...@googlegroups.com
The purpose of internalized strings is to have a canonical, unique representation of a string. So yes if strings are internalized that are duplicates of already internalized values, we "throw away"/discard that duplicate and point them to the unique string instead (i.e. the string becomes a ThinString).
If you want to take ownership of the buffer of an internalized string, you can by externalizing the internalized string (as opposed to internalizing an external string).

Maybe you can tell us what your intent is? Because you mentioned "add some external data to x" I am not sure if external strings are really what you are looking for. The purpose of external strings is only to allow embedders to have ownership of the string buffer, not to attach arbitrary data to strings.

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com.

Bruce MacNaughton

unread,
Apr 17, 2023, 12:02:50 PM4/17/23
to v8-users
The goal (what we've used them for now) is to have a unique copy of a string so we can track the source of a string. We attach Persistent<v8::Object> properties to the string so that a string from one source can be differentiated from a string from another source. Specifically, we track strings that are user-input.

I was a little fast and loose with the comment about "add some external data to x". More precisely, I create a subclass that extends v8::String::ExternalStringResource and the subclass has a Persistent<v8::Object> member.

Is there another way to differentiate between two strings of identical form/content/representation?

Bruce MacNaughton

unread,
Apr 17, 2023, 7:08:22 PM4/17/23
to v8-users
I think I've had a bit of a misconception; I was jumbling together "external" and our Persistent extension. Externalizing the internalized string wouldn't help because the real issue (for us) is to have the internalized string built on our subclass with the Persistent property. And that doesn't seem possible.
Reply all
Reply to author
Forward
0 new messages