Planning to experiment with inline storage for DOMString / webidl binding codegen

Showing 1-11 of 11 messages
Planning to experiment with inline storage for DOMString / webidl binding codegen Nick Fitzgerald 12/1/15 5:39 PM
One of the largest performance bottlenecks in Element.getAttribute (and I
suspect a bunch of other webidl bindings as well) is copying (often very
small) SpiderMonkey strings into heap allocated DOMStrings. It doesn't seem
practical to always use JS strings as they don't use UTF-8 and it would
require a lot more rooting and generally seems like it would require
rewriting a large amount of code. But, it was pointed out that these (often
very small) strings are always heap allocated.

See #6906 and specifically these comments:

https://github.com/servo/servo/issues/6906#issuecomment-159372789
https://github.com/servo/servo/issues/6906#issuecomment-159379936
https://github.com/servo/servo/issues/6906#issuecomment-159383632

I've created a crate[0] that provides a mostly drop-in replacement of
std::string::String but with inline storage for small strings to avoid heap
allocation. It also provides a StringExt trait to abstract string
operations over both InlinableString and std::string::String. I'm planning
on implementing this trait for DOMString as well, making the codegen emit
bindings code that uses InlinableString for stack allocated strings, and
make the webidl interface methods use `StringExt` instead of `DOMString`
directy. My hope is that this will make a good chunk of that performance
bottleneck go away.

[0] https://github.com/fitzgen/inlinable_string

If I bug you on irc with requests for help, please have patience with me :)

If you see something architecturally wrong with this plan, please let me
know sooner rather than later!

Cheers,

Nick
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Jack Moffitt 12/1/15 5:48 PM
> One of the largest performance bottlenecks in Element.getAttribute (and I
> suspect a bunch of other webidl bindings as well) is copying (often very
> small) SpiderMonkey strings into heap allocated DOMStrings.

Alan Jeffrey has been investigating issues around string copying as
well. Have you chatted with him to compare notes?

jack.
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Nick Fitzgerald 12/1/15 6:00 PM
​I was not aware of that. I'm interested in Alan's initial findings​ and
approach as well! (Pung for more info here:
https://github.com/servo/servo/issues/6906#issuecomment-161153975)
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Alan Jeffrey 12/1/15 6:01 PM
Yes, I've done some experiments with this. Inlining small strings didn't
help as much as I'd hoped, I suspect because they're being converted to
Strings or JSStrings a lot. I'm still in the middle of experimenting, I'm
hoping to write up soon. The code is at
https://github.com/asajeffrey/servo/tree/string_representation_experiments
although it's a bit of a mess.

A.



On Tue, Dec 1, 2015 at 7:48 PM, Jack Moffitt <ja...@metajack.im> wrote:

> > One of the largest performance bottlenecks in Element.getAttribute (and I
> > suspect a bunch of other webidl bindings as well) is copying (often very
> > small) SpiderMonkey strings into heap allocated DOMStrings.
>
> Alan Jeffrey has been investigating issues around string copying as
> well. Have you chatted with him to compare notes?
>
> jack.
> _______________________________________________
> dev-servo mailing list
> dev-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-servo
>
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Simon Sapin 12/1/15 6:04 PM
On 02/12/15 02:39, Nick Fitzgerald wrote:
> I've created a crate[0] that provides a mostly drop-in replacement of
> std::string::String but with inline storage for small strings to avoid heap
> allocation.
>
> [0]https://github.com/fitzgen/inlinable_string

There’s also [1], already in used in Servo in some places, which behaves
like Vec<T> but stores items inline up to a certain length. I’ve also
been playing with [2] to be like String but with a generic backing
storage. (I’ve used it in [3] to return a single code point in UTF-8 as
StringWrapper<[u8; 4]>.)

Would it make sense to converge with some of these?

[1] https://github.com/servo/rust-smallvec
[2]
https://github.com/SimonSapin/rust-std-candidates/blob/master/string-wrapper/lib.rs
[3] https://github.com/SimonSapin/rust-utf8


--
Simon Sapin
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Nick Fitzgerald 12/2/15 8:50 AM
And there is also ArrayString[0] from the arrayvec crate[1]. I was
considering reusing this and getting rid of the fixed size `InlineString` I
rolled myself as ArrayString is almost identical.

[0] http://bluss.github.io/arrayvec/doc/arrayvec/struct.ArrayString.html
[1] https://crates.io/crates/arrayvec
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Alan Jeffrey 12/2/15 1:06 PM
Well, I tried eliminating the buffering, but I still didn't get any perf
benefit, if anything there was a degrade. Sigh.

https://github.com/asajeffrey/servo/commit/1533238eb8ffff22688572a04d5bd6eb8ac88297

On Wed, Dec 2, 2015 at 10:49 AM, Nick Fitzgerald <nfitz...@mozilla.com>
wrote:
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Andrew McCreight 12/2/15 1:23 PM
For what it is worth, Gecko uses a single-entry per-zone cache for
converting the result of JS strings to DOM strings in getAttribute (I
think). bz added this in bug 773520, and it has had a few refinements since
then. Maybe some kind of caching would help Servo, too.

Andrew
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Alan Jeffrey 12/2/15 1:41 PM
Yes, this is on my list of things to think about.

On Wed, Dec 2, 2015 at 3:23 PM, Andrew McCreight <amccr...@mozilla.com>
wrote:
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Boris Zbarsky 12/2/15 2:04 PM
On 12/2/15 4:23 PM, Andrew McCreight wrote:
> For what it is worth, Gecko uses a single-entry per-zone cache for
> converting the result of JS strings to DOM strings in getAttribute (I
> think)

No, that's the other direction: DOM strings to JS strings.

-Boris
Re: Planning to experiment with inline storage for DOMString / webidl binding codegen Alan Jeffrey 12/5/15 9:05 PM
Another shot at an improved DOMString representation. This one uses one of
3 representations: either a String, an inlined string (for < 16 bytes) or
an interned string:

https://github.com/asajeffrey/servo/tree/domstring-use-atom-or-inlined-string

I did lots of hoop-jumping to get it to fit in the same 3 words as a String.

The performance is still worse than just using a String. Even more
baffling, it uses more memory if you point it at en.wikipedia.org. I can
understand a time penalty caused by straight-line code becoming branching,
but I don't understand a memory penalty.