Moving to a Rust based XML parser is part of our strategy to address security issues and the difficult security track record of libxml2 and libxslt.
Unfortunately, for XSLT support, libxml2 and libxslt are tightly entangled. However, we can already ship the Rust based XML parser for scenarios without XSLT.
In this intent to experiment, I propose to roll out the Rust-based XML parser for scenarios where no XSLT processing is required.:
For details of "Likely", see Risks below.
XML Parsing and Serialization
In implementing the Rust based XML parser we tested against ~400 WPT and internal web tests and brought down the failures to close to 0. For DOMParser and XMLHttpRequest test, the new parser is already permantenly running on bots.
Technically, niche issues remain where in serialization, with the new Rust parser we may occasionally insert an extra xmlns: element on a root element due to API restrictions of the XML parser. This does not affect document semantics.
From WPT tests we do not see other compatibility issues, and we believe we can progress to real-world testing for the non XSLT scenarios.
Inline XSLT in SVG
There is a minor theoretical risk regarding standalone SVG images (scenario 3 above) that utilize inline XSLT to transform raw XML data into SVG. Example
Data: UseCounters (XSLPIInSVGImage and XSLPIInSVGImageStandalone) currently show 0% usage in Canary and Dev.
Current State: This works in Chrome for standalone docs but not for externally referenced images (matching Firefox). Safari supports both.
Signal: Mozilla and Apple both have expressed support for deprecating XSLT in general, not only in this context (WHATWG Issue #11523).
Changing to Rust based XML parsing eliminates a class (and historic chain) of memory corruption bugs in XML parsing. Libxml2 had unstable maintainership, and delays in responding to security issues.
Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?
No, expect for the likely non-existent usage of XSLT in SVG.
I propose rolling out to 50% Dev, Canary and Beta. Then progress to 1% on stable after observing the new use counter for XSLT usage in SVG, and monitoring the perf histogram Blink.XMLParsing.NonXsltXmlParsingTime.Combined.
The new Rust parser at this point is not on par with the performance of the libxml2 based parser and shows a 50% regression in the blink_perf.parser microbenchmark parsing a 3MB heavy XML document and measuring throughput, tracked in https://crbug.com/470367156
Rolling out at a small percentage to stable helps us gather the required metrics to decide whether this leads to real-world performance implications in the metrics we care about: mainly LCP, and monitoring for major shifts in the UMA histogram for XML parser timing.
Performance. The microbenchmark finding shows that the parser is not at the same performance of the C parser at this point. Finding out whether this practically matters is one goal of this experiment.
No issues, both libxml2 and the Rust parser parse into internal DOM structures which are accessible in source view like before.
| Experimental roll out to 1% for M147. |
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAN6muBu762SaOZv_a%2BSDpJDnrRVS6Y2ZRETyJfPjdfuEAEG6qA%40mail.gmail.com.