Intent to Ship: Intl.Segmenter

157 views
Skip to first unread message

Frank Tang

unread,
Aug 17, 2020, 7:10:56 PM8/17/20
to v8-users, blink-dev, v8-...@googlegroups.com

For m87


Contact emails

ft...@chromium.orgsf...@chromium.org

Explainer


https://github.com/tc39/proposal-intl-segmenter

Specification

https://tc39.github.io/proposal-intl-segmenter/

Design docs


https://docs.google.com/document/d/1xugLpLmgRFnNXK8ztariTAbD2IXueDw1T3VNuuZCz8k/edit#heading=h.xgjl2srtytjt
https://docs.google.com/presentation/d/1X2zBU3bZ4ergVMWfubCsdnHFzeaDgqiTRJVgvNGjQBs/edit#slide=id.p

TAG review

reviewed by ECMA402 and TC39

Summary

Intl.Segmenter implements methods for finding the location of boundaries in text, including grapheme, line, word and sentence boundary analysis.

Link to “Intent to Prototype” blink-dev discussion

https://groups.google.com/a/chromium.org/g/blink-dev/c/muRQBwyzzPw/m/MXnlnDEdBgAJ

Risks



Interoperability and Compatibility

The specification is moved to Stage 3 in TC39 2020-Jul meeting with support from ECMA402.

Gecko: In development (https://bugzilla.mozilla.org/show_bug.cgi?id=1423593)

WebKit: No signal

Web developers: No signals

Ergonomics

Engineer from Apple believe we should not add line break support to the Intl.Segmenter because the developer may abuse the API and perform text layout by themselves instead of depending on CSS. The line break feature then were removed from the specification in the current shape.


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

Yes https://github.com/tc39/test262/tree/master/test/intl402/Segmenter

Tracking bug

https://bugs.chromium.org/p/v8/issues/detail?id=6891

Link to entry on the Chrome Platform Status

https://www.chromestatus.com/feature/6099397733515264

This intent message was generated by Chrome Platform Status.

Yoav Weiss

unread,
Aug 18, 2020, 4:33:14 AM8/18/20
to v8-users, blink-dev, v8-...@googlegroups.com
On Tue, Aug 18, 2020 at 1:10 AM Frank Tang <ft...@chromium.org> wrote:

For m87


Contact emails

ft...@chromium.orgsf...@chromium.org

Explainer


https://github.com/tc39/proposal-intl-segmenter

Specification

https://tc39.github.io/proposal-intl-segmenter/

Design docs


https://docs.google.com/document/d/1xugLpLmgRFnNXK8ztariTAbD2IXueDw1T3VNuuZCz8k/edit#heading=h.xgjl2srtytjt
https://docs.google.com/presentation/d/1X2zBU3bZ4ergVMWfubCsdnHFzeaDgqiTRJVgvNGjQBs/edit#slide=id.p

TAG review

reviewed by ECMA402 and TC39

Summary

Intl.Segmenter implements methods for finding the location of boundaries in text, including grapheme, line, word and sentence boundary analysis.

Link to “Intent to Prototype” blink-dev discussion

https://groups.google.com/a/chromium.org/g/blink-dev/c/muRQBwyzzPw/m/MXnlnDEdBgAJ

Risks



Interoperability and Compatibility

The specification is moved to Stage 3 in TC39 2020-Jul meeting with support from ECMA402.

Gecko: In development (https://bugzilla.mozilla.org/show_bug.cgi?id=1423593)

That issue seems stalled... 

WebKit: No signal

Could you ask for official signals from both? 

Web developers: No signals

Who's asking for this? Why are we implementing? Do we believe it's something developers will use?
 

Ergonomics

Engineer from Apple believe we should not add line break support to the Intl.Segmenter because the developer may abuse the API and perform text layout by themselves instead of depending on CSS. The line break feature then were removed from the specification in the current shape.


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

Yes https://github.com/tc39/test262/tree/master/test/intl402/Segmenter

Tracking bug

https://bugs.chromium.org/p/v8/issues/detail?id=6891

Link to entry on the Chrome Platform Status

https://www.chromestatus.com/feature/6099397733515264

This intent message was generated by Chrome Platform Status.

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/CAOcELL8S5zsU0HuppQrz%2BTK59nChDWOtuNpDLgefeazAEbHm1g%40mail.gmail.com.

Frank Tang

unread,
Aug 21, 2020, 3:05:48 AM8/21/20
to v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, Ross Kirsling, v8-users, blink-dev
On Tue, Aug 18, 2020 at 1:33 AM Yoav Weiss <yo...@yoav.ws> wrote:


On Tue, Aug 18, 2020 at 1:10 AM Frank Tang <ft...@chromium.org> wrote:

For m87


Contact emails

ft...@chromium.orgsf...@chromium.org

Explainer


https://github.com/tc39/proposal-intl-segmenter

Specification

https://tc39.github.io/proposal-intl-segmenter/

Design docs


https://docs.google.com/document/d/1xugLpLmgRFnNXK8ztariTAbD2IXueDw1T3VNuuZCz8k/edit#heading=h.xgjl2srtytjt
https://docs.google.com/presentation/d/1X2zBU3bZ4ergVMWfubCsdnHFzeaDgqiTRJVgvNGjQBs/edit#slide=id.p

TAG review

reviewed by ECMA402 and TC39

Summary

Intl.Segmenter implements methods for finding the location of boundaries in text, including grapheme, line, word and sentence boundary analysis.

Link to “Intent to Prototype” blink-dev discussion

https://groups.google.com/a/chromium.org/g/blink-dev/c/muRQBwyzzPw/m/MXnlnDEdBgAJ

Risks



Interoperability and Compatibility

The specification is moved to Stage 3 in TC39 2020-Jul meeting with support from ECMA402.

Gecko: In development (https://bugzilla.mozilla.org/show_bug.cgi?id=1423593)

That issue seems stalled... 
Zibi (ECMA402 members from Mozilla) could you comment about your understanding about how likely Gecko would support Intl.Segmenter?
 

WebKit: No signal

Could you ask for official signals from both? 
Mathias - could you help?
Ross / rkir...@gmail.com (TC39 member from Apple) could you comment about your understanding about how likely Safari would support  Intl.Segmenter? 
 

Web developers: No signals

Who's asking for this? Why are we implementing? Do we believe it's something developers will use?

This is really needed to replace the non-standard Intl.v8BreakIterator. We somehow shipped a non standard one  Intl.v8BreakIterator and ECMA402 and TC39 really think there is a need to retire/obsolete/deprecated  Intl.v8BreakIterator but we need a standard one first ship so we can tell the developer how to adopt the standard one. According to https://www.chromestatus.com/metrics/feature/timeline/popularity/556 currently 0.4% of all chrome page load use Intl.v8BreakIterator. and these are the first target  we would them to move their code away from Intl.v8BreakIterator to Intl.Segmenter . Even with just Chrome launch it, it will be better that they stay using the chrome only Intl.v8BreakIterator as today. 

 

Ergonomics

Engineer from Apple believe we should not add line break support to the Intl.Segmenter because the developer may abuse the API and perform text layout by themselves instead of depending on CSS. The line break feature then were removed from the specification in the current shape.


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

Yes https://github.com/tc39/test262/tree/master/test/intl402/Segmenter

Tracking bug

https://bugs.chromium.org/p/v8/issues/detail?id=6891

Link to entry on the Chrome Platform Status

https://www.chromestatus.com/feature/6099397733515264

This intent message was generated by Chrome Platform Status.

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/CAOcELL8S5zsU0HuppQrz%2BTK59nChDWOtuNpDLgefeazAEbHm1g%40mail.gmail.com.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CACj%3DBEj62bfc0JA5rDhM%3Dci-2bOfPw0o7sHhATjvoNfVGsOi9g%40mail.gmail.com.

Ross Kirsling

unread,
Aug 21, 2020, 4:16:07 AM8/21/20
to Frank Tang, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev
Note that I work for Sony, not Apple, but I do work on JSC and I can say that we have a finished implementation expected to land in the near future:
https://bugs.webkit.org/show_bug.cgi?id=213638

Joshua Bell

unread,
Aug 21, 2020, 12:43:06 PM8/21/20
to Ross Kirsling, Frank Tang, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev
I've heard this feature request from partners for two reasons:
  • It enables building full-text indexes e.g. using Indexed DB. WebSQL supported full text search (FTS) using its own engine, but browsers have removed that. FTS requires segmentation (this API) and optional stemming. Exposing ICU-equivalent segmentation saves developers from having to include that logic in their apps, or falling back to e.g. English-only segmentation and giving a poor experience in other locales.
  • More generally, we have requests from partners who implement custom text layout and rendering to canvas, e.g. as part of creative applications. Today, they are forced to ship ICU (or the equivalent) to support segmentation, e.g. using ICU built with WASM. Where possible, exposing web standard APIs that can be used instead will reduce the download cost users face.
Non-API-OWNER opinion: The 0.4% number for use of Intl.v8BreakIterator seems high! An HTTP Archive analysis of how it's being used and the prospects for migrating that use to the standard API would be interesting (e.g. is it via a small number of actively maintained libraries?). But I have sufficient confidence in the utility of the segmenter API in general that I wouldn't block on such an analysis.
 
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHnZyqDDH9gq5WxU3rtB1EEvKaQLunwqU2Hucyg8zM8H5uNMTg%40mail.gmail.com.

Frank Tang

unread,
Aug 21, 2020, 3:13:29 PM8/21/20
to Ross Kirsling, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev
Sorry, my bad. 

Yoav Weiss

unread,
Aug 21, 2020, 3:24:09 PM8/21/20
to Joshua Bell, Ross Kirsling, Frank Tang, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev
LGTM1

Thanks both! That's a helpful context.
 
Non-API-OWNER opinion: The 0.4% number for use of Intl.v8BreakIterator seems high! An HTTP Archive analysis of how it's being used and the prospects for migrating that use to the standard API would be interesting (e.g. is it via a small number of actively maintained libraries?). But I have sufficient confidence in the utility of the segmenter API in general that I wouldn't block on such an analysis.

Yeah. No need to block shipping on this analysis.
 

Chris Harrelson

unread,
Aug 21, 2020, 4:12:21 PM8/21/20
to Yoav Weiss, Joshua Bell, Ross Kirsling, Frank Tang, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev

Frank Tang

unread,
Aug 26, 2020, 4:05:54 PM8/26/20
to Chris Harrelson, Yoav Weiss, Joshua Bell, Ross Kirsling, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev
Ping, still needs the third approval.

Daniel Bratell

unread,
Aug 27, 2020, 10:04:35 AM8/27/20
to Frank Tang, Chris Harrelson, Yoav Weiss, Joshua Bell, Ross Kirsling, v8-...@googlegroups.com, Daniel Ehrenberg, Mathias Bynens, Shane Carr, zbran...@mozilla.com, v8-users, blink-dev

LGTM3

Thanks for improving the situation here!

/Daniel

Reply all
Reply to author
Forward
0 new messages