Re: [blink-dev] Intent to Implement and ship: Render Unicode control characters

172 views
Skip to first unread message

Rick Byers

unread,
Sep 10, 2015, 8:32:52 PM9/10/15
to Emil A Eklund, blink-dev
Updating subject to include "and ship" as requested in the template.


Have you done any httparchive search or anything to get an idea for how common this is?  I'm just wondering what level of outreach might be necessary here.  Any specific example of a site that's broken by this change?

Rick

On Thu, Sep 10, 2015 at 6:34 PM, 'Emil A Eklund' via blink-dev <blin...@chromium.org> wrote:
Contact emails
e...@chromium.org

Spec
https://drafts.csswg.org/css-text/#white-space-fprocessing

Summary
Currently Chrome and other browsers do not render unicode control
characters. This violates the unicode spec and the handling in other
software. With this change non-white-space control characters will be
rendered.

Motivation
A few months ago, the CSSWG realized that all the browsers were
violating Unicode, by not rendering non-WS control characters
<https://lists.w3.org/Archives/Public/www-style/2014Mar/0475.html>.

It was decided that it made sense to match Unicode (and, likely,
other software) and display them instead. As such the spec has been
updated to address this.

Compatibility Risk
In order to minimize the impact of the change for web developers the
major browser vendors have all agreed to flip the switch around the same
time frame. We're targeting Chrome 47 which fits the agreed "around
November" timeline.

Ongoing technical constraints
None

Will this feature be supported on all six Blink platforms (Windows,
Mac, Linux, Chrome OS, Android, and Android WebView)?
Yes

OWP launch tracking bug
crbug.com/530348

Link to entry on the feature dashboard
https://www.chromestatus.com/features/6232200047493120

Requesting approval to ship?
Yes

Emil A Eklund

unread,
Sep 10, 2015, 8:47:29 PM9/10/15
to Rick Byers, blink-dev
On Thu, Sep 10, 2015 at 5:32 PM, Rick Byers <rby...@chromium.org> wrote:
> Updating subject to include "and ship" as requested in the template.
>
> Correct spec anchor link is
> https://drafts.csswg.org/css-text/#white-space-processing

Thank you Rick!

> Have you done any httparchive search or anything to get an idea for how
> common this is? I'm just wondering what level of outreach might be
> necessary here. Any specific example of a site that's broken by this
> change?

I tried making the change earlier this year and noticed a handful of
mostly smaller sites where control characters ended up being rendered.
Didn't see any visible changes on news sites, wikipedia, or social
media sites. We've since made quite a few changes to how we do text
shaping, let me rerun the tests and give an update on this thread.

Rick Byers

unread,
Sep 10, 2015, 9:14:24 PM9/10/15
to Emil A Eklund, blink-dev
Thanks!  In general this sounds like it's probably a good thing to do, but given that no other browser does this yet I think it's worth some care to try to understand the impact on web devs / users.  If we're likely to break real sites then we should do some outreach (blog post or whatever).  I assume the breakage is typically pretty minor - just a small visual artifact unlikely to break actual usage, right?

Rick

Emil A Eklund

unread,
Sep 10, 2015, 9:18:17 PM9/10/15
to Rick Byers, blink-dev
On Thu, Sep 10, 2015 at 6:14 PM, Rick Byers <rby...@chromium.org> wrote:
> Thanks! In general this sounds like it's probably a good thing to do, but
> given that no other browser does this yet I think it's worth some care to
> try to understand the impact on web devs / users. If we're likely to break
> real sites then we should do some outreach (blog post or whatever). I
> assume the breakage is typically pretty minor - just a small visual artifact
> unlikely to break actual usage, right?

Correct.

The current plan is for all browsers to make the change around the
same time to minimize the impact for web developers. I do understand
(and share) your concerns. Hopefully I'll be able to address them with
data :)

Paul Irish

unread,
Sep 10, 2015, 9:48:39 PM9/10/15
to Emil A Eklund, blink-dev
Are the two bullet points listed here what's being proposed in this intent?

> 1. Render control characters U+0080-U+009F normally (ie show boxes if there is no available glyph).
> 2. Treat U+000C (form feed), in addition to U+0009, U+000A and U+000D, as whitespace.

Emil A Eklund

unread,
Sep 11, 2015, 12:20:26 PM9/11/15
to Paul Irish, blink-dev
On Thu, Sep 10, 2015 at 6:48 PM, Paul Irish <paul...@google.com> wrote:
> Are the two bullet points listed here what's being proposed in this intent?
>
>> 1. Render control characters U+0080-U+009F normally (ie show boxes if
>> there is no available glyph).
>> 2. Treat U+000C (form feed), in addition to U+0009, U+000A and U+000D, as
>> whitespace.

Correct, as per spec: "Control characters (Unicode category Cc) other
than tab (U+0009), line feed (U+000A), and carriage return (U+000D)
must be rendered as a visible glyph..."

Greg Whitworth

unread,
Sep 11, 2015, 5:23:07 PM9/11/15
to blink-dev, e...@google.com
Hey Rick,

Do you plan to put this behind a flag with it off by default, or ship it with it on by default? As stated by Emil there are sites that have Control Characters on them and thus why this needs to be unified breaking change by all UAs (Firefox made this change and had to revert due to the users thinking it was a bug). We're trying to work towards a coordinated release with coordinated PR to let authors know before shipping the change in 2016 (assuming all browsers follow through putting it behind a flag).

Thanks,
Greg

Greg Whitworth

unread,
Sep 11, 2015, 5:26:30 PM9/11/15
to blink-dev, e...@google.com
Sorry, disregard, I notice the reference to flipping the switch at the same time. But yes there are some out there but it's not very common based on basic investigation. One of the best ways to determine this is to ship it and allow for testing with it on, we plan to do this and will report back to www-style if the breaking change is just too large.

Simon Pieters

unread,
Sep 15, 2015, 11:45:02 AM9/15/15
to blink-dev, Greg Whitworth, e...@google.com
Non-owner LGTM, with the following reservations:

* There should be a flag so it is trivial to revert if other browsers are
lagging behind the timeplan or if it turns out to break too much of the
Web.
* It would also be good to have bug links or equivalent for the other
browsers available to track their progress.

As for the timeplan, the Intent says "around November", but Greg says
below "in 2016". I trust you to coordinate the shipping, but it's not
clear to me that everyone has the same plan. :-)

On Fri, 11 Sep 2015 23:23:07 +0200, Greg Whitworth <gw...@microsoft.com>
wrote:

> Hey Rick,
>
> Do you plan to put this behind a flag with it off by default, or ship it
> with it on by default? As stated by Emil there are sites that have
> Control
> Characters on them and thus why this needs to be unified breaking change
> by
> all UAs (Firefox made this change and had to revert due to the users
> thinking it was a bug). We're trying to work towards a coordinated
> release
> with coordinated PR to let authors know before shipping the change in
> 2016
> (assuming all browsers follow through putting it behind a flag).
>
> Thanks,
> Greg
>
> On Thursday, September 10, 2015 at 5:32:52 PM UTC-7, Rick Byers wrote:
>
>> Updating subject to include "and ship" as requested in the template.
>>
>> Correct spec anchor link is
>> https://drafts.csswg.org/css-text/#white-space-processing
>>
>> Have you done any httparchive search or anything to get an idea for how
>> common this is? I'm just wondering what level of outreach might be
>> necessary here. Any specific example of a site that's broken by this
>> change?
>>
>> Rick
>>
>> On Thu, Sep 10, 2015 at 6:34 PM, 'Emil A Eklund' via blink-dev <
>> blin...@chromium.org <javascript:>> wrote:
>>
>>> Contact emails
>>> e...@chromium.org <javascript:>
> To unsubscribe from this group and stop receiving emails from it, send
> an email to blink-dev+...@chromium.org.


--
Simon Pieters
Opera Software

Philip Jägenstedt

unread,
Sep 16, 2015, 5:20:25 AM9/16/15
to Simon Pieters, blink-dev, Greg Whitworth, Emil A Eklund
It sounds like most browsers are already on board with making this change, but I wonder about the motivation. If this were not a violation of the Unicode spec, would it still be an improvement for users, web developers or some other group to make the change?

Some estimate of impact using httparchive data or similar would be useful.

Philip

Greg Whitworth

unread,
Sep 16, 2015, 11:09:37 PM9/16/15
to Simon Pieters, blink-dev, e...@google.com
> Non-owner LGTM, with the following reservations:
>
> * There should be a flag so it is trivial to revert if other browsers are lagging
> behind the timeplan or if it turns out to break too much of the Web.
> * It would also be good to have bug links or equivalent for the other
> browsers available to track their progress.
>
> As for the timeplan, the Intent says "around November", but Greg says
> below "in 2016". I trust you to coordinate the shipping, but it's not clear to me
> that everyone has the same plan. :-)

The minutes are spotty on this from the Sydney face to face but regarding the timeline, "around November," this was to have it behind a flag. Coincidentally a rough plan to have it ship around November was suggested as well since Apple normally ships around then. That said, I don't want to put the cart before the horse and discuss shipping dates until we can ascertain the actual impact of this change. Additionally, this is our first (that I'm aware of) coordinated release of a breaking change so I want to ensure that we blast the PR trumpets so that as many web devs are aware of this change as possible. Because even though we plan to test it in various ways to get feedback it would be good to get web developer feedback as well. So basically a rough timeline looks like this:

TPAC 2015: All UAs have code in their browsers behind flag (off by default)
TPAC 2015 - Summery or Fall 2016: PR from all UAs devrel, tooling, etc regarding breaking change
Now - Early 2016: UAs do internal testing, testing via dev channels (if available), testing with third parties and report back any compat issues found to www-style thread
Summer or Fall 2016: Find shipping date that can overlap as many UAs as possible as not to make it so that one UA has to carry the burden of "bugs"

Greg

Greg Whitworth

unread,
Sep 17, 2015, 1:04:17 PM9/17/15
to Philip Jägenstedt, Simon Pieters, blink-dev, Emil A Eklund
>> From: Philip Jägenstedt [mailto:phi...@opera.com]
>> Sent: Wednesday, September 16, 2015 2:20 AM
>> To: Simon Pieters <sim...@opera.com>
>> Cc: blink-dev <blin...@chromium.org>; Greg Whitworth <gw...@microsoft.com>; Emil A Eklund <e...@google.com>
>> Subject: Re: [blink-dev] Intent to Implement and ship: Render Unicode control characters

>> It sounds like most browsers are already on board with making this change, but I wonder about the motivation.

I’ve said this on www-style but the largest motivation for us is to show to ourselves along with the web community that we can cooperate to make breaking changes together to improve the platform. There is usage of these (somehow, not sure how they’re getting in there) in web sites and if any one browser makes the change it results in bugs being filed against that UA (both Mozilla and Microsoft have hit this). If all of us make the change around the same time then the web developer/client will know it is an issue with their site rather than an issue with the browser. This is a relatively harmless change (doesn’t affect layout or cause any end user functionality issues in the examples I've seen) but a good test so that in the future we can make corrective changes to the web platform following this same model.

We have some other ideas of breaking changes that we’ve begun discussing with people at the CSSWG depending on if this model works effectively.

Regarding the reasoning behind the specification change, see this thread: http://lists.w3.org/Archives/Public/www-style/2014Oct/0391.html

Greg

Philip Jägenstedt

unread,
Sep 18, 2015, 5:11:59 AM9/18/15
to Greg Whitworth, Simon Pieters, blink-dev, Emil A Eklund
Thanks for that background, Greg. From the thread, this is what I can
see in terms of a problem to fix:

> The presence of such characters within the text degrades functionality
> by interfering with operations such as search, indexing, copy/paste to
> other environments, etc. Their presence is typically the result of
> broken authoring tools/workflows, but as long as browsers ignore them
> for rendering, authors generally remain unaware that their data is bad,
> and readers will usually be unaware that their searches, etc., may be
> missing content they would have expected to match.

Honestly, this seems like a problem best solved in the browser itself,
by ignoring control characters, just like how you would ignore the
empty element in A<b></b>C and thus find a match for "AC". Compare
that to A&#xB;C, where I cannot match "AC", at least not in Opera or
Chrome.

Some estimate of the expected impact is really needed here. Making
some proportion of the web look worse is a very tangible downside,
while "follow the Unicode spec" and "test synced release of breaking
changes" are relatively weak upsides, IMHO.

(I think synced release of breaking changes has great potential,
perhaps for restricting Geolocation to secure origins, and there's
lots of other changes that would benefit from such coordination.)

Philip

Philip Jägenstedt

unread,
Feb 2, 2016, 1:56:31 PM2/2/16
to Greg Whitworth, Simon Pieters, blink-dev, Emil A Eklund
Any update on this intent? This was a bit of a trial run for making
breaking changes to multiple browsers, so did the other browsers go
ahead?

Philip

Boris Zbarsky

unread,
Feb 2, 2016, 2:50:31 PM2/2/16
to Philip Jägenstedt, blink-dev
On 2/2/16 1:56 PM, Philip Jägenstedt wrote:
> Any update on this intent? This was a bit of a trial run for making
> breaking changes to multiple browsers, so did the other browsers go
> ahead?

I believe Firefox has been doing this for a while now.

-Boris

Emil A Eklund

unread,
Feb 2, 2016, 5:02:33 PM2/2/16
to Boris Zbarsky, Philip Jägenstedt, blink-dev
On Wed, Feb 3, 2016 at 6:50 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
> On 2/2/16 1:56 PM, Philip Jägenstedt wrote:
>>
>> Any update on this intent? This was a bit of a trial run for making
>> breaking changes to multiple browsers, so did the other browsers go
>> ahead?

We made some of the changes but not all. Le me follow up and report back.
Reply all
Reply to author
Forward
0 new messages