Intent to Implement : Navigation Error Logging

115 views
Skip to first unread message

Siddharth Vijayakrishnan

unread,
Mar 20, 2015, 5:30:06 PM3/20/15
to blin...@chromium.org, Thomas Tuttle, Ilya Grigorik

Intent to Implement: Navigation Error Logging


Contact emails

igri...@google.com (spec author)

tttu...@google.com (doing the work to implement this in Chrome)


Spec

Link to spec.


Summary

Defines a mechanism that enables developers to declare a network error reporting policy for a web application. A user agent can use this policy to report encountered network errors that prevented it from successfully fetching requested resources


Motivation

Developers can monitor site performance through the navigation timing API, but lack insight into more catastrophic failures such as socket timeouts, redirect loops and DNS issues that may be causing their site to become unavailable to sections of users or across different geographies. This spec gives the ability to monitor outages and transient failures that would go unnoticed otherwise.


Compatibility Risk

Minimal. Firefox and IE have both expressed interest. Iterating on spec feedback [1]. Yandex and others have expressed interest in adopting this feature server side.


Ongoing technical constraints

None


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes


OWP launch tracking bug?

https://code.google.com/p/chromium/issues/detail?id=338852


Link to entry on the feature dashboard

https://www.chromestatus.com/feature/5391249376804864


Requesting approval to ship?

Not yet.


[1] : https://lists.w3.org/Archives/Public/public-web-perf/2015Mar/0030.html


Ilya Grigorik

unread,
Mar 20, 2015, 6:03:55 PM3/20/15
to Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
Note: the spec was previously named "Navigation Error Logging", but has been renamed to "Network Error Logging". 

This intent is for the latter ("network")... ignore the "navigation" in the title - woops! :)

Rick Byers

unread,
Mar 23, 2015, 1:54:20 PM3/23/15
to Ilya Grigorik, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
This sounds really exciting.  I'm not familiar with the details here, but in general error reporting techniques have proven to be critical to raising application quality elsewhere.

To what extent can these use cases be addressed by a service worker today?  Can we conceptually imagine this API being implemented as a built-in service worker?  Eg. If a web developer likes most of what this does, but needs some additional application-specific context/changes, can they do so with a service worker?

Rick


Chris Harrelson

unread,
Mar 23, 2015, 3:07:43 PM3/23/15
to Rick Byers, Ilya Grigorik, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
LGTM

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Philip Jägenstedt

unread,
Mar 24, 2015, 4:41:38 AM3/24/15
to Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle, Ilya Grigorik
LGTM2, although none required.

Ilya Grigorik

unread,
Mar 24, 2015, 5:50:52 PM3/24/15
to Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
On Mon, Mar 23, 2015 at 10:53 AM, Rick Byers <rby...@chromium.org> wrote:
This sounds really exciting.  I'm not familiar with the details here, but in general error reporting techniques have proven to be critical to raising application quality elsewhere.

To what extent can these use cases be addressed by a service worker today?  Can we conceptually imagine this API being implemented as a built-in service worker?  Eg. If a web developer likes most of what this does, but needs some additional application-specific context/changes, can they do so with a service worker?

It's partially implementable via SW... 

For navigation requests you can intercept the fetch, detect if/when that fails, log it and/or dispatch an report indicating that the fetch failed. However, you might not get the same fidelity of why the fetch failed.. I guess, in theory, you can check Resource Timing timestamps to see what succeeded and infer if there was a DNS/TCP and/or other error.. but, at least in Chrome, we currently don't surface this - see [1]. 

AFAIK, you wouldn't be able to implement the embed case with SW: https://w3c.github.io/network-error-logging/#reporting-of-third-party-subresource-fetch-failures - the SW controlling the page that embeds the third party resource would intercept the request, and third-party origin can't instrument those (without NEL) to fire an error report on failed fetches.

ig

Anne van Kesteren

unread,
Mar 25, 2015, 10:03:51 AM3/25/15
to Ilya Grigorik, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
On Tue, Mar 24, 2015 at 10:50 PM, 'Ilya Grigorik' via blink-dev
<blin...@chromium.org> wrote:
> For navigation requests you can intercept the fetch, detect if/when that
> fails, log it and/or dispatch an report indicating that the fetch failed.
> However, you might not get the same fidelity of why the fetch failed..

Yeah, that you don't get that information is kind of the point. The
decision to not expose network errors in more detail than that
("network error") was quite purposeful.


--
https://annevankesteren.nl/

Ilya Grigorik

unread,
Mar 25, 2015, 7:24:14 PM3/25/15
to Anne van Kesteren, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
I bootstrapped the current list of errors from Domain Reliability [1], we can (read, should) definitely revisit their granularity, etc., as we iterate on the implementation. If you have any particular suggestions or insights, let me know! ... Let's open appropriate issues on GH and iterate there?

Anne van Kesteren

unread,
Mar 26, 2015, 2:16:49 AM3/26/15
to Ilya Grigorik, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
On Thu, Mar 26, 2015 at 12:23 AM, Ilya Grigorik <igri...@google.com> wrote:
> [W]e can (read, should) definitely revisit their granularity, etc., as we iterate
> on the implementation. If you have any particular suggestions or insights,
> let me know!

Well, as I tried to say quite explicitly in the previous email (and I
also told the Web Performance WG in person at one point), we haven't
gone further than the granularity of "network error" and I believe
that has been quite intentional. What changed?


--
https://annevankesteren.nl/

Ilya Grigorik

unread,
Mar 26, 2015, 5:31:42 PM3/26/15
to Anne van Kesteren, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
On Wed, Mar 25, 2015 at 11:16 PM, Anne van Kesteren <ann...@annevk.nl> wrote:
Well, as I tried to say quite explicitly in the previous email (and I
also told the Web Performance WG in person at one point), we haven't
gone further than the granularity of "network error" and I believe
that has been quite intentional. What changed?

The high-level goals of NEL are to enable developers to (a) detect network failures that prevent users from using their service, and (b) to provide useful context about the cause such that these failures can be addressed. Just knowing that a "network error" has occurred is a start, but to be actionable we also need to provide some context on why the failure occurred. 

With that in mind, yes, to address the NEL use cases we are exposing new data that was previously not available: https://w3c.github.io/network-error-logging/#privacy-considerations ... in addition to what's mentioned in that section, note that latest NEL draft does not provide a JS interface (previous iterations of the spec did) for retrieving the errors; the error logging is a "SHOULD" not a must for the UA; and the types and granularity of errors is something I expect we'll have to iterate on as we implement the spec to find the right balance between enabling (b) and corresponding privacy/security considerations.

ig

Elliott Sprehn

unread,
Mar 26, 2015, 6:17:33 PM3/26/15
to Ilya Grigorik, Anne van Kesteren, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle
Sending to the server is equivalent to having a JS interface in terms of what information is exposed to the page. I must admit having a header that makes us send JSON blobs to a specified uri is kind of strange, is there a reason not to expose this through script and let authors use libraries for it? That saves us from dictating the message format and what information is included so as the web evolves so can the format and the data. Once we start sending blobs to the server we'll probably be stuck with that format for a very long time.

- E

Ilya Grigorik

unread,
Mar 26, 2015, 6:33:51 PM3/26/15
to Elliott Sprehn, Anne van Kesteren, Rick Byers, Siddharth Vijayakrishnan, blink-dev, Thomas Tuttle

On Thu, Mar 26, 2015 at 3:16 PM, Elliott Sprehn <esp...@chromium.org> wrote:
Sending to the server is equivalent to having a JS interface in terms of what information is exposed to the page. I must admit having a header that makes us send JSON blobs to a specified uri is kind of strange, is there a reason not to expose this through script and let authors use libraries for it? That saves us from dictating the message format and what information is included so as the web evolves so can the format and the data. Once we start sending blobs to the server we'll probably be stuck with that format for a very long time.

That's not true. Exposing it via JS makes the error data available to any script running on your page, whereas restricting to server delivery only makes it available to collectors declared by the NEL origin. Further, a JS interface is not sufficient to address the use cases we're targeting with NEL: if the page failed to load then there is no script to execute and report the error (can't do real-time reporting); with JS you can only report errors after the fact and assuming the user has eventually and successfully reached your site (that's a big if); there is no way to get reports for embedded resources. 

Some additional context on use cases: http://w3c.github.io/network-error-logging/#use-cases

Also, the header mechanism here is not new. CSP uses same approach to report security violations: 

ig
Reply all
Reply to author
Forward
0 new messages