Web-Facing Change PSA: URLPattern: Inherit left, wildcard right

20 views
Skip to first unread message

Jeremy Roman

unread,
Nov 2, 2023, 4:04:16 PM11/2/23
to blink-dev, Shunya Shishido, Domenic Denicola

See https://github.com/whatwg/urlpattern/pull/198; tracked in https://chromestatus.com/feature/6076647526891520 and https://bugs.chromium.org/p/chromium/issues/detail?id=1468446.

The following changes apply to patterns which are constructed using a base URL, the constructor string syntax, or both -- but not any pattern which explicitly specifies components separately without a base URL.

  • Components are not inherited from a base URL if an "earlier" component is explicitly specified.
  • In the string format, unspecified "later" components are implicitly wildcarded, rather than required to be empty (with the exception of the port, which is always taken to be specified when the hostname is).
  • Username and password are never implicitly specified or inherited.

For example:

  1. "https://example.com/*" also matches with any username, password, search, and hash. Previously this would be written "https://*:*@example.com/*\\?*#*".
  2. new URLPattern({ pathname: "/login" }, "https://example.com/?q=hello") accepts any query string and hash, not only "?q=hello" and "".
  3. "https://*:*" or {protocol: "https"} means "any HTTPS URL, on any port", and "https://*" means "any HTTPS URL on the default port (443)". These have the same meaning whether or not a base URL is provided, since specifying the protocol prohibits inheritance of other components.

This makes patterns more expansive than before, in cases where wildcards are likely to be desirable.

The logic of inheriting components from a base URL dictionary is also similarly changed in a way that may make it not match where it did before, but more consistently with the above and with how relative URL strings are resolved. For example, new URLPattern("https://example.com/foo?q=1#hello").test({pathname: "/foo", hash: "hello", baseURL: "https://example.com/bar?q=1"}) previously returned true but will now be false, since the search component is not inherited when the pathname is specified. This is analogous to how new URL("/foo#hello", "https://example.com/bar?q=1") works. The reverse is also possible; in both cases this is quite niche.


Though this change is significant for the kinds of patterns it enables, I believe it does not significantly affect the patterns in use today. We have added UseCounters for URLPattern matches (i.e., the instances where a URLPattern is tested against a URL) for cases where the output should be different due to the string format and base URL inheritance changes, and both are extremely small. I expect this is because URLPattern is currently primarily used only on a single component (the path component) and without a base URL; behavior in those cases is not affected.

Reply all
Reply to author
Forward
0 new messages