Proposal: Expression-based Inspect implementation for URI

77 views
Skip to first unread message

Wojtek Mach

unread,
May 31, 2024, 6:13:54 AMMay 31
to elixir-lang-core
The current inspect implementation is pretty verbose:

```elixir
iex> URI.new!("https://elixir-lang.org")
%URI{
  scheme: "https",
  userinfo: nil,
  host: "elixir-lang.org",
  port: 443,
  path: nil,
  query: nil,
  fragment: nil
}

```

I'd like to propose a more concise one along the lines of "Expression-based inspection" from ELixir v1.14:

```elixir
iex> URI.new!("https://elixir-lang.org")
URI.new!("https://elixir-lang.org")

```

There is a subtle difference between `URI.new!/1` and `URI.parse/1`, the former sets the deprecated `authority` field to `nil` so this proposal takes that into consideration, returns `URI.parse` "as is":

```elixir
iex> URI.parse("https://elixir-lang.org")
URI.parse("https://elixir-lang.org")

```

`URI.new!/1` is stricter than `URI.parse/1` and in particular it does not allow non-escaped characters, notably `"` and `(` and `)` which would conflict with the inspect representation so for these I propose to return `%URL{}`:

```elixir
iex> URI.parse("https://elixir-lang.org/\"")
%URI{
  scheme: "https",
  authority: "elixir-lang.org",
  userinfo: nil,
  host: "elixir-lang.org",
  port: 443,
  path: "/\"",
  query: nil,
  fragment: nil
}

```

I vaguely remember previous discussions about this and I believe the biggest concern was "hiding" the internal structure. For example, what is the `:scheme`, `:host` and `:path` in intentionally mistyped URL `http:/foo`? We would not get the answer from Inspect:

```elixir
iex> URI.new!("http:/foo")
URI.new!("http:/foo")

```

but I'd like to propose to additionally implement `IEx.Info`:

```
iex> i URI.new!("http:/foo")
Term
  URI.new!("http:/foo")
Data type
  URI
Description
  This is a struct representing a URI.
Raw representation
  %URI{scheme: "http", authority: nil, userinfo: nil, host: nil, port: 80, path: "/foo", query: nil, fragment: nil}
Reference modules
  URI
Implemented protocols
  IEx.Info, Inspect, String.Chars

```

José Valim

unread,
May 31, 2024, 6:20:54 AMMay 31
to elixir-l...@googlegroups.com
There are some issues with this proposal:

1. URI parse does not validate, it only parses the components, which means inspecting with URI.new! after URI.parse! does not guarantee it will be parseable and return the same result back.

2. URI.new! (and similar) hide the underlying fields of the URI, making them harder to discover. I believe URI.new! (and similar) are almost always justified when the structs fields are private, which is not the case here, so there is more space for debate

If the main goal is to reduce verbosity, then maybe we should consider tagging all fields as optional, so we have this instead:

iex> URI.new!("https://elixir-lang.org")
%URI{
  scheme: "https",
  host: "elixir-lang.org",
  port: 443
}

It may be an acceptable trade-off between hiding fields and keeping it more compact.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/e36197aa-1b77-4000-a3b2-55f022e20ce1n%40googlegroups.com.

José Valim

unread,
May 31, 2024, 6:32:24 AMMay 31
to elixir-l...@googlegroups.com
I have reviewed the pull request and I noticed that it does validate the URL before generating URI.new!, but it now has the unfortunate side-effect that you may have 2 (or 3) distinct representations when inspecting, which is another downside of the proposal. I don't think the cons (multiple representations, hiding of structures) justify the pro of a more succinct representation.

Wojtek Mach

unread,
May 31, 2024, 6:58:22 AMMay 31
to elixir-l...@googlegroups.com
On 31 May 2024, at 12:32, José Valim <jose....@dashbit.co> wrote:

I have reviewed the pull request and I noticed that it does validate the URL before generating URI.new!, but it now has the unfortunate side-effect that you may have 2 (or 3) distinct representations when inspecting, which is another downside of the proposal. I don't think the cons (multiple representations, hiding of structures) justify the pro of a more succinct representation.

I agree, 3 representations is probably too much, I think having just 2, URI.new! and %URI{}, is more palatable. This reminds me that we used to print `%{~D[2024-01-01] | calendar: Calendar.Holocene}` as `%Date{year: 2024, month: 1, day: 1, calendar: Calendar.Holocene}` so FWIW there was some precedent for this. (It has been improved since, though, and nowadays it would have printed: `~D[2024-01-01 Calendar.Holocene]`.)

If discoverability is more important than verbosity then that’s that. I personally don’t think it matters that much, we have good error messages when mistyping field names and we have the |> inspect(structs: false) and IEx.Info “fallbacks”. Here’s somewhat contrived example so take it for what it is but the difference is pretty stark:

```
iex> Enum.map(1..5, &URI.new!("https://example.com/foo?x=#{&1}"))
```

```
iex> Enum.map(1..5, &URI.new!("https://example.com/foo?x=#{&1}"))
[
  %URI{
    scheme: "https",
    userinfo: nil,
    host: "example.com",
    port: 443,
    path: "/foo",
    query: "x=1",
    fragment: nil
  },
  %URI{
    scheme: "https",
    userinfo: nil,
    host: "example.com",
    port: 443,
    path: "/foo",
    query: "x=2",
    fragment: nil
  },
  %URI{
    scheme: "https",
    userinfo: nil,
    host: "example.com",
    port: 443,
    path: "/foo",
    query: "x=3",
    fragment: nil
  },
  %URI{
    scheme: "https",
    userinfo: nil,
    host: "example.com",
    port: 443,
    path: "/foo",
    query: "x=4",
    fragment: nil
  },
  %URI{
    scheme: "https",
    userinfo: nil,
    host: "example.com",
    port: 443,
    path: "/foo",
    query: "x=5",
    fragment: nil
  }
]
```

Can’t say I run into stuff like this very often but on the rare occasion I wouldn’t mind something way more compact so I could focus on what matters (i.e. not the internal structure)

Anyway, thanks for considering this!

Zach Allaun

unread,
Jun 1, 2024, 10:31:41 AMJun 1
to elixir-lang-core
Wojtek previously suggested a ~URI sigil, and while there was some initial excitement around it, the thread somewhat died without explicit approval or dismissal.

I bring it up because, if that proposal were to be accepted, I think that using ~URI"https://example.com" as the inspect representation would be my preferred approach. If the URI is valid, it would inspect using the sigil (which itself is compile-time validated when used); if it is invalid, it inspects using the struct.

To be clear, I don't want to derail this thread, and if there's interest in resurrecting the ~URI discussion, I think that should be done on the other thread, but I wanted to bring it up here because it would be suboptimal to change the inspect representation twice, were ~URI to be implemented.
Reply all
Reply to author
Forward
0 new messages