[Proposal] is_regex/1 guard

44 views
Skip to first unread message

Christopher Keele

unread,
Aug 16, 2020, 8:08:26 PM8/16/20
to elixir-lang-core
With is_struct/2 coming in 1.11, I think it makes sense to add a guard for regexes at the same time.

It is rarely appropriate for Elixir developers to destructure a Regex struct: its implementation is semi-opaque (partially evidenced by the fact that it carries a version number with it). The main exception is trying to pattern match on function parameters to determine if input is a regex versus something else.

MapSet is another example of a stdlib struct whose implementation is versioned, and might make more sense to have a guard for rather than expecting developers to pattern match it for typecheck guards. However, since it is fully opaque, there are other problems implementing a guard for them, as discussed here already: https://groups.google.com/g/elixir-lang-core/c/2KnRcKTZvuE/m/229Nfw0oCwAJ

The fact that a Regex is a struct is much more of an implementation detail than some other, raw-data stdlib structs; like Range, Date, Version, and URI. An is_regex/1 would help make that clearer, and prevent code like this that has to choose between not using guards, or reaching into the struct implementation detail.

considerations:
  • we'd want to re-implement Regex.regex?/1 in terms of it and possibly put that function into a deprecation path.
  • could be implemented in either Regex or Kernel. adding to Kernel sucks, but I think it would be a stronger commitment to the Regex struct implementation being an opaque detail.
    • it would make Regexs feel a lot more first-class Elixir data-typey, which they already do a lot just by virtue of inspecting as their sigil constructors.

José Valim

unread,
Aug 17, 2020, 3:41:29 AM8/17/20
to elixir-l...@googlegroups.com
Hi Chris,

To me the conclusion is exactly the opposite: since we have is_struct/2, there is no need to add a bunch of guards to Kernel. Furthermore, adding such guards to Kernel will only feel natural for Elixir built-types. For everyone else, the usage is more bureaucratic (i.e. import/require a module and then use it). So is_struct/2 is more consistent.

The argument that the Regex struct is an implementation detail does not hold because, given that Elixir has limited built-in types, it has to be a struct by definition. For example, if anyone has ever implemented a protocol for Regex, you are relying on the fact it is a struct - as there is literally no other option.

The MapSet is really an issue with Dialyzer not having a mechanism for us to express the constructs we have in Elixir. As mentioned in the linked issue, it manifests in other occasions too, and I would rather fix Dialyzer. Furthermore, a "is_mapset" guard would have the same warnings as is_struct/2", if any, so it wouldn't really address this problem.

In any case, Regex.regex? does send mixed signals now that we have is_struct/2, so I will schedule it for deprecation in the long run. Thanks!


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/071f08ea-f4cf-483e-bfe6-29d98af4bef7n%40googlegroups.com.

Christopher Keele

unread,
Aug 17, 2020, 1:17:48 PM8/17/20
to elixir-lang-core
That makes sense to me! I had forgotten Protocol would implicitly imply developers should be comfortable knowing these are structs. In fact, I never thought about implementing a protocol for Regex at all, which is a really cool ability.

I also noticed the commit deprecates exception?/1, which was my only follow-up note if we went the other way from what I proposed. :) Thanks for talking me through your reasoning!
Reply all
Reply to author
Forward
0 new messages