[proposal] Use patterns to filter data (good for pipes)

109 views
Skip to first unread message

Matt Farabaugh

unread,
Dec 13, 2022, 3:50:37 PM12/13/22
to elixir-lang-core
All,

I wrote a macro which allows for using a pattern to essentially extract a value from a data structure:


@doc """
Filter a value out of data using a pattern with one variable. Returns a default value
if no match. Raises if pattern contains no variables or more than one variable.

## Examples

iex> "$3.45" |> PatternHelpers.pattern_filter("$" <> money) |> String.to_float()
3.45

iex> {1,2,3} |> PatternHelpers.pattern_filter({1,_,a})
3

iex> %{a: 1, b: 2} |> PatternHelpers.pattern_filter(%{a: 9, b: b})
nil

iex> %{a: 1, b: 2} |> PatternHelpers.pattern_filter(%{a: 9, b: b}, "???")
"???"
"""

And an unsafe version:

@doc """
See `pattern_filter/3`. Raises if no match.

## Examples

iex> {1,2,3} |> PatternHelpers.pattern_filter!({9,_,b})
** (MatchError) no match of right hand side value: {1, 2, 3}
"""

This is my first proposal. Please let me know if this idea is worth some attention, and how I might better do my part to present it. I have code obviously but I'm not sure this is the place for it.

Thanks,
Matt F

Sabiwara Yukichi

unread,
Dec 13, 2022, 6:55:25 PM12/13/22
to elixir-l...@googlegroups.com
This is an interesting idea, but maybe `then/2` (or case/2 if you're fine piping in it) could already cover these cases quite well (equivalent to your pattern_filter! function):

"$3.45" |> then(fn "$" <> money -> money end) |> String.to_float()
"$3.45" |> case do "$" <> money -> money end |> String.to_float()

The non-raising alternative might be slightly more verbose because you need a second clause. But most of the time (like your money example), the next step of the pipe might fail on nil anyway. Or maybe a with/1 pipeline might work better than the pipe operator if you want a succession of happy-path matches?

If we move forward, it might be better to explicitly declare the returned expression, not just infer it from dangling variables: it isn't obvious what should `{a, b}` or `{a, b, a}` return. Something like:

{1, 2, 1} |> pattern_filter(b <- {a, b, a})

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/4b65083e-6c36-4519-a0ed-fa2da8b61a9bn%40googlegroups.com.

Matt Farabaugh

unread,
Dec 13, 2022, 8:36:48 PM12/13/22
to elixir-l...@googlegroups.com

Thank you for the useful feedback! Let me try addressing your excellent points in turn:

 

  1. While `then/2` and `case/2` are somewhat suitable, they are verbose, especially in the case of the non-raising alternative. The additional typing involved for the anonymous function distracts the reader from the pattern matching, which is the crux of the matter.
  2. With my implementation, the macro returns an empty string: assert “” == “$” |> pattern_filter(“$” <> money)
  3. The `with/1` pipeline is verbose and lacks the visual flow of |>
  4. The way I wrote the macro stipulates that there is only one non-underscored variable per pattern. Changing the examples you provided to `{a, _b}` and `{_a, b, _a}` would work just as well and remove ambiguity as to the return
    1. So {1,2,1} |> pattern_filter({_a, b, _a}) is more concise and just as clear, imo
    2. This does raise a warning, though I’m not sure why the warning is issued for a pattern like this. It just kind of (politely) tells me to remove the underscores, though the code works as intended.

 

 

 

 

 

From: <elixir-l...@googlegroups.com> on behalf of Sabiwara Yukichi <sabi...@gmail.com>
Reply-To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Date: Tuesday, December 13, 2022 at 6:55 PM
To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Subject: [EXTERNAL] Re: [elixir-core:11213] [proposal] Use patterns to filter data (good for pipes)

 

This email came from a source outside of CoverMyMeds. Use caution when clicking on links or replying with confidential information.


This electronic transmission is confidential and intended solely for the addressee(s). If you are not an intended addressee, do not disclose, copy or take any other action in reliance on this transmission. If you have received this transmission in error, please delete it from your system and notify CoverMyMeds LLC at pri...@covermymeds.com. Thank you.

Jay Rogov

unread,
Dec 14, 2022, 11:09:02 AM12/14/22
to elixir-lang-core
A small idea

It might be interesting to look at this as an extension of existing Kernel.destructure/2: https://hexdocs.pm/elixir/1.12/Kernel.html#destructure/2

Right now it works like this

destructure([x, y, z], [1, 2, 3, 4, 5]) # in other words, destructure(form_with_bindings, data)

But if we provide another function, say destructure(data, form, new_form, default \\ nil), with `new_form` specifying which data it should return based on binds from `form`, we can easily use it in pipes like this:

data
|> ...
|> destructure({:ok, result}, result)
|> ...
|> destructure([a, b, c], %{a => 42 + b * c})
|> ...

Matt Farabaugh

unread,
Dec 14, 2022, 11:59:07 AM12/14/22
to elixir-l...@googlegroups.com

Cool yeah – I think extending the contexts in which we can use pattern matching is a good idea generally. So your `destructure/4` would let us pull out any number of data points from a structure and do some work on them and return them. Very cool! Hmmm maybe it shouldn’t be “destructure” anymore but “restructure”.

 

I think it’d be somewhat verbose for the use cases I had in mind for `pattern_filter`, but would make for a nice new way to pattern match!

Matt Farabaugh

unread,
Dec 15, 2022, 3:48:39 PM12/15/22
to elixir-l...@googlegroups.com

Hi again. I’m new to this so apologies if I’m coming off as clueless or pushy. I’m wondering what, if any, next steps there are to advocate for this. Should I share the code? Given what I mentioned in the responses about using one variable or using a new `destructure`, is there still disagreement?

 

Thanks!

Matt F

 

From: <elixir-l...@googlegroups.com> on behalf of Sabiwara Yukichi <sabi...@gmail.com>
Reply-To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Date: Tuesday, December 13, 2022 at 6:55 PM
To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Subject: [EXTERNAL] Re: [elixir-core:11213] [proposal] Use patterns to filter data (good for pipes)

 

This email came from a source outside of CoverMyMeds. Use caution when clicking on links or replying with confidential information.


This is an interesting idea, but maybe `then/2` (or case/2 if you're fine piping in it) could already cover these cases quite well (equivalent to your pattern_filter! function):

Ben Wilson

unread,
Dec 15, 2022, 4:12:37 PM12/15/22
to elixir-lang-core
Hi Matt,

I am not on the core team nor do I speak for them. From what I have generally seen, "alternative APIs" that can be implemented as libraries generally should just stay libraries, unless there develops a strong following that makes it clear that such an API should be moved into the core.

Tangentially, is this code available anywhere?

As for my take on this proposal, I don't really agree with your responses to `then`. 

> The additional typing involved for the anonymous function distracts the reader from the pattern matching, which is the crux of the matter.

50% of the additional typing is related to specifying the return shape, and in this respect that typing is well worth it, because there's no guess work. Code is read far more often than written, and the implicit return structure of `pattern_filter` leaves me guessing about how it works in cases like this:

```
{1, 2, 3} |> pattern_filter({1, a, b})
#=> ?
```

`then` is mildly more verbose, but it composes clearly with functions without requiring macro magic:

# assertive success
iex(2)> {1, 2, 3} |> then(fn {1, a, b} -> {a, b} end)
{2, 3}

# fall back to nil or some other value
iex(4)> 1 |> then(fn {1, a, b} -> {a, b}; _ -> nil end)
nil

# assertive failure
iex(5)> 1 |> then(fn {1, a, b} -> {a, b} end)
** (FunctionClauseError)
```

`then` doesn't require that you learn any new API, you just do normal function stuff and everything works as expected. I'm not seeing a significant improvement with `pattern_filter` over this.

- Ben

Matt Farabaugh

unread,
Dec 15, 2022, 4:22:52 PM12/15/22
to elixir-l...@googlegroups.com

Hi Ben,

 

Thank you for the feedback!

 

The pattern_filter requires a pattern with exactly one variable and therefore always returns just the value bound to that variable.

```

{1, 2, 3} |> pattern_filter({1, a, b})

```

would result in an ArgumentError, but

 

```

{1, 2, 3} |> pattern_filter({1, _a, b})

# 3

# or

{1, 2, 3} |> pattern_filter({1, a, _})

# 2

# would both work

```

The shape of the return is stereotyped, so it is simpler than `then`.

 

I will put the code up on github and share!

 

Thanks,

Matt F

 

 

From: <elixir-l...@googlegroups.com> on behalf of Ben Wilson <benwil...@gmail.com>


Reply-To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Date: Thursday, December 15, 2022 at 4:12 PM
To: elixir-lang-core <elixir-l...@googlegroups.com>

Ben Wilson

unread,
Dec 15, 2022, 4:23:43 PM12/15/22
to elixir-lang-core
Apologies, I missed that you addressed how `{1, 2, 3} |> pattern_filter({1, a, b})` would work in your earlier reply, in that you only allow a single variable to be bound. This further reduces its general applicability.

Matt Farabaugh

unread,
Dec 15, 2022, 4:31:25 PM12/15/22
to elixir-l...@googlegroups.com

Hi Ben,

 

I agree that it reduces its applicability, but I see that as a virtue. Filter and Map are useful despite being less applicable than Reduce, since they are simpler.

 

Matt F

 

From: <elixir-l...@googlegroups.com> on behalf of Ben Wilson <benwil...@gmail.com>


Reply-To: "elixir-l...@googlegroups.com" <elixir-l...@googlegroups.com>
Date: Thursday, December 15, 2022 at 4:23 PM
To: elixir-lang-core <elixir-l...@googlegroups.com>

José Valim

unread,
Dec 15, 2022, 4:58:02 PM12/15/22
to elixir-l...@googlegroups.com
Hi Matt, thanks for the proposal.

My concern with the proposal is that it introduces yet another way of pattern matching, compared to case, =, and match?. Imagine I wrote this code:

case res do
  {:ok, value} -> value
  :error -> :default
end

And if someone in a code review asked me to rewrite this code, I am not sure how pleased I would be. I see the following issues:

1. By not declaring the other cases, we are implicitly discarding them. A case would at least require an explicit `_ ->`
2. By defaulting to `nil` for unmatched cases, we introduce more scenarios where nil can creep into the code. For example, your first example would raise an error on nil if the string does not match
3. If for some reason I need to extract two variables from the same pattern, I need to rewrite the code or extract it twice (duplication, inefficient)

Plus, pattern_filter seems to be designed for piping, which makes sense, but in Elixir the pattern is (almost?) always on the left side of the value.

PS: I didn't reply exactly because I was waiting to see if the proposal developed in different ways. Typically speaking, if there wasn't an explicit "go ahead", then it won't be accepted unless it is further developed. :)




Zach Daniel

unread,
Dec 15, 2022, 5:11:32 PM12/15/22
to elixir-l...@googlegroups.com
Maybe a ridiculous idea, but could we support the actual core pattern matching syntax in some way to support this kind of thing? If so, it could make pattern matching even more powerful than it already is.

This is *not* a good suggestion for the syntax I think its very very ugly and not good, I'm just trying to make up some random operators that doesn't exist yet. I also haven't thought this through enough to say wether or not its even technically sound, but to me enhancing pattern matching sounds better than introducing new ways to match patterns.
```
case records do
  [{a, _}--->]  ->
    a # is a list of all `a` that matched the pattern, requiring that all elements match the pattern

  [{a, _}-=->]  ->
    a # is a list of all `a` that matched the pattern, requiring at least one match

  [{a, _}-->]  ->
    a # is a list of all `a` that matched the pattern, allowing no matches

  ...
end
```

Then it plays well with case, won't affect old code because it requires new syntax, that kind of thing. It would also support functions where I use a guard at first and then realize I have to switch it from a guard, because I now want to take a list and only operate when all elements of a list match the pattern.


Anyway, just an off the wall idea :)



To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

This electronic transmission is confidential and intended solely for the addressee(s). If you are not an intended addressee, do not disclose, copy or take any other action in reliance on this transmission. If you have received this transmission in error, please delete it from your system and notify CoverMyMeds LLC at privacy@covermymeds.com. Thank you.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KzqaLTtDKEz_rBx7W5waHao%3Dm9JMGJn%2BJ0Z9OeHMLW6w%40mail.gmail.com.

Matt Farabaugh

unread,
Dec 15, 2022, 5:31:42 PM12/15/22
to elixir-l...@googlegroups.com

Thanks for the reply, José. I was thinking that more ways to pattern match is a good thing. I’m still new to the language so I might not have its philosophy internalized yet.

 

I probably should have included at least the heads of the macros:

 

defmacro pattern_filter(value, pattern, default \\ nil)

defmacro pattern_filter!(value, pattern)

 

So the fallback can be specified as not nil. One of my examples returns “???” when not matching. I was just trying to make it like `Map.get` in that sense. So your example would become this:

 

case res do

  {:ok, value} -> value

  :error -> :default

end

 

|> pattern_filter({:ok, value}, :default)

 

I agree that it is less flexible than a case – maybe more akin to a ternary with a pattern.

 

I also think that if you needed to extract two variables, then pattern_filter wouldn’t be the right tool.

 

In any case, I really appreciate how much thoughtful feedback I gained from you and the others who replied! I’m glad to be working in Elixir.

Reply all
Reply to author
Forward
0 new messages