[Proposal] Map.filter/2 based on :maps.filter

273 views
Skip to first unread message

Roman Smirnov

unread,
Aug 15, 2019, 3:29:02 AM8/15/19
to elixir-lang-core
Since OTP 18 there is a pretty convenient function :maps.filter.

I think it would be nice to have Map.filter(map, predicate) in Elixir as well instead of doing


map
|> Enum.filter(predicate)
|> Enum.into(%{})

or 

for {key, value} <- map, some_filter(key, value), into: %{}, do: {key, value}

The first one alternative is slower, b/c of 2-step transformation, and the second one consume more memory, could not be piped and has a lack of expressiveness (too imperative way to do a simple filtration).

There were benchmarks and a small discussion in PR: https://github.com/elixir-lang/elixir/pull/9292, but the discussion should be moved to this mailing list.


Alexei Sholik

unread,
Aug 15, 2019, 5:12:31 AM8/15/19
to elixir-lang-core
The simplest alternative is to use :maps.filter(). It's not as easy to pipe into, but that's a minor concern for me.

As another alternative, this should be faster and have lower memory footprint than using Enum, although you'd need to benchmark it as well to be sure:

map
|> Stream.filter(predicate)
|> Map.new()

Personally, I would love to see an extension for Map.new() that would allow filtering the first argument in addition to the currently supported transformation. So instead of the current

Map.new(enumerable, fn thing -> {key, val} end)

we could have

Map.new(enumerable, map: fn thing -> {key, val} end, filter: fn {_key, val} -> predicate(val) end)

Such an extension is unlikely to be added though because it would be another way of doing what is already possible with Enum, Stream, and Map.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5d403c4a-91f8-4c4d-8dc0-297185a2aed8%40googlegroups.com.

Johanna Larsson

unread,
Aug 15, 2019, 6:26:02 AM8/15/19
to elixir-l...@googlegroups.com
The stream version is by far the slowest, which isn't surprising since streams bring some overhead. Didn't measure memory usage though https://gist.github.com/joladev/2b51260107078a1fc7f9d4504916fe21

Honestly, I tend to agree it would be nice to add it, for feature completeness, even if it doesn't improve performance hugely. I understand the desire to avoid unnecessary clutter in the core language, but this is a core erlang function that we're explicitly not wrapping, when most of the other ones are wrapped. And most likely just because it wasn't around from the start. It's quite a bit more ergonomic to use `Map.filter` rather than `for` or `map|>Enum.filter(pred)|>Enum.into({})`. Much cleaner.

Not too hard to implement your own version of `Map.filter` though, so I don't feel too strongly about it.

Andrea Leopardi

unread,
Aug 15, 2019, 6:37:07 AM8/15/19
to elixir-l...@googlegroups.com
We should also address the fact that if we add Map.filter/2, we also likely have to add Map.drop/2 (to mimic Enum and Stream), but that name is already taken.

--

Andrea Leopardi

Roman Smirnov

unread,
Aug 17, 2019, 5:40:46 PM8/17/19
to elixir-lang-core
> The simplest alternative is to use :maps.filter(). It's not as easy to pipe into, but that's a minor concern for me.

It's not an alternative, it's a workaround that you have to use, b/c there is no wrapper on this function. And this suggestion is all about adding this wrapper to Elixir.

  

четверг, 15 августа 2019 г., 12:12:31 UTC+3 пользователь alco написал:
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

Roman Smirnov

unread,
Aug 17, 2019, 5:42:34 PM8/17/19
to elixir-lang-core
Why? How Enum.drop/2 is related to Enum.filter/2? They are doing completely different things. 

четверг, 15 августа 2019 г., 13:37:07 UTC+3 пользователь Andrea Leopardi написал:
We should also address the fact that if we add Map.filter/2, we also likely have to add Map.drop/2 (to mimic Enum and Stream), but that name is already taken.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-l...@googlegroups.com.
--

Andrea Leopardi

Andrea Leopardi

unread,
Aug 18, 2019, 5:35:41 PM8/18/19
to elixir-lang-core
> Why? How Enum.drop/2 is related to Enum.filter/2? They are doing completely different things. 

My bad, I was referring to Enum.reject/2!

Andrea Leopardi


To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/c819a8cc-c3b6-41cc-a056-76a294b79fb6%40googlegroups.com.

Roman Smirnov

unread,
Aug 18, 2019, 6:16:51 PM8/18/19
to elixir-lang-core
> My bad, I was referring to Enum.reject/2!

Well, Map.reject/2 does not exist yet. So, we could add it as well.

понедельник, 19 августа 2019 г., 0:35:41 UTC+3 пользователь Andrea Leopardi написал:

Chris

unread,
Aug 27, 2019, 1:52:33 PM8/27/19
to elixir-l...@googlegroups.com
It would be nice to have this.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/13ab9f68-3895-47b1-bd7f-dee62f1a2e66%40googlegroups.com.

Keith Salisbury

unread,
Mar 11, 2021, 4:18:06 PM3/11/21
to elixir-lang-core
Was there a reason this didn't fly?
 

Adam Millerchip

unread,
Sep 9, 2021, 8:09:37 AM9/9/21
to elixir-lang-core
Bump. Is there any opposition to this? I might (eventually) take a look if nobody beats me to it, as long as there isn't any opposition.

José Valim

unread,
Sep 9, 2021, 8:32:09 AM9/9/21
to elixir-l...@googlegroups.com
Last time we had this discussion, the main concern was that :maps.filter emits key, value as arguments instead of being in a tuple, and that was incongruent to the rest of Elixir’s API. We could wrap it in a tuple, but that added some considerable overhead. So we need to make a choice.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Wiebe-Marten Wijnja

unread,
Sep 9, 2021, 2:13:08 PM9/9/21
to elixir-l...@googlegroups.com, José Valim
Where can the results of the benchmark/profiling be found? I wonder if there are tricks to reduce the overhead. Also, maybe the impact of this might possibly be less in newer OTP versions, which is something that might be worth checking.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Wiebe-Marten Wijnja

unread,
Sep 9, 2021, 2:22:04 PM9/9/21
to elixir-l...@googlegroups.com, José Valim
Sorry for double-posting, but what also might be relevant is that other than I thought `:maps.filter` is not a BIF (c. f. its implementation at https://github.com/erlang/otp/blob/master/lib/stdlib/src/maps.erl#L300)

We might be able to write our own alternative that wraps the interface one level of abstraction lower, i. e. building on top of the map iterators and from_list functions directly.

José Valim

unread,
Sep 9, 2021, 3:44:23 PM9/9/21
to elixir-l...@googlegroups.com
Exactly. You can build profiling on top of that and let us know how the different approaches go!

Wiebe-Marten Wijnja

unread,
Sep 10, 2021, 1:21:04 PM9/10/21
to elixir-l...@googlegroups.com
I've done some benchmarking


Implementations:
  • :maps.filter(pred, map_or_iter): The baseline we benchmark against. Expects a two-parameter function, which goes against Elixir idioms, where a two-element tuple would be used instead.
  • MapsFilterProf.wrapped_filter(map_or_iter, fun): Wraps :maps.filter/2, to adapt it to a two-element tuple input.
  • MapsFilterProf.direct_filter(map_or_iter, fun): Re-implements :maps.filter/2 directly, with the assumption that this might be more performant than wrapping a function with an extra intermediate anonymous function.
  • MapsFilterProf.direct_filter_inlined(map_or_iter, fun): Similar to direct_filter/2, but inlines the other fnctions the other implementation would call from the :maps module as well (most importantly: next/1 which is called once per map element). This is done since local function calls are usually faster (and more often optimized by the compiler) than inter-module function calls.


log-log graph comparing the four filter-implementations,
        plotting map-size against time.
Benchmarking was done on Elixir 1.12.0 / Erlang 24.0.1, on an x86-64 Linux machine with 8GB RAM.

It can clearly be seen from this graph that the difference in implementations is very small.
Especially the difference between direct_filter_inlined/2 vs. :maps.filter/2 is neglegible. wrapped_filter/2 is,
as was expected, a little bit (averaging at ~20-40%) slower than the former two.

On average, the direct_filter_inlined/2 seems to be slightly faster than the non-inlined version.
This difference is small, but significant (i.e. reproducible across benchmark re-runs).



The particular benchmark I wrote filters all odd values from a map that has the shape %{0 => 0, 1 => 1, 2 => 2, ... n => n}.
This seems like an appropriate test for filtering maps, but maybe there are even better ideas?


~Marten/Qqwy

OpenPGP_signature

Allen Madsen

unread,
Sep 10, 2021, 5:08:19 PM9/10/21
to elixir-l...@googlegroups.com
Perhaps also benchmark the combination of Enum.filter/2 and Enum.into/2, since this is the current way with the elixir standard library.

José Valim

unread,
Sep 11, 2021, 4:27:21 AM9/11/21
to elixir-l...@googlegroups.com
Thanks Marten, so I think we can proceed with a PR for Map.filter and Map.map, both passing tuples as argument. Note we will need the same for Keyword. Could you please submit it?

Wiebe-Marten Wijnja

unread,
Sep 11, 2021, 9:26:45 AM9/11/21
to elixir-l...@googlegroups.com

@Allen Madsen:

I have added `Enum.filter` to the benchmark.

Also, by telling the compiler to inline the calls to `next` and `iterator`, removing redundant guard-clause checks and not supporting map-iterators (as the rest of the `Map`-module doesn't expose them either),
`Map.filter` is able to be slightly faster than `:maps.filter` 🥳.


For posterity, the related elixir-lang PR can be found here.

~Marten/Qqwy
OpenPGP_signature
Reply all
Reply to author
Forward
0 new messages