Proposal: Map.split

Chris Miller

unread,

Dec 13, 2021, 1:42:45 PM12/13/21

to elixir-lang-core

Is there any interest in adding a `Map.split_with/2` that would take a function of `{key :: any(), value :: any()} -> boolean` and returns `{map_where_true :: map(), map_where_false :: map()}`?

I know this functionality can be created easily with the functionality thats already exposed, but it seems like it might be a nice addition and would add greater parity between Enum and Map - it could also be added to Keyword, even thought the distinction between Keyword.split_with and Enum.split_with would be nominal.

José Valim

unread,

Dec 13, 2021, 1:51:30 PM12/13/21

to elixir-lang-core

Hi Chris,

Thanks for the proposal.

I would like to first see benchmarks that show a Map implementation can be considerably more efficient. Otherwise, if it is about saving a couple Map.new calls, then I would rather not add it, as it will move to copying many more functions from Enum to Map.

On Mon, Dec 13, 2021 at 7:42 PM Chris Miller <camil...@gmail.com> wrote:

Is there any interest in adding a `Map.split_with/2` that would take a function of `{key :: any(), value :: any()} -> boolean` and returns `{map_where_true :: map(), map_where_false :: map()}`?

I know this functionality can be created easily with the functionality thats already exposed, but it seems like it might be a nice addition and would add greater parity between Enum and Map - it could also be added to Keyword, even thought the distinction between Keyword.split_with and Enum.split_with would be nominal.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/0a7b4be9-ccb9-4c6a-b482-96379a6a9a18n%40googlegroups.com.

Chris Miller

unread,

Dec 13, 2021, 2:38:16 PM12/13/21

to elixir-lang-core

Of course some numbers would be great to look at here - here is a benchmark for an implementation of Map.split_with/2 comparing the results agains a few implementations using the current standard library

Code

```elixir

defmodule SplitWith do
def test do
map = Map.new(0..1000, fn n -> {n, n} end)
predicate = fn {x, _} -> rem(x, 2) == 0 end

Benchee.run(%{
"split_with" => fn -> split_with(map, predicate) end,
"enum_split_with" => fn -> enum_split_with(map, predicate) end,
"enum_reduce" => fn -> enum_reduce(map, predicate) end,
"map_filter_reject" => fn -> map_filter_reject(map, predicate) end
})
end

def split_with(map, fun) when is_map(map) and is_function(fun, 1) do
iter = :maps.iterator(map)
next = :maps.next(iter)
{while_true, while_false} = do_split_with({[], []}, next, fun)
{:maps.from_list(while_true), :maps.from_list(while_false)}
end

defp do_split_with(acc, :none, _fun), do: acc

defp do_split_with({while_true, while_false}, {key, value, iter}, fun) do
if fun.({key, value}) do
do_split_with({[{key, value} | while_true], while_false}, :maps.next(iter), fun)
else
do_split_with({while_true, [{key, value} | while_false]}, :maps.next(iter), fun)
end
end

def map_filter_reject(map, predicate) do
{Map.filter(map, &predicate.(&1)), Map.filter(map, fn pair -> not predicate.(pair) end)}
end

def enum_reduce(map, predicate) do
Enum.reduce(map, {%{}, %{}}, fn {key, value} = pair, {while_true, while_false} ->
if predicate.(pair) do
{Map.put(while_true, key, value), while_false}
else
{while_true, Map.put(while_false, key, value)}
end
end)
end

def enum_split_with(map, predicate) do
{while_true, while_false} = Enum.split_with(map, &predicate.(&1))
{Map.new(while_true), Map.new(while_false)}
end
end
```

Benchee results
```
iex(1)> SplitWith.test
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.0
Erlang 24.0.5

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 28 s

Benchmarking enum_reduce...
Benchmarking enum_split_with...
Benchmarking map_filter_reject...
Benchmarking split_with...

Name ips average deviation median 99th %
split_with 5.73 K 174.41 μs ±10.32% 170 μs 243 μs
enum_split_with 5.06 K 197.47 μs ±12.64% 191 μs 291 μs
enum_reduce 4.52 K 221.10 μs ±22.96% 207 μs 435 μs
map_filter_reject 4.52 K 221.15 μs ±11.35% 215 μs 313 μs

Comparison:
split_with 5.73 K
enum_split_with 5.06 K - 1.13x slower +23.06 μs
enum_reduce 4.52 K - 1.27x slower +46.69 μs
map_filter_reject 4.52 K - 1.27x slower +46.73 μs

```

So we do get some modest gains with the new implementation. Let me know what your thoughts are on moving forward!

And as always - really appreciate your work!

José Valim

unread,

Dec 14, 2021, 4:02:37 AM12/14/21

to elixir-lang-core

TL;DR - pull request is welcome.

My initial take was that the performance benefits are not significant but then, after carefully checking the Enum API, I realized that my slippery-slope argument does not hold: there aren't any other functions, as far as I can tell, that would make sense to be ported to Map. Therefore, I would say the small performance improvement and the API completeness arguments are strong enough to move forward, so please do send a pull request for Map.split_with/2 and Keyword.split_with/2.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/bc95dc57-2df1-4b32-b673-535e5c499493n%40googlegroups.com.

Zach Daniel

unread,

Dec 14, 2021, 12:54:21 PM12/14/21

to elixir-l...@googlegroups.com

For some reason, this seems off to me. It seems to me like something like a.) allowing enumerable implementors to override the method of splitting, and then b.) allowing `split_with` to specify an option that says "I expect the original type back", might be better? My perspective here is based on making the structures that we use regularly more powerful/smarter as opposed to splitting that responsibility up into the other modules. I'll have to forever remember that `Map.split_with` performs better for maps than `Enum.split_with` and to use that instead if I'm using a map. Whereas `Enum.split_with(map, func, into: %{})` (just using that option name as an example, I don't really think it is good).

Sent via Superhuman

On Tue, Dec 14, 2021 at 4:02 AM, José Valim <jose....@dashbit.co> wrote:

TL;DR - pull request is welcome.

My initial take was that the performance benefits are not significant but then, after carefully checking the Enum API, I realized that my slippery-slope argument does not hold: there aren't any other functions, as far as I can tell, that would make sense to be ported to Map. Therefore, I would say the small performance improvement and the API completeness arguments are strong enough to move forward, so please do send a pull request for Map.split_with/2 and Keyword.split_with/2.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/bc95dc57-2df1-4b32-b673-535e5c499493n%40googlegroups.com.

--

You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4K5_PDeWk%3Dis26c%3Dc0OwFDrfy8LfjwQivriccxq2vKWZg%40mail.gmail.com.

Reply all

Reply to author

Forward

Proposal: Map.split_with/2

Chris Miller

José Valim

Chris Miller

José Valim

Zach Daniel