Proposal: Map.split_with/2

54 views
Skip to first unread message

Chris Miller

unread,
Dec 13, 2021, 1:42:45 PM12/13/21
to elixir-lang-core
Is there any interest in adding a `Map.split_with/2` that would take a function of `{key :: any(), value :: any()} -> boolean` and returns `{map_where_true :: map(), map_where_false :: map()}`? 

I know this functionality can be created easily with the functionality thats already exposed, but it seems like it might be a nice addition and would add greater parity between Enum and Map - it could also be added to Keyword, even thought the distinction between Keyword.split_with and Enum.split_with would be nominal.


José Valim

unread,
Dec 13, 2021, 1:51:30 PM12/13/21
to elixir-lang-core
Hi Chris,

Thanks for the proposal.

I would like to first see benchmarks that show a Map implementation can be considerably more efficient. Otherwise, if it is about saving a couple Map.new calls, then I would rather not add it, as it will move to copying many more functions from Enum to Map.

On Mon, Dec 13, 2021 at 7:42 PM Chris Miller <camil...@gmail.com> wrote:
Is there any interest in adding a `Map.split_with/2` that would take a function of `{key :: any(), value :: any()} -> boolean` and returns `{map_where_true :: map(), map_where_false :: map()}`? 

I know this functionality can be created easily with the functionality thats already exposed, but it seems like it might be a nice addition and would add greater parity between Enum and Map - it could also be added to Keyword, even thought the distinction between Keyword.split_with and Enum.split_with would be nominal.


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/0a7b4be9-ccb9-4c6a-b482-96379a6a9a18n%40googlegroups.com.

Chris Miller

unread,
Dec 13, 2021, 2:38:16 PM12/13/21
to elixir-lang-core
Of course some numbers would be great to look at here - here is a benchmark for an implementation of Map.split_with/2 comparing the results agains a few implementations using the current standard library

Code
```elixir
defmodule SplitWith do
  def test do
    map = Map.new(0..1000, fn n -> {n, n} end)
    predicate = fn {x, _} -> rem(x, 2) == 0 end

    Benchee.run(%{
      "split_with" => fn -> split_with(map, predicate) end,
      "enum_split_with" => fn -> enum_split_with(map, predicate) end,
      "enum_reduce" => fn -> enum_reduce(map, predicate) end,
      "map_filter_reject" => fn -> map_filter_reject(map, predicate) end
    })
  end

  def split_with(map, fun) when is_map(map) and is_function(fun, 1) do
    iter = :maps.iterator(map)
    next = :maps.next(iter)
    {while_true, while_false} = do_split_with({[], []}, next, fun)
    {:maps.from_list(while_true), :maps.from_list(while_false)}
  end

  defp do_split_with(acc, :none, _fun), do: acc

  defp do_split_with({while_true, while_false}, {key, value, iter}, fun) do
    if fun.({key, value}) do
      do_split_with({[{key, value} | while_true], while_false}, :maps.next(iter), fun)
    else
      do_split_with({while_true, [{key, value} | while_false]}, :maps.next(iter), fun)
    end
  end

  def map_filter_reject(map, predicate) do
    {Map.filter(map, &predicate.(&1)), Map.filter(map, fn pair -> not predicate.(pair) end)}
  end

  def enum_reduce(map, predicate) do
    Enum.reduce(map, {%{}, %{}}, fn {key, value} = pair, {while_true, while_false} ->
      if predicate.(pair) do
        {Map.put(while_true, key, value), while_false}
      else
        {while_true, Map.put(while_false, key, value)}
      end
    end)
  end

  def enum_split_with(map, predicate) do
    {while_true, while_false} = Enum.split_with(map, &predicate.(&1))
    {Map.new(while_true), Map.new(while_false)}
  end
end
```

Benchee results
```
iex(1)> SplitWith.test
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.0
Erlang 24.0.5

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 28 s

Benchmarking enum_reduce...
Benchmarking enum_split_with...
Benchmarking map_filter_reject...
Benchmarking split_with...

Name                        ips        average  deviation         median         99th %
split_with               5.73 K      174.41 μs    ±10.32%         170 μs         243 μs
enum_split_with          5.06 K      197.47 μs    ±12.64%         191 μs         291 μs
enum_reduce              4.52 K      221.10 μs    ±22.96%         207 μs         435 μs
map_filter_reject        4.52 K      221.15 μs    ±11.35%         215 μs         313 μs

Comparison:
split_with               5.73 K
enum_split_with          5.06 K - 1.13x slower +23.06 μs
enum_reduce              4.52 K - 1.27x slower +46.69 μs
map_filter_reject        4.52 K - 1.27x slower +46.73 μs
```

So we do get some modest gains with the new implementation.  Let me know what your thoughts are on moving forward!

And as always - really appreciate your work!

José Valim

unread,
Dec 14, 2021, 4:02:37 AM12/14/21
to elixir-lang-core
TL;DR - pull request is welcome.

My initial take was that the performance benefits are not significant but then, after carefully checking the Enum API, I realized that my slippery-slope argument does not hold: there aren't any other functions, as far as I can tell, that would make sense to be ported to Map. Therefore, I would say the small performance improvement and the API completeness arguments are strong enough to move forward, so please do send a pull request for Map.split_with/2 and Keyword.split_with/2.

Zach Daniel

unread,
Dec 14, 2021, 12:54:21 PM12/14/21
to elixir-l...@googlegroups.com
For some reason, this seems off to me. It seems to me like something like a.) allowing enumerable implementors to override the method of splitting, and then b.) allowing `split_with` to specify an option that says "I expect the original type back", might be better? My perspective here is based on making the structures that we use regularly more powerful/smarter as opposed to splitting that responsibility up into the other modules. I'll have to forever remember that `Map.split_with` performs better for maps than `Enum.split_with` and to use that instead if I'm using a map. Whereas `Enum.split_with(map, func, into: %{})` (just using that option name as an example, I don't really think it is good).

Sent via Superhuman


On Tue, Dec 14, 2021 at 4:02 AM, José Valim <jose....@dashbit.co> wrote:
TL;DR - pull request is welcome.

My initial take was that the performance benefits are not significant but then, after carefully checking the Enum API, I realized that my slippery-slope argument does not hold: there aren't any other functions, as far as I can tell, that would make sense to be ported to Map. Therefore, I would say the small performance improvement and the API completeness arguments are strong enough to move forward, so please do send a pull request for Map.split_with/2 and Keyword.split_with/2.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4K5_PDeWk%3Dis26c%3Dc0OwFDrfy8LfjwQivriccxq2vKWZg%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages