Map.new/2 is quite useful and I use it frequently. Whenever I use it, I expect that my transform function returns unique keys, so that Enum.count(input) == map_size(output) holds true. Unfortunately, this hasn’t always held true for the data I’m working with, and it results in an output map that’s essentially “lost” some of the input data. Here’s an example:
iex(2)> input = [%{name: "Bob", age: 37}, %{name: "Jill", age: 24}, %{name: "Bob", age: 0}]
[%{age: 37, name: "Bob"}, %{age: 24, name: "Jill"}, %{age: 0, name: "Bob"}]
iex(3)> Map.new(input, &{&1.name, &1.age})
%{"Bob" => 0, "Jill" => 24}
Here the %{name: "Bob", age: 37} data has been lost because it was overwritten by %{name: "Bob", age: 0} since maps obviously can’t have duplicates. In this situation, it would be really nice if Map.new/2 did not silently ignore the fact that I gave it duplicate keys. I’d like a strict from of Map.new/2 that raises an error when your transform function returns duplicate keys, to notify you that you aren’t using it properly.
If we add this, I’m not sure if it can just be a change to Map.new/2 (would that be a breaking change for some users?) or if we’d need to add an alternate function like Map.new!/2 or Map.strict_new/2 or something.
Thoughts?
Myron
for n <- input, into: %{}, do: {n.name, n.age}
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5e15315d-b6c3-4fa7-b31f-a2d7c5576941%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Beautiful is
better than ugly,
Explicit is better than implicit,
Simple is better than complex,
Complex is better than complicated.
The Zen of Python, by Tim Peters
Overall I like the idea of having a separate function that throws an error in case there are duplicated keys. Just don't think it's the place to be done on Map.new/2 It works that way on Erlang maps:from_list and also with forfor n <- input, into: %{}, do: {n.name, n.age}
2017-10-03 23:23 GMT-04:00 Myron Marston <myron....@gmail.com>:
Map.new/2is quite useful and I use it frequently. Whenever I use it, I expect that my transform function returns unique keys, so thatEnum.count(input) == map_size(output)holds true. Unfortunately, this hasn’t always held true for the data I’m working with, and it results in an output map that’s essentially “lost” some of the input data. Here’s an example:iex(2)> input = [%{name: "Bob", age: 37}, %{name: "Jill", age: 24}, %{name: "Bob", age: 0}] [%{age: 37, name: "Bob"}, %{age: 24, name: "Jill"}, %{age: 0, name: "Bob"}] iex(3)> Map.new(input, &{&1.name, &1.age}) %{"Bob" => 0, "Jill" => 24}Here the
%{name: "Bob", age: 37}data has been lost because it was overwritten by%{name: "Bob", age: 0}since maps obviously can’t have duplicates. In this situation, it would be really nice ifMap.new/2did not silently ignore the fact that I gave it duplicate keys. I’d like a strict from ofMap.new/2that raises an error when your transform function returns duplicate keys, to notify you that you aren’t using it properly.If we add this, I’m not sure if it can just be a change to
Map.new/2(would that be a breaking change for some users?) or if we’d need to add an alternate function likeMap.new!/2orMap.strict_new/2or something.Thoughts?
Myron
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5e15315d-b6c3-4fa7-b31f-a2d7c5576941%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
We definitely shouldn’t change Map.new as that would be backwards incompatible.
I had this concern as well, although, I’m curious: can anyone come up with a valid use case for using Map.new/2 with duplicate keys and wanting it to discard the earlier duplicate? I can’t think of a use case for that, but it’s probably just lack of imagination on my part.
Why not just fold over your list and
Map.merge/3over them with a function that handles the duplicates (and throws or whatever you wish)?
If I realized up front that my input data violated my expectations and would result in duplicate keys, then sure, I’d do something like this (or I’d pre-filter out the excess entires). The problem that’s bitten me on multiple occasions is that according to my understanding of my input data, I expect Map.new/2 to work just fine, and for there not to be any duplicate keys. As an example, consider the common use case of working with legacy SQL DBs, where you’ve got a table with a column that is conceptually unique, but that lacks a unique index on to enforce it. I’ve wound up with bugs in my application as a result. Obviously, this isn’t Elixir’s fault, but I do think a warning or an error in this situation would help surface the problem much sooner.
José, would you be open to the addition of a Map.new!/2 function that raises an error when there are duplicate keys? Although, maybe that implies that Map.new/2 returns {:ok, map} (which is obviously doesn’t). Maybe there’s a better name we can come up with for this function if there’s agreement to add it.
I am quite happy with the Map API as is, it covers the majority of cases, so that should probably be written on your own application
This issue has bit me in multiple code bases, and it’s really not ideal to have to re-implement it each time. Given that under normal use of Map.new, you’d expect your input to have unique keys, doesn’t it seem worthwhile to have a version of Map.new/2 when this is violated, rather than silently discarding data?
I am not aware of any optimization we could do that would be specific for Elixir.
What kind of optimization are you referring to? I’d assume that adding logic for this would make the function a bit slower than the existing Map.new/2 but I was hoping we could at least detect this situation (but not necessarily detect what the duplicate keys are) in constant time by using map_size/1 on the resulting map, and comparing it to the input size.
Thanks,
Myron
--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-core" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-core/EessBQaaBsE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/dd49f8a6-43f3-4ce9-bc75-e89dba65b887%40googlegroups.com.
José, would you be open to the addition of a
Map.new!/2function that raises an error when there are duplicate keys? Although, maybe that implies thatMap.new/2returns{:ok, map}(which is obviously doesn’t).
I was hoping we could at least detect this situation (but not necessarily detect what the duplicate keys are) in constant time by using
map_size/1on the resulting map, and comparing it to the input size.
The input size is an enumerable and it is not guaranteed we can retrieve its size in constant time. The most common input is a list and it would require at least traversing the list one more time.
Right, for Map.new/1, you’d have to traverse it one more time. But for Map.new/2, isn’t the enumerable reduced into a list before generating the map? I was thinking that the reduction could count the list at the same time, requiring no extra traversals.
Regardless, thanks for explaining your reasoning!
Myron
--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-core" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-core/EessBQaaBsE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2B6__DcCdEpArNQPMyVri3HYKe%2BTw%2BT3kUxuY5sRrDSSg%40mail.gmail.com.