[Proposal] Add invert() method to Map module

223 views
Skip to first unread message

Justin Gamble

unread,
Feb 6, 2019, 1:35:49 AM2/6/19
to elixir-lang-core
Hi,

I think an invert() method on Map would be convenient to use.  It is only a handful of lines of code to add to a custom program, but why should users need to do that?  In my opinion this basic functionality should be implemented by the Map module. 

The invert() method would operate the same way it does for Ruby:

Recently I wanted to invert a map in one of my Elixir programs (details below), and I googled and found the below solution by Chris McCord in https://stackoverflow.com/questions/26614682/how-to-change-all-the-values-in-an-elixir-map:

def invert(map) when is_map(map) do
   
for {k, v} <- map, into: %{}, do: {v, k}
 
end


I would basically like the above function to be added to the Elixir.Map module.

Here is the scenario for which I wanted to use this the invert function:

My Elixir program is synchronizing Robot Framework source code files with TestRail.  Robot Framework defines automated test cases.  TestRail is a web-based Test Case Management system.   The goal is to synchronize TestRail to contain all the tests that are defined in the Robot Framework files, and in the same directory structure.

Originally the Elixir program was specified to only create test cases in TestRail.  When you are creating a test case, the TestRail API requires you to specify the ID of the directory (aka "section") where the case is written.  For this I use an Elixir Map to map paths to section IDs.  When I want to create a file in directory "/path/to/test.robot", then I Lookup the section ID from my map & then call the "add_case" TestRail API function.

That version of the program was a success.  My boss was impressed and next asked for enhancements to the program.  The new functionality being requested is to have existing TestRail Test Cases get updated with the latest test steps defined in Robot Framework.  Part of that update is to verify that the path to the test case is the same in the Robot Framework source code and in the TestRail.  In this case we first call the TestRail API to lookup a test case, at which point we know the section ID that it belongs to.  But we need to translate that to a directory path and confirm that this path is the same location as where the test file exists in the Robot Framework source code.  

For updating, it makes sense to also be able to use an Elixir Map to map from the Section ID back to the Path -- the inverse of the original map.  

It is true that I could iterate through the original map until I found the section ID in the value section, and then return the map index.  But that seems awfully inefficient when you consider that there are hundreds of test files and thousands of tests.  So I decided an inverse map would provide faster lookup, at the cost of more memory being used.  Also note that in my scenario both the indexes and the values are unique; it is a 1-to-1 mapping.

This is just one example scenario.  I imagine other examples exist, or else the invert() function would never have been created in the Ruby community.

Amos King - Binary Noggin

unread,
Feb 6, 2019, 11:57:51 AM2/6/19
to elixir-l...@googlegroups.com
I find that small piece of code would often have other data processing connected to it. I'd hate to cycle through the map multiple times if I was going to swap the key and value, and then do more processing on the value. I think such a function would encourage the practice of chaining along without thinking of the consequences of performance. I personally don't find this to be useful as part of the core of the language, but maybe in a package that you reuse in many of your own projects. If you find that it is pulled into all of your projects it could then be open-sourced so that others could take advantage of it.

Amos King
CEO
Binary Noggin
=======================================================
I welcome VSRE emails. Learn more at http://vsre.info/
=======================================================


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/b2c7e58a-ad99-4458-a45e-9f7d5f7fce15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Glauber Campinho

unread,
Feb 6, 2019, 12:59:25 PM2/6/19
to elixir-l...@googlegroups.com
Hi,

I would also raise the concern that map has no ordering, so inverting a map can have different results depending on the map. Said that, I believe that inverting should be done case by case and should not be in the core, and if it is, it should solve the problem of sorting or throw an error.

Glauber Campinho


Justin Gamble

unread,
Feb 6, 2019, 1:27:29 PM2/6/19
to elixir-lang-core
Thanks for your feedback Amos.  I hope you can appreciate that not all applications are focused primarily around performance.

If performance is key to your application, then you are responsible for learning tricks to optimize the code.  That might include iterating over your map a single time, and doing multiple transformations at once - including inverting the map.

For applications where performance is a consideration, but not the prime focus, things are different. Here maintainabiility tends to be the key focus.  I find it elegant to write code where functions do not have side-effects.  (If you are doing multiple transformations while iterating over a map to invert it, then yes you are introducing side-effects.  Perhaps the side-effects are justified, but that is a design decision)

Perhaps you'll agree that the proposed implementation of Map.invert() looks clear.  But two questions:
  1. Would someone new to Elixir write it out so simply?
  2. What happens if a map has multiple keys that map to the same value - which value will become the key in the inverted map?  This requires additional testing and documentation.

Elixir is a productive language.  It should not require each user to go through the tedium of implementing, testing and documenting common code like this.  That is the purpose of having a re-useable library.

I think Map.invert() is a tool that should be offered for users that want it.  As an analogy, consider the filter() function that is offered by both Enum and Stream modules.  Some users will select Stream.filter/2 for performance, but does that mean Enum.filter/2 should be deprecated?  I don't think so.  Users need to be smart enough to know which functions work for them.

In the example scenario I described, I simply wanted an inverted map.  And I questioned why this was not already provided by the library? 

On the topic of performance, I have one other tidbit to mention.  In my application, I achieved performance by launching a separate thread for each Robot Framework file that was being processed. The resulting execution time is so blindingly fast, that I am more than pleased.  I certainly have no need to further
optimize the performance, especially if that would complicate the code.

Thanks,
Justin

Ben Wilson

unread,
Feb 6, 2019, 5:36:17 PM2/6/19
to elixir-lang-core
Map.invert is at least largely harmless, unlike some of the other proposals that happen. It's still a little odd to me to have a function that boils down to just:

Map.new(map, fn {k, v} -> {v, k} end)

Honestly that is almost clearer if you've never heard the word "invert" with respect to maps.

Andrea Leopardi

unread,
Feb 7, 2019, 7:45:49 AM2/7/19
to elixir-l...@googlegroups.com
A big problem is also that semantically keys and values are very different in maps, in that keys are unique and values aren't. Inverting a map could cause problems if you don't take that into account. Of course, if you use the Map.new/2 solution the problem is there anyways, but at least you're being explicit in what happens.

Only one use case has been presented and I think I would need at the very least a few more before considering whether this belongs in core. When a function is this small and the use cases are rare and specific, I think implementing it in your own code usually works best.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--

Andrea Leopardi

Valter Sundström

unread,
Feb 7, 2019, 11:40:42 AM2/7/19
to elixir-lang-core
I have to strongly agree with Andrea on this, and it's the first thing I reacted to when reading this suggestion. The fact that it fits semantically in one use case, does not mean it's general.
And as previously stated, in my opinion the explicit `Map.new` is much clearer to me than an additional function, and barely any code, so generalizing a term for this and putting it in the standard library does not seem warranted.

Dmitry Belyaev

unread,
Feb 8, 2019, 8:51:02 AM2/8/19
to Justin Gamble, elixir-lang-core
First of all, I really think that 'invert' is very unfortunate name for such an operation. I was very surprised with the proposed semantic, which for me looks more like 'transpose' (remember matrix transpose).

But even with that name the operation is possible to lose data if values are not unique, so two such consecutive operations may result in different result than the original input.

I doubt that the rules which value takes priority are general enough to go into standard library, and should be specific to the place of invocation.
--
Kind regards,
Dmitry Belyaev

Xavier Noria

unread,
Feb 8, 2019, 10:07:07 AM2/8/19
to elixir-l...@googlegroups.com, Justin Gamble
One discussion is about inclusion in core (settled), and a different discussion is about semantics.

The point of invert is: when you as a programmer know the mapping is a bijection, and you need boths directions, you know invert is well-defined and can use it (present in some languages). If the mapping is not a bijection, you as a programmer know invert probably doesn’t make sense, it is undefined behaviour, or whatever. The former has uses cases (I have used it), the latter probably not or rare.
--
Sent from Gmail Mobile

boris kotov

unread,
Feb 8, 2019, 10:44:41 AM2/8/19
to elixir-lang-core
Just to throw some cents in her.. I think, it belongs into a library, like in other languages.
The elixir core, is mostly used to build the elixir core itself and make it extendable.

Just imagine this example

Map.invert(%{foo: %{ what: :jo }}) 
%{%{jo: :what} => :foo}) ?

Or 

%{%{what: :jo} => :foo}) ?

Or maybe put an option flag there?

Map.invert(map, deep: true)

you see.. is it worth it? The core team should focus on important stuff :p

In other languages, there are libraries like lodash / ramda. There you can find such things a lot. (and mostly yagni)
If I would need it, I would build my own and put it in MyApp.Helpers.invert (you know.. naming is hard..)

Justin Gamble

unread,
Feb 10, 2019, 3:41:47 AM2/10/19
to elixir-lang-core

Hi Andrea,


I think there are two questions:


First question: what are use cases?


Basically it is whenever the keys and values in a map have equal importance as being the reference.  Inverting a map allows you use a value to look up a key.


The following are use cases for inverting a map, where it makes sense to do a lookup on the key & also to do a lookup on the value:


  • Phone Book.  You may have a name and want to lookup the phone number, or vice versa.  (idea taken from here)

  • Converting between languages.  To map words/expressions between 2 languages.

  • Encrypting/decrypting data

  • Capital city & State.  ("given this capital city, what is its state?", "given this state, what is its capital city?")

  • Sports team & school

  • Sports player name & player number

  • Countries & their final ranked position  (i.e. in Soccer World Cup)

  • Finally, to recap the motivation in the original proposal: Converting between English titles and record IDs for talking to APIs.  The same argument could be applied for talking to a database instead of an API.

    • For example, for a school application you many need to convert between classroom_number & classroom_ID , or student_name & student_ID


Here are examples from other programming languages showing that developers are commonly asking about this functionality.


Ruby:


Python:


Perl:



Java:



One reason to implement Map.invert() now is to avoid further people asking about it in the future. ;)


Second question: is the function too small to add to the library?


I now think the initial code I proposed is inadequate.  The Map.new/2 solution is also inadequate. A Map library implementation would need to consider what happens when the map contains duplicate values.  


I'm thinking we should offer 2 variants of the invert() method:   Map.invert/1 and Map.invert!/1.


The Map.invert!/1 method would expect a 1-to-1 map, and raise an exception if the map contains non-unique values  

(i.e. the following would raise an error:

Map.invert!(%{a: 1, b: 1})
)

In contrast, the Map.invert/1 method would allow for non-unique values. It would return a map where the values are a list of keys in the original map.

(i.e.

Map.invert(%{"banana" => "yellow", "strawberry" => "red", "cherry" => "red"})

would return

%{"yellow" => ["banana"], "red" => ["strawberry", "cherry"]}.  
).

This idea is from solutions found in other languages (i.e. "Perl Cookbook", "Ruby Cookbook 2nd ed", or see here or here for Python).  This way no data is lost through the inversion.  Ideally, if possible, we can use a property-based test to validate that inverting the map 2 times will always returns the initial map.


What are your thoughts?


Thanks,

Justin


Ben Wilson

unread,
Feb 10, 2019, 9:28:03 AM2/10/19
to elixir-lang-core
I like that your updated proposal adds functionality beyond simple Map.new. Personally, I'm still not convinced that this isn't better handled by existing functionality where the program author can choose functions that make the result of what's happening clearer.

For example in the `Map.invert/1` case you're essentially doing `Enum.group_by(fn {_, value} -> value end, fn {key, _} -> key end)`. This to me is much clearer than remembering that doing ! or not on the function completely changes the value data type.

Your proposed property test is actually quite problematic. It seems like you're saying that:

```
map == map |> Map.invert |> Map.invert
```

If this is true, and `invert` without a ! is really a grouping, then this has interesting implications for how `map |> Map.invert` works if your values _already happen to be lists_. Concretely what if the map is:

%{body: 'hello world'}

Does this treat every value in that list as if it's a key? This would be incredibly surprising behavior I should think.

Glauber Campinho

unread,
Feb 10, 2019, 12:24:42 PM2/10/19
to elixir-l...@googlegroups.com
If the invert returns a list of values for duplicate values, the invert of invert would not be the original value. That is a strange behavior in my opinion.

Glauber Campinho

unread,
Feb 10, 2019, 12:30:27 PM2/10/19
to elixir-l...@googlegroups.com
Sorry, to be more clear, if you have a map and invert the keys, but have an exception for arrays in the keys, it would possible the inverse of the inverse would be Identity. Then the strange thing is that sometimes you invert the keys, sometimes you split the keys.

All of this seems very specific to the case that you are dealing with, people will want different behaviors for different cases.

Martin Svalin

unread,
Feb 11, 2019, 3:37:15 PM2/11/19
to elixir-l...@googlegroups.com
Imho, given that a Map allows duplicate values, `inverse` is a poor fit for the data structure as such. Map is also not the only key-value data structure common in Elixir. Should there be a corresponding Keyword.inverse/1? It would add the interesting twist that neither key or value need to be unique.

A safely invertible map is a separate data structure, and can be provided in hex packages. If a developer needs guarantees around uniqueness of values, then a plain map is something that needs to be treated carefully. In other cases, the content of the map is known (literal, perhaps), and a straight up Map.new(for {k,v} <- map, do: {v,k}) is just fine.

I agree there are use cases for inverting a map, but I believe the use cases are different enough that a Map.invert/1 risks having a confusing API for duplicates. I think the proposal shows this. Better then to leave it to the dev or to library maintainers.

I think functions in the data structure modules like Map, List, String etc should work with _any_ value of that kind, not come with warnings to not use them unless your data has these special properties.

my 2 cents. /Martin

Reply all
Reply to author
Forward
0 new messages