[PROPOSAL] ELL 2: Implement Enum.zip_map

120 views
Skip to first unread message

Quentin Crain

unread,
Apr 12, 2016, 11:59:50 AM4/12/16
to elixir-lang-talk
Hi!

As previously warned in this thread, I have a proposal. Two in fact! The thread you are reading is actually proposal 2 but comes 1st; confusing no?!?

(And yes I fancy myself hysterical! Sorry.)

=======================================================================================

META

Number: ELL 2

Title: Implement Enum.zip_map

Contributors: Quentin Crain, bjfish, elixir-lang slackers


TL;DR

Abstract:


zip two enumerables’ elements into a tuple, padding the shorter with :nil (default), and map to your function; ie. zip and map in one pass. We have filter_map, so why not this?! :)


Implementation:


def zip_map(list1, list2, func, padding \\ :nil) do

 l1 = length(list1)

 l2 = length(list2)


 if l1 < l2 do

   list1 = list1

   ++

   (Stream.cycle([padding]) |> Stream.take(l2-l1) |> Enum.to_list)

 else

   list2 = list2

   ++

   (Stream.cycle([padding]) |> Stream.take(l1-l2) |> Enum.to_list)

 end


 Enum.zip(list1, list2)

 |> Enum.map(func)

end



Examples:


# Concat two lists of lists

a_thru_m = (for e <- 97..109, do: [e])

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']

n_thru_z = (for e <- 110..122, do: [e])

['n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']


zip_map(a_thru_m, n_thru_z, fn ({x,y}) -> x++y end)

['an', 'bo', 'cp', 'dq', 'er', 'fs', 'gt', 'hu', 'iv', 'jw', 'kx', 'ly', 'mz']


# Sum the elements of two lists

one_thru_ten = 1..10 |> Enum.to_list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

one_thru_five = 1..5 |> Enum.to_list

[1, 2, 3, 4, 5]


zip_map(one_thru_ten, one_thru_five, fn ({x,y}) -> x+y end, 0)

[2, 4, 6, 8, 10, 6, 7, 8, 9, 10]


STORY


I’m old. I’m a GenX-er. I press -+ at least twice on webpages nowadays. I have a daughter in college, damn it!


Oh great, another geezer story. Geezers love narratives, especially their own!


So, I figured it was time to get off my ass; I knew I needed to get out of my cubicle, at least virtually. I had been working in Python for 15 years and wasn’t growing. I had just designed a test framework as microservices, but wasn’t happy with a number of things. So, I looked around and of course found Erlang/OTP. Then found Elixir. That was all of 3months ago.


Dropped Python cold. Didn’t have to, but no better way to learn eh? Like “real” languages.


So, how to learn? Best way for me: read the group’s interactive communication channels. This meant elixir-lang-talk, elixir-lang-core and elixir-lang on slack.


Ok, ok, hurrying the story along: bjfish on slack wanted to split a list into N number of lists, like this:


distribute([1, 2, 3, 4, 5], 3)

[[1, 4], [2, 5], [3]] (or [[1, 2], [3, 4], [5]], he didn’t really care)


He implemented it by grouping on the div of each element’s index (assuming I understand his solution correctly!) Most of the replies, though, wanted to use Enum.chunk and so did I. So I spent 3 days on it.


Obsessive and stupid, I know!


What seems like the right solution is to be able to Enum.chunk as many elements as possible and then distribute the remaining elements over the chunks. All very easy:


{chunkable, remaining} = Enum.split(SOME-LIST, SOME-MATH-TO-PARTITION)

chunkable = Enum.chunk(chunkable, CHUNK-SIZE)


Of course at this point I just wanted to zip the lists and then map them by appending/++ each element of remaining to a chunk in chunkable. But they are of different lengths so good ‘ol zip wasn’t going to help me. I did it with reduce but thought it ugly.


What I wanted -- nay needed! -- was zip_map! So here it is hopefully making the solution to bjfish’s need very clear:


zip_map(chunkable, remaining,

fn

({element_from_chunkable, :nil}) ->

element_from_chunkable

({element_from_chunkable, element_from_remaining}) ->

element_from_chunkable ++ [element_from_remaining]

end

)


That’s why I’m propose zip_map. Thoughts?


Booker Bense

unread,
Apr 12, 2016, 12:51:12 PM4/12/16
to elixir-lang-talk
First, Welcome to Elixir.
 
Your function is not one pass, so I don't see a lot of value over simply using Enum.zip and Enum.map.

FWIW, you might consider contributing to the Crutches project. That project is meant to gather
a lot of useful methods that don't quite meet the standard for the Core Elixir library. In general
getting anything into Enum at this point requires a strong use case. There is a focus on keeping
the core as lean as possible to maximize core developer's time. It's very hard to take out something
once it is in core. Functions that are largely composition of existing Enum functions are a hard
sell. This is just my opinion as an interested observer, but if you want to put something in Enum, it
should be a single pass reduction.

https://github.com/mykewould/crutches


- Booker C. Bense

Peter Hamilton

unread,
Apr 12, 2016, 12:53:13 PM4/12/16
to elixir-lang-talk

There are two proposals within this one.

First, a form of zip that pads rather than "finishes as soon as any enumerable completes."

Second, a convenient zip_map function.

For a padding zip, I think Enum.pad/2 and Stream.pad/2 are possibilities, though like your implementation it wouldn't work for indeterminate or infinite streams. The real question isn't "How do I make ListA have length N?" It's "How do I make ListA have the same length as ListB?"

That's not really something easily solved in with the standard Enum/Stream funs, as there's a coupling between the two the two Lists. One solution that would work with indefinite streams:

ref = make_ref
padding = Stream.cycle([ref])
Stream.zip(Stream.concat(list1, padding), Stream.concat(list2, padding))
|> Stream.take_while(&({ref,ref} != &1))

Or something like that. That's still too complex to call it an "idiomatic solution" (plus perf issues). So I think padded zip still has a place (it would need a cleaner implementation though).

As for zip_map, I'm not a huge fan when Stream.zip |> Enum.map would accomplish the same thing.

Also, Welcome!

- Peter

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/e1565486-c76f-4582-837c-ba4f734c3ae7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

José Valim

unread,
Apr 12, 2016, 12:58:02 PM4/12/16
to elixir-l...@googlegroups.com
Thank you Quentin!

Today I consider Enum.filter_map/3 mostly a mistake. The question is: what can we do with Enum.zip_map/3 that we won't be able to do today with Enum.zip/2 and Enum.map/2? Aside from the fact the latter would traverse the list just once, of course.



José Valim
Skype: jv.ptec
Founder and Director of R&D

Quentin Crain

unread,
Apr 12, 2016, 1:30:28 PM4/12/16
to elixir-lang-talk
Thanks for welcomes!

@Peter: Yes, it is sorta a twofer request: It would be nice if zip didn't terminate on the shortest list. In fact, my experience is usually I want to continue walking the longer list with info that I'm done with the shorter list, hence the default of :nil.

@Booker & @José: It is true, this isn't much more than packaging up zip & map. I totally get keeping things functionally (ha!) short-n-sweet for composability. I didn't hold much hope, but I had to introduce myself somehow; plus, it was a way to make my other proposal! ;)

Should we consider the ELL withdrawn or rejected? :)

<< q


José Valim

unread,
Apr 12, 2016, 1:43:49 PM4/12/16
to elixir-l...@googlegroups.com
@Peter: Yes, it is sorta a twofer request: It would be nice if zip didn't terminate on the shortest list. In fact, my experience is usually I want to continue walking the longer list with info that I'm done with the shorter list, hence the default of :nil.

Maybe we should be talking about a zip_pad or something similar that receives two lists and a padding and it takes elements from the pad in order to complete the shorter list?




José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

Peter Hamilton

unread,
Apr 12, 2016, 1:45:18 PM4/12/16
to elixir-lang-talk
I don't like the name zip_pad. Could we consider zip/3, following the example of chunk/4?

José Valim

unread,
Apr 12, 2016, 1:49:36 PM4/12/16
to elixir-l...@googlegroups.com
I thought about chunk/4 as well but do we want the same semantics? In chunk/4, the padding is used as is. If the pad is too short, the final chunk is too short. That's ok for chunk/4 because the chunk size is known before hand. That's not the case for zip and I think we would rather prefer to cycle over the pad (for example, if I give [1, 2, 3], it will complete it with [1, 2, 3, 1, 2, 3, ...]). I am not settled on the name, let's figure out the semantics and we can move to the name next. Thoughts?



José Valim
Skype: jv.ptec
Founder and Director of R&D

Peter Hamilton

unread,
Apr 12, 2016, 2:11:18 PM4/12/16
to elixir-l...@googlegroups.com
From a usability standpoint, I don't find it too cumbersome to do:

Enum.chunk(list, 15, 15, Stream.cycle([nil]))

There may be a performance hit though.

If we make zip/3 cycle by default, then there isn't a possibility of doing the normal pad (whereas the opposite would work).

I'm also intrigued (not sure if I like it or dislike it) by the idea of padding with a Stream full of side effects, which wouldn't work if we only allowed padding with a cycle.

José Valim

unread,
Apr 12, 2016, 3:19:23 PM4/12/16
to elixir-l...@googlegroups.com
Maybe we should support options like :pad and :cycle? Not sure I like it though.

For more options, visit https://groups.google.com/d/optout.


--

Quentin Crain

unread,
Apr 12, 2016, 3:54:56 PM4/12/16
to elixir-lang-talk, jose....@plataformatec.com.br
Seems like zip mimicking chunk would be consistent and nice.

Like this? (Yes, cheating with the range, just lazy! :))

zip(1..5, 1..8)
[{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}]

zip(1..5, 1..8, ["Marcia"])
[{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}, {"Marcia", 6}]

zip(1..5, 1..8, Stream.cycle(["Marcia"]))
[{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}, {"Marcia", 6}, {"Marcia", 7}, {"Marcia", 8}]

No? (Assuming I'm understanding.)

<< q

On Tuesday, April 12, 2016 at 12:19:23 PM UTC-7, José Valim wrote:
Maybe we should support options like :pad and :cycle? Not sure I like it though.

On Tuesday, April 12, 2016, Peter Hamilton <> wrote:
From a usability standpoint, I don't find it too cumbersome to do:

Enum.chunk(list, 15, 15, Stream.cycle([nil]))

There may be a performance hit though.

If we make zip/3 cycle by default, then there isn't a possibility of doing the normal pad (whereas the opposite would work).

I'm also intrigued (not sure if I like it or dislike it) by the idea of padding with a Stream full of side effects, which wouldn't work if we only allowed padding with a cycle.

On Tue, Apr 12, 2016 at 10:49 AM José Valim <> wrote:
I thought about chunk/4 as well but do we want the same semantics? In chunk/4, the padding is used as is. If the pad is too short, the final chunk is too short. That's ok for chunk/4 because the chunk size is known before hand. That's not the case for zip and I think we would rather prefer to cycle over the pad (for example, if I give [1, 2, 3], it will complete it with [1, 2, 3, 1, 2, 3, ...]). I am not settled on the name, let's figure out the semantics and we can move to the name next. Thoughts?



José Valim
Skype: jv.ptec
Founder and Director of R&D

Peter Hamilton

unread,
Apr 12, 2016, 5:09:06 PM4/12/16
to elixir-lang-talk, jose....@plataformatec.com.br
Yes. With the case:

zip(1..8, 1..5, Stream.cycle(["Marcia"]))
[{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}, {6, "Marcia"}, {7, "Marcia"}, {8, "Marcia"}]

also being true (important that the pad goes to whichever stream ends first)

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

Quentin Crain

unread,
Apr 12, 2016, 5:28:10 PM4/12/16
to elixir-l...@googlegroups.com
Yes, lovely!

On Tue, Apr 12, 2016, 14:09 Peter Hamilton <> wrote:
Yes. With the case:

zip(1..8, 1..5, Stream.cycle(["Marcia"]))
[{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}, {6, "Marcia"}, {7, "Marcia"}, {8, "Marcia"}]

also being true (important that the pad goes to whichever stream ends first)

Peter Hamilton

unread,
Apr 12, 2016, 5:31:58 PM4/12/16
to elixir-l...@googlegroups.com
Jose: Any notion of pad vs cycle should apply to chunk as well. I think so long as we have chunk/4 in its current state, zip/3 should have the same behavior. I don't think we want to change chunk at this point in time, so I think we're stuck with those semantics.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

José Valim

unread,
Apr 12, 2016, 5:43:28 PM4/12/16
to elixir-l...@googlegroups.com
Should we just introduce zip/3 or should we go with a more explicit name? I am aware mirroring chunk/4 is beneficial but I always felt chunk/4 could have been more intuitive as two separate functions.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/CAOMhEnx5840q6z5sYzB8BtADrEhpWgFV852ToQ6maQu07%2B%3Dm6w%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.


--

Peter Hamilton

unread,
Apr 12, 2016, 7:20:03 PM4/12/16
to elixir-l...@googlegroups.com
Here are the options I see:
1. zip/3 now and then when we split chunk/4 (breaking change) we can also split zip/3.
2. zip_pad (or whatever we call it) now, leave chunk/4 until a breaking change release and do chunk_pad.
3. do zip_pad and chunk_pad now and get them in the next breaking change release.

The long term goal (which I think we all agree on) should be mirroring. There's a possible concession of a short term mismatch. My preference is for this functionality to be available sooner rather than later, so I wouldn't want to tie it to the release of chunk_pad unless there was a breaking release coming. That leaves #1 and #2. We've talked about #1 plenty, but it's probably fair to say we shouldn't rule out #2 until we get a feel for what the new semantics would be like.

I'm having a hard time envisioning the split. What did you have in mind?

eksperimental

unread,
Apr 12, 2016, 9:46:20 PM4/12/16
to elixir-l...@googlegroups.com, Tallak Tveide
I asked a while ago if there was a zip-padding function,
https://groups.google.com/d/msg/elixir-lang-talk/pil9caRXQnM/-newc1hUAAAJ
and there was not, and José suggested that that would be a desirable
function to have in Enum, which I did... It's been there siting for a
while and I was actually thinking about taking back on this week, to
see that there is a similar proposal.

It's already a working version,
https://github.com/eksperimental/experimental_elixir/blob/master/lib/enum_pad.ex
https://github.com/eksperimental/experimental_elixir/blob/master/test/enum_pad_test.exs

what I ended up doing was creating two sets of functions:
Enum.pad/3-4
Enum.pad_zip/3-4

and Enum.enumerable?/1 to determine if we are dealing with an enumerable
(there are probably better ways to deal with this).

I never submitted it because I never implemented the Stream versions of
them, but whatever is under Enum is ready for review.
> >>> <https://groups.google.com/d/msgid/elixir-lang-talk/CALNYqAsQs%2Bjg9JCV3c5a%3DtR3uwdQ8rtj%3DM3wb%3D-%3DNuDOUOEg4Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> >>> .
> >>> For more options, visit https://groups.google.com/d/optout.
> >>>
> >> --
> >> You received this message because you are subscribed to the Google
> >> Groups "elixir-lang-talk" group.
> >> To unsubscribe from this group and stop receiving emails from it,
> >> send an email to elixir-lang-ta...@googlegroups.com.
> >>
> > To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/elixir-lang-talk/CAOMhEnx5840q6z5sYzB8BtADrEhpWgFV852ToQ6maQu07%2B%3Dm6w%40mail.gmail.com
> >> <https://groups.google.com/d/msgid/elixir-lang-talk/CAOMhEnx5840q6z5sYzB8BtADrEhpWgFV852ToQ6maQu07%2B%3Dm6w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> >> .
> >
> >
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
> > --
> >
> >
> > *José Valim*
> > www.plataformatec.com.br
> > Skype: jv.ptec
> > Founder and Director of R&D
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "elixir-lang-talk" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to elixir-lang-ta...@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/elixir-lang-talk/CAGnRm4LOxCeCbWhKeLJS_cEbp_iPyzTdNHRaWMQrMweATBh0FA%40mail.gmail.com
> > <https://groups.google.com/d/msgid/elixir-lang-talk/CAGnRm4LOxCeCbWhKeLJS_cEbp_iPyzTdNHRaWMQrMweATBh0FA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .

Quentin Crain

unread,
Apr 12, 2016, 11:20:42 PM4/12/16
to elixir-lang-talk, tal...@gmail.com
Sweet! Much better than my little posts .. awesome eksperimental!

<< q

José Valim

unread,
Apr 13, 2016, 3:29:50 AM4/13/16
to elixir-l...@googlegroups.com
Peter, we can always introduce a new function named chunk*, keep the existing chunk/4 to be deprecated in the long term.

It is also worth mentioning that we are going to introduce String.pad_leading/3 and String.pad_trailing/3 that will use the cycling semantics (because I don't think the non-cycling semantics makes sense for String):

    String.pad_leading("hello", 10, " ") #=> "     hello"




José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

Rich Morin

unread,
Apr 13, 2016, 12:10:09 PM4/13/16
to elixir-l...@googlegroups.com
This discussion reminds me of a notion I had a while back. Might it
be useful to create a table that catalogs some of the general-purpose
functions in various functional languages? This might help to reduce
proliferation of terminology (and the resulting confusion).

-r

--
http://www.cfcl.com/rdm Rich Morin r...@cfcl.com
http://www.cfcl.com/rdm/resume San Bruno, CA, USA +1 650-873-7841

Software system design, development, and documentation


Reply all
Reply to author
Forward
0 new messages