[Proposal] `Enum.operation_by/3`

93 views
Skip to first unread message

Luca Campobasso

unread,
Jan 12, 2021, 7:32:31 AM1/12/21
to elixir-lang-core
There are already some functions that take a filter function with the operation, like `Enum.count/1` and its filtered version, or `Enum.count/2`. Some other functions, like `Enum.max` or `Enum.frequencies` have a `_by` version that takes an additional argument. 

It would be cool and much more succinct, also for the library, if we could generalise this behavior to all functions, and have a function that takes a function name as an atom, its "operating" function, and a filter function, i.e. something like this:
```
Enum.operation_by(:map, &String.downcase/1, &filter_fun/1)
```
which I think would consistently reduce the dimension of the API. People often use filtering before doing stuff with Enum. This doesn't bring any new function, but rather generalises a behaviour and reduces the amount of stuff one would need to remember or know about the Enum API (which by the way, it's a work of art as it is TBH).

 

Austin Ziegler

unread,
Jan 12, 2021, 11:10:43 AM1/12/21
to elixir-l...@googlegroups.com
I don’t like the name, but the idea here would be for `Enum.operation_by/4`, wouldn’t it, or would it be a function that returns a function?

Something like:

```elixir
def operation_by(operation, transform_fun, filter_fun)
  fn enumerable ->
    filtered = Enum.filter(enumerable, filter_fun)
    apply(Enum, operation, [filtered, transform_fun])
  end
end

def operation_by(enumerable, operation, transform_fun, filter_fun)
  filtered = Enum.filter(enumerable)
  apply(Enum, operation, [filtered, transform_fun])
end
```

So this would visibly replace `enum |> filter(filter_fun) |> map(transform_fun)` as `enum |> operation_by(:map, transform_fun, filter_fun)`.

I’m not sure that this offers _enough_ visual benefit (and operation_by isn’t exactly a catchy name) that it would be worth adding to core. If it could _also_ provide a _performance_ benefit, I think it’d be an easier win. Especially as a filter-and-transform can be done pretty easily with `Enum.reduce`:

```elixir
list
|> Enum.reduce([], fn value, acc ->
  if Integer.mod(value, 3) == 0 do
    [ value * value | acc ]
  else
    acc
  end)
|> Enum.reverse()
```

-a

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/d6ae0579-04d7-4a2c-944a-f7279c789004n%40googlegroups.com.


--

Luca Campobasso

unread,
Jan 12, 2021, 11:36:05 AM1/12/21
to elixir-lang-core
Well, the name could be everything else, I'm not too attached to it, nor to the arguments order. Yes, I meant it to have arity 4.

My main argument for adding something like this is that there are already several functions like it, like count/2 (filter and count) and the others ending by "_by", which add a lot of stuff to the core library, and could be implemented with the example you gave me too, e.g. count/2:

```
list
|> Enum.reduce(0, fn value, acc ->
  if fun_filter(value) do
   acc + 1
  else
    acc
  end)
```

So, this addition would actually reduce the dimension of the core Enum library.

Wiebe-Marten Wijnja

unread,
Jan 12, 2021, 1:08:06 PM1/12/21
to elixir-l...@googlegroups.com

It seems to me that we already have something very similar to the hypothetical `Enum.operation_by`.
Namely: for-comprehensions.

~Qqwy/W-M

OpenPGP_signature

Luca Campobasso

unread,
Jan 12, 2021, 4:27:14 PM1/12/21
to elixir-lang-core
Yes, but also we can do pretty much everything with recursion and if/else statements, so no need for the for-comprehension too.

What I'm proposing is to get rid of all the functions in the Enum API that add an arity like the ones mentioned above.
Here I see that you refuse all kinds of functions, saying that "you can just do A + B", but meanwhile the same API is littered with functions that are "A + B" too.

So what I see here, with your answers, is that you draw the line on which functions to add and which not, simply on the mood you're on today. 

José Valim

unread,
Jan 12, 2021, 4:49:04 PM1/12/21
to elixir-l...@googlegroups.com
Hi folks, let’s please keep the tone of the discussion constructive.

Luca, can you please outline which functions you believe would be replaced by operation_by/4 and how? You mentioned count/2 and others ending with _by, but the many of the latter cannot be replaced by your proposed function. Functions like max_by, frequencies_by and similar are not doing filtering.

Also, the implementation you propose would be less efficient, as it by definition does a separate filter pass on the data before calling another Enum function.

So if you can provide more information, that would help reframe the discussion.




Austin Ziegler

unread,
Jan 12, 2021, 4:53:30 PM1/12/21
to elixir-l...@googlegroups.com
I’m not involved in the decision except as discussing it as a possibly interested user.

My concern is that `operation_by` hides a potentially useless `Enum.filter/2`, resulting in 2n loops over the enumerable list. `for` comprehensions and `Enum.reduce/2,3` make the filtering part of the list, and the use of Stream operations does something _similar_ while leaving the expressiveness that `operation_by` may hide.

If we could figure out a way to implement something like `operation_by` without the filter pass so that it operates more efficiently than `list |> Enum.filter(filter) |> <enum-operation>(operation_fn)`, I think that it becomes much easier to argue.

Austin Ziegler

unread,
Jan 12, 2021, 4:54:23 PM1/12/21
to elixir-l...@googlegroups.com
That was my implementation; I don’t want Luca’s proposal to be tainted by the inefficient implementation I put in to try to understand the proposal.

-a

José Valim

unread,
Jan 12, 2021, 5:02:26 PM1/12/21
to elixir-l...@googlegroups.com
That was my implementation; I don’t want Luca’s proposal to be tainted by the inefficient implementation I put in to try to understand the proposal.

That's precisely the point. :) It will be beneficial to see a concrete proposal because it will help move the discussion forward. One of Luca's proposals also had a filter step, which I still don't see how it applies to chunk_by, dedup_by, max_by, etc.

Greg Vaughn

unread,
Jan 12, 2021, 5:10:38 PM1/12/21
to elixir-l...@googlegroups.com
My 2 biggest concerns about the proposal (as I understand it, but would welcome more details):

1) Removing things from Enum should not be taken lightly. It must wait until Elixir 2.x for semantic versioning reasons and even then it is asking the community to update a lot of existing code.

2) It's hostile to newcomers to Elixir. Often Elixir is also their first functional language. The proposal requires a solid knowledge of higher order functions in order to do some very basic things a beginner would see in an early tutorial.

-Greg Vaughn

Luca Campobasso

unread,
Jan 12, 2021, 5:11:27 PM1/12/21
to elixir-lang-core
For now I'm not trying to put forward any 'concrete' implementation. It's just an idea for a generalised function. 

While I understand that there's no `Enum.filter` written explicitly in the code, there's still this step of sort of filtering out, written one way or the other, which at high level looks like filtering e.g. the simplest example is this:
```
def count(enumerable, fun) do
  reduce(enumerable, 0, fn entry, acc ->
    if(fun.(entry), do: acc + 1, else: acc)
  end)
end
```
The implementation wouldn't require new code itself, was just thinking at applying functions already existing together

Luca Campobasso

unread,
Jan 12, 2021, 5:13:03 PM1/12/21
to elixir-lang-core
I didn't mean to just remove functions from a subversion to the other, e.g. 11.3 to 11.4 I thought that goes without saying.

José Valim

unread,
Jan 12, 2021, 5:14:40 PM1/12/21
to elixir-l...@googlegroups.com
 
While I understand that there's no `Enum.filter` written explicitly in the code, there's still this step of sort of filtering out, written one way or the other, which at high level looks like filtering e.g. the simplest example is this:
```
def count(enumerable, fun) do
  reduce(enumerable, 0, fn entry, acc ->
    if(fun.(entry), do: acc + 1, else: acc)
  end)
end
```

Can you please tell me which functions you believe this pattern could be applied at? You mentioned count and filter and other _by functions, but I am still struggling to see how it applies to max_by, dedup_by, chunk_by, etc.

Otherwise, it is best to revisit it once there is a more concrete proposal (even if not efficient at first).

Luca Campobasso

unread,
Jan 12, 2021, 5:20:52 PM1/12/21
to elixir-lang-core
Well I didn't cook up a proposal, just had an idea and thought I could write, at this moment I have nothing as a concrete proposal written in code.
Looking through the API, I thought it could be applied to, for example, map, zip, reduce, sum, take, into, shuffle, scan, basically anything that goes through a list. 

w...@resilia.nl

unread,
Jan 13, 2021, 8:30:40 AM1/13/21
to elixir-lang-core
I think my last message may have been too short, not explaining my reasoning well enough.

The main concern I have is that if you have a very flexible function whose set of internal operations depends on its arguments,
it is no longer immediately obvious (i.e. obvious from only looking at the code where it is used) what the particular order of these operations will be.

For instance:

```
Enum.operation_by(names, :count, &String.downcase/1, &filter_fun/1)
```

When is `String.downcase` run here, and what impact will it have on the result?

If one were to write a longer pipeline with more, simpler, steps, then it becomes easier to read and comprehend.

For instance, we might have either

```
names
|> Enum.filter(&filter_fun/1)
|> Enum.map(&String.downcase/1)
|> Enum.count()
```

or

```
names
|> Enum.map(&String.downcase/1)
|> Enum.filter(&filter_fun/1)
|> Enum.count()
```

in each case resolving the ambiguity.

Of course, we're leaving some runtime efficiency on the table by having multiple smaller steps.

When this really becomes a problem, we can replace a particular pipeline with `for`:

`for` is a much higher-level abstraction than doing manual recursion, because:
  • It has its own generator syntax to filter on a pattern match.
  • Besides being also able to filter by functions.
  • It allows you to pass the extra options `:reduce`, `:into` and/or `:uniq` to abstract away common transformations of the result.
Such a for-loop might be slightly harder to read than a pipeline of simple steps,
but it will be more performant because no intermediate enumerables are constructed any more.
And it will still be clear in which order the steps are performed.

Example:

```
for name <- names,
      filter_fun.(String.downcase(name)),
      reduce: 0 do
      acc -> acc + 1
end
```

So this is why I am not currently convinced that this proposal is a good idea.

To summarize: In most cases the clarity of having more simpler steps in a pipeline is preferable,
and in the cases where performance really does become critical, we already have `for`.

I hope that contextualizes my earlier reply a little :-)

~Qqwy/Marten

Luca Campobasso

unread,
Jan 15, 2021, 3:47:13 AM1/15/21
to elixir-lang-core
That's much better, Qqwy, thanks. This is a much more explicative answer :)

I cannot see however this: "It allows you to pass the extra options `:reduce`, `:into` and/or `:uniq` to abstract away common transformations of the result."
Is it true? I cannot see it in the docs. There is only the "into" option.

José Valim

unread,
Jan 15, 2021, 3:54:42 AM1/15/21
to elixir-l...@googlegroups.com
Maybe you are reading an outdated version of the docs?


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Luca Campobasso

unread,
Jan 15, 2021, 4:32:48 AM1/15/21
to elixir-lang-core
if you type "for comprehension" on Google, this is the first result:
which doesn't report those options

Usually I check the API docs too, but I also assume that the information content is more or less equivalent to the Getting Started section

José Valim

unread,
Jan 15, 2021, 4:39:29 AM1/15/21
to elixir-l...@googlegroups.com
The getting started guide is not in depth as it is mostly meant as introduction but we should at least link to the docs for further reference. I will update, thanks!

Reply all
Reply to author
Forward
0 new messages