Add Enum.histogram and Enum.histogram_by

101 views
Skip to first unread message

Pedro Carvalho

unread,
Jul 19, 2022, 10:21:02 PM7/19/22
to elixir-lang-core
It's so common to use histograms in data science. Why not ?
And its just some code like this

{min, max} = enumerable |> Enum.min_max()

interval_size = max - min

buckets = oṕts(:buckets_number)
bucket_size = interval_size / buckets 

enumerable
|> Enum.frequencies_by(fn element -> trunc((element - min)/bucket_size) end)
|> Enum.map(fn {key, frequency} ->
   bucket_min = (key*bucket_size) + min
  bucket_max = min(max,bucket_min + bucket_size)
   {{bucket_min, bucket_max}, frequency }
end)

an.le...@gmail.com

unread,
Jul 20, 2022, 3:39:00 AM7/20/22
to elixir-l...@googlegroups.com
Hi Pedro,

Thanks for the proposal.

My counter-argument here is this: while this is useful for data science, how common is it to do data-science stuff with just Elixir's standard library? I’d imagine that most times, you'd be using a data-science focused library anyways. In that case, maybe this function belongs in there. Another similar argument is that this could be in a library instead of the standard library.

Are there any particular reasons why this would belong in the standard library?

Andrea

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/176bfe73-effd-47d0-be85-2961c897a67en%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages