[ruby-core:45805] [ruby-trunk - Feature #6636][Open] Enumerable#size

56 views
Skip to first unread message

marcandre (Marc-Andre Lafortune)

unread,
Jun 23, 2012, 1:49:38 PM6/23/12
to ruby...@ruby-lang.org

Issue #6636 has been reported by marcandre (Marc-Andre Lafortune).

----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636

Author: marcandre (Marc-Andre Lafortune)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: 2.0.0


Now that it has been made clear that `Enumerable#count` never calls `#size` and that we have `Enumerable#lazy`, let me propose again an API for a lazy way to get the size of an Enumerable: `Enumerable#size`.

* call-seq:
* enum.size # => nil, Integer or Float::INFINITY
*
* Returns the number of elements that will be yielded, without going through
* the iteration (i.e. lazy), or +nil+ if it can't be calculated lazily.
*
* perm = (1..100).to_a.permutation(4)
* perm.size # => 94109400
* perm.each_cons(2).size # => 94109399
* loop.size # => Float::INFINITY
* [42].drop_while.size # => nil

About 66 core methods returning enumerators would have a lazy `size`, like `each_slice`, `permutation` or `lazy.take`.

A few would have `size` return `nil`:
Array#{r}index, {take|drop}_while
Enumerable#find{_index}, {take|drop}_while
IO: all methods

Sized enumerators can also be created naturally by providing a block to `to_enum`/`enum_for` or a lambda to `Enumerator.new`.

Example for `to_enum`:

class Integer
def composition
return to_enum(:composition){ 1 << (self - 1) } unless block_given?
yield [] if zero?
downto(1) do |i|
(self - i).composition do |comp|
yield [i, *comp]
end
end
end
end

4.composition.to_a
# => [[4], [3, 1], [2, 2], [2, 1, 1], [1, 3], [1, 2, 1], [1, 1, 2], [1, 1, 1, 1]]
42.composition.size # => 2199023255552

Example for `Enumerator.new`:

def lazy_product(*enums)
sizer = ->{
enums.inject(1) do |product, e|
break if (size = e.size).nil?
product * size
end
}
Enumerator.new(sizer) do |yielder|
# ... generate combinations
end
end

lazy_product(1..4, (1..3).each_cons(2)).size # => 8
lazy_product(1..4, (1..3).cycle).size # => Float::INFINITY



--
http://bugs.ruby-lang.org/

marcandre (Marc-Andre Lafortune)

unread,
Jun 30, 2012, 4:26:48 PM6/30/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).

File enumsize.pdf added

Attaching one-minute slide
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-27618

mame (Yusuke Endoh)

unread,
Jul 1, 2012, 1:09:39 PM7/1/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned

Received. Thank you!

--
Yusuke Endoh <ma...@tsg.ne.jp>
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-27674

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned

naruse (Yui NARUSE)

unread,
Jul 21, 2012, 2:43:52 AM7/21/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by naruse (Yui NARUSE).


How about adding Enumerator#receiver and define yourself with it.

diff --git a/enumerator.c b/enumerator.c
index f01ddd5..8e3ae9a 100644
--- a/enumerator.c
+++ b/enumerator.c
@@ -942,6 +942,32 @@ enumerator_inspect(VALUE obj)
}

/*
+ * call-seq:
+ * e.receiver -> object
+ *
+ * Returns the receiver of this enumerator.
+ */
+
+static VALUE
+enumerator_receiver(VALUE obj)
+{
+ struct enumerator *e;
+ VALUE eobj;
+
+ TypedData_Get_Struct(obj, struct enumerator, &enumerator_data_type, e);
+ if (!e || e->obj == Qundef) {
+ return Qnil;
+ }
+
+ eobj = rb_attr_get(obj, id_receiver);
+ if (NIL_P(eobj)) {
+ eobj = e->obj;
+ }
+
+ return eobj;
+}
+
+/*
* Yielder
*/
static void
@@ -1748,6 +1774,7 @@ InitVM_Enumerator(void)
rb_define_method(rb_cEnumerator, "feed", enumerator_feed, 1);
rb_define_method(rb_cEnumerator, "rewind", enumerator_rewind, 0);
rb_define_method(rb_cEnumerator, "inspect", enumerator_inspect, 0);
+ rb_define_method(rb_cEnumerator, "receiver", enumerator_receiver, 0);

/* Lazy */
rb_cLazy = rb_define_class_under(rb_cEnumerator, "Lazy", rb_cEnumerator);

irb(main):007:0> e="abcde".enum_for(:each_byte)
=> #<Enumerator: "abcde":each_byte>
irb(main):009:0> def e.size; receiver.bytesize; end
=> nil
irb(main):010:0> e.size
=> 5
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-28247

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned

marcandre (Marc-Andre Lafortune)

unread,
Jul 21, 2012, 5:39:00 PM7/21/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).


Hi,

On Sat, Jul 21, 2012 at 2:43 AM, naruse (Yui NARUSE) <nar...@airemix.jp> wrote:
> How about adding Enumerator#receiver and define yourself with it.

I agree `receiver` could be helpful. One would also need the `method` and the `arguments`, as in my request #3714.

Still, it doesn't really address the issue.

If someone wants to write a library to output the progression, for example, it is still not possible for a general enumerable/enumerator.

The proposal is so that:
- it is standard so anyone can depend on it and also create their own enumerables/enumerators
- it can help in some calculations
- it can help to have generic progression reports
- etc.

Thanks
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-28263

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned

mame (Yusuke Endoh)

unread,
Jul 23, 2012, 10:55:37 AM7/23/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by mame (Yusuke Endoh).

Status changed from Assigned to Feedback

Marc-Andre Lafortune,

We discussed your slide at the developer meeting (7/21).

Matz was positive to the spec of return value: Integer, Float::
INFINITY, and nil.
However, we couldn't understand what API is proposed for creating
an Enumeartor with size.

So, please revise and elaborate your API according to these two
points:


* Enumerator.new(size) is not acceptable because of compatibility:

p Enumerator.new([1,2,3]).take(2) #=> [1, 2]

* We cannot determine the size of enumerator when creating it:

a = [1]
e = a.permutation
a << 2
p e.to_a #=> [[1, 2], [2, 1]]

So, the API may need to receive a code fragment that calculates
size, such as a Proc.

--
Yusuke Endoh <ma...@tsg.ne.jp>
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-28331

Author: marcandre (Marc-Andre Lafortune)
Status: Feedback

marcandre (Marc-Andre Lafortune)

unread,
Jul 23, 2012, 11:13:43 AM7/23/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).

Status changed from Feedback to Open

Hi,

mame (Yusuke Endoh) wrote:
> Matz was positive to the spec of return value: Integer, Float::
> INFINITY, and nil.

:-)

> However, we couldn't understand what API is proposed for creating
> an Enumeartor with size.
>
> So, please revise and elaborate your API according to these two
> points:
>
> * Enumerator.new(size) is not acceptable because of compatibility:
>
> p Enumerator.new([1,2,3]).take(2) #=> [1, 2]

Agreed.
I am proposing `Enumerator.new(size_lambda){ block }`, i.e. only if a block is given, then the first argument can be a lambda/proc that can lazily compute the size.

The old syntax of `Enumerator.new` without a block does not change meaning.

> * We cannot determine the size of enumerator when creating it:
>
> a = [1]
> e = a.permutation
> a << 2
> p e.to_a #=> [[1, 2], [2, 1]]
>
> So, the API may need to receive a code fragment that calculates
> size, such as a Proc.

Agreed.
This is why I propose that `to_enum` accepts a block that can calculate the size, and Enumerator.new with a block can accept a lambda/proc for the same.

Does this address the concerns?

I will be glad to propose a set of patches so we can experiment with this.
--
Marc-André
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-28336

Yusuke Endoh

unread,
Jul 24, 2012, 10:29:30 AM7/24/12
to ruby...@ruby-lang.org
Hello Marc-Andre

2012/7/24, marcandre (Marc-Andre Lafortune) <ruby...@marc-andre.ca>:
>> * Enumerator.new(size) is not acceptable because of compatibility:
>>
>> p Enumerator.new([1,2,3]).take(2) #=> [1, 2]
>
> Agreed.
> I am proposing `Enumerator.new(size_lambda){ block }`, i.e. only if a block
> is given, then the first argument can be a lambda/proc that can lazily
> compute the size.

This is just my guess, but matz will not like such a method whose
meaning of its argument varies depending on whether block is given
or not.


> The old syntax of `Enumerator.new` without a block does not change meaning.

Is it okay that there is no way to specify size in this case?


>> * We cannot determine the size of enumerator when creating it:
>>
>> a = [1]
>> e = a.permutation
>> a << 2
>> p e.to_a #=> [[1, 2], [2, 1]]
>>
>> So, the API may need to receive a code fragment that calculates
>> size, such as a Proc.
>
> Agreed.
> This is why I propose that `to_enum` accepts a block that can calculate the
> size, and Enumerator.new with a block can accept a lambda/proc for the
> same.

What argument(s) will the lambda/proc receive?

--
Yusuke Endoh <ma...@tsg.ne.jp>

marcandre (Marc-Andre Lafortune)

unread,
Jul 24, 2012, 11:26:53 AM7/24/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).


Hi,

mame (Yusuke Endoh) wrote:
> > I am proposing `Enumerator.new(size_lambda){ block }`, i.e. only if a block
> > is given, then the first argument can be a lambda/proc that can lazily
> > compute the size.
>
> This is just my guess, but matz will not like such a method whose
> meaning of its argument varies depending on whether block is given
> or not.

I understand the concern.

It could still be acceptable here because the other form is already documented as 'discouraged'. Maybe we should deprecate it?

Other possibility would be to add a different creator, e.g. `Enumerator.sized(size_lambda){|yielder| ... }`.

> > The old syntax of `Enumerator.new` without a block does not change meaning.
>
> Is it okay that there is no way to specify size in this case?

This old syntax is already discouraged and `to_enum`/`enum_for` should be used instead.

> > This is why I propose that `to_enum` accepts a block that can calculate the
> > size, and Enumerator.new with a block can accept a lambda/proc for the
> > same.
>
> What argument(s) will the lambda/proc receive?

We could consider passing the receiver and/or any arguments passed to `to_enum`, but I would propose to keep it simple and pass no arguments.

This is because enumerators are immutable and all information held in Enumerators should be accessible from the block/lambda anyways.
--
Marc-André
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-28400

ko1 (Koichi Sasada)

unread,
Oct 26, 2012, 6:12:09 PM10/26/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by ko1 (Koichi Sasada).

Assignee changed from matz (Yukihiro Matsumoto) to mame (Yusuke Endoh)


----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-31691

Author: marcandre (Marc-Andre Lafortune)
Status: Open
Priority: Normal
Assignee: mame (Yusuke Endoh)

mame (Yusuke Endoh)

unread,
Oct 26, 2012, 10:51:15 PM10/26/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned
Assignee changed from mame (Yusuke Endoh) to matz (Yukihiro Matsumoto)
Target version changed from 2.0.0 to next minor


----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-31763

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: next minor

marcandre (Marc-Andre Lafortune)

unread,
Oct 29, 2012, 1:48:44 AM10/29/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).


Hi.

My understanding was this new feature would make it into Ruby 2.0. Did I misunderstand?

The implementation can be seen here: https://github.com/marcandre/ruby/compare/marcandre:trunk...marcandre:enum_size

Although the combined diff (https://github.com/marcandre/ruby/compare/marcandre:trunk...marcandre:enum_size.diff ) is pretty big, there are really just two interesting commits. The second adds #size and extends constructor. The third extends `to_enum` to accept a block. See https://github.com/marcandre/ruby/commit/add_enumerator_size and https://github.com/marcandre/ruby/commit/sized_to_enum

The first commit only warns on using the deprecated form with no block `Enumerator.new(obj, *args)`, but compatibility is maintained.

The remaining commits add support for the different ways of creating enumerators that can evaluate lazily their size. A few remain to be implemented, in particular the lazy ones.

----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-31908

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: next minor

marcandre (Marc-Andre Lafortune)

unread,
Oct 31, 2012, 12:20:11 AM10/31/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).


I added support for lazy enumerators.

Yusuke, Matz, did I address your questions/concerns?
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-32052

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: next minor

matz (Yukihiro Matsumoto)

unread,
Nov 4, 2012, 11:30:45 PM11/4/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by matz (Yukihiro Matsumoto).


After skimming your modifies, I feel they are decent.
Sorry for being late to check.

Matz.
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-32364

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: next minor

marcandre (Marc-Andre Lafortune)

unread,
Nov 5, 2012, 12:16:26 PM11/5/12
to ruby...@ruby-lang.org

Issue #6636 has been updated by marcandre (Marc-Andre Lafortune).

Assignee changed from matz (Yukihiro Matsumoto) to marcandre (Marc-Andre Lafortune)
Target version changed from next minor to 2.0.0

Hi,

matz (Yukihiro Matsumoto) wrote:
> After skimming your modifies, I feel they are decent.

Thanks for looking at them. Very happy to read this :-)

I'll then commit these and finish implementing size for ranges of strings.
--
Marc-André
----------------------------------------
Feature #6636: Enumerable#size
https://bugs.ruby-lang.org/issues/6636#change-32449

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Reply all
Reply to author
Forward
0 new messages