Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

join not in Enumerable

0 views
Skip to first unread message

Logan Capaldo

unread,
May 21, 2005, 6:34:32 PM5/21/05
to
Just a few minutes ago I was playing with irb as I am wont to do, and
typed this:

('a'..'z').join(' ')

Lo and behold it protested at me with a NoMethodError. I said to my
self, self there is no reason that has to be Array only functionality.
Why isn't it in Enumerable? So I said:

module Enumerable
def join(sep = '')
inject do |a, b|
"#{a}#{sep}#{b}"
end
end
end

And then I said ('a'..'z').join(' ') and got:
=> "a b c d e f g h i j k l m n o p q r s t u v w x y z"

#inject has to be the most dangerously effective method ever. But I digress:

Why is join, and perhaps even pack in Array and not in Enumerable?


Ara.T.Howard

unread,
May 21, 2005, 7:56:08 PM5/21/05
to

the only reason i can think of is that just because somthing is countable
(Enumerable) doesn't mean each sub-thing is singular. take a hash for
example. this is no stubling block (pun intended) for ruby however:

harp:~ > cat a.rb
module Enumerable
def join(sep = '', &b)
inject(nil){|s,x| "#{ s }#{ s && sep }#{ b ? b[ x ] : x }"}
end
end
class Array; def join(*a, &b); super; end; end

r = 'a' .. 'z'
p(r.join(' '))

h = {:k => :v, :K => :V}
p(h.join(';'){|kv| kv.join '=>'})

a = [ [0, 1], [2, 3] ]
p(a.join(','){|kv| kv.join ':'})


harp:~ > ruby a.rb


"a b c d e f g h i j k l m n o p q r s t u v w x y z"

"k=>v;K=>V"
"0:1,2:3"

this allows 'nesting' of join calls for arbitrarily deep enumerable structures.

a3 = [
[ [:a, :b], [:c, :d] ],
[ [:e, :f], [:g, :h] ],
]

p( a3.join('___'){|a2| a2.join('__'){|a1| a1.join '_'}} )

#=> "a_b__c_d___e_f__g_h"

it's a nice idea you have there!

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

David A. Black

unread,
May 21, 2005, 10:05:10 PM5/21/05
to
Hi --

On Sun, 22 May 2005, Logan Capaldo wrote:

> Just a few minutes ago I was playing with irb as I am wont to do, and
> typed this:
>
> ('a'..'z').join(' ')
>
> Lo and behold it protested at me with a NoMethodError. I said to my
> self, self there is no reason that has to be Array only functionality.
> Why isn't it in Enumerable? So I said:
>
> module Enumerable
> def join(sep = '')
> inject do |a, b|
> "#{a}#{sep}#{b}"
> end
> end
> end
>
> And then I said ('a'..'z').join(' ') and got:
> => "a b c d e f g h i j k l m n o p q r s t u v w x y z"
>
> #inject has to be the most dangerously effective method ever. But I digress:

You can speed it up a lot if you do this:

module Enumerable
def join(sep = '')

to_a.join(sep)
end
end

Benchmarking 10 calls to each version, for a dummy class where each
just iterates from 1 to 1000:

user system total real
inject 2.720000 0.030000 2.750000 ( 2.759071)
to_a 0.300000 0.000000 0.300000 ( 0.298650)

> Why is join, and perhaps even pack in Array and not in Enumerable?

I guess to_a makes the conversion pretty easy, and Array tends to
serve as the "normalized" version of Enumerable in a lot of contexts.
I don't know if there's any other reason.


David

--
David A. Black
dbl...@wobblini.net


Jim Weirich

unread,
May 22, 2005, 12:37:59 AM5/22/05
to
On Saturday 21 May 2005 10:05 pm, David A. Black wrote:
> Hi --
>
> On Sun, 22 May 2005, Logan Capaldo wrote:
> > Just a few minutes ago I was playing with irb as I am wont to do, and
> > typed this:
> >
> > ('a'..'z').join(' ')
> >
> > Lo and behold it protested at me with a NoMethodError. I said to my
> > self, self there is no reason that has to be Array only functionality.
> > Why isn't it in Enumerable? So I said:
> >
> > module Enumerable
> > def join(sep = '')
> > inject do |a, b|
> > "#{a}#{sep}#{b}"
> > end
> > end
> > end
> >
> > And then I said ('a'..'z').join(' ') and got:
> > => "a b c d e f g h i j k l m n o p q r s t u v w x y z"
> >
> > #inject has to be the most dangerously effective method ever. But I
> > digress:
>
> You can speed it up a lot if you do this:

[... elided version using to_a ...]

The reason the non-to_a version is slow is because it creates a series of
increasingly larger strings. A faster version (without resorting to to_a)
would build up a single string gradually. Here is another version:

def join(sep='')
inject(nil) { |a, b|
a ? (a << sep << b.to_s) : "#{b}"
}
end

Here are the timings I got ...

user system total real
to_a: 0.580000 0.000000 0.580000 ( 0.583975)
inject slow: 10.520000 0.210000 10.730000 ( 11.998484)
inject fast: 0.590000 0.020000 0.610000 ( 0.651972)

--
-- Jim Weirich j...@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)


Christoph

unread,
May 22, 2005, 4:21:45 AM5/22/05
to
Jim Weirich schrieb:

>
> def join(sep='')
> inject(nil) { |a, b|
> a ? (a << sep << b.to_s) : "#{b}"
> }
> end
>
>

It's
p ([].join) # ""

so this should be

def join(sep="")
if sep == ""
inject('') { |a, b|
a << b.inspect
}
else
inject('') { |a, b|
a << sep << b.inspect
}
end
end

/Christoph


nobu....@softhome.net

unread,
May 22, 2005, 5:37:02 AM5/22/05
to
Hi,

At Sun, 22 May 2005 08:56:08 +0900,
Ara.T.Howard wrote in [ruby-talk:143311]:


> the only reason i can think of is that just because somthing is countable
> (Enumerable) doesn't mean each sub-thing is singular. take a hash for
> example. this is no stubling block (pun intended) for ruby however:

Feels interesting.


Index: enum.c
===================================================================
RCS file: /cvs/ruby/src/ruby/enum.c,v
retrieving revision 1.54
diff -U2 -p -r1.54 enum.c
--- enum.c 30 Oct 2004 06:56:17 -0000 1.54
+++ enum.c 22 May 2005 09:36:21 -0000
@@ -967,4 +967,52 @@ enum_zip(argc, argv, obj)
}

+static VALUE
+enum_join_s(obj, arg, recur)
+ VALUE obj, *arg;
+ int recur;
+{
+ if (recur) {
+ static const char recursed[] = "[...]";
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_cat(arg[0], recursed, sizeof(recursed) - 1);
+ }
+ else {
+ if (rb_block_given_p()) {
+ obj = rb_yield(obj);
+ }
+ if (TYPE(obj) != T_STRING) {
+ obj = rb_obj_as_string(obj);
+ }
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_append(arg[0], obj);
+ }
+ return arg[0];
+}
+
+static VALUE
+enum_join_i(el, arg)
+ VALUE el, arg;
+{
+ return rb_exec_recursive(enum_join_s, el, arg);
+}
+
+static VALUE
+enum_join(argc, argv, obj)
+ int argc;
+ VALUE *argv;
+ VALUE obj;
+{
+ VALUE arg[2];
+
+ rb_scan_args(argc, argv, "01", &arg[1]);
+ arg[0] = rb_str_new(0, 0);
+ rb_iterate(rb_each, obj, enum_join_i, (VALUE)arg);
+ return arg[0];
+}
+
/*
* The <code>Enumerable</code> mixin provides collection classes with
@@ -998,4 +1046,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"inject", enum_inject, -1);
rb_define_method(rb_mEnumerable,"partition", enum_partition, 0);
+ rb_define_method(rb_mEnumerable,"classify", enum_classify, 0);
rb_define_method(rb_mEnumerable,"all?", enum_all, 0);
rb_define_method(rb_mEnumerable,"any?", enum_any, 0);
@@ -1008,4 +1057,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"each_with_index", enum_each_with_index, 0);
rb_define_method(rb_mEnumerable, "zip", enum_zip, -1);
+ rb_define_method(rb_mEnumerable, "join", enum_join, -1);

id_eqq = rb_intern("===");

--
Nobu Nakada


Kristof Bastiaensen

unread,
May 22, 2005, 5:53:42 AM5/22/05
to
On Sun, 22 May 2005 11:05:10 +0900, David A. Black wrote:

>> Why is join, and perhaps even pack in Array and not in Enumerable?
>
> I guess to_a makes the conversion pretty easy, and Array tends to
> serve as the "normalized" version of Enumerable in a lot of contexts.
> I don't know if there's any other reason.

I believe because join requires an ordered collection, and enumerables
aren't guaranteed to be ordered. For example the order of traversing a
Hash may differ for a different hash with the same elements. For this
reason the output of join for an enumerable is undefined.

Regards,
KB

Robert Klemme

unread,
May 22, 2005, 6:32:03 AM5/22/05
to

"Kristof Bastiaensen" <kri...@vleeuwen.org> schrieb im Newsbeitrag
news:pan.2005.05.22....@vleeuwen.org...

> On Sun, 22 May 2005 11:05:10 +0900, David A. Black wrote:
>
>>> Why is join, and perhaps even pack in Array and not in Enumerable?
>>
>> I guess to_a makes the conversion pretty easy, and Array tends to
>> serve as the "normalized" version of Enumerable in a lot of contexts.
>> I don't know if there's any other reason.
>
> I believe because join requires an ordered collection, and enumerables
> aren't guaranteed to be ordered.

That would be my answer, too.

> For example the order of traversing a
> Hash may differ for a different hash with the same elements. For this
> reason the output of join for an enumerable is undefined.

At least it is unpredictable. Even more so: order may change completely
with each insertion:

>> h=(0..5).inject({}){|h,i| h[i.to_s]=i;h}
=> {"0"=>0, "1"=>1, "2"=>2, "3"=>3, "4"=>4, "5"=>5}
>> h.to_a
=> [["0", 0], ["1", 1], ["2", 2], ["3", 3], ["4", 4], ["5", 5]]
>> h["6"]=6
=> 6
>> h.to_a
=> [["6", 6], ["0", 0], ["1", 1], ["2", 2], ["3", 3], ["4", 4], ["5", 5]]

Kind regards

robert

David A. Black

unread,
May 22, 2005, 7:19:21 AM5/22/05
to
Hi --

I don't think the unorderedness would matter; consider, for example,
Hash#to_s.

David A. Black

unread,
May 22, 2005, 7:26:22 AM5/22/05
to
Hi --

On Sun, 22 May 2005, Robert Klemme wrote:

>
> "Kristof Bastiaensen" <kri...@vleeuwen.org> schrieb im Newsbeitrag
> news:pan.2005.05.22....@vleeuwen.org...
>> On Sun, 22 May 2005 11:05:10 +0900, David A. Black wrote:
>>
>>>> Why is join, and perhaps even pack in Array and not in Enumerable?
>>>
>>> I guess to_a makes the conversion pretty easy, and Array tends to
>>> serve as the "normalized" version of Enumerable in a lot of contexts.
>>> I don't know if there's any other reason.
>>
>> I believe because join requires an ordered collection, and enumerables
>> aren't guaranteed to be ordered.
>
> That would be my answer, too.

As per my previous post, I don't think that matters for join, which is
just a "dumb" string representation facility and won't care about
order.

Another related thought: Enumerables have this underlying numerical
index, as reflected in Enumerable#each_with_index. Even hashes are,
in that sense, "ordered": their elements are "indexed" from 0 up.

I have to say, though, that I think #each_with_index should be removed
from Enumerable and pushed down to the classes that mix it in
(similarly to #each_index). But I suppose as long as they are called
"enumerable" they are in some sense associated with a numerical index.

That's probably only tangentially related to #join, though. Mainly I
think that #join is just a fancy #to_s, and orderedness isn't an
issue.

Eric Mahurin

unread,
May 22, 2005, 9:59:20 AM5/22/05
to
--- "David A. Black" <dbl...@wobblini.net> wrote:
> I have to say, though, that I think #each_with_index should
> be removed
> from Enumerable and pushed down to the classes that mix it in
> (similarly to #each_index). But I suppose as long as they
> are called
> "enumerable" they are in some sense associated with a
> numerical index.

Unfortunately for Hash, this can cause confusion to what an
"index" is. If it weren't for an already existing Hash#index
(which gets a key), I would suggest it be brought over from
Array to Enumerable.

I do tend to think that many of the Array methods should be
brought over to Enumerable. You could bring over just about
any one that is non-modifying and operates sequentially forward
on the array, but you may also restrict the ones related to an
"index":

*, +, <=>, ==, assoc, compact, concat, empty?, eql?, first,
flatten, hash, join, last, length, nitems, pack, rassoc, size,
to_s, uniq

And if you don't care about the "index" confusion in hash, you
could get these:

[], at, fetch, index, slice, values_at



Discover Yahoo!
Find restaurants, movies, travel and more fun for the weekend. Check it out!
http://discover.yahoo.com/weekend.html

David A. Black

unread,
May 22, 2005, 10:18:27 AM5/22/05
to
Hi --

On Sun, 22 May 2005, Eric Mahurin wrote:

> --- "David A. Black" <dbl...@wobblini.net> wrote:
>> I have to say, though, that I think #each_with_index should
>> be removed
>> from Enumerable and pushed down to the classes that mix it in
>> (similarly to #each_index). But I suppose as long as they
>> are called
>> "enumerable" they are in some sense associated with a
>> numerical index.
>
> Unfortunately for Hash, this can cause confusion to what an
> "index" is. If it weren't for an already existing Hash#index
> (which gets a key), I would suggest it be brought over from
> Array to Enumerable.

I see it the other way. "Index" means different things to different
enumerables. I don't like the idea of having Enumerable define index
as consecutive integers slapped onto the elements. I'd rather defer
that to the classes -- as, indeed, it is, with the strange exception
of each_with_index.

> I do tend to think that many of the Array methods should be
> brought over to Enumerable. You could bring over just about
> any one that is non-modifying and operates sequentially forward
> on the array, but you may also restrict the ones related to an
> "index":
>
> *, +, <=>, ==, assoc, compact, concat, empty?, eql?, first,
> flatten, hash, join, last, length, nitems, pack, rassoc, size,
> to_s, uniq

Some of these would fare better than others. #flatten has no general
meaning for an enumerable, since not all of them are recursive
container objects. I don't think you can #pack an arbitrary
enumerable either. #size also doesn't work for enumerables in
general, partly because some of them have no particular size and
partly because even for those that do, taking the size might cause
side-effects (e.g., an I/O-based enumerable).

In general, I think the design is a good one: Enumerable contains the
most common methods (plus each_with_index :-) and specialized behavior
is left to the classes. Arrays provide a kind of normalized
representation through which some of those behaviors can be achieved
-- i.e., with #to_a you can hook into a lot of them. Finer
granularity would certainly be possible; there's been discussion, for
example, of separating Iterable (or something like that) out of
Enumerable.

Ara.T.Howard

unread,
May 22, 2005, 10:51:20 AM5/22/05
to
On Sun, 22 May 2005, David A. Black wrote:

> I see it the other way. "Index" means different things to different
> enumerables. I don't like the idea of having Enumerable define index as
> consecutive integers slapped onto the elements. I'd rather defer that to
> the classes -- as, indeed, it is, with the strange exception of
> each_with_index.

hmm. i don't really see any possible implication with orderedness or index
confustion within enumerable methods - i guess it's because i think of
enumerable as, and only as, countable. because of that

- no orderedness is implied : eg. in what order would you count a dozen
eggs? doesn't much matter really, so long as you start at 0 and end at
11. [ ;-) ]

- which brings me to the second point. the notion of index is, and always
is, with enumerable methods, the notion of my index in the the current
count. again, with the eggs example, if i'm counting a dozen eggs then,
at any point, i can tell you which egg i'm on : the 3rd, the 8th, etc.
this is obviously my index into the count.

> In general, I think the design is a good one: Enumerable contains the most
> common methods (plus each_with_index :-) and specialized behavior is left to
> the classes. Arrays provide a kind of normalized representation through
> which some of those behaviors can be achieved -- i.e., with #to_a you can
> hook into a lot of them. Finer granularity would certainly be possible;
> there's been discussion, for example, of separating Iterable (or something
> like that) out of Enumerable.

so i guess we are on the same page here except i'd call Iterable Countable and
then the each_with_index method fits right back in ;-) i can understand
someone seeing it an not entirely orthogonal, why not simply track your own
'index'?, yet the notion seems no natural when wants to do something like

def sample collection
ret = []
collection.each_with_index{|x,i| (i >= 42 ? break ; (ret << x))}
ret
end

kind regards.

David A. Black

unread,
May 22, 2005, 11:11:45 AM5/22/05
to
Hi --

On Sun, 22 May 2005, Ara.T.Howard wrote:

> On Sun, 22 May 2005, David A. Black wrote:
>
>> I see it the other way. "Index" means different things to different
>> enumerables. I don't like the idea of having Enumerable define index as
>> consecutive integers slapped onto the elements. I'd rather defer that to
>> the classes -- as, indeed, it is, with the strange exception of
>> each_with_index.
>
> hmm. i don't really see any possible implication with orderedness or index
> confustion within enumerable methods - i guess it's because i think of
> enumerable as, and only as, countable. because of that
>
> - no orderedness is implied : eg. in what order would you count a dozen
> eggs? doesn't much matter really, so long as you start at 0 and end at
> 11. [ ;-) ]

It may not imply orderness, but it imposes an order. I think part of
the problem surrounding this has always been that hashes having
numerical indices, aside from imposing an order that means nothing,
collides head-on with the idea that a hash key is to the hash what an
array index is to the array.

In other words, one can say:

Hashes have keys, values, and a thing called the "index" which is
a 0-originating count for the key/value pairs.

or one can say:

Where arrays have numerical indices, hashes have keys that don't
have to be numbers.

but it doesn't make sense to say both.

Moreover, this also makes no sense:

h = {1,2,3,4,5,6,7,8}
h.each_with_index {|pair,i|
puts "Found #{pair.join("=>")} at index 2" if i == 2
}
puts "h.index(2) is #{h.index(2)}"

There is a direct conflict of terminology here: "index" means two
completely different things in the case of a hash, depending which
method you're calling.

You could argue (and I think people have) that the "index" in
"each_with_index" is the index of an array resulting from an implicit
#to_a operation. And, sure enough, these two things are equivalent:

h.each_with_index
h.to_a.each_with_index

But to me that's just another indication that something is askew.

> - which brings me to the second point. the notion of index is, and always
> is, with enumerable methods, the notion of my index in the the current
> count. again, with the eggs example, if i'm counting a dozen eggs then,
> at any point, i can tell you which egg i'm on : the 3rd, the 8th, etc.
> this is obviously my index into the count.

I've been asking for 4.5 years for someone to show me a case where
it's useful to have Hash#each_with_index, and I've never been shown
one :-)

>> In general, I think the design is a good one: Enumerable contains the most
>> common methods (plus each_with_index :-) and specialized behavior is left
>> to
>> the classes. Arrays provide a kind of normalized representation through
>> which some of those behaviors can be achieved -- i.e., with #to_a you can
>> hook into a lot of them. Finer granularity would certainly be possible;
>> there's been discussion, for example, of separating Iterable (or something
>> like that) out of Enumerable.
>
> so i guess we are on the same page here except i'd call Iterable Countable
> and
> then the each_with_index method fits right back in ;-) i can understand
> someone seeing it an not entirely orthogonal, why not simply track your own
> 'index'?, yet the notion seems no natural when wants to do something like
>
> def sample collection
> ret = []
> collection.each_with_index{|x,i| (i >= 42 ? break ; (ret << x))}
> ret
> end

I can't think of an unordered collection where that would be likely to
be useful. It looks like something you'd use on an array or a
filehandle, but not on a hash.

Ara.T.Howard

unread,
May 22, 2005, 11:13:01 AM5/22/05
to

only you would crank that out in C nobu ;-)


looks good for enumerable:

harp:~/build/ruby > ./ruby -e' p( {:k => :v, :K => :V }.join(","){|kv| kv.join "=>"} ) '
"k=>v,K=>V"

but doesn't override Array's current behaviour:

harp:~/build/ruby > ./ruby -e'a3 = [ [ [4], [2] ], [ ["forty"], ["two"] ] ]; p a3.join("___"){|a2| a2.join("__"){|a1| a1.join "_"}}'
"4___2___forty___two"


i'm not sure how to do this in C:

module Enumerable
def join(sep = '', &b)
inject(nil){|s,x| "#{ s }#{ s && sep }#{ b ? b[ x ] : x }"}
end
end
class Array
def join(*a, &b); super; end
end

so Array's join is clobbered...

kind regards.

Daniel Berger

unread,
May 22, 2005, 12:05:17 PM5/22/05
to

Because there is nothing explicitly iterative about join. Also, every
class except Array would have to have a custom definition of join,
since there's no reasonable default behavior for any class outside of
Array. And if every class would have to implement its own version of a
method, that method doesn't belong in a module. Modules are not
interfaces.

I can see that Ara has already found an excuse to give join a block -
lovely. The slide continues....

Regards,

Dan

Eric Mahurin

unread,
May 22, 2005, 5:27:10 PM5/22/05
to

--- "David A. Black" <dbl...@wobblini.net> wrote:
> Hi --
>
> On Sun, 22 May 2005, Eric Mahurin wrote:
>
> > --- "David A. Black" <dbl...@wobblini.net> wrote:
> >> I have to say, though, that I think #each_with_index
> should
> >> be removed
> >> from Enumerable and pushed down to the classes that mix it
> in
> >> (similarly to #each_index). But I suppose as long as they
> >> are called
> >> "enumerable" they are in some sense associated with a
> >> numerical index.
> >
> > Unfortunately for Hash, this can cause confusion to what an
> > "index" is. If it weren't for an already existing
> Hash#index
> > (which gets a key), I would suggest it be brought over from
> > Array to Enumerable.
>
> I see it the other way. "Index" means different things to
> different
> enumerables. I don't like the idea of having Enumerable
> define index
> as consecutive integers slapped onto the elements. I'd
> rather defer
> that to the classes -- as, indeed, it is, with the strange
> exception
> of each_with_index.

I was agreeing with you - each_with_index confuses what an
"index" is for Hash (or Hash#index does depending on how you
look at it).

> > I do tend to think that many of the Array methods should be
> > brought over to Enumerable. You could bring over just
> about
> > any one that is non-modifying and operates sequentially
> forward
> > on the array, but you may also restrict the ones related to
> an
> > "index":
> >
> > *, +, <=>, ==, assoc, compact, concat, empty?, eql?, first,
> > flatten, hash, join, last, length, nitems, pack, rassoc,
> size,
> > to_s, uniq
>
> Some of these would fare better than others.

agreed. I just listed all of them that were read-only and
operate sequentially forward in across the array. I think all
of these could be easily implemented in enumerable, but not all
make necessarily make sense.

> #flatten has no general
> meaning for an enumerable, since not all of them are
> recursive container objects.

Array#flatten only descends into Array elements, and an
Enumerable#flatten might only descend into Enumerable (or
Array) elements.

> I don't think you can #pack an arbitrary
> enumerable either. #size also doesn't work for enumerables
> in
> general, partly because some of them have no particular size
> and
> partly because even for those that do, taking the size might
> cause
> side-effects (e.g., an I/O-based enumerable).

Using any enumerable method with IO has the same issue. You'll
need to seek back between calls to any of the enumerable
methods on an IO.

Of course their is an easy (but not quite as efficient) way to
do any of the array methods on an enumerable: just call to_a
first (i.e. enum.to_a.join(" ") does a join for any
enumerable).



Yahoo! Mail
Stay connected, organized, and protected. Take the tour:
http://tour.mail.yahoo.com/mailtour.html

Ara.T.Howard

unread,
May 22, 2005, 12:45:42 PM5/22/05
to
On Mon, 23 May 2005, Daniel Berger wrote:

> Logan Capaldo wrote:
>> Just a few minutes ago I was playing with irb as I am wont to do, and
>> typed this:
>>
>> ('a'..'z').join(' ')
>>
>> Lo and behold it protested at me with a NoMethodError. I said to my
>> self, self there is no reason that has to be Array only
> functionality.
>> Why isn't it in Enumerable? So I said:
>>
>> module Enumerable
>> def join(sep = '')
>> inject do |a, b|
>> "#{a}#{sep}#{b}"
>> end
>> end
>> end
>>
>> And then I said ('a'..'z').join(' ') and got:
>> => "a b c d e f g h i j k l m n o p q r s t u v w x y z"
>>
>> #inject has to be the most dangerously effective method ever. But I
> digress:
>>
>> Why is join, and perhaps even pack in Array and not in Enumerable?
>

> Because there is nothing explicitly iterative about join.

Enumerable#join(sep): concatinate each (Enumerable#each) thing onto a string
followed by sep, unless it is the last (implying iteration) thing.

isn't this definition reasonable and iterative?

> Also, every class except Array would have to have a custom definition of
> join, since there's no reasonable default behavior for any class outside of
> Array.

really?

set = Set::new
set.join ','

ll = LinkedList::new
ll.join '->'

dll = DoublyLinkedList::new
dll.join '<->'

v = BitVector::new
v.join '|'

path = graph.shortest_path from, to
path.join '=>'

string = String::new
string.join "<br>"

stack = Stack::new
stack.join '-'

rope = Rope::new
rope.join '_'

priority_queue = PriorityQueue::new
priority_queue.join(','){|priority_and_obj| priority_and_obj.join ':'}

come to mind ;-)


> And if every class would have to implement its own version of a method, that
> method doesn't belong in a module. Modules are not interfaces.

why would every class have to implement it's own? with the defintion we've
been throwing around we already have things like

harp:~/build/ruby > ./ruby -e'html = "line1\nline2\nline3".join "<br>"; p html'
"line1\n<br>line2\n<br>line3"

which is kinda handy and makes good sense no?

> I can see that Ara has already found an excuse to give join a block -
> lovely. The slide continues....

weee. ;-)

cheers.

David A. Black

unread,
May 22, 2005, 9:37:36 PM5/22/05
to
Hi --

On Mon, 23 May 2005, Eric Mahurin wrote:

>> #flatten has no general
>> meaning for an enumerable, since not all of them are
>> recursive container objects.
>
> Array#flatten only descends into Array elements, and an
> Enumerable#flatten might only descend into Enumerable (or
> Array) elements.

It just seems like a bad fit for, say, iterating through lines of a
file. I don't think being enumerable implies being flattenable,
because it doesn't imply being nested (whereas being an array implies
that you might be nested). Therefore I wouldn't put flatten in
Enumerable.

>> I don't think you can #pack an arbitrary
>> enumerable either. #size also doesn't work for enumerables
>> in
>> general, partly because some of them have no particular size
>> and
>> partly because even for those that do, taking the size might
>> cause
>> side-effects (e.g., an I/O-based enumerable).
>
> Using any enumerable method with IO has the same issue. You'll
> need to seek back between calls to any of the enumerable
> methods on an IO.

Size has the other problem too: being enumerable does not mean being
measurable. For example:

class C
include Enumerable
def each
loop { yield rand(100) }
end
end

It's meaningless to talk about the size of a C object -- but it's a
perfectly legitimate enumerable.

> Of course their is an easy (but not quite as efficient) way to
> do any of the array methods on an enumerable: just call to_a
> first (i.e. enum.to_a.join(" ") does a join for any
> enumerable).

I'm not sure that it's less efficient than some of these. For
example, see the exchange earlier in the thread about to_a vs. inject
in join; the efficient inject seemed to benchmark around the same as
to_a (though presumably this would vary from one case to another). Of
course, I'm not sure one of my C objects, above, would respond to
nicely to to_a.... :-)

Daniel Berger

unread,
May 22, 2005, 10:25:02 PM5/22/05
to

Fine, replace "Array" with "most lists" and my point still stands.

> > And if every class would have to implement its own version of a
method, that
> > method doesn't belong in a module. Modules are not interfaces.
>
> why would every class have to implement it's own? with the defintion
we've
> been throwing around we already have things like
>
> harp:~/build/ruby > ./ruby -e'html = "line1\nline2\nline3".join
"<br>"; p html'
> "line1\n<br>line2\n<br>line3"
>
> which is kinda handy and makes good sense no?

No, it doesn't make sense.

Regards,

Dan

Eric Mahurin

unread,
May 22, 2005, 10:56:09 PM5/22/05
to
--- "David A. Black" <dbl...@wobblini.net> wrote:
> Size has the other problem too: being enumerable does not
> mean being
> measurable. For example:
>
> class C
> include Enumerable
> def each
> loop { yield rand(100) }
> end
> end
>
> It's meaningless to talk about the size of a C object -- but
> it's a
> perfectly legitimate enumerable.

I don't think an inifinite collection is a legitimate
enumerable. Since each is an infinite loop, none of the
Enumerable methods will necessarily escape that loop.
Enumerable#find (and friends) may return if it finds a match
since it probably breaks from the loop, but I don't think you
should depend on the implementation breaking from the loop (it
could simply record the first match).

I still think that any array method that is read-only and
operates in a single forward pass over an array would be good
to consider to go into Enumerable (Enumerable#sort* doesn't
meet those qualifications and I don't think it should have been
in there in the first place). But, even without doing this
enum.to_a.<array_method> should work just fine.

0 new messages