Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[BUG] string range membership

3 views
Skip to first unread message

Warren Brown

unread,
Nov 23, 2005, 1:41:32 AM11/23/05
to
All,

I ran across some code that was trying to validate that an integer
was in a given range, however the integer and the range were Strings.
The problem boils down to this:

>ruby -e "p ('1'..'10').member?('2')"
false

Given that...

>ruby -v -e "('1'..'10').each {|s| p s}"
ruby 1.8.2 (2004-12-25) [i386-mswin32]
"1"
"2"
"3"
"4"
"5"
"6"
"7"
"8"
"9"
"10"

...it seems like ('1'..'10').member?('2') should return true. The
problem lies in range.c, in the range_each_func() method. This method
starts with the first value, then calls succ() to get the next value,
breaking out of the loop when the value is no longer less than or equal
to the ending value (or strictly less than the ending value on an
exclusive range). Unfortunately, for the given string range this
happens immediately, since '2' > '10'.

I suppose that it could be argued that this is not a bug, but that
would be a difficult argument to win. Also, I need to make sure that
this is still a bug in the latest version of Ruby. Unfortunately, I'm
too sleepy to investigate further or create a patch for this tonight,
but I'll try to work on it some more tomorrow night (assuming nobody
else fixes it first).

- Warren Brown

Brian Schröder

unread,
Nov 23, 2005, 2:53:37 AM11/23/05
to

I'd argue that it is not a bug, as there is no unique isomorphie from
strings to integers. Some well known functions would be hex, octal and
decimal encoding. I.e. '2' ... '10' could be understood as 2 ... 8, or
2 ... 16 or 2 ... 10 or error ... 2 depending on the base.

Only you can now what the string means, so convert it to an integer
and do the range test on integers.

So I think you should save your time on creating the patch and
preferably fix the application code.

Brian

--
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/


Joel VanderWerf

unread,
Nov 23, 2005, 3:18:10 AM11/23/05
to
Brian Schröder wrote:
> I'd argue that it is not a bug, as there is no unique isomorphie from
> strings to integers. Some well known functions would be hex, octal and
> decimal encoding. I.e. '2' ... '10' could be understood as 2 ... 8, or
> 2 ... 16 or 2 ... 10 or error ... 2 depending on the base.
>
> Only you can now what the string means, so convert it to an integer
> and do the range test on integers.

I was going to make that argument, but I realized that #each (by way of
#succ) *does* have some extra knowledge (or assumptions) about strings,
and it's pretty smart:

irb(main):003:0> ('1.1'..'10.1').each {|s| p s}
"1.1"
"1.2"
"1.3"
"1.4"
..
..
"9.6"
"9.7"
"9.8"
"9.9"
"10.0"
"10.1"
=> "1.1".."10.1"
irb(main):004:0> ('1.6.4'..'1.8.3').each {|s| p s}
"1.6.4"
"1.6.5"
"1.6.6"
"1.6.7"
"1.6.8"
"1.6.9"
"1.7.0"
"1.7.1"
"1.7.2"
"1.7.3"
"1.7.4"
"1.7.5"
"1.7.6"
"1.7.7"
"1.7.8"
"1.7.9"
"1.8.0"
"1.8.1"
"1.8.2"
"1.8.3"
=> "1.6.4".."1.8.3"

I guess it is just impractical for Range#member? to test using #succ.

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407


Yukihiro Matsumoto

unread,
Nov 23, 2005, 3:19:55 AM11/23/05
to
Hi,

In message "Re: [BUG] string range membership"


on Wed, 23 Nov 2005 15:41:32 +0900, "Warren Brown" <warre...@aquire.com> writes:
| I ran across some code that was trying to validate that an integer
|was in a given range, however the integer and the range were Strings.

include? and member? compares with beg <= val <= end, which is
dictionary order for strings. Unfortunately strings generated from
using succ is not in dictionary order. I'm not sure how to solve
this.

matz.

Paul

unread,
Nov 23, 2005, 3:23:58 AM11/23/05
to
Just playing with it around but...


irb(main):057:0> ('1'..'10').each { |i| puts i }


1
2
3
4
5
6
7
8
9
10

=> "1".."10"
irb(main):058:0> ('1'..'10').each { |i| case when i == '2' ; puts "two"
else puts i end}
1
two


3
4
5
6
7
8
9
10

=> "1".."10"
irb(main):059:0>

Trans

unread,
Nov 23, 2005, 8:00:45 AM11/23/05
to
> I'm not sure how to solve this.

This was discussed sometime ago. The solution (mostly arrived at by
Peter Vanbroekhoven) is to use a different comparision method. In
Facets you'll find the #cmp method, which is part of the base methods,
and which is used by the Interval class --a true Interval as opposed to
what Range is.

def cmp(other)
return -1 if length < other.length
return 1 if length > other.length
self <=> other
end

Of course this won't be of use to tuple forms like "1.18.12", but in
such cases a Tuple object is in order anyway.

T.

Warren Brown

unread,
Nov 23, 2005, 9:57:41 AM11/23/05
to
Brian,

> I'd argue that it is not a bug, as there is no unique
> isomorphie from strings to integers.

I have to disagree.

>ruby -v -e "p(('1'..'10').to_a)"
ruby 1.8.2 (2004-12-25) [i386-mswin32]
["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]

This shows a clear and unique mapping of the range '1'..'10' into a
set of strings.

- Warren Brown


Warren Brown

unread,
Nov 23, 2005, 10:01:48 AM11/23/05
to
T,

> This was discussed sometime ago. The solution (mostly
> arrived at by Peter Vanbroekhoven) is to use a
> different comparision method.

I can't find this discussion in the archive. Can you give me a link
or a message number?

- Warren Brown

Ara.T.Howard

unread,
Nov 23, 2005, 10:05:32 AM11/23/05
to

but where do '01', '001', and '0001' go? they too, are in the set of strings.

regards.

-a
--
===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| all happiness comes from the desire for others to be happy. all misery
| comes from the desire for oneself to be happy.
| -- bodhicaryavatara
===============================================================================

Yukihiro Matsumoto

unread,
Nov 23, 2005, 10:14:56 AM11/23/05
to
Hi,

In message "Re: [BUG] string range membership"

on Wed, 23 Nov 2005 23:57:41 +0900, "Warren Brown" <warre...@aquire.com> writes:

|> I'd argue that it is not a bug, as there is no unique
|> isomorphie from strings to integers.
|
| I have to disagree.

For your information, member? used to iterate over items to check
membership. But since confusion between include? and member?, they
were merged. The point is Ranges are used both for ranges and
intervals. Sometimes users want it to behave like a range surrounded
by begin/end values. Sometimes they want it to behave like a set of
values, that #each produces.

I'd like to care this issue, but I haven't know the right way to solve
it yet. Perhaps we should provide both membership method, with right
names for each. Any ideas?

matz.


Warren Brown

unread,
Nov 23, 2005, 10:20:45 AM11/23/05
to
Ara,

>> ruby -v -e "p(('1'..'10').to_a)"
>> ruby 1.8.2 (2004-12-25) [i386-mswin32]
>> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
>>
>> This shows a clear and unique mapping of the range
>> '1'..'10' into a set of strings.
>
> but where do '01', '001', and '0001' go? they too,
> are in the set of strings.

You completely lost me there. '01' doesn't *go* anywhere. That
string is not in the range '1'..'10', in the same way the 'x' is not in
the range 'a'..'n'.

Don't let the fact that my example used strings that look like
numbers confuse the issue. The issue is that a range of strings that
can be converted into a finite set, has a method to test for membership
in that range, that doesn't match values that are in the set. Wow, that
sentence is even hard for *me* to follow.

OK, let's take a different example to avoid all discussion of
integers and various string representations of them.

>ruby -v -e "p(('a'..'aa').to_a)"
ruby 1.8.2 (2004-12-25) [i386-mswin32]
["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "aa"]

Here we have a string range that has 27 "members". Now:

>ruby -e "p(('a'..'aa').member?('a'))"
true
>ruby -e "p(('a'..'aa').member?('b'))"
false
..
>ruby -e "p(('a'..'aa').member?('z'))"
false
>ruby -e "p(('a'..'aa').member?('aa'))"
true

Can this really be called correct behavior of the member?() method?
I can't see any tenable argument to say that it is.

- Warren Brown

Brian Schröder

unread,
Nov 23, 2005, 10:26:50 AM11/23/05
to
On 23/11/05, Warren Brown <warre...@aquire.com> wrote:

It is not unique as

ruby -e "p %w(1 10 11 100 101).map { | b | b.to_i(2) }"
[1, 2, 3, 4, 5]

is another mapping of strings to integers that is equally valid (and
would not contain '5').

Warren Brown

unread,
Nov 23, 2005, 11:03:19 AM11/23/05
to
Matz,

> For your information, member? used to iterate over

? items to check membership. But since confusion


> between include? and member?, they were merged. The
> point is Ranges are used both for ranges and
> intervals. Sometimes users want it to behave like a
> range surrounded by begin/end values. Sometimes they
> want it to behave like a set of values, that #each
> produces.
>
> I'd like to care this issue, but I haven't know the
> right way to solve it yet. Perhaps we should
> provide both membership method, with right names for
> each. Any ideas?

Ah, I see. So really, the root problem here is the assumption by
Range that (value < value.succ). And in String, this assumption does
not always hold true:

irb(main):001:0> s = 'z'
=> "z"
irb(main):002:0> s < s.succ
=> false

Because of that, there is a huge distinction between
str_range.to_a.member?(x) (is x a member of the set of the range's
values) and (str_range.first <= x <= str_range.last) (is x in the
range's interval).
So, given that (at least in the case of ranges of strings) there is a
clear distinction between a value being included in the interval and a
value being included in the set, it appears that we have a real need for
two different methods. The methods Range#include? (in interval) and
Range#member? (of set) seem to be perfect candidates for these two
different functionalities. Before these two methods were merged, did
they take on these two functionalities, or were they different in some
other way?

Are there other cases where "membership" changes depending on
whether the range is viewed as a set or an interval? If not, perhaps it
would be better to address the fact that str.succ violates the (str <
str.succ) assumption. Perhaps the functionality currently in
String#succ could be moved to another method (String#increment
perhaps?), and String#succ could take on a new functionality that does
not violate (str < str.succ).

Anyway, please let me know if there is anything I can do to help
settle this issue.

- Warren Brown

Ara.T.Howard

unread,
Nov 23, 2005, 11:40:21 AM11/23/05
to
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

Trans

unread,
Nov 23, 2005, 2:11:45 PM11/23/05
to
> I can't find this discussion in the archive. Can you give me a link
> or a message number?

Largely from Ruby-talk 115120, although the solution really came about
on the old suby-muse mailing list.

I'm not sure everyone understood me though. The problem is that
String's #<=> and #succ methods are not compatible. Therefore
Range#member? and Rage#include? which use #<=> can not provide proper
results for String-based Ranges. The solution is to have Range use a
different comparision method, namely #cmp. In most classes #cmp will of
course just be an alias for #<=>, but it String is would differ to be
compatible with #succ, and then #include and #member would be correct.
Q.E.D.

T.

Trans

unread,
Nov 23, 2005, 3:07:09 PM11/23/05
to
s/it String is/in String it/

Daniel Sheppard

unread,
Nov 23, 2005, 5:50:52 PM11/23/05
to
ruby -e "p ('1'..'10').find {|x| x == '2'}"
"2"

It wont be as fast as Range#include, but you can't put the equivalent
into Range without giving it knowledge about how String#succ works.
#####################################################################################
This email has been scanned by MailMarshal, an email content filter.
#####################################################################################


Yukihiro Matsumoto

unread,
Nov 23, 2005, 7:21:28 PM11/23/05
to
Hi,

In message "Re: [BUG] string range membership"

on Thu, 24 Nov 2005 01:03:19 +0900, "Warren Brown" <warre...@aquire.com> writes:

|So, given that (at least in the case of ranges of strings) there is a
|clear distinction between a value being included in the interval and a
|value being included in the set, it appears that we have a real need for
|two different methods. The methods Range#include? (in interval) and
|Range#member? (of set) seem to be perfect candidates for these two
|different functionalities. Before these two methods were merged, did
|they take on these two functionalities, or were they different in some
|other way?

#include? used for range check, #member? was for set membership. But
since they have same functionality in Enumerable, some claimed having
different behaviors in Range was confusing. I agreed.

| Anyway, please let me know if there is anything I can do to help
|settle this issue.

All we need is making up good names for each functionality.

matz.


Ara.T.Howard

unread,
Nov 23, 2005, 7:38:11 PM11/23/05
to

Range#contains?

??

Yukihiro Matsumoto

unread,
Nov 23, 2005, 8:18:31 PM11/23/05
to
Hi,

In message "Re: [BUG] string range membership"

on Thu, 24 Nov 2005 09:38:11 +0900, "Ara.T.Howard" <ara.t....@noaa.gov> writes:

|> All we need is making up good names for each functionality.
|
| Range#contains?
|
|??

For which functionality?

matz.


Trans

unread,
Nov 23, 2005, 9:02:27 PM11/23/05
to
> All we need is making up good names for each functionality.

That is NOT all you need! This does not solve the complete problem, but
only provides a little-bitty patch for query on a Range member, and a
very inefficient one at that --which I thought was part of the reason
you changed #include and #member to be the same in the first place.

The overarching issue is that sortable and comparable are using the
same method #<=>, but they do not neccessarily want the same meaning.
You should provide a separate method for comparable --like I said, in
most cases they will be equivalent, but not so in String. And
dictionary order comparion is needed anyway. I studied this issue
exahustively over a year ago when I wrote a true Interval class.

T.

Ara.T.Howard

unread,
Nov 23, 2005, 9:04:59 PM11/23/05
to
On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:

well, i would think of #member? as most natural for set membership - so
#contains? would/should be most like #include? - in my mind.

harp:~ > cat a.rb
module Enumerable
def contains? value
map.include? value
end
end

r = "a" .. "aa"
p r.contains?("z")

harp:~ > ruby a.rb
true

so, if each would 'hit' it - it's contained.

kind regards.

Yukihiro Matsumoto

unread,
Nov 23, 2005, 10:40:57 PM11/23/05
to
Hi,

In message "Re: string range membership"


on Thu, 24 Nov 2005 11:07:26 +0900, "Trans" <tran...@gmail.com> writes:

|> All we need is making up good names for each functionality.
|
|That is NOT all you need! This does not solve the complete problem, but
|only provides a little-bitty patch for query on a Range member, and a
|very inefficient one at that --which I thought was part of the reason
|you changed #include and #member to be the same in the first place.

Depends on how you define problem.

|The overarching issue is that sortable and comparable are using the
|same method #<=>, but they do not neccessarily want the same meaning.
|You should provide a separate method for comparable --like I said, in
|most cases they will be equivalent, but not so in String. And
|dictionary order comparion is needed anyway. I studied this issue
|exahustively over a year ago when I wrote a true Interval class.

I'm not sure what you meant here. Range has no relation with
sorting. Can you elaborate?

matz.


Trans

unread,
Nov 24, 2005, 1:15:16 AM11/24/05
to
> I'm not sure what you meant here. Range has no relation with
> sorting. Can you elaborate?

#succ defines a sort order of sorts (pun intended ;-). But #<=> defines
a sort order too along with comparability. In most classes there's no
problem, but in String the two come into conflict --the orders are not
the same.

Then consider that Range is not a true interval because it uses #succ.
This is why I created a true Interval class that uses #+ instead.
Likewise Range shouldn't use #<=> either, but another method, lets call
it #cmp. This would fix the problem.

In general:

module Comparable
def cmp(o)
self<=>o
end
end

That is to say, for anything comparable #cmp is the same as #<=>,
unless otherwise defined. (Alternately you could define #cmp as an
alias of #<=> directly in the classes it is needed --that would
probably be better.) Then in String define #cmp specially to confom to
the successive order as defined by #succ.

Thus having Range use #cmp instead of #<=> the issue is solved.

In summary, an object would then be "Rangeable" if it supports #succ,
but only fully so if is also supports #cmp too (instead of #<=>).

Does it make sense now? (Sorry if I'm not explaining well, it's a tad
subtle and it's been awhile since I worked on it too, so I have been
trying to recall it all myself too).

T.

daz

unread,
Nov 24, 2005, 2:26:49 PM11/24/05
to

Ara.T.Howard wrote:
> On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:
>
> >
> > All we need is making up good names for each functionality.
>
> Range#contains?
>
> ??
>


I'm sure there was an earlier post with an excellent
synopsis on ranges which stated that a Range /doesn't/
"contain"? Yeh, here it is ... from you ;))

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/167200

"not quite - we have a string range that __produces__
27 elements. it does not 'have' or 'contain' them.
it merely suggests this set as it's current thought"

(SCNR ;)


Would something like Range#covers? be more apt?
(meaning within the bounds).
Oh, pooh, that's got an 's' on the end as well :(


daz

Ara.T.Howard

unread,
Nov 24, 2005, 3:14:48 PM11/24/05
to

lol. i realized that actually - i thought that the confusion with "include?"
being used to test "containment" might be resolved by having a method actually
named "contains?". ;-)

too confusing?

Jim Weirich

unread,
Nov 25, 2005, 8:21:48 AM11/25/05
to
ara.t.howard wrote:
> lol. i realized that actually - i thought that the confusion with
> "include?"
> being used to test "containment" might be resolved by having a method
> actually
> named "contains?". ;-)

How about "within?" for a value within a given range?

-- Jim Weirich


--
Posted via http://www.ruby-forum.com/.


daz

unread,
Nov 25, 2005, 9:46:41 AM11/25/05
to

Jim Weirich wrote

>
> How about "within?" for a value within a given range?
>

(0..5).within?(3)

reads backwards, IMO compared to:

(0..5).include?(3)


daz

Florian Frank

unread,
Nov 25, 2005, 5:47:30 PM11/25/05
to
daz wrote:

Yes, I agree. I think, just finding a new word for a method that does
something, that used to have different names in the past, won't help.
This has been tried, but it didn't work too well.

Ruby's ranges have (at least) a dual nature:
1. as an interval (a, b) of values,
2. as a shortcut for a set of values { a, a.succ, a.succ.succ, ..., b }.

I think "include?" is a good name for the 1. And 2 is very similar to 1,
so people will easily confuse those names.

What about using a bit of double dispatch here, like that:

class Object
def element?(r)
r.find { |x| x == self } ? true : false
end
end

"bb".element? "a".."zz" # => true

This doesn't read backwards, and the name conveys the meaning of set
membership, as required by 2.

Perhaps using another method than "find" for searching (that only
defaults to "find") would make it possible, to provide an alternative
implementation for datastructures, that can compute membership faster
than O(n).

--
Florian Frank

Logan Capaldo

unread,
Nov 26, 2005, 4:49:50 PM11/26/05
to
I was going to suggest r.has_element?(x) for the equavilent of #member.
maybe r.surrounds?(x) for for #include. That one is not as good.

Trans

unread,
Nov 26, 2005, 5:20:33 PM11/26/05
to
Florian, your double dispatch is interseting. While I still have no
idea if anyone has understand the #cmp solution I've proposed since no
one has commented on it. A fully general solution of #cmp looks
something like this:

def cmp( other )
return 0 if self == other
loop
before, after = other.succ, self.succ
return -1 if before == self
return 1 if after == self
end
end

Of course no one would never use this becuase 'other' may not be an
actual member and thus never hit on ==. So the only way to ensure
member comparsion in a fully general way is to have the Range on hand
--hence your double dispatch. A generalized solution would then be:

def cmp( other, range )
return 0 if self == other
arr = range.to_a
arr.index( self ) <=> arr.index( other )
end

But this is silly since Range can do this itself, no need to double
dispatch --if #cmp is not defined on the object, Range can always
expand into an array and compare indexes itself. But there's still the
rick of infinite expansions.

Likewise I think the double dispatching within an #element? method is
in the same league. If a #cmp can't be defined and used to determine
membership neither will an #element? method be able to, so Range then
must resort to 'to_a.include?'

Range is better off depending on a comparision method just for it to
ensure compatibility with #succ --which also ensures determination of
memebership with the methods we already have #member? and #include?
--Then they would do exaclty what the documentation says they're
supposed to do, which they actually DO NOT do at the moment.

T.

Warren Brown

unread,
Nov 28, 2005, 10:16:49 AM11/28/05
to
Matz,

> #include? used for range check, #member? was for set
> membership. But since they have same functionality
> in Enumerable, some claimed having different
> behaviors in Range was confusing. I agreed.
>

> All we need is making up good names for each
> functionality.

OK, I think I see why they were changed to be the same, but I really
don't understand the choice of functionality that was kept. For
everything except Ranges, #include? and #member? checks for set
membership. In Ranges, #include? and #member? don't check for set
membership, they check for interval coverage instead. This seems worse
than the original situation where at least #member? meant the same thing
everywhere.

One other side note on the current names: "include" and "member" are
really opposite ideas. A range includes a value, but a value is a
member of a range. Having them mean exactly the same thing might also
be confusing.

Anyway, could Range#include? and Range#member? be changed back to a
membership check and a new method be added to Range for interval
coverage, or would that break too much backwards compatibility? Several
names come to mind for the new method: #between? (my personal favorite),
#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?

If the current behavior of the Range methods can't be changed, names
for membership checks (not including #member? - yuck!) could be:
#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
#in?

What do you think?

- Warren Brown

Yukihiro Matsumoto

unread,
Nov 28, 2005, 10:39:24 AM11/28/05
to
Hi,

In message "Re: [BUG] string range membership"

on Tue, 29 Nov 2005 00:16:49 +0900, "Warren Brown" <warre...@aquire.com> writes:

| OK, I think I see why they were changed to be the same, but I really
|don't understand the choice of functionality that was kept. For
|everything except Ranges, #include? and #member? checks for set
|membership. In Ranges, #include? and #member? don't check for set
|membership, they check for interval coverage instead. This seems worse
|than the original situation where at least #member? meant the same thing
|everywhere.

I don't remember exactly but it's for the sake of performance. I've
thinking about this issue for last few days, and it could be made
better by treating numbers specially, just like we did for min and max
in Range.

| Anyway, could Range#include? and Range#member? be changed back to a
|membership check and a new method be added to Range for interval
|coverage, or would that break too much backwards compatibility? Several
|names come to mind for the new method: #between? (my personal favorite),
|#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
|#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?
|
| If the current behavior of the Range methods can't be changed, names
|for membership checks (not including #member? - yuck!) could be:
|#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
|#in?
|
| What do you think?

Thank you for the candidates. I'd like to hear opinion from others
(especially from English speakers).

matz.

men...@rydia.net

unread,
Nov 28, 2005, 10:53:12 AM11/28/05
to
Quoting Yukihiro Matsumoto <ma...@ruby-lang.org>:

> In message "Re: [BUG] string range membership"
> on Tue, 29 Nov 2005 00:16:49 +0900, "Warren Brown"
<warre...@aquire.com> writes:
>
> |Several names come to mind for the new method: #between?
> |(my personal favorite), #betwixt? (kind of silly, but could
> |be fun), #cover?, #surround?, #bound?, #inside?, #within?,
> |#in_range?, #in_interval?, #in?
>

> Thank you for the candidates. I'd like to hear opinion from
> others (especially from English speakers).

#within? seems best to me.

-mental


Joel VanderWerf

unread,
Nov 28, 2005, 11:05:36 AM11/28/05
to

I like #bound?, as in

(lower..upper).bound? x

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407


David A. Black

unread,
Nov 28, 2005, 11:07:05 AM11/28/05
to
Hi --

On Tue, 29 Nov 2005, Yukihiro Matsumoto wrote:

> Hi,
>
> In message "Re: [BUG] string range membership"
> on Tue, 29 Nov 2005 00:16:49 +0900, "Warren Brown" <warre...@aquire.com> writes:
>
> | OK, I think I see why they were changed to be the same, but I really
> |don't understand the choice of functionality that was kept. For
> |everything except Ranges, #include? and #member? checks for set
> |membership. In Ranges, #include? and #member? don't check for set
> |membership, they check for interval coverage instead. This seems worse
> |than the original situation where at least #member? meant the same thing
> |everywhere.
>
> I don't remember exactly but it's for the sake of performance. I've
> thinking about this issue for last few days, and it could be made
> better by treating numbers specially, just like we did for min and max
> in Range.

I think that as long as ranges have all of this array/set behavior --
as long as range and range.to_a share so much functionality -- ranges
will always feel like two different objects. The whole idea of
"membership" in a range has always seemed a little strange to me. I
guess I think of ranges as very different from arrays and sets.

> | Anyway, could Range#include? and Range#member? be changed back to a
> |membership check and a new method be added to Range for interval
> |coverage, or would that break too much backwards compatibility? Several
> |names come to mind for the new method: #between? (my personal favorite),
> |#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
> |#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?
> |
> | If the current behavior of the Range methods can't be changed, names
> |for membership checks (not including #member? - yuck!) could be:
> |#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
> |#in?
> |
> | What do you think?
>
> Thank you for the candidates. I'd like to hear opinion from others
> (especially from English speakers).

(0..5).to_a.include?(n) :-)

But seriously.... If it's a method of Range, then it has to be from
the range perspective, not the perspective of the argument.
#encompass? comes to mind. There was an interesting discussion on IRC
about how to check for complete inclusion of one range in another.
#encompass? could, ummm, encompass that:

(0..5).encompass?(4) # true
(0..5).encompass?(5.1) # false
(0..5).encompass?(1..2) # true
(1..2).encompass?(0..5) # false

etc.


David

--
David A. Black
dbl...@wobblini.net


Bob Showalter

unread,
Nov 28, 2005, 11:15:11 AM11/28/05
to
Yukihiro Matsumoto wrote:
> #include? used for range check, #member? was for set membership. But
> since they have same functionality in Enumerable, some claimed having
> different behaviors in Range was confusing. I agreed.
>
> | Anyway, please let me know if there is anything I can do to help
> |settle this issue.
>
> All we need is making up good names for each functionality.

How about something like Enumerable#produces?, or Enumerable#yields?

Then perhaps start deprecating Enumerable#member?

So for a range, one could use r.include?(obj) to test for obj between
the endpoints, and r.yields?(obj) to test whether r.succ ever yields obj.

Enumerable#=== becomes an issue (case statement), right?


men...@rydia.net

unread,
Nov 28, 2005, 11:24:54 AM11/28/05
to
Quoting Joel VanderWerf <vj...@path.berkeley.edu>:

> I like #bound?, as in
>
> (lower..upper).bound? x

Hmm, I don't know. That seems like it would suggest the existence
of a Range#bind ... #bounds? possibly?

-mental


Warren Brown

unread,
Nov 28, 2005, 11:33:05 AM11/28/05
to
Matz,

> Anyway, could Range#include? and Range#member? be
> changed back to a membership check and a new method
> be added to Range for interval coverage, or would
> that break too much backwards compatibility?

Without an answer to this question, people will not know which of
the two functionalities to choose a name for. If the answer is "yes",
then we are looking for a name for a membership check. If the answer is
"no", then we are looking for a name for an interval inclusion test.

Did I miss something somewhere?

- Warren Brown

Yukihiro Matsumoto

unread,
Nov 28, 2005, 11:49:51 AM11/28/05
to
Hi,

In message "Re: [BUG] string range membership"

I'm vaguely thinking of changing #member? back to membership check,
with performance optimization for numbers.

matz.


Warren Brown

unread,
Nov 28, 2005, 11:54:26 AM11/28/05
to
Matz,

> I'm vaguely thinking of changing #member? back to
> membership check, with performance optimization for
> numbers.

Yay! What about Range#include?? Would it still be an alias for
Range#member?, or would it retain the current interval check?

-Warren Brown

Martin DeMello

unread,
Nov 28, 2005, 12:14:50 PM11/28/05
to

That reads better as obj.within?(range) than as range.within?(obj). I
like #contain? personally, though in programming terms a "container" is
more a set than a range. Actually, I'm all for #include? to mean
bounding inclusion, and something a lot more expensive-sounding than
member? for #to_a set inclusion. Something like
"aaa"..."zzz".generates?("bbb") would at least indicate that it was
doing an O(n) stepthrough of the range.

martin

Yukihiro Matsumoto

unread,
Nov 28, 2005, 12:31:59 PM11/28/05
to
Hi,

In message "Re: [BUG] string range membership"

on Tue, 29 Nov 2005 01:54:26 +0900, "Warren Brown" <warre...@aquire.com> writes:

| Yay! What about Range#include?? Would it still be an alias for
|Range#member?, or would it retain the current interval check?

No decided yet. Feel free to say your opinion.

matz.


Warren Brown

unread,
Nov 28, 2005, 12:49:29 PM11/28/05
to
Matz,

>> What about Range#include?? Would it still be an
>> alias for Range#member?, or would it retain the
>> current interval check?
>
> No decided yet. Feel free to say your opinion.

Well, I tend to agree with your previous decision on this. Having
Enumerable#include? be an alias for Enumerable#member? (or vice-versa),
but having Range#include? behave differently from Range#member? would be
confusing. I think it would be better to leave these two methods as
aliases and add a new method to Range for the interval check. My
leading candidate for this method is now David A. Black's suggestion of
#encompass?. This is a great name and I really like his idea of
extending it to accept Ranges as parameters so that
(1..10).encompass?(2..9) == true. Other synonyms could also work:
#enclose?, #surround?, and even #contain?.

- Warren Brown

Trans

unread,
Nov 28, 2005, 12:50:57 PM11/28/05
to

Bob Showalter wrote:

> How about something like Enumerable#produces?, or Enumerable#yields?
>
> Then perhaps start deprecating Enumerable#member?
>
> So for a range, one could use r.include?(obj) to test for obj between
> the endpoints, and r.yields?(obj) to test whether r.succ ever yields obj.
>
> Enumerable#=== becomes an issue (case statement), right?

A nice alternative bit of thinking, Bob. At least you are making some
sense.

As for the rest of the gibberish being posted here, which btw has been
the same flap for years, forget it. It's hopeless. You all will be
right back to were you were two years ago, two years from now.

Adios,
T.

ara.t....@noaa.gov

unread,
Nov 28, 2005, 1:20:09 PM11/28/05
to

reading this just gave me a new idea: first of all, i think this method
should be a verb so that it implies a loop and test, rather than a simply
test. this is important because the method is, potentally, extremely
expensive. shortening you suggesting then, how about

(0 ... 42).pass? 42 #=> false
(0 .. 42).pass? 42 #=> false

for example:

harp:~ > cat a.rb
module Enumerable

#
# Enumerable#pass? member
# true if the each method will yield member
#
def pass? member
each{|element| return(element ? element : true) if member === element}
return nil
end
end

p (0 ... 42).pass?(42)
p (0 .. 42).pass?(42)
p [0,1,2,nil].pass?(nil)

open(".bashrc") do |f|
if ((line = f.pass? %r/screen/))
puts line
end
end


harp:~ > ruby a.rb
nil
42
true
alias ss='screen -S '

thoughts?

Joel VanderWerf

unread,
Nov 28, 2005, 2:10:28 PM11/28/05
to
ara.t....@noaa.gov wrote:
> (0 ... 42).pass? 42 #=> false
> (0 .. 42).pass? 42 #=> false

(0..42).pass? 40.5

true or false?

David A. Black

unread,
Nov 28, 2005, 2:17:39 PM11/28/05
to
Hi --

On Tue, 29 Nov 2005, ara.t....@noaa.gov wrote:

> On Tue, 29 Nov 2005, Warren Brown wrote:
>
>> Matz,
>>
>>>> What about Range#include?? Would it still be an
>>>> alias for Range#member?, or would it retain the
>>>> current interval check?
>>>
>>> No decided yet. Feel free to say your opinion.
>>
>> Well, I tend to agree with your previous decision on this. Having
>> Enumerable#include? be an alias for Enumerable#member? (or vice-versa),
>> but having Range#include? behave differently from Range#member? would be
>> confusing. I think it would be better to leave these two methods as
>> aliases and add a new method to Range for the interval check. My
>> leading candidate for this method is now David A. Black's suggestion of
>> #encompass?. This is a great name and I really like his idea of
>> extending it to accept Ranges as parameters so that
>> (1..10).encompass?(2..9) == true. Other synonyms could also work:
>> #enclose?, #surround?, and even #contain?.
>
> reading this just gave me a new idea: first of all, i think this method
> should be a verb so that it implies a loop and test, rather than a simply
> test. this is important because the method is, potentally, extremely
> expensive.

"encompass" is a verb :-)


> shortening you suggesting then, how about
>
> (0 ... 42).pass? 42 #=> false
> (0 .. 42).pass? 42 #=> false

I don't get how "pass" relates to ranges, or enumerables generally.
Do you mean because it will be "passed" to the block? That seems like
focusing on the mechanics rather than the semantics of the object.
(But maybe I'm misunderstanding.)

Martin DeMello

unread,
Nov 29, 2005, 8:43:49 AM11/29/05
to
ara.t....@noaa.gov wrote:
>
> reading this just gave me a new idea: first of all, i think this method
> should be a verb so that it implies a loop and test, rather than a simply
> test. this is important because the method is, potentally, extremely
> expensive. shortening you suggesting then, how about

I suggested #generates? elsewhere in the thread, for the same reason - I
think it important that the method sounds expensive. Are there any
methods currently in the core/stdlib that sound innocuous but have a
performance hit under the covers? (#last perhaps?).

martin

Dave Howell

unread,
Nov 30, 2005, 7:02:03 PM11/30/05
to
English speaker.

Relative new Ruby user.

My thoughts: forget it! Stop! Ahhhh! Kitchen sink!

Let's see if I've got this straight. Somebody complained because

('1'..'10').member?('2')
=> false


Good! The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they
are arbitrary, and it is a trick.

The fact that '1', '2', ... '9','10' is obvious doesn't make it any
less arbitrary.

'1'..'100'

Is that supposed to be 1, 2, 3, ... 99, 100 or 1, 10, 11, 100? Ruby
arbitrarily decided to interpret those strings as base 10 integers.

'a.1'..'c.3'

Quite honestly, I have absolutely no idea how Ruby would count that.
Will I get 'a.1', 'a.2', 'a.3', 'b.1' ... or is it going to go all the
way to 'a.9' and then start over with 'b.1'?

Ruby does NOT need more almost-but-not-quite-the-same methods! People
sophisticated enough to require access to the subtle differences are
sophisticated enough to fix them problem themselves, by modifying the
necessary code, or by finding somebody else's recommended modification
and using that.

Please. The tremendous ease with which Ruby can be extended is all the
more reason to keep the core set tight, small, clean, and thus more
comprehensible and more accessible to beginners. I can't even count how
many different ways there are to open a file! I'd so much rather have
one way, with one thorough explanation, and notes on its shortcomings,
than seven, or whatever, each with just a sketchy description.

In closing:
('1'..'10').to_a.member?('2')
=> true

Is that really such a big deal?

Hal Fulton

unread,
Nov 30, 2005, 8:26:41 PM11/30/05
to
Dave Howell wrote:
>
> Good! The fact that Ruby will get incredibly clever with strings and
> fabricate arbitrary sequences with them is a charming trick, but they
> are arbitrary, and it is a trick.
>

That sums it up well. I couldn't have said it better. Well, maybe if I
really tried...


Hal

ara.t....@noaa.gov

unread,
Nov 30, 2005, 9:59:12 PM11/30/05
to

otoh - __everything__ computers do can be summarized as extremely clever
tricks with strings. maybe better said as infinitely long tapes, but strings
nonetheless.

regards.

Trans

unread,
Nov 30, 2005, 11:24:38 PM11/30/05
to
> In closing:
> ('1'..'10').to_a.member?('2')
> => true
>
> Is that really such a big deal?

This is but one common instance. The issue extends beyond this. Range
is constrained by exceptional behaviors such that it can't be used more
creatively in an assured functioning manner. For instance, youd expect
#member? to do what the documentation says it does. If it not going to
do that the documentation ought to be changed. Why hasn't it? Becuase
Range has a great deal of exceptional behavior for the sake of
efficency. Documenting the actual behavior would be too complicated.

Given the current implementation of Range and how it is used (or more
percisely, how it can't be used), and since there is clearly no intetn
to do otherwise, it just doesn't make sense to push it as some sort of
generalized component. Might as well give Range all the knowledge it
needs to do its basic job and forget this "mixin nature" altogether
(i.e. #succ and #<=> of the sentinals). Finish coding Range to know
what an numeric sequence is and what a string sequence is and forget
about it (If you've ever looked at the code you know it's half way
there anyway). At least it would be even more efficient then.

T.

Warren Brown

unread,
Dec 1, 2005, 1:59:56 AM12/1/05
to
Dave,

> Relative new Ruby user.

Welcome to Ruby!

> Let's see if I've got this straight. Somebody
> complained because
>
> ('1'..'10').member?('2')
> => false

That is the tip of the iceberg, yes.

> Good! The fact that Ruby will get incredibly clever
> with strings and fabricate arbitrary sequences with
> them is a charming trick, but they are arbitrary, and
> it is a trick.

Why is this good? The Range '1'..'10' is a member of Enumerable.
As such, it has a finite number of elements and those elements can
enumerated (for lack of a better word) one at a time using #each. In
particular, this Range is enumerated as '1', '2', '3', '4', '5', '6',
'7', '8', '9', '10'. The method Enumerable#member?, will return true if
one of the enumerated elements is equal to the parameter. However, for
Ranges, the behavior of #member? is different. So different in fact
that for this Range, #mamber?('2') returns false. Many people see this
as a bad thing, not a good thing.

> The fact that '1', '2', ... '9','10' is obvious
> doesn't make it any less arbitrary.

However that sequence isn't arbitrary, all others are. A Range can
be defined on any object that supports #succ and #<=>. The #succ method
defines the *one and only* sequence that a Range cares about, in
relation to Enumerable. For strings, '1', '2', '3', '4', '5', '6', '7',
'8', '9', '10' is the *one and only* sequence that #succ generates, and
that is the sequence that Enumerable#member? would use if Range didn't
override the #member? method.

> '1'..'100'
>
> Is that supposed to be 1, 2, 3, ... 99, 100 or 1, 10,
> 11, 100? Ruby arbitrarily decided to interpret those
> strings as base 10 integers.

No, it's supposed to be '1', '2', '3', ... '99', '100'. There is
nothing arbitrary about the sequence, and it's not a trick. It is the
sequence defined by String#succ. You are welcome to write your own
version of String#succ, but it won't change anything. Range#member?
will still ignore it.

> 'a.1'..'c.3'
>
> Quite honestly, I have absolutely no idea how Ruby
> would count that. Will I get 'a.1', 'a.2', 'a.3',
> 'b.1' ... or is it going to go all the way to 'a.9'
> and then start over with 'b.1'?

Well, let's see:

>ruby -e "p ('a.1'..'c.3').to_a"
["a.1", "a.2", "a.3", "a.4", "a.5", "a.6", "a.7", "a.8", "a.9", "b.0",
"b.1", "b.2", "b.3", "b.4", "b.5", "b.6", "b.7", "b.8", "b.9", "c.0",
"c.1", "c.2", "c.3"]

> (The rest of the message was less relevant, so I snipped it along with
my irrelevant smart-ass replies :o) Instead, I present the current
state of affairs on this issue, as there seems to be a lot of confusion
about it:


There are two core issues involved in this problem. The first is
the dual nature of Ranges. Since Ranges implement the #each method,
they can be viewed as a set of elements, which is how Enumerable views
them. Therefore (1..10).to_a works, along with all of the other
wonderful methods that Enumerable provides.

Ranges can also be viewed as intervals. The best example here is
(1.0..10.0). This Range is *not* Enumerable, since Float does not (and
can not) implement the #each method. However, it is still useful to ask
if a number falls within the boundaries of a Range. Therefore, the <=>
operator is used to test for Range.begin <= value <= Range.end. This is
the functionality that is currently implemented by Range#member?, and
its alias, Range#include?. This was mainly done as an optimization,
since checking 1 <= x <= 1000000 is a whole lot faster than Enumerating
all 1000000 elements. It also allowed Float Ranges to work as well.

The other core issue is that the method String#succ is implemented
in such a way that it is possible for (x > x.succ) to be true (e.g. 'z'
> 'z'.succ). This is what makes the view of a Range as a set and the
view of a Range as an interval incompatible, and why
('1'..'10').include?('2') can be viewed as either right or wrong
depending on how you are looking at the Range. Certainly, '2' is in the
set ('1', '2', '3', ... , '10'), but '1' <= '2' <= '10' is *not* true
since strings are compared, well, as strings.

So, we are currently in a situation where Enumerable.member? (and
its alias Enumerable.include?) test for set membership by enumerating
the set through the #each method, but Range#member? and Range#include?
test for interval coverage and *not* set membership. This is the main
inconsistency that we are trying to get rid of.

Matz is currently considering changing the functionality of
Range#member? from an interval coverage test back to the set membership
test, which interestingly enough, is actually how it started life (it
was later change to be the same as #include?). Range would still
override the method and optimize the test for Integer Ranges, but
non-Integer ranges (include String Ranges) would revert back to the
Enumerable#member? method (or at least that method's functionality).
Matz hasn't decided whether he would change the Range#include? method to
be a test for set membership too, or to leave it as an interval coverage
test. My guess is that it will remain an alias for #member?, since the
two are aliases in Enumerable.

However, since Range#member? would no longer be an interval coverage
test, Matz would want to add a new method to Range to take its place, so
he is currently trying to find a good name for that method. Current
suggestions for the name include (no pun intended):

#between?
#betwixt?
@bound?
#cover?
#enclose?
#encompass?
#in?
#in_interval?
#in_range?
#inside?
#surround?
#within?

Matz is also seeking comments from other people on these suggested
names along with any other names that might be appropriate.

David A. Black also suggested (along with the wonderfully apt name
#encompass?) that this new function could also accept a Range as the
parameter and test for interval over interval coverage as well. This
sounds like a great suggestion and would make the new function even more
useful.

So, that's where we are. I hope this clears up a lot of the
misconceptions that seem to have plagued this discussion.

- Warren Brown

Bob Showalter

unread,
Dec 1, 2005, 8:31:02 AM12/1/05
to
Dave Howell wrote:
> English speaker.
>
> Relative new Ruby user.
>
> My thoughts: forget it! Stop! Ahhhh! Kitchen sink!
>
> Let's see if I've got this straight. Somebody complained because
>
> ('1'..'10').member?('2')
> => false
>
>
> Good!

No, not good.

Range mixes Enumerable, but Range#member? does not behave like
Enumerable#member?, hence the confusion.

>The fact that Ruby will get incredibly clever with strings and
> fabricate arbitrary sequences with them is a charming trick, but they
> are arbitrary, and it is a trick.
>
> The fact that '1', '2', ... '9','10' is obvious doesn't make it any
> less arbitrary.

Yes, it is arbitrary, but nevertheless, the range '1'..'10' will produce
the value '2' (unless String#succ has been overridden), so '2' is by any
ordindary definition of the word member, a member of this range.

It sounds more like your beef is with String#succ

> I'd so much rather have
> one way, with one thorough explanation, and notes on its shortcomings,
> than seven, or whatever, each with just a sketchy description.

I would too, namely make Enumerable#member? work the same way for Ranges
that it does for any other Enumerable. (That seems to be where matz is
leaning).

> ...


> In closing:
> ('1'..'10').to_a.member?('2')
> => true
>
> Is that really such a big deal?
>

No, that's fine. More efficient would be

!('1'..'10').find({|x| x == '2'}).nil?


David A. Black

unread,
Dec 1, 2005, 8:40:44 AM12/1/05
to
Hi --

On Thu, 1 Dec 2005, Bob Showalter wrote:

> Dave Howell wrote:
>> English speaker.
>>
>> Relative new Ruby user.
>>
>> My thoughts: forget it! Stop! Ahhhh! Kitchen sink!
>>
>> Let's see if I've got this straight. Somebody complained because
>>
>> ('1'..'10').member?('2')
>> => false
>>
>>
>> Good!
>
> No, not good.
>
> Range mixes Enumerable, but Range#member? does not behave like
> Enumerable#member?, hence the confusion.
>
>> The fact that Ruby will get incredibly clever with strings and
>> fabricate arbitrary sequences with them is a charming trick, but they are
>> arbitrary, and it is a trick.
>>
>> The fact that '1', '2', ... '9','10' is obvious doesn't make it any less
>> arbitrary.
>
> Yes, it is arbitrary, but nevertheless, the range '1'..'10' will produce the
> value '2' (unless String#succ has been overridden), so '2' is by any
> ordindary definition of the word member, a member of this range.

I think it's more a question of the definition of the word "range".
I've come to believe that ranges should be strictly interval-like in
their behavior. Basically, a range is a kind of filter: (0...10) is
not ten numbers, but rather an expression of "the fact of
0-through-10-ness", or something like that. At least, that's how I'd
like to see ranges work. I think they're trying to be too many
things at once.

This is also why I think the idea of a mutable range is a
contradiction in terms. You can't change what the fact of starting at
0 and ending at 10 means.


David

>> ('1'..'10').to_a.member?('2')
>> => true
>>
>> Is that really such a big deal?
>>
>
> No, that's fine. More efficient would be
>
> !('1'..'10').find({|x| x == '2'}).nil?

I don't think find takes a hash argument :-) Also, why not lose the !
and the .nil? ?

Bob Showalter

unread,
Dec 1, 2005, 8:55:44 AM12/1/05
to
David A. Black wrote:
>> No, that's fine. More efficient would be
>>
>> !('1'..'10').find({|x| x == '2'}).nil?
>
>
> I don't think find takes a hash argument :-) Also, why not lose the !
> and the .nil? ?

a) The fingers get ahead of the brain :)
b) To get a true/false. But not needed, of course.


Trans

unread,
Dec 1, 2005, 9:04:03 AM12/1/05
to
Nice summary Warren.

There's still a little bit more to it though. If one serachs ruby-talk
one finds that there are also other less obvious pacularities about
Range --not that they are all the significant but they are there.

The problem I see is that if #member? goes back to being essentially
equivalent to #to_a.include? We're right back to the original problem
exactly as you point out:

> This was mainly done as an optimization,
> since checking 1 <= x <= 1000000 is a whole lot faster than Enumerating
> all 1000000 elements. It also allowed Float Ranges to work as well.

How could one optimize a _cutstom_ memebership for a Range then? You
can't, so our choices for #member? trap us between inconsistant
functionaity or significant ineffeicency. And it still does not address
the underlying causes: #succ and #<=> are incompatabile in the String
class, and might also be so for other classes.

I've offered the best solution generally possible for this issue: It
corrects the underlying cuase, fixes the inconsistant functionality and
maintains efficiency. What more can one ask? Nonetheless no one seems
interested in it. I tend to think the reason is becuase it introduces a
new method (#cmp), but since no one has even touched on it, how do I
know? I'm at a loss. Do people just not get it? Did I not explain it
well enough? Did I miss something? Or is that people just prefer to
stew around in their own preconceptions?

T.

David A. Black

unread,
Dec 1, 2005, 9:23:09 AM12/1/05
to
Hi --

On Thu, 1 Dec 2005, Bob Showalter wrote:

You can do:

enum.any? {|e| ... }

to get a true/false result.

Bob Showalter

unread,
Dec 1, 2005, 10:47:29 AM12/1/05
to
David A. Black wrote:
> You can do:
>
> enum.any? {|e| ... }
>
> to get a true/false result.

Excellent! I assume that would stop iterating as soon as a true was found?


David A. Black

unread,
Dec 1, 2005, 10:49:51 AM12/1/05
to
Hi --

Yes:

irb(main):004:0> [1,2,3,4].any? {|e| puts e; e > 1 }
1
2
=> true

Adam Shelly

unread,
Dec 2, 2005, 1:57:11 AM12/2/05
to
On 12/1/05, Trans <tran...@gmail.com> wrote:
> Nice summary Warren.

>
>
> I've offered the best solution generally possible for this issue: It
> corrects the underlying cuase, fixes the inconsistant functionality and
> maintains efficiency. What more can one ask? Nonetheless no one seems
> interested in it. I tend to think the reason is becuase it introduces a
> new method (#cmp), but since no one has even touched on it, how do I
> know? I'm at a loss. Do people just not get it? Did I not explain it
> well enough? Did I miss something? Or is that people just prefer to
> stew around in their own preconceptions?
>
I just saw this quote in pickaxe, which helped clarify my thinking:

"Ranges can be constructed using objects of any type, as long as the
objects can be compared using their <=> operator and they support the
succ method to return the next object in sequence. "

But a<=>a.succ != -1 for all a, especially if a is a String.
This causes r.find{|a| !r.member?(a)} to return non-nil for some
ranges, which is unexpected, or possibly just plain wrong.

So I think I like your suggestion, which I would boil down to:
Change the requirement as follows: "Ranges can be constructed from
objects of any class which supports #succ and #cmp, where
a.cmp(a.succ)==-1 for all a."

For numeric classes #cmp is an alias for #<=>. For strings #cmp is a
custom function, matching the string succession generator.
And for your own classes you can write your own #cmp, which does not
have to match #<=>. For instance:
President.new("Kennedy").cmp President.new("Nixon") #=> -1 (Nixon came later)
President.new("Kennedy")<=> President.new("Nixon") #=> 1 (but
Kennedy is greater)


The issue I see is that I don' t know if it is possible to write a
valid #cmp for all cases.
What is the result of 'a'.cmp('0') ? You can't ever get a '0' with
'a'.succ. So is '0' before or after 'a'?

I suppose that you could require that a.cmp b returns nil when
a=a.succ will never produce b. Then Range#member? becomes a test for
set membership, just like enum#member?:
class Range
def member? v
(f=first.cmp(v) && f<=0 && l=last.cmp(v) && l>=0)
end
end

So yes, I like this suggestion.
And I just realized it it may not be orthogonal to the other:
The suggested change leaves no way to test for Interval inclusion in
those cases where the interval and the sequence are different (like
'a'..'bb'). So perhaps there still should be a method of testing
Range inclusion using <=>. (my suggestion is #spans?)
('a'..'bb').member? 'z' #=> true
('a'..'bb').spans? 'z' #=> false
('a'..'bb').member? 'aardvark' #=> false
('a'..'bb').spans? 'aardvark' #=> true

-Adam


Trans

unread,
Dec 3, 2005, 7:58:57 AM12/3/05
to

Adam Shelly wrote:
> I just saw this quote in pickaxe, which helped clarify my thinking:
>
> "Ranges can be constructed using objects of any type, as long as the
> objects can be compared using their <=> operator and they support the
> succ method to return the next object in sequence. "
>
> But a<=>a.succ != -1 for all a, especially if a is a String.
> This causes r.find{|a| !r.member?(a)} to return non-nil for some
> ranges, which is unexpected, or possibly just plain wrong.
>
> So I think I like your suggestion, which I would boil down to:
> Change the requirement as follows: "Ranges can be constructed from
> objects of any class which supports #succ and #cmp, where
> a.cmp(a.succ)==-1 for all a."

Very nicely put. I wish I were as gifted at explaining things. Thanks
Adam.

> For numeric classes #cmp is an alias for #<=>. For strings #cmp is a
> custom function, matching the string succession generator.
> And for your own classes you can write your own #cmp, which does not
> have to match #<=>. For instance:
> President.new("Kennedy").cmp President.new("Nixon") #=> -1 (Nixon came later)
> President.new("Kennedy")<=> President.new("Nixon") #=> 1 (but
> Kennedy is greater)

:-)

> The issue I see is that I don' t know if it is possible to write a
> valid #cmp for all cases.
> What is the result of 'a'.cmp('0') ? You can't ever get a '0' with
> 'a'.succ. So is '0' before or after 'a'?
>
> I suppose that you could require that a.cmp b returns nil when
> a=a.succ will never produce b. Then Range#member? becomes a test for
> set membership, just like enum#member?:
> class Range
> def member? v
> (f=first.cmp(v) && f<=0 && l=last.cmp(v) && l>=0)
> end
> end

Yes that's exactly it. The nice thing about having #cmp seperate from
#<=> is you can have optimizations to determing this, and it can be
uses by the #member? method.

> So yes, I like this suggestion.
> And I just realized it it may not be orthogonal to the other:
> The suggested change leaves no way to test for Interval inclusion in
> those cases where the interval and the sequence are different (like
> 'a'..'bb'). So perhaps there still should be a method of testing
> Range inclusion using <=>. (my suggestion is #spans?)
> ('a'..'bb').member? 'z' #=> true
> ('a'..'bb').spans? 'z' #=> false
> ('a'..'bb').member? 'aardvark' #=> false
> ('a'..'bb').spans? 'aardvark' #=> true

Good point. Boundry checks with #<=>, irregardelss of membership, would
still be useful.

Thanks Adam.

T.

0 new messages