Array#insert

1 view
Skip to first unread message

Aleksi Niemel

unread,
Oct 17, 2000, 4:18:00 PM10/17/00
to
I was looking up how Array#insert should be called. Uh, there's none! Maybe
there's some other suitable method somewhere in the middle of the other, but
I didn't spot it fast.

Of course ary[index,0] = stuff. But it's very annoying format, when you
could be more explicit too. Besides there's always the surprise of

ruby -e'a=[1,2,3]; a[1,0]=[4,5]; p a;'
[1, 4, 5, 2, 3]
ruby -e'a=[1,2,3]; a[1,0]=[[4,5]]; p a;'
[1, [4, 5], 2, 3]

lurking, and I'd expect the first example output what the second does.

If there should be Array#insert we should probably make up our mind whether
it should be implemented as (not any real code, just to show possible few
points for debate)

class Array
def insert(index, *ary)
self[index<+1>, 0] = <*>ary
end
end

where <+1> is about having semantics of "insert-before/at" or "insert-after"
index, and <*> if there should be some sort of autoflattening, or in some
other form. I personally don't have opinion should the semantics be before
or after, but I'd expect Array#insert not to autoflatten in any way.

- Aleksi

GOTO Kentaro

unread,
Oct 17, 2000, 4:49:03 PM10/17/00
to
In message "[ruby-talk:5652] Array#insert"
on 00/10/18, Aleksi Niemel=E4 <aleksi....@cinnober.com> writes:
>I was looking up how Array#insert should be called. Uh, there's none! Mayb=
e
>there's some other suitable method somewhere in the middle of the other, b=

ut
>I didn't spot it fast.
>
>Of course ary[index,0] =3D stuff. But it's very annoying format, when you

>could be more explicit too. Besides there's always the surprise of
>
> ruby -e'a=3D[1,2,3]; a[1,0]=3D[4,5]; p a;'

> [1, 4, 5, 2, 3]
> ruby -e'a=3D[1,2,3]; a[1,0]=3D[[4,5]]; p a;'

> [1, [4, 5], 2, 3]
>
>lurking, and I'd expect the first example output what the second does.

How about this?

class Array
def insert(pos, *obj)
self[pos,0] =3D obj
self
end
end

a =3D [1,2,3] ; a.insert(1, [4, 5]) ; p a #=3D> [1, [4, 5], 2, 3]
a =3D [1,2,3] ; a.insert(1, 4, 5 ) ; p a #=3D> [1, 4, 5, 2, 3]

-- gotoken

Aleksi Niemel

unread,
Oct 19, 2000, 1:02:39 PM10/19/00
to
Gotoken proposed code for Array#insert:

> How about this?
>
> class Array
> def insert(pos, *obj)
> self[pos,0] = obj
> self
> end
> end
>
> a = [1,2,3] ; a.insert(1, [4, 5]) ; p a #=> [1, [4, 5], 2, 3]
> a = [1,2,3] ; a.insert(1, 4, 5 ) ; p a #=> [1, 4, 5, 2, 3]
>

I like it. So here's a (equivalent?) patch which produces

a=[1,2,3]; a.insert(1, [4,5]); p a;


[1, [4, 5], 2, 3]

a=[1,2,3]; a.insert(1, 4,5); p a;


[1, 4, 5, 2, 3]

a=[1,2,3]; a.insert(-1, [4,5]); p a;
[1, 2, [4, 5], 3]

a=[1,2,3]; a.insert(4, [4,5]); p a;
[1, 2, 3, nil, [4, 5]]

a=[1,2,3]; a.insert(-4, [4,5]); p a;
-e:1:in `insert': index -4 out of array (IndexError)
from -e:1

a=[1,2,3]; a.insert(1); p a;
-e:1:in `insert': wrong # of arguments (1 when 2 required) (ArgumentError)
from -e:1

a=[1,2,3]; a.insert([4,5], 1); p a;
-e:1:in `insert': Argument position is not a fixnum (ArgumentError)
from -e:1

Approved?

- Aleksi

[niemela@mercury ruby]$ cvs diff -u array.c
Index: array.c
===================================================================
RCS file: /home/cvs/ruby/array.c,v
retrieving revision 1.30
diff -u -r1.30 array.c
--- array.c 2000/10/10 07:03:15 1.30
+++ array.c 2000/10/19 16:52:05
@@ -617,6 +617,30 @@
return argv[1];
}

+static VALUE
+rb_ary_insert(argc, argv, ary)
+ int argc;
+ VALUE *argv;
+ VALUE ary;
+{
+ VALUE objects;
+ long pos;
+
+ if (argc < 2) {
+ rb_raise(rb_eArgError, "wrong # of arguments (%d when 2 required)",
argc);
+ }
+
+ if (TYPE(argv[0]) != T_FIXNUM) {
+ rb_raise(rb_eArgError, "Argument position is not a fixnum");
+ }
+
+ pos = FIX2INT(argv[0]);
+ objects = rb_ary_new4(argc-1, argv+1);
+ rb_ary_replace(ary, pos, 0, objects);
+
+ return ary;
+}
+
VALUE
rb_ary_each(ary)
VALUE ary;
@@ -1615,6 +1639,7 @@

rb_define_method(rb_cArray, "[]", rb_ary_aref, -1);
rb_define_method(rb_cArray, "[]=", rb_ary_aset, -1);
+ rb_define_method(rb_cArray, "insert", rb_ary_insert, -1);
rb_define_method(rb_cArray, "at", rb_ary_at, 1);
rb_define_method(rb_cArray, "first", rb_ary_first, 0);
rb_define_method(rb_cArray, "last", rb_ary_last, 0);


Guy N. Hurst

unread,
Oct 19, 2000, 6:01:19 PM10/19/00
to

Aleksi Niemelwrote:
> ...


> a=[1,2,3]; a.insert(-1, [4,5]); p a;
> [1, 2, [4, 5], 3]
>

I don't like this because it implies there is no way
to use the 'insert' method to put something at the end.

I would prefer that using -1 would result in:

[1, 2, 3, [4, 5]]

Guy N. Hurst

--
HurstLinks Web Development http://www.hurstlinks.com/
Norfolk, VA - (757)623-9688
PHP/MySQL - Ruby/Perl - HTML/Javascript

Mark Slagell

unread,
Oct 19, 2000, 9:29:29 PM10/19/00
to
"Guy N. Hurst" wrote:
>
> Aleksi Niemelwrote:
> > ...
> > a=[1,2,3]; a.insert(-1, [4,5]); p a;
> > [1, 2, [4, 5], 3]
> >
>
> I don't like this because it implies there is no way
> to use the 'insert' method to put something at the end.

Right.



> I would prefer that using -1 would result in:
>
> [1, 2, 3, [4, 5]]

But it muddies the semantics, right? "insert before" for non-negative
indices, "insert after" for negative. I don't know that there's a good
solution, because it would seem that some kinds of asymmetry are
unavoidable when 0 can refer to the first element of something and -1 to
the last.

Guy N. Hurst

unread,
Oct 19, 2000, 11:58:23 PM10/19/00
to

Mark Slagell wrote:
>
> "Guy N. Hurst" wrote:
> >
> > Aleksi Niemelwrote:
> > > ...
> > > a=[1,2,3]; a.insert(-1, [4,5]); p a;
> > > [1, 2, [4, 5], 3]
> > >
> >
> > I don't like this because it implies there is no way
> > to use the 'insert' method to put something at the end.
>
> Right.
>
> > I would prefer that using -1 would result in:
> >
> > [1, 2, 3, [4, 5]]
>
> But it muddies the semantics, right? "insert before" for non-negative
> indices, "insert after" for negative.

I don't think it muddies the semantics at all.

It is always "insert before" - it's just a matter of which direction
you are traversing the elements. And using the negative sign indicates
you are moving right-to-left, so you would still be following the
"insert-before" semantic, albeit the mirror image of it.


[*** Note: the rest of this long email simply belabors this point,
and can be disregarded if you get the drift already ]

> I don't know that there's a good
> solution, because it would seem that some kinds of asymmetry are
> unavoidable when 0 can refer to the first element of something and -1 to
> the last.

What has muddied it, perhaps, is using zero as the starting index. ;-)

Otherwise, we would have:
1,2,3,... for the left-to-right movement, which allows you to specify
the first element
and
-1,-2,-3,.... for the right-to-left movement, which allows you to specify
the last element without knowing how many total elements.

But with zero, who should get it? It can't be in both places because then
there would be an ambiguity since +0 == -0.
For that you would have to have two different commands to distinguish
from which direction you wish to start.

But the case I am interested in is using a single command, with the
negative sign to indicate the direction of traversal.

So, since it is so common to start indexing with zero and move
left-to-right, that ONLY LEAVES us with the option of starting
the right-to-left movement at index -1.

It is very logical - more so than our local-var/dyna-var semantic.

If this functionality is not built-in, then it will probably be an
add-on module implemented using a bunch of pop's etc.


[*** Note: this email merely continues to belabor the point - no need to read
further unless you have the time and inclination]

Let me repeat...

I don't focus on 'insert before' or 'insert after', I pay attention to
the direction I am traversing it in.

0 1 2 3
[a,b,c,d,...]

-3 -2 -1
[...x, y, z]


-1 means first element when starting from the right. You can't use
zero because that is already taken. NOTHING ELSE makes sense.
(otherwise please dispel my ignorance :-)

Likewise, if you need to identify insertion points, zero is already
taken:

0 1 2 3 ...
[ a, b, c, d...]

So to be able to use the same command and start from the opposite end
you must use the following scheme:

-4 -3 -2 -1
[... x, y, z ]

Regardless of what might be used elsewhere, no one can possibly be
confused (for long) with this approach, because there is no other
workable solution.

In fact, I think if zero were ever put at the end, I *would* be confused.
There is no need to do that, even if regular indexing started at +1 instead
of zero.
...

In PHP, the last element is identified by -1, the second-last as -2, etc.
The first element is 0, second is 1, etc.
That is really the only way it can work without causing confusion.
What else could the last element be? 0? 9999999999? _LAST_?
(ok, using _LAST_ is reasonable)


[*** Warning! It isn't getting any shorter - stop now unless you still have time ;-) ]

As a matter of fact, let me lengthen this email to demonstrate just what
PHP does to make handy use of such semantics with the substr() function:


substr ("string", start [, length])
http://www.php.net/manual/function.substr.php
>>
Substr returns the portion of string specified by the
start and length parameters.

If start is positive, the returned string will start at
the start'th position in string, counting from zero.

For instance, in the string 'abcdef', the character at position 0 is 'a',
the character at position 2 is 'c', and so forth.

Examples:

$rest = substr ("abcdef", 1); // returns "bcdef"
$rest = substr ("abcdef", 1, 3); // returns "bcd"

If start is negative, the returned string will start at
the start'th character from the end of string.

Examples:

$rest = substr ("abcdef", -1); // returns "f"
$rest = substr ("abcdef", -2); // returns "ef"
$rest = substr ("abcdef", -3, 1); // returns "d"

If length is given and is positive, the string returned will
end length characters from start. If this would result in a string with
negative length (because the start is past the end of the string),
then the returned string will contain the single character at start.

If length is given and is negative, the string returned will
end length characters from the end of string. If this would result in a
string with negative length, then the returned string will contain
the single character at start.

Examples:

$rest = substr ("abcdef", 1, -1); // returns "bcde"
<<

(note: PHP isn't perfect, and other functions aren't as exemplary
as substr(), nor is there an array_insert() function, but still...)

I don't think semantics needs to be a concern when there is no other
workable way to think about something. Likewise it shouldn't be an
issue that the last element should start with -1 and move to the left.

Semantics is definable, and should be geared towards how the language
is most likely to be used, I think.


Guy N. Hurst

[ Ok, it's over. :-) ]

jwei...@one.net

unread,
Oct 20, 2000, 1:15:14 AM10/20/00
to
>>>>> "Mark" == Mark Slagell <m...@iastate.edu> writes:

>> I would prefer that using -1 would result in:
>>
>> [1, 2, 3, [4, 5]]

Mark> But it muddies the semantics, right? "insert before" for
Mark> non-negative indices, "insert after" for negative.

Don't think of it as "insert before" or "insert after". Think of it
as "insert at". After the assertion, the inserted value will be *at*
the given index.

E.g.

After ...
a.insert (index, value)
then ...
a[index] == value

whether index is positive or negative.

BTW, shouldn't the method be named "insert!" ?

--
-- Jim Weirich jwei...@one.net http://w3.one.net/~jweirich
---------------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

Aleksi Niemel

unread,
Oct 20, 2000, 2:59:16 PM10/20/00
to
> From: jwei...@one.net [mailto:jwei...@one.net]

> >>>>> "Mark" == Mark Slagell <m...@iastate.edu> writes:
>
> >> I would prefer that using -1 would result in:
> >>
> >> [1, 2, 3, [4, 5]]
>
> Mark> But it muddies the semantics, right? "insert before" for
> Mark> non-negative indices, "insert after" for negative.
>
> Don't think of it as "insert before" or "insert after". Think of it
> as "insert at".

Very good discussion. Guy's right about better semantics, Mark about real
meaning of semantics, and Jim about proposing the way to understand how
negative indexes work.

While Jim's following discussion

> After ...
> a.insert (index, value)
> then ...
> a[index] == value
>

applies when value is just one parameter, it has to be extended when
Array#insert is called with multiple values. The idea is the same anyway:

After ...
a.insert (index, val1, val2, val3)
then
a[index] == val1
a[index+1] == val2
a[index+2] == val3

> BTW, shouldn't the method be named "insert!" ?

I think I'm following matz ideas when I named it to be "insert" instead of
"insert!". And the reasoning is that currently the methods are not named
with exclamation mark (!) if

1) the name clearly indicates the receiving object is going to change and/or
2) there's no corresponding method with implicit dup.

In this case the name indicates the receiver is going to be changed, and
there's no !-version (just like Array#delete_at doesn't have exclamation
mark).

My understanding is that a method which changes receiver can be named
without !. But a method which does not change receiver can't be named with
!.
Also a method which has a non-receiver-mutating version too has to be named
with !.

I'm sure matz will correct me here, if I'm mistaken or the "rules" are
incorrect. I'm also willing to create insert *and* insert! if it's thought
to be necessary or the current way confusing. I don't feel so (yet :).

Maybe the name should Array#insert_at, like delete_at. What do you think?

Anyway, here's an update patch, which passes the following RubyUnit tests:

Index: array.c
===================================================================
RCS file: /home/cvs/ruby/array.c,v
retrieving revision 1.30
diff -u -r1.30 array.c
--- array.c 2000/10/10 07:03:15 1.30

+++ array.c 2000/10/20 18:54:38
@@ -617,6 +617,42 @@


return argv[1];
}

+static VALUE
+rb_ary_insert(argc, argv, ary)
+ int argc;
+ VALUE *argv;
+ VALUE ary;
+{
+ VALUE objects;
+ long pos;
+
+ if (argc < 2) {
+ rb_raise(rb_eArgError, "wrong # of arguments (%d when 2 required)",
argc);
+ }
+
+ if (TYPE(argv[0]) != T_FIXNUM) {
+ rb_raise(rb_eArgError, "Argument position is not a fixnum");
+ }
+

+ objects = rb_ary_new4(argc-1, argv+1);
+
+ pos = FIX2INT(argv[0]);

+ /* tweak negative positions to insert in correct place so that
+ following array is indexed properly
+
+ [12, 34, 45, 56]
+ 0 1 2 3 positive indexes
+ -4 -3 -2 -1 negative indexes
+ */
+ if (pos < 0)
+ if (++pos == 0)
+ pos = RARRAY(ary)->len;
+


+ rb_ary_replace(ary, pos, 0, objects);
+
+ return ary;
+}
+
VALUE
rb_ary_each(ary)
VALUE ary;

@@ -1615,6 +1651,7 @@



rb_define_method(rb_cArray, "[]", rb_ary_aref, -1);
rb_define_method(rb_cArray, "[]=", rb_ary_aset, -1);
+ rb_define_method(rb_cArray, "insert", rb_ary_insert, -1);
rb_define_method(rb_cArray, "at", rb_ary_at, 1);
rb_define_method(rb_cArray, "first", rb_ary_first, 0);
rb_define_method(rb_cArray, "last", rb_ary_last, 0);

def test_insert
# Jim Weirich describes a way to understand
# insert's semantics at [ruby-talk:5700]
a = [1,2,3]
assert_equal([[4, 5], 1, 2, 3], a.dup.insert(0, [4,5]) )
assert_equal([1, [4, 5], 2, 3], a.dup.insert(1, [4,5]) )
assert_equal([1, 2, [4, 5], 3], a.dup.insert(2, [4,5]) )
assert_equal([1, 2, 3, [4, 5]], a.dup.insert(3, [4,5]) )
assert_equal([1, 2, 3, nil, [4, 5]], a.dup.insert(4, [4,5]) )

assert_equal([1, 4, 5, 2, 3], a.dup.insert(1, 4, 5) )

assert_equal([1, 2, 3, [4, 5]], a.dup.insert(-1, [4,5]) )
assert_equal([1, 2, [4, 5], 3], a.dup.insert(-2, [4,5]) )
assert_equal([1, [4, 5], 2, 3], a.dup.insert(-3, [4,5]) )
assert_equal([[4, 5], 1, 2, 3], a.dup.insert(-4, [4,5]) )

# I'm a little bit unsure if this should be an error
assert_exception(IndexError) { a.dup.insert(-5, [4,5]) }

assert_exception(ArgumentError) { a.dup.insert(1) }
assert_exception(ArgumentError) { a.dup.insert([4,5], 1) }

# Jim Weirich test extended
b = a.dup
b.insert(1, 6)
assert_equal([1, 6, 2, 3], b)
b.insert(1, 5)
assert_equal([1, 5, 6, 2, 3], b)
b.insert(1, 4)
assert_equal([1, 4, 5, 6, 2, 3], b)
assert_equal(b, a.dup.insert(1, 4, 5, 6))

b = a.dup.insert(1, 4, 5, 6)
assert_equal(4, b[1] )
assert_equal(5, b[1+1])
assert_equal(6, b[1+2])
end

Guy N. Hurst

unread,
Oct 20, 2000, 3:23:26 PM10/20/00
to

Aleksi Niemelwrote:
> ...


> Maybe the name should Array#insert_at, like delete_at. What do you think?
>

I think that is an excellent idea.

> applies when value is just one parameter, it has to be extended when
> Array#insert is called with multiple values. The idea is the same anyway:
>
> After ...
> a.insert (index, val1, val2, val3)
> then
> a[index] == val1
> a[index+1] == val2
> a[index+2] == val3


This would have to be shown differently for negative indexes:

Using
a.insert (-index, val1, val2, val3)

results in

a[-index-2] == val1
a[-index-1] == val2
a[-index] == val3


Guy N. Hurst

Aleksi Niemel

unread,
Oct 20, 2000, 3:37:03 PM10/20/00
to
Guy points:

> > After ...
> > a.insert (index, val1, val2, val3)
> > then
> > a[index] == val1
> > a[index+1] == val2
> > a[index+2] == val3
>
>
> This would have to be shown differently for negative indexes:
>
> Using
> a.insert (-index, val1, val2, val3)
>
> results in
>
> a[-index-2] == val1
> a[-index-1] == val2
> a[-index] == val3

Good point, I should have included it for the sake of completeness.

- Aleksi

Mark Slagell

unread,
Oct 20, 2000, 5:54:35 PM10/20/00
to
Aleksi Niemelwrote:
>
> . . .

> as "insert before" or "insert after". Think of it
> > as "insert at".
>
> Very good discussion. Guy's right about better semantics, Mark about real
> meaning of semantics, and Jim about proposing the way to understand how
> negative indexes work.

Being entirely satisfied with Jim's "insert at" explanation, I withdraw
my objection. :-)

> ...


>
> > BTW, shouldn't the method be named "insert!" ?
>
> I think I'm following matz ideas when I named it to be "insert" instead of
> "insert!". And the reasoning is that currently the methods are not named
> with exclamation mark (!) if
>
> 1) the name clearly indicates the receiving object is going to change and/or
> 2) there's no corresponding method with implicit dup.

> ...

This is in fact one of the first things that startled me when learning
about ruby. I'll give three reasons for wishing we were consistent about
!/? naming:

A. Least Surprise is not well served by the inconsistency, IMO. I'd love
to be able to reliably know, when seeing the name of an unfamilar
method, whether it is can be trusted not to change the receiver.

B. Getting used to having garbage collection has made me more
comfortable with programming in a functional style, as it's easier to
reason about the behavior of non-destructive methods. So it is a nice
convenience for me when destructive methods routinely turn out to have
non-destructive equivalents. I guess what I'm saying is that "clearly
indicates" is anything but clear much of the time, and is open to honest
disagreement based on differing styles; for instance, you might not see
a use for a non-destructive array insert whereas to me it might sound
just as natural and sensible as non-destructive string concatenation.

C. (related to B, but this applies even if you don't consistently prefer
functional style) Naming a destructive method without the ! ties our
hands - what can we do if we later change our mind later about the need
for a non-destructive equivalent?

-- Mark

Yukihiro Matsumoto

unread,
Oct 21, 2000, 11:32:38 AM10/21/00
to
Hi,

In message "[ruby-talk:5716] Re: Array#insert"


on 00/10/21, Aleksi Niemel=E4 <aleksi....@cinnober.com> writes:

|I think I'm following matz ideas when I named it to be "insert" instead of
|"insert!". And the reasoning is that currently the methods are not named

|with exclamation mark (!) if=20
|
|1) the name clearly indicates the receiving object is going to change and/=
or
|2) there's no corresponding method with implicit dup.=20


|
|In this case the name indicates the receiver is going to be changed, and
|there's no !-version (just like Array#delete_at doesn't have exclamation
|mark).
|
|My understanding is that a method which changes receiver can be named
|without !. But a method which does not change receiver can't be named with
|!.
|Also a method which has a non-receiver-mutating version too has to be named
|with !.
|
|I'm sure matz will correct me here, if I'm mistaken or the "rules" are
|incorrect. I'm also willing to create insert *and* insert! if it's thought
|to be necessary or the current way confusing. I don't feel so (yet :).

You're right. Did you scan my brain lately?

And:

In message "[ruby-talk:5721] Re: Array#insert"


on 00/10/21, Mark Slagell <m...@iastate.edu> writes:

|This is in fact one of the first things that startled me when learning
|about ruby. I'll give three reasons for wishing we were consistent about
|!/? naming:
|
|A. Least Surprise is not well served by the inconsistency, IMO. I'd love
|to be able to reliably know, when seeing the name of an unfamilar
|method, whether it is can be trusted not to change the receiver.
|
|B. Getting used to having garbage collection has made me more
|comfortable with programming in a functional style, as it's easier to
|reason about the behavior of non-destructive methods. So it is a nice
|convenience for me when destructive methods routinely turn out to have
|non-destructive equivalents. I guess what I'm saying is that "clearly
|indicates" is anything but clear much of the time, and is open to honest
|disagreement based on differing styles; for instance, you might not see
|a use for a non-destructive array insert whereas to me it might sound
|just as natural and sensible as non-destructive string concatenation.
|
|C. (related to B, but this applies even if you don't consistently prefer
|functional style) Naming a destructive method without the ! ties our
|hands - what can we do if we later change our mind later about the need
|for a non-destructive equivalent?

Well, your resolution is consistent and simpler, but it makes Ruby
programs full of bang signs, that makes programmers (at least me)
unhappy. Although current rule is more complex than you expect, but
most human brain can handle it easily; I'm sure.

matz.

Mark Slagell

unread,
Oct 22, 2000, 3:00:00 AM10/22/00
to
Yukihiro Matsumoto wrote:
...

>
> Well, your resolution is consistent and simpler, but it makes Ruby
> programs full of bang signs ...

Only for the programmers that use the destructive methods. :-)


Yukihiro Matsumoto

unread,
Oct 22, 2000, 3:00:00 AM10/22/00
to
Hi,

In message "[ruby-talk:5743] Re: Array#insert"


on 00/10/22, Mark Slagell <msla...@iastate.edu> writes:

|> Well, your resolution is consistent and simpler, but it makes Ruby

|> programs full of bang signs ...
|
|Only for the programmers that use the destructive methods. :-)

Yes, but destructiveness is not bad in general, especially in
object-oriented programming, it's part of the nature.

matz.


Mark Slagell

unread,
Oct 22, 2000, 2:25:33 PM10/22/00
to

>
> Yes, but destructiveness is not bad in general,

Sure, that I agree with. Otherwise I'd be a Lisp devotee I suppose, and
not hanging around here bothering people. :-)

> especially in
> object-oriented programming, it's part of the nature.

But that puzzles me. I don't view functional style and the OO paradigm
in any kind of conflict, unless one takes the former to its
bondage-and-discipline extremes; the flexibility of OO is such that it
supports a functional style at least as well as the parentheses-laden
languages we love to hate. To say "line=f.readline.chomp" instead of
"line=f.readline; line.chomp!" means to already be thinking
functionally, but it's all a matter of degrees.

I recognize that choices made about a language's syntax and libraries
both reflect and reinforce a particular programming style, or range of
styles -- it starts and ends with the author, with small influences from
the community along the way, and that's as it should be. A wish to be
able to stay a good distance away from Perl underlies much of my
squeaking here, speaking in the definite minority on most of these
issues. Ruby still seems like the closest thing out there to what I
want, so I don't mean to be pushy about it.

Mark

Aleksi Niemel

unread,
Oct 22, 2000, 5:04:19 PM10/22/00
to
matz:

> You're right. Did you scan my brain lately?

Well, I tried at least :). First the power my poor home-made machinery
consumes overloaded the national network, and caused a black moment during
early morning hours. They managed to fix the outage, and connect few big
power plants from Germany too, but then it was the poor Wintoys which
crashed. Apparently you can't give it even a small task of measuring the
room temperature. You see, the scanner - long distance scanner particularly
- requires quite stabile environment.

After that there were amazing storms at sun and the result of those were
very visible. At Hokkaido, I mean. Very lively and bright Northern Lights,
Aurora Borealis. And of course, as you guess, they effectively halted the
operation of my scanner, and we're just warming it up.

So the long story short: I had to make up the rules :).

Mark Slagell:


> |A. Least Surprise is not well served by the inconsistency,

First, I have to confess, Mark, that I fought with the issue too. Nowadays
I'm calmer, as after few surprises one gets used to that :).

However, while you're right with your claim, you have to note the
inconsistency could be detecetd only against some rule set. And in this case
the rule set is quite simple, compact, and well working:

1) plain method name doesn't say anything about state desctructive
or preserving nature
2) '!' in the end of the method name wakes you up. Now we're really
modifying receiving object instead of the usual thing, where
we return a new modified object.

Thus when you read about SomeClass#act, you have to think if the name
indicates the state of the object is going to change. If the case is vague,
consult documentation.
When you see there's SomeClass#act!, you can be pretty sure there's plain
named, non-mutating version of the method, and usually it works like
someClassInstance.dup.act!.

So if your rule-set was something different, like mine was, you probably
find yourself thinking why every state-modifying method isn't named with !.
The reason is that most methods modify receiver's state, thus the code would
be full of exclamation marks. Matz did see this, and started to use another,
quite natural and little suprising, rule set. The one in my previous mail,
and above.

Here Least Surprise was dismissed for the Friendly Source code. But it
didn't manage to go far away, before there was again room for it.

> |a use for a non-destructive array insert whereas to me it might sound
> |just as natural and sensible as non-destructive string concatenation.

Humm..first, it's better to use state-modifying or something else than
destructiveness to describe or call 'methods!'. Second, what are
non-destructive string concatenation or array insert? Is non-destructive
string concatenation an operation which doesn't change the state of the
string object, thus the only possible catenation would be aString + "" and
everything else rejected with an exception.

By the way did you notice how natural the previous aString.+("") looked.
Would you like to say aString.+!("")?

> |Naming a destructive method without the ! ties our hands

This is very good point!

Would you like to try what it's like to live in properly named world by
creating a somewhat complete aliasing module, changing current
state-modifying methods to have an exclamation mark etc.?

If there's some real code to see, it would be fun to evaluate if there's any
ground for the claim "the code would be polluted by '!'".

- Aleksi

Mark Slagell

unread,
Oct 22, 2000, 9:55:45 PM10/22/00
to
Aleksi Niemelwrote:

>
> First, I have to confess, Mark, that I fought with the issue too. Nowadays
> I'm calmer, as after few surprises one gets used to that :).
>
> However, while you're right with your claim, you have to note the
> inconsistency could be detecetd only against some rule set. And in this case
> the rule set is quite simple, compact, and well working:
>
> 1) plain method name doesn't say anything about state desctructive
> or preserving nature
> 2) '!' in the end of the method name wakes you up. Now we're really
> modifying receiving object instead of the usual thing, where
> we return a new modified object.

This rule can be augmented in a way that makes everybody happy, even me,
and steps on nobody's toes. Skip down to the end of this message to see
my proposal, but first I'll respond to some of these good points...



> So if your rule-set was something different, like mine was, you probably
> find yourself thinking why every state-modifying method isn't named with !.
> The reason is that most methods modify receiver's state, thus the code would
> be full of exclamation marks. Matz did see this, and started to use another,
> quite natural and little suprising, rule set. The one in my previous mail,
> and above.
>
> Here Least Surprise was dismissed for the Friendly Source code. But it
> didn't manage to go far away, before there was again room for it.

Right. The reason most methods modify receiver's state is that Ruby
isn't being used by most of us as a fundamentally functional language.
Thing is, there's nothing in the essence of the language to _keep_ it
from also being used that way; ruby is really quite well suited for it,
and it comes down to a matter of what methods are supplied, as
functional-style code tends to rely more on constructors, and "indirect
constructors" (sorry, is there a technical name for that? I just mean
methods that return an object of the same type as the receiver, like
"succ" and many others). My problem with the naming rule as it stands
is that it indicates a limiting assumption about the kinds of code
people will write.

> > |a use for a non-destructive array insert whereas to me it might sound
> > |just as natural and sensible as non-destructive string concatenation.
>
> Humm..first, it's better to use state-modifying or something else than
> destructiveness to describe or call 'methods!'. Second, what are
> non-destructive string concatenation or array insert? Is non-destructive
> string concatenation an operation which doesn't change the state of the
> string object, thus the only possible catenation would be aString + "" and
> everything else rejected with an exception.
>
> By the way did you notice how natural the previous aString.+("") looked.
> Would you like to say aString.+!("")?

Maybe I'm missing something here. All I was trying to say is that
everybody recognizes the naturalness, and usefulness, of string
concatenation that doesn't modify the receiver, as in the "+" operator,
which returns a different string. It isn't so obvious to everyone that
it might be nice to have an array insert method that returns a shallow
copy of the array reflecting the results of the specified insertion. If
it were available, I would use it all the time in preference to the
other; maybe that's just me, but if not, then surely modification of the
receiver isn't so clearly indicated by the method name. I wouldn't
suggest there must only be non-destructive methods, that's not the
point.



> > |Naming a destructive method without the ! ties our hands
>
> This is very good point!
>
> Would you like to try what it's like to live in properly named world by
> creating a somewhat complete aliasing module, changing current
> state-modifying methods to have an exclamation mark etc.?

No, I guess not. I mean, when I first looked at the usage of '!' I was
just convinced it was a mistake. You've convinced me that it is not -- I
don't agree with it, but I can also see it's not crazy. And of course
there's the problem of messing up existing code.

So, instead I suggest standardizing a way to name a non-state-modifying
method when the unbanged name is already in use, such as a trailing
underscore. Then we have:

1. ends in '!': receiver's state changes.
2. ends in '_': receiver's state guaranteed not to change.
3. Otherwise consult (cough, cough) common sense, or documentation.

If we don't like '_', we can make it something else. RFC?

-- Mark

Aleksi Niemel

unread,
Oct 22, 2000, 10:31:20 PM10/22/00
to
> > By the way did you notice how natural the previous
> > aString.+("") looked.
> > Would you like to say aString.+!("")?
>
> Maybe I'm missing something here. All I was trying to say is that
> everybody recognizes the naturalness, and usefulness, of string
> concatenation that doesn't modify the receiver, as in the "+"
> operator, which returns a different string.

Nope, it was me who misunderstood. Yup, what you say makes much sense.

> It isn't so obvious to everyone that it might be nice to have an
> array insert method that returns a shallow
> copy of the array reflecting the results of the specified
> insertion.

While it might not be obvious, as you say, there's one catch here. Assume
Array#insert would have been named Array#insert!:

class Array
def insert
dup.insert!
end
end

But if you have only non-state-modifying version, plain Array#insert,
there's no way one can go to other direction.

Now, if we're not going to implement two methods, it's clear we should have
state-modifying version, as we always can create non-state-modifying
version. The next thing is naming. As we are not going to implement two
methods, current Ruby naming framework suggests plain insert.

Then after all the discussion so far, it all comes to whether
state-modifying insert is badly named when there's no sign about it's
nature.

def update(anArray)
@theArray.insert(@index, anArray)
end

For me it's obvious (or I should put, I expect) insert in above code does
change @theArray. A side effect which would create a new array, change it
and finally return would be dangerous.

If that side effect is needed the code @theArray.dup.insert(@index, anArray)
communicates programmers intention explicitly.

Anyway, this discussion has enlargened my view for Ruby name issues.
Previously I considered naming having to be conformant to OO-centered world.
That world suggests state-modifying plain names when obvious.

> So, instead I suggest standardizing a way to name a

> 2. ends in '_': receiver's state guaranteed not to change.

I think the idea is quite good as it isn't pervasive. Anyway the difference
between

myInteresting.object.code.dup.partly_destroy.to_s and
myInteresting.object.code.partly_destroy_.to_s

isn't many chars and the psychological distance with in the end of the line
gets smaller and smaller, so the coder has to be very careful.

- Aleksi

Mark Slagell

unread,
Oct 23, 2000, 12:48:33 AM10/23/00
to
Aleksi Niemelwrote:
>
> ...

> While it might not be obvious, as you say, there's one catch here. Assume
> Array#insert would have been named Array#insert!:
>
> class Array
> def insert
> dup.insert!
> end
> end
>
> But if you have only non-state-modifying version, plain Array#insert,
> there's no way one can go to other direction.

Yes, but I guess I don't think it has much to do with the method name
issue. (Unless, again, you hear me saying something that I don't mean
to be saying.) It does suggest an implementation guideline.

> def update(anArray)
> @theArray.insert(@index, anArray)
> end
>
> For me it's obvious (or I should put, I expect) insert in above code does
> change @theArray.

Context makes it clear, yes? But that's the readability of existing,
working code. If, without the benefit of an example like the above in
front of me, I write

bar = foo.insert(idx, obj)

... I've unwittingly created context that will frustrate me later.
(Okay, maybe you'd never write that line of code, but bear with me.)
The method name is recognized, the arguments are acceptable, the
interpreter doesn't complain, and, well, it seems like good code to me.
Context doesn't help a bit; if anything, staring at it just reinforces
the original mistake in my mind. I have to eventually trace my
program's misbehavior to a line of code that passed those routine
intuitive tests.

At the risk of belaboring the point -- hmm, maybe I'm about ready to
abandon this one too -- a convention saying that methods ending in '_'
are non-state-modifying means I'm likely to first try

bar = foo.insert_(idx, obj)

and if there is an undefined method error, I can either go straight to
the documentation to see what's what, or just try it again without the
'_'; but in the latter case my suspicions for this line of code are
already raised, and if I'm wrong, the error will be much more quickly
diagnosed as a result. Don't we all do essentially the same thing with
the '!' now and then? (or, again, is this just me?)

> Anyway, this discussion has enlargened my view for Ruby name issues.
> Previously I considered naming having to be conformant to OO-centered world.
> That world suggests state-modifying plain names when obvious.
>
> > So, instead I suggest standardizing a way to name a
> > 2. ends in '_': receiver's state guaranteed not to change.
>
> I think the idea is quite good as it isn't pervasive. Anyway the difference
> between
>
> myInteresting.object.code.dup.partly_destroy.to_s and
> myInteresting.object.code.partly_destroy_.to_s
>
> isn't many chars and the psychological distance with in the end of the line
> gets smaller and smaller, so the coder has to be very careful.
>
> - Aleksi

Sigh. I hadn't thought of how an underscore blends in with the
surroundings. It certainly doesn't grab the eye the way ! and ? do. So
the dup'ed version, as you've written it, is easier to read.

-- Mark

Mathieu Bouchard

unread,
Oct 23, 2000, 1:29:26 AM10/23/00
to
> def update(anArray)
> @theArray.insert(@index, anArray)
> end
> For me it's obvious (or I should put, I expect) insert in above code does
> change @theArray. A side effect which would create a new array, change it
> and finally return would be dangerous.

Ok, maybe I'm missing something, but why don't you write it:

@theArray[@index,0] = anArray

or is the []= considered "unclear"?... then why :[]= is there in the
first place?

and if .insert(i,*e) is to be added, then why not .delete(i,n) as well?...
and .splice(i,n,*e) ???

matju


Conrad Schneiker/Austin/Contr/IBM

unread,
Oct 23, 2000, 3:00:00 AM10/23/00
to
Mark Slagell wrote:

# Sigh. I hadn't thought of how an underscore blends in with the
# surroundings. It certainly doesn't grab the eye the way ! and ? do. So
# the dup'ed version, as you've written it, is easier to read.

Well, if it's any consolation, I brought up the ! issue a while back too.

I am still unhappy with it, because it seems like it "should" follow your
previously proposed scheme. It seems somewhat Perlish to me in terms of
not having a good general idea of what you are looking at until you look
up the pieces.

I guess now I would now most prefer the dup solution, and I retroactively
would have preferred that originally the ! should indicate the
((supposedly) more exceptional!) nondestructive version. (I think someone
mentioned that the ! idea came from scheme, and IIRC it was also mentioned
that unlike Ruby, scheme only/always uses ! for destructive methods.)

If it weren't for backward compatibility, something a little more compact
and a little more visually distinguished such as .!. that could be used
instead of .dup. might be nice.

Conrad Schneiker
(This note is unofficial and subject to improvement without notice.)


Yukihiro Matsumoto

unread,
Oct 23, 2000, 3:00:00 AM10/23/00
to
Hi,

In message "[ruby-talk:5756] Re: Array#insert"


on 00/10/23, Mark Slagell <msla...@iastate.edu> writes:

|> Yes, but destructiveness is not bad in general,
|
|Sure, that I agree with. Otherwise I'd be a Lisp devotee I suppose, and
|not hanging around here bothering people. :-)

I don't call that languages with incredable amount of parenthesises
functional. See rplaca for instance.

|> especially in object-oriented programming, it's part of the nature.
|
|But that puzzles me.

Sometimes. English puzzles me far often though ;-)

matz.


Mark Slagell

unread,
Oct 23, 2000, 3:00:00 AM10/23/00
to

Hmmm....

~/dl/ruby-1.6.1:>diff parse.c~ parse.c
6894c6894
< if ((c == '!' || c == '?') && is_identchar(tok()[0]) && !peek('=')) {
---
> if ((c == '!' || c == '?' || c == '~') && is_identchar(tok()[0]) && !peek('=')) {


Reply all
Reply to author
Forward
0 new messages