[ruby-core:18008] Improving the metaprogramming facilities of Ruby.

0 views

Skip to first unread message

Daniel Pittman

unread,

Jul 27, 2008, 8:35:41 PM7/27/08

to ruby...@ruby-lang.org

G'day. I hope this is the right place to introduce this discussion,
as the ruby-lang.org website is not entirely clear on where language
design and specification discussion takes place.

If this is the wrong place then, please, be forgiving and direct me to
the correct venue to address this.

Also, if this has been previously discussed then please direct me to the
appropriate threads. I searched the mailing list but turned up nothing
obvious.[1]

Finally, I am sorry for the length of this message. I wanted to be as
clear and concise as possible in presenting this, to try and avoid
confusion. Clarity lead to more text than I expected, though.

I have recently been working on a Ruby / ERB template issue where,
essentially, we wanted to have the ERB templates run in an environment
that was almost, but not exactly, the same as Ruby.

Specifically, in the problem domain undefined variables are forbidden,
and we want to expose that rule to our ERB templates.

This was solved, previously, using 'method_missing', and pretending that
variables were actually local methods; that was a losing proposition as
soon as a variable named 'raise' came along. ;)

It could also be solved with an 'instance_variable_missing' method,
analogous to the 'method_missing' method, but thinking about that
brought me to thinking about metaprogramming in Ruby overall.

At the moment Ruby has two distinct and parallel methods for managing
instance variables[2], being:

@foo
instance_variable_get("@foo")

I had, initially, expected that I could simply define a method,
'instance_variable_get', on my object which would allow me to override
the standard ivar lookup process for our own.

What I found, to my surprise, was that @foo acts directly on the ivar
table, accessing or setting the instance variable directly. The
instance_variable_get method does the same, and there is no actual
relationship between the two.

So, after thinking about this and how I would have expected Ruby to
handle the situation, I have two questions:

First, is these a reason why these two operations are distinct?

In consulting the Ruby 1.8 and 1.9 code I see that rb_obj_ivar_get in
object.c and the handling of NODE_IVAR in eval.c both call directly into
the rb_ivar_set function.

Other than differences in error checking[3] both seem to be more or less
identical wrappers over the rb_ivar_set method.

To validate this I experimentally patched eval.c to call the
instance_variable_get (and set) methods rather than directly applying
the call; the results of that (and my tiny test code) are attached.

This seemed to support my theory: both methods seem identical, in
effect, and Ruby continues to pass the packages test suite with the
changes in place.

Second, would it be acceptable to change Ruby (1.9, and ideally 1.8) so
that these two paths /were/ uniform, at least in principal?

I am, obviously, happy to invest the time to write the needed code and
testing, and I am willing to work to see this happen. (eg, to seek
agreement on the specification of the changes, etc, and implement to
something that the various Ruby implementers all considered valid.)

To this effect I have included a rough draft of the specification I
think would be appropriate for this small area of Ruby. This is
intended to strike an appropriate balance between the users and the
implementers in terms of performance and power.

One of the specific goals I have kept in mind in the proposed
modification is that the implementation should be free to optimize the
behaviour of direct (@foo) and indirect (instance_variable_get) access
to ivar storage.

I believe that a conforming implementation could ensure that these
methods directly accessed the ivar table rather than making a method
call without too much trouble, while still supporting the user in
overriding the methods; a combination of caching and a sufficiently
smart compiler should be able to ensure safety.

In the text I have use the following terms:

"the results ... are undefined":

Ruby code may do this, but the behaviour is not guaranteed by Ruby
in any fashion. The code may behave as they expect, or it may cause
an exception, cause Ruby to malfunction, or any other result.

Implementers are encouraged to document the behaviour and limits of
their implementation, as well as to provide sensible results when
undefined actions are taken, such as raising an appropriate
exception.

"conforming code":

Ruby code that is written within the defined behaviour of the core
system, rather than code relying on the behaviour of any particular
implementation of Ruby.

The documentation below covers the range of Object instance_* methods
associated with ivars; the quoted text is the Ruby 1.9 documentation,
and my proposed additions are included.

Editorial comments -- the "why" I added the text are in square brackets,
such that:

> This is Ruby 1.9 documentation.

This is my modified text. [This is my editorial comment on my text.]

The modified method specifications:

> obj.instance_variables => array
>
> Returns an array of instance variable names for the receiver. Note
> that simply defining an accessor does not create the corresponding
> instance variable.

The array returned from this method may not be modified by the caller;
the results of doing so are undefined.

[This text was added so that anyone overriding this method knows that
they can safely return an internal array of names without risk of it
being modified -- for performance reasons, mostly.]

> obj.remove_instance_variable(symbol) => obj
>
> Removes the named instance variable from obj, returning that
> variable's value.

This method may or may not be called during object destruction;
conforming code should not rely on this to perform variable
finalization.

[Added so that implementers are guaranteed to be free of the burden of
calling this during object destruction, and to ensure users know where
they stand.]

> obj.instance_variable_defined?(symbol) => true or false
>
> Returns <code>true</code> if the given instance variable is defined in
> obj.

If instance_variable_get or instance_variable_set are defined then it is
the responsibility of the user to ensure that this method must also be
return accurate results. The results are undefined if this rule is not
respected.

[Added so that Ruby libraries can assume that this method and the
get/set methods are symmetric, and to make it clear to the user that
they are expected to preserve this behaviour.]

> obj.instance_variable_get(symbol) => obj
>
> Returns the value of the given instance variable, or nil if the
> instance variable is not set.

This method is called by Ruby when you write <code>@variable</code>
within the scope of your object methods. You can override this method
to change the rules for instance variable lookup, as long as you abide
by the specified rules.

[Added to make it clear that the user /is/ permitted to override this
method, and that this will result in *all* ivar references calling
their method.]

> The <code>@</code> part of the variable name should be included for
> regular instance variables. Throws a <code>NameError</code> exception
> if the supplied symbol is not valid as an instance variable name.

The default implementation of this method returns <code>nil</code> if
the instance variable has not been set. Conforming code is free to
override this in user defined classes, but the results of overriding
this method in system classes is not defined.

[This is added to make it clear to the user that, for example, returning
false rather than nil from Object.instance_variable_get would be a bad
idea, and to allow the implementer and third party library authors to
assume the standard behaviour in core classes.]

Care should be taken to avoid recursion in user supplied implementations
of this method, as the results of directly referencing an instance
variable inside the method are undefined.

[This is added so that the user, and the implementer, know that they can
just infinitely recurse in the face of the pathological code:

class BadlyBroken; def instance_variable_get name; @recurse; end; end ]

> obj.instance_variable_set(symbol, obj) => obj
>
> Sets the instance variable names by symbol to object,
> thereby frustrating the efforts of the class's author to attempt to
> provide proper encapsulation.

[This is the original text, and is probably not terribly appropriate
after the changes here, but I preserve it for the moment.]

> The variable did not have to exist prior to this call.

[The proceeding should be replaced with:]

This method is called to set the initial, and all subsequent, values of
the instance variable. The variable will not exist prior to setting the
initial value.

Ruby will call this method whenever an instance variable is set,
including for <code>@variable = value</code> expressions within the
scope of your object.

[Added to make it clear this is always called by Ruby to set a
variable.]

Conforming code may override this method, provided the specified
behaviour is implemented. However, the results of overriding this
method on core Ruby classes is undefined.

[Added to make it clear what the user can, and can't, do.]

Care should be taken to avoid recursion in this method; the results of
directly setting an instance variable within this method are undefined,
and you should call your superclass instance_variable_set method
instead.

[Added to avoid forcing implementers to support pathological code:

def BadlyBroken;
def instance_variable_set name, value; @recurse = true; end
end

Implementers are free to support that, of course, if they wish, but
conforming code cannot assume that it will do anything remotely
sensible.]

Regards,
Daniel

Footnotes:

[1] Since I have not previously contributed to the Ruby core I am
somewhat nervous about making such a large proposal for the
language, and I would really like to avoid offending anyone.

[2] This is also true of pretty much every other type, such as
constants, methods, and so forth. Assuming that the general
principals of this message are accepted I intend to also address
those areas for consistency.

[3] Which, in eval.c, is handled at a higher level, so both perform
identical checks and have an identical environment, as far as I can
tell.

0000-sample-ivar-hack.patch

meta.rb

Daniel Pittman

unread,

Jul 28, 2008, 12:08:57 AM7/28/08

to ruby...@ruby-lang.org

"David A. Black" <dbl...@rubypal.com> writes:
> On Mon, 28 Jul 2008, Daniel Pittman wrote:

Thank you for your response.

>> At the moment Ruby has two distinct and parallel methods for managing
>> instance variables[2], being:
>>
>> @foo
>> instance_variable_get("@foo")
>>
>> I had, initially, expected that I could simply define a method,
>> 'instance_variable_get', on my object which would allow me to override
>> the standard ivar lookup process for our own.
>>
>> What I found, to my surprise, was that @foo acts directly on the ivar
>> table, accessing or setting the instance variable directly. The
>> instance_variable_get method does the same, and there is no actual
>> relationship between the two.
>

> I remembering having a similar expectation with regard to Array#[] and
> #[]= -- namely, thinking that if I overrode them, I could hook into
> #push, #pop, and so forth. That doesn't happen mainly for optimization
> reasons, in that particular case.

I suspect that the same is true here: method calls are not free, and a
large part of what I was trying to do here was strike a balance between
user freedom -- typically by overriding methods -- and implementer
freedoms, including avoiding method lookup / call for performance
reasons.

>> So, after thinking about this and how I would have expected Ruby to
>> handle the situation, I have two questions:
>>
>> First, is these a reason why these two operations are distinct?
>

> I don't know the exact rationale but I've always assumed that
> instance_variable_[gs]et exists simply to allow the operation to take
> place outside of the object, and/or symbolically. I've never thought
> of it as a hook on the basic variable assignment.
>
> In fact, one question that occurs to me is: if you want to use this as
> a hook, but also need the current functionality, what would you do?

super; the trivial example of a "tracked variable" class would be:

$ivarlog = []
class TracedIvar < Object
def instance_variable_set(name, value)
$ivarlog << "set #{name} to #{value} in #{self}"
return super(name,value)
end
end

Calling your superclass method allows you to wrap functionality around
it to extend the default behaviour, but still use the default
implementation.

(In CLOS, from which I draw some inspiration, I would use a :before
method, so that I didn't have to explicitly call super, but this
provides the same functionality.)

> I can imagine the usefulness of the hook, possibly, but I wouldn't
> want to lose the simple symbolic assignment.

*nod*

>> Second, would it be acceptable to change Ruby (1.9, and ideally 1.8) so
>> that these two paths /were/ uniform, at least in principal?
>

> I would hope not 1.8. 1.8.7 is already further from 1.8.6 than any
> other one right-hand digit change I can remember. I think it would be
> better not to widen that gap, but to concentrate on the 1.9 stuff.

Fair enough. As far as I can see this would be a backward compatible
change, but it is reasonably radical, and the followup extending this
elsewhere in the code would be more radical.

I don't have a very strong opinion here because I don't understand the
Ruby release policy well -- I just know it would be convenient to have
this functionality in the application I am working on. ;)

[...]

>>> obj.instance_variables => array
>>>
>>> Returns an array of instance variable names for the receiver. Note
>>> that simply defining an accessor does not create the corresponding
>>> instance variable.
>>
>> The array returned from this method may not be modified by the caller;
>> the results of doing so are undefined.
>>
>> [This text was added so that anyone overriding this method knows that
>> they can safely return an internal array of names without risk of it
>> being modified -- for performance reasons, mostly.]
>

> There's a difference, though, between its being truly immutable, and
> being mutable with no defined effect (which could mean disastrous
> effect).

Yes, I agree.

> I'm not sure but I don't think anyone is returning anything but new
> array objects for introspective methods like
> #instance_variables, #methods, and so forth.

Neither can I. On the other hand, I have never written a Ruby
interpreter, so I don't actually /know/ what optimization this might
make possible by permitting the internal value to be returned.

I have no strong opinion on this remaining, or stipulating that it
should be a new array -- but I think it is important that the end user
know if they can mutate this object destructively or not.

(I hit this issue with Hash.each -- it isn't specified what effect
deleting a key while inside the each block will have, so I had to
assume that some implementation, somewhere, might have nasty
side-effects if I did that.)

>>> obj.remove_instance_variable(symbol) => obj
>>>
>>> Removes the named instance variable from obj, returning that
>>> variable's value.
>>
>> This method may or may not be called during object destruction;
>> conforming code should not rely on this to perform variable
>> finalization.
>>
>> [Added so that implementers are guaranteed to be free of the burden of
>> calling this during object destruction, and to ensure users know where
>> they stand.]
>

> Is this something that users would even have to know or worry about?

In writing this specification I tried to think, "if I were a great fool,
what assumptions might I make about this code?"

One assumption was that the destruction sequence of an object might call
this to get rid of each variable, and that I could use that to destroy
my external ivar storage.

Since that is (in my opinion) an unfair burden on the implementer to
force this by default, and a user desiring this can implement it
themselves using a finalizer.

I think it helps give the implementer more freedom, and the user more
safety, but I would not object too strongly to the text being dropped
from the specification.

I should, perhaps, also note that I would be happy to see the help for
the function drop this text, and it to migrate into an appropriate
language / standard class specification.

>>> obj.instance_variable_defined?(symbol) => true or false
>>>
>>> Returns <code>true</code> if the given instance variable is defined in
>>> obj.
>>
>> If instance_variable_get or instance_variable_set are defined then it is
>> the responsibility of the user to ensure that this method must also be
>> return accurate results. The results are undefined if this rule is not
>> respected.
>>
>> [Added so that Ruby libraries can assume that this method and the
>> get/set methods are symmetric, and to make it clear to the user that
>> they are expected to preserve this behaviour.]
>

> Is this likely to be an issue?

On the greater fool theory, yes. In the wild, foolish days of my youth
I occasionally implemented the equivalent of a Ruby class where two
objects would claim identity, but would return a different hash value.

(eg: A == B, but A.hash != B.hash)

On the same basis I can see someone forgetting to implement this if they
change _get and _set, and then getting surprising results.

> Presumably the instance_variable_set(sym,value) method would, whatever
> else it did, set the instance variable to the value, and that the
> instance variable would then be defined. What would be an example of
> how to violate this expectation?

A class that stored ivar values in an external database, such as a
shared memory block or an SQL datastore, rather than in core.

This would be valuable for cross-process object sharing, and could be
implemented as some approximation of:

class ExternalStore < Object
def instance_variable_get(name); $db.get(name); end
def instance_variable_set(name,value); $db.set(name, value); end
def instance_variable_defined?(name); $db.defined?(name); end
# the obvious implementations of the list and delete operations
end

In this instance the methods would work as expected; you could obtain
the list of instance variables, access them, modify them, and so forth
without ever needing in-core storage of those values.

If you neglected to implement the _defined? method, or the _variables
method, then you would end up in a situation where the values accessed
by _set and _get were not the same as the values accessed via _defined?
and _variables.

This is, I think, user error, but providing some guidance to the user
about what additional methods they might need to implement is
beneficial.

>>> obj.instance_variable_get(symbol) => obj
>>>
>>> Returns the value of the given instance variable, or nil if the
>>> instance variable is not set.
>>
>> This method is called by Ruby when you write <code>@variable</code>
>> within the scope of your object methods. You can override this method
>> to change the rules for instance variable lookup, as long as you abide
>> by the specified rules.
>>
>> [Added to make it clear that the user /is/ permitted to override this
>> method, and that this will result in *all* ivar references calling
>> their method.]
>

> I know this is the heart of your proposal.

Yes: most of the rest of this is to support this core facility.

> I still have the issue with it that if it does change the rules for
> instance variable lookup, and I still want a way that uses the old
> rules but does so symbolically, I would have a problem.

I don't entirely agree with you here.

This change permits the end user to override these rules for a class
tree, but is explicit that this implementation must abide by the
specification for the function.

Like any inheritance this could be used to surprise others (eg:
class.each does not accept a block), but it can also be used to make
sensible extensions to Ruby (eg: DatabaseResult.each acts on each row in
the returned data.)

As far as your use is concerned the actual storage location of the ivar
is a black box: you call the method, and it obtains the value from
somewhere.

Having a Ruby programmer override this in Ruby would make no more
difference than the MRI implementation changing to use a different
storage method and adapting the internal implementation of ivar access.

Both are black boxes that behave according to the specification; your
code can safely depend on the external interface but not on the internal
implementation.

(Well, you can even depend on the implementation, but that is going to
cause you grief if your code is, for example, ever run on another Ruby
implementation.)

>>> The <code>@</code> part of the variable name should be included for
>>> regular instance variables. Throws a <code>NameError</code> exception
>>> if the supplied symbol is not valid as an instance variable name.
>>
>> The default implementation of this method returns <code>nil</code> if
>> the instance variable has not been set. Conforming code is free to
>> override this in user defined classes, but the results of overriding
>> this method in system classes is not defined.
>>
>> [This is added to make it clear to the user that, for example, returning
>> false rather than nil from Object.instance_variable_get would be a bad
>> idea, and to allow the implementer and third party library authors to
>> assume the standard behaviour in core classes.]
>

> I can see some really serious problems here. Mainly, it would no
> longer be possible to assume anything about instance_variable_[gs]et,
> so basically it would not be usable.

I strongly disagree here: you can depend on the stated interface, which
covers how these methods relate to each other.

This transforms ivar storage into an OO interface, like many others,
where the user can change the implementation.

While they can probably also change the semantics, that is a mistake,
and not one we can protect the user from -- unless we prevent users from
overriding core methods.

(Heck, I can override __id__ if I want, which warns, but allows me to
mutate this very, very core method -- even Object#__id__!)

> I know that it's sort of "impolite" to use those methods from outside
> anyway, but it's always been possible and it's always meant a certain
> thing.

I disagree here, also: I can implement instance_variable_get in my class
today, and any call to that will hit my method, not the core method.

So, without reading the source (or my statement) you cannot guarantee
that these two statements are equivalent:

NastyObject.instance_variable_get(:@value)
NastyObject.instance_eval("@value")

My implementation could be, today, this:

class NastyObject
@value = "you can't touch this!"
def instance_variable_get(name); "ha-ha!"; end
end

So, any call to that method is /already/ a risky call, and assumes
details of the object implementation that may not be true.

> If it can now mean anything, then you'd literally never want to do it.

As stated, I don't believe this changes the available semantics. If
anything it actually makes it /safer/ to do, because now you *are*
guaranteed that @value and instance_variable_get(:@value) end up using
the same method to access their storage.

> That comes back to the issue with not wanting to lose the old
> functionality in the interest of the new.

I tend to agree, at the level of an individual class: you should be able
to safely continue to use these instance_variable_* methods externally
without any change to your code, no matter what the end user does.

(Likewise, much of the wording here is to ensure that the implementer
can do the same, by telling the user that they will cause grief by
breaking certain permitted assumptions about core behaviour.)

I disagree that this would change the rules of the game in general,
though, since I don't believe it actually makes your method calls any
more risky.

I would also disagree with the assertion that someone *outside* an
object should be able to access the superclass defined ivar storage,
rather than the class defined ivar storage.

(eg: in ExternalStore above, I do not agree that anyone outside the
class should be able to bypass the method implementations that access
the database, for the standard OO encapsulation reasons.)

> What you're describing sounds like methods that would be good
> candidates for being protected -- and, indeed, I think they should be
> if they can be redefined. But that leaves a gap in the language.

I absolutely do not agree here. This change makes these a *more*
uniform and reliable interface, not less. This makes it safer for
external code to use them.

>>> obj.instance_variable_set(symbol, obj) => obj

[...]

> Again, I think this all points to the appropriateness of making these
> methods protected if they do what you're envisioning.

I don't believe there is any fundamental difference is argument between
this and the previous method, so will leave addressing anything here for
the moment.

>> [Added to make it clear what the user can, and can't, do.]
>>
>> Care should be taken to avoid recursion in this method; the results of
>> directly setting an instance variable within this method are undefined,
>> and you should call your superclass instance_variable_set method
>> instead.
>

> Does that mean that the setting of the instance variable would not
> automatically happen?

Correct: one of the standard uses of this would be to support external
storage for ivar content, not internal storage.

> That seems odd to me. Also, if you have to explicitly call 'super',
> then you're not really doing an override; at most, you're chaining
> some behavior in. (Which is often a good approach, but in this case it
> would limit you.)

Perhaps my wording is unclear there; you are not obliged to call super,
but you cannot assume that '@ivar = value' will do *anything* but call
the same method within the implementation of instance_variable_set.

If you choose to store the value somewhere external then, by all means,
do not call super in your implementation.

(Some uses of this functionality are best suited for traditional
multimethod implementations, where you can use a :before method rather
than the :around methods that Ruby offers.

Part of the trade-off there is that Ruby implementers who want :before
have to implement :around and manually call the main implementation.
On the other hand, Ruby method calls are more predictable. Such is life.)

> In sum, my concerns include:

In sum, also, my answers:

> 1. losing the current, very useful behavior of simple get/set
> operations performed symbolically;

I believe this *increased*, not decreases, the safety of using the
instance_variable_* methods outside the class implementation, by making
them consistent with internal '@value' style ivar access.

> 2. the effectively "undefined" behavior of doing these overrides
> for any class (not just core classes) whose internals you don't
> control (since the class's implementer may be depending on a
> different behavior;

I can understand that; it is both true and fair to say that overriding
fundamental assumptions of another class is risky.

However, Ruby already allows this, and I don't believe we are going to
substantially increase the user or implementer visible risk with this
change.

> 3. the issue of whether or not instance_variable_set(sym,value)
> might actually not even set the instance variable, which raises
> questions of how the overrides relate to their 'super' versions,
> and therefore how much one would really be able to do with them.

I hope I have clarified this as being bad wording on my part:

Any implementation of instance_variable_set that returns successfully,
and which does not set the value as far as instance_variable_get, et al,
are concerned, causes undefined behaviour.

Thank you very much for taking the time to address this suggestion
seriously, and I very much appreciate your comments.

Regards,
Daniel

David A. Black

unread,

Jul 27, 2008, 10:46:18 PM7/27/08

to ruby...@ruby-lang.org

Hi --

On Mon, 28 Jul 2008, Daniel Pittman wrote:

> At the moment Ruby has two distinct and parallel methods for managing
> instance variables[2], being:
>
> @foo
> instance_variable_get("@foo")
>
> I had, initially, expected that I could simply define a method,
> 'instance_variable_get', on my object which would allow me to override
> the standard ivar lookup process for our own.
>
> What I found, to my surprise, was that @foo acts directly on the ivar
> table, accessing or setting the instance variable directly. The
> instance_variable_get method does the same, and there is no actual
> relationship between the two.

I remembering having a similar expectation with regard to Array#[] and

#[]= -- namely, thinking that if I overrode them, I could hook into
#push, #pop, and so forth. That doesn't happen mainly for optimization
reasons, in that particular case.

> So, after thinking about this and how I would have expected Ruby to

> handle the situation, I have two questions:
>
> First, is these a reason why these two operations are distinct?

I don't know the exact rationale but I've always assumed that

instance_variable_[gs]et exists simply to allow the operation to take
place outside of the object, and/or symbolically. I've never thought
of it as a hook on the basic variable assignment.

In fact, one question that occurs to me is: if you want to use this as

a hook, but also need the current functionality, what would you do? I

can imagine the usefulness of the hook, possibly, but I wouldn't want
to lose the simple symbolic assignment.

> Second, would it be acceptable to change Ruby (1.9, and ideally 1.8) so

> that these two paths /were/ uniform, at least in principal?

I would hope not 1.8. 1.8.7 is already further from 1.8.6 than any

other one right-hand digit change I can remember. I think it would be
better not to widen that gap, but to concentrate on the 1.9 stuff.

(Then again, I don't have any standing to say anything about that,
other than to give an opinion in passing, if that.)

There's a difference, though, between its being truly immutable, and

being mutable with no defined effect (which could mean disastrous

effect). I'm not sure but I don't think anyone is returning anything

but new array objects for introspective methods like
#instance_variables, #methods, and so forth.

>> obj.remove_instance_variable(symbol) => obj

>>
>> Removes the named instance variable from obj, returning that
>> variable's value.
>
> This method may or may not be called during object destruction;
> conforming code should not rely on this to perform variable
> finalization.
>
> [Added so that implementers are guaranteed to be free of the burden of
> calling this during object destruction, and to ensure users know where
> they stand.]

Is this something that users would even have to know or worry about?

>> obj.instance_variable_defined?(symbol) => true or false

>>
>> Returns <code>true</code> if the given instance variable is defined in
>> obj.
>
> If instance_variable_get or instance_variable_set are defined then it is
> the responsibility of the user to ensure that this method must also be
> return accurate results. The results are undefined if this rule is not
> respected.
>
> [Added so that Ruby libraries can assume that this method and the
> get/set methods are symmetric, and to make it clear to the user that
> they are expected to preserve this behaviour.]

Is this likely to be an issue? Presumably the

instance_variable_set(sym,value) method would, whatever else it did,
set the instance variable to the value, and that the instance variable
would then be defined. What would be an example of how to violate this
expectation?

>> obj.instance_variable_get(symbol) => obj

>>
>> Returns the value of the given instance variable, or nil if the
>> instance variable is not set.
>
> This method is called by Ruby when you write <code>@variable</code>
> within the scope of your object methods. You can override this method
> to change the rules for instance variable lookup, as long as you abide
> by the specified rules.
>
> [Added to make it clear that the user /is/ permitted to override this
> method, and that this will result in *all* ivar references calling
> their method.]

I know this is the heart of your proposal. I still have the issue with
it that if it does change the rules for instance variable lookup, and

I still want a way that uses the old rules but does so symbolically, I
would have a problem.

>> The <code>@</code> part of the variable name should be included for

>> regular instance variables. Throws a <code>NameError</code> exception
>> if the supplied symbol is not valid as an instance variable name.
>
> The default implementation of this method returns <code>nil</code> if
> the instance variable has not been set. Conforming code is free to
> override this in user defined classes, but the results of overriding
> this method in system classes is not defined.
>
> [This is added to make it clear to the user that, for example, returning
> false rather than nil from Object.instance_variable_get would be a bad
> idea, and to allow the implementer and third party library authors to
> assume the standard behaviour in core classes.]

I can see some really serious problems here. Mainly, it would no

longer be possible to assume anything about instance_variable_[gs]et,

so basically it would not be usable. I know that it's sort of

"impolite" to use those methods from outside anyway, but it's always

been possible and it's always meant a certain thing. If it can now

mean anything, then you'd literally never want to do it.

That comes back to the issue with not wanting to lose the old
functionality in the interest of the new. What you're describing

sounds like methods that would be good candidates for being protected
-- and, indeed, I think they should be if they can be redefined. But
that leaves a gap in the language.

>> obj.instance_variable_set(symbol, obj) => obj

>>
>> Sets the instance variable names by symbol to object,
>> thereby frustrating the efforts of the class's author to attempt to
>> provide proper encapsulation.
>
> [This is the original text, and is probably not terribly appropriate
> after the changes here, but I preserve it for the moment.]
>
>> The variable did not have to exist prior to this call.
>
> [The proceeding should be replaced with:]
>
> This method is called to set the initial, and all subsequent, values of
> the instance variable. The variable will not exist prior to setting the
> initial value.
>
> Ruby will call this method whenever an instance variable is set,
> including for <code>@variable = value</code> expressions within the
> scope of your object.
>
> [Added to make it clear this is always called by Ruby to set a
> variable.]
>
> Conforming code may override this method, provided the specified
> behaviour is implemented. However, the results of overriding this
> method on core Ruby classes is undefined.

The problem, I think, is that it's really undefined, in exactly that
sense, in any class. The reason it's undefined in the core classes is
that if you override, say, Range#instance_variable_set, you don't
necessarily know what the implementer has done with instance variables
of ranges, so you don't know what you might be breaking. But the same
is true of any class where someone from outside the class overrides
the method for the class or one of its instances: the implementer of
that class may have been depending on the original behavior, or some
prior override.

Again, I think this all points to the appropriateness of making these
methods protected if they do what you're envisioning.

> [Added to make it clear what the user can, and can't, do.]

>
> Care should be taken to avoid recursion in this method; the results of
> directly setting an instance variable within this method are undefined,
> and you should call your superclass instance_variable_set method
> instead.

Does that mean that the setting of the instance variable would not
automatically happen? That seems odd to me. Also, if you have to

explicitly call 'super', then you're not really doing an override; at
most, you're chaining some behavior in. (Which is often a good
approach, but in this case it would limit you.)

In sum, my concerns include:

1. losing the current, very useful behavior of simple get/set
operations performed symbolically;

2. the effectively "undefined" behavior of doing these overrides
for any class (not just core classes) whose internals you don't
control (since the class's implementer may be depending on a
different behavior;

3. the issue of whether or not instance_variable_set(sym,value)
might actually not even set the instance variable, which raises
questions of how the overrides relate to their 'super' versions,
and therefore how much one would really be able to do with them.

David

--
Rails training from David A. Black and Ruby Power and Light:
* Advancing With Rails August 18-21 Edison, NJ
* Co-taught by D.A. Black and Erik Kastner
See http://www.rubypal.com for details and updates!

Reply all

Reply to author

Forward

0 new messages