Problem accessing association object

16 views
Skip to first unread message

MarkMT

unread,
May 31, 2009, 9:27:31 AM5/31/09
to DataMapper
Hi - I've spent the past day going slightly crazy over a problem I've
had accessing objects through what seems like a perfectly
straightforward association. And although I've found a work-around of
sorts, I really don't understand why it works and wonder if my problem
could possibly reflect a bug in DM.

Here's the situation - I have a couple of DM models in a merb
application...

file regatta.rb
--------------------
class Regatta
include DataMapper::Resource
Extlib::Inflection.plural_word 'regatta', 'regattas'
property id, Serial
has 1, :recap
end
-------------------


file recap.rb
-------------------
class Recap
include DataMapper::Resource
property id, Serial
belongs_to :regatta
end
------------------

(A bunch of properties and other associations omitted for clarity)

Now if I invoke 'bin/merb -i' and do something like the following
(assuming the db objects have already been created and associated):

r = Recap.get(1)
=> #<Recap id=1 regatta_id=21>
r.regatta

I get a stack overflow error -

SystemStackError: stack level too deep
from /Code/merb/craw/gems/gems/dm-core-0.9.11/lib/dm-core/adapters/
data_objects_adapter.rb:309:in `order_statement'
from /Code/merb/craw/gems/gems/dm-core-0.9.11/lib/dm-core/adapters/
data_objects_adapter.rb:237:in `read_statement'

... + much more...

I've poked around the DM associations code to try to track down what's
happening, but without much success. It seems that there is some
recursive invocation of the #{name}_association method going on in
module DataMapper::Associations::ManyToOne (dm-core-0.9.11/lib/dm-core/
associations/many_to_one.rb, but I haven't been able to definitively
determine where the re-entry is happening, except that it *seems* to
happen as the instance variable @#{name}_association is being
returned, though I'm not sure that makes sense. Anyhow...

Something else that is curious is that the SQL that gets executed (in
an infinite loop prior to the stack error above beign reported) is
this -

SELECT `id` FROM `regattas` ORDER BY `id`

(again I've left out most of the model attributes for clarity)...

The reason this seems odd is that there is no 'WHERE' clause which
would be required to pull out the specific regatta associated with the
recap object.

In the process of trying to figure this out, I tried creating two
additional models with different names, 'Event' and 'Report' but
identical semantics, i.e. associated in the same way as 'Regatta' and
'Recap' and with all the same properties. With these new models I was
*not* able to reproduce the problem above. In the process of
investigating a little further, I stumbled across a workaround for the
original problem... I removed the file event.rb that contained the
Event model I had created temporarily for testing and if I then by
simply renaming the file regatta.rb to event.rb, leaving its content
unchanged, the problem goes away. i.e. the model 'Regatta' is now
defined in file event.rb. And what is also interesting is that the SQL
executed when I try to access 'r.regatta' using irb is this:

SELECT `id` FROM `regattas` WHERE (`id` IN (21)) ORDER BY `id`

i.e. the WHERE clause is now present as you would expect.

Just to muddy the water a little further... I tried creating a
completely separate skeleton merb application and defined the same two
models 'Regatta' and 'Recap' (without associations to some other
models present in the original application). However I was *not* able
to reproduce the original problem.

So I'm puzzled. I have a work-around for the original problem, but I'm
uncomfortable not really knowing why it works, and I'm unsure why I
can't reproduce the problem in a different application.

I'm open to any suggestions/insights/advice. The problem may well be
in my code, but even if it is, if I can figure out exactly what it is
there may be an opportunity to modify DM to detect the error and raise
a more informative exception than the stack overflow error above.

Mark.

MarkMT

unread,
Jun 3, 2009, 10:58:17 PM6/3/09
to DataMapper
I've been looking a little closer at this problem, and indications so
far are that it may well be a datamapper issue...

I removed from my app all models except Regatta, Recap and User and
all attributes and extraneous associations apart from id's for each
remaining model. What's left is this -

----------------------
class User
include DataMapper::Resource
property :id, Serial
end

class Recap
include DataMapper::Resource
property :id, Serial
belongs_to :regatta
belongs_to :author, :class_name => User, :child_key => [:author_id]

end

class Regatta
include DataMapper::Resource
include Attachable
Extlib::Inflection.plural_word 'regatta', 'regattas'
property :id, Serial
has 1, :recap
end
------------------------

In addition, just for testing purposes, I removed the content of
module Attachable in attachable.rb.

As before, I get a system stack error if I load a Recap, r, from the
db and try to access r.regatta. What's curious is that there seems to
be a combination of apparently unrelated factors that lead to the
problem. I can prevent the error by doing any one of the following
three things -
- remove 'include Attachable' from the definition of class Regatta
- change the association between Recap and User to not use a different
association name, i.e. -- 'belongs_to :user, :child_key =>
[:author_id]'
- change the name of the file containing the Regatta model to
something other than regatta.rb, such as event.rb.

Seems weird. Of course I may just be missing something completely
unrelated. I'd be interested to know if anyone else can reproduce this
problem.

Mark.

MarkMT

unread,
Jun 12, 2009, 12:35:30 PM6/12/09
to DataMapper
Well it's taken a couple of weeks to figure this out (not my day
job!), but after further investigation, I think I've convinced myself
that this is indeed a DataMapper bug. I'd appreciate it if someone who
is familiar with the internals could take a look at my reasoning here
and check whether I'm thinking straight about this...

The problem seems to occur when four conditions coincide -

(i) Given three models A, B and C, many-to-one relationships exist
between A and C and between B and C
(ii) The 'belongs_to :b' statement in the definition of the child
model C specifies the :classname for the parent model B.
(iii) The 'belongs_to :a' statement in C *precedes* the one for B.
(iv) The parent model A is loaded *after* the child model C.

So in my case I had the following, in which 'Regatta', 'User' and
'Recap' correspond to A, B and C above respectively and the models are
loaded in the order listed below -


In file 'user.rb' -
-----------------------------
class User
include DataMapper::Resource
property :id, Serial
has n, :recaps
end
-----------------------------


In file 'recap.rb' -
-----------------------------
class Recap
include DataMapper::Resource
property :id, Serial
belongs_to :regatta
belongs_to :author, :class_name => User, :child_key => [:author_id]
end
-----------------------------


In file 'regatta.rb' -
-----------------------------
class Regatta
include DataMapper::Resource
include Attachable
Extlib::Inflection.plural_word 'regatta', 'regattas'
property :id, Serial
has 1, :recap
end
-----------------------------

The reason this scenario is a problem is as follows - each
'belongs_to' in the definition of Recap causes a Relationship object
to be created with a @child_model instance variable set to Recap. Each
of these relationships also has an instance variable @child_key
representing the foreign key for the parent model.
Relationship#initialize may cause a value to be assigned to this
variable by invoking Relationship#child_key *if* both @child_model and
@parent_model are of class 'Model'.

In the case of 'belongs_to :regatta', @parent_model is not of class
Model, but is instead a string "Regatta" derived from the
symbol :regatta passed to 'belongs_to'. So in this case @child_key
does not get assigned during the creation of the Relationship object.
However in the case of 'belongs_to :author, :class_name =>
User, :class_key => [:author_id]', the relationship's child_key method
does indeed get invoked.

This is significant because the method Relationship#child_key includes
a call to child_model.properties, which as well as returning the
properties of the child model Recap, also checks whether Recap is the
child of any other many-to-one relationships. If it is, it will invoke
child_key on each of those relationships. In the scenario I've
described, Recap does indeed have another many-to-one relationship -
it's the one created by the *preceding* invocation of
'belongs_to :regatta'.

So Relationship#child_key is called on the relationship between the
child model Recap and the parent model Regatta. At one point in this
method the following statement appears -

child_key = parent_key.zip(@child_properties || []).map do |
parent_property,property_name|

Of particular interest is the reference to 'parent_key' on the right
hand side. This causes Relationship#parent_key to be invoked for the
same relationship. Within that method the statement 'parent_model.key'
gets executed, where 'parent_model' represents 'Regatta', and the
method 'key' in module 'Model' is defined by one line, 'properties
(repository_name).key' .

At this point, since the Regatta model has yet to be loaded (condition
(iv) above), the properties of Regatta, @properties[repository_name],
are initially nil, but prior to the end of the Model module's
'properties' method this instance variable gets set to
PropertySet.new, which is an empty set, #<PropertySet:{}>.

This empty parent key gets passed back to the child_key method, where
the expression noted above for the local variable child_key evaluates
to [ ], causing the child_key method's return value, PropertySet.new
(child_key), to evaluate to the empty set #<PropertySet:{}>.

So we end up with @child_key for the :regatta relationship on the
Recap model being the empty set.

Later on in the course of executing my merb application's controller,
if I have a recap, r, and call r.regatta, what happens? The method
Recap#regatta that was created dynamically when the Recap model was
loaded is defined as -

regatta_association.nil? ? nil : regatta_association

The method 'regatta_association' returns a ManyToOne::Proxy object,
but the method 'nil?' is not defined on the Proxy class because the
class definition explicitly undefines all but a specific set of
methods. As a result, 'method_missing' gets called on the Proxy
object. 'method_missing' sends the missing method, in this case 'nil?'
to association's parent object. That parent is obtained by calling
Proxy#parent, which is defined by one line -

@parent ||= @relationship.get_parent(@child)

At this point the instance variable @parent has not been assigned, so
Relationship#get_parent gets called. 'get_parent' ends with this -

----------
children.each do |c|
c.send(association_accessor).instance_variable_set(:@parent,
collection.get(*child_key.get(c)))
end
child.send(association_accessor).instance_variable_get(:@parent)
----------

Inside the block, 'c' is the Recap object and 'c.send
(association_accessor)' returns the regatta association's proxy
object. 'collection' is an array of all objects of the parent class,
i.e. all Regattas. However, because 'child_key' is an empty property
set, 'collection.get(*child_key.get(c))' returns nil, which is what
'instance_variable_set' assigns to @parent.

And now the problem becomes evident... 'Proxy#instance_variable_get
which is called next is defined as -

super || parent.instance_variable_get(variable)

'super' is nil. The problem is 'parent'... we've been here before.
@parent has still not been assigned. So calling Proxy#parent causes
Proxy#parent to be called again which causes Proxy#parent to be called
again which... The loop never terminates, and hence we get a stack
exception.

Although the problem manifests itself when the 'nil?' method gets
called on the association proxy, causing method_missing to be called,
the undefined 'nil?' is not itself the problem. If you remove the
undef of 'nil?', calling the assocation method, e.g. r.regatta, will
successfully return the assocation proxy. But if you try do anything
with it, e.g. r.regatta.<some_attribute>, 'method_missing' will again
be called and the infinite Proxy#parent loop will again be entered.

I'm not sure what the right way is to fix the underlying problem, but
as a user of datamapper, it seems that simply swapping the order of
the two 'belongs_to' declarations in the definition of Recap solves
the problem.

Mark.
Reply all
Reply to author
Forward
0 new messages