Efficiency question on stacked relationships

5 views
Skip to first unread message

Michael Lackhoff

unread,
Oct 31, 2009, 12:24:11 PM10/31/09
to Rose::DB::Object
Hello,

when I was debugging a new feature in my app I was surprised about
database hits that look unneccessary to me.
Here it goes:
my $r = ParentClass->new(id => 999)->load();
foreach my $child ($r->children) {
dosomething_with($child->grandparent->id);
}

in the Children-class:
sub grandparent {shift->parent->grandparent}

My expectation was to have these DB hits:
- initial get $r
- get $r->children
- get $r->grandparent

Instead I get these additional hits:
for every $child->grandparent:
- get $child->parent (the same as $r)
- get parent->grandparent (again and again)

Perhaps it can be boiled down to the question why a call to
$parent->child->parent has to get parent from the database again.

Of course there are ways to do my own caching but perhaps I am just
missing the obvious and it is all there already.

-Michael

John Siracusa

unread,
Oct 31, 2009, 1:53:34 PM10/31/09
to rose-db...@googlegroups.com
On Sat, Oct 31, 2009 at 12:24 PM, Michael Lackhoff <mic...@lackhoff.de> wrote:
> Perhaps it can be boiled down to the question why a call to
> $parent->child->parent has to get parent from the database again.

RDBO doesn't keep any sort of global object cache. Relationship
methods cache the objects they fetch within the object that the
relationship method was called on. When you call parent() on the
child object, that method had not previously been called on that
object. The child object therefore queries the database to get the
parent object. The child object has no idea that some other code had
previously loaded the same object it's about to fetch.

-John

Mark Frost

unread,
Oct 31, 2009, 1:56:51 PM10/31/09
to Rose::DB::Object
I noticed this before, and almost made a topic about it, but in
thinking it through further it made sense.

There's no way in Rose::DB::Object to define that two relationships
are the same "path". For example, you're suggesting that if I call
$table->cells, and then from one of the cells called $cell->table, I
should get the same table back, right? I thought the same thing.

But then it occurred to me that the names of our relationships are
arbitrary, and the paths arent unique. For example, I could have my
$cell relationship that points back at its table be called
"get_sheep", for some reason. Then I could have another relationship
called table, which still points at the table class, but using
join_args to restrict it down further.

Without John setting up a way to see "These two relationships are part
of the same family", which could be overarchitecting Rose::DB::Object,
I dont think there's a consistent way for him to know which of your
relationships are the same "path" programmatically.

One of my examples, I have a "Patient" and "PatientMerge" classes.
Patient can call ->merges, and a PatientMerge can call->patient.
Obviously, Id like it if these two traded off. But my ->merges
relationship has join_args and manager args, so it isn't my *true*
relationship back to the Patient class, that's another relationship
called ->all_merges.

The only way Ive been able to think of for John to do this (maybe he
has his own, better ideas) would be to add the ability to specify one
"parent" and one "child" for relationships between two tables. But
then, of course, that's one more thing you'd have to remember when
setting up your classes, so...

Michael Lackhoff

unread,
Oct 31, 2009, 3:09:30 PM10/31/09
to rose-db...@googlegroups.com
Thanks, Mark and John for your responses!

Since this can be quite expensive, is there a recommended way to deal
with it?
Perhaps, instead of:


sub grandparent {shift->parent->grandparent}

some way of accessor-like caching:
sub grandparent {
my $self = shift;
if (@_) {
$self->{grandparent} = $_[0]; # ???
}
elsif (not $self->{grandparent}) {
$self->{grandparent} = $self->parent->grandparent;
}
return $self->{grandparent};
}

This way I could pass the earlier fetched object whenever necessary
performance-wise.
But how about messing with the object-hash? Looks ugly but isn't that
unusual in the class file itself (I wouldn't do it from elswhere like
$child->{grandparent} = $cached).

Or are there better ways?

-Michael

Mark Frost

unread,
Oct 31, 2009, 3:25:28 PM10/31/09
to Rose::DB::Object
Speaking from personal experience, do NOOOOTTTT DO THIS.

Well, at least if you're in a mod_perl environment, which I would
assume if you're using RDBO.

Perl has an issue where if you have a circular reference amongst
objects, the garbage collection process doesn't know how to clean them
up (an object has to have NO active references on it for Perl to
garbage collect it. When you have an object reference circle, *all* of
the references remain active ones). This can lead to dramatic memory
leaks that will be very unhealthy for your webserver.

The only way I've found to get around this problem is to treat the
Rose relationships as a tree, rather than as individual objects, and
always pass the full *tree* that I need around. For example, if I'm in
a template that's using a Patient object, and I want to call another
template that knows how to draw the PatientAccount, but needs *one*
piece of information about the Patient itself... then I have to either
A. Pass the Patient and access the account that way, or B. Pass the
PatientAccount AND a reference to the Patient object as well (this
doesnt create a circle since I haven't stuffed it into the Patient
object).

There's an article about these memory leaks in Perl that I'll try to
dig up. It baffled us for a long time, because we thought it was an
RDBO thing, but it turned out it happens with circular references in
any Perl objects.

Michael Lackhoff

unread,
Oct 31, 2009, 5:29:44 PM10/31/09
to rose-db...@googlegroups.com
On 31.10.2009 20:25 Mark Frost wrote:

> Well, at least if you're in a mod_perl environment, which I would
> assume if you're using RDBO.

Yep.

> Perl has an issue where if you have a circular reference amongst
> objects, the garbage collection process doesn't know how to clean them
> up (an object has to have NO active references on it for Perl to
> garbage collect it. When you have an object reference circle, *all* of
> the references remain active ones). This can lead to dramatic memory
> leaks that will be very unhealthy for your webserver.

Isn't it possible to weaken the reference? There must be some code for
this in RDBO already since it is almost natural to build circles with
all the relationships.
But you might be right, a first glance shows that it is very easy to get
it wrong and then the memory leaks become a real problem.

-Michael

Mark Frost

unread,
Oct 31, 2009, 8:24:48 PM10/31/09
to Rose::DB::Object
Personally, I was never able to solve this problem, so I had to change
the way I was using RDBO slightly. I thought it was going to prove to
be a real inconvenience, but it hasnt been bad. Just something
important to keep in mind.
Reply all
Reply to author
Forward
0 new messages