Deep Eager load avoiding N+1 and cartesian

311 views
Skip to first unread message

Brad Laney

unread,
Jan 30, 2012, 12:58:04 PM1/30/12
to nhusers
Hello.

I am trying to avoid N+1 and cartesian product when I need to do deep
eager loads. I want to do this for performance. It is an EAV type of a
data structure.

First situation:
A -> B -> C. Three tables. Regular primary keys off ints and a list in
the parent that goes to the child.

I have an inherited Id column, of int, on each type.
My mappings are fine, everything can be selected using a
QueryOver.Fetch for eager loading, but this causes a cartesian
product.

I tried using futures. This apparently does nothing like it's
advertised on peoples blogs, etc. It's supposed to allow you to
preload records to avoid N+1 and cartesian, but NH apparently just
ignores it completely.

var r = s.QueryOver<NHMapTest1>()
.Where(x => x.Id == id)
.Future();

s.QueryOver<NHMapTest2>()
.Where(x => x.Parent.Id == id)
.Future();
Running r.Single(); returns the entity, but if you do .Items it'll
query the DB, even tho they were selected in the 2nd future. Both
queries run correctly and return valid results.


public class NHMapTest1 : Entity
{
public virtual string Title { get; set; }
private IList<NHMapTest2> _items;
public virtual IList<NHMapTest2> Items { get { return
_items; } }
}

public class NHMapTest2 : Entity
{
public virtual string Title { get; set; }
private IList<NHMapTest3> _items;
public virtual IList<NHMapTest3> Items { get { return
_items; } }
public virtual NHMapTest1 Parent { get; set; }
}

public class NHMapTest3 : Entity
{
public virtual string Title { get; set; }
public virtual NHMapTest2 Parent { get; set; }
}


2nd Situation:
A -> B -> C -> D when C is a composite key reference to B and D.


public class NHMapTest1 : Entity
{
public virtual string Title { get; set; }
private IList<NHMapTest2> _items;
public virtual IList<NHMapTest2> Items { get { return
_items; } }
}

public class NHMapTest2 : Entity
{
public virtual string Title { get; set; }
private IList<NHMapTest3> _items;
public virtual IList<NHMapTest3> Items { get { return
_items; } }
public virtual NHMapTest1 Parent { get; set; }
}

public class NHMapTestComposite2 : Entity
{
public virtual NHMapTest2 NHMapTest2 { get; set; }
public virtual NHMapTest3 NHMapTest3 { get; set; }
public virtual string Title { get; set; }
}

public class NHMapTest3 : Entity
{
public virtual string Title { get; set; }
private IList<NHMapTest4> _items;
public virtual IList<NHMapTest4> Items { get { return
_items; } }
public virtual NHMapTest2 Parent { get; set; }
}

Oskar Berggren

unread,
Jan 30, 2012, 2:44:50 PM1/30/12
to nhu...@googlegroups.com
2012/1/30 Brad Laney <brad.j...@gmail.com>:

> Hello.
>
> I am trying to avoid N+1 and cartesian product when I need to do deep
> eager loads. I want to do this for performance. It is an EAV type of a
> data structure.
>

Is the problem too many queries, or that you absolutely cannot have
more than one roundtrip to the database?

[...]

>
>                var r = s.QueryOver<NHMapTest1>()
>                    .Where(x => x.Id == id)
>                    .Future();
>
>                s.QueryOver<NHMapTest2>()
>                    .Where(x => x.Parent.Id == id)
>                    .Future();
> Running r.Single(); returns the entity, but if you do .Items it'll
> query the DB, even tho they were selected in the 2nd future. Both
> queries run correctly and return valid results.

When you access the collection, even though a number of objects of the
collection's member type has been loaded, NH doesn't know they are
part of the collection, that's why it needs to query again. Looking at
the child's Parent reference of loaded objects wouldn't be enough,
since it doesn't know if those are _all_ the children.

Have you set proper batch size on your collections and classes? And
possibly specify subselect fetching for the collections. With that you
should be able to rely on lazy fetch, but with a limited number of
round trips.


/Oskar

Brad Laney

unread,
Jan 30, 2012, 3:54:07 PM1/30/12
to nhusers
This relationship goes 4 deep. When it does lazy loading it ends up
running like 500+ queries.
This job can easily be done in 4 queries, and then the content
aggregated.
I have not touched batch size, because its already set, and the number
of records coming back is not the issue.
I have tried fetch but then you end up returning large amounts of
unnecessary data, which is why I want to use futures.
This is supposed work. It's documented as how it works. I just don't
know how to get it to do join type select with QueryOver.
I'm going to try it with the criteria API.

.SetFetchMode("Items", FetchMode.Select) <-- this is pretty much what
I want.

What is subselect fetching?

Is there a way to fetch an entire list of data, instead of one at a
time?
x.Items.Items <-- I want to fill every single item inside that
collection in one sql call instead of it lazy loading each one that I
want.

What I want it to run is:

select A.* from A where A.id = ?
select B.* from B where B.aid = ?
select C.* from C left join B where B.aid = ?
select D.* from D left join C left join B where b.aid = ?

I don't mind joining up to A instead of B. But yeah. Also notice I
don't want every single column, because that's pointless.

On Jan 30, 1:44 pm, Oskar Berggren <oskar.bergg...@gmail.com> wrote:
> 2012/1/30 Brad Laney <brad.j.la...@gmail.com>:

Boris Drajer

unread,
Feb 2, 2012, 7:15:58 AM2/2/12
to nhusers
Hi,

Sorry if I misunderstand, but it seems to me that the crucial point
you're missing is that it's not enough to load the contents of the
collections, NHibernate needs to load the collection as a part of its
owner. If you retrieve individual objects, they will all be present in
the session but the collections that contain them will remain
uninitialized. When you access such collections, NHibernate executes a
query and then for each row retrieved discovers that it's already
cached in the session, so it only adds it to the collection. What you
need to do is to load them *together*. This is best illustrated with
HQL:

from A a left join fetch a.Bs where [something]

This loads the A's along with their collections of B's, not just A's
and B's separately.

With QueryOver, I think it would look something like this:

var q = session.QueryOver<A>().Fetch(a =>
a.Bs).Eager.Where([something]);

You can execute multiple different queries of this sort to load
different portions of the object graph.

HTH!

David Schmitt

unread,
Feb 2, 2012, 8:06:34 AM2/2/12
to nhu...@googlegroups.com
On 02.02.2012 13:15, Boris Drajer wrote:
> Hi,
>
> Sorry if I misunderstand, but it seems to me that the crucial point
> you're missing is that it's not enough to load the contents of the
> collections, NHibernate needs to load the collection as a part of its
> owner. If you retrieve individual objects, they will all be present in
> the session but the collections that contain them will remain
> uninitialized. When you access such collections, NHibernate executes a
> query and then for each row retrieved discovers that it's already
> cached in the session, so it only adds it to the collection. What you
> need to do is to load them *together*. This is best illustrated with
> HQL:
>
> from A a left join fetch a.Bs where [something]
>
> This loads the A's along with their collections of B's, not just A's
> and B's separately.
>
> With QueryOver, I think it would look something like this:
>
> var q = session.QueryOver<A>().Fetch(a =>
> a.Bs).Eager.Where([something]);
>
> You can execute multiple different queries of this sort to load
> different portions of the object graph.

Interesting! I was wondering the same myself recently, and this seems to
be exactly what I need. Is there any information available how this can
be done with the LINQ-provider? I guess, it's "just" a matter of putting
the right nodes into the expression tree. I was looking at the reference
on NHforge, but didn't see anything pertinent.

Best Regards, David

Brad Laney

unread,
Feb 2, 2012, 11:32:40 AM2/2/12
to nhusers
Hey,

I understand that is how NH works, but you need to understand I don't
want to do a Cartesian join for this.

I have A -> B -> C -> D

If I want D populated then i will be selecting the entire contents of
A, like, 40 times.

I understand it needs to know the link. And I understand I can use
futures to efficiently populate one level deep.
What I want is an efficient way to populate 4 levels deep, just like I
could with TSQL then join them together at the end.

I do not care the method, or technique to doing this. I just need an
object so that my DBA does not tell me I am bringing down the server
by having a high visited page run 500 queries every time it runs.

Doing this is not an acceptable solution, specially since it requires
left outer join:

from A left outer join fetch A.B left outer join fetch B.C left outer
join fetch C.D

Darren Kopp

unread,
Feb 2, 2012, 12:22:46 PM2/2/12
to nhu...@googlegroups.com
Why not just use raw sql? I use nhibernate everywhere in my system until I need to do something that I can't do with nhibernate or would be prohibitive performance wise to do so with nhibernate. Then I drop back to raw sql.

Brad Laney

unread,
Feb 2, 2012, 3:08:29 PM2/2/12
to nhusers
So... do it in raw sql, build the entities, and merge them into
session?

That's going to take so freaking long, this is required in almost
every single select query we do.

Oskar Berggren

unread,
Feb 2, 2012, 3:32:35 PM2/2/12
to nhu...@googlegroups.com
I'm not sure exactly why this explodes to 500 queries. Maybe I'm
missing something but it seems to me the batchsize settings should be
able to keep that down to fairly few queries, though more than four.
Which batchsize have you set, and to what values?

Another point to consider is that this sounds like some sort of report
page, which isn't exactly what NHibernate is optimized for. If this is
true, it may be useful to just use SQL to project the data to a
dataset or some simple view model and forget about the entities for
this case. There might also be the option of caching the "compiled"
view model in memory or serialized in e.g. a nosql database.

How many objects are we talking about in each level here? If the
numbers grow beyond perhaps 10000 or so I would expect you to
experience sluggishness from NHibernate having to create all those
objects, even if data is fetched efficiently from the database.

/Oskar


2012/2/2 Brad Laney <brad.j...@gmail.com>:

> --
> You received this message because you are subscribed to the Google Groups "nhusers" group.
> To post to this group, send email to nhu...@googlegroups.com.
> To unsubscribe from this group, send email to nhusers+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/nhusers?hl=en.
>

Brad Laney

unread,
Feb 6, 2012, 5:41:04 PM2/6/12
to nhusers
This is not a report. It just a simple output of an object. It just
happens that every time we output the object we need to output these
properties as well.

It is a listing on our site, which can contain many items inside it,
like car, truck etc, and also every item can contain properties that
describe the item. Like width, height, etc.

When we show a list of listings, then we need all the listings, all
the items for each listing, and all the properties for each item.
This is why it causes so many. There are up to 20 properties per
listing item.

Almost everything we use in our business is complicated, we do not
have simple "get one entity" needs. I haven't really been in a
business where things are that simple. We have to show complex data to
the user no matter what we are showing. And all the data is related.

We do have some rules, like, when we get a listing, we do not always
want the items, but when we want the items, we always want the item
properties. So when we do want the items, there are lots of objects
that need to be created.

The lowest amount of queries we could do, I guess, is for every
listing, get all the children.
This would just be like 30 or 50 queries, but our DBA will still throw
a fit.

The issue I have with "just do it raw". Then why not ALWAYS do it raw?
If you have to do it raw once, then you are going to have a mapper,
and it's what, one more line of code to make it do it for a list of
items? Doesn't seem to really save you anything. If you have to do it
once, then you can always do it. Which just means you write the code
to do the manual ORM and then have the "overhead" for NH on simple
stuff, when the simple stuff is the most common.

Is the answer, then, "sorry NH can't handle that yet"?

On Feb 2, 2:32 pm, Oskar Berggren <oskar.bergg...@gmail.com> wrote:
> I'm not sure exactly why this explodes to 500 queries. Maybe I'm
> missing something but it seems to me the batchsize settings should be
> able to keep that down to fairly few queries, though more than four.
> Which batchsize have you set, and to what values?
>
> Another point to consider is that this sounds like some sort of report
> page, which isn't exactly what NHibernate is optimized for. If this is
> true, it may be useful to just use SQL to project the data to a
> dataset or some simple view model and forget about the entities for
> this case. There might also be the option of caching the "compiled"
> view model in memory or serialized in e.g. a nosql database.
>
> How many objects are we talking about in each level here? If the
> numbers grow beyond perhaps 10000 or so I would expect you to
> experience sluggishness from NHibernate having to create all those
> objects, even if data is fetched efficiently from the database.
>
> /Oskar
>
> 2012/2/2 Brad Laney <brad.j.la...@gmail.com>:

Oskar Berggren

unread,
Feb 6, 2012, 6:32:31 PM2/6/12
to nhu...@googlegroups.com
Hi Brad,

Obviously you know your system much better than me, but here are some
additional thoughts:


On batch size:
It's late evening for me so forgive me if I have a mental accident
here, but it seems to me that with proper batch size on the
collections you should be able to get something similar to the
following scenario:

Listing { Set<Item> Items }
Item { Set<Property> Properties }
Property { Name, Value }

Query 1: session.Query<Listing>().BlaBla.ToList().
Query 2: access loadedListings[0].Items.First(), will load the Items
collection for ALL loaded listings
Query 3: access loadedListings[0].Items.First().Properties.First(),
will load the Properties collection for ALL loaded Items
Done.


Tree structures:
Have you considered other methods of efficiently loading tree
structures from SQL, such as "nested set" and "materialized path"? On
the other hand, since you have a known and limited depth, those
methods may not get you much compared to regular joins.


On report:
To me, the concept of reading and displaying a large amount of data to
the user is a kind of report, in that it reports the contents for the
system. This is a factor behind the concept of
command/query-separation and using separate view models (where the
data may be stored denormalized to increase read performance). Btw, it
sounds like you are putting a lot of data in front of the user in a
single view. Is the user really able to cope with all that data at
once?


Domain logic:
When you display all this information to the user, do you really need
(significant parts of) the domain logic in your domain model? If not,
then it may be that NHibernate isn't the right tool for this use case.
If you don't have a rich domain model, perhaps NHibernate isn't the
correct tool for this project. If you do have a rich domain model,
there may still be areas of the application which isn't optimal.
Everything is a compromise. (In some projects of mine I have many use
cases where I need to load less than 10 objects, usually less than
three levels deep.)

If you don't really need anything in the entities for this usecase,
perhaps loading the data using raw SQL and NOT building the entities
is useful.


On the other hand, if someone came up with a nice solution to
efficiently load such structures, it seems like an interesting new
feature for NHibernate. :)


/Oskar

2012/2/6 Brad Laney <brad.j...@gmail.com>:

Brad Laney

unread,
Feb 6, 2012, 8:07:48 PM2/6/12
to nhusers
Heyo,

"
Query 1: session.Query<Listing>().BlaBla.ToList().
Query 2: access loadedListings[0].Items.First(), will load the Items
collection for ALL loaded listings
Query 3: access loadedListings[0].Items.First().Properties.First(),
will load the Properties collection for ALL loaded Items
Done.
"

Query 3 is wrong. That will only load the properties for the FIRST
item in the list.
Then that becomes a query for EVERY item in the list.
Then when you want to access the value of properties, it only does
ONE.
So while looping you end up doing an N+1.
> 2012/2/6 Brad Laney <brad.j.la...@gmail.com>:

Brad Laney

unread,
Feb 6, 2012, 8:17:15 PM2/6/12
to nhusers
I agree with you on a lot of levels.
Accept for the fact that the web is built off information now.
I deal with ecommerce. And in ecommerce, business men like to throw
data at the screen.

So when we want to show content to the user, its a lot of content,
because that is what they want.
I do not really have a say in the matter.

I do not know if this is allowed here but, xpost to my stackoverflow:

http://stackoverflow.com/questions/9042078/how-can-you-avoid-nhibernate-n1-with-composite-key

Darren Kopp

unread,
Feb 7, 2012, 7:12:52 PM2/7/12
to nhu...@googlegroups.com
SQL is not the be-all-end-all system for storing data. If you can, drop down to raw sql for the query and parse into DTO's. You don't need to put all that into session because you aren't saving it are you? Or perhaps look at a document database for storing materialized version of the DTO and figure out a way to keep that in sync.

Boris Drajer

unread,
Feb 8, 2012, 7:13:16 AM2/8/12
to nhusers
Hi Brad,

(Sorry for the late answer)

You don't have to do a full cartesian product and you don't really
have to use left join fetch... You can do something like

from A inner join fetch A.B where A = something
from B inner join fetch B.C where B.A = something
from C inner join fetch C.D where C.B.A = something

If you execute each of the three queries separately, you'l retrieve B
+ C + D number of rows and still have all collections loaded.

The difference between left join fetch and inner join fetch is that
empty collections don't get initialized with inner, while they do with
left. For example, with

from A inner join fetch A.B where A = something

A's with no B's don't appear in query results at all and their A.B
collections remain uninitialized, which produces the lazy
initialization exception when they are accessed. If this isn't a
problem for you, then inner join is indeed a better solution.

Best regards!

Brad Laney

unread,
Feb 8, 2012, 11:29:47 AM2/8/12
to nhusers
@boris

I have tried what you said and it doesn't work, at all.
1) When I inner join the collections arn't loaded into the lists, it
just ignores the data
2) When I run those 3 queries in a futures, it also does nothing. It
populates no lists except for B inside A.

Boris Drajer

unread,
Feb 9, 2012, 12:47:53 PM2/9/12
to nhusers
Hi,

There could be something else, then, that causes the problem... The
method works, I've been using it for years. Here's a couple of links,
this is where I got it from, maybe you'll find something I forgot to
mention:

http://www.ayende.com/Blog/archive/2007/06/20/Efficently-loading-deep-object-graphs.aspx
http://osdir.com/ml/java.springframework.user/2004-08/msg00157.html

(I can't reach Ayende's blog currently, but this page is available on
the wayback machine...)

BTW, in the latest NHibernate (3.2.0) I've been having trouble with
MultiQuery and Future (although they threw exceptions instead of just
not doing anything), it may be good to try to get things working
without them and go back only after everything else is performing
correctly.

Best regards!

Brad Laney

unread,
Feb 15, 2012, 7:03:25 PM2/15/12
to nhusers
I'll zip and upload a full working project to show you that it doesn't
work.

On Feb 9, 11:47 am, Boris Drajer <bdra...@8bit.rs> wrote:
> Hi,
>
> There could be something else, then, that causes the problem... The
> method works, I've been using it for years. Here's a couple of links,
> this is where I got it from, maybe you'll find something I forgot to
> mention:
>
> http://www.ayende.com/Blog/archive/2007/06/20/Efficently-loading-deep...http://osdir.com/ml/java.springframework.user/2004-08/msg00157.html

Brad Laney

unread,
Feb 15, 2012, 7:04:05 PM2/15/12
to nhusers
http://stackoverflow.com/questions/9042078/how-can-you-avoid-nhibernate-n1-with-composite-key

On Feb 15, 6:03 pm, Brad Laney <brad.j.la...@gmail.com> wrote:
> I'll zip and upload a full working project to show you that it doesn't
> work.
>
> On Feb 9, 11:47 am, Boris Drajer <bdra...@8bit.rs> wrote:
>
>
>
>
>
>
>
> > Hi,
>
> > There could be something else, then, that causes the problem... The
> > method works, I've been using it for years. Here's a couple of links,
> > this is where I got it from, maybe you'll find something I forgot to
> > mention:
>
> >http://www.ayende.com/Blog/archive/2007/06/20/Efficently-loading-deep...
>
Reply all
Reply to author
Forward
0 new messages