Question about document relationships

139 views
Skip to first unread message

Jason Dentler

unread,
Apr 25, 2012, 4:22:08 PM4/25/12
to rav...@googlegroups.com
First, let me apologize for the newb question. I've been working with RavenDB for a day or two. 

In my system, I have several view models like the following:

markets/418

{
  "Name": "Austin",
  "Url": "xxxxxxx",
  "StateAbbreviation": "TX",
  "EmailAddresses": {
    "Warranty": "xxxxxxxx"
  },
  "PhoneNumber": "(999)-999-9999",
  "MapCenter": {
    "Latitude": 30.350426,
    "Longitude": -97.713808
  },
  "MapZoom": 10,
  "MapIcons": []
}

cities/902

{
  "Name": "Buda",
  "Url": "xxxxx",
  "StateAbbreviation": "TX",
  "Market": {
    "Id": "markets/418",
    "Name": "Austin",
    "Url": "xxxxx"
  }
}

Notice that the City contains a few fields about it's parent Market so that I can avoid querying several documents to display the city details. I think this is the right way to accomplish this, but I could be wrong.

I'm using NServiceBus. When my system handles a MarketNameChanged event, how would I reliably update the Market's Name property in my Market document and all the City documents that reference that Market?

I've seen this in the FAQ, but I could end up inconsistent since this is outside my unit of work.

Thanks,
Jason
Message has been deleted

Stijn Volders (ONE75)

unread,
Apr 25, 2012, 5:34:33 PM4/25/12
to rav...@googlegroups.com
Yes, and this is normal. The document itself (Market) is a consistency boundary. If you have your data denormalized, you are using "eventually consistency". 

RavenDB supports transactions so you could do the "Update Market" and "Update index" operations in 1 transaction, but I'm not sure how this works with NServiceBus if you want to update the index in the "MarketNameChanged event"

Jason Dentler

unread,
Apr 25, 2012, 7:26:52 PM4/25/12
to rav...@googlegroups.com
Thanks. I'm glad I'm on the right track with the modeling at least. 

Ignore NServiceBus for a moment. There are about 6 document types that contain the market name. Each market could have around 1000 related documents with the name field. What's the best way to update the market name? 

I don't mind being eventually consistent, but eventually, it has to either work entirely or throw an exception and roll back entirely. I don't know if this is possible.

Ryan Heath

unread,
Apr 26, 2012, 3:45:56 AM4/26/12
to rav...@googlegroups.com
You could issue a patch command. 

Do you know .Include? You can query related docs in one go, saving you from problems like this. 

That said, a question: when would one use inclusion of properties of related docs into the aggregate doc instead of querying with .Include?

// Ryan

Oren Eini (Ayende Rahien)

unread,
Apr 26, 2012, 3:46:03 AM4/26/12
to rav...@googlegroups.com
Jason,
I would actually say that you can NOT denormalize things, and keep the values in the Market document, then Include it.

Stijn Volders (ONE75)

unread,
Apr 26, 2012, 5:21:28 AM4/26/12
to rav...@googlegroups.com
Oren,

Can you elaborate a little bit?

Reading this FAQ,  denormalization is a best practice.
Reading this document,  Includes are favored over denormalization

From what I understand, it's a "it depends" case: if you have frequent updates to the document, don't denormalize it. If it's almost static, denormalize the data.

I didn't read anything about the frequency of changes in the Market document, so I didn't question the design Jason used.

Why are you saying he cannot denormalize things? 

Thanks

Stijn

Itamar Syn-Hershko

unread,
Apr 26, 2012, 6:23:52 AM4/26/12
to rav...@googlegroups.com
The FAQ pages are the docs from the old site, which we slowly deprecate. That was a best practice back then, not today

Now denormalization is only encouraged when you want to persist a point-in-time view of your data, like customer details in an order. Even if customer details change at a later time, the order doesn't change, hence that denormalization. Or rather, copy data over to your aggregate root when necessary.

Another valid scenario for doing denormalization is when sharding, in places you cannot shard related collections together.

In all other places, use includes or multi-maps

Jason Dentler

unread,
Apr 26, 2012, 9:18:44 AM4/26/12
to rav...@googlegroups.com
Includes sounds dangerously like the .Fetch and .ThenFetch NHibernate hell I'm trying to escape.

I've looked up the multi-map, but I'm still thoroughly confused by how this could work to merge bits of one document in to another. Can you give a more concrete example?

Thanks,
Jason

Ryan Heath

unread,
Apr 26, 2012, 9:31:55 AM4/26/12
to rav...@googlegroups.com
I am not familiar with the .Fetch and .ThenFetch NHibernate hell.
Could you explain what hell you are referring to?
Perhaps that hell does not exist in the RavenDB realm ...

// Ryan

Matt Warren

unread,
Apr 26, 2012, 9:50:06 AM4/26/12
to rav...@googlegroups.com
Multi-Map doesn't allow you to merge bits of one doc into another. It allows you to index across docs that are in different collections. Or more precisely, docs that have a different "shape", but some fields in common.

Jason Dentler

unread,
Apr 26, 2012, 10:50:41 AM4/26/12
to rav...@googlegroups.com
Let's back up a bit. Maybe multimap isn't the best solution, or maybe I don't understand how it fits my use-case. If my city documents only have the market id, and I am given a city id, how can I get a result in this shape:

{
  "Id": "cities/902",

Ryan Heath

unread,
Apr 26, 2012, 10:55:54 AM4/26/12
to rav...@googlegroups.com
var city = Session.Load<City>(cityid).Include<Market>(c=>c.marketid);
var market = Session.Load<Market>(city.marketid);

The second Load does not go over the wire. Market has been loaded with
the first call.

// Ryan

Chris Marisic

unread,
Apr 27, 2012, 8:54:01 AM4/27/12
to rav...@googlegroups.com
Yes but in domain driven design your documents will be aggregate roots and will have few to zero relationships outside of itself internally.

A user document is by far the most common to have with an Include.

If your design requires you to include the User, and then you need to try to include something on the User (this isn't possible) that shows you have a document modeling issue. Past that, you will likely have very few includes or if you see that growing fast you're modeling for a RDBMS not a document database.

Jason Dentler

unread,
Apr 27, 2012, 9:27:11 AM4/27/12
to rav...@googlegroups.com
I'm only using RavenDB for a read model. Aggregate boundaries don't apply here. 

Ideally, I'd like to store serialized view models, which implies some data duplication. I thought this was the whole idea of NoSQL document databases like Raven.

- J

Chris Marisic

unread,
Apr 27, 2012, 12:45:50 PM4/27/12
to rav...@googlegroups.com
The whole purpose of document databases is to solve the shortcomings RBDMS face. RDBMS just generally are not the best choices for being a backend datastore to most applications.They always are used because it's become an accepted standard.

Using ravendb as a view store is a good way to start weening yourself off the RDBMS addiction.

Jason Dentler

unread,
Apr 28, 2012, 6:51:35 AM4/28/12
to rav...@googlegroups.com
Right. Everyone says view store is a great use. Oren drops in to says don't denormalize references, despite what it says in the site's documentation and most blog posts I've found on the subject. According to my other responses, that leaves me with Include and multimap. 

Include means I fetch multiple documents and automap them together on to my view model, which IMO, defeats the purpose of having a read model. I'm still heavily processing the result when it's being read, not when it's being written, and I'm fetching, for example, an entire market document instead of just the market name and url that I need.

Can you explain how multimap helps me return an entire city document (with market id) and add the market name and market url from the referenced market document. I haven't had that ah-ha! moment.

- Jason

Ryan Heath

unread,
Apr 28, 2012, 7:24:10 AM4/28/12
to rav...@googlegroups.com
You could use TransformResults of an index as seen here:
http://ayende.com/blog/4661/ravendb-live-projections-or-how-to-do-joins-in-a-non-relational-database

When you will have your index, you'll see it is not really more than
an include (server side).
You should think about what you are willing to pay for:
- serverside index processing
- or fetching/including the market while quering

I dont believe it really makes a big difference in this particular case.

// Ryan

Itamar Syn-Hershko

unread,
Apr 28, 2012, 1:46:55 PM4/28/12
to rav...@googlegroups.com
Jason,

Look at the dates of Oren's blog posts. Best practices with RavenDB have changed as new functionality came in. Denormalization means more friction and more administration work for keeping data "true", and you want to avoid that. Unless of course that data actually belongs there, like in places where we need to freeze data in a certain point in time, as I mentioned before.

Includes are used to fetch referenced documents in one go - that would mean larger data transfer, but it is more efficient than making multiple requests. Think about it as a better way of doing Lazy loading of stuff you know you are going to need.

Map functions in index definitions map properties from your objects and drop them into an index, multi maps is just a way for doing this with multiple objects. An optional reduce function allows you to join them, or doing aggregation operations before dropping those values to the index.

You can either use a TransformResults function to filter out data you don't need on every query, or read data directly from the indexes (and possibly map amended fields to the index). See this for example  http://daniellang.net/using-an-index-as-a-materialized-view-in-ravendb/ 

Also, if this is a standard query and you can find an easy way to make it in a way that feels right, you may want to question your model. It might be good, but there may be ways to change it so the common use cases work better.

Oren Eini (Ayende Rahien)

unread,
Apr 28, 2012, 4:47:04 PM4/28/12
to rav...@googlegroups.com
Jason,
There are distinct differences between using RavenDB as a persistent view model store and as the transactional store.
Those differences include how you model, what you include, and how you handle things.

My default advice is that if you run into trouble with updating denormalized references, back off completely and use Include. 
If you don't have an issue with updating denormalized references (either because of the way you structured the way the data is pushed to RavenDB or because it is a point in time data), go ahead and denomralized.

Wallace Turner

unread,
May 3, 2012, 9:39:14 AM5/3/12
to rav...@googlegroups.com
using 888
I'm trying to find all DataResult documents that have State =null. My query is returning 0 results but there is clearly documents in my database that have State=null
*why am i doing this?* because this field was added.

See this screenshot: I'm querying a document that i know has State=null.
The very next line should return all docs that have null state?? adding WaitForNonStaleResultsAsOfLastWrite doesnt fix it.

Wallace Turner

unread,
May 3, 2012, 9:49:18 AM5/3/12
to ravendb
using 888
I'm trying to find all DataResult documents that have State =null. My
query is returning 0 results but there is clearly documents in my
database that have State=null
*why am i doing this?* because this field was added.

See this screenshot:

http://d.pr/i/zmJw

Oren Eini (Ayende Rahien)

unread,
May 3, 2012, 9:50:57 AM5/3/12
to rav...@googlegroups.com
Can you try producing a failing test?
fdhdgeeb.png

Wallace Turner

unread,
May 3, 2012, 7:10:42 PM5/3/12
to rav...@googlegroups.com
Hi Oren, I'm not sure how my new post ended up on this old thread.

Its happening in a database with about ~500k records (not production) - I have a small bootstrapper app that recreates the issue, i could send you guys this and the database zipped?


Oren Eini (Ayende Rahien)

unread,
May 4, 2012, 9:48:29 AM5/4/12
to rav...@googlegroups.com
Yes, that would be great

Wallace Turner

unread,
May 5, 2012, 8:13:08 AM5/5/12
to rav...@googlegroups.com
Hi Oren, I've got a spike project with a database included, its a console app with just 10 lines to demonstrate the problem. 

As noted before, the important lines are:

                     var result = session.Query<DataResult>().First(r => r.SiteId == "t108137341");
                    Console.WriteLine("result " + result + " state " + result.State);//has a result where result.State is null
                    var query = session.Query<DataResult>().Where(r => r.State == null).Customize(o => o.WaitForNonStaleResultsAsOfLastWrite()).ToList();
                    Console.WriteLine("query count " + query.Count); //no results where State = null ??? 

There are roughly 400k records and about 20k have State=null but NONE are returned when running the query above.



Cheers,

Wal


On Friday, 4 May 2012 21:48:29 UTC+8, Oren Eini wrote:
Yes, that would be great

On Fri, May 4, 2012 at 2:10 AM, Wallace Turner
Hi Oren, I'm not sure how my new post ended up on this old thread.

Oren Eini (Ayende Rahien)

unread,
May 6, 2012, 8:25:53 AM5/6/12
to rav...@googlegroups.com
Wallace, the issue is the structure of your documents.
Take a look at this, the document that you are loading:

{
  "Address": "21 Weelara Road, City Beach, WA 6015",
  "Price": "$3,600,000",
  "Url": "/property-house-wa-city+beach-108137341",
  "SiteId": "t108137341",
  "Source": "rea",
  "CreatedOn": "2012-04-05T13:52:43.0000000Z",
  "LastUpdated": "2012-04-11T11:03:57.0000000Z",
  "LastModified": "2012-04-05T13:52:43.0000000Z",
  "PriceHistories": [
    {
      "ChangeDate": "2012-04-05T13:52:43.0000000Z",
      "Value": "$3,600,000",
      "Notes": ""
    }
  ],
  "Suburb": "city beach"
}


Note that it doesn't HAVE a State property. Because it doesn't have a state property (vs. having it and it being null), you can't query on that.

Wallace Turner

unread,
May 6, 2012, 7:49:25 PM5/6/12
to rav...@googlegroups.com
>Note that it doesn't HAVE a State property. Because it doesn't have a state property (vs. having it and it being null), you can't query on that.

Right, I re-generated the index - does the index simply exclude the document as it doesnt have a State property? I'm curious how it would distinguish between a document that has a NULL state and one that doesnt have a state property at all. (keeping in mind that from C# both these documents appear the same)

What is the recommended approach to resolving issues like this, ie data migration where the domain model changes? As a brute force approach it seems I have to fetch all the objects then Save( ) them back to simply populate the null field.

Oren Eini (Ayende Rahien)

unread,
May 7, 2012, 12:15:35 AM5/7/12
to rav...@googlegroups.com
Wallace,
Regenerating the index won't do much, the data on the database isn't the same.
And the shape of the C# class is meaningless on the server.

You can either use UpdateByIndex or the brute force way
Reply all
Reply to author
Forward
0 new messages