introductions. and: is RavenDB right for me?

1,027 views
Skip to first unread message

mindplay

unread,
Apr 11, 2012, 3:26:12 PM4/11/12
to rav...@googlegroups.com
Hello List,

I've been looking at RavenDB for a while now, and recently decided to install and play around with it.

Just to provide a bit of background, I'm currently working on a very large business application, which we decided to build using NHibernate and Fluent. This has been about two years in the making at this point.

NH breaks standard C# language features such as GetType() and the "is" keyword, which gets really problematic when combined with the ASP.NET/MVC framework, which makes extensive use of reflection. 

About a year into the project, we hit so many walls we eventually gave up using the criteria API, which appears to be an incomplete abstraction. We also had to give up Fluent, which also appears to be incomplete. We dropped down to generating HQL queries - so much for strong typing. Feels like 1998 all over again. In some cases we even hit limitations with HQL, and had to drop down to raw SQL and views. And here I was thinking we had come further than that.

In the past years, I built several apps using PHP and the Yii framework - it has a nice, simple implementation of AR that doesn't make things overly complicated or get in the way of doing simple things the simple way. No, it's not a full, clean abstraction of SQL databases by any means - but it doesn't attempt to be either, which makes it thinner, easier to understand, and easier to work with.

Even after two years of intense day-to-day work with NH, it still seems to be full of surprises. In my honest opinion, I believe it's the Black Monolith of O/RM, and I can never truly know all of it's secrets ;-)

After years of looking with envy at various new graph and object DBMS, I have become increasingly convinced that even the best efforts of the smartest people (with all due respect to the authors of NHibernate) cannot make RDBMS really truly work well for complex object graphs. I like to say, and only half-jokingly, that the one thing a relational database can't handle, is relations. ;-)

Anyway, so much for introductions. As said, I've decided to take a look at RavenDB and see for myself first-hand what it's about.

I'm going to ask some questions, and please don't take these the wrong way - I'm not trying to criticize or provoke or thwart anybody's efforts, I'm just simply trying to understand the value proposal from my own point of view.

Taking one thing at a time, the first major surprise to me, was the fact that there is no real support for traversal of a model as such. And what I mean by that is...

public class Product
{
    public string Name { get; set; }
    public float Price { get; set; }
    public IList<string> Categories { get; set; }
}

This Product-class has a list of categories, probably strings like "categories/123", etc.

That's not how you would model an entity in a typical "business"-model - you'd have something like IList<Category> containing references to the actual Category objects, or simulating that presence of that collection using a proxy and lazy-loading pattern.

So now there's an aspect of persistence to this entity, which suggests to me that the intention is to write dedicated DTO's rather than "business"-entities, and persist those?

Or maybe the intent is actually for me to write "business"-entities and just accept the fact that persistence aspects are going to get mixed in? The NHibernate community (and the software itself) has gone to great lengths to teach me that this is "wrong" and "bad" for various reasons. And my hope/dream after seeing the early RavenDB videos was that it would "just work", as you keep saying - but it seems like you still need to quite carefully design with persistence in mind? And that there is no clean/direct way to separate these concerns?

I understand that you have advanced indexing and querying features, allowing you to prefetch related entities in advance and so forth - but my concern here is not really performance, but transparency.

In an ideal world, I would just write completely persistence-ignorant models, optimizing for the problem-domain of the software itself, without regard for persistence, perhaps other than specifying which properties are persistent or transient.

I realize this is not an ideal world, and perhaps the philosophy of RavenDB is to just accept that fact and deal with it?

But storing full keys like "categories/123" almost seems worse to me than just storing "123" in a category-id-column in a relational database - at least then you have things like cascading updates/deletes without having to deal with that housekeeping aspect of persistence. How is this better?

How are you using RavenDB, or what is the intended use? Do you write dedicated DTOs alongside your business-model, or do you just write business-entities, mix in your storage concerns, and live with that?

As said, please don't take these questions the wrong way - I'm not trying to provoke or attack your efforts, your work, your ideas or your ideals. I'm just trying to understand whether or how I can adopt your ideas/values and work productively that way.

Bottom line, my concern is scalability in terms of complexity - not in terms of performance. I don't build Twitter or FaceBook, and for the most part, RDBMS perform acceptably for the applications I need to build.

Thanks!

-- Rasmus Schultz <http://mindplay.dk>

Chris Marisic

unread,
Apr 11, 2012, 4:03:16 PM4/11/12
to rav...@googlegroups.com
Your questions here could literally lead to hours worth of discussion. The short answer, RavenDB is designed specifically to solve ALL of those problems you mentioned you face.

That being said, you have new problems to over come which is dealing with transaction boundaries and deciding where and when to denormalize data vs not denormalizing.

Generally with RavenDB you want to specifically avoid designs that require "cascade" type behavior, that implies an incorrect usage of a nosql db. These operations in some scenarios can't be avoided, and just need to be limited as much as possible especially since a cascade of that sort could theoretically require mutating every document in your database (or atleast every document in a collection)

mindplay

unread,
Apr 11, 2012, 5:02:54 PM4/11/12
to rav...@googlegroups.com
I realize this is no small topic, but what I'm fishing for here is the intended use of RavenDB specifically, moreso that a general debate about right and wrong - I guess I'm looking for the "happy path" that will lead to maximum joy and fewest possible headaches with RavenDB specifically.

It sounds like denormalization is one of the first facts of life with RavenDB that one needs to accept? I'm going to have a hard time with that - I generally have been taught to avoid denormalization, particularly for performance, and over the years have gotten good at avoiding this with SQL-server/MySQL without sacrificing performance.

It sounds like one is going to have to make other sacrifices with RavenDB in terms of denormalization, for other reasons, but unavoidably, and by design?

Itamar Syn-Hershko

unread,
Apr 11, 2012, 5:31:15 PM4/11/12
to rav...@googlegroups.com
No, basically you only denormalize when it makes sense, and that is usually in one of two scenarios - sharding under some circumstances and to persist a point in-time view of data. In those 2 scenarios it actually makes sense to denormalize, hence its not a sacrifice (for example, you WANT the product price to be denormalized into the order object so future price changes won't affect past orders).

We have multi-maps and includes to handle 99% of all other cases

Itamar Syn-Hershko

unread,
Apr 11, 2012, 5:33:30 PM4/11/12
to rav...@googlegroups.com
Re " I generally have been taught to avoid denormalization " - the best practices with Raven and any other non-relational DB are COMPLETELY different than those of RDBMSes. Specifically, we don't try to persist normal forms, hence normalization is not a cardinal sin.

Itamar Syn-Hershko

unread,
Apr 11, 2012, 5:34:48 PM4/11/12
to rav...@googlegroups.com
Let me rephrase:  hence denormalization is not a cardinal sin

mindplay

unread,
Apr 11, 2012, 6:23:46 PM4/11/12
to rav...@googlegroups.com
I'm not sure that technically is denormalization?

You may be storing the same piece of data, but you're actually storing two different pieces of information. For example, "the customer's current address" is not the same information as "the customer's address at the time he placed the order" - even if the data you're persisting is identical in those two cases, the information it conveys is different when you're preserving historical information.

My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

I saw a custom key examples somewhere, where the user's e-mail address was being used as the primary key, and references to that user would be stored as "users/j...@doe.com", which seems clever at a glance - but what happens if the user changes their e-mail address? seems like an extremely bad idea, I don't know why such an example would even be cited...


On Wednesday, April 11, 2012 5:31:15 PM UTC-4, Itamar Syn-Hershko wrote:
No, basically you only denormalize when it makes sense, and that is usually in one of two scenarios - sharding under some circumstances and to persist a point in-time view of data. In those 2 scenarios it actually makes sense to denormalize, hence its not a sacrifice (for example, you WANT the product price to be denormalized into the order object so future price changes won't affect past orders).

We have multi-maps and includes to handle 99% of all other cases

Itamar Syn-Hershko

unread,
Apr 11, 2012, 6:46:02 PM4/11/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 1:23 AM, mindplay <ras...@mindplay.dk> wrote:
I'm not sure that technically is denormalization?

So what do you call a "denormalization"?
 
You may be storing the same piece of data, but you're actually storing two different pieces of information. For example, "the customer's current address" is not the same information as "the customer's address at the time he placed the order" - even if the data you're persisting is identical in those two cases, the information it conveys is different when you're preserving historical information.

My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Why is that? "categoeries/123" is a document ID. With RavenDB, document IDs are strings. And I'm not sure what are the 2 pieces of redundant information?

We don't have the notion of foreign keys - RavenDB is NOT relational. "123" is not a document in a table of "categories"; "categories/123" is a document, and internally we group similar documents by their Entity-Name under a logical unit called a Collection.


Coming from a relational background, it's important to remember 2 things: the complete object graph is persisted, and we don't care about repeating ourselves where it makes sense to do so ("denormalization"). The question we ask when modeling is "what is an object" and "where does it make sense to repeat ourselves". Both are answered by Domain-Driven-Design concepts like an aggregate root (using the transactional boundaries to define discrete objects) and by expected usage patterns.
 

Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

No, you need the full ID there. As I said, the ID is a string that by convention holds the collection name as well.
 

I saw a custom key examples somewhere, where the user's e-mail address was being used as the primary key, and references to that user would be stored as "users/j...@doe.com", which seems clever at a glance - but what happens if the user changes their e-mail address? seems like an extremely bad idea, I don't know why such an example would even be cited...

You should take that into account when designing your system. In the RavenDB website we did the same, since we assume the email address won't change, or at least we don't care if it will. Doing this makes it very easy to enforce a unique constraint over the email address. If you have a website where you want to support changes to email addresses, you'd probably want to go in another route.

As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

Itamar.

Beyers

unread,
Apr 11, 2012, 7:06:03 PM4/11/12
to rav...@googlegroups.com
Rob Ashton has a post I found helpful that explains the whole transaction boundary concept and how it translates to good document design:  http://codeofrob.com/entries/ravendb---document-design-with-collections.html 
In some cases it is perfectly fine to store child documents as part of the parent document, in other cases storing references is better. The above post helped me a lot to clarify which design to use when, maybe it will help you as well.

Justin A

unread,
Apr 12, 2012, 2:59:22 AM4/12/12
to rav...@googlegroups.com
Hi Mindplay - welcome to the secret ninja DoJo.

i'm one of the more challenged individuals around here - and i'm slowly getting the hang of it. It == NoSql and RavenDb's implimentation of it.

The hardest thing I struggled with initially was rewiring my brain to -stop- thinking like an RDBMS / SQL language and now thinking more about documents / my domain model.

What i've found is this : Using RavenDb i don't really have to worry about the database any more. Meaning, here's my solution/product .. now how do I need to stick this crap into a Sql Server? PITA.

Once i broke free of that personal constraint (ie. modelling everything for a fricking database instead of modelling for my solution/domain) .. things started working better for me.

And then all these scenario's (like why is an -identity- a string? OMG! etc..) just became: question => reason and solution.

And development life is much easier and quicker :)

Quote Itamar: As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

Drink the cool-aid... u won't regret it :)

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 4:23:50 AM4/12/12
to rav...@googlegroups.com
> That's not how you would model an entity in a typical "business"-model - you'd have something like IList<Category> containing references to the actual Category objects, or simulating that presence of that collection using a proxy and lazy-loading pattern.


No, actually, that isn't how you would model this.
This is how you are _used_ to modeling this, because you are thinking about how NHibernate does this.
You have to understand that this has been an explicit design choice with RavenDB. I have seen the problems that you get into when you try to go the magic route.
Since you noted the issue with `is` and `GetType()`, and you probably know about the SELECT N+1 issues, you are probably familiar with those issues.

Instead of trying to imagine a world where everything is in memory (the abstraction that NHibernate is trying to create), RavenDB follows the Aggregate model, where there are clear boundaries between different entities. That matches well to the way things actually work, because you can rely on being able to cheaply access anything inside the aggregate, and there is an explicit step that you have to take to access anything that isn't in the aggregate.

Note that RavenDB contains a lot of features, like `Include()` and `Live Projections` that allows you to easily get the related data, but again, we do that as an explicit step because you _have_ to respect the boundary.

> So now there's an aspect of persistence to this entity, which suggests to me that the intention is to write dedicated DTO's rather than "business"-entities, and persist those?

Nope, it is just that you model you entities in a different way than you would using a relational database.

> but my concern here is not really performance, but transparency.

So was mine when designing this. But instead of pretending that "oh, it doesn't matter, let us deal with this in the OR/M layer", I decided that we need to be transparent about the actual implications of what you are doing. The end result is a much better application, because you don't have hidden snares waiting for you.

> In an ideal world, I would just write completely persistence-ignorant models, optimizing for the problem-domain of the software itself, without regard for persistence, perhaps other than specifying which properties are persistent or transient.

No, you won't. Because even if we assume that you entire model is in memory, that is _still_ a bad way to design things. You need to think about things like concurrency, you need to think about transaction boundaries, you need to think about how to actually _deal_ with things. What you are saying is valid if you had only one user, only one time. But it falls apart once you start to consider what is actually going on.

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 4:27:41 AM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 1:23 AM, mindplay <ras...@mindplay.dk> wrote:

My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.


You are breaking apart the key in your head. That isn't how it works in RavenDB.
This is a _single value_. It is structured this way for one major reason. Readability. Because having this in this fashion make is easier to work with your code.
 
Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

That will work, yes. It isn't recommended. Same readability argument.
 

I saw a custom key examples somewhere, where the user's e-mail address was being used as the primary key, and references to that user would be stored as "users/j...@doe.com", which seems clever at a glance - but what happens if the user changes their e-mail address? seems like an extremely bad idea, I don't know why such an example would even be cited...

You don't do that, then, if that is an option.
The document key cannot be changed, that is a cardinal rule in RavenDB. You cannot "rename" a document.
If you have the option of changing emails,you don't use the email as the document id.

Chris Marisic

unread,
Apr 12, 2012, 9:03:35 AM4/12/12
to rav...@googlegroups.com


On Thursday, April 12, 2012 4:27:41 AM UTC-4, Oren Eini wrote:
inline

On Thu, Apr 12, 2012 at 1:23 AM, mindplay <ras...@mindplay.dk> wrote:

My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.


You are breaking apart the key in your head. That isn't how it works in RavenDB.
This is a _single value_. It is structured this way for one major reason. Readability. Because having this in this fashion make is easier to work with your code.
 

We use RavenDB with almost no HiLo keys, we've been able to create natural keys for our documents and then for related documents for a customer/# are customer/#/someresource/someidentifer. This results in natural restful URL structures you can carry over to your app. This also allows for doing full text search using multimaps across ID in additon to content fields and boost the value of ID which then also gives you very relevant searches even across multiple collections.


I saw a custom key examples somewhere, where the user's e-mail address was being used as the primary key, and references to that user would be stored as "users/j...@doe.com", which seems clever at a glance - but what happens if the user changes their e-mail address? seems like an extremely bad idea, I don't know why such an example would even be cited...

You don't do that, then, if that is an option.
The document key cannot be changed, that is a cardinal rule in RavenDB. You cannot "rename" a document.
If you have the option of changing emails,you don't use the email as the document id.

[Well I guess we break that in 1 scenario where we actually let staff change a natural identifier in the rare cases it needs to change. It was fairly easy to do that using straight ravendb CommandData and the  DatabaseCommands.StartsWith and the benefits of having the natural keys in our scenario is definitely worth it. Related documents are only small groupings]

mindplay

unread,
Apr 12, 2012, 9:43:01 AM4/12/12
to rav...@googlegroups.com
On Wednesday, April 11, 2012 6:46:02 PM UTC-4, Itamar Syn-Hershko wrote:

So what do you call a "denormalization"?

the introduction of any "accidental" state - when you're storing copies of data to increase performance, for example. Ideally, you should never have to do that.

the only time you should have to make copies of data, is when copying the data is part of an operation that isn't "accidental" - that is, it satisfies a real requirement, not just something you have to do because the underlying systems suffer from technical limitations that make them unable to handle normalized models with sufficient performance.
 
My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Why is that? "categoeries/123" is a document ID. With RavenDB, document IDs are strings. And I'm not sure what are the 2 pieces of redundant information?

the word "categories" is redundant - storing the ID as a string, for that matter, is redundant, if you know every key is going to be a number.
 
We don't have the notion of foreign keys - RavenDB is NOT relational. "123" is not a document in a table of "categories"; "categories/123" is a document, and internally we group similar documents by their Entity-Name under a logical unit called a Collection.

But the model you're storing contains relations - so the data you're storing is relational in nature, and "categories/123" is a foreign key to a specific document. 

I understand that RavenDB itself is not "relational" in the traditional sense, but clearly a lot of work went into providing means of dealing with relational data. I can think of very, very few applications that would not need extensive relation management. Certainly every example I've seen so far makes use of relations and foreign keys, or "document ids", if that's what you prefer to call them.

If you're going to store a list of category-document ids in a property, storing strings like "categories/123" is denormalization in some sense - you could just as well store the integer "123", since you know this list is going to contain strictly "category" document ids.
 
Coming from a relational background, it's important to remember 2 things: the complete object graph is persisted, and we don't care about repeating ourselves where it makes sense to do so ("denormalization"). The question we ask when modeling is "what is an object" and "where does it make sense to repeat ourselves". Both are answered by Domain-Driven-Design concepts like an aggregate root (using the transactional boundaries to define discrete objects) and by expected usage patterns.

Well, yes, a complete object-graph is persisted in one shot, but the complete model-graph is not automatically persisted - and you have to design around this when designing your model.

Bear with me:

Where I would normally have IList<Category> you have IList<string> instead - so now the model itself is not directly traversable.

In a sense, this is an indirect means of defining the boundaries of self-contained documents within your model, as a means of explaining to the data-mapper (Raven) which relations cross boundaries between documents, to prevent it from traversing outside the scope of your document-object-graph.

This adds complexity, because you have to go back to the database and fetch another piece of the model-graph when needed - and this complexity is accidental, because this is not a true expression of your model-graph.

This is what I was getting at early on when we connected on Twitter
 
 Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

No, you need the full ID there. As I said, the ID is a string that by convention holds the collection name as well.

I understand that you can't perform a query without providing the full document ID - but if you know you're storing only categories in a specific collection, that ID is easily reconstructed by adding "categories/" in front of the number itself, so there isn't technically any reason why you would need to store the whole string, other than convention, is that correct?
 
As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

It still seems to that a lot of your considerations are not business-related, but technical.

And it may be that this actually works better in practice - that it's more realistic and honest to deal with storage-mechanisms as what they really are.

For me personally, as mentioned, working with the AR implementation in Yii was definitely a lot more transparent and productive than NHibernate, which attempts to fully abstract and hide almost every aspect of persistence.

It may be that attempting to run and hide from any aspects of persistence is somewhat delusional.

I am definitely still very interested in RavenDB on account of it's simplicity. I'm just trying to gauge whether that simplicity will scale well to a model and requirements as complex as the ones in the application I'm building.

I appreciate your willingness to discuss this! a lot!

Thanks :-)

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 10:00:28 AM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 4:43 PM, mindplay <ras...@mindplay.dk> wrote:
On Wednesday, April 11, 2012 6:46:02 PM UTC-4, Itamar Syn-Hershko wrote:

So what do you call a "denormalization"?

the introduction of any "accidental" state - when you're storing copies of data to increase performance, for example. Ideally, you should never have to do that.

the only time you should have to make copies of data, is when copying the data is part of an operation that isn't "accidental" - that is, it satisfies a real requirement, not just something you have to do because the underlying systems suffer from technical limitations that make them unable to handle normalized models with sufficient performance.
 

Then don't do it when you don't have to. There is nothing that requires it.
 
My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Why is that? "categoeries/123" is a document ID. With RavenDB, document IDs are strings. And I'm not sure what are the 2 pieces of redundant information?

the word "categories" is redundant - storing the ID as a string, for that matter, is redundant, if you know every key is going to be a number.
 

You _can_ make it just a number, you are aware, right?
And the actual representation is meaningless, the _only_ reason the key is there is to make it human readable.
 
We don't have the notion of foreign keys - RavenDB is NOT relational. "123" is not a document in a table of "categories"; "categories/123" is a document, and internally we group similar documents by their Entity-Name under a logical unit called a Collection.

But the model you're storing contains relations - so the data you're storing is relational in nature, and "categories/123" is a foreign key to a specific document. 


No, it isn't a relation. It is a property holding the value of another document key, that is quite different. It isn't relation, there aren't FK, etc.
  

If you're going to store a list of category-document ids in a property, storing strings like "categories/123" is denormalization in some sense - you could just as well store the integer "123", since you know this list is going to contain strictly "category" document ids.

You can do that if you want, it is easier if you store everything.
 
 
Coming from a relational background, it's important to remember 2 things: the complete object graph is persisted, and we don't care about repeating ourselves where it makes sense to do so ("denormalization"). The question we ask when modeling is "what is an object" and "where does it make sense to repeat ourselves". Both are answered by Domain-Driven-Design concepts like an aggregate root (using the transactional boundaries to define discrete objects) and by expected usage patterns.

Well, yes, a complete object-graph is persisted in one shot, but the complete model-graph is not automatically persisted - and you have to design around this when designing your model.

Bear with me:

Where I would normally have IList<Category> you have IList<string> instead - so now the model itself is not directly traversable.

Yes, intentionally so. 
Let me try something else, try imagining User <<-- -- >> Group scenario. If you try to do strong references this way, you end up with having to access all the users if you load just one. And all their groups as well.
We put explicit boundaries for a reason.

If you want that, feel free to check OODB, instead of a document db. They have much of the same behavior that you seem to want, but RavenDB is not an OODB. And it behave differently.
 

In a sense, this is an indirect means of defining the boundaries of self-contained documents within your model, as a means of explaining to the data-mapper (Raven) which relations cross boundaries between documents, to prevent it from traversing outside the scope of your document-object-graph.

You confuse the data mapper with the actual physical structure of the documents. 
You aren't trying to map data, you are defining the actual physical structure of the documents. 
 

This adds complexity, because you have to go back to the database and fetch another piece of the model-graph when needed - and this complexity is accidental, because this is not a true expression of your model-graph.


I would disagree that this is accidental. It is quite intentional. 

 
This is what I was getting at early on when we connected on Twitter
 
 Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

No, you need the full ID there. As I said, the ID is a string that by convention holds the collection name as well.

I understand that you can't perform a query without providing the full document ID - but if you know you're storing only categories in a specific collection, that ID is easily reconstructed by adding "categories/" in front of the number itself, so there isn't technically any reason why you would need to store the whole string, other than convention, is that correct?
 
As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

It still seems to that a lot of your considerations are not business-related, but technical.

Try building an actual application using this, and I think that you'll come up with a different conclusion. 
 

And it may be that this actually works better in practice - that it's more realistic and honest to deal with storage-mechanisms as what they really are.

For me personally, as mentioned, working with the AR implementation in Yii was definitely a lot more transparent and productive than NHibernate, which attempts to fully abstract and hide almost every aspect of persistence.


You are actually feeling this way mostly because you have been writing SQL apps for a very long time. 

Chris Marisic

unread,
Apr 12, 2012, 10:12:12 AM4/12/12
to rav...@googlegroups.com


On Thursday, April 12, 2012 9:43:01 AM UTC-4, mindplay wrote:
On Wednesday, April 11, 2012 6:46:02 PM UTC-4, Itamar Syn-Hershko wrote:
My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Why is that? "categoeries/123" is a document ID. With RavenDB, document IDs are strings. And I'm not sure what are the 2 pieces of redundant information?

the word "categories" is redundant - storing the ID as a string, for that matter, is redundant, if you know every key is going to be a number.

No it is not redundant. The database has no notion WHATSOEVER of "123" in isolation.
 
If you're going to store a list of category-document ids in a property, storing strings like "categories/123" is denormalization in some sense - you could just as well store the integer "123", since you know this list is going to contain strictly "category" document ids.

This is inaccurate. You can say these are ""category" document ids.", but what will tell the server? Nothing can. That's why these full IDs matter.
 


Well, yes, a complete object-graph is persisted in one shot, but the complete model-graph is not automatically persisted - and you have to design around this when designing your model.

Bear with me:

Where I would normally have IList<Category> you have IList<string> instead - so now the model itself is not directly traversable.

In a sense, this is an indirect means of defining the boundaries of self-contained documents within your model, as a means of explaining to the data-mapper (Raven) which relations cross boundaries between documents, to prevent it from traversing outside the scope of your document-object-graph.

This is inaccurate. IList<Category> vs IList<string> are 2 incredible different things.  If you use IList<Category> that is saying you want to store all of this data as part of the same document. IList<string>  is no different, except here you're specifically refereing to an IList of CategoryIds 


This adds complexity, because you have to go back to the database and fetch another piece of the model-graph when needed - and this complexity is accidental, because this is not a true expression of your model-graph.

Dealing with relationships adds complexity. RavenDB is not adding complexity here, this is inherent complexity that most ORMs just let people forget about things that actually do matter. This is not accidental, RavenDB was specifically built to make 1-to-many relationship traversal be explicit and not implicit as ORMs do it and automatically eliminate the most common problem in data access when it's using ORMs, N+1. RavenDB specifically provides methods that allow you to get all of the data with a single request to the server when this is truly needed.

In our applications have almost no usages of List<string> foreignKeys.
 

This is what I was getting at early on when we connected on Twitter
 
 Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

No, you need the full ID there. As I said, the ID is a string that by convention holds the collection name as well.

I understand that you can't perform a query without providing the full document ID - but if you know you're storing only categories in a specific collection, that ID is easily reconstructed by adding "categories/" in front of the number itself, so there isn't technically any reason why you would need to store the whole string, other than convention, is that correct?

No, this is also incorrect. Collections do not exist, they are synthetic. When you talk to RavenDB in any fashion you are not querying collections, you are querying indexes, or using IDs to talk directly to the store. There is no way to "easily reconstructed by adding "categories/" in front of the number itself," This could only be done in limited fashions on the client API. This is why ravendb supports having a Category { Id = "categories/123" } document, you can do Session.Load<Category>((int)123)  (cast for extra clarity, not needed). HERE, RavenDB can guess "categories/123".
 
 
As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

It still seems to that a lot of your considerations are not business-related, but technical

I don't agree with this. You're experiencing tunnel vision, that your expectations of  how you would solve problems when targeting a RDBMS are not being thought of in isolation to "how would i solve this problem". I have yet to ever specifically change my domain design to accommodate raven, there's been many times I've had to drastically alter domain design because of sql server.

Itamar Syn-Hershko

unread,
Apr 12, 2012, 10:15:07 AM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 4:43 PM, mindplay <ras...@mindplay.dk> wrote:
On Wednesday, April 11, 2012 6:46:02 PM UTC-4, Itamar Syn-Hershko wrote:

So what do you call a "denormalization"?

the introduction of any "accidental" state - when you're storing copies of data to increase performance, for example. Ideally, you should never have to do that.

the only time you should have to make copies of data, is when copying the data is part of an operation that isn't "accidental" - that is, it satisfies a real requirement, not just something you have to do because the underlying systems suffer from technical limitations that make them unable to handle normalized models with sufficient performance.

Great, then by your definition of "denormalization", you should NEVER denormalize with RavenDB. Correctly using Includes, MultiMaps and the TransformResults function, you will be able to both respect transaction boundaries AND have all the data you need in a VERY efficient manner, without any cost similar to an RDBMS JOIN...
 
 
My concern is that storing a string like "categories/123" seems to be a general practice - and as mentioned, in some sense, this is worse than just storing the foreign key "123", as you would typically do with an RDBMS, since you're now storing two pieces of redundant information.

Why is that? "categoeries/123" is a document ID. With RavenDB, document IDs are strings. And I'm not sure what are the 2 pieces of redundant information?

the word "categories" is redundant - storing the ID as a string, for that matter, is redundant, if you know every key is going to be a number.

It is not, it is part of the ID, by convention. If you had stored a category with an ID "123" and then a product under the same ID "123", the product will overwrite the category. The reason for this is IDs are unique per-database, hence the convention which also makes it very readable.

And you don't always know every key is going to be a number. Consider IDs like "tags/ravendb" in a blog for a Tag entity for example, or the users example you brought up. With RavenDB IDs are always strings, the client API makes it transparent for you in some occasions, handling HiLo and other stuff f
 
 
We don't have the notion of foreign keys - RavenDB is NOT relational. "123" is not a document in a table of "categories"; "categories/123" is a document, and internally we group similar documents by their Entity-Name under a logical unit called a Collection.

But the model you're storing contains relations - so the data you're storing is relational in nature, and "categories/123" is a foreign key to a specific document. 

I understand that RavenDB itself is not "relational" in the traditional sense, but clearly a lot of work went into providing means of dealing with relational data. I can think of very, very few applications that would not need extensive relation management. Certainly every example I've seen so far makes use of relations and foreign keys, or "document ids", if that's what you prefer to call them.

A foreign-key is a term coming from the relational world, meaning there is an index build on it. That is not the case with RavenDB, and there's a big difference. You can store a reference to another document and load it efficiently using Includes, or "join" them while indexing using multi-maps, but thats a completely different thing.

Since we use DDD concepts, and relations between aggregate roots are allowed - and that is also the case in real-world scenarios - we ought to support that, hence all the work. This is not relational-data per se, think of it more as relations between aggregate roots, which are - as mentioned before - VERY different than what your objects look like in an RDBMS ER diagram.
 

If you're going to store a list of category-document ids in a property, storing strings like "categories/123" is denormalization in some sense - you could just as well store the integer "123", since you know this list is going to contain strictly "category" document ids.

Yes, you could do that, but I wouldn't call it denormalization in that case. For full readability of my object graph I'd probably still go with categories/123. Don't fear the "added cost" of reading a few more bytes, its completely negligible.
 
 
Coming from a relational background, it's important to remember 2 things: the complete object graph is persisted, and we don't care about repeating ourselves where it makes sense to do so ("denormalization"). The question we ask when modeling is "what is an object" and "where does it make sense to repeat ourselves". Both are answered by Domain-Driven-Design concepts like an aggregate root (using the transactional boundaries to define discrete objects) and by expected usage patterns.

Well, yes, a complete object-graph is persisted in one shot, but the complete model-graph is not automatically persisted - and you have to design around this when designing your model.

Bear with me:

Where I would normally have IList<Category> you have IList<string> instead - so now the model itself is not directly traversable.

In a sense, this is an indirect means of defining the boundaries of self-contained documents within your model, as a means of explaining to the data-mapper (Raven) which relations cross boundaries between documents, to prevent it from traversing outside the scope of your document-object-graph.

This adds complexity, because you have to go back to the database and fetch another piece of the model-graph when needed - and this complexity is accidental, because this is not a true expression of your model-graph.

This is what I was getting at early on when we connected on Twitter

Again, a question of modeling and use cases

You will have a List<Category> if the Category object you persist has no meaning outside the scope of your object. Like List<OrderLine> within an Order object. Since your category is probably going to be referenced from somewhere else, and contain some more data unique to it, it should be stored on its own and given its own document ID, which you later reference.

It is going to be traversible quite easily by using Includes. And note using includes will not generate extra network traffic. Or you can use multi-maps to and transform results to search on partial data and project custom objects, even view models, directly from the index.
 
 
 Although I suppose there's no reason you'd be forced to store keys in such a long form? You could use an IList<int> for category-ids, for example, if you wanted to, and still make use of multi-maps and includes, as far as I can tell, correct?

No, you need the full ID there. As I said, the ID is a string that by convention holds the collection name as well.

I understand that you can't perform a query without providing the full document ID - but if you know you're storing only categories in a specific collection, that ID is easily reconstructed by adding "categories/" in front of the number itself, so there isn't technically any reason why you would need to store the whole string, other than convention, is that correct?

See my comment above.
 
 
As you can see, using RavenDB is all about considering your business model closely. It may be confusing at first.

It still seems to that a lot of your considerations are not business-related, but technical.

And it may be that this actually works better in practice - that it's more realistic and honest to deal with storage-mechanisms as what they really are.

For me personally, as mentioned, working with the AR implementation in Yii was definitely a lot more transparent and productive than NHibernate, which attempts to fully abstract and hide almost every aspect of persistence.

Exactly what RavenDB doesn't do. It doesn't abstract anything, it just works with your model. And we are convinced thats a better route to go in. That being said, considerations are not all too technical, they involve a LOT of business logic and expected use cases. Try modeling something with the DDD approaches and you'll see.
 

It may be that attempting to run and hide from any aspects of persistence is somewhat delusional.

I am definitely still very interested in RavenDB on account of it's simplicity. I'm just trying to gauge whether that simplicity will scale well to a model and requirements as complex as the ones in the application I'm building.

Much more than any RDBMS will, that I can assure you. Try it, we are here to assist.

Itamar Syn-Hershko

unread,
Apr 12, 2012, 10:15:26 AM4/12/12
to rav...@googlegroups.com
Oren and I in a race condition. Oren wins.

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 10:19:34 AM4/12/12
to rav...@googlegroups.com
Isn't that last write wins?

Itamar Syn-Hershko

unread,
Apr 12, 2012, 10:22:16 AM4/12/12
to rav...@googlegroups.com
LOL

mindplay

unread,
Apr 12, 2012, 10:39:27 AM4/12/12
to rav...@googlegroups.com
Thank you for taking the time to elaborate on this.

I guess I don't see the practical reason why aspects like persistence and transaction-boundaries should affect the OOP design-patterns you choose?

For example, the following models Stores that are closed on particular days:

public class Closing
{
    public DateTime ClosedFrom { get; set; }
    public DateTime ClosedTo { get; set; } 
}

public class Store
{
    public IList<Closing> Closings { get; set; }
}

This would be harder to model with a relational database, where this would actually persist in two tables, say, "stores" and "closings". That's a lot of complexity just to store something that is really truly composite - and in a sense, it's "wrong", because a closing-date is unique to a Store, it's not an independent thing that has any meaning outside the context of the Store it belongs to. Yet, with an RDBMS, we're forced to store them as independent units.

Much simpler with RavenDB, where this is stored as one unit, a document. "It just works." :-)

Now let's say that stores are listed in a number of cities.

public class Store
{
    public IList<Closing> Closings { get; set; }
    public IList<City> Cities { get; set; }
}

Now a City does not belong to a Store, and a Store does not belong to a City - they're related of course, but neither has ownership of the other. They are independent units.

Let's be clear about the fact that I didn't choose this design-pattern because I'm thinking about persistence - this is basic, traditional, persistence-ignorant OO.

And now you want me to define the transaction-boundary by changing the model:

public class Store
{
    public IList<Closing> Closings { get; set; }
    public IList<string> Cities { get; set; }
}

My problem with this approach is, you did a lot more than just defining a transaction-boundary for persistence, and it has far-reaching consequences.

Why can't you just declare the document-boundary instead? For example:

public class Store
{
    public IList<Closing> Closings { get; set; }
    
    [Documents]
    public IList<City> Cities { get; set; }
}

And then let the persistence layer do the work?

There are at least a few common patterns that cover probably 90-95% of common relations in any given model.

Why can't we model those cases using declarations instead of code?

Since we're no longer dealing with all the cases like the list of store closing-dates above, that should eliminate a lot of the more complex scenarios that were so difficult to handle in NH - I bet you would need only a few declarations to make it all the way to full persistence-abstraction with RavenDB...

Or maybe a lot, maybe I'm delusional ;-)


On Thursday, April 12, 2012 4:23:50 AM UTC-4, Oren Eini wrote:

No, actually, that isn't how you would model this.
This is how you are _used_ to modeling this, because you are thinking about how NHibernate does this.
You have to understand that this has been an explicit design choice with RavenDB. I have seen the problems that you get into when you try to go the magic route.
Since you noted the issue with `is` and `GetType()`, and you probably know about the SELECT N+1 issues, you are probably familiar with those issues.

Instead of trying to imagine a world where everything is in memory (the abstraction that NHibernate is trying to create), RavenDB follows the Aggregate model, where there are clear boundaries between different entities. That matches well to the way things actually work, because you can rely on being able to cheaply access anything inside the aggregate, and there is an explicit step that you have to take to access anything that isn't in the aggregate.

Note that RavenDB contains a lot of features, like `Include()` and `Live Projections` that allows you to easily get the related data, but again, we do that as an explicit step because you _have_ to respect the boundary.

> So now there's an aspect of persistence to this entity, which suggests to me that the intention is to write dedicated DTO's rather than "business"-entities, and persist those?

Nope, it is just that you model you entities in a different way than you would using a relational database.

> but my concern here is not really performance, but transparency.

So was mine when designing this. But instead of pretending that "oh, it doesn't matter, let us deal with this in the OR/M layer", I decided that we need to be transparent about the actual implications of what you are doing. The end result is a much better application, because you don't have hidden snares waiting for you.

> In an ideal world, I would just write completely persistence-ignorant models, optimizing for the problem-domain of the software itself, without regard for persistence, perhaps other than specifying which properties are persistent or transient.

No, you won't. Because even if we assume that you entire model is in memory, that is _still_ a bad way to design things. You need to think about things like concurrency, you need to think about transaction boundaries, you need to think about how to actually _deal_ with things. What you are saying is valid if you had only one user, only one time. But it falls apart once you start to consider what is actually going on.


Chris Marisic

unread,
Apr 12, 2012, 11:11:27 AM4/12/12
to rav...@googlegroups.com
You think you want those features, but they really wouldn't be beneficial. They would just allow developers to make badly designed non-relational databases behave exactly the same as relation-databases and manifest all of the common problems RDBMS create with even more problems because you're trying to make a non-relational system behave relationally.

Itamar Syn-Hershko

unread,
Apr 12, 2012, 11:12:48 AM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 5:39 PM, mindplay <ras...@mindplay.dk> wrote:
Thank you for taking the time to elaborate on this.

I guess I don't see the practical reason why aspects like persistence and transaction-boundaries should affect the OOP design-patterns you choose?

They don't. Transactional boundaries are completely business related, and come from DDD, which is OOP-next-gen if you wish. And persistence is never being considered as a factor. As I said - as far as we are concerned, you just drop your objects into Raven.
 

For example, the following models Stores that are closed on particular days:

public class Closing
{
    public DateTime ClosedFrom { get; set; }
    public DateTime ClosedTo { get; set; } 
}

public class Store
{
    public IList<Closing> Closings { get; set; }
}

This would be harder to model with a relational database, where this would actually persist in two tables, say, "stores" and "closings". That's a lot of complexity just to store something that is really truly composite - and in a sense, it's "wrong", because a closing-date is unique to a Store, it's not an independent thing that has any meaning outside the context of the Store it belongs to. Yet, with an RDBMS, we're forced to store them as independent units.

Much simpler with RavenDB, where this is stored as one unit, a document. "It just works." :-)

Now let's say that stores are listed in a number of cities.

public class Store
{
    public IList<Closing> Closings { get; set; }
    public IList<City> Cities { get; set; }
}

Now a City does not belong to a Store, and a Store does not belong to a City - they're related of course, but neither has ownership of the other. They are independent units.

Let's be clear about the fact that I didn't choose this design-pattern because I'm thinking about persistence - this is basic, traditional, persistence-ignorant OO.

And now you want me to define the transaction-boundary by changing the model:

public class Store
{
    public IList<Closing> Closings { get; set; }
    public IList<string> Cities { get; set; }
}

My problem with this approach is, you did a lot more than just defining a transaction-boundary for persistence, and it has far-reaching consequences.

True, since here you've taken a decision to have a City in its own object (probably rightly so). When in memory you can keep references to the same object, but we can't do that with json, hence the List<string>.

With about 3 lines of code (Include + a foreach loop) you can overcome this, so I don't really see the problem here.

But I see a different problem here - how can a store be in several places at once? perhaps you're looking for something along those lines instead (StoreCompany and ActualStore)? http://ayende.com/blog/84993/document-based-modeling-auctions
 
Why can't you just declare the document-boundary instead? For example:

public class Store
{
    public IList<Closing> Closings { get; set; }
    
    [Documents]
    public IList<City> Cities { get; set; }
}

And then let the persistence layer do the work?

How would that work?
 

There are at least a few common patterns that cover probably 90-95% of common relations in any given model.

Why can't we model those cases using declarations instead of code?

The way I see it, you want to start writing a lot of complex stuff that will most probably be prone to a lot of bugs just to solve what you consider to be a problem, which is gone in 3 lines of code. I think before we move on with this conversation, you need to explain WHY this actually bugs you, and why you REALLY need that?...

mindplay

unread,
Apr 12, 2012, 12:00:17 PM4/12/12
to rav...@googlegroups.com
On Thursday, April 12, 2012 11:12:48 AM UTC-4, Itamar Syn-Hershko wrote:
Why can't we model those cases using declarations instead of code?

The way I see it, you want to start writing a lot of complex stuff that will most probably be prone to a lot of bugs just to solve what you consider to be a problem, which is gone in 3 lines of code. I think before we move on with this conversation, you need to explain WHY this actually bugs you, and why you REALLY need that?...

3 lines of code here and 3 lines of code there. And every time, code that most likely has nothing to do with the task it's performing.

That's mainly what bugs me - when I'm writing code that works with the model, I don't want to have to think about persistence, I want to focus on the task I'm trying to perform.

I don't want somebody else reading the code getting distracted by it either - having 3 lines of persistence-related code in the middle of a business-procedure can be confusing or misleading.

There's also a maintenance issue - suppose your model changes, and something that was previously a component is now an independent document, so you change the collection from IList<Category> to IList<string> and this breaks all of your existing business-procedures. Suppose you have 100s of business-procedures that require a list of categories.

In light of this last consideration alone, I'm tempted to consistently add methods for every collection to every model object, essentially duplicating all of my collections:

class Store
{
    public IList<string> Categories { get; set; }

    public IEnumerable<Category> GetCategories()
    {
        // fetch and return Categories...
    }
}

I don't know, maybe this is truer to the actual data-model, but it seems like a lot of boiler-plate. Maybe in some sense it's actually better than simply IList<Category> though, since this enables you to choose in each case: do you need just the category document IDs, or do you want to fetch the actual objects? Not something that is easily possible using the other approach.

Thoughts?

Chris Marisic

unread,
Apr 12, 2012, 12:09:10 PM4/12/12
to rav...@googlegroups.com
I would never build a model that behaved as such. I do not ever want my persistence mechanism coupled to my business objects.  Even with ORMs it was tough enough for me to accept the virtual keyword everywhere for dynamic proxy implementations, I certainly wouldn't tolerate this in my model.

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 12:44:20 PM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 7:00 PM, mindplay <ras...@mindplay.dk> wrote:
On Thursday, April 12, 2012 11:12:48 AM UTC-4, Itamar Syn-Hershko wrote:
Why can't we model those cases using declarations instead of code?

The way I see it, you want to start writing a lot of complex stuff that will most probably be prone to a lot of bugs just to solve what you consider to be a problem, which is gone in 3 lines of code. I think before we move on with this conversation, you need to explain WHY this actually bugs you, and why you REALLY need that?...

3 lines of code here and 3 lines of code there. And every time, code that most likely has nothing to do with the task it's performing.

That's mainly what bugs me - when I'm writing code that works with the model, I don't want to have to think about persistence, I want to focus on the task I'm trying to perform.


Tough luck, persistence IS part of the problem that you are trying to solve, and you have to take it into account
 
There's also a maintenance issue - suppose your model changes, and something that was previously a component is now an independent document, so you change the collection from IList<Category> to IList<string> and this breaks all of your existing business-procedures. Suppose you have 100s of business-procedures that require a list of categories.


This is NOT just a minor change. This has a lot of implications. You WANT the code to break.
 
In light of this last consideration alone, I'm tempted to consistently add methods for every collection to every model object, essentially duplicating all of my collections:

class Store
{
    public IList<string> Categories { get; set; }

    public IEnumerable<Category> GetCategories()
    {
        // fetch and return Categories...
    }
}

Absolutely horrible. You will end up with a lot of pain for absolutely no gain.
May I suggest, go and write an idiomatic RavenDB application, then come back and tell us what the experience was like.
Right now, you don't have valid reasons, you have gut feeling based on experience in completely different technology and methodology. 

mindplay

unread,
Apr 12, 2012, 3:06:59 PM4/12/12
to rav...@googlegroups.com
On Thursday, April 12, 2012 12:44:20 PM UTC-4, Oren Eini wrote:
That's mainly what bugs me - when I'm writing code that works with the model, I don't want to have to think about persistence, I want to focus on the task I'm trying to perform.

Tough luck, persistence IS part of the problem that you are trying to solve, and you have to take it into account

of course, but I still feel like it should be handled in isolation and not mixed into business-procedures.

May I suggest, go and write an idiomatic RavenDB application, then come back and tell us what the experience was like.
Right now, you don't have valid reasons, you have gut feeling based on experience in completely different technology and methodology. 

that's going to be hard sell, since I don't understand your methodology - probably the only way I would get to do that, is if I had the time to do it on my own dime. I can't really go to management or  to my client and try to sell them on an idea I don't understand.

if it works as well as you claim, hopefully it will become popular enough to warrant a book - not just on the software, but on the methodology. 

The examples and tutorials available at the moment are all very trivial and show individual features working well in isolation, but I don't feel like there's enough depth to provide a big picture.

My main concern is that this won't scale in terms of complexity. If you could show me a complex app that leverages the features and applies the methodology in practice, perhaps this would be more accessible.

As you probably know, it's much harder to un-learn than it is to learn - and it sounds like these ideas go against much of the established, mainstream software theory I was taught. I'm afraid you'd have to take out my brain and reset it - at the moment it's stuck screaming "NO NO NO" to some of these ideas.

I dipped my toes, and the water doesn't feel too cold, but I still fear there may be sharks ;-)

Maybe I can make it to one of your courses later this year. Do you teach the methodology or do these courses mainly focus on programming?

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 3:16:05 PM4/12/12
to rav...@googlegroups.com
inline

On Thu, Apr 12, 2012 at 10:06 PM, mindplay <ras...@mindplay.dk> wrote:
On Thursday, April 12, 2012 12:44:20 PM UTC-4, Oren Eini wrote:
That's mainly what bugs me - when I'm writing code that works with the model, I don't want to have to think about persistence, I want to focus on the task I'm trying to perform.

Tough luck, persistence IS part of the problem that you are trying to solve, and you have to take it into account

of course, but I still feel like it should be handled in isolation and not mixed into business-procedures.


40 years of RDMBS experience says that it isn't a good way to go.
 
May I suggest, go and write an idiomatic RavenDB application, then come back and tell us what the experience was like.
Right now, you don't have valid reasons, you have gut feeling based on experience in completely different technology and methodology. 

that's going to be hard sell, since I don't understand your methodology - probably the only way I would get to do that, is if I had the time to do it on my own dime. I can't really go to management or  to my client and try to sell them on an idea I don't understand.

if it works as well as you claim, hopefully it will become popular enough to warrant a book - not just on the software, but on the methodology. 

The examples and tutorials available at the moment are all very trivial and show individual features working well in isolation, but I don't feel like there's enough depth to provide a big picture.

There is a book in the works, yes :-)
 

My main concern is that this won't scale in terms of complexity. If you could show me a complex app that leverages the features and applies the methodology in practice, perhaps this would be more accessible.


RavenDB runs msnbc.com and pluralsight, among others. 
We have several big sample apps, RacconBlog is a god example.

Oren Eini (Ayende Rahien)

unread,
Apr 12, 2012, 3:16:49 PM4/12/12
to rav...@googlegroups.com
Oh, and in the courses, I am not focusing very much on the API, that is why we have intellisense, we focus much more on the actual semantics and the zen. How to think about about building applications using RavenDb.

On Thu, Apr 12, 2012 at 10:06 PM, mindplay <ras...@mindplay.dk> wrote:

Troy

unread,
Apr 12, 2012, 3:39:36 PM4/12/12
to rav...@googlegroups.com
What other big sample apps are there, and where are they located?
 
RavenDB runs msnbc.com and pluralsight, among others. 
We have several big sample apps, RacconBlog is a god example.
 
 
Also, when is the book in progress going to be released? Any ETA? 

mindplay

unread,
Apr 12, 2012, 4:17:08 PM4/12/12
to rav...@googlegroups.com
We have several big sample apps, RacconBlog is a god example.

the application I'm building has 100+ entities with thousands of properties and several hundred relationships, factory-classes with hundreds of query-parameters, and will be twice as big when it's done.

I'm still trying to imagine what that would look like if, every time I had to traverse one of these relationships, I had to go back to the database and retrieve them manually... and in some cases, having to traverse across five or ten document-boundaries...

Chris Marisic

unread,
Apr 12, 2012, 4:38:33 PM4/12/12