RavenGallery - Peer Review

114 views
Skip to first unread message

Rob Ashton

unread,
Sep 27, 2010, 8:23:43 PM9/27/10
to ravendb
I've started a blog series where I aim to go from start to finish on a
simple RavenDB project and show how things are done from basics and
entry-level usage (dynamic indexes) up to proper usage (map/reduce
queries with manually set analyzers)

I am going to paste the content of my entries here before they go live
and allow scrutiny, I'm going to get enough of it anyway because
that's what normally happens when people post entire projects on their
blog, so getting it over and done with from the onset seems like a
good idea.

Rob Ashton

unread,
Sep 27, 2010, 8:24:52 PM9/27/10
to ravendb
(Code is here: http://github.com/robashton/RavenGallery/ by the way )

BLOG ENTRY - SETTING UP THE PROJECT

Where do I get it RavenDB?

RavenDB can be downloaded from a number of places, you can download
the stable binaries, the unstable binaries or go direct to Github and
use whatever fork you find there.

I personally run off my own fork which is updated very frequently from
Ayende’s code – as it means I can rapidly add any missing features and
push them back or fix bugs when I find them.

For this project I’ll be doing just that, and the binaries found in
the Image Gallery project will be from my fork found at
http://github.com/robashton/Ravendb

This is because I’ll be utilising functionality that is on the
bleeding edge of RavenDB, of course by the time most of you read this
they’ll be in the released binaries, so we’ll move to those now.

Released builds (Unstable)

Next up on the stability list are the recent builds, which can be
found on the Hibernating Rhinos site http://builds.hibernatingrhinos.com/builds/RavenDB-Unstable,
this is probably safer than constantly pulling from Ayende’s fork and
represent the latest changes that have made it through code review.

Released builds (Commercial)

At time of writing, these are about 20 builds behind the Unstable
branch, and as such miss out on some of the functionality that can be
enjoyed in the Unstable and code versions of RavenDB. Unless you are
planning on releasing software in the next week or two, I don’t really
advocate using this branch for development.

Why unstable?

RavenDB is changing constantly, breaking changes still happen, API
changes still happen, functionality is being added constantly – I have
a suite of tests for all my projects that utilise RavenDB and I test
all of my interaction with RavenDB.

By updating regularly I ensure that the amount of work needed to fix
any breaking changes is kept to a minimum, as opposed to waiting 30
builds and finding out that half of my entire test suite fails.

That’s not to say that Raven isn’t ready for production because I
believe that the stable branch is indeed stable, but because I’m in
development and I haven’t got imminent release to look forward to I’m
happy to put up with a few bugs in order to get the latest and
greatest functionality.

The Project itself

I’ve made an MVC2 project in VS2010 and removed all the default
garbage that gets provided with it, apart from the JavaScript files,
the CSS and the master page. I’m starting from a blank slate and
removing all the code because I know how much that offends people.
I’ve downloaded the following libraries: Moq, NUnit, StructureMap and
shoved them in a folder called _Libs along with the RavenDB binaries.
Oh yes, I’ve got a basic StructureMapControllerFactory that I’ll be
using to create controllers with dependencies injected (in case
anybody asks later on)
I’ve got two additional projects (class libraries) which are named
RavenGallery.Core and RavenGallery.Core.Tests
I’ve gone through the project settings and ensured their .NET profile
is set to .NET Framework 4
Here is my solution



And this is what I meant by setting the .NET profile





Which RavenDB Binaries to use?

Once you’ve selected which binary drop to use, you have to make a
decision as to which Raven.Client to use, and this is where I tell you
why I made sure my Target Framework was set to 4

Raven.Client.Lightweight.dll
This is for when you are using .NET Framework 4 Client Profile
This doesn’t require any of the other binaries (Lucene/etc)
This doesn’t allow you to host RavenDB within your application
This is therefore a pure client
Raven.Client.3.5.dll
As the name suggests, this is for when you are still on the .NET 2.0
runtime (using the .NET 3.5 framework)
This doesn’t require any of the other binaries (Lucene/etc)
This doesn’t allow you to host RavenDB within your application
This is therefore a pure client for older versions of the .NET
framework
Raven.Client.dll
This is the full, heavyweight client for .NET 4.0 (not client
profile!)
This requires all of the other binaries (Lucene.NET, Esent, etc)
This allows you to host RavenDB within your application
This is a mixed client + server
As my choice of target framework will tell you, I have chosen to host
RavenDB internally as part of the web application, and I will take the
responsibility for starting up and shutting it down as part of the
application lifecycle.

The .NET Framework 4 profile is important, as it is a common gotcha
for people to link the wrong binaries and wonder why they are still
getting reference errors.

That’s set-up covered, in the next entry I shall cover how we’ll be
hosting RavenDB in the application and managing our sessions with it.

Rob Ashton

unread,
Sep 27, 2010, 8:25:39 PM9/27/10
to ravendb
BLOG ENTRY - THE APPLICATION LIFECYCLE


As discussed in the previous entry, I have decided to host RavenDB
within the application itself, this is a decision that lends itself
very easily to the early stages of a development project against
RavenDB, as there is no need to remember to run RavenDB before running
the application, and it is a decision that providing you write your
code properly can be changed later on in development when it becomes
useful to have RavenDB running on another server or just as a separate
process.

Before writing any code, we have to understand the basic components at
play when communicating with RavenDB.

IDocumentStore
This is the main port of call for communicating with a RavenDB server
This is also the main port of call for starting up a RavenDB server if
running locally (DocumentStore)
One of these should exist per application – IE, create on start-up and
persist
IDocumentSession
This is created via IDocumentStore and provides interfaces for
querying the document store
This controls the unit of work, tracks loaded documents, keeps a cache
etc
This should be created per-unit-of-work (Typically one per HTTP
request, or one per transaction)
NB: Whilst RavenDB supports transactions across requests, typically we
avoid this and try to commit all changes once per unit of work
On start-up

So, on start-up we need to create a document store and make that
available for creating document sessions when necessary (and maybe
other purposes).

This is what I’ve come up with:

Bootstrapper.cs



1: public static class Bootstrapper 2: { 3:
public static void Startup() 4: { 5: var
documentStore = new DocumentStore 6:
{ 7: Configuration = new RavenConfiguration
8: { 9: DataDirectory =
"App_Data\\RavenDB", 10: } 11: };
12: documentStore.Initialize(); 13: 14:
ObjectFactory.Initialize(config => 15:
{ 16: config.AddRegistry(new
CoreRegistry(documentStore)); 17: }); 18: }
19: }
This is invoked from Global.asax.cs via Application_Start – we create
the DocumentStore and pass it into a StructureMap registry which then
sets up the container. If I wanted to switch to a different mechanism
for dealing with RavenDB I could change it here and the rest of my
application wouldn’t be any the wiser. I could even load the settings
from a file here, but I’m not going to – so there.

Inside CoreRegistry.cs we have

1: public class CoreRegistry : Registry 2:
{ 3: public CoreRegistry(IDocumentStore documentStore)
4: { 5:
For<IDocumentStore>().Use(documentStore); 6: } 7: }
There we go, anything that requests IDocumentStore will be given
IDocumentStore
Per request

Until we find a reason not to, we will create RavenDB sessions per
request and secretly store them in the HttpContext.Items collection so
they can be used through-out the rest of the request, of course we’ll
use StructureMap to manage that for us.

CoreRegistry.cs now looks like this:

1: public CoreRegistry(IDocumentStore documentStore)
2: { 3:
For<IDocumentStore>().Use(documentStore); 4:
For<IDocumentSession>() 5: .HttpContextScoped()
6: .Use(x => 7:
{ 8: var store =
x.GetInstance<IDocumentStore>(); 9: return
store.OpenSession(); 10: });
Of course this is leaking sessions because StructureMap won’t dispose
of any of our created sessions unless we tell it to, so into
Global.asax.cs I go once more and add the following line:
1: protected void Application_EndRequest() 2:
{ 3:
ObjectFactory.ReleaseAndDisposeAllHttpScopedObjects(); 4: }
This will do for now, I could have done this using a HttpModule or
whatever I preferred, but that gets the concepts across.

Note: I am leaving the decision of calling SaveChanges and flushing
the transaction to the application for now, I could do that when I
dispose of the session, but that would involve writing a bit more code
than needed at this stage.

In the next entry, I’ll set up the basic structure of my documents,
and cover how at least in the immediate future I’ll be passing them to
and fro with RavenDB

fschwiet

unread,
Sep 27, 2010, 10:30:46 PM9/27/10
to ravendb
I think there are other DLLs that are needed you should mention. Well
I'm not sure of the conditions they're needed, but my tests do not
pass unless I also reference:

Raven.Database.dll
Raven.Storage.Esent.dll

I've had troubles getting Raven.Storage.Esent.dll to include/copy
properly when RavenDB is referenced from shared library. I think its
because the reference is run-time only, some tools do not recognize
the esent DLL still needs copied. Ayende's recent suggestion, to
reference something in Raven.Storage.Esent.dll directly solved the
problem.

In my cass, I added "class RavenDllAnchor { TransactionalStorage
_ignored; }" to the shared library and that fixed the problems using
Raven.Storage.Esent.dll.

On Sep 27, 5:24 pm, Rob Ashton <robash...@CodeOfRob.com> wrote:
> (Code is here:http://github.com/robashton/RavenGallery/by the way )
>
> BLOG ENTRY -  SETTING UP THE PROJECT
>
> Where do I get it RavenDB?
>
> RavenDB can be downloaded from a number of places, you can download
> the stable binaries, the unstable binaries or go direct to Github and
> use whatever fork you find there.
>
> I personally run off my own fork which is updated very frequently from
> Ayende’s code – as it means I can rapidly add any missing features and
> push them back or fix bugs when I find them.
>
> For this project I’ll be doing just that, and the binaries found in
> the Image Gallery project will be from my fork found athttp://github.com/robashton/Ravendb
>
> This is because I’ll be utilising functionality that is on the
> bleeding edge of RavenDB, of course by the time most of you read this
> they’ll be in the released binaries, so we’ll move to those now.
>
> Released builds (Unstable)
>
> Next up on the stability list are the recent builds, which can be
> found on the Hibernating Rhinos sitehttp://builds.hibernatingrhinos.com/builds/RavenDB-Unstable,

Rob Ashton

unread,
Sep 27, 2010, 10:36:21 PM9/27/10
to ravendb
Okay, I have actually included those in the project, I didn't think
that people might forget them but I guess it depends where you get the
code drop from, because I build my own I just have the binaries to
hand.

I'll amend the blog post accordingly and make special mention of the
above and make it clear that I'm referencing the entire folder of
assemblies, and not just Raven.Client

fschwiet

unread,
Sep 27, 2010, 11:40:56 PM9/27/10
to ravendb
I suppose since the Raven.Database.dll reference warns at compile
time its not worth mentioning. Raven.storage.esent.dll reference
problems only show up at run-time, and only under certain conditions.
I've ran into this in the past and solved it by adding references to
the DLL from different application projects (instead of the shared
library). This felt wrong. It wasn't obvious to me I could instead
maintain a single reference to the DLL by adding an explicit reference
(though I suppose not obvious to me isn't saying much :P).

Ayende Rahien

unread,
Sep 28, 2010, 1:40:12 AM9/28/10
to rav...@googlegroups.com
How about showing the ConnectionStringName feature?
How about showing embedded+http hosting?

Rob Ashton

unread,
Sep 28, 2010, 4:50:50 AM9/28/10
to ravendb
I will be demonstrating those (they're on the list!), my general aim
is to put something together as quickly as possible and then the
series can focus on how to do_it_better.

I'll be using dynamic indexes for the first few features (using the
Linq provider), before demonstrating that pre-defined indexes are more
sensible

Rob Ashton

unread,
Sep 28, 2010, 6:40:09 AM9/28/10
to ravendb
Written so far and in the queue:

I - Introduction - 28-09-2010
II - Setting up the project - 29-09-2010 - 14:00
III - Application/Request Lifecycle - 30-09-2010 - 13:00
IV - Tracking Documents - 01-10-2010 - 13:00

- Weekend

V - The Application Stucture - 03-10-2010 - 18:00
VI - Repositories, Entities and Commands - 04-10-2010 - 13:00
VII - User Registration - 05-10-2010 - 13:00
VIII - Sign-in/Sign-out - 06 - 10-2010 - 13:00

I think rather than paste them here as there are images and rich text
etc, I'm going to start pushing them into Google documents, I'll sort
that out this afternoon

Rob Ashton

unread,
Sep 28, 2010, 8:58:24 AM9/28/10
to ravendb
Application Lifecycle:
https://docs.google.com/document/edit?id=1ehCQOSgrk4otz2WpnR_OFMpdGhhLbXYxyzeAhfGZR5Y&hl=en&authkey=CPzskOgE

Tracking Documents:
https://docs.google.com/document/edit?id=1ADeGF2qmjoNnlnQvTHlsUxlWLpdBLaoBjw5zASqhXVU&hl=en&authkey=CNHQj8EL

The Application Structure:
https://docs.google.com/document/edit?id=1VysAysI3nYKSCnV1e5ZUA0A9LM1w-kls6GmwhdZC-M4&hl=en&authkey=COXFtZUD

Repositories, Entities and Commands:
https://docs.google.com/document/edit?id=1diNbj9ANPxSzqcBDVZJVczgleVEsVWSa2YbX7ObOJP8&hl=en&authkey=CJ3vs7gF

User Registration:
https://docs.google.com/document/edit?id=1THtH_VxS4HxJ_ZL9CL0HDwsNpTh_7Kx90kFMWLRcBJI&hl=en&authkey=CJTCoboH

It's mostly RavenDB lite so far, but I aim to pack an awful lot into
this series and need to make sure the groundwork is set before I go
into lots of depth, I've deliberately not done some things perfectly
because they give a good opportunity for refactor later

Rob Ashton

unread,
Sep 28, 2010, 11:34:20 AM9/28/10
to ravendb
Signing in

https://docs.google.com/document/edit?id=1Sdo3rzR1Cj5GoOSH6M3sZYgeEwT_4-3L5B8l5WJXE7g&hl=en&authkey=CIiiyZQG


WHEW. finally past the boiler plate nonsense and time to think about
the actual gallery functionality!

Ayende Rahien

unread,
Oct 1, 2010, 7:52:34 AM10/1/10
to rav...@googlegroups.com
You made an enemy :-)
Entities the way you describe them are very awkward to use. Why not simply use the documents as the entities? Especially since you have to go through all the contortions to keep using the same document instance.

Chris Marisic

unread,
Oct 1, 2010, 8:34:03 AM10/1/10
to ravendb
I have to agree, rarely if ever should domain entities contain real
methods. just about the only time I've fallen back to putting methods
in my entities if it's doing some kind of complex interrogation with
the object that would be boiler plate code if left to it's own devices
such as: user.CanAssignRole("admin")

On Oct 1, 7:52 am, Ayende Rahien <aye...@ayende.com> wrote:
> You made an enemy :-)
> Entities the way you describe them are very awkward to use. Why not simply
> use the documents as the entities? Especially since you have to go through
> all the contortions to keep using the same document instance.
>
> On Tue, Sep 28, 2010 at 2:58 PM, Rob Ashton <robash...@codeofrob.com> wrote:
> > Application Lifecycle:
>
> >https://docs.google.com/document/edit?id=1ehCQOSgrk4otz2WpnR_OFMpdGhh...
>
> > Tracking Documents:
>
> >https://docs.google.com/document/edit?id=1ADeGF2qmjoNnlnQvTHlsUxlWLpd...
>
> > The Application Structure:
>
> >https://docs.google.com/document/edit?id=1VysAysI3nYKSCnV1e5ZUA0A9LM1...
>
> > Repositories, Entities and Commands:
>
> >https://docs.google.com/document/edit?id=1diNbj9ANPxSzqcBDVZJVczgleVE...
>
> > User Registration:
>
> >https://docs.google.com/document/edit?id=1THtH_VxS4HxJ_ZL9CL0HDwsNpTh...

Rob Ashton

unread,
Oct 1, 2010, 10:00:40 AM10/1/10
to ravendb
I knew I'd make an enemy :)

Better to have the discussion than to not post any blogs though.

I don't really see any contortions being made - I've certainly not
found it difficult to wrap them up, because I'm not having to
duplicate anything across to the entities themselves from the
document. If I request the same entity from a repository, the inner
document will still be the same document and I haven't lost anything
there.

One thing that is made easy by RavenDB is that I *could* just make all
the fields private and rely on custom serialization to persist the
state - but that to me feels like even more of a contortion -
especially when we come to storing fields that don't really belong on
the logical entity by flattening other entities' data to it (if that
makes sense)

In this series, I'm going to end up denormalising quite a lot of data
onto the Image document for querying purposes, and I don't want that
on my logical entity, this way of working allows me to avoid the
mismatch there without having to resort to storing multiple documents
for the Image (one for entity, one for views) - and allows me to just
carry on using the indexes against the Image document.

I hate to over-engineer and I must admit I do feel a little bit as if
I am doing here and realistically I could just plonk the entity out
there, but I want to make it dead clear that we don't use them for
anything other than modifying state, the moment I make it easy to
access their whole data structures, they stop being entities and start
just being flat data structures and being used directly to create view
models, I don't really like that.

In a simple app like this, you're right, entities aint gonna have much
behaviour, and one reason I'm doing my best *not* to mention DDD/etc/
etc/etc in this series is that this is an unsuitable project for it
because there is very little complexity.

Thoughts? I have no qualms about undoing earlier decisions in later
blog entries once the initial design unravels, this is my first real
project against RavenDB where I haven't just used RavenDB directly all
over the show and there are always going to be changes/refactors as
the project goes on. It is a cake walk to perform those changes/
refactors with the tests that I've got in place and with the way I've
written the code.

Rob Ashton

unread,
Oct 1, 2010, 11:24:21 AM10/1/10
to ravendb
I think with the direction I'm going in, there shouldn't be a reason
to expose state via the entities - perhaps I just explain it badly in
my blog?

Rob Ashton

unread,
Oct 1, 2010, 11:28:40 AM10/1/10
to ravendb
Also to Chris Marisic: You say domain entities, I've stayed away from
the term "domain", because this is a very simple "domain" and doesn't
need too much modelling - but as you brought it up.

Entities shouldn't have methods? Then what should they have? If you
have a domain with only nouns and no verbs, then it's not a domain,
it's a data structure :)

Rob Ashton

unread,
Oct 1, 2010, 11:34:20 AM10/1/10
to ravendb
Note: I'm still not saying what I am doing is right, I am just saying
that you are wrong :)

Rob Ashton

unread,
Oct 1, 2010, 11:58:28 AM10/1/10
to ravendb
So Ayende, as you know what I am talking about
-------------------------------------------------------------------------

My goal: To have the entities to expose behaviour only to prove it can
be done if we *were* doing DDD, and have indexes expose the data we
need for the views and therefore the data required to create commands
to go modify those entities (shouldn't need to access state on those
entities at all for this purpose?)

NHibernate:
-----------------
A pattern which we can use in NHibernate is to store all the state as
private fields, and that still allows SQL queries against the data to
create arbitrary projections of that data into the view models, this
is a PITA because it's a lot of work coming up with those SQL queries
and it's also *slow*, because we're constantly querying the store and
having to come up with the answers there and then.

We can use read-only properties if we want to perform nice type safe
queries on our domain to create those SQL queries (for those who don't
like HQL), but that still suffers from the same problem that reading
is an expensive operation

Recently people have decided that event sourcing is the way forward,
and that calls to behaviour methods on our entities won't change state
directly, but instead will register events which can be used by
various systems to allow views/reporting/whatever you want. This is
heavy weight and architecturally challenging to understand/implement.
This does however mean you get to pre-construct views and what have
you, and have multiple data stores for different purposes.

RavenDB:
--------------

Private members: Could do this, but then we can't access the data from
our documents unless we *always* do projections to get that data, in
which case how do we change the data unless we add setters around
those properties or... methods (okay, so we've got the write methods
no matter what) - but for the purposes of easily accessing data it
becomes a bit of a PITA

readonly properties: Could do this, we can now access the data from
our documents easily, creating test data becomes difficult though, we
can add methods on the documents/entities to manipulate their state of
course but it's a f-ugly solution

leave them all public and add methods to the entity to facilitate safe
modification, they're just for state: We can add methods on our
documents if we want to, but controlling state becomes difficult, you
want to make sure that your entities are logically intact and leaving
them open for any old code to manipulate seems dangerous. When you add
flattened data to your documents then you have to expose them as part
of your entity, even though that data has nothing to do with your
entity.

Leave them all public, but the behaviour layer doesn't expose the
documents: This is what I've gone for, as it means I can still easily
write tests by shoving raw documents in, in whatever state I so choose
- it also means that my view layer can just grab the data it needs in
whatever way it needs, but never are the actual documents exposed to
anything they shouldn't be, which means I can't accidentally modify
them,

You don't *have* to resort to event sourcing or multiple stores to get
the performance because you're able to create indexes instead of
direct queries to get projections of the data out (okay, we can't do
joins, but we can *under the hood* add additional flattened data to
the documents and leave our entity layer clean.)

The essence of this is I want to be able to have a logically clean
domain layer and a fast view layer and all done in a way that the
common mortal would understand?

Thoughts?

Rob Ashton

unread,
Oct 1, 2010, 1:28:54 PM10/1/10
to ravendb
Note: I've had a vote for

"readonly properties, despite non-entity-related-data-being-present",
it's not an awful compromise although it does feel muddy

Rob Ashton

unread,
Oct 1, 2010, 1:46:23 PM10/1/10
to ravendb
I'm now leaning towards that too having hurt my head thinking about it
more, is this the general consensus? The properties are available for
linq queries, but modification is only allowed through the appropriate
behavioural methods. The documents are still readable so we *can*
create views by looking directly at them and do projections.

Easy enough to change if so, if I do change, the question is do I
modify my blog post or do I post a correction and link to it :) (I'm
thinking the latter, but I don't want to leave stale data on the
internets)

Ayende Rahien

unread,
Oct 1, 2010, 2:39:51 PM10/1/10
to rav...@googlegroups.com
I actually completely disagree with you here.
Most of the functionality in the app should be in the entities.
Take a look at the MVC Music Store example in the Samples, look at the ShoppingCart class as a good example for that.

Ayende Rahien

unread,
Oct 1, 2010, 2:48:02 PM10/1/10
to rav...@googlegroups.com
Basically, my point is that I don't see any benefit for you doing that.
I really like your entities/views separation, but for me it is that views are the output of indexes, and entities are documents.

Ayende Rahien

unread,
Oct 1, 2010, 2:48:38 PM10/1/10
to rav...@googlegroups.com
I think that it would be better to explain WHY you are going that route.

Ayende Rahien

unread,
Oct 1, 2010, 2:58:36 PM10/1/10
to rav...@googlegroups.com
inline

On Fri, Oct 1, 2010 at 5:58 PM, Rob Ashton <roba...@codeofrob.com> wrote:
So Ayende, as you know what I am talking about
-------------------------------------------------------------------------

My goal: To have the entities to expose behaviour only to prove it can
be done if we *were* doing DDD, and have indexes expose the data we
need for the views and therefore the data required to create commands
to go modify those entities (shouldn't need to access state on those
entities at all for this purpose?)


Okay, that make sense. You basically want to expose no state directly off the entities.
 
NHibernate:
-----------------
A pattern which we can use in NHibernate is to store all the state as
private fields, and that still allows SQL queries against the data to
create arbitrary projections of that data into the view models, this
is a PITA because it's a lot of work coming up with those SQL queries
and it's also *slow*, because we're constantly querying the store and
having to come up with the answers there and then.

We can use read-only properties if we want to perform nice type safe
queries on our domain to create those SQL queries (for those who don't
like HQL), but that still suffers from the same problem that reading
is an expensive operation


Which is how I usually work if I worry about someone else changing my state.

 
Recently people have decided that event sourcing is the way forward,
and that calls to behaviour methods on our entities won't change state
directly, but instead will register events which can be used by
various systems to allow views/reporting/whatever you want.

That is really aimed at systems with:
a) very high degree of complexity
b) high change rate

The idea is that you can have something like this:

 
This is
heavy weight and architecturally challenging to understand/implement.
This does however mean you get to pre-construct views and what have
you, and have multiple data stores for different purposes.

RavenDB:
--------------

Private members: Could do this, but then we can't access the data from
our documents unless we *always* do projections to get that data, in
which case how do we change the data unless we add setters around
those properties or... methods (okay, so we've got the write methods
no matter what) - but for the purposes of easily accessing data it
becomes a bit of a PITA
 
readonly properties: Could do this, we can now access the data from
our documents easily, creating test data becomes difficult though, we
can add methods on the documents/entities to manipulate their state of
course but it's a f-ugly solution

leave them all public and add methods to the entity to facilitate safe
modification, they're just for state: We can add methods on our
documents if we want to, but controlling state becomes difficult, you
want to make sure that your entities are logically intact and leaving
them open for any old code to manipulate seems dangerous.

Your approach isn't really different. You state is in another class, which is just as easy to modify.
My approach is public props with methods to modify state. It make setting the state for tests easy, but the actual logic is handled in the methods.
Those are usually called from message consumers, and are very easy to follow.
 
For that matter, most of the time, you are going to be reading projections directly off of indexes, anyway. And I don't bind directly to entities, so it doesn't really matter.

Ayende Rahien

unread,
Oct 1, 2010, 3:00:02 PM10/1/10
to rav...@googlegroups.com
There is no need to go to event sourcing for something as simple as this.
I would have:

blogPost.CorrectTitle(...); // may modify slug, or create an additional slug
blogPost.CorrectContent(..); // etc.

Rob Ashton

unread,
Oct 1, 2010, 3:10:50 PM10/1/10
to ravendb
Okay, so you agree with the essence of what I'm desiring, but I need
to explain it far better and/or opt for a more simple and far less
anal model?

Thanks for taking the time to go through my mountains of text, I
honestly think it better that discussions like these are aired and
fully talked through - I might be capable of coming up with ways of
doing things and justifications for them, but a lot of people need
guidance or they just get stuck in a rut with no place to go.

I'll keep it as it is for the continuation of the blog series (because
I *am* anal), BUT armed with this discussion I'll write a blog entry
for publish after the current queue (currently two weeks long) with
the various opinions and considerations aired and the pros and cons of
each approach

(And I agree entirely about event sourcing being aimed at complex
systems with high change rates, and the current kerkuffle with people
falling head over heels to implement it in every system they write is
quite frankly staggering - what I aim to show with this is that you
can get a really good read experience with RavenDB without having to
do any manual work in constantly updating pre-computed views. For that
matter, pure domain driven design has a similar issue which is why I
am doing my best not to actually make any claims that that's what I am
doing with this modest project.)

Oh, and I hadn't seen the aspect domain driven design, I actually
rather like that approach - I wish I'd thought of it before starting
one of our work projects (we opted for transaction script style
interaction with NH because it's easier to override with a plug-in if
a customer wants something different) *guilty*

Rob Ashton

unread,
Oct 1, 2010, 3:18:27 PM10/1/10
to ravendb
And yeah, you're right that I am still exposing all those properties
*somewhere* for something to modify if they really want to, but
they're a private implementation detail not accessible by anything
outside of the repository/entity itself and it makes it very clear
what the rules are with regards to enacting change on the underlying
state :)

I've ended up like this after working with teams who if they have a
way to do something, they'll do it, if I had public properties on my
entities with get/sets on them, then I'd be endlessly code reviewing
to find people having written their behaviour directly in the command
to just update the properties on the object without going through some
proper behaviour code.

Downsides to what I've done here, accessing child entities will be a
PITA if we ever want to enact change via anything other than the
aggregate - not that I should need to though if I am putting my
behaviour in the aggregate and allowing it to keep its unit intact.

Ajai Shankar

unread,
Oct 1, 2010, 3:32:49 PM10/1/10
to rav...@googlegroups.com
Dammit - Rob types at speed of light! :)

@Ayende


"I actually completely disagree with you here.
Most of the functionality in the app should be in the entities.
Take a look at the MVC Music Store example in the Samples, look at the ShoppingCart class as a good example for that."

Could you explain more - what behavior do you put in entities?

From whatever I have seen in real life projects, the only thing you want to put is stuff such as adding a item to an order (especially to say get the NH back ref correct), or to get the order total from order items.

For example you do not exactly want to inject a email service into an Order and have it send out notification when order.Submit()

In fact even hitting a tax service to calculate tax from order would makes us think twice...

I find it funny when they (not you Ayende) keep talking it's all about the domain, but never manage to give us a good idea of what behavior to put in an entity...

Ajai

Rob Ashton

unread,
Oct 1, 2010, 3:40:57 PM10/1/10
to ravendb
One day I hope to be able to write and multi-task as well as
Ayende ;-)

I started writing a response to that, but I got to a paragraph and
realised I wasn't going to be able to distil the rather complex
problem of domain modelling and how that works within a codebase to a
simple answer in an e-mail thread about RavenDB - you might consider
hitting up the DDD mailing list/reading the appropriate sites on the
subject instead! http://domaindrivendesign.org/

(Um, about to paste more entries from the upcoming blog)

Rob Ashton

unread,
Oct 1, 2010, 4:31:33 PM10/1/10
to ravendb

Rob Ashton

unread,
Oct 1, 2010, 7:53:45 PM10/1/10
to ravendb

Ayende Rahien

unread,
Oct 3, 2010, 10:30:36 AM10/3/10
to rav...@googlegroups.com
inline

On Fri, Oct 1, 2010 at 9:10 PM, Rob Ashton <roba...@codeofrob.com> wrote:
Okay, so you agree with the essence of what I'm desiring, but I need
to explain it far better and/or opt for a more simple and far less
anal model?


Yes.
 
Thanks for taking the time to go through my mountains of text, I
honestly think it better that discussions like these are aired and
fully talked through - I might be capable of coming up with ways of
doing things and justifications for them, but a lot of people need
guidance or they just get stuck in a rut with no place to go.


I would also suggest linking to the discussion, might be interesting to see how it goes.

Ayende Rahien

unread,
Oct 3, 2010, 10:36:05 AM10/3/10
to rav...@googlegroups.com
inline

On Fri, Oct 1, 2010 at 9:18 PM, Rob Ashton <roba...@codeofrob.com> wrote: 
I've ended up like this after working with teams who if they have a
way to do something, they'll do it, if I had public properties on my
entities with get/sets on them, then I'd be endlessly code reviewing
to find people having written their behaviour directly in the command
to just update the properties on the object without going through some
proper behaviour code.

I actually find that it is pretty easy to do code reviews on code like this.
Using Udi's law of "you can call a single method on a single entity" make it very easy to scan large number of command handlers and very quickly figure out if they violated the rules or not.
 
Downsides to what I've done here, accessing child entities will be a
PITA if we ever want to enact change via anything other than the
aggregate - not that I should need to though if I am putting my
behaviour in the aggregate and allowing it to keep its unit intact.

Um, why would you want to enact change except through the aggregate?

Ayende Rahien

unread,
Oct 3, 2010, 10:43:59 AM10/3/10
to rav...@googlegroups.com

Rob Ashton

unread,
Oct 3, 2010, 10:50:48 AM10/3/10
to ravendb
"Um, why would you want to enact change except through the aggregate?
"

I wouldn't, I was just thinking out loud :)

On Oct 3, 3:36 pm, Ayende Rahien <aye...@ayende.com> wrote:
> inline
>

Ayende Rahien

unread,
Oct 3, 2010, 10:52:21 AM10/3/10
to rav...@googlegroups.com
Why do you need ImageTagDocument? Simple string would work, I think.

Another problem with       public ImageBrowseView Load(ImageBrowseInputModel input)
Is that you are loading ALL documents, if you have non Image docs in the DB, it will crash.

Ayende Rahien

unread,
Oct 3, 2010, 10:56:14 AM10/3/10
to rav...@googlegroups.com
"Eventually, if used enough these will be promoted into permanent indexes, but a restart of the server will mean these temporary indexes are cleared and the process has to start again."

This seems to indicate that even permanent indexes are removed on restart.

"This is sub-optimal and while they are great for rapid development and the sketching out of ideas (and a few other purposes), if you are using a query as a common part of your codebase then the indexes should all be created as part of a deploy/build process and RavenDB should be left to index documents in its own time."

Why is this sub optimal? On the long run, it will run just as efficiently as the normal indexes. I would actually say we want to encourage that for the simple queries, and recommend standard indexes for the complex situations only.,

Rob Ashton

unread,
Oct 3, 2010, 10:57:31 AM10/3/10
to ravendb
And yeah, I'll link to the discussion when I write the blog entry, I
really think that at the end of the day unless you're just plain
misguided and don't think behaviour should be in the domain - the real
question is here is just "how anal are you about separating these
things"

I'll probably phrase it somewhere along those lines too.

Rob Ashton

unread,
Oct 3, 2010, 11:07:07 AM10/3/10
to ravendb
Okay, is that your feeling over dynamic indexes then? I much prefer
pre-defining my indexes for queries that I know about, but I can have
a go at re-wording that to make that opinion clear whilst also making
the underlying functionality more obvious - I just don't like that an
app restart can potentially end up with stale data (I guess making
temp indexes permanent does negate that).

I have ImageTagDocument because I wanted my first map/reduce example
to be obvious that I was looking at the collection and its properties.
I'm also not sure if the dynamic Any() works with just plain arrays of
strings (it probably should, and not checking is just laziness - my
bad).

RE: ImageDocument - yeah, also my bad - I actually caught that an
hour after writing, but I haven't updated either my published and
queued blog entry or the Google Document - I will rectify that.
( http://bit.ly/bBDvL0 )

Ayende Rahien

unread,
Oct 3, 2010, 11:15:39 AM10/3/10
to rav...@googlegroups.com
Dynamic Queries seems to fill the gap, and there really isn't a perf problem with them for most real world scenarios.
I agree that standard indexes feel better, but I have to wonder if it isn't just that this is what we are used to.
Can you try to formulate why you like the standard over the dynamic?

Ayende Rahien

unread,
Oct 3, 2010, 11:15:56 AM10/3/10
to rav...@googlegroups.com
And how can an app restart end up with stale data?

Rob Ashton

unread,
Oct 3, 2010, 11:18:19 AM10/3/10
to ravendb
I mean *more* stale data when RavenDB restarts

If you have permanent indexes, then if the documents are indexed, then
that's it - the documents are indexed
IF you have temporary indexes, when you first load a page that uses
that index, you might get incredibly old data and have to keep
refreshing until you see the data you want (unless it's a common page
in which case they will have become permanent indexes).

Okay, perhaps I need to explain it that way - "What is your use case
for these indexes", "how do you expect them to be used"

Rob Ashton

unread,
Oct 3, 2010, 11:19:23 AM10/3/10
to ravendb
This is especially true if you are using Embedded Raven

Ayende Rahien

unread,
Oct 3, 2010, 11:48:05 AM10/3/10
to rav...@googlegroups.com
I am sorry, but I don't see the code to support that assumption.

Temporary indexes are just like normal indexes, they are handled in exactly the same manner.

For that matter, let us say that we create a temp index, then restarted, and queried the index again.
Since on startup, that index is purged, it will be re-created.
There isn't any chance for stale data.

Rob Ashton

unread,
Oct 3, 2010, 11:58:46 AM10/3/10
to ravendb
Okay, perhaps I'm not explaining myself fully still or I'm just plain
wrong (Perhaps my definition of stale data, stale query results are
probably more correct a term?)

If you have 10,000 documents in the store, and you have a temporary
index defined which isn't called enough to get promoted into a
permanent index

On first call to the index, you ask for page 0 and 100 items - the
temporary index creation code will wait until it has *any* 100 items
and then return that data, it doesn't wait for non stale results.

It will just return the first 100 items that happen to be indexed and
match the query, and it doesn't matter if more relevant data is
present in the store.

Of course the documents themselves are valid, and the search results
themselves are valid... just out of date with what would be returned
in the index was created on start-up and all the documents already
indexed.

This is with the assumption that indexed data itself is deleted when
an index is destroyed.

Rob Ashton

unread,
Oct 3, 2010, 12:01:34 PM10/3/10
to ravendb
Of course if you're performing an order by date and you're asking for
the most recent items first, you'll get very old items appearing

I think perhaps I care about this too much, as the data *itself* is
correct then it's not like the users are likely to do able to do bad
things with it - they might just wonder where their data got to

Ayende Rahien

unread,
Oct 3, 2010, 12:08:57 PM10/3/10
to rav...@googlegroups.com
Not really.
What happens is that the index run, you return the first 100 results.
But the index keeps running until it is removed.
What it means is that on the next call, it is probably already non stale.
More than that, because for the first time, you are literally querying the data store at a point in time, the data can't be stale.

Ayende Rahien

unread,
Oct 3, 2010, 12:09:26 PM10/3/10
to rav...@googlegroups.com
Oh, that is a factor, yes.
Doing ordering on a dynamic index is something that we should discourage.

Rob Ashton

unread,
Oct 3, 2010, 12:34:48 PM10/3/10
to ravendb
Okay, I'll see what I can with regards to a proper explanation of the
above discussion -

Summary:

Dynamic indexes can be used if
- Sorting isn't being used on the return result
- It's a commonly ran query

Proper indexes should be used if
- Sorting is being used on the return result
- A large number of members are being referenced to (the only
*real* downside is the index name)
- The query is rarely executed but up to date data is largely
desired

Dynamic queries are also useful for short lived queries where testing
out new ideas and analyzing the data store, but
WaitForNoneStaleResultsAsOfNow should probably be used to get the full
effect.

I think you're right that we're used to "proper" indexes, I foresee
people using dynamic queries more than we might necessarily ourselves.

Ayende Rahien

unread,
Oct 3, 2010, 1:44:11 PM10/3/10
to rav...@googlegroups.com
Yes, that seems to be pretty accurate. And the sorting is only valid problem for the first few queries.

Ayende Rahien

unread,
Oct 3, 2010, 1:45:18 PM10/3/10
to rav...@googlegroups.com
Another disadvantage for the dynamic query is that if the query is really rarely executed, we pay some price to maintain the index for the next 10 min.
It might be better to use a linear query instead.

Rob Ashton

unread,
Oct 3, 2010, 1:48:02 PM10/3/10
to ravendb
Good point, I'll make sure to mention that too

Rob Ashton

unread,
Oct 4, 2010, 5:45:20 AM10/4/10
to ravendb

Rob Ashton

unread,
Oct 4, 2010, 5:56:36 AM10/4/10
to ravendb
I'm now writing an entry on full-text search, (searching multiple
fields)

The way I see it is

You can do

from doc in docs
select new
{
SearchText = doc
}

And index the entire document, I can't work out how RavenDB does this,
I assume it serializes to JSON or something similar?
Downsides is that it's entirely non-selective, you're not choosing
what fields you want to search or how you want to search them,
unexpected consequences abound (and the index is going to get quite
large, but that's not an issue is it?)

We can also do

from doc in docs
from tag in doc.Tags
select new
{
SearchText = doc.Title + doc.Description + tag.Name
}

My thoughts on this are that it's quite good because we're being picky
about the fields we're searching for, but say we had two tags, "Cats"
and "Dogs" (and they weren't mentioned in title or description"

If I want to do a search for all documents which have tags of BOTH
"Cats" and "Dogs"

is a search like that going to work with this method? It is my perhaps
incorrect understanding that two entries in the index would be created
with the above query, one with "Cats" and one with "Dogs" (because of
the select many)

We can also do something like this

from doc in docs
from tag in doc.Tags
select new
{
doc.Title,
doc.Description,
tag.Name
}

This suffers from the same problems as the previous, but has the
advantage that we can specifically only search for the fields we're
interested in, this is my preference, but assuming I'm correct about
"Cats" and "Dogs", I'd like to be able to do something like this

from doc in docs
select new
{
Title = doc.Title,
Description = doc.Description,
Tags = doc.Tags.Select(x=>x.Name).ToArray()
}

Or something equivalent to that anyhow

Rob Ashton

unread,
Oct 4, 2010, 8:34:56 AM10/4/10
to ravendb

Rob Ashton

unread,
Oct 4, 2010, 9:41:33 AM10/4/10
to ravendb
And I've summarised that discussion, please let me know if I've missed
anything out - I tried to be thorough

Rob Ashton

unread,
Oct 4, 2010, 12:15:38 PM10/4/10
to ravendb
Improving search by using Projections
https://docs.google.com/document/edit?id=1FeyEh4o0iFzAWMbCTCI7ErdE8qonaf0-qRLzXV91QjY&hl=en&authkey=CKv-vucL

These two entries are sure candidates for corrections, I've gone on my
best understanding of what is going on, but I'm 20% sure there are
some errors in there.

Matt Warren

unread,
Oct 4, 2010, 12:46:02 PM10/4/10
to ravendb
Rob,

If you use Luke (from http://code.google.com/p/luke/), you can look at
what's stored in the lucene index that RavenDB uses. The GUI is a big
ugly, but it does the job.

Now that the default is NotAnalyzed the field will be stored as-is, so
1 piece of text.

Rob Ashton

unread,
Oct 4, 2010, 12:50:45 PM10/4/10
to ravendb
Ah, thanks - that looks like just the ticket

Matt Warren

unread,
Oct 4, 2010, 12:51:20 PM10/4/10
to ravendb
Also if you take a look at the docs here http://ravendb.net/documentation/how-indexes-work-updated,
I added a section that shows how the built-in Lucene analysers work,
in case the default one doesn't do what you need.

On Oct 4, 5:46 pm, Matt Warren <mattd...@gmail.com> wrote:
> Rob,
>
> If you use Luke (fromhttp://code.google.com/p/luke/), you can look at

Rob Ashton

unread,
Oct 4, 2010, 12:59:44 PM10/4/10
to ravendb
Yeah, I've read that - it was quite informative thanks

Ayende Rahien

unread,
Oct 4, 2010, 3:26:44 PM10/4/10
to ravendb
inline

On Mon, Oct 4, 2010 at 11:45 AM, Rob Ashton <roba...@codeofrob.com> wrote:
Implementing a real-time tag search:
https://docs.google.com/document/edit?id=1g1mTb9kEyE16gL7s_WcamZwQ_E_SG6w1PtmkLX1Lf-M&hl=en&authkey=CKe31oUC


I like it.
But, can't you just do StartsWith in the linq query and it would give you better search results?
Won't give you auto complete, though.

You need to change the map/reduce index so they would both output the same value.
Right now we are not actually checking for this, but it is important when we implement reduce groups

Ayende Rahien

unread,
Oct 4, 2010, 3:32:50 PM10/4/10
to ravendb
                                SearchText = doc.Title + tag.Name

Note that for this to work you HAVE to add whitespace between them.

Ayende Rahien

unread,
Oct 4, 2010, 3:29:53 PM10/4/10
to ravendb
inline

On Mon, Oct 4, 2010 at 11:56 AM, Rob Ashton <roba...@codeofrob.com> wrote:
I'm now writing an entry on full-text search, (searching multiple
fields)

The way I see it is

You can do

from doc in docs
select new
{
   SearchText = doc
}

And index the entire document, I can't work out how RavenDB does this,
I assume it serializes to JSON or something similar?

Yes, it is serialized to json, and if you set it to analyzed, you can probably get pretty good results
 
Downsides is that it's entirely non-selective, you're not choosing
what fields you want to search or how you want to search them,
unexpected consequences abound (and the index is going to get quite
large, but that's not an issue is it?)

We can also do

from doc in docs
from tag in doc.Tags
select new
{
    SearchText = doc.Title + doc.Description + tag.Name
}


Argh, ugly. This would be better:

from doc in docs
from tag in doc.Tags
select new
{
    SearchText = new string[]{doc.Title, doc.Description, tag.Name}
}
 
My thoughts on this are that it's quite good because we're being picky
about the fields we're searching for, but say we had two tags, "Cats"
and "Dogs" (and they weren't mentioned in title or description"

If I want to do a search for all documents which have tags of BOTH
"Cats" and "Dogs"

from doc in docs
select new { doc.Tags }

Tags:Cat AND Tags:Dog
 
is a search like that going to work with this method? It is my perhaps
incorrect understanding that two entries in the index would be created
with the above query, one with "Cats" and one with "Dogs" (because of
the select many)

Using your index, there is no way to ask for something that implement both, no 

Ayende Rahien

unread,
Oct 4, 2010, 3:37:56 PM10/4/10
to ravendb
This is with the original LuceneQuery, to get access to this functionality in a LINQ query we’d have to to use Customize to get hold of the LuceneQuery.

from image in session.Query<Image>("index")
select new { image.Title, image.Filename};

Should also do projection

Ayende Rahien

unread,
Oct 4, 2010, 3:35:10 PM10/4/10
to ravendb
Matt,
Is there a reason why this isn't linked to the main documentation page?

Rob Ashton

unread,
Oct 4, 2010, 3:55:26 PM10/4/10
to ravendb
Projections with LINQ:: when I tried that it didn't work - that was
into a concrete type though - I'll update that.

Spaces in index: That's a gotcha, I like the arrays - I'll use that

Yes, I can do StartsWith, the original aim was to do it this way and
then show how we set the Analyzed attribute (this is before we
realised StartsWith works with NotAnalyzed!)
Trying not to cover too many topics in a single post

RE Cats + Dogs, let's move that discussion to the "Advice" thread, as
that's a part of that =)

Feedback noted, I'll update the entries with valid information.

On Oct 4, 8:37 pm, Ayende Rahien <aye...@ayende.com> wrote:
> > This is with the original LuceneQuery, to get access to this functionality
>
> in a LINQ query we’d have to to use Customize to get hold of the
> LuceneQuery.
>
> from image in session.Query<Image>("index")
> select new { image.Title, image.Filename};
>
> Should also do projection
>
>
>
> On Mon, Oct 4, 2010 at 6:15 PM, Rob Ashton <robash...@codeofrob.com> wrote:
> > Improving search by using Projections
>
> >https://docs.google.com/document/edit?id=1FeyEh4o0iFzAWMbCTCI7ErdE8qo...

Matt Warren

unread,
Oct 4, 2010, 5:52:37 PM10/4/10
to ravendb
I think I messed up the link when I posted the update to that page,
but it's fixed not. The page is at http://ravendb.net/documentation/how-indexes-work
which is linked to from http://ravendb.net/documentation.

On Oct 4, 8:35 pm, Ayende Rahien <aye...@ayende.com> wrote:
> Matt,
> Is there a reason why this isn't linked to the main documentation page?
>

Matt Warren

unread,
Oct 4, 2010, 5:57:57 PM10/4/10
to ravendb
You should be able to do projection like this if you want to specify
the type:
var projectionResult = s.Query<OrderItem>("ByLineCost")
.Where(x => x.Cost > 1)
.Select(x => new SomeDataProjection { Cost =
x.Cost })

but the anonymous type way is probably cleaner.

Matt Warren

unread,
Oct 4, 2010, 7:24:50 PM10/4/10
to ravendb
Rob/Ayende

Going back to the dynamic queries discussion, taking into account the
summary below

> Summary:
>
> Dynamic indexes can be used if
> - Sorting isn't being used on the return result
> - It's a commonly ran query
>
> Proper indexes should be used if
> - Sorting is being used on the return result
> - A large number of members are being referenced to (the only *real* downside is the index name)
> - The query is rarely executed but up to date data is largely desired

Is it now fair to say that you'd recommand people use dynamic queries
first and then a "proper" index is dynamic doesn't meet their needs?

Also how do dynamic queries cope with analysed/not-analysed issues and
full/partial text searches?

Ayende Rahien

unread,
Oct 5, 2010, 12:09:05 AM10/5/10
to ravendb
Yes, I suggest starting with auto indexing first.
I think that for full search searches (analyzed fields) we can require them to create an index for that.

I think that after the current round of improvements for auto indexes, we might want to figure out if the query requires full text indexing and auto recognize that, but that is about that.
Reply all
Reply to author
Forward
0 new messages