Concerns about consistency in Raven DB

301 views
Skip to first unread message

David Cleave

unread,
Apr 7, 2014, 1:48:53 PM4/7/14
to rav...@googlegroups.com
I understand that Raven puts availability over consistency. I understand it talks about *eventual consistency*. But I have tests that demonstrate inconsistency that I am concerned about.

In a nutshell, they do the following. All steps are performed via calls to a web API which in turn talks to Raven. All software is running on my dev box.

1) Insert documents into an empty collection (max three documents).
2) Make modifications to collection properties within the documents. I have tried this both by a) using patching and b) simply loading documents up, modifying the property and saving them.
3) Load the documents up again and run assertions on the contents of the collection properties.

I am finding that in many cases, some documents are being left in an inconsistent state, as if the modification has not touched them. I assume this is because step 2 finds the documents to modify using indexes which are stale (i.e. the indexes are still being built after the inserts done in step 1).

Putting a short Thread.Sleep (around 500 ms) between steps 1 and 2 fixes this problem.

I am very concerned about this. In production, Raven will actually be running in two separate boxes using replication, so I would imagine index rebuilding will be slower yet.

What steps can I take to ensure these modifications result in a consistent state? Or is Raven just not designed for this?

Michael Yarichuk

unread,
Apr 7, 2014, 1:56:34 PM4/7/14
to rav...@googlegroups.com
Hi,
Can you share what you are doing in the form of code?
Also, have you checked for stale indexes before step 3?

also, you can look at WaitForIndexing() implementation (used for unit-testing)


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best regards,

 

Michael Yarichuk

RavenDB Core Team

Tel: 972-4-6227811

Fax:972-153-4-6227811

Email : michael....@hibernatingrhinos.com

 

Rhino-Logo

Jahmai Lay

unread,
Apr 8, 2014, 1:00:25 AM4/8/14
to rav...@googlegroups.com

Some code would help give a more meaningful answer, but in essence Raven expects you to design your system around eventually consistency if you are using Indexes. If you need to read-your-write consistency for a particular use-case, you should avoid Indexes and use the document id(s) directly (Store, Load), or construct your keys such that they can be loaded using the StartingWith method. I recommend against waiting for indexes to catch up (WaitForStaleIndexes, etc) except in special cases.

David Cleave

unread,
Apr 8, 2014, 7:12:48 AM4/8/14
to rav...@googlegroups.com
@Michael: there's quite a lot of code. This is an integration test, and we're actually accessing Raven indirectly through a web API layer. Therefore, the unit test helpers such as WaitForIndexing() are of no help here. We do check for stale indexes before step 3, but the problem is with stale indexes between steps 1 and 2.

@Jahmai: thanks for your input. The data I'm modifying is a child collection of the actual objects stored in the collection, and subject to frequent change. Therefore it's not a suitable candidate to form part of the document key.

I have had some success by retrieving the documents using WaitForNonStaleResultsAsOfNow() in the modification step (step 2). This appears to allow the indexes to have completed updating after step 1 (inserting the documents), so all the expected documents are modified. I understand that this exposes us to bugs around clock synchronisation, which we'll have to look out for. WaitForNonStaleResultsAsOfLastWrite() is of no use to us since we are using Raven across two machines.

Oren Eini (Ayende Rahien)

unread,
Apr 8, 2014, 7:55:10 AM4/8/14
to ravendb
We really can't help if you don't show us what you are doing.



Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811





--

David Cleave

unread,
Apr 8, 2014, 12:50:49 PM4/8/14
to rav...@googlegroups.com
Okay, here is some code to look at.

Note that I have replaced the actual entity type I'm working with. So in this example, we're using good old Foo (shown below). The actual code for working with Raven is unchanged.

public class Foo
{
  public string Name {get; set;}
  public string GroupName {get; set;}
  public IEnumerable<Bar> Bars {get; set;}
}

public class Bar
{
  public BarType Type {get; set;}
  public string Name {get; set;}
}

Also  note that the IDocumentStore is injected and has a lifestyle of per web request. I believe that this prevents me using WaitForNonStaleResultsAsOfLastWrite(), but I think that's no use to me anyway because the application is actually hosted on two load-balanced machines, the raven server on each replicating to the other.

Step 1) (inserting)
// Each insert is the result of a call to a web API. The call results in this code to work with Raven:
using (var session = _documentStore.OpenSession())
{
  session.Store(fooToInsert);
  session.SaveChanges();
}

Step 2) (modifying)
// This modification is the result of a single call to the web API
using (var session = _documentStore.OpenSession())
{
  // Remove all Bars of a particular type and name from all Foos in a certain group
  var matchingRecords = session.Query<Foo>()
    .Where(x => x.GroupName == groupName && x.Bars.Any(y => y.Type == barType && y.Name == name))
    .Customize(x => x.WaitForNonStaleResultsAsOfNow())
    .ToList();

  foreach (var foo in matchingRecords)
  {
    foo.Bars = foo.Bars
      .Where(x => x.Type != barType).ToList();
    session.Store(foo);
  }

  session.SaveChanges();
}

I'm actually finding that the WaitForNonStaleResultsAsOfNow() call ensures that all the Foos get updated, even in the load-balanced environment, which is great. But I'm vulnerable to bugs around discrepancies in system clocks, and would like to know if there's a better approach.

Oren Eini (Ayende Rahien)

unread,
Apr 8, 2014, 1:23:04 PM4/8/14
to ravendb
You should NOT have a per request life style for a document store.
And you are trying to do a _query_ on items that you have just inserted.

What load balance approach?



Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811





--

Mircea Chirea

unread,
Apr 8, 2014, 1:35:17 PM4/8/14
to rav...@googlegroups.com
The document store should be global and shared; it's a singleton. You open as many sessions as you need (one per request).

David Cleave

unread,
Apr 9, 2014, 4:28:52 AM4/9/14
to rav...@googlegroups.com
The load balance approach is round robin - so requests go to each server alternately. This could be changed though.

"And you are trying to do a _query_ on items that you have just inserted."

Yes - what's wrong with that? The code I'm describing is part of an integration test, but in theory, we could certainly receive a request to insert some new data and then receive another request shortly afterwards to modify data that would include the data just inserted.

Chris Marisic

unread,
Apr 9, 2014, 8:49:49 AM4/9/14
to rav...@googlegroups.com


On Wednesday, April 9, 2014 4:28:52 AM UTC-4, David Cleave wrote:


Yes - what's wrong with that? The code I'm describing is part of an integration test, but in theory, we could certainly receive a request to insert some new data and then receive another request shortly afterwards to modify data that would include the data just inserted.


if you use Load, there will never, ever be any issues with availability of a document (unless you're doing replication and read from a node other than master and it doesn't have the doc yet). If you use query, you must fully embrace eventual consistency "and then receive another request shortly afterwards to modify data" may be ultimately impossible to do. The workarounds are add retries & timeouts to your operation, expecting the data will be there if you give it some more time. The other option is rework this code so it runs off loads.

Ultimately, your application should rarely if ever base decisions on information from queries. Queries are for searching/reports, not meant to be decision sources. Decisions should be made around transactional boundaries that eliminates all maybe scenarios and leaves you with only atomic accept/reject. The basic premise is start with the view everything is stale and don't try to address that being a concern in any fashion except at document modification. If you can never come to terms with that fact about eventual consistency if your apps ever have real load thrown at them expect to spend orders of magnitudes more money on database hardware for massive vertical scaling. Rigid consistency model === vertical scaling.  

Oren Eini (Ayende Rahien)

unread,
Apr 9, 2014, 1:36:40 PM4/9/14
to ravendb
Who does the load balancing?




Oren Eini

CEO

Mobile: + 972-52-548-6969

Office:  + 972-4-674-7811

Fax:      + 972-153-4622-7811





--

Federico Lois

unread,
Apr 10, 2014, 12:11:50 AM4/10/14
to rav...@googlegroups.com
From experience you MUST NOT EVER EVER EVER EVER use WaitForNonStaleResultsAsOfNow() or whatever derivation of it (not even in tests, you have WaitForIndexing there). Example: https://groups.google.com/forum/#!topic/ravendb/mCOSyxdC6Ws

Part of the rearchitecture I mentioned was in the login system. I cannot show you the old one (I scrapped that from my memory :P), but you can see what I mean here: https://github.com/Corvalius/Membership-ravendb
In the code you will find that only a single query was needed for that. The rest which would have required a 
WaitForNonStaleResultsAsOfNow() is solved through references. I found that type of design to be both performant and when you automate it in your head also very simple to follow. Intention is explicit from the get go.

Any question about it, just shoot :)

Kind regards,
Federico

Justin A

unread,
Apr 11, 2014, 4:30:52 AM4/11/14
to rav...@googlegroups.com
thread-steal: @Federico, there's also: https://github.com/SimpleAuthentication/SimpleAuthentication as another nice** alternative. I've been using it heaps of times against RavenDb. (Would be awesome if i could even finish a project to get it off my fricking dev machine :(  )

** I'm sorta started that project. Sorta. No bias at all ..nooooo... :)

Federico Lois

unread,
Apr 11, 2014, 8:42:55 AM4/11/14
to rav...@googlegroups.com
thread-steal: @Justin looking good and comprehensive. We didnt find it when we were shooping for a solution back in early 2013 :D ... We were looking for a provider who had social, local account and roles. The Microsoft approach was what we needed but didnt supported the ability to plug-in storage providers, so we wrote one ourselves. We tried to preserve the source specs as much as we could looking forward to them to eventually provide a providers interface :) . We had been talking with the team about open sourcing it for the lastest part of 2013 (after stabilization on our production environments) but just recently we drop the code there. It still missing some changes in namespaces, the nuget account (we use a private one on our infrastructure for internal development) and the like. That I will probably do over the weekend if I find the time to do so and create a proper thread here :). 


On Fri, Apr 11, 2014 at 5:30 AM, Justin A <jus...@adler.com.au> wrote:
thread-steal: @Federico, there's also: https://github.com/SimpleAuthentication/SimpleAuthentication as another nice** alternative. I've been using it heaps of times against RavenDb. (Would be awesome if i could even finish a project to get it off my fricking dev machine :(  )

** I'm sorta started that project. Sorta. No bias at all ..nooooo... :)

--

David Cleave

unread,
Apr 11, 2014, 9:05:00 AM4/11/14
to rav...@googlegroups.com
It's hosted in Azure. Azure does the load balancing. Why do you ask?

Oren Eini (Ayende Rahien)

unread,
Apr 11, 2014, 9:06:42 AM4/11/14
to ravendb
Load balancing of _what_ ? RavenDB instances?

David Cleave

unread,
Apr 11, 2014, 9:10:41 AM4/11/14
to rav...@googlegroups.com
Sorry, specifically what was the problem you encountered with WaitForNonStaleResultsAsOfNow()? I can't tell from the link.

I'm aware of the potential for server clock mismatch problems, but not anything else.

Federico Lois

unread,
Apr 11, 2014, 8:01:41 PM4/11/14
to rav...@googlegroups.com
Basically, you cannot predict that your are not going to be behind and for how long. Lets say that an index gets resetted (or needs to be reset because of a new definition) and you have 10Gb of data, then that index may be stale for a loooooong time. Long enough for your WaitForNonStaleResultAsOfNow() call to fail with a Timeout. You may tolerate that with a batch background process that you can use reinitiate later without trouble, you cannot when the one waiting for the response is the user. 

If you are using WaitForXXX calls then you are subject to __unknown wait times and also timeouts__ and you DO have to deal with them in your normal codepath. In the simple membership codebase you will find that entities named XXXRef are used to leave tombstones to find the objects with a secondary key, and with a close inspection you will find out that most of those are in places where you would have issued a query in a SQL database (where you are not subjected to eventual consistency). 
 

Vlad Kosarev

unread,
Apr 14, 2014, 12:00:13 PM4/14/14
to rav...@googlegroups.com


On Wednesday, April 9, 2014 8:49:49 AM UTC-4, Chris Marisic wrote:


if you use Load, there will never, ever be any issues with availability of a document (unless you're doing replication and read from a node other than master and it doesn't have the doc yet).

Or unless you are using Transactions with AllowNonAuthoritativeInformation = true;
Reply all
Reply to author
Forward
0 new messages