WaitFor...AsOfLastWrite vs. WaitFor...AsOfNow

214 views
Skip to first unread message

Herbi

unread,
Aug 14, 2013, 8:57:44 AM8/14/13
to rav...@googlegroups.com
Hi,
I am executing some tests to get familiar with RavenDB. Therefore I have 2 processes accessing the same collection of data.
The 1. process updates the data continuously (in a loop), while the 2. process reads/queries the same data (also in a loop).

 - a) When I use "WaitFor...AsOfNow" for the query in the 2. process, then everything works as expected => RavenDb returns non-stale results for each read (while the first process continuously updates the data)....
 - b) But when I use "WaitFor...AsOfLastWrite", then the 2. process is blocked until the 1. process is done with the updates.
 - c) If I use neither of both, but instead "DefaultQueryingConsistency = ConsistencyOptions.QueryYourWrites", then RavenDb behaves like in a)

As both processes and also RavenDb are running on the same machine, I expected that the usage of "WaitFor...AsOfNow" and "WaitFor...AsOfLastWrite" would behave the same, because there are no clock sync issues.
Can somebody explain, what is going on here?

Thanks,
Herbi

Oren Eini (Ayende Rahien)

unread,
Aug 14, 2013, 9:04:05 AM8/14/13
to ravendb
WaitForLastWrite will wait until the last write _from this process_ will be indexed.


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Herbi

unread,
Aug 14, 2013, 9:44:00 AM8/14/13
to rav...@googlegroups.com

WaitForLastWrite will wait until the last write _from this process_ will be indexed.

So that means, that the second process, which only reads the data, should therefore only wait until its _own_ writes are indexed. And therefore it shouldn´t be influenced by the first process.
But actually that´s not the case: the _second_ process waits until the updates of the _first_ process are indexed (using WaitForLastWrite). 
That´s what I don´t understand.

Chris Marisic

unread,
Aug 14, 2013, 9:52:18 AM8/14/13
to rav...@googlegroups.com
Simply put you really shouldn't be using either of these. These features are borderline poisonous and model design fault crutches. There are very few legitimate usage scenarios of these.

Mauro Servienti

unread,
Aug 14, 2013, 10:06:31 AM8/14/13
to rav...@googlegroups.com

I totally agree with Chris on this topic, and not only this :-)

 

The only thing I’d like to add to the conversation, beside the fact that it should work as expected, is that the really difficult thing is the mindset switch, we have been used for years to a consistent world that does not mimic at all the real world, in real life we are used to staleness and I’ll add that we can perfectly live without transactions :-) we compensate every single moment of the day.

 

I tried to explain to my mug that the action of drinking is transactional not to change the t-shirt when something goes wrong… :-P

 

.m

--

Herbi

unread,
Aug 14, 2013, 10:27:07 AM8/14/13
to rav...@googlegroups.com
Come on guys, that´s exactly what I´m doing - exploring those borderline scenarios
I just wanted to know, what the difference of those 3 configurations is? (WaitForLastWrite, WaitForNow, QueryYourWrites)
And the answer is "don´t use it, because you don´t need it?"

Chris Marisic

unread,
Aug 14, 2013, 1:08:19 PM8/14/13
to rav...@googlegroups.com
Yes don't use it.

If you're going to ignore us, use waitforlastwrite.

Marco

unread,
Aug 14, 2013, 4:52:58 PM8/14/13
to rav...@googlegroups.com
But when you have a list of records based on an index, the user edits one record, save the record and refresh the grid/list. Isn't this the situation when you must use waitforlastwrite? 

Chris Marisic

unread,
Aug 14, 2013, 5:12:56 PM8/14/13
to rav...@googlegroups.com
This is, but not on reading the list. You want to write this record, then in the same request query that index with waitforlastwrite, then redirect the user to the list after the index is synchronized. This avoids making the common operation, reads, from always waiting, and makes the uncommon operations, writes, to have to wait.

devm...@hotmail.com

unread,
Aug 15, 2013, 2:09:31 AM8/15/13
to rav...@googlegroups.com
Chirs, this is eye opening for me, could you please confirm my understanding ?

what you mean is edit record, call SaveChanges() then  immediately after that query and user waitforlastwrite then redirect the user to list page ?

somethings i am wondering about :
-what exactly do you mean by "makes the uncommon operations, writes, to have to wait. "
-is this a normal solution to use or just for extreme situations ?
-the thing which i still don't understand, when you add or update a new record, does the whole index recompute or just modified records, and if it just the modified records, why peopel are so worried as i don't image this will take more than seconds to update the index.


thanks for your input

Mircea Chirea

unread,
Aug 15, 2013, 4:04:55 AM8/15/13
to rav...@googlegroups.com
Yes, you wait for the index to process your changes on the write pages. That way your read pages are guaranteed to have the most up-to-date information.

  1. Reads are usually done much more often than writes, so it makes sense to make reads as fast as possible and move the expensive code into write operations.
  2. Yes, it is perfectly normal. It just depends on what operations can be made slower in your case: reads or writes. Heck with SQL you'd nearly ALWAYS move the burden into writes, as all queries return non-stale data unless you explicitly use NOLOCK or other options.
  3. When a document is added, changed or removed, all indexes are notified of this and marked stale. Each index, in parallel, will decide if it needs to handle the document (based on metadata or whatever). When an index is done, regardless of whether it had to compute the document or not, it's marked as non stale.
Note that the stale status is per-index, so if you have constant writes it might always appear as stale, even though data written a few seconds ago is probably indexed - just the stuff written just now will not be available. Since indexing is done in parallel you can get much much better read performance if you do not care about getting stale or missing data for (maybe) a second or two after a write; in fact, this is how most of the world operates, as you can almost never have accurate data to the milisecond :) If you really do want, you can always wait on the read, or on the write, for the data you need to be indexed and ready to go. An example:
  • The user wants to create a new account.
  • You query for all usernames, checking if the one the user wants is taken. In doing so you use WaitForNonStaleResultsAsOfNow.
  • You proceed creating the user account and saving it to the database.
  • Option 1: wait for the index to process the newly created account.
  • Log the user in and redirect them.
  • Option 2: when the page loads query for the user's account by username, using WaitForNonStaleResultsAsOfNow.
In this case it would be really bad to redirect the user only to give him an error on the next page that the user account cannot be found, because it hasn't been indexed yet. Thus we need to wait for the index to be finished, but only for the account we just created, we don't care about future accounts. We can wait when trying to log in (option 2), but we'd add an unnecessary wait for all log in operations, as we only need to wait the first time after the account is created; thus we wait after creating the account (option 1). The user probably won't mind an extra second of waiting when doing a one-time-only operation, like creating an account, but will almost certainly mind waiting before being able to log in :)

Note that we are not using WaitForNonStaleResultsAsOfLastWrite - unless you know what you are doing, I do not recommend using this on web apps. The reason is that web apps are commonly load balanced against multiple processes and the load balanced might not route the request after a redirect to the same process. Since AsOfLastWrite only applies to writes from this process it would be useless and downright dangerous in this scenario.

If we were to use the simple WaitForNonStaleResults we could wait indefinitely because if writes were to happen continuously the index would always be marked as stale. This is only intended to be used while debugging, during unit tests, or similar tasks, NEVER IN PRODUCTION.


P.S.: There are better ways to accomplish the above (use the username as document ID, such as accounts/john.smith, which guarantees uniqueness and non-stale data for load/store), but this was the simplest way to demonstrate my point.

Mauro Servienti

unread,
Aug 15, 2013, 4:05:27 AM8/15/13
to rav...@googlegroups.com

:-) the reality is that I do not know the meaning of them I only use WaitFor* in tests in order to block the test engine to be sure that data are where the test expects to find them :-)

Mauro Servienti

unread,
Aug 15, 2013, 5:06:05 AM8/15/13
to rav...@googlegroups.com

This is only a part of the problem, and cannot be applied in most situation where you really need to scale out, if you are living in an eventually consistent world you need to fully embrace that world, period, and live with it :-)

 

I simplify things down for the topic, we have been used, due to the consistency behavior or relational db, to:

 

1.       Handle the user request;

2.       Connect to the db;

3.       Do something interesting with the data;

4.       Read the result from the db;

5.       Since by default transactions are “serialized” we are sure that the read data are consistent with what we wrote;

 

The above is the typical single thread situation where the caller is blocked waiting for the operation to complete, right now if you introduce another thread at step 3 you are in eventually consistent world, a sort of, where the caller is freed before the end of the operation thus if the caller tries to read data can find something unexpected :-)

 

For the sake of the sample let us say that the indexing behavior of RavenDB is like a thread at step 3, so if you need to read the data you have just written you need a way to wait, well I do not see this as a solution, see below, but as a compromise if the situation allows it, and the various WaitFor, as per Chris sample, are there for this. So:

 

1.       Handle the user request;

2.       Connect to the db;

3.       Do something interesting with the data;

4.       Wait for the index to have your own data indexed

5.       Read the result from the index;

6.       Return to the user;

 

Now the question is: what happens if we introduce the “consistency” problem, or a sort of, before? For example at step 1? I’m working on an application where the flow can be summarized like this:

 

1.       A SPA (single page html application) issues a request to the WebAPI backend;

2.       WebAPI does nothing, simply transform the incoming request into something that can be sent over a bus (Azure Service Bus);

3.       Return to the user;

4.       Another process, a worker role, handles the request

a.       Connects to the db;

b.      Does something interesting with the data

c.       Send an event on the bus again to notify that something interesting is completed;

5.       The WebAPI backend waits for the event and transform if to something deliverable via SignalR to the originating SPA;

6.       The SPA can now read its own written data;

 

In the above situation having an eventually consistent db or not is the same, it does not change the problem at all, because the caller is freed at step 2 where nothing has happened yet.

 

My experience start saying that embracing eventual consistency leads you to embrace events sourcing and embracing events sourcing brilliantly solves the whole problem, exactly as in real life there are situations where you need to wait for something (block the UI and wait for the event) or situation where you are only interested the something is happened (simply wait for the event without blocking the UI) or even situation where you are not interested at all in the event (fire and forget).

 

When you are in a line at the bank, waiting for your operation to get completed, you are not lying on the casher knees continuously asking him “have you finished?”, you are simply waiting for him to tell you (event) the result of the operation, but in the meanwhile you can do something else, you are blocking the line behind you.

 

And last: embrace a world with no transaction but with compensations, everything is much easier :-)

 

.m

 

From: rav...@googlegroups.com [mailto:rav...@googlegroups.com] On Behalf Of devm...@hotmail.com
Sent: giovedì 15 agosto 2013 08.10
To: rav...@googlegroups.com
Subject: Re: [RavenDB] Re: WaitFor...AsOfLastWrite vs. WaitFor...AsOfNow

 

Chirs, this is eye opening for me, could you please confirm my understanding ?

 

what you mean is edit record, call SaveChanges() then  immediately after that query and user waitforlastwrite then redirect the user to list page ?

 

somethings i am wondering about :

-what exactly do you mean by "makes the uncommon operations, writes, to have to wait. "

-is this a normal solution to use or just for extreme situations ?

-the thing which i still don't understand, when you add or update a new record, does the whole index recompute or just modified records, and if it just the modified records, why peopel are so worried as i don't image this will take more than seconds to update the index.

 

 

thanks for your input

On Thursday, August 15, 2013 1:12:56 AM UTC+4, Chris Marisic wrote:

This is, but not on reading the list. You want to write this record, then in the same request query that index with waitforlastwrite, then redirect the user to the list after the index is synchronized. This avoids making the common operation, reads, from always waiting, and makes the uncommon operations, writes, to have to wait.

Kijana Woodard

unread,
Aug 15, 2013, 9:07:31 AM8/15/13
to rav...@googlegroups.com

+1 to changing your pov. Most if these problems disappear if you 'bend the spoon'. As Mircea pointed out in his post script about unique user names, just have docs that have the names as keys and store in a transaction with your main doc(s). No waiting. Most situations can be repositioned to simply avoid the decisions you're facing. Believe me, it's not easy when you're starting out because we're so used to relational dbs. In that world, you start with 3nf or whatever and then figure out how the heck you get what you need out. In document modeling, you figure out what you need and build that. In practice, that means that all the pain and frustration you normally feel at the end of a relational db project is experienced up front in a document db project. IMO, this is a good thing, but it is definitely challenging to our 'training'.

devm...@hotmail.com

unread,
Aug 15, 2013, 12:44:55 PM8/15/13
to rav...@googlegroups.com
awesome explanation man, thanks a lot really :)
Reply all
Reply to author
Forward
0 new messages