[RavenDB] LoadStartingWith performance consistent?

82 views
Skip to first unread message

Ryan Heath

unread,
Nov 25, 2015, 12:52:37 PM11/25/15
to rav...@googlegroups.com
I have two questions about LoadStartingWith. 

The default pageSize for LoadStartingWith is much smaller than the size for Query, 25 vs 1024. 
What is the rationale behind this low value? This could potentially result in many small requests when paging through them. 

The performance of LoadStartingWith is overall good enough but occasionally we see timings of seconds up to minutes before it returns. Our database is not really busy at all and does not have much documents in it. 
What could LoadStartingWith be waiting for before it can return an answer?

Concurrent we ran tests of loading the documents by id. This was always consistent in response timing even during long outstanding LoadStartingWith requests. 

We saw this behavior with 2.5.2952 and with 3.0.3800.

// Ryan

Kijana Woodard

unread,
Nov 25, 2015, 12:55:23 PM11/25/15
to rav...@googlegroups.com
Side note: the default result size for Query is 128.


// Ryan

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Yarichuk

unread,
Nov 25, 2015, 3:13:13 PM11/25/15
to RavenDB - 2nd generation document database
On top of my head - the occasional delay on returning of LoadStartingWith might be that database is unloaded due to inactivity. Then LoadStartingWith comes, and the database needs to be loaded from the disk.
--
Best regards,

 

Michael Yarichuk

RavenDB Core Team

Tel: 972-4-6227811

Fax:972-153-4-6227811

Email : michael....@hibernatingrhinos.com

 

RavenDB paving the way to "Data Made Simple" http://ravendb.net/  

Ryan Heath

unread,
Nov 26, 2015, 1:22:44 AM11/26/15
to rav...@googlegroups.com
Yeah, you are right, I swapped it with pagesize on the server.

// Ryan

Ryan Heath

unread,
Nov 26, 2015, 1:26:44 AM11/26/15
to rav...@googlegroups.com
Well, we are calling the db every minute. That does not sound as inactive :)
Also the simple Load calls where back immediately, while LoadStartingWith calls (called at the very same time) would take (much) longer.

Is this reloading visible in raven studio?

// Ryan

Oren Eini (Ayende Rahien)

unread,
Nov 26, 2015, 2:24:23 AM11/26/15
to ravendb
No, it isn't usually visible.
And once every minute should keep us alive.

Can you do a fiddler trace of these and post the SAZ file here?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 

Ryan Heath

unread,
Nov 26, 2015, 3:09:40 AM11/26/15
to rav...@googlegroups.com
I cannot decrypt the https calls with fiddler. Something is refusing the certificate from fiddler. 
I'll try to get a hold on the iis logs, is that an option to look at?

// Ryan

Oren Eini (Ayende Rahien)

unread,
Nov 26, 2015, 3:20:25 AM11/26/15
to ravendb
Go to the studio, then to manage your server, then to trafic watch, and you can watch it there.

Ryan Heath

unread,
Nov 26, 2015, 4:26:12 AM11/26/15
to rav...@googlegroups.com
I do not see those items: manage your server, watch traffic.
We are on RavenRQ, could that be the reason why I do not see them?

// Ryan

Oren Eini (Ayende Rahien)

unread,
Nov 26, 2015, 4:43:41 AM11/26/15
to ravendb
Yes, you wouldn't see it there.
You need to figure out the Fiddler issue then, I'm not sure what can be done otherwise.

Chris Marisic

unread,
Nov 30, 2015, 12:20:49 PM11/30/15
to RavenDB - 2nd generation document database


On Wednesday, November 25, 2015 at 11:52:37 AM UTC-6, Ryan Heath wrote:

The performance of LoadStartingWith is overall good enough but occasionally we see timings of seconds up to minutes before it returns. Our database is not really busy at all and does not have much documents in it. 
What could LoadStartingWith be waiting for before it can return an answer?


can you show some example searches you ran with it?

Ryan Heath

unread,
Nov 30, 2015, 6:06:25 PM11/30/15
to rav...@googlegroups.com
Hi Chris,

We typically do two kind of searches:

startsWith=tv_accounts%2F673%2F&matches=dayschedules%2F201510%2A%2F%2A&exclude=&start=0&pageSize=25
startsWith=tv_accounts%2F673%2F&matches=dayschedules%2F201510%2A%2F%2A&exclude=&start=25&pageSize=25
startsWith=tv_accounts%2F673%2F&matches=dayschedules%2F201510%2A%2F%2A&exclude=&start=50&pageSize=25
which is equivalent to tv_accounts/673/dayschedules/201510*/*
this is searching for all schedules of oct 2015 for account 673

and

startsWith=tv_accounts%2F&matches=%2A%2Fdayschedules%2F20151125%2F%2A&exclude=&start=0&pageSize=25
startsWith=tv_accounts%2F&matches=%2A%2Fdayschedules%2F20151125%2F%2A&exclude=&start=4&pageSize=25
which is equivalent to tv_accounts/*/dayschedules/20151125/*
this is searching for all schedules of 25 nov 2015 for all accounts.

The last searches are expensive as expected and are not a problem for the app. It has timings from double digits to five digits (multiple seconds) for no obvious reason.
The first searches are fast and visible to our users. Occasionally those time are going up as well, for no obvious reason.

We are planning to upgrade to the 3.0 client, which has a better support for paging through those searches. 
Currently (v2.5) we need to perform at least two requests to know if there are more records available.

// Ryan

--

Federico Lois

unread,
Nov 30, 2015, 8:45:50 PM11/30/15
to rav...@googlegroups.com
I see why you can have unpredictable time. Suppose you have an account say A and an account say B. Now A has 30 schedules and B has 3000. Now to find the last that hits on A you need to check 30 items to get the last hit on B you have to check 3000. You get the idea. Don't know on 2.5 but if i remember correctly as loading by id is an immediate operation (acid) you cannot use lucene to deal with the query and you are doing an scan over the whole set n * page times.

So yes, the time you get  is bound to be unpredictable.

From: Ryan Heath
Sent: ‎30/‎11/‎2015 20:06
To: rav...@googlegroups.com
Subject: Re: [RavenDB] LoadStartingWith performance consistent?

Ryan Heath

unread,
Dec 1, 2015, 1:06:40 AM12/1/15
to rav...@googlegroups.com
But we see these different times with the same account. That's why I call it not consistent. The schedules were not changing for a particular account but sometime its time went through the roof, so to speak. 

Yes, it scans n times, that's why I want to upgrade to 3.0 client. So we can at least skip an extra call, which is needed in the 2.0 client. 
But we also saw retrieving say page 2 was taking a lot more time than page 1 and 3. For no obvious reason. Those times are expected to be the same, no?

Is it possible that there are locks on the scanning of documents? Does it have to wait for something to complete before it can begin, continue or finish its operation?

// Ryan

Oren Eini (Ayende Rahien)

unread,
Dec 1, 2015, 2:36:13 AM12/1/15
to ravendb
There are no locks in this process.

Why not just do a query on startsWith  tv_accounts/673/dayschedules/201510 ? Which is going to be faster than doing a match?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


Ryan Heath

unread,
Dec 1, 2015, 3:15:39 AM12/1/15
to rav...@googlegroups.com
Sure, the match is not needed (in the first case). I will try that and report back.

Thanks

// Ryan

Chris Marisic

unread,
Dec 1, 2015, 11:18:10 AM12/1/15
to RavenDB - 2nd generation document database


On Tuesday, December 1, 2015 at 12:06:40 AM UTC-6, Ryan Heath wrote:

Yes, it scans n times, that's why I want to upgrade to 3.0 client. So we can at least skip an extra call, which is needed in the 2.0 client. 
But we also saw retrieving say page 2 was taking a lot more time than page 1 and 3. For no obvious reason. Those times are expected to be the same, no?

Pretty sure the expectation is every page is slightly longer in duration to access. 

The page 3 taking noticeably less time than page 2, that's odd. Are you sure that's occurred for the exact same LoadStartingWith parameters? 

Ryan Heath

unread,
Dec 2, 2015, 6:25:31 PM12/2/15
to rav...@googlegroups.com
For the first case, we have stripped the match and we are now using the RavenPagingInformation to determine if we have seen all the pages.
We have also bumped up the pagesize from 25 to 100, since it's more likely that our data will return more than 25 entries.
It is performing a whole lot better and consistent now.

Thanks!

// Ryan
Reply all
Reply to author
Forward
0 new messages