All indexes going stale and staying stale

255 views
Skip to first unread message

Kenneth Truyers

unread,
Oct 2, 2014, 10:55:12 AM10/2/14
to rav...@googlegroups.com
After I added two indexes (which worked correctly on a local database) all indexes are going stale and they are staying stale.

I'm not sure how to troubleshoot this. This is what I've done and what happend:

  • Added the new indexes => showed 2 indexes stale out of 32 (normal)
  • After a few minutes => 32 indexes stale (no new documents were added in the meantime)
  • After half an hour we restarted IIS and it started again with 2 stale indexes, a few minutes later, all 32 indexes were stale again and now, several hours later, they are still stale
Trying this on the local db succeeded without a problem. The only difference is the amount of documents in the database (+- 100.000 versus 1.000.000). These indexes could be quite slow due to a heavy map/reduce.

Any idea on how to troubleshoot this problem, where to look, ...?



Vlad Kosarev

unread,
Oct 2, 2014, 11:48:11 AM10/2/14
to rav...@googlegroups.com
Check your admin endpoint - http://dbserver/databases/dbname/stats and see what's happening in there to prefetches and indexing. That should give you an idea.
For us slowness is due to extremely slow prefetches. So even though an index might only touch 1 document, prefetch has to load every document in the database (in your case 1 mil) and that can take a while. Once documents are loaded indexing is extremely fast but that's irrelevant to total time your db is 'down'.

Kenneth Truyers

unread,
Oct 2, 2014, 12:08:49 PM10/2/14
to rav...@googlegroups.com
Thanks.

I've been looking into this, but I'm not sure what to gather from this data.

Looking at the prefetches, there are about 15 items, each one of them lasting anywhere from a few ms to 15 seconds. The difference between the first and last entry is about a minute. The last prefetch was about 4 hours ago, so I don't think that's the issue. 

Is there anything specific on the indexes I should have a look at? The two added indexes have a LastIndexingTime of 2 hours ago.

Vlad Kosarev

unread,
Oct 2, 2014, 1:04:42 PM10/2/14
to rav...@googlegroups.com
ok, do this - http://ravendb.net/docs/2.0/server/deployment/docs-debug-logging
and see what you can find in there, might give you exact error

Vlad Kosarev

unread,
Oct 2, 2014, 1:04:55 PM10/2/14
to rav...@googlegroups.com
and see what you can find in there, might give you exact error

On Thursday, October 2, 2014 12:08:49 PM UTC-4, Kenneth Truyers wrote:

Oren Eini (Ayende Rahien)

unread,
Oct 2, 2014, 1:13:27 PM10/2/14
to ravendb
It would really help if you posted the stats here, so we can look at them.

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kenneth Truyers

unread,
Oct 2, 2014, 6:14:58 PM10/2/14
to rav...@googlegroups.com
Hi, 

I have pasted the output from the stats here: http://pastebin.com/4n3sHmWh (will expire on 10/10/2014)

I'll have a look whether I can enable the debugging, once I have it, I'll also post it here. 

thanks.

Oren Eini (Ayende Rahien)

unread,
Oct 2, 2014, 11:38:54 PM10/2/14
to ravendb
This shows a single stale index.

Shayne van Asperen

unread,
Oct 20, 2014, 12:27:56 PM10/20/14
to rav...@googlegroups.com
Kenneth: Did you find a resolution for this? Oren: Why was the studio showing that all 32 indexes were stale when the debug output only shows 1 index stale?

We are having the same problem again now. Deployed 1 changed index and 2 new indexes, and immediately the studio reported 3 stale indexes (as expected). However, after about an hour or so I see that it shows all 31 indexes are stale.

Stats: http://pastebin.com/7Q0suGEx

Kenneth Truyers

unread,
Oct 20, 2014, 12:35:24 PM10/20/14
to rav...@googlegroups.com
Hi,

No, I didn't find a resolution for this. It's possible that when I took the stats, the indexes just went back to unstale, so that would explain the stats.

However, the issue remains, whenever you add an index it will make all the other ones stale for a very long time. 

Just to be clear: me and Shayne are working on the same project (same database).

Oren Eini (Ayende Rahien)

unread,
Oct 21, 2014, 10:07:50 AM10/21/14
to ravendb
In 2.5, we are biased toward handling new indexes. That means that they get more time to run than the other indexes.
In 3.0, we have a different behavior. 

Adding a new index on a large db in 2.5 can cause the existing indexes to take longer to process.

Kenneth Truyers

unread,
Oct 21, 2014, 12:21:21 PM10/21/14
to rav...@googlegroups.com
Hi Oren,

I understand the behavior might have changed in 3.0, but at the moment we have found several issues with 3.0 that prevent us from upgrading, so we have to continue with the current version we have installed for the moment.

I just did a test and after importing some data, 27 indexes went stale and it took about 7 hours before they became unstale again.

I hope this is not expected behavior. In case it isn't I'm not sure what to look at on our side though, the indexes have a lot of documents (our main DB size is about 1.000.000 documents), but they are not extremely complex, nor do they have cartesian products or are flattening structures (our biggest index has less than 1.000.000 documents). Could you tell me exactly what information you need?
You mentioned a few times the log file, but could you indicate which log file, where to get it (or where the documentation to get it is) and at what point (before the change, when the indexes are stale, after they become unstale, ...)?
Keep in mind that this process takes 7 hours to complete, so I can't just redo it every time a new piece of information is needed.

I hope you do understand that this seriously limits our ability to use RavenDB. It's not a case of support, SLA, ... It's a case of not having the proper documentation to know how to troubleshoot, not having proper design guidelines anywhere and having this forum as the only source of community support. I understand this is a young product, but if we are not able to build a moderate size DB with a reasonable response time on the indexes, we will have to conclude that Raven is not adequate for our needs (and this should be documented as well). So, if this is expected behavior, I hope you can be direct about it, so we can make an informed decision whether to continue with RavenDB or not (along with the other issues we posted on the forum here).

Chris Marisic

unread,
Oct 21, 2014, 1:02:35 PM10/21/14
to rav...@googlegroups.com
You have to keep in mind just because they're stale doesn't necessarily mean anything. Unless the index staleness was actually halting your application (issues I've seen in the past).

Why does it matter if the indexes are stale? Did you actually test how stale the other indexes were? As in, did you a new FooItem and figure out the time elapsed until FooIndex returns FooItem?

You never mentioned your hosting setup. What types of hard drive(s), is it cloud or physical? 1M documents is not many documents, how large are the document in KB/MB on average? How many gigabytes is your total Raven server? Are you sharing resources with multiple databases? Do you share disk IO with other disk intensive applications (such as a sql database)?

Federico Lois

unread,
Oct 21, 2014, 2:03:28 PM10/21/14
to rav...@googlegroups.com
I would add. Are you running on Azure?

From: Chris Marisic
Sent: ‎21/‎10/‎2014 14:02
To: rav...@googlegroups.com
Subject: Re: [RavenDB] Re: All indexes going stale and staying stale

Kenneth Truyers

unread,
Oct 21, 2014, 7:05:40 PM10/21/14
to rav...@googlegroups.com
Hi Chris, Federico,

Thanks for your reply.

The staleness of the indexes actually does matter, I'll explain briefly what the application does so you can get an idea.

We aggregate data from different sources and deduplicate them to create a type of super database. When data comes in from a source, we need to query the database to check for duplicates and then merge them (according a set of rules and preferences from one data source to another). That means that as soon as we import items from a certain data source, we need to wait until the index we're querying at that moment becomes unstale to do the next import (otherwise we might miss some duplicates). For the consumer of the database (the read-side) it doesn't really matter that the indexes are stale, but for the write side it does. So, we don't test the staleness of the indexes, we just wait for nonstale results, which can take a very long time.

As for the environment, we're currently running on Azure on a 4-core 7GB RAM machine. However, the test I did before was on a local workstation (4-core 3GHz i7, 16GB RAM and an SSD-drive).
The average document size is anywhere between 5 and 30KB and the total database size is about 2.5GB. We're not sharing any resources. 

Oren Eini (Ayende Rahien)

unread,
Oct 22, 2014, 5:09:37 AM10/22/14
to ravendb
What is the size of the database you have? (GB, and # of docs)
What license do you use? Do you use suggestions? 
Are you doing constant writes?

Here is the documentation link you can enable the debug log: http://ravendb.net/docs/2.5/server/deployment/docs-debug-logging

The most important thing to know about indexes is to look at the /stats endpoint.




Oren Eini (Ayende Rahien)

unread,
Oct 22, 2014, 5:11:58 AM10/22/14
to ravendb
Note that Azure I/O is notoriously slow.

As for your issue, if you need to do dedup, indexes aren't the way to do that.
Use document ids, instead.

Kenneth Truyers

unread,
Oct 22, 2014, 5:31:10 AM10/22/14
to rav...@googlegroups.com
Hi Oren,

As I said in my previous answer: we have about 1.000.000 documents with a DB size of 2.5GB.
On Azure we use a commercial license, locally we use a dev-license.

The writes are not constant, but in batches. For deduplication, we can't use ID's since we have more complex matching logic (ie: fuzzy matches on the name, or location within a certain radius, ...)

As for Azure I/O being slow, I also tested this on a local machine (as mentioned in the previous post, a 4-core 3GHz i7 with 16GB ram and an SSD-drive) and the result is the same: 7 hours before the indexes go nonstale.

Oren Eini (Ayende Rahien)

unread,
Oct 22, 2014, 5:34:57 AM10/22/14
to ravendb
7 hours doing what? Just indexing? Also processing writes?

Remember that stale / non stale isn't something that we care about. We care about latencies.
Also, remember that soft de dup like that is very easy to get into issues.
You have two transactions that try to save the same thing, but they both check first that it isn't there, then it saves.



Kenneth Truyers

unread,
Oct 22, 2014, 5:47:56 AM10/22/14
to rav...@googlegroups.com
In those 7 hours no writes were made (we don't do any writes as long as the indexes are stale), this is the main issue, as long as the index we need to match new data with existing data is stale we can't do any new imports.

For the soft dedup, what other solution would you suggest? To give you a more concrete example:

We have the following data from two different data sources:

Source 1
-------------
Artist:
{
    Name: 'Johny Cash',
    Website: ''
}

Source 2
-------------
Artist:
{
    Name: 'Cash, Johny',
    Website: 'johnycash.com'
}


We save the first one to the database. Now when we import the second one, we query the database on an index like this:
    from a in artists
    select { a.Name }

Using the Lucene capabilities, we can now fuzzy match their names. When we find a match, instead of inserting the new performer, we update the current performer and fill in the website so that we have the following document in the database:

{
    Name: 'Johny Cash',
    Website: 'johnycash.com'
}

I don't see any other way to write to the database relying on stale indexes. If we would allow this particular index to be stale when we write, it's possible that the first document is not yet indexed, so we would create a duplicate in the database.
How would you recommend solving this?

On the other hand, we that we'd had to wait for stale indexes, but we never expected it to take that long. Imports are done maybe 5 to 10 times a day, so it's OK if it takes even up to an hour, just 7 hours seems a bit exaggerated.

Chris Marisic

unread,
Oct 22, 2014, 1:33:56 PM10/22/14
to rav...@googlegroups.com
Something seems very wonky, for how small of a database you have that is something that should take minutes not hours.

Chris Marisic

unread,
Oct 22, 2014, 1:34:46 PM10/22/14
to rav...@googlegroups.com
Are you using LoadDocument extensively in your indexes?

Oren Eini (Ayende Rahien)

unread,
Oct 22, 2014, 1:34:48 PM10/22/14
to ravendb

Please post your index definitions

Kenneth Truyers

unread,
Oct 23, 2014, 9:31:45 AM10/23/14
to rav...@googlegroups.com
We're only using LoadDocument in one or two indexes. I have attached the index definitions.

FYI: this is the approximate document count for each collection:

CommandLogs: 4901
Events: 383.266
Merges: 869
Performers: 945.088
Splits: 6
Venues: 20.003
IndexDefinitions.zip

Chris Marisic

unread,
Oct 23, 2014, 10:28:24 AM10/23/14
to rav...@googlegroups.com
You seem to have way too many indexes. In general you should have exactly 1 map index per collection type.

Kenneth Truyers

unread,
Oct 23, 2014, 10:57:50 AM10/23/14
to rav...@googlegroups.com
We had about 30 indexes before, and we refactored things to have less indexes. I think there's probably two or three more indexes we could merge, but after that I can't really see what more can be done, because:
  • The indexes have different analyzers
  • They have different reduce components
  • Some of them use LoadDocuments
  • Some of them have MultiMaps
Could you have a look at some of the indexes and see what other merge possibilities you see?
Either way, I don't think 22 indexes on 1.000.000 documents should be such a bottleneck for the system, I still feel like something else wrong.

Thank you very much for your help!

Oren Eini (Ayende Rahien)

unread,
Oct 26, 2014, 4:19:42 AM10/26/14
to ravendb
Can you show the stats output from when it is indexing? 
Preferably 3 - 5 times, taken 3 - 5 minutes apart?
Reply all
Reply to author
Forward
0 new messages