All indexes going stale and staying stale

Kenneth Truyers

unread,

Oct 2, 2014, 10:55:12 AM10/2/14

to rav...@googlegroups.com

After I added two indexes (which worked correctly on a local database) all indexes are going stale and they are staying stale.

I'm not sure how to troubleshoot this. This is what I've done and what happend:

Added the new indexes => showed 2 indexes stale out of 32 (normal)
After a few minutes => 32 indexes stale (no new documents were added in the meantime)
After half an hour we restarted IIS and it started again with 2 stale indexes, a few minutes later, all 32 indexes were stale again and now, several hours later, they are still stale

Trying this on the local db succeeded without a problem. The only difference is the amount of documents in the database (+- 100.000 versus 1.000.000). These indexes could be quite slow due to a heavy map/reduce.

Any idea on how to troubleshoot this problem, where to look, ...?

Vlad Kosarev

unread,

Oct 2, 2014, 11:48:11 AM10/2/14

to rav...@googlegroups.com

Check your admin endpoint - http://dbserver/databases/dbname/stats and see what's happening in there to prefetches and indexing. That should give you an idea.

For us slowness is due to extremely slow prefetches. So even though an index might only touch 1 document, prefetch has to load every document in the database (in your case 1 mil) and that can take a while. Once documents are loaded indexing is extremely fast but that's irrelevant to total time your db is 'down'.

Kenneth Truyers

unread,

Oct 2, 2014, 12:08:49 PM10/2/14

to rav...@googlegroups.com

Thanks.

I've been looking into this, but I'm not sure what to gather from this data.

Looking at the prefetches, there are about 15 items, each one of them lasting anywhere from a few ms to 15 seconds. The difference between the first and last entry is about a minute. The last prefetch was about 4 hours ago, so I don't think that's the issue.

Is there anything specific on the indexes I should have a look at? The two added indexes have a LastIndexingTime of 2 hours ago.

Vlad Kosarev

unread,

Oct 2, 2014, 1:04:42 PM10/2/14

to rav...@googlegroups.com

ok, do this - http://ravendb.net/docs/2.0/server/deployment/docs-debug-logging

and see what you can find in there, might give you exact error

Vlad Kosarev

unread,

Oct 2, 2014, 1:04:55 PM10/2/14

to rav...@googlegroups.com

ok, do this - http://ravendb.net/docs/2.0/server/deployment/docs-debug-logging

and see what you can find in there, might give you exact error

On Thursday, October 2, 2014 12:08:49 PM UTC-4, Kenneth Truyers wrote:

Oren Eini (Ayende Rahien)

unread,

Oct 2, 2014, 1:13:27 PM10/2/14

to ravendb

It would really help if you posted the stats here, so we can look at them.

Hibernating Rhinos Ltd

Oren Eini l CEO l Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kenneth Truyers

unread,

Oct 2, 2014, 6:14:58 PM10/2/14

to rav...@googlegroups.com

Hi,

I have pasted the output from the stats here: http://pastebin.com/4n3sHmWh (will expire on 10/10/2014)

I'll have a look whether I can enable the debugging, once I have it, I'll also post it here.

thanks.

Oren Eini (Ayende Rahien)

unread,

Oct 2, 2014, 11:38:54 PM10/2/14

to ravendb

This shows a single stale index.

Shayne van Asperen

unread,

Oct 20, 2014, 12:27:56 PM10/20/14

to rav...@googlegroups.com

Kenneth: Did you find a resolution for this? Oren: Why was the studio showing that all 32 indexes were stale when the debug output only shows 1 index stale?

We are having the same problem again now. Deployed 1 changed index and 2 new indexes, and immediately the studio reported 3 stale indexes (as expected). However, after about an hour or so I see that it shows all 31 indexes are stale.

Stats: http://pastebin.com/7Q0suGEx

Kenneth Truyers

unread,

Oct 20, 2014, 12:35:24 PM10/20/14

to rav...@googlegroups.com

Hi,

No, I didn't find a resolution for this. It's possible that when I took the stats, the indexes just went back to unstale, so that would explain the stats.

However, the issue remains, whenever you add an index it will make all the other ones stale for a very long time.

Just to be clear: me and Shayne are working on the same project (same database).

Oren Eini (Ayende Rahien)

unread,

Oct 21, 2014, 10:07:50 AM10/21/14

to ravendb

In 2.5, we are biased toward handling new indexes. That means that they get more time to run than the other indexes.

In 3.0, we have a different behavior.

Adding a new index on a large db in 2.5 can cause the existing indexes to take longer to process.

Kenneth Truyers

unread,

Oct 21, 2014, 12:21:21 PM10/21/14

to rav...@googlegroups.com

Hi Oren,

I understand the behavior might have changed in 3.0, but at the moment we have found several issues with 3.0 that prevent us from upgrading, so we have to continue with the current version we have installed for the moment.

I just did a test and after importing some data, 27 indexes went stale and it took about 7 hours before they became unstale again.

I hope this is not expected behavior. In case it isn't I'm not sure what to look at on our side though, the indexes have a lot of documents (our main DB size is about 1.000.000 documents), but they are not extremely complex, nor do they have cartesian products or are flattening structures (our biggest index has less than 1.000.000 documents). Could you tell me exactly what information you need?

You mentioned a few times the log file, but could you indicate which log file, where to get it (or where the documentation to get it is) and at what point (before the change, when the indexes are stale, after they become unstale, ...)?

Keep in mind that this process takes 7 hours to complete, so I can't just redo it every time a new piece of information is needed.

I hope you do understand that this seriously limits our ability to use RavenDB. It's not a case of support, SLA, ... It's a case of not having the proper documentation to know how to troubleshoot, not having proper design guidelines anywhere and having this forum as the only source of community support. I understand this is a young product, but if we are not able to build a moderate size DB with a reasonable response time on the indexes, we will have to conclude that Raven is not adequate for our needs (and this should be documented as well). So, if this is expected behavior, I hope you can be direct about it, so we can make an informed decision whether to continue with RavenDB or not (along with the other issues we posted on the forum here).

Chris Marisic

unread,

Oct 21, 2014, 1:02:35 PM10/21/14

to rav...@googlegroups.com

You have to keep in mind just because they're stale doesn't necessarily mean anything. Unless the index staleness was actually halting your application (issues I've seen in the past).

Why does it matter if the indexes are stale? Did you actually test how stale the other indexes were? As in, did you a new FooItem and figure out the time elapsed until FooIndex returns FooItem?

You never mentioned your hosting setup. What types of hard drive(s), is it cloud or physical? 1M documents is not many documents, how large are the document in KB/MB on average? How many gigabytes is your total Raven server? Are you sharing resources with multiple databases? Do you share disk IO with other disk intensive applications (such as a sql database)?

Federico Lois

unread,

Oct 21, 2014, 2:03:28 PM10/21/14

to rav...@googlegroups.com

I would add. Are you running on Azure?

From: Chris Marisic
Sent: ‎21/‎10/‎2014 14:02
To: rav...@googlegroups.com
Subject: Re: [RavenDB] Re: All indexes going stale and staying stale

Kenneth Truyers

unread,

Oct 21, 2014, 7:05:40 PM10/21/14

to rav...@googlegroups.com

Hi Chris, Federico,

Thanks for your reply.

The staleness of the indexes actually does matter, I'll explain briefly what the application does so you can get an idea.

We aggregate data from different sources and deduplicate them to create a type of super database. When data comes in from a source, we need to query the database to check for duplicates and then merge them (according a set of rules and preferences from one data source to another). That means that as soon as we import items from a certain data source, we need to wait until the index we're querying at that moment becomes unstale to do the next import (otherwise we might miss some duplicates). For the consumer of the database (the read-side) it doesn't really matter that the indexes are stale, but for the write side it does. So, we don't test the staleness of the indexes, we just wait for nonstale results, which can take a very long time.

As for the environment, we're currently running on Azure on a 4-core 7GB RAM machine. However, the test I did before was on a local workstation (4-core 3GHz i7, 16GB RAM and an SSD-drive).

The average document size is anywhere between 5 and 30KB and the total database size is about 2.5GB. We're not sharing any resources.

Oren Eini (Ayende Rahien)

unread,

Oct 22, 2014, 5:09:37 AM10/22/14

to ravendb

What is the size of the database you have? (GB, and # of docs)

What license do you use? Do you use suggestions?

Are you doing constant writes?

Here is the documentation link you can enable the debug log: http://ravendb.net/docs/2.5/server/deployment/docs-debug-logging

The most important thing to know about indexes is to look at the /stats endpoint.

Oren Eini (Ayende Rahien)

unread,

Oct 22, 2014, 5:11:58 AM10/22/14

to ravendb

Note that Azure I/O is notoriously slow.

As for your issue, if you need to do dedup, indexes aren't the way to do that.

Use document ids, instead.

Kenneth Truyers

unread,

Oct 22, 2014, 5:31:10 AM10/22/14

to rav...@googlegroups.com

Hi Oren,

As I said in my previous answer: we have about 1.000.000 documents with a DB size of 2.5GB.

On Azure we use a commercial license, locally we use a dev-license.

The writes are not constant, but in batches. For deduplication, we can't use ID's since we have more complex matching logic (ie: fuzzy matches on the name, or location within a certain radius, ...)

As for Azure I/O being slow, I also tested this on a local machine (as mentioned in the previous post, a 4-core 3GHz i7 with 16GB ram and an SSD-drive) and the result is the same: 7 hours before the indexes go nonstale.

Oren Eini (Ayende Rahien)

unread,

Oct 22, 2014, 5:34:57 AM10/22/14

to ravendb

7 hours doing what? Just indexing? Also processing writes?

Remember that stale / non stale isn't something that we care about. We care about latencies.

Also, remember that soft de dup like that is very easy to get into issues.

You have two transactions that try to save the same thing, but they both check first that it isn't there, then it saves.

Kenneth Truyers

unread,

Oct 22, 2014, 5:47:56 AM10/22/14

to rav...@googlegroups.com

In those 7 hours no writes were made (we don't do any writes as long as the indexes are stale), this is the main issue, as long as the index we need to match new data with existing data is stale we can't do any new imports.

For the soft dedup, what other solution would you suggest? To give you a more concrete example:

We have the following data from two different data sources:

Source 1

-------------

Artist:

{

Name: 'Johny Cash',

Website: ''

}

Source 2

-------------

Artist:

{

Name: 'Cash, Johny',

Website: 'johnycash.com'

}

We save the first one to the database. Now when we import the second one, we query the database on an index like this:

from a in artists

select { a.Name }

Using the Lucene capabilities, we can now fuzzy match their names. When we find a match, instead of inserting the new performer, we update the current performer and fill in the website so that we have the following document in the database:

{

Name: 'Johny Cash',

Website: 'johnycash.com'

}

I don't see any other way to write to the database relying on stale indexes. If we would allow this particular index to be stale when we write, it's possible that the first document is not yet indexed, so we would create a duplicate in the database.

How would you recommend solving this?

On the other hand, we that we'd had to wait for stale indexes, but we never expected it to take that long. Imports are done maybe 5 to 10 times a day, so it's OK if it takes even up to an hour, just 7 hours seems a bit exaggerated.

Chris Marisic

unread,

Oct 22, 2014, 1:33:56 PM10/22/14

to rav...@googlegroups.com

Something seems very wonky, for how small of a database you have that is something that should take minutes not hours.

Chris Marisic

unread,

Oct 22, 2014, 1:34:46 PM10/22/14

to rav...@googlegroups.com

Are you using LoadDocument extensively in your indexes?

Oren Eini (Ayende Rahien)

unread,

Oct 22, 2014, 1:34:48 PM10/22/14

to ravendb

Please post your index definitions

Kenneth Truyers

unread,

Oct 23, 2014, 9:31:45 AM10/23/14

to rav...@googlegroups.com

We're only using LoadDocument in one or two indexes. I have attached the index definitions.

FYI: this is the approximate document count for each collection:

CommandLogs: 4901

Events: 383.266

Merges: 869

Performers: 945.088

Splits: 6

Venues: 20.003

IndexDefinitions.zip

Chris Marisic

unread,

Oct 23, 2014, 10:28:24 AM10/23/14

to rav...@googlegroups.com

You seem to have way too many indexes. In general you should have exactly 1 map index per collection type.

Kenneth Truyers

unread,

Oct 23, 2014, 10:57:50 AM10/23/14

to rav...@googlegroups.com

We had about 30 indexes before, and we refactored things to have less indexes. I think there's probably two or three more indexes we could merge, but after that I can't really see what more can be done, because:

The indexes have different analyzers
They have different reduce components
Some of them use LoadDocuments
Some of them have MultiMaps

Could you have a look at some of the indexes and see what other merge possibilities you see?

Either way, I don't think 22 indexes on 1.000.000 documents should be such a bottleneck for the system, I still feel like something else wrong.

Thank you very much for your help!

Oren Eini (Ayende Rahien)

unread,

Oct 26, 2014, 4:19:42 AM10/26/14

to ravendb

Can you show the stats output from when it is indexing?

Preferably 3 - 5 times, taken 3 - 5 minutes apart?

Reply all

Reply to author

Forward