I have a use case for map/reduce indexes that may not have any reduced values for a given database. This causes the 3.0 server to incorrectly consider the index corrupt and it will reset it on each load of the database.
2015-04-08 23:45:16.7452,Raven.Database.Indexing.IndexStorage.Startup,Warn,,Could not open index DocCounts/ByTag. Trying to recover index,"System.InvalidOperationException: Map-Reduce index corruption detected.
at Raven.Database.Indexing.IndexStorage.CheckMapReduceIndexState(IDictionary`2 commitData, Boolean resetTried) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 251
at Raven.Database.Indexing.IndexStorage.OpenIndexOnStartup(String indexName) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 142"
2015-04-08 23:45:16.7612,Raven.Database.Indexing.IndexStorage.Startup,Warn,,"Could not open index DocCounts/ByTag. Recovery operation failed, forcibly resetting index","System.InvalidOperationException: Map-Reduce index corruption detected.
at Raven.Database.Indexing.IndexStorage.CheckMapReduceIndexState(IDictionary`2 commitData, Boolean resetTried) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 251
at Raven.Database.Indexing.IndexStorage.OpenIndexOnStartup(String indexName) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 142"
Consider counting tags on documents with the index definition below as an illustration. If it happens that I have a database with many documents and none of them have any tags, then Raven 3.0 will be resetting this map/reduce index on every startup. It is somewhat common for a few of my map/reduce indexes to be in this state depending on the user's dataset (we reduce on counts of tags, custom key/value pairs, etc.). The issue I have is that resetting these indexes is costly if the database has a large number of documents (but just happens to have no properties on the documents that contribute to my map/reduces) since the reset must scan through the entire database on each start.
Additionally, for map only indexes, I will see the server reset the index if there are no entries in the database for that document type since the last commit will be Etag.Empty:
2015-04-08 23:45:16.7612,Raven.Database.Indexing.IndexStorage,Info,,Resetting index 'ProductionDocs/PrimaryIndex (17)'. Last stored etag: 01000000-0000-0001-0000-000000000337. Last commit etag: 00000000-0000-0000-0000-000000000000.,
public class Doc
{
public ICollection<Tag> Tags;
}
public class Tag
{
public string Id { get; set; }
public string Name { get; set; }
}
public class DocCounts_ByTag : AbstractIndexCreationTask<Doc, DocCounts_ByTag.TagCount>
{
public class TagCount
{
public string Id { get; set; }
public string Name { get; set; }
public long DocCount { get; set; }
}
public DocCounts_ByTag()
{
Map = docs => from doc in docs
from tag in doc.Tags
select new
{
tag.Id,
tag.Name,
DocCount = 1,
};
Reduce = results => from tag in results
group tag by tag.Id
into g
select new
{
Id = g.Key,
g.First().Name,
DocCount = g.Sum(x => x.DocCount),
};
Sort(result => result.DocCount, SortOptions.Long);
}
}