Raven 3.0 server considers indexes without any entries as corrupt on startup

129 views
Skip to first unread message

Barry Hagan

unread,
Apr 9, 2015, 1:32:06 AM4/9/15
to rav...@googlegroups.com
I have a use case for map/reduce indexes that may not have any reduced values for a given database.  This causes the 3.0 server to incorrectly consider the index corrupt and it will reset it on each load of the database.

2015-04-08 23:45:16.7452,Raven.Database.Indexing.IndexStorage.Startup,Warn,,Could not open index DocCounts/ByTag. Trying to recover index,"System.InvalidOperationException: Map-Reduce index corruption detected.
   at Raven.Database.Indexing.IndexStorage.CheckMapReduceIndexState(IDictionary`2 commitData, Boolean resetTried) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 251
   at Raven.Database.Indexing.IndexStorage.OpenIndexOnStartup(String indexName) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 142"

2015-04-08 23:45:16.7612,Raven.Database.Indexing.IndexStorage.Startup,Warn,,"Could not open index DocCounts/ByTag. Recovery operation failed, forcibly resetting index","System.InvalidOperationException: Map-Reduce index corruption detected.
   at Raven.Database.Indexing.IndexStorage.CheckMapReduceIndexState(IDictionary`2 commitData, Boolean resetTried) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 251
   at Raven.Database.Indexing.IndexStorage.OpenIndexOnStartup(String indexName) in c:\Builds\RavenDB-Stable-3.0\Raven.Database\Indexing\IndexStorage.cs:line 142"


Consider counting tags on documents with the index definition below as an illustration.  If it happens that I have a database with many documents and none of them have any tags, then Raven 3.0 will be resetting this map/reduce index on every startup.  It is somewhat common for a few of my map/reduce indexes to be in this state depending on the user's dataset (we reduce on counts of tags, custom key/value pairs, etc.).  The issue I have is that resetting these indexes is costly if the database has a large number of documents (but just happens to have no properties on the documents that contribute to my map/reduces) since the reset must scan through the entire database on each start.

Additionally, for map only indexes, I will see the server reset the index if there are no entries in the database for that document type since the last commit will be Etag.Empty:

2015-04-08 23:45:16.7612,Raven.Database.Indexing.IndexStorage,Info,,Resetting index 'ProductionDocs/PrimaryIndex (17)'. Last stored etag: 01000000-0000-0001-0000-000000000337. Last commit etag: 00000000-0000-0000-0000-000000000000.,



    public class Doc
    {
        public ICollection<Tag> Tags;
    }

    public class Tag
    {
        public string Id { get; set; }
        public string Name { get; set; }
    }

    public class DocCounts_ByTag : AbstractIndexCreationTask<Doc, DocCounts_ByTag.TagCount>
    {
        public class TagCount
        {
            public string Id { get; set; }
            public string Name { get; set; }
            public long DocCount { get; set; }
        }

        public DocCounts_ByTag()
        {
            Map = docs => from doc in docs
                          from tag in doc.Tags
                          select new
                          {
                              tag.Id,
                              tag.Name,
                              DocCount = 1,
                          };

            Reduce = results => from tag in results
                                group tag by tag.Id
                                    into g
                                    select new
                                    {
                                        Id = g.Key,
                                        g.First().Name,
                                        DocCount = g.Sum(x => x.DocCount),
                                    };

            Sort(result => result.DocCount, SortOptions.Long);
        }
    }

Oren Eini (Ayende Rahien)

unread,
Apr 9, 2015, 4:48:07 AM4/9/15
to ravendb

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Barry Hagan

unread,
Apr 10, 2015, 6:29:38 PM4/10/15
to rav...@googlegroups.com
Please see my comment added on RavenDB-3373.

Chris Marisic

unread,
Apr 10, 2015, 6:39:11 PM4/10/15
to rav...@googlegroups.com
Barry as an interim fix can you just add a dummy document so that index has 1 document even if you need to add logic in your app to ignore it?

Barry Hagan

unread,
Apr 10, 2015, 9:14:26 PM4/10/15
to rav...@googlegroups.com
I'm not in production with 3.0 yet - this is just my upgrade testing and discovery.  So I'll wait for a proper fix.
Reply all
Reply to author
Forward
0 new messages