DataSubsription suddenly stops working [RavenDB version 30155 Voron Engine]

137 views
Skip to first unread message

msharp...@gmail.com

unread,
Jan 11, 2017, 5:02:16 PM1/11/17
to RavenDB - 2nd generation document database

We have a serious problem by running a DataSubscription on one of our databases.

 

We work with the RavenDB version 30155.The database runs with the voron engine, has a total database size of 165 GB, nearly 50.000.000 documents and only one index (Raven/DocumentsByEntityName). In addition we have a subscription which subscribes on a 11.000.000 documents large collection. Suddenly the subscription stopped working after processing 3.5 mio documents without any visible reason or error. So we tried different things:


{
  "Url": "/databases/ ourdatabase /docs?etag=01000000-0000-0023-0000-00000007F25D",
  "Error": "System.IO.InvalidDataException: Failed to de-serialize a document: ourDoc/8852259/eventstream ---> System.Data.DataException: Index points to a non leaf page\r\n   at Voron.Trees.Tree.SearchForPage(MemorySlice key, Lazy`1& cursor, NodeHeader*& node)\r\n   at Voron.Trees.Tree.Read(Slice key)\r\n   at Voron.Impl.SnapshotReader.Read(String treeName, Slice key, WriteBatch writeBatch)\r\n   at Raven.Database.Storage.Voron.StorageActions.DocumentsStorageActions.ReadDocumentData(String normalizedKey, Slice sliceKey, Etag existingEtag, RavenJObject metadata, Int32& size)\r\n   --- End of inner exception stack trace ---\r\n   at Raven.Database.Storage.Voron.StorageActions.DocumentsStorageActions.ReadDocumentData(String normalizedKey, Slice sliceKey, Etag existingEtag, RavenJObject metadata, Int32& size)\r\n   at Raven.Database.Storage.Voron.StorageActions.DocumentsStorageActions.DocumentByKey(String key)\r\n   at Raven.Database.Storage.Voron.StorageActions.DocumentsStorageActions.<GetDocumentsAfterWithIdStartingWith>d__8.MoveNext()\r\n   at Raven.Database.Actions.DocumentActions.<>c__DisplayClass2c.<GetDocuments>b__2a(IStorageActionsAccessor actions)\r\n   at Raven.Storage.Voron.TransactionalStorage.ExecuteBatch(Action`1 action)\r\n   at Raven.Storage.Voron.TransactionalStorage.Batch(Action`1 action)\r\n   at Raven.Database.Actions.DocumentActions.GetDocuments(Int32 start, Int32 pageSize, Etag etag, CancellationToken token, Func`2 addDocument, Nullable`1 maxSize, Nullable`1 timeout)\r\n   at Raven.Database.Actions.DocumentActions.GetDocumentsAsJson(Int32 start, Int32 pageSize, Etag etag, CancellationToken token, Nullable`1 maxSize, Nullable`1 timeout)\r\n   at Raven.Database.Server.Controllers.DocumentsController.DocsGet()\r\n   at lambda_method(Closure , Object , Object[] )\r\n   at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass10.<GetExecutor>b__9(Object instance, Object[] methodParameters)\r\n   at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__0.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__0.MoveNext()"
}



·         If we try to create a new Index we become the following error after a while:

Unexpected exception happened during execution of indexing batch...this is not supposed to happen. Reason: System.IO.InvalidDataException: Data corruption - the key = 'ourDoc/730149' was found in the documents index, but matching document was not found
   at Raven.Database.Storage.Voron.StorageActions.DocumentsStorageActions.<GetDocumentsAfterWithIdStartingWith>d__8.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Raven.Database.Prefetching.PrefetchingBehavior.<>c__DisplayClass20.<GetJsonDocsFromDisk>b__1c(IStorageActionsAccessor actions)
   at Raven.Storage.Voron.TransactionalStorage.ExecuteBatch(Action`1 action)
   at Raven.Storage.Voron.TransactionalStorage.Batch(Action`1 action)
   at Raven.Database.Prefetching.PrefetchingBehavior.GetJsonDocsFromDisk(CancellationToken cancellationToken, Etag etag, Etag untilEtag, Reference`1 earlyExit)
   at Raven.Database.Prefetching.PrefetchingBehavior.LoadDocumentsFromDisk(Etag etag, Etag untilEtag)
   at Raven.Database.Prefetching.PrefetchingBehavior.GetDocsFromBatchWithPossibleDuplicates(Etag etag, Nullable`1 take)
   at Raven.Database.Prefetching.PrefetchingBehavior.GetDocumentsBatchFrom(Etag etag, Nullable`1 take)
   at Raven.Database.Prefetching.PrefetchingBehavior.DocumentBatchFrom(Etag etag, List`1& documents)
   at Raven.Database.Indexing.IndexingExecuter.<ExecuteIndexingWork>b__17(IndexingGroup indexingGroup, Int64 i)



So it looks like we have three problems:

  1. The index points to “non-existent documents”.
  2. The “non-existent documents” are lost even though we have no code to delete such documents
  3. The subscription doesn’t start because the index is defect.

Currently we are trying to find out how many documents are lost and if it are not so many maybe there will be a change to repair those.

So our main problem is the defect index wherefore the subscription doesn’t work. We’ve made a reset of the index but that doesn’t help.


Has anyone an idea how we can solve this problem?

Many thanks in advance!

 

 

ar...@ayende.com

unread,
Jan 12, 2017, 5:05:43 AM1/12/17
to RavenDB - 2nd generation document database
Hi,

in RavenDB-Build-30155.zip file under Diag/StorageExporter you will find Raven.StorageExporter.exe command line tool which attempts to export a database. Run this tool without any argument to see the usage and available options.

The database needs to be offline at the time of the export. Once a document in the db is corrupted you will get the exception and it will be skipped (not included in the .ravendump file). This way you will be able do determine how many docs are broken.

In order to import use Raven.Smuggler tool: https://ravendb.net/docs/article-page/3.5/Csharp/server/administration/exporting-and-importing-data. Please import to 3.0.30160-Hotfix version of RavenDB.

Currently you are running on 30155 build. What versions of RavenDB were you using in the past? The corruption you are experiencing could be caused by a bug in some previous versions of RavenDB.

Regards,
Arek

msharp...@gmail.com

unread,
Jan 12, 2017, 6:52:58 AM1/12/17
to RavenDB - 2nd generation document database
Hi Arek,
thank you for your post. We are trying your advice.
We created a new database server with build 30155 as first version and then created the database, so there are no previous versions.
Kind regards,
m#

msharp...@gmail.com

unread,
Jan 13, 2017, 3:11:46 PM1/13/17
to RavenDB - 2nd generation document database
Hi Arek,
the advice you gave us worked well. We were able to export our database with the tool you mentioned and imported it to a new Build 3.0.30160-Hotfix Version.
After importing the database we found out, that the document eTags were different and caused by that, the order in which the documents were read from the subscription is different.
Is there an other possibility or command line switch that we can use to bring the documents back in the right order?
Many thanks in advance! 

Kind regards
m#

Arkadiusz Palinski

unread,
Jan 13, 2017, 4:17:08 PM1/13/17
to rav...@googlegroups.com
There is no way to recover old etags. However, you can try to specify -DocumentsStartEtag and pass non empty etag 00000000-0000-0000-0000-000000000001. Then documents will be written to the export file in the desired order. Under the hood the mechanism works differently, in particular exporting docs from a specified etag uses 'key_by_etag' index, which I think can be corrupted based on the exceptions you wrote in the first post. So you might no be able to export all docs successfully (you will see errors then and be able to know what docs were skipped).

Implementation details of docs export starting from etag can be found here: https://github.com/ravendb/ravendb/blob/v3.0/Raven.StorageExporter/StorageExporter.cs#L177

--
You received this message because you are subscribed to a topic in the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/pMyi0p9WmVk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages