Rebuilding broken ElasticSearch indices with BerkelyDB

20 views
Skip to first unread message

Laverne Schrock

unread,
Aug 18, 2016, 1:08:14 AM8/18/16
to Aurelius
Hi all,

We are experiencing an issue where our indexes become incorrect.

We are running Titan 1.0.0 and interact with it entirely via gremlin.sh. A sample properties file is attached. The indexes are created like so:

mgmt = graph.openManagement()
generation = mgmt.makePropertyKey("generation").dataType(Integer.class).make()
total_error = mgmt.makePropertyKey("total_error").dataType(Float.class).make()
generationTotalError = mgmt.buildIndex('generationTotalError', Vertex.class).addKey(generation).addKey(total_error).buildMixedIndex("search")
mgmt.commit()

Then the data is loaded. After that, I can do queries like `g.V().has('total_error', 0).count()` and it will very quickly locate the handful of nodes that fit this query. Occasionally, however, something happens to the database so that the query no longer finds any nodes. Following the guide here, I've taken these steps:

  1. Close gremlin.sh
  2. `mv demo0_searchindex demo0_searchindex.bak`
  3. Launch gremlin.sh
  4. import com.thinkaurelius.titan.graphdb.database.management.ManagementSystem
  5. graph = TitanFactory.open('demo0.properties')
  6. m = graph.openManagement()
  7. i = m.getGraphIndex('generationTotalError')
  8. m.updateIndex(i, SchemaAction.REINDEX)
  9. m.commit()
  10. ManagementSystem.awaitGraphIndexStatus(graph, 'generationTotalError').status(SchemaStatus.ENABLED).call()
None of these commands are blocking, but after command #8, I can see the java process slowly using more and more CPU until it is using nearly 100% on all eight cores. Eventually, I get this output

Exception in thread "Thread-8" java.lang.NullPointerException
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.scan.StandardScannerExecutor.cleanup(StandardScannerExecutor.java:200)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.scan.StandardScannerExecutor.cleanupSilent(StandardScannerExecutor.java:211)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.scan.StandardScannerExecutor.run(StandardScannerExecutor.java:110)

        at java.lang.Thread.run(Thread.java:745)

Running graph.tx().commit() at this point gives

com.thinkaurelius.titan.core.TitanException: Could not start new transaction
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.newTransaction(StandardTitanGraph.java:324)
        at com.thinkaurelius.titan.graphdb.transaction.StandardTransactionBuilder.start(StandardTransactionBuilder.java:227)
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.newThreadBoundTransaction(StandardTitanGraph.java:313)
        at com.thinkaurelius.titan.graphdb.tinkerpop.TitanBlueprintsGraph.startNewTx(TitanBlueprintsGraph.java:68)
        at com.thinkaurelius.titan.graphdb.tinkerpop.TitanBlueprintsGraph.access$000(TitanBlueprintsGraph.java:33)
        at com.thinkaurelius.titan.graphdb.tinkerpop.TitanBlueprintsGraph$GraphTransaction.doOpen(TitanBlueprintsGraph.java:258)
        at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.open(AbstractTransaction.java:87)
        at org.apache.tinkerpop.gremlin.structure.Transaction$READ_WRITE_BEHAVIOR$1.accept(Transaction.java:209)
        at org.apache.tinkerpop.gremlin.structure.Transaction$READ_WRITE_BEHAVIOR$1.accept(Transaction.java:206)
        at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:92)
        at org.apache.tinkerpop.gremlin.structure.Transaction$commit.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:110)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:114)
        at groovysh_evaluate.run(groovysh_evaluate:3)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:69)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:185)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:119)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:94)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:130)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:150)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:123)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:58)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:130)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:150)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:82)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:144)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:303)
Caused by: com.thinkaurelius.titan.diskstorage.PermanentBackendException: Could not start BerkeleyJE transaction
        at com.thinkaurelius.titan.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:147)
        at com.thinkaurelius.titan.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:34)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreManagerAdapter.beginTransaction(OrderedKeyValueStoreManagerAdapter.java:54)
        at com.thinkaurelius.titan.diskstorage.Backend.beginTransaction(Backend.java:525)
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.openBackendTransaction(StandardTitanGraph.java:330)
        at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.newTransaction(StandardTitanGraph.java:320)
        ... 46 more
Caused by: com.sleepycat.je.EnvironmentFailureException: (JE 5.0.73) JAVA_ERROR: Java Error occurred, recovery may not be possible.
        at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1507)
        at com.sleepycat.je.Environment.checkEnv(Environment.java:2185)
        at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1313)
        at com.sleepycat.je.Environment.beginTransaction(Environment.java:1284)
        at com.thinkaurelius.titan.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:137)
        at com.thinkaurelius.titan.diskstorage.berkeleyje.BerkeleyJEStoreManager.beginTransaction(BerkeleyJEStoreManager.java:34)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreManagerAdapter.beginTransaction(OrderedKeyValueStoreManagerAdapter.java:54)
        at com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog.openTx(KCVSLog.java:306)
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:131)
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation$1.call(BackendOperation.java:147)
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:56)
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:42)
        at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:144)
        at com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller.run(KCVSLog.java:703)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at sun.reflect.GeneratedConstructorAccessor20.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at com.sleepycat.je.log.entry.BaseEntry.newInstanceOfType(BaseEntry.java:74)
        at com.sleepycat.je.log.entry.BaseEntry.newInstanceOfType(BaseEntry.java:69)
        at com.sleepycat.je.log.entry.LNLogEntry.newLNInstance(LNLogEntry.java:228)
        at com.sleepycat.je.log.entry.LNLogEntry.readBaseLNEntry(LNLogEntry.java:195)
        at com.sleepycat.je.log.entry.LNLogEntry.readEntry(LNLogEntry.java:130)
        at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:1008)
        at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:848)
        at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:809)
        at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)
        at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
        at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2261)
        at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1466)
        at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1593)
        at com.sleepycat.je.Cursor.retrieveNextAllowPhantoms(Cursor.java:2924)
        at com.sleepycat.je.Cursor.retrieveNextNoDups(Cursor.java:2801)
        at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:2775)
        at com.sleepycat.je.Cursor.getNext(Cursor.java:1128)
        at com.thinkaurelius.titan.diskstorage.berkeleyje.BerkeleyJEKeyValueStore.getSlice(BerkeleyJEKeyValueStore.java:138)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreAdapter.getKeys(OrderedKeyValueStoreAdapter.java:107)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.KCVSUtil.getKeys(KCVSUtil.java:57)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.scan.StandardScannerExecutor.addDataPuller(StandardScannerExecutor.java:79)
        at com.thinkaurelius.titan.diskstorage.keycolumnvalue.scan.StandardScannerExecutor.run(StandardScannerExecutor.java:106)
        ... 1 more

Is there any way to get my indices back without rebuilding the whole database? I have a few other indices, but this is the most important one at the moment.

This database has ~67 million vertices with about an equal amount of edges so I would love to not have to rebuild it. Moreover, I would love to be free of the threat of getting in such a situation in the future.

Reply all
Reply to author
Forward
0 new messages