Because we've had daily outages of 15 minutes due to the backup of the
DB, we follwed the advice of Luca to set up a distributed environment
with two nodes.
Here is the config:
default-distributed-db-config.json
{
"replication": true,
"autoDeploy": true,
"hotAlignment": true,
"resyncEvery": 15,
"clusters": {
"internal": {
"replication": false
},
"index": {
"replication": false
},
"ODistributedConflict": {
"replication": false
},
"*": {
"replication": true,
"readQuorum": 1,
"writeQuorum": 1,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"partitioning": {
"strategy": "round-robin",
"default": 0,
"partitions": [
[ "<NEW_NODE>" ]
]
}
}
}
}
So now here is what happended and finally led to a corrupted DB:
- Stopped the application server
- Setup and configured replication with the settings above
- Two nodes: node01 and node02
- node02 had no existing database, so node01 exported and zipped the db
and sent it to node02
- node02 extracted the db successfully, log message: INFO [node02] installed database [OHazelcastPlugin]
- We started the application server
- After some time the following exception (sometimes) appeares in the orient-server.log:
Cannot route TX operation against distributed node
Error on committing distributed transaction
-> com.orientechnologies.orient.server.distributed.ODistributedStorage.commit(ODistributedStorage.java:502)
-> com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:109)
-> com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.commit(ODatabaseRecordTx.java:146)
-> com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:440)
-> com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:435)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.commit(ONetworkProtocolBinary.java:1253)
-> com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:325)
-> com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:126)
-> com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
- After a while the following exception appeared in our application with increased frequency:
Caused by:
com.orientechnologies.orient.core.exception.OTransactionException: Cannot insert item in mvrb-tree because the transactional item was not found.
at com.orientechnologies.orient.core.type.tree.OMVRBTreeRID.internalPut(OMVRBTreeRID.java:156)
at com.orientechnologies.orient.core.type.tree.OMVRBTreeRID.internalPut(OMVRBTreeRID.java:57)
at com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.put(OMVRBTreePersistent.java:468)
at com.orientechnologies.orient.core.type.tree.provider.OMVRBTreeRIDProvider.lazyUnmarshall(OMVRBTreeRIDProvider.java:227)
at com.orientechnologies.orient.core.type.tree.OMVRBTreeRID.getTreeSize(OMVRBTreeRID.java:332)
at com.orientechnologies.orient.core.type.tree.OMVRBTreeRID.size(OMVRBTreeRID.java:318)
at com.orientechnologies.orient.core.type.tree.OMVRBTreeRIDSet.size(OMVRBTreeRIDSet.java:91)
at com.orientechnologies.common.collection.OMultiValue.getSize(OMultiValue.java:82)
at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerSchemaAware2CSV.toString(ORecordSerializerSchemaAware2CSV.java:165)
at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerStringAbstract.toStream(ORecordSerializerStringAbstract.java:92)
at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerSchemaAware2CSV.toStream(ORecordSerializerSchemaAware2CSV.java:518)
at com.orientechnologies.orient.core.record.ORecordSchemaAwareAbstract.toStream(ORecordSchemaAwareAbstract.java:127)
at com.orientechnologies.orient.core.record.ORecordSchemaAwareAbstract.toStream(ORecordSchemaAwareAbstract.java:122)
at com.orientechnologies.orient.core.record.impl.ODocument.toStream(ODocument.java:391)
at com.orientechnologies.orient.client.remote.OStorageRemote.commitEntry(OStorageRemote.java:1919)
... 75 more
- Then we took a look at the corresponding dataset to those exception via the orientdb console, i.e.:
select from #12:155580
Error:
com.orientechnologies.orient.core.exception.OTransactionException:
Cannot insert item in mvrb-tree because the transactional item was not found.
- Simple properties could be selected without problems, i.e.:
select email from #12:155580
----+-----+---------
# |@RID |email
----+-----+---------
0 |#-2:1|<removed>
----+-----+---------
- Selections of linked Edges resulted sometimes in errors, i.e.:
select out_Friend from #12:155580
Error:
com.orientechnologies.orient.core.exception.OTransactionException:
Cannot insert item in mvrb-tree because the transactional item was not found.
- Whereas others worked
select in_Friend from #12:155580
----+-----+---------
# |@RID |in_Friend
----+-----+---------
0 |#-2:1|[63]
----+-----+---------
1 item(s) found. Query executed in 0.004 sec(s).
Any idea or hint for us?
Thanks a million
Daniel