Important: improving relationships, Super-Node problem and roadmap

155 views
Skip to first unread message

Luca Garulli

unread,
Nov 18, 2011, 7:43:04 AM11/18/11
to orient-database
Hi all,
in last days I've received many messages about the bottleneck of OrientDB (but seems the same of other GraphDB) when many edges are added to a vertex. This problem is the same why not-unique indexes are slow when many items have the same key.

Now this problem seems more critical than when we discovered it. This is the reason why I'd like to fix it in 1.0 postponing the release date to the end of January 2012. This would be also compatible with the next release round of TinkerPop Blueprints planned for the begin of December. So we could provide a release 1.0rc7 before the official 1.0.

The roadmap would be:
  • 1.0rc7 planned for December, 2nd 2011
  • 1.0 planned for end of January 2012
WDYT? Comments? Votes?

Technical proposal of the improvement (quite technical, sorry)

The cause of the bottleneck is into the management of the list/set of RID. Today it's embedded into the parent record and streamed like:

{
  "name" : "Luca",
  "friends" : [#10:3,#10:232,#10:3443]
}

This means that every time I add a new item I've to unmarshall the entire list, add the item, re-marshall it, convert in binary and then store. Wow! This is why it's not so fast.

I've a good solution in my mind, sorry to go so in deep:
  • split collection of RIDs in several linked nodes: like a common linked list but persistent
  • node size will be configurable, by default 512 items
  • internal data will be serialized as binary avoiding the from string/to string cost
  • each node will be serialized with the link to the previous node (10 bytes as RID), the next one (10 bytes as RID) and the number of items inside of it (4 bytes as integer)
  • the list of RIDs will be of fixed size: 10 bytes (RID streamed size) * 512 (node size) = 5120 bytes per node, This avoid defragmentation on update because size doesn't change
  • so each node will be of size: 10 + 10 + 4 + ( 10 * 512 ) = 5144 bytes
  • to save memory resources RIDs will be not unmarshalled, but lazy created at the fly from the binary data
  • a new iterator could cross the entire list at very low cost without consuming memory
  • first node will continue to be stored as embedded in the parent avoiding to penalize tiny collections

This could be ready in 1 week, 2 weeks with all the test cases.

WDYT? Any comments?

Lvc@

Alessandro Nadalin

unread,
Nov 18, 2011, 7:46:15 AM11/18/11
to orient-...@googlegroups.com

Sounds reasoble.

Postponing the stable in order to fix critical issues is the only way to go if we dont want people ranting about an "incomplete" stable release

--
Alessandro Nadalin
www.odino.org | twitter.com/_odino_
inviato da dispositivo mobile

Sylvain Spinelli

unread,
Nov 18, 2011, 8:23:57 AM11/18/11
to OrientDB
Hi Luca,

Perhaps the first embedded node should be different:
* have a dynamic size in order to not used 5144 bytes for one or two
link.
* contain a ref to the last node for efficient append links.

But... The perf will not be so good for remove links : you need to
iterate on the entire list.

The other way would be to use your MVRBTree as Set (and ordering RID).
(That is what I plan for my own impl).

But for this I think we need to review OMVRBTreePersistent with
delegation pattern to a persistence layer (for tree properties and for
each entries).
In that way, I think we could optimize the MVRBTree perf : for
example, when values or keys are fixed size the persistence layer
could be more efficient than actual impl.
The persistence layer could also "fake the values" in order to
implement an efficient Set interface: store only keys and return the
same object for key and value.

If you're interested, I could help.

Sylvain

Michael Widmann

unread,
Nov 18, 2011, 9:14:52 AM11/18/11
to orient-...@googlegroups.com
Hmmm

If this too fixes the problems we have with the "exists" in the index  (OLogManager) - and the optimization in multithreaded environments with....

17:15:53,997 ERROR [remoting.storage.
StorageServiceServerMain] [Thread:main] Error upon startup
com.orientechnologies.orient.core.memory.OLowMemoryException: Optimization level: 1
    at com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.checkForOptimization(OMVRBTreePersistent.java:643)
    at com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.loadEntry(OMVRBTreePersistent.java:123)
    at com.orientechnologies.orient.core.type.tree.OMVRBTreeEntryPersistent.getParent(OMVRBTreeEntryPersistent.java:411)
    at com.orientechnologies.common.collection.OMVRBTree.successor(OMVRBTree.java:2148)
    at com.orientechnologies.common.collection.OMVRBTree.getEntry(OMVRBTree.java:437)
    at com.orientechnologies.common.collection.OMVRBTree.getEntry(OMVRBTree.java:362)
    at com.orientechnologies.common.collection.OMVRBTree.containsKey(OMVRBTree.java:228)
    at com.orientechnologies.orient.core.index.OIndexMVRBTreeAbstract.contains(OIndexMVRBTreeAbstract.java:221)
    at com.orientechnologies.orient.core.index.OIndexAbstractDelegate.contains(OIndexAbstractDelegate.java:65)
    at com.bayoda.ods.store.orientdb.OrientDBStore.exists(OrientDBStore.java:200)
    at com.bayoda.ods.store.orientdb.OrientDBStore.put(OrientDBStore.java:362)
    at com.bayoda.ods.store.orientdb.OrientDBStore.put(OrientDBStore.java:1)
    at com.bayoda.ods.store.LoggingStore.put(LoggingStore.java:84)
    at com.bayoda.ods.store.LoggingStore.put(LoggingStore.java:1)
    at com.bayoda.ods.remoting.impl.StorageServiceServer.put(StorageServiceServer.java:122)
    at com.bayoda.remoting.storage.StorageServiceServerMain.put(StorageServiceServerMain.java:95)
    at com.bayoda.remoting.storage.StorageServiceServerMain.bulkPerformanceAllStoresTest(StorageServiceServerMain.java:163)
    at com.bayoda.remoting.storage.StorageServiceServerMain.startSpringified(StorageServiceServerMain.java:78)
    at com.bayoda.remoting.storage.StorageServiceServerMain.main(StorageServiceServerMain.java:86)

Every thing would be fine :-)     The Problem is - that the speed isn't all - the stability is much more the thing to consider  - orient rules - if stable ;-)

Michael

2011/11/18 Sylvain Spinelli <sylvain....@gmail.com>



--
bayoda.com - Professional Online Backup Solutions for Small and Medium Sized Companies

Luca Garulli

unread,
Nov 18, 2011, 9:57:14 AM11/18/11
to orient-...@googlegroups.com
Hi,
this has been fixed in SVN rr4171.

Lvc@

connie dobbs

unread,
Nov 18, 2011, 9:58:13 AM11/18/11
to orient-...@googlegroups.com
Hi Luca,

I'm very excited that this problem is getting priority treatment: other aspects of orientdb suit my problem space perfectly, this was the only hurdle.

The proposed solution appears to ease issues with modifying membership in the edge collection. Even deleting, in my mind, could be handled efficiently in the proposed solution if a "null RID" (all zeros) could be used to overwrite any deleted edge in the node. Any operations/iterators could easy ignore the nulled RID, effectively making deletions just as cheap as additions (even cheaper, in edge cases). If the unused space (i.e. nulled records) becomes a concern, the nodes could be repacked without the nulls in an offline operation. Obviously, this requires the "all zero"/nulled RID to be otherwise unused/invalid, which I'm not certain of in the current implementation.

I'm curious how the proposed solution would perform during edge traversal: Would the entire edge collection need to be iterated to determine membership, or am i missing something in the solution? 
When thinking about optimizing traversal, the nodes start to look like hash buckets to me. Instead of simply treating them as a linked list, I would think it could be possible to manage them the same way most hash implementations do. That way for any given edge, finding which node/bucket it would probably be contained in would be a very cheap hash operation, and then finding the edge would be limited to the matching against at most the 512 records in that node. Any thoughts? 

thanks


As you know, I am particularly interested in two features of the proposed solution:
-

connie dobbs

unread,
Nov 18, 2011, 10:00:57 AM11/18/11
to orient-...@googlegroups.com

As you know, I am particularly interested in two features of the proposed solution:
-


ignore that text in my reply. I left it in there while drafting. 

Luca Garulli

unread,
Nov 18, 2011, 10:11:25 AM11/18/11
to orient-...@googlegroups.com
On 18 November 2011 14:23, Sylvain Spinelli <sylvain....@gmail.com> wrote:
Hi Luca,

Perhaps the first embedded node should be different:
* have a dynamic size in order to not used 5144 bytes for one or two
link.

Hi Sylvain,
the first node, the head, will continue to stay inside the parent record as embedded. The max-size of the first node could be setted to 16 or 32 items. In this way few records will be treated like now.
 
* contain a ref to the last node for efficient append links.

Since nodes are not ordered the add could be also in the first node.

About ordering there are several solution:
  1. keep all the item always ordered: faster search, but costly insert + mini-rebalancing
  2. use a Bloom Filter on top of each node. Since node size is determined the bloom filter would be very efficient 99,9% of good answers
 

But... The perf will not be so good for remove links : you need to
iterate on the entire list.


Yes, even if using Bloom Filters would allow to reduce the checks
 
The other way would be to use your MVRBTree as Set (and ordering RID).
(That is what I plan for my own impl).

But for this I think we need to review OMVRBTreePersistent with
delegation pattern to a persistence layer (for tree properties and for
each entries).
In that way, I think we could optimize the MVRBTree perf : for
example, when values or keys are fixed size the persistence layer
could be more efficient than actual impl.
The persistence layer could also "fake the values" in order to
implement an efficient Set interface: store only keys and return the
same object for key and value.

Having a OMRBTree also for sets it solves many problems, but as you says it could be higher optimized for the RID list case. Furthermore could be heavier to keep it in memory.
 
If you're interested, I could help.

It would be great! We could mix both cases: root node could be embedded into the parent record to reduce records for few items.

Can you estimate how much time this would require you?
 
Sylvain

Lvc@

A Richardson

unread,
Nov 18, 2011, 10:14:33 AM11/18/11
to orient-...@googlegroups.com
We would welcome a delay in the stable release to address these issues. Poor performance of non-unique indexes is our most significant issue right now. It's great to hear this is being addressed sooner rather than later.

Thanks,

Alex

Sylvain Spinelli

unread,
Nov 18, 2011, 2:50:53 PM11/18/11
to OrientDB
> Can you estimate how much time this would require you?

I will investigate this week-end and tell you.

Rafael Fernandes

unread,
Nov 19, 2011, 5:39:24 AM11/19/11
to OrientDB
Hi guys, I'm also interested on that solution, thanks for that...but
what about the cases where we need to get the latest inserted objects/
vertexes of a root node?I believe this is a scenario quite common and
still we had to iterate through the entire graph to get them out... if
we have several linked nodes, how would that be possible?
Thinking about a social media streaming where latest status updates
should be provided to a client as well as their history... always
displaying from newest to oldest, how would that workout? does this
solution work in this case?
thanks in advance...rafa
On Nov 18, 5:50 pm, Sylvain Spinelli <sylvain.spine...@gmail.com>
wrote:
Reply all
Reply to author
Forward
0 new messages