The relationship store

Niels Hoogeveen

unread,

Mar 23, 2013, 9:44:01 AM3/23/13

to ne...@googlegroups.com

As promised, my three-monthly reminder on the issue of densly connected nodes. This time around, however, I'd like to start the discussion again on the method chosen to solve the problem.

The current relationship store has at least the following three issues:

nodes having many relationships of one type, have slow access to relationships of other types
fetching the relationships between two nodes, requires a scan of all relationships of one of the nodes
relations get fragmented over the store

issue 1:

Since all relationships of a node are organized as one linked list, there is no partitioning of relationships by relationship type. This means that all relationships attached to a node need to be scanned to fetch the relationships of a certain type. This is especially troubling when there are many relationships of one type attached to the node, while trying to fetch relationships of another type. It is acceptable that fetching many relationships takes time, it is not that acceptable that the presence of many relationships of one type, causes fetching of relationships of other types to slow down.

issue 2:

Since the id of the other node attached to the relationship is stored in the relationship record, and relationships are organized as two non-partitioned linked lists, it requires a full scan of the relationships of one of the nodes to fetch the relationships between two nodes. This can be troubling when both nodes have many relationships of the same type.

issue 3:

New relationships are added to the store, either by appending a relationship record to the store, or by reusing the space of a deleted relationship. This leads to fragmentation of relationships. Ideally we would like to have similar relationships stored together, so they can be fetched with one disk operation.

About a year-and-a-half ago, I proposed a solution to solve these three issues. At the same time, a new node store layout was prepared to solve issue 1. This solution is still awaiting QA and may end up in some future release of Neo4j.

At the time, I had a discussion with Mattias about the merits of both solutions. Unfortunately that discussion is no longer available, since the issue on GitHub no longer exists.

Serving from memory: Mattias and I agreed that the solution I proposed solved more problems than his, but was the more involved solution, since it also had more impact on the caching mechanism as well as the relationship store. I agreed that a quicker fix was desirable, given the nuisance of issue 1.

Now that we are more than a year further down the road, I no longer feel that argument applies. The quick fix was never applied and it is still uncertain when it will make it into a release. Therefore, I would like to repropose my original solution.

As I said before, the current relationship store, contains two linked lists of unpartitioned relationships. The record layout looks as follows:

private final long firstNode;
private final long secondNode;
private final int type;
private long firstPrevRel
private long firstNextRel
private long secondPrevRel
private long secondNextRel

The field firstNode, contains the id of the node on the outgoing side of the relationship, the field secondNode contains the id of the node on the incoming side of the relationship. For the incoming side the firstPrevRel and firstNextRel fields span up the list of relationships attached to that node, while the fields secondPrevRel and secondNextRel do the same for the node on the outgoing side of the relationship.

Having two linked lists combined in one record, makes it impossible to sort relationships for both the incoming and the outgoing side, so the first step we need to take is break up the relationships in two parts: the list on the incoming side and the list on the outgoing side.

Now that we have the two lists separated, we can sort the relationship lists. To do so, I proprose a Btree structure, storing the following record:

private final long node;

private final bool direction;

private final int type;

private final long otherNode;

private final long relationship;

Let's look the possible scenarios:

Fetch all relationships of one node, irrespective of direction, type, or the other node attached to the relationships. All those relationships are grouped together the Btree, requiring fewer read operations than the current relationship store layout. (solves issue 3)
Fetch all relationships of one node in one direction. All those relationships are grouped together and require fewer reads of the store. (solves issue 3)
Fetch all relationships of one node for a specific relationship type in both directions. Here we need to do two look ups in the relationship store, one for the incoming side and one for the outgoing side. Other than that, all relationships are further grouped together. (solves issue 1 and issue 3).
Fetch all relationships of one node for in one direction for a specific relationship type. All relationships are grouped together according to this criterion, requiring few reads of the store. (solves issue 1 and issue 3).
Fetch all relationships between two nodes, irrespective of direction and type. In this situation we need to do two lookups in the store for either side and additional lookups for each relationship type. The number of relationships types used with one particular node is relatively small, so we still have acceptable lookup times. (solves issue 2)
Fetch all relationship between two node with a given direction. For each relationship type known in that direction for that node, we need to do a lookup in the store. Again the number of relationship types used is likely to be small, so we still have accaptable lookup time. (solves issue 2)
Fetch all relationships between two nodes with a given relationship type for both directions. We need two lookups in the store, for either side. (solves issue 1, issue 2 and issue 3)
Fetch all relationships between two nodes with a given relationship type and a given direction. Requires one lookup in the store. (solves issue 1, issue 2 and issue 3)

What remains of the relationship record is the bare minimum, just two ids to both nodes contained in the relationship, and an implicit id (position in the file) for lookup of properties.

Altogether, it feel that this solution is preferable to the one now awaiting QA. The argument that this solution would take more time to make it to a release, in my opinion, no longer applies, because the quick solution already takes more than a year-and-a-half anyway. Given the choice between waiting a long time for a suboptimal solution and waiting long for a more optimal solution, I am personally in favor of the latter.

Kind regard,

Niels Hoogeveen

Wes Freeman

unread,

Mar 23, 2013, 12:53:35 PM3/23/13

to ne...@googlegroups.com

Cool ideas. I think this is clearly superior for fetching (especially from disk), but what about updating? What happens when you add more relationships/types to a node? Do you leave padding for adding new items to the btree, and when that gets filled do you move them to the end of the file (a la mongodb)--or how would that work?

I always thought it would be even better if a select property could be organized in a btree per node, sort of like a clustered index in relational database terms. This way, you could efficiently query on your selected property (in addition to relationship type). Optimizing the case where you have a billion relationships coming out of a node, but you only care about the outbound relationships of a certain type, where the timestamp is in a particular 24 hour range. Currently, there's no good way to do that query--you can build in graph indexes and such to reduce the number of nodes from that node, but it sure would be nice to be able to query it simply.

Wes

--
You received this message because you are subscribed to the Google Groups "Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Niels Hoogeveen

unread,

Mar 23, 2013, 3:19:31 PM3/23/13

to ne...@googlegroups.com

Hi Wes,

When adding a relationship to the store, the normal Btree operations would take place. So the index block the relationship will have to fit in is located. If the block is not full it will simply be added. If the block is full, a new block will be created and the normal btree rebalancing procedure will be followed. That way, the relationships will always remain in sorted order, grouped in blocks of similar relationships. New blocks will either be appended to the file, or deleted block will be reused. Since one block contains many relationships (with 2k blocks, 128 relationships can be stored in one block), fragmentation of relationships becomes less of an issue.

Wes Freeman

unread,

Mar 23, 2013, 10:27:26 PM3/23/13

to ne...@googlegroups.com

Ah, so you're saying it should be a btree for the whole graph?--I misunderstood, and thought it was per node or something. I guess if that were the case you wouldn't need to store the first node id every time. Interesting stuff...

More concerns (hopefully you can clear them up):

- why do you need to store direction? is it no longer implied? or is that just part of your data structure and not actually stored?

- doesn't this optimize for relationships in one direction as a btree scan is required for the other direction? or am I misunderstanding how your btree is organized? (should we consider a better multidimensional index that doesn't favor any particular query plan?)

Wes

Wes Freeman

unread,

Mar 23, 2013, 10:43:27 PM3/23/13

to ne...@googlegroups.com

Btw, here is your old github issue, which I'm reading through now: https://github.com/neo4j/neo4j/issues/198

Niels Hoogeveen

unread,

Mar 24, 2013, 8:58:53 AM3/24/13

to ne...@googlegroups.com

I see I wasn't completely clear in my proposal. For each relationship, two entries are stored in the Btree, one for the incoming side and one for the outgoing side. This makes relationships are treated equally irrespective of direction, To make a distinction between the incoming side and the outgoing side, the direction needs to be part of the key.

Michael Hunger

unread,

Mar 24, 2013, 11:15:24 AM3/24/13

to ne...@googlegroups.com

We started a document that covers some of that.

I can share it with you

Sent from mobile device

Niels Hoogeveen

unread,

Mar 24, 2013, 2:45:18 PM3/24/13

to ne...@googlegroups.com

Please do.

Niels Hoogeveen

unread,

Mar 26, 2013, 9:13:12 AM3/26/13

to ne...@googlegroups.com

Hi Michael,

You said you had a document you wanted to share with me.

Niels

On Sunday, March 24, 2013 4:15:24 PM UTC+1, Michael Hunger wrote:

Mattias Persson

unread,

Mar 27, 2013, 4:43:15 PM3/27/13

to Neo4j Development

One objection I have is around a b-tree that is for the whole db. What I would like to investigate more is something like a index/b-tree/block like structure per dense node to not introduce any scalability concerns related to a global index.

2013/3/24 Niels Hoogeveen <nielsh...@gmail.com>

--
Mattias Persson, [mat...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com

Wes Freeman

unread,

Mar 27, 2013, 5:35:47 PM3/27/13

to ne...@googlegroups.com

+1 for Mattias's comment. Using a b-tree for the whole graph's relationships might introduce a performance requirement of fitting all relationships in RAM. I like the idea of the property index per node, but that also introduces some concerns, like how to store the mini-trees together on disk; maybe there's a good way of chunking them?

Wes

Jim Webber

unread,

Mar 28, 2013, 5:32:41 AM3/28/13

to ne...@googlegroups.com

I'd add to this that dense nodes become dense because they're written to a lot.

Any solution for dense nodes should make removing (or minimising) write contention a priority too.

Jim

Mattias Persson

unread,

Mar 28, 2013, 6:30:49 AM3/28/13

to Neo4j Development

2013/3/28 Jim Webber <j...@neotechnology.com>

I'd add to this that dense nodes become dense because they're written to a lot.

Any solution for dense nodes should make removing (or minimising) write contention a priority too.

That is true, good point. However that contention doesn't get worse the denser the nodes gets, it's quite constant. The loading/reading part gets worse and worse, hence the higher priority and focus there imho.

Ian Robinson

unread,

Mar 28, 2013, 7:30:16 AM3/28/13

to ne...@googlegroups.com

I think it's more that the number of writes per second may be higher for a dense node. If you're going to get from 0 to 100,000 relationships on a node in, say, 2 hrs, the contention will be higher than on a node that only attracts 100 new relationships in 2 hrs. Dense nodes have to become dense somehow, and in some scenarios, that can happen in a bursty manner.

ian

UK: +44 7974 144424
Skype: iansrobinson

Niels Hoogeveen

unread,

Mar 28, 2013, 8:37:36 AM3/28/13

to ne...@googlegroups.com

Good point Mattias.

It is of course possible to eliminate the first entry of the key and store the absolute file position of the first index block within the node. That way index blocks can be maintained on a per-node basis.Of course this has to happen for both the incoming direction and the outgoing direction, leading to two file-pointers stored in the node and two root index blocks per node in the relationship-index file.

Niels

Niels Hoogeveen

unread,

Mar 28, 2013, 8:44:42 AM3/28/13

to ne...@googlegroups.com

While we're at it, why not eleminate the direction entry of the key too and have two separate files, one for the incoming sides of the relationship, one for the outgoing side of the relationship, leading to an index key:

private final int type;

private final long otherNode;

private final long relationship;

Mattias Persson

unread,

Mar 31, 2013, 7:02:36 PM3/31/13

to Neo4j Development

Would you mind sketching this up in a nice picture?

Perhaps here? https://docs.google.com/a/neopersistence.com/drawings/d/10cjeetZ26k6hU2fTrAzzmARkUeWOB61YdSszJEuI84c/edit?usp=sharing

2013/3/28 Niels Hoogeveen <nielsh...@gmail.com>

Niels Hoogeveen

unread,

Apr 1, 2013, 8:51:26 PM4/1/13

to ne...@googlegroups.com

Hi Mattias,

I am not the greatest artist in the world, but I tried to sketch the proposed solution in the document you created. Feel free to improve on it.

In words:

Each node record contains a record-id pointing into a store (file) containing index blocks for outgoing relationships, and it has another record-id doing the same for the incoming relationships.

There are two stores of index blocks one for the outgoing relationships, one for the incoming relationships.

The record id's stored in the node record, point to the root index block of the Btree for that node. A store contains multiple Btrees (one for each node having relationships in the associated direction).

The key of the index is:

relationshiptype-id
node-id (the other node of the relationship)
relationship-id

Niels

Mattias Persson

unread,

May 2, 2013, 3:30:49 AM5/2/13

to Neo4j Development

Hi,

thanks for the drawing.

I don't get the whole picture though... does each index entry in the b-tree point to a relationship record in the existing relationship store, or do you envision a new type of relationship store where relationships live together?

2013/4/2 Niels Hoogeveen <nielsh...@gmail.com>

Niels Hoogeveen

unread,

May 7, 2013, 3:33:08 PM5/7/13

to ne...@googlegroups.com

Hi Mattias,

The current relationship record has four fields that become obsolete in this proposal. These fields are: firstPrevRel, firstNextRel, secondPrevRel, secondNextRel. It seems to me that it would be better to start working with a slimmed down relationship file, since we can save 32 byte per relationship entry, a 60% reduction.

Over the last couple of weeks, I have been pondering how to efficiently deal with higher-order relationships and have come to realize that the solution I proposed can be used for this purpose.

Instead of creating a node that models the higher-arity relationship, it is possible to create a path that has the same function. Now the problem with ordinary paths is that we never know that two relationships that are positioned after one another actually belong together. This can be achieved with the solution I described. A path can be stored by using the same relationship-id for all relationships in the path. The relationship record itself can store the first and last node of the path. Instead of storing the direction in the btree records, it is possible to store a byte, which denotes the position in the path. This way binary relations can be used just like they are used now (except being stored in a btree), using 1 for outgoing and -1 for incoming relationships. Higher arity-relationships can be denoted by using numbers other than 1 or -1.

Niels

Niels Hoogeveen

unread,

May 13, 2013, 9:03:02 AM5/13/13

to ne...@googlegroups.com

Hi Mattias,

I hope this answered your question. As you can see, the lookup of the relationship record itself is actually only needed if you want to do a lookup by relationship-id. For all other queries a lookup within the index would suffice, since all information of the relationship record is already contained within the index. Only in the case where the relationship-id is all you know about the relation, would it be necessary to access the relationship store itself.

Niels

Mattias Persson

unread,

May 15, 2013, 4:21:36 PM5/15/13

to Neo4j Development

Hi Niels,

sorry I haven't put many thoughts into this recently, but I've read your replies.

So we'd have each dense node having its own little index in the form of a b-(or+)tree for example. And precisely, the relationship records can be stored directly in the index (it'd have to include the pointer to the first property record as well. I like that very much and thought about perhaps trying out MapDB (the lower-level Engine, which works on record level) and see how it scales. Building such an index from scratch would be great, but may be quite time-consuming.

So it becomes a matter of composing the right index key... and actually each dense node can have different key layout to optimise for how it's used... or even several indexes for different use cases.

2013/5/13 Niels Hoogeveen <nielsh...@gmail.com>

Marko Rodriguez

unread,

May 15, 2013, 4:29:00 PM5/15/13

to ne...@googlegroups.com

Hi,

The model you guys are talking about is called vertex-centric indices and you can learn more about it here:

http://thinkaurelius.com/2012/10/25/a-solution-to-the-supernode-problem/

Blueprints supports this via the VertexQuery interface:

https://github.com/tinkerpop/blueprints/blob/master/blueprints-core/src/main/java/com/tinkerpop/blueprints/VertexQuery.java

Currently, Titan and OrientDB provide native support for it. Be great to see it in Neo4j.

HTH,

Marko.

http://markorodriguez.com

Niels Hoogeveen

unread,

May 18, 2013, 4:19:06 PM5/18/13

to ne...@googlegroups.com

I am looking forward to seeing your experiments, Mattias. Indeed having good index functionality per node opens many new possibilities.

MapDb looks like an interesting product, although seeing that it's a one man's hobby project makes it a bit risky to use, unless of course, Neotech forks the project and helps to grow it into a maintained product.

Reply all

Reply to author

Forward