Best design for edges in this problem

36 views
Skip to first unread message

Kirell

unread,
Apr 24, 2014, 12:20:03 PM4/24/14
to spar...@googlegroups.com
Hello, I have been using Dex/ sparksee for quite some time now but I need and advice on the best representation of data for my use case.

Basically I want to store and compress data on nodes and edges which evolve over some fixed STEPS like years (but not necessarily time). 
I think I have a solution for the nodes: port the data on the edge between the node and its corresponding STEP node. With this technique, I can request all the data over the steps in one request (one explode from the node).

Unfortunately, I cannot use the same technique for edges since I can't link and edge to another edge. 

So I had 3 ideas in mind:
  • Keep only one edge type and put an identifier in edge and create has many edges as needed. The obvious pitfall it that I need to do a selection each time I want to retrieve an edge. Imagine that I have 1000 steps, I will have to sort 1000 edges times the degree of the node to retrieve my actual neighborhood.
  • Create a new edge type for each STEP. But number of steps can be huge (1000+), I am not sure that Sparksee is efficient to retrieve data this way ? Moreover, if I want to retrieve all the data on an edge for a number of steps I would have to iterate over the types and Objects::Union the result together.
  • Create an EdgeNode between every nodes in my graph, I am now creating 2 small edges and 1 node instead of just 1 edge. The scheme looks like this N -- EN --- N instead of N -- N. Now that I have a node to represent an edge I can use the same technique as before and link EdgeNodes to the STEP node and store data on this edge. However I have to reimplement Neighbors and Explode primitives to retrieve the target node or edge. If an edge does not exist I still have to do 2 lookups to check its existence (get the EdgeNode and check if a link exists with the corresponding STEP).
I think the 3rd solution is superior even if I have to create more objects. Which solution seems best to you ?

Thank you very much.




c3po.ac

unread,
Apr 25, 2014, 4:23:14 AM4/25/14
to spar...@googlegroups.com
Hi,

Unfortunately I can't give you a perfect option. Each solution has it's advantages and disadvantages:

1- It's a simple solution, but you have to use Select. To get the neighbors at certain step, you could do something like this:

// Get the edges from all the steps
 
Objects totalExplode = graph.explode(theNode, theEdgeType, EdgesDirection.Any);
 
// Restrict the select to the explode result
 
Objects stepExplode = graph.select(theEdgeStepAttribute, Condition.Equal, theStepValue, totalExplode);
 totalExplode
.close();
 
// get the other nodes from the edges
 
Objects stepNeighbors = graph.Heads(stepExplode); // Or graph.Tails if the edges are incoming
 stepExplode
.close();

It's not the most efficient, but it's easy to implement and you don't have to do the sort yourself.

2- This may be very efficient for some operations, but not so good for others. And I can not recommend creating a huge amount of edge types that will grow even more in the future. Expanding the schema requires lots of data structures to keep the information available and some of them must be cached at all time. That being said, it should work if you have enough memory.

3- It's more complex, but I think that's the best solution too.

Best regards.

El dijous 24 d’abril de 2014 18:20:03 UTC+2, Kirell va escriure:

Kirell

unread,
Apr 25, 2014, 4:29:33 AM4/25/14
to spar...@googlegroups.com
All right, I'll stick with the third solution for now.

Thank you,

Best
Reply all
Reply to author
Forward
0 new messages