Re: [TinkerPop] Calculate path score - help with query

386 views
Skip to first unread message

Marko Rodriguez

unread,
Oct 10, 2012, 10:13:24 AM10/10/12
to gremli...@googlegroups.com
Hi Barbara,

I'm trying to optimize my code, but I'm newbie in Gremlin and I hope you can help me.

Your code below is very clean. You are using all the right techniques.

g.v('#6:0').as('startPoint').outE('knows').inV.loop('startPoint'){it.loops<=3 && !(it.object.name in ['A','B'])}.has('name', 'B').path

This query gives me a list of paths (I'm using gremlin in Java with orientDB) but I just need the path score, which is given by the sum of edges' weights. Right now I'm doing this by iterating all the paths and sum the edges' weights, but this is too heavy because I have a lot of paths, nodes, etc. So I'm trying to calculate the weights as I traverse the graph using gremlin, but I can't figure it out. Can you help me with the query?

Will this work for you?

g.v('#6:0').as('startPoint').outE('knows').inV.loop('startPoint'){it.loops<=3 && !(it.object.name in ['A','B'])}.has('name', 'B').path.transform{
it.findAll{ it instanceof Edge}.collect{ it.weight }.sum()
}

Using the toy TinkerGraph:

gremlin> g.v(1).outE.inV.outE.inV.path.transform{it.findAll{it instanceof Edge}} ==>[e[8][1-knows->4], e[10][4-created->5]] ==>[e[8][1-knows->4], e[11][4-created->3]]

gremlin> g.v(1).outE.inV.outE.inV.path.transform{it.findAll{it instanceof Edge}.collect{it.weight}} ==>[1.0, 1.0] ==>[1.0, 0.4]

gremlin> g.v(1).outE.inV.outE.inV.path.transform{it.findAll{it instanceof Edge}.collect{it.weight}.sum()} ==>2.0 ==>1.4000000059604645


...if you want the path and weight together:

gremlin> g.v(1).outE.inV.outE.inV.path.transform{[it,it.findAll{it instanceof Edge}.collect{it.weight}.sum()]} 
==>[[v[1], e[8][1-knows->4], v[4], e[10][4-created->5], v[5]], 2.0] 
==>[[v[1], e[8][1-knows->4], v[4], e[11][4-created->3], v[3]], 1.4000000059604645]


HTH,
Marko.




Bárbara Furtado

unread,
Oct 10, 2012, 1:19:34 PM10/10/12
to gremli...@googlegroups.com
Hi Marko,
Thank you so much for your quick response. It works, but I'm still with the performance problem. It takes too much time running all the queries, because I want to find the paths between too many nodes. I'm wondering if it is possible to have a list of end nodes and find all the paths in one query.
In my query, I'm filtering the end node with .has('name', B). Can I filter more end nodes in the same query?

I'm executing the gremlin query this way in Java:

db.command(new OCommandGremlin(query)).execute()

Is this efficient? Or is it better to use Pipes in Java?

Thanks again.

Marko Rodriguez

unread,
Oct 10, 2012, 1:50:30 PM10/10/12
to gremli...@googlegroups.com
Hi,

Does, db.command() return an iterable? The best you can do is iterate. :/

Next, in terms of .has('name',B). 

You can do this:

.filter{it.name in [A,B,C,D]} (for ORing)

There is a ticket in Blueprints to make has() do this ORing.


Hope that helps,
Marko.


--
 
 

Bárbara Furtado

unread,
Oct 16, 2012, 6:34:26 AM10/16/12
to gremli...@googlegroups.com
Hi again Marko,

Sorry if I am asking something too basic, but I really need your help on this.

As I said before, I'm using OrientDB to store a graph and need to find all the paths that exist between a starting node and a set of ending nodes, within a maximum distance of 3 relations.

I need to perform these traversals very efficiently, because I have to do a lot of them in a short period of time. So I would like to know if it is possible to build a Gremlin/Pipes (I'm using GremlinPipeline now, in Java) query that receives the starting node and the set of ending nodes and returns the paths found.

To exemplify my problem, consider the following undirected sub-graph: https://dl.dropbox.com/u/999549/graph-example.png . If my starting node was A and the ending nodes were B, C and D, then the traversal should return the following paths:

A > 1 > 2 > B
A > C
A > 3 > D
A > C > 4 > D

I think that with something like this, I can store the current path in a list when it passes in some of the ending nodes, but I can't figure it out.
List<Double> x = new ArrayList<Double>();
new GremlinPipeline(g.getVertex(1)).outE().store(x, new PipeFunction<Edge,Double>() {
  public Double compute(Edge e) {
    return (Double) e.getProperty("weight");
  }
}

Thank you in advance.

Best regards,
Bárbara

On Wednesday, October 10, 2012 3:13:36 PM UTC+1, Marko A. Rodriguez wrote:

Marko Rodriguez

unread,
Oct 17, 2012, 2:58:38 PM10/17/12
to gremli...@googlegroups.com
Hi,

I need to perform these traversals very efficiently, because I have to do a lot of them in a short period of time. So I would like to know if it is possible to build a Gremlin/Pipes (I'm using GremlinPipeline now, in Java) query that receives the starting node and the set of ending nodes and returns the paths found.

Yes, you can, but you are iterating results.

For example:

for(List path : pipe) {
...
}

If you want to have the top weighted, you will have to aggregate all and sort. You can't sort without knowing the entire set.

I'm not too familiar with all the OrientDB specifics, but in Gremlin/Pipes, you simply write your expression and iterate results. This is very fast.

Hope that helps,
Marko.

Reply all
Reply to author
Forward
0 new messages