paths between nodes

344 views
Skip to first unread message

Boris Kizelshteyn

unread,
Aug 18, 2011, 10:54:46 PM8/18/11
to gremli...@googlegroups.com
This seems to come up from time to time both here and on the neo4j list, but I actually still can't figure it out. How do I determine undirected paths between two nodes? I have seen the example on the wiki and that is fine if the relationships flow is one directional, but what if I need to get an unknown path between node 1 and node 3 where node one "likes" node 2 and node 3 "isnear" node 2?

So for example, I do this: 

g.v(1).as('x').outE.inV.inE.outV.loop('x'){it.loops < 4 & it.object.id !=3}[[id:3]].paths

but it is very slow and hangs up when there is alot of data to traverse, is there something I should be doing differently?

Thanks in advance!

Peter Neubauer

unread,
Aug 19, 2011, 2:33:00 AM8/19/11
to gremli...@googlegroups.com
Boris,
the basic technique we use in the Neo4j graph algos fro this is to go
out from both starting points and interleave the traversals in order
to determine if there is a match between the current endpoints of the
paths from both directions. In Gremlin, you start from a single source
and thus the paths are getting longer.

I don't know if you can use the pipes approach for this kind of algos
with best performance.

Also, you could have your query shorter and use "both" as direction
maybe? g.v(1).out.in sounds not like an undirected traversal.
Something like

g.v(1).as('x').both.loop('x'){it.loops < 4 & it.object.id
!=3}[[id:3]].paths maybe?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

Pierre De Wilde

unread,
Aug 19, 2011, 4:44:10 AM8/19/11
to gremli...@googlegroups.com
Hey,

As already suggested, the fastest way to find paths between 2 nodes is by using the native Neo4j GraphAlgoFactory.

Unfortunately, other graph databases may not provide such algorithms natively. 
Although Tinkerpop doesn't provide such algorithms yet, you may always tinker by yourself, as you have done.

Assuming A is the id of the starting node, B the id of the ending node and N the maximum number of loops:

To find all paths in a directed graph:

g.v(A).out.loop(1){it.loops<=N && !(it.object.id in [A,B])}[[id:B]].paths

To find all paths in an undirected graph:

g.v(A).both.loop(1){it.loops<=N && !(it.object.id in [A,B])}[[id:B]].paths

If [[id:B]] is too cryptic, replace it with .filter{it.id==B}


HTH,
Pierre

Boris Kizelshteyn

unread,
Aug 19, 2011, 8:34:39 AM8/19/11
to gremli...@googlegroups.com
Thanks so much all! I have been working with the algos via rest, but have been hitting 2 problems:

1. Dijkstar alogo crashes (it becomes unresponsive and I have to restart) my db and it only as about 16k nodes

2. shotest and all paths works well, except that I then have to go back and look up all the info for each relationship and node which makes it really intensive just to query the paths between 2 nodes.

So I was thinking gremlin would be best because I can return the data structure I want from the initial query.

Thanks!

Peter Neubauer

unread,
Aug 19, 2011, 8:46:08 AM8/19/11
to gremli...@googlegroups.com
Boris,
16k nodes should not be a problem for a Dijkstra. Can you replicate
the problem in an isolated test in some form?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

Boris Kizelshteyn

unread,
Aug 19, 2011, 10:14:05 AM8/19/11
to gremli...@googlegroups.com
this is what I'm doing:

http> POST /db/data/node/6098/paths {"to":"/node/18","relationships":{"type":"FB_FRIEND"},"algorithm":"dijkstra","cost_property":"cost","default_cost":1}

It just turns and turns and eats up all the resources and I have to kill neo4j. 

Peter Neubauer

unread,
Aug 19, 2011, 10:58:34 AM8/19/11
to gremli...@googlegroups.com

Boris,
Will check it out next week, if that is ok?

/peter

Sent from my phone.

Boris Kizelshteyn

unread,
Aug 19, 2011, 10:59:44 AM8/19/11
to gremli...@googlegroups.com
Whenever you can help! Thanks a million

Boris Kizelshteyn

unread,
Aug 19, 2011, 12:25:48 PM8/19/11
to gremli...@googlegroups.com
Hey Pierre,

I'm not sure if I am doing something wrong or if my installation is screwed up in some way

1. I have two nodes 6098 and 18, they are connected by one hop:

  • gremlin> g.v(6098).outE().inV().filter{it.id==18}.paths()
  • ==> [v[6098], e[20626][6098-FRIEND->18], v[18]]

2. When I try the directed path finding algo with a 4 degree loop:

g.v(6098).out().loop(1).{it.loops<=2 && !(it.object.id in [6098,18]).filter{it.id==18}.paths

it returns no data and my gremlin session hangs up (from neo4j console)

3. if I increase loop to 5 or more neo4j becomes unresponsive and need sto be restarted.

Any idea what's happening?

Thanks!

Peter Neubauer

unread,
Aug 19, 2011, 12:33:29 PM8/19/11
to gremli...@googlegroups.com
Boris,
I think this is a question for the neo4j group rather than Tinkerpop ...

/peter


On Friday, August 19, 2011, Boris Kizelshteyn <boris.ki...@popcha.com> wrote:
> Hey Pierre,
> I'm not sure if I am doing something wrong or if my installation is screwed up in some way
> 1. I have two nodes 6098 and 18, they are connected by one hop:
>
> gremlin> g.v(6098).outE().inV().filter{it.id==18}.paths()
> ==> [v[6098], e[20626][6098-FRIEND->18], v[18]]
>
> 2. When I try the directed path finding algo with a 4 degree loop:
> g.v(6098).out().loop(1).{it.loops<=2 && !(it.object.id in [6098,18]).filter{it.id==18}.paths
> it returns no data and my gremlin session hangs up (from neo4j console)
>
> 3. if I increase loop to 5 or more neo4j becomes unresponsive and need sto be restarted.
> Any idea what's happening?
> Thanks!
> On Fri, Aug 19, 2011 at 4:44 AM, Pierre De Wilde <pierre...@gmail.com> wrote:
>
> Hey,
> As already suggested, the fastest way to find paths between 2 nodes is by using the native Neo4j GraphAlgoFactory <http://api.neo4j.org/current/org/neo4j/graphalgo/GraphAlgoFactory.html>.
--

Pierre De Wilde

unread,
Aug 19, 2011, 12:51:08 PM8/19/11
to gremli...@googlegroups.com
Hey Boris,

Double-check your statement:

g.v(6098).out().loop(1).{it.loops<=2 && !(it.object.id in [6098,18]).filter{it.id==18}.paths

g.v(6098).out().loop(1).{it.loops<=2 && !(it.object.id in [6098,18]).filter{it.id==18}.paths

                    ---^                                       ----^ missing }

The correct syntax is:

g.v(6098).out().loop(1){it.loops<=2 && !(it.object.id in [6098,18])}.filter{it.id==18}.paths

Also note that () is not necessary for in, out, both, inE, inV, ...

g.v(6098).out.loop(1){it.loops<=2 && !(it.object.id in [6098,18])}.filter{it.id==18}.paths

HTH,
Pierre

Boris Kizelshteyn

unread,
Aug 19, 2011, 12:59:51 PM8/19/11
to gremli...@googlegroups.com
Thanks a million! And thanks everyone for being patient as I learn!

FTIW: I saw a post from Marko that said that adding () reduces reflection time, so I have been habituating to it ;)

Boris Kizelshteyn

unread,
Aug 19, 2011, 12:42:11 PM8/19/11
to gremli...@googlegroups.com
Agreed ... I got my lists confused ;) appologies

Marko Rodriguez

unread,
Aug 23, 2011, 10:57:51 AM8/23/11
to gremli...@googlegroups.com
Hi,

FTIW: I saw a post from Marko that said that adding () reduces reflection time, so I have been habituating to it ;)

This is true.

Good luck,
Marko.


Reply all
Reply to author
Forward
0 new messages