Select intermediate nodes between two nodes

135 views
Skip to first unread message

Davide

unread,
Jan 9, 2015, 11:43:32 AM1/9/15
to gremli...@googlegroups.com
Hi,
I'm struggling with the following problem:

1.
I have a node A that is connected with some nodes B1,B2,B3 and then I have some other nodes C, D, E that are connected to some nodes Bx (x = 1,2,3,...).
So A can be connected to some of the other nodes C,D,E through a subset of B1,B2,B3. I'd like to identify, for each node connected to A in this way, the subset of B nodes. Something like: C={B1,B3},D={B2},E={B1,B2}.

I spent a couple of days on that but I've not been able to obtain anything close to the expected result. Any help or suggestion will be very appreciated.


In addition to this I'm trying to solve the following point:

2.
I'd like to filter the above result keeping only the node connected to A through a specific subset of B nodes (e.g. only C={B1,B3},E={B1,B2} and not D as its connected only through B2).

I'm trying to solve this in Java, but also any help in Groovy with be much appreciated!


Thanks

Davide

Daniel Kuppitz

unread,
Jan 9, 2015, 1:42:53 PM1/9/15
to gremli...@googlegroups.com
Hi,

to answer your first question, this is your sample graph:

g = TinkerGraph.open()
A = g.addVertex(id, "A")
B1 = g.addVertex(id, "B1")
B2 = g.addVertex(id, "B2")
B3 = g.addVertex(id, "B3")
C = g.addVertex(id, "C")
D = g.addVertex(id, "D")
E = g.addVertex(id, "E")
A.addEdge("link", B1)
A.addEdge("link", B2)
A.addEdge("link", B3)
B1.addEdge("link", C)
B3.addEdge("link", C)
B2.addEdge("link", D)
B1.addEdge("link", E)
B2.addEdge("link", E)

And here's the query you're looking for:

gremlin> A.out("link").as("a").out("link").as("b").select().group().by {it.get("b")}.by {it.get("a")}
==>[v[C]:[v[B1], v[B3]], v[D]:[v[B2]], v[E]:[v[B1], v[B2]]]

Regarding the 2nd question: Simply add an except (or retain) step after the first out step.

Cheers,
Daniel



--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/c7e1d11a-0082-4947-9171-4edf5d7c8774%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Davide

unread,
Jan 12, 2015, 9:11:46 AM1/12/15
to gremli...@googlegroups.com
Hi, 

thanks for your answer.

Could you please clarify how to use the except/retain step in this scenario? It's not clear to me how to apply them to filter the result of the groubBy keeping only the keys that are mapped with a specific sub-set of values.

Many thanks,
Davide

Daniel Kuppitz

unread,
Jan 12, 2015, 9:19:11 AM1/12/15
to gremli...@googlegroups.com
It depends on the subset - you either want to keep B1 and B3 or filter out B2:

gremlin> A.out("link").retain([B1,B3]).as("a").out("link").as("b").select().group().by {it.get("b")}.by {it.get("a")}
==>[v[C]:[v[B1], v[B3]], v[E]:[v[B1]]]
gremlin> A.out("link").except(B2).as("a").out("link").as("b").select().group().by {it.get("b")}.by {it.get("a")}
==>[v[C]:[v[B1], v[B3]], v[E]:[v[B1]]]

Cheers,
Daniel


Davide

unread,
Jan 12, 2015, 10:55:26 AM1/12/15
to gremli...@googlegroups.com
Thanks for your answer. Unfortunately my scenario is a bit more complex:

I don't have a specific set of nodes that I want to keep or exclude, I just want to keep the node at the 3rd level (C, D, ...) only if at the 2nd level I have at least two nodes in the set (B1,B2,B3). So (B1,B2) is ok as well as (B1,B3,B4), and (B1,B4) is not.

The main problem is that retain/except look like working as a filter on the 2nd level nodes connected to the 1st one. Instead I'm trying to filter on the nodes on the 2nd level that are in common with the 1st and 3rd ones.

Is it possible to use and/or conditions withe retain/exclude functions? Something like: retain([B1,B2] or [B1,B3] or [B2,B3]) ?

Unfortunately is not even easy to explain in plain words :) 

Thanks,
Davide

Daniel Kuppitz

unread,
Jan 12, 2015, 12:21:08 PM1/12/15
to gremli...@googlegroups.com
only if at the 2nd level I have at least two nodes in the set

Then this might help:

gremlin> A.out("link").as("a").out("link").as("b").select().group("x").by {it.get("b")}.by {it.get("a")}.cap().sideEffect {
gremlin>   def x = it.sideEffects("x")
gremlin>   it.sideEffects("x", x.grep { it.value.size() >= 2 }.collectEntries())
gremlin> }.cap("x")
==>[v[C]:[v[B1], v[B3]], v[E]:[v[B1], v[B2]]]

Or, if you no longer need a traversal, simply do this:

gremlin> A.out("link").as("a").out("link").as("b").select().group().by {it.get("b")}.by {it.get("a")}.next().grep { it.value.size() >= 2 }
==>v[C]={v[B1]=1, v[B3]=1}
==>v[E]={v[B1]=1, v[B2]=1}

Cheers,
Daniel


Davide

unread,
Jan 13, 2015, 12:16:25 PM1/13/15
to gremli...@googlegroups.com
Thanks Daniel for the tip.

Unfortunately I've not been able to make it working with the grep function (maybe because I'm using ThinkerPop2) and the logic to filter the nodes at the 2nd level is more complex than this.

Anyway, I solved the problem using a transformation on the result of the groupBy where I can apply the logic to discard the results I don't need. I've written it in Java as it's easier for me and now it's working properly. I'm just a bit concerned about the performance of this solution, above all for the memory usage (I have thousands of nodes with just 1 shared node, but only a few that share 2 or more), but I don't see any better solution at the moment. Any other tips for improvement are very welcome :)

Cheers,
Davide
Reply all
Reply to author
Forward
0 new messages