Repeating a traversal on multiple time based edges to find specific vertices

26 views
Skip to first unread message

Kris Leonard

unread,
Feb 22, 2024, 9:36:41 PMFeb 22
to Gremlin-users
In my graph I have vertices joined by edges with a timestamp property and over time there may be replications of those edges with different timestamp property values. Like for example ("t" being the timestamp property on the edges) : 

g.addV('a').property(T.id, 'a1') \ .addV('a').property(T.id, 'a2') \ .addV('a').property(T.id, 'a3') \ .addV('b').property(T.id, 'b2') \ .addV('b').property(T.id, 'b3') \ .addV('c').property(T.id, 'c3') \ .addV('d').property(T.id, 'd1') \ .addE('ad').from_(__.V('a1')).to(__.V('d1')).property('t', 1).property('a', 'y') \ .addE('ad').from_(__.V('a1')).to(__.V('d1')).property('t', 2).property('a', 'y') \ .addE('ab').from_(__.V('a2')).to(__.V('b2')).property('t', 1).property('a', 'y') \ .addE('bd').from_(__.V('b2')).to(__.V('d1')).property('t', 1).property('a', 'y') \ .addE('ab').from_(__.V('a3')).to(__.V('b3')).property('t', 1).property('a', 'y') \ .addE('bc').from_(__.V('b3')).to(__.V('c3')).property('t', 1).property('a', 'y') \ .addE('cd').from_(__.V('c3')).to(__.V('d1')).property('t', 1).property('a', 'y') \ .iterate() 

In this example we end up with this a1 -> d1 (at "t" 1), a1 -> d1 (at "t" 2), a2 -> b2 -> d1, a3 -> b3 -> c3 -> d1. For the a1 -> d1 relationship there are two edges: one with a "t" property of 1 and the other with a "t" property of 2. 

If I was starting at node d1 what would the query to get all of the "a" nodes where the edges have a max "t" value between the edges for two vertices at or before a certain time? So for example I want to traverse from d1 to get all of the associated "a" nodes where the "t" value on the edges is less than or equal to 3 and I only want to traverse those edges between those vertices where the edge has the maximum "t" value that is less than or equal to 3.

I have tried a few things with a repeat().until() but I am not sure how in the repeat() to limit the list of traversed edges to be just one edge per associated vertex. Limiting the in bound edges in this example to be just one and sorted by the edge "t" property would just results in only one edge being traversed (a1 -> d1 at "t" 2). Instead I would like to continuously traverse to all of the vertexes on the latest "t" edge for the associated vertex until all of the "a" vertexes are found. So something like on the first loop a1 -> d1 (at "t" 2), b2 -> d1, c3 -> d1, second loop a2 -> b2, b3 -> c3, and final loop a3 -> b3. 

Please let me know if that doesn't make sense and I can provide more clarity. Thanks for any help and information!

Cole Greer

unread,
Feb 23, 2024, 5:25:47 PMFeb 23
to Gremlin-users
Hi Kris,

I believe the traversal you need would be something similar to

gremlin> g.V("d1").inE().has("t", P.lte(3)).outV().hasLabel("a")
==>v[a1]
==>v[a1]

Note in this case it returns a1 twice as it is accessible from d1 via 2 different edges. You could add a dedup() at the end if you only want unique results.

Here is a quick breakdown of each of these steps:

g.V("d1")    //start at vertex "d1"
.inE()    //traverse to all incoming edges
.has("t", P.lte(3))    //filter out edges all edges which do not have a property "t" which is <= 3
.outV()    traverse to source vertex for each edge
.hasLabel("a")    //filter out all vertices which do not have the label "a"

Let me know if that works for you or if you have any further questions.

Regards,

Cole Greer

prashant upadhyay

unread,
Feb 23, 2024, 5:33:14 PMFeb 23
to Gremlin-users
To get all the paths from "d" to "a" wherin the "t" property of edge was lesser than given value. You can use below query. 

gremlin> g.V().
......1>   hasLabel('d').as('d').
......2>   repeat(__.inE().has('t', P.lt(3)).outV()).
......3>     until(__.has(T.label, "a")).
......4>     as('a').
......5>   path()

==>[v[d1],e[12][a1-ad->d1],v[a1]]
==>[v[d1],e[13][a1-ad->d1],v[a1]]
==>[v[d1],e[15][b2-bd->d1],v[b2],e[14][a2-ab->b2],v[a2]]
==>[v[d1],e[18][c3-cd->d1],v[c3],e[17][b3-bc->c3],v[b3],e[16][a3-ab->b3],v[a3]]

This should help with your question in:

> Instead I would like to continuously traverse to all of the vertexes on the latest "t" edge for the associated vertex until all of the "a" vertexes are found

However, now you want to limit the output "a" vertex which has maximum incoming t. This part is not very clear to me. Do you want the "t" property of the last incoming edge to the "a" vertex or in general for every edge in the path from d to this "a" vertex?

If the question is limited to last incoming vertex to "a" then you can use: 

gremlin> g.V().
......1>   hasLabel('d').as('d').
......2>   repeat(__.inE().has('t', P.lt(3)).outV()).until(__.has(T.label, "a")).
......3>   group().
......4>     by().
......5>     by(
......6>       outE().
......7>       order().by('t', desc).
......8>       limit(1).
......9>       project('edge', 'tValue').by().by('t')).
.....10>   unfold()

==>v[a1]={edge=e[13][a1-ad->d1], tValue=2}
==>v[a2]={edge=e[14][a2-ab->b2], tValue=1}
==>v[a3]={edge=e[16][a3-ab->b3], tValue=1}
Reply all
Reply to author
Forward
0 new messages