dijkstra and A* like algorithm implementation with gremlin

272 views
Skip to first unread message

Yueming Duan

unread,
Oct 30, 2019, 6:49:50 PM10/30/19
to Gremlin-users
Hi, Gremlin Folks,

I started my gremlin adventure most recently, and pretty new in this area.... tried a lot to find implementation related to bfs/dfs/shortest path/dijkstra/A*. following are some references, sorting by relevance to my problem:


More specifically with my problem, i have a (directed) properties graph, where each vertex and edge has a weight/score in the range of [0, 1.0).

1 Start from a set of vertices S1 in the graph, with initial relevance score of 1.0,
2 expanding the vertices following the out edges from S1, each visited vertex will have a relevance score which is calculated in some way along the path, based on the vertex and edge properties along the path/s, in case of multiple path, taking the maximum in this case.
3 collecting all the visited vertices as long as the accumulated score of the vertex is above some threshold (e.g., 0.7), and continue traversal until no more vertex meet the threshold (with the calculated score).
4 return all the valid vertices and corresponding scores, ideally sorted by the score desc..

With step 2, assume the calculation (math expression) score will be monotonically decreasing along the path, and to make step 2 simple, assume the calculation is memory-less, the accumulated score along the path only depends on previous calculated score (with vertex) and current properties in edge and vertex's property.

 I believe this is a more or less advanced use cases of gremlin, i feel it need sack, sideEffect, transformation, map, filtering, priority queue, etc., but it is still not straightforward for me yet. At the same time, i believe this problem is generic and practical enough, which is a classic application pattern to be implemented as a show case and present the power of gremlin in the community. I have been struggling with this for a few days. Really looking for some gremlin experts from the community could help for this implementation and demonstration the ability of gremlin. Appreciate your time for reading this so far, and thanks a lot in advance!

Best,
Yueming

Yueming Duan

unread,
Oct 30, 2019, 7:30:05 PM10/30/19
to Gremlin-users
Sorry that i forget to mention one more complexity fact above, the score of a vertex could be potentially updated due to a new path and corresponding score updating.

Sample Data:
graph = EmptyGraph.instance()
cluster = Cluster.open()
g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster, "g"))


  va = g.addV("model_a").property(id, "1").property("tag", "a").property("weight", 1).next()
  vb = g.addV("model_a").property(id, "2").property("tag", "b").property("weight", 2).next()
  vc = g.addV("model_a").property(id, "3").property("tag", "c").property("weight", 3).next()
  vd = g.addV("model_a").property(id, "4").property("tag", "d").property("weight", 4).next()
  ve = g.addV("model_a").property(id, "5").property("tag", "e").property("weight", 5).next()
  vf = g.addV("model_a").property(id, "6").property("tag", "f").property("weight", 6).next()
  vg = g.addV("model_a").property(id, "7").property("tag", "g").property("weight", 7).next()
  vh = g.addV("model_a").property(id, "8").property("tag", "h").property("weight", 8).next()
  vi = g.addV("model_a").property(id, "9").property("tag", "i").property("weight", 9).next()

    g.addE("model_a").from(va).to(vb).property(T.id, 12).property("dis", 0.50d)
    g.addE("model_a").from(va).to(vc).property(T.id, 13).property("dis", 0.95d)
    g.addE("model_a").from(vb).to(vd).property(T.id, 24).property("dis", 0.10d)
    g.addE("model_a").from(vb).to(vi).property(T.id, 29).property("dis", 0.99d)
    g.addE("model_a").from(vc).to(vb).property(T.id, 32).property("dis", 0.10d)
    g.addE("model_a").from(vc).to(vd).property(T.id, 34).property("dis", 0.95d)
    g.addE("model_a").from(vc).to(ve).property(T.id, 35).property("dis", 0.75d)
    g.addE("model_a").from(vc).to(vf).property(T.id, 36).property("dis", 0.10d)
    g.addE("model_a").from(vc).to(vh).property(T.id, 38).property("dis", 0.95d)
    g.addE("model_a").from(vd).to(vb).property(T.id, 42).property("dis", 0.95d)
    g.addE("model_a").from(vd).to(ve).property(T.id, 45).property("dis", 0.95d)
    g.addE("model_a").from(vd).to(vg).property(T.id, 47).property("dis", 0.65d)
    g.addE("model_a").from(ve).to(vg).property(T.id, 57).property("dis", 0.85d)
    g.addE("model_a").from(vh).to(vf).property(T.id, 86).property("dis", 0.90d)

Assume the score of vertex is the Setting threshold of 0.8, and score of a vertex is the max of the product of dis properties along path from any Vertex in S1 (a and c in the example) to the target vertex, i would like to a gremlin query/travasal returns following list/map.
h, 0.95
d, 0.95
b, 0.9025
e, 0.9025
i, 0.89347

Yueming Duan

unread,
Oct 30, 2019, 7:30:48 PM10/30/19
to Gremlin-users
Adding a graph for this example.
Screen Shot 2019-10-30 at 4.15.09 PM.png

Yueming Duan

unread,
Oct 30, 2019, 7:37:19 PM10/30/19
to Gremlin-users
one more update for above example to make it consistent with the attached graph representation of the sample graph.
g.E(86).values("dis");
==>0.90
g.E(86).property("dis", 0.60d).iterate()
g.E(86).values("dis");
==>0.6

Daniel Kuppitz

unread,
Nov 2, 2019, 12:33:19 PM11/2/19
to gremli...@googlegroups.com
I think g should be part of your expected result as there is a valid path:

c -[0.95]-> d -[0.9]-> e -[0.85]-> g

The traversal I came up with:

gremlin> S1 = ["a","c"] ; initialRel = 1.0 ; threshold = 0.7
gremlin> g.withSack(initialRel).V().has("tag", within(S1)).
           repeat(outE("model_a").sack(mult).by("dis"). 
                    filter(sack().is(gt(threshold))).
                    inV().has("tag", without(S1)).
                  group("m").
                    by("tag").
                    by(sack().max())).
           cap("m").
           order(local).by(values, desc).next()
==>d=0.950
==>h=0.950
==>b=0.90250
==>e=0.90250
==>i=0.8934750
==>g=0.7671250

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/06b325d9-9370-4b40-805b-7f3c3ed5daad%40googlegroups.com.

Yueming Duan

unread,
Nov 2, 2019, 1:42:14 PM11/2/19
to Gremlin-users
Hi, Daniel,

Really appreciate your help of coming up with such an elegant solution, although i still need more time to fully understand, i am pretty sure it is correct based on the example and my initial understanding of your ONE line solution!  And you are totally right, g should be part of results if threshold is 0.7, also i realized that my example input and graph is not fully consistent, sorry for misleading you and people, i will update them later due to some other commitment.

On Saturday, November 2, 2019 at 9:33:19 AM UTC-7, Daniel Kuppitz wrote:
I think g should be part of your expected result as there is a valid path:

c -[0.95]-> d -[0.9]-> e -[0.85]-> g

The traversal I came up with:

gremlin> S1 = ["a","c"] ; initialRel = 1.0 ; threshold = 0.7
gremlin> g.withSack(initialRel).V().has("tag", within(S1)).
           repeat(outE("model_a").sack(mult).by("dis"). 
                    filter(sack().is(gt(threshold))).
                    inV().has("tag", without(S1)).
                  group("m").
                    by("tag").
                    by(sack().max())).
           cap("m").
           order(local).by(values, desc).next()
==>d=0.950
==>h=0.950
==>b=0.90250
==>e=0.90250
==>i=0.8934750
==>g=0.7671250

Cheers,
Daniel


On Wed, Oct 30, 2019 at 4:37 PM Yueming Duan <yuemin...@gmail.com> wrote:
one more update for above example to make it consistent with the attached graph representation of the sample graph.
g.E(86).values("dis");
==>0.90
g.E(86).property("dis", 0.60d).iterate()
g.E(86).values("dis");
==>0.6

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Yueming Duan

unread,
Nov 3, 2019, 1:07:31 AM11/3/19
to Gremlin-users
As a followup, to make the sample input and graph being consistent.

    g.addE("model_a").from(vh).to(vf).property(T.id, 86).property("dis", 0.60d)

Assume the score of vertex is the Setting threshold of 0.7, and score of a vertex is the max of the product of dis properties along path from any Vertex in S1 (a and c in the example) to the target vertex, i would like to a gremlin query/traversal returns following list/map.
h: 0.95
d: 0.95
b: 0.9025
e: 0.9025
i: 0.89347
g: 0.767


Screen Shot 2019-11-02 at 9.59.33 PM.png
Reply all
Reply to author
Forward
0 new messages