How to insert edge between vertex id list

165 views
Skip to first unread message

Stark Arya

unread,
Aug 11, 2019, 11:20:45 PM8/11/19
to Gremlin-users
some times,we need get  startId ->  endId's certain hops  one path, the path include both edge and vertex details which seems like :
g.V(startId).repeat(outE().inV()).times(4).hasId(endId).limit(1).path().by(__.valueMap(true))

As we knows, the hop num is bigger, the cost may increasing Unacceptable,so,we could get vertexId list first, and then  add edge between them.

1.  Get vertexId list first,like:
gremlin>  g.V("startId").repeat(out().simplePath()).times(4).hasId("endId").limit(1).path()
==>[v[startId],v[secondId],v[thirdId],v[endId]]

2.  Query edge from the startId & endId.
g.V("startId").outE().as("E").otherV().hasId("endId").select("E").by(valueMap(true))
==>[id:5d3b354af8d6c90653a1c64d, label:drugs_disease_treat, confidence:0.6]

So How to make 1 and 2 into one gremlin query ?  


Stephen Mallette

unread,
Aug 13, 2019, 8:39:16 AM8/13/19
to gremli...@googlegroups.com
I'm not sure I understand why you would think it faster to gather vertices first and then edges. In your first traversal you end up traversing the same edges that you intend to collect in your second traversal. Why look to traverse those paths all over again?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/a4e5631b-551b-470e-862d-343b567504a6%40googlegroups.com.

Stark Arya

unread,
Aug 13, 2019, 10:06:00 AM8/13/19
to Gremlin-users
because one vertex store Adjacent edge and vertex,if use g.V("startId").repeat(out().simplePath()).times(4).hasId("endId").limit(1).path(),  will go from a vertex to adjacent vertex directly;but if use g.V(startId).repeat(outE().inV()).times(4).hasId(endId).limit(1).path().by(__.valueMap(true))。 will go from a vertex to adjacent edge and then to the same vertex,make one more storage look up。

在 2019年8月13日星期二 UTC+8下午8:39:16,Stephen Mallette写道:
I'm not sure I understand why you would think it faster to gather vertices first and then edges. In your first traversal you end up traversing the same edges that you intend to collect in your second traversal. Why look to traverse those paths all over again?

On Sun, Aug 11, 2019 at 11:20 PM Stark Arya <sands...@gmail.com> wrote:
some times,we need get  startId ->  endId's certain hops  one path, the path include both edge and vertex details which seems like :
g.V(startId).repeat(outE().inV()).times(4).hasId(endId).limit(1).path().by(__.valueMap(true))

As we knows, the hop num is bigger, the cost may increasing Unacceptable,so,we could get vertexId list first, and then  add edge between them.

1.  Get vertexId list first,like:
gremlin>  g.V("startId").repeat(out().simplePath()).times(4).hasId("endId").limit(1).path()
==>[v[startId],v[secondId],v[thirdId],v[endId]]

2.  Query edge from the startId & endId.
g.V("startId").outE().as("E").otherV().hasId("endId").select("E").by(valueMap(true))
==>[id:5d3b354af8d6c90653a1c64d, label:drugs_disease_treat, confidence:0.6]

So How to make 1 and 2 into one gremlin query ?  


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Stephen Mallette

unread,
Aug 13, 2019, 10:41:03 AM8/13/19
to gremli...@googlegroups.com
will go from a vertex to adjacent vertex directly 

Unless that is a feature/optimization of the underlying graph database you are using, I don't think that's the case. The graph will still need to access/read those edges to determine the adjacent vertex. I don't think there is anything to gain by taking this approach. The only gain would be to ignore the return of edges in the path() which is why you might write the traversal that way. By avoiding the edge in your gremlin that's one less object to track in memory in the path. But, you want the edge in the result, so throwing it away to later traverse all those edges again doesn't really gain you anything.

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/13942f03-63a8-435b-93c5-15dc24104d58%40googlegroups.com.

Stark Arya

unread,
Aug 13, 2019, 11:17:58 AM8/13/19
to Gremlin-users

Yes,it is our alibaba graphdb's  optimization,  you can see from the under picture, how much time edge looks up cost. if remove EdgeVertex, we could save 31s


图片.png



在 2019年8月13日星期二 UTC+8下午10:41:03,Stephen Mallette写道:
will go from a vertex to adjacent vertex directly 

Unless that is a feature/optimization of the underlying graph database you are using, I don't think that's the case. The graph will still need to access/read those edges to determine the adjacent vertex. I don't think there is anything to gain by taking this approach. The only gain would be to ignore the return of edges in the path() which is why you might write the traversal that way. By avoiding the edge in your gremlin that's one less object to track in memory in the path. But, you want the edge in the result, so throwing it away to later traverse all those edges again doesn't really gain you anything.

On Tue, Aug 13, 2019 at 10:06 AM Stark Arya <sands...@gmail.com> wrote:
because one vertex store Adjacent edge and vertex,if use g.V("startId").repeat(out().simplePath()).times(4).hasId("endId").limit(1).path(),  will go from a vertex to adjacent vertex directly;but if use g.V(startId).repeat(outE().inV()).times(4).hasId(endId).limit(1).path().by(__.valueMap(true))。 will go from a vertex to adjacent edge and then to the same vertex,make one more storage look up。

在 2019年8月13日星期二 UTC+8下午8:39:16,Stephen Mallette写道:
I'm not sure I understand why you would think it faster to gather vertices first and then edges. In your first traversal you end up traversing the same edges that you intend to collect in your second traversal. Why look to traverse those paths all over again?

On Sun, Aug 11, 2019 at 11:20 PM Stark Arya <sands...@gmail.com> wrote:
some times,we need get  startId ->  endId's certain hops  one path, the path include both edge and vertex details which seems like :
g.V(startId).repeat(outE().inV()).times(4).hasId(endId).limit(1).path().by(__.valueMap(true))

As we knows, the hop num is bigger, the cost may increasing Unacceptable,so,we could get vertexId list first, and then  add edge between them.

1.  Get vertexId list first,like:
gremlin>  g.V("startId").repeat(out().simplePath()).times(4).hasId("endId").limit(1).path()
==>[v[startId],v[secondId],v[thirdId],v[endId]]

2.  Query edge from the startId & endId.
g.V("startId").outE().as("E").otherV().hasId("endId").select("E").by(valueMap(true))
==>[id:5d3b354af8d6c90653a1c64d, label:drugs_disease_treat, confidence:0.6]

So How to make 1 and 2 into one gremlin query ?  


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/a4e5631b-551b-470e-862d-343b567504a6%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Stephen Mallette

unread,
Aug 13, 2019, 11:35:31 AM8/13/19
to gremli...@googlegroups.com
interesting....could you show the profile() with the out() only instead of outE().inV()?

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/7f6b2010-4664-47ab-89e1-f9e804042587%40googlegroups.com.

Stark Arya

unread,
Aug 13, 2019, 9:22:28 PM8/13/19
to Gremlin-users
Ok, i think first get vertex list and then fill inner edge is necessary currently

gremlin> g.V("start").repeat(out().simplePath()).times(4).hasId("end").limit(1).path().by(__.valueMap(true)).profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
GraphDbGraphStep(vertex,[start])                                   1           1           0.113     0.00
BatchPrefetchVertexStep(OUT,vertex)                      2501        2501          15.884     0.15
PathFilterStep(simple)                                              2500        2500           1.686     0.02
NoOpBarrierStep(2500)                                                  2           2           1.629     0.02
BatchPrefetchVertexStep(OUT,vertex)                     2501        2501           8.998     0.09
PathFilterStep(simple)                                              2500        2500           1.867     0.02
NoOpBarrierStep(2500)                                                  2           2           1.620     0.02
BatchPrefetchVertexStep(OUT,vertex)                      5001        5001          16.566     0.16
PathFilterStep(simple)                                              5000        5000           3.960     0.04
NoOpBarrierStep(2500)                                            3637        3637           8.453     0.08
BatchPrefetchVertexStep(OUT,vertex)                 1640249     1640249        6824.154    66.13
PathFilterStep(simple)                                           1640000     1640000        1379.412    13.37
NoOpBarrierStep(2500)                                        1637502     1637502        1684.508    16.32
HasStep([~id.eq(end)])                                                  2           2         369.500     3.58
RangeGlobalStep(0,1)                                                   1           1           0.024     0.00
PathStep([[PropertyMapStep(value), ProfileStep]])       1           1           0.462     0.00
  PropertyMapStep(value)                                              5           5           0.385
                                            >TOTAL                     -           -       10318.842        -

在 2019年8月13日星期二 UTC+8下午11:35:31,Stephen Mallette写道:
interesting....could you show the profile() with the out() only instead of outE().inV()?

Reply all
Reply to author
Forward
0 new messages