counting the number of edges between two nodes and storing that as a property in a new (summary) edge

297 views
Skip to first unread message

Olav Laudy

unread,
Jan 17, 2018, 1:16:05 PM1/17/18
to Gremlin-users
Hi,


I have nodes with multiple edges between them. I can count them, however, since count is a reducing barrier, I'm unable to use the count in the creation of a new edge between the two nodes.


g.addV('object1')
g.addV('object2')
g.addV('object3')

g.V().hasLabel('object1').as('A')
  .V().hasLabel('object2').addE('obs1').from('A')

g.V().hasLabel('object1').as('A')
  .V().hasLabel('object2').addE('obs2').from('A')

g.V().hasLabel('object2').as('A')
  .V().hasLabel('object3').addE('obs1').from('A')


I would like to create two edges:

g.addE('summary').from('object1').to('object2').property('count',2)
g.addE('summary').from('object2').to('object3').property('count',3)

I can count:

g.V().hasLabel('object1').as('A').outE().as('B').count()

but after issuing the count, I'm unable to get hold of 'A' and 'B' due to the fact that count is a reducing barrier. 



Any suggestion is highly welcome!

Kelvin Lawrence

unread,
Jan 17, 2018, 6:23:04 PM1/17/18
to Gremlin-users
Others I am sure can come up with better ways but does something like this work for you?

gremlin> g.V(3).as('a').project('edges').by(outE().count()).addV('newnode').property('count',select('edges'))
==>v[54922]
gremlin
> g.V(54922).valueMap()
==>[count:[61]]

Olav Laudy

unread,
Jan 17, 2018, 6:30:34 PM1/17/18
to Gremlin-users
Hi Kelvin,

Thank you so much for your suggestion. Your query creates and vertex, while I'm looking to create an edge. When I try to transform your query into an edge query, I run into the same issue as before: after the count is done, I can't 'reach' the vertex anymore.

Any ideas?

Kelvin Lawrence

unread,
Jan 17, 2018, 6:53:02 PM1/17/18
to Gremlin-users
Something like this then perhaps

Enter
code  g.V(3).as('a').V('4').as('b').addE('summary').from('a').to('b').property('count',select('a').by(outE().count()))
e[54930][3-summary->4]
gremlin> g.E(54930).valueMap()
==>[count:67]here
...

Olav Laudy

unread,
Jan 17, 2018, 10:09:45 PM1/17/18
to Gremlin-users
Hi,

Unfortunately, I don't get this to work: in your example, you id the nodes (g.V(3)); for me this needs to work with all the nodes I iterate over. If I do a g.V() twice, I get a Cartesian join which is huge in my real case and the counts do not relate correctly to the pair involved.

Maybe my example more clear:

g.addV('object').property('name',1)
g.addV('object').property('name',2)
g.addV('object').property('name',3)

g.V().has('object','name',1).as('A').V().has('object','name',2).addE('obs').from('A')
g.V().has('object','name',1).as('A').V().has('object','name',2).addE('obs').from('A')
g.V().has('object','name',2).as('A').V().has('object','name',3).addE('obs').from('A')

The outcome should be:

Link from name=1 to name=2, count=2
Link from name=2 to name=3, count=1

Robert Dale

unread,
Jan 18, 2018, 9:26:45 AM1/18/18
to gremli...@googlegroups.com
g.V().
project('a','b').by().by(__.out('obs').groupCount()).as('group').
select('b').unfold().as('out').select(keys).
addE('summary').
from(select('group').select('a')).
property('summary', select('out').select(values))


Robert Dale

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1364f7bf-6c07-4e91-b0d3-dd9237485e07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Olav Laudy

unread,
Jan 18, 2018, 9:28:47 AM1/18/18
to gremli...@googlegroups.com

Very impressive!!! Where do I learn that? How do I feel that logic?


To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/QEn7URER8aU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CABed_4pGvDLvL%2B%2B2Yt%2BZQwafTO1zFBAO5_yS9f%2BCnysob8BVng%40mail.gmail.com.

Kelvin Lawrence

unread,
Jan 18, 2018, 9:53:37 AM1/18/18
to Gremlin-users
Hi Olav, glad to see Robert got you going. I missed your last post and was not understanding properly what you wanted to achieve. As to how you get to "feel" Gremlin - I think we would all (at least most of us anyway!) agree that it takes time and lots of experiments. before it starts to become intuitive and even then there is often a better way that someone else will come up with!

Hang in there - I know I learn something new about Gremlin just about every single day!

Kelvin

Robert Dale

unread,
Jan 18, 2018, 10:09:23 AM1/18/18
to gremli...@googlegroups.com
Read everything Mr. Kuppitz has ever written.  ;-)

This is one of those where you have to put everything up front because you can't get reference it later or it's not optimal to.
For example, g.V().as('a').groupCount().by(out()).addE().from('a') <-- group is reducing barrier so can't get back to 'a'.
or where later you try .addE().property('summary', outE().count) <--  even if you filter on label 'obs', you'd still get ALL 'obs' edges to all vertexes (assuming you could have more than one) but even then you'd have to have an additional filter.

So, let's try to put everything required for the edge up-front. The groupCount() is the same as group().by().by(count()). This will group each out vertex with its count.

gremlin> g.V().
......1> project('a','b').by().by(__.out('obs').groupCount()).as('group')
==>[a:v[0],b:[v[2]:2]]
==>[a:v[2],b:[v[4]:1]]
==>[a:v[4],b:[]]

This is our 'group' that we can reference later.  It has the start vertex 'a', the out vertex 'b' key, and the count 'b' value. Everything we need.

I'm assuming you could have more than one out vertex.  AddE() does not work on multiple targets so you can't start with the starting vertex. Let's get to the 'out' vertexes as they will need to drive the addE().
 
gremlin> g.V().
......1> project('a','b').by().by(__.out('obs').groupCount()).as('group').
......2> select('b').unfold().as('out').select(keys)
==>v[2]
==>v[4]

Let's start adding that edge to the current vertex... 

addE('summary').

Here we need to get back to the original, starting vertex.  It's in the projected map 'group' as key 'a'.

from(select('group').select('a')).

Then add the property.  We reference back to the 'out' map and select the value which contains the count().

property('summary', select('out').select(values))

And as I saw in Kelvin's reply, this did take some experimentation to get right. I had to relearn some of the rules as I went along ;-)

HTH



Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/QEn7URER8aU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAO%3DOAK%2BBFg1sX%2Bqy2LkZAfKB5Y3bQCLdRq6gF%3Dr1cMo6CJNmiw%40mail.gmail.com.

Kelvin Lawrence

unread,
Jan 18, 2018, 10:32:40 AM1/18/18
to Gremlin-users
A big plus one on read everything from Mt Kuppitz :-)

I have had several people suggest to me recently that the whole area of collections that a traversal builds up and also barrier steps is an area that people find confusing. Based on that feedback and this conversation as well  I will move coverage of that topic to the top of my todo list for things to add next to the  book.

Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/QEn7URER8aU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Stephen Mallette

unread,
Jan 18, 2018, 10:45:16 AM1/18/18
to Gremlin-users
I'd advocate for not only reading Kuppitz but really digging into any Gremlin you don't understand at a glance and trying those traversals for yourself in the Gremlin Console. Make sure you understand what each step of the traversal is accomplishing and how it is transforming the data in traversers, mutating side-effects, etc. Change the steps, the parameters, the sample data and see what happens. These points are some of what I expect to get into with my "Gremlin's Anatomy" talk. 

As Kelvin mentioned I've noticed and heard that there isn't a lot of information available on collections. To that end, I'm working on a "Collections" recipe now that goes into this topic with some really nice examples. We'll have that as part of 3.2.8 and 3.3.2.

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/f669407a-1c35-475e-90b0-2bffe514df6d%40googlegroups.com.

Olav Laudy

unread,
Jan 18, 2018, 11:24:22 AM1/18/18
to Gremlin-users
Yes sir! that's exactly what I'm doing with each of the pearls I've been given on this mailinglist. 


Thank you and I really appreciate your responses!

Olav Laudy

unread,
Jan 18, 2018, 12:10:22 PM1/18/18
to Gremlin-users
o o, trouble in paradise:

Locally I'm running Gremlin 3.3.1 and the query works beautifully.

However, the real thing is supposed to run on AWS Neptune and (with the exact nodes/vertices and query as above) I receive:


"startup failed:\nScript32.groovy: 1: [Static type checking] - 
Cannot call org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal <org.apache.tinkerpop.gremlin.structure.Vertex, org.apache.tinkerpop.gremlin.structure.Edge>#from(org.apache.tinkerpop.gremlin.process.traversal.Traversal <org.apache.tinkerpop.gremlin.structure.Edge, org.apache.tinkerpop.gremlin.structure.Vertex>) with arguments [org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal <A extends java.lang.Object, E2 extends java.lang.Object>] \n @ line 1, column 1.\n   

g.V().hasLabel('object1').project('a','b')          
     .by()          
     .by(__.out('obs1').groupCount()).as('group')              
     .select('b').unfold().as('out').select(keys)              
     .addE('summary')                
     .from(select('group').select('a'))                
       .property('summary', select('out').select(values))

\n   ^\n\n1 error\n

Note I changed object->object1 and obs->obs1 to not interfere with existing labels. The part up to .addE works fine. The moment I use the addE part, even without the property added, I receive this error.

Olav Laudy

unread,
Jan 18, 2018, 2:41:26 PM1/18/18
to Gremlin-users
This is a comparison of the features enabled by Gremlin 3.3.1 and AWS Neptune:

Is there any feature that needs to be enabled for the above query to work?






Neptune Gremlin Difference
Graph Feature
     
Transactions TRUE FALSE FALSE
ThreadedTransactions FALSE FALSE TRUE
Computer FALSE TRUE FALSE
Persistence TRUE TRUE TRUE
ConcurrentAccess TRUE FALSE FALSE
Variable Feature
     
Variables FALSE TRUE FALSE
SerializableValues FALSE TRUE FALSE
UniformListValues FALSE TRUE FALSE
BooleanArrayValues FALSE TRUE FALSE
DoubleArrayValues FALSE TRUE FALSE
IntegerArrayValues FALSE TRUE FALSE
StringArrayValues FALSE TRUE FALSE
BooleanValues FALSE TRUE FALSE
ByteValues FALSE TRUE FALSE
DoubleValues FALSE TRUE FALSE
FloatValues FALSE TRUE FALSE
IntegerValues FALSE TRUE FALSE
LongValues FALSE TRUE FALSE
MapValues FALSE TRUE FALSE
MixedListValues FALSE TRUE FALSE
StringValues FALSE TRUE FALSE
ByteArrayValues FALSE TRUE FALSE
FloatArrayValues FALSE TRUE FALSE
LongArrayValues FALSE TRUE FALSE
Vertex Feature
     
MetaProperties FALSE TRUE FALSE
DuplicateMultiProperties FALSE TRUE FALSE
AddVertices TRUE TRUE TRUE
RemoveVertices TRUE TRUE TRUE
MultiProperties TRUE TRUE TRUE
UserSuppliedIds TRUE TRUE TRUE
AddProperty TRUE TRUE TRUE
RemoveProperty TRUE TRUE TRUE
NumericIds FALSE TRUE FALSE
StringIds TRUE TRUE TRUE
UuidIds FALSE TRUE FALSE
CustomIds FALSE FALSE TRUE
AnyIds FALSE TRUE FALSE
Vertex Property Feature
     
UserSuppliedIds FALSE TRUE FALSE
AddProperty TRUE TRUE TRUE
RemoveProperty TRUE TRUE TRUE
NumericIds TRUE TRUE TRUE
StringIds TRUE TRUE TRUE
UuidIds FALSE TRUE FALSE
CustomIds FALSE FALSE TRUE
AnyIds FALSE TRUE FALSE
Properties TRUE TRUE TRUE
SerializableValues FALSE TRUE FALSE
UniformListValues FALSE TRUE FALSE
BooleanArrayValues FALSE TRUE FALSE
DoubleArrayValues FALSE TRUE FALSE
IntegerArrayValues FALSE TRUE FALSE
StringArrayValues FALSE TRUE FALSE
BooleanValues TRUE TRUE TRUE
ByteValues TRUE TRUE TRUE
DoubleValues TRUE TRUE TRUE
FloatValues TRUE TRUE TRUE
IntegerValues TRUE TRUE TRUE
LongValues TRUE TRUE TRUE
MapValues FALSE TRUE FALSE
MixedListValues FALSE TRUE FALSE
StringValues TRUE TRUE TRUE
ByteArrayValues FALSE TRUE FALSE
FloatArrayValues FALSE TRUE FALSE
LongArrayValues FALSE TRUE FALSE
Edge Feature
     
AddEdges TRUE TRUE TRUE
RemoveEdges TRUE TRUE TRUE
UserSuppliedIds TRUE TRUE TRUE
AddProperty TRUE TRUE TRUE
RemoveProperty TRUE TRUE TRUE
NumericIds FALSE TRUE FALSE
StringIds TRUE TRUE TRUE
UuidIds FALSE TRUE FALSE
CustomIds FALSE FALSE TRUE
AnyIds FALSE TRUE FALSE
Edge Property Feature
     
Properties TRUE TRUE TRUE
SerializableValues FALSE TRUE FALSE
UniformListValues FALSE TRUE FALSE
BooleanArrayValues FALSE TRUE FALSE
DoubleArrayValues FALSE TRUE FALSE
IntegerArrayValues FALSE TRUE FALSE
StringArrayValues FALSE TRUE FALSE
BooleanValues TRUE TRUE TRUE
ByteValues TRUE TRUE TRUE
DoubleValues TRUE TRUE TRUE
FloatValues TRUE TRUE TRUE
IntegerValues TRUE TRUE TRUE
LongValues TRUE TRUE TRUE
MapValues FALSE TRUE FALSE
MixedListValues FALSE TRUE FALSE
StringValues TRUE TRUE TRUE
ByteArrayValues FALSE TRUE FALSE
FloatArrayValues FALSE TRUE FALSE
LongArrayValues FALSE TRUE FALSE

Robert Dale

unread,
Jan 19, 2018, 8:34:38 AM1/19/18
to gremli...@googlegroups.com

Try casting it.

.from((Vertex) select('group').select('a'))  

The method signature changed in 3.3.1 so you don't have to do that.


Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/638f1987-4cf0-4e06-84d6-d5e3b3aabc12%40googlegroups.com.

Jean-Baptiste Musso

unread,
Jan 19, 2018, 10:07:39 AM1/19/18
to gremli...@googlegroups.com
Thanks for sharing this Robert, very nice explanation!
That recipe is very useful since precomputing edges between vertices or properties on vertices is often required for performance requirement, especially when traversing big graphs. Will mentally bookmark this.

Jean-Baptiste

Olav Laudy

unread,
Jan 19, 2018, 11:44:45 AM1/19/18
to Gremlin-users
Hi,

I'm sorry I'm so blind here:

This is my (simplified) query: 

g.V().hasLabel('object1').project('a','b')          
     .by()          
     .by(__.out('obs1').groupCount()).as('group')              
     .select('b').unfold().as('out').select(keys)              
     .addE('summary')                
     .from(select('group').select('a'))         

which I change to:

g.V().hasLabel('object1').project('a','b')          
     .by()          
     .by(__.out('obs1').groupCount()).as('group')              
     .select('b').unfold().as('out').select(keys)              
     .addE('summary')                
     .from( (Vertex) select('group').select('a'))     

Now gives me:

Error encountered evaluating script











Jean-Baptiste


Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/QEn7URER8aU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Daniel Kuppitz

unread,
Jan 21, 2018, 2:07:48 PM1/21/18
to gremli...@googlegroups.com
Casting to Vertex is invalid. It should either be cast to Traversal<Edge, Vertex> (which will probably fail, too) or to the non-generic version Traversal. The latter would be my best bet.

g.V().hasLabel('object1').
  project('a','b').
    by().
    by(__.out('obs1').groupCount()).as('group').
  select('b').unfold().as('out').
  select(keys).
  addE('summary').                
    from((Traversal) select('group').select('a'))

However, I don't see the need for groupCount() and project() in your query. Why don't you just use a much simpler query?

g.V().hasLabel('object1').as('a').
  local(out('obs1').dedup()).
  addE('summary').from('a')

Maybe I missed something (and yes, I'm too lazy to reread the whole thread :)). Just in case you actually want to use the group-counts:

g.V().hasLabel('object1').as('a').
  map(out('obs1').groupCount()).
  unfold().as('x').select(keys).
  addE('summary').
    from('a').
  property('count', select('x').select(values))

This query will create the summary edges and store the groupCount values as a property on the respective edge.

Cheers,
Daniel


To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1ed93e8d-5b62-4ab1-a410-00a76c0ab3ce%40googlegroups.com.

Stephen Mallette

unread,
Jan 22, 2018, 8:36:03 AM1/22/18
to Gremlin-users
Casting to Vertex is invalid. It should either be cast to Traversal<Edge, Vertex> (which will probably fail, too) 

The API simply doesn't support addV(Traversal) prior to 3.3.1. I'm pretty sure Neptune is 3.3.0 - you have to be sure the versions align. Perhaps in TinkerPop 4.x we can do a better job of dealing with situations like this. Better messaging at the least - smarter handling of incompatibilities at the best.

Olav Laudy

unread,
Jan 22, 2018, 11:39:45 AM1/22/18
to Gremlin-users
Brilliant! This did the trick:

g.V().hasLabel('object1').as('a').
  map(out('obs1').groupCount()).
  unfold().as('x').select(keys).
  addE('summary').
    from('a').
  property('count', select('x').select(values))


For completeness and searchability:

How to create a new edge that counts the number of edges between two nodes. It's basically a pre-compilation or a summary edge.

Start with:

g.addV('object1').property('name',1)
g.addV('object1').property('name',2)
g.addV('object1').property('name',3)

g.V().has('object1','name',1).as('A').V().has('object1','name',2).addE('obs1').from('A')
g.V().has('object1','name',1).as('A').V().has('object1','name',2).addE('obs1').from('A')
g.V().has('object1','name',2).as('A').V().has('object1','name',3).addE('obs1').from('A')


Apply the query above and receive:

['Edge:',
 {'id': 'd4b08f52-530e-628a-3955-76a4ef8bdd99',
  'inV': '1eb08f52-45e1-b633-af8d-a32dc812bb10',
  'inVLabel': 'object1',
  'label': 'summary',
  'outV': '72b08f52-456f-aa41-4a10-e55c5782b998',
  'outVLabel': 'object1',
  'properties': {'count': 2}},
 'Edge:',
 {'id': '52b08f52-5312-afa8-e7a2-f103893efdc8',
  'inV': '6ab08f52-4650-3975-45e5-629ab72a3266',
  'inVLabel': 'object1',
  'label': 'summary',
  'outV': '1eb08f52-45e1-b633-af8d-a32dc812bb10',
  'outVLabel': 'object1',
  'properties': {'count': 1}}]












Reply all
Reply to author
Forward
0 new messages