list/num followers query in name value pairs format

35 views
Skip to first unread message

Eser Kandogan

unread,
Nov 30, 2016, 7:59:01 AM11/30/16
to Gremlin-users
Let's say I have a graph of nodes representing people with follows edges between them. Each person node has an id property.

1. How can I compute for each person number and list of people they are following. I want the output to be:
[ "PERSON1": [ "numFollowers": 3, "followers": ["PERSON2", "PERSON3", "PERSON4"] ], "PERSON2": [ "numFollowers": 1, "followers": ["PERSON4" ] ], ... ]
saying:
PERSON1 has 3 followers and they are PERSON2, PERSON3, and PERSON4
and 
PERSON2 has 1 follower and it is PERSON4

2. How can I compute the number and list of people they co-follow. I want the output to be:
[ "PERSON1": [ "PERSON2": [ "numSharedFollowers": 1, "sharedFollowers": ["PERSON4"]], ... ], "PERSON2": [ "PERSON1": ["numSharedFollowers": 1, "sharedFollowers": ["PERSON4"]], ... ], ... ]

Thank you so much.



Daniel Kuppitz

unread,
Nov 30, 2016, 8:27:00 AM11/30/16
to gremli...@googlegroups.com
1. How can I compute for each person number and list of people they are following.

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]

gremlin> g.V().hasLabel("person").as("p").map(out("knows").values("name").fold()).
           group().by(select("p").by("name")).
                   by(project("numFollowers","followers").by(count(local)).by()).next()

==>peter={numFollowers=0, followers=[]}
==>vadas={numFollowers=0, followers=[]}
==>josh={numFollowers=0, followers=[]}
==>marko={numFollowers=2, followers=[vadas, josh]}

2. How can I compute the number and list of people they co-follow.

Let's use co-creators over the toy graph:

gremlin> g.V().hasLabel("person").as("p").out("created").as("s").map(__.in("created").where(neq("p")).values("name").fold()).
           group().by(select("p").by("name")).
                   by(group().by(select("s").by("name")).
                              by(project("numCoworkers","coworkers").by(count(local)).by())).next()
==>peter={lop={numCoworkers=2, coworkers=[marko, josh]}}
==>josh={ripple={numCoworkers=0, coworkers=[]}, lop={numCoworkers=2, coworkers=[marko, peter]}}
==>marko={lop={numCoworkers=2, coworkers=[josh, peter]}}

Cheers,
Daniel



--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/4c2c80f9-a38a-459b-910b-eb7a86dd3cb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eser Kandogan

unread,
Nov 30, 2016, 3:15:16 PM11/30/16
to Gremlin-users
Thank you so much. I think the part where I was struggling was the project() which is missing in 3.0.1. Is there are substitute I can use that will work for 3.0.1?


Daniel Kuppitz

unread,
Nov 30, 2016, 3:26:18 PM11/30/16
to gremli...@googlegroups.com
Yep.

.project("a","b").by(bla).by(blub)

is the same as:

.as("a","b").select("a","b").by(bla).by(blub)

Cheers,
Daniel


On Wed, Nov 30, 2016 at 9:15 PM, Eser Kandogan <eserka...@gmail.com> wrote:
Thank you so much. I think the part where I was struggling was the project() which is missing in 3.0.1. Is there are substitute I can use that will work for 3.0.1?


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Eser Kandogan

unread,
Nov 30, 2016, 3:37:38 PM11/30/16
to Gremlin-users
Thank you so much!

Eser Kandogan

unread,
Nov 30, 2016, 11:57:45 PM11/30/16
to Gremlin-users
OK. I am still not getting it right. What I would like to do is to list for all pairs of people the list and count of the created assets, i,e:
 
==>peter={marko={numCoCreated=1, coCreated=[lop]},josh={numCoCreated=1, coCreated=[lop]}}
==>josh={marko={numCoCreated=1, coCreated=[lop]},peter={numCoCreated=1, coCreated=[lop]}}
==>marko={peter={numCoCreated=1, coCreated=[lop]},josh={numCoCreated=1, coCreated=[lop]}}

I am trying:

g.V().hasLabel("person").as('p1').out('created').aggregate('c').group().by(select('p1').by('name)).by(g.V().hasLabel('person').as('p2').group().by(select('p2').by('name')).by(out('created').where(within('c')).values('name').fold()))

to have an outer loop with all people where i aggreagete all created in c and then in an inner loop and create a co-created list comparing to c.

(probably loop is the wrong way to think about this?)
 
 

Eser Kandogan

unread,
Dec 1, 2016, 12:20:59 AM12/1/16
to Gremlin-users
If two people have nothing co-created I still want to list it by as empty set for coCreated.


Daniel Kuppitz

unread,
Dec 1, 2016, 8:00:34 AM12/1/16
to gremli...@googlegroups.com
Ok, let's go through it step by step.

1. Get all persons

gremlin> g.V().hasLabel("person").as("p1")
==>v[1]
==>v[2]
==>v[4]
==>v[6]

2. Create person combinations

gremlin> g.V().hasLabel("person").as("p1").
           V().hasLabel("person").where(neq("p1")).path()

==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[6]]
==>[v[2],v[1]]
==>[v[2],v[4]]
==>[v[2],v[6]]
==>[v[4],v[1]]
==>[v[4],v[2]]
==>[v[4],v[6]]
==>[v[6],v[1]]
==>[v[6],v[2]]
==>[v[6],v[4]]

3. Get software that was co-created by these 2 persons

gremlin> g.V().hasLabel("person").as("p1").
           V().hasLabel("person").where(neq("p1")).as("p2").
           map(out("created").where(__.in("created").as("p1")).fold()).path()

==>[v[1],v[2],[]]
==>[v[1],v[4],[v[3]]]
==>[v[1],v[6],[v[3]]]
==>[v[2],v[1],[]]
==>[v[2],v[4],[]]
==>[v[2],v[6],[]]
==>[v[4],v[1],[v[3]]]
==>[v[4],v[2],[]]
==>[v[4],v[6],[v[3]]]
==>[v[6],v[1],[v[3]]]
==>[v[6],v[2],[]]
==>[v[6],v[4],[v[3]]]

4. Add a projection to get the counts and the software itself

gremlin> g.V().hasLabel("person").as("p1").
           V().hasLabel("person").where(neq("p1")).as("p2").
           map(out("created").where(__.in("created").as("p1")).fold()).
           project("numCoCreated","coCreated").by(count(local)).by().path()

==>[v[1],v[2],[],[numCoCreated:0,coCreated:[]]]
==>[v[1],v[4],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]
==>[v[1],v[6],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]
==>[v[2],v[1],[],[numCoCreated:0,coCreated:[]]]
==>[v[2],v[4],[],[numCoCreated:0,coCreated:[]]]
==>[v[2],v[6],[],[numCoCreated:0,coCreated:[]]]
==>[v[4],v[1],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]
==>[v[4],v[2],[],[numCoCreated:0,coCreated:[]]]
==>[v[4],v[6],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]
==>[v[6],v[1],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]
==>[v[6],v[2],[],[numCoCreated:0,coCreated:[]]]
==>[v[6],v[4],[v[3]],[numCoCreated:1,coCreated:[v[3]]]]

5. Squash it all together to get it in the preferred format

gremlin> g.V().hasLabel("person").as("p1").
           V().hasLabel("person").where(neq("p1")).as("p2").
           map(out("created").where(__.in("created").as("p1")).fold()).
           project("numCoCreated","coCreated").by(count(local)).by().
           group().by(select("p1")).by(group().by(select("p2"))).next()

==>v[1]={v[2]=[{numCoCreated=0, coCreated=[]}], v[4]=[{numCoCreated=1, coCreated=[v[3]]}], v[6]=[{numCoCreated=1, coCreated=[v[3]]}]}
==>v[2]={v[1]=[{numCoCreated=0, coCreated=[]}], v[4]=[{numCoCreated=0, coCreated=[]}], v[6]=[{numCoCreated=0, coCreated=[]}]}
==>v[4]={v[1]=[{numCoCreated=1, coCreated=[v[3]]}], v[2]=[{numCoCreated=0, coCreated=[]}], v[6]=[{numCoCreated=1, coCreated=[v[3]]}]}
==>v[6]={v[1]=[{numCoCreated=1, coCreated=[v[3]]}], v[2]=[{numCoCreated=0, coCreated=[]}], v[4]=[{numCoCreated=1, coCreated=[v[3]]}]}

6. Make it all pretty and use names instead of vertex identifiers

gremlin> g.V().hasLabel("person").as("p1").
           V().hasLabel("person").where(neq("p1")).as("p2").
           map(out("created").where(__.in("created").as("p1")).values("name").fold()).
           project("numCoCreated","coCreated").by(count(local)).by().
           group().by(select("p1").by("name")).by(group().by(select("p2").by("name"))).next()
==>peter={vadas=[{numCoCreated=0, coCreated=[]}], josh=[{numCoCreated=1, coCreated=[lop]}], marko=[{numCoCreated=1, coCreated=[lop]}]}
==>vadas={peter=[{numCoCreated=0, coCreated=[]}], josh=[{numCoCreated=0, coCreated=[]}], marko=[{numCoCreated=0, coCreated=[]}]}
==>josh={peter=[{numCoCreated=1, coCreated=[lop]}], vadas=[{numCoCreated=0, coCreated=[]}], marko=[{numCoCreated=1, coCreated=[lop]}]}
==>marko={peter=[{numCoCreated=1, coCreated=[lop]}], vadas=[{numCoCreated=0, coCreated=[]}], josh=[{numCoCreated=1, coCreated=[lop]}]}


That's it. HTH.

Cheers,
Daniel



On Thu, Dec 1, 2016 at 6:20 AM, Eser Kandogan <eserka...@gmail.com> wrote:
If two people have nothing co-created I still want to list it by as empty set for coCreated.


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Eser Kandogan

unread,
Dec 1, 2016, 12:46:19 PM12/1/16
to Gremlin-users
Thanks. I am using 3.0.1. and it seems that it doesn't like the second V() in the query.

on g.V().hasLabel("person").as("p1").V()

I get:
No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal.V() is applicable for argument types: () values: []
Possible solutions: is(java.lang.Object), by(groovy.lang.Closure), sum(), max(), any(), min()


Daniel Kuppitz

unread,
Dec 1, 2016, 3:48:26 PM12/1/16
to gremli...@googlegroups.com
This query has the same result (requires much more memory though):

g.V().hasLabel("person").aggregate("p").as("p1").select("p").unfold()

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Eser Kandogan

unread,
Dec 2, 2016, 11:36:19 PM12/2/16
to Gremlin-users
Thanks. How can I do something similar for different labels (loop over person and software)? I.e. loop over people and software so that I have combinations of each p1, s1, p1, s2, ... whether or not there is a create link. I am having trouble in the sense that I need to start another set of traversals but can't.

I tried: 
g.V().hasLabel('person').as('p').g.V().hasLabel('software').as('s') 

g.V().hasLabel('person').as('p').group().by('name').by{g.V().hasLabel('software').as('s').}     // Here I cannot access 'p' anymore.. I need to check something in to combination of p and s, for all p and all s.

(I need to do this is 3.0.1)
 

Daniel Kuppitz

unread,
Dec 3, 2016, 12:08:19 AM12/3/16
to gremli...@googlegroups.com
If I were you, I would use multiple traversals (given that you're stuck in 3.0.1). It's really not worth the effort to squash it all into a single traversal (and 3.0.x was not even able to process these patterns efficiently).

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Eser Kandogan

unread,
Dec 5, 2016, 2:53:40 PM12/5/16
to Gremlin-users
Thanks. 

I figured how to do this. Copying here just in case for somebody else who might need it.

g.V().has('TYPE','PERSON').as('p').map(__.outE('PERSON_FOLLOWS').inV().values('UID').fold()).group().by(select('p').values('UID')).by{c=it;g.V().has('TYPE','PERSON').as('p2').group().by(select('p2').by('UID')).by(__.out('PERSON_FOLLOWS').values('UID').is(within(c)).fold().as('stat','samples').select('stat','samples').by(count(local)).by(__.unfold().limit(3).fold()))}

Note the nodes and relationships might be different that the graph Daniel is using but it should give you an idea.
I am also limiting the shared list to 3 to save space.

Thank you all.


On Friday, December 2, 2016 at 9:08:19 PM UTC-8, Daniel Kuppitz wrote:
If I were you, I would use multiple traversals (given that you're stuck in 3.0.1). It's really not worth the effort to squash it all into a single traversal (and 3.0.x was not even able to process these patterns efficiently).

Cheers,
Daniel

On Sat, Dec 3, 2016 at 5:36 AM, Eser Kandogan <eserka...@gmail.com> wrote:
Thanks. How can I do something similar for different labels (loop over person and software)? I.e. loop over people and software so that I have combinations of each p1, s1, p1, s2, ... whether or not there is a create link. I am having trouble in the sense that I need to start another set of traversals but can't.

I tried: 
g.V().hasLabel('person').as('p').g.V().hasLabel('software').as('s') 

g.V().hasLabel('person').as('p').group().by('name').by{g.V().hasLabel('software').as('s').}     // Here I cannot access 'p' anymore.. I need to check something in to combination of p and s, for all p and all s.

(I need to do this is 3.0.1)
 

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages