Gremlin - To group(), count(), mean()

Balaji R

unread,

Mar 22, 2018, 12:04:49 PM3/22/18

to Gremlin-users

Hi,

How to group the vertices by label at the same time count out and in vertices and mean of the property for each group?

I tried the below query. It is not working.

g.V().hasLabel('Topics').

has('topicType','Dead Battery').

as('topicType', 'avg_sentiment', 'resp_count', 'msg_count').

in().out().hasLabel('Messages').in().hasLabel('ChatSessions').

as('sessionId', 'date', 'time', 'status').

select('topicType', 'avg_sentiment', 'resp_count', 'msg_count', 'sessionId', 'date', 'time', 'status').

by('topicType').

by(__.in().values('sentiment').mean()).

by(__.in().values('response').count()).

by(__.in().out().hasLabel('Messages').values('message').count()).

by('sessionId').

by('date').

by('time').

by('status')

Thanks,

Balaji R

Daniel Kuppitz

unread,

Mar 22, 2018, 1:17:47 PM3/22/18

to gremli...@googlegroups.com

Your query doesn't group anything, so I'm not quite sure if my answer is going to be what you're looking for, but I'll just rely on the initial question.

How to group the vertices by label at the same time count out and in vertices and mean of the property for each group?

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().
group(). /* group by label */
    by(label).
    unfold().as('kv'). /* unfold and ... */
    select(values).
    project("i","o","m"). /* ... compute statistics for each label */
    by(unfold().inE().count()).
    by(unfold().outE().count()).
    by(unfold().bothE().values('weight').mean()).
    group(). /* regroup by label */
    by(select('kv').select(keys)).
    by(fold().unfold())
==>[software:[i:4,o:0,m:0.5],person:[i:2,o:6,m:0.625]]

Cheers,

Daniel

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/45b53d46-b1b7-4dd7-bb40-654fff5c9c89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Balaji R

unread,

Mar 23, 2018, 2:08:55 AM3/23/18

to Gremlin-users

Thanks a lot Daniel

By using your query I was able to get all users data. What I need is group of users under one topic. Expected something like this,

sessionId	topicType	productType	avg_sentiment	resp_count	msg_count	date	time	status
CS0001	Topic 1	Prodcut 1	1.5	4	4	3/22/2018	12:06:56 PM	ACTIVE
CS0002	Topic 2	Prodcut 1, Prodcuct 2	3	3	3	3/22/2018	12:06:56 PM	CLOSED

Please help me on this. attached my graph model.

Thanks,

Balaji

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

ChatBot_AI.jpg

Daniel Kuppitz

unread,

Mar 23, 2018, 11:13:42 AM3/23/18

to gremli...@googlegroups.com

First, what's your current query? And next, I don't see users in your model. Please clarify that, provide a small sample dataset and an expected result.

Thanks.

Cheers,

Daniel

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/6da823b2-525d-46d9-9a2f-05f880ffbc77%40googlegroups.com.

Balaji R

unread,

Mar 26, 2018, 1:54:50 AM3/26/18

to Gremlin-users

Sorry Daniel,

Here I attached sample graph and expected results.

Thanks for you support.

Regards,

Balaji

Expected_Result.JPG

Sample_Graph.jpg

Daniel Kuppitz

unread,

Mar 26, 2018, 1:09:19 PM3/26/18

to gremli...@googlegroups.com

For future questions: Would be nice if you could provide a script that creates your sample graph. Like this:

g = TinkerGraph.open().traversal()
g.addV('User').property(id,'USR01').property('status','Active').as('u1').
addV('User').property(id,'USR02').property('status','Closed').as('u2').
addV('User').property(id,'USR03').property('status','Active').as('u3').
addV('Message').property(id,'Message 1').as('m1').
addV('Message').property(id,'Message 2').as('m2').
addV('Message').property(id,'Message 3').as('m3').
addV('Message').property(id,'Message 4').as('m4').
addV('Response').property(id,'Response 1').property('rating',4).as('r1').
addV('Response').property(id,'Response 2').property('rating',3).as('r2').
addV('Response').property(id,'Response 3').property('rating',4).as('r3').
addV('Response').property(id,'Response 4').property('rating',4).as('r4').
addV('Response').property(id,'Response 5').property('rating',4).as('r5').
addV('Topic').property(id,'Banking').as('t1').
addV('Topic').property(id,'Insurance').as('t2').
addV('Topic').property(id,'Investments').as('t3').
addV('Product').property(id,'Prod 1').as('p1').
addV('Product').property(id,'Prod 2').as('p2').
addE('posted').from('u1').to('m1').
addE('posted').from('u1').to('m2').
addE('posted').from('u2').to('m3').
addE('posted').from('u3').to('m4').
addE('responded').from('r1').to('m1').
addE('responded').from('r2').to('m2').
addE('responded').from('r3').to('m2').
addE('responded').from('r4').to('m3').
addE('responded').from('r5').to('m4').
addE('belongs').from('r1').to('t1').
addE('belongs').from('r2').to('t2').
addE('belongs').from('r3').to('t2').
addE('belongs').from('r4').to('t2').
addE('belongs').from('r5').to('t3').
addE('about').from('r1').to('p1').
addE('about').from('r2').to('p1').
addE('about').from('r3').to('p2').
addE('about').from('r4').to('p2').
addE('about').from('r5').to('p2').
iterate()

The query, you're looking for, could look similar to this one:

gremlin> g.V('Insurance').as('t').
......1> in('belongs').as('r').
......2> out('responded').as('m').
......3> in('posted').
......4> group('ratings').
......5> by().
......6> by(select('r').fold()).
......7> group('messages').
......8> by().
......9> by(select('m').fold()).
.....10> group('products').
.....11> by().
.....12> by(select('r').out('about').fold()).
.....13> barrier().
.....14> dedup().as('u').
.....15> project('userId','topicType','rating','resp_count','msg_count','productType','status').
.....16> by(id).
.....17> by(select('t').by(id)).
.....18> by(select('ratings').unfold().
.....19> where(select(keys).as('u')).
.....20> select(values).unfold().
.....21> values('rating').mean()).
.....22> by(select('ratings').unfold().
.....23> where(select(keys).as('u')).
.....24> select(values).count(local)).
.....25> by(select('messages').unfold().
.....26> where(select(keys).as('u')).
.....27> select(values).unfold().
.....28> dedup().count()).
.....29> by(select('products').unfold().
.....30> where(select(keys).as('u')).
.....31> select(values).unfold().
.....32> dedup().id().fold()).
.....33> by('status')
==>[userId:USR01,topicType:Insurance,rating:3.5,resp_count:2,msg_count:1,productType:[Prod 1,Prod 2],status:Active]
==>[userId:USR02,topicType:Insurance,rating:4.0,resp_count:1,msg_count:1,productType:[Prod 2],status:Closed]

There's a performance issue though, that can currently only be solved using lambdas:

gremlin> g.V('Insurance').as('t').
......1> in('belongs').as('r').
......2> out('responded').as('m').
......3> in('posted').
......4> group('ratings').
......5> by().
......6> by(select('r').fold()).
......7> group('messages').
......8> by().
......9> by(select('m').fold()).
.....10> group('products').
.....11> by().
.....12> by(select('r').out('about').fold()).
.....13> barrier().
.....14> dedup().as('u').
.....15> project('userId','topicType','rating','resp_count','msg_count','productType','status').
.....16> by(id).
.....17> by(select('t').by(id)).
.....18> by(flatMap {it.getSideEffects().get('ratings').get(it.get()).iterator()}.
.....19> values('rating').mean()).
.....20> by(map {it.getSideEffects().get('ratings').get(it.get()).size()}).
.....21> by(map {it.getSideEffects().get('messages').get(it.get()).toSet().size()}).
.....22> by(map {it.getSideEffects().get('products').get(it.get()).toSet()*.id()}).
.....23> by('status')
==>[userId:USR01,topicType:Insurance,rating:3.5,resp_count:2,msg_count:1,productType:[Prod 1,Prod 2],status:Active]
==>[userId:USR02,topicType:Insurance,rating:4.0,resp_count:1,msg_count:1,productType:[Prod 2],status:Closed]

Cheers,

Daniel

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/b6012f82-bb53-4327-a015-daf7ec8d88be%40googlegroups.com.

Balaji R

unread,

Mar 27, 2018, 3:38:03 AM3/27/18

to Gremlin-users

Thanks a lot Daniel. :)

Reply all

Reply to author

Forward