Groovy Monkey Patching, Trees, and Gremlin

479 views
Skip to first unread message

James Thornton

unread,
Aug 18, 2011, 7:56:26 PM8/18/11
to gremli...@googlegroups.com
My search for a way to handle trees in Gremlin has ironically led me to monkey patching.

Here is what I have so far -- it's a tree closure (http://paste.pocoo.org/show/460538/), and at the moment I'm calling it via:

gremlin> results = []
gremlin> g.v(1).sideEffect{ tree(it) }

...or like this...

gremlin> results = []
gremlin> tree(g.v(1))

In these examples, because of how the tree closure is written, after you call them results will contain the tree of outgoing vertices starting at vertex one. 

However, what if I wanted the incoming vertices instead of the outgoing ones? 

One way to do it would be to pass in the edge direction as a param to the closure, but a more elegant way would be to call tree as a closure on TinkerVertex like this:

gremlin> g.v(1).in('knows').tree{}

But this won't right now work because com.tinkerpop.blueprints.pgm.impls.tg.TinkerVertex doesn't know anything about tree. However, you should be able to dynamically add tree{} (and any custom closure you desire) to TinkerVertex with a little Groovy monkey patching. 

gremlin> com.tinkerpop.blueprints.pgm.impls.tg.TinkerVertex.metaClass.tree = tree

Here are some docs on monkey patching in Groovy:


I am still experimenting with this idea and still looking into the details on how to do make it work. Having it at the Pipe level would be ideal, but unless you are a Pipes wizard, it may not be practical to create a Pipe for every closure you desire.

What do you think of this approach?

- James








James Thornton

unread,
Aug 18, 2011, 8:58:28 PM8/18/11
to gremli...@googlegroups.com
BTW what do you think of paring this with a "stored-procedure" library in Rexster? (https://github.com/tinkerpop/rexster/issues/86)

- James

James Thornton

unread,
Aug 19, 2011, 1:35:59 AM8/19/11
to gremli...@googlegroups.com
Actually, the right way to do this is probably to use the Gremlin defineStep method:

 Gremlin.defineStep("tree", [Vertex,Pipe], {_().sideEffect{ tree(it) }} );

...for some reason I keep forgetting that's there :)

- James

James Thornton

unread,
Aug 19, 2011, 6:26:45 AM8/19/11
to gremli...@googlegroups.com
Ok, here's what I have so far (http://paste.pocoo.org/show/460694/).

The "tree" step takes two required parameters and one optional one: 

param 0: The results list.
param 1: The edge direction (either "in" or "out").
param 2: The edge label (optional).

Here are some examples...

gremlin> load treeTest.groovy      

gremlin> results = []
gremlin> g.v(1).tree(results,"out")
==>v[1]
gremlin> results
==>v[1]
==>[v[2], v[3], v[4]]
==>[v[5], v[3]]

gremlin> results = []              
gremlin> g.v(1).tree(results,"out","knows")
==>v[1]
gremlin> results                           
==>v[1]
==>[v[2], v[4]]
 
My Groovy/Gremlin fu isn't that strong yet, and this is the first time I have tried to create a named step so I'm sure there are ways to improve it -- suggestions welcome.

- James

James Thornton

unread,
Aug 19, 2011, 8:54:32 AM8/19/11
to gremli...@googlegroups.com
By using the Gremlin transform step instead of sideEffect, I was able to get rid of the "results" param. 

And by breaking tree() into into two steps, inTree() and outTree(), I was able to get rid of the "direction" param. Now the only param is the optional "label" param:

http://paste.pocoo.org/show/460754/


gremlin> load treeTest.groovy   

gremlin> g.v(1).outTree()       
==>[v[1], [v[2], v[3], v[4]], [v[5], v[3]]]

gremlin> g.v(1).outTree("knows")
==>[v[1], [v[2], v[4]]]

- James
 

Pierre De Wilde

unread,
Aug 19, 2011, 9:33:20 AM8/19/11
to gremli...@googlegroups.com
Hi James,

Nice user-defined step demo usage. When finalized, please add it in wiki (Gremlin Cookbook).

First line _tree = {} is not really necessary.

gremlin> g.v(1).outTree()
==>[v[1], [v[2], v[3], v[4]], [v[5], v[3]]]

How can I identify the relationship between v[4] and [v[5], v[3]] ?

v[1]
v[2]
v[3]
v[4]
v[5]
v[3]

Another example:

gremlin> g.v(3).inTree()
==>[v[3], [v[1], v[4], v[6]], [v[1]]]

v[3]
v[1]
v[4]
v[1]
v[6]


Thanks,
Pierre

Marko Rodriguez

unread,
Aug 24, 2011, 4:42:36 PM8/24/11
to gremli...@googlegroups.com
Hi,

This is nice. A couple of things.

tree = { vertices ->
  results << vertices
  
  vertices.each() {
    children = it.out().toList()
    if (children) 
      tree(children)
  }
  results
}

This could be a Pipe, but I don't see a nice way in which this TreePipe would flow its results to yet another pipe. In other words, its more of an "end method" than a pipe?... Perhaps a it could be a SideEffectPipe. E.g.

t = []
g.v(1).out.tree(t).name
==>josh
==>vadas
==>lop
t
==>[[][][]]]] // the embedded list tree data structure. I didn't feel like writing it all out.

In this way, TreePipe is like TablePipe.

Thoughts?

Also, how to generalize so out() is the determinant path description? .. What about any arbitrary path description? E.g.

g.v(1).out.tree(t, _().out('knows')).name

where the "knows tree" is created.

See ya,
Marko.

James Thornton

unread,
Aug 25, 2011, 1:10:35 AM8/25/11
to gremli...@googlegroups.com
Hey Marko -

As Pierre pointed out, this isn't quite right...

tree = { vertices ->
  results << vertices
  
  vertices.each() {
    children = it.out().toList()
    if (children) 
      tree(children)
  }
  results
}


It outputs this...

gremlin> g.v(1).outTree()       
==>[v[1], [v[2], v[3], v[4]], [v[5], v[3]]]


Here's what the output should be (ignore the missing Vs, just look at the structure)...

[1, [2, 3, 4, [5, 3]]]


The problem is the recursive algo above isn't what I want to use -- it's just my closest attempt at tree recursion for Groovy closures.

Here's how I want to right it...

https://gist.github.com/1170000


But there's something about Groovy scope/recursion that I'm missing. Why are those "this Collection"s about? 

Will you summon your Groovy skills and tell me why this doesn't work?


> This could be a Pipe, but I don't see a nice way in which this TreePipe would flow its 
> results to yet another pipe. In other words, its more of an "end method" than a pipe?

I see it as more of an end method too. 

> g.v(1).out.tree(t, _().out('knows')).name

Yeah, that might work -- let me play around with it.

- James




James Thornton

unread,
Aug 25, 2011, 1:12:35 AM8/25/11
to gremli...@googlegroups.com
> Here's how I want to right it...

*write it (but I guess in this case "right it" works too :-)

- James

infojunkie

unread,
Sep 5, 2011, 10:47:16 PM9/5/11
to Gremlin-users
> As Pierre pointed out, this isn't quite right...
[..]
> But there's something about Groovy scope/recursion that I'm missing. Why are
> those "this Collection"s about?
>
> Will you summon your Groovy skills and tell me why this doesn't work?
>

I needed the very same thing, so I debugged your code. Here's
something that works better for me: https://gist.github.com/1196430

The trick was to use
def results = []
inside the function.

N00b question (this is my 1st day of writing Groovy and 1st week of
using Gremlin) : how can I iterate over the results?

Pierre De Wilde

unread,
Sep 6, 2011, 1:58:15 AM9/6/11
to gremli...@googlegroups.com
Hey,

Excellent. Using your code, we get expected results:

gremlin> g.v(1).outTree()
==>[v[1], [v[2], v[3], v[4], [v[5], v[3]]]]

gremlin> g.v(3).inTree() 
==>[v[3], [v[1], v[4], [v[1]], v[6]]]

> how can I iterate over the results?

add .toList() 

gremlin> x = g.v(3).inTree() 
==>[v[3], [v[1], v[4], [v[1]], v[6]]]
gremlin> x
gremlin> x = g.v(3).inTree().toList()
==>[v[3], [v[1], v[4], [v[1]], v[6]]]
gremlin> x
==>[v[3], [v[1], v[4], [v[1]], v[6]]]

HTH,
Pierre

James Thornton

unread,
Sep 6, 2011, 4:20:58 AM9/6/11
to gremli...@googlegroups.com
> The trick was to use 
> def results = [] 
> inside the function. 

def -- three little letters -- that's all it was??? :)

I've been trying to get to the bottom of this for weeks -- on StackOverflow (http://stackoverflow.com/questions/7130318/how-does-groovy-handle-closure-scope-and-recursion), here, and elsewhere. 

This little bug even inspired me to write Mogwai (https://github.com/espeed/mogwai) so I could write the closure in Python/Jython instead.

Good work. Thank you :)

- James




Tobias Bradtke

unread,
Sep 6, 2011, 5:27:55 AM9/6/11
to gremli...@googlegroups.com
This is too crazy!

Yesteday I had a huge problem with a self defined Gremlin-Step that just gave wrong results. It seemed it just canceled processing somewhere without giving an error and I could not figure out why. I ended with an ugly workaround.

After reading this I threw some "def"s to it and it just works fine as expected. =:D


Thanks a lot for that hint!!
I repeat: Just too crazy ;)

webwurst

James Thornton

unread,
Sep 6, 2011, 5:47:38 AM9/6/11
to gremli...@googlegroups.com

Marko Rodriguez

unread,
Sep 6, 2011, 10:25:21 AM9/6/11
to gremli...@googlegroups.com
Hey,

Excellent. Using your code, we get expected results:

gremlin> g.v(1).outTree()
==>[v[1], [v[2], v[3], v[4], [v[5], v[3]]]]

gremlin> g.v(3).inTree() 
==>[v[3], [v[1], v[4], [v[1]], v[6]]]

> how can I iterate over the results?

** WARNING: I may be saying what you guys are trying to do ***

It might be a good idea to have outETree, inETree, outTree, inTree, and ?outVTree?/?inVTree? pipes. Moreover, these are not "end pipes", but instead, use throughout a query to build a tree up and thus, will not require recrussion in the classic sense.

For example:

g.v(1).outTree.outTree

the first outTree will do
==>[v[1], [v[2], v[3], v[4]]]

The second outTree will do:
==>[v[1], [v[2], v[3], [v[4],[v[5], v[3]]]]]

In essence, code wise, it finds the deepest sublist of the incoming list and then calculates the respect out vertices and appends them to that list and emits them.

?? .. . ? ...  but what about then idTree, labelTree, ... :| .... . . . . . . . . . .  . .. . . . .  
can there be a generic pipe like fold? g.v(1).out.fold.out.fold <=> g.v(1).outTree.outTree ?
Marko.

Pierre De Wilde

unread,
Sep 6, 2011, 10:44:26 AM9/6/11
to gremli...@googlegroups.com
Hey,

You're right.

+1 for generic pipe

Preference for .tree vs .fold:

g.v(1).out.tree.out.tree <=> g.v(1).outTree.outTree
g.v(1).out.fold.out.fold <=> g.v(1).outTree.outTree

Pierre

Marko Rodriguez

unread,
Sep 6, 2011, 10:45:42 AM9/6/11
to gremli...@googlegroups.com
You're right.

+1 for generic pipe

Preference for .tree vs .fold:

g.v(1).out.tree.out.tree <=> g.v(1).outTree.outTree
g.v(1).out.fold.out.fold <=> g.v(1).outTree.outTree

Who knows how to implement that? :) ... I have to be honest, thinking in terms of embedded lists (a tree) makes my go loopy.

Marko.

Redouane Lakrache

unread,
Sep 23, 2011, 3:29:48 PM9/23/11
to gremli...@googlegroups.com
We are making a small php api that represents data as a tree extracted from a bigger graph.
It would be great to have this supported by gremlin

Thank you.

James Thornton

unread,
Sep 23, 2011, 5:55:50 PM9/23/11
to gremli...@googlegroups.com
Yeah, a tree pipe would be cool. But you can use this Gremlin tree script right now (https://gist.github.com/1197179).  

Marko Rodriguez

unread,
Sep 23, 2011, 5:58:21 PM9/23/11
to gremli...@googlegroups.com

I agree we should do a tree pipe. There is a ticket. If some wants to pull request and its good, I will merge.

On cell so don't have ticket number off hand.

Marko.

Redouane Lakrache

unread,
Sep 24, 2011, 7:43:11 PM9/24/11
to gremli...@googlegroups.com
Hello everybody,
I've been able to run the tree step pointed by James and it works great.
Don't know if what i'm going to say is revelant or not, but i see this step more of a sideEffect than a final step.
It would be great to have something like that:

tree = []
g.v(1).out.toTree(tree).out.out('knows').toTree(tree).out().toTree()

so we can not only have a tree from the graph but a "shortcutted" one

v[1]
|_v[2]*
|  |_v[4]
|  |  |_v[8]*
|  |  |  |_v[11]*
|  |  |_v[9]*
|  |_v[5]
|_v[3]*
   |_v[6]
   |_v[7]
      |_v[10]*

Only vertices marked by * would be on the subtree
_v[2]
|  |_v[8]
|  |  |_v[11]
|  |_v[9]
|_v[3]
   |_v[10]

I don't have enough gremlin skills yet, but i'll continue digging into that.
Thaks

Redouane Lakrache

unread,
Sep 28, 2011, 1:30:58 AM9/28/11
to gremli...@googlegroups.com
Hello everybody,
Here's an attempt to make a gremlin (side effect) defined step for building trees https://gist.github.com/1247049.
Sorry if the code is a bit ugly (i never dealt with java code), so feel free to refine and test it.

One question:
How can i pass the tree as an argument to the defined step so i can name my tree variable whatever i want?

Thank you
Reply all
Reply to author
Forward
0 new messages