> I've been playing with the new Groovy-based Gremlin and I like it a
> lot.
Nice. Yea, old Gremlin seems ages ago now. Glad you made the switch.
> One thing I've learned is that it is very important to know the
> class of the object you are acting upon.
Super important. In general, _() is a way to turn any object into the start of a Pipeline.
> // Who are the artists
> g.V[['type':'artist']].name
Cool. Though, if you have an index, you can do:
g.idx(T.v)[[type:'artist']].name // much faster than a linear run through all vertices
Moreover, you don't need [['type':...]], you can do [[type:....]]. Groovy assumes the key to be a string if not specified.
> // Who sang the most songs
> m=[:]
> g.V[['type':'song']].out('sung_by').name.groupCount(m)
> m.sort{a,b -> b.value <=> a.value}
Cool. Again, g.idx() would be faster.
> // Filter out one-hit wonders - note the sort returns a map, not a
> pipe
> m.sort{a,b -> b.value <=> a.value}.findAll{it.value > 1} // The Groovy
> way, which returns another map
That works....
> // I discovered that I could accomplish the same thing with a filter
> if I used the '_' step
> m.sort{a,b -> b.value <=> a.value}._{it.value > 1} // The Gremlin way,
> which returns a pipe
You could do that, or your could do the "real" Gremlin way:
g.idx(T.v)[[type:'song]].out('sung_by'){it.in('sung_by').count() > 1}.name.groupCount(m)
In short, don't insert into the map anyone with less than 2 songs. However, probably less efficient given the double check to 'sung_by'. You can be verbose/explicit with:
g.idx(T.v)[[type:'song]].out('sung_by').filter{it.in('sung_by').count() > 1}.name.groupCount(m)
> // While I am at it, how about a histogram of the values in map m
> i=[:]
> // If I have a pipe, it is easy
> m.sort{a,b -> b.value <=> a.value}._{it.value >
> 2}.transform{it.value}.groupCount(i)
> // But if I do not perform the filter using the Gremlin method, then I
> need a way to transform my List to a pipe
> m.sort{a,b -> b.value <=> a.value}.findAll{it.value >
> 2}.collect{it.value}.groupCount(i) // throws exception
> // And since I want the full histogram, I don't want the findAll
> m.sort{a,b -> b.value <=> a.value}.collect{it.value}.groupCount(i) //
> throws exception
> // The simplest way I could find to do this was:
> m.sort{a,b -> b.value <=>
> a.value}.collect{it.value}._{true}.groupCount(i)
Simply do _(). If its not a pipe (some other object), then _() is required. If it is a pipe, then simply _ can be used.
> // A closure that returns the 'it' works as well
> m.sort{a,b -> b.value <=>
> a.value}.collect{it.value}._{it}.groupCount(i)
Its not returning 'it'. Its saying 'it != null', thus true. Same as {true}. This is known as "Groovy Truth"
> // However, both if these solutions insert an extra filter pipe into
> the flow:
> println m.sort{a,b -> b.value <=>
> a.value}.collect{it.value}._{it}.groupCount(i)
> [IdentityPipe, ClosureFilterPipe, GroupCountClosurePipe]
Again, just do _().
> // I do not understand why I must add the filter
> m.sort{a,b -> b.value <=>
> a.value}.collect{it.value}._.groupCount(i) // Why does this throw an
> exception
> Exception evaluating property '_' for java.util.ArrayList, Reason:
> groovy.lang.MissingPropertyException: No such property
>
> // I found the answer; the identity pipe step ('_'), if not followed
> by a pair of braces must be followed by a pair of parentheses.
> println m.sort{a,b -> b.value <=>
> a.value}.collect{it.value}._().groupCount(i)
> [IdentityPipe, GroupCountClosurePipe]
>
> // FWIW, you can optionally perform the identity earlier and then use
> the transform step to retrieve the map values - maybe this could be
> faster if pipes were executed in parallel?
> m.sort{a,b -> b.value <=>
> a.value}._().transform{it.value}.groupCount(i)
There you go. _() basically does this:
IdentityPipe.setStarts(previousObject.iterator()) /// Object.iterator() is a metaMethod provided by Groovy to coerce any object into an iterator.
> Some final questions: Would it be possible to teach gremlin that a
> lone _ is really a _() (perhaps through the dynamic programming
> mechanism in Groovy that allows methods to look like properties)?
Adding methods/properties to classes can have terrible rippling effects to other Groovy libraries the user might be using. As such, I've only overloaded java.lang.Object with one metaMethod -- Object._(). Its the way of turning any object into a pipeline.
> Also, is there a better way to do what I was doing? Care to take a
> stab at the artist-sings-two-songs-in-a-row problem?
g.idx(T.v)[[type:'song']].out('sung_by').sideEffect{x = it}.back(2).out('followed_by').out('sung_by'){x == it}.name.groupCount(m)
That says:
1. For every song vertex.
2. Determine who sung it.
3. Save that person to variable x.
4. Jump back to the song (as we were at the person)
5. What song's follow that song.
6. Filter out those songs not sung by x.
7. Index the person into m.
Thus, each count in m is for a person singing a song twice in a row.
Here is a variation of the same computation:
Assuming Gremlin 1.1:
g.idx(T.v)[[type:'song']].as('song').out('sung_by').sideEffect{x = it}.back('song').out('followed_by').out('sung_by'){x == it}.name.groupCount(m)
Assuming Gremlin 1.2-SNAPSHOT:
g.idx(T.v)[[type:'song']].as('song').out('sung_by').sideEffect{x = it}.back('song').out('followed_by').out('sung_by'){x == it}.groupCount(m){it.name}
Assuming Gremlin 1.2-SNAPSHOT and being "filter explicit":
g.idx(T.v)[[type:'song']].as('song').out('sung_by').sideEffect{x = it}.back('song').out('followed_by').out('sung_by').filter{x == it}.groupCount(m){it.name}
Hope that helps. That was fun. Thanks for your mind dump.
Marko.
>
> Thanks,
> Paul Jackson