A humble migration guide from Gremlin2 to Gremlin 3

153 views
Skip to first unread message

Damien Seguy

unread,
Mar 22, 2016, 1:45:14 PM3/22/16
to Gremlin-users

A few notes before the list : I bring this list in the hope it may help others to get orientation when moving from Gremlin 2 to 3. I have made Gremlin works for me, but I have still to gain a lot of experience. Hopefully, anyone with better knowledge will correct this humble article.


Gremlin 3 has been around for quite some time, and gremlin 2 is dead. Time to move code. Since this is a major version change, migration is rather steep. It requires a lot of modifications, not just a few API names. I scanned Gremlin 2 queries in my code, and here are the first directions that I found.



My context : I use Gremlin via HTTP API. The underlying database is Neo4j, and Gremlin is running as a plug-in. The guide focuses on query building.


  • g.idx('index')
    g.idx('index')[['key' :'value']] is gone. The index have moved from Gremlin to the underlying graph engine, and any index has to be created with the engine's language. For for TitanDB, it
    is with Titan API ManagementSystem, and for Neo4j, it will be Cypher.
    Once the index is build, gremlin attempts to use the good indices if the query starts with :
    g.V().has(). Besides that, there is nothing special do to, nor to note.
    An
    interesting alternative to indexing is the Subgraph() step : it creates a partition of the graph, that may be used later as a basis for another query. That may be a good way to prepare nodes with a large sieve, then run more complex queries on smaller set of nodes : it's always a good strategy. Now, if subgraphs have to be used several times, where can they be stored ? Graph variables() or Sack() ?

  • Back() doesn't exist
    The As() step is available, and may be coupled with Select() to access previously named steps. Given some weird behaviors of back(), this looks interesting. It is also useful to collect several nodes during the query and finally return them as an array.

  • Lambda steps sideEffect() and Filter() are to be avoided
    They are still available, though, so it
    makes an easy migration. At first glance, Filter() may be replaced by WHERE() step. SideEffect() will have to be replaced by a mix of other steps. This will require some work, and may eventually make the queries a lot cleaner : nothing like a big string of code inside a step to decay quickly.

  • The Loop() step is now called Repeat()
    It comes along with modulators, Emit(), Times() and Until(). That will help readability, and there is no more use of it.object or it.loops inside the closures.

  • The GroupBy() step has been recalled Group()
    It has now a named modulator called By(). This is the same evolution than Loop().

  • The Transform() step is gone.
    The queries make
    s some usage of this step when it needs to apply some transformation during the traversal. It may be to follow an optional link ( transform{ if (it.out('METHOD').any()) { it.out('METHOD').next(); } else { it; }} ) or replacing current step by an arbitrary element stored in an index (transform{g.idx("classes")[["path":it.fullnspath]].next(); }}). Those steps may need the most rethinking, besides replacing Transform() by sideEffect() for the time being.

  • SetProperty() and RemoveProperty() are gone, and have been replaced by Property(). I can't find explicitly the other CRUD operations, so I guess that removing property is making it NULL ; replacing property is just adding it again. Getting the list of properties is using valuesMap() step.

  • Map() is now valueMap(), with the possibility to get the all the properties, a sub-selection of them.

  • More goodness
    While r
    eading the docs, Match() and TimeLimit() looks very interesting : Match() will check that some condition applies to several sub-searches at the same time. When searching has a path that looks like a star, it is going to be great.
    Timelimit() is a step that limits the amount of time spend on a query, and reports any data found during that time. This looks very interesting to keep a system reactive, as long as the user may do something with a partial results.


Thanks for reading : do not hesitate to correct any misconception.
I'll begin experimenting quickly with this new version of Gremlin and gain experience.

Marko Rodriguez

unread,
Mar 22, 2016, 1:47:11 PM3/22/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
That is a really nice rundown. 

Thanks for doing that.

Marko.
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/40aac944-993e-4046-9d4d-1794f957f9c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages