Creating a host language embedded Gremlin language variant.

168 views
Skip to first unread message

Marko Rodriguez

unread,
Apr 12, 2016, 10:37:41 AM4/12/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hello everyone,

Please see the section entitled "Host Language Embedding" here:

When I was writing up this section, I noticed that most of the language drivers that are advertised on our homepage (http://tinkerpop.incubator.apache.org/#graph-libraries) know how to talk to Gremlin Server via web sockets, REST, etc., but rely on the user to create a String of their graph traversal and submit it. For instance, here is a snippet from the Gremlin-PHP documentation:

$db = new Connection([
    'host' => 'localhost',
    'graph' => 'graph',
    'username' => 'pomme',
    'password' => 'hardToCrack'
]);
//you can set $db->timeout = 0.5; if you wish
$db->open();
$db->send('g.V(2)');
//do something with result
$db->close();

$db->send(String) is great, but it would be better if the user didn't have to leave PHP.

Please see this ticket:

I think for non-JVM languages, it would be nice if these drivers (PHP, JavaScript, Python, etc.) didn't require the user to explicitly create Gremlin-XXX Strings, but instead either used JINI or model-3 in the ticket above. Lets look at model-3 as I think its the easiest and more general.

For instance, they would have a class in their native language that would mirror the GraphTraversal API. *** I don't know any other languages well enough, so I'm just going to do this in Groovy :), hopefully you get the generalized point. ***

public class Test {

  String s;

  public Test(final String source) {
    s = source;
  }

  public Test() {
    s = "";
  }

  public Test V() {
    s = s + ".V()";
    return this;
  }

  public Test outE(final String label) {
    s = s + ".outE(\"${label}\")";
    return this;
  }

  public Test repeat(final Test test) {
    s = s + ".repeat(${test.toString()})";
    return this;
  }

  public String toString() {
    return s;
  }
}

Then, via fluency (function composition) and nesting, you could generate a Gremlin-Groovy (or which ever ScriptEngine language) traversal String in the backend.

gremlin> g = new Test("g");
==>g
gremlin> g.V().outE("knows")
==>g.V().outE("knows")
gremlin>
gremlin> g = new Test("g");
==>g
gremlin> g.V().repeat(new Test().outE("knows"))
==>g.V().repeat(.outE("knows"))
gremlin>

From there, that String is then submitted as you normally do with your driver. For instance, with Gremlin-PHP, via $db->send(String). 

Of course, if your driver is already on a JVM language, there is no reason to do this (e.g. Gremlin-Scala), but if you are not on the JVM, this gives the user host language embedding and a more natural "look and feel." Moreover, if your language doesn't use "dot notation," you would use the natural idioms of your language. 

$g->V->outE("knows")

If anyone is interested in updating their non-JVM language driver to use this model, I would like to write a blog post about it. Or perhaps, a tutorial for for language designers.

Thoughts?,
Marko.

David Brown

unread,
Apr 12, 2016, 11:37:05 AM4/12/16
to d...@tinkerpop.incubator.apache.org, gremli...@googlegroups.com
Thanks for this Marko.

The original Python OGM mogwai included a limited subset of this sort
of functionality that allowed the user to create traversals (always
with a start node) using an interface that mirrored the Blueprints
API. In my initial port to TP3 (Goblin) [1] , I changed this to class
`V` [3] and did a minimum refactor to get everything working with the
new stack. Currently, this part of the library is under review (both
the API and the internals); maybe we will move towards model-3 as you
suggest. However, I'm not sure if this will be a top priority, as
there are other bits of functionality we are working on currently.

Anyway, I thought you may like to know. I also think that any
resources you provide for the community are definitely welcome! Thanks
to the whole TinkerPop team!

1. http://goblin.readthedocs.org/en/latest/index.html
2. http://goblin.readthedocs.org/en/latest/usage.html#the-v-ertex-centric-query-api

Btw, a real release of Goblin is coming this week. I'll announce on the lists...

Best,
Dave
--
David M. Brown
R.A. CulturePlex Lab, Western University

Dylan Millikin

unread,
Apr 12, 2016, 2:22:19 PM4/12/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hey, 

This might be a bit long but should explain a few of the pitfalls of making a gremlin language variant outside of the JVM.

The biggest challenges fall around the following categories. I'll elaborate on these further down :
  1. Method overloading is not always native or it is implemented differently. (some languages have very limited typing as well)
  2. Even if the overloading is handled correctly conflicts can arise from lack of typing in some languages, using bindings and/or server side variables.
  3. For a functional library the APIs for Graph, traversal, Elements(node/vertex), and native Java API need to be made available. 

There are also further issues related to gremlin versions, performance, and functionality that I'll skim at the end of this post.

1. Method overloading :

abstract class Query {
   public function has(PropertyKey $key); //1
   public function has(PropertyKey $key, Object $value); //2
   public function has(Label $label, String $value); //3
   public function has(VertexId $id, Long $value); //4
   public function has(VertexId $id, Int $value); //5
   public function has(VertexId $id, Predicate $p); //6
}

The above is illegal in languages like PHP (or javascript?). Instead we're stuck with :

abstract class Query {
   public function has(Array $args);
}

We're then left to figure out what is what in the array and sort out how we need to stringify the output. 

If the user does $g->V()->has("label", "user") do we add quotes to the first argument or is it a label/id? What about the second argument, is it a predicate? etc.  This gets complexe very quickly.
And what if I had $g->V()->has("id", 36) . PHP only supports Int so one of the two signatures (4 or 5) needs to give as we have a major conflict. This example is fictional for has() but I've run into this on a couple of other methods, just can't remember which. 

Another example would be  g.V().has(id, neq(m)) . We could imagine the following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) where Id() is a class that helps us recognize this type, and neq() a static method of Predicate. However "m" has to be passed as string and we have no clue what m is... is this a string or a binding or a server side variable? More on this in point 2.

To close things off here there's also the case of signatures like out(String... edgeLabels) that need their own logic.

Conclusion: There's a lot of manual work that needs to go into separating the logic between signatures and handling special cases. Part of this can be automated if your language supports magic getters and setters by parsing the javadocs for example. But not only is that an if, the rest will still be manual. This step is maintenance heavy.

2. Conflicts

Because we're manipulating strings it's really hard to tell a few items appart (binding vs server variable vs string; Theres a reason why I separate binding and variable). 

For instance in the example above of gremlin : g.V().has(id, neq(m)) vs PHP: $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to make of m. Is this a binding or a string or even a variable that was previously set in the session? There is no clean way of working around this.

Firstly because bindings tend to be handled on a different layer than the query builder.
Secondly because methods that will help in avoiding the conflicts will also lose typing data.
For example : $g->V()->has(new Id(), Predicate::neq(Query::variable("m"))) could generate the proper query by outputting m without quotes but we don't know what type m is so in some cases it might be tricky to select the proper signature.

Conclusion: there are a number of ways around this point. We use prefixes B_m or V_m and a hack to ignore signatures altogether when in this scenario. It's not that these aren't solve-able they just aren't trivial.

3. API

Why we would need traversal, graph, vertex and edge APIs are quite self explanatory for everyday work with Gremlin. I'm just going to expose why we would also require some Java classes as well. 

Because JSON is lossy by nature we often have to cast variables to certain types. For example by submitting these kind of scripts : g.V(1).property("date", new Date(B_m)); with B_m = timestamp. This is just another case that is difficult to cover. 

This adds onto the other points in making a gremlin language variant non-trivial.

All of the above can be worked around by using an injection method that just appends a string to the query : $g->customStep("V().has(id, neq(m))") but that's besides the point.

Final Conclusion: It's not a trivial task. Of course the examples above are very verbose and achieving something closer to gremlin in style is possible but there are always going to be "gotchas" users will need to keep in mind.  A while back in TP2 I released a php library for this (the one we currently use in our projects). I decided to remove it as it was too much maintenance to get it to work across user causes so I decided to concentrate on our own one (some choices made in 2. wouldn't have worked for other cases)
I'm convinced there's got to be a way of reconciling everything and getting this to work flawlessly but it's going to require a lot of thought/work


PS: I mentioned some other points like managing multiple versions of gremlin (for two lines of releases) which is a real headache. 
For performance it may be good to allow the builder to handle multiple lines, which comes with it's load of complications as well. 
And then there's the ability to "block" queries and either inject them into each other or merge them together which simplifies unit testing and extends functionality :

$query = $g->V()->out("likes")->flag("flagname")->has("age", 20);
// Some logic here accesses new information and realizes the query needs altering
$query->getFlag("flagname")->out("hates", true) // true for merge
$query->toString(); // g.V().out('likes', hates').has('age', 20)

But this point alone could warrant it's own email as it is relatively complex. Though TP3 has simplified some cases thanks to union() and some other steps.

Our builder supports all of the above so if you have any questions feel free to ask me.

Phew that was long. I'll add this to the ticket in a bit.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/47A92EFF-CB36-41EA-B252-6823A42F4D7B%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Marko Rodriguez

unread,
Apr 12, 2016, 3:42:22 PM4/12/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hi Dylan,

Your email is excellent. Thank you for breaking things down for me. Here are some responses.

1. Method overloading :

abstract class Query {
   public function has(PropertyKey $key); //1
   public function has(PropertyKey $key, Object $value); //2
   public function has(Label $label, String $value); //3
   public function has(VertexId $id, Long $value); //4
   public function has(VertexId $id, Int $value); //5
   public function has(VertexId $id, Predicate $p); //6
}

The above is illegal in languages like PHP (or javascript?). Instead we're stuck with :

abstract class Query {
   public function has(Array $args);
}

We're then left to figure out what is what in the array and sort out how we need to stringify the output. 

I was thinking, why would you need to introspect into the array? Just toString() each element in the array with a comma (,) in between. For instance:

* has("age",32) ==> has(["age",32]) ==> has("age",32) // all String array element need " " wrappers.
* has("age") ==> has(["age"]) ==> has("age")
* has("person","name","marko") ==> has(["person","name","marko"]) ==> has("person","name","marko")

Thus, Gremlin-PHP have one has()-method and that method just iterates the arguments and toString()'s thing accordingly with comma deliminators.

If the user does $g->V()->has("label", "user") do we add quotes to the first argument or is it a label/id? What about the second argument, is it a predicate? etc.  This gets complexe very quickly.

The universal rule --- if its a String add quotes. If its not, don't. 

$id -> "~id"
$label -> "~label"

$g->V()->has($label,"user")

And what if I had $g->V()->has("id", 36) . PHP only supports Int so one of the two signatures (4 or 5) needs to give as we have a major conflict. This example is fictional for has() but I've run into this on a couple of other methods, just can't remember which. 

Yea, that sucks. Well, you could do this:

$g->V()->has($id,Number::long(36))  ==> g.V().has("id",36l)

This would, of course, bind you to Gremlin-Groovy as the ultimate ScriptEngine.

Another example would be  g.V().has(id, neq(m)) . We could imagine the following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) where Id() is a class that helps us recognize this type, and neq() a static method of Predicate. However "m" has to be passed as string and we have no clue what m is... is this a string or a binding or a server side variable? More on this in point 2.

Well, this is the same problem in Gremlin-Java. where() is ALWAYS bindings and has() is ALWAYS objects. Thus:

$g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) // again strings always get " "-wrappers.

To close things off here there's also the case of signatures like out(String... edgeLabels) that need their own logic.

Again, just toString() each object in the array and insert commas between.

$g->V()->out(["created","knows"]) ==> g.V().out("created","knows")


Conclusion: There's a lot of manual work that needs to go into separating the logic between signatures and handling special cases. Part of this can be automated if your language supports magic getters and setters by parsing the javadocs for example. But not only is that an if, the rest will still be manual. This step is maintenance heavy.

I see the biggest pains being:

1. Having to implement each method.
2. Having to have helper classes for P, T, Order, Column, etc.

This is simply a matter of fat fingering stuff in and not anything implementation-wise that is problematic -- ????….

2. Conflicts

Because we're manipulating strings it's really hard to tell a few items appart (binding vs server variable vs string; Theres a reason why I separate binding and variable). 

For instance in the example above of gremlin : g.V().has(id, neq(m)) vs PHP: $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to make of m. Is this a binding or a string or even a variable that was previously set in the session? There is no clean way of working around this.

Firstly because bindings tend to be handled on a different layer than the query builder.
Secondly because methods that will help in avoiding the conflicts will also lose typing data.
For example : $g->V()->has(new Id(), Predicate::neq(Query::variable("m"))) could generate the proper query by outputting m without quotes but we don't know what type m is so in some cases it might be tricky to select the proper signature.

Conclusion: there are a number of ways around this point. We use prefixes B_m or V_m and a hack to ignore signatures altogether when in this scenario. It's not that these aren't solve-able they just aren't trivial.

Hm. Yea, I'm not to smart about sever variables. Out of my butt you could create a "crazy String" for those an then do replaceAll-style updates.

g.V().out("%%x")

replaceAll("%%x",x)

?


3. API

Why we would need traversal, graph, vertex and edge APIs are quite self explanatory for everyday work with Gremlin. I'm just going to expose why we would also require some Java classes as well. 

Because JSON is lossy by nature we often have to cast variables to certain types. For example by submitting these kind of scripts : g.V(1).property("date", new Date(B_m)); with B_m = timestamp. This is just another case that is difficult to cover. 

This adds onto the other points in making a gremlin language variant non-trivial.

All of the above can be worked around by using an injection method that just appends a string to the query : $g->customStep("V().has(id, neq(m))") but that's besides the point.


Ah. Classy. Note that in ?3.2.1? we might support script()-step.

g.V().script("out().map{ it.name }")

…to enable lambdas in remote'd traversals (Server or OLAP).

For your Date example, you would have to have a special "toString()" for PHP dates to Java dates (or whichever backend ScriptEngine is being used).

$g->V()->property("data", phpDate)

Your Array-string-ifier would not just call toString() blindly on the objects of the array arguments, but would do stuff like:

if(object instanceof String)
  return \" + object.toString() + "\;
else if(object instanceof Date)
  return "new Date(…)";
else
  return object.toString()


Final Conclusion: It's not a trivial task. Of course the examples above are very verbose and achieving something closer to gremlin in style is possible but there are always going to be "gotchas" users will need to keep in mind.  A while back in TP2 I released a php library for this (the one we currently use in our projects). I decided to remove it as it was too much maintenance to get it to work across user causes so I decided to concentrate on our own one (some choices made in 2. wouldn't have worked for other cases)
I'm convinced there's got to be a way of reconciling everything and getting this to work flawlessly but it's going to require a lot of thought/work


PS: I mentioned some other points like managing multiple versions of gremlin (for two lines of releases) which is a real headache. 
For performance it may be good to allow the builder to handle multiple lines, which comes with it's load of complications as well. 
And then there's the ability to "block" queries and either inject them into each other or merge them together which simplifies unit testing and extends functionality :

$query = $g->V()->out("likes")->flag("flagname")->has("age", 20);
// Some logic here accesses new information and realizes the query needs altering
$query->getFlag("flagname")->out("hates", true) // true for merge
$query->toString(); // g.V().out('likes', hates').has('age', 20)

But this point alone could warrant it's own email as it is relatively complex. Though TP3 has simplified some cases thanks to union() and some other steps.

Our builder supports all of the above so if you have any questions feel free to ask me.

Phew that was long. I'll add this to the ticket in a bit.


Yes, maintenance seems the biggest pain. Every new method to Gremlin-Java requires updates to Gremlin-PHP ---- perhaps there is a programmatic way to introspect the Java source file (or JavaDoc) and generate the code automagically?

public GraphTraversal out(final String… edgeLabels) 
==auto-write==>
out(Array… edgeLabels) {
  $string -> $string + ".out(" + StringHelper::toString(edgeLabels) + ")";
}


If you could do that, then the only code you actually have to write/maintain (besides the introspector above) is StringHelper which does all the fancy String conversion of arguments. 

??.

Thanks Dylan for your time,
Marko.


Dylan Millikin

unread,
Apr 12, 2016, 10:54:08 PM4/12/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Yes a lot of the points you bring up are valid. 

One of the main problems with stringifying everything is that it does not allow for some of the stuff I mentioned in my PS. That is to name "smart merges". This query building behavior that makes use of scopes is unfortunately the standard for frameworks in the industry. 
This is mostly due to the SQL heritage and it's declarative nature ; ordering of "steps" doesn't matter so it allows for easy "after the fact" client side filtering. It's not uncommon to have a base query that gets altered by some filtering data. In some cases it's a simple has() that needs to be injected somewhere, in other cases it's a repeat() that needs to be completely altered.
Use cases can get a little complicated here but in it's simplest form imagine having to add/remove entries to/from a match(). Of course that scenario works well with a toString approach but for other steps, not so well. Our experience has been that the builder needs to be aware of the step's signatures to resolve merges.

So sure this is another problem entirely, in the end users can't really do this with string queries either. But for widespread adoption it would be best if the query builder could handle these scenarios. 

Also to bounce off of some of your comments : 

> $id -> "~id"
> $label -> "~label"
g.V().out("%%x")
$g->V()->has($id,Number::long(36))  ==> g.V().has("id",36l)

All of the above are absolutely possible. But it's a lot to keep in mind for users that are already trying to figure out how Gremlin works. Now they also need to translate gremlin-groovy into gremlin-php. 
One of the advantages of going the hard route and keeping track of all step signatures instead of a toString approach is that you can significantly reduce the above cases. The builder can resolve quite a few of these automatically and when conflicts arise it can do it's best to resolve it and throw/log a warning telling the user how he could explicit his query.

>For your Date example, you would have to have a special "toString()" for PHP dates to Java dates (or whichever backend ScriptEngine is being used).

There are no PHP Dates [insert desperate crying emoji here]. PHP sucks with typing. It's got it's good points but this kind of stuff is not one of them. Basically PHP Dates come in various forms, from Integer timestamps to String and only the user really knows what he wants. We can provide this functionality like you did with long() but it's another thing to keep in mind.

One point we haven't gone over have been lambdas. We can't really toString these. I guess this is where customStep() or script() come in play.

To wrap it up, a toString query builder is absolutely an option and could cover a lot of the API. In fact in PHP we could magically make any API method available, $g->something("~label", "lolo") would stringify to g.something(label, "lolo") regardless of whether or not the step exists. But this involves quite a few language specific alterations and doesn't provide much (if any) functional benefit. 
It would be so much easier for people to just write a gremlin-groovy string as it's well documented and doesn't need any extra knowledge.
If on the other hand the query builder has features like mentioned in the PS or earlier in this post, it's well worth the effort. I believe most people who build their own query builders do so to support some form of extra feature they wouldn't have by using gremlin-groovy string queries. 
But such a query builder enters the realm of non-trivial (although not unachievable). A first step in helping people make these builders would be to provide an easily parseable list of signatures for the most desirable classes. Maybe something along the lines of a yaml file.

Anyways I'm just thinking out loud at this point.





Mark Henderson

unread,
Apr 14, 2016, 9:06:23 AM4/14/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
I've written "native object to Gremlin" libs in both PHP and Python and it isn't too bad/not too far from Groovy. The biggest issues were around indices [..] (when it had that format) and closures "{x -> ...}", but otherwise both langs allowed for easy query building. 

It basically looked like this in PHP:

$g= Gremlin();
$g
->V()->has('"name"','mark');
echo
(str)$g;  //g.V().has("name",SOME_BOUND_VAR_1)

Works pretty much the same with the Python lib that I've been building (https://github.com/emehrkay/gremlinpy). 

If we wanted to actually execute the query on every step, that wouldn't be too difficult to implement with Gremlinpy. Gremlinpy is a simple linked list, it looks at g.V().has('"name"', 'mark') as three token objects with a shared pool of bound parameters. It creates the string query and parameters dictionary when you cast the list to a string. The only change needed would be to bind in a library like Gremlinclient (https://github.com/davebshow/gremlinclient), build the query with every step, and send it to the server.

res = g.V() # sends request
res2
= g.V().has('"name"', 'mark') # second request
...

The remaining difficulty would be deciding what gets bound. Maybe you can pass in a key val pair for what you want bound

res = g.V().has('"name"',{'NAME':'mark'})  # g.V().has("name",NAME)

Marko Rodriguez

unread,
Apr 14, 2016, 9:34:40 AM4/14/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hi Mark,

Exactly. I never saw Gremlin-Py until now and just noticed it on the Apache TinkerPop homepage. That is good stuff. Moreover, as you say, there is a distinction between:

1. Writing Gremlin in a host language.
2. Communicating to a GremlinServer-compliant server in a host language.

The (1) is about query syntax and the (2) is about protocol stuffs.

Lots of the libraries either confound the two or just do (2) with (1) simply being a Groovy String (cheesy).

I would like to see a lot more (1) of the community libraries as I think this is one of the big selling points of Gremlin -- write in your native language.

BTW, I added Gremlin-Py to the description in the "host language embedding" section here: http://www.planettinkerpop.org/#gremlin (2 scrolls down).

Thanks for your thoughts,
Marko.

Mark Henderson

unread,
Apr 14, 2016, 10:28:16 AM4/14/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
I think writing "Gremlin/Groovy" in a host language is pretty awesome as long as it isn't too far off from writing actual Gremlin. I can revive my PHP project if it would be helpful to the community. A JavaScript version would probably be one that would get the most attention from developers today, but JS, even with es6, doesn't have the flexibility (maybe with Proxies) with its objects where you wouldn't have to write a full-on 1-to-1 api equivalent of Gremlin (let alone mimicking Groovy). It seems like a Ruby version would be doable by implementing `method_missing` 

Thanks for adding Gremlinpy to the new site (I need to clean up the code a bit *shame*)
...

Marko Rodriguez

unread,
Apr 14, 2016, 10:47:14 AM4/14/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hi Mark,

I think that any host language embedding should use its native idioms while, at the same time, staying as true as possible to Gremlin-Java (not Gremlin-Groovy -- though they are nearly identical). I would argue that Gremlin-Java is the "true representation" of the language. So what do I mean by native idioms?

in_V vs inV // if camel case isn't a thing in the native language
$g vs. g // of course if thats how variables are referenced
…huh, can't think of anything else :). But I hope you get the point. 

I notice in Gremlin-Py you do g.v(2) vs g.V(2). Why is that? 

*** Would you be interested in working on a tutorial (with me?) about the 3 ways to create a Gremlin language variant. Given your expertise in Python and the existence of Gremlin-Py, I think we can both (1) make a good tutorial to teach others down the line and (2) spruce up Gremlin-Py's documentation and appearance (e.g. you need a Gremlin logo! -- Gremlin with a Snake around his neck? -- want me to make you one?). ***

Thanks Mark,
Marko.

Mark Henderson

unread,
Apr 14, 2016, 10:59:41 AM4/14/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
A logo would be awesome! Thanks. 

Gremlinpy started off, and still can, support Gremlin 2.x so some of the examples still reference that syntax. Things like indices [..] and lower v/e are still outlined in the examples. At the core it takes whatever you throw at it and turns it into a Gremlin script.

I'd love to help with the tutorial. I think it will not only help the Gremlin community, but the library will get a lot better as a result. Just let me know where to start and what you'd like to see. 
...

Marko Rodriguez

unread,
Apr 14, 2016, 12:10:04 PM4/14/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hi Mark,

A logo would be awesome! Thanks. 

Please see attached.

I'd love to help with the tutorial. I think it will not only help the Gremlin community, but the library will get a lot better as a result. Just let me know where to start and what you'd like to see. 

We have a way of creating easily creating/publishing tutorials in TinkerPop3. 


I don't know how to do it, but Stephen does. How about you do this:

1. You fork Apache TinkerPop tp31/.
2. You give Stephen and I rights to your forked repository.
3. Stephen will create the tutorial stub. (this will help me learn when I see his commit).
- @stephen: call it gremlin-language-variants
4. You and I then go to town on creating the tutorial.

Please read over the ticket and comment as appropriate so we jive and are on the same page going into this:

Thank you Mark,
Mark…………….o.











--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
Message has been deleted

Mark Henderson

unread,
Apr 14, 2016, 1:24:50 PM4/14/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
Ha that logo is hilariously awesome! I will add it to the repo later tonight. 

I will let you know once I have everything setup regarding the repo.

Thanks,
...

Cody Lee

unread,
Apr 14, 2016, 7:10:36 PM4/14/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
I will say that Mogwai did support the use of gremlinpy and still give the OGM output feel. All that was needed was passing the output (string + parameter map binding) to the execute_query method of mogwai and then pass that back through the result through Element.deserialize and poof, back to your OGM models or if it wasn't of that type gracefully fallback to python types.

There was also support for embedding groovy/gremlin as static parameterized scripts alongside your code, thus giving the developer the ability to just all a method, classmethod or property and it does all the paramaterized query behind the scenes.  Goblin (Mogwai's grown-up self and TP3 compatible) should be a great addition for the python commnunity.

Cody
...

Stephen Mallette

unread,
Apr 15, 2016, 6:49:51 AM4/15/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
David Brown mentioned Goblin to me.  I feel like there seems to be a fair bit of fragmentation in the TinkerPop+Python land. Maybe it's just because I don't know Python, but there are a ton of libraries out there and I'm not sure I understand how they fit together. Maybe Python folks understand it all immediately but to me, it feels like some consolidation is in order to give users a more clear choice in what to use.  Am I talking crazy?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

David Brown

unread,
Apr 15, 2016, 9:27:12 AM4/15/16
to d...@tinkerpop.incubator.apache.org, Gremlin-users
I'll see if I can clarify a bit.

The fragmentation is at least partly my fault. As I discussed in a
previous post, I wrote three Python drivers for the Gremlin Server:
aiogremlin, gremlinrestclient, and gremlinclient. The first,
aiogremlin, was to test out the new Gremlin Server because my lab was
considering using Titan for some internal projects. The second,
gremlinrestclient, was to provide Gremlin Server integration for a
Django app (SylvaDB) using HTTP requests. However, because Python
users are kind of split between Python 2 and Python 3 right now,
aiogremlin didn't really seem to suit the needs of the community, as
it is Python 3 only. I was approached about providing a
prototype/reference implementation for a Python 2/3 compatible
websocket client that could be dropped into mogwai, the original
Python OGM for Titan. Therefore, I started work on gremlinclient,
which is by far the most flexible of the Python drivers that I know
of, because it allows for pluggable websocket client implementations.

Later, I started working with the ZEROFAIL team on a port of mogwai
that would allow us to use the library with the new TinkerPop stack.
As this was quite an evolution, we renamed the project Goblin.
Regardless, Goblin is still very similar to mogwai both in terms of
API and internals. And as Cody mentioned, integrating a Python
embedded Gremlin implementation like gremlinpy is easy, and it can be
done the same way he indicated. This is true of any of the drivers as
well. I will include examples that integrate gremlinpy in the future
versions of docs for both Goblin and gremlinclient.

gremlin-python uses Jython to provide a Gremlin language variant, and
it is the library I know least about.

In terms of activity, gremlinclient and goblin are the libraries that
I am currently developing and maintaining, and Goblin has started to
receive some community input from a couple dedicated users. I've had 4
or 5 PRs over the last couple days. aiogremlin is being maintained, as
in I will respond to issues patches etc., but it is not my priority,
as gremlinclient can do everything aiogremlin does and more.

So, as far as I know, there are currently 4 important Python libraries
that each serves a different function:

Jython Gremlin - gremlin-python
Native Object to Gremlin - gremlinpy
OGM - Goblin
Websocket (soon to support REST as well )Driver - gremilnclient

In terms of cohesiveness/integration, Goblin fundamentally integrates
gremlinclient, and gremlinpy should play nicely with either.
Therefore, in my opinion, these three libraries form the core of the
TinkerPop+Python landscape.

Hope this helps. Sorry if I missed anything...

Best,

Dave
>>> *1. Method overloading :*
>>> variable? More on this in point *2.*
>>>
>>>
>>> Well, this is the same problem in Gremlin-Java. where() is ALWAYS
>>> bindings and has() is ALWAYS objects. Thus:
>>>
>>> $g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) //
>>> again strings always get " "-wrappers.
>>>
>>> To close things off here there's also the case of signatures like out(String...
>>> edgeLabels) that need their own logic.
>>>
>>>
>>> Again, just toString() each object in the array and insert commas between.
>>>
>>> $g->V()->out(["created","knows"]) ==> g.V().out("created","knows")
>>>
>>>
>>> *Conclusion*: There's a lot of manual work that needs to go into
>>> separating the logic between signatures and handling special cases. Part of
>>> this can be automated if your language supports magic getters and setters
>>> by parsing the javadocs for example. But not only is that an if, the rest
>>> will still be manual. This step is maintenance heavy.
>>>
>>>
>>> I see the biggest pains being:
>>>
>>> 1. Having to implement each method.
>>> 2. Having to have helper classes for P, T, Order, Column, etc.
>>>
>>> This is simply a matter of fat fingering stuff in and not anything
>>> implementation-wise that is problematic -- ????….
>>>
>>> *2. Conflicts*
>>>
>>> Because we're manipulating strings it's really hard to tell a few items
>>> appart (binding vs server variable vs string; Theres a reason why I
>>> separate binding and variable).
>>>
>>> For instance in the example above of *gremlin :* g.V().has(id, neq(m))
>>> vs *PHP:* $g->V()->has(new Id(), Predicate::neq("m")) we don't know what
>>> to make of m. Is this a binding or a string or even a variable that was
>>> previously set in the session? There is no clean way of working around this.
>>>
>>> Firstly because bindings tend to be handled on a different layer than the
>>> query builder.
>>> Secondly because methods that will help in avoiding the conflicts will
>>> also lose typing data.
>>> For example : $g->V()->has(new Id(),
>>> Predicate::neq(Query::variable("m"))) could generate the proper query by
>>> outputting m without quotes but we don't know what type m is so in some
>>> cases it might be tricky to select the proper signature.
>>>
>>> *Conclusion*: there are a number of ways around this point. We use
>>> prefixes B_m or V_m and a hack to ignore signatures altogether when in this
>>> scenario. It's not that these aren't solve-able they just aren't trivial.
>>>
>>>
>>> Hm. Yea, I'm not to smart about sever variables. Out of my butt you could
>>> create a "crazy String" for those an then do replaceAll-style updates.
>>>
>>> g.V().out("%%x")
>>>
>>> replaceAll("%%x",x)
>>>
>>> ?
>>>
>>>
>>> *3. API*
>>> *Final Conclusion:* It's not a trivial task. Of course the examples
>>> above are very verbose and achieving something closer to gremlin in style
>>> is possible but there are always going to be "gotchas" users will need to
>>> keep in mind. A while back in TP2 I released a php library for this (the
>>> one we currently use in our projects). I decided to remove it as it was too
>>> much maintenance to get it to work across user causes so I decided to
>>> concentrate on our own one (some choices made in *2.* wouldn't have
>> <https://groups.google.com/d/msgid/gremlin-users/b0e6a46e-9fd4-4139-9d6d-a96038be52e0%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>



buddylee48

unread,
Apr 15, 2016, 9:28:12 AM4/15/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org

Depends on perception, there doesn't seem to be many from my perspective. To clarify Goblin is Mogwai's successor and it has ZeroFail behind it, which is the only one with sponsored adoption that I'm aware of (unless bulbs is too).  There are a couple incarnations of the rexpro interface  (one I currently maintain for Titan <0.9.x ), gremlinpy a very nifty syntax mirror, and a couple HTTP clients like bulbs. I really only see 2 major full featured OGM libraries (Goblin & bulbs), and a handful of communication/utility libraries. 

A couple of choices is a good thing as one library's opinionated operation may not fit your problem, but another might. Some people really may just want a communication + syntax mirror and are perfectly happy with raw python type responses, others may prefer something like an OGM that is linked to their Object Models, and some mixed mode? Not to mention the different programming paradigms.

You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/kTXEzJE8wEs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAA-H43-HOh7bGpmf748nCrGs63v%2B6ConGRzf2txsjoEiuwarGg%40mail.gmail.com.

Mark Henderson

unread,
Apr 15, 2016, 9:49:16 AM4/15/16
to Gremlin-users, d...@tinkerpop.incubator.apache.org
Great summary and thanks for all of the work you've done. I use Gremlinclient in my Python OGM Gizmo (https://github.com/emehrkay/gizmo). Gizmo has a slightly different approach to data management because it grew out of the want to add simple objects around Gremlinpy. It tries to be a data mapper with the main focus being that you can have certain entity types managed (CRUD) in unique ways without muddying the codebase too much (I am actually beginning to mix in other data sources into my Mappers, it is pretty nice). I need to get on your level with documentation and examples though. 

I agree with Cody, a few choices for graph management shouldn't hurt, but help adoption in any community. One person may want a more Active Record-like approach, someone else may like my Data Mapper-like implementation.

Check out Gizmo and let me know what you think.

Thanks

buddylee48

unread,
Apr 15, 2016, 10:14:08 AM4/15/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org

Ah I totally spaced on gizmo! Sorry Mark!

You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/kTXEzJE8wEs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/49556108-6177-4d53-bb93-14d52e0b7e8f%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages