[TinkerPop3] RFC Rexster to Become Gremlin Server

1,130 views
Skip to first unread message

Stephen Mallette

unread,
Jan 1, 2014, 8:52:23 AM1/1/14
to gremli...@googlegroups.com
One of the key design goals for TinkerPop3 is to simplify the stack, reduce design choices, and focus TinkerPop development efforts on the features meant to expose best practices when developing graph-based applications.  Rexster has been a key part of the infrastructure supporting that application development and over the years it has evolved many features in attempts to make it more flexible for development (e.g. gremlin extension, kibbles, custom extensions), more efficient (e.g. rexpro), or simply more accessible for new users (e.g. Dog House).  

In the spirit of TinkerPop3 "simplification", we believe that Rexster - a Graph Server - as a separate project no longer makes sense.  Therefore, the intention is to fold the "Graph Server" functionality into the Gremlin project as its own module.  We intend to call this Gremlin Server.  It will be built on Netty (http://netty.io/) which seems to be more accepted than Grizzly these days.   Netty is also Apache 2 licensed which fits nicely with the Apache 2 story for the TinkerPop3 stack.

Gremlin Server will be based on websockets and does one thing: process remotely submitted Gremlin scripts and return results. That's it.  No REST. No Dog House. No RexPro (though I suppose RexPro most closely aligns with websockets).

This RFC is meant to introduce Gremlin Server and get everyone thinking Netty, websockets, remote Gremlin, etc.  Expect additional RFCs to be forthcoming around Gremlin Server, especially in regards to how we plan to allow interaction with it.  

Thoughts and questions are welcome...Thanks.

Stephen




David Gonzalez

unread,
Jan 20, 2014, 10:35:30 AM1/20/14
to gremli...@googlegroups.com
Stephen, This is fantastic news. Let us know where you'd like help. Happy to contribute.

Stephen Mallette

unread,
Jan 22, 2014, 7:03:25 AM1/22/14
to gremli...@googlegroups.com
David, thanks for the feedback.  At this point, I think the most important thing Gremlin Server needs is language bindings to make it easy to develop against.  Obviously, I'd appeal to the developers of the various Rexster bindings to bring the same support to Gremlin Server that they brought to Rexster, but if others in the community would like to contribute that certainly wouldn't be discouraged in any way.  Building language bindings is helpful for two reasons:

1. TinkerPop becomes useful outside of java/groovy (that's an obvious one)
2. Writing language bindings helps find bugs...it happened that way during the development of RexPro (that's the less obvious one)

It would be really nice to see us have at least the same bindings we had with Rexster ready to go on the release day of TinkerPop3.  That would mean: Python, c#, ruby, go.  Javascript is a glaring omission (not that there were no javascript ways to work with Rexster...just no RexPro bindings i'm aware of). 

There's no documentation yet for how to build a client, though Sridhar of the Bitsy Graph has been busy working on a reference implementation which we hope to have an RFC for in coming weeks.  

Outside of Gremlin Server, TinkerPop3 contribution ideas would include:

+ A javascript implementation of Gremlin with ScriptEngine implementation over jsr-223.  Frank Panetta did some exploratory work on this with Nashorn and had some success with it (https://github.com/entrendipity/gremlin-js).  We did get gremlin-js evaluating in Rexster at one point.  Frank has moved on from working with graphs on a daily basis and has transferred his grex (https://github.com/gulthor/grex) and gremlin-node (https://github.com/inolen/gremlin-node) repos to others who are now managing them.  I still like the thought of a pure gremlin-js syntax that could be evaluated in Gremlin Server...would be nice to see that work.
+ A third-party managed blueprints-io-contrib repo to take over the IO packages we will no longer support (GML and GraphSON).  Recall the RFC on this topic: https://groups.google.com/forum/#!searchin/gremlin-users/rfc$20io/gremlin-users/9LQ1-BSGLYE/2wBAVFoJ2vgJ  This project could even introduce support for other graph interchange format (gephi, CSV, etc.).  Note that with "plugins", it would be possible for this library to extend the Gremlin language so that users could still do g.saveCsv('/tmp/g.txt')..

Hope that gives you some ideas where you might be able to contribute. 

Thanks,

Stephen



--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Erick Tryzelaar

unread,
Mar 12, 2014, 8:37:41 PM3/12/14
to gremli...@googlegroups.com
Hello Stephen,

Have you made any progress on the Gremlin Server protocol? I was just about to start working on a RexPro/c++ library, and after seeing this RFC it's made me a little worried to write something against RexPro if it's about to be deprecated. Do you have any suggestions on how to work around the differences? For example, are sessions definitely gone from TinkerPop3?

Thanks for any help!
Erick

Stephen Mallette

unread,
Mar 13, 2014, 8:35:14 AM3/13/14
to gremli...@googlegroups.com
Erick,

Progress has been made, but I'd not say it is final by any means and is subject to change.  You can see the basics of the "test client" written in java here in these two classes:

https://github.com/tinkerpop/tinkerpop3/blob/master/gremlin/gremlin-server/src/test/java/com/tinkerpop/gremlin/server/WebSocketClient.java

I would say that these classes are somewhat analogous to what we have now with the RexsterClient functionality.  Send a string of gremlin and get back some results.  I think that this style of interaction with Gremlin Server (i.e. send a fat string of Gremlin to Gremlin Server and get back results) is only the very root of what Gremlin Server will do (or at least that's the hope).  What has yet to be solved is how to define a higher level of interaction where users don't have to hardcode gremlin strings, not deal so directly with object serialization, etc. 

So....as far as future compatibility goes, I think that there will always be this low-level API that is the same as RexPro.  In that sense, I think the change to Gremlin Server would introduce two breaks:

1. You would need to change your application code to the new "Gremlin Server Client" (which would still send strings of Gremlin to be processed on the server).
2. You would likely need to change how you process results a bit.  In RexPro (and the REST Gremlin Extension), Rexster would serialize an entire result from a request into memory and send that back.  On the client-side that meant that you would work with the entire results set in memory once that came back down.  With Gremlin Server, results are serialized individually and streamed back to the client.  So to work on an entire result set you would need to iterate all results from the server and then perform your computation.  There are a number of advantages to this approach, but one that stands out is that it treats remote execution very much like a Pipeline (or Stream in Java 8) which means it's very Gremlin like from the client's perspective.  Bringing remote Gremlin execution inline with local execution makes it feel like there's a way to join those two modes, though it's not fully clear just how we will do that.

If you intend to build a new client language binding (which would be great as it would help us understand how well Gremlin Server will work for getting Gremlin out of the JVM), I would focus on a very simple implementation (like the test client i referenced above).  It would be great to hear how difficult or easy it is to write a client against netty/websockets and how well that client performs (we had great disparity in client implementation with grizzly/rexpro in that regard).


Thanks,

Stephen



For more options, visit https://groups.google.com/d/optout.

David Ash

unread,
Mar 21, 2014, 6:50:07 AM3/21/14
to gremli...@googlegroups.com
Overall I appreciate the effort to simplify the stack for TinkerPop3.  I think the idea of leaving a generic REST API behind in place of remotely executing gremlin makes sense, particularly for GET's -- only the most trivial applications would ever want to retrieve data via the REST API anyways.  I'm less confident that the same is true for PUT's, POST's, and DELETE's, but the same notion applies that complex data is typically encoded as multiple vertices and edges and it would take numerous REST requests to add, edit, and delete a single complex data structure.

I am less enthusiastic about completely abandoning standard http requests in place of websockets, but I'm on the fence.  There is a big benefit toward establishing a single connection and handling requests/responses in an event-based system.  However, this comes at the cost of added work for migrating existing software -- changing from http-based requests to websocket events will require another level of redesign that wouldn't be required if http support continued.  At least it would have been nice to have seen it phased out, but considering the additional effort for TinkerPop development to continue this support, phasing out probably doesn't make sense.

Most importantly, I am very hopeful that stored procedures will continue to be supported in TinkerPop3.  In fact, there is some room to improve the experience of using stored procedures (although the current implementation is quite functional, it feels a tad bit hacky and lacks certain features).  The use of stored procedures is surely a best practice as it centralizes core functionality and encourages distributed application suites to be DRY.  Stored procedures decouple applications and the data itself.  Stored procedures also minimize request sizes, and can add security benefits (possibility of disabling ad-hoc queries, limiting who can call what procedure, etc).  In fact, seeing a full security model surrounding sproc execution permissions would be ideal, and I'd love to see these kinds of features come as a result of TinkerPop3's simplification (theoretically freeing up time in the future for adding new features).



On Wednesday, January 1, 2014 6:52:23 AM UTC-7, Stephen Mallette wrote:

Stephen Mallette

unread,
Mar 21, 2014, 8:35:26 AM3/21/14
to gremli...@googlegroups.com
David, thanks for your feedback.  I have been hesitant about pure websockets myself, but I think that you outline the exact realities that make me feel it's the right direction.  .

As far as "stored procedures" go, the idea remains that writing Gremlin, compiling it to a jar and exposing it through Gremlin Server is the encouraged pattern, though we would also like to see ways to write client-side Gremlin that will remotely execute efficiently.  I think that provides maximum flexibility while constraining available approaches to interacting with the server.  We're not quite sure what that all looks like yet, so it remains to be seen how/if that vision will be realized.

I'm not sure that it's in my mind to build out stored procedures in the full manner you describe (e.g."full security model"), however I would offer that Gremlin Server is very much open to extension.  For example, let's say that you really want to take stored procedures to the next level (beyond just treating them as server-side function calls...in other words, make them first-class citizens), you could add a custom websocket request "processor" that would maybe have the following requirements:

1. doesn't process arbitrary scripts passed to the server
2. accepts a message that simply contains a stored procedure name and it's arguments
3. validates that the stored procedure is configured on the server
4. offers some kind of authorization scheme to check permissions prior to execution

Offhand, I think the current extension model for Gremlin Server would support all that, though when I revisit development of Gremlin Server later I will consider these things to make sure that it does.

Best regards,

Stephen


David Ash

unread,
Mar 21, 2014, 10:09:31 AM3/21/14
to gremli...@googlegroups.com
That sounds perfectly reasonable.  Thanks!

David Bruant

unread,
Mar 23, 2014, 7:13:41 PM3/23/14
to gremli...@googlegroups.com
Hi,

First time on the list, so I guess I should introduce myself.
I'm David Bruant. As far as development goes, I'm mostly a web developer. I speak web and JavaScript fluently. I speak Java a bit (enough to play).
On the open source side of things, I'm a Mozilla contributor (mostly on MDN (documentation), currently interested in devtools/addons) and contribute regularly to various web standards mailing-lists.
I'm overall extremely enthusiastic at the thought of graph databases although I haven't really had the occasion to use one yet, but am looking forward to.


Le mercredi 1 janvier 2014 14:52:23 UTC+1, Stephen Mallette a écrit :
Gremlin Server will be based on websockets and does one thing: process remotely submitted Gremlin scripts and return results. That's it.  No REST. No Dog House. No RexPro (though I suppose RexPro most closely aligns with websockets).
I'm on the fence especially regarding the decision of removing the HTTP (REST) interface. I understand wanting to simplify the whole Tinkerpop stack, but a REST API a good way to get people started with a graph database without having them to learn a new programming language (be it Java or Groovy). For people who don't program in this language, the alternative to an HTTP API would be to maintain binding for lots of languages, which wouldn't be realistic to expect from the TinkerPop team, nor the community.
Regarding the Dog House, I would go the other way and try to make it more useful; hook up a simple d3.js ("force layout") script to display the graph (requires batch operations to be first-class in Rexster), allow to add nodes and vertices with a few clicks. Among other things, it'd allow a more direct relationship to the graph for users including when it comes to debugging. It'd be a sort of simple equivalent of phpMySQL, but... more glamorous with the graph visualization (as long as the graph is a small one)... and that would work with any gremlin-based database (an not a single vendor like MySQL fo phpMySQL).

One point that hasn't been raised on this thread is regarding the choice of the websocket protocol. Why not just go at the TCP level? (From what I understand, MySQL protocol packets are TCP packets)
The websocket protocol contains all sort of things that were required for practical reasons on the web (passing through proxies, handshake with HTTP CONNECT, ports 80 and 443) but it's less clear why they're needed at all as part of a binary protocol to discuss with a graph database. If I understand correctly each websocket datagram also has a little overhead which has no good reason to exist in your context.



"A javascript implementation of Gremlin with ScriptEngine implementation over jsr-223.  Frank Panetta did some exploratory work on this with Nashorn and had some success with it (https://github.com/entrendipity/gremlin-js).  We did get gremlin-js evaluating in Rexster at one point.  Frank has moved on from working with graphs on a daily basis and has transferred his grex (https://github.com/gulthor/grex) and gremlin-node (https://github.com/inolen/gremlin-node) repos to others who are now managing them.  I still like the thought of a pure gremlin-js syntax that could be evaluated in Gremlin Server...would be nice to see that work."
=> I'm not sure I understand this part, but I'm interested. Could you describe what you would want in more details?

Thanks,

David

David Ash

unread,
Mar 23, 2014, 9:02:46 PM3/23/14
to gremli...@googlegroups.com
David, thanks for introducing yourself and joining the conversation.  My post to this topic was actually my first post to this group as well.  I respect your opinions and probably would have shared your views when I first started working with Graph Databases.  However, my opinions have changed over the past couple of years...
 
I'm on the fence especially regarding the decision of removing the HTTP (REST) interface. I understand wanting to simplify the whole Tinkerpop stack, but a REST API a good way to get people started with a graph database without having them to learn a new programming language (be it Java or Groovy). For people who don't program in this language, the alternative to an HTTP API would be to maintain binding for lots of languages, which wouldn't be realistic to expect from the TinkerPop team, nor the community.

I agree that a REST API sounds like a good way to get people started, but in practice I assure you that it's not.  Instead of teaching people how to properly interface with a graph database, it teaches them how to do things incorrectly and encourages bad habits that are not scalable (ie. making numerous API requests in order to achieve a single complex interaction).

Consider the hypothetical situation where a REST API is present for a relational database.  Instead of learning SQL, you just send a REST API request to pull records from one table.  And then process the result, and then make another request to pull records from another table in order to do the equivalent of a table join.  So for each table join, you need processing, and another REST API request.  And these requests and responses potentially grow exponentially in terms of size.  And a ton of logic is moved to the database client (imagine you want to filter this data down to some narrow overlap of conditions -- all of the records have to be pulled and compared by your application).  All of this work is done instead of just doing a single request via SQL.  And doing it via REST API cannot take advantage of any query or internal optimization.  Does it work?  It's a lot of work, but sure, it works on a small scale.  But it doesn't scale up.  So you now *think* you know how to do it, but you really don't because you don't know SQL.  The same is true for Graph Databases.

In the academic world, the biggest cost is that universities teach the wrong skills because they're easier for students to learn, and end up pushing bad habits onto the students who end up thinking they have knowledge that they don't really have.

In the enterprise world, however, the cost can be staggering.  Consider this: As an architect I can understand that a graph database is the right tool for solving a specific use-case.  But I may have very little technical understanding of how to implement a graph database solution.  So I do a proof-of-concept and choose the REST API method because it seems easier.  My proof of concept goes smoothly, and based on this I estimate the amount of work that needs to be performed.  I get approved to have a certain amount of money based on this estimate.  But the REST API approach doesn't scale, I have to switch over to a language-based approach (Gremlin), and I end up going over budget by a couple hundred thousand dollars and 3 months of time.

A REST API is not a good way to interact with a graph database, and having the option to do things incorrectly is costly.  Stephen has had the wisdom to see that, and doesn't see the point into putting in work to maintain a methodology that is not a best practice.

Notably, I'm speaking as a guy who builds (and estimates) projects that have involved graph databases for a Fortune 10 company.  And being misled by the presented design options really is a significant issue.  Just present the good options and it makes my job easy.  It also makes the student's life easier because he doesn't have to learn things wrong first only to have to break those habits and learn things right later, and it makes the teacher's job easier when he chooses what to teach, and it makes the TinkerPop development team's job easier because it simplifies the stack.  I think dropping the REST API is a winning proposition on all sides.

Regarding the Dog House, I would go the other way and try to make it more useful; hook up a simple d3.js ("force layout") script to display the graph (requires batch operations to be first-class in Rexster), allow to add nodes and vertices with a few clicks. Among other things, it'd allow a more direct relationship to the graph for users including when it comes to debugging. It'd be a sort of simple equivalent of phpMySQL, but... more glamorous with the graph visualization (as long as the graph is a small one)... and that would work with any gremlin-based database (an not a single vendor like MySQL fo phpMySQL).

I pretty much agree, but really I don't think the user interface needs to be in the TinkerPop stack.  Third-party user interfaces built on top of TinkerPop would be fine, and would improve TinkerPop development by increasing development focus on what is important.  In fact, David you have some great ideas regarding that interface and the experience necessary to develop it.  I'm open to contribute.  Maybe you and I should start a project on Github?

One point that hasn't been raised on this thread is regarding the choice of the websocket protocol. Why not just go at the TCP level? (From what I understand, MySQL protocol packets are TCP packets)
The websocket protocol contains all sort of things that were required for practical reasons on the web (passing through proxies, handshake with HTTP CONNECT, ports 80 and 443) but it's less clear why they're needed at all as part of a binary protocol to discuss with a graph database. If I understand correctly each websocket datagram also has a little overhead which has no good reason to exist in your context.

Websocket rather than TCP would make it possible for client-side applications to talk directly to the database.




Thanks!
David


--
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/jpVPCzs3-T8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

David Bruant

unread,
Mar 24, 2014, 6:24:31 AM3/24/14
to gremli...@googlegroups.com
Le 24/03/2014 02:02, David Ash a écrit :
David, thanks for introducing yourself and joining the conversation.  My post to this topic was actually my first post to this group as well.  I respect your opinions and probably would have shared your views when I first started working with Graph Databases.  However, my opinions have changed over the past couple of years...
 
I'm on the fence especially regarding the decision of removing the HTTP (REST) interface. I understand wanting to simplify the whole Tinkerpop stack, but a REST API a good way to get people started with a graph database without having them to learn a new programming language (be it Java or Groovy). For people who don't program in this language, the alternative to an HTTP API would be to maintain binding for lots of languages, which wouldn't be realistic to expect from the TinkerPop team, nor the community.

I agree that a REST API sounds like a good way to get people started, but in practice I assure you that it's not.  Instead of teaching people how to properly interface with a graph database, it teaches them how to do things incorrectly and encourages bad habits that are not scalable (ie. making numerous API requests in order to achieve a single complex interaction).
I feel that it can only be because the REST API is not well designed.
So far, every single Gremlin query I've come across was a one-liner (like "g.V('name','hercules').out('father').out('father').name"). This maps quite naturally to URLs without too much work. As a 30sec draft, it could look like:

http://localhost:8182/graphs/graphOfTheGods/query/V:name,hercules/out:father/out:father/name

I can only agree that the current /vertices and /edges approach is not graph-like and feels like a SQL-Table-way of modeling the world that is inappropriate with graphs. It doesn't mean a REST API is bad idea.
(by the way, is it possible to implement what I described above as a Rexster extension? I haven't looked at that too much yet)

Regarding, "numerous API requests in order to achieve a single complex interaction", I agree with you that batch operations should be part of the API by default. I hope my above example is convincing enough that it's possible and shouldn't require too much work from Gremlin.
Hopefully, my example also shows that the REST API can teach how to properly interact with graphs (pick some nodes to begin with and traverse the graph given some rules).


Consider the hypothetical situation where a REST API is present for a relational database.  Instead of learning SQL, you just send a REST API request to pull records from one table.
I would answer that the REST API is misdesigned. A generic REST API for SQL would have a way to express JOINs.

I don't think there is a fundamental reason with REST APIs, but I can see how a misdesigned one can be harmful. The questions would then be: would a well-designed API benefit the TinkerPop project?
My take on this is that it would. At the very least until bindings for all the main languages exist and are mature.
And while it's easy to read a REST API sending JSON (or a well-supported serialization format) back, I'm afraid a binary protocol would scare people away because it's a lot of work to get right.


A REST API is not a good way to interact with a graph database
Can you provide insights on why you think that is?
Lots of REST APIs have been designed after the Table-thinking, but it does not have to be that way. For instance, I'm quite pissed off at Twitter for their API which doesn't offer a graph-like interface and I'm quite tired of the hoops it forces me to do to get graph data. But it could be designed differently, they just haven't chosen so (yet?).


Note that this discussion relates to the discussion about blueprints-io and the choice of serialization format.
If for the REST API, a new JSON format to return the subgraph is invented, it's more work. However, if it builds on top of the couple of remaining io formats, it can be much less work. Maybe at the cost of some overhead, but the purpose of the API would be experimentation; needs for performance would have to go the binary or Groovy road.

By the way, will the "TinkerPop format" relate to the binary format?


Regarding the Dog House, I would go the other way and try to make it more useful; hook up a simple d3.js ("force layout") script to display the graph (requires batch operations to be first-class in Rexster), allow to add nodes and vertices with a few clicks. Among other things, it'd allow a more direct relationship to the graph for users including when it comes to debugging. It'd be a sort of simple equivalent of phpMySQL, but... more glamorous with the graph visualization (as long as the graph is a small one)... and that would work with any gremlin-based database (an not a single vendor like MySQL fo phpMySQL).

I pretty much agree, but really I don't think the user interface needs to be in the TinkerPop stack.  Third-party user interfaces built on top of TinkerPop would be fine, and would improve TinkerPop development by increasing development focus on what is important.
As a newcomer not speaking Java or Groovy, if there is no binding for my language, then I have to either make one (à la node-java or for the binary format) or wait for one, or learn Java/Groovy (with all the tooling like Maven, etc.). All these solutions are quite a step for a newcomer. And even if a binding exists, its completeness and stability are often unknown.
This doesn't really help adoption and newcomers.


In fact, David you have some great ideas regarding that interface and the experience necessary to develop it.  I'm open to contribute.  Maybe you and I should start a project on Github?
I already have a lot on my open source plate. I'm tempted to say that "I'm open to contribute" ;-). More seriously, I won't be able to lead such a project until some time or until I do get to work on a project involving a graph database.
Let's take this offlist.



One point that hasn't been raised on this thread is regarding the choice of the websocket protocol. Why not just go at the TCP level? (From what I understand, MySQL protocol packets are TCP packets)
The websocket protocol contains all sort of things that were required for practical reasons on the web (passing through proxies, handshake with HTTP CONNECT, ports 80 and 443) but it's less clear why they're needed at all as part of a binary protocol to discuss with a graph database. If I understand correctly each websocket datagram also has a little overhead which has no good reason to exist in your context.

Websocket rather than TCP would make it possible for client-side applications to talk directly to the database.
Ok, so you have two requirements: one is a binary, space efficient protocol (like the MySQL one), the other is speaking with a web browser. And websockets, could fit the bill for both. But it comes as a cost of some inefficiency for the binary format.
The binary protocol is needed for cases where the application code is in another machine than the database (which is probably a more common use case than web browser interaction). It might make sense to try to make it as efficient as possible.

Thanks,

David

Stephen Mallette

unread,
Mar 24, 2014, 7:02:07 AM3/24/14
to gremli...@googlegroups.com
David Ash,

This part is very well said:

it teaches them how to do things incorrectly and encourages bad habits that are not scalable

TinkerPop has had many of those kinds of things in it.  Lots of options (built in the good name of flexibility and a rich feature set) that take folks down the wrong path with their application design.  The mailing list is littered with that kind of discussion.  It makes me wonder how many people started down the wrong way, couldn't figure it out, didn't bother to ask for help on the list and just gave up on TinkerPop all together.  Thanks for helping clarify the position.

I'm now off to read the rest of this discussion, but I hit that sentence you wrote and I wanted to make sure I highlighted it a bit.


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Stephen Mallette

unread,
Mar 24, 2014, 8:31:05 AM3/24/14
to gremli...@googlegroups.com
David Bruant,

Welcome to the conversation and to TinkerPop.  Definitely appreciate the feedback.

I feel that it can only be because the REST API is not well designed.
So far, every single Gremlin query I've come across was a one-liner (like "g.V('name','hercules').out('father').out('father').name"). This maps quite naturally to URLs without too much work. As a 30sec draft, it could look like:

http://localhost:8182/graphs/graphOfTheGods/query/V:name,hercules/out:father/out:father/name

I can only agree that the current /vertices and /edges approach is not graph-like and feels like a SQL-Table-way of modeling the world that is inappropriate with graphs. It doesn't mean a REST API is bad idea.
(by the way, is it possible to implement what I described above as a Rexster extension? I haven't looked at that too much yet)

You've only scratched the surface of Gremlin.  Path-based Gremlin does map quite well to a URL (and we've considered that in the past), but you won't get very far with that for most real use cases.  See a few more complex example in the Gremlin wiki or stackoverflow:


It would be interesting to see how you would express some of that in a REST API, but I don't think you are necessarily suggesting that.  Some of those examples show that not all Gremlin are neat one-liners.  In fact in most cases, you will find that real-world Gremlin tends to be somewhat distant from that.
 
Regarding, "numerous API requests in order to achieve a single complex interaction", I agree with you that batch operations should be part of the API by default. I hope my above example is convincing enough that it's possible and shouldn't require too much work from Gremlin.
Hopefully, my example also shows that the REST API can teach how to properly interact with graphs (pick some nodes to begin with and traverse the graph given some rules).

I agree that having a REST API that compiles to a path-based gremlin expression helps folks unfamiliar with Gremlin get started and even thinking in "Gremlin way".   However, I think that API will get increasingly complex as you add in other elements of "getting started" such that people will have to learn a "REST language".  If you have to learn a language, why not just learn Gremlin and gain full expressiveness from the start, which you will have to do anyway.
 
By the way, will the "TinkerPop format" relate to the binary format?

I'll be hoping to update the somewhat out-of-date RFC on IO sometime soon, but to answer your question, the "TinkerPop format" will be a binary format.

Regarding the Dog House, I would go the other way and try to make it more useful; hook up a simple d3.js ("force layout") script to display the graph (requires batch operations to be first-class in Rexster), allow to add nodes and vertices with a few clicks. Among other things, it'd allow a more direct relationship to the graph for users including when it comes to debugging. It'd be a sort of simple equivalent of phpMySQL, but... more glamorous with the graph visualization (as long as the graph is a small one)... and that would work with any gremlin-based database (an not a single vendor like MySQL fo phpMySQL).

I pretty much agree, but really I don't think the user interface needs to be in the TinkerPop stack.  Third-party user interfaces built on top of TinkerPop would be fine, and would improve TinkerPop development by increasing development focus on what is important.
As a newcomer not speaking Java or Groovy, if there is no binding for my language, then I have to either make one (à la node-java or for the binary format) or wait for one, or learn Java/Groovy (with all the tooling like Maven, etc.). All these solutions are quite a step for a newcomer. And even if a binding exists, its completeness and stability are often unknown.
This doesn't really help adoption and newcomers.

Ah...Dog House...good ol' Dog House.  The primary contributors to TinkerPop are not UI developers and while we are capable of such work, it's not really enjoyed and creates a body of code that comes with a high cost in terms of maintenance and testing.  It would be great to see a UI developed around the TinkerPop stack, but I don't think the TinkerPop itself will be able to carry that burden.  It will have to come out of the community.  

As a newcomer who doesn't know the JVM ecosystem super-well, you are providing good insight into what people might think when they first encounter TinkerPop3.  That's my biggest takeaway from this discussion.  Based on your feedback, I have to question if we are doing enough to help get people started.  It's a bit early to say just what we will have in that regard but that's the reason for these RFCs (this one in particular).  

I will say that for javascript guys, we might have a more cohesive story than we have had in the past (i.e. easier to get started, better integration, etc.).  In recent weeks, I've had it in my mind that we need to make javascript a first class citizen in TinkerPop3, given the Java 8 commitment to Nashorn.  I'm not sure who can say what the implications of Nashorn will be anymore than we can predict the speed of Java 8 adoption, but I can say that I find the idea interesting and wonder if we shouldn't try to be on the forefront of that for TinkerPop3.  It would great to hear more from the javascript community on their thoughts in this area if they are so inclined.

Thanks again for your feedback.  It was quite helpful.

Best regards,

Stephen


 


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

David Bruant

unread,
Mar 24, 2014, 11:16:40 AM3/24/14
to gremli...@googlegroups.com
Le 24/03/2014 13:31, Stephen Mallette a écrit :
David Bruant,

Welcome to the conversation and to TinkerPop.  Definitely appreciate the feedback.
Thanks. And thanks to you and David Ash for the constructive conversation :-)



I feel that it can only be because the REST API is not well designed.
So far, every single Gremlin query I've come across was a one-liner (like "g.V('name','hercules').out('father').out('father').name"). This maps quite naturally to URLs without too much work. As a 30sec draft, it could look like:

http://localhost:8182/graphs/graphOfTheGods/query/V:name,hercules/out:father/out:father/name

I can only agree that the current /vertices and /edges approach is not graph-like and feels like a SQL-Table-way of modeling the world that is inappropriate with graphs. It doesn't mean a REST API is bad idea.
(by the way, is it possible to implement what I described above as a Rexster extension? I haven't looked at that too much yet)

You've only scratched the surface of Gremlin.
That was my thoughts too.


Path-based Gremlin does map quite well to a URL (and we've considered that in the past), but you won't get very far with that for most real use cases.
For the sake of experimenting with graph database (build a prototype, learn to model data as graph), maybe it can be okay to tell people to make several requests.



It would be interesting to see how you would express some of that in a REST API, but I don't think you are necessarily suggesting that.
Some cases here would be better solved with a gremlin-server extension indeed. Some others could work with a couple round-trips which is probably okay.


Some of those examples show that not all Gremlin are neat one-liners.  In fact in most cases, you will find that real-world Gremlin tends to be somewhat distant from that.
There are also lots of simple use cases that work with a couple of one-liners and maybe a bit of "client-side" processing.


 
Regarding, "numerous API requests in order to achieve a single complex interaction", I agree with you that batch operations should be part of the API by default. I hope my above example is convincing enough that it's possible and shouldn't require too much work from Gremlin.
Hopefully, my example also shows that the REST API can teach how to properly interact with graphs (pick some nodes to begin with and traverse the graph given some rules).

I agree that having a REST API that compiles to a path-based gremlin expression helps folks unfamiliar with Gremlin get started and even thinking in "Gremlin way".   However, I think that API will get increasingly complex as you add in other elements of "getting started" such that people will have to learn a "REST language".  If you have to learn a language, why not just learn Gremlin and gain full expressiveness from the start, which you will have to do anyway.
It's not about gremlin, it's about Java/Groovy and learning a whole new toolchain just to do a simple thing.
Right now, my use case is to make a graph of Twitter users. Because of rate limits to get data, I'd better store the partial graph. I can do it in a JSON file, in a SQL database, but why not a graph database...
Since I know Node.js and I know how to do the Twitter OAuth, HTTP requests JSON parsing and all sort of data processing in this "platform", I'd love to do my project in Node.js (instead of having to learn to do that in a different language). My most complex query so far is "starting with some root nodes, get me the subgraph of all reachable vertices of degree >= 2 and the edges among these vertices" (and want to store the data in Blueprint to leverage existing graph algorithms/query language instead of reinventing the wheel)

In Blueprints 2, I can interact with the database via Rexster (even if inefficiently sometimes. At least for a limited subset of operations, but there are some good extensions I can pull off). If you remove Rexster, my option are to go binary, wait/hope for a mature node.js implementation (and mature takes time) or do everything in Groovy/Java.

I guess it's okay if complex graph operations are required to be made in Groovy/Gremlin, but I'd love to make the rest of my program in the programming language/environment of my choice.


Regarding the Dog House, I would go the other way and try to make it more useful; hook up a simple d3.js ("force layout") script to display the graph (requires batch operations to be first-class in Rexster), allow to add nodes and vertices with a few clicks. Among other things, it'd allow a more direct relationship to the graph for users including when it comes to debugging. It'd be a sort of simple equivalent of phpMySQL, but... more glamorous with the graph visualization (as long as the graph is a small one)... and that would work with any gremlin-based database (an not a single vendor like MySQL fo phpMySQL).

I pretty much agree, but really I don't think the user interface needs to be in the TinkerPop stack.  Third-party user interfaces built on top of TinkerPop would be fine, and would improve TinkerPop development by increasing development focus on what is important.
As a newcomer not speaking Java or Groovy, if there is no binding for my language, then I have to either make one (à la node-java or for the binary format) or wait for one, or learn Java/Groovy (with all the tooling like Maven, etc.). All these solutions are quite a step for a newcomer. And even if a binding exists, its completeness and stability are often unknown.
This doesn't really help adoption and newcomers.

Ah...Dog House...good ol' Dog House.  The primary contributors to TinkerPop are not UI developers and while we are capable of such work, it's not really enjoyed and creates a body of code that comes with a high cost in terms of maintenance and testing.  It would be great to see a UI developed around the TinkerPop stack, but I don't think the TinkerPop itself will be able to carry that burden.  It will have to come out of the community.  

As a newcomer who doesn't know the JVM ecosystem super-well, you are providing good insight into what people might think when they first encounter TinkerPop3.  That's my biggest takeaway from this discussion.  Based on your feedback, I have to question if we are doing enough to help get people started.
Thanks :-)
I think my last paragraph above captures my feeling. I'm okay writing Rexster extensions in Gremlin and setup whatever environment to do so, but I want to write all the non-graph code (interaction with the Twitter API in my most recent use case) in the language/environment of my choice.


It's a bit early to say just what we will have in that regard but that's the reason for these RFCs (this one in particular).  

I will say that for javascript guys, we might have a more cohesive story than we have had in the past (i.e. easier to get started, better integration, etc.).  In recent weeks, I've had it in my mind that we need to make javascript a first class citizen in TinkerPop3, given the Java 8 commitment to Nashorn.  I'm not sure who can say what the implications of Nashorn will be anymore than we can predict the speed of Java 8 adoption, but I can say that I find the idea interesting and wonder if we shouldn't try to be on the forefront of that for TinkerPop3.  It would great to hear more from the javascript community on their thoughts in this area if they are so inclined.
I clearly can't talk for the whole JavaScript community (though I can already tell you that there is not such a thing, the JavaScript ecosystem is now composed of several communities), but I can share my opinion.

I don't know too much about Nashorn, but I don't think anyone cares in the at least the Node.js and web dev communities. From what I understand, Nashorn does not really aimed at interoperability with Node.js or the browser. It's rather an environment to script the JVM in JavaScript.
Also, I think Nashorn is a good ES5 implementation, but its status regarding ES6 is unclear. ES6 will be a big piece of work. V8 is already implementing pieces and some Node.js modules will take advantage of them as soon as they're officially released (they're currently all behind flags last time I checked).

If you want to connect with the Node.js community, Nashorn doesn't seem like the way to go (I've read about a Node.jar project, but I'm not sure where it's at and the Node.js ecosystem moves fast enough that I'm not sure it will be practical before a long time).
An interesting thing to notice is that Mozilla has a way to make simple addons called Jetpack. By following CommonJS, they hoped to create a community of JavaScript modules. It never really happened. But now that they noticed how flourishing the Node.js community is, they've decided to move to Node.js compat [1].
What this shows is that Node.js has enough strength and traction so that other projects are aiming at interoperability with its package ecosystem (npm).

If I want to build an application with a Blueprints database in JS, I won't use Nashorn as my runtime. I'll use Node.js.
That said, using Nashorn and JS to make gremlin-server extensions would be a nice addition. One benefit would be to just drop JS files in some directory (no need to compile anything) to make extensions. I'd happily use that (is it already possible to make Rexster extensions in JS?)


Thanks again for your feedback.  It was quite helpful.
My pleasure :-)

Thanks for your insightful response,

David

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=934696

Stephen Mallette

unread,
Mar 24, 2014, 5:55:16 PM3/24/14
to gremli...@googlegroups.com
> One benefit would be to just drop JS files in some directory (no need to compile anything) to make extensions. I'd happily use that (is it already possible to make Rexster extensions in JS?)

I had some success working with Frank Panetta (original author of grex and gremlin-node, who I think has moved on from graphs a bit) getting a "gremlin-js" jsr-223 compliant script engine going in Rexster.  Frank had it working pretty nicely though there was some verbosity in the javascript that made it a bit less than appealing compared to groovy.  It didn't really get all too polished, but that's about as close as we came to getting JS into Rexster.




--

David Ash

unread,
Mar 24, 2014, 6:22:04 PM3/24/14
to gremli...@googlegroups.com

Using js in place of Groovy would be a nice add-on. Although the closure syntax is more verbose than Groovy, it would be a great add because of the number of Javascript developers out there (its a must-know second language for all web developers regardless of primary language, as well as being a primary language itself via node.js).

My current project is also a node.js project, although we are using Titan with the TinkerPop stack on top, and access the database via "stored procedures" within Rexster's Gremlin extension.

You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/jpVPCzs3-T8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

Dmill

unread,
Apr 13, 2014, 9:06:47 PM4/13/14
to gremli...@googlegroups.com
Hey guys.

I would like to add my 5 cents to this. Coming from a non java background and having written and used the rexpro php client I would like to share some feedback of my current sticking points. (maybe there's a more recent discussion somewhere. If so feel free to send me in the right direction).

Anyways, 
With the current implementation of rexpro, scripts either have to be written, or dynamically built on the client side before being run against the server. And as was addressed in this discussion, the full result set had to be passed back to the client. Other than what has already been mentioned here there has been one major sticking point IMHO that has proven to be quite a hurdle for us. It comes in the form of query criteria (or the lack of)

Let me explain:
Coming from the web, and the use of PHP frameworks + RDBMS we've gotten use to using query criterias in some form or another to perform filters. With relational databases such as MYSQL/etc. this can be as simple as using an "AND". 
As a very simple example we could imagine a table of all users with search fields for every column (or property). My original query retrieves all users from the database, but if I enter "marko" in the "name" field, and "25" in the "age" field. It will apply these two criterias to my already existing original query. In this instance I would add 'AND name = "marko" AND age = 25'. Thus preventing me from having to write a query for every possible combination, keeping my code clean and maintainable.

Now this example is simple but it can become increasingly complex if various parts of your software modify the "scope" of the original query at different moments of the program execution.
For one, where it was simply appending an AND to a query with a RDBMS, In gremlin you have more complex string manipulation to do.
Second of all, Given the nature of graph databases, your filtering isn't only done on a property level but also on a traversal level. Implying that your criteria has to be able to backtrack over any script and modify your traversal rules.

I can't speak on behalf of everyone, but I don't think I'm taking too much of a risk by saying that this would set back a few people from outside the java world. Especially if their use of graph databases imply complex structures and rights. 

Obviously a lot of this boils down to imperative vs declarative languages and it'll be a sticking point, but a good chunk of it can be handled client side, and I've attempted to do so. My major problems have been:
- In order to build and merge steps together (ie: .outV( someName ) + .outV( someOtherName ) = .outV( someName, someOtherName) ) our software needs a good understanding of gremlin method signatures. Meaning we have to manually replicate all possibilities in our code.
- Binds become a nightmare when detecting method signatures because they are applied server side. (we set our binds before defining the scripts using them so they can be referenced)
- Closures are often very verbose. (detail)

The worst point of all is obviously the first one. The possibility of passing a simple stored procedure and it's arguments suddenly opens the door to new possibilities. Maybe some way of altering/combining existing steps would be a great addition.

Stephen Mallette

unread,
Apr 14, 2014, 6:37:43 AM4/14/14
to gremli...@googlegroups.com
Your concerns seems to be less with Rexster (and related approaches for non-JVM languages to connect) and more with how non-jvm languages can interact with graphs given Gremlin (a jvm-language) as the means of exchange on Rexster.  We're thinking about non-jvm language options now, but don't yet have anything concrete to provide.  

Your final paragraph confused me a bit:

The worst point of all is obviously the first one. The possibility of passing a simple stored procedure and it's arguments suddenly opens the door to new possibilities. Maybe some way of altering/combining existing steps would be a great addition.

Are you saying you want "stored procedures"?  you can do that now with the Gremlin Extension and the load parameter or with RexPro and script engine configuration (init-scripts, imports, etc).  

Also, are you saying "altering/combining" existing steps would help?  you can do that now with user defined steps (https://github.com/tinkerpop/gremlin/wiki/User-Defined-Steps) or dropping lower level and using groovy meta programming.  Consider looking at this blog post if you haven't already: http://thinkaurelius.com/2013/07/25/developing-a-domain-specific-language-in-gremlin/

or for either of these items....did you mean something else?


Dylan Millikin

unread,
Apr 25, 2014, 10:15:48 AM4/25/14
to gremli...@googlegroups.com
Thanks a lot for the reply. 

You are absolutely correct. My original concern has indeed more to do with interacting with gremlin. If you ever have an open thread discussing this I would be more than happy to participate. I'm afraid there isn't much to be done there, but there are definitely some things that could potentially make life easier for some of us

My final paragraph was confusing because I misunderstood the original statement so I didn't make any sens (hey it was late ;) ). 

By altering/combining I don't know ahead of time what combination of steps will occur. It can potentially change on each page call so user defined steps aren't much help unfortunately. 


--
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/jpVPCzs3-T8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

Ming-Ju Valentine Lin

unread,
Jun 8, 2015, 5:03:40 PM6/8/15
to gremli...@googlegroups.com
Hi, 

Is there a migration guide for Rexster user to port to Gremlin Server? I am using RexPro now and cannot find ways to enable SSL. Does Gremlin Server support SSL? I guess one way is to use the REST API and https. But, I'd like to know if there is any secure communication over RexPro or WebSocket provided by Gremlin Server.

Sincerely,
Val

Stephen Mallette

unread,
Jun 8, 2015, 5:22:22 PM6/8/15
to gremli...@googlegroups.com
Hi Val,
 
Is there a migration guide for Rexster user to port to Gremlin Server?

There is no such guide that specifically discusses that.  I'd recommend you read though the latest Gremlin Server docs:

 
Does Gremlin Server support SSL?

Yes - Gremlin Server directly supports SSL and is straightforward to setup given the latest work that has been done in the past week or so (Gremlin Server supported SSL before these changes as of the stable M9 release but the configuration was sorta klunky).  If you are feeling adventurous and would like to play with Gremlin Server to see the SSL stuff in action, I'd suggest you try to build the latest SNAPSHOT.  Under SNAPSHOT turning on SSL (with a self-signed certificate) is as simple as setting this value to "true":


Obviously, that's not great for production so, you can also provide some other settings once you enable SSL (still working on the docs for this stuff):


Note that with Rexster, there is this hanging PR which I haven't tested:


I'm not sure if it's worth trying or not, but if you must use Rexster for some reason, perhaps it is worth looking at.



 


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Ming-Ju Valentine Lin

unread,
Jun 9, 2015, 5:41:57 PM6/9/15
to gremli...@googlegroups.com
Hi Stephen,

Thank you for your response. 

Would you suggest using Rexster REST API instead and put some LB that handles https in front of Rexster server? 

/Val


--
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/jpVPCzs3-T8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAA-H438uyLLcRwpsaiBORJ60h0X5qpH3c0Vu-3GM%2BbUbWG6B_g%40mail.gmail.com.

Stephen Mallette

unread,
Jun 9, 2015, 7:46:45 PM6/9/15
to gremli...@googlegroups.com
basically - just put something in front of Rexster to handle the SSL and proxy the calls through to it.

Reply all
Reply to author
Forward
0 new messages