Created module for connecting to Tinkerpop-compatible Graph Databases: mod-tinkerpop-persistor

448 views
Skip to first unread message

Arnold Schrijver

unread,
Oct 9, 2013, 6:53:37 PM10/9/13
to ve...@googlegroups.com
Last weekend I finally got some time to create a simple Vert.x BusMod to connect to Tinkerpop-compatible graph databases. I tested the code against Neo4J and OrientDB. Though I created it for my own private experimentation to learn both about Vert.x and graph databases in general, I took some time to create a bit of documentation and just now added the project to Github.


Like I said this is just experimental and is by no means production-ready. I think there should be a lot of improvements around Gremlin query support. I'll be curious to hear if any of you find it useful or have ideas for further improvement or additional features. Personally I'd like to experiment further in creating ACL graphs and eventually maybe create an Apache Shiro module (as discussed about in previous threads) that uses these graphs to create a security Realm for authentication, authorization and session management (but that is way off given how little spare time I have for this :)

Tim Fox

unread,
Oct 11, 2013, 5:46:50 AM10/11/13
to ve...@googlegroups.com
Looks great
> --
> You received this message because you are subscribed to the Google
> Groups "vert.x" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to vertx+un...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

kuujo

unread,
Oct 11, 2013, 6:05:17 AM10/11/13
to ve...@googlegroups.com
Definitely not my area, but looks awesome! Well done.

Amit Kumar

unread,
Oct 11, 2013, 3:33:04 PM10/11/13
to ve...@googlegroups.com

TinkerpopPersister - Following lines seem to be confusing -

// Need to return the resulting Graph, if Id's have been generated.

        if (graph.getFeatures().ignoresSuppliedIds) {           

Is not the 'reply' object required to provide 'graph' object in the reply regardless whether the IDs were generated or not ?

Arnold Schrijver

unread,
Oct 11, 2013, 6:52:46 PM10/11/13
to ve...@googlegroups.com
Hi Amit,

Maybe that would be more consistent. The reason I chose to implement it this way is that in cases where the graphdb does not ignore supplied ID's the reply message would return the exact same GraphSON text that you just provided it in the call to addGraph(). That seemed too wasteful to me as it could constitute of a large amount of text. So in those cases only a { "status": "ok" } is replied, meaning the graph you supplied has been persisted to the db (and you already know the ID's).

I could improve the code and wrap the GraphSON part of the reply in a 'graph' object, so you could more easily check if that key is present:

{
   
"graph":
   
{
       
// Graph with generated ID's here
   
},
   
"status": "ok"
}

Regards,

Arnold.

Arnold Schrijver

unread,
Oct 12, 2013, 6:35:56 AM10/12/13
to ve...@googlegroups.com
Amit, 

I made the changes that I mentioned, wrapping the results in a 'graph' object on addGraph(), getVertices(), getVertex(), getEdges() and getEdges()

Amit Kumar

unread,
Oct 24, 2013, 11:03:04 PM10/24/13
to ve...@googlegroups.com
Thanks a lot Arnold. I have follow on questions though.

1. Due to the asynchronous communication nature of this module, can I use a TransactionalGraph and still assume that the transactions are taken care ? e.g. I have 2 vertices and 1 edges to be added to an existing graph within a given transaction. Is this still possible ?

2. Not sure how to attach a subgraph to an existing Graph using this.

Amit Kumar

unread,
Oct 24, 2013, 11:05:11 PM10/24/13
to ve...@googlegroups.com

Arnold Schrijver

unread,
Oct 26, 2013, 3:28:22 AM10/26/13
to ve...@googlegroups.com
Hi Amit,

Thanks, I also very much appreciate your feedback. Regarding your follow-up questions:

1. Unfortunately the transaction boundary currently only spans a single action as stated in the remarks. The transaction is committed when the action ends. I agree it would be very valuable if transactions could span multiple actions. It would be interesting to see how this would work with Vert.x. At the moment there is not much information to be found on this apart from some discussion in this thread and a feature request to integrate JBoss Transaction Manager with Vert.x created in March.

2. Given my, still limited, knowledge on graph databases I am not sure as well ;) Given what I read about subgraphing with Gremlin on this page I suppose you could create the subgraph presented there if the query action was refactored to use the GremlinGroovyScriptEngine  (thanks for pointing me in that direction!). But then you would end up with a serialized GraphSON version of the subgraph. AFAIK Tinkerpop Blueprints does not support detached elements / subgraphs (at least that is what this Tinkerpop Frames issue states).

I created enhancement issues on Github for both these questions. See issues #7 and #8.

Arnold Schrijver

unread,
Oct 26, 2013, 3:30:01 AM10/26/13
to ve...@googlegroups.com
Thanks, useful information! I created an enhancement issue on Github for this as well: Issue #9.

Arnold Schrijver

unread,
Oct 28, 2013, 2:32:45 AM10/28/13
to ve...@googlegroups.com
Regarding transactions: I looked at the implementation in Tim Yates' mod-jdbc-persistor and I think a similar mechanism can be easily implemented, but with Tinkerpop TransactionalGraph instead of Connection objects.

Tim Yates

unread,
Oct 28, 2013, 6:12:16 AM10/28/13
to ve...@googlegroups.com
Glad it was useful :-)


--

Arnold Schrijver

unread,
Oct 28, 2013, 6:40:53 AM10/28/13
to ve...@googlegroups.com
It was, thanks :)

Seb Heymann

unread,
Oct 29, 2013, 2:08:22 PM10/29/13
to ve...@googlegroups.com
Nice job!

Can we configure it with an URL to a Neo4j database server instead of a directory?

How about extending the support of Cypher queries, which are very powerful when working with Neo4j?

cheers,
Sébastien

Arnold Schrijver

unread,
Oct 30, 2013, 2:38:13 AM10/30/13
to ve...@googlegroups.com
Thanks! 
Good question on the URL config, I was already wondering that myself. The Neo4j Tinkerpop implementation on Github only mentions configuration of an URL. I created an issue for this to investigate further (https://github.com/aschrijver/mod-tinkerpop-persistor/issues/10).

On the use of Cypher: Cypher, though indeed very powerful, is proprietary technology of Neo4J (AFAIK) and would require access to the underlying vendor-specific graph classes instead of using standard Tinkerpop code. Gremlin is what is used instead by TInkerpop and is also quite powerful and extensible and continues to be improved. So unless there will be a straightforward way in supporting Cypher in a more 'standards-conform' manner I'll rather stick to just Gremlin support.

Seb Heymann

unread,
Oct 30, 2013, 4:12:19 AM10/30/13
to ve...@googlegroups.com
I need to access a remote Neo4j server through the REST API so I'll propose a patch to Blueprints. Rexter and OrientDB works this way so I guess it's doable.

Neo4j is GNU GPLv3 so Cypher is open source as well :) As far as I understand there is more a Gremlin vs Cypher issue between Aurelius and Neo Tech. Part of the specs of Cypher are Neo4j-related but it is also the case with Titan/Gremlin. So I guess we should not put Cypher into Blueprints at the moment, and find a workaround...

Btw I'm interested in contributing to your module. What do you think that needs to be done to reach a production-quality level?

cheers,
seb

Arnold Schrijver

unread,
Oct 30, 2013, 3:58:38 PM10/30/13
to ve...@googlegroups.com
Hi Seb,

I found the following discussion on the Blueprints implementation on the Neo4J google group: https://groups.google.com/forum/#!searchin/neo4j/blueprints/neo4j/drebEP5rCJk/7x2lgqahoHQJ and also there is Mike Bryant on the Gremlin-Users google group that may have some spare time soon to make improvements to this Blueprints implementation, so I pointed to your suggested improvement on this thread :)
Do you really require the remote sever URL config, or is the current implementation that uses a Neo4J EmbeddedGraphDatabase also usable?

On the production-readiness of the module...I think people that are more graphdb experts than me should be the judge on that :) 
But seriously, I think at least the module should be extended with transaction support (Issue #7) and probably also batching (best keep that consistent to the way this is done in mod-mongo-persistor and mod-jdbc-persistor).
Then the most important functionality to be improved would be the Gremlin query stuff. Since queries need to be compiled the first time they are used it this may not be performant enough for use cases that can't benefit caching. Also candidate for improvement is the way that query results are returned to the client. The serialization of query results is now quite simple but may not match the results of more complex Gremlin queries.

Would be great if you could help!

Arnold Schrijver

unread,
Oct 30, 2013, 5:27:35 PM10/30/13
to ve...@googlegroups.com
BTW issue #9 would also be a great enhancement as it would allow passing Groovy scripts to enhance Gremlin query processing. And also the integration-test coverage can be extended and performance tests added.

Seb Heymann

unread,
Oct 30, 2013, 6:22:36 PM10/30/13
to ve...@googlegroups.com
Thanks for the discussion Arnold!

Using the remote Neo4j server is mandatory in my case, because I'd like to run the app on a different machine, to connect on existing Neo4j servers in production. I've started to work on my own on a Cypher+REST API and we'll see at the end if it's good enough.

Background: I'm evaluating the migration of the Linkurious backend (full Node.js) to Vert.x. We're focused on Neo4j right now but would like to support more graphdb systems.

Amit Kumar

unread,
Oct 30, 2013, 10:40:43 PM10/30/13
to ve...@googlegroups.com
Going the Cypher way makes it specific to Neo Technologies. I thought that the original idea of implementing a blueprint API was to enable working with any Graph database (which implements TinkerPop Blueprint). When you have a blueprint enabled Graph, Gremlin is a general query language you can use to talk to that database. If a dedicated query language needs to be created specific to Neo, that could be part of another project. Just my 2 cents.

Arnold Schrijver

unread,
Oct 31, 2013, 2:45:28 AM10/31/13
to ve...@googlegroups.com
Hi Amit,

That was (and still is) my original idea for the Tinkerpop Persistor module. So unless Tinkerpop will include direct Cypher support or includes some kind of query selection mechanism Gremlin is the way to go.
(BTW Looking at the roadmap for Tinkerpop I see many exciting new stuff being planned, but no cypher).

I agree Cypher query support should be a separate project. With Vert.x this _could_ then be a module that pulls in the Tinkerpop Persistor module (via mod.json inclusion) to delegate all the 'regular' graphdb actions to and only processes cypher queries separately. 
Maybe to facilitate this use case the Tinkerpop Persistor could include an action to expose some of the configuration/connection details of the underlying graph database engine(s) it targets, so client modules have an easier time getting at the raw vendor-specific graph implementation.

Seb Heymann

unread,
Oct 31, 2013, 3:30:31 AM10/31/13
to ve...@googlegroups.com
Okay I start the dev of a pure cypher module; it won't have a dependency to Tinkerpop Persistor as it would add little features compared to the cost of adding the dependency, but I'll stick to the same actions and message format that are received through the event bus. It is the great force of Vert.x so I hope that will facilitate switching from one graph persistor to another.

Arnold Schrijver

unread,
Oct 31, 2013, 3:54:07 AM10/31/13
to ve...@googlegroups.com
Alright, pity we can't combine, but great to keep compatibility between modules!
BTW, got some more response on the remote Neo4J Tinkerpop issue (and then I stop the cross-referencing galore ;-)

shishya

unread,
Nov 8, 2013, 11:59:11 AM11/8/13
to ve...@googlegroups.com
Awesome! This was the springboard I was looking for. Thanks

bin wang

unread,
Nov 15, 2013, 12:24:12 PM11/15/13
to ve...@googlegroups.com

I tried to integrate it with wwilson's module:   https://github.com/wwilson/blueprints-foundationdb-graph , but I am not sure how to  pass in the id for Addgraph action.

In Addgraph  input message, you only accept "action" and "graph" element.

 This issue might be related only to wwilson's module in that when you can call  graph = GraphFactory.open(tinkerpopConfig); in the handler,  the foundationDB module will simply expect graphName( or ID) in the config:
 public FoundationDBGraph(Configuration config) {
        this(config.getString("blueprints.foundationdb.name", UUID.randomUUID().toString()));
    }
Since there is no "blueprints.foundationdb.name" in the config, it will alwasy generate new Id.

And also in the reply I can see only "status" :"ok" with no id because the module's  ignoresSuppliedIds is false, then I don't even has a handle to the new graph created in DB.

I feel you should always return the id in the reply no matter the id is  generated by system or supplied by user which could serve a little confirmation for client. 

Arnold Schrijver

unread,
Nov 18, 2013, 2:38:24 AM11/18/13
to ve...@googlegroups.com
Hi,

You can specify the vendor-specific config you need in your tinkerpopConfig section yourself. If you include a 'blueprints.foundationdb.name' there, then it should be passed to the FoundationDBGraph.
Then when calling addGraph and ignoresSuppliedIds = false I decided to only return status='Ok' because you just supplied the ID's, so you already know them. Otherwise the reply would only contain an exact copy of the graph that was just passed in.
Message has been deleted

Walison Moreira

unread,
Dec 4, 2013, 6:52:36 PM12/4/13
to ve...@googlegroups.com
I am creating a Java server with "Vert.x" and OrientDB. This module will help a lot.

Thank you.

Norman Maurer

unread,
Dec 5, 2013, 12:22:47 AM12/5/13
to ve...@googlegroups.com, Walison Moreira
Use english here please :)

-- 
Norman Maurer

On 5. Dezember 2013 at 00:51:56, Walison Moreira (walison...@gmail.com) wrote:

Eu estou criando um servidor Java com "Vert.x" e OrientDB. Esse módulo vai ajudar muito.
Obrigado.
Reply all
Reply to author
Forward
0 new messages