Fragmentation and worries about the current graph databases landscape

112 views
Skip to first unread message

Xander Uiterlinden

unread,
Aug 11, 2016, 3:09:34 AM8/11/16
to Gremlin-users
Hi,

In our current project we've been using OrientDB for about the last two years. Although mostly it runs fine, it does seem to lack stability in the clustered setup. Another issue I'm running into is that you'll need the enterprise version to be able to get the (automatic) incremental backups. 
While implementing OrientDB for our project I chose not to use the OrientDB API's but to rely on the tinkerpop blueprints API (2.x at the time) instead. I perceived this as being the 'JDBC' for graph databases and therefore giving me freedom of choice regarding the implementation, and giving me me option to swap the implementation whenever disappointed with the current choice.

While looking into upgrading to tinkerpop3 it appeared to me that there's actually not really that much choice if you're looking for that substitutability, especially when running the graph DB for an 'enterprise' application with high availability requirements, preferably open source.

In the open source area there seems to be only 1 graph database that supports a clustered storage setup being Titan. Titan is part of DataStax who are pushing their own commercial graph DB probably leading to a slow death of Titan. OrientDB provides clustering, but is nowhere near to completing tinkerpop3 support. Besides that, the word on OrientDB seems to get worse every day. (see http://orientdbleaks.blogspot.com/). Neo4J provides high availabiliy, but only in the commercial version which comes with a hefty price tag. Besides that Neo seems to be pushing their Cypher as the main query language and not gremlin. Also accessing the open source Neo database with the remote gremlin java API just isn't possible.

So I am, still using OrientDB 2 using the Tinkerop 2 API's looking ahead and really wondering what is the best way to move forward, preferably without vendor lock-in and with high availability requirements (mainly OLTP). Am I missing parts of the landscape, or is my 'JDBC' for graph databases interpretation too romantic today? Any thoughs on this?

--Xander

Luca Garulli

unread,
Aug 11, 2016, 3:39:30 AM8/11/16
to gremlin-users
Hi Xander,

Are you using a recent version of OrientDB 2.2.x? Release 2.1.x contained many limitations and bugs we fixed in 2.2.x. We reached a very good level of "rock solidity", even with multiple data centers. By the way in a few hours the v2.2.7 will be out.

This doesn't mean you have to use OrientDB, but it looks like there aren't to many Open Source alternatives in terms of Distributed Graph Databases out there...

About that infamous web site I should be impressed, really: you can find tons of blog posts against MySQL and MongoDB, but an entire site against a technology that is FREE for any usage it's fun and sad at the same time. 

In my developer's career I used hundreds of libraries, framework, tools and projects, and when something doesn't fit my needs I simply throw it away and try the next one on the list. I would never spend so many hours of my (precious) time to reverse my angriness against an Open Source project hiding myself as an "anonymous" user...

OrientDB company is not even that big corporation to hate, we're the hackers behind the first Multi-Model database with a Graph Engine trying to build the unbreakable database everybody is waiting for :-)

I hope to have provided an answer, but if you want to stay with OrientDB, please use last release and write any issue you encounter to StackOverflow or the Community Group.

Thanks,
Luca


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1e1cfa0e-ac18-4dbe-80ba-4ec9475e921d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Stephen Mallette

unread,
Aug 11, 2016, 7:03:48 AM8/11/16
to Gremlin-users
While looking into upgrading to tinkerpop3 it appeared to me that there's actually not really that much choice if you're looking for that substitutability, especially when running the graph DB for an 'enterprise' application with high availability requirements, preferably open source.

Are there choices in TinkerPop 2.x that you would switch to that aren't supported in TinkerPop 3.x? 

So I am, still using OrientDB 2 using the Tinkerop 2 API's looking ahead and really wondering what is the best way to move forward, preferably without vendor lock-in and with high availability requirements (mainly OLTP). Am I missing parts of the landscape, or is my 'JDBC' for graph databases interpretation too romantic today? 

I mean TinkerPop is still providing you with a vendor agnostic approach in TinkerPop 3.x with largely the same major implementation choices as TinkerPop 2.x (unless you're thinking of one that i'm not). Perhaps I'm missing the point of your question, but the notion of "JDBC for graph databases" still applies to TinkerPop. It seems that your problem is more about choosing the "right" TinkerPop-enabled graph database. If that is what you're asking, then I'm not aware of any other TinkerPop 3.x implementations beyond those listed on TinkerPop's home page (http://tinkerpop.apache.org/).

You might want to follow and/or look at throwing some support behind this issue on apache s2graph: 


That might be a direction to go for you at some point.

 

--

Robert Dale

unread,
Aug 11, 2016, 11:04:12 AM8/11/16
to Gremlin-users
You can use neo4j HA without a commercial license in some use cases. You only have to abide by the AGPL license.  So in the most basic case, using the gremlin server distribution, a remote database without modification, HA or not, you do not need a commercial license. 

Pieter Martin

unread,
Aug 11, 2016, 12:46:48 PM8/11/16
to gremli...@googlegroups.com
There are many providers listed on the TinkerPop site. What is missing is their
voice with regards to the evolution of gremlin as the "JDBC" of graphs.

The mailling list is mostly filled with Titan and a bit of Neo4j experiences so
its hard to know how much those other providers are being accessed via
TinkerPop.

Probably it is early days and most of them have their own competing graph
language. I had hoped the move Apache would have encouraged them to join in but
so far it has not really happened.

The part I am romantic about is TinkerPop as the java api "JDBC" for the graph
landscape. If you need to do any abstraction on top of graphs then TinkerPop is
the way to go. Be it Neo4j, Titan, OrientDb or any other I'd use TinkerPop for
the java api. Non developers writing reports or whatever can choose which every
query language suites them best. Of course if its not gremlin you will be
locked in.

Cheers
Pieter

Excerpts from Xander Uiterlinden's message of August 11, 2016 9:09 :
> --
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Marko Rodriguez

unread,
Aug 11, 2016, 12:55:35 PM8/11/16
to gremli...@googlegroups.com
Hi,

To add to Pieter’s point, even if they want a different query language (e.g. SQL, SPARQL, Cypher, etc.), the new Bytecode work in TINKERPOP-1278 is going to make it really easy for compiler designers to translate their language to Gremlin bytecode. Even more recently, I’m hearing talk of a development team wanting to translate Gremlin bytecode to Spark’s DataFrames API.

Marko.

http://markorodriguez.com
> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1470931267-astroid-0-wy2cyo9qnj-1925%40pieter-laptop.
Reply all
Reply to author
Forward
0 new messages