Titan - Cassandra cluster setup

883 views
Skip to first unread message

Hiren Dutta

unread,
Aug 1, 2013, 1:56:05 AM8/1/13
to aureliu...@googlegroups.com
Hi,
I am new to the graph DB setup. I went through the concepts but need help regarding steps and configuration for

1. Set up multinode Titan Cluster (not in cloud)
2. Setup Cassandra cluster (not in cloud)
3. How Titan cluster can use Cassandra cluster as backend storage? Configuration details.
4. Connect application with Titan cluster.

Any links /documents regarding development and configuration perspective will be really helpful.

Regards,
Hiren

subhankar

unread,
Aug 1, 2013, 8:58:27 AM8/1/13
to aureliu...@googlegroups.com

Hiren Dutta

unread,
Aug 7, 2013, 1:28:43 AM8/7/13
to aureliu...@googlegroups.com
Thanks for the document link! 
I have few queries based on this.

1. Titan backed cassandra cluster is providing high availability. right? How to scale Titan for load balancing?
2. Is titan server with embeddedcassandra - rexster cluster configuration is best form performance perspective? What architecture do you suggest for production in terms of HA and load balancing (performance)? 
3. One basic question, If my application wants to connect with titan- cassandra -rexster embedded cluster, which machine ip should I provide in cassandra.properties and rexster.xml file? providing any one of the clustered machine ip will be enough or I have to provide all comma separated machine ips.
4. For large distributed graph processing Faunus-hadoop cluster architecture is must?

Matthias Broecheler

unread,
Aug 12, 2013, 11:08:45 PM8/12/13
to aureliu...@googlegroups.com
Hey Hiren,

1-2) If you are primarily concerned with HA I would recommend you don't run titan-cassandra embedded, but maintain a separate titan and cassandra cluster with the titan machines running rexster as well (just vanialla rexster). Then, you get independent fail-over capacity in both cluster: cassandra as normal and the titan instance can be scaled up and down as needed since they don't maintain state independently from the cassandra cluster
3) You can provide one or all. WIth the astyanax adapter it should auto-discover the cluster. For thrift you need to provide all ips.
4) If you mean large OLAP style processing (batch analytics, reporting, etc), then yes

HTH,
Matthias


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Matthias Broecheler
http://www.matthiasb.com
Message has been deleted

Rob McFeely

unread,
Mar 27, 2014, 8:45:05 AM3/27/14
to aureliu...@googlegroups.com

Matthias

 

To keep a separate Cassandra and Titan+Rexster clusters means that there is two HTTP bridges before our front end application can reach the data 

 

APP <---http (or RexPro)---> REXSTER+TITAN <---http----> CASSANDRA

 

I fear that the extra http will be adding too much overhead so what Im proposing in our setup is to remove the TITAN+REXSTER server and give each front end APP a Titan layer which can go straight to the Cassandra cluster via the Blueprints API calls and TitanFactory in java code.

 

Do you see any problems with using this approach? Is there any benefit of Rexster and RexPro that I would be losing out on if I went down this path?


Regards

Rob

Daniel Kuppitz

unread,
Mar 31, 2014, 7:55:12 AM3/31/14
to aureliu...@googlegroups.com
Hey Rob,

you won't need Rexster if your application is developed in Java. The simple remote server mode should work for you.
The remote server mode with Rexster is great if you also want to support clients written in other languages.

Cheers,
Daniel

Matthias Broecheler

unread,
Mar 31, 2014, 12:45:41 PM3/31/14
to aureliu...@googlegroups.com
Hi Rob,

if you use Rexster as Titan's server we recommend running Rexster alongside the storage backend (C* or HBase), i.e. on the same machine and connect via localhost. That way, the second link does not go over the network which gives very low latency.

Cheers,
Matthias


For more options, visit https://groups.google.com/d/optout.

Rob McFeely

unread,
Apr 2, 2014, 10:58:04 AM4/2/14
to aureliu...@googlegroups.com
Thanks Matthias

Can i set up a number of the Rexster-local-http->cassandra server instances and have the cassandras setup as a cluster (thereby providing redundancy and replication), but only list the local cassandra instance in each respective rexster config?  Listing only the local instance in each rexster is important so that a write is very fast as its going only to the local machine. Then cassandra self replicates the write on the back-end independently to the rest of the cluster in an eventual consistent manner.  In this model each Rexster only knows about its local cassandra and we let the cassandra cluster handle replication.  Naturally it requires a load balancing on the front end of the rexster cluster to ensure no rexster instance gets overloaded?

Im assuming this matches the model that is being discussed at the end of this article but I want to make sure  http://thinkaurelius.com/2013/03/30/titan-server-from-a-single-server-to-a-highly-available-cluster/

To be sure of what I mean....

Matthias Broecheler

unread,
Apr 2, 2014, 1:48:58 PM4/2/14
to aureliu...@googlegroups.com
Beautiful picture, Rob. And the answer is "yes" ;-)
Reply all
Reply to author
Forward
0 new messages