Titan Server / Cassandra Cluster - High Availability

175 views
Skip to first unread message

Effy

unread,
Oct 10, 2015, 9:13:38 AM10/10/15
to Aurelius
Hi,

I'm trying to plan our company's new servers architecture, using Titan/Cassandra.
My scenario (sample):
  • 2 frontend servers (web server), each one needs to access titan cluster.
  • 3 DB servers
I have two approaches for this setup:
  1. Install Titan server on each web server, install Cassandra on each DB server.
    Each frontend server, accesses the local titan instance.
    Titan server accesses cassandra cluster.
  2. Install Titan server and Cassandra on each DB server.
    Each frontend server, accesses all db servers (using some custom WebSockets code, which would have to be implemented in the TP3 WebSockets driver)
The first setup, seems much easier, as cluster management is handled completely by Cassandra. However, in this approach, the titan server isn't installed on the same machine as Cassandra.
Will that raise a performance issue?

The second setup, separates better between our web servers and the db servers, and allowed for easier installation of new web servers.

I'd appreciate any insights regarding this decision.

Thank you,
Effy

Jason Plurad

unread,
Oct 10, 2015, 2:32:03 PM10/10/15
to aureliu...@googlegroups.com
The options are generally discussed here: http://s3.thinkaurelius.com/docs/titan/current/cassandra.html

I would go with a Cassandra cluster separate from the rest of your infrastructure (see 15.3. Remote Server Mode with Rexster). It will make its operational management more straightforward and predictable. Notice on 15.4 Titan Embedded Mode that the documentation states: "Note, that running Titan with Cassandra embedded requires GC tuning. While embedded Cassandra can provide lower latency query answering, its GC behavior under load is less predictable."

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/a01f4962-5607-4507-8c41-7ffb7447159c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Have a good one,
Jason

Effy

unread,
Oct 11, 2015, 4:41:04 AM10/11/15
to Aurelius
Thanks Jason,
Do you think it's important to host Titan Server and Cassandra on the same machines? Or that shouldn't have significant affect on performance?

Effy

Jason Plurad

unread,
Oct 12, 2015, 11:27:49 AM10/12/15
to aureliu...@googlegroups.com
Obviously, your mileage may vary, and you should test/benchmark your Cassandra cluster with a production workload... but no, I wouldn't get hung up on the performance difference with running Cassandra on a dedicated cluster versus having Titan Server co-located on with Cassandra.


For more options, visit https://groups.google.com/d/optout.

Ted Wilmes

unread,
Oct 12, 2015, 12:34:46 PM10/12/15
to Aurelius
Building on Jason's good advice, I'd definitely suggest trying at least some decent proxy for what you expect your production workload to be in both colocated & non-co-located mode.  1.0's query.batch helps to alleviate a decent chunk of the slowdown you can get from introducing a network hop into your setup but this is also dependent on the nature of the types of queries you're running.  For example, in your other posting on the query.batch issue, if you're doing something where you are doing (potentially) a large number of hops (your repeat step), you'll be taking an increased latency penalty for each of those hops when you go off instance vs co-located.  Not nearly as much as without query batching, but it'll still be there.  Also, if you're deploying to the "cloud" you can see a fair amount of latency variability between instances, even within the same availability zone, which can bite you.  If you're on AWS, you could look at placement groups to get some more control over this.  Bottom line though, all of this will be highly dependent on what your user load, reads, writes, etc. actually looks like, so any amount of testing you can do beforehand will be very valuable.

--Ted
Reply all
Reply to author
Forward
0 new messages