Cassandra with Titan

AJ

unread,

Jul 19, 2012, 2:14:18 PM7/19/12

to gremli...@googlegroups.com

Hello

Had a question on using Cassandra with Titan:

Is the expectation still that one node of Cassandra stores the full graph database. In other words, using Cassandra partitioners (random or ordererd) for sharding titan Column Families is not suggested, right ?

The sharding issue of Neo4J (http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-7b75a191fa22.aspx) is still very much an issue with titan as well, irrespective of storage layer, right ?

Thanks !

Matthias Broecheler

unread,

Jul 19, 2012, 3:21:05 PM7/19/12

to gremli...@googlegroups.com

No, titan will distribute the graph across multiple cassandra nodes in the cassandra cluster.

In our Titan benchmark we used 6 cassandra nodes each storing a fragment of the twitter graph:

http://thinkaurelius.github.com/titan/doc/titan-stress-poster.pdf

So, the more cassandra nodes you have in your cluster, the more graph data you can store and the more cache you have available to keep the "hot" graph data in memory.

--
Matthias Broecheler, PhD
http://www.matthiasb.com
E-Mail: m...@matthiasb.com

AJ

unread,

Jul 19, 2012, 4:09:13 PM7/19/12

to gremli...@googlegroups.com

Thanks Matthias. In that case, can you expand on the following:

1. Which partitioner is used: Random or BOP ? I doubt it is RP as then the traversals will have high latencies. But will wait to hear from you.

2. How is hotspot avoided, i.e. one node having much more of the graph data than other cassandra nodes ? Are the nodes constantly rebalanced by changing the range of BOPs (assuming it is BOP)

3. How are the 2 contradictory requirements of balancing nodes and avoiding internode traversals managed ?

4. It seems that the titan CFs hold data in Byte format (not in pure ascii). It seems some sort of proprietary titan way that normal users can't access (read/write) directly but only through Titan. If the CFs are partitioned on several cassandra nodes; do I risk corrupting the full graph database if at any point I am not able to get to one of the latest partition and say have to resort to a replica that is slightly behind.

It is one thing to loose some records (as it happens in edge cases of eventually consistent distributed databases like Cassandra) and another thing for the full database to get corrupted. Because the data stored by Titan is in some native format, not sure which one can I end up dealing with during disaster.

Thanks for your inputs.

AJ

unread,

Jul 20, 2012, 1:50:00 PM7/20/12

to gremli...@googlegroups.com

Forgot to expand, BOP is byte ordered partitioning of Cassandra and RP is random Partitioner.

Thanks.

Matthias Broecheler

unread,

Jul 25, 2012, 3:48:23 AM7/25/12

to gremli...@googlegroups.com

Hey AJ,

1) It's both actually - depending on your configuration. Random partitioning is the "safe" option and the one that we are supporting in the alpha release of Titan. "Safe" because it leads to very well balanced partitions, is easy to use, requires no token ring configuration, etc. Titan also supports a vertex-id prefixing mode whereby vertices are slotted into partition buckets by virtue of having the same id prefix. Together with byte ordered partitioner this gives vertex locality in the cassandra cluster that can lead to better performance for traversals.

However, those performance differences are a lot less than one might think. One reason is that distributing the vertices across machines (and RP does that very well) you get the benefit of being able multi-thread your traversals across multiple cassandra machines which actually gives you very low latencies. Titan supports the ThreadedTransactionalGraph interface in Blueprints to enable such multi-threaded traversals.

Then again, some domains might have an obvious vertex partitioning to them which leads to significant performance gains. For those cases we will make the vertex id prefixed based partitioning with BOP easier to use in a future version.

2) RP leads to much less trouble with hotspots. When using vertex id prefixed partitioning with BOP it needs to be ensured that not too much locality is achieved to avoid hotspots. Constantly rebalancing would be too expensive.

3) Two options based on the above:

a) Multi-threaded traversals + RP + topology aware cassandra setup (e.g. amazon cluster groups) = high performance and easy

b) Vertex id prefix and partition buckets for local subgraphs + BOP = very high performance but requires more configuration

==> working on a future version of Titan to make this easy

4) Yes, Titan uses its own format for encoding edges and properties in ByteBuffers to make use of graph-centric compression schemes and generally storing data efficiently and effectively for quick retrieval. However, there is no danger of data corruption as all key-column-value triples stored are independent from each other. If you were to loose one for whatever reason, you would loose an edge in the graph. Titan preserves the data model of the underlying storage backend in that sense.

AJ

unread,

Jul 25, 2012, 3:38:36 PM7/25/12

to gremli...@googlegroups.com

Thanks Matthias. This is very good insight !

When I get sometime, I will do tests with RP to find how the latency looks.

Marko Rodriguez

unread,

Jul 25, 2012, 3:42:37 PM7/25/12

to gremli...@googlegroups.com

Hi AJ,

You might be interested in this Titan benchmark using RP that we did for GraphLab 2012. The stress test is at the bottom of the poster:

http://thinkaurelius.github.com/titan/doc/titan-stress-poster.pdf

Enjoy,

Marko.

http://markorodriguez.com

Matthias Broecheler

unread,

Jul 25, 2012, 3:42:38 PM7/25/12

to gremli...@googlegroups.com

If latency is of concern, make sure you place all cassandra nodes in one placement group (if you are using EC2 or whatever the equivalent would be for your data center).

Reply all

Reply to author

Forward