I’m looking at Janus performance
Using Cassandra-stress tool, I have obtain the below write rates. It seems good enough to me, even though it’s a very basic setup, 3 nodes, replica_factor=3, on AWS across 3 AZ). We could go for 12 nodes to improve speed further. But I'm not sure this is yet necessary. That is basically my question.
Op rate : 68,618 op/s [WRITE: 68,618 op/s]
Partition rate : 68,618 pk/s [WRITE: 68,618 pk/s]
Row rate : 68,618 row/s [WRITE: 68,618 row/s]
Latency mean : 2.9 ms [WRITE: 2.9 ms]
Latency median : 1.9 ms [WRITE: 1.9 ms]
Latency 95th percentile : 7.2 ms [WRITE: 7.2 ms]
Latency 99th percentile : 16.0 ms [WRITE: 16.0 ms]
Latency 99.9th percentile : 46.5 ms [WRITE: 46.5 ms]
Latency max : 325.8 ms [WRITE: 325.8 ms]
Now I have an embedded Janus inside a Flink and we are injecting Vertex at a rate of around 200/sec (sometimes 400/sec but it's constant after we started the test). Janus default settings (read/write quorum)
We do a commit every 1,000 vertices. Changing that commit size does not get us improvements.
While this part of the application needs to be single thread and we will be able to multi-thread for other processes, I'm still thinking that 200 vertices insert per seconds for one thread is below normal performance, especially considering the ability for Cassandra to write at 70,000 rows/sec.
What is the sort of ratio we should get between writing in Janus and writing in Cassandra? And should I even try to improve Cassandra performance, or is the issue lying in Janus or in between Janus and Cassandra?
I looked at the read/write latencies on both Tables and ColumnFamily in Cassandra, and the only latency is in graphindex table. The latency seems in the norm (if I read the values correctly). Other tables don't seem to be used much apart from edgestore table, but much less.
If my performance issue is a Janus configuration, what metric should I look at?
I've thrown as much CPU core and memory as I could with no major improvement. I've also gradually increased CPU+Memory+Heap on Cassandra (64GB memory + 16 cores), and it did improve the cassandra-stress test, but not really the insert rate into Janus.
Where to look for the bottleneck or config issue? What Janus/Java or Cassandra metrics can help me, considered the results above?
graph = JanusGraphFactory.open('conf/janusgraph-cassandra.properties')
g = graph.traversal()
txn = 10000
t0 = System.currentTimeMillis(); (1..1000000).each{ g.addV('node').property('name', 'n_'+it).iterate(); if (it % txn == 0) graph.tx().commit(); }; t1 = System.currentTimeMillis(); t1-t0
graph = JanusGraphFactory.open('conf/janusgraph-cassandra.properties')
mgmt = graph.openManagement()
node = mgmt.makeVertexLabel('node').make()
name = mgmt.makePropertyKey('name').dataType(String.class).make()
nameIndex = mgmt.buildIndex('nameIndex', Vertex.class).addKey(name).buildCompositeIndex()
mgmt.commit()
g = graph.traversal()
txn = 10000
t0 = System.currentTimeMillis(); (1..1000000).each{ g.V().has('name', 'n_'+it).fold().coalesce(unfold(), g.addV('node').property('name', 'n_'+it)).iterate(); if (it % txn == 0) graph.tx().commit(); }; t1 = System.currentTimeMillis(); t1-t0
graph = JanusGraphFactory.open('conf/janusgraph-cassandra.properties')
mgmt = graph.openManagement()
node = mgmt.makeVertexLabel('node').make()
name = mgmt.makePropertyKey('name').dataType(String.class).make()
nameIndex = mgmt.buildIndex('nameIndex', Vertex.class).addKey(name).unique().buildCompositeIndex()
mgmt.setConsistency(name, ConsistencyModifier.LOCK)
mgmt.setConsistency(nameIndex, ConsistencyModifier.LOCK)
mgmt.commit()
g = graph.traversal()
txn = 10000
t0 = t2 = System.currentTimeMillis(); (1..200000).each{ g.addV('node').property('name', 'n_'+it).iterate(); if (it % txn == 0) graph.tx().commit(); if (it % 10000 == 0) { t3 = System.currentTimeMillis(); println ''+it+' '+(t3-t2); t2 = t3; } }; t1 = System.currentTimeMillis(); t1-t0