slow writes on a production Cassandra Cluster

462 views
Skip to first unread message

Kieran Sherlock

unread,
Jun 10, 2016, 1:11:01 AM6/10/16
to Aurelius

We are deploying Titan, 1.0, on a Cassandra cluster, dse-4.8.6,  that is currently idle and are seeing vertex and edge addition times of up to a second, with a significant proportion in the 200ms range.  We did just complete a large test data load where we were seeing decent performance.   We then dropped (via cqlsh) and rebuilt the keyspace and schema.   Performance has been horrendous since.

The cluster is spec'ed as follows

Cassandra Cluster:  2 x DC, 3 node / DC, SSD, 64GB
cassandra service: 16g
gremlin-server: 6g co-hosted on each node

The following configs are being used for the gremlin-server processes

gremlin-server.yaml
host: 0.0.0.0
port: 8182
threadPoolWorker: 16
gremlinPool: 64
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/titan.properties}
plugins:
  - aurelius.titan
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
  nashorn: {
      imports: [java.lang.Math],
      staticImports: [java.lang.Math.PI]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph, bufferSize: 8192000 }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true, bufferSize: 81920 }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 20000
writeBufferHighWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

with the following titan properties

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandra
storage.cassandra.keyspace=my_keyspace
storage.hostname=10.1.2.3
storage.username=user
storage.password=password
storage.cassandra.astyanax.local-datacenter=DC2
storage.cassandra.read-consistency-level=LOCAL_QUORUM
storage.cassandra.write-consistency-level=LOCAL_QUORUM
ids.block-size=200000
storage.buffer-size=102400
query.fast-property=true
cache.db-cache=true


The keyspace is defined as 

CREATE KEYSPACE my_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3'}  AND durable_writes = true;


At the gremlin console, local to the gremlin-server, the following actions each take a noticeable amount of time, 

gremlin> :> graph.addVertex('a','b')
==>v[12472]
gremlin> :> graph.addVertex('a','c')
==>v[8232]
gremlin> :> g.V(12472).next().addEdge('e',g.V(8232).next())
==>e[2s7-9mg-27th-6co][12472-e->8232]

As you can see from the ids this is on an empty graph.

We are using only composite indexes, one unique, others not.  The indexes are built like 

...
if (mgmt.getPropertyKey("a") == null) {
  
   name = mgmt.makePropertyKey("a").dataType(String.class).make();
    
   namei = mgmt.buildIndex("a", Vertex.class).addKey(name).unique().buildCompositeIndex();
    
   mgmt.setConsistency(namei, ConsistencyModifier.DEFAULT);

}
...
mgmt.commit();

While building the schema we did see the following warning

7353 [main] WARN  com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLocker  - Lock write succeeded but took too long: duration PT0.13S exceeded limit PT0.1S

Breaking the replication to the remote DC didn't make a difference, so it seems to be a local issue.  Yet we see the same behavior on each DC.  

Any help, or pointers to how to debug the issue would be greatly appreciated.

Kieran.

Kieran Sherlock

unread,
Jun 11, 2016, 7:50:28 PM6/11/16
to Aurelius

Some more data from our titan cluster.

cfhistogram on the titan tables seems to be pretty reasonable with, for example, the edgestore looking like this

my_keyspace/edgestore histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)
50%             1.00             24.00             86.00               535                10
75%             1.00             29.00            103.00               770                12
95%             1.00             60.00            149.00               770                14
98%             1.00             72.00            310.00               924                14
99%             1.00             86.00            372.00               924                14
Min             0.00              0.00              9.00                61                 2
Max             4.00          61214.00         263210.00            315852              3311

The other tables look reasonable also with the worst 99% being titan_ids

my_keyspace/titan_ids histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)
50%             1.00             10.00            258.00               310                 4
75%             1.00             14.00            310.00               446                 6
95%             2.00            372.00            446.00               642                 8
98%             2.00            535.00            924.00               770                10
99%             2.00            535.00           1109.00               770                10
Min             0.00              4.00             21.00                87                 0
Max             2.00            535.00           1331.00               770                10

Running a test pushing 100000 Vertex into Titan using 30 threads we are seeing a little under 1700/second with an average latency of 17ms (std dev 8) on the client side (reading the ResultSet synchronously).

Does this look like what we should expect for this class of hardware?

Thanks again for any help or guidance you can provide.

Kieran.

Samik R

unread,
Jun 13, 2016, 2:15:50 AM6/13/16
to Aurelius
Just a note that when you are building schema over a network, the message "Lock write succeeded but took too long" often shows up. We have seen the schema building actually failing in a few cases. There are essentially two ways to mitigate this: (1) Build the schema on the same box where titan-server is running, (2) Increase the time limit. This can only be done using the gremlin shell.

Not sure if that is effecting your overall speed though.
Reply all
Reply to author
Forward
0 new messages