slow writes on a production Cassandra Cluster

Kieran Sherlock

unread,

Jun 10, 2016, 1:11:01 AM6/10/16

to Aurelius

We are deploying Titan, 1.0, on a Cassandra cluster, dse-4.8.6, that is currently idle and are seeing vertex and edge addition times of up to a second, with a significant proportion in the 200ms range. We did just complete a large test data load where we were seeing decent performance. We then dropped (via cqlsh) and rebuilt the keyspace and schema. Performance has been horrendous since.

The cluster is spec'ed as follows

Cassandra Cluster: 2 x DC, 3 node / DC, SSD, 64GB

cassandra service: 16g

gremlin-server: 6g co-hosted on each node

The following configs are being used for the gremlin-server processes

gremlin-server.yaml

host: 0.0.0.0

port: 8182

threadPoolWorker: 16

gremlinPool: 64

scriptEvaluationTimeout: 30000

serializedResponseTimeout: 30000

channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer

graphs: {

graph: conf/titan.properties}

plugins:

- aurelius.titan

scriptEngines: {

gremlin-groovy: {

imports: [java.lang.Math],

staticImports: [java.lang.Math.PI],

scripts: [scripts/empty-sample.groovy]},

nashorn: {

imports: [java.lang.Math],

staticImports: [java.lang.Math.PI]}}

serializers:

- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph, bufferSize: 8192000 }}

- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true, bufferSize: 81920 }}

- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}

- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}

processors:

- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}

metrics: {

graphiteReporter: {enabled: false, interval: 180000}}

threadPoolBoss: 1

maxInitialLineLength: 4096

maxHeaderSize: 8192

maxChunkSize: 8192

maxContentLength: 65536

maxAccumulationBufferComponents: 1024

resultIterationBatchSize: 20000

writeBufferHighWaterMark: 32768

writeBufferHighWaterMark: 65536

ssl: {

enabled: false}

with the following titan properties

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory

storage.backend=cassandra

storage.cassandra.keyspace=my_keyspace

storage.hostname=10.1.2.3

storage.username=user

storage.password=password

storage.cassandra.astyanax.local-datacenter=DC2

storage.cassandra.read-consistency-level=LOCAL_QUORUM

storage.cassandra.write-consistency-level=LOCAL_QUORUM

ids.block-size=200000

storage.buffer-size=102400

query.fast-property=true

cache.db-cache=true

The keyspace is defined as

CREATE KEYSPACE my_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3'} AND durable_writes = true;

At the gremlin console, local to the gremlin-server, the following actions each take a noticeable amount of time,

gremlin> :> graph.addVertex('a','b')

==>v[12472]

gremlin> :> graph.addVertex('a','c')

==>v[8232]

gremlin> :> g.V(12472).next().addEdge('e',g.V(8232).next())

==>e[2s7-9mg-27th-6co][12472-e->8232]

As you can see from the ids this is on an empty graph.

We are using only composite indexes, one unique, others not. The indexes are built like

...

if (mgmt.getPropertyKey("a") == null) { 

name = mgmt.makePropertyKey("a").dataType(String.class).make(); 

namei = mgmt.buildIndex("a", Vertex.class).addKey(name).unique().buildCompositeIndex(); 

mgmt.setConsistency(namei, ConsistencyModifier.DEFAULT);

 }

...

mgmt.commit();

While building the schema we did see the following warning

7353 [main] WARN com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLocker - Lock write succeeded but took too long: duration PT0.13S exceeded limit PT0.1S

Breaking the replication to the remote DC didn't make a difference, so it seems to be a local issue. Yet we see the same behavior on each DC.

Any help, or pointers to how to debug the issue would be greatly appreciated.

Kieran.

Kieran Sherlock

unread,

Jun 11, 2016, 7:50:28 PM6/11/16

to Aurelius

Some more data from our titan cluster.

cfhistogram on the titan tables seems to be pretty reasonable with, for example, the edgestore looking like this

my_keyspace/edgestore histograms

Percentile SSTables Write Latency Read Latency Partition Size Cell Count

(micros) (micros) (bytes)

50% 1.00 24.00 86.00 535 10

75% 1.00 29.00 103.00 770 12

95% 1.00 60.00 149.00 770 14

98% 1.00 72.00 310.00 924 14

99% 1.00 86.00 372.00 924 14

Min 0.00 0.00 9.00 61 2

Max 4.00 61214.00 263210.00 315852 3311

The other tables look reasonable also with the worst 99% being titan_ids

my_keyspace/titan_ids histograms

Percentile SSTables Write Latency Read Latency Partition Size Cell Count

(micros) (micros) (bytes)

50% 1.00 10.00 258.00 310 4

75% 1.00 14.00 310.00 446 6

95% 2.00 372.00 446.00 642 8

98% 2.00 535.00 924.00 770 10

99% 2.00 535.00 1109.00 770 10

Min 0.00 4.00 21.00 87 0

Max 2.00 535.00 1331.00 770 10

Running a test pushing 100000 Vertex into Titan using 30 threads we are seeing a little under 1700/second with an average latency of 17ms (std dev 8) on the client side (reading the ResultSet synchronously).

Does this look like what we should expect for this class of hardware?

Thanks again for any help or guidance you can provide.

Kieran.

Samik R

unread,

Jun 13, 2016, 2:15:50 AM6/13/16

to Aurelius

Just a note that when you are building schema over a network, the message "Lock write succeeded but took too long" often shows up. We have seen the schema building actually failing in a few cases. There are essentially two ways to mitigate this: (1) Build the schema on the same box where titan-server is running, (2) Increase the time limit. This can only be done using the gremlin shell.

Not sure if that is effecting your overall speed though.

Reply all

Reply to author

Forward