spark cassandra write performance

764 views

Skip to first unread message

Giri

unread,

Apr 18, 2017, 6:30:31 PM4/18/17

to DataStax Spark Connector for Apache Cassandra

Hi,

I'm trying to load 1 billion records to cassandra. Each record has different partition key. So, I think there is no need to do batching . Here are the parameters I'm trying to tune

"spark.cassandra.output.batch.size.rows": 1,
"spark.cassandra.output.concurrent.writes":500,
"spark.cassandra.output.throughput_mb_per_sec": 1

Are there any other parameters I need to tune for this kind of scenario to not overwhelm the cassandra

Thanks

Jim Hatcher

unread,

Apr 18, 2017, 7:47:00 PM4/18/17

to DataStax Spark Connector for Apache Cassandra

Giri,

Have you seen this presentation? It's a gold mine.

https://www.slideshare.net/mobile/DataStax/maximum-overdrive-tuning-the-spark-cassandra-connector-russell-spitzer-datastax-c-summit-2016

Jim

Get Outlook for Android

From: Giri
Sent: Tuesday, April 18, 3:30 PM
Subject: spark cassandra write performance
To: DataStax Spark Connector for Apache Cassandra

Hi, I'm trying to load 1 billion records to cassandra. Each record has different partition key. So, I think there is no need to do batching . Here are the parameters I'm trying to tune "spark.cassandra.output.batch.size.rows": 1, "spark.cassandra.output.concurrent.writes":500, "spark.cassandra.output.throughput_mb_per_sec": 1 Are there any other parameters I need to tune for this kind of scenario to not overwhelm the cassandra Thanks -- You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group. To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

Reply all

Reply to author

Forward

0 new messages