Metrics not reporting

368 views
Skip to first unread message

Clayton Stout

unread,
Oct 14, 2015, 2:15:39 PM10/14/15
to DataStax Spark Connector for Apache Cassandra
I'm trying to inspect spark metrics for a running job. I have updated the metrics.properties file with a csv sink for all instances and included the two lines to get metrics from the CassandraConnectorSource. I am seeing files show up (executor files on the workers and driver files on the master), but I'm not sure that the connector specific metrics are present. Currently I only see:

app-20151014175609-0015.1.executor.filesystem.file.largeRead_ops.csv
app-20151014175609-0015.1.executor.filesystem.file.read_bytes.csv
app-20151014175609-0015.1.executor.filesystem.file.read_ops.csv
app-20151014175609-0015.1.executor.filesystem.file.write_bytes.csv
app-20151014175609-0015.1.executor.filesystem.file.write_ops.csv
app-20151014175609-0015.1.executor.filesystem.hdfs.largeRead_ops.csv
app-20151014175609-0015.1.executor.filesystem.hdfs.read_bytes.csv
app-20151014175609-0015.1.executor.filesystem.hdfs.read_ops.csv
app-20151014175609-0015.1.executor.filesystem.hdfs.write_bytes.csv
app-20151014175609-0015.1.executor.filesystem.hdfs.write_ops.csv
app-20151014175609-0015.1.executor.threadpool.activeTasks.csv
app-20151014175609-0015.1.executor.threadpool.completeTasks.csv
app-20151014175609-0015.1.executor.threadpool.currentPool_size.csv
app-20151014175609-0015.1.executor.threadpool.maxPool_size.csv

listed on the workers.

My metrics file looks like

executor.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource
driver.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource

*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink

# Polling period for CsvSink
*.sink.csv.period=1

*.sink.csv.unit=minutes

# Polling directory for CsvSink
*.sink.csv.directory=/tmp/

I'm on spark-cassandra-connector 1.3.0. Any thoughts on what is going wrong?

Max Pimm

unread,
Dec 2, 2015, 4:57:49 PM12/2/15
to DataStax Spark Connector for Apache Cassandra

I have the same issue using 1.5.0-M2 however this commit might have fixed it in M3. I'll have to try.

https://github.com/datastax/spark-cassandra-connector/commit/a9d7eec74da958601bc8865c40c69b288f1b3cd0

Joseph Lust

unread,
Jan 27, 2016, 7:42:46 PM1/27/16
to DataStax Spark Connector for Apache Cassandra, max...@gmail.com
I've run into the same problem. I'm curious is this working for anyone in 1.6?

Thanks,
Joe

Sa Xiao

unread,
Feb 23, 2016, 8:01:03 PM2/23/16
to DataStax Spark Connector for Apache Cassandra, max...@gmail.com
Does anyone get any luck to get the metrics work?

I ran into the same problem. I'm using spark-cassandra-connector_2.10:1.5.0, spark 1.5.1. I am trying to send the metrics to statsD. My metrics.properties is as the following:

*.sink.statsd.class=org.apache.spark.metrics.sink.StatsDSink
*.sink.statsd.host=localhost
*.sink.statsd.port=18125

executor.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource
driver.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource

I'm able to see other metrics, e.g. DAGScheduler, but not any from the CassandraConnectorSource. E.g. I tried to search "write-byte-meter", but didn't find it. I didn't see the metrics on the spark UI either. I didn't find any relevant error or info in the log that indicates the CassandraConnectorSource is actually registered by the spark metrics system. Any pointers would be very much appreciated!

Thanks,
Sa

Sa Xiao

unread,
Feb 23, 2016, 8:01:33 PM2/23/16
to DataStax Spark Connector for Apache Cassandra, max...@gmail.com

*.sink.statsd.class=org.apache.spark.metrics.sink.StatsDSink
*.sink.statsd.host=localhost
*.sink.statsd.port=18125

executor.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource
driver.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource

Thanks,
Sa

Sa Xiao

unread,
Feb 24, 2016, 7:25:00 PM2/24/16
to spark-conn...@lists.datastax.com, max...@gmail.com
Ok, I get the metrics working. I'd like to report back here in case anyone is interested in.

I'm using spark-cassandra-connector_2.10:1.5.0-M2, spark 1.5.1, and running with mesos. I tried spark-cassandra-connector_2.10:1.5.0, and it worked too. Here are the things that I do to make it work:

1. Add the following to the metrics.properties for both the driver and all the mesos slaves. 
"executor.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource
driver.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource"

2. Add the following to spark-default.conf for the driver:
spark.executor.extraClassPath=/opt/spark/app/lib/spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar

3. Add the spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar to the directory you set above. Instruction to build the jar is here: https://github.com/datastax/spark-cassandra-connector/blob/master/doc/13_spark_shell.md

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

Jayant Gope

unread,
Jun 8, 2024, 3:55:10 AM6/8/24
to DataStax Spark Connector for Apache Cassandra, Sa Xiao, max...@gmail.com
Hey, could you please explain, what's the purpose of the jar file - spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar? and what does it includes?
Reply all
Reply to author
Forward
0 new messages