Is WriteTime and TTL available from Dataset with an option ?

John Engstrom

unread,

May 4, 2018, 11:46:24 AM5/4/18

to DataStax Spark Connector for Apache Cassandra

To create Datasets with the Cassandra Spark Connection I'm doing the following:

val procData3 = spark.read.format("org.apache.spark.sql.cassandra").options(Map( "table" -> "process_data3", "keyspace" -> "processkeyspace")).load.cache()

Is there an option that can be passed to include WriteTime and TTL for the columns? Or do I have to do:
sc.cassandraTable[(String, Int, Double, Long, Option[Long], String)]("processkeyspace", "process_data3").select("host", "running_processes", "system_cpu_usage", WriteTime("running_processes"),TTL("running_processes"),"collection_time") and then convert the CassandraTableScanRDD to a Dataset?

If it's the latter then any tips for converting a CassandraTableScanRDD to a Dataset would be appreciated.

Thanks,
John Engstrom

Collin Sauve

unread,

May 4, 2018, 11:59:58 AM5/4/18

to spark-conn...@lists.datastax.com

I’ve asked that same question before and AFAIK it is not possible to retrieve these with the Dataset API. I filed this issue in the Jira back in February: https://datastax-oss.atlassian.net/projects/SPARKC/issues/SPARKC-528

There are a couple of ways to convert an RDD to a Dataset. I use a case class like this:

case class MyRow(field1: String, writeTime: java.sql.Date)

val rdd = sc.cassandraTable(keyspaceName, tableName)
    .select("field1", WriteTime("field1", Some("field1_writetime")))
    .map(row => MyRow(
        row.get[String]("field1"), 
        new Date(TimeUnit.MILLISECONDS.convert(row.get[Long]("field1_writetime"), TimeUnit.MICROSECONDS))
    )
val ds = spark.sqlContext.createDataFrame(rdd)

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

John Engstrom

unread,

May 4, 2018, 12:10:49 PM5/4/18

to DataStax Spark Connector for Apache Cassandra

Thanks Collin - I appreciate the comment and the example on converting to a Dataset.

I'll add my support to wanting that Jira ticket worked. :)

Reply all

Reply to author

Forward