The sample multimap has the types Double and Float in it as follows:
{1.50E8=[10, 20], 1.51E8=[-10, -13, -14, -15], 1.52E8=[-10, -11]
I need to use spark to get the min, max, average of each of these. For example for the first one 1.50ED would be min 10, max 20, avg 15.
I already have the code that I use once I have it in a dataframe which works:
queryMV.groupBy(col("channel"))
.agg(min("power"), max("power"), avg("power"))
.write().mode(SaveMode.Append)
.option("table", "results")
.option("keyspace", "model")
.format("org.apache.spark.sql.cassandra").save();
I just help converting this multimap into a DataFrame.
// Define the schema for the dataframe
val schema = StructType(Array(
StructField("key", StringType, false),
StructField("values", ArrayType(DoubleType, true), true)
))
// Read multimap, and map it to Row object
val buf = scala.collection.mutable.ListBuffer.empty[Row]
var iter = multimap.keySet().iterator()
while(iter.hasNext){
var key = iter.next()
var value = multimap.get(key).toList.toArray
buf += Row(key,value)
}
// Create a dataframe
val df = sqlContext.createDataFrame(sc.parallelize(buf), schema)
--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.