Can you create an RDD from a multimap?

66 views
Skip to first unread message

Neutronis Consulting

unread,
Feb 19, 2016, 9:25:28 AM2/19/16
to DataStax Spark Connector for Apache Cassandra
I'm trying to get a multimap untimately into a TempTable that I can use as a dataframe to perform operations on.

The sample multimap has the types Double and Float in it as follows:
{1.50E8=[10, 20], 1.51E8=[-10, -13, -14, -15], 1.52E8=[-10, -11]

I need to use spark to get the min, max, average of each of these. For example for the first one 1.50ED would be min 10, max 20, avg 15.


I already have the code that I use once I have it in a dataframe which works:

queryMV.groupBy(col("channel"))
.agg(min("power"), max("power"), avg("power"))
.write().mode(SaveMode.Append)
.option("table", "results")
.option("keyspace", "model")
.format("org.apache.spark.sql.cassandra").save();

I just help converting this multimap into a DataFrame.


Shashwat Rastogi

unread,
Feb 21, 2016, 4:27:51 AM2/21/16
to spark-conn...@lists.datastax.com
Hi,

Try the following:
// Define the schema for the dataframe
val schema = StructType(Array(
StructField("key", StringType, false),
StructField("values", ArrayType(DoubleType, true), true)
))

// Read multimap, and map it to Row object
val buf = scala.collection.mutable.ListBuffer.empty[Row]
var iter = multimap.keySet().iterator()
while(iter.hasNext){
var key = iter.next()
var value = multimap.get(key).toList.toArray
buf += Row(key,value)
}

// Create a dataframe
val df = sqlContext.createDataFrame(sc.parallelize(buf), schema)



--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.



--
Shashwat Rastogi
Software Engineer
Amadeus Software Labs
Bangalore
+91-7022-25-0604

Jason Walkowicz

unread,
Feb 21, 2016, 11:42:53 AM2/21/16
to spark-conn...@lists.datastax.com
Thank you, unfortunately I have to use Java instead of scala.
Reply all
Reply to author
Forward
0 new messages