Can you create an RDD from a multimap?

Neutronis Consulting

unread,

Feb 19, 2016, 9:25:28 AM2/19/16

to DataStax Spark Connector for Apache Cassandra

I'm trying to get a multimap untimately into a TempTable that I can use as a dataframe to perform operations on.

The sample multimap has the types Double and Float in it as follows:
{1.50E8=[10, 20], 1.51E8=[-10, -13, -14, -15], 1.52E8=[-10, -11]

I need to use spark to get the min, max, average of each of these. For example for the first one 1.50ED would be min 10, max 20, avg 15.

I already have the code that I use once I have it in a dataframe which works:

queryMV.groupBy(col("channel"))
.agg(min("power"), max("power"), avg("power"))
.write().mode(SaveMode.Append)
.option("table", "results")
.option("keyspace", "model")
.format("org.apache.spark.sql.cassandra").save();

I just help converting this multimap into a DataFrame.

Shashwat Rastogi

unread,

Feb 21, 2016, 4:27:51 AM2/21/16

to spark-conn...@lists.datastax.com

Hi,

Try the following:

// Define the schema for the dataframe
val schema = StructType(Array(
  StructField("key", StringType, false),
  StructField("values", ArrayType(DoubleType, true), true)
))

// Read multimap, and map it to Row object
val buf = scala.collection.mutable.ListBuffer.empty[Row]
var iter = multimap.keySet().iterator()
while(iter.hasNext){
  var key = iter.next()
  var value = multimap.get(key).toList.toArray
  buf += Row(key,value)
}

// Create a dataframe
val df = sqlContext.createDataFrame(sc.parallelize(buf), schema)

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

--

Shashwat Rastogi

Software Engineer

Amadeus Software Labs

Bangalore

+91-7022-25-0604

Jason Walkowicz

unread,

Feb 21, 2016, 11:42:53 AM2/21/16

to spark-conn...@lists.datastax.com

Thank you, unfortunately I have to use Java instead of scala.

Reply all

Reply to author

Forward