Getting 'Task not serializable' error while using Session object inside the foreach() method of JavaRDD

896 views
Skip to first unread message

JEENA VINOD

unread,
Oct 11, 2015, 4:54:14 PM10/11/15
to DataStax Spark Connector for Apache Cassandra
I am using the Session object to execute CQLs to insert data into Cassandra.
I need to process each JavaRdd object to get the data to be inserted.
Hence using Session inside foreach(). But getting 'java.io.NotSerializableException: com.datastax.spark.connector.cql.SessionProxy'.
Please help resolve this.

Code snippet:
try (Session session = connector.openSession()) {
processedObj.foreach(data -> {
String cql = "";
//Frame the insert cql from data
.
.
session.execute(cql);
});
}

Exception StackTrace:
Exception in thread "main" org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2030)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:889)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:888)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:888)
at org.apache.spark.api.java.JavaRDDLike$class.foreach(JavaRDDLike.scala:330)
at org.apache.spark.api.java.AbstractJavaRDDLike.foreach(JavaRDDLike.scala:47)
Caused by: java.io.NotSerializableException: com.datastax.spark.connector.cql.SessionProxy
Serialization stack:
- object not serializable (class: com.datastax.spark.connector.cql.SessionProxy, value: com.datastax.spark.connector.cql.SessionProxy@3f702946)
- field (class: java.lang.reflect.Proxy, name: h, type: interface java.lang.reflect.InvocationHandler)
- object (class com.sun.proxy.$Proxy8, com.datastax.driver.core.SessionManager@13275d8)
- element of array (index: 3)
- array (class [Ljava.lang.Object;, size 4)
- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, type: class [Ljava.lang.Object;)

Denis Makarskiy

unread,
Oct 12, 2015, 3:13:07 AM10/12/15
to spark-conn...@lists.datastax.com
I have not manage to find the answer to the same question that was in the group.

That is why I repeat it. Session is not serializable it is true, that is why you need to open the session into lambda in foreach block of code. Because particularly this block will be spread to the executors.
And better use CQL query as a prepared statement for the performance reasons.

I hope it helps.

To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

Reply all
Reply to author
Forward
0 new messages