spark-cassandra-connector-unshaded with Guava 27 support

89 views
Skip to first unread message

Han Liu

unread,
Sep 21, 2020, 3:26:37 PM9/21/20
to DataStax Spark Connector for Apache Cassandra
Hi, 
Currently I am trying to use spark-cassandra-connector-unshaded with guava 27 ( I also need the Java Driver, so I picked the unshaded version).

spark-cassandra-connector-unshaded-2.4.3 is still using deprecated guava api 
com.google.common.util.concurrent.Futures.addCallback(Lcom/google/common/util/concurrent/ListenableFuture;Lcom/google/common/util/concurrent/FutureCallback;)V in guava 27, which will throw runtime exception. It seems the shaded version already bypass this problem after 2.5.1, then when could we expect the unshaded version with newer guava support?

Also, I have tried to build branch b2.4 locally, while it couldn't even start as joda-time is missing somehow. I just follow the instruction: ./sbt/sbt and test.
Error log:
[error] /spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/GettableData.scala:11: object DateTimeZone is not a member of package org.joda.time
[error] import org.joda.time.DateTimeZone.UTC
[error]                      ^
[error] /spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/GettableData.scala:83: type LocalDate is not a member of package org.joda.time
[error]         new org.joda.time.LocalDate(localDate.getYear, localDate.getMonth, localDate.getDay)
[error]                           ^

Could some one plz help with it or provide some advice here?

Thanks,
Han

Russell Spitzer

unread,
Sep 21, 2020, 3:43:25 PM9/21/20
to DataStax Spark Connector for Apache Cassandra
You should use the shaded version if you want a newer guava, the unshaded version is there for I'd you want to manually reshade things or mess with dependencies

--
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

Han Liu

unread,
Sep 22, 2020, 1:54:05 PM9/22/20
to spark-conn...@lists.datastax.com
Hi Russell, thanks for the quick reply!

I have to use cassandra java driver at the same time, so I choose the unshaded version. After pinning cassandra java driver to 3.6.2, I have tried out several versions of shaded versions but couldn't find a compatible one. 

When running with spark-cassandra-connector_2.11_2.5.1, it will throw an exception
Caused by: com.datastax.spark.connector.types.TypeConversionException: Cannot convert object 2017-02-01T08:00:00Z of type class java.time.Instant to java.sql.Date.
while in the source code I did define it to be java.sql.Data : private val buildDate = Date.valueOf("2017-02-01")

unshaded_2.4.3 is the only version I could use so far, but as said, I couldn't make it work with the newer guava. I have tried shaded_2.4.3, it will throw Exception
java.lang.NoSuchMethodError: com.datastax.driver.core.TypeCodec.<init>(Lcom/datastax/driver/core/DataType;Lcom/google/common/reflect/TypeToken;)V

Will you have some advice on it? Thanks a lot!

Error log:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 0.0 failed 1 times, most recent failure: Lost task 11.0 in stage 0.0 (TID 11, localhost, executor driver): com.datastax.spark.connector.types.TypeConversionException: Failed to convert column build_start_time of geocache.access_points_v3 to java.sql.Date: 2017-02-01T08:00:00Z
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.tryConvert(GettableDataToMappedTypeConverter.scala:134)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.convertedColumnValue(GettableDataToMappedTypeConverter.scala:160)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.com$datastax$spark$connector$mapper$GettableDataToMappedTypeConverter$$ctorParamValue(GettableDataToMappedTypeConverter.scala:191)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter$$anonfun$com$datastax$spark$connector$mapper$GettableDataToMappedTypeConverter$$fillBuffer$1.apply$mcVI$sp(GettableDataToMappedTypeConverter.scala:223)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.com$datastax$spark$connector$mapper$GettableDataToMappedTypeConverter$$fillBuffer(GettableDataToMappedTypeConverter.scala:222)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter$$anonfun$convertPF$1.applyOrElse(GettableDataToMappedTypeConverter.scala:260)
at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:44)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.convert(GettableDataToMappedTypeConverter.scala:19)
at com.datastax.spark.connector.rdd.reader.ClassBasedRowReader.read(ClassBasedRowReader.scala:33)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:347)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:347)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$12.next(Iterator.scala:445)
at com.datastax.spark.connector.util.CountingIterator.next(CountingIterator.scala:16)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.datastax.spark.connector.types.TypeConversionException: Cannot convert object 2017-02-01T08:00:00Z of type class java.time.Instant to java.sql.Date.
at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:46)
at com.datastax.spark.connector.types.TypeConverter$SqlDateConverter$$anonfun$convertPF$14.applyOrElse(TypeConverter.scala:323)
at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:44)
at com.datastax.spark.connector.types.TypeConverter$SqlDateConverter$.com$datastax$spark$connector$types$NullableTypeConverter$$super$convert(TypeConverter.scala:320)
at com.datastax.spark.connector.types.NullableTypeConverter$class.convert(TypeConverter.scala:57)
at com.datastax.spark.connector.types.TypeConverter$SqlDateConverter$.convert(TypeConverter.scala:320)
at com.datastax.spark.connector.mapper.GettableDataToMappedTypeConverter.tryConvert(GettableDataToMappedTypeConverter.scala:131)
... 26 more

Russell Spitzer <russell...@gmail.com> 于2020年9月21日周一 下午12:43写道:
To unsubscribe from this topic, visit https://groups.google.com/a/lists.datastax.com/d/topic/spark-connector-user/JlylgjlFsVE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to spark-connector-...@lists.datastax.com.

Russell Spitzer

unread,
Sep 22, 2020, 3:24:07 PM9/22/20
to DataStax Spark Connector for Apache Cassandra
you can still use the java driver even with the shaded version, you'll just need to include another dependency if you don't want to use the Shaded version. It is very difficult to upgrade core libraries like Guava without running into numerous problems like you are seeing. I would recommend using the Shaded version and just including yet another version of Java driver if that's what you need. The whole reason the shaded version exists is so that you can use your own versions of Guava or other shaded libraries without conflicting with the SCC.

Han Liu

unread,
Sep 23, 2020, 5:02:51 AM9/23/20
to spark-conn...@lists.datastax.com
Gotcha, thanks!
One more question, I failed to build spark-cassandra-connector on branch b2.4. It seems it couldn't find joda-time.
> cd spark-cassandra-connector
> git checkout b2.4
> ./sbt/sbt
> package

Error log:
[error] /spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/GettableData.scala:11: object DateTimeZone is not a member of package org.joda.time
[error] import org.joda.time.DateTimeZone.UTC
[error]                      ^
[error] /spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/GettableData.scala:83: type LocalDate is not a member of package org.joda.time
[error]         new org.joda.time.LocalDate(localDate.getYear, localDate.getMonth, localDate.getDay)
[error]                           ^
........
[warn] three warnings found
[error] 259 errors found
[error] (spark-cassandra-connector-shaded/compile:compileIncremental) Compilation failed
[error] Total time: 34 s, completed Sep 23, 2020 1:54:48 AM



Will you have any advice on this?


Thanks,
Han


Russell Spitzer <russell...@gmail.com> 于2020年9月22日周二 下午12:24写道:
Reply all
Reply to author
Forward
0 new messages