pyspark with spark 3.0. FileNotFound exception when using packages to load cassandra driver

421 views
Skip to first unread message

Ganesh Krishnan

unread,
Jun 21, 2020, 5:24:18 PM6/21/20
to DataStax Spark Connector for Apache Cassandra


Stack trace:
Traceback (most recent call last):
  File "src/main/python/user/skuSalesForecast.py", line 34, in <module>
    spark = SparkSession.builder.appName("Sku Sales Forecast").config("spark.cassandra.connection.host", "cassandraHost") \
  File "/home/aihello.com/.local/lib/python3.8/site-packages/pyspark/sql/session.py", line 186, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/home/aihello.com/.local/lib/python3.8/site-packages/pyspark/context.py", line 371, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/home/aihello.com/.local/lib/python3.8/site-packages/pyspark/context.py", line 130, in __init__
    self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
  File "/home/aihello.com/.local/lib/python3.8/site-packages/pyspark/context.py", line 193, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/home/aihello.com/.local/lib/python3.8/site-packages/pyspark/context.py", line 310, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/home/aihello.com/.local/lib/python3.8/site-packages/py4j/java_gateway.py", line 1568, in __call__
    return_value = get_return_value(
  File "/home/aihello.com/.local/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.io.FileNotFoundException: File file:/home/aihello.com/.ivy2/jars/com.github.jnr_jffi-1.2.19.jar does not exist



The same cassandra driver + spark version works with scala.

Alex Ott

unread,
Jun 22, 2020, 2:15:39 AM6/22/20
to DataStax Spark Connector for Apache Cassandra
This is a known issue: https://datastax-oss.atlassian.net/browse/SPARKC-599 - for some reason, Spark, when downloading dependencies renames that jar, so it couldn't be found...
If you want, you can use shaded packages that I prepared last week:





--
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.


--
With best wishes,                    Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)

Ganesh Krishnan

unread,
Jun 23, 2020, 3:41:18 PM6/23/20
to DataStax Spark Connector for Apache Cassandra
Do I copy this jar to my project and add it as dependency?
Weirdly this only happens on python not in scala
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.
Message has been deleted

Russell Spitzer

unread,
Jun 24, 2020, 12:12:44 AM6/24/20
to DataStax Spark Connector for Apache Cassandra
The 2.5 release shouldn't be compatible with spark 3.0 yet so you would need a fresh build of of the b3.0 branch or master of the scc.

On Tue, Jun 23, 2020, 10:35 PM Ganesh Krishnan <gane...@gmail.com> wrote:
Got this error now:
java.io.IOException: Failed to open native connection to Cassandra at {cassandraHost:9042} :: com/typesafe/config/ConfigMergeable

Tried to build a fat jar too.
Google doesnt turn up many results with this issue either.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.


--
With best wishes,                    Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)

--
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.
Reply all
Reply to author
Forward
0 new messages