I am using a Dataproc cluster (2.0 ubuntu ) with hadoop 3.2 and spark 3.1. I have a python code to read avro files from GCS. So I have used 'spark-avro_2.12-3.1.0.jar' but its give some error like method not found etc (java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.DataSourceUtils$.creteDateRebaseFuncInRead ). How to decide which library is compatible to use ?
I am reading using :
df = spark.read.format("avro").load([ 'file1.avro' , 'file2.avro' ])