Hi All,The delta delete operation works with EMR 7.0.0, Spark 3.5.0, and Delta 3.0.0 but fails with EMR 7.8.0, Spark 3.5.4, and Delta 3.3.0. Any ideas about this issue?t_delta = DeltaTable.forPath(spark,table_path)t_delta.delete("id = 'asssaaa'")
Traceback (most recent call last):
File "/tmp/python1629415551282998462/zeppelin_python.py", line 167, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 4, in <module>
File "/usr/local/lib/python3.9/site-packages/delta/tables.py", line 106, in delete
self._jdt.delete(DeltaTable._condition_to_jcolumn(condition))
File "/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
return_value = get_return_value(
File "/usr/lib/spark/python/pyspark/errors/exceptions/captured.py", line 185, in deco
raise converted from None
pyspark.errors.exceptions.captured.AnalysisException: [DELTA_UNSUPPORTED_SOURCE] DELETE destination only supports Delta sources.
Some(DeleteFromTable (user_offers_id#33 = F42F6B1A-9C83-4EFD-8A69-8348974C9517)
+- Project [staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, user_offers_id#30, 36, true, false, true) AS user_offers_id#33, point_transaction_id#31, etl_timestamp#32]
+- Relation [user_offers_id#30,point_transaction_id#31,etl_timestamp#32] parquet
)
- df.read.format("delta").load(path). This works.
- The SQL query works. spark.sql(delete_sql_query)
Thanks,
Rameshkumar S