Hi All,
Log files are not deleted automatically after multiple checkpoints.
I have set the following config.
df.repartition(repartition_number).write.mode("overwrite"). \
option('delta.deletedFileRetentionDuration', 'interval 2 hours'). \
option('delta.logRetentionDuration', 'interval 2 hours').partitionBy(partition_column).format("delta").option(
"overwriteSchema", "true").save(path)
My table is a streaming table, the above statement executes the first time if the table is not available in S3.
I would like to reduce the history retention default from 30 days to 1 day or hour,
Please suggest to me the right approach.
Thanks,
Rameshkumar S