Log files are not deleted automatically after multiple checkpoints

102 views
Skip to first unread message

rameshkumar....@gmail.com

unread,
Apr 3, 2022, 7:02:51 AM4/3/22
to Delta Lake Users and Developers
Hi All,

Log files are not deleted automatically after multiple checkpoints.
I have set the following config.
df.repartition(repartition_number).write.mode("overwrite"). \
option('delta.deletedFileRetentionDuration', 'interval 2 hours'). \
option('delta.logRetentionDuration', 'interval 2 hours').partitionBy(partition_column).format("delta").option(
"overwriteSchema", "true").save(path)

My table is a streaming table, the above statement executes the first time if the table is not available in S3.
I would like to reduce the history retention default from 30 days to 1 day or hour,
Please suggest to me the right approach. 

Thanks,
Rameshkumar S

Denny Lee

unread,
Apr 3, 2022, 10:34:12 PM4/3/22
to rameshkumar....@gmail.com, Delta Lake Users and Developers
Could you configure these properties using the TBL PROPERTIES settings to see if this will help?  

--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/4116f521-1fd6-42ad-9eb7-6bded0a7d24dn%40googlegroups.com.


--
Reply all
Reply to author
Forward
0 new messages