We are excited to announce the release of Delta Lake 2.0.0 on Apache Spark 3.2.
Support Change Data Feed on Delta tables. Change Data Feed represents the row level changes between different versions of the table. When enabled, additional information is recorded regarding row level changes for every write operation on the table. See the documentation for more details.
Support Z-Order clustering of data to reduce the amount of data read. Z-Ordering is a technique to colocate related information in the same set of files. This data clustering allows column stats (released in Delta 1.2) to be more effective in skipping data based on filters in a query. See the documentation for more details.
Support for idempotent writes to Delta tables to enable fault-tolerant retry of Delta table writing jobs without writing the data multiple times to the table. See the documentation for more details.
Support for dropping columns in a Delta table as a metadata change operation. This command drops the column from metadata and not the column data in underlying files. See documentation for more details.
Support for dynamic partition overwrite. Overwrite only the partitions with data written into them at runtime. See documentation for details.
Experimental support for multi-part checkpoints to split the Delta Lake checkpoint into multiple parts to speed up writing the checkpoints and reading. See documentation for more details.
Python and Scala API support for OPTIMIZE file compaction and Z-order by.
Other notable changes
SimpleAWSCredentialsProvider or TemporaryAWSCredentialsProvider in S3 multi-cluster write supported LogStore.DataFrame to be written even if the column was nullable.Independent of this release, we have improved the framework for writing large scala performance benchmarks (initial version added in version 1.2.0), we have added support for running benchmarks on Google Compute Platform using Google Dataproc (in addition to the existing support for EMR on AWS)
Adam Binford, Alkis Evlogimenos, Allison Portis, Ankur Dave, Bingkun Pan, Burak Yilmaz, Chang Yong Lik, Chen Qingzhi, Denny Lee, Eric Chang, Felipe Pessoto, Fred Liu, Fu Chen, Gaurav Rupnar, Grzegorz Kołakowski, Hussein Nagree, Jacek Laskowski, Jackie Zhang, Jiaan Geng, Jintao Shen, Jintian Liang, John O'Dwyer, Junyong Lee, Kam Cheung Ting, Karen Feng, Koert Kuipers, Lars Kroll, Liwen Sun, Lukas Rupprecht, Max Gekk, Michael Mengarelli, Min Yang, Naga Raju Bhanoori, Nick Grigoriev, Nick Karpov, Ole Sasse, Patrick Grandjean, Peng Zhong, Prakhar Jain, Rahul Shivu Mahadev, Rajesh Parangi, Ruslan Dautkhanov, Sabir Akhadov, Scott Sandre, Serge Rielau, Shixiong Zhu, Shoumik Palkar, Tathagata Das, Terry Kim, Tyson Condie, Venki Korukanti, Vini Jaiswal, Wenchen Fan, Xinyi, Yijia Cui, Yousry Mohamed
Thanks
Venki