Table version with timeStamp returns table modification time

45 views
Skip to first unread message

Tomás Trindade

unread,
Apr 16, 2024, 8:30:37 PMApr 16
to Delta Lake Users and Developers
Hello, my colleague noticed that when requesting a table version using time stamps (version?startingTimestamp=)  the returned version wasn't as expected. This happens when using Delta Sharing Server. After debugging it was found that on /delta/connectors/standalone/src/main/scala/io/delta/standalone/internal/DeltaHistoryManager.scala the function getCommits returns "file" modified timestamp instead of the creation time of the table. Is this correct? We were expecting the table's information to come from the Delta Logs, such as the creation time stamp of the table on commitInfo, but that isn't the case.

Thank you

Burak Yavuz

unread,
Apr 16, 2024, 8:51:00 PMApr 16
to Tomás Trindade, Delta Lake Users and Developers
Due to clock skew, we trust the storage system's timestamp as the commit timestamp. It would also be pretty expensive to open up all commits to find what the commitInfo timestamp is going to be when doing time travel.

Best,

Burak Yavuz

Software Engineer

Databricks Inc.

bu...@databricks.com

databricks.com



On Tue, Apr 16, 2024 at 5:30 PM Tomás Trindade <tomast...@gmail.com> wrote:
Hello, my colleague noticed that when requesting a table version using time stamps (version?startingTimestamp=)  the returned version wasn't as expected. This happens when using Delta Sharing Server. After debugging it was found that on /delta/connectors/standalone/src/main/scala/io/delta/standalone/internal/DeltaHistoryManager.scala the function getCommits returns "file" modified timestamp instead of the creation time of the table. Is this correct? We were expecting the table's information to come from the Delta Logs, such as the creation time stamp of the table on commitInfo, but that isn't the case.

Thank you

--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/92f0c69b-d7c6-4fee-896a-a138816b5210n%40googlegroups.com.

Tomás Trindade

unread,
Apr 23, 2024, 1:07:57 PMApr 23
to Delta Lake Users and Developers
Hello, everyone. I have talked to Burak and, as he said, "Currently migrating a Delta table to a new storage system and maintaining commit timestamps is not supported".
We today manage our own object store solution and we provide to our users our own delta sharing server, which leverages the open-source reference implementation provided by Databricks: https://github.com/delta-io/delta-sharing. Users are interested in performing Delta Sharing requests that leverage parameters such as startingTimestamp, endingTimestamp, which rely on the timestamp of each delta version. However, a common use case for our users is to upload already-existing Delta tables to our object store solution, and then leverage Delta Sharing. Naturally, the object store is not aware of the original timestamp of the delta files, and, therefore, users are currently unable to use these parameters, as they provide wrong results. We are currently studying the best approach to solve this issue. If anyone has any thoughts on how we could handle this, it would be much appreciated. Please feel free to contact me for any questions or ideas.

Thank you in advance,
Tomás

Reply all
Reply to author
Forward
0 new messages