Generated Columns vs Metadata Columns

20 views
Skip to first unread message

Jacek Laskowski

unread,
Jun 2, 2021, 9:37:08 AM6/2/21
to Delta Lake Users and Developers
Hi,

I'm curious what metadata columns miss that Delta 1.0.0 came up with a new concept of generated columns? Why not to use SupportsMetadataColumns?

I'm just exploring these concepts in Delta and Spark SQL and they seem so similar at first glance.

Shixiong(Ryan) Zhu

unread,
Jun 2, 2021, 7:50:50 PM6/2/21
to Jacek Laskowski, Delta Lake Users and Developers
Hey Jacek,

A generated column is a derived column which users can define an expression to automatically generate it. It's similar to a normal column except Delta will generate values for you. Many databases support it as well, such as PostgreSQL , MySQL.

A metadata column is more like a system column, such as input file name, which is not a part of the data in the table.

They are two different concepts.

Best Regards,

Ryan


--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/CAB_G8Zt7u%3DiCbz7m%2B1SaCqsRm%2Br3cCGgba8KOshvkODSQOsKdQ%40mail.gmail.com.

Jacek Laskowski

unread,
Jun 4, 2021, 2:22:06 PM6/4/21
to Shixiong(Ryan) Zhu, Delta Lake Users and Developers
Hi Ryan,

Thanks for your reply! Makes sense now.

> A metadata column is more like a system column, such as input file name, which is not a part of the data in the table.

Is there a place for Metadata Columns in Delta Lake, e.g. for the input file name? How would that be different from input_file_name() standard function? Curious if there were any difference since Delta does all the heavy lifting itself (and an end user is not really interested in this metadata).
Reply all
Reply to author
Forward
0 new messages