Merge all columns when match except one.

1,093 views
Skip to first unread message

Muhammad Suhaib

unread,
May 21, 2021, 1:03:50 AM5/21/21
to Delta Lake Users and Developers
Hi, 

I am trying to update my delta table using data frame which contains updated data from OLTP system using id of OLTP to check if its already in the delta table or not. 

My delta table also have DW id which i do not want to update in case its matched as its reference to many fact table other than DW id i want to update all columns. 

Is there is any way using merge statement, i can update all columns except one column.
 

Thanks,
Regards,
Muhammad. 

Mich Talebzadeh

unread,
May 21, 2021, 6:19:01 AM5/21/21
to Muhammad Suhaib, Delta Lake Users and Developers
Well this is a generic question that applies to any DW table. STAR schema?

What you are doing is trying to append to your delta table and seek if that record with the OLTP.id already exists in the table.

Why do you want to update the record with the new changed column assuming you are doing CDC here. You can just add a new row to it.

Then you can find out the last added row with

WITH ranked AS (
  SELECT
    *,
    ROW_NUMBER() OVER (
      PARTITION BY <record_key>
      ORDER BY <event_timestamp> DESC
    ) AS rank
  FROM <table>
)
SELECT *
FROM ranked
WHERE rank = 1

HTH




   view my Linkedin profile

 

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/acbbc118-125e-4bc5-b87a-389d41746475n%40googlegroups.com.

Jacek Laskowski

unread,
May 21, 2021, 8:15:49 AM5/21/21
to Muhammad Suhaib, Delta Lake Users and Developers
Hi Muhammad,

Would the following MERGE query be what you're looking for?

val updates = Map(
  "updated" -> lit(true))

target.as("to")
  .merge(
    source = sourceDF.as("from"),
    condition = $"to.id" === $"from.num")
  .whenMatched.update(updates)
  .execute()

The updates map is where you define the columns of the target delta table and the new values (and that's where you skip the one or more columns you don't want to update on match).

--
Reply all
Reply to author
Forward
0 new messages