Hello Aayush,
There are 2 viable approaches depending on the scale of your data
- If the scale is large, then it's best to use a distributed engine like Spark. In that case, you can use merge either using the DeltaTable API or directly SQL using `spark.sql("MERGE INTO ...")`
Delta Standalone was designed to provide basic read and write support for Delta for writing connectors for any engine, and advanced operations like deletes and merges were out of scope as implementing those efficiently were left to the engines to implement efficiently.
Side note: Our new project
Delta Kernel is the successor to Delta Standalone. But the scope is still the same, no advanced operations.
Hope this helps.