After some investigation, and if our assumptions are correct, we think it's likely to be preferable to continue diffing the catalogs in PDB. One reason is that at the moment PDB CPU is expected to be a notably less constrained resource than postgres since we have the option of running multiple command processors, and trading PDB CPU for a decreased postgres write load is potentially useful. In addition, at least with plain SQL, while we could handle the new/changed rows via upsert, we'd still have to arrange for all the obsolete rows to be deleted. Furthermore, any unchanged rows, at least with the straightforward "on conflict update" solution, would still generate dead tuples (the only upsert that doesn't iirc is "do nothing"). Regarding the original problem, we noticed that the resource queries that were causing concern were likely running more slowly because of VM snapshot IO contention on the host. In any case, whatever we decide in the end, we've taken this opportunity to review some of the other storage code, and identified a number of places where we handle things more efficiently, via upsert, decreasing round trips, etc. cf. PDB-5128 |