Hello everyone,
We would like some input on a proposed feature when using gpbackup on 6X when using the jobs flag.
Problem Statement
If you specify a jobs value higher than 1, the database must be in a quiescent state at the very beginning while the utility creates the individual connections, initializes their transaction snapshots, and acquires a lock on the tables that are being backed up. If concurrent database operations are being performed on tables that are being backed up during the transaction snapshot initialization and table locking step, consistency between tables that are backed up in different parallel workers cannot be guaranteed.
To avoid a COPY deadlock scenario, parallel workers also must acquire an ACCESS SHARE lock on each table before attempting the COPY TO command These locks are currently not released until end of backup. if a parallel worker is unable to get a lock on a table, the worker no longer has a valid distributed snapshot and is terminated.
Proposal
Guarantee user data consistency and by exporting a distributed snapshot that all parallel workers will use to synchronize their views of the database.
Goals
- Consistency between tables that are backed up is guaranteed when running gpbackup with default parameters.
- Reuse parallel worker connections if an error is encountered backing up a table.
- Reduce maximum concurrent locks held during gpbackup operations
Solution 1. When jobs flag is specified, establish jobs + 1 total connections. Use connection 0 to export the snapshot. Parallel workers will begin and commit a transaction for each table.
connection 0:
- Sets and exports transaction snapshot
- Runs catalog queries required for setup.
- Acquires ACCESS SHARE locks on tables in the backup set
connections 1.. n
- Worker gets a table from the set, begins a new transaction, imports the snapshot, attempts a table lock.
- If lock succeeds, copy the table out and commit the transaction.
- If lock fails, place table into a deferred queue and rollback the transaction.
- Repeat step 1.
connection 0 scans the deferred queue. It already holds the locks, so copies the table out.
- The connection that exported the snapshot must keep the transaction alive for the duration of the backup. This is no change from the current implementation.
- If a copy error occurs on the connection, the snapshot is no longer valid and the backup will fail.
- gpbackup implicitly establishes jobs + 1 connections to the database.
- Connection 0 must still acquire ACCESS SHARE locks for all tables in a single transaction
Solution 2. Implement optional flag that does not acquire locks up front.
Several users have run into the error [CRITICAL]:-ERROR: out of shared memory (SQLSTATE 53200) when attempting to gather the ACCESS SHARE locks.
Provide users who can guarantee there will be no MVCC unsafe statements i.e. TRUNCATE TABLE or several ALTER TABLE variations, the ability to run a backup without acquiring locks that attempts to ensure consistent data visibility as a best effort.
A user who is aware of the downsides should have the ability to backup an arbitrarily large database without concern for running out of shared memory due to locking, or having to restart the cluster to enable a workaround, such as increasing the GUC max_locks_per_transaction.
connection 0:
- Sets and exports transaction snapshot
- Runs catalog queries required for setup
connections 1 .. n
- Worker gets a table from the set, begins a new transaction, imports the snapshot, attempts a table lock.
- If lock succeeds, copy the table out and commit the transaction.
- If lock fails, log a warning and pass table to an errored tables list.
- Repeat step 1.
cleanup:
- Output list of errored tables, if any
- Query pg_catalog.pg_stat_last_operation to determine if MVCC unsafe operations were run on tables on the backup set during the backup timeframe. If yes, output warning with list of tables.
Risks/Issues
- The connection that exported the snapshot must keep the transaction alive for the duration of the backup.
- A backup could become inconsistent or table DDL could be modified at any point if an MVCC unsafe operation is run.
- Additional documentation and log output required.
One or both of these solutions could be implemented.
Thanks,
Brent