- Heikki
Let's talk about autovacuum.
So as a first baby step, I propose that we keep autovacuum=off, but still launch it for anti-wraparound purposes, if you forget to vacuum a table manually. Once we get some experience with that, we can consider enabling it for catalog tables, or allowing it to be turned on in general, if the administrator so chooses.
Andreas Scherbaum
Principal Software Engineer
GoPivotal Deutschland GmbH
Amtsgericht Königstein im Taunus, HRB 8433
Geschäftsführer: Andrew Michael Cohen, Paul Thomas Dacier
What about enabling it in general? From what I've heard, the reason autovacuum was disabled in GPDB was that it caused erratic performance [https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/fAN5gLJWlw4/IEMbZPHsCQAJ]. Autovacuum might kick in just when you're running a big, important query, and compete for I/O. That problem is not unique to GPDB. It is often required to tune autovacuum in PostgreSQL, too, or to set up manual VACUUM of some tables, to avoid that. The unique thing in GPDB is that it's an MPP system, and the time it takes for a query to finish is determined by the slowest node in the cluster. If autovacuum kicks in at different times on different segments, that's bad.
So as a first baby step, I propose that we keep autovacuum=off, but still launch it for anti-wraparound purposes, if you forget to vacuum a table manually. Once we get some experience with that, we can consider enabling it for catalog tables, or allowing it to be turned on in general, if the administrator so chooses.
- Heikki
On Mon, Feb 12, 2018 at 9:17 AM, Heikki Linnakangas <hlinna...@pivotal.io> wrote:On 12/02/18 19:09, Ivan Novick wrote:
On Mon, Feb 12, 2018 at 3:31 AM, Heikki Linnakangas <hlinna...@pivotal.io
wrote:
launch it for anti-wraparound purposes, if you forget to vacuum a table
manually. Once we get some experience with that, we can consider enabling
it for catalog tables, or allowing it to be turned on in general, if the
administrator so chooses.
Baby steps sounds good.
With your proposal can the user choose to enable it for catalog tables?
Can they enable it for user tables? Or will they be not available even
with a GUC?
To be determined. In the very first baby step, no. But once that's done, it would be easy to allow it to be turned on, if the DBA wants to.
OK, any progress in this area is good, thank you so much!
On Mon, Feb 12, 2018 at 12:19 PM, Ivan Novick <ino...@pivotal.io> wrote:On Mon, Feb 12, 2018 at 9:17 AM, Heikki Linnakangas <hlinna...@pivotal.io> wrote:On 12/02/18 19:09, Ivan Novick wrote:
On Mon, Feb 12, 2018 at 3:31 AM, Heikki Linnakangas <hlinna...@pivotal.io
wrote:
launch it for anti-wraparound purposes, if you forget to vacuum a table
manually. Once we get some experience with that, we can consider enabling
it for catalog tables, or allowing it to be turned on in general, if the
administrator so chooses.
Baby steps sounds good.
With your proposal can the user choose to enable it for catalog tables?
Can they enable it for user tables? Or will they be not available even
with a GUC?
To be determined. In the very first baby step, no. But once that's done, it would be easy to allow it to be turned on, if the DBA wants to.
OK, any progress in this area is good, thank you so much!Ivan,To be clear this is a master feature only. We are investigating fixing the wrap around issue differently in 5.x.
Proposal
--------
I propose that we re-enable autovacuum in GPDB.
To what extent? I'm not sure yet. But at the very least, I think we should let autovacuum handle anti-wraparound vacuuming of 'template0', like it does in the upstream.
What about enabling it in general? From what I've heard, the reason autovacuum was disabled in GPDB was that it caused erratic performance [https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/fAN5gLJWlw4/IEMbZPHsCQAJ]. Autovacuum might kick in just when you're running a big, important query, and compete for I/O. That problem is not unique to GPDB. It is often required to tune autovacuum in PostgreSQL, too, or to set up manual VACUUM of some tables, to avoid that. The unique thing in GPDB is that it's an MPP system, and the time it takes for a query to finish is determined by the slowest node in the cluster. If autovacuum kicks in at different times on different segments, that's bad.
So as a first baby step, I propose that we keep autovacuum=off, but still launch it for anti-wraparound purposes, if you forget to vacuum a table manually. Once we get some experience with that, we can consider enabling it for catalog tables, or allowing it to be turned on in general, if the administrator so chooses.
Thanks Aism,But do we have any further plan for this important feature? I mean without autovacuum, there are many work to do for DBA, like table bloat and missing stats for optimizer etc. Since many tables would do massive update/insert/delete.
Thanks Aism,But do we have any further plan for this important feature? I mean without autovacuum, there are many work to do for DBA, like table bloat and missing stats for optimizer etc. Since many tables would do massive update/insert/delete.
=========================================================================
=========================================================================
How and what do we enable for 7.
Split into 2 use cases:
Prevent XID wrap around
compaction/garbage collection
Issues when tried before:
was enabled locally on the segment
caused locking problems and performance issues
lot's of deadlock issue
Action items
investigate the issues when we did the work before
Investigate the distributed deadlocking issues
make sure we test against old issues
Other issues/thoughts pointed out we need to consider
vacuum of AO tables very different from HEAP
Vacuum with bitmap indexes causes a reindex
we want to keep the segments independent as much as possible
Postgres vacuum improvements mean that there shouldn't be locking anymore and there should be no expectation of performance degradation
for AO tables we should treat the aux tables as if they are catalog tables
we need to consider the thresholds we set
Staging of the work:
make autovacuum work for all catalog tables -- bloat & XID wraparound
Enable it for all HEAP tables for XID wrap around
Figure out what we want to about bloat on HEAP
figure out what we want to do about bloat for AO
Separate work:
create a vacuum catalog option
Hi all,
It might be interesting to check https://github.com/reorg/pg_repack
They done some smart hacks to avoid locking during vacuum.
-D.
--
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.
On Thu, Aug 1, 2019 at 12:08 AM Jet rote:
We have submitted a PR10515 to enable distributed autovacuum in Greenplum, could you guys help review it.
Hi guys,
Since the PR(https://github.com/greenplum-db/gpdb/pull/10515) brings the focus back to the autovacuum. So I organize autovacuum related information to help us have an overall view, then we can discuss the further steps to enable it. Here's the google document which contains below contents: https://docs.google.com/document/d/1rnO0FUadIPNC7SLY9kPntz6ce0l7XOtDflTbsgmZ1m8/edit?usp=sharing. I'll keep it updated base on our discussion.
Currently, GPDB enables the autovacuum daemon to perform VACUUM operations only on the Greenplum Database template database template0. Autovacuum is enabled for template0 because connections are not allowed to template0. The autovacuum daemon performs VACUUM operations on template0 to manage transaction IDs (XIDs) and help avoid transaction ID wraparound issues in template0.
Also, the autoanalyze is disabled too, but we bring in autostats feature in GPDB to collect statistics after executing CREATE TABLE AS SELECT, UPDATE, DELETE, INSERT, and COPY. Accurate statistics will help the planner to choose the most appropriate query plan, and thereby improve the speed of query processing.
autovaccum is highly recommended in Postgres, whose purpose is to automate the execution of VACUUM and ANALYZE commands. When enabled, autovacuum checks for tables that have had a large number of inserted, updated or deleted tuples.
The main jobs are:
Currently, we require users to execute VACUUM periodically to avoid series issues. Enabling autovacuum will release the burden from users.
Also for ANALYZE, we rely on auto stats now. But we encounter several issues during the autostats phase.
It gets involved in CREATE TABLE AS SELECT, UPDATE, DELETE, INSERT, and COPY in the same transaction, which brings extra overload in these queries. This will slow down the execution.
We may face deadlock issue, from mailing list https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/nPW-mbO-oAw/m/G9q5Wyz7AAAJ The
issue is: T1: copy child_prt_1
on segment
, trigger ANALYZE on the root partition table. Holding ShareUpdateExclusiveLock on the root table. T1 QEs: execute select
* from pg_catalog.gp_acquire_sample_rows(relid, 10000, 't')
on root table, which will hold AccessShareLock for each child table. Please refer to acquire_inherited_sample_rows
.
T2: copy child_prt_3 on segment
,
trigger analysis on root partition table. Waiting for ShareUpdateExclusiveLock on the root table. T2 QEs holding RowExclusiveLock after executing COPY TO on segments since T2 is not committed.
So the wait graph is: T2 QD --> T1 QD --> T1 QE --> T2 QE --> T2 QD
The reason is the auto_stats may expand ANALYZE on the root table in the same transaction, it'll acquire locks for lots of child tables. Tune some GUCS will help to prevent the deadlock.
And the reason we expand analyze to root is that we don't have good timing to analyze the root partition table since autovacuum is not enabled.
Running ANALYZE separately could also avoid the issue. So If autoanalyze is enabled, analyze should run in separate transactions, then, we can totally remove the auto_stats logic and avoid the above issues.
We also have a similar Jira which users execute ANALYZE in the same transaction of truncate/insert on a leaf table causing deadlock. Enable antovacuum could also avoid user call ANALYZE manually in the same transaction.
I checked the related topics from JIRA and mailing list.
https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/fAN5gLJWlw4/m/IEMbZPHsCQAJ https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/gqordopb6Gg/m/-GHXSE4qBwAJ
Then I collect issues as below:
For issue 1, Cost-based Vacuum Delay could help to reduce the impact(https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST).
For issue 3, since PG 9.4, autovacuum workers generally don't block other commands. If a process attempts to acquire a lock that conflicts with the SHARE UPDATE EXCLUSIVE lock held by autovacuum, lock acquisition will interrupt the autovacuum. Related code GetBlockingAutoVacuumPgproc
. DeadLockCheck
detects
whether the current query is blocked by autovacuum(https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM).
On master, when executing ANALYZE under autovacuum worker, it should have an MPP view to acquiring sample rows from segments if needed. This requires collecting pgstat into master. On each segment, it should be fine to execute autoanalyze separately if we can limit the performance issue mentioned above. But right now, I check the code, it seems neither ANALYZE nor auto_stats calculate stats on segments, which is reasonable.
For AO table, still uses distributed transactions for the updates on pg_aoseg during AO vacuum. Dropping old AWAITING_DROP segments that are no longer needed doesn't require a distributed transaction, so it can be done under each segment separately(this strategy works because deletes can't be performed for AO tables inside serializable transactions). But for other vacuum operations which updating pg_aoseg_* should be issued from master (see problem 3 in discussion https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/lUwtNxaT3f0/m/PzGAqW1EAwAJ).
Based on the above topics, we have some basic thoughts on how to enable it.
The autovacuum worker on master should have the MPP view to execute ANALYZE. Acquire sample rows from segments if needed. This should be the same with the ANALYZE command executed by user, except the autovacuum daemon should have the stats for the whole cluster to decide when to execute ANALYZE. (Seems we don't need statistics on segments right?)
Intuitively, there are two solutions:
Each segment takes care of its own autovacuum.
Master controls the whole cluster's vacuum. Just as VACUUM command executed by user. So we don't need autovacuum daemon on each segment.
For solution 1: We may still face issue 1 mentioned above. And whether Cost-based Vacuum Delay and lock acquisition interrupts the autovacuum could help with this?
Vacuum on AO table still needs to be fired from Master since it requires distributed transactions.
For solution 2: For a relation, maybe most of the segments have no need to execute vacuum.
For the AO table, as in the discussion linked above, independently vacuum could help to remove AWAITING_DROP segfiles.
Currently, we have a PR to enable auto-analyze in Greenplum master/7. Here I’ll briefly introduce the implementation.
The basic idea is to collect pgstat info into Master when executing queries.
Start the master’s autovacuum launcher process. Fire an autovacuum work process for a database on master when the naptime reaches. Then the autovacuum worker will iterate through all tables/materialized views under current database, and execute ANALYZE for tables which reached the analyze threshold. Note the ANALYZE statement issued in the autovacuum worker on master is the same as executing it through query on QD. ie. The auto-analyze is coordinated by the master and segments do not start it’s own autovacuum launcher and autovacuum worker.
=================================================
Since the auto-analyze is fired from master, we need to collect pgstat info into master. GPDB used to call `auto_stats` for queries that may process tuples for different command types. The PR brings in another function `gp_pgstat_report_tabstat` which should be called when tuples are processed, it’ll report pgstat for the command on master.
So we wrap the `auto_stats` and `gp_pgstat_report_tabstat` in a function called `collect_tabstat`. And put it as `auto_stats` used to be.
We used to disable modifying autovacuum related GUCs. Within the PR, enable them.
Now, we support xid wrap around VACUUM through autovacuum for template0 only. This is by checking the database’s `datallowconn` field. If it’s false, it does not allow the user to connect it directly. That’s also the reason why we enable xid wrap around VACUUM for template0.
Then we reuse the `datallowconn` for auto-analyze. If the current database’s `datallowconn` sets to true, we only perform ANALYZE in autovacuum workers. A new field `wi_for_analyze` in WorkerInfoData is added to pass this into the autovacuum worker.
Other tables xid wrap around VACUUM are ignored.
Since auto-analyze is coordinated by the master and segments do not start it’s own autovacuum launcher and autovacuum worker. We need current master’s autovacuum workers which do auto-analyze on a database have `Gp_role = GP_ROLE_DISPATCH`. So it’ll do the ANALYZE on master to calculate statistics with right behavior.
But for VACUUM on template0, we still keep it running as `GP_ROLE_UTILITY`.
In postgres, the autovacuum workers generally don't block other commands. If a process attempts to acquire a lock that conflicts with the SHARE UPDATE EXCLUSIVE lock held by autovacuum, lock acquisition will interrupt the autovacuum.
This feature should also work for the auto-analyze’s autovacuum workers. A test is added to ensure this.
A simple example is:
==================================================
Partition table support
Current implementation does not collect pgstat for partition tables. So the auto-analyze will not execute for them.
The reason is we can not collect accurate pgstat for the partition table.
For example:
Insert into the child table, it updates the pgstat for the child table, but the root table's pgstat does not get updated on master. (Although we can simply add it to the root table)
Insert into the root table, it updates the pgstat for the root table on master, but underly child tables' pgstat not get updated, since GPDB only gets processed tuple count on master at each command execution end phase where `collect_tabstat` gets called. We need to collect tuple count for each child table on segments into master.
Although there’s solutions, we'll have PG 12 merge soon, so we decide to implement it later.
And for postgres, it only collects pgstat for child tables, this makes auto-analyze only executed for child tables. User still needs to manually ANALYZE on the parent table.
Cost-based delay
In postgres, the cost-base delay feature is used to limit the VACUUM and ANALYZE IO cost. Since the IO cost is not collected to master, so feature is not working in cluster wide. But we can still use it to limit IO cost for each segment.
We plan to support the disk io limit in the resource group. Once it is ready, we will see if the resource group can help here.
VACUUM under autovacuum
Now, GPDB only supports xid wrap around VACUUM on template0 through the autovacuum. Since we still have different opinions for the auto-vacuum implementation, current PR does not include the vacuum part.
==================================================
For the test part, we already have pgstat collect test and simple auto-analyze test to make sure the auto-analyze works as expected in the PR.
But furthermore, I’m thinking do more for:
Enable auto-analyze as default to run all pipeline test cases to find potential bugs. Some of the test cases will fail because of the tables statistics are not up to date.
Run TPCB/TPC-DS with auto-analyze enabled to test the workload impact.
==================================================
Once we merge the PR. There’s some decisions we need to make and discuss.
Shall we enable the auto-analyze as default(if the workload impact is small) in GPDB 7?
Are there any concerns?
Hello gpdb hackers!I have read the whole thread, but didnt get one thing: do we need to vacuum AO/AOCS relation to prevent XID wraparound or just for compaction (remove deleted tuples)? As far as i know, AO/AOCS tables contains only plain tuples data, without any xmin/xmax thing, so there should be no XID wraparound with them.
If so, maybe we can consider enable autovacuum (in any implementation) for HEAP tables only? I think, it should not impact performance much. The thing is, users usually dont create large HEAP tables in Greenplum, mostly use AO/AOCS for `big data` purpose. So, vacuum will handle them fast.