Jira (PDB-5589) Investigate pdb upgrade times with migration 81/82

12 views
Skip to first unread message

Joshua Partlow (Jira)

unread,
Feb 3, 2023, 7:00:05 PM2/3/23
to puppe...@googlegroups.com
Joshua Partlow created an issue
 
PuppetDB / Task PDB-5589
Investigate pdb upgrade times with migration 81/82
Issue Type: Task Task
Assignee: Unassigned
Components: PuppetDB, PuppetDB Enterprise
Created: 2023/02/03 3:59 PM
Priority: Normal Normal
Reporter: Joshua Partlow

Double check how much time the index regeneration from pdb migrations 81/82 takes (these migrations rejigger the reports/resource_events partitions to use declarative partitioning).

  • Time to upgrade
  • In versions of PG 11.17 the reports_noop_idx indexes generated on the child partitions is named differently than in 11.18 (for example, reports_20230130z_noop_idx instead of reports_noop_idx_20230130z, see: https://github.com/puppetlabs/puppetdb/pull/3748/commits/133c3771e84d8b389bb4eca74907f7b03a96e389). It would be good to validate what the indexes look like after upgrading on <PG 11.18, and then also upgrading from their to a PG >=11.18 (possibly easiest to test this with PE)
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)
Atlassian logo

Austin Blatt (Jira)

unread,
Mar 15, 2023, 11:36:01 AM3/15/23
to puppe...@googlegroups.com
Austin Blatt commented on Task PDB-5589
 
Re: Investigate pdb upgrade times with migration 81/82

I'm not too worried about time to upgrade PE (famous last words...). I think all our operations are "move" operations, the alter table commands all look to be commands that are not table rewriting.

But checking indexes and query performance after the upgrade would be good. I do not know if we still have the data around for Locust, but running all those queries would be a good start.

Cas Donoghue (Jira)

unread,
Mar 15, 2023, 2:25:01 PM3/15/23
to puppe...@googlegroups.com
Cas Donoghue updated an issue
 
Change By: Cas Donoghue
Sprint: Skeletor 03/29/2023

Austin Blatt (Jira)

unread,
Mar 15, 2023, 4:58:04 PM3/15/23
to puppe...@googlegroups.com

Austin Blatt (Jira)

unread,
Mar 22, 2023, 2:55:02 PM3/22/23
to puppe...@googlegroups.com
Austin Blatt commented on Task PDB-5589
 
Re: Investigate pdb upgrade times with migration 81/82

Noticed a 2 index difference between the new and old version. Partitions have a primary key, which is expected, but there also appears to be a duplicate index being created (at least by the migration). This was run against an brand new postgres 15.2, where the bug was supposed to have been fixed.

"reports_20230318z_encode_producer_timestamp_idx" UNIQUE, btree (encode(hash, 'hex'::text), producer_timestamp)
"reports_hash_expr_idx_20230318z" UNIQUE, btree (encode(hash, 'hex'::text))

the id index is also not removed in the migration, but does not continue into a newly created partition.

"idx_reports_id_20230318z" UNIQUE, btree (id)

These issues go away for new partitions, so I don't know how big of an issue they are.

Austin Blatt (Jira)

unread,
Mar 22, 2023, 4:20:04 PM3/22/23
to puppe...@googlegroups.com
Austin Blatt commented on Task PDB-5589

Please take these results very loosely, they were produced on a laptop.

Migration results running migrations 81 and 82

0 reports - 81ms
6,720 reports - 135ms
35,500 reports - 185ms
67,200 reports - 188ms

There is some impact in migration time when more data is added, but it may even tail off as you get more data in a partition. Even if we ignore the final run with 67k nodes, extrapolating from the first three data points linearly puts 67.2 million reports (100k nodes with a run interval of 30 minutes and a history of 14 days) at 172 seconds — or just under 3 minutes.

Joshua Partlow (Jira)

unread,
Mar 28, 2023, 6:06:02 PM3/28/23
to puppe...@googlegroups.com

Regarding the index differences between 7.12.1 and 7.x with migrations 81 and 82, it took a little bit for me to get up to speed here, but I think I'm now seeing the same thing you are, Austin Blatt.

On 7.12.1 we have the following report partition indexes:

    "reports_cached_catalog_status_on_fail_20230320z" btree (cached_catalog_status) WHERE cached_catalog_status = 'on_failure'::text
    "reports_catalog_uuid_idx_20230320z" btree (catalog_uuid)
    "idx_reports_certname_end_time_20230320z" btree (certname, end_time)
    "idx_reports_producer_timestamp_by_hour_certname_20230320z" btree (date_trunc('hour'::text, timezone('UTC'::text, producer_timestamp)), producer_timestamp, certname)
    "reports_end_time_idx_20230320z" btree (end_time)
    "reports_environment_id_idx_20230320z" btree (environment_id)
    "reports_job_id_idx_20230320z" btree (job_id) WHERE job_id IS NOT NULL
    "reports_noop_idx_20230320z" btree (noop) WHERE noop = true
    "idx_reports_noop_pending_20230320z" btree (noop_pending) WHERE noop_pending = true
    "idx_reports_prod_20230320z" btree (producer_id)
    "idx_reports_compound_id_20230320z" btree (producer_timestamp, certname, hash) WHERE start_time IS NOT NULL
    "idx_reports_producer_timestamp_20230320z" btree (producer_timestamp)
    "reports_status_id_idx_20230320z" btree (status_id)
    "reports_tx_uuid_expr_idx_20230320z" btree ((transaction_uuid::text))
    "idx_reports_id_20230320z" UNIQUE, btree (id)
    "reports_hash_expr_idx_20230320z" UNIQUE, btree (encode(hash, 'hex'::text))

On 7.x (I was testing 7.12.1-66-g40b8c9950, this would also apply to main (pdb 8)), for a migrated reports partition you have:

# New indexes which correspond to:
# https://github.com/puppetlabs/puppetdb/blob/40b8c9950d9b05301fda9027b4676aa922f61ce0/src/puppetlabs/puppetdb/scf/migrate.clj#L2279
# https://github.com/puppetlabs/puppetdb/blob/40b8c9950d9b05301fda9027b4676aa922f61ce0/src/puppetlabs/puppetdb/scf/migrate.clj#L2293-L2296
    "reports_20230320z_pkey" PRIMARY KEY, btree (id, producer_timestamp)
    "reports_20230320z_encode_producer_timestamp_idx" UNIQUE, btree (encode(hash, 'hex'::text), producer_timestamp)
 
# Migrated indexes identical to 7.12.1 because they get moved
# as part of ALTER TABLE ATTACH PARTITION, and they keep their existing idexes. vvv 
    "reports_cached_catalog_status_on_fail_20230320z" btree (cached_catalog_status) WHERE cached_catalog_status = 'on_failure'::text
    "reports_catalog_uuid_idx_20230320z" btree (catalog_uuid)
    "idx_reports_certname_end_time_20230320z" btree (certname, end_time)
    "idx_reports_producer_timestamp_by_hour_certname_20230320z" btree (date_trunc('hour'::text, timezone('UTC'::text, producer_timestamp)), producer_timestamp, certname)
    "reports_end_time_idx_20230320z" btree (end_time)
    "reports_environment_id_idx_20230320z" btree (environment_id)
    "reports_job_id_idx_20230320z" btree (job_id) WHERE job_id IS NOT NULL
    "reports_noop_idx_20230320z" btree (noop) WHERE noop = true
    "idx_reports_noop_pending_20230320z" btree (noop_pending) WHERE noop_pending = true
    "idx_reports_prod_20230320z" btree (producer_id)
    "idx_reports_compound_id_20230320z" btree (producer_timestamp, certname, hash) WHERE start_time IS NOT NULL
    "idx_reports_producer_timestamp_20230320z" btree (producer_timestamp)
    "reports_status_id_idx_20230320z" btree (status_id)
    "reports_tx_uuid_expr_idx_20230320z" btree ((transaction_uuid::text))
    "idx_reports_id_20230320z" UNIQUE, btree (id)
    "reports_hash_expr_idx_20230320z" UNIQUE, btree (encode(hash, 'hex'::text))

But a new 7.x reports partition is created with these indexes:

# These two are the same vv
    "reports_20230304z_pkey" PRIMARY KEY, btree (id, producer_timestamp)
    "reports_20230304z_encode_producer_timestamp_idx" UNIQUE, btree (encode(hash, 'hex'::text), producer_timestamp)
 
# These 14 indexes match the type and columns of the 7.12.1 and of migrated 7.12.1
# partitions, but are named by postgres since they get created by pg on the fly
# when it creates the new partition.
    "reports_20230304z_cached_catalog_status_idx" btree (cached_catalog_status) WHERE cached_catalog_status = 'on_failure'::text
    "reports_20230304z_catalog_uuid_idx" btree (catalog_uuid)
    "reports_20230304z_certname_end_time_idx" btree (certname, end_time)
    "reports_20230304z_date_trunc_producer_timestamp_certname_idx" btree (date_trunc('hour'::text, timezone('UTC'::text, producer_timestamp)), producer_timestamp, certname)
    "reports_20230304z_end_time_idx" btree (end_time)
    "reports_20230304z_environment_id_idx" btree (environment_id)
    "reports_20230304z_job_id_idx" btree (job_id) WHERE job_id IS NOT NULL
    "reports_20230304z_noop_idx" btree (noop) WHERE noop = true
    "reports_20230304z_noop_pending_idx" btree (noop_pending) WHERE noop_pending = true
    "reports_20230304z_producer_id_idx" btree (producer_id)
    "reports_20230304z_producer_timestamp_certname_hash_idx" btree (producer_timestamp, certname, hash) WHERE start_time IS NOT NULL
    "reports_20230304z_producer_timestamp_idx" btree (producer_timestamp)
    "reports_20230304z_status_id_idx" btree (status_id)
    "reports_20230304z_transaction_uuid_idx" btree ((transaction_uuid::text))
 
# And lastly the 7.12.1 idx_reports_id and reports_hash_expr_idx don't exist because there
# is no matching idx on the main reports table since those were replaced by the
# pkey and encode_producer_timestamp_idx variations that include
# producer_timestamp as required for unique declarative partition indexes.

The differences between the migrated partitions and newly created partitions in a 7.x db is a difference in index names, and the absence of the two leftover 7.12.1 indexes idx_reports_id and reports_hash_expr_idx in the migrated partitions. This is because the migrated partitions just undergo an alter table to make them partitions of the new reports table and they bring their indexes with them. The new pkey and encode_producer_timestamp_idx indexes are propogated by postgres based on the existence of these indexes in the reports table. Everything else is the same except for names.

So, yes, the presence of the two leftover

    "idx_reports_id_20230320z" UNIQUE, btree (id)
    "reports_hash_expr_idx_20230320z" UNIQUE, btree (encode(hash, 'hex'::text))

indexes in migrated partitions shouldn't matter, since they are redundant given pkey and encode_producer_timestamp_idx, and will disappear in a week or two as new partitions are created and old ones are dropped. But it would be cleaner if they were dropped during migration, so I'll put up a pr for that.

I tested PG 14.7 and 15.2, but didn't see any differences between the two postgresql versions.

Joshua Partlow (Jira)

unread,
Mar 28, 2023, 7:54:01 PM3/28/23
to puppe...@googlegroups.com

Same issue for resource_events, except it's just the old primary key 'resource_events_hash_*' that is left over in the migrated partitions. Effectively replaced by the new pkey on event_hash, timestamp.

Reply all
Reply to author
Forward
0 new messages