Jira (PDB-5450) Slow fact queries after PuppetDB 7.9.0 upgrade

37 views
Skip to first unread message

Austin Blatt (Jira)

unread,
Feb 3, 2022, 1:23:01 PM2/3/22
to puppe...@googlegroups.com
Austin Blatt updated an issue
 
PuppetDB / Task PDB-5450
Slow fact queries after PuppetDB 7.9.0 upgrade
Change By: Austin Blatt
Summary: Slow fact queries after PuppetDB 7.9.0 really slow upgrade
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.2#820002-sha1:829506d)
Atlassian logo

David McTavish (Jira)

unread,
Feb 3, 2022, 2:16:03 PM2/3/22
to puppe...@googlegroups.com
David McTavish updated an issue
Change By: David McTavish
Priority: Normal Major

David McTavish (Jira)

unread,
Feb 4, 2022, 8:44:02 AM2/4/22
to puppe...@googlegroups.com

David McTavish (Jira)

unread,
Feb 4, 2022, 8:44:02 AM2/4/22
to puppe...@googlegroups.com

David McTavish (Jira)

unread,
Feb 4, 2022, 8:44:03 AM2/4/22
to puppe...@googlegroups.com
David McTavish updated an issue
Change By: David McTavish
Sprint: HA 2022-03-02 HAHA/Grooming

David McTavish (Jira)

unread,
Feb 4, 2022, 8:44:05 AM2/4/22
to puppe...@googlegroups.com
David McTavish updated an issue
Change By: David McTavish
Sprint: HAHA/Grooming HA 2022-02-16

Elof Ofel (Jira)

unread,
Feb 4, 2022, 12:37:02 PM2/4/22
to puppe...@googlegroups.com
Elof Ofel commented on Task PDB-5450
 
Re: Slow fact queries after PuppetDB 7.9.0 upgrade

Ok, so I exec:ed into the container and added 'log_min_duration_statement = 20000' to postgresql.conf and restarted.
Now the postgres-container logs slow queries.
Here is a spot sample.
This query is generated by a PuppetExplorer dashboard cell showing how many Debian 11 nodes I have, i.e. the query:
operatingsystemrelease~"^11." and operatingsystem="Debian"

This generates this log while the loading of the dashboard page timeout:

postgres_1 | 2022-02-04 16:40:58.161 CET [75] LOG: duration: 23246.062 ms execute <unnamed>/C_2:
SELECT COUNTstar_yellow.png AS result_count FROM (WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL AND deactivated > '2022-01-21T15:40:33.691Z')
OR (expired IS NOT NULL and expired > '2022-01-21T15:40:33.691Z')), not_active_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL OR expired IS NOT NULL)) SELECT reports.corrective_change AS "latest_report_corrective_change", certnames.deactivated AS "deactivated", certnames.expired AS "expired", reports_environment.environment AS "report_environment", certnames.certname AS "certname", fs.timestamp AS "facts_timestamp", facts_environment.environment AS "facts_environment",
reports.cached_catalog_status AS "cached_catalog_status", report_statuses.status AS "latest_report_status", encode(reports.hash::bytea, 'hex') AS "latest_report_hash", catalogs.timestamp AS "catalog_timestamp", reports.noop_pending AS "latest_report_noop_pending", reports.end_time AS "report_timestamp", reports.noop AS "latest_report_noop", catalog_environment.environment AS "catalog_environment", reports.job_id AS "latest_report_job_id" FROM certnames
LEFT JOIN catalogs ON catalogs.certname = certnames.certname LEFT JOIN factsets fs ON certnames.certname = fs.certname LEFT JOIN reports ON (certnames.certname = reports.certname AND certnames.latest_report_id = reports.id) LEFT JOIN environments catalog_environment ON catalog_environment.id = catalogs.environment_id
LEFT JOIN report_statuses ON reports.status_id = report_statuses.id LEFT JOIN environments facts_environment ON facts_environment.id = fs.environment_id LEFT JOIN environments reports_environment ON reports_environment.id = reports.environment_id
WHERE ((((certnames.certname) in (SELECT certname FROM ( (SELECT fc.certname AS "certname" FROM (select certname, flattened.* from factsets fs left join lateral ( with recursive flattened_one (parent_path, parent_types, key, value, type) as
( select array[]::text[], '', (jsonb_each(fs.stable||fs.volatile)).*, 's' union all select parent_path || flattened_one.key, parent_types || flattened_one.type, sub_paths.key, sub_paths.value, sub_paths.type from flattened_one inner join lateral
( select (jsonb_each(value)).*, 's' as type where jsonb_typeof(value) = 'object' union all select generate_series::text as key, value->generate_series as value, 'i' as type from generate_series(0, jsonb_array_length(value) - 1)
where jsonb_typeof(value) = 'array' ) as sub_paths on true ) select environment_id, parent_path || key as path, parent_types || type as types, coalesce(parent_path[1], key) as name, value from flattened_one where not jsonb_typeof(value) = any('

{"array", "object"}') ) as flattened on true) fc LEFT JOIN environments env ON fc.environment_id = env.id
WHERE ((fc.path = (array['operatingsystemrelease']) and fc.types = 's') AND (value#>>'{}')::text ~ $1)) ) sub)) AND ((certnames.certname) in (SELECT certname FROM ( (SELECT fc.certname AS "certname" FROM (select certname, flattened.* from factsets fs left join lateral ( with recursive flattened_one (parent_path, parent_types, key, value, type) as
( select array[]::text[], '', (jsonb_each(fs.stable||fs.volatile)).*, 's' union all select parent_path || flattened_one.key, parent_types || flattened_one.type, sub_paths.key, sub_paths.value, sub_paths.type from flattened_one inner join lateral
( select (jsonb_each(value)).*, 's' as type where jsonb_typeof(value) = 'object' union all select generate_series::text as key, value->generate_series as value, 'i' as type from generate_series(0, jsonb_array_length(value) - 1)
where jsonb_typeof(value) = 'array' ) as sub_paths on true ) select environment_id, parent_path || key as path, parent_types || type as types, coalesce(parent_path[1], key) as name, value from flattened_one where not jsonb_typeof(value) = any('{"array", "object"}

') ) as flattened on true) fc
LEFT JOIN environments env ON fc.environment_id = env.id WHERE ((fc.path = (array['operatingsystem']) and fc.types = 's') AND (jsonb_typeof("value") = $2 AND (value = $3)))) ) sub))) AND NOT ((certnames.certname) in (SELECT certname FROM ( SELECT not_active_nodes.certname AS "certname" FROM not_active_nodes ) sub)))) results_to_count

postgres_1 | 2022-02-04 16:40:58.161 CET [75] DETAIL: parameters: $1 = '^11.', $2 = 'string', $3 = '"Debian"'

I extract the SQL syntax and replace $1, $2 and $3 with their respective values and store this to /tmp/q.txt :

SELECT COUNTstar_yellow.png AS result_count FROM (WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL AND deactivated > '2022-01-21T15:40:33.691Z')
OR (expired IS NOT NULL and expired > '2022-01-21T15:40:33.691Z')), not_active_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL OR expired IS NOT NULL)) SELECT reports.corrective_change AS "latest_report_corrective_change", certnames.deactivated AS "deactivated", certnames.expired AS "expired", reports_environment.environment AS "report_environment", certnames.certname AS "certname", fs.timestamp AS "facts_timestamp", facts_environment.environment AS "facts_environment",
reports.cached_catalog_status AS "cached_catalog_status", report_statuses.status AS "latest_report_status", encode(reports.hash::bytea, 'hex') AS "latest_report_hash", catalogs.timestamp AS "catalog_timestamp", reports.noop_pending AS "latest_report_noop_pending", reports.end_time AS "report_timestamp", reports.noop AS "latest_report_noop", catalog_environment.environment AS "catalog_environment", reports.job_id AS "latest_report_job_id" FROM certnames
LEFT JOIN catalogs ON catalogs.certname = certnames.certname LEFT JOIN factsets fs ON certnames.certname = fs.certname LEFT JOIN reports ON (certnames.certname = reports.certname AND certnames.latest_report_id = reports.id) LEFT JOIN environments catalog_environment ON catalog_environment.id = catalogs.environment_id
LEFT JOIN report_statuses ON reports.status_id = report_statuses.id LEFT JOIN environments facts_environment ON facts_environment.id = fs.environment_id LEFT JOIN environments reports_environment ON reports_environment.id = reports.environment_id
WHERE ((((certnames.certname) in (SELECT certname FROM ( (SELECT fc.certname AS "certname" FROM (select certname, flattened.* from factsets fs left join lateral ( with recursive flattened_one (parent_path, parent_types, key, value, type) as
( select array[]::text[], '', (jsonb_each(fs.stable||fs.volatile)).*, 's' union all select parent_path || flattened_one.key, parent_types || flattened_one.type, sub_paths.key, sub_paths.value, sub_paths.type from flattened_one inner join lateral
( select (jsonb_each(value)).*, 's' as type where jsonb_typeof(value) = 'object' union all select generate_series::text as key, value->generate_series as value, 'i' as type from generate_series(0, jsonb_array_length(value) - 1)
where jsonb_typeof(value) = 'array' ) as sub_paths on true ) select environment_id, parent_path || key as path, parent_types || type as types, coalesce(parent_path[1], key) as name, value from flattened_one where not jsonb_typeof(value) = any('

{"array", "object"}') ) as flattened on true) fc LEFT JOIN environments env ON fc.environment_id = env.id
WHERE ((fc.path = (array['operatingsystemrelease']) and fc.types = 's') AND (value#>>'{}')::text ~ '^11.')) ) sub)) AND ((certnames.certname) in (SELECT certname FROM ( (SELECT fc.certname AS "certname" FROM (select certname, flattened.* from factsets fs left join lateral ( with recursive flattened_one (parent_path, parent_types, key, value, type) as
( select array[]::text[], '', (jsonb_each(fs.stable||fs.volatile)).*, 's' union all select parent_path || flattened_one.key, parent_types || flattened_one.type, sub_paths.key, sub_paths.value, sub_paths.type from flattened_one inner join lateral
( select (jsonb_each(value)).*, 's' as type where jsonb_typeof(value) = 'object' union all select generate_series::text as key, value->generate_series as value, 'i' as type from generate_series(0, jsonb_array_length(value) - 1)
where jsonb_typeof(value) = 'array' ) as sub_paths on true ) select environment_id, parent_path || key as path, parent_types || type as types, coalesce(parent_path[1], key) as name, value from flattened_one where not jsonb_typeof(value) = any('{"array", "object"}

') ) as flattened on true) fc
LEFT JOIN environments env ON fc.environment_id = env.id WHERE ((fc.path = (array['operatingsystem']) and fc.types = 's') AND (jsonb_typeof("value") = 'string' AND (value = '"Debian"')))) ) sub))) AND NOT ((certnames.certname) in (SELECT certname FROM ( SELECT not_active_nodes.certname AS "certname" FROM not_active_nodes ) sub)))) results_to_count

  1. time psql -f /tmp/q.txt puppetdb puppetdb
    result_count
    --------------
    63
    (1 row)

real 0m1.023s
user 0m0.025s
sys 0m0.012s

Ok, the query works. I have 63 Debian 11 machines.

Elof Ofel (Jira)

unread,
Feb 4, 2022, 12:38:01 PM2/4/22
to puppe...@googlegroups.com
Elof Ofel commented on Task PDB-5450

I now add "EXPLAIN (ANALYZE, BUFFERS)" to the query.

time psql -f /tmp/q2.txt puppetdb puppetdb > /tmp/idle.txt
real 0m1.036s
user 0m0.035s
sys 0m0.004s

Works fine.

I now generate some background load by reloading the dashboard and immediately re-run the command:
time psql -f /tmp/q2.txt puppetdb puppetdb > /tmp/load.txt
real 0m18.966s
user 0m0.049s
sys 0m0.013s

Still works fine, only this query takes 18 seconds.
Meanwhile the dashboard webpage keeps loading for ca 20 seconds more, and then display values in all cells.

If I diff idle.txt and load.txt, both show the same 128 rows, just different timings.

So my guess is that it is not the query(s) itself that need debugging but why a PuppetDB 7.9.0 under load
start having the 40 second problems when none of the previous versions had this problem.

# cat /tmp/idle.txt
                                                                                                                                QUERY PLAN
 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------
 Aggregate  (cost=76757873.32..76757873.33 rows=1 width=8) (actual time=924.199..924.215 rows=1 loops=1)
   Buffers: shared hit=3037
   ->  Nested Loop Semi Join  (cost=1493156.08..76757872.92 rows=32 width=299) (actual time=881.713..924.189 rows=63 loops=1)
         Join Filter: (certnames.certname = fs.certname)
         Buffers: shared hit=3037
         ->  Nested Loop Semi Join  (cost=747601.42..38819974.20 rows=32 width=50) (actual time=881.197..900.147 rows=63 loops=1)
               Buffers: shared hit=2478
               ->  Merge Left Join  (cost=2046.76..2047.56 rows=32 width=25) (actual time=880.636..880.761 rows=63 loops=1)
                     Merge Cond: (certnames.certname = catalogs.certname)
                     Buffers: shared hit=1895
                     ->  Sort  (cost=2042.20..2042.28 rows=32 width=25) (actual time=880.538..880.558 rows=63 loops=1)
                           Sort Key: certnames.certname
                           Sort Method: quicksort  Memory: 29kB
                           Buffers: shared hit=1893
                           ->  Hash Right Join  (cost=3.92..2041.40 rows=32 width=25) (actual time=880.329..880.423 rows=63 loops=1)
                                 Hash Cond: ((reports.certname = certnames.certname) AND (reports.id = certnames.latest_report_id))
                                 Buffers: shared hit=1888
                                 ->  Append  (cost=0.00..1997.42 rows=7629 width=49) (actual time=0.020..4.520 rows=7596 loops=1)
                                       Buffers: shared hit=1883
                                       ->  Seq Scan on reports  (cost=0.00..0.00 rows=1 width=56) (actual time=0.010..0.010 rows=0 loops=1)
                                       ->  Seq Scan on reports_20220121z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.008..0.292 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220122z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.010..0.264 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220123z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.004..0.278 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220124z  (cost=0.00..129.08 rows=508 width=49) (actual time=0.005..0.262 rows=504 loops=1)
                                             Buffers: shared hit=124
                                       ->  Seq Scan on reports_20220125z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.006..0.257 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220126z  (cost=0.00..127.09 rows=509 width=49) (actual time=0.006..0.261 rows=507 loops=1)
                                             Buffers: shared hit=122
                                       ->  Seq Scan on reports_20220127z  (cost=0.00..122.15 rows=515 width=49) (actual time=0.006..0.251 rows=511 loops=1)
                                             Buffers: shared hit=117
                                       ->  Seq Scan on reports_20220128z  (cost=0.00..133.12 rows=512 width=49) (actual time=0.004..0.279 rows=509 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220129z  (cost=0.00..132.06 rows=506 width=49) (actual time=0.005..0.281 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220130z  (cost=0.00..132.05 rows=505 width=49) (actual time=0.009..0.263 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220131z  (cost=0.00..130.13 rows=513 width=49) (actual time=0.004..0.259 rows=510 loops=1)
                                             Buffers: shared hit=125
                                       ->  Seq Scan on reports_20220201z  (cost=0.00..133.13 rows=513 width=49) (actual time=0.004..0.258 rows=512 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220202z  (cost=0.00..164.50 rows=650 width=49) (actual time=0.004..0.324 rows=648 loops=1)
                                             Buffers: shared hit=158
                                       ->  Seq Scan on reports_20220203z  (cost=0.00..133.09 rows=509 width=49) (actual time=0.007..0.262 rows=508 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220204z  (cost=0.00..94.60 rows=360 width=49) (actual time=0.004..0.223 rows=363 loops=1)
                                             Buffers: shared hit=91
                                 ->  Hash  (cost=3.44..3.44 rows=32 width=33) (actual time=875.058..875.059 rows=63 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 13kB
                                       Buffers: shared hit=2
                                       ->  Seq Scan on certnames  (cost=1.64..3.44 rows=32 width=33) (actual time=875.027..875.040 rows=63 loops=1)
                                             Filter: (NOT (hashed SubPlan 1))
                                             Rows Removed by Filter: 1
                                             Buffers: shared hit=2
                                             SubPlan 1
                                               ->  Seq Scan on certnames certnames_1  (cost=0.00..1.64 rows=1 width=25) (actual time=37.966..37.968 rows=1 loops=1)
                                                     Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
                                                     Rows Removed by Filter: 63
                                                     Buffers: shared hit=1
                     ->  Sort  (cost=4.56..4.72 rows=64 width=33) (actual time=0.092..0.105 rows=64 loops=1)
                           Sort Key: catalogs.certname
                           Sort Method: quicksort  Memory: 30kB
                           Buffers: shared hit=2
                           ->  Seq Scan on catalogs  (cost=0.00..2.64 rows=64 width=33) (actual time=0.007..0.015 rows=64 loops=1)
                                 Buffers: shared hit=2
               ->  Nested Loop  (cost=745554.66..1213060.20 rows=1 width=25) (actual time=0.307..0.307 rows=1 loops=63)
                     Buffers: shared hit=583
                     ->  Index Scan using factsets_certname_idx on factsets fs_1  (cost=0.14..1.41 rows=1 width=140) (actual time=0.003..0.003 rows=1 loops=63)
                           Index Cond: (certname = certnames.certname)
                           Buffers: shared hit=126
                     ->  CTE Scan on flattened_one  (cost=745554.52..1213058.77 rows=1 width=136) (actual time=0.302..0.302 rows=1 loops=63)
                           Filter: ((value = '"Debian"'::jsonb) AND (jsonb_typeof(value) <> ALL ('{array,object}'::text[])) AND ((parent_types || type) = 's'::text) AND ((parent_path || key) = '{operatingsyste
m}'::text[]) AND (jsonb_typeof(value) = 'string'::text))
                           Rows Removed by Filter: 123
                           Buffers: shared hit=457
                           CTE flattened_one
                             ->  Recursive Union  (cost=0.00..745554.52 rows=11000100 width=160) (actual time=0.187..0.240 rows=124 loops=63)
                                   Buffers: shared hit=457
                                   ->  Result  (cost=0.00..2.52 rows=100 width=160) (actual time=0.186..0.216 rows=124 loops=63)
                                         Buffers: shared hit=457
                                         ->  ProjectSet  (cost=0.00..0.52 rows=100 width=32) (actual time=0.185..0.200 rows=124 loops=63)
                                               Buffers: shared hit=457
                                               ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=63)
                                   ->  Nested Loop  (cost=0.01..52555.00 rows=1100000 width=160) (never executed)
                                         ->  WorkTable Scan on flattened_one flattened_one_3  (cost=0.00..20.00 rows=1000 width=160) (never executed)
                                         ->  Append  (cost=0.01..36.03 rows=1100 width=96) (never executed)
                                               ->  Result  (cost=0.01..2.02 rows=100 width=96) (never executed)
                                                     ->  ProjectSet  (cost=0.01..0.52 rows=100 width=32) (never executed)
                                                           ->  Result  (cost=0.01..0.01 rows=1 width=0) (never executed)
                                                                 One-Time Filter: (jsonb_typeof(flattened_one_3.value) = 'object'::text)
                                               ->  Result  (cost=0.01..17.51 rows=1000 width=96) (never executed)
                                                     One-Time Filter: (jsonb_typeof(flattened_one_3.value) = 'array'::text)
                                                     ->  Function Scan on generate_series generate_series_1  (cost=0.01..10.01 rows=1000 width=4) (never executed)
         ->  Nested Loop  (cost=745554.66..1185559.32 rows=1 width=25) (actual time=0.381..0.381 rows=1 loops=63)
               Buffers: shared hit=559
               ->  Index Scan using factsets_certname_idx on factsets fs  (cost=0.14..0.78 rows=1 width=140) (actual time=0.003..0.003 rows=1 loops=63)
                     Index Cond: (certname = fs_1.certname)
                     Buffers: shared hit=126
               ->  CTE Scan on flattened_one flattened_one_1  (cost=745554.52..1185558.52 rows=1 width=136) (actual time=0.376..0.376 rows=1 loops=63)
                     Filter: ((jsonb_typeof(value) <> ALL ('{array,object}'::text[])) AND ((value #>> '{}'::text[]) ~ '^11.'::text) AND ((parent_path || key) = '{operatingsystemrelease}'::text[]) AND ((parent_
types || type) = 's'::text))
                     Rows Removed by Filter: 154
                     Buffers: shared hit=433
                     CTE flattened_one
                       ->  Recursive Union  (cost=0.00..745554.52 rows=11000100 width=160) (actual time=0.179..0.247 rows=155 loops=63)
                             Buffers: shared hit=433
                             ->  Result  (cost=0.00..2.52 rows=100 width=160) (actual time=0.178..0.217 rows=155 loops=63)
                                   Buffers: shared hit=433
                                   ->  ProjectSet  (cost=0.00..0.52 rows=100 width=32) (actual time=0.177..0.197 rows=155 loops=63)
                                         Buffers: shared hit=433
                                         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=63)
                             ->  Nested Loop  (cost=0.01..52555.00 rows=1100000 width=160) (never executed)
                                   ->  WorkTable Scan on flattened_one flattened_one_2  (cost=0.00..20.00 rows=1000 width=160) (never executed)
                                   ->  Append  (cost=0.01..36.03 rows=1100 width=96) (never executed)
                                         ->  Result  (cost=0.01..2.02 rows=100 width=96) (never executed)
                                               ->  ProjectSet  (cost=0.01..0.52 rows=100 width=32) (never executed)
                                                     ->  Result  (cost=0.01..0.01 rows=1 width=0) (never executed)
                                                           One-Time Filter: (jsonb_typeof(flattened_one_2.value) = 'object'::text)
                                         ->  Result  (cost=0.01..17.51 rows=1000 width=96) (never executed)
                                               One-Time Filter: (jsonb_typeof(flattened_one_2.value) = 'array'::text)
                                               ->  Function Scan on generate_series  (cost=0.01..10.01 rows=1000 width=4) (never executed)
 Planning Time: 10.464 ms
 JIT:
   Functions: 116
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 8.902 ms, Inlining 61.649 ms, Optimization 485.272 ms, Emission 325.991 ms, Total 881.814 ms
 Execution Time: 982.817 ms
(128 rows)

Elof Ofel (Jira)

unread,
Feb 4, 2022, 12:38:02 PM2/4/22
to puppe...@googlegroups.com
Elof Ofel commented on Task PDB-5450

# cat /tmp/load.txt
                                                                                                                                QUERY PLAN
 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------
 Aggregate  (cost=76757874.40..76757874.41 rows=1 width=8) (actual time=17384.424..17384.445 rows=1 loops=1)
   Buffers: shared hit=3038
   ->  Nested Loop Semi Join  (cost=1493157.16..76757874.00 rows=32 width=299) (actual time=16595.373..17384.369 rows=63 loops=1)
         Join Filter: (certnames.certname = fs.certname)
         Buffers: shared hit=3038
         ->  Nested Loop Semi Join  (cost=747602.50..38819975.28 rows=32 width=50) (actual time=16594.563..16840.309 rows=63 loops=1)
               Buffers: shared hit=2479
               ->  Merge Left Join  (cost=2047.84..2048.64 rows=32 width=25) (actual time=16593.751..16594.123 rows=63 loops=1)
                     Merge Cond: (certnames.certname = catalogs.certname)
                     Buffers: shared hit=1896
                     ->  Sort  (cost=2043.28..2043.36 rows=32 width=25) (actual time=16593.568..16593.603 rows=63 loops=1)
                           Sort Key: certnames.certname
                           Sort Method: quicksort  Memory: 29kB
                           Buffers: shared hit=1894
                           ->  Hash Right Join  (cost=3.92..2042.48 rows=32 width=25) (actual time=16593.228..16593.362 rows=63 loops=1)
                                 Hash Cond: ((reports.certname = certnames.certname) AND (reports.id = certnames.latest_report_id))
                                 Buffers: shared hit=1889
                                 ->  Append  (cost=0.00..1998.48 rows=7633 width=49) (actual time=0.021..85.061 rows=7597 loops=1)
                                       Buffers: shared hit=1884
                                       ->  Seq Scan on reports  (cost=0.00..0.00 rows=1 width=56) (actual time=0.009..0.010 rows=0 loops=1)
                                       ->  Seq Scan on reports_20220121z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.010..0.356 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220122z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.016..0.345 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220123z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.007..0.335 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220124z  (cost=0.00..129.08 rows=508 width=49) (actual time=0.006..0.332 rows=504 loops=1)
                                             Buffers: shared hit=124
                                       ->  Seq Scan on reports_20220125z  (cost=0.00..132.07 rows=507 width=49) (actual time=0.010..36.395 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220126z  (cost=0.00..127.09 rows=509 width=49) (actual time=0.033..0.358 rows=507 loops=1)
                                             Buffers: shared hit=122
                                       ->  Seq Scan on reports_20220127z  (cost=0.00..122.15 rows=515 width=49) (actual time=0.016..11.312 rows=511 loops=1)
                                             Buffers: shared hit=117
                                       ->  Seq Scan on reports_20220128z  (cost=0.00..133.12 rows=512 width=49) (actual time=0.021..0.354 rows=509 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220129z  (cost=0.00..132.06 rows=506 width=49) (actual time=0.009..0.344 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220130z  (cost=0.00..132.05 rows=505 width=49) (actual time=0.018..0.348 rows=504 loops=1)
                                             Buffers: shared hit=127
                                       ->  Seq Scan on reports_20220131z  (cost=0.00..130.13 rows=513 width=49) (actual time=0.008..0.341 rows=510 loops=1)
                                             Buffers: shared hit=125
                                       ->  Seq Scan on reports_20220201z  (cost=0.00..133.13 rows=513 width=49) (actual time=0.011..0.343 rows=512 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220202z  (cost=0.00..164.50 rows=650 width=49) (actual time=0.007..0.419 rows=648 loops=1)
                                             Buffers: shared hit=158
                                       ->  Seq Scan on reports_20220203z  (cost=0.00..133.09 rows=509 width=49) (actual time=0.016..0.349 rows=508 loops=1)
                                             Buffers: shared hit=128
                                       ->  Seq Scan on reports_20220204z  (cost=0.00..95.64 rows=364 width=49) (actual time=0.006..32.344 rows=364 loops=1)
                                             Buffers: shared hit=92
                                 ->  Hash  (cost=3.44..3.44 rows=32 width=33) (actual time=16506.741..16506.743 rows=63 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 13kB
                                       Buffers: shared hit=2
                                       ->  Seq Scan on certnames  (cost=1.64..3.44 rows=32 width=33) (actual time=16506.694..16506.715 rows=63 loops=1)
                                             Filter: (NOT (hashed SubPlan 1))
                                             Rows Removed by Filter: 1
                                             Buffers: shared hit=2
                                             SubPlan 1
                                               ->  Seq Scan on certnames certnames_1  (cost=0.00..1.64 rows=1 width=25) (actual time=680.858..680.861 rows=1 loops=1)
                                                     Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
                                                     Rows Removed by Filter: 63
                                                     Buffers: shared hit=1
                     ->  Sort  (cost=4.56..4.72 rows=64 width=33) (actual time=0.175..0.216 rows=64 loops=1)
                           Sort Key: catalogs.certname
                           Sort Method: quicksort  Memory: 30kB
                           Buffers: shared hit=2
                           ->  Seq Scan on catalogs  (cost=0.00..2.64 rows=64 width=33) (actual time=0.015..0.029 rows=64 loops=1)
                                 Buffers: shared hit=2
               ->  Nested Loop  (cost=745554.66..1213060.20 rows=1 width=25) (actual time=3.906..3.906 rows=1 loops=63)
                     Buffers: shared hit=583
                     ->  Index Scan using factsets_certname_idx on factsets fs_1  (cost=0.14..1.41 rows=1 width=140) (actual time=0.009..0.009 rows=1 loops=63)
                           Index Cond: (certname = certnames.certname)
                           Buffers: shared hit=126
                     ->  CTE Scan on flattened_one  (cost=745554.52..1213058.77 rows=1 width=136) (actual time=3.887..3.887 rows=1 loops=63)
                           Filter: ((value = '"Debian"'::jsonb) AND (jsonb_typeof(value) <> ALL ('{array,object}'::text[])) AND ((parent_types || type) = 's'::text) AND ((parent_path || key) = '{operatingsyste
m}'::text[]) AND (jsonb_typeof(value) = 'string'::text))
                           Rows Removed by Filter: 123
                           Buffers: shared hit=457
                           CTE flattened_one
                             ->  Recursive Union  (cost=0.00..745554.52 rows=11000100 width=160) (actual time=1.466..3.721 rows=124 loops=63)
                                   Buffers: shared hit=457
                                   ->  Result  (cost=0.00..2.52 rows=100 width=160) (actual time=1.464..2.600 rows=124 loops=63)
                                         Buffers: shared hit=457
                                         ->  ProjectSet  (cost=0.00..0.52 rows=100 width=32) (actual time=1.458..2.059 rows=124 loops=63)
                                               Buffers: shared hit=457
                                               ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=63)
                                   ->  Nested Loop  (cost=0.01..52555.00 rows=1100000 width=160) (never executed)
                                         ->  WorkTable Scan on flattened_one flattened_one_3  (cost=0.00..20.00 rows=1000 width=160) (never executed)
                                         ->  Append  (cost=0.01..36.03 rows=1100 width=96) (never executed)
                                               ->  Result  (cost=0.01..2.02 rows=100 width=96) (never executed)
                                                     ->  ProjectSet  (cost=0.01..0.52 rows=100 width=32) (never executed)
                                                           ->  Result  (cost=0.01..0.01 rows=1 width=0) (never executed)
                                                                 One-Time Filter: (jsonb_typeof(flattened_one_3.value) = 'object'::text)
                                               ->  Result  (cost=0.01..17.51 rows=1000 width=96) (never executed)
                                                     One-Time Filter: (jsonb_typeof(flattened_one_3.value) = 'array'::text)
                                                     ->  Function Scan on generate_series generate_series_1  (cost=0.01..10.01 rows=1000 width=4) (never executed)
         ->  Nested Loop  (cost=745554.66..1185559.32 rows=1 width=25) (actual time=8.633..8.633 rows=1 loops=63)
               Buffers: shared hit=559
               ->  Index Scan using factsets_certname_idx on factsets fs  (cost=0.14..0.78 rows=1 width=140) (actual time=0.007..0.007 rows=1 loops=63)
                     Index Cond: (certname = fs_1.certname)
                     Buffers: shared hit=126
               ->  CTE Scan on flattened_one flattened_one_1  (cost=745554.52..1185558.52 rows=1 width=136) (actual time=8.617..8.617 rows=1 loops=63)
                     Filter: ((jsonb_typeof(value) <> ALL ('{array,object}'::text[])) AND ((value #>> '{}'::text[]) ~ '^11.'::text) AND ((parent_path || key) = '{operatingsystemrelease}'::text[]) AND ((parent_
types || type) = 's'::text))
                     Rows Removed by Filter: 154
                     Buffers: shared hit=433
                     CTE flattened_one
                       ->  Recursive Union  (cost=0.00..745554.52 rows=11000100 width=160) (actual time=3.825..5.213 rows=155 loops=63)
                             Buffers: shared hit=433
                             ->  Result  (cost=0.00..2.52 rows=100 width=160) (actual time=3.824..5.163 rows=155 loops=63)
                                   Buffers: shared hit=433
                                   ->  ProjectSet  (cost=0.00..0.52 rows=100 width=32) (actual time=3.818..4.490 rows=155 loops=63)
                                         Buffers: shared hit=433
                                         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=63)
                             ->  Nested Loop  (cost=0.01..52555.00 rows=1100000 width=160) (never executed)
                                   ->  WorkTable Scan on flattened_one flattened_one_2  (cost=0.00..20.00 rows=1000 width=160) (never executed)
                                   ->  Append  (cost=0.01..36.03 rows=1100 width=96) (never executed)
                                         ->  Result  (cost=0.01..2.02 rows=100 width=96) (never executed)
                                               ->  ProjectSet  (cost=0.01..0.52 rows=100 width=32) (never executed)
                                                     ->  Result  (cost=0.01..0.01 rows=1 width=0) (never executed)
                                                           One-Time Filter: (jsonb_typeof(flattened_one_2.value) = 'object'::text)
                                         ->  Result  (cost=0.01..17.51 rows=1000 width=96) (never executed)
                                               One-Time Filter: (jsonb_typeof(flattened_one_2.value) = 'array'::text)
                                               ->  Function Scan on generate_series  (cost=0.01..10.01 rows=1000 width=4) (never executed)
 Planning Time: 209.435 ms
 JIT:
   Functions: 116
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 204.960 ms, Inlining 1129.279 ms, Optimization 9167.271 ms, Emission 6154.773 ms, Total 16656.283 ms
 Execution Time: 18063.038 ms
(128 rows)

Let me know if I can do anything else to debug this.

David McTavish (Jira)

unread,
Feb 4, 2022, 1:48:01 PM2/4/22
to puppe...@googlegroups.com
David McTavish updated an issue
Change By: David McTavish
Method Found: Needs Assessment
Issue Type: Task Bug

Austin Blatt (Jira)

unread,
Feb 4, 2022, 8:19:03 PM2/4/22
to puppe...@googlegroups.com
Austin Blatt commented on Bug PDB-5450
 
Re: Slow fact queries after PuppetDB 7.9.0 upgrade

I think the facts queries are creating the significant load due to the JIT optimization that looks to be enabled on your Postgres, and that load is slowing the Postgres down and creating the poor query performance.

I spun up a totally empty database and ran a similar query.

With JIT disabled, it takes less than a millisecond.

 Planning Time: 1.312 ms
 Execution Time: 0.278 ms

and then with JIT enabled it takes over 700ms

 Planning Time: 1.224 ms
 JIT:
   Functions: 56
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 12.209 ms, Inlining 81.108 ms, Optimization 268.355 ms, Emission 160.117 ms, Total 521.789 ms
 Execution Time: 735.923 ms

It appears that the Postgres query planner is generating a huge cost estimate {{cost=745615.49..90016708.23 }} and then Postgres attempts as much JIT optimization as possible against the query because it expects it to be slow (even though it is not). This is in line with the explain analyzes that you provides, where most of the time for both queries seems to be spent in attempting the various JIT optimizations rather than the actual query. We can look into why Postgres's planner is getting such a wrong estimate, but in the mean time I would recommend trying to load the UI after disabling JIT.

If that is not possible, it would also be safe to downgrade your PuppetDB to 7.8.0 because we did not have any database migrations between the two versions.

Elof Ofel (Jira)

unread,
Feb 5, 2022, 4:49:03 AM2/5/22
to puppe...@googlegroups.com
Elof Ofel commented on Bug PDB-5450

Hi!
Thanks (again) for quick response!

I will try this next week when back at work.

Can I ask for the configuration line to add/remove?

PS: I'm using the standard postgres container with default settings, as installed by your Pupperware.

Austin Blatt (Jira)

unread,
Feb 5, 2022, 11:12:01 AM2/5/22
to puppe...@googlegroups.com
Austin Blatt commented on Bug PDB-5450

Yep, definitely, the configration option is jit in postgresql.conf

jit = off

It looks like JIT was turned from off by default in PG 11 to on by default in PG 12+. I am not terribly familiar with our pupperware stack, but I assume that means its pulling in a newer version of postgres that 11. Thanks for letting me know, I'll look into that as well.

Elof Ofel (Jira)

unread,
Feb 5, 2022, 12:14:02 PM2/5/22
to puppe...@googlegroups.com
Elof Ofel commented on Bug PDB-5450

Thanks.
Will test next week.

Meanwhile, I can inform you that the postgres container has not changed.
During the last year or two, Pupperware has been using the container 'postgres:12.6'.

So "unfortunately" your explaination above is not the reason for the performance degregation after upgrading PuppetDB from 7.8.0 to 7.9.0.

Elof Ofel (Jira)

unread,
Feb 7, 2022, 9:40:03 AM2/7/22
to puppe...@googlegroups.com
Elof Ofel commented on Bug PDB-5450

Yay!

Good catch! It is the JIT compilation that is the issue.

echo 'jit = off' >> /var/lib/postgresql/data/pgdata/postgresql.conf

After disabling it, all cells in my dashboard load in less than 3 seconds. That's ok.

So if this is the solution, then your Pupperware team should disable jit in the postgres container.
...or can you see why Postgres's planner is getting such a wrong estimate?

Perhaps it can be solved in pupperware with something like this:
pupperware: gem/lib/pupperware/compose-services/postgres-custom/10-postgresql.conf_disable_jit.sh :

#!/bin/bash
/bin/grep -q '^jit = off$' /var/lib/postgresql/data/pgdata/postgresql.conf || /bin/echo -e '\n# Added by 10-postgresql.conf_disable_jit.sh:\njit = off' >> /var/lib/postgresql/data/pgdata/postgresql.conf

Unfortunately I can't test the script, 'cause the official postgres docker image don't support advanced customization of the config. :-/
The script that allow customization only do this once, if there are no database. If a database already exists, all custom initialization scripts are skipped. Since I already have a database, my script (/docker-entrypoint-initdb.d/10-postgresql.conf_disable_jit.sh inside the container) is not executed.
I guess that the above script will execute on a brand new pupperware install though, disabling JIT.

In my case, I manually edit the postgresql.conf file and add 'jit = off'. Fortunately, this conf file is stored in a persistent docker volume, so you only need to edit it once. smile.png

Austin Blatt (Jira)

unread,
Feb 7, 2022, 11:17:02 AM2/7/22
to puppe...@googlegroups.com
Austin Blatt commented on Bug PDB-5450

Awesome, glad it's working better!

And yes, I should have been more clear above, the ticket you originally linked to (PDB-5259) was the change to the fact-contents endpoint in 7.9.0 that is causing JIT to perform poorly. So in 7.8.0 and earlier, the fact-contents endpoint would have performed normally with JIT enabled.

The performance improvement on PDB-5259 was substantial, so unless we can solve the Postgres planner issue, we will likely be changing our query code to disable JIT for all its fact-contents queries so that other users do not have to do it manually. In the meantime, I will add a known issue to our docs with the workaround.

Thanks again for the report and help tracking down the issue.

Austin Blatt (Jira)

unread,
Feb 7, 2022, 11:51:04 AM2/7/22
to puppe...@googlegroups.com
Austin Blatt commented on Bug PDB-5450

Actually,

in less than 3 seconds

is that still notably slower for you than before the upgrade?

David McTavish (Jira)

unread,
Feb 7, 2022, 1:40:01 PM2/7/22
to puppe...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages