Jira (PDB-5647) Slow fact queries after PuppetDB 7 upgrade (comparing with PuppetDB 5) with GCP CloudSQL PostgreSQL

11 views
Skip to first unread message

Krzysztof Krzyżaniak

unread,
Jun 2, 2023, 8:03:01 AM6/2/23
to puppe...@googlegroups.com
Krzysztof Krzyżaniak created an issue
 
PuppetDB / Bug PDB-5647
Slow fact queries after PuppetDB 7 upgrade (comparing with PuppetDB 5) with GCP CloudSQL PostgreSQL
Issue Type: Bug Bug
Affects Versions: PDB 7.13.0
Assignee: Unassigned
Components: PuppetDB
Created: 2023/06/02 5:02 AM
Priority: Normal Normal
Reporter: Krzysztof Krzyżaniak

Note

This could be related to PDB-5450 but there are some differences

Who found the bug?

Krzysztof Krzyżaniak (eloy) <kkrzy...@egnyte.com>

Explain what's going on

We are in the progress of migration to puppet7 with upgraded Puppetserver and Puppetdb. We noticed that our Puppetdb queries are much slower on puppet7 than on puppet5. We are aware of PDB-5450 but we use GCP CloudSQL which does not have JIT enabled by design.

Steps to reproduce

Our example query:

facts[value] {
        name = "consul_services"
        and facts { name = "egnyte_role" and value = "roles-consul" }
        and facts { name = "virtual_dc" and value = "egn" }
        and facts { name = "side" and value = "gcp" }
}

Querying puppetdb 5

$ time ( echo '{"query": "facts[value] {  name = \"consul_services\" and facts { name = \"egnyte_role\" and value = \"roles-consul\" } and facts { name = \"virtual_dc\" and value = \"egn\" }  and facts { name = \"side\" and value = \"gcp\" }  }"}' | xh POST :8080/pdb/query/v4 > /dev/null )
 
real	0m4.737s
user	0m0.001s
sys	0m0.010s

Querying puppetdb 7

time ( echo '{"query": "facts[value] {  name = \"consul_services\" and facts { name = \"egnyte_role\" and value = \"roles-consul\" } and facts { name = \"virtual_dc\" and value = \"egn\" }  and facts { name = \"side\" and value = \"gcp\" }  }"}' | xh POST :8080/pdb/query/v4 > /dev/null )
 
 
real	0m37.431s
user	0m0.002s
sys	0m0.004s

Generated query on puppetdb7

WITH inactive_nodes AS
(
SELECT  certname
FROM certnames
WHERE (deactivated IS NOT NULL AND deactivated > '2023-06-02T07:36:50.397Z') OR (expired IS NOT NULL AND expired > '2023-06-02T07:36:50.397Z')
), not_active_nodes AS
(
SELECT  certname
FROM certnames
WHERE (deactivated IS NOT NULL) OR (expired IS NOT NULL)
)
SELECT  fs.value AS "value"
FROM
(
SELECT  certname
       ,environment_id
       ,'consul_services'::text                 AS key
       ,(stable||volatile)-> 'consul_services' AS value
FROM factsets
WHERE (stable||volatile) ? 'consul_services'
) AS fs
LEFT JOIN environments AS env
ON fs.environment_id = env.id
WHERE (((fs.key = 'consul_services')) AND (((fs.certname) IN ( SELECT certname FROM  ( SELECT fs.certname AS "certname" FROM  ( SELECT certname, environment_id, (jsonb_each((stable||volatile))).* FROM factsets  ) AS fs LEFT JOIN environments AS env ON fs.environment_id = env.id WHERE ((fs.key = 'egnyte_role')) AND (jsonb_typeof("value") = 'string' AND ((fs.value = '"roles-consul"')))  ) AS sub)) AND ((fs.certname) IN ( SELECT certname FROM  ( SELECT fs.certname AS "certname" FROM  ( SELECT certname, environment_id, (jsonb_each((stable||volatile))).* FROM factsets  ) AS fs LEFT JOIN environments AS env ON fs.environment_id = env.id WHERE ((fs.key = 'virtual_dc')) AND (jsonb_typeof("value") = 'string' AND ((fs.value = '"egn"')))  ) AS sub))) AND ((fs.certname) IN ( SELECT certname FROM  ( SELECT fs.certname AS "certname" FROM  ( SELECT certname, environment_id, (jsonb_each((stable||volatile))).* FROM factsets  ) AS fs LEFT JOIN environments AS env ON fs.environment_id = env.id WHERE ((fs.key = 'side')) AND (jsonb_typeof("value") = 'string' AND ((fs.value = '"gcp"')))  ) AS sub)))
AND NOT ((fs.certname) IN ( SELECT certname FROM  ( SELECT not_active_nodes.certname AS "certname" FROM not_active_nodes  ) AS sub))

Explain analyze:

                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=12639.18..37778.23 rows=1 width=32) (actual time=17044.441..39996.822 rows=3 loops=1)
   Join Filter: (factsets.certname = fs_2.certname)
   Rows Removed by Join Filter: 1391
   Buffers: shared hit=956656
   ->  Nested Loop Semi Join  (cost=12639.18..25213.29 rows=1 width=257) (actual time=16967.932..39199.368 rows=3 loops=1)
         Join Filter: (factsets.certname = fs_1.certname)
         Rows Removed by Join Filter: 4337
         Buffers: shared hit=937084
         ->  Nested Loop  (cost=12639.18..12648.36 rows=1 width=202) (actual time=1560.239..1571.784 rows=27 loops=1)
               Buffers: shared hit=37131
               ->  HashAggregate  (cost=12564.92..12564.93 rows=1 width=63) (actual time=1559.406..1559.481 rows=27 loops=1)
                     Group Key: fs.certname
                     Batches: 1  Memory Usage: 24kB
                     Buffers: shared hit=36684
                     ->  Subquery Scan on fs  (cost=0.00..12564.92 rows=1 width=63) (actual time=78.887..1559.235 rows=27 loops=1)
                           Filter: ((fs.key = 'egnyte_role'::text) AND (fs.value = '"roles-consul"'::jsonb) AND (jsonb_typeof(fs.value) = 'string'::text))
                           Rows Removed by Filter: 638477
                           Buffers: shared hit=36684
                           ->  Result  (cost=0.00..7172.92 rows=269600 width=127) (actual time=0.626..1503.369 rows=638504 loops=1)
                                 Buffers: shared hit=36684
                                 ->  ProjectSet  (cost=0.00..1780.92 rows=269600 width=95) (actual time=0.624..1392.324 rows=638504 loops=1)
                                       Buffers: shared hit=36684
                                       ->  Seq Scan on factsets factsets_1  (cost=0.00..405.96 rows=2696 width=155) (actual time=0.007..1.593 rows=2696 loops=1)
                                             Buffers: shared hit=379
               ->  Index Scan using factsets_certname_idx on factsets  (cost=74.25..82.28 rows=1 width=155) (actual time=0.445..0.445 rows=1 loops=27)
                     Index Cond: (certname = fs.certname)
                     Filter: ((NOT (hashed SubPlan 1)) AND ((stable || volatile) ? 'consul_services'::text))
                     Buffers: shared hit=447
                     SubPlan 1
                       ->  Seq Scan on certnames  (cost=0.00..73.97 rows=1 width=55) (actual time=0.016..0.424 rows=2 loops=1)
                             Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
                             Rows Removed by Filter: 2696
                             Buffers: shared hit=47
         ->  Subquery Scan on fs_1  (cost=0.00..12564.92 rows=1 width=63) (actual time=6.207..1393.472 rows=161 loops=27)
               Filter: ((fs_1.key = 'virtual_dc'::text) AND (fs_1.value = '"egn"'::jsonb) AND (jsonb_typeof(fs_1.value) = 'string'::text))
               Rows Removed by Filter: 579473
               Buffers: shared hit=899953
               ->  Result  (cost=0.00..7172.92 rows=269600 width=127) (actual time=0.509..1343.199 rows=579634 loops=27)
                     Buffers: shared hit=899953
                     ->  ProjectSet  (cost=0.00..1780.92 rows=269600 width=95) (actual time=0.508..1244.236 rows=579634 loops=27)
                           Buffers: shared hit=899953
                           ->  Seq Scan on factsets factsets_2  (cost=0.00..405.96 rows=2696 width=155) (actual time=0.004..1.411 rows=2448 loops=27)
                                 Buffers: shared hit=9186
   ->  Subquery Scan on fs_2  (cost=0.00..12564.92 rows=1 width=63) (actual time=0.552..265.234 rows=465 loops=3)
         Filter: ((fs_2.key = 'side'::text) AND (fs_2.value = '"gcp"'::jsonb) AND (jsonb_typeof(fs_2.value) = 'string'::text))
         Rows Removed by Filter: 108145
         Buffers: shared hit=19537
         ->  Result  (cost=0.00..7172.92 rows=269600 width=127) (actual time=0.540..255.527 rows=108610 loops=3)
               Buffers: shared hit=19537
               ->  ProjectSet  (cost=0.00..1780.92 rows=269600 width=95) (actual time=0.539..236.956 rows=108610 loops=3)
                     Buffers: shared hit=19537
                     ->  Seq Scan on factsets factsets_3  (cost=0.00..405.96 rows=2696 width=155) (actual time=0.007..0.227 rows=465 loops=3)
                           Buffers: shared hit=90
 Planning:
   Buffers: shared hit=24
 Planning Time: 1.000 ms
 Execution Time: 39996.958 ms
(57 rows)

JIT settings on CloudSQL

puppetdb=> SELECT name, setting FROM pg_settings where name like 'jit%';
          name           | setting 
-------------------------+---------
 jit                     | on
 jit_above_cost          | 100000
 jit_debugging_support   | off
 jit_dump_bitcode        | off
 jit_expressions         | on
 jit_inline_above_cost   | 500000
 jit_optimize_above_cost | 500000
 jit_profiling_support   | off
 jit_provider            | llvmjit
 jit_tuple_deforming     | on
(10 rows)

... but

puppetdb=> SELECT pg_jit_available();
 pg_jit_available 
------------------
 f
(1 row)

... so it looks like JIT is not used (also it doesn't exist in the explained body)

Unfortunately, we can't set jit off because this setting is not available as an adjustable option in CloudSQL.

Additional Info

  • PostgreSQL version:

    puppetdb=> SELECT version();
                                             version                                         
    -----------------------------------------------------------------------------------------
     PostgreSQL 13.8 on x86_64-pc-linux-gnu, compiled by Debian clang version 12.0.1, 64-bit
    

  • PuppetDB version: puppetdb-7.13.0-1.el7.noarch
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.21#820021-sha1:38274c8)
Atlassian logo

Krzysztof Krzyżaniak

unread,
Jun 5, 2023, 4:09:01 AM6/5/23
to puppe...@googlegroups.com
Krzysztof Krzyżaniak commented on Bug PDB-5647
 
Re: Slow fact queries after PuppetDB 7 upgrade (comparing with PuppetDB 5) with GCP CloudSQL PostgreSQL

We can confirm that moving from CloudSQL to PostgreSQL on VM improves query time but still not good enough for our needs.

                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=12495.76..37341.86 rows=1 width=32) (actual time=10829.491..25654.661 rows=3 loops=1)
   Join Filter: (factsets.certname = fs_2.certname)
   Rows Removed by Join Filter: 1307
   Buffers: shared hit=620993 read=24568 dirtied=27 written=2007
   ->  Nested Loop Semi Join  (cost=12495.76..24923.36 rows=1 width=258) (actual time=10783.305..25178.060 rows=3 loops=1)
         Join Filter: (factsets.certname = fs_1.certname)
         Rows Removed by Join Filter: 4326
         Buffers: shared hit=608386 read=24568 dirtied=27 written=2007
         ->  Nested Loop  (cost=12495.76..12504.87 rows=1 width=203) (actual time=986.376..993.239 rows=27 loops=1)
               Buffers: shared hit=14055 read=11105 dirtied=27 written=1564
               ->  HashAggregate  (cost=12418.48..12418.49 rows=1 width=63) (actual time=985.937..985.982 rows=27 loops=1)
                     Group Key: fs.certname
                     Batches: 1  Memory Usage: 24kB
                     Buffers: shared hit=13708 read=11092 dirtied=27 written=1564
                     ->  Subquery Scan on fs  (cost=0.00..12418.48 rows=1 width=63) (actual time=46.545..985.901 rows=27 loops=1)
                           Filter: ((fs.key = 'egnyte_role'::text) AND (fs.value = '"roles-consul"'::jsonb) AND (jsonb_typeof(fs.value) = 'string'::text))
                           Rows Removed by Filter: 639154
                           Buffers: shared hit=13708 read=11092 dirtied=27 written=1564
                           ->  Result  (cost=0.00..6970.48 rows=272400 width=127) (actual time=0.394..951.928 rows=639181 loops=1)
                                 Buffers: shared hit=13708 read=11092 dirtied=27 written=1564
                                 ->  ProjectSet  (cost=0.00..1522.48 rows=272400 width=95) (actual time=0.392..884.611 rows=639181 loops=1)
                                       Buffers: shared hit=13708 read=11092 dirtied=27 written=1564
                                       ->  Seq Scan on factsets factsets_1  (cost=0.00..133.24 rows=2724 width=156) (actual time=0.011..1.002 rows=2699 loops=1)
                                             Buffers: shared hit=39 read=67 dirtied=6 written=10
               ->  Index Scan using factsets_certname_idx on factsets  (cost=77.28..85.31 rows=1 width=156) (actual time=0.265..0.265 rows=1 loops=27)
                     Index Cond: (certname = fs.certname)
                     Filter: ((NOT (hashed SubPlan 1)) AND ((stable || volatile) ? 'consul_services'::text))
                     Buffers: shared hit=345 read=13
                     SubPlan 1
                       ->  Seq Scan on certnames  (cost=0.00..77.00 rows=1 width=55) (actual time=0.190..0.191 rows=2 loops=1)
                             Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
                             Rows Removed by Filter: 2699
                             Buffers: shared hit=50
         ->  Subquery Scan on fs_1  (cost=0.00..12418.48 rows=1 width=63) (actual time=1.066..895.696 rows=160 loops=27)
               Filter: ((fs_1.key = 'virtual_dc'::text) AND (fs_1.value = '"egn"'::jsonb) AND (jsonb_typeof(fs_1.value) = 'string'::text))
               Rows Removed by Filter: 579366
               Buffers: shared hit=594331 read=13463 written=443
               ->  Result  (cost=0.00..6970.48 rows=272400 width=127) (actual time=0.305..865.161 rows=579526 loops=27)
                     Buffers: shared hit=594331 read=13463 written=443
                     ->  ProjectSet  (cost=0.00..1522.48 rows=272400 width=95) (actual time=0.304..803.625 rows=579526 loops=27)
                           Buffers: shared hit=594331 read=13463 written=443
                           ->  Seq Scan on factsets factsets_2  (cost=0.00..133.24 rows=2724 width=156) (actual time=0.003..0.483 rows=2448 loops=27)
                                 Buffers: shared hit=2468 read=129 written=2
   ->  Subquery Scan on fs_2  (cost=0.00..12418.48 rows=1 width=63) (actual time=0.350..158.505 rows=437 loops=3)
         Filter: ((fs_2.key = 'side'::text) AND (fs_2.value = '"gcp"'::jsonb) AND (jsonb_typeof(fs_2.value) = 'string'::text))
         Rows Removed by Filter: 101793
         Buffers: shared hit=12580
         ->  Result  (cost=0.00..6970.48 rows=272400 width=127) (actual time=0.344..153.210 rows=102230 loops=3)
               Buffers: shared hit=12580
               ->  ProjectSet  (cost=0.00..1522.48 rows=272400 width=95) (actual time=0.343..142.446 rows=102230 loops=3)
                     Buffers: shared hit=12580
                     ->  Seq Scan on factsets factsets_3  (cost=0.00..133.24 rows=2724 width=156) (actual time=0.003..0.077 rows=437 loops=3)
                           Buffers: shared hit=53
 Planning:
   Buffers: shared hit=21 read=3
 Planning Time: 0.653 ms
 Execution Time: 25654.807 ms
(57 rows)

Krzysztof Krzyżaniak

unread,
Jun 12, 2023, 9:15:02 AM6/12/23
to puppe...@googlegroups.com

It looks like it's entirely connected with the PostgreSQL version, all versions higher than 9 have problems with that query. These are times we gathered for various combination of PuppetDB and PostgreSQL (in any case JIT is switched off):

puppetdb5 & PostgreSQL13
real 0m24.315s
user 0m0.006s
puppetdb5 & PostgreSQL9.6 
real 0m7.194s
user 0m0.004s
puppetdb 7.12.0 & PostgreSQL9.6
real 0m5.689s
user 0m0.005s
puppetdb 7.13.0 & PostgreSQL13 - combination not possible, doesn't work.

Steve Axthelm (Jira)

unread,
Jun 13, 2023, 1:48:04 PM6/13/23
to puppe...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages