| This is a pretty vague report, I know, but I file it anyhow. I'm running PuppetDB (an puppetserver and puppetexplorer) as a docker container, managed via pupperware. I just upgraded the PuppetDB container from v7.8.0 to v7.9.0. After that, querys from PuppetExplorer are really slow. Sometimes they even timeout. All the cells in the PuppetExplorer dashboard usually update quickly (less than 1 second) for the last year during various PuppetDB 7.x.x versions. If I now reload the dashboard, a few cells immediately show a value while the others show the rotating animation waiting for a response. After 40 seconds they finally show a value (and sometimes it times out completely). In the PuppetDB container I see no warnings in the log. In the PuppetExplorer container I see no warnings if there is a response (after ~40s), otherwise it logs:
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled |
puppetexplorer_1 | 03/Feb/2022:12:57:48 +0000 [ERROR 502 /api/pdb/query/v4/nodes] context canceled
|
I'm running 'top' on the docker host. I reload the PuppetExplorer dashboard (and it sends a bunch of querys to PDB). In 'top' I can see how ca 20 postgres processes appear and each use 10% CPU. The processes are there for ~40 seconds and then disappear. At the same time my browser finally show values in all dashboard cells. So, my guess is that something in those postgres threads take 40 seconds before they finally return. Meanwhile, the puppet agents speak to the puppetserver as usual - no increase in the time to apply the catalog, and the server replaces the facts and stores the reports to PuppetDB without problem. In https://puppet.com/docs/puppetdb/7/release_notes.html#puppetdb-790 I see: "Improved performance of the fact-contents endpoint. Testing against a database of 10,000 mocked nodes, there was an observed 84% decrease in time taken to complete a difficult query. https://tickets.puppetlabs.com/browse/PDB-5259" Could it be this change that degraded the performance for me? Possibly a bug in some library that distributes multiple querys over a pool of postgres threads? |