So after a bit of analysis we realise querying for a fact by value_string was super slow . It's gin index
puppetdb> explain analyze select * from facts where value_string LIKE '
nachodev6.cern.ch';
+------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|------------------------------------------------------------------------------------------------------------------------------------------|
| Bitmap Heap Scan on facts (cost=1018.68..15873.81 rows=3958 width=129) (actual time=124.456..124.469 rows=8 loops=1) |
| Heap Blocks: exact=5 |
| -> Bitmap Index Scan on facts_value_string_trgm (cost=0.00..1017.69 rows=3958 width=0) (actual time=124.435..124.435 rows=8 loops=1) |
| Planning time: 0.170 ms |
| Execution time: 124.627 ms |
+------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.145s
puppetdb> explain analyze select * from facts where value_string = '
nachodev6.cern.ch';
+---------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|---------------------------------------------------------------------------------------------------------------|
| Seq Scan on facts (cost=0.00..935197.30 rows=3958 width=129) (actual time=5466.492..6886.163 rows=8 loops=1) |
| Rows Removed by Filter: 27237747 |
| Planning time: 0.129 ms |
| Execution time: 6886.208 ms |
+---------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 6.904s (6 seconds)
puppetdb>
query_facts( 'hostgroup_0="dark" and hostgroup_1="web"', ['ipaddress','ipaddress6'])
and we could not figure a way to make them use LIKE.
So we added an index which made a hugely massive difference:
CREATE INDEX concurrently facts_value_string_idx ON facts(value_string);
and now we no longer doing full table scans for the above.
Attached is our CPU drop on the database as well as the latency drop on the facts endpoint of puppetdb 4.
Our puppetdb is now very happy rather than very sad.
We do now have a limit of 8191 on fact value but we can live with that.
Steve.