Jira (PDB-5252) Determine size impact of trigram indexes on facts

8 views
Skip to first unread message

Andrei Filipovici (Jira)

unread,
Aug 26, 2021, 7:27:02 AM8/26/21
to puppe...@googlegroups.com
Andrei Filipovici updated an issue
 
PuppetDB / Task PDB-5252
Determine size impact of trigram indexes on facts
Change By: Andrei Filipovici
Summary: Determine size impact of trigram index indexes on facts
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Andrei Filipovici (Jira)

unread,
Aug 26, 2021, 7:59:03 AM8/26/21
to puppe...@googlegroups.com
Andrei Filipovici commented on Task PDB-5252
 
Re: Determine size impact of trigram indexes on facts

After I started PDB the sandbox shrunk to 22729 mb / 23832150982 bytes.

After adding the remaining 9 indexes the sandbox size was 22766 mb / 23871586932 bytes.

Total indexes size on table factsets was 151 mb / 158687232 bytes.
Individual index size was:

                 relation                  |    size
-------------------------------------------+------------
 public.idx_factsets_jsonb_merged          | 90 MB
 public.factsets                           | 37 MB
 public.factsets_hash_expr_idx             | 8848 kB
 public.factsets_operating_system_idx      | 7312 kB
 public.factsets_fqdn_idx                  | 5112 kB
 public.factsets_os_family_idx             | 4840 kB
 public.factsets_system_uptime_idx         | 4824 kB
 public.factsets_virtual_idx               | 4712 kB
 public.factsets_puppetversion_idx         | 4680 kB
 public.factsets_kernel_idx                | 4648 kB
 public.factsets_trusted_authenticated_idx | 4648 kB
 public.factsets_timezone_idx              | 4424 kB
 public.factsets_certname_idx              | 3104 kB
 public.idx_factsets_prod                  | 2504 kB
 public.factsets_pkey                      | 2504 kB
 public.factsets_processors_count_idx      | 616 kB
 public.factsets_id_seq                    | 8192 bytes

The command ingestion benchmark results were:

Andrei Filipovici (Jira)

unread,
Aug 26, 2021, 10:29:03 AM8/26/21
to puppe...@googlegroups.com

After that I deleted all the GIN indexes, run a vacuum full and recreated all indexes as GIST.

Andrei Filipovici (Jira)

unread,
Aug 27, 2021, 7:17:03 AM8/27/21
to puppe...@googlegroups.com

In summary:

  • how much space does an index for a fact take (sandbox size on disk and index size obtain with SQL query)?

The used space for a GIN index is between 3 to 9 mb and for a GIST index is between 3 and 10 mb.

  • how much space do 10 indexes take?

10 GIN indexers used 56 mb and the GIST indexers used 77 mb.

  • is there any performance decrease on ingestion?

When we first run the ingestion benchmark we got a time of 263 seconds and after we added a GIN index we got a time of 49 sec. The time remained the same with 10 GIN indexers. With 10 GIST indexers the time increased to 51 sec. I think the impact of 10 indexers in negligible.

Reply all
Reply to author
Forward
0 new messages