Jira (PDB-5252) Determine trigram index on facts size impact

12 views
Skip to first unread message

Bogdan Irimie (Jira)

unread,
Aug 25, 2021, 3:53:03 AM8/25/21
to puppe...@googlegroups.com
Bogdan Irimie created an issue
 
PuppetDB / Task PDB-5252
Determine trigram index on facts size impact
Issue Type: Task Task
Assignee: Unassigned
Created: 2021/08/25 12:52 AM
Priority: Normal Normal
Reporter: Bogdan Irimie

We should answer these questions:

  • how much space does an index for a fact take?
  • how much space do 10 indexes take?
  • is there any performance decrease on ingestion?
  • can we add indexes on nested fact?
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Bogdan Irimie (Jira)

unread,
Aug 25, 2021, 3:56:03 AM8/25/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
We should answer these questions:

* how much space does an index for a fact take (sandbox size on disk and index size obtain with SQL query) ?
* how much space do 10 indexes take?
* is there any performance decrease on ingestion?
* can we add indexes on nested fact?

Bogdan Irimie (Jira)

unread,
Aug 25, 2021, 4:05:03 AM8/25/21
to puppe...@googlegroups.com

Bogdan Irimie (Jira)

unread,
Aug 25, 2021, 4:06:03 AM8/25/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
We should answer these questions:

* how much space does an index for a fact take (sandbox size on disk and index size obtain with SQL query)?
* how much space do 10 indexes take?
* is there any performance decrease on ingestion?

* can we add indexes on nested fact?

Bogdan Irimie (Jira)

unread,
Aug 25, 2021, 4:07:03 AM8/25/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Sprint: ghost-08.09.2021

Andrei Filipovici (Jira)

unread,
Aug 25, 2021, 5:53:03 AM8/25/21
to puppe...@googlegroups.com

Andrei Filipovici (Jira)

unread,
Aug 25, 2021, 8:17:02 AM8/25/21
to puppe...@googlegroups.com
Andrei Filipovici commented on Task PDB-5252
 
Re: Determine trigram index on facts size impact

Run the tests on the n1 server on a 100k nodes sandbox.
The initial sandbox size was 23Gb / 22804mb / 23911510356 bytes.

Total indexes size on table factsets was 95 mb / 99500032 bytes and we got it by running this sql query:
SELECT pg_size_pretty (pg_indexes_size('factsets'));

Individual index size was:
relation | size
---------------------------------+-----------
public.idx_factsets_jsonb_merged | 81 MB
public.factsets | 17 MB. (don't know what that is)
public.factsets_hash_expr_idx | 6648 kB
public.factsets_certname_idx | 3104 kB
public.factsets_pkey | 2208 kB
public.idx_factsets_prod | 2208 kB
public.factsets_id_seq | 8192 bytes (don't know what that is)

We got this by running this:
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND
relname LIKE '%' || 'factsets' || '%'
ORDER BY pg_relation_size(C.oid) DESC;

Andrei Filipovici (Jira)

unread,
Aug 25, 2021, 10:23:06 AM8/25/21
to puppe...@googlegroups.com

After running the command ingestion benchmark, the sandbox grew to 25 GB. I run a vacuum full and the size got down to 22869mb / 23978720946 bytes.

After creating the index on the operating system fact the sandbox size got to 22873mb / 23983677106 bytes. So it's a 4 mb size increase.

Total indexes size on table factsets was 100 mb / 104816640 bytes.

Individual index size was:

public.idx_factsets_jsonb_merged     | 81 MB
 public.factsets                      | 30 MB
 public.factsets_hash_expr_idx        | 6648 kB
 public.factsets_operating_system_idx | 4840 kB
 public.factsets_certname_idx         | 3104 kB
 public.idx_factsets_prod             | 2208 kB
 public.factsets_pkey                 | 2208 kB
 public.factsets_id_seq               | 8192 bytes

The command ingestion results:

Processing-seconds: {
"OneMinuteRate":309.6056417527956,
"MeanRate":221.83511475603126,
"FifteenMinuteRate":32.04613388354146,
"Max":63.091493,
"50thPercentile":8.552764999999999,
"Mean":10.05392052990784,
"DurationUnit":"milliseconds",
"95thPercentile":36.686657,
"99thPercentile":52.893364,
"98thPercentile":47.307950999999996,
"Min":1.315396,
"999thPercentile":61.004227,
"RateUnit":"events/second",
"75thPercentile":10.14032,
"Count":30000,
"StdDev":11.080445464493687,
"FiveMinuteRate":89.20183541650876}
Elapsed-seconds: 49

Reply all
Reply to author
Forward
0 new messages