Jira (PDB-5377) Try fact-contents improvements on fact-path gc

17 views
Skip to first unread message

Rob Browning (Jira)

unread,
Nov 17, 2021, 11:53:02 AM11/17/21
to puppe...@googlegroups.com
Rob Browning created an issue
 
PuppetDB / Improvement PDB-5377
Try fact-contents improvements on fact-path gc
Issue Type: Improvement Improvement
Assignee: Unassigned
Components: PuppetDB
Created: 2021/11/17 8:52 AM
Priority: Normal Normal
Reporter: Rob Browning

See if we can improve the performance of the fact-paths gc by reworking its query along the lines of the changes we're making to fact-contents (PDB-5259).  e.g. try reworking the traversal to avoid duplicating the factset subtree at every level of the descent.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Rob Browning (Jira)

unread,
Nov 17, 2021, 11:53:02 AM11/17/21
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Sprint: HAHA/Grooming

Austin Blatt (Jira)

unread,
Jan 5, 2022, 2:33:01 PM1/5/22
to puppe...@googlegroups.com
Austin Blatt updated an issue
Change By: Austin Blatt
Sprint: HAHA/Grooming HA 2022-01-19
This message was sent by Atlassian Jira (v8.20.2#820002-sha1:829506d)
Atlassian logo

Austin Blatt (Jira)

unread,
Jan 5, 2022, 2:33:01 PM1/5/22
to puppe...@googlegroups.com

Rob Browning (Jira)

unread,
Jan 11, 2022, 12:49:01 PM1/11/22
to puppe...@googlegroups.com

David McTavish (Jira)

unread,
Jan 12, 2022, 4:59:02 PM1/12/22
to puppe...@googlegroups.com

Rob Browning (Jira)

unread,
Jan 19, 2022, 1:39:02 PM1/19/22
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Fix Version/s: PDB n/a

Rob Browning (Jira)

unread,
Jan 19, 2022, 1:39:02 PM1/19/22
to puppe...@googlegroups.com
Rob Browning commented on Improvement PDB-5377
 
Re: Try fact-contents improvements on fact-path gc

The gc query was already similar, though not the same as the newer fact-contents query, and testing the existing gc query on a fast machine against a version of the fact-contents query that had been adjusted to only track the relevant data showed that the existing gc query was already roughly twice as fast.

Though for context, that was on a very fast nvme drive, and while it was running against 100k nodes, they were benchmark generated, and not rewritten, so that table was presumably very compact, etc.

Since we've seen that gc query taking 20+ minutes at client sites with fewer nodes, we'd recommend working with CS to gather some relevant information from a collection of notable sites as a next step – both data for which we'd provide the collection tools, and the coincident support script (i.e. sar) data.

Rob Browning (Jira)

unread,
Jan 19, 2022, 1:39:03 PM1/19/22
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Release Notes: Not Needed
Reply all
Reply to author
Forward
0 new messages