Jira (PDB-5532) Enterprise Grade - Puppetdb

Issue Type:	Epic
Assignee:	Unassigned
Created:	2022/08/29 12:50 PM
Due Date:	2023/01/01
Priority:	Major
Reporter:	Micah Wilson

Increase the scale of nodes in puppet storage to match large, enterprise customer needs.

Good = 75K nodes

Better = 150k nodes

Best = 250k nodes

HOW?

STORE LESS STUFF:

Remove Unchanged Resources
Only generate and store log/events on failed runs

MODIFY PUPPETDB TO INCLUDE AN IN-MEMORY QUERY CACHE:

Transparent by default but can be modified/disabled if/when necessary.

IMPROVE FACT PATHS

Do it less often
Improve query performance

Reference Doc: https://docs.google.com/document/d/1B4LKhV3GWcMSxPrKqGU6wo0qKi-EguCVSdeq_sXzcMI/edit#

This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)

Micah Wilson (Jira)

unread,

Aug 31, 2022, 2:02:03 PM8/31/22

to puppe...@googlegroups.com

Micah Wilson updated an issue

PuppetDB /

Change By:	Micah Wilson

Increase the scale of nodes in puppet storage to match large, enterprise customer needs.

Good = 75K nodes

Better = 150k nodes

Best = 250k nodes

HOW?

* STORE LESS STUFF : *
#
1. Remove Unchanged Resources - *TSHIRT SIZE (S)*
#
_Per Rob, we currently have this in opensource, in PE, the default setting is to collect both changed and unchanged resources (redundant data) and we can change that config setting to only collect changed resources. We will also need to write tests and adjust tests, will need to make changes on puppetdb and extensions. (about 1-2 sprints)_

2. Only generate and store log/events on failed runs (Agent work?)

* MODIFY PUPPETDB TO INCLUDE AN IN-MEMORY QUERY CACHE : *
#
1. Transparent by default but can be modified/disabled if/when necessary.

* IMPROVE FACT PATHS - TSHIRT SIZE (M)*
#
1. Do it less often
#
2. Improve query performance

_Proposed solution: Change the fact path garbage collection is configurable outside of everything else, and change the default to 1x every 24h instead of 1x every hour. Untangling/refactoring, and updating writing tests. (2-3 sprints)_

Reference Doc: [ https://docs.google.com/document/d/1B4LKhV3GWcMSxPrKqGU6wo0qKi-EguCVSdeq_sXzcMI/edit# ]

productboard (Jira)

unread,

Aug 31, 2022, 4:55:01 PM8/31/22

to puppe...@googlegroups.com

productboard updated an issue

PuppetDB /

Change By:	productboard
productboard URL:	https://puppet.productboard.com/feature-board/planning/features/15146803

productboard (Jira)

unread,

Aug 31, 2022, 4:55:02 PM8/31/22

to puppe...@googlegroups.com

productboard updated an issue

PuppetDB /

Change By:	productboard

Increase the scale of nodes in puppet storage to match large, enterprise customer needs.

Good = 75K nodes

Better = 150k nodes

Best = 250k nodes

HOW?

*STORE LESS STUFF*

1. # Remove Unchanged Resources \ - *TSHIRT SIZE (S)*

_Per Rob, we currently have this in opensource, in PE, the default setting is to collect both changed and unchanged resources (redundant data) and we can change that config setting to only collect changed resources. We will also need to write tests and adjust tests, will need to make changes on puppetdb and extensions. (about 1 \ -2 sprints)_

2. # Only generate and store log/events on failed runs (Agent work?)

*MODIFY PUPPETDB TO INCLUDE AN IN \ -MEMORY QUERY CACHE*

1. # Transparent by default but can be modified/disabled if/when necessary.

*IMPROVE FACT PATHS \ - TSHIRT SIZE (M)*

1. # Do it less often

2. # Improve query performance

_Proposed solution: Change the fact path garbage collection is configurable outside of everything else, and change the default to 1x every 24h instead of 1x every hour. Untangling/refactoring, and updating writing tests. (2 \ -3 sprints)_

Reference Doc: [https://docs.google.com/document/d/1B4LKhV3GWcMSxPrKqGU6wo0qKi-EguCVSdeq_sXzcMI/edit#]

productboard (Jira)

unread,

Aug 31, 2022, 5:55:02 PM8/31/22

to puppe...@googlegroups.com

productboard updated an issue

PuppetDB /