Jira (PDB-5593) Generating report data

Issue Type:	New Feature
Assignee:	Unassigned
Created:	2023/02/13 9:25 AM
Priority:	Normal
Reporter:	Austin Blatt

Events per report

This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)

Joshua Partlow (Jira)

unread,

Mar 1, 2023, 2:17:02 PM3/1/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow
Story Points:	1

Cas Donoghue (Jira)

unread,

Mar 9, 2023, 2:17:02 PM3/9/23

to puppe...@googlegroups.com

Cas Donoghue commented on

Re: Generating report data

Joshua Partlow will work with Austin Blatt to get this filled out with more details.

Joshua Partlow (Jira)

unread,

Apr 26, 2023, 1:55:01 PM4/26/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow

- Events * logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message
*** ever more than 1 event?

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. So generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,

May 10, 2023, 1:53:01 PM5/10/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow

* logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message
*** ever more than 1 event?

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. So Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Cas Donoghue (Jira)

unread,

May 10, 2023, 2:34:01 PM5/10/23

to puppe...@googlegroups.com

Cas Donoghue updated an issue

PuppetDB /

Change By:	Cas Donoghue
Sprint:	Skeletor 05/24/2023

Joshua Partlow (Jira)

unread,

May 10, 2023, 4:21:03 PM5/10/23

to puppe...@googlegroups.com

Joshua Partlow assigned an issue to Joshua Partlow

PuppetDB /

Change By:	Joshua Partlow
Assignee:	Joshua Partlow

Joshua Partlow (Jira)

unread,

May 11, 2023, 4:29:02 PM5/11/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow

* logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message

*** ever more than 1 one event ? per resource property change, (file content, owner, mode, for instance would be three events), but typically just one

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,

May 11, 2023, 6:46:04 PM5/11/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow

* logs (stored as a blob in reports.logs )

** variable
*** tags array
*** message
** plugin lines
** could have debug lines

* metrics (stored as a blob in reports.metrics )

** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources

** containment_path stored as a blob ( corresponds to depth of resource in catalog graph reports.resources )
** events array ( but changed resource properties stored as in resource_events )
*** most variable

**** old_value
**** new_value
**** message

*** so each resource events array has one event per resource property change, (file content, owner, mode, for instance would be three events), but typically just one
**** containment_path (corresponds to depth of resource in catalog graph)
**** most variable
***** old_value
***** new_value
***** message

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,

May 12, 2023, 1:16:03 PM5/12/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /

Change By:	Joshua Partlow

* logs (stored as a blob in reports.logs)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines

* ** a pe primary report with --debug, for example, adds ~11000 log entries and 5MB to the report.
* metrics (stored as a blob in reports.metrics)

** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources

** stored as a blob (reports.resources)
** but changed resource properties stored in resource_events

*** so each resource events array has one event per resource property change, (file content, owner, mode, for instance would be three events), but typically just one
**** containment_path (corresponds to depth of resource in catalog graph)
**** most variable
***** old_value
***** new_value
***** message

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,

May 12, 2023, 6:28:03 PM5/12/23

to puppe...@googlegroups.com

Joshua Partlow updated an issue

PuppetDB /