Jira (PDB-5593) Generating report data

21 views
Skip to first unread message

Austin Blatt (Jira)

unread,
Feb 13, 2023, 12:26:02 PM2/13/23
to puppe...@googlegroups.com
Austin Blatt created an issue
 
PuppetDB / New Feature PDB-5593
Generating report data
Issue Type: New Feature New Feature
Assignee: Unassigned
Created: 2023/02/13 9:25 AM
Priority: Normal Normal
Reporter: Austin Blatt
  • Events per report
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)
Atlassian logo

Joshua Partlow (Jira)

unread,
Mar 1, 2023, 2:17:02 PM3/1/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
Change By: Joshua Partlow
Story Points: 1

Cas Donoghue (Jira)

unread,
Mar 9, 2023, 2:17:02 PM3/9/23
to puppe...@googlegroups.com
Cas Donoghue commented on New Feature PDB-5593
 
Re: Generating report data

Joshua Partlow will work with Austin Blatt to get this filled out with more details. 

Joshua Partlow (Jira)

unread,
Apr 26, 2023, 1:55:01 PM4/26/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
- Events * logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message
*** ever more than 1 event?

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build
per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. So generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,
May 10, 2023, 1:53:01 PM5/10/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
* logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message
*** ever more than 1 event?

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. So Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Cas Donoghue (Jira)

unread,
May 10, 2023, 2:34:01 PM5/10/23
to puppe...@googlegroups.com
Cas Donoghue updated an issue
Change By: Cas Donoghue
Sprint: Skeletor 05/24/2023

Joshua Partlow (Jira)

unread,
May 10, 2023, 4:21:03 PM5/10/23
to puppe...@googlegroups.com
Joshua Partlow assigned an issue to Joshua Partlow
Change By: Joshua Partlow
Assignee: Joshua Partlow

Joshua Partlow (Jira)

unread,
May 11, 2023, 4:29:02 PM5/11/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
* logs (stored as a blob)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path (corresponds to depth of resource in catalog graph)
** events array (stored as resource_events)
*** most variable
**** old_value
**** new_value
**** message
*** ever more than 1 one event ? per resource property change, (file content, owner, mode, for instance would be three events), but typically just one

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,
May 11, 2023, 6:46:04 PM5/11/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
* logs (stored as a blob in reports.logs )

** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* metrics (stored as a blob in reports.metrics )

** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** containment_path stored as a blob ( corresponds to depth of resource in catalog graph reports.resources )
**
events array ( but changed resource properties stored as in resource_events )
***
most variable

**** old_value
**** new_value
**** message
*** so each resource events array has one event per resource property change, (file content, owner, mode, for instance would be three events), but typically just one
**** containment_path (corresponds to depth of resource in catalog graph)
**** most variable
***** old_value
***** new_value
***** message

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,
May 12, 2023, 1:16:03 PM5/12/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
* logs (stored as a blob in reports.logs)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
* ** a pe primary report with --debug, for example, adds ~11000 log entries and 5MB to the report.
*
metrics (stored as a blob in reports.metrics)

** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** stored as a blob (reports.resources)
** but changed resource properties stored in resource_events

*** so each resource events array has one event per resource property change, (file content, owner, mode, for instance would be three events), but typically just one
**** containment_path (corresponds to depth of resource in catalog graph)
**** most variable
***** old_value
***** new_value
***** message

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?

Joshua Partlow (Jira)

unread,
May 12, 2023, 6:28:03 PM5/12/23
to puppe...@googlegroups.com
Joshua Partlow updated an issue
* logs (stored as a blob in reports.logs)
** variable
*** tags array
*** message
** plugin lines
** could have debug lines
*** a pe primary report with --debug, for example, adds ~11000 log entries and 5MB to the report.

* metrics (stored as a blob in reports.metrics)
** elements are simple structure (hash of name, value, category)
** count of metrics somewhat variable based on type?
* resources
** stored as a blob (reports.resources)
** but changed resource properties stored in resource_events
*** so each resource events array has one event per resource property change, (file content, owner, mode, for instance would be three events), but typically just one
**** containment_path (corresponds to depth of resource in catalog graph)
**** most variable
***** old_value
***** new_value
***** message

Probably simplest to generate reports from the generated catalogs since that provides us with resources and structure already. Also makes the sample dataset more cohesive.

How many variations to build per catalog? Since a report is a snapshot of change in time, could build several variants. Need to check and see how benchmark varies reports. Benchmark varies timestamps, but not events.

Also need to think about unchanged resources. Does generate deal with that, or should benchmark mutate reports with that flag on or off?
Will add a flag to generate for now.

Cas Donoghue (Jira)

unread,
May 24, 2023, 2:14:01 PM5/24/23
to puppe...@googlegroups.com
Cas Donoghue updated an issue
Change By: Cas Donoghue
Sprint: Skeletor 05/24/2023 , Skeletor 06/07/2023
This message was sent by Atlassian Jira (v8.20.21#820021-sha1:38274c8)
Atlassian logo

Joshua Partlow (Jira)

unread,
Jun 6, 2023, 1:16:01 PM6/6/23
to puppe...@googlegroups.com
Joshua Partlow commented on New Feature PDB-5593
 
Re: Generating report data

Pez had failed I think due to the openssl issue breaking cast for postgres. Rekicked as that is resolved now, to get this into PE main.

Rob Browning (Jira)

unread,
Jun 12, 2023, 6:56:02 PM6/12/23
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Fix Version/s: PDB 8.0.1

Claudia Petty (Jira)

unread,
Jun 21, 2023, 10:56:08 AM6/21/23
to puppe...@googlegroups.com
Claudia Petty updated an issue
Change By: Claudia Petty
Labels: new-feature
Reply all
Reply to author
Forward
0 new messages