Jira (PDB-5590) Generating scale testing data

4 views
Skip to first unread message

Austin Blatt (Jira)

unread,
Feb 13, 2023, 12:23:01 PM2/13/23
to puppe...@googlegroups.com
Austin Blatt created an issue
 
PuppetDB / Epic PDB-5590
Generating scale testing data
Issue Type: Epic Epic
Assignee: Unassigned
Created: 2023/02/13 9:22 AM
Priority: Normal Normal
Reporter: Austin Blatt

For an actual scale test we can use the existing benchmark tool to submit arbitrary numbers of commands to a PDB. But that operates by reading a set of "example" commands and making minor modifications to them over time. The existing data was generated on an employees laptop years ago and is not representative of "real data" - but we don't know what is.

In order to test PuppetDB at scale, we need better data. Current data examples
are very sparse - generated on an employees laptop many years ago with very
little total data - or from a customer but very old.

Why do we need to write code to generate it as opposed to just create it once?
We don't know what data structure is "representative" of customers use case,
and it likely isn't one single structure. I think we have about 1000 customers,
so we probably have 1001 different use profiles. By write code to generate it
from a set of parameters, we can change our data set as we move forward in
time. Out of scope, but we could even instrument the same metrics in PE and
potentially use that to replicate issues a customer is seeing without having to
ask them to run things on our behalf to diagnose issues.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.20.11#820011-sha1:0629dd8)
Atlassian logo

David Piekny (Jira)

unread,
Feb 15, 2023, 5:46:03 PM2/15/23
to puppe...@googlegroups.com
David Piekny updated an issue
Change By: David Piekny
Labels: 23Q1 enterprise-scalability phase_1
Reply all
Reply to author
Forward
0 new messages