This should help replica provisioningTo limit start up time while providing enough consistency for PuppetDB replicas, we need to limit the things amount of data that is transferred.
Currently we care about sync * Catalogs * Factsets * Reports * Node deactivation * Catalog Inputs (for cd4pe)
By far the two largest sets of data are reports and catalog inputs, so I recommend that initial sync is limited to the following. * Catalogs * Factsets * Node deactivation
The most commonly suggested modification to this list is to sync _only_ the latest reports. I haven't heard a compelling reason to spend startup time syncing the latest report for a replica PuppetDB because all the time PDB spends in startup sync it is drifting out of alignment. If someone has a good reason that a replica PuppetDB should have the latest reports when it starts up, I would be happy to add that to the list of things to sync.
By far the two largest sets of data are reports and catalog inputs, so I recommend think that initial sync is should be limited to the following to ensure a somewhat fast startup time.
* Catalogs * Factsets * Node deactivation
The most commonly suggested modification to this list is to also sync _only_ the latest reports. I haven't heard a compelling reason to spend startup time syncing the latest report for a replica PuppetDB because all the time PDB spends in startup sync it is drifting out of alignment. If someone has a good reason that a replica PuppetDB should have the latest reports when it starts up, I would be happy to add that to the list of things to sync.