Jira (PDB-4444) PuppetDB never finishes migrating resource

Luke Bigum (JIRA)

unread,

Jun 26, 2019, 5:59:03 AM6/26/19

to puppe...@googlegroups.com

Luke Bigum created an issue

PuppetDB /

PDB-4444

PuppetDB never finishes migrating resource_events

Issue Type:	Bug
Assignee:	Unassigned
Components:	PuppetDB
Created:	2019/06/26 2:58 AM
Priority:	Normal
Reporter:	Luke Bigum

In our environment, a PuppetDB upgrade never completes - any schema migrations that touch `resource_events` always take too long (over 12+ hours). The PuppetDB JVM either crashes OOM, or I give up, kill it, and truncate `resource_events` and start it again.

This is the migration query that is running:

INSERT INTO resource_events_transform ( new_value, corrective_change, property, file, report_id, event_hash, old_value, containing_class, certname_id, line, resource_type, status, resource_title, timestamp, containment_path, message ) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16 )+

I'm not sure there's any way to solve this... `resource_events` is by far the largest table, usually around 3-5 million rows. I've already disabled report processing on our Dev infrastructure to limit the amount of reports stored.

Any suggestions, or should I make it practice to truncate this table before every package upgrade?

Add Comment

This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)

Charlie Sharpsteen (JIRA)

unread,

Jun 26, 2019, 10:59:03 AM6/26/19

to puppe...@googlegroups.com

Charlie Sharpsteen commented on

PDB-4444

Re: PuppetDB never finishes migrating resource_events

Which version of PuppetDB are you starting with, and which version are you upgrading to? That will let us know which migrations are being run on the resource_events table.

Add Comment

Luke Bigum (JIRA)

unread,

Jun 28, 2019, 5:11:03 AM6/28/19

to puppe...@googlegroups.com

Luke Bigum commented on

PDB-4444

Re: PuppetDB never finishes migrating resource_events

In March, puppetdb-5.2.2-1.el6 -> puppetdb-6.3.0-1.el6, and then a few days ago puppetdb-6.3.0-1.el6 -> puppetdb-6.3.3-1.el6.

Add Comment

Robert Roland (JIRA)

unread,

Aug 13, 2019, 2:58:03 PM8/13/19

to puppe...@googlegroups.com

Robert Roland commented on

PDB-4444

Re: PuppetDB never finishes migrating resource_events

Luke Bigum - can we get some details on this instance of a migration issue?

How long did you let it run, how many rows are in your table, do you have custom JVM GC settings for PuppetDB, how much RAM / CPU cores does your PuppetDB instance have? What sort of bandwidth do you have between the PostgreSQL server and PuppetDB?

Our testing of this migration included an instance with approximately 5 million rows in resource_events and the migration took 45 minutes.

Add Comment

Luke Bigum (JIRA)

unread,

Aug 14, 2019, 4:51:03 AM8/14/19

to puppe...@googlegroups.com

Luke Bigum commented on

PDB-4444

Re: PuppetDB never finishes migrating resource_events

For the JVM, we only change the Max Heap:

JAVA_ARGS="-Xmx6g"

The current row count (which is a higher than I was expecting):

puppetdb=> select count(*) from resource_events;

 count

----------

42973699

(1 row)

The machine itself is a 12 core KVM instance with 40Gig total RAM. The Postgresql instance is co-located on the same VM, and it also runs a Puppet Server instance (but we don't have automatic Puppet runs enabled so this Puppet Server is mostly idle).

Some other relevant config:

# How often (in minutes) to compact the database

# gc-interval = 60

gc-interval = 60

Number of seconds before any SQL query is considered 'slow'; offending

# queries will not be interrupted, but will be logged at the WARN log level.

log-slow-statements = 10

syntax_pgs = true

node-ttl = 0s

node-purge-ttl = 1d

report-ttl = 14d

conn-max-age = 60

conn-keep-alive = 45

conn-lifetime = 0

We don't purge any nodes based on last check-in time, but I do try purge reports. We can't turn on node-ttl because our Puppet runs are done on a certain schedule. If we were to turn on automatic purging, we might expire an active node simply because it hasn't had it's scheduled run. This affects our monitoring, which is derived from PuppetDB. When a machine is decommissioned it is deactivated in PuppetDB manually.

I thought purging reports would be enough to keep `resource_events` low, but could be it's not doing what I think it's doing.

Add Comment

Reply all

Reply to author

Forward

Jira (PDB-4444) PuppetDB never finishes migrating resource_events

Luke Bigum (JIRA)

Charlie Sharpsteen (JIRA)

Luke Bigum (JIRA)

Robert Roland (JIRA)

Luke Bigum (JIRA)