Jira (PDB-4938) Serialize startup sync and garbage collection

6 views
Skip to first unread message

Austin Blatt (Jira)

unread,
Oct 21, 2020, 4:36:02 PM10/21/20
to puppe...@googlegroups.com
Austin Blatt created an issue
 
PuppetDB / Bug PDB-4938
Serialize startup sync and garbage collection
Issue Type: Bug Bug
Assignee: Austin Blatt
Created: 2020/10/21 1:35 PM
Priority: Normal Normal
Reporter: Austin Blatt

As observed at customer sites, the initial sync can still find itself in a deadlock with garbage collection. When this happens, PuppetDB remains in maintenance mode for 4 hours until restarted by systemd. The periodic sync is successfully cancelled by garbage collection, so we should use the same logic for initial sync to prevent the deadlock. But because we really do want the initial sync to run and complete, we also need to serialize the initial garbage collection with the initial sync.

If startup sync does not complete before the next run of gc (after gc-interval), it could still be cancelled, but this will be a "best effort" until we can rework the summary query.

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Austin Blatt (Jira)

unread,
Oct 26, 2020, 4:09:03 PM10/26/20
to puppe...@googlegroups.com
Austin Blatt updated an issue
Change By: Austin Blatt
Fix Version/s: PDB 6.13.1

Austin Blatt (Jira)

unread,
Oct 26, 2020, 5:40:03 PM10/26/20
to puppe...@googlegroups.com
Austin Blatt updated an issue
As observed at customer sites, the initial sync can still find itself in a deadlock with garbage collection. When this happens, PuppetDB remains in maintenance mode for 4 hours until restarted by systemd.
The periodic sync is successfully cancelled by garbage collection, so we should use the same logic for initial sync to prevent the deadlock. But because we really do want the initial sync to run and complete, we also need to serialize the initial garbage collection with the initial sync.

If startup sync does not complete before the next run of gc (after {{gc-interval}}), it could still be cancelled, but this will be a "best effort" until we can rework the summary query.

Austin Blatt (Jira)

unread,
Oct 26, 2020, 5:40:04 PM10/26/20
to puppe...@googlegroups.com
Austin Blatt updated an issue
Change By: Austin Blatt
Release Notes Summary: PE only, the initial garbage collection and sync are serialized. If initial sync takes longer than your gc-interval, it could be cancelled.

Austin Blatt (Jira)

unread,
Oct 26, 2020, 5:41:03 PM10/26/20
to puppe...@googlegroups.com
Austin Blatt updated an issue
As observed at customer sites, the initial sync can still find itself in a deadlock with garbage collection. When this happens, PuppetDB remains in maintenance mode for 4 hours until restarted by systemd.


Until we can re-work the summary query, we should serialize garbage collection and sync on startup.
Reply all
Reply to author
Forward
0 new messages