Jira (PDB-5141) Fix issue with lock_timeout format during partition drop

24 views
Skip to first unread message

Zachary Kent (Jira)

unread,
May 28, 2021, 1:57:28 PM5/28/21
to puppe...@googlegroups.com
Zachary Kent created an issue
 
PuppetDB / Bug PDB-5141
Fix issue with lock_timeout format during partition drop
Issue Type: Bug Bug
Assignee: Unassigned
Created: 2021/05/26 4:49 PM
Priority: Normal Normal
Reporter: Zachary Kent

tbd

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Rob Browning (Jira)

unread,
May 28, 2021, 2:02:29 PM5/28/21
to puppe...@googlegroups.com

Rob Browning (Jira)

unread,
May 28, 2021, 2:02:33 PM5/28/21
to puppe...@googlegroups.com

Rob Browning (Jira)

unread,
May 28, 2021, 2:07:27 PM5/28/21
to puppe...@googlegroups.com

Zachary Kent (Jira)

unread,
May 28, 2021, 2:22:28 PM5/28/21
to puppe...@googlegroups.com
Zachary Kent updated an issue
Change By: Zachary Kent
tbd The `show lock_timeout` query we do [here|https://github.com/puppetlabs/puppetdb/blob/6.x/src/puppetlabs/puppetdb/scf/storage.clj#L1634-L1635] to grab any existing system lock_timeout returns a result that isn't formatted properly for Long/parseLong. We need to fix this query so that it doesn't throw if there is a system lock_timeout already set.

To reproduce with a local PDB:
{code:java}in psql: alter role pdb_test set lock_timeout=300;
run: lein test :only puppetlabs.puppetdb.cli.services-test/regular-gc-drops-oldest-partitions-incrementally
{code}
This will cause the test to fail with:
{code:java}21349 [pool-3-thread-3] ERROR puppetlabs.puppetdb.cli.services - Error while sweeping reports and resource events
java.lang.NumberFormatException: For input string: "300ms"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.base/java.lang.Long.parseLong(Long.java:692)
at java.base/java.lang.Long.parseLong(Long.java:817)
at puppetlabs.puppetdb.scf.storage$prune_daily_partitions.invokeStatic(storage.clj:1571)
        ...
{code}
If there is a lock_timeout set for postgres or for the puppetdb/pe-puppetdb role it will cause partition GC to fail and cause partitions to build up until PDB is restarted or the lock_timeout is reset. If there isn't a lock_timeout set the query returns 0 which isn't a problem for Long/parseLong.

As a workaround resetting the lock_timeout should allow partition GC to succeed. For example:
{code:java} alter role "pe-puppetdb" reset lock_timeout;
{code}
Will reset the lock_timeout on the pe-puppetdb role and should resolve the error seen above. Partition drops are still protected by a (5min default) lock_timeout which is defaulted and set via an env var [here|https://github.com/puppetlabs/puppetdb/blob/6.x/src/puppetlabs/puppetdb/scf/storage.clj#L1584-L1586].

Eric Thompson (Jira)

unread,
Jun 2, 2021, 2:18:02 PM6/2/21
to puppe...@googlegroups.com

Eric Thompson (Jira)

unread,
Jun 2, 2021, 2:20:02 PM6/2/21
to puppe...@googlegroups.com
Eric Thompson updated an issue
Change By: Eric Thompson
Sprint: HA 2021-06-02 , HA 2021-06-16

Eric Thompson (Jira)

unread,
Jun 2, 2021, 2:27:02 PM6/2/21
to puppe...@googlegroups.com

Rob Browning (Jira)

unread,
Jun 15, 2021, 12:27:02 PM6/15/21
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Fix Version/s: PDB 7.4.2
Fix Version/s: PDB 6.9.2

Rob Browning (Jira)

unread,
Jun 15, 2021, 12:31:02 PM6/15/21
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Fix Version/s: PDB 6.9.2
Fix Version/s: PDB 6.16.2

Rob Browning (Jira)

unread,
Jun 15, 2021, 12:40:01 PM6/15/21
to puppe...@googlegroups.com
Rob Browning updated an issue
Change By: Rob Browning
Release Notes: Bug Fix
Release Notes Summary: Lock timeouts should be parsed correctly now.  Previously, if a lock timeout had been set either via the experimental [PDB_GC_DAILY_PARTITION_DROP_LOCK_TIMEOUT_MS](https://puppet.com/docs/puppetdb/latest/configure.html#experimental-environment-variables) variable, or other means, PuppetDB might fail to interpret the value correctly, and as a result, fail to prune older data correctly. [(PDB-5141)](https://tickets.puppetlabs.com/browse/PDB-5141)
Reply all
Reply to author
Forward
0 new messages