Jira (PDB-5135) PuppetDB should warn about resource titles that exceed Postgres index sizes

1 view
Skip to first unread message

Charlie Sharpsteen (Jira)

unread,
May 20, 2021, 11:51:04 AM5/20/21
to puppe...@googlegroups.com
Charlie Sharpsteen created an issue
 
PuppetDB / Improvement PDB-5135
PuppetDB should warn about resource titles that exceed Postgres index sizes
Issue Type: Improvement Improvement
Assignee: Unassigned
Created: 2021/05/20 8:50 AM
Priority: Normal Normal
Reporter: Charlie Sharpsteen

When PuppetDB inserts data into Postgres, the constraints of the database can cause errors to be raised.

A typical example is that Postgres disallows the use of the null byte, "\0", in strings while UTF-8 generally tolerates it. Another constraint comes into play when large data values are inserted into columns that have database indexes:

Any data type that can be sorted into a well-defined linear order can be indexed by a btree index. The only limitation is that an index entry cannot exceed approximately one-third of a page (after TOAST compression, if applicable).

https://www.postgresql.org/docs/11/btree-intro.html

This constraint is typically encountered with resource title values in catalogs or reports and results in an error similar to the following being raised from the storage attempt:

2021-05-11T23:59:47.225Z ERROR [p.p.command] [14,654,263] [store report] Retrying after attempt 0 for node.hostname.example, due to: org.postgresql.util.PSQLException: ERROR: index row size 2720 exceeds maximum 2712 for index "resource_events_resource_timestamp_20210511z"

This error is somewhat useful in that it indicates which node tripped the condition. But the error does not help identify which data value needs to be corrected.

For resource titles, PuppetDB should check input lengths against the 2712 character maximum and emit a warning or error that includes:

  • The certname of the node that produced the data
  • The type of the resource
  • The manifest file and line number where the resource was defined
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Charlie Sharpsteen (Jira)

unread,
May 20, 2021, 11:53:02 AM5/20/21
to puppe...@googlegroups.com

Charlie Sharpsteen (Jira)

unread,
May 20, 2021, 11:53:03 AM5/20/21
to puppe...@googlegroups.com
Charlie Sharpsteen commented on Improvement PDB-5135
 
Re: PuppetDB should warn about resource titles that exceed Postgres index sizes

The PuppetDB terminii that run inside of Puppet Server might be a good spot for this check as they already iterate over catalog resources and report resource events when prepping the data for sybmission to PuppetDB.

Austin Blatt (Jira)

unread,
Jun 30, 2021, 12:45:02 PM6/30/21
to puppe...@googlegroups.com

Bogdan Irimie (Jira)

unread,
Jul 1, 2021, 3:21:02 AM7/1/21
to puppe...@googlegroups.com

Oana Tanasoiu (Jira)

unread,
Jul 14, 2021, 3:18:03 AM7/14/21
to puppe...@googlegroups.com

Oana Tanasoiu (Jira)

unread,
Jul 14, 2021, 3:45:02 AM7/14/21
to puppe...@googlegroups.com
Oana Tanasoiu updated an issue
Change By: Oana Tanasoiu
Sprint: ready for triage 2 ghost-28.07.2021

Sebastian Miclea (Jira)

unread,
Jul 20, 2021, 7:24:04 AM7/20/21
to puppe...@googlegroups.com

Sebastian Miclea (Jira)

unread,
Jul 27, 2021, 9:52:03 AM7/27/21
to puppe...@googlegroups.com
Sebastian Miclea commented on Improvement PDB-5135
 
Re: PuppetDB should warn about resource titles that exceed Postgres index sizes

Charlie Sharpsteen I've set up a PR fir the PupetDB termini that logs a message when the index limit is reached. With Rob Browning suggestion I've added a second log to warn that the size is near the limit. Currently I'm looking if it's possible to add the message from PDB, because for me, it's a bit confusing to have one error in pdb and one in puppetserver.

Bogdan Irimie (Jira)

unread,
Jul 28, 2021, 3:32:02 AM7/28/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Sprint: ghost-28.07.2021 , ghost-11.08.2021

Sebastian Miclea (Jira)

unread,
Aug 11, 2021, 6:29:02 AM8/11/21
to puppe...@googlegroups.com
Sebastian Miclea updated an issue
Change By: Sebastian Miclea
Release Notes: Not Needed Enhancement
Release Notes Summary: On the resource_events_resource_*z partial has the multicolumn resource_events_resource_timestamp_xxxxxz index (timestamp, title and type) that is limited to 2712 bytes for postgres versions up to 11. Starting with postgres 12, the index size was reduced with 8 bytes. Having resource events that exceed this limit will cause PDB to fail to insert the row without to many info about what and where is the resource that caused the error. This pr adds extra logs with details to allow easier debugging. There are two messages printed, when the index is close to the limit (between 2500 and 2704) and when the limit is exceeded (over 2704).

Austin Blatt (Jira)

unread,
Sep 9, 2021, 11:00:02 AM9/9/21
to puppe...@googlegroups.com

Sebastian Miclea (Jira)

unread,
Oct 5, 2021, 9:32:02 AM10/5/21
to puppe...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages