Jira (PDB-3211) Collect/expose metrics on vacuum stats

0 views
Skip to first unread message

Ryan Senior (JIRA)

unread,
Dec 6, 2016, 10:08:03 AM12/6/16
to puppe...@googlegroups.com
Ryan Senior created an issue
 
PuppetDB / New Feature PDB-3211
Collect/expose metrics on vacuum stats
Issue Type: New Feature New Feature
Assignee: Unassigned
Created: 2016/12/06 7:07 AM
Priority: Normal Normal
Reporter: Ryan Senior

PuppetDB has a potential to generate excessive garbage for PostgreSQL depending on usage patterns. We don't currently collect or monitor this metric which is a gap in our performance and scalability. It's likely that this ticket will need to be more of a design/investigation sort to ticket to figure out how we want to monitor this over time (and what historic information is available in PostgreSQL on vacuuming). A second ticket can be created/estimated for implementing this.

The end result should be the ability to monitor, assess the impact of PuppetDB usage on PostgreSQL's vacuuming. This should allow us to make measurable improvements over time.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe)
Atlassian logo

Karen Van der Veer (JIRA)

unread,
Dec 12, 2016, 6:31:03 PM12/12/16
to puppe...@googlegroups.com
Karen Van der Veer updated an issue
Change By: Karen Van der Veer
Sprint: Hopper SE 2016-01-11

Ryan Senior (JIRA)

unread,
Dec 14, 2016, 10:51:02 AM12/14/16
to puppe...@googlegroups.com
Ryan Senior updated an issue
Change By: Ryan Senior
Sprint: SE 2016-01-11 Hopper

Ryan Senior (JIRA)

unread,
Dec 14, 2016, 11:04:02 AM12/14/16
to puppe...@googlegroups.com
Ryan Senior updated an issue
Change By: Ryan Senior
Sprint: Hopper

Zachary Kent (Jira)

unread,
Feb 22, 2021, 2:41:01 PM2/22/21
to puppe...@googlegroups.com
Zachary Kent commented on New Feature PDB-3211
 
Re: Collect/expose metrics on vacuum stats

The main focus of this ticket should be learning about the postgres autovacuum process and identifying what things we would want to track related to autovacuum in PDB. If we can identify a few core metrics we care about we could add a periodic task to check these and log warnings if PDB detects that things aren't healthy.

For example, one thing we'll want to check is overall table bloat. This can be accomplished by running the following query:

select n_dead_tup, n_live_tup, last_autovacuum
  from pg_stat_user_tables;

Some metrics for postgres are already collected in the puppetlabs-puppet_metrics_collector module. Recently changes were made to this module in PR 71 to gather more postgres metrics. The changes in the psql_metrics file of that PR could be a useful starting place when looking for the types of things we might want to gather in PDB periodically.

This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Zachary Kent (Jira)

unread,
Feb 22, 2021, 2:46:03 PM2/22/21
to puppe...@googlegroups.com
Zachary Kent updated an issue
 
Change By: Zachary Kent
Acceptance Criteria: * A candidate list of autovacuum related metrics we would want to track in PDB with a brief explanation of why each is valuable
* A list of queries that can be used to gather information for each metric

Zachary Kent (Jira)

unread,
Feb 22, 2021, 2:46:03 PM2/22/21
to puppe...@googlegroups.com
Zachary Kent updated an issue
Change By: Zachary Kent
Story Points: 2

Zachary Kent (Jira)

unread,
Feb 22, 2021, 5:31:03 PM2/22/21
to puppe...@googlegroups.com
Zachary Kent updated an issue
Change By: Zachary Kent
Labels: tsr-pdb-backlog

Zachary Kent (Jira)

unread,
Feb 22, 2021, 5:54:03 PM2/22/21
to puppe...@googlegroups.com
Zachary Kent commented on New Feature PDB-3211
 
Re: Collect/expose metrics on vacuum stats

See PDB-4941 and related links in the comments there for more background on times we've seen autovacuum issues with customers and how they were resolved. The information there isn't necessary to complete this ticket, but is useful to gain some context around past issues. 

Bogdan Irimie (Jira)

unread,
Feb 23, 2021, 8:08:02 AM2/23/21
to puppe...@googlegroups.com

Bogdan Irimie (Jira)

unread,
Mar 24, 2021, 10:35:03 AM3/24/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Sprint: ready for triage 3

Bogdan Irimie (Jira)

unread,
Mar 24, 2021, 10:35:03 AM3/24/21
to puppe...@googlegroups.com

Bogdan Irimie (Jira)

unread,
Mar 30, 2021, 3:55:04 AM3/30/21
to puppe...@googlegroups.com
Bogdan Irimie assigned an issue to Bogdan Irimie
Change By: Bogdan Irimie
Assignee: Bogdan Irimie

Bogdan Irimie (Jira)

unread,
Apr 2, 2021, 11:09:02 AM4/2/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Attachment: Screenshot 2021-04-02 at 18.06.03.png
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Bogdan Irimie (Jira)

unread,
Apr 7, 2021, 9:04:03 AM4/7/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Sprint: ghost-7.04.2021 , HAHA/Grooming 2

Bogdan Irimie (Jira)

unread,
Apr 7, 2021, 2:40:03 PM4/7/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Attachment: puppet-metrics-collector-20210401T142001Z.json

Bogdan Irimie (Jira)

unread,
Apr 7, 2021, 2:42:03 PM4/7/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Attachment: vacuum-metrics

Bogdan Irimie (Jira)

unread,
Apr 7, 2021, 3:03:03 PM4/7/21
to puppe...@googlegroups.com
Bogdan Irimie commented on New Feature PDB-3211
 
Re: Collect/expose metrics on vacuum stats

`vacuum-metrics` contains SQL queries that we can use for monitoring the bloat and some possible reasons.  The puppetlabs-puppet_metrics_collector module collects bloat and vacuum data from `pg_stat_all_tables`. Attached is an example of a Grafana dashboard that can be created with  puppet_metrics_dashboard that shows the evolution of dead tuples over time.

One of the main advantages of puppetlabs-puppet_metrics_collector module is that it saves data every 5 minutes and can push it to third party monitoring systems and databases like splunk, influxdb and graphite. The module will be included in the next versions of PE.

Bogdan Irimie (Jira)

unread,
Apr 7, 2021, 3:12:03 PM4/7/21
to puppe...@googlegroups.com
Bogdan Irimie commented on New Feature PDB-3211

We should look over auto vacuum throttling parameters as they can play a bigger role in large deployments where the default values might be to small and auto vacuum will not be able to keep up with the bloat.

Bogdan Irimie (Jira)

unread,
Apr 12, 2021, 9:56:01 AM4/12/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Attachment: vacuum-metrics

Bogdan Irimie (Jira)

unread,
Apr 12, 2021, 9:56:02 AM4/12/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Attachment: vacuum-metrics-1

Bogdan Irimie (Jira)

unread,
Apr 12, 2021, 9:56:02 AM4/12/21
to puppe...@googlegroups.com

Bogdan Irimie (Jira)

unread,
Apr 12, 2021, 9:56:02 AM4/12/21
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
 
PuppetDB / New Feature PDB-3211
Collect/expose metrics on vacuum stats
Change By: Bogdan Irimie
Attachment: vacuum-metrics

Claudia Petty (Jira)

unread,
Jun 21, 2023, 11:03:47 AM6/21/23
to puppe...@googlegroups.com
Claudia Petty updated an issue
Change By: Claudia Petty
Labels: new-feature tsr-pdb-backlog
This message was sent by Atlassian Jira (v8.20.21#820021-sha1:38274c8)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages