I was given the task of tuning their resource queues (RQs) to avoid performance degradation, query waiting, out of memory events, etc.
Now, to do that properly and to make decisions based on reality, I will need to collect historic data on query execution. I am familiar with most data dictionary views reporting activity on the system (pg_stat_activity and the statistics collector as well as perfmon APIs that give a pretty accurate picture of what’s going on NOW) but to know resource demand over time, I will need to collect all these data over a meaningful period (such as 3 months) on some kind of ad hoc schema.
Do you know of any function or set of functions in GP to do this query execution activity data capture/collection? I mean, capture data into some history tables without relying on GP logs exclusively, like a stream pipeline taking snapshots of the system query activity every given interval.
Any suggestion on that would be awesome. And also, any best practices tweaking RQs. We are upgrading production July 28th from 4.3.12 to 5.9 (big jump) and also we will start using GPCC 4 with workload manager (WLM) that we didn’t use before. But still, resource queues have to be setup properly.
Daniel Barale
--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-users+unsubscribe@greenplum.org.
To post to this group, send email to gpdb-...@greenplum.org.
Visit this group at https://groups.google.com/a/greenplum.org/group/gpdb-users/.
For more options, visit https://groups.google.com/a/greenplum.org/d/optout.
Have you tried using the data in the gpperfmon database?
On Thu, Jul 12, 2018 at 8:09 AM, Daniel Barale <dba...@gmail.com> wrote:
I was given the task of tuning their resource queues (RQs) to avoid performance degradation, query waiting, out of memory events, etc.
Now, to do that properly and to make decisions based on reality, I will need to collect historic data on query execution. I am familiar with most data dictionary views reporting activity on the system (pg_stat_activity and the statistics collector as well as perfmon APIs that give a pretty accurate picture of what’s going on NOW) but to know resource demand over time, I will need to collect all these data over a meaningful period (such as 3 months) on some kind of ad hoc schema.
Do you know of any function or set of functions in GP to do this query execution activity data capture/collection? I mean, capture data into some history tables without relying on GP logs exclusively, like a stream pipeline taking snapshots of the system query activity every given interval.
Any suggestion on that would be awesome. And also, any best practices tweaking RQs. We are upgrading production July 28th from 4.3.12 to 5.9 (big jump) and also we will start using GPCC 4 with workload manager (WLM) that we didn’t use before. But still, resource queues have to be setup properly.
Daniel Barale
--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-users+...@greenplum.org.
To post to this group, send email to gpdb-...@greenplum.org.
Visit this group at https://groups.google.com/a/greenplum.org/group/gpdb-users/.
For more options, visit https://groups.google.com/a/greenplum.org/d/optout.
--