Hi,I'm trying to compute aggregate system statistics (i.e. CPU utilization, measured by node exporters) when the system is subject to load tests. I want to be able to distinguish how different test conditions affect system performance.To do that, I've created a text file read by the node exporter textfile collector, in which I put the run ID, i.e.:run{run="126"} 1126 is the test id and this metric is set to 1 during the test run. When the next test is being run (i.e. test 127), I overwrite the file and change the run label to run="127".It seems it is working fine, however if I chart the expression run[10m] as a stacked graph in prometheus, I see overlapping in the different labels:
Why does this occur?As I always update the file atomically with a new value, no overlapping should ever occur - I should always get something equal to 1. It looks like prometheus takes a while to understand that the value changed.The scraping works correctly: if I visualize the query results in the Console, I correctly see that the no overlapping is present in the timestamps:What am I doing wrong here?Many thanks for your help and for making this great tool!
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/438dc53b-088a-4c1b-9b02-6384f8e10d8d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Brian,Thanks for the quick and useful reply!I managed to rework my setup around your suggestion. I have now a metric called runid that holds the id of the current running test, and 0 when no run is in progress. I can then gather the performance metrics I require during a specific run as follows:avg by (instance, job) (sum by (instance, cpu, job) (irate(node_cpu{mode=~"user|system|nice"}[1m]))) * 100 and runid == $runidNow, I have two remaining questions:1) How to compute aggregate stats over a specific runIn order to understand how a test run performed, I would like to have a single aggregate value like a benchmark score. For example, the average CPU utilization during a test run. Is there a way I can accomplish this using promQL?I have found the <aggr>_over_time functions, however I'm not able to come up with a working query. It works in a simple case like:avg_over_time(go_goroutines[1m])However, if I have to do an average of a rate, this doesn't work because this is an instant vector I guess:avg_over_time(rate(node_cpu[1m]))Is it possible to transform it in a range vector?
2) Distinct values of a metricI'm using grafana on top of prometheus to analyze the results. I have setup a template var which holds my runid distinct values, so that I can filter and visualize the relevant tests.Now, how can I get the distinct values of a metric, so that the template variable will show them?
I also wanted to thank you all again for this wonderful project, I'm now able to implement complex derived metrics & ideas in a matter of clicks!
On Saturday, 24 December 2016 16:09:40 UTC+1, stefan...@gmail.com wrote:Hi,I'm trying to compute aggregate system statistics (i.e. CPU utilization, measured by node exporters) when the system is subject to load tests. I want to be able to distinguish how different test conditions affect system performance.To do that, I've created a text file read by the node exporter textfile collector, in which I put the run ID, i.e.:run{run="126"} 1126 is the test id and this metric is set to 1 during the test run. When the next test is being run (i.e. test 127), I overwrite the file and change the run label to run="127".It seems it is working fine, however if I chart the expression run[10m] as a stacked graph in prometheus, I see overlapping in the different labels:Why does this occur?As I always update the file atomically with a new value, no overlapping should ever occur - I should always get something equal to 1. It looks like prometheus takes a while to understand that the value changed.The scraping works correctly: if I visualize the query results in the Console, I correctly see that the no overlapping is present in the timestamps:What am I doing wrong here?Many thanks for your help and for making this great tool!
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7c55d46e-5bd9-45e1-987f-6e4cbd792749%40googlegroups.com.