Decreasing value

151 views
Skip to first unread message

Nils Krabshuis

unread,
Aug 25, 2021, 1:25:40 AM8/25/21
to Prometheus Users
In your typical graph with its ups and downs, how would one alert on a drop in the value? 

An example, pg_database_size which normally would either stay stable or most likely grow - how could one monitor and alert on a decreasing value? I know I can do metric < metric offset 1w - but that's just a single point in time, I would like to be able to alert on a declining rate (a negative rate if you will).




Brian Candler

unread,
Aug 28, 2021, 3:14:04 PM8/28/21
to Prometheus Users
See https://prometheus.io/docs/prometheus/latest/querying/functions/

delta() does more or less what you describe in a cleaner way: it takes the difference between the first and last values in a time window (range vector), and extrapolates to that time window.  It will ignore any up-and-down bumps in between.

If the value must never, ever go down, not even by a tiny amount, then you could use resets().  This counts the number of times in a time window that the value has decreased.  But really it's intended for use with counters, so it may be too sensitive for what you're doing.  Maybe a bit of vacuuming might reduce the database size a bit.

A more sophisticated approach is to use predict_linear() to apply a best-fit line over all the data points and work out the expected value at some future point in time.  This can be used to warn when disks are filling up:

- name: DiskRate3h
  interval: 10m
  rules:
  # Warn if rate of growth over last 3 hours means filesystem will fill in 2 days
  - alert: DiskFilling
    expr: |
      predict_linear(node_filesystem_avail_bytes{fstype!~"fuse.*|nfs.*"}[3h], 2*86400) < 0
    for: 6h
    labels:
      severity: warning
    annotations:
      summary: 'Filesystem will be full in less than 2d at current 3h growth rate'

Comparing the predict_linear() value with the current value could be a good way to get an indication as to whether the value is "increasing" or "decreasing" overall, particularly if the usage is bumpy up and down.
Reply all
Reply to author
Forward
0 new messages