Alerts for http error

32 views
Skip to first unread message

vickyrat...@gmail.com

unread,
Apr 4, 2022, 4:21:06 AM4/4/22
to Prometheus Users
Does anyone have a dashboard/alerts for HTTP errors? I am having trouble creating one. For example 502 error there won't be any logs when I am not able to access an endpoint.

Brian Candler

unread,
Apr 4, 2022, 10:34:26 AM4/4/22
to Prometheus Users
Before you think about dashboards or alerts, the first thing you need to decide is where this data is coming from and how you are collecting it. For example: if you are parsing web server logs, then you need to increment counters for different status codes. grok_exporter or mtail can help you with that. If you want to make active tests (probes) of a remote webserver, then something like blackbox_exporter can help you, and it returns the http status code as one of its metrics.
 
> For example 502 error there won't be any logs when I am not able to access an endpoint

It's unclear what problem it is you are having.

If the webserver itself returns a 502 error, then it will log the fact that it has done so, just like any other response.  Equally, if blackbox_exporter gets a 502 response from a webserver, then it will report it as a 502.  Both cases are suitable for reporting and alerting.

vickyrat...@gmail.com

unread,
Apr 5, 2022, 8:06:14 AM4/5/22
to Prometheus Users
Thanks for the help. Right now I have only 2 data sources Prometheus and Loki. Can I use any of the two for the 502 error alert?

Brian Candler

unread,
Apr 5, 2022, 10:04:08 AM4/5/22
to Prometheus Users
Neither Prometheus nor Loki are data sources.  Both are places you store and query data. You can store metrics (numbers) in Prometheus, and logs or events in Loki (which can be text or JSON etc).

Your data source could be web server logs, or it could be counters that your application exposes, or something else.

Once you've decided where you're collecting these HTTP responses from, you can then choose where and how to store them:

- Prometheus you won't be able to store individual HTTP logs, but you can create counters (e.g. number of 200 responses, number of 502 responses etc) and store those as metrics.  By looking at how those counters change over time, it can answer questions like "how many requests per minute am I handling?" and "what proportion of those requests gave 502 responses"?  You can also record latency information in histograms, which are basically counters grouped into buckets (e.g number of requests which took 0-1ms, 1-2ms, 2-5ms etc)

- in Loki you can store the individual HTTP requests as separate log entries, so you can drill down to details of individual requests. You can also do aggregate queries using LogQL, but they will be slower, as much more data has to be processed for a given time range compared to what Prometheus would need to query.

Reply all
Reply to author
Forward
0 new messages