Servers randomly return http error 403

858 views
Skip to first unread message

g...@recongate.com

unread,
Aug 9, 2017, 10:07:46 AM8/9/17
to Prometheus Users
Hi everyone,

I'm using prometheus to monitor servers from different locations around the world and on different cloud platforms.

In one location (were there are a few servers) i sometimes get the error "server returned HTTP status 403 Forbidden". The error occurs randomly, sometime prometheus manages to monitor the server sufficiently and in the next scrape the error will appear again.

The network traffic to the servers is not the best but should be enough to monitor them (when i ping i get an answer back within around 90-100ms).

I tried changing the scrape _interval and the scrape_timeout but it did not change anything (even when i set it for 10 minutes).

Any ideas on how i can monitor these servers?

Ben Kochie

unread,
Aug 11, 2017, 1:19:36 PM8/11/17
to g...@recongate.com, Prometheus Users
The recommend thing to do is to run the Prometheus server more closely to the targets.  This makes sure the network in the middle is not a problem.  Otherwise you are effectively monitoring the network in addition to the target services.

You can forward alert message over the long links to a remote alertmanager.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/286d9508-e768-4f12-bd5c-5f5d5f10642c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

g...@recongate.com

unread,
Aug 13, 2017, 3:12:54 AM8/13/17
to Prometheus Users, g...@recongate.com
Thanx ben,

Unfortunately it is impossible for us to create a prometheus server near the target servers. Is there maybe a way to calculate the average or use some sort of cli to get some data from these servers?

On Friday, August 11, 2017 at 8:19:36 PM UTC+3, Ben Kochie wrote:
The recommend thing to do is to run the Prometheus server more closely to the targets.  This makes sure the network in the middle is not a problem.  Otherwise you are effectively monitoring the network in addition to the target services.

You can forward alert message over the long links to a remote alertmanager.
On Aug 9, 2017 16:07, <g...@recongate.com> wrote:
Hi everyone,

I'm using prometheus to monitor servers from different locations around the world and on different cloud platforms.

In one location (were there are a few servers) i sometimes get the error "server returned HTTP status 403 Forbidden". The error occurs randomly, sometime prometheus manages to monitor the server sufficiently and in the next scrape the error will appear again.

The network traffic to the servers is not the best but should be enough to monitor them (when i ping i get an answer back within around 90-100ms).

I tried changing the scrape _interval and the scrape_timeout but it did not change anything (even when i set it for 10 minutes).

Any ideas on how i can monitor these servers?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Ben Kochie

unread,
Aug 13, 2017, 4:28:11 AM8/13/17
to g...@recongate.com, Prometheus Users
It should be no problem to monitor with a few lost scrapes.  This is one of the advantages of polling monitoring and PromQL.

With a typical scrape_interval of 30s, and 50% scrape loss, you can still easily use functions like rate(), but you will want a longer range vector, say 3-5 minutes, in order to deal with the lost scrapes.

You will also probably want to adjust the alert rules to have a longer FOR time in order to avoid alerting noise.

You may want to consider faster scrape_interval polling, say 10-15s, in order to increase the number of samples you collect.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b5e22010-ae89-4577-8fe9-bdbe76a97b1f%40googlegroups.com.

Ben Kochie

unread,
Aug 15, 2017, 10:05:09 AM8/15/17
to Gil Goldberger, Prometheus Users
The only way for a 403 error to happen is if there is some kind of proxy server in between your Prometheus server and the targets.

On Tue, Aug 15, 2017 at 3:31 PM, Gil Goldberger <g...@recongate.com> wrote:
I tried reaching the node exporter via curl (from the prometheus server) and managed to see the metrics every time (tried at least ten times).

If i should function with 50% scrape loss i should be able to monitor these servers, however i still get this error from the servers. 

Any thought on why curl works fine and the prometheus server gets an error?

Ben Kochie

unread,
Aug 15, 2017, 10:14:56 AM8/15/17
to Gil Goldberger, Prometheus Users
I would recommend tcpdump on both sides in order to see where the 403 is coming from.

On Tue, Aug 15, 2017 at 4:11 PM, Gil Goldberger <g...@recongate.com> wrote:
There is no proxy in between the two servers and curl works fine.

Are there any tests to do on the target server to see what the problem is.

Ben Kochie

unread,
Aug 15, 2017, 11:00:23 AM8/15/17
to Gil Goldberger, Prometheus Users
Attempt 2, with gzip'd attachments.

On Tue, Aug 15, 2017 at 4:49 PM, Gil Goldberger <g...@recongate.com> wrote:
I have sent the tcpdump output of both the server and the target.

In addition i have sent a video of the prometheus site and what happens when i refresh the browser.

maybe you can help understand what the problem is.
prometheus.mov.gz
server_tcpdump.log.gz
target_tcpdump.log.gz

Gil Goldberger

unread,
Aug 16, 2017, 4:52:55 AM8/16/17
to promethe...@googlegroups.com, Ben Kochie

---------- Forwarded message ----------
From: Gil Goldberger <g...@recongate.com>
Date: Tue, Aug 15, 2017 at 6:11 PM
Subject: Re: [prometheus-users] Servers randomly return http error 403
To: Ben Kochie <sup...@gmail.com>
server_tcpdump.log.gz
target_tcpdump.log.gz
Reply all
Reply to author
Forward
0 new messages