prometheus is scraping metrics from an instance which has no exporter running on

54 views
Skip to first unread message

Yashar Nesabian

unread,
Jun 23, 2020, 2:32:02 AM6/23/20
to Prometheus Users
Hi,
A few days ago I realized IPMI exporter is not running on one of our bare metals but we didn't get any alert from our Prometheus. Although I cannot get the metrics via curl on the Prometheus server, our Prometheus is scraping metrics successfully from this server!
here is the Prometheus page indicating the Prometheus can scrap metrics successfully :

prom.png


But when I SSH to the server, no one is listening on port 9290:



And I've checked the DNS records, they are correct (when I ping the address, it returns the correct address. Here is the curl result from the Prometheus server for one-08:
curl: (7) Failed to connect to one-08.compute.x.y.z port 9290: Connection refused

The weird thing is I can see one-08 metrics on the Prometheus server (for the moment):

prom1.png



I tried to put this job on another Prometheus server but I get an error on the second one claiming context deadline exceeded which is correct.

Christian Hoffmann

unread,
Jun 30, 2020, 2:30:28 AM6/30/20
to Yashar Nesabian, Prometheus Users
Hi,


On 6/23/20 8:32 AM, Yashar Nesabian wrote:
> Hi,
> A few days ago I realized IPMI exporter is not running on one of our
> bare metals but we didn't get any alert from our Prometheus. Although I
> cannot get the metrics via curl on the Prometheus server, our Prometheus
> is scraping metrics successfully from this server!
> here is the Prometheus page indicating the Prometheus can scrap metrics
> successfully :
>
> prom.png
>
>
> But when I SSH to the server, no one is listening on port 9290:
>
>
>
> And I've checked the DNS records, they are correct (when I ping the
> address, it returns the correct address. Here is the curl result from
> the Prometheus server for one-08:
> |
> curl http://one-08.compute.x.y.z:9290
> curl: (7) Failed to connect to one-08.compute.x.y.z port 9290:
> Connection refused
> |
>
> The weird thing is I can see one-08 metrics on the Prometheus server
> (for the moment):
>
> prom1.png
>
>
>
> I tried to put this job on another Prometheus server but I get an error
> on the second one claiming context deadline exceeded which is correct.

Could a DNS cache be involved?

Try comparing
getent hosts one-08
vs.
dig one-08
on the Prometheus machine.

You can also try tcpdump to analyze where Prometheus is actually
connecting to.

Kind regards,
Christian

nesa...@gmail.com

unread,
Jul 8, 2020, 4:02:38 AM7/8/20
to Prometheus Users
Hi
I tried the things you mentioned but DNS is showing the correct address using dig and getent command.
I think the Prometheus is caching the address because on another target,  we had a DNS record "kafka-a.x.y.z" which was pointing to an IP address, now we deleted the DNS query and the dig command can't resolve the name "kafka-a.x.y.z" but the Prometheus is still reading metrics from it.
Reply all
Reply to author
Forward
0 new messages