EC2 Discovery fails on Prometheus 2.0

3,084 views
Skip to first unread message

johnath...@gmail.com

unread,
Nov 15, 2017, 10:35:30 AM11/15/17
to Prometheus Users
After updating the docker container to 2.0 and updating the rules it complains of not being able to access EC2 discovery because a role ARN was not provided. Any help would be appreciated. I can confirm I can get out to the internet on the server, I issued a curl of https://prometheus.io/docs/prometheus/latest/configuration/configuration/ and it successfully retrieved it. 


level=error ts=2017-11-15T14:54:04.524776878Z
caller=ec2.go:127
component
="target manager"
discovery
=ec2
msg
="Refresh failed"
err
="could not describe instances: EC2RoleRequestError: no EC2 instance role found caused by:
 EC2MetadataError: failed to make EC2Metadata request caused by:
 <?xml version="
1.0" encoding="iso-8859-1"?>
 <!DOCTYPE html PUBLIC "
-//W3C//DTD XHTML 1.0 Transitional//EN"
 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
 <html xmlns="
http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 
<head>
 
<title>404 - Not Found</title>
 </
head>
 
<body>
 
<h1>404 - Not Found</h1>
 </
body>
 
</html>

yogiaws

unread,
Nov 15, 2017, 4:38:14 PM11/15/17
to Prometheus Users
Hi

The Ec2 box where your Prometheus is running, needs to have an IAM role, at bare minimum it should ec2 read access, so that it can run ec2_describe_instance role or u can provide access_key/secret key that has the similar permission

johnath...@gmail.com

unread,
Nov 15, 2017, 6:19:58 PM11/15/17
to Prometheus Users
I am using the access key/secret key and it was working before the upgrade to 2.0.

johnathan falk

unread,
Nov 16, 2017, 12:45:50 PM11/16/17
to Prometheus Users
Currently the system has the access/secret key. The user currently has these permissions:
AmazonEC2ReadOnlyAccess

--
You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/0gzPE8CL0gQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/74f32e9d-9d63-4d12-9488-f90da1ca7957%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Thank You,
Johnathan Falk

Conor Broderick

unread,
Nov 20, 2017, 9:56:02 AM11/20/17
to johnathan falk, Prometheus Users
I think there may have been a regression introduced in #3343 which is causing your issue here.

Can you post your ec2_sd_config here so we can take a look? 

To unsubscribe from this group and all its topics, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
--
Thank You,
Johnathan Falk

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CADNp-0Rbic9kk3rYirqCy18VWvunMzHJyBRyvuwDqpfxrN2WxA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.



--

johnathan falk

unread,
Nov 20, 2017, 1:19:52 PM11/20/17
to Conor Broderick, Prometheus Users
# Monitor basic AWS information
  - job_name: 'AWS BBS Node Info'
    ec2_sd_configs:
      - region: us-east-1
        port: 9100
      - region: us-east-2
        port: 9100
      - region: us-west-1
        port: 9100
      - region: us-west-2
        port: 9100
    relabel_configs:
      # Only monitor instances with infra:monitoring:prometheus:node = "true"
      - source_labels: [__meta_ec2_tag_infra_monitoring_prometheus_node]
        regex: true
        action: keep
        # Use the instance ID as the instance label
      - source_labels: [__meta_ec2_instance_id]
        target_label: instance
      - source_labels: [__meta_ec2_tag_Name]
        target_label: name

I store the access_key / secret_key in the ENV variables. I have manually put it in the config but it still doesn't work. 

To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
--
Thank You,
Johnathan Falk

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

To post to this group, send email to promethe...@googlegroups.com.
--
Thank You,
Johnathan Falk

johnathan falk

unread,
Nov 20, 2017, 1:51:34 PM11/20/17
to Conor Broderick, Prometheus Users
I think I have it working now but had to change my AWS Config. Previously the user was part of a group that assigned EC2ReadOnly access. I created a role allowing access and assigned it to the prometheus server nodes and that seems to have fixed the problem. 

ghatti.s...@gmail.com

unread,
Nov 25, 2017, 4:38:56 PM11/25/17
to Prometheus Users
Hi,

I am also having a similar issue. 
prometheus is able to scrape with aws access_key and secret_key but fails when using ec2 instance role. I am able to describe the ec2 instances from the prometheus instance directly. 

the ec2 config :

-   job_name: ec2-scrape-configs
    ec2_sd_configs:
      - region: ap-southeast-1
        role_arn: arn:aws:iam::xxxxxxxxxxxx:role/aws-opsworks-ec2-role
        port: 9126
    relabel_configs:
        # Only monitor instances with a tag "legos"
      - source_labels: [__meta_ec2_tag_Name]
        regex: "(legos -*)"
        action: keep 

Am I missing something?

rob.m...@cbsinteractive.com

unread,
Dec 4, 2017, 5:56:25 PM12/4/17
to Prometheus Users
Did you ever figure this out? I am having the same issue. 

ghatti.s...@gmail.com

unread,
Dec 5, 2017, 12:47:56 PM12/5/17
to Prometheus Users
Nope. I feel I am missing something very basic. 
This is what I did/tried: 

The EC2 instance is launched with a role that has read access on all of AWS EC2. I validated this by running ec2 describe and can see the details of the instances. ( I did not configure any AWS access keys in the box) 
But when running prometheus with the role_arn (as detailed in my previous message) I see access denied messages in prometheus logs.
I am running prometheus with access_key/secret_key for now.

nizam...@gmail.com

unread,
Jan 14, 2019, 5:52:07 AM1/14/19
to Prometheus Users

I also faced same issue , but after some hit and trial I found that , "role_arn" is the culprit, instead of this use "profile". It worked for me.

ec2_sd_configs:
- region: 'us-west-2'
profile: 'arn:aws:iam::XXXXX:instance-profile/my-ec2-role'
filters:
- name: tag:Service
values:
- abc

ghatti.s...@gmail.com

unread,
Feb 6, 2019, 4:31:46 AM2/6/19
to Prometheus Users
From the docs the "profile" is the name of the AWS credentials profile 
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to connect to the API.
[ profile: <string> ]

Does the instance where prometheus is running have any IAM role that has read access to EC2 ?

Asim Ayub

unread,
Nov 18, 2020, 7:33:16 AM11/18/20
to Prometheus Users
Hi,
I know this is an old issue, but I am having the same problem. I am using the <kube-prometheus-stack-10.3.2> helm chart to run Prometheus. I also have  'kube2iam'  running in the cluster. For all my other services I have managed to use an IAM role attached to the pod as annotation <iam.amazonaws.com/role: "arn:aws:iam::123456789:role/k8s-role-name"> and have been able to access the relevant AWS service. 
I have followed the same procedure in this instance, I have created an IAM role with the correct TrustPolicy and EC2ReadOnly permissions. I pass the role as annotation to the prometheus pod and in the config define it as follows:
- job_name: ec2
        ec2_sd_configs:
          - region: eu-west-1
            profile: "arn:aws:iam::123456789:role/<role-name>"
            port: 9100
However I keep getting the error:
component="discovery manager scrape" discovery=ec2 msg="Unable to refresh target groups" err="could not describe instances: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"
I have replaced 'profile' with 'role_arn' but no luck.
However, if I replace 'profile' or 'role_arn' with access_key and secret_key, and remove the pod annotation everything seems to work fine. For security reasons, I don't like the use of access_key and secret_key.
I have seen this github issue  but there is no conclusion at the end of that.

Any help will be highly appreciated.
Thanks


Reply all
Reply to author
Forward
0 new messages