Using service account IAM roles in EKS

276 views
Skip to first unread message

Louis Bougeard

unread,
Jan 15, 2020, 5:36:12 AM1/15/20
to Prometheus Users
I'm trying to use prometheus-cloudwatch-exporter in EKS, using serviceAccounts rather than Kube2IAM with a OIDC IAM Role attached to the pod. This is the new-ish official Amazon way of doing things.

I've fixed the helmchart for this and will be raising a PR to allow annotations on serviceAccounts and to add them to the deployments, in the coming days to get the issuing of the AWS_WEB_IDENTITY_TOKEN_FILE.

Where I'm now stuck is that the serviceAccount mounts a JWT in /var/run/secrets/eks.amazonaws.com/serviceaccount in a file called token by default.

This is owned by root and I can't see how to change this in the official documentation  (https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/). The issue is that the user in the container doesn't appear to have permissions to read the JWT to exchange for an AWS secret and token and I don't want to be running as root in the container or have to roll my own. 

The logs show the following:

WARNING: CloudWatch scrape failed
com.amazonaws.services.cloudwatch.model.AmazonCloudWatchException: User: arn:aws:sts::123456789:assumed-role/aaa202001010000000000000000/i-abcdefghijklmnop is not authorized to perform: cloudwatch:ListMetrics (Service: AmazonCloudWatch; Status Code: 403; Error Code: AccessDenied; Request ID: XXXX

I've checked the POM and the version of the AWS SDK used (1.11.658) is modern enough to pick up the token file credentials provider in the default credentials chain.

Has anyone worked out a way to by default change the owner of the mount without having to manually mount the secret etc.?

Louis Bougeard

unread,
Jan 15, 2020, 5:44:29 AM1/15/20
to Prometheus Users
Some context I missed off...

It seems to be recommended to run as nobody, not root:
https://github.com/helm/charts/blob/master/stable/prometheus-cloudwatch-exporter/values.yaml#L164

As similar issue is referenced here:
https://github.com/aws/containers-roadmap/issues/23#issuecomment-535176333

Louis Bougeard

unread,
Jan 15, 2020, 8:45:10 AM1/15/20
to Prometheus Users
Hopefully this is useful to someone:

I've raised a PR with the chart changes here:  https://github.com/helm/charts/pull/20162.

I was using version 0.6.0 of cloudwatch exporter (prom/cloudwatch-exporter:cloudwatch_exporter-0.6.0), however this doesn't contain a sufficiently up-to-date version of the AWS SDK. By upgrading to 0.7.0 it does contain the correct version of the SDK and thus can grab the credentials. There is however one big caveat for this, which is that I need to set securityContext.runAsUser to run as root to be able to access the token file. This obviously isn't ideal, but it does at least work, for now...

If anyone has any thoughts on this, or better solutions, that would be much appreciated.
Message has been deleted

Chelo Montilla

unread,
May 11, 2020, 3:59:44 AM5/11/20
to Prometheus Users
I'm trying to get it working with the latest version 0.8.0, but I'm getting this error:
May 08, 2020 12:45:40 PM io.prometheus.cloudwatch.CloudWatchCollector collect
WARNING: CloudWatch scrape failed
com.amazonaws.services.resourcegroupstaggingapi.model.AWSResourceGroupsTaggingAPIException: User: arn:aws:sts::0123456789:assumed-role/staging/i-0b79d679574316228 is not authorized to perform: tag:GetResources (Service: AWSResourceGroupsTaggingAPI; Status Code: 400; Error Code: AccessDeniedException; Request ID: c7a6c5c8-3c1f-451e-b548-ff38bc84c9ee)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1742)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1371)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1347)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1127)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:784)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:752)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
	at com.amazonaws.services.resourcegroupstaggingapi.AWSResourceGroupsTaggingAPIClient.doInvoke(AWSResourceGroupsTaggingAPIClient.java:1631)
	at com.amazonaws.services.resourcegroupstaggingapi.AWSResourceGroupsTaggingAPIClient.invoke(AWSResourceGroupsTaggingAPIClient.java:1598)
	at com.amazonaws.services.resourcegroupstaggingapi.AWSResourceGroupsTaggingAPIClient.invoke(AWSResourceGroupsTaggingAPIClient.java:1587)
	at com.amazonaws.services.resourcegroupstaggingapi.AWSResourceGroupsTaggingAPIClient.executeGetResources(AWSResourceGroupsTaggingAPIClient.java:1021)
	at com.amazonaws.services.resourcegroupstaggingapi.AWSResourceGroupsTaggingAPIClient.getResources(AWSResourceGroupsTaggingAPIClient.java:992)
	at io.prometheus.cloudwatch.CloudWatchCollector.getResourceTagMappings(CloudWatchCollector.java:292)
	at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:548)
	at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:664)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:190)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:223)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:144)
	at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22)
	at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:48)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:873)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:542)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.eclipse.jetty.server.Server.handle(Server.java:502)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
	at java.base/java.lang.Thread.run(Unknown Source)


The same permissions works using access keys and I set as you said the securityContext to root, and I can see the file is accesible in the pod. How did you manage it to work?
Have you found any solution to not have to run as root?
Thanks in advance

Sally Lehman

unread,
May 16, 2020, 12:54:24 AM5/16/20
to Prometheus Users
User: arn:aws:sts::0123456789:assumed-role/staging/i-0b79d679574316228 is not authorized to perform: tag:GetResources - have you verified that this user you are looking to access these resources with has the same
permissions that the user for your access keys, do? 

Chelo Montilla

unread,
May 16, 2020, 2:21:15 AM5/16/20
to Prometheus Users
Yes, same permissions using access keys works fine.
Reply all
Reply to author
Forward
0 new messages