Cannot access S3 from H2O Docker container on EC2 when using AM role authorization

92 views
Skip to first unread message

Burbach, Matthias

unread,
Jan 19, 2022, 5:48:57 AM1/19/22
to h2os...@googlegroups.com
Hello,
I am trying H2O on our EC2 based Kubernetes cluster but fail to access S3 bucket resources when relying on IAM role authorization. More precisely,

I am using h2oai/h2o-open-source-k8s:3.36.0.1 as the Docker image.

If I log into the running container, install aws cli and access the bucket using aws s3 s3://my-bucket on the command line, it works fine. So the container does have sufficient privileges to make access to the bucket, but they don't seem to propagate into the Java processes of H2O.

If I send the AWS S3 credentials to H2O API through the Python API function set_s3_credentials(), then it works as well. But I do not want to have this extra step.

The documentation at https://docs.h2o.ai/h2o/latest-stable/h2o-docs/cloud-integration/ec2-and-s3.html
says "If you are running H2O using an IAM role, it is not necessary to distribute the AWS credentials to all the nodes in the cluster. The latest version of H2O can access the temporary access key."

Is there anything I am missing?

Thanks for your help,
Matthias


Burbach, Matthias

unread,
Jan 20, 2022, 2:18:56 PM1/20/22
to h2os...@googlegroups.com
Hello,
I tried various approaches of making H2O aware of my temporary AWS credentials for accessing S3 which can be retrieved from http://169.254.169.254/latest/meta-data/iam/security-credentials/ inside the Docker container running H2O on an AWS EC2 instance managed by Kubernetes. None of them worked.
Only when I use long-term credentials (no session token handling required) which I have to inject into the container myself, e.g. by mounting them as a Kubernetes secret file, it works.
This is not a very good solution in my context, but one I can accept as a workaround for now.

If someone managed to make it work even with temporary AWS credentials, please let me know.
Regards,
Matthias


Von: Burbach, Matthias
Gesendet: Mittwoch, 19. Januar 2022 11:48
An: h2os...@googlegroups.com <h2os...@googlegroups.com>
Betreff: Cannot access S3 from H2O Docker container on EC2 when using AM role authorization
 

Michal Kurka

unread,
Jan 27, 2022, 11:12:02 AM1/27/22
to H2O Open Source Scalable Machine Learning - h2ostream
Hello Matthias,

says "If you are running H2O using an IAM role, it is not necessary to distribute the AWS credentials to all the nodes in the cluster. The latest version of H2O can access the temporary access key."

the documentation is referring to running on EC2 instances and might not be correct in respect of Kubernetes. Are you using EKS (https://aws.amazon.com/eks/) or what does your Kubernetes deployment look like?

In your deployment, H2O should pick up the credentials using InstanceProfileCredentialsProvider (option 6 on https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html).

> If I log into the running container, install aws cli and access the bucket using aws s3 s3://my-bucket on the command line, it works fine.

This indicates we might need to make a change in H2O.

It would help us if you gave us details on how to reproduce the issue.

Thank you,
MK

Michal Kurka

unread,
Jan 27, 2022, 11:24:42 AM1/27/22
to H2O Open Source Scalable Machine Learning - h2ostream
Matthias,

one quick idea - if you are willing to experiment:

We might get over the issue if we bypass H2O's credential provider chain and use the AWS default one. To do that you can define a system property when launching H2O this way:

java -Dsys.ai.h2o.persist.s3.customCredentialsProviderClass=com.amazonaws.auth.DefaultAWSCredentialsProviderChain -jar h2o.jar

Please let me know if this helped,

MK


Michal Kurka

unread,
Feb 11, 2022, 12:17:51 PM2/11/22
to H2O Open Source Scalable Machine Learning - h2ostream
Mathias, 

this issue might be relevant to your problem - https://h2oai.atlassian.net/browse/PUBDEV-8567 - please take a look, this should be fixed in next H2O release.

MK

Reply all
Reply to author
Forward
0 new messages