Alluxio workers on YARN can't authenticate with kerberos

177 views
Skip to first unread message

kong....@gmail.com

unread,
Aug 20, 2018, 4:56:26 AM8/20/18
to Alluxio Users

Hi, community,

I have an alluxio cluster with 3 workers running on YARN, while the master is running on my own infrastructure.

All the workers exited with an exception:

java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]



As suggested, I copied krb5.conf and keytab under {alluxio_home}/conf since alluxio will move everything under that folder into the container.

And in the alluxio-site.properties, I specified the keytab and principal as:

# Keroberos
alluxio.master.keytab.file=${alluxio.home}/conf/alluxio.keytab
alluxio.master.principal=allux...@XXXXX.HADOOP.XXXXXX.COM
alluxio.worker.keytab.file=${alluxio.home}/conf/alluxio.keytab
alluxio.worker.principal=allux...@XXXXX.HADOOP.XXXXXX.COM

And in alluxio-env.sh, I specified ALLUXIO_OPTS as:

ALLUXIO_JAVA_OPTS+=" -Djava.security.krb5.conf=${ALLUXIO_HOME}/conf/krb5.conf"


Also, I put all the hadoop related config files under {alluxio_home}/conf:

alluxio-env.sh
alluxio-env.sh.template
alluxio-site.properties
alluxio-site.properties.template
core-site.xml
core-site.xml.template
hdfs-site.xml
krb5.conf
log4j.properties
mapred-site.xml
masters
metrics.properties.template
alluxio.keytab
workers
yarn-site.xml


Do I need some other config to make kerberos work with workers on YARN? Or is there any way that allows me to debug this issue further?

Thanks in advance!

Gene Pang

unread,
Aug 20, 2018, 2:12:52 PM8/20/18
to Alluxio Users
Hi,

Is there more of a stack trace for the AccessControlException you see on the Alluxio workers?

Thanks,
Gene
Message has been deleted

kong....@gmail.com

unread,
Aug 20, 2018, 8:21:52 PM8/20/18
to Alluxio Users
Hi,

Thanks for your reply.
Here is what I can get from the YARN worker page:

ERROR ApplicationMaster - Error running Application Master
java
.lang.RuntimeException: Cannot find resource
        at alluxio
.yarn.ApplicationMaster.setupLocalResources(ApplicationMaster.java:445)
        at alluxio
.yarn.ApplicationMaster.launchWorkerContainer(ApplicationMaster.java:392)
        at alluxio
.yarn.ApplicationMaster.requestAndLaunchContainers(ApplicationMaster.java:337)
        at alluxio
.yarn.ApplicationMaster.runApplicationMaster(ApplicationMaster.java:232)
        at alluxio
.yarn.ApplicationMaster.access$000(ApplicationMaster.java:75)
        at alluxio
.yarn.ApplicationMaster$2.run(ApplicationMaster.java:206)
        at alluxio
.yarn.ApplicationMaster$2.run(ApplicationMaster.java:203)
        at java
.security.AccessController.doPrivileged(Native Method)
        at javax
.security.auth.Subject.doAs(Subject.java:422)
        at org
.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at alluxio
.yarn.ApplicationMaster.main(ApplicationMaster.java:203)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "XXXXXX.XXXX.XXXX.XXX.XXXXX/DDDD.DD.DD.DDD"; destination host is: "XXXXXX.XXXX.XXXX.XXX.XXXXX":8020;
        at org
.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org
.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
        at org
.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org
.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org
.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com
.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
        at org
.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:823)
        at sun
.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun
.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java
.lang.reflect.Method.invoke(Method.java:498)
        at org
.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org
.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org
.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com
.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
        at org
.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
        at org
.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
        at org
.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
        at org
.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org
.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438)
        at alluxio
.yarn.YarnUtils.createLocalResourceOfFile(YarnUtils.java:88)
        at alluxio
.yarn.ApplicationMaster.setupLocalResources(ApplicationMaster.java:440)
       
... 10 more
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
        at org
.apache.hadoop.ipc.Client$Connection$1.run(Client.java:720)
        at java
.security.AccessController.doPrivileged(Native Method)
        at javax
.security.auth.Subject.doAs(Subject.java:422)
        at org
.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at org
.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683)
        at org
.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770)
        at org
.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
        at org
.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
        at org
.apache.hadoop.ipc.Client.call(Client.java:1451)
       
... 29 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
        at org
.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
        at org
.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
        at org
.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
        at org
.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
        at org
.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
        at org
.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
        at java
.security.AccessController.doPrivileged(Native Method)
        at javax
.security.auth.Subject.doAs(Subject.java:422)
        at org
.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at org
.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757)
       
... 32 more


Thanks,
Mu

kong....@gmail.com

unread,
Aug 20, 2018, 8:35:12 PM8/20/18
to Alluxio Users
Sorry, the code block seems unable to take so much text.

The full stack trace is here:


        at alluxio
.yarn.YarnUtils.createLocalResourceOfFile(YarnUtils.<span styl

kong....@gmail.com

unread,
Aug 20, 2018, 8:44:21 PM8/20/18
to Alluxio Users
Another thing I found when I was trying to figure this out is,
if I don't put hostname in the under fs path in the command line:

/integration/yarn/bin/alluxio-yarn.sh 1 hdfs:///user/alluxio_user/alluxio/test/storage master.hostname.com

The workers can successfully launch. 
But when I run 

${ALLUXIO_HOME}/bin/alluxio runTests

The worker will complain about there is no host in the path.
But it seems if I don't put hostname in the path, somehow alluxio can pass the authentication.

kong....@gmail.com

unread,
Aug 21, 2018, 5:00:44 AM8/21/18
to Alluxio Users
I solved this issue.

The problem is that I was using a hard coded absolute path for alluxio.underfs.hdfs.configuration in alluxio-site.properties.

After I changed it by using ${alluxio.home}, the workers on yarn function well.
Also, since in the hdfs-site.xml I already specify the default hostname for hdfs, I no longer need to put hostname in the underfs path.

Gene Pang

unread,
Aug 21, 2018, 11:13:30 AM8/21/18
to Alluxio Users
I'm glad it is solved! Thanks for sharing your experience and details on how you resolved the issue.

Thanks,
Reply all
Reply to author
Forward
0 new messages