alluxio kerberos error

78 views
Skip to first unread message

Chen Song

unread,
Nov 1, 2016, 11:40:01 AM11/1/16
to Alluxio Users
I am testing alluxio in a kerberized environment. I am using 1.4.0-SNAPSHOT.

  • I followed instructions on http://www.alluxio.org/docs/1.3/en/Configuring-Alluxio-with-secure-HDFS.html to set the following configs in alluxio-site.xml.
    • alluxio.master.keytab.file=/etc/krb5.keytab.hdfs
      alluxio.master.principal=hdfs/<_HOST>@<REALM>
      alluxio.worker.keytab.file=/etc/krb5.keytab.hdfs
      alluxio.worker.principal=hdfs/<_HOST>@<REALM>
  • I started the alluxio worker with hdfs user so it can access the keytab and login.
  • From worker application logs, I do see a login was successful using keytab after the worker is started.
    • 2016-11-01 15:28:50,664 INFO  security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab) - Login successful for user hdfs/<_HOST>@<REALM> using keytab file /etc/krb5.keytab.hdfs
  • However, when I tried to load a file into alluxio from HDFS, it failed with errors shown as follows. The errors look typical kerberos errors.
    • 2016-11-01 15:28:50,713 WARN  security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,714 WARN  ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,714 WARN  security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,726 WARN  security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,727 WARN  ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,727 WARN  security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      2016-11-01 15:28:50,730 INFO  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke) - Exception while invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over <namenode>/<ip>:8020 after 1 fail over attempts. Trying to fail over immediately.
      java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "<worker node>/<ip>"; destination host is: "<namenode>":8020;
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
              at org.apache.hadoop.ipc.Client.call(Client.java:1472)
              at org.apache.hadoop.ipc.Client.call(Client.java:1399)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
              at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
              at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1980)
              at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
              at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
              at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
              at alluxio.underfs.hdfs.HdfsUnderFileSystem.exists(HdfsUnderFileSystem.java:178)
              at alluxio.worker.file.UnderFileSystemManager$InputStreamAgent.<init>(UnderFileSystemManager.java:153)
              at alluxio.worker.file.UnderFileSystemManager$InputStreamAgent.<init>(UnderFileSystemManager.java:120)
              at alluxio.worker.file.UnderFileSystemManager.openFile(UnderFileSystemManager.java:514)
              at alluxio.worker.file.DefaultFileSystemWorker.openUfsFile(DefaultFileSystemWorker.java:163)
              at alluxio.worker.file.FileSystemWorkerClientServiceHandler$5.call(FileSystemWorkerClientServiceHandler.java:169)
              at alluxio.worker.file.FileSystemWorkerClientServiceHandler$5.call(FileSystemWorkerClientServiceHandler.java:166)
              at alluxio.RpcUtils.call(RpcUtils.java:62)
              at alluxio.worker.file.FileSystemWorkerClientServiceHandler.openUfsFile(FileSystemWorkerClientServiceHandler.java:166)
              at alluxio.thrift.FileSystemWorkerClientService$Processor$openUfsFile.getResult(FileSystemWorkerClientService.java:709)
              at alluxio.thrift.FileSystemWorkerClientService$Processor$openUfsFile.getResult(FileSystemWorkerClientService.java:693)
              at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
              at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
              at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
              at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
              at org.apache.hadoop.ipc.Client.call(Client.java:1438)
              ... 35 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
              at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
              at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
              at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
The only thing I can get this working is to explicitly kinit as hdfs before starting worker. But with that, the token get expired after a few days.

Any thoughts on this?

Chen

Haoyuan Li

unread,
Nov 1, 2016, 12:12:23 PM11/1/16
to Chen Song, Alluxio Users
Unless you would like to develop Alluxio, we would suggest you to use a released version, e.g. version 1.3

Best,

Haoyuan

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chen Song

unread,
Nov 1, 2016, 3:02:46 PM11/1/16
to Alluxio Users, chen.s...@gmail.com
Thanks Haoyuan.

I tried with 1.3.0 release version and I got the same error.

Chen
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

Chen Song

unread,
Nov 1, 2016, 5:19:41 PM11/1/16
to Alluxio Users, chen.s...@gmail.com
From my testing, it seems that the following 2 properties are only used to login in KDC initially. After that, when it is the time to access data in HDFS, the worker calls doAs() as the user who runs the application, not the user specified in the principal. Even if the running user is the same as the user in the kerberos principal, the ticket doesn't appear to be passed properly.

alluxio.worker.keytab.file=/etc/krb5.keytab.hdfs
alluxio.worker.principal=hdfs/<_HOST>@<REALM>


Chaomin Yu

unread,
Nov 2, 2016, 8:44:34 PM11/2/16
to Chen Song, Alluxio Users
Hi Chen,

Thanks for reporting this issue. This is indeed a known limitation in current Alluxio release.
Can you please try to work around it by renewing the Kerberos TGT periodically? such as calling "kinit" every few days before the TGT expires?

You can also refer to this thread for a similar discussion.

To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

Chen Song

unread,
Nov 9, 2016, 2:07:06 PM11/9/16
to Alluxio Users, chen.s...@gmail.com
Thanks for confirm on this.
Let me give a try.

Chen
Cheers,
Chaomin

Chaomin Yu

unread,
Nov 18, 2016, 1:11:43 PM11/18/16
to Chen Song, Alluxio Users
Hi Chen,

Are you able to get it worked around?

Best,
Chaomin

To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages