2016-10-06 18:42:00,006 ERROR logger.type (MetricsConfig.java:loadConfigFile) - Error loading metrics configuration file.2016-10-06 18:42:01,074 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable2016-10-06 18:42:01,297 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab) - Login successful for user hdfs/ip-172-31-40-120.us-w...@PHDATA.IO using keytab file /tmp/alluxio-1.1.0/conf/hdfs.keytab2016-10-06 18:42:01,300 INFO logger.type (AbstractMaster.java:start) - BlockMaster: Starting leader master.2016-10-06 18:42:01,302 INFO logger.type (JournalWriter.java:completeAllLogs) - Marking all logs as complete.2016-10-06 18:42:01,303 INFO logger.type (JournalWriter.java:completeCurrentLog) - Completed current log: /tmp/alluxio-1.1.0/journal/BlockMaster/log.out to completed log: /tmp/alluxio-1.1.0/journal/BlockMaster/completed/log.000000000000000000012016-10-06 18:42:01,304 INFO logger.type (AbstractMaster.java:start) - BlockMaster: process entire journal before becoming leader master.2016-10-06 18:42:01,304 INFO logger.type (JournalTailer.java:processJournalCheckpoint) - BlockMaster: Loading checkpoint file: /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data2016-10-06 18:42:01,305 INFO logger.type (JournalReader.java:getCheckpointInputStream) - Opening journal checkpoint file: /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data2016-10-06 18:42:01,318 INFO logger.type (JournalReader.java:getNextInputStream) - Opening journal log file: /tmp/alluxio-1.1.0/journal/BlockMaster/completed/log.000000000000000000012016-10-06 18:42:01,318 INFO logger.type (JournalTailer.java:processNextJournalLogFiles) - BlockMaster: Processing a completed log file.2016-10-06 18:42:01,318 INFO logger.type (JournalTailer.java:processNextJournalLogFiles) - BlockMaster: Finished processing the log file.2016-10-06 18:42:01,318 INFO logger.type (JournalWriter.java:getCheckpointOutputStream) - Creating tmp checkpoint file: /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data.tmp2016-10-06 18:42:01,318 INFO logger.type (JournalWriter.java:getCheckpointOutputStream) - Latest journal sequence number: 22 Next journal sequence number: 232016-10-06 18:42:01,342 INFO logger.type (JournalWriter.java:close) - Successfully created tmp checkpoint file: /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data.tmp2016-10-06 18:42:01,343 INFO logger.type (JournalWriter.java:close) - Renamed checkpoint file /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data.tmp to /tmp/alluxio-1.1.0/journal/BlockMaster/checkpoint.data2016-10-06 18:42:01,344 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Deleting all completed log files...2016-10-06 18:42:01,344 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Deleting completed log: /tmp/alluxio-1.1.0/journal/BlockMaster/completed/log.000000000000000000012016-10-06 18:42:01,345 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Finished deleting all completed log files.2016-10-06 18:42:01,353 INFO logger.type (JournalWriter.java:openCurrentLog) - Opened current log file: /tmp/alluxio-1.1.0/journal/BlockMaster/log.out2016-10-06 18:42:01,356 INFO logger.type (MountTable.java:add) - Mounting hdfs://ip-172-31-37-123.us-west-2.compute.internal:8020 at /2016-10-06 18:42:01,357 INFO logger.type (AbstractMaster.java:start) - FileSystemMaster: Starting leader master.2016-10-06 18:42:01,357 INFO logger.type (JournalWriter.java:completeAllLogs) - Marking all logs as complete.2016-10-06 18:42:01,357 INFO logger.type (JournalWriter.java:completeCurrentLog) - Completed current log: /tmp/alluxio-1.1.0/journal/FileSystemMaster/log.out to completed log: /tmp/alluxio-1.1.0/journal/FileSystemMaster/completed/log.000000000000000000012016-10-06 18:42:01,358 INFO logger.type (AbstractMaster.java:start) - FileSystemMaster: process entire journal before becoming leader master.2016-10-06 18:42:01,358 INFO logger.type (JournalTailer.java:processJournalCheckpoint) - FileSystemMaster: Loading checkpoint file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data2016-10-06 18:42:01,358 INFO logger.type (JournalReader.java:getCheckpointInputStream) - Opening journal checkpoint file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data2016-10-06 18:42:01,364 INFO logger.type (JournalReader.java:getNextInputStream) - Opening journal log file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/completed/log.000000000000000000012016-10-06 18:42:01,364 INFO logger.type (JournalTailer.java:processNextJournalLogFiles) - FileSystemMaster: Processing a completed log file.2016-10-06 18:42:01,374 INFO logger.type (JournalTailer.java:processNextJournalLogFiles) - FileSystemMaster: Finished processing the log file.2016-10-06 18:42:01,374 INFO logger.type (JournalWriter.java:getCheckpointOutputStream) - Creating tmp checkpoint file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data.tmp2016-10-06 18:42:01,374 INFO logger.type (JournalWriter.java:getCheckpointOutputStream) - Latest journal sequence number: 421 Next journal sequence number: 4222016-10-06 18:42:01,380 INFO logger.type (JournalWriter.java:close) - Successfully created tmp checkpoint file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data.tmp2016-10-06 18:42:01,380 INFO logger.type (JournalWriter.java:close) - Renamed checkpoint file /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data.tmp to /tmp/alluxio-1.1.0/journal/FileSystemMaster/checkpoint.data2016-10-06 18:42:01,380 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Deleting all completed log files...2016-10-06 18:42:01,380 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Deleting completed log: /tmp/alluxio-1.1.0/journal/FileSystemMaster/completed/log.000000000000000000012016-10-06 18:42:01,381 INFO logger.type (JournalWriter.java:deleteCompletedLogs) - Finished deleting all completed log files.2016-10-06 18:42:01,575 INFO server.Server (Server.java:doStart) - jetty-7.x.y-SNAPSHOT2016-10-06 18:42:01,596 INFO handler.ContextHandler (ContextHandler.java:startContext) - started o.e.j.s.ServletContextHandler{/metrics/json,null}2016-10-06 18:42:01,608 WARN webapp.WebInfConfiguration (WebInfConfiguration.java:makeTempDirectory) - Can't reuse /tmp/jetty-0.0.0.0-19999-webapp-_-any-, using /tmp/jetty-0.0.0.0-19999-webapp-_-any-_52209745782363821422016-10-06 18:42:01,679 INFO handler.ContextHandler (ContextHandler.java:startContext) - started o.e.j.w.WebAppContext{/,file:/tmp/alluxio-1.1.0/core/server/src/main/webapp/},/tmp/alluxio-1.1.0/core/server/src/main/webapp2016-10-06 18:42:10,819 INFO server.AbstractConnector (AbstractConnector.java:doStart) - Started SelectChann...@0.0.0.0:199992016-10-06 18:42:10,820 INFO logger.type (UIWebServer.java:startWebServer) - Alluxio Master Web service started @ 0.0.0.0/0.0.0.0:199992016-10-06 18:42:10,822 INFO logger.type (AlluxioMaster.java:startServing) - Alluxio master version 1.1.0 started @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:199982016-10-06 18:42:39,308 INFO logger.type (BlockMaster.java:getWorkerId) - getWorkerId(): WorkerNetAddress: WorkerNetAddress{host=ip-172-31-40-120.us-west-2.compute.internal, rpcPort=29998, dataPort=29999, webPort=30000} id: 12016-10-06 18:42:39,321 INFO logger.type (BlockMaster.java:workerRegister) - registerWorker(): MasterWorkerInfo{id=1, workerAddress=WorkerNetAddress{host=ip-172-31-40-120.us-west-2.compute.internal, rpcPort=29998, dataPort=29999, webPort=30000}, capacityBytes=134217728, usedBytes=0, lastUpdatedTimeMs=1475779359320, blocks=[]}2016-10-06 18:42:53,103 INFO logger.type (JournalWriter.java:openCurrentLog) - Opened current log file: /tmp/alluxio-1.1.0/journal/FileSystemMaster/log.out
2016-10-06 18:42:53,215 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 18:42:53,216 WARN ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 18:42:53,216 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 18:42:53,221 ERROR logger.type (HdfsUnderFileSystem.java:exists) - 1 try to check if hdfs://ip-172-31-37-123.us-west-2.compute.internal:8020/default_tests_files/BasicNonByteBuffer_CACHE_PROMOTE_MUST_CACHE exists : Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120"; destination host is: "ip-172-31-37-123.us-west-2.compute.internal":8020;
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120"; destination host is: "ip-172-31-37-123.us-west-2.compute.internal":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at alluxio.underfs.hdfs.HdfsUnderFileSystem.exists(HdfsUnderFileSystem.java:207)
at alluxio.master.file.FileSystemMaster.loadMetadataAndJournal(FileSystemMaster.java:1792)
at alluxio.master.file.FileSystemMaster.loadMetadataIfNotExistAndJournal(FileSystemMaster.java:1932)
at alluxio.master.file.FileSystemMaster.getFileInfo(FileSystemMaster.java:447)
at alluxio.master.file.FileSystemMasterClientServiceHandler$7.call(FileSystemMasterClientServiceHandler.java:155)
at alluxio.master.file.FileSystemMasterClientServiceHandler$7.call(FileSystemMasterClientServiceHandler.java:152)
at alluxio.RpcUtils.call(RpcUtils.java:40)
at alluxio.master.file.FileSystemMasterClientServiceHandler.getStatus(FileSystemMasterClientServiceHandler.java:152)
at alluxio.thrift.FileSystemMasterClientService$Processor$getStatus.getResult(FileSystemMasterClientService.java:1460)
at alluxio.thrift.FileSystemMasterClientService$Processor$getStatus.getResult(FileSystemMasterClientService.java:1444)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 34 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 37 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 46 more
2016-10-06 18:42:53,229 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Hi Chaomin,Yes everything works fine if I kinit with the keytab first I am able to successfully runTests and read/write files through alluxio to hdfs.[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ kinit -k -t hdfs.keytab hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IO
[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ klistTicket cache: FILE:/tmp/krb5cc_992
Default principal: hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IO
Valid starting Expires Service principal
10/06/2016 19:51:07 10/06/2016 20:01:07 krbtgt/PHDA...@PHDATA.IOrenew until 10/06/2016 20:21:07[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ ./bin/alluxio-start.sh masterStarting master @ ip-172-31-40-120.us-west-2.compute.internal. Logging to /tmp/alluxio-1.1.0/logs
[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ ./bin/alluxio-start.sh worker SudoMount
ALLUXIO_WORKER_MEMORY_SIZE was not set. Using the default one: 128MBALLUXIO_RAM_FOLDER was not set. Using the default one: /mnt/ramdiskFormatting RamFS: /mnt/ramdisk (128mb)
[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ ./bin/alluxio runTests
2016-10-06 19:56:18,866 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemMasterClient master @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:199982016-10-06 19:56:18,876 INFO type (AbstractClient.java:connect) - Client registered with FileSystemMasterClient master @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:19998runTest Basic CACHE_PROMOTE MUST_CACHE2016-10-06 19:56:18,979 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with BlockMasterClient master @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:199982016-10-06 19:56:18,983 INFO type (AbstractClient.java:connect) - Client registered with BlockMasterClient master @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:199982016-10-06 19:56:19,009 INFO type (BlockWorkerClient.java:connectOperation) - Connecting to local worker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,040 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_PROMOTE_MUST_CACHE took 76 ms.2016-10-06 19:56:19,151 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_PROMOTE_MUST_CACHE took 111 ms.Passed the test!runTest BasicNonByteBuffer CACHE_PROMOTE MUST_CACHEPassed the test!runTest Basic CACHE_PROMOTE CACHE_THROUGH2016-10-06 19:56:19,250 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,250 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,505 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_PROMOTE_CACHE_THROUGH took 266 ms.2016-10-06 19:56:19,583 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_PROMOTE_CACHE_THROUGH took 78 ms.Passed the test!runTest BasicNonByteBuffer CACHE_PROMOTE CACHE_THROUGH2016-10-06 19:56:19,588 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,588 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic CACHE_PROMOTE THROUGH2016-10-06 19:56:19,745 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,746 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,829 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_PROMOTE_THROUGH took 85 ms.2016-10-06 19:56:19,904 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,904 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,989 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_PROMOTE_THROUGH took 159 ms.Passed the test!runTest BasicNonByteBuffer CACHE_PROMOTE THROUGH2016-10-06 19:56:19,993 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:19,994 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,147 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,148 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic CACHE_PROMOTE ASYNC_THROUGH2016-10-06 19:56:20,225 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_PROMOTE_ASYNC_THROUGH took 10 ms.2016-10-06 19:56:20,306 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_PROMOTE_ASYNC_THROUGH took 81 ms.Passed the test!runTest BasicNonByteBuffer CACHE_PROMOTE ASYNC_THROUGHPassed the test!runTest Basic CACHE MUST_CACHE2016-10-06 19:56:20,387 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_MUST_CACHE took 5 ms.2016-10-06 19:56:20,454 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_MUST_CACHE took 67 ms.Passed the test!runTest BasicNonByteBuffer CACHE MUST_CACHEPassed the test!runTest Basic CACHE CACHE_THROUGH2016-10-06 19:56:20,529 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,530 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,625 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_CACHE_THROUGH took 97 ms.2016-10-06 19:56:20,691 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_CACHE_THROUGH took 66 ms.Passed the test!runTest BasicNonByteBuffer CACHE CACHE_THROUGH2016-10-06 19:56:20,695 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,695 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic CACHE THROUGH2016-10-06 19:56:20,864 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,865 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:20,948 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_THROUGH took 86 ms.2016-10-06 19:56:21,015 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,015 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,090 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_THROUGH took 142 ms.Passed the test!runTest BasicNonByteBuffer CACHE THROUGH2016-10-06 19:56:21,093 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,096 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,247 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,248 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic CACHE ASYNC_THROUGH2016-10-06 19:56:21,336 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_CACHE_ASYNC_THROUGH took 6 ms.2016-10-06 19:56:21,402 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_CACHE_ASYNC_THROUGH took 66 ms.Passed the test!runTest BasicNonByteBuffer CACHE ASYNC_THROUGHPassed the test!runTest Basic NO_CACHE MUST_CACHE2016-10-06 19:56:21,481 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_NO_CACHE_MUST_CACHE took 5 ms.2016-10-06 19:56:21,573 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_NO_CACHE_MUST_CACHE took 92 ms.Passed the test!runTest BasicNonByteBuffer NO_CACHE MUST_CACHEPassed the test!runTest Basic NO_CACHE CACHE_THROUGH2016-10-06 19:56:21,651 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,651 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,725 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_NO_CACHE_CACHE_THROUGH took 75 ms.2016-10-06 19:56:21,789 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_NO_CACHE_CACHE_THROUGH took 63 ms.Passed the test!runTest BasicNonByteBuffer NO_CACHE CACHE_THROUGH2016-10-06 19:56:21,793 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:21,793 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic NO_CACHE THROUGH2016-10-06 19:56:22,167 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,167 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,253 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_NO_CACHE_THROUGH took 87 ms.2016-10-06 19:56:22,322 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,322 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,380 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_NO_CACHE_THROUGH took 127 ms.Passed the test!runTest BasicNonByteBuffer NO_CACHE THROUGH2016-10-06 19:56:22,384 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,384 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,522 INFO type (AbstractClient.java:connect) - Alluxio client (version 1.1.0) is trying to connect with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:299982016-10-06 19:56:22,522 INFO type (AbstractClient.java:connect) - Client registered with FileSystemWorkerClient FileSystemWorker @ ip-172-31-40-120.us-west-2.compute.internal/172.31.40.120:29998Passed the test!runTest Basic NO_CACHE ASYNC_THROUGH2016-10-06 19:56:22,587 INFO type (BasicOperations.java:writeFile) - writeFile to file /default_tests_files/Basic_NO_CACHE_ASYNC_THROUGH took 6 ms.2016-10-06 19:56:22,653 INFO type (BasicOperations.java:readFile) - readFile file /default_tests_files/Basic_NO_CACHE_ASYNC_THROUGH took 66 ms.Passed the test!runTest BasicNonByteBuffer NO_CACHE ASYNC_THROUGHPassed the test!On Oct 6, 2016, at 2:40 PM, Chaomin Yu <cha...@alluxio.com> wrote:Hi Andy,Thank you for trying out Alluxio!Can you please try "kinit -k -t /tmp/alluxio-1.1.0/conf/hdfs.keytab hdfs/_HOST@PHDATA.IO" before starting Alluxio servers, to see whether it help mitigate this error?Hope this helps,Chaomin
2016-10-06 18:42:01,297 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab) - Login successful for user hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IO using keytab file /tmp/alluxio-1.1.0/conf/hdfs.keytab
2016-10-06 18:42:10,819 INFO server.AbstractConnector (AbstractConnector.java:doStart) - Started SelectChannelConnector@0.0.0.0:19999
--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.--
That only works until my kerberos ticket hits its expiration.
016-10-06 21:00:46,967 ERROR logger.type (MetricsConfig.java:loadConfigFile) - Error loading metrics configuration file.2016-10-06 21:00:48,257 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-10-06 21:00:48,471 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab) - Login successful for user hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IO using keytab file /tmp/alluxio-1.1.0/conf/hdfs.keytab
2016-10-06 22:23:40,332 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:23:40,333 WARN ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:23:40,334 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]2016-10-06 22:23:42,678 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:23:42,679 WARN ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:23:42,679 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]2016-10-06 22:29:31,340 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:29:31,341 WARN ipc.Client (Client.java:run) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2016-10-06 22:29:31,341 WARN security.UserGroupInformation (UserGroupInformation.java:doAs) - PriviledgedActionException as:hdfs/ip-172-31-40-120.us-west-2.compute.internal@PHDATA.IO (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)](END)
Hi Andy,
Can you please try "kinit -k -t /tmp/alluxio-1.1.0/conf/hdfs.keytab hdfs/_HO...@PHDATA.IO" before starting Alluxio servers, to see whether it help mitigate this error?Hope this helps,Chaomin
I made it shorter (1 hour) for testing but Kerberos default is 7days so currently I would have to restart the alluxio processes every 7days.
According to the log you provided, seems like your ticket is non-renewable after 10/06/2016 20:21:07 ?
Valid starting Expires Service principal
So below is a standard length of a Ticket Granting Ticket(TGT) is 7days and the standard lifetime of Service Ticket is 1day. If I do a renewal (kinit -R) you can see it renews the start and expires dates of the service ticket ahead but the renew until time for the Ticket Granting Ticket(TGT) is still 7days. Essentially the way it works is the TGT is used to obtain the Service Ticket from the Ticket Granting Server(TGS), the service ticket is what is actually use to communicate with the hdfs service. So before 7days end we essentially need to do a new kinit to get a new TGT, there is utility code provided as part of hadoop to do this which I think is our issue and is better explained here http://stackoverflow.com/questions/34616676/should-i-call-ugi-checktgtandreloginfromkeytab-before-every-action-on-hadoop .
[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ kinit -k -t hdfs.keytab hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IO[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ klistTicket cache: FILE:/tmp/krb5cc_992Default principal: hdfs/ip-172-31-40-120.us-west-2.compute...@PHDATA.IOValid starting Expires Service principal
10/07/2016 00:30:42 10/08/2016 00:30:42 krbtgt/PHDA...@PHDATA.IOrenew until 10/14/2016 00:30:42[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ kinit -R
[hdfs@ip-172-31-40-120 alluxio-1.1.0]$ klistTicket cache: FILE:/tmp/krb5cc_992