Running MapReduce job on yarn failed with "java.net.SocketTimeoutException: Read timed out"

292 views
Skip to first unread message

Kaiming Wan

unread,
Nov 7, 2016, 5:23:46 AM11/7/16
to Alluxio Users
I am running MR job on my alluxio and hdfs cluster.

The basic machine and configuration spec can be found in another post: https://groups.google.com/forum/#!topic/alluxio-users/wbaV31EEAvE

By the way,  alluxio.security.authentication.socket.timeout.ms is set to 3000000 which is a enough large value



alluxio master.log:
2016-11-07 17:23:17,094 INFO  logger.type (HdfsUnderFileSystem.java:setMode) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043' permissions from: rwxr-xr-x to 448
2016-11-07 17:23:17,336 INFO  logger.type (HdfsUnderFileSystem.java:setMode) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar' permissions from: rw-r--r-- to 420
2016-11-07 17:37:59,014 ERROR logger.type (AbstractThriftClient.java:retryRPC) - java.net.SocketTimeoutException: Read timed out
org
.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
 at org
.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org
.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at org
.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376)
 at org
.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453)
 at org
.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
 at org
.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org
.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at org
.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
 at org
.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
 at org
.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org
.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
 at org
.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
 at alluxio
.thrift.BlockWorkerClientService$Client.recv_lockBlock(BlockWorkerClientService.java:277)
 at alluxio
.thrift.BlockWorkerClientService$Client.lockBlock(BlockWorkerClientService.java:263)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$6.call(RetryHandlingBlockWorkerClient.java:199)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$6.call(RetryHandlingBlockWorkerClient.java:195)
 at alluxio
.AbstractThriftClient.retryRPC(AbstractThriftClient.java:140)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient.lockBlock(RetryHandlingBlockWorkerClient.java:193)
 at alluxio
.client.block.LocalBlockInStream.<init>(LocalBlockInStream.java:62)
 at alluxio
.client.block.AlluxioBlockStore.getInStream(AlluxioBlockStore.java:148)
 at alluxio
.client.file.FileInStream.updateBlockInStream(FileInStream.java:519)
 at alluxio
.client.file.FileInStream.updateStreams(FileInStream.java:426)
 at alluxio
.client.file.FileInStream.read(FileInStream.java:202)
 at alluxio
.web.WebInterfaceBrowseServlet.displayFile(WebInterfaceBrowseServlet.java:99)
 at alluxio
.web.WebInterfaceBrowseServlet.doGet(WebInterfaceBrowseServlet.java:186)
 at javax
.servlet.http.HttpServlet.service(HttpServlet.java:687)
 at javax
.servlet.http.HttpServlet.service(HttpServlet.java:790)
 at org
.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
 at org
.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)
 at org
.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
 at org
.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
 at org
.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
 at org
.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
 at org
.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
 at org
.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 at org
.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
 at org
.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
 at org
.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
 at org
.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
 at org
.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
 at org
.eclipse.jetty.server.Server.handle(Server.java:499)
 at org
.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
 at org
.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
 at org
.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
 at org
.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
 at org
.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
 at java
.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
 at java
.net.SocketInputStream.socketRead0(Native Method)
 at java
.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 at java
.net.SocketInputStream.read(SocketInputStream.java:170)
 at java
.net.SocketInputStream.read(SocketInputStream.java:141)
 at java
.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java
.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java
.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at org
.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
 
... 46 more
2016-11-07 17:37:59,015 INFO  logger.type (DynamicResourcePool.java:checkHealthyAndRetry) - Clearing unhealthy resource alluxio.thrift.BlockWorkerClientService$Client@aef2a0f.
2016-11-07 17:37:59,016 INFO  logger.type (ThriftClientPool.java:createNewResource) - Created a new thrift client alluxio.thrift.BlockWorkerClientService$Client@36d92ae6
2016-11-07 17:44:08,059 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:08,060 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:10,104 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:10,105 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:10,209 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:10,220 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:17,649 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:17,650 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:17,721 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:17,721 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:22,005 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:22,006 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:22,096 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:22,097 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:23,629 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:23,630 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:23,709 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:23,710 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:25,509 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:25,509 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:25,583 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:25,584 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:27,123 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:27,123 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:27,199 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:27,200 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:30,227 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:30,228 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:44:30,309 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:44:30,310 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:58:56,548 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:58:56,549 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 17:58:58,384 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - Master addresses: [10.8.12.16:19998, 10.8.12.17:19998]
2016-11-07 17:58:58,385 INFO  logger.type (LeaderInquireClient.java:getMasterAddress) - The leader master: 10.8.12.16:19998
2016-11-07 18:00:32,995 ERROR logger.type (AbstractThriftClient.java:retryRPC) - java.net.SocketTimeoutException: Read timed out
org
.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
 at org
.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org
.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at org
.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376)
 at org
.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453)
 at org
.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
 at org
.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org
.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at org
.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
 at org
.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
 at org
.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org
.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
 at org
.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
 at alluxio
.thrift.BlockWorkerClientService$Client.recv_lockBlock(BlockWorkerClientService.java:277)
 at alluxio
.thrift.BlockWorkerClientService$Client.lockBlock(BlockWorkerClientService.java:263)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$6.call(RetryHandlingBlockWorkerClient.java:199)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$6.call(RetryHandlingBlockWorkerClient.java:195)
 at alluxio
.AbstractThriftClient.retryRPC(AbstractThriftClient.java:140)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient.lockBlock(RetryHandlingBlockWorkerClient.java:193)
 at alluxio
.client.block.LocalBlockInStream.<init>(LocalBlockInStream.java:62)
 at alluxio
.client.block.AlluxioBlockStore.getInStream(AlluxioBlockStore.java:148)
 at alluxio
.client.file.FileInStream.updateBlockInStream(FileInStream.java:519)
 at alluxio
.client.file.FileInStream.updateStreams(FileInStream.java:426)
 at alluxio
.client.file.FileInStream.read(FileInStream.java:202)
 at alluxio
.web.WebInterfaceBrowseServlet.displayFile(WebInterfaceBrowseServlet.java:99)
 at alluxio
.web.WebInterfaceBrowseServlet.doGet(WebInterfaceBrowseServlet.java:186)
 at javax
.servlet.http.HttpServlet.service(HttpServlet.java:687)
 at javax
.servlet.http.HttpServlet.service(HttpServlet.java:790)
 at org
.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
 at org
.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)
 at org
.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
 at org
.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
 at org
.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
 at org
.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
 at org
.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
 at org
.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 at org
.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
 at org
.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
 at org
.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
 at org
.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
 at org
.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
 at org
.eclipse.jetty.server.Server.handle(Server.java:499)
 at org
.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
 at org
.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
 at org
.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
 at org
.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
 at org
.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
 at java
.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
 at java
.net.SocketInputStream.socketRead0(Native Method)
 at java
.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 at java
.net.SocketInputStream.read(SocketInputStream.java:170)
 at java
.net.SocketInputStream.read(SocketInputStream.java:141)
 at java
.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java
.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java
.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at org
.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
 
... 46 more
2016-11-07 18:00:32,996 INFO  logger.type (DynamicResourcePool.java:checkHealthyAndRetry) - Clearing unhealthy resource alluxio.thrift.BlockWorkerClientService$Client@3cd1d18b.
2016-11-07 18:00:32,997 INFO  logger.type (ThriftClientPool.java:createNewResource) - Created a new thrift client alluxio.thrift.BlockWorkerClientService$Client@7f64bdb7

alluxio worker.log
2016-11-07 17:23:17,321 INFO  logger.type (HdfsUnderFileSystem.java:setOwner) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar' user from: appadmin to appadmin, group from: root to appadmin
2016-11-07 17:23:17,403 INFO  logger.type (FileUtils.java:createStorageDirPath) - Folder /home/appadmin/ramdisk/alluxioworker/.tmp_blocks/865 was created!
2016-11-07 17:23:17,426 INFO  logger.type (HdfsUnderFileSystem.java:setOwner) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.split' user from: appadmin to appadmin, group from: root to appadmin
2016-11-07 17:23:17,450 INFO  logger.type (HdfsUnderFileSystem.java:setOwner) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo' user from: appadmin to appadmin, group from: root to appadmin
2016-11-07 17:23:17,548 INFO  logger.type (HdfsUnderFileSystem.java:setOwner) - Changing file 'hdfs://ns/alluxio/data/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.xml' user from: appadmin to appadmin, group from: root to appadmin




yarn resourcemanager log
2016-11-07 17:23:17,011 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 43
2016-11-07 17:23:17,667 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 43 submitted by user appadmin
2016-11-07 17:23:17,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1477295699236_0043
2016-11-07 17:23:17,667 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin IP=10.8.12.16 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1477295699236_0043
2016-11-07 17:23:17,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1477295699236_0043 State change from NEW to NEW_SAVING
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1477295699236_0043
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1477295699236_0043 State change from NEW_SAVING to SUBMITTED
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application added - appId: application_1477295699236_0043 user: appadmin leaf-queue of parent: root #applications: 1
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Accepted application application_1477295699236_0043 from user: appadmin, in queue: default
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1477295699236_0043 State change from SUBMITTED to ACCEPTED
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from NEW to SUBMITTED
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1477295699236_0043 from user: appadmin activated in queue: default
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1477295699236_0043 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@2a08a91b, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2016-11-07 17:23:17,668 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1477295699236_0043_000001 to scheduler from user appadmin in queue default
2016-11-07 17:23:17,678 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from SUBMITTED to SCHEDULED
2016-11-07 17:23:17,922 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_01_000001 Container Transitioned from NEW to ALLOCATED
2016-11-07 17:23:17,922 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_01_000001
2016-11-07 17:23:17,922 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1477295699236_0043_01_000001 of capacity <memory:27136, vCores:1> on host sq-hbase1.800best.com:12727, which has 1 containers, <memory:27136, vCores:1> used and <memory:54272, vCores:7> available after allocation
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1477295699236_0043_000001 container=Container: [ContainerId: container_1477295699236_0043_01_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 clusterResource=<memory:244224, vCores:24>
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:27136, vCores:1>, usedCapacity=0.11111111, absoluteUsedCapacity=0.11111111, numApps=1, numContainers=1
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.11111111 absoluteUsedCapacity=0.11111111 used=<memory:27136, vCores:1> cluster=<memory:244224, vCores:24>
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : sq-hbase1.800best.com:12727 for container : container_1477295699236_0043_01_000001
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1477295699236_0043 AttemptId: appattempt_1477295699236_0043_000001 MasterContainer: Container: [ContainerId: container_1477295699236_0043_01_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ]
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from SCHEDULED to ALLOCATED_SAVING
2016-11-07 17:23:17,923 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from ALLOCATED_SAVING to ALLOCATED
2016-11-07 17:23:17,939 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1477295699236_0043_000001
2016-11-07 17:23:17,940 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1477295699236_0043_01_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] for AM appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,940 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1477295699236_0043_01_000001 : $JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog  -Xmx21708m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
2016-11-07 17:23:17,940 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,940 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1477295699236_0043_01_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] for AM appattempt_1477295699236_0043_000001
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from ALLOCATED to LAUNCHED
2016-11-07 17:23:18,922 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_01_000001 Container Transitioned from ACQUIRED to RUNNING
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:appattempt_1477295699236_0043_000001 Timed out after 600 secs
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1477295699236_0043_000001 with final state: FAILED, and exit status: -1000
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from LAUNCHED to FINAL_SAVING
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1477295699236_0043_000001
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1477295699236_0043_000001
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000001 State change from FINAL_SAVING to FAILED
2016-11-07 17:35:00,057 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1477295699236_0043_000001 is done. finalState=FAILED
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from NEW to SUBMITTED
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_01_000001 Container Transitioned from RUNNING to KILLED
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1477295699236_0043_01_000001 in state: KILLED event:KILL
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_01_000001
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1477295699236_0043_01_000001 of capacity <memory:27136, vCores:1> on host sq-hbase1.800best.com:12727, which currently has 0 containers, <memory:0, vCores:0> used and <memory:81408, vCores:8> available, release resources=true
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=appadmin user-resources=<memory:0, vCores:0>
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1477295699236_0043_01_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:244224, vCores:24>
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:244224, vCores:24>
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1477295699236_0043_000001 released container container_1477295699236_0043_01_000001 on node: host: sq-hbase1.800best.com:12727 #containers=0 available=<memory:81408, vCores:8> used=<memory:0, vCores:0> with event: KILL
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1477295699236_0043 requests cleared
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1477295699236_0043 user: appadmin queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1477295699236_0043_000001
2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1477295699236_0043 from user: appadmin activated in queue: default
2016-11-07 17:35:00,059 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1477295699236_0043 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@3d9289d2, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2016-11-07 17:35:00,059 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1477295699236_0043_000002 to scheduler from user appadmin in queue default
2016-11-07 17:35:00,060 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from SUBMITTED to SCHEDULED
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_02_000001 Container Transitioned from NEW to ALLOCATED
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_02_000001
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1477295699236_0043_02_000001 of capacity <memory:27136, vCores:1> on host sq-hbase1.800best.com:12727, which has 1 containers, <memory:27136, vCores:1> used and <memory:54272, vCores:7> available after allocation
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1477295699236_0043_000002 container=Container: [ContainerId: container_1477295699236_0043_02_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 clusterResource=<memory:244224, vCores:24>
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting assigned queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:27136, vCores:1>, usedCapacity=0.11111111, absoluteUsedCapacity=0.11111111, numApps=1, numContainers=1
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.11111111 absoluteUsedCapacity=0.11111111 used=<memory:27136, vCores:1> cluster=<memory:244224, vCores:24>
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : sq-hbase1.800best.com:12727 for container : container_1477295699236_0043_02_000001
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1477295699236_0043 AttemptId: appattempt_1477295699236_0043_000002 MasterContainer: Container: [ContainerId: container_1477295699236_0043_02_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ]
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from SCHEDULED to ALLOCATED_SAVING
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from ALLOCATED_SAVING to ALLOCATED
2016-11-07 17:35:00,108 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1477295699236_0043_000002
2016-11-07 17:35:00,109 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1477295699236_0043_02_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] for AM appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,109 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1477295699236_0043_02_000001 : $JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog  -Xmx21708m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
2016-11-07 17:35:00,109 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,109 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1477295699236_0043_02_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] for AM appattempt_1477295699236_0043_000002
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from ALLOCATED to LAUNCHED
2016-11-07 17:35:01,064 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_02_000001 Container Transitioned from ACQUIRED to RUNNING
2016-11-07 17:35:01,064 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:appattempt_1477295699236_0043_000002 Timed out after 600 secs
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1477295699236_0043_000002 with final state: FAILED, and exit status: -1000
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from LAUNCHED to FINAL_SAVING
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1477295699236_0043_000002
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1477295699236_0043_000002
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1477295699236_0043_000002 State change from FINAL_SAVING to FAILED
2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1477295699236_0043 with final state: FAILED
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1477295699236_0043 State change from ACCEPTED to FINAL_SAVING
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1477295699236_0043_000002 is done. finalState=FAILED
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1477295699236_0043
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1477295699236_0043_02_000001 Container Transitioned from RUNNING to KILLED
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1477295699236_0043_02_000001 in state: KILLED event:KILL
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_02_000001
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1477295699236_0043 failed 2 times due to ApplicationMaster for attempt appattempt_1477295699236_0043_000002 timed out. Failing the application.
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1477295699236_0043_02_000001 of capacity <memory:27136, vCores:1> on host sq-hbase1.800best.com:12727, which currently has 0 containers, <memory:0, vCores:0> used and <memory:81408, vCores:8> available, release resources=true
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=appadmin user-resources=<memory:0, vCores:0>
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1477295699236_0043 State change from FINAL_SAVING to FAILED
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1477295699236_0043_02_000001, NodeId: sq-hbase1.800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource: <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.8.12.16:12727 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:244224, vCores:24>
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:244224, vCores:24>
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1477295699236_0043_000002 released container container_1477295699236_0043_02_000001 on node: host: sq-hbase1.800best.com:12727 #containers=0 available=<memory:81408, vCores:8> used=<memory:0, vCores:0> with event: KILL
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1477295699236_0043 requests cleared
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1477295699236_0043 user: appadmin queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2016-11-07 17:48:20,195 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1477295699236_0043 user: appadmin leaf-queue of parent: root #applications: 0
2016-11-07 17:48:20,202 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=appadmin OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1477295699236_0043 failed 2 times due to ApplicationMaster for attempt appattempt_1477295699236_0043_000002 timed out. Failing the application. APPID=application_1477295699236_0043
2016-11-07 17:48:20,202 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1477295699236_0043,name=job_NewCityToCityEfficiency.GetBillInfoJob,user=appadmin,queue=default,state=FAILED,trackingUrl=http://sq-hbase1.800best.com:8088/cluster/app/application_1477295699236_0043,appMasterHost=N/A,startTime=1478510597667,finishTime=1478512100195,finalStatus=FAILED,memorySeconds=40765516,vcoreSeconds=1502,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=MAPREDUCE
2016-11-07 17:48:20,221 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1477295699236_0043_000002
2016-11-07 17:48:21,231 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2016-11-07 17:48:21,231 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...



yarn nodemanager log

2016-11-07 17:23:17,941 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1477295699236_0043_000001 (auth:SIMPLE)
2016-11-07 17:23:17,951 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1477295699236_0043_01_000001 by user appadmin
2016-11-07 17:23:17,951 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1477295699236_0043
2016-11-07 17:23:17,951 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin IP=10.8.12.16 OPERATION=Start Container Request TARGET=ContainerManageImpl  RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_01_000001
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1477295699236_0043 transitioned from NEW to INITING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Adding container_1477295699236_0043_01_000001 to application application_1477295699236_0043
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1477295699236_0043 transitioned from INITING to RUNNING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_01_000001 transitioned from NEW to LOCALIZING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1477295699236_0043
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo transitioned from INIT to DOWNLOADING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar transitioned from INIT to DOWNLOADING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.split transitioned from INIT to DOWNLOADING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.xml transitioned from INIT to DOWNLOADING
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ alluxio://10.8.12.16:19998/user/hdcf/sqoop/import-data/Q9/GE_SYS_SITE/importdata-20161107, 1478492434773, FILE, null }
2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1477295699236_0043_01_000001
2016-11-07 17:23:17,957 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/nmPrivate/container_1477295699236_0043_01_000001.tokens. Credentials list:
2016-11-07 17:23:17,965 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user appadmin
2016-11-07 17:23:17,973 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/nmPrivate/container_1477295699236_0043_01_000001.tokens to /home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043/container_1477295699236_0043_01_000001.tokens
2016-11-07 17:23:17,973 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043 = file:/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043
2016-11-07 17:23:18,008 INFO alluxio.logger.type: initialize(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo, Configuration: core-default.xml, core-site.xml, yarn-default.xml, yarn-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml). Connecting to Alluxio: alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo
2016-11-07 17:23:18,008 INFO alluxio.logger.type: alluxio://10.8.12.16:19998 alluxio://10.8.12.16:19998
2016-11-07 17:23:18,008 INFO alluxio.logger.type: getFileStatus(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo)
2016-11-07 17:23:18,009 INFO alluxio.logger.type: open(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo, 4096)
2016-11-07 17:23:18,015 INFO alluxio.logger.type: Created a new thrift client alluxio.thrift.BlockWorkerClientService$Client@623930f1
2016-11-07 17:23:18,044 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.splitmetainfo(->/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043/filecache/10/job.splitmetainfo) transitioned from DOWNLOADING to LOCALIZED
2016-11-07 17:23:18,052 INFO alluxio.logger.type: getFileStatus(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar)
2016-11-07 17:23:18,052 INFO alluxio.logger.type: open(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar, 4096)
2016-11-07 17:23:18,100 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.jar(->/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043/filecache/11/job.jar) transitioned from DOWNLOADING to LOCALIZED
2016-11-07 17:23:18,107 INFO alluxio.logger.type: getFileStatus(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.split)
2016-11-07 17:23:18,107 INFO alluxio.logger.type: open(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.split, 4096)
2016-11-07 17:23:18,152 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.split(->/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043/filecache/12/job.split) transitioned from DOWNLOADING to LOCALIZED
2016-11-07 17:23:18,159 INFO alluxio.logger.type: getFileStatus(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.xml)
2016-11-07 17:23:18,160 INFO alluxio.logger.type: open(alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.xml, 4096)
2016-11-07 17:23:18,194 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource alluxio://10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_1477295699236_0043/job.xml(->/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043/filecache/13/job.xml) transitioned from DOWNLOADING to LOCALIZED
2016-11-07 17:24:36,084 INFO alluxio.logger.type: Resource alluxio.thrift.BlockWorkerClientService$Client@73bffc8 is garbage collected.
2016-11-07 17:28:35,762 INFO alluxio.logger.type: Resource alluxio.thrift.BlockWorkerClientService$Client@623930f1 is garbage collected.
2016-11-07 17:35:00,061 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1477295699236_0043_000001 (auth:SIMPLE)
2016-11-07 17:35:00,063 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1477295699236_0043_01_000001
2016-11-07 17:35:00,063 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin IP=10.8.12.16 OPERATION=Stop Container Request TARGET=ContainerManageImpl  RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_01_000001
2016-11-07 17:35:00,063 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_01_000001 transitioned from LOCALIZING to KILLING
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_01_000001
2016-11-07 17:35:00,064 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_01_000001 transitioned from KILLING to DONE
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1477295699236_0043_01_000001 from application application_1477295699236_0043
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread
2016-11-07 17:35:00,065 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1477295699236_0043
2016-11-07 17:35:00,116 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1477295699236_0043_000002 (auth:SIMPLE)
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1477295699236_0043_02_000001 by user appadmin
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin IP=10.8.12.16 OPERATION=Start Container Request TARGET=ContainerManageImpl  RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_02_000001
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Adding container_1477295699236_0043_02_000001 to application application_1477295699236_0043
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_02_000001 transitioned from NEW to LOCALIZING
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1477295699236_0043
2016-11-07 17:35:00,118 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ alluxio://10.8.12.16:19998/user/hdcf/sqoop/import-data/Q9/GE_SYS_SITE/importdata-20161107, 1478492434773, FILE, null }
2016-11-07 17:45:13,679 ERROR alluxio.logger.type: java.net.SocketTimeoutException: Read timed out
alluxio
.org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
 at alluxio
.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at alluxio
.org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at alluxio
.org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376)
 at alluxio
.org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453)
 at alluxio
.org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
 at alluxio
.org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at alluxio
.org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at alluxio
.org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
 at alluxio
.org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
 at alluxio
.org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at alluxio
.org.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
 at alluxio
.org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
 at alluxio
.thrift.BlockWorkerClientService$Client.recv_promoteBlock(BlockWorkerClientService.java:303)
 at alluxio
.thrift.BlockWorkerClientService$Client.promoteBlock(BlockWorkerClientService.java:290)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$7.call(RetryHandlingBlockWorkerClient.java:218)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient$7.call(RetryHandlingBlockWorkerClient.java:214)
 at alluxio
.AbstractThriftClient.retryRPC(AbstractThriftClient.java:140)
 at alluxio
.client.block.RetryHandlingBlockWorkerClient.promoteBlock(RetryHandlingBlockWorkerClient.java:213)
 at alluxio
.client.block.AlluxioBlockStore.promote(AlluxioBlockStore.java:250)
 at alluxio
.client.file.FileInStream.updateBlockInStream(FileInStream.java:513)
 at alluxio
.client.file.FileInStream.updateStreams(FileInStream.java:426)
 at alluxio
.client.file.FileInStream.read(FileInStream.java:202)
 at alluxio
.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:186)
 at java
.io.DataInputStream.read(DataInputStream.java:100)
 at org
.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
 at org
.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
 at org
.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
 at org
.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
 at org
.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:267)
 at org
.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
 at org
.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
 at java
.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java
.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java
.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java
.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
 at java
.net.SocketInputStream.socketRead0(Native Method)
 at java
.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 at java
.net.SocketInputStream.read(SocketInputStream.java:170)
 at java
.net.SocketInputStream.read(SocketInputStream.java:141)
 at java
.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java
.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java
.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at alluxio
.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
 
... 36 more
2016-11-07 17:45:13,680 INFO alluxio.logger.type: Clearing unhealthy resource alluxio.thrift.BlockWorkerClientService$Client@3f36e769.
2016-11-07 17:45:13,680 INFO alluxio.logger.type: Created a new thrift client alluxio.thrift.BlockWorkerClientService$Client@310fa211
2016-11-07 17:48:20,228 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1477295699236_0043_000002 (auth:SIMPLE)
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1477295699236_0043_02_000001
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin IP=10.8.12.16 OPERATION=Stop Container Request TARGET=ContainerManageImpl  RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_02_000001
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_02_000001 transitioned from LOCALIZING to KILLING
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=appadmin OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1477295699236_0043 CONTAINERID=container_1477295699236_0043_02_000001
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1477295699236_0043_02_000001 transitioned from KILLING to DONE
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1477295699236_0043_02_000001 from application application_1477295699236_0043
2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1477295699236_0043
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1477295699236_0043 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1477295699236_0043
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1477295699236_0043 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log Deletion for application: application_1477295699236_0043, with delay of 10800 seconds



Bin Fan

unread,
Nov 7, 2016, 12:36:06 PM11/7/16
to Kaiming Wan, Alluxio Users
Hi Kaiming,

will this happen deterministically? e.g., happens every run for your MR?

Bin
> .java:127)
> ... 46 more
> 2016-11-07 18:00:32,996 INFO logger.type (DynamicResourcePool.java:chec
> kHealthyAndRetry) - Clearing unhealthy resource alluxio.thrift.
> BlockWorkerClientService$Client@3cd1d18b.
> 2016-11-07 18:00:32,997 INFO logger.type (ThriftClientPool.java:createN
> ewResource) - Created a new thrift client alluxio.thrift.BlockWorkerClie
> ntService$Client@7f64bdb7
> .scheduler.capacity.LeafQueue: Application added - appId:
> application_1477295699236_0043 user: org.apache.hadoop.yarn.server.
> 2016-11-07 17:35:00,058 INFO org.apache.hadoop.yarn.server.resourcemanager
> .scheduler.capacity.LeafQueue: Application application_1477295699236_0043
> from user: appadmin activated in queue: default
> 2016-11-07 17:35:00,059 INFO org.apache.hadoop.yarn.server.resourcemanager
> .scheduler.capacity.LeafQueue: Application added - appId:
> application_1477295699236_0043 user: org.apache.hadoop.yarn.server.
> .scheduler.capacity.ParentQueue: Re-sorting assigned queue: root.default
> stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:
> 27136, vCores:1>, usedCapacity=0.11111111, absoluteUsedCapacity=0.11111111
> 800best.com:12727, NodeHttpAddress: sq-hbase1.800best.com:8042, Resource:
> <memory:27136, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken
> , service: 10.8.12.16:12727 }, ] queue=default: capacity=1.0,
> absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0
> 1477295699236_0043/job.jar transitioned from INIT to DOWNLOADING
> 2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.localizer.LocalizedResource: Resource alluxio://
> 10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_
> 1477295699236_0043/job.split transitioned from INIT to DOWNLOADING
> 2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.localizer.LocalizedResource: Resource alluxio://
> 10.8.12.16:19998/tmp/hadoop-yarn/staging/appadmin/.staging/job_
> 1477295699236_0043/job.xml transitioned from INIT to DOWNLOADING
> 2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.localizer.ResourceLocalizationService: Downloading public
> rsrc:{ alluxio://10.8.12.16:19998/user/hdcf/sqoop/import-data/
> Q9/GE_SYS_SITE/importdata-20161107, 1478492434773, FILE, null }
> 2016-11-07 17:23:17,952 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.localizer.ResourceLocalizationService: Created localizer
> for container_1477295699236_0043_01_000001
> 2016-11-07 17:23:17,957 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.localizer.ResourceLocalizationService: Writing
> credentials to the nmPrivate file /home/appadmin/hadoop-2.7.2/tmp/nm-local
> -dir/nmPrivate/container_1477295699236_0043_01_000001.tokens. Credentials
> list:
> 2016-11-07 17:23:17,965 INFO org.apache.hadoop.yarn.server.nodemanager.
> DefaultContainerExecutor: Initializing user appadmin
> 2016-11-07 17:23:17,973 INFO org.apache.hadoop.yarn.server.nodemanager.
> DefaultContainerExecutor: Copying from /home/appadmin/hadoop-2.7.2/tmp/nm-
> local-dir/nmPrivate/container_1477295699236_0043_01_000001.tokens to /home
> /appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/
> application_1477295699236_0043/container_1477295699236_0043_01_000001.t
> okens
> 2016-11-07 17:23:17,973 INFO org.apache.hadoop.yarn.server.nodemanager.
> DefaultContainerExecutor: Localizer CWD set to /home/appadmin/hadoop-2.7.2
> /tmp/nm-local-dir/usercache/appadmin/appcache/application_1477295699236_0043
> = file:/home/appadmin/hadoop-2.7.2/tmp/nm-local-dir/usercache/appadmin/
> appcache/application_1477295699236_0043
> containermanager.localizer.ResourceLocalizationService: Downloading public
> rsrc:{ alluxio://10.8.12.16:19998/user/hdcf/sqoop/import-data/
> NMAuditLogger: USER=appadmin OPERATION=Container Finished - Killed TARGET=
> ContainerImpl RESULT=SUCCESS APPID=application_1477295699236_0043
> CONTAINERID=container_1477295699236_0043_02_000001
> 2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.container.ContainerImpl: Container
> container_1477295699236_0043_02_000001 transitioned from KILLING to DONE
> 2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.application.ApplicationImpl: Removing
> container_1477295699236_0043_02_000001 from application
> application_1477295699236_0043
> 2016-11-07 17:48:20,230 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory
> nor physical-memory monitoring is needed. Not running the monitor-thread
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.AuxServices: Got event CONTAINER_STOP for appId
> application_1477295699236_0043
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.application.ApplicationImpl: Application
> application_1477295699236_0043 transitioned from RUNNING to
> APPLICATION_RESOURCES_CLEANINGUP
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.AuxServices: Got event APPLICATION_STOP for appId
> application_1477295699236_0043
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> DefaultContainerExecutor: Deleting absolute path : /home/appadmin/hadoop-
> 2.7.2/tmp/nm-local-dir/usercache/appadmin/appcache/application_
> 1477295699236_0043
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.application.ApplicationImpl: Application
> application_1477295699236_0043 transitioned from APPLICATION_RESOURCES_CLEANINGUP
> to FINISHED
> 2016-11-07 17:48:20,231 INFO org.apache.hadoop.yarn.server.nodemanager.
> containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log
> Deletion for application: application_1477295699236_0043, with delay of
> 10800 seconds
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Alluxio Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to alluxio-user...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Kaiming Wan

unread,
Nov 8, 2016, 2:18:24 AM11/8/16
to Alluxio Users, wan...@gmail.com
Hi Bin Fan,

I have solved this problem by changing some directory's permission and reformat alluxio cluster and reload the data to alluxio. 

在 2016年11月8日星期二 UTC+8上午1:36:06,Bin Fan写道:
> 2016-11-07 17:48:20,058 INFO org.apache.hadoop.yarn.s...

Gene Pang

unread,
Nov 23, 2016, 9:10:44 AM11/23/16
to Alluxio Users, wan...@gmail.com
Thanks for confirming and providing a solution!

-Gene
Reply all
Reply to author
Forward
0 new messages