FileAlreadyExistsException: rename destination /tmp/hadoop-gobblin/mapred/local/...

154 views
Skip to first unread message

Brian Orwig

unread,
Mar 21, 2017, 2:45:43 PM3/21/17
to gobblin-users
All, 

We are getting a lot of the following errors when running Gobblin (v0.8.0) jobs in parallel via Azkaban: 

16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO - 2017-03-16 13:46:59 UTC ERROR [main] gobblin.runtime.AbstractJobLauncher  321 - Failed to launch and run job job_hilton-ari-ari_hilton_raw_1489672009593: java.io.IOException: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /tmp/hadoop-gobblin/mapred/local/1489672018396 already exists.
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO - java.io.IOException: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /tmp/hadoop-gobblin/mapred/local/1489672018396 already exists.
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:149)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.security.AccessController.doPrivileged(Native Method)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at javax.security.auth.Subject.doAs(Subject.java:422)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.MRJobLauncher.runWorkUnits(MRJobLauncher.java:203)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:296)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:84)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:61)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:106)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.lang.reflect.Method.invoke(Method.java:497)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO - Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /tmp/hadoop-gobblin/mapred/local/1489672018396 already exists.
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:145)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        ... 21 more
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO - Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /tmp/hadoop-gobblin/mapred/local/1489672018396 already exists.
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:1291)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.DelegateToFileSystem.renameInternal(DelegateToFileSystem.java:172)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:727)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:218)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:657)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.fs.FileContext.rename(FileContext.java:905)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:289)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
16-03-2017 13:46:59 UTC hilton-ari-ari_hilton_raw INFO -        at java.lang.Thread.run(Thread.java:745)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO - Exception in thread "main" gobblin.runtime.JobException: Job job_hilton-ari-ari_hilton_raw_1489672009593 failed
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:363)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:84)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:61)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:106)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at java.lang.reflect.Method.invoke(Method.java:497)
16-03-2017 13:47:01 UTC hilton-ari-ari_hilton_raw INFO -        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

I saw a few other posts with similar FileAlreadyExistsExceptions and those all recommended using the TimePartitionedDataPublisher which we are already using: 

data.publisher.final.dir=${env:DATA_DIR}
data.publisher.permissions=775
data.publisher.replace.final.dir=false
data.publisher.type=gobblin.publisher.TimePartitionedDataPublisher

If I look in the /tmp/hadoop-gobblin/mapred/local/ directory there are tons of empty dirs there that look to be incrementally generated: 

drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316953
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316952
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316951
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316950
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316949
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316959
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316958
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316957
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316956
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316963
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316962
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316961
drwxr-xr-x 2 gobblin gobblin 4096 Mar 21 18:35 1490121316960

So my questions are

  1. How do I fix this issue?
  2. Why are these being written? 
  3. Where is the setting at to not have these written to /tmp ? 
Thanks!

Brian Orwig

unread,
Mar 21, 2017, 3:25:08 PM3/21/17
to gobblin-users
I found answers to #2 and #3

The directory is the default hadoop.tmp.dir that is configured in core-defaults.xml: 

hadoop.tmp.dir
/tmp/hadoop-${user.name}

and this can be changed in the core-site.xml for example: 

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/data/persistent/hadoop</value>
  </property>

This is used as the tmp dir for the map reduce jobs and the dir can be local or within HDFS. 

So for #1 I still need a workaround unless I just schedule the Azkaban jobs to run at different times instead of at the same time. 
Reply all
Reply to author
Forward
0 new messages