Access Denied when running wordcount on hadoop / server

180 views
Skip to first unread message

Kevin

unread,
Mar 18, 2009, 2:05:34 PM3/18/09
to cascading-user
We have Hadoop up and running in a small-ish cluster, we're using it
for other things. I am trying to learn Cascading to see if it will
make my coding life easier. I've been trying to use the logparser.jar
and the wordcount.jar built as suggested in the examples. In both
cases when I try to run the programs I get an error along the lines of

> 09/03/18 13:58:52 WARN cascade.Cascade: [import pages+url pipe+...] flow failed: import pages
> cascading.flow.FlowException: unhandled exception
> at cascading.flow.Flow.complete(Flow.java:613)
> at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:419)
> at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:369)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.hadoop.security.AccessControlException:
> org.apache.hadoop.security.AccessControlException: Permission denied: user=kdorff, access=WRITE,
> inode="hadoop":hadoop:supergroup:rwxr-xr-x
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at un.reflect.DelegatingConstructorAccessorImpl.newInstance
> (DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
> at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:831)
> at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:257)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1118)
> at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:283)
> at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:609)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
> at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:450)
> at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:389)
> ... 5 more

The account where I can running the code (kdorff) can sucessfully
write to the directory I have specified for output. The output
directory doesn't exist before I run the code. I'm not sure how to
debug this. This account submits other hadoop jobs without problems. I
am running Hadoop 0.19.1.

Any hints on this?

Kevin


Kevin

unread,
Mar 18, 2009, 3:59:27 PM3/18/09
to cascading-user
Just in case it is relavent, my command is

hadoop jar wordcount.jar url+page.200.txt /users/kdorff/wordcoutput-
output wordcount-local-output

where url+page.200.txt is in the current directory in the local
filesystem. As I said before, my username is 'kdorff' and I CAN write
to /user/kdorff without a problem but the dir /users/kdorff/
wordcoutput-output does not yet exist (per hadoop procedure). The
local directory "wordcount-local-output" does not yet exist. I am
using the stock wordcount.jar that I built from sources (downloaded
from example). Using Cascading 1.0.5, Hadoop 0.19.1.

Kevin

Kevin

unread,
Mar 18, 2009, 4:00:49 PM3/18/09
to cascading-user
Also, I CAN run this code locally (ie on my PC with stock hadoop
configuration + HADOOP_HOME pointing to that stock hadoop
configuration). In that case it runs fine, just doesn't use the
cluster at all.

Kevin

Kevin

unread,
Mar 18, 2009, 4:00:55 PM3/18/09
to cascading-user
Also, I CAN run this code locally (ie on my PC with stock hadoop
configuration + HADOOP_HOME pointing to that stock hadoop
configuration). In that case it runs fine, just doesn't use the
cluster at all.

Kevin

On Mar 18, 3:59 pm, Kevin <kdorff.corn...@gmail.com> wrote:

Chris K Wensel

unread,
Mar 19, 2009, 1:26:52 AM3/19/09
to cascadi...@googlegroups.com
for now, turn off hdfs permissions in hadoop, until I can reproduce
this.

it could likely be the temp file creation, this can be changed through
the "cascading.tmp.dir" property.

ckw
--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/

Kevin

unread,
Mar 19, 2009, 10:19:46 AM3/19/09
to cascading-user
Do I set cascading.tmp.dir in the hadoop-site.xml file or ??

Thanks,
Kevin

Kevin

unread,
Mar 19, 2009, 10:43:10 AM3/19/09
to cascading-user
I looked the cascading source and cascading.tmp.dir, when it isn't
defined, seems it will use hadoop.tmp.dir which, I believe, is a
directory on a local filesystem (not the hadoop filesystem), unless I
am misunderstanding the intent of hadoop.tmp.dir.

My problem is related to trying to write a file on HDFS. So, when I
define cascading.tmp.dir it is a directory that I need write
permissions on on HDFS? Is that correct? So I should do something like
make a directory /tmp in HDFS that has 777 permissions or? I missed
this in the documentation or are cascading people just all running
with permissions off?

Thanks,
Kevin

Kevin

unread,
Mar 19, 2009, 11:06:45 AM3/19/09
to cascading-user
For what it's worth, I made an HDFS directory /tmp and made it "777"
but my problem persists. I CAN make and remove directories in there as
the user kdorff but that doesn't help with this problem.

Kevin

Kevin

unread,
Mar 19, 2009, 11:23:51 AM3/19/09
to cascading-user
Ok, sorry, this is COMPLETELY my fault. I was running

hadoop jar wordcount.jar url+page.200.txt /users/kdorff/
wordcoutput-
output wordcount-local-output

but there is specified the output dir as /users instead of /user -
since permissions were on it wouldn't let me create a /users directory
which makes some sense. When I turned permissions off it was happy to
make a /users directory. I've removed the stray /users directory,
fixed my command, and reenabled permissions and my job runs.

Sorry for the false alarm.
Kevin

On Mar 19, 10:43 am, Kevin <kdorff.corn...@gmail.com> wrote:

Chris K Wensel

unread,
Mar 19, 2009, 11:34:14 AM3/19/09
to cascadi...@googlegroups.com
hey Kevin, great news you got it working.

sorry for not getting replies out sooner.

cheers
Reply all
Reply to author
Forward
0 new messages