Tap only works with Lfs?

14 views
Skip to first unread message

jackycs

unread,
May 24, 2010, 5:51:07 PM5/24/10
to cascading-user
Hadoop 0.20.2 + Cascading 1.1.1

I have a program that writes data to a text file. The program works
fine when a local file path is used. i.e. Tap sink = new Lfs(new
TextLine(), "some/path", SinkMode.REPLACE), but the very same program
doesn't work when a Dfs is specified. i.e. Tap sink = new Dfs(new
TextLine(), "hdfs://localhost/some/path", SinkMode.REPLACE). I
verified that "hdfs://localhost/some/path" was created after running
the program, however it was an empty folder and the hadoop job failed
with the following error,

10/05/24 14:45:40 WARN flow.FlowStep: [read] completion events count:
6
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000002_0, Status : SUCCEEDED
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000000_0, Status : FAILED
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000000_1, Status : FAILED
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000000_2, Status : FAILED
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000000_3, Status : TIPFAILED
10/05/24 14:45:40 WARN flow.FlowStep: [read] event = Task Id :
attempt_201005241244_0004_m_000001_0, Status : SUCCEEDED
10/05/24 14:45:40 WARN flow.Flow: stopping jobs
10/05/24 14:45:40 INFO flow.FlowStep: [read] stopping:
(1/1) ...TEAM']]"]["hdfs://localhost/some/path"]"]
10/05/24 14:45:40 WARN flow.Flow: stopped jobs
10/05/24 14:45:40 WARN flow.Flow: shutting down job executor
10/05/24 14:45:40 WARN flow.Flow: shutdown complete
10/05/24 14:45:40 INFO hadoop.Hadoop18TapUtil: deleting temp path
hdfs://localhost/some/path/_temporary
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:
56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: cascading.flow.FlowException: step failed:
(1/1) ...TEAM']]"]["hdfs://localhost/some/path"]"]
at cascading.flow.FlowStepJob.blockOnJob(FlowStepJob.java:173)
at cascading.flow.FlowStepJob.start(FlowStepJob.java:138)
at cascading.flow.FlowStepJob.call(FlowStepJob.java:127)
at cascading.flow.FlowStepJob.call(FlowStepJob.java:39)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

any ideas?

Thanks,
Jack

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

Chris K Wensel

unread,
May 24, 2010, 5:58:05 PM5/24/10
to cascadi...@googlegroups.com
what happens if you just use 'some/path' with Dfs?
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com

jackycs

unread,
May 24, 2010, 6:07:55 PM5/24/10
to cascading-user
Chris thanks for your reply.

I removed the "hdfs://localhost" prefix and got the same error.
However, I took a look at the log file under some/path/_log and found
this IOException

ERROR="java\.io\.IOException: Split class cascading\.tap\.hadoop
\.MultiInputSplit not found
at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:
326)
at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:307)
at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:170)

what did I miss here?

Thanks,
Jack
> > For more options, visit this group athttp://groups.google.com/group/cascading-user?hl=en.
>
> --
> Chris K Wensel
> ch...@concurrentinc.comhttp://www.concurrentinc.com
>
> --
> You received this message because you are subscribed to the Google Groups "cascading-user" group.
> To post to this group, send email to cascadi...@googlegroups.com.
> To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/cascading-user?hl=en.

Chris K Wensel

unread,
May 24, 2010, 6:10:10 PM5/24/10
to cascadi...@googlegroups.com
Your classpath is likely borked.

Make sure cascading core (and deps) is in your jar under the lib folder.

ckw
--
Chris K Wensel
ch...@concurrentinc.com

jackycs

unread,
May 24, 2010, 6:56:41 PM5/24/10
to cascading-user
cool. I was using Eclipse's export feature and they put all the jars
under /.

Just curious, if classpath is the root cause, then how come the same
program worked with Lfs?

Ken Krugler

unread,
May 24, 2010, 7:28:33 PM5/24/10
to cascadi...@googlegroups.com
Well, with Lfs it has to run in local (not distributed) mode, so I
don't think the job jar actually gets distributed/used to run the job.

-- Ken
>>> To post to this group, send email to cascading-
>>> us...@googlegroups.com.
>>> To unsubscribe from this group, send email to cascading-use...@googlegroups.com
>>> .
>>> For more options, visit this group athttp://groups.google.com/group/cascading-user?hl=en
>>> .
>>
>> --
>> Chris K Wensel
>> ch...@concurrentinc.comhttp://www.concurrentinc.com
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "cascading-user" group.
>> To post to this group, send email to cascadi...@googlegroups.com.
>> To unsubscribe from this group, send email to cascading-use...@googlegroups.com
>> .
>> For more options, visit this group athttp://groups.google.com/group/cascading-user?hl=en
>> .
>
> --
> You received this message because you are subscribed to the Google
> Groups "cascading-user" group.
> To post to this group, send email to cascadi...@googlegroups.com.
> To unsubscribe from this group, send email to cascading-use...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en
> .
>

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g
Reply all
Reply to author
Forward
0 new messages