Eitetsu Jo
unread,Jul 20, 2010, 11:05:14 PM7/20/10Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to cascading-user
Hi, I posted for the first time.
I want to do the processing as the following.
input source1 : A, B, C, ...
input source2 : x, y, z, ...
Making new tuple stream from two above sources like this : { A x } ,
{ A y }, { A z }, ... , { B x } , { B y } , { B z }, ... , { C x } ,
{ C y } , { C z } , ...
To do this, I try allocating pseudo ID,
{ A 1 } { A 2 } { A 3 } ... { B 1 } { B 2 } { B 3 } ...
{ 1 x } { 2 y } { 3 z } ...
and joining them with CoGroup by pseudo IDs.
Does this approach agree with Cascading?
The result shows { B 1 } { B 2 } { B 3 }... ( not including { A 1 }
{ A 2 } .. { C 1 } { C 2 } ... )
with FlowException.
I don't have any idea to solve this.
Would you let me know any tips about this?
10/07/21 11:58:33 INFO util.Util: resolving application jar from found
main method on: jo.som.Test
10/07/21 11:58:33 INFO flow.MultiMapReducePlanner: using application
jar: /home/hadoop/LeukemiaSOM.jar
10/07/21 11:58:34 INFO cascade.Cascade: Concurrent, Inc - Cascading
1.1.1 [hadoop-0.19.2+]
10/07/21 11:58:34 INFO flow.Flow: [SOM] starting
10/07/21 11:58:34 INFO flow.Flow: [SOM] source:
Hfs["TextDelimited[['pmid', 'vector']]"]["/user/hadoop/
tarticle.txt"]"]
10/07/21 11:58:34 INFO flow.Flow: [SOM] sink:
Hfs["TextLine[['offset', 'line']->[ALL]]"]["/user/hadoop/tarticle1"]"]
10/07/21 11:58:34 INFO flow.Flow: [SOM] parallel execution is
enabled: true
10/07/21 11:58:34 INFO flow.Flow: [SOM] starting jobs: 1
10/07/21 11:58:34 INFO flow.Flow: [SOM] allocating threads: 1
10/07/21 11:58:34 INFO flow.FlowStep: [SOM] starting step: (1/1)
Hfs["TextLine[['offset', 'line']->[ALL]]"]["/user/hadoop/tarticle1"]"]
10/07/21 11:58:39 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
10/07/21 11:58:40 INFO mapred.FileInputFormat: Total input paths to
process : 1
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] completion events count: 7
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000003_0, Status : SUCCEEDED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000000_0, Status : SUCCEEDED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000001_0, Status : FAILED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000001_1, Status : FAILED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000001_2, Status : FAILED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000001_3, Status : TIPFAILED
10/07/21 11:59:15 WARN flow.FlowStep: [SOM] event = Task Id :
attempt_201007211157_0001_m_000002_0, Status : SUCCEEDED
10/07/21 11:59:15 WARN flow.Flow: stopping jobs
10/07/21 11:59:15 INFO flow.FlowStep: [SOM] stopping: (1/1)
Hfs["TextLine[['offset', 'line']->[ALL]]"]["/user/hadoop/tarticle1"]"]
10/07/21 11:59:15 WARN flow.Flow: stopped jobs
10/07/21 11:59:15 WARN flow.Flow: shutting down job executor
10/07/21 11:59:15 WARN flow.Flow: shutdown complete
10/07/21 11:59:15 INFO hadoop.Hadoop18TapUtil: deleting temp path /
user/hadoop/tarticle1/_temporary
Exception in thread "main" cascading.flow.FlowException: step failed:
(1/1) Hfs["TextLine[['offset', 'line']->[ALL]]"]["/user/hadoop/
tarticle1"]"]
at cascading.flow.FlowStepJob.blockOnJob(FlowStepJob.java:173)
at cascading.flow.FlowStepJob.start(FlowStepJob.java:138)
at cascading.flow.FlowStepJob.call(FlowStepJob.java:127)
at cascading.flow.FlowStepJob.call(FlowStepJob.java:39)
at java.util.concurrent.FutureTask
$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)