Fixes for TempateTap and failure traps

11 views
Skip to first unread message

Chris K Wensel

unread,
Apr 5, 2009, 3:08:35 PM4/5/09
to cascadi...@googlegroups.com
Hey all

Just a quick heads up there are a number for stability fixes for how
TemplateTaps and failure trap Taps are handled in wip-1.0.8.

http://github.com/cwensel/cascading/tree/wip-1.0.8

If you are seeing problems using those features, please give the
branch a spin and let me know how it fares. All regression tests pass,
but these features push Hadoop to its limits and are hard to unit test.

This branch only works with Hadoop 0.19. Back porting to Hadoop 0.18
will be a major feat of engineering I fear. I hope to take a stab at
it this week.

cheers,
chris

--
Chris K Wensel
ch...@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/

Chris Curtin

unread,
Apr 9, 2009, 2:39:33 PM4/9/09
to cascading-user
Hi Chris,

I tried the 1.0.8 branch and none of my code works any longer. No
useful errors in the output. I have been using 1.0.4 in EC2 on 0.19.0

Output from a non-Template tap application:

09/04/09 13:47:10 INFO flow.MultiMapReducePlanner: using application
jar: /mnt/hadoop/cascading/responsecurve.jar
09/04/09 13:47:10 INFO cascade.Cascade: [reformat_7] starting
09/04/09 13:47:10 INFO cascade.Cascade: [reformat_7] starting flows:
1
09/04/09 13:47:10 INFO cascade.Cascade: [reformat_7] allocating
threads: 1
09/04/09 13:47:10 INFO cascade.Cascade: [reformat_7] starting flow:
reformat_7
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] atleast one sink is
marked for delete
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] sink oldest modified
date: Wed Dec 31 18:59:59 EST 1969
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] starting
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] source: MultiTap[[Hfs
["TextLine[['offset', 'line']->[ALL]]"]["/source/
7/456912_296204_sent.csv"]"], Hfs["TextLine[['offset', 'line']->
[ALL]]"]["/source/7/456912_296204_metrics_1.csv"]"]]]
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] sink: Hfs["SequenceFile
[['Recipient Id', 'Mailing Id', 'Report Id', 'Campaign Id', 'Recipient
Type', 'Email', 'DOMAIN', 'LMR_ID', 'Suppression Reason', 'SENT',
'DAYS_IN_LIST', 'DAYS_IN_LIST_BUCKET', 'OPENS_TOTAL', 'OPENS_HTML',
'OPENS_AOL', 'OPENS_TEXT', 'OPENS_WEB', 'OPENS_FIRST', 'OPENS_LAST',
'CLICK_TOTAL_ANY', 'CLICK_TOTAL_HTML', 'CLICK_TOTAL_AOL',
'CLICK_TOTAL_TEXT', 'CLICK_TOTAL_WEB', 'CLICK_ANY_FIRST',
'CLICK_ANY_LAST', 'FTF_TOTAL', 'FTF_FIRST', 'FTF_LAST',
'CONVERSIONS_TOTAL', 'CONVERSIONS_AMOUNT', 'CONVERSION_FIRST',
'CONVERSION_LAST', 'HARD_BOUNCE', 'SOFT_BOUNCE', 'BOUNCE_DATE',
'OPTED_OUT_FROM_MAILING', 'OPTED_OUT_FROM_MAILING_DATE',
'OPTED_OUT_VIA_ABUSE', 'OPTED_OUT_VIA_ABUSE_DATE', 'REPLY_MAIL_BLOCK',
'REPLY_MAIL_BLOCK_DATE', 'REPLY_MAIL_RESTRICTION',
'REPLY_MAIL_RESTRICTION_DATE', 'REPLY_COUNT', 'REPLY_FIRST',
'REPLY_LAST', 'OPT_OUT', 'RECIPIENT_TYPE', 'OPTED_IN', 'OPTED_OUT',
'OPT_IN_DETAILS', 'OPT_OUT_DETAILS', 'UNDELIVERABLE_DETAILS',
'open_dt', 'click_dt', 'sent_dt', 'send_hour', 'cheezburgerz_total',
'cheezburgerz_html', 'cheezburgerz_aol', 'cheezburgerz_text',
'cheezburgerz_web', 'cheezburgerz_first', 'cheezburgerz_last']]"]["/
output/7_flat"]"]
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] parallel execution is
enabled: true
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] starting jobs: 1
09/04/09 13:47:10 INFO flow.Flow: [reformat_7] allocating threads: 1
09/04/09 13:47:10 INFO flow.FlowStep: [reformat_7] starting step:
(1/1) ', 'cheezburgerz_html', 'cheezburgerz_aol', 'cheezburgerz_text',
'cheezburgerz_web', 'cheezburgerz_first', 'cheezburgerz_last']]"]["/
output/7_flat"]"]
09/04/09 13:47:10 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
09/04/09 13:47:10 INFO mapred.FileInputFormat: Total input paths to
process : 2
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] completion events
count: 10
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000017_0, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_r_000002_0, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000017_1, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000017_2, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000017_3, Status : TIPFAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000016_0, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_r_000001_0, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000016_1, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000016_2, Status : FAILED
09/04/09 13:48:15 WARN flow.FlowStep: [reformat_7] event = Task Id :
attempt_200904091327_0007_m_000016_3, Status : TIPFAILED
09/04/09 13:48:15 WARN flow.Flow: stopping jobs
09/04/09 13:48:15 INFO flow.FlowStep: [reformat_7] stopping: (1/1) ',
'cheezburgerz_html', 'cheezburgerz_aol', 'cheezburgerz_text',
'cheezburgerz_web', 'cheezburgerz_first', 'cheezburgerz_last']]"]["/
output/7_flat"]"]
09/04/09 13:48:15 WARN flow.Flow: stopped jobs
09/04/09 13:48:15 WARN flow.Flow: shutting down job executor
09/04/09 13:48:15 WARN flow.Flow: shutdown complete
09/04/09 13:48:15 WARN cascade.Cascade: [reformat_7] flow failed:
reformat_7
cascading.flow.FlowException: step failed: (1/1) ',
'cheezburgerz_html', 'cheezburgerz_aol', 'cheezburgerz_text',
'cheezburgerz_web', 'cheezburgerz_first', 'cheezburgerz_last']]"]["/
output/7_flat"]"]
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:466)
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:397)
at java.util.concurrent.FutureTask$Sync.innerRun
(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
09/04/09 13:48:15 WARN cascade.Cascade: [reformat_7] stopping flows
09/04/09 13:48:15 INFO cascade.Cascade: [reformat_7] stopping flow:
reformat_7
09/04/09 13:48:15 WARN cascade.Cascade: [reformat_7] stopped flows
09/04/09 13:48:15 WARN cascade.Cascade: [reformat_7] shutting down
flow executor
09/04/09 13:48:15 WARN cascade.Cascade: [reformat_7] shutdown complete
cascading.cascade.CascadeException: flow failed: reformat_7
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:428)
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:369)
at java.util.concurrent.FutureTask$Sync.innerRun
(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: cascading.flow.FlowException: step failed: (1/1) ',
'cheezburgerz_html', 'cheezburgerz_aol', 'cheezburgerz_text',
'cheezburgerz_web', 'cheezburgerz_first', 'cheezburgerz_last']]"]["/
output/7_flat"]"]
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:466)
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:397)
... 5 more

Output from a Template Tap application:

09/04/09 13:43:15 INFO flow.MultiMapReducePlanner: using application
jar: /mnt/hadoop/cascading/responsecurve.jar
09/04/09 13:43:15 INFO cascade.Cascade: [offlinetest] starting
09/04/09 13:43:15 INFO cascade.Cascade: [offlinetest] starting flows:
1
09/04/09 13:43:15 INFO cascade.Cascade: [offlinetest] allocating
threads: 1
09/04/09 13:43:15 INFO cascade.Cascade: [offlinetest] starting flow:
offlinetest
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] atleast one sink does
not exist
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] starting
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] source: Hfs["TextLine
[['offset', 'line']->[ALL]]"]["/source/offlinetest"]"]
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] sink:
com.silverpop.cmc.reports.mailingreport.SPTemplateTap@430c0325
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] parallel execution is
enabled: true
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] starting jobs: 1
09/04/09 13:43:15 INFO flow.Flow: [offlinetest] allocating threads: 1
09/04/09 13:43:15 INFO flow.FlowStep: [offlinetest] starting step:
(1/1) com.silverpop.cmc.reports.mailingreport.SPTemplateTap@430c0325
09/04/09 13:43:15 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
09/04/09 13:43:15 INFO mapred.FileInputFormat: Total input paths to
process : 1
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] completion events
count: 10
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000003_0, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_r_000002_0, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000003_1, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000003_2, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000003_3, Status : TIPFAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000002_0, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_r_000001_0, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000002_1, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000002_2, Status : FAILED
09/04/09 13:44:16 WARN flow.FlowStep: [offlinetest] event = Task Id :
attempt_200904091327_0006_m_000002_3, Status : TIPFAILED
09/04/09 13:44:16 WARN flow.Flow: stopping jobs
09/04/09 13:44:16 INFO flow.FlowStep: [offlinetest] stopping: (1/1)
com.silverpop.cmc.reports.mailingreport.SPTemplateTap@430c0325
09/04/09 13:44:16 WARN flow.Flow: stopped jobs
09/04/09 13:44:16 WARN flow.Flow: shutting down job executor
09/04/09 13:44:16 WARN flow.Flow: shutdown complete
09/04/09 13:44:16 WARN cascade.Cascade: [offlinetest] flow failed:
offlinetest
cascading.flow.FlowException: step failed: (1/1)
com.silverpop.cmc.reports.mailingreport.SPTemplateTap@430c0325
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:466)
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:397)
at java.util.concurrent.FutureTask$Sync.innerRun
(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
09/04/09 13:44:16 WARN cascade.Cascade: [offlinetest] stopping flows
09/04/09 13:44:16 INFO cascade.Cascade: [offlinetest] stopping flow:
offlinetest
09/04/09 13:44:16 WARN cascade.Cascade: [offlinetest] stopped flows
09/04/09 13:44:16 WARN cascade.Cascade: [offlinetest] shutting down
flow executor
09/04/09 13:44:16 WARN cascade.Cascade: [offlinetest] shutdown
complete
cascading.cascade.CascadeException: flow failed: offlinetest
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:428)
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:369)
at java.util.concurrent.FutureTask$Sync.innerRun
(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: cascading.flow.FlowException: step failed: (1/1)
com.silverpop.cmc.reports.mailingreport.SPTemplateTap@430c0325
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:466)
at cascading.flow.FlowStep$FlowStepJob.call(FlowStep.java:397)
... 5 more

Chris K Wensel

unread,
Apr 9, 2009, 2:55:20 PM4/9/09
to cascadi...@googlegroups.com
Thanks for testing this.

I'll need to see the task output (stdout and syslog). You can send
them directly to me..

ckw

Chris K Wensel

unread,
Apr 10, 2009, 10:56:46 AM4/10/09
to cascadi...@googlegroups.com
Looks like this is an issue with 0.19.0 and not with 0.19.1.

I guess 1.0.8 will require at least Hadoop 0.19.1.

ckw

Chris K Wensel

unread,
Apr 10, 2009, 11:15:16 AM4/10/09
to cascadi...@googlegroups.com
Found the Hadoop bug. resolved in 0.19.1
https://issues.apache.org/jira/browse/HADOOP-4847

ckw

Chris K Wensel

unread,
Apr 10, 2009, 3:47:01 PM4/10/09
to cascadi...@googlegroups.com
Hey all

So 1.0.8 doesn't have any special 0.19.x requirements, I'm going to
revert supporting the Hadop OutputCommitter in Cascading (as it is
broken in 0.19.0).

This will delay 1.0.8 maint release till probably early next week.

in the mean time, I do have regression tested versions of wip-1.0.8 on
github. one version for hadoop 0.19.1+ and another for hadoop 0.18.3+.

http://github.com/cwensel/cascading/tree/master

cheers,
ckw

Chris K Wensel

unread,
Apr 11, 2009, 5:29:52 PM4/11/09
to cascadi...@googlegroups.com
FYI

I just pushed up to github support for Hadoop 0.18.3+ and Hadoop
0.19.0+.

http://github.com/cwensel/cascading/tree/master

Please give it a test.

if it wasn't for one package renaming, we wouldn't need two
distributions. oh well. heh

cheers,
chris
Reply all
Reply to author
Forward
0 new messages