Exception in thread "main" cascading.cascade.CascadeException: flow failed: catchUnmatched+catchUndated+page at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:714) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:653) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: cascading.flow.FlowException: unhandled exception at cascading.flow.Flow.complete(Flow.java:821) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:705) ... 6 more Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://10.4.217.249:9000/user/hadoop/output/flavored_access_logs/good at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208) at cascading.tap.hadoop.MultiInputFormat.getSplits(MultiInputFormat.java:240) at cascading.tap.hadoop.MultiInputFormat.getSplits(MultiInputFormat.java:180) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1036) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1028) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:172) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:897) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:871) at cascading.flow.FlowStepJob.blockOnJob(FlowStepJob.java:164) at cascading.flow.FlowStepJob.start(FlowStepJob.java:140) at cascading.flow.FlowStepJob.call(FlowStepJob.java:129) at cascading.flow.FlowStepJob.call(FlowStepJob.java:39) ... 5 more
I had a peek at the contents of HDFS and by the time the job had failed, that path did exist, which was weird. This is cascading 1.2. Possibly a known EMR issue, like with S3? Or do I need to set some sort of setting in my site config for Hadoop?
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/Y7QlK_tvXN4J.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.