12/06/29 11:28:06 INFO util.HadoopUtil: resolving application jar from found main method on: Main
12/06/29 11:28:06 INFO planner.HadoopPlanner: using application jar: /home/guthriec/cascadingManual/CascadingJob.jar
12/06/29 11:28:06 INFO property.AppProps: using
app.id: 58992DBF194B209F568EA9053FD38B5A
12/06/29 11:28:06 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO hadoop.Hfs: forcing job to local mode, via source: Lfs["TextLine[['offset', 'line']->[ALL]]"]["data/url+page.200.txt"]"]
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO planner.HadoopPlanner: using application jar: /home/guthriec/cascadingManual/CascadingJob.jar
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO planner.HadoopPlanner: using application jar: /home/guthriec/cascadingManual/CascadingJob.jar
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 WARN conf.Configuration: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
12/06/29 11:28:07 INFO hadoop.Hfs: forcing job to local mode, via sink: Lfs["TextLine[['offset', 'line']->['url', 'word', 'count']]"]["local/urls"]"]
12/06/29 11:28:07 INFO planner.HadoopPlanner: using application jar: /home/guthriec/cascadingManual/CascadingJob.jar
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO hadoop.Hfs: forcing job to local mode, via sink: Lfs["TextLine[['offset', 'line']->['word', 'count']]"]["local/words"]"]
12/06/29 11:28:07 INFO util.Version: Concurrent, Inc - Cascading 2.0.1
12/06/29 11:28:07 INFO cascade.Cascade: [import pages+url pipe+...] starting
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO cascade.Cascade: [import pages+url pipe+...] parallel execution is enabled: true
12/06/29 11:28:07 INFO cascade.Cascade: [import pages+url pipe+...] starting flows: 4
12/06/29 11:28:07 INFO cascade.Cascade: [import pages+url pipe+...] allocating threads: 4
12/06/29 11:28:07 INFO cascade.Cascade: [import pages+url pipe+...] starting flow: import pages
12/06/29 11:28:07 INFO flow.Flow: [import pages] at least one sink does not exist
12/06/29 11:28:07 INFO flow.Flow: [import pages] starting
12/06/29 11:28:07 INFO flow.Flow: [import pages] source: Lfs["TextLine[['offset', 'line']->[ALL]]"]["data/url+page.200.txt"]"]
12/06/29 11:28:07 INFO flow.Flow: [import pages] sink: Hfs["SequenceFile[['url', 'page']]"]["output/pages"]"]
12/06/29 11:28:07 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:07 INFO flow.Flow: [import pages] parallel execution is enabled: true
12/06/29 11:28:07 INFO flow.Flow: [import pages] starting jobs: 1
12/06/29 11:28:07 INFO flow.Flow: [import pages] allocating threads: 1
12/06/29 11:28:07 INFO flow.FlowStep: [import pages] starting step: (1/1) output/pages
12/06/29 11:28:08 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:08 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
12/06/29 11:28:08 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/06/29 11:28:08 WARN snappy.LoadSnappy: Snappy native library is available
12/06/29 11:28:08 INFO snappy.LoadSnappy: Snappy native library loaded
12/06/29 11:28:08 INFO mapred.FileInputFormat: Total input paths to process : 1
12/06/29 11:28:08 INFO mapreduce.JobSubmitter: number of splits:2
12/06/29 11:28:08 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
12/06/29 11:28:08 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
12/06/29 11:28:08 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
12/06/29 11:28:08 WARN conf.Configuration: mapred.output.key.comparator.class is deprecated. Instead, use mapreduce.job.output.key.comparator.class
12/06/29 11:28:08 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
12/06/29 11:28:08 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
12/06/29 11:28:08 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
12/06/29 11:28:08 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
12/06/29 11:28:08 INFO mapred.ResourceMgrDelegate: Submitted application application_1340992436467_0015 to ResourceManager at /
0.0.0.0:803212/06/29 11:28:08 INFO flow.FlowStep: [import pages] submitted hadoop job: job_1340992436467_0015
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] task completion events identify failed tasks
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] task completion events count: 7
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000000_0, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000001_0, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000001_1, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000000_1, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000001_2, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000000_2, Status : FAILED
12/06/29 11:28:34 WARN flow.FlowStep: [import pages] event = Task Id : attempt_1340992436467_0015_m_000001_3, Status : TIPFAILED
12/06/29 11:28:34 INFO flow.Flow: [import pages] stopping all jobs
12/06/29 11:28:34 INFO flow.FlowStep: [import pages] stopping: (1/1) output/pages
12/06/29 11:28:34 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0015
12/06/29 11:28:34 INFO flow.Flow: [import pages] stopped all jobs
12/06/29 11:28:34 INFO util.Hadoop18TapUtil: deleting temp path output/pages/_temporary
12/06/29 11:28:34 WARN cascade.Cascade: [import pages+url pipe+...] flow failed: import pages
cascading.flow.FlowException: local step failed
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:191)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
12/06/29 11:28:34 INFO cascade.Cascade: [import pages+url pipe+...] starting flow: export word
12/06/29 11:28:34 INFO cascade.Cascade: [import pages+url pipe+...] starting flow: export url
12/06/29 11:28:34 INFO flow.Flow: [export word] at least one sink does not exist
12/06/29 11:28:34 INFO flow.Flow: [export url] at least one sink does not exist
12/06/29 11:28:34 INFO flow.Flow: [export word] starting
12/06/29 11:28:34 INFO flow.Flow: [export word] source: Hfs["SequenceFile[['word', 'count']]"]["output/words"]"]
12/06/29 11:28:34 INFO flow.Flow: [export word] sink: Lfs["TextLine[['offset', 'line']->['word', 'count']]"]["local/words"]"]
12/06/29 11:28:34 INFO flow.Flow: [export word] parallel execution is enabled: true
12/06/29 11:28:34 INFO flow.Flow: [export word] starting jobs: 1
12/06/29 11:28:34 INFO flow.Flow: [export word] allocating threads: 1
12/06/29 11:28:34 INFO flow.FlowStep: [export word] starting step: (1/1) local/words
12/06/29 11:28:34 INFO flow.Flow: [export url] starting
12/06/29 11:28:34 INFO flow.Flow: [export url] source: Hfs["SequenceFile[['url', 'word', 'count']]"]["output/urls"]"]
12/06/29 11:28:34 INFO flow.Flow: [export url] sink: Lfs["TextLine[['offset', 'line']->['url', 'word', 'count']]"]["local/urls"]"]
12/06/29 11:28:34 INFO flow.Flow: [export url] parallel execution is enabled: true
12/06/29 11:28:34 INFO flow.Flow: [export url] starting jobs: 1
12/06/29 11:28:34 INFO flow.Flow: [export url] allocating threads: 1
12/06/29 11:28:34 INFO flow.FlowStep: [export url] starting step: (1/1) local/urls
12/06/29 11:28:34 INFO mapred.FileInputFormat: Total input paths to process : 0
12/06/29 11:28:34 INFO mapred.FileInputFormat: Total input paths to process : 0
12/06/29 11:28:34 INFO mapreduce.JobSubmitter: number of splits:0
12/06/29 11:28:34 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
12/06/29 11:28:34 INFO mapreduce.JobSubmitter: number of splits:0
12/06/29 11:28:34 INFO mapred.ResourceMgrDelegate: Submitted application application_1340992436467_0016 to ResourceManager at /
0.0.0.0:803212/06/29 11:28:34 INFO flow.FlowStep: [export word] submitted hadoop job: job_1340992436467_0016
12/06/29 11:28:34 INFO mapred.ResourceMgrDelegate: Submitted application application_1340992436467_0017 to ResourceManager at /
0.0.0.0:803212/06/29 11:28:34 INFO flow.FlowStep: [export url] submitted hadoop job: job_1340992436467_0017
12/06/29 11:28:45 WARN flow.FlowStep: [export word] task completion events identify failed tasks
12/06/29 11:28:45 WARN flow.FlowStep: [export word] task completion events count: 0
12/06/29 11:28:45 INFO flow.Flow: [export word] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [export word] stopping: (1/1) local/words
12/06/29 11:28:45 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0016
12/06/29 11:28:45 INFO flow.Flow: [export word] stopped all jobs
12/06/29 11:28:45 WARN cascade.Cascade: [import pages+url pipe+...] flow failed: export word
cascading.flow.FlowException: local step failed
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:191)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
12/06/29 11:28:45 WARN flow.FlowStep: [export url] task completion events identify failed tasks
12/06/29 11:28:45 WARN flow.FlowStep: [export url] task completion events count: 0
12/06/29 11:28:45 INFO flow.Flow: [export url] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [export url] stopping: (1/1) local/urls
12/06/29 11:28:45 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0017
12/06/29 11:28:45 INFO flow.Flow: [export url] stopped all jobs
12/06/29 11:28:45 WARN cascade.Cascade: [import pages+url pipe+...] flow failed: export url
cascading.flow.FlowException: local step failed
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:191)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
12/06/29 11:28:45 INFO cascade.Cascade: [import pages+url pipe+...] stopping all flows
12/06/29 11:28:45 INFO cascade.Cascade: [import pages+url pipe+...] stopping flow: export url
12/06/29 11:28:45 INFO flow.Flow: [export url] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [export url] stopping: (1/1) local/urls
12/06/29 11:28:45 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0017
12/06/29 11:28:45 INFO flow.Flow: [export url] stopped all jobs
12/06/29 11:28:45 INFO cascade.Cascade: [import pages+url pipe+...] stopping flow: export word
12/06/29 11:28:45 INFO flow.Flow: [export word] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [export word] stopping: (1/1) local/words
12/06/29 11:28:45 WARN ipc.Client: Unexpected error reading responses on connection Thread[IPC Client (799490076) connection to guthriec-ThinkPad-T420s/
127.0.1.1:53224 from guthriec,5,main]
java.lang.NullPointerException
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
12/06/29 11:28:45 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0016
12/06/29 11:28:45 INFO flow.Flow: [export word] stopped all jobs
12/06/29 11:28:45 INFO cascade.Cascade: [import pages+url pipe+...] stopping flow: url pipe+word pipe
12/06/29 11:28:45 INFO flow.Flow: [url pipe+word pipe] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [url pipe+word pipe] stopping: (2/2) output/words
12/06/29 11:28:45 INFO flow.FlowStep: [url pipe+word pipe] stopping: (1/2) output/urls
12/06/29 11:28:45 INFO flow.Flow: [url pipe+word pipe] stopped all jobs
12/06/29 11:28:45 INFO cascade.Cascade: [import pages+url pipe+...] stopping flow: import pages
12/06/29 11:28:45 INFO flow.Flow: [import pages] stopping all jobs
12/06/29 11:28:45 INFO flow.FlowStep: [import pages] stopping: (1/1) output/pages
12/06/29 11:28:46 INFO ipc.Client: Retrying connect to server: guthriec-ThinkPad-T420s/
127.0.1.1:35123. Already tried 0 time(s).
12/06/29 11:28:46 INFO mapred.ResourceMgrDelegate: Killing application application_1340992436467_0015
12/06/29 11:28:46 INFO flow.Flow: [import pages] stopped all jobs
12/06/29 11:28:46 INFO cascade.Cascade: [import pages+url pipe+...] stopped all flows
Exception in thread "main" cascading.cascade.CascadeException: flow failed: import pages
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:771)
at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:710)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: cascading.flow.FlowException: local step failed
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:191)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42)
... 5 more