Hello,
I have a TupleEntry with the following keys: Name, Age, Address. Name and Age are just strings. Address is a TupleEntry with the sole key of City. Each operations work fine. I can use Identity. I can pass them to a Function.
When I pass the root TupleEntry to GroupBy, I get the stacktrace below.
Here's the code.
Pipe in = new Pipe("FromArango");
Pipe grouping = new GroupBy(in, new Fields("name"));
Pipe buf = new Every(grouping, new BuffOp());
Pipe cause = new Each(buf, new Identity());
Pipe out = new Each(cause, new ComplexRead.Flattener());
If I get rid of the GroupBy and Ever, it works fine.
Is it possible to have TupleEntries with TupleEntries? If so, what's the config dial I need?
cascading.flow.FlowException: local step failed
at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:230)
at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:150)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:124)
at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:43)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: cascading.CascadingException: unable to load serializer for: cascading.tuple.TupleEntry from: org.apache.hadoop.io.serializer.SerializationFactory
at cascading.tuple.hadoop.TupleSerialization.getNewSerializer(TupleSerialization.java:453)
at cascading.tuple.hadoop.TupleSerialization$SerializationElementWriter.write(TupleSerialization.java:743)
at cascading.tuple.io.TupleOutputStream.writeElement(TupleOutputStream.java:114)
at cascading.tuple.io.TupleOutputStream.write(TupleOutputStream.java:89)
at cascading.tuple.io.TupleOutputStream.writeTuple(TupleOutputStream.java:64)
at cascading.tuple.hadoop.io.TupleSerializer.serialize(TupleSerializer.java:37)
at cascading.tuple.hadoop.io.TupleSerializer.serialize(TupleSerializer.java:28)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1074)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:591)
at cascading.tap.hadoop.util.MeasuredOutputCollector.collect(MeasuredOutputCollector.java:69)
at cascading.flow.hadoop.stream.HadoopGroupByGate.receive(HadoopGroupByGate.java:68)
at cascading.flow.hadoop.stream.HadoopGroupByGate.receive(HadoopGroupByGate.java:37)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
... 4 more
Thanks,
JPD