cascading.tuple.TupleException: unable to read from input identifier: hdfs://nameservice1/adelbertc/ds=2013-07-25/2013-07-26_1374860564-m-00001
at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127)
at cascading.flow.stream.SourceStage.map(SourceStage.java:76)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: cascading.tap.TapException: did not parse correct number of values from input data, expected: 37, got: 1:SEQ !org.apache.hadoop.io.NullWritable org.apache.hadoop.io.Text�z���YɼS?� �� �I'm guessing because my files are Snappy compressed and I didn't tell the job that.. if it helps below is the output I see on the machine where I ran the hadoop command before it failed: http://pastie.org/private/dznuqca6xmyv2nxtvo4wa
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
For more options, visit https://groups.google.com/groups/opt_out.
We use lzo compressed data usually, and to read that we use the code in scalding-commons with Lzo in the trait and class names (sorry on mobile now).
To read snappy data I think you are going to have to write a cascading SnappyScheme to handle the decompression. Maybe someone has already written one for cascading?
On Friday, July 26, 2013, wrote:
How can I tell Scalding to read a CSV file from an HDFS file that was compressed by Snappy?This is my Scalding code: http://pastie.org/private/qezw1rtkrhrocyjydeonaCommand: hadoop jar target/scalding-jobs-0.0.1.jar com.adelbertc.scalding.job.DummyJob --hdfs --input /adelbertc/ds+2013-07-25 --output /adelbertc/dummyError:--cascading.tuple.TupleException: unable to read from input identifier: hdfs://nameservice1/adelbertc/ds=2013-07-25/2013-07-26_1374860564-m-00001 at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127) at cascading.flow.stream.SourceStage.map(SourceStage.java:76) at cascading.flow.stream.SourceStage.run(SourceStage.java:58) at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: cascading.tap.TapException: did not parse correct number of values from input data, expected: 37, got: 1:SEQ !org.apache.hadoop.io.NullWritable org.apache.hadoop.io.Text�z���YɼS?� �� �I'm guessing because my files are Snappy compressed and I didn't tell the job that.. if it helps below is the output I see on the machine where I ran the hadoop command before it failed: http://pastie.org/private/dznuqca6xmyv2nxtvo4wa
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-user+unsubscribe@googlegroups.com.
To post to this group, send email to cascading-user@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
For more options, visit https://groups.google.com/groups/opt_out.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/a39417cf-6855-49b8-b57b-c1624df6c85d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.