nested tuples

162 views
Skip to first unread message

Derrick Burns

unread,
Jul 11, 2012, 4:17:31 PM7/11/12
to cascadi...@googlegroups.com
I'd like to serialize and deserialize nested tuples.  Using Cascading local mode (with TextLine or TextDelimited), nested tuples are effectively flattened.  I could encode nested tuples manually, but I'd prefer not to.  Is there a Scheme/Tap that I can use in local mode to read/write nested tuples.  

Second, I assume that nested tuples are properly serialized and deserialized when I used the Hadoop SequenceFile and WritableSequenceFile.  Is this correct?  

Finally, is anyone aware of a Cascading AVRO serializer/deserializer that properly handles nested tuples?

Sam Ritchie

unread,
Jul 11, 2012, 5:12:42 PM7/11/12
to cascadi...@googlegroups.com
If you use cascading.kryo, you should be able to arbitrarily nest data structures inside of tuple fields with no problem:


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/e3yvlaPclFwJ.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.



--
Sam Ritchie, Twitter Inc
@sritchie09

(Too brief? Here's why! http://emailcharter.org)

Oscar Boykin

unread,
Jul 11, 2012, 5:16:54 PM7/11/12
to cascadi...@googlegroups.com
I think a sequence file would work fine for your use case.

On Wed, Jul 11, 2012 at 1:17 PM, Derrick Burns <derric...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/e3yvlaPclFwJ.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

Chris K Wensel

unread,
Jul 11, 2012, 5:17:21 PM7/11/12
to cascadi...@googlegroups.com
yes, nested Tuples are serialized when using any sort of binary Scheme like SequenceFile using native hadoop serialization.

if you want text, take a look at the json taps on the site. they may do that.


ckw

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/e3yvlaPclFwJ.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

Christopher Severs

unread,
Jul 12, 2012, 4:30:45 PM7/12/12
to cascadi...@googlegroups.com


On Wednesday, July 11, 2012 1:17:31 PM UTC-7, Derrick Burns wrote:


Finally, is anyone aware of a Cascading AVRO serializer/deserializer that properly handles nested tuples?

Hi Derrick,

There is a cascading.avro project which is in a bit of flux right now (mainly due to lack of time). I've been thinking about how to implement reading nested things though and I think I have a reasonable solution. I'll try and get it added to the project soon. Or, if you already have something worked up go ahead and make a pull request on the 2.0 branch of the project.

-----
Chris

Reply all
Reply to author
Forward
0 new messages