Skip header line (Scalding)

233 views
Skip to first unread message

amitmor

unread,
May 22, 2012, 6:53:43 AM5/22/12
to cascadi...@googlegroups.com
This question is about Scalding:

I couldn't find out how to tell Scalding TextLine("filename")#read  to skip the header in an input pipe. Any ideas ?

Thanks,
Amit

Oscar Boykin

unread,
May 22, 2012, 12:33:53 PM5/22/12
to cascadi...@googlegroups.com
We don't use header lines at Twitter, so we haven't tested that.

You could look up the cascading docs and implement a source that does this (see TextLine source in scalding) or point me to how to do it in cascading and I might be able to get it into the next jar we publish (today or tomorrow).

--
Oscar Boykin
@posco http://twitter.com/posco

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/7VELeBvEunoJ.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

Sam Ritchie

unread,
May 22, 2012, 1:49:32 PM5/22/12
to cascadi...@googlegroups.com
The TextDelimited scheme implements this in Cascading 2, if you want some reference:

--
Sam Ritchie, Twitter Inc
@sritchie09

(Too brief? Here's why! http://emailcharter.org)

amitmor

unread,
May 24, 2012, 9:53:15 AM5/24/12
to cascadi...@googlegroups.com
Thanks you Oscar and Sam !

I have written a little thing: https://gist.github.com/2781595
Do with it whatever you wish to, never mind if you don't !
Apperantly, only TextDelimited can accept the headers flags, so that's what I have implemented.

Amit


On Tuesday, May 22, 2012 8:49:32 PM UTC+3, Sam Ritchie wrote:
The TextDelimited scheme implements this in Cascading 2, if you want some reference:


On Tue, May 22, 2012 at 9:33 AM, Oscar Boykin <os...@twitter.com> wrote:
We don't use header lines at Twitter, so we haven't tested that.

You could look up the cascading docs and implement a source that does this (see TextLine source in scalding) or point me to how to do it in cascading and I might be able to get it into the next jar we publish (today or tomorrow).

--
Oscar Boykin
@posco http://twitter.com/posco

On Tuesday, May 22, 2012 at 3:53, amitmor wrote:

This question is about Scalding:

I couldn't find out how to tell Scalding TextLine("filename")#read  to skip the header in an input pipe. Any ideas ?

Thanks,
Amit



--

Oscar Boykin

unread,
May 24, 2012, 1:51:58 PM5/24/12
to cascadi...@googlegroups.com
Thanks Amit (and Sam).

If you make a pull request for this to scalding/develop (change to DelimitedScheme and Tsv) I would pull it.

Don't you want to be a contributor?  :)

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/snN_yOqSJLMJ.

To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

amit.m...@gmail.com

unread,
May 25, 2012, 12:25:32 AM5/25/12
to cascadi...@googlegroups.com


Sent from my HTClu.o
Reply all
Reply to author
Forward
0 new messages