Cascading counter

91 views
Skip to first unread message

Velkumar Neel

unread,
Dec 29, 2014, 12:41:45 AM12/29/14
to cascadi...@googlegroups.com
Hi
If in a cascading Pipe as below,

Pipe P1 = new UniqueCount(P1,new Fields(Field1, Field2,Field3), new Fields(COUNT).

From the above pipe, I need to use the COUNT to get assigned to a local variable to display the statistics of how many records are present.

How should i assign this value?

(OR)

What is the other way to get the count value?

Thanks
Vel

Ken Krugler

unread,
Dec 29, 2014, 10:30:06 AM12/29/14
to cascadi...@googlegroups.com


From: Velkumar Neel

Sent: December 28, 2014 9:41:44pm PST

To: cascadi...@googlegroups.com

Subject: Cascading counter



You'll have N values for COUNT, one for each unique combination fo values for Field1, Field2, and Field3, right?

So it's not a single value, and it doesn't represent how many records were present, it represents how many groups you get from a key composed of those three fields.

In any case, some options...

1. Attach P1 to a text file sink Tap, then read that when the Flow has completed.

2. Attach P1 to a database sink Tap, and read those records when the Flow has completed.

3. If the number of unique combinations of fields is going to be "small" (say less than 100) then you could use a custom function that sythensizes a counter name from the field values and uses the count as the increment for that counter.

-- Ken

--------------------------
Ken Krugler
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Velkumar Neel

unread,
Dec 29, 2014, 3:06:57 PM12/29/14
to cascadi...@googlegroups.com
Hi ken
Thanks. But I cannot attach that pipe to tail sink because I have intermediate pipes whose count is also needed. So in this case what should I do?

Thanks
Vel
--
You received this message because you are subscribed to a topic in the Google Groups "cascading-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cascading-user/JtiSFDPNrpA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/8F70BB3B-0A4C-470B-9DEA-758E520B3A8B%40transpac.com.
For more options, visit https://groups.google.com/d/optout.

Ken Krugler

unread,
Dec 30, 2014, 10:48:00 AM12/30/14
to cascadi...@googlegroups.com


From: Velkumar Neel

Sent: December 29, 2014 12:06:55pm PST

To: cascadi...@googlegroups.com

Subject: Re: Cascading counter


Hi ken
Thanks. But I cannot attach that pipe to tail sink because I have intermediate pipes whose count is also needed. So in this case what should I do?

Split the pipe if you need it both as output and as input to subsequent Flow steps.

Pipe outputCountersPipe = new Pipe("output counters", P1);

Now continue using P1 in the rest of your Flow, and connect the outputCountersPipe to the sink.

-- Ken

You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.

To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.

For more options, visit https://groups.google.com/d/optout.

Velkumar Neel

unread,
Dec 30, 2014, 6:10:24 PM12/30/14
to cascadi...@googlegroups.com
Hi ken
Thanks. If we need to count how many records present in a pipe, how to do that?


On Monday, December 29, 2014, Ken Krugler <kkrugle...@transpac.com> wrote:

Ken Krugler

unread,
Dec 30, 2014, 6:22:43 PM12/30/14
to cascadi...@googlegroups.com


From: Velkumar Neel

Sent: December 30, 2014 3:10:22pm PST

To: cascadi...@googlegroups.com

Subject: Re: Cascading counter


Hi ken
Thanks. If we need to count how many records present in a pipe, how to do that?
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.

To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.

For more options, visit https://groups.google.com/d/optout.

Velkumar Neel

unread,
Jan 6, 2015, 4:01:43 PM1/6/15
to cascadi...@googlegroups.com
Hi Ken
I have declared the counter as below 
Public enum counters
 public static Enum [] getCounter
{
return new Enum[]
{
Tot_no_of_records,
no_of_records_selected
}

In my cascading flow i have declared a counter as below
int j = 1;
Pipe = new Each(Pipe, new Counter(counters.getCounter, int J)

Iam getting counter value as zero. Am I doing anything wrong?

Thanks
Vel

Ken Krugler

unread,
Jan 6, 2015, 4:29:28 PM1/6/15
to cascadi...@googlegroups.com
Hi Vel,

No idea what you're doing with the getCounter call.

Why not

public enum MyCounters {
Tot_no_of_records,
no_of_records,selected
}

And then

pipe = new Each(pipe, new Counter(MyCounters.Tot_no_of_records));

-- Ken


From: Velkumar Neel

Sent: January 6, 2015 1:01:40pm PST

Velkumar Neel

unread,
Jan 7, 2015, 1:05:16 PM1/7/15
to cascadi...@googlegroups.com
Hi Ken
Thanks. I try to display the counter value in local cascading, it didn't give the count value.
Sysprint (MyCounters.Tot_no_of_records).
--
You received this message because you are subscribed to a topic in the Google Groups "cascading-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cascading-user/JtiSFDPNrpA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.

Velkumar Neel

unread,
Feb 8, 2021, 6:46:59 PM2/8/21
to cascadi...@googlegroups.com
Hi ken 
Iam trying to print the pipe with the files values
But it is not showing in the debug logs. I worked on this product 5 years before . Please help me with

Iam saying 

Pipe = new Each(pipe , new fields (“name”),
New debug (“print”, true)

Is there any other setting I need to do ? 

Please note the pipe Iam debugging is an intermediate pipe .

Thanks in advance 

Sent from my iPhone

On Jan 7, 2015, at 11:05 AM, Velkumar Neel <neel...@gmail.com> wrote:

Hi Ken

Ken Krugler

unread,
Feb 8, 2021, 7:50:39 PM2/8/21
to cascadi...@googlegroups.com
When you run Cascading locally, I don’t think Hadoop counters work, because there is no real Job Manager.

Though it’s been a while (like almost 5 years) since I had to deal with that.

— Ken

You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/BC6C0A75-63EB-49A3-A59E-19F9A4428B1A%40gmail.com.

--------------------------
Ken Krugler
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr



--------------------------
Ken Krugler

Velkumar Neel

unread,
Feb 8, 2021, 8:12:50 PM2/8/21
to cascadi...@googlegroups.com
Hi Ken
Thanks for the prompt reply always 
Iam running on cluster. 

Sent from my iPhone

On Feb 8, 2021, at 5:50 PM, Ken Krugler <ken_...@krugler.org> wrote:



Chris K Wensel

unread,
Feb 8, 2021, 8:30:23 PM2/8/21
to cascadi...@googlegroups.com
The Debug filter will print to stdout wherever its run (it doesn’t rely on counters, fwiw). 

you might double check stdout isn’t redirected somewhere on your cluster non obvious.

Debug is also a PlannedOperation, and can be removed by the planner during planning. See what DebugLevel you set on the FlowDef. By default it is left in.

ckw
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/BC6C0A75-63EB-49A3-A59E-19F9A4428B1A%40gmail.com.

Velkumar Neel

unread,
Feb 8, 2021, 9:11:48 PM2/8/21
to cascadi...@googlegroups.com
Does it need to be flowdef.verbose? It is now set to default . 

Sent from my iPhone

On Feb 8, 2021, at 6:30 PM, Chris K Wensel <ch...@wensel.net> wrote:



Chris K Wensel

unread,
Feb 8, 2021, 9:34:35 PM2/8/21
to cascadi...@googlegroups.com
The default is to leave in the Debug. 

Try running the flow locally to confirm it’s printing to stdout.

ckw

Velkumar Neel

unread,
Feb 9, 2021, 5:56:18 PM2/9/21
to cascadi...@googlegroups.com
Thank you both for the prompt reply .

It worked 

One more help please.

I have a source pipe which may or may not have records. In empty records case how do I handle the flowdef? 

Sent from my iPhone

On Feb 8, 2021, at 6:12 PM, Velkumar Neel <neel...@gmail.com> wrote:

Hi Ken

Chris K Wensel

unread,
Feb 10, 2021, 9:52:34 AM2/10/21
to cascadi...@googlegroups.com
I’m unsure what outcome you expect?

You can always test the sources for size before you plan the flow. Taps can be used independently.

Cascading 4 (under dev) https://github.com/cwensel/cascading has added new APIs to the Tap interface providing the Java stream API for reading data locally, for example.

ckw

Reply all
Reply to author
Forward
0 new messages