Hi,
In my project, I have sub assemblies to process the data and finding the number of tuples processed by each assembly.
I have used Counter to calculate the tuples processed by an assembly by using Each pipe.
I am retrieving the count by getting counter groups from flowstepstats. For eg:
Long RecordCount = 0L;
for (String counter : flowStepStats.getCountersFor(COUNTER_GROUP)) {
RecordCount = flowStepStats.getCounterValue(COUNTER_GROUP, counter);
}
If I tried to access getCountersFor() method while flowstep is running, I am getting empty list.
Counters are available only after flowstep completes.
The above scenario is for hadoop local mode as we cannot get counters at flownode level on local mode.
In remote mode, we are able to retrieve counters at flownode level by getting counter groups from flownodeStats. For eg:
Long recordCount = 0L;
for (String counter : flowNodeStats.getCountersFor(COUNTER_GROUP)) {
recordCount = flowNodeStats.getCounterValue(COUNTER_GROUP, counter);
}
In remote mode, if we tried to retrieve counters while any of the flownode keeps running in MR job,I am getting empty list.
Counters can only be retrieved after a flownode completes.
In hadoop local mode and remote mode, I am not able to get counters progressively.
I am getting counters only after a flownode(for remote) or flowstep(for local) completes.
Is there any way to fetch the counters progressively(as SubAssembly progresses) in Hadoop remote and local mode ?
Can someone please suggest me some solution for the above use case?
Cascading Version : 3.1.0
Thanks!!