IllegalStateException: too many captured primary elements -- Exception in thread "main" cascading.flow.planner.PlannerException: registry: MapReduceHadoopRuleRegistry, phase: PostPipelines, failed on rule: RemoveMalformedHashJoinPipelineTransformer

71 views
Skip to first unread message

Shishir Adhikari

unread,
Jul 26, 2016, 1:19:20 AM7/26/16
to cascading-user
Hello there,

I am part of a team responsible for updating existing and functional cascading 2.6.0 app to cascading 3.1.0 . During this process I changed the flow connector from HadoopFlowConnector to Hadoop2MR1FlowConnector (using hadoop 2) as well. But I am having difficulty figuring out an issue. This job is running fine on older cascading with HadoopFlowConnector flow.
But there is an error while running on newer version.
There is the log:
16/07/26 10:24:43 INFO property.AppProps: using app.id: F27E30B5EA624039B70E482B50755CB7
16/07/26 10:24:44 INFO flow.Flow: [MyJobName] executed rule registry: MapReduceHadoopRuleRegistry, completed as: ILLEGAL, in: 00:00.733
16/07/26 10:24:44 INFO flow.Flow: [MyJobName] rule registry: MapReduceHadoopRuleRegistry, found assembly to be malformed
Exception in thread "main" cascading.flow.planner.PlannerException: registry: MapReduceHadoopRuleRegistry, phase: PostPipelines, failed on rule: RemoveMalformedHashJoinPipelineTransformer, see attached source element-graph
at cascading.flow.planner.rule.RuleExec.performTransform(RuleExec.java:418)
at cascading.flow.planner.rule.RuleExec.performMutation(RuleExec.java:224)
at cascading.flow.planner.rule.RuleExec.executeRulePhase(RuleExec.java:176)
at cascading.flow.planner.rule.RuleExec.planPhases(RuleExec.java:123)
at cascading.flow.planner.rule.RuleExec.exec(RuleExec.java:84)
at cascading.flow.planner.rule.RuleSetExec.execPlannerFor(RuleSetExec.java:153)
at cascading.flow.planner.rule.RuleSetExec$3.call(RuleSetExec.java:336)
at cascading.flow.planner.rule.RuleSetExec$3.call(RuleSetExec.java:328)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: too many captured primary elements
at cascading.flow.planner.iso.transformer.RemoveBranchGraphTransformer.transformGraphInPlaceUsing(RemoveBranchGraphTransformer.java:58)
at cascading.flow.planner.iso.transformer.RecursiveGraphTransformer.transform(RecursiveGraphTransformer.java:118)
at cascading.flow.planner.iso.transformer.RecursiveGraphTransformer.transform(RecursiveGraphTransformer.java:72)
at cascading.flow.planner.rule.RuleTransformer.transform(RuleTransformer.java:85)
at cascading.flow.planner.rule.RuleExec.performTransform(RuleExec.java:414)
... 13 more

One thing that might be helpful is that, for business requirement, I have to iterate in a Assembly multiple times to call a number of buffers. the result of each buffer is merged and returned. With new version of cascading also, there is no error if iteration is upto 2. Here is the snippet of code:

Pipe[] myPipe1Collection = new Pipe[myCount];
for (int i = 0; i < myCount; i++)
        {

//param1 and param2 vary on each iteration which determines internal logic
            Buffer myFirstBuffer = new MyFirstBuffer(param1, param2,..., paramN);
            Pipe myPipe1 = new Pipe("myPipeName"+i, previousPipe);
            mbrRiskScorePipes[i]=new GroupBy(myPipe1,groupFields,sortFields,true);
            mbrRiskScorePipes[i] = new Every(mbrRiskScorePipes[i], myFirstBuffer, Fields.RESULTS);

//similar for other buffers
  }
Pipe myTail1Pipe = new Merge(myPipe1Collection);
//merging other buffers as well

I don't want to change my working code base. Please help me figure out the problem. What actually is "too many captured primary elements" ?

Regards,
Shishir Adhikari

Chris K Wensel

unread,
Jul 26, 2016, 12:53:40 PM7/26/16
to cascadi...@googlegroups.com
If you can provide a test case as a pull request, I can work with the planner to resolve it.


as a work around, you can provide a custom rule registry that leaves out the RemoveMalformedHashJoinPipelineTransformer rule since you don’t seem to have any HashJoins in the assembly.

ckw

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/e06883fe-8b83-43bb-9c8c-145b91ca55ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris K Wensel




Shishir Adhikari

unread,
Jul 28, 2016, 12:31:07 AM7/28/16
to cascading-user
Hi Chris,

Actually I had a Hash Join on my Assembly. I replaced Hash Join with CoGroup and every thing is working now.

I will try to replicate the issue in a test case and get back to you.

Regards, 

jaydee...@gwynniebee.com

unread,
Jan 22, 2018, 4:38:23 AM1/22/18
to cascading-user
Hey,
I'm facing the same issue , did you find any solution other than replacing HashJoin with CoGroup ?

Shishir Adhikari

unread,
Jan 22, 2018, 5:46:04 AM1/22/18
to cascading-user
I couldn't explore it further because I was busy on other priorities.

Chris K Wensel

unread,
Jan 23, 2018, 9:51:09 AM1/23/18
to cascadi...@googlegroups.com
This is a generic error. So I really need a test case in order to debug it.

3.3 is in active development, so a PR with a test case from there would be great.


ckw


-- 
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/cascading-user.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages