Vertex failed, vertexName=F94065FA50A2402B85BADC54084D324C, vertexId=vertex_1434403130327_0002_1_02, diagnostics=[Vertex vertex_1434403130327_0002_1_02 [F94065FA50A2402B85BADC54084D324C] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: E6E0F344EDC34730A44DA6FD2D212B43 initializer failed, vertex=vertex_1434403130327_0002_1_02 [F94065FA50A2402B85BADC54084D324C], org.apache.tez.dag.api.TezUncheckedException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not foundat org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:426)at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:122)at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:415)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)at java.util.concurrent.FutureTask.run(FutureTask.java:262)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not foundat org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2106)at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:689)at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:424)... 13 moreCaused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not foundat org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2098)... 15 moreCaused by: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not foundat org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)... 16 more]
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/a2e84675-e632-4128-a1ef-c30d765fbb0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Log Type: stderrLog Upload Time: 16-Jun-2015 00:28:50Log Length: 77Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/80d75392-3743-4462-b36a-1b866950d568%40googlegroups.com.
Map<Object, Object> properties = new HashMap<>(); AppProps.setApplicationJarClass(properties, Main.class); AppProps.setApplicationName(properties, "Data Platform ETL");
properties = FlowRuntimeProps.flowRuntimeProps() // level of parallelization during the gather stage. FIXME: don't hardcode .setGatherPartitions(4) .buildProperties( properties );
properties.put("io.serializations", "cascading.kryo.KryoSerialization"); properties.put(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, "true"); properties.put("tez.lib.uris", "${fs.default.name}/apps/tez-0.6.1-minimal-hadoop26.tar.gz,${fs.default.name}/apps/loan-applications-etl-0.2-SNAPSHOT/lib/avro-mapred-1.7.7-hadoop2.jar"); properties.put("tez.use.cluster.hadoop-libs", "true"); properties.put("yarn.timeline-service.hostname", "master.local"); // FIXME: don't hardcode properties.put("io.compression.codecs", "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec"); properties.put("mapred.output.committer.class", "org.apache.hadoop.mapred.FileOutputCommitter");
/* NOT WORKING HfsProps.setUseCombinedInput(properties, true); HfsProps.setUseCombinedInputSafeMode(properties, true); HfsProps.setCombinedInputMaxSize(properties, 134_217_728L); */ properties.put("mapred.min.split.size", "33554432");
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/88e72f5f-e0b8-4afa-99eb-c514a7b08a95%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/28000974-c3d2-476d-8fca-3c71d7353aa3%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/F4939E55-7B3C-4273-A2C9-3EB5007E4920%40wensel.net.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/cb2ef2ef-68fe-49b2-a3ff-7caeec678f87%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1A85642.AF5E%25luis.casillas%40progressfin.com.
Thanks. I’ll see if I can give it a shot on Monday.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1A9E369.B2DE%25luis.casillas%40progressfin.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/CAA2tiY%2Bi7sCq_ofoHZpwyX4tVY0zV3ndUEcerUQ0G%3Da25PenJw%40mail.gmail.com.
I don’t have any particular attachment to that one packaging model. Our application uses it simply because that’s what the Cascading 2.x documentation told us to do. The official examples have Gradle setups that package the apps this way, and we just copied those. It took hardly any time or effort.
It’s also worth stressing the incredible convenience of it, especially when coupled with EMR. So far all we’ve needed to do to run our apps is to package everything in the one jar and tell EMR to run it. We don’t even need to log into the cluster to do that; submitting the jar through the UI or the AWS SDK works perfectly. We haven’t needed to script the installation of anything into the EMR environment at all, since the lib-directory-in-jar packaging so far has been sufficient to bundle everything we need.
But as far as I’m concerned I’d be just as happy with any alternative that's just as easy, well documented and convenient, so deprecate away if that’s the way the wind blows.
I do recognize however that, unless Amazon starts putting recent versions of Tez into their default EMR images, we’ll be out of this utopia in the near future. (The only Tez-on-EMR automation I’ve seen so far is the experimental one linked from this thread in the AWS Forums, which seems like an excellent starting point, but will take some work to get it going.)
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1AA09DA.B341%25luis.casillas%40progressfin.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1AA09DA.B341%25luis.casillas%40progressfin.com.
Vertex failed, vertexName=CF4F45EB699F41559ABDDE93CD813E6B, vertexId=vertex_1434997756408_0006_1_03, diagnostics=[Task failed, taskId=task_1434997756408_0006_1_03_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.ClassCastException: org.apache.tez.dag.api.TezConfiguration cannot be cast to org.apache.hadoop.mapred.JobConfat cascading.avro.AvroScheme.sinkConfInit(AvroScheme.java:53)at cascading.tap.Tap.sinkConfInit(Tap.java:206)at cascading.tap.hadoop.Hfs.sinkConfInit(Hfs.java:414)at cascading.tap.hadoop.Hfs.sinkConfInit(Hfs.java:108)at cascading.tap.hadoop.io.TapOutputCollector.initialize(TapOutputCollector.java:96)at cascading.tap.hadoop.io.TapOutputCollector.<init>(TapOutputCollector.java:91)at cascading.tap.hadoop.io.TapOutputCollector.<init>(TapOutputCollector.java:79)at cascading.tap.hadoop.io.TapOutputCollector.<init>(TapOutputCollector.java:74)at cascading.tap.hadoop.io.HadoopTupleEntrySchemeCollector.makeCollector(HadoopTupleEntrySchemeCollector.java:57)at cascading.tap.hadoop.io.HadoopTupleEntrySchemeCollector.<init>(HadoopTupleEntrySchemeCollector.java:49)at cascading.tap.hadoop.Hfs.openForWrite(Hfs.java:447)at cascading.tap.hadoop.Hfs.openForWrite(Hfs.java:108)at cascading.tap.MultiSinkTap$MultiSinkCollector.<init>(MultiSinkTap.java:82)at cascading.tap.MultiSinkTap.openForWrite(MultiSinkTap.java:162)at cascading.flow.stream.element.SinkStage.prepare(SinkStage.java:68)at cascading.flow.tez.stream.element.TezSinkStage.prepare(TezSinkStage.java:63)at cascading.flow.stream.graph.StreamGraph.prepare(StreamGraph.java:181)at cascading.flow.tez.FlowProcessor.run(FlowProcessor.java:137)at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:326)at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:415)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1ADAB68.B4BC%25luis.casillas%40progressfin.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1B07FD2.B6A9%25luis.casillas%40progressfin.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D1ADAB68.B4BC%25luis.casillas%40progressfin.com.
For more options, visit https://groups.google.com/d/optout.
From: Luis Casillas
Sent: August 21, 2015 3:57:29pm PDT
To: Ken Krugler; cascadi...@googlegroups.com
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/D0E15293-37E8-4209-8131-ABB432D9EBA0%40progressfin.com.
For more options, visit https://groups.google.com/d/optout.