We have a jar with various UDFs. We load it using the "add jar"
directive whenever we need to use the UDFs, which is in most
sessions(*). I can use the UDFs just fine on simple queries in
beeline. I'm getting a cryptic error, however, when I use them in an
only, slightly, more complex query which reads data from one tablle
and writes the UDF results to another table. Here is the error:
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask. java.io.IOException: Previous writer likely failed to write file:/opt/mr3-run/work-dir/hive/hive/_mr3_session_dir/f82d2277/intzsta-1.0-SNAPSHOT.jar. Failing because I am unlikely to write too.
at org.apache.hadoop.hive.ql.exec.mr3.DAGUtils.localizeResource(DAGUtils.java:1371)
at org.apache.hadoop.hive.ql.exec.mr3.DAGUtils.addTempResources(DAGUtils.java:1260)
at org.apache.hadoop.hive.ql.exec.mr3.DAGUtils.localizeTempFilesFromConf(DAGUtils.java:1171)
at org.apache.hadoop.hive.ql.exec.mr3.MR3Task.setupSubmit(MR3Task.java:241)
at org.apache.hadoop.hive.ql.exec.mr3.MR3Task.execute(MR3Task.java:143)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.executeMr3(TezTask.java:148)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2681)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2352)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2029)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1729)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1723)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:229)
at org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:326)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:344)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
intzsta-1.0-SNAPSHOT.jar is the jar with the uDFs.
David
(*)We still use "add jar" mainly for historical reasons dating back to
when we occasionally updated the jar with new or fixed UDFs. We
almost never need to do that anymore. Does the MR3 implementation of
Hive have a directory from where jars are automatically loaded? I
think the directory or setting was named auxlib or similar in the old,
Hadoop verison.
--
David Engel
da...@istwok.net