h2o in hadoop/hive

139 views
Skip to first unread message

anixa...@gmail.com

unread,
Jun 3, 2016, 6:56:06 AM6/3/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hi All,

I have built a classification model in R/h2o and I have taken the binary model out. Now I want to use this model/binary file to deploy this model in hadoop production environment.

I kept the binary model in hdfs and my production data is residing in hive. Now, I want to execute R script in hive, load/connect to h2o, load h2o model and deploy on production table (in hive).

As a set-up I have installed R and h2o in all machines in hadoop. My present status:

i) Able to execute R in hive
ii) Able to load/connect to h2o

But I am not able to load h2o model from hdfs. I used the following command:

model <- h2o.loadModel("hdfs://model_folder/DRF_model_ID")

DRF_model_ID: model folder where all model binaries are stored.

Any suggestions/advice would be helpful.


Regards,
Anirban Ghosh

Tom Kraljevic

unread,
Jun 3, 2016, 12:24:17 PM6/3/16
to anixa...@gmail.com, H2O Open Source Scalable Machine Learning - h2ostream

Hi,

Sorry, without actual output of the failure there isn’t much guidance we can give.

But here are a couple of tests you might want to take a look at which exercise saving and loading models.

https://github.com/h2oai/h2o-3/blob/master/h2o-r/tests/testdir_jira/runit_hex_1775_save_load.R
https://github.com/h2oai/h2o-3/blob/master/h2o-r/tests/testdir_hdfs/runit_INTERNAL_HDFS_model_export.R

Tom
> --
> You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

anirban ghosh

unread,
Jun 6, 2016, 5:54:46 AM6/6/16
to Tom Kraljevic, H2O Open Source Scalable Machine Learning - h2ostream
Hi All,

This is the error message I am getting:

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"X1":99.10917,"X2":0.0,"X3":11.0,"X4":100.0,"X5":100.0,"X6":80.20368,"X7":0.0,"X8":0.0,"X9":0.0,"X10":0.66543907,"X11":100.0,"X12":47.20982,"X13":12.0,"X14":0.047311783,"X15":0.0,"X16":0.0,"X17":100.0,"X18":100.0,"X19":100.0,"X20":5.8290815,"X21":96.07143,"X22":0.0,"X23":7.0,"X24":50.0,"X25":20.85,"X26":18.0,"X27":null,"X28":"Phone","X29":"-1","X30":"MR","X31":"2G","X32":"2G","X33":null,"X34":null,"X35":null}
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"X1":99.10917,"X2":0.0,"X3":11.0,"X4":100.0,"X5":100.0,"X6":80.20368,"X7":0.0,"X8":0.0,"X9":0.0,"X10":0.66543907,"X11":100.0,"X12":47.20982,"X13":12.0,"X14":0.047311783,"X15":0.0,"X16":0.0,"X17":100.0,"X18":100.0,"X19":100.0,"X20":5.8290815,"X21":96.07143,"X22":0.0,"X23":7.0,"X24":50.0,"X25":20.85,"X26":18.0,"X27":null,"X28":"Phone","X29":"-1","X30":"MR","X31":"2G","X32":"2G","X33":null,"X34":null,"X35":null}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
        ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: An error occurred while reading or writing to your custom script. It may have crashed with an error.
        at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:410)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
        ... 9 more
Caused by: java.io.IOException: Broken pipe
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:345)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:95)
        at java.io.DataOutputStream.write(DataOutputStream.java:88)
        at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(TextRecordWriter.java:54)
        at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:378)
        ... 15 more


FAILED: Execution Error, return code 20001 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred while reading or writing to your custom script. It may have crashed with an error.
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec


Tom Kraljevic

unread,
Jun 6, 2016, 10:27:57 AM6/6/16
to anirban ghosh, H2O Open Source Scalable Machine Learning - h2ostream

hi,

this question should be asked in the hive community, rather than the h2o community.

thanks
tom

Sent from my iPad
Reply all
Reply to author
Forward
0 new messages