I'm building a target encoder model using the given python example on h2o documentation and trying to predict the target encodings through java using mojo of this model. But the mojo prediction fails on the categories which are present in test data only and not in training data with following error
Exception in thread "main" java.lang.NullPointerException
at hex.genmodel.algos.targetencoder.TargetEncoderMojoModel.computeEncodings(TargetEncoderMojoModel.java:87)
at hex.genmodel.algos.targetencoder.TargetEncoderMojoModel.score0(TargetEncoderMojoModel.java:72)
at hex.genmodel.easy.EasyPredictModelWrapper.predict(EasyPredictModelWrapper.java:889)
at hex.genmodel.easy.EasyPredictModelWrapper.transformWithTargetEncoding(EasyPredictModelWrapper.java:618)
at main.main(main.java:26)
After digging into the target encoder mojo, found that categories which are present in test data only, are present indomains.txt
, so the target encoder doesn't treat these categories as missing categories. And the target encodings are missing for these categories fromencoding_map.ini
, so model throwsNullPointerException
when it tries to access encodings for such categories usingencoding_map.ini,
code to train model and predict encodings can be found here https://stackoverflow.com/questions/63850930/h2o-target-encoder-mojo-model-throwing-null-pointer-exception
I'm using h2o version 3.30.0.7 on macOS. Am I doing something wrong or is there a bug in h2o target encoder mojo?