Hi Pratyush,
This looks like another low-level technical issue, which should have
gone straight into GitHub issue tracker!
>
> I downloaded the source for jpmml-evaluator-spark-1.3.0
> and did a clean install on my machine and then copied the
> shaded jar, jpmml-evaluator-spark-runtime-1.3.0_2.11.jar to my cluster.
>
Why did you decide to try shading?
According to my notes, the classpath conflict that must be resolved
via shading only affects Apache Spark 2.0, 2.1 and 2.2 versions:
https://github.com/jpmml/jpmml-sparkml/blob/1.5.14/README.md#installation
Here, the fix version is stated as 2.3.0:
https://issues.apache.org/jira/browse/SPARK-15526
Your Apache Spark version 2.4.8 should be safe in all regards, no
shading is necessary.
> sudo spark-shell --jars jpmml-evaluator-spark-runtime-1.3.0_2.11.jar
>
Did you get your application running with my pre-packaged
JPMML-Evaluator-Spark 1.3.0 version?
$ spark-shell --packages org.jpmml:jpmml-evaluator-spark:1.3.0 --jars
your-app.jar
I wonder if this technical error persists with the default (non-shaded) library.
>
> val pmmlTransformerBuilder = new TransformerBuilder(evaluator).withLabelCol("label").exploded(true)
> val pmmlTransformer = pmmlTransformerBuilder.build()
>
> Everytime, I seem to hit the following runtime exception now:
>
> scala> val pmmlTransformer = pmmlTransformerBuilder.build()
> java.lang.NullPointerException
> at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:346)
>
This exception is raised by Spark/Scala REPL environment, and doesn't
seem to refer to any org.jpmml.* namespace classes.
I'm not a Scala expert, so I can't advise in that area.
However, for debugging purposes, you could try the following:
1) Print out the value of 'pmmlTransformedBuilder' variable. Does it
exist (ie. is non-null)? If it does, then it means that the issue
happens inside the TransformerBuilder#build() method.
2) When I read the source code of TransformerBuilder#build() method,
then I can see two execution pathways in there - one for the
exploded=true option (more complex), and another one for the
exploded=false option (simpler). Right now you're using exploded=true.
Does your code complete when you do exploded=false?
3) If this issue is about exploded=true, then it seems to me that you
need to configure TransformerBuilder#withOutputCols() and/or
TransformerBuilder#withProbabilityCols() options.
VR