Tensorflow Serving vs JPMML

Ron Gonzalez

unread,

Jul 10, 2017, 7:55:13 PM7/10/17

to Java PMML API

Are there any performance benchmarks that have been done with respect to Tensorflow Serving vs Tensorflow-converted PMML models?

Thanks,
Ron

Villu Ruusmann

unread,

Jul 11, 2017, 4:31:03 PM7/11/17

to Java PMML API

Hi Ron,

> Are there any performance benchmarks that have been
> done with respect to Tensorflow Serving vs Tensorflow-converted
> PMML models?
>

I'm only testing the correctness of converted models:
https://github.com/jpmml/jpmml-tensorflow/tree/master/src/test/java/org/jpmml/tensorflow

The correctness check asserts that TensorFlow predictions and PMML
predictions are equivalent. The goal is to reach absolute equivalence,
which means that TensorFlow and PMML predictions will not differ more
than by 1 ULP.

I'm always prioritizing correctness over (initial-) performance. A
correct thing can be made faster easily. The opposite - making a
fast-performing thing more correct - is much more difficult.

The evaluation of neural network models was heavily refactored in
JPMML-Evaluator version 1.3.7:
https://github.com/jpmml/jpmml-evaluator/commit/7eba189b1e5b8c3b7f3ebd4c9129ea8876af19a5

This refactoring gives JPMML-Evaluator the ability to evaluate models
using different math contexts (see
http://mantis.dmg.org/view.php?id=179). For example, R, Scikit-Learn
and Apache Spark ML export NN models that assume 64-bit FP math
context, whereas TensorFlow exports NN models that assume 32-bit FP
math context.

Just to answer your question, I converted the DNNRegressionAuto model
(https://github.com/jpmml/jpmml-tensorflow/tree/master/src/test/resources/savedmodel/DNNRegressionAuto)
to PMML data format, and profiled it using the
org.jpmml.evaluator.EvaluationExample command-line example
application:
$ cd jpmml-tensorflow
$ java -jar target/converter-executable-1.0-SNAPSHOT.jar
--tf-savedmodel-input src/test/resources/savedmodel/DNNRegressionAuto/
--pmml-output Auto.pmml
$ java -cp ../jpmml-evaluator/pmml-evaluator-example/target/example-1.3-SNAPSHOT.jar
org.jpmml.evaluator.EvaluationExample --input
src/test/resources/csv/Auto.csv --model Auto.pmml --output /dev/null
--loop 10000

This is the profiling summary on my ancient laptop:
<stdout>
main
count = 10000
mean rate = 343,65 calls/second
1-minute rate = 292,76 calls/second
5-minute rate = 271,78 calls/second
15-minute rate = 267,71 calls/second
min = 2,56 milliseconds
max = 124,31 milliseconds
mean = 2,91 milliseconds
stddev = 1,79 milliseconds
median = 2,63 milliseconds
75% <= 2,71 milliseconds
95% <= 3,72 milliseconds
98% <= 4,88 milliseconds
99% <= 8,26 milliseconds
99.9% <= 21,34 milliseconds
</stdout>

It shows that on average it takes 2.91 milliseconds to score 392 data
records, which translates to ~135'000 data records per second. By
moving over to more modern desktop hardware, and running four threads
instead of one, one would be easily getting 1 million scores per
second.

The selling point of PMML has never been raw performance. The selling
point is having a standalone 1 MB JAR file, which can score pretty
much any predictive model out there.

VR

Villu Ruusmann

unread,

Jul 11, 2017, 5:14:56 PM7/11/17

to Java PMML API

Hi Ron,

> Are there any performance benchmarks that have been
> done with respect to Tensorflow Serving vs Tensorflow-converted
> PMML models?
>

Another idea - if your NN models become really wide and deep, then
JPMML-Evaluator can "outsource" some numerically intensive
computations to a capable 3rd party backend, such as TensorFlow.

Basically, you would keep the input data validation, preparation and
transformation logic in PMML, but translate the NeuralNetwork element
to a TensorFlow script. This translation is straightforward, because
both PMML and TensorFlow regard predictive models as graphs of
computations, and their "vocabularies" are greatly overlapping in this
area.

I'm not the first person to notice this. For example, here's a Python
application for translating PMML to TensorFlow:
https://github.com/yogeshhk/TensorFlowPMML

VR

Reply all

Reply to author

Forward