OpenScoring single scoring time

Mehdi L'kotbi

unread,

Nov 28, 2019, 7:53:32 AM11/28/19

to Java PMML API

I have a question about the evaluation time for single predictions. In the project's github page, it is mentioned that response time is less than 1ms.

I am using the python API to communicate with the OpenScoring server, with an Isolation Forest PMML model (with 1000 trees), and the response time is 4-5 ms for single predictions, I need to make scoring time at most 1 ms (I'm working on an event-driven application, and it is necessary to have at maximum 1 ms for the scoring task) .

My question is, the 1ms benchmark mentioned on the github page, is it for single predictions? if so, do you have any suggestions on how can I achieve the 1ms scoring time?

Thank you.

L'kotbi Mehdi

Data scientist

Villu Ruusmann

unread,

Nov 28, 2019, 9:52:58 AM11/28/19

to Java PMML API

Hi Mehdi,

>
> it is mentioned that response time is less than 1ms.
>

This is a rough estimate that should help you assess if the
Openscoring REST web service is a good candidate for your application
scenario or not.

For example, most application scenarios will be happy with 10-30 ms
latency times. Yours appears to be somewhat more demanding, but should
still fit into Openscoring domain of capabilities.

The "response time" has the following components:
1) Raw model execution time - this is determined by the performance of
the JPMML-Evaluator library.
2) HTTP layer overhead - this includes JSON message parse/format, and
HTTP web server overhead.
3) Network transport time.

The statement "response time is less than 1 ms" should hold true if:
1) The overall complexity of a model is not extreme. Say, if the size
of your PMML file does not exceed 2-3 MB.
2) The model schema is not extreme. Say, there are no more than 20-30
input fields to it (this affects the parsing/formatting of
EvaluationRequest and EvaluationResponse objects).
3) The Openscoring REST web service is deployed "locally". If you host
your application and Openscoring REST web service in different
instances, then the network transport time between the two may easily
exceed 20-30 ms.

> I am using the python API to communicate with the
> OpenScoring server, with an Isolation Forest PMML
> model (with 1000 trees),

The complexity of your model is definitely above average. This is
roughly equivalent to making 500-1000 evaluations using a decision
tree model.

Things you might try (I'm assuming you're using Scikit-Learn for
training this Isolation Forest model):
1) Tweak model hyperparameters (n_estimators, max_depth) so that the
model would become smaller.
2) Use the best Sklearn-to-PMML conversion tool. The SkLearn2PMML
package applies decision tree compaction and flattening by default,
which are essential here. If you're using some fake conversion tool,
then your PMML files may easily be two-three times bigger (and this
brings performance down two-three times).
3) Make sure that you're using the latest Openscoring version. You
should probably build one from the HEAD revision of the 2.0
development branch manually (there hasn't been a 2.0.0 release yet).
It's important to use the latest JPMML-Evaluator library version,
because they should be much better performing (significant advances in
the area of pre-parsing class model objects).
4) Transpile your model from dummy XML to smart Java bytecode
representation (see https://github.com/jpmml/jpmml-transpiler-service)

Transpilation (suggestion #4) generally improves Scikit-Learn model
performance five times or more (see
https://github.com/jpmml/jpmml-transpiler-service#benchmarking).
However, this is a very brute force approach, and you should try out
more intelligent approaches (suggestions #1 -- 3) first.

VR

Mehdi L'kotbi

unread,

Nov 28, 2019, 10:00:16 AM11/28/19

to Java PMML API

OK thanks for your suggestions and your quick response.

Julien Rouar

unread,

Dec 4, 2019, 9:09:07 AM12/4/19

to Java PMML API

Hello Villu,

i work with Mehdi on the same problem and we success to do better than your nice api python with the libray urllib. (2.3 ms => 1.6ms)

I don't know if it's a particular case or if it could be better than your internal use of requests library.

Thanks a lot for all your help and clues.

Reply all

Reply to author

Forward