Hi Donato,
>
> 1. the PMML namespace version, I'd like to get this:
>
The PMML namespace URI identifies the PMML schema version that the
conversion tool (in this case, the SkLearn2PMML/JPMML-SkLearn software
stack) adheres to.
The JPMML-SkLearn library (the 1.6.X development branch) is currently
adhering to PMML 4.4. If you forcibly change the PMML namespace URI
from 4.4 to 4.2 (or some other 4.X or 3.X PMML namespace URI), then
you risk invalidating your PMML document.
For PMML version changes you'd need to use a proper PMML translation
tool (updates PMML markup properly, or informs you if it can't be
done).
> 2. the generated model tag lacks of the modelName attribute:
>
This is an optional attribute. It cannot be populated automatically
with a sensible value, so setting it manually after the conversion
seems like the right thing to do.
However, it would be nice if the PMMLPipeline class provided means for
setting it in Python. I've just opened a new GitHub issue about it
here:
https://github.com/jpmml/sklearn2pmml/issues/234
> 3. the output field name contains parenthesis that breaks my client (DMN model):
> <OutputField name="probability(false)" optype="continuous" dataType="double" feature="probability" value="false"/>
>
Your client (DMN model) is breaking for no good reason. According to
the PMML specification, PMML field names may contain any character,
including control characters such as parentheses.
The JPMML family of conversion tools uses a convention where field
names are formatted similar to Java/Python/R function invocations. For
example, the field name "probability(false)" should be interpreted as
"invoke the probability function with the 'false' argument". This
convention makes it easy to generate arbitrary complexity field names,
which can be parsed/reformatted later on.
Some other conversion tools (eg. the legacy 'pmml' R package) would
format this field as "probability_false". This style doesn't scale at
all, compare "final_business_decision(probability(false))" vs
"final_business_decision_probability_false" for
readability/parseability.
TLDR: You'd need to update your PMML client application to support
more recent PMML schema versions (PMML 4.2 is 6+ years old) and field
naming conventions. I won't be downgrading/dummyfying the JPMML
software stack.
Villu