Hi Daria,
> I am working on a combination of two PMMLs
> (the first one is a classifier and the second is a regressor).
>
Are you doing the combination work manually, or have you developed
some kind of utility for that?
The SkLearn2PMML package provides the
sklearn2pmml.ensemble.EstimatorChain meta-estimator, which should be
able to automate much of it.
Assuming you have your pre-fitted classifier and regressor objects available:
```python
classifier = ...
regressor = ...
ensemble = EstimatorChain([
("first", classifier, str(True)),
("second", regressor, str(True))
], multioutput = True)
sklearn2pmml(ensemble, "EstimatorChain.pmml")
```
It would be awesome to have some kind of "PMML document wrapper"
meta-estimator available, which would allow the reuse of existing PMML
files (in place of live Scikit-Learn estimator objects).
> I've created a "frame" with DataDictionary, MiningDat,
> Output, and Segmentation, where the segments are both PMMLs.
>
If you managed to load your combined PMML document using
JPMML-Evaluator(-Python), and there were no InvalidMarkupExceptions or
MissingMarkupExceptions raised, then it must be all good so far.
> In the output, I've copied the outputs of the classifier
> as it is the first PMML, but while I'm getting the results
> of the regressor (the second PMML), the results of
> the classifier are returning as null.
Just to double-check, have you tried what happens if you exchange the
places of your elementary models (now: classifier first and regressor
second; then: regressor first and classifier second).
Does it also swap the results pattern - regressor results disappear
and classifier results show up?
> This is the Output section of the main frame (the copied output).
>
The PMML specification is rather vague about the canonical placement
of target and output field elements in (multi-layer-) ensemble models.
If you want to use the OutputField@segmentId mechanism for making a
model reference, then I would place such OutputField elements to the
top(most)-level. The idea is to ensure that all the models are in
"scope".
> I'm wondering what I am doing wrong.
>
You should check the declared type of your model ensemble (ie. the
value of /PMML/MiningModel/Segmentation@multipleModelMethod
attribute).
Is it "modelChain" currently? This type represents "evaluate segments,
and return the results of the LAST SEGMENT whose associated predicate
was True". By definition, you are restricted to the results of one
model here.
If you want to return multiple results, then you should change this
value from "modelChain" to "x-multiModelChain" (please note the "x-"
prefix!). This type represents "evaluate segments, and return the
results of EVERY SEGMENT whose associated predicate was True".
VR