I'm trying to use a function I am defining inside of a PMMLPipeline using the Sklearn FunctionTransformer preprocessor as follows:
def equality_column(X):
equality_col = X[:,0] == X[:,1]
equality_col = equality_col.astype(int)
X = np.append(X,equality_col[:,np.newaxis],1)
return(X)
from sklearn2pmml import PMMLPipeline
from sklearn.tree import DecisionTreeClassifier
from sklearn2pmml import sklearn2pmml
iris_pipeline = PMMLPipeline([
("mapper", DataFrameMapper([
(['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'], FunctionTransformer(equality_column)),
])),
("classifier", LogisticRegression())
])
iris_pipeline.fit(iris_df[iris_df.columns.difference(["Species"])], iris_df["Species"])
sklearn2pmml(iris_pipeline, "LogisticRegressionIris.pmml", with_repr = True)
The transformer should just take in a dataset, check equality on the first and second column, and append a new column that is 1 if equal, 0 otherwise. It fails on the sklearn2pmml line, though everything up until that point seems to work and I can even run iris_pipeline.predict(iris_df). I have seen the following issue on github https://github.com/jpmml/sklearn2pmml/issues/11, is it still true that sklearn2pmml only supports a limited list of ufuncs and if so, do you have any plans to expand that in the future?
Thanks,
Andrew