(Just found your Repo - and it is AWESOME, thank you all!)
Question is this:
Can you use a multiclass classifier of a feature(s) when processing/feature engineering on top of a stacked classifier?
Use-case:
You have 10 features to use in a multi-class classification problem. 1 of those features is text, the others and categorical, numerical, and time.
9 features get put through typical pipeline steps:
Similar to the sklearn Column Transformer with mixed types example:
numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())])
categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
('onehot', OneHotEncoder(handle_unknown='ignore'))])
The text feature is "preprocessed" or engineered by passing it through a text pipeline that contains a domain-specific trained vector model. The output of which is a 100-dimensional vector/array passed into a multi-class classifier that outputs the classification probabilities ("predict_proba").
These probabilities would then be combined with the features from the above preprocessor before being passed into a classifer/stacking classifier.
text_features = ['domain text']
text_transformer = Pipeline(steps=[
('text_vectors', (TextVectorizer() ),
('predict_prob', DecisionTreeClassifier(params='awesome'))])
preprocessor = ColumnTransformer(
transformers=[
('text', text_transformer, text_features),
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
clf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', LogisticRegression(solver='lbfgs'))])
The real question is:
Can you fit then predict as part of a transformation on a subset your data in a stacked ensemble?
In several different ways, I have used sklearn mixins (BaseEstimator, TransformerMixin, ClassifierMixin) to create custom classes to do this but I have failed miserably.
Will this actually work? Has anyone ever seen anything like this or am I just dreaming up crazy things?
Any insight or thoughts would be appreciated.
Thanks!
Jonathan