can i use multiple input model with sklearn(KerasClassifier)?

955 views

Skip to first unread message

JinHwan Hwang

unread,

Jul 15, 2016, 8:39:18 AM7/15/16

to Keras-users

I want to train multiple input keras model with sklearn cross validation feature.. But It seems that sklearn doesn't support multiple input. So I want to know that there are other way to overcome it. I found that there are existing manual way. But why i want to use sklearn is not only this feature. So I want to know that it is possible to use multiple input keras models with sklearn.

thank you for your time and help.

These are input

for sheet in tower_dataset :
    print sheet.shape

(126L, 3L)
(126L, 45L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 148L)
(126L, 100L)
(126L, 296L)
(126L, 296L)
(126L, 176L)
(126L, 31L)
(126L, 5L)

And What i tried to do is that train the keras model with sklearn cross validation.

epochs = 100
n_folds = 10
model = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10)
skf = StratifiedKFold(y=label, n_folds=n_folds, shuffle=True, random_state=rand_seed)


kfold = StratifiedKFold(y=label, n_folds=n_folds, shuffle=True, random_state=rand_seed)


results = cross_val_score(model, tower_dataset, label, cv=kfold)
print(results.mean())

ValueErrorTraceback (most recent call last)
<ipython-input-219-924ce2fda183> in <module>()
      6 kfold = StratifiedKFold(y=label, n_folds=n_folds, shuffle=True, random_state=rand_seed)
      7 
----> 8 results = cross_val_score(model, tower_dataset, label, cv=kfold)
      9 print(results.mean())
     10 

C:\Users\user\Anaconda2\lib\site-packages\sklearn\cross_validation.pyc in cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch)
   1420         Array of scores of the estimator for each run of the cross validation.
   1421     """
-> 1422     X, y = indexable(X, y)
   1423 
   1424     cv = check_cv(cv, X, y, classifier=is_classifier(estimator))

C:\Users\user\Anaconda2\lib\site-packages\sklearn\utils\validation.pyc in indexable(*iterables)
    199         else:
    200             result.append(np.array(X))
--> 201     check_consistent_length(*result)
    202     return result
    203 

C:\Users\user\Anaconda2\lib\site-packages\sklearn\utils\validation.pyc in check_consistent_length(*arrays)
    174     if len(uniques) > 1:
    175         raise ValueError("Found arrays with inconsistent numbers of samples: "
--> 176                          "%s" % str(uniques))
    177 
    178 

ValueError: Found arrays with inconsistent numbers of samples: [ 16 126]

eric....@ensam.eu

unread,

Feb 9, 2017, 10:16:54 AM2/9/17

to Keras-users

You can try to define a class like this :

class MultiInputSplitter(BaseEstimator, TransformerMixin):
    """Generate splitted matrix for multi input models.

    Generate a new feature matrix that contain n sub-matrix .

    Parameters
    ----------
    
    decomposition sizes : tuple
        the tuple must contain the number of features in each sub-matrix

        Examples
    --------
    >>> X = np.arange(6).reshape(3, 2)
    >>> X
    array([[0, 1],
           [2, 3],
           [4, 5]])
    >>> poly = PolynomialFeatures((1,1))
    >>> poly.fit_transform(X)
    array([[  0,2,4],
           [  1,3,5]])
    


    Notes
    -----
    Be aware that the sum of features in each sub-matrix is equal to the total number of features 
    in the initial matirx 
    """
    def __init__(self, sizes=(1,1)):
        self.sizes = sizes



    def fit(self, X, y=None):
        """
        Compute number of output submatrix.
        """
        n_samples, n_features = check_array(X).shape
        
        self.n_input_features_ = n_features
        self.n_input_samples_ = n_samples
        self.n_sub_matrix = len(self.sizes)
        total_sub_matrix_features = 0
        for val in self.sizes:
            total_sub_matrix_features+=val
        if total_sub_matrix_features!=n_features:
            raise ValueError("X shape does not match sub matrix sizes")
        return self

    def transform(self, X, y=None):
        """Transform the initial matrix

        Parameters
        ----------
        X : array-like, shape [n_samples, n_features]
            The data to transform, row by row.

        Returns
        -------
        XP : [np.ndarray shape [n_samples, NP], ...np.ndarray shape [n_samples, NP]]            The matrix of features, where NP is the number of polynomial
            features generated from the combination of inputs.
        """
        X = check_array(X, dtype=FLOAT_DTYPES)
        n_samples, n_features = X.shape

        if n_features != self.n_input_features_:
            raise ValueError("X shape does not match training shape")

        # allocate output data
        XP =[]
        init=0
        for val in  self.sizes:
            XPP= X[:,init:(init+val)]
            init+=val
            XP.append(XPP)

        return XP

and use it with a pipeline

estimators = []
estimators.append(('splitter', MultiInputSplitter(sizes=(3,3))))
estimators.append(('mim', KerasRegressor(build_fn=multi_input_model, nb_epoch=50, batch_size=5, verbose=verboseout)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)

Reply all

Reply to author

Forward

0 new messages