SFS with CV

Arun

unread,

Oct 29, 2020, 9:40:01 AM10/29/20

to mlxtend

Hi Sebastian ,

I had a question regarding the implementation of the SFS() method under the cross-validation setting in mlxtend.

The choice of features and the sequence of them in each of the training folds may differ w.r.t to other training folds (based on Introduction to Statistical Learning book)

Say if there are 4 features(a,b,c,d) and 3 cv folds, and on using Forward Subset Selection

Fold 1(as cv), Fold 2,3(as train) may choose Features : a, ab,abc,abcd

Fold 2(as cv), Fold 1,3(as train) may choose Features : b,bc,bcd,bcda

Fold 3(as cv), Fold 1,2(as train) may choose Features : c,ca,cab,cabd

However the final results from the SFS() method just mentions a single chosen sequence of features and their corresponding cv scores .

For eg : b,bc,bcd,bcda and their corresponding cv scores for each fold as mentioned below

b , cv_score for : {fold1,fold2,fold3}

bc, cv_score for : {fold1,fold2,fold3}

bcd , cv_score for : {fold1,fold2,fold3}

bcda, cv_score for : {fold1,fold2,fold3}

Could you please comment more on the cv based implementation of SFS() and how a single sequence of features is chosen, and how is the CV score calculated for each feature subset

Thanks,

Arun

Sebastian Raschka

unread,

Oct 29, 2020, 9:06:53 PM10/29/20

to Arun, mlxtend

Hi Arun,

does Introduction to Statistical Learning discuss sequential feature selection?

So, the SFS implementation is based on the original paper. While the performance in each round for each feature subset is based on testing the classifier on a holdout (validation) dataset, the SFS in mlxtend offers the option to use k-fold cross validation for that.

Regarding your example, if there are a total of 4 features in the dataset (a, b, c, d), and you are in the 3rd round of SFS with the features (a, c), it will

1) do k-fold cv with the classifier and feature set a, b, c
2) do k-fold cv with the classifier and feature set a, c, d

compare the k-fold avg. performances of 1) and 2) to decide which feature to pick as the third feature.

The method you describe

> Say if there are 4 features(a,b,c,d) and 3 cv folds, and on using Forward Subset Selection

sounds different. My guess is that the "Forward Subset Selection" method is just different method than "Sequential Forward Selection".

This is actually interesting. "Forward Subset Selection" (haven't heard of it before) may be interesting to implement for MLxtend.

Best,
Sebastian

> --
> You received this message because you are subscribed to the Google Groups "mlxtend" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mlxtend+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mlxtend/dbf0209e-d672-4fc6-8e2a-030ecddf55e3n%40googlegroups.com.

Arun

unread,

Nov 2, 2020, 11:49:43 AM11/2/20

to mlxtend

Hi Sebastian ,

Thank you for the quick response and the answers.

Now I’m able to understand the SFS() implementation in mlxtend clearly.

My answers for the details requested :

> does Introduction to Statistical Learning discuss sequential feature selection?

Statistical Learning discusses 2 more methods along with Best Subset Selection, namely Forward Stepwise Selection and Backward Stepwise selection. These two methods also choose features in a step-wise manner. I’m not sure if they could also be termed as Sequential feature selection methods ?

Sorry for the term I had used in my earlier email for "Forward Stepwise Selection". I had incorrectly stated it as “Forward Subset Selection”.

The implementation suggested in Statistical learning is :

a. Among the competing models for the same subset size, choose the model that gives best accuracy on training dataset.

b. Finally, of all the best models(based on Training error) for each subset size, choose the best subset size based on the Validation error.

Algorithm snippet from the ISLR text book for the Forward Stepwise Selection mentioned below:

However, the SFS() implementation in mlxtend would choose both the best feature subset and best feature subset size using CV(using validation error) .

Which among these two approaches(SFS or Forward Stepwise selection), would you recommend/suggest for Feature Selection ?

Thanks,

Arun

unread,

Nov 2, 2020, 11:56:24 AM11/2/20

to mlxtend

Adding Forward Step Wise Selection algorithm snippet

Reply all

Reply to author

Forward