Worse SVM results when standardizing data

86 views

Skip to first unread message

aure

unread,

Apr 18, 2012, 7:20:37 AM4/18/12

to MLcomp

Dear all,

I have been testing a SVM binary classification on EEG data set. We
use some features of 240 components, which are already defined in the
interval 0-1. With them I obtain an AUC of approximately 80% (for a
reasonable ROC).

When I standardize the data (mean 0 and var 1) the performance
decreases to approx AUC 50% (with a ROC more or less equal to random
classification). I found very strange this effect of standardization,
since all texts advice to standardize data for improving performance.

Any cues? Has anybody observed similar effects?

Thanks for your help.

Kind regards
Aureli

Percy Liang

unread,

Apr 21, 2012, 10:27:35 AM4/21/12

to mlc...@googlegroups.com

Standardizing features is a only rough rule of thumb when you believe
each feature should play roughly an equal role in the classification.
When your features have an intrinsic scale that suggests otherwise,
uniform standardization can hurt because that information is lost. If
the dataset is uploaded to MLcomp, please send a link to it and maybe
someone can have a look.

Cheers,

-Percy

Reply all

Reply to author

Forward

0 new messages