Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

perfcurve with multiclass classification problems

206 views
Skip to first unread message

Ali Arslan

unread,
Feb 8, 2011, 2:02:04 PM2/8/11
to
I am trying to assess the classification performance of a multiclass classification problem with perfcurve.

let's say I have 20 predicted labels, consisting of 4 classes: 'a', 'b', 'c', 'd'.
I noticed that perfcurve allows the predicted labels to be a cell array of strings, but it demands that the test labels should be a vector of values (so I enumerate them as 'a' = 1, 'b' = 2 and so on.). Else it will throw an error.

What I don't understand is how does perfcurve know the number corresponding to positive class that we want to measure the classification performance for.

for example:
perfcurve(predicted, testSetLabels, 'a' )
how does perfcurve know that 'a' corresponds to number 1 in the array testSetLabels?

I actually replaced testSetLabels with an array [1:20] to see if it would give me an error. And it didn't?! I actually got accuracy higher than the actual performance, which was extremely weird.

Can someone tell me what is the correct way to use perfcurve for multiclass problems?

Ilya_Narsky

unread,
Feb 8, 2011, 3:37:33 PM2/8/11
to

"Ali Arslan" <ali_a...@brown.edu> wrote in message
news:iis3vc$mvc$1...@fred.mathworks.com...

Quoting from perfcurve help:

[X,Y] = perfcurve(LABELS,SCORES,POSCLASS) computes a ROC curve for a
vector of classifier predictions SCORES given true class labels,
LABELS. The labels can be a numeric vector, logical vector, character
matrix, cell array of strings or categorical vector (see help for
groupingvariable). SCORES is a numeric vector of scores returned by a
classifier for some data.

It sounds like:

- You confuse predicted labels with true labels for your test data.

- You use discrete labels instead of continuous scores predicted by your
classifier.

Take a look at the glmfit example at the bottom of perfcurve help.

perfcurve always separates observations in two groups, positive and
negative. You must choose one class as the positive class and provide SCORES
predicted for this class. You can either group the rest of the classes into
the negative class or you can choose one of them to be the negative class.
Take a look at 'NegClass' option. -Ilya

0 new messages