let's say I have 20 predicted labels, consisting of 4 classes: 'a', 'b', 'c', 'd'.
I noticed that perfcurve allows the predicted labels to be a cell array of strings, but it demands that the test labels should be a vector of values (so I enumerate them as 'a' = 1, 'b' = 2 and so on.). Else it will throw an error.
What I don't understand is how does perfcurve know the number corresponding to positive class that we want to measure the classification performance for.
for example:
perfcurve(predicted, testSetLabels, 'a' )
how does perfcurve know that 'a' corresponds to number 1 in the array testSetLabels?
I actually replaced testSetLabels with an array [1:20] to see if it would give me an error. And it didn't?! I actually got accuracy higher than the actual performance, which was extremely weird.
Can someone tell me what is the correct way to use perfcurve for multiclass problems?
"Ali Arslan" <ali_a...@brown.edu> wrote in message
news:iis3vc$mvc$1...@fred.mathworks.com...
Quoting from perfcurve help:
[X,Y] = perfcurve(LABELS,SCORES,POSCLASS) computes a ROC curve for a
vector of classifier predictions SCORES given true class labels,
LABELS. The labels can be a numeric vector, logical vector, character
matrix, cell array of strings or categorical vector (see help for
groupingvariable). SCORES is a numeric vector of scores returned by a
classifier for some data.
It sounds like:
- You confuse predicted labels with true labels for your test data.
- You use discrete labels instead of continuous scores predicted by your
classifier.
Take a look at the glmfit example at the bottom of perfcurve help.
perfcurve always separates observations in two groups, positive and
negative. You must choose one class as the positive class and provide SCORES
predicted for this class. You can either group the rest of the classes into
the negative class or you can choose one of them to be the negative class.
Take a look at 'NegClass' option. -Ilya