accuracy, sensitivity and specificity

71 views
Skip to first unread message

fmcp...@gmail.com

unread,
Jul 7, 2016, 6:12:09 AM7/7/16
to Cardinal MSI Help
Hi Kyle,
I don't understand what is happening in my classification workflow. Briefly:
1. I have a combined MSImageSet with ten samples (6 classA and 4 classB). This combined set include exactly the useful pixels to fill the diagnosis data frame. Also this data set is reduced by resample method
2. I fill the diagnosis data frame with command line >diagnosis[coord(combinedDataset)$sample=="name of the sample"]<- "classA or classB". Repeating this step for all samples, the completed diagnosis is obtained.
3. Then >diagnosis <- as.factor(diagnosis), >pData(combinedDataset)$diagnosis <- diagnosis and >summary(mixparcial.resample$diagnosis) to check all pixels are assigned to classA or B.
4. Following that, I do OPLS cv  >combinedDataset.cv.opls <- cvApply(combinedDataset, .y = combinedDataset$diagnosis, .fun = "OPLS", ncomp = 1:10, keep.Xnew = FALSE)
5. But when I plot the result I obtain an accuracy almost 90% and no sensitivity or specificity parameters. What does NaN mean?

$`ncomp = 1`
                       classA        classB
Accuracy         0.8964384   0.8964384
Sensitivity            NaN         NaN
Specificity            NaN         NaN
FDR              0.6000000   0.4000000

6. However when I generate myClassifier <- OPLS(combinedDataset, y=combinedDataset$diagnosis, ncomp=1) it shows
> summary(myClassifier)
$`ncomp = 1`
                     classA         classB
Accuracy       0.94571437  0.94571437
Sensitivity     0.92411549  0.95852223
Specificity     0.95852223  0.92411549
FDR              0.07036492  0.04484068

What am I doing wrong?
Thanks in advance,
Paco
Paco

Auto Generated Inline Image 1

kbemis

unread,
Jul 10, 2016, 9:44:43 PM7/10/16
to Cardinal MSI Help
Hi Paco,

If I understand correctly, your experiment has one class per sample. By default, each sample is a fold for cross-validation. That means when predicting for that fold, all of the pixels are either class A or B, with none of the other class. This results in dividing by 0 when calculating the sensitivity and specificity for that fold, which is why they are NaN (Not a Number).

You'll need manually specify the folds so that there are so all classes are represented in each fold. The simplest way to do this for you would be to used 2 folds with 3 class A samples and 2 class B samples each. Use the 'sample' variable to create a new 'fold' variable, specifying which pixels belongs to which fold, and give this variable to the cvApply ".fold" parameter.

-Kylie

fm...@usal.es

unread,
Jul 15, 2016, 9:46:08 AM7/15/16
to Cardinal MSI Help
Thank you Kyle. That was the problem. I have reading a little about crossvalidation methods and I have understood the concept.
Thank you again  and best of luck for your thesis defence.
Paco
Reply all
Reply to author
Forward
0 new messages