Confidence Interval

31 views
Skip to first unread message

Isaac Llorente

unread,
Jun 19, 2020, 11:26:55 AM6/19/20
to TADPOLE
Hi,

I have a question related to the concept of Confidence Interval. Is it:

1) Referring to the "stability" of the model prediction? Which perhaps could be estimated with the distribution of predictions from the different folds in a cross-validation fashion (?) (and assuming they are sampled from a normal distribution)

2) Referring to the actual chance of the real value being inside the interval? In which case I would look at my model predictions' errors in the validation set to estimate the error over my model's predictions.

Isaac

Razvan Marinescu

unread,
Jun 19, 2020, 11:52:56 AM6/19/20
to TADPOLE
Hi Isaac,

That is a very interesting question you asked! I should start by saying that the confidence interval is used for computing some of the performance measures (MAUC, WES, CPA). That is, the confidence interval should be constructed such that 50% of the values in the (D4) test set lie in that region. Now whether you can estimate that based on cross-validation folds from the training set, or using a validation set it up to you and is an assumption your method would make. Note that you might not have to assume Gaussianity to compute them (e.g. Ventricle volumes are always positive so a Gaussian noise model is not ideal there as it can take negatives). 

How exactly you "estimate" the confidence interval is a matter of statistical science. What you mentioned are two potentially good estimators of a "true" unobserved confidence interval. Now some estimators could have higher variance and low bias, or low variance and high bias, and it is a matter of you choosing the right one based on your model or just intuition. You can also have more complex models of estimating the conf. interval: for example, you could have an a-priori belief that in the D4 test set there is a distribution shift as compared to D1-D2 -- maybe scans in ADNI3 are higher-resolution and that would give you more precise Ventricle Volume measurements, so you can lower the width of the confidence interval. You can encode this into your model if you want, or whatever else you think might be important.  

I hope this helps.

Raz


Isaac Llorente

unread,
Jun 19, 2020, 12:24:25 PM6/19/20
to TADPOLE
Thank you for the clarification!

Isaac
Reply all
Reply to author
Forward
Message has been deleted
0 new messages