Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ROC curve - perfcurve?

48 views
Skip to first unread message

Ingrid

unread,
Aug 27, 2013, 4:07:05 AM8/27/13
to
Hello,

I would like to generate a ROC curve to be able to assess the goodness of fit of a logistic regression model I just fitted using glmfit.

I have "yfit" as a column vector (length n), giving me the counts of items predicted as being in the 'positive' state by the model according to a continuous explanatory variable X (n values);
Obviously "yobs" the actual counts of buildings - same vector characteristics as above.

However, perfcurve requires as input the individual cases, so if I understand correctly for each value of my continuous variable Xi (1<=i<=n), I would need to create a matrix of individual cases with 'labels' being my true class labels positive or negative, the scores or probabilities of being positive, and the posclass (i.e. positive)
It seems very long and impractical given that I already have counts.

Is there any way I can work directly with the total counts, both actual and predicted, for each Xi to generate the ROC curve instead of having to redecompose into individual instances? with this function or another one?

PS. I already tried to generated the rOC curve "by hand" using my counts rates but without success.

Torsten

unread,
Aug 27, 2013, 5:48:05 AM8/27/13
to
"Ingrid " <ing...@tsunami2.civil.tohoku.ac.jp> wrote in message <kvhmn9$pjr$1...@newscl01ah.mathworks.com>...
Did you look at the example under
http://www.mathworks.de/de/help/stats/perfcurve.html
?

Best wishes
Torsten.

Ilya Narsky

unread,
Aug 27, 2013, 9:14:19 AM8/27/13
to
"Ingrid " <ing...@tsunami2.civil.tohoku.ac.jp> wrote in message
news:kvhmn9$pjr$1...@newscl01ah.mathworks.com...
You do need to flatten out your arrays of values and counts into a matrix of
individual cases. Unless your dataset is really big, this shouldn't be a
problem. Something like this would work:

%%
x = [2100 2300 2500 2700 2900 3100 ...
3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';
b = glmfit(x,[y n],'binomial','link','probit');

%%
M = size(x,1);
xflat = [];
yflat = [];

for m=1:M
xflat = [xflat; repmat(x(m,:),n(m),1)];
yflat = [yflat; zeros(n(m)-y(m),1); ones(y(m),1)];
end

% bflat should be equal to b
bflat = glmfit(xflat,yflat,'binomial','link','probit');

%%
p = glmval(b,xflat,'probit');
[fpr,tpr] = perfcurve(yflat,p,1);
plot(fpr,tpr);

Ilya

Ingrid

unread,
Aug 28, 2013, 2:24:08 AM8/28/13
to
"Ilya Narsky" <ina...@mathworks.com> wrote in message <kvi8nb$er$1...@newscl01ah.mathworks.com>...
This is great thank you!
0 new messages