> My work is as follows. In healthy me, I will examine the concentration
> of some Marker serum (lets call this variable "MarkerX", and also some
> other marker (that is predictor of somatic disease, lets call it
> "MarkerY"). For known reasons, I can't tell all the details :) But
> both variables, MarkerX and MarkerY are continues. So, all the men
> will be classified according the MarkerY values as "Positive" (the
> value of MarkerY is above normal range, big risk for disease) and
> "Negative" (the value of MarkerY is within the normal range, t.i.,
> small risk for disease). Comparisons between these two groups will be
> made (MarkerX concentration means, ajusting for confounders and so on)
> and logistic regresion models will be created.
>
> The mean aim of this study is try to find the "cut-off" value of
> MarkerX. T.i., to found out where is the thereshold, when the risk for
> having disease (which is showed by increasing MarkerY value)
> significantly rises if the patient has decreased MarkerX value.
> Analysis is going to be made usign ROC curves, I am familiar with
> this :)
>
> But how to calculate the sample size for this study? I found that
> special complex formulas should be used, dealing with ROC curves or
> smth., but I don't know exactly what kind of.
> Some info is at
http://www.compass.fhcrc.org/edrnnci/files/pdf/PepeSSCalc.pdf,
> but I guess it isn't what I need.
You will get a lot of comments along the lines of "you shouldn't analyze
it this way because..." and I'd encourage you to look at those comments
carefully and with an open mind.
But to directly answer your question without the distraction of changing
your research direction 180 degrees, I would suggest this:
If the sample size formulas associated with ROC curves are too messy or
too unclear to implement, then why not establish that sensitivity and
specificity both have reasonably narrow confidence intervals? You won't
know the sensitivity and specificity until you do the research, of
course, but set up a plausible scenario (or if you're really ambitious a
range of plausible scenarios) and then calculate the widths of the
confidence intervals. If your intervals are too wide, increase your
sample size until you are happy. In the unlikely event that your
intervals are too narrow, cut back on your sample size.
By the way, you need data on men with disease in order to compute
sensitivity OR to establish a reasonable cutoff OR to compute an ROC
curve. Do you really plan to study ONLY healthy men (I assume that
"healthy me" is a typo and not an indication of self-experimentation).
If so, then you need to radically revise your statistical approach.
Steve Simon, n...@pmean.com, Standard Disclaimer.
Sign up for the Monthly Mean, the newsletter that
dares to call itself average at www.pmean.com/news
Thanks in advance.
This reference may be useful...
@article{bland2009,
author = {Bland, J. M.},
citeulike-article-id = {6128105},
citeulike-linkout-0 = {http://dx.doi.org/10.1136/bmj.b3985},
day = {6},
doi = {10.1136/bmj.b3985},
issn = {1468-5833},
journal = {BMJ},
keywords = {methodology, power, samplesize, statisics},
month = oct,
number = {oct06 3},
pages = {b3985},
posted-at = {2010-03-30 13:23:21},
priority = {0},
title = {{The tyranny of power: is there a better way to calculate
sample size?}},
url = {http://dx.doi.org/10.1136/bmj.b3985},
volume = {339},
year = {2009}
}
A version is available from Martin Blands home page at
http://www-users.york.ac.uk/~mb55/talks/tyrpowertalk.pdf if you do not
have access to the BMJ
(http://www.bmj.com/content/339/bmj.b3985.extract)
Neil
--
“Truth in science can be defined as the working hypothesis best suited
to open the way to the next better one.” - Konrad Lorenz
Email - nshe...@gmail.com
Website - http://kimura.no-ip.org/
Photos - http://www.flickr.com/photos/slackline/