Beginner's Question re: stats and spss....

nemo121

unread,

Dec 6, 2009, 10:04:33 AM12/6/09

to

Hi, I'm pretty clueless with stats so I'd appreciate any advice....

I want to check the P value of a change in the number of incidents of
injuries occuring over a period of years. I have the number of bed days per
year and the number of incidents and have the number of injuries. From this
I've generated the number of injuries per bed day and have shown that it
increased from roughly 1 injury per 1200 bed days in 2005 to 1 injury per
1800 bed days in 2007.

BUT how can I find out if this increase in number of bed days per injury
could be purely due to chance?

I'll list the full figures here just in case that's helpful
Year: 2005
Patient Bed Days = 96605
Number of Injuries = 78
Injuries per pt bed day = 1238.53

Year 2007
Patient Bed Days = 93977
Number of Injuries = 55
Injuries per pt bed day = 1708.67

I think that I need a non-parametric test of independent groups which seems
to mean something like the Chi Square Test or the Mann Whitney U test but
when I try that I think it mustn't be right as it generates nothing usable
( although, again, it may be that I just don't understand how to do it ).

Rich Ulrich

unread,

Dec 6, 2009, 5:56:27 PM12/6/09

to

On Sun, 6 Dec 2009 15:04:33 -0000, "nemo121" <drne...@gmail.com>
wrote:

>Hi, I'm pretty clueless with stats so I'd appreciate any advice....
>
>I want to check the P value of a change in the number of incidents of
>injuries occuring over a period of years. I have the number of bed days per
>year and the number of incidents and have the number of injuries. From this
>I've generated the number of injuries per bed day and have shown that it
>increased from roughly 1 injury per 1200 bed days in 2005 to 1 injury per
>1800 bed days in 2007.

That would be a *decrease* in injuries.

>
>BUT how can I find out if this increase in number of bed days per injury
>could be purely due to chance?
>
>I'll list the full figures here just in case that's helpful
>Year: 2005
>Patient Bed Days = 96605
>Number of Injuries = 78
>Injuries per pt bed day = 1238.53

That label should say, "Bed days per injury."
Ditto, for Year 2007. More injuries in
2005 (78) correspond to smaller number for "days"
since base lines are not too different.

>
>
>Year 2007
>Patient Bed Days = 93977
>Number of Injuries = 55
>Injuries per pt bed day = 1708.67
>
>I think that I need a non-parametric test of independent groups which seems
>to mean something like the Chi Square Test or the Mann Whitney U test but
>when I try that I think it mustn't be right as it generates nothing usable
>( although, again, it may be that I just don't understand how to do it ).

You are in the vicinity of having a difference at a 5% test, and
not a whole lot better.

The power that you have for comparisons is conveyed
by the number of injuries, 78 vs. 55. Any statistical
comparison based on those numbers alone will contain an
assumption of "independence" which might not be met,
namely, that different people suffered the injuries - within
each year, and (less damaging as an assumption) between
years.

The best style of testing has to start with individuals who
have incidents/injuries. If you can give verbal assurances
that weird things have not happened with the population,
you can probably present data based on these numbers.

For instance, you *do* want to state, for instance, that
no individual accounted for more than 2 (say) injuries.
- if you can't say that, you might want to give another
count that compares "individuals injured" instead of "injuries."

The "Bed Days" is similarly problematic in being a compounded
number of People times "Days per person" - if those are highly
variable, it is especially something to worry about.

What would probably be intelligible and most likely to be
acceptable for publication from these numbers is to
compare 55 to 78 as being allocated in the proportion to
the number of days exposed -- 49.3% vs. 50.7%.
That is practically the same test as "Are these two equal?"

I hope that SPSS gives the chance to define exact proportions
for a test in "non-par tests". If not, you can probably
find such a test online.

--
Rich Ulrich

nemo121

unread,

Dec 6, 2009, 6:54:49 PM12/6/09

to

You are correct as regards the terminology, thanks, I was so frustrated with
the stats that I wasn't precise with my terminology.

At present some individuals did account for more than one injury but as
those injuries were sustained in seperate incidents on widely different days
I felt that it was appropriate to include them in the total listing of
incidents. Later on I would intend to go into the number of individuals who
had multiple incidents incurring injury and what that means for risk
management etc.

What I'm still not clear on though is just how I could generate a P value
for the figures I do have. Surely there must be some way to do this? Or am I
wrong?

" compare 55 to 78 as being allocated in the proportion to the number of
days exposed "

I'm a little unclear on what you mean here. What I think you are saying is
that I should adjust the decrease in injuries for the number of "Patient Bed
Days"

E.g. There is a 2.73% decrease in the number of patient bed days in 2007
compared to 2005.

There is a 29.5% decrease in the number of incidents in the same two years
but adjusting for the decrease in patient bed days this represents a "true"
decrease of only 26.77%.

I've already done that but my concern is that the first thing someone will
say when reading that is, "Well that could be due to chance". Then they'll
look for a P value, see none and think that I must have calculated a P value
but found it to be greater than 0.05 and decided to conceal it. I wish I
knew enough stats to have gone that far ;-)

Bruce Weaver

unread,

Dec 7, 2009, 1:41:48 PM12/7/09

to

I wrote to a former colleague who works with this kind of data to ask
his advice. Here it is:

<advice>
You could advise this person to look at Appendix B1.1 in the document
at: http://www.iwh.on.ca/evaluating-safety-programs
Just click on the pdf download.

I would also recommend that the person look at the number of injuries
per bed days, not the other way round. It's clearer that there's been
a decrease in the rate that way.
</advice>

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."

Rich Ulrich

unread,

Dec 8, 2009, 12:28:34 AM12/8/09

to

On Sun, 6 Dec 2009 23:54:49 -0000, "nemo121" <drne...@gmail.com>
wrote:

>You are correct as regards the terminology, thanks, I was so frustrated with
>the stats that I wasn't precise with my terminology.
>
>At present some individuals did account for more than one injury but as
>those injuries were sustained in seperate incidents on widely different days
>I felt that it was appropriate to include them in the total listing of
>incidents. Later on I would intend to go into the number of individuals who
>had multiple incidents incurring injury and what that means for risk
>management etc.

And if the whole difference exists because one person
had 20 injuries in the first example? ... you apologize for
jumping to comclusions?

I downloaded the document that Bruce recommended, and
it looks pretty decent when I browse it hastily. It seems to
cover a broad amount of the topic of simple injury assessment.
I don't know whether it covers this issue, but I hope it does.

>
>What I'm still not clear on though is just how I could generate a P value
>for the figures I do have. Surely there must be some way to do this? Or am I
>wrong?
>

The simple way to test is to use the chi-squared test
where one row is "days" and the other is "injuries".
Some computer programs might have trouble with the
size of the numbers for "days" but you will get very
nearly the right thing if you divide by 100 to get
numbers near 10 000.

What I suggested before was a binomial test, using
exact probabilities based on the days.

If you can't manage either of these with some ease,
then it might be better than you don't try to produce
them and thus put yourself in the position of having
to describe them or defend them.

>
>" compare 55 to 78 as being allocated in the proportion to the number of
>days exposed "
>I'm a little unclear on what you mean here. What I think you are saying is
>that I should adjust the decrease in injuries for the number of "Patient Bed
>Days"
>
>E.g. There is a 2.73% decrease in the number of patient bed days in 2007
>compared to 2005.
>
>There is a 29.5% decrease in the number of incidents in the same two years
>but adjusting for the decrease in patient bed days this represents a "true"
>decrease of only 26.77%.
>
>I've already done that but my concern is that the first thing someone will
>say when reading that is, "Well that could be due to chance". Then they'll
>look for a P value, see none and think that I must have calculated a P value
>but found it to be greater than 0.05 and decided to conceal it. I wish I
>knew enough stats to have gone that far ;-)

I lost track of the exact numbers... using what I remembered,
I get results that don't quite reject at the 5% level.

--
Richard Ulrich

nemo121

unread,

Dec 8, 2009, 3:35:37 PM12/8/09

to

> I downloaded the document that Bruce recommended, and
> it looks pretty decent when I browse it hastily. It seems to
> cover a broad amount of the topic of simple injury assessment.
> I don't know whether it covers this issue, but I hope it does.

Yes, it was most helpful.

> The simple way to test is to use the chi-squared test
> where one row is "days" and the other is "injuries".
> Some computer programs might have trouble with the
> size of the numbers for "days" but you will get very
> nearly the right thing if you divide by 100 to get
> numbers near 10 000.
>
> What I suggested before was a binomial test, using
> exact probabilities based on the days.
>
> If you can't manage either of these with some ease,
> then it might be better than you don't try to produce
> them and thus put yourself in the position of having
> to describe them or defend them.

Yes, I can see your point. But one of the reasons behind trying to find the
answers here is that I wish to use this project as a base from which to
become more proficient in statistics.

I've managed to use a Chi Squared test yesterday and got a significance of
0.066.

>>I've already done that but my concern is that the first thing someone will
>>say when reading that is, "Well that could be due to chance". Then they'll
>>look for a P value, see none and think that I must have calculated a P
>>value
>>but found it to be greater than 0.05 and decided to conceal it. I wish I
>>knew enough stats to have gone that far ;-)
>
> I lost track of the exact numbers... using what I remembered,
> I get results that don't quite reject at the 5% level.

Many thanks.

nemo121

unread,

Dec 8, 2009, 3:39:02 PM12/8/09

to

Many thanks for the link. I've read and it has proven extremely useful.

I also managed to get results from a CHi Square test ( still won't work in
SPSS for some reason ) so I think my problem is solved.

Thanks for the help.

Bruce Weaver

unread,

Dec 8, 2009, 4:03:28 PM12/8/09

to

If you post your chi-square computation here, maybe someone can tell
you how to get SPSS to do it.