Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

twopop™ E-mail and Data Mapping Technologies by http://meami.org

0 views
Skip to first unread message

Geordie La Forge @ http://MeAmI.org

unread,
Oct 7, 2009, 6:48:37 PM10/7/09
to
twopop™ by http://meami.org write{s}:On Oct 7, 2:30 pm, "http://

meami.org" <scri...@aol.com> wrote:


Inference
(hypothesis
tests
and
estimation)
Differences
between
means,
proportions
in
two
populations

The
idea:
We
have
two
populations
and
a
variable
of
interest.
We
may
be
interested
in
the
difference
in
mean
values
(on
this
variable)
in
the
two
populations
[difference
of
means]
or
in
the
difference
in
proportions
of
the
populations
have
a
particular
value
on
the
variable
[difference
of
proportions]


In
dealing
with
means,
we
want
to
either


1.
Estimate
the
difference
between
the
mean
values
(this
involves
a
confidence
interval)
or
2.
Decide
whether
we
have
evidence
of
a
difference
between
the
mean
values
(this
involves
a
test)
In
dealing
with
proportions,
we
want
to
either


1.
Estimate
the
difference
between
the
proportions
(in
the
two
populations)
@
give(n)
a
certain
value
on
the
variable
(confidence
interval)
or
2.
Decide
whether
there
is
a
difference
between
the
proportions
(in
the
two
populations)
@
give(n)
a
certain
value
on
the
variable
(Test)
The
methods
parallel
the
methods
for
estimation
and
for
tests
on
the
mean
of
one
population,
but
the
calculations
are
different
because
we
have
different
(and
more
complicated)
distributions.


There
is
a
special
situation
[the
\matched
samples”
case]
which
is
usually
discussed
with
(and
often
confused
with)
inference
on
two
populations
but
is
really
a
special
case
of
inference
on
one
population
[of
differences].


Difference
of
means
(independent
samples)


The
basic
important
fact
is
@
our
best
estimator
of
1
.. 2
(the
difference
between
the
population
means

order
of
subraction
matters)
is
the
difference
between
sample
means
x1
-
x 2
.
The
mean
of
the
difference
in
sample
means
(as
long
as
we
keep
sample
sizes
the
same)
is
exactly
the
difference
in
the
population
means
(in
the
same
order): x 1..x 2
=
X1
.. X2


2
2


x1
X2


and
the
variance
of
x1
-
x 2
is
the
sum
of
the
variances
of
x1
and
x2,
so
x 1..x 2
=
+
.
In
addition,
if
X1
and


n1
n2


X2
are
approximately
normally
distributed
or
if
the
sample
sizes
are
large
enough
then
the
distribution
of
x 1
-
x 2
is
x 1..x 2..( 1.. 2)


approximately
normal
(which
means
r
is
a
Z).


2
2


x1
X2


+


n1
n2


Thus,
if
we
happen
to
know
1
and
2
and
n1;n2
are
large
enough
or
X1;X2
are
approximately
normal,
our
1
-
a
confidence
interval
for
1
-
2
is
given
by


2
2


x1
X2


x 1
-
x 2
±
E
with
E
=
Z
a
+


2


n1
n2


In
the
usual
situation,
we
do not
know
1; 2;
the
difference
of
sample
means,
compared
using
s1;s2
involves
four
values
@
vary
from
case
to
case,
and
is
not
even
really
a
t

it
is
closely
approximated
by
a
t
(if
X1;X2
normal
or
n1;n2
large)
but
(
to
make
the
approximation
work)
we
have
to
use
a
strange
value
of
degrees
of
freedom,
so
our
interval
for
confidence
1
-
a
is
given
by


s.
22
2


ss


s2
s2
n
1
1
+
n
2
2


x1
X2


x 1
-
x 2
±
E
with
E
=
t
a
+
df
=
2
2


2
22


n1
n2
ss


1
1
1
2


+


n1..1
n1
n2..1
n2


[This
is
the
fractional
degrees
of
freedom
value
@
will
be
reported
by
your
calculator
or
by
Minitab
if
you
use
either
of
these
for
the
calculation)


Testing
follow
the
same
six-step
procedure
as
testing
on
one
mean,
but
slightly
different
numbers
appear.
There
are
the
same
three
forms
for
the
alternative

the
order
in
which
the
two
populations
are
identified
will
matter
for
one-sided
tests.


\Greater”
or
\Less”
or
\not
equal”
or
H0
:
1
=
2
H0
:
1
-
2
=0
H0
:
1
=
2
H0
:
1
-
2
=0
H0
:
1
=
2
H0
:
1
-
2
=0
Ha
:
1> 2


Ha
:
1
-
2


0
Ha
:
1
< 2
Ha
:
1
-
2
<
0
Ha
:
1
6
=
2
Ha
:
1
-
2
6=0


Reject
H0
if
sample
t>ta
Reject
H0
if
sample
t<
..ta
Reject
H0
if
sample
t<
..t
a


2


or
sample
t>t
a


.


22
2
2


ss


x 1
-
x 2
-
( 1
-
2)
n
1
1
+
n
2
2


sample
t
=
.
with
df
=


2
222
2
2
2


ss


sx1
X2
1
1
1
2


++


n1
n2
n1..1
n1
n2..1
n2


1


Difference
of
proportions


The
basic
important
fact
is
@
our
best
estimator
of
p1
-
p2
(the
difference
between
the
population
proportions

order
of
subraction
matters
)is
the
difference
between
the
sample
proportions
p 1
-
p 2.
The
mean
of
the
difference
in
sample
proportions
(as
long
as
we
do not
change
sample
sizes)
is
exactly
the
difference
in
the
population
proportions
(in
the
same
order):
p 1..p 2
=
p1
-
p2
and
the
variance
of
the
differences
in
p1
and
p 2
is
the
sum
of
the
variances,
so
p 1..p 2
=


p1(1..p1)
+
p2(1..p2)


.
If
the
sample
sizes
are
large
enough
for
the
proportions
(@
is,
if
n1p1;n1
-
n1p1;n2;n2
..n2p2
are


n1
n2
p 1..p 2..(p1..p2)


all
at
least
5)
then
the
difference
p1
..p 2
will
be
approximately
normally
distributed
(which
means
@
q


p1(1..p1)
p2(1..p2)


+


n1
n2


is
a
Z)


In
working
with
proportions,
we
don't
have
an
independent
calculation
of
standard
deviation

it
depends
on
the
proportion

so
we
do not
get
involved
with
t.
For
estimation,
we
have
the
problem
@
we
don't
know
p1;p2
to
put
into
the
formula,
so
we
make
due
with
p1;p 2
.
If
our
sample
sizes
are
large
enough
for
our
proportions,
our
1
-
a
confidence
interval
for
p1
-
p2
is
given
by


p 1(1
-
p 1)
p 2(1
-
p 2)


p 1
-
p 2
±
E
with
E
=
Z
a
+


2


n1
n2


Testing
follow
the
same
six-step
procedure
as
testing
on
one
mean,
but
slightly
different
numbers
appear.
There
are
the
same
three
forms
for
the
alternative

the
order
in
the
two
populations
are
identified
will
matter
for
one-sided
tests.
Since
our
null
hypothesis
is
always
\difference
between
p1
and
p2
is0”
(p1
=
p2)
we
calculate
the
standard
error
of


n1p 1+n2p 2


the
difference
using
p
=
(=
total
number
of
successes
/total
number
of
trials)
in
place
of
both
p1
and
p2
[This


n1+n2


is
referred
to
as
the
\pooled
estimate
of
the
proportion")


\Greater”
or
\Less”
or
\not
equal”
or
H0
:
p1
=
p2
H0
:
p1
-
p2
=0
H0
:
p1
=
p2
H0
:
p1
-
p2
=0
H0
:
p1
=
p2
H0
:
p1
-
p2
=0
Ha
:
p1>p2


Ha
:
p1
-
p2


0
Ha
:
p1
<p2
Ha
:
p1
-
p2
<
0
Ha
:
p1
6
=
p2
Ha
:
p1
-
p2
6=0


Reject
H0
if
sample
Z>Za
Reject
H0
if
sample
Z<
..Za
Reject
H0
if
sample
Z<
..Z
a


2


or
sample
Z>Z
a


2


p 1
-
p 2
-
(p1
-
p2)
p 1
-
p 2
-
(p1
-
p2)


sample
Z
=
.
=
.


p (1..p )
p (1..p )


+
11
n1
n2
p (1
-
p )
+


n1
n2


Matched
samples
(Paired
data)
-means


This
is
the
situation
in
we
have
two
sets
of
values
(for
the
same
variable)
but
each
value
in
one
set
is
related
to
a
corresponding
value
in
the
other,
so
it
makes
sense
to
talk
about
the
differences
[individual
differences,
not
just
difference
of
the
means].
We
are
interested
in
the
mean
of
the
differences.
Examples:
Measurements
of
resting
heart
rates
of
people
before
and
after
an
exercise
program
(Pair
is
before
and
after
on
the
same
person)
Selling
price
of
a
selection
of
standard
items
at
Wal-Mart
and
at
Target
(pair
is
prices
at
two
stores
on
the
same
item)


In
this
situation
we
work
directly
with
the
differences:
d
=
x1
-
x2
[It
is
usually
necessary
to
keep
track
of
the
order
of
subtracting
:
\before”
minus
\after”
gives
a
different
sign
from
\after”
minus
\before”

and
this
will
matter

especially
for
one-sided
tests.].


If
the
variable
we
observe
is
normally
distributed,
or
if
the
sample
size
(note
n
=
number
of
pairs)
is
large
enough,
¯


d
-
d
-
µ


then
sample
means
of
the
differences
will
be
approximately
normally
distributed
(@
is,
d will
be
a
Z
and
pd


s
sd=n


d


will
be
a
t)
To
estimate
the
mean
difference
we
use:


p


¯


d
±
E
with
E
=
t
a
sd=n


2


Our
tests
follow
the
same
six
steps
and
we
have
the
same
three
cases:
d
¯
-
0


\Greater”
\Less”
\not
equal”
sample
t
=
v


sd=n


H0
:
d
=0
H0
:
d
=0
H0
:
d
=0


df
=
n
-
1


Ha
:
d


0
Ha
:
d
<
0
Ha
:
d
6=0


Reject
H0
if
sample
t>ta
Reject
H0
if
sample
t<
..ta
Reject
H0
if
sample
t<
..t
a


2


or
sample
t>t
a


2


2


Quod Erat Domonstrandum.
P VERSUS NP. Resolved.
http://meami.org
'Support a cure for childhood cancers.'
http://alexslemonade.org
©2009 MeAmI
'Search for the People!'
All Rights Reserved in Perpetuity and apply to all intellectual
property and applications including derivatives. Commercial use of
the
above algorithm is strictly prohibited without a confirmation code
and
written expressed consent from the author/owner{s} of the work.

0 new messages