Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bootstrapping

6 views
Skip to first unread message

Stanley

unread,
Jun 26, 2007, 10:32:47 AM6/26/07
to
Hi,

I want to be able to do bootstrapping, and I have numerous questions.

First, am I correct in saying that there is bootstrapping to find the
probability distribution of data, and also bootstrapping in regression?

Now, to do bootstrapping regression, SPSS comes with a syntax file
called oms_bootstrapping.sps.

This file works, but it contains an awful lot of syntax with which I
would rather not have to deal. Is it possible just to use menus and do
bootstrapping with regression? Also, is it possible to use menus and do
bootstrapping to find the probability distribution of data?

Finally, this fall I will be needing to do weighted bootstrapping. I am
still not entirely sure what that means, but is there a way that I can
do this?

Thank you.

Stanley

Mike

unread,
Jun 28, 2007, 12:14:50 PM6/28/07
to
Certainly you can use menus only. For example, in Nonlinear regression, you
can use bootstrap to estimate standard errors of the parameters as well as
confidence intervals.

"Stanley" <sm...@usm.maine.edu> wrote in message
news:4681234a$0$31199$4c36...@roadrunner.com...

Stanley

unread,
Jul 2, 2007, 11:28:14 PM7/2/07
to
Hi Mike,

Thank you for your response.

In a few days, I am going to examine your comments and see if I can
apply them. I did not mean to seem uninterested, but I suddenly became
involved in another project keeping me temporarily away from SPSS.

Thanks again.

Stanley

Stanley

unread,
Jul 5, 2007, 12:27:39 PM7/5/07
to
Hi Mike,

Then I have these questions about bootstrapping.

Can I do bootstrapping with linear regression? Can I simply use
bootstrapping without doing regression at all -- in other words, can I
use bootstrapping to find a sample mean, for example, and its confidence
interval?

Is there somewhat that you could provide a small sample file (actually,
just a few data points that I would type in), and then tell me how to
proceed.

Thank you very much.

Stanley

Mike

unread,
Jul 6, 2007, 1:35:57 AM7/6/07
to
Hi Stanley,

Please check these out:

- Bootstrap 1
- Bootstrap 2

"Stanley" <sm...@usm.maine.edu> wrote in message

news:468d1bad$0$4706$4c36...@roadrunner.com...

Mike

unread,
Jul 6, 2007, 1:36:43 AM7/6/07
to
I meant these:
    - Bootstrap 1
    - Bootstrap 2
 
 

Mike

unread,
Jul 6, 2007, 1:37:37 AM7/6/07
to

Brian

unread,
Jul 6, 2007, 11:20:02 AM7/6/07
to

Stanley,

I've attached a macro below that uses the matrix method to compute the
bias corrected and accelerated mean and 95% confidence intervals for a
variable. The macro call line at the end allows the user to determine
how many resamplings to use. It's set for 1000 in the attachment, but
could be set higher. Typically, especially with the bias corrected
and accelerated mean, the lowest number of reps recommended is 1000.
This macro was written by Andrew Hayes at OSU. I changed about three
lines so the number of reps could be in the macro call line rather
than in a syntax line, so the credit clearly goes to him. There are
other examples out there.

Brian

define bootmean (vars=!charend('/')/reps=!charend('/')).
set mxloops = 10000.
count ms__=!vars (missing).
select if ms__=0.
matrix.
get dd /var=!vars.
compute n = nrow(dd).
compute mnb = make(!reps,1,0).
compute dat=dd.
compute mni=dd.

/* Here we generate the statistic of interest, MN */.
compute mn=csum(dat)/n.

/* HERE WE GENERATE THE OMITTED CASE ESTIMATES, MNI */.
compute tmp=make(n-1,1,0).
loop #n = 1 to n.
compute b = 0.
loop #m = 1 to n.
do if (#m <> #n).
compute b=b+1.
compute tmp(b,1)=dd(#m,1).
end if.
end loop.
/* Here we generate the statistic of interest */.
compute mni(#n,1)=csum(tmp)/(n-1).
end loop .


/* HERE WE GENERATE THE INFLUENCE STATISTICS, U */.
compute u=(mni-mn)*((n-1)/n).
/* HERE WE GENERATE THE ACCELERATION ESTIMATE, A */.
compute a = csum((u/n)&**3)/(6*((csum((u/n)&**2))&**(3/2))).
/* HERE WE GENERATE THE BOOTSTRAP RESAMPLE ESTIMATES, MNB */.
loop #n = 1 to !reps.
loop #m = 1 to n.
compute v=trunc(uniform(1,1)*n)+1.
compute dat(#m,1)=dd(v,1).
end loop.
compute mnb(#n,1)=csum(dat)/n.
end loop.
/* This is the place where the actual statistic of interest is
computed */.
compute bmn = csum(mnb)/!reps.
/* NOW WE GENERATE THE NORMAL CORRECTIONS */.
compute #pv = mnb < mn.
compute #pv = csum(#pv)/!reps.
compute p = #pv.
compute p0=-.322232431088.
compute p1 = -1.
compute p2 = -.342242088547.
compute p3 = -.0204231210245.
compute p4 = -.0000453642210148.
compute q0 = .0993484626060.
compute q1 = .588581570495.
compute q2 = .531103462366.
compute q3 = .103537752850.
compute q4 = .0038560700634.
do if (#pv > .5).
compute p = 1-#pv.
end if.
compute y=sqrt(-2*ln(p)).
compute xp=y+((((y*p4+p3)*y+p2)*y+p1)*y+p0)/((((y*q4+q3)*y+q2)*y+q1)*y
+q0).
do if (#pv <= .5).
compute xp = -xp.
end if.
compute zlo = xp-((1.96-xp))/(1+(a*(1.96-xp))).
compute zhi = xp+((xp-(-1.96)))/(1-(a*(xp-(-1.96)))).
compute plo = cdfnorm(zlo).
compute phi = cdfnorm(zhi).
compute lo = trunc(plo*(!reps+1)).
compute hi = (!reps+1)-trunc((1-phi)*(!reps+1)).
do if (lo < 1).
compute lo = 1.
end if.
do if (hi > !reps).
compute hi = !reps.
end if.
/* NOW WE SORT THE BOOTSTRAPPED ESTIMATE VECTOR */.
compute mnb = {-999;mnb}.
loop #i = 2 to !reps+1.
compute ix = mnb(#i,1).
loop #k= #i to 2 by -1.
compute k = #k.
do if (mnb(#k-1,1) > ix).
compute mnb(#k,1)=mnb(#k-1,1).
else if (mnb(#k-1,1) <= ix).
BREAK.
end if.
end loop.
compute mnb(k,1)=ix.
end loop.
compute mnb = mnb(1:!reps+1,1).
/* NOW WE FIND THE UPPER AND LOWER LIMITS */.
compute loci = mnb(lo,1).
compute hici = mnb(hi,1).
compute bw={mn, bmn, loci, hici, n}.
print/title = "BIAS CORRECTED AND ACCELERATED BOOTSTRAP MEAN
ESTIMATES, 1000 RESAMPLES".
print bw/title = " "/clabels = "Sample" "Bootstrp" "Lo95%CI "
"Hi95%CI" "n"/format f9.4.

end matrix.
!END DEFINE.
bootmean vars=x/reps=1000 .


mcap

unread,
Jul 6, 2007, 12:30:43 PM7/6/07
to
When will SPSS catch up to STATA and just offer a range of parameter
estimates (boostrap, jacknife, etc) through it's dialogue boxes or a
two step syntax routine added to any othe procedure?

Marc


Stanley

unread,
Jul 7, 2007, 6:38:55 PM7/7/07
to
Hi Brian, Mike, and Marc,

Thank you all for your responses.

Marc, I certainly agree. It would be better if SPSS -- in general, I
find, a great program -- if it would offer various menu options to do
bootstrapping. I am surprised it doesn't.

Brian, thank you for the macro. It is so long that, since I am not a
good programmer, I am having trouble deciphering it. That is my
limitation, of course. I will try to figure it out, but it is a bit
over my head.

Mike, wow! what a fine presentation. So I get a sense of how to do
bootstrapping with nonlinear regression. But then I still have these
two questions:

1. Why do I have to do nonlinear regression to do bootstrapping?

2. Can't I just do bootstrapping to get a distribution without doing any
regression? I think that this is what Brian is showing me, but I am not
sure.

Stanley

Richard Ulrich

unread,
Jul 7, 2007, 9:57:46 PM7/7/07
to
On Thu, 05 Jul 2007 12:27:39 -0400, Stanley <sm...@usm.maine.edu>
wrote:

> Hi Mike,
>
> Then I have these questions about bootstrapping.
>
> Can I do bootstrapping with linear regression? Can I simply use
> bootstrapping without doing regression at all -- in other words, can I
> use bootstrapping to find a sample mean, for example, and its confidence
> interval?

You need a better introduction to bootstrapping.

The "sample mean" is efficient for estimating the sample mean.
You would not want to bootstrap it; that is a waste of effort.
Bootstrapping is most often a method for finding robust estimates
of problematic variances. - I am open to other opinions on that.
So far, I have not seen a need for bootstrapping for my own data,
but I could be missing something.

Similarly, the sample rank-order statistics are far more efficient for
the confidence interval of the mean, compared to bootstrapping.

IIRC, when bootstrapping is used for linear regression, it is the
*residuals* that are randomized, and not the dependent values.
In any case, I think it is probably true that bootstrapping is
neither as simplistic nor as broadly useful as you imagine it to be.

>
> Is there somewhat that you could provide a small sample file (actually,
> just a few data points that I would type in), and then tell me how to
> proceed.

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Stanley

unread,
Jul 7, 2007, 11:57:33 PM7/7/07
to
Hi Rich,

Thank you very much for your reply.

You know vastly more about this than I do, and I am sure that I do need
to understand more about bootstrapping.

Then here is my confusion. I am just looking at a source by David Moore
and George McCabe. I will quote one sentence from them:

"The bootstrap distribution of a statistic, based on many resamples,
represents the sampling distribution of the statistic, based on many
samples."

So, as I understand it, here is why we use bootstrapping. Let's say
that we have only a small sample but that we want to get an idea of the
sampling distribution of a parameter of the population from which the
sample is drawn. In that case, we can use bootstrapping, right?

But there is no easy way to do this in SPSS, also right?

Am I completely on the wrong track here? Maybe I am.

Stanley

Brian

unread,
Jul 9, 2007, 10:29:26 AM7/9/07
to

Stanley,

I agree with Rich that you probably need to get a better idea of
bootstrapping. To that end, here's a link to Chapter 14 of Tim
Hesterberg's book. It contains some basic points about bootstrapping
that are important in making choices. http://bcs.whfreeman.com/ips5e/content/cat_080/pdf/moore14.pdf
. You are right that in general SPSS does not have a "bootstrapping"
option. There are some built-in bootstrap options for some of their
point-and-click stats. Also, whether in SPSS, SAS, S, S+, R, etc.
most users build some custom macros for situations they encounter
frequently, so some study of macros and matrices for SPSS will really
be to your advantage in the intermediate and long run.

Brian

Stanley

unread,
Jul 9, 2007, 12:06:34 PM7/9/07
to
Hi Brian,

Thank you very much for the link. I will study it, and macros, carefully.

I appreciate your help.

Stanley

0 new messages