Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
{MEDSTATS} Help with multiple regression
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Peter Flom  
View profile  
 More options Jul 2, 4:26 pm
From: Peter Flom <peterflomconsult...@mindspring.com>
Date: Thu, 2 Jul 2009 16:26:20 -0400 (EDT)
Local: Thurs, Jul 2 2009 4:26 pm
Subject: Re: {MEDSTATS} Help with multiple regression
jabs <jabenavi...@gmail.com> wrote

>Hello folks
>I am a physician who works in Mexico. I would like to predict the
>weight of a fetus before birth through ultrasound measurements. There
>are many studies which have published an equation or a formulae in
>order to estimate fetal weight, and the equation has been obtained
>from independent variables (parameters of ultrasound). Unfortunately,
>none of  these studies has been done in Mexican population.
>I have collected the birth weight (dependent variable) of almost 500
>newborns (NB). I hav also collected 13 ultrasound measurements
>(independent variables) per fetus in the 48 hours prior to birth
>(prenatal stage). My goal is to find an equation or formula to predict
>the weight of the baby using ultrasound variables (independent
>variables). I have read about this and I think I have to run a linear
>regression in which the dependent variable would be the birth weight,
>and ultrasound variables would be included as independent variables.

So far so good ....

>According to what I have read, I have to choose a selection of
>variables backwards method  by which I will obtain a linear model.

Not good at all.  Backwards methods (and other automatic variable selection methods)
are not good.  They are commonly used, but they are wrong.

The

>problem is that I have no experience on how to perform this. Even
>though, I have tried to do it using SPSS software and after running
>the regression, at the results window I get a series of data such us
>tables (descriptive statistics, correlation, included/deleted
>variables, a summary model, ANOVA, analysis of colinearity, excluded
>variables), and Graphics. What is the right way to run the multiple
>regression? How can I get the model from these data? Which data must
>be included in the equation? Thanks in advance for your help.

You might try asking on an SPSS list, for details of how to do things in SPSS,
but which variables you should use is not dependent on software.  If you
are trying to replicate previous results, you should use the same variables.

With 500 newborns, you could use all 13 variables - unless there are collinearity problems.

Or you might want to use something like principal component regression, or partial least squares;
you might be concerned with possible nonlinear effects; there are other possibilities as well.

Peter

Peter L. Flom, PhD
Statistical Consultant
www DOT peterflomconsulting DOT com


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "{MEDSTATS} Re: Help with multiple regression" by Christian Lerch
Christian Lerch  
View profile  
 More options Jul 2, 4:49 pm
From: Christian Lerch <t....@gmx.net>
Date: Thu, 02 Jul 2009 22:49:08 +0200
Local: Thurs, Jul 2 2009 4:49 pm
Subject: Re: {MEDSTATS} Re: Help with multiple regression
snip-----------------
 > With 500 newborns, you could use all 13 variables - unless there are
collinearity problems.
snip-----------------

Collinearity is very likely.

Start with a correlation matrix of all 13 measurements [Statistics =>
Correlate => Bivariate...]. Correlation coefficiants above, say, 0.80
usually show that the inclusion of both variables is not necessary or is
even counterproductive.

Regards,
Christian

Peter Flom schrieb:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bruce Weaver  
View profile  
 More options Jul 2, 5:43 pm
From: Bruce Weaver <bwea...@lakeheadu.ca>
Date: Thu, 2 Jul 2009 14:43:03 -0700 (PDT)
Local: Thurs, Jul 2 2009 5:43 pm
Subject: Re: {MEDSTATS} Re: Help with multiple regression
On Jul 2, 4:49 pm, Christian Lerch <t....@gmx.net> wrote:

> snip-----------------
>  > With 500 newborns, you could use all 13 variables - unless there are
> collinearity problems.
> snip-----------------

> Collinearity is very likely.

> Start with a correlation matrix of all 13 measurements [Statistics =>
> Correlate => Bivariate...]. Correlation coefficiants above, say, 0.80
> usually show that the inclusion of both variables is not necessary or is
> even counterproductive.

> Regards,
> Christian

Using bivariate correlations to try to assess multicollinearity is not
a very good idea, IMO.  First, you can have complete linear dependence
in the absence of any alarming looking bivariate correlations.  To
illustrate, try this example that Jerry Dallal posted in sci.stat.math
a couple years ago:

  X1  X2  X3  Y
  18  88 106  13
  72  45 117  43
  36  63  99  50
  75  26 101  77
  22  83 105  23
  99  71 170  68
  69  53 122   6
   6  49  55  51
  86  99 185  37
  85  64 149  10
  87   7  94  32
  93  32 125  69
  44  88 132   4
  34  34  68  13
  84  28 112  18

Check out all of the simple correlations.
Regress Y on X1,X2,X3.

Second, in models that include products or polynomial terms (e.g., a
model with both X and X-squared as predictors), there can be very high
correlations between variables, but no problematic
multicollinearity.

Tolerance and Variance Inflation Factor (which are available in the
SPSS Regression procedure) are better measures of problematic
multicollinearity, I think.

For more info, see the Multicollinearity link here:

  http://faculty.chass.ncsu.edu/garson/PA765/regress.htm

Regarding Peter's comments on stepwise selection, here is a good
summary of the problems:

   http://www.cmh.edu/stats/faq/faq12.asp

--
Bruce Weaver
bwea...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SR Millis  
View profile  
 More options Jul 2, 6:10 pm
From: SR Millis <srmil...@yahoo.com>
Date: Thu, 2 Jul 2009 15:10:13 -0700 (PDT)
Local: Thurs, Jul 2 2009 6:10 pm
Subject: Re: {MEDSTATS} Re: Help with multiple regression

Examining zero order correlations will not necessarily help in detecting high collinearity.  The absence of high correlations can't be viewed as  evidence of no problem.  It's possible for 3 or more variables to be collinear while no 2 of the variables taken alone are highly correlated.

You need to request collinearity diagnostics in linear regression.  Then, examine the condition indexes. Identify any that are large, ie, >30 (or even 20). Then, examine the associated variance-decomposition proportions for those large condition indexes. Large VDP (>.50) will identify those variables that are involved in the near dependency.

Scott R Millis, PhD, ABPP (CN,CL,RP), CStat, CSci
Professor & Director of Research
Dept of Physical Medicine & Rehabilitation
Dept of Emergency Medicine
Wayne State University School of Medicine
261 Mack Blvd
Detroit, MI 48201
Email:  smil...@med.wayne.edu
Tel: 313-993-8085
Fax: 313-966-7682

--- On Thu, 7/2/09, Christian Lerch <t....@gmx.net> wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Flom  
View profile  
 More options Jul 2, 6:17 pm
From: Peter Flom <peterflomconsult...@mindspring.com>
Date: Thu, 2 Jul 2009 18:17:27 -0400 (GMT-04:00)
Local: Thurs, Jul 2 2009 6:17 pm
Subject: Re: {MEDSTATS} Re: Help with multiple regression
I wrote

> > With 500 newborns, you could use all 13 variables - unless there are
>collinearity problems.

Christian Lerch <t....@gmx.net> replied

>Collinearity is very likely.

>Start with a correlation matrix of all 13 measurements [Statistics =>
>Correlate => Bivariate...]. Correlation coefficiants above, say, 0.80
>usually show that the inclusion of both variables is not necessary or is
>even counterproductive.

Actually, correlations are neither necessary nor sufficient for collinearity.

Much better to use condition  indexes

Peter

Peter L. Flom, PhD
Statistical Consultant
www DOT peterflomconsulting DOT com


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google