Interpretation of lrm output

952 views
Skip to first unread message

Alexander Kautzsch

unread,
Jun 29, 2012, 1:40:44 PM6/29/12
to statforli...@googlegroups.com
Dear all,

I have some questions concerning the interpretation of lrm output (hopefully not too trivial).

As posted elsewhere I'm doing a study on language use in Namibia on the basis of a questionnaire with 5-point likert scales.
I want to test if the usage in particular situations of one of the 4 languages spoken there is influenced by the following factors:
 $ ETHNICITY : 4 levels "boer","ger","misc", "nat_afr"
 $ GENDER    : 2 levels "f","m"
 $ AGE 

E.g., for the usage of English (ENG) in a certain situation, I first identify significant factors using, 
QUEST.LRM = lrm(ENG ~ GENDER+AGE+ETHNICITY, data=QUEST, x=T, y=T)
anova(QUEST.LRM)

then I simplify the model (eliminate GENDER; not significant):
QUEST.LRMA = lrm(ENG ~ AGE+ETHNICITY, data=QUEST, x=T, y=T)

then 
anova(QUEST.LRMA) 
yields:

Factor     Chi-Square d.f. P    

 AGE        12.23      1    0.0005

 ETHNICITY  12.15      3    0.0069

 TOTAL      16.82      4    0.0021

QUEST.LRMA
looks as follows:

                  Coef    S.E.   Wald Z Pr(>|Z|)
y>=2               2.2462 0.9452  2.38  0.0175  
y>=3               1.1175 0.9241  1.21  0.2266  
y>=4               0.6153 0.9239  0.67  0.5054  
y>=5              -0.5646 0.9327 -0.61  0.5450  
AGE               -0.0773 0.0221 -3.50  0.0005  
ETHNICITY=ger      3.3409 1.0494  3.18  0.0015  
ETHNICITY=misc     3.1383 1.1810  2.66  0.0079  
ETHNICITY=nat_afr  1.1425 0.6576  1.74  0.0823  


And here are my questions concerning what exactly is significant about ETHNICITY and AGE:

- can I interpret the results as follows: ETHNICITY=ger and ETHNICITY=misc are significant (p<.05); the fact that they have a positive coef means that the respective variable (ENG) is favoured, (if negative --> disfavoured)

- why are only three of my four ETHNICITY levels (ger, misc, nat_afr) displayed? 

- for the interpretation of how age is significant, I need the summary(QUEST.LRMA), don't I?

summary (QUEST.LRMA)looks as follows: 
 
Effects              Response : ENG 

 Factor                   Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
 AGE                      27  47   20    -1.55  0.44 -2.41      -0.68     
  Odds Ratio              27  47   20     0.21    NA  0.09       0.51     
 ETHNICITY - boer:nat_afr  4   1   NA    -1.14  0.66 -2.43       0.15     
  Odds Ratio               4   1   NA     0.32    NA  0.09       1.16     
 ETHNICITY - ger:nat_afr   4   2   NA     2.20  0.91  0.41       3.99     
  Odds Ratio               4   2   NA     9.01    NA  1.51      53.86     
 ETHNICITY - misc:nat_afr  4   3   NA     2.00  1.06 -0.08       4.07     
  Odds Ratio               4   3   NA     7.36    NA  0.93      58.50  


--> AGE: is the crucial line "Odds Ratio"? Meaning: When going from 27 to 47 the likelihood of ENG to occur is multiplied by 0.21, i.e. reduced to one fifth?

Do the other odds ratios also tell me something?

Thanks in advance for any suggestions.
Best regards,
Alex

Stefan Th. Gries

unread,
Jun 29, 2012, 1:45:04 PM6/29/12
to statforli...@googlegroups.com
Just checking

- your dependent variable has more than two levels, right? It's a
multinomial regression you're doing? (which is not covered in the
first edition of SFLWR, but will be in the second)
- you really do not want to include interactions?

STG
--
Stefan Th. Gries
-----------------------------------------------
University of California, Santa Barbara
http://www.linguistics.ucsb.edu/faculty/stgries
-----------------------------------------------

Pep Vallbé

unread,
Jun 30, 2012, 2:54:57 AM6/30/12
to statforli...@googlegroups.com
I think Stefan is right, I would look for interactions, and I wouldn't remove the gender variable from the model despite being non-significant. Moreover, if English usage is the only information you're looking for, instead of a multinomial  regression you might want to recode your dependent variable into a binary variable (1=ENG, 0=otherwise) and run a logistic regression model (with the glm function) with this new variable as the response. 

pep


--
You received this message because you are subscribed to the Google Groups "StatForLing with R" group.
To post to this group, send email to statforli...@googlegroups.com.
To unsubscribe from this group, send email to statforling-wit...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/statforling-with-r?hl=en.




--
Pep Vallbé


Alexander Kautzsch

unread,
Jun 30, 2012, 2:59:08 AM6/30/12
to statforli...@googlegroups.com
yes, the dependent variable has 5 levels.
yes, what I'm doing follows Baayen's chapter 6.3.2 on ordinal logistic regression.
how could I include interactions?
bw,
alex

Pep Vallbé

unread,
Jun 30, 2012, 3:07:46 AM6/30/12
to statforli...@googlegroups.com
but as I understand your dependent variable is not an ordinal variable but a nominal one. If that is right, as Baayen tells you in 6.3.2 you shouldn't run an ordinal logistic regression unless your dependent variable a factor with ordered levels. If it is a factor with UNORDERED


--
You received this message because you are subscribed to the Google Groups "StatForLing with R" group.

To post to this group, send email to statforli...@googlegroups.com.
To unsubscribe from this group, send email to statforling-wit...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/statforling-with-r?hl=en.



--
Pep Vallbé


Pep Vallbé

unread,
Jun 30, 2012, 3:11:06 AM6/30/12
to statforli...@googlegroups.com
sorry, the message was sent before I finished. If it is a factor with unordered levels you decide to do either a multinomial regression model (as Stefan suggested) or a logistic regression with a binary variable as I suggested. Before trying interactions (including two variables multiplied in the model, e.g., AGE*ETHNICITY) I suggest clarifying this first.

p
--
Pep Vallbé


Alexander Kautzsch

unread,
Jun 30, 2012, 3:29:45 AM6/30/12
to statforli...@googlegroups.com
It is a factor with ordered levels from 1 to 5 (Use of ENG in a certain situation: 1 never-2 rarely-3 sometimes-4 mostly-5 always). But I also have the data available in binary form (1,2,3--> "no", 4,5--> "yes").

alex


On Saturday, June 30, 2012 9:11:06 AM UTC+2, Joan-Josep Vallbé wrote:
sorry, the message was sent before I finished. If it is a factor with unordered levels you decide to do either a multinomial regression model (as Stefan suggested) or a logistic regression with a binary variable as I suggested. Before trying interactions (including two variables multiplied in the model, e.g., AGE*ETHNICITY) I suggest clarifying this first.

p

On Sat, Jun 30, 2012 at 9:07 AM, Pep Vallbé <pepv...@gmail.com> wrote:
but as I understand your dependent variable is not an ordinal variable but a nominal one. If that is right, as Baayen tells you in 6.3.2 you shouldn't run an ordinal logistic regression unless your dependent variable a factor with ordered levels. If it is a factor with UNORDERED


On Sat, Jun 30, 2012 at 8:59 AM, Alexander Kautzsch <alex.k...@googlemail.com> wrote:
yes, the dependent variable has 5 levels.
yes, what I'm doing follows Baayen's chapter 6.3.2 on ordinal logistic regression.
how could I include interactions?
bw,
alex

On Friday, June 29, 2012 7:45:04 PM UTC+2, Stefan Th. Gries wrote:
Just checking

- your dependent variable has more than two levels, right? It's a
multinomial regression you're doing? (which is not covered in the
first edition of SFLWR, but will be in the second)
- you really do not want to include interactions?

STG
--
Stefan Th. Gries 


--
Pep Vallbé





--
Pep Vallbé


Stefan Th. Gries

unread,
Jun 30, 2012, 11:21:52 AM6/30/12
to statforli...@googlegroups.com
You would include interactions as usual, i.e. either as A:B or A*B. As
for the interpretation of the model, I would do that via the predicted
probabilities. Generate all combinations of variable levels/values in
your predictor space with expand.grid, compute predicted probabilities
of the levels of the dependent variable, and plot them for the
significant predictors. That usually makes even more complex results
fairly comprehensible.

Cheers,
STG
--
Stefan Th. Gries
Reply all
Reply to author
Forward
0 new messages