Interactions in GLM

341 views
Skip to first unread message

Johan Sigfrids

unread,
May 6, 2014, 4:13:36 PM5/6/14
to julia...@googlegroups.com
How do you specify interactions and non-linear transforms using the fomula for GLM? Something like y~x1*x2 + x2^2

Kevin Squire

unread,
May 12, 2014, 4:43:15 PM5/12/14
to julia...@googlegroups.com
I don't think that's supported directly right now. But you can always create additional columns with those values and use them. 


On Tuesday, May 6, 2014, Johan Sigfrids <johan.s...@gmail.com> wrote:
How do you specify interactions and non-linear transforms using the fomula for GLM? Something like y~x1*x2 + x2^2

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-stats...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Myles White

unread,
May 12, 2014, 4:46:08 PM5/12/14
to julia...@googlegroups.com
We might already support interactions, but I'm not sure if it works at the moment.

Non-linear transforms is a trickier case: their use in R depends upon R's willingness to eval() things in arbitrary scopes, which Julia doesn't really do. We can hardcode a few special cases like log, sqrt, etc., but I'm inclined to think it's better to insist that transformations be placed in DataFrames rather than created as temporary columns.

 -- John

Johan Sigfrids

unread,
May 12, 2014, 5:06:10 PM5/12/14
to julia...@googlegroups.com
I did futz around with the formula but I couldn't get interactions working. I suppose I will just have to start making lots of columns.

Johan Sigfrids

unread,
May 12, 2014, 5:27:27 PM5/12/14
to julia...@googlegroups.com
Follow up question: If you aren't going to support interactions or non-linear terms in the formula, could you support the R thing where you want everything except something like y~.-x7.

Douglas Bates

unread,
May 13, 2014, 3:34:22 PM5/13/14
to julia...@googlegroups.com
Interaction terms are written as A&B not A:B because Colon is already so overloaded.

See the expansion of A*B

julia> using RDatasets

julia> inst = dataset("lme4","InstEval");

julia> ModelFrame(Y ~ Dept*Service, inst).terms
Terms({:Dept,:Service,:(Dept & Service)},{:Y,:Dept,:Service},3x4 Array{Int8,2}:
 1  0  0  0
 0  1  0  1
 0  0  1  1,[1,1,1,2],true,true)

julia> ModelFrame(Y ~ Dept+Service+Dept&Service, inst).terms
Terms({:Dept,:Service,:(Dept & Service)},{:Y,:Dept,:Service},3x4 Array{Int8,2}:
 1  0  0  0
 0  1  0  1
 0  0  1  1,[1,1,1,2],true,true)

Johan Sigfrids

unread,
May 13, 2014, 8:52:28 PM5/13/14
to julia...@googlegroups.com
I've tried all three. both & and *  result in the same error.

using RDatasets, GLM
boston = dataset("MASS", "Boston")

lm(MedV ~ LStat+Age+LStat&Age, boston)


no method getindex(DataFrame, Expr)
 in coefnames at /Users/johansigfrids/.julia/v0.3/DataFrames/src/formula/formula.jl:249
 in coeftable at /Users/johansigfrids/.julia/v0.3/GLM/src/lm.jl:67
 in show at /Users/johansigfrids/.julia/v0.3/GLM/src/linpred.jl:77
 in anonymous at show.jl:1040
 in with_output_limit at show.jl:1017
 in showlimited at show.jl:1039
 in writemime at replutil.jl:2
 in writemime at multimedia.jl:41
 in sprint at io.jl:460
 in display_dict at /Users/johansigfrids/.julia/v0.3/IJulia/src/execute_request.jl:24

Taylor Maxwell

unread,
May 14, 2014, 1:44:38 AM5/14/14
to julia...@googlegroups.com
It works fine for me although I am using the latest master version of GLM so I need to use fit(LmMod, formula,df)

using RDatasets, GLM
boston = dataset("MASS", "Boston")

julia> cc=fit(LmMod,MedV ~ LStat+Age+LStat&Age, boston)
DataFrameRegressionModel{LmMod{DensePredQR{Float64}},Float64}:

Coefficients:
                  Estimate Std.Error    t value Pr(>|t|)
(Intercept)        36.0885   1.46984    24.5528  < eps()
LStat             -1.39212  0.167456   -8.31335  8.8e-16
Age            -0.00072086 0.0198792 -0.0362621   0.9711
LStat & LStat   0.00415595 0.0018518    2.24428   0.0252

Douglas Bates

unread,
May 14, 2014, 4:04:00 PM5/14/14
to julia...@googlegroups.com
I think the problem is with third and higher order interactions.

Min-Woong Sohn

unread,
Sep 29, 2016, 4:47:26 PM9/29/16
to julia-stats
As a followup, I realized that GLM put the interaction variable at the end of the estimates in the coeftable output. Is there any way I can preserve the order such that the variables are shown in order (e.g., LStat, Age, and LStat and Age intertion in that order).

Milan Bouchet-Valat

unread,
Sep 29, 2016, 6:46:06 PM9/29/16
to julia...@googlegroups.com
Le jeudi 29 septembre 2016 à 09:47 -0700, Min-Woong Sohn a écrit :
> As a followup, I realized that GLM put the interaction variable at
> the end of the estimates in the coeftable output. Is there any way I
> can preserve the order such that the variables are shown in order
> (e.g., LStat, Age, and LStat and Age intertion in that order).
What's the problem exactly with the output shown in the message below?
Don't you get the same result?


Regards


> > I think the problem is with third and higher order interactions.
> >
> > > It works fine for me although I am using the latest master
> > > version of GLM so I need to use fit(LmMod, formula,df)
> > >
> > > using RDatasets, GLM
> > > boston = dataset("MASS", "Boston")
> > >
> > > julia> cc=fit(LmMod,MedV ~ LStat+Age+LStat&Age, boston)
> > > DataFrameRegressionModel{LmMod{DensePredQR{Float64}},Float64}:
> > >
> > > Coefficients:
> > >                   Estimate Std.Error    t value Pr(>|t|)
> > > (Intercept)        36.0885   1.46984    24.5528  < eps()
> > > LStat             -1.39212  0.167456   -8.31335  8.8e-16
> > > Age            -0.00072086 0.0198792 -0.0362621   0.9711
> > > LStat & LStat   0.00415595 0.0018518    2.24428   0.0252
> > >
> > >
> > > > How do you specify interactions and non-linear transforms using
> > > > the fomula for GLM? Something like y~x1*x2 + x2^2
> > > >
> > >
> >
Reply all
Reply to author
Forward
0 new messages