Consolidation of packages for fitting models based on linear predictors

223 views
Skip to first unread message

Douglas Bates

unread,
Aug 1, 2013, 11:04:15 AM8/1/13
to julia...@googlegroups.com
The GLM and LM packages from JuliaStats and my MixedModels package all fit statistical models based on a linear predictor expressed as a formula.  The formula/ModelFrame/ModelMatrix code currently sits in the DataFrames package.

I propose consolidating these packages and the formula/ModelFrame/ModelMatrix code into a single package.  If others agree that this is a good idea then we should consider carefully the choice of a package name.  StatModels may be too general as the Distributions package provides mle and map fits and MCMC fits models, in a sense.  A name related to linear predictors may be too obscure for users to recognize the purpose of the package.

John Myles White

unread,
Aug 1, 2013, 12:23:37 PM8/1/13
to julia...@googlegroups.com
I'd love to see some consolidation. I think we should just delete the LM package, since it's totally deprecated as is.

Although it's a little odd, I'd propose putting everything into GLM with the understanding that the name is being slightly abused. No one will be surprised that GLM supports OLS. I would guess that people are more likely to be pleasantly surprised to find that GLM supports GLMM's than they are to be upset that GLMM's aren't segregated away from GLM's. (If I'm wrong in thinking that MixedModels supports GLMM's, please correct me.)

LinearPredictors is a tempting name, but it's not clear to me why that package wouldn't include things like linear SVM's. And it will not be as easy-to-find for people who search for Julia GLM.

FWIW, I'd prefer that the ModelMatrix code stay in DataFrames since it's useful to anyone who wants to translate DataFrames into Julia matrices.

 -- John

On Aug 1, 2013, at 11:04 AM, Douglas Bates <dmb...@gmail.com> wrote:

The GLM and LM packages from JuliaStats and my MixedModels package all fit statistical models based on a linear predictor expressed as a formula.  The formula/ModelFrame/ModelMatrix code currently sits in the DataFrames package.

I propose consolidating these packages and the formula/ModelFrame/ModelMatrix code into a single package.  If others agree that this is a good idea then we should consider carefully the choice of a package name.  StatModels may be too general as the Distributions package provides mle and map fits and MCMC fits models, in a sense.  A name related to linear predictors may be too obscure for users to recognize the purpose of the package.

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-stats...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Douglas Bates

unread,
Aug 1, 2013, 2:01:37 PM8/1/13
to julia...@googlegroups.com
On Thursday, August 1, 2013 11:23:37 AM UTC-5, John Myles White wrote:
I'd love to see some consolidation. I think we should just delete the LM package, since it's totally deprecated as is.

Although it's a little odd, I'd propose putting everything into GLM with the understanding that the name is being slightly abused. No one will be surprised that GLM supports OLS. I would guess that people are more likely to be pleasantly surprised to find that GLM supports GLMM's than they are to be upset that GLMM's aren't segregated away from GLM's. (If I'm wrong in thinking that MixedModels supports GLMM's, please correct me.)

It doesn't support them as yet but building such support is part of the plan.

I think it would be okay to stay with the name GLM and build the mixed-effects models into it.  It will also be straightforward to implement.

Viral B. Shah

unread,
Aug 2, 2013, 12:06:52 AM8/2/13
to julia...@googlegroups.com
This consolidation is an excellent idea. It would also be nice to delete LM. I have been dabbling a bit with the code in JuliaStats recently, and enjoying it. I am already able to do stuff with DataFrames and linear models that I would have otherwise needed to use R for. Some function reference and docs along with the proposed consolidation would make all this much more usable.

-viral

Douglas Bates

unread,
Aug 2, 2013, 2:17:25 PM8/2/13
to julia...@googlegroups.com
I'm in the process of performing the consolidation.  Things are being held up by the problem of the instance of an LMMGeneral type losing its type information on return from the constructor.

One side-effect, John, is that the GLM package will depend on NLopt, which may make installing NLopt on your computer more of a priority.

Douglas Bates

unread,
Aug 2, 2013, 3:07:51 PM8/2/13
to julia...@googlegroups.com
The master branch on the GLM.jl repository now has the code from MixedModels.jl incorporated.  I changed the name of the function to fit linear mixed-effects models from 'lmer' to 'lmm'.  The 'r' at the end of the original name referred to the implementation in R.

Has the process for deprecating a package been formalized yet?

Viral Shah

unread,
Aug 2, 2013, 3:09:10 PM8/2/13
to julia...@googlegroups.com
There is no formal process for deprecating packages, but Stefan had mentioned that one can print a warning or error when the package is loaded, and eventually remove it.

-viral



On 03-Aug-2013, at 12:37 AM, Douglas Bates <dmb...@gmail.com> wrote:

> The master branch on the GLM.jl repository now has the code from MixedModels.jl incorporated. I changed the name of the function to fit linear mixed-effects models from 'lmer' to 'lmm'. The 'r' at the end of the original name referred to the implementation in R.
>
> Has the process for deprecating a package been formalized yet?
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "julia-stats" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/julia-stats/Txid3wkvNOE/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to julia-stats...@googlegroups.com.

John Myles White

unread,
Aug 2, 2013, 3:48:03 PM8/2/13
to julia...@googlegroups.com
We should note that including NLopt makes GLM a GPL package.

 -- John

Viral Shah

unread,
Aug 2, 2013, 3:57:39 PM8/2/13
to julia...@googlegroups.com
This is a bit unfortunate. Not that the result is GPL, but it was nice to have a pure julia GLM package from an ease of installation perspective.

-viral
> You received this message because you are subscribed to a topic in the Google Groups "julia-stats" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/julia-stats/Txid3wkvNOE/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to julia-stats...@googlegroups.com.

Douglas Bates

unread,
Aug 3, 2013, 10:46:36 AM8/3/13
to julia...@googlegroups.com
On Friday, August 2, 2013 2:57:39 PM UTC-5, Viral B. Shah wrote:
This is a bit unfortunate. Not that the result is GPL, but it was nice to have a pure julia GLM package from an ease of installation perspective. 

I hadn't thought of that consequence.  I want to continue to use the NLopt package for fitting mixed-effects models so I will undo the consolidation and move the MixedModels repository to the JuliaStats group.
Reply all
Reply to author
Forward
0 new messages