Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Linear Mixed Effect Models - Predictor matrix rank deficiency

1,347 views

Skip to first unread message

Andrew

unread,

Oct 22, 2013, 8:27:10 AM10/22/13

Hi all,

I'm having a problem while finding a linear mixed effect model using fitlme.m. My model has three fixed effects, of which two (X1, X2) are dichotomous and one (X3) is nominal, although presently I'm only considering two categories for the nominal variable.

When I run fitlme.m using a dataset where the two dichotomous fixed effects are stored as logicals (and the nominal fixed effect stored as a nominal) I get the error:

"Fixed Effects design matrix X must be of full column rank"

which I don't understand as I have observations corresponding to all possible (8) combinations of the fixed effects. Since I'm only considering two cases of the nominal variable I can store it in the dataset as a logical, in which case fitlme.m will run without error (although the output fixed and mixed effects have an _1 after them as though the formula has been misinterpreted).

If I make all fixed effects numeric (doubles) fitlme.m will run without error and with an output as expected.

Can anyone shed light on this or confirm if its OK to treat all of my fixed effects as numeric variables while using fitlme.m?

Thanks,
Andrew

Gautam Pendse

unread,

Oct 22, 2013, 4:58:07 PM10/22/13

Hi Andrew,

I tried the following:

ds = dataset();
% some random data.
ds.y = rand(8,1);
ds.A = [true;true;true;true;false;false;false;false];
ds.B = [true;true;false;false;true;true;false;false];
ds.C = nominal([true;false;true;false;true;false;true;false]);
lme = fitlme(ds,'y ~ A + B + C')

This seems to work for me. If you are still having problems, can you share your data?

Categorical variables with K levels in the formula string are represented using (K-1) dummy variables. Have a look at the 'DummyVarCoding' parameter in the documentation. So A_1 would represent the coefficient associated with dummy variable corresponding to the "true" level in categorical predictor A.

-- Gautam

"Andrew" wrote in message <l45quu$59$1...@newscl01ah.mathworks.com>...

Andrew

unread,

Oct 23, 2013, 3:58:08 AM10/23/13

Hi Gautam,

Thanks for your reply. I took your test case, extended it to the number of observations I have, and started substituting in actual data for each of the variables. I found that using actual data for the nominal variable is what "broke" it.

Where I'd gone wrong was by doing:
Xd = 1:3; % raw data was in double format
X = nominal(Xd); % reflect that its actually nominal data
X = X(Xd ~=3); % don't fit the lme model using data from category 3

so I think X in this case is still being internally represented by 2 dummy variables, 1 of which is never true once I remove the X == 3 cases, which then causes the rank deficiency in the predictor matrix.

As a side note, using the X from above, isequal(X, nominal(1:2)) returns true, despite them representing different things.

So once again thanks for your help, and I'm happy to call this "solved"
Andrew

"Gautam Pendse" <gautam...@mathworks.com> wrote in message <l46osv$acl$1...@newscl01ah.mathworks.com>...

0 new messages