Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

nlinfit for Multivariate Nonlinear Regression?

435 views
Skip to first unread message

Rene

unread,
Dec 15, 2011, 7:54:07 PM12/15/11
to
Hello!

I'd like to perform a quadratic regression on multivariate data - more precisely from my experiments I have four "input" variables (regressors) and one response value.
I understand the four terms X,y,fun,beta0. But for defining fun and X I need to know how these are supposed to look like.

At the moment my matrix X of regressors with n values each (out of n experiment runs) looks the following:
x11, x21, x31, x41
x12, x22, x32, x42
...
x1n, x2n, x3n, x4n

Would it be right to extend the matrix by the following columns (calculated for each row, of course)?
x11, x21, x31, x41, x1*x1, x1*x2, x1*x3, x1*x4, x2*x2, x2*x3, x2*x4, x3*x3, x3*x4, x4*x4

And then have a function like:
fun = @(b,X) (b(1) + b(2)*X(1:n,1) + b(3)*X(1:n,2) + ....
with a vector b of 14 rows (as the matrix has 14 columns)?


I'd be very glad for some advise!

René

Greg Heath

unread,
Dec 15, 2011, 10:55:19 PM12/15/11
to
1.Since you only have one output variable, it is Multiple (NOT
Multivariate) Regression.
2. Since your model is linear in unknown parameters, the soution can
be obtained via backslash.
3. The Multivariate model can also be solved via backslash.

Hope this helps.

Greg

Rene

unread,
Dec 15, 2011, 11:59:09 PM12/15/11
to
You're right, sorry... of course it's multiple and not multivariate. Must have mixed up the terms.

But I'm not sure if backslash (you're referring to matrix division, aren't you) would help me as I don't just want to do the Matrix division but get a best fit estimate for the whole sample. And I assume the relation between some inputs and the output is somewhat quadratic, that's why the regular linear regression didn't lead to a good prediction.


Greg Heath <he...@alumni.brown.edu> wrote in message <fb6bd1e2-798b-4b69...@o9g2000vbc.googlegroups.com>...

Rene

unread,
Dec 16, 2011, 3:28:08 PM12/16/11
to
I tried to fix it, but still get errors...

I used the fuction

function [out] = nlr(b,X)
for n = 1:25
out(n) = b(1).*Inputs(1,n)+b(2).*Inputs(2,n)+b(3).*Inputs(3,n)+b(4).*Inputs(4,n)+b(5).*Inputs(5,n)+b(6).*Inputs(6,n)+b(7).*Inputs(7,n)+b(8).*Inputs(8,n)+b(9).*Inputs(9,n)+b(10).*Inputs(10,n)+b(11).*Inputs(11,n)+b(12).*Inputs(12,n)+b(13).*Inputs(13,n)+b(14).*Inputs(14,n)+b(15).*Inputs(15,n);
end

and

beta0 = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];

Called it with

nlin_write95 = nlinfit(Inputs, WriteLatency95th, nlr, beta0);

And get the error:
Error using nlr (line 6)
Not enough input arguments.


Does anybody have a suggestion how to solve this problem?
Thanks!


"Rene " <rene_w...@hotmail.com> wrote in message <jce4rf$aif$1...@newscl01ah.mathworks.com>...

Tom Lane

unread,
Dec 16, 2011, 3:35:16 PM12/16/11
to
> But I'm not sure if backslash (you're referring to matrix division, aren't
> you) would help me as I don't just want to do the Matrix division but get
> a best fit estimate for the whole sample. And I assume the relation
> between some inputs and the output is somewhat quadratic, that's why the
> regular linear regression didn't lead to a good prediction.

René, Greg's advice is correct, because backslash does least squares when
the matrix that precedes it is not square.

Consider using the x2fx function with backslash, or the regstats function:

>> x1 = rand(10,1); x2 = randn(10,1);
>> y = 10 + 9*x1 + 8*x2 + 7*x1.^2 + 6*x2.^2 + 5*x1.*x2 +
>> randn(size(x1))/100;
>> x = [x1 x2];
>> x2fx(x,'quadratic')\y
ans =
10.0210
8.8776
7.9925
5.0227
7.1465
5.9956
>> s = regstats(y,x,'quadratic');
>> s.beta
ans =
10.0210
8.8776
7.9925
5.0227
7.1465
5.9956

You can see I roughly reproduced the known coefficients, except their order
is different from the way I entered them.

-- Tom

Rene

unread,
Dec 16, 2011, 4:56:08 PM12/16/11
to
Thanks a lot to the both of you! Regstat indeed seems to be exactly what I had been looking for... I'm pretty new to Matlab, so I didn't know about this capability of Backslash.

That just leaves me with one last question: Do you have an idea in which order the elements of beta are listed?
For my four input variables I get a beta of size 15 which is correct as there is one term corresponding to beta_0 (the very first one, that was easy to guess due to my figures), four for the single variables, four for the squared versions and six for the interactions. But I have no clue if there's first the single variables, than squares and then interactions or if it's in any other order.

"Tom Lane" <tl...@mathworks.com> wrote in message <jcga25$hat$1...@newscl01ah.mathworks.com>...
> > But I'm not sure if backslash (you're referring to matrix division, aren't
> > you) would help me as I don't just want to do the Matrix division but get
> > a best fit estimate for the whole sample. And I assume the relation
> > between some inputs and the output is somewhat quadratic, that's why the
> > regular linear regression didn't lead to a good prediction.
>
> Ren, Greg's advice is correct, because backslash does least squares when

Greg Heath

unread,
Dec 17, 2011, 2:04:08 AM12/17/11
to
On Dec 16, 4:56 pm, "Rene " <rene_wein...@hotmail.com> wrote:
---SNIP>
> That just leaves me with one last question: Do you have an idea in
> which order the elements of beta are listed?

Compare

y = 10 + 9*x1 + 8*x2 + 7*x1.^2 + 6*x2.^2 + 5*x1.*x2 + randn(size(x1))/
100;

with

ans =
   10.0210
    8.8776
    7.9925
    5.0227
    7.1465
    5.9956

Hope this helps.

Greg

Rene

unread,
Dec 28, 2011, 3:39:08 PM12/28/11
to
Thanks Greg, I appreciate your help a lot... so I expected the order for four variables x1-4 to be:
beta0, x1, x2, x3, x4, x1*x2, x1*x3, x1*x4, x2*x3, x2*x4, x3*x4, x1^2, x2^2, x3^2, x4^2

Unfortunately, if I check the results with some of the input data and the calculated residuals, they don't match... which other order could have been used? I can't think of any.


Greg Heath <he...@alumni.brown.edu> wrote in message <b10c5ac3-4610-40bc...@p20g2000vbm.googlegroups.com>...

Greg Heath

unread,
Dec 29, 2011, 7:03:26 AM12/29/11
to

%%%%%CORRECTED FOR THE HEINOUS SIN OF TOP-POSTING%%%%

On Dec 28, 3:39 pm, "Rene " <rene_wein...@hotmail.com> wrote:
> Greg Heath <he...@alumni.brown.edu> wrote in message <b10c5ac3-4610-40bc-9872-7a13cec0f...@p20g2000vbm.googlegroups.com>...
> > On Dec 16, 4:56 pm, "Rene " <rene_wein...@hotmail.com> wrote:
> > ---SNIP>
> > > That just leaves me with one last question: Do you have an idea in
> > > which order the elements of beta are listed?
>
> > Compare
>
> > y = 10 + 9*x1 + 8*x2 + 7*x1.^2 + 6*x2.^2 + 5*x1.*x2 + randn(size(x1))/
> > 100;
>
> > with
>
> > ans =
> > 10.0210
> > 8.8776
> > 7.9925
> > 5.0227
> > 7.1465
> > 5.9956
>
> Thanks Greg, I appreciate your help a lot... so I expected the order for four variables x1-4 to be:
> beta0, x1, x2, x3, x4, x1*x2, x1*x3, x1*x4, x2*x3, x2*x4, x3*x4, x1^2, x2^2, x3^2, x4^2
>
> Unfortunately, if I check the results with some of the input data and the calculated residuals, they don't match... which other order could have been used? I can't think of any.

Why don't you extend the model to 3 variables and post the results.
Then post the results for 4 variables.

Hope this helps.

Greg
Do the same for

Rene

unread,
Dec 29, 2011, 12:19:08 PM12/29/11
to
Greg Heath <he...@alumni.brown.edu> wrote in message <eddb55b2-e582-4ca7...@z25g2000vbs.googlegroups.com>...
>
> Why don't you extend the model to 3 variables and post the results.
> Then post the results for 4 variables.
>
> Hope this helps.
>
> Greg
> Do the same for

I tried this and got the following result which was a bit ambiguous:

> x1 = randn(15,1); x2 = randn(15,1); x3 = randn(15,1); x4 = randn(15,1);
> y = 15 + 14*x1 + 13*x2 + 12*x3 + 11*x4 + 10*x1.^2 + 9*x2.^2 + 8*x3.^2 + 7*x4.^2 + 6*x1.*x2 + 5*x1.*x3 + 4*x1.*x4 + 3*x2.*x3 + 2*x2.*x4 + 1*x3.*x4 + randn(size(x1))/100;
> x = [x1 x2 x3 x4];
> x2fx(x,'quadratic')\y

ans =

14.5224
13.2438
11.9884
12.5123
12.8083
4.6481
5.5425
8.0585
3.9399
3.6584
0.0052
9.9893
9.1804
7.8304
5.4853

But then I figured that the problem was the random term, so a second try gave the desired result:

> x1 = randn(15,1); x2 = randn(15,1); x3 = randn(15,1); x4 = randn(15,1);
> y = 15 + 14*x1 + 13*x2 + 12*x3 + 11*x4 + 10*x1.^2 + 9*x2.^2 + 8*x3.^2 + 7*x4.^2 + 6*x1.*x2 + 5*x1.*x3 + 4*x1.*x4 + 3*x2.*x3 + 2*x2.*x4 + 1*x3.*x4;
> x = [x1 x2 x3 x4];
> x2fx(x,'quadratic')\y

ans =

15.0000
14.0000
13.0000
12.0000
11.0000
6.0000
5.0000
4.0000
3.0000
2.0000
1.0000
10.0000
9.0000
8.0000
7.0000


So the order is indeed beta0, x1, x2, x3, x4, x1*x2, x1*x3, x1*x4, x2*x3, x2*x4, x3*x4, x1^2, x2^2, x3^2, x4^2.

Then I'll have to figure out what else could be the result for the non-fitting residuals.
Thanks for your help and for being patient :-)

Greg Heath

unread,
Dec 29, 2011, 3:12:51 PM12/29/11
to
On Dec 29, 12:19 pm, "Rene " <rene_wein...@hotmail.com> wrote:
> Greg Heath <he...@alumni.brown.edu> wrote in message <eddb55b2-e582-4ca7-a509-554851e9c...@z25g2000vbs.googlegroups.com>...
On Dec 28, 3:39 pm, "Rene " <rene_wein...@hotmail.com> wrote:
> Thanks Greg, I appreciate your help a lot... so I expected the order for four variables x1-4 to be:
> beta0, x1, x2, x3, x4, x1*x2, x1*x3, x1*x4, x2*x3, x2*x4, x3*x4, x1^2, x2^2, x3^2, x4^2


That is correct.

> Unfortunately, if I check the results with some of the input data and the calculated residuals, they don't match... which other order could have been used? I can't think of any.

If the order of your model is higher than 2, you have to define your
model by using a power matrix in the second input of x2fx. Although
the command
"help x2fx" (w/o quotes) doesn't help much, see the documentation via
"doc
x2fx". Using that example, consider

close all, clear all, clc
x = [1 2 3 ; 4 5 6 ]' % transpose
model1 = 'quadratic'
D1 = x2fx(x,model1)

model2 = [0 1 0 1 1 0; 0 0 1 1 0 1]' % transpose
D2 = x2fx(x,model2)

Now you can do your 4th order model.

Hope this helps.

Greg

Steven_Lord

unread,
Dec 30, 2011, 11:58:55 PM12/30/11
to


"Rene " <rene_w...@hotmail.com> wrote in message
news:jdi7ec$29f$1...@newscl01ah.mathworks.com...
> Greg Heath <he...@alumni.brown.edu> wrote in message
> <eddb55b2-e582-4ca7...@z25g2000vbs.googlegroups.com>...
>>
>> Why don't you extend the model to 3 variables and post the results.
>> Then post the results for 4 variables.
>>
>> Hope this helps.
>>
>> Greg
>> Do the same for
>
> I tried this and got the following result which was a bit ambiguous:
>
>> x1 = randn(15,1); x2 = randn(15,1); x3 = randn(15,1); x4 = randn(15,1);
>> y = 15 + 14*x1 + 13*x2 + 12*x3 + 11*x4 + 10*x1.^2 + 9*x2.^2 + 8*x3.^2 +
>> 7*x4.^2 + 6*x1.*x2 + 5*x1.*x3 + 4*x1.*x4 + 3*x2.*x3 + 2*x2.*x4 + 1*x3.*x4
>> + randn(size(x1))/100;
>> x = [x1 x2 x3 x4];
>> x2fx(x,'quadratic')\y

*snip*

The order of the columns returned by X2FX is given in the documentation:

http://www.mathworks.com/help/toolbox/stats/x2fx.html

--
Steve Lord
sl...@mathworks.com
To contact Technical Support use the Contact Us link on
http://www.mathworks.com

0 new messages