lambda + (1 - lambda)*probit (-x^2)
I haven't been able to figure out how to fit this with glmfit. Am I missing something? Ideas?
Daniel, your description is a little vague to me. It sounds like you're using the GLMFIT function in the Statistics Toolbox with a probit link and a binomial distribution. It sounds like what you want is to model the probability of "success" (whatever that means in your case) as
p = lambda + (1-lambda)*Phi(eta)
where Phi is the standard normal CDF, and eta is the linear predictor X*beta. Is that right?
You can do this by defining your own link function as described in the help for GLMFIT. I believe the usual thing described in the literature is to fix lambda, fit the GLM and compute the deviance, and then vary lambda to find the value that minimizes the deviance.
The one catch in doing this with GLMFIT is that you'll probably find that the standard GLM initialization doesn't work well with that kind of link function -- depending on your data and the value of lambda, it may try to initialize with probabilities that are less than lambda. That's easy to fix if you can modify GLMFIT in your MATLAB installation.
If you can't, there was a thread a few weeks ago with the subject "glmfit binomial-probit with natural mortality" that describes a way to call GLMFIT to fit this model.
Hope this helps.
- Peter Perkins
The MathWorks, Inc.
In fact, this works very well for much of my data. However, humans sometimes "miss" on trials when the probability says that they should almost never miss and this can severely bias the desired parameter fits from the model. (An example of a lapse is that I should hear that tone except that I fell asleep while you played the tone for me.) This is often handled in the psychophysical literature by adding one additional parameter to the model to represent the fact that subjects might sometimes miss on trials that they should have detected. (See below for an equation showing how this is implemented.)
To avoid ambiguity, I'll provide some portions of an m-file that will explicitly outline my model.
mu=-3; %This is the location of the peak of the probability distribution function.
sigma=3; %This is the standard deviation.
lambda=0.01; % This is my lapse rate.
x=-20:.01:20;
If I understand the m-file you provided, the following represent the three functions that are required.
deta=(1-2*lambda)*normpdf(x,mu,sigma); %This is the derivative of the link function
eta=norminv((p_lapse-lambda)/(1-2*lambda),mu,sigma); %This is a form of the link function
mu=lambda + (1-2*lambda)*normcdf(x,mu,sigma); %This is a form of the inverse link function
However, when I try to use the m-file that you provided as a template, I start to run into problems.
I can almost figure out how to implement the derivative function, since I can write this equation in a simple closed form solution:
deta=(1-2*L)*exp(-((mu)).^2)/sqrt(2*pi);
(But this is missing the normalization by sigma which I can't see how to write in the desired format for the GLM, but I probably am just missing something.)
However, the inverse link function cannot be written in a closed form solution. In fact it includes the normal cdf, which is a from of the error function and I just can't see how to write this in a closed form that is appropriate for the glmfit. Seems like the same is true for the link function.
Am I missing something?
Thanks in advance for any advice. Sincerely,
Dan
> If I understand the m-file you provided, the following represent the three functions that are required.
> deta=(1-2*lambda)*normpdf(x,mu,sigma); %This is the derivative of the link function
> eta=norminv((p_lapse-lambda)/(1-2*lambda),mu,sigma); %This is a form of the link function
> mu=lambda + (1-2*lambda)*normcdf(x,mu,sigma); %This is a form of the inverse link function
>
> However, when I try to use the m-file that you provided as a template, I start to run into problems.
>
> I can almost figure out how to implement the derivative function, since I can write this equation in a simple closed form solution:
> deta=(1-2*L)*exp(-((mu)).^2)/sqrt(2*pi);
> (But this is missing the normalization by sigma which I can't see how to write in the desired format for the GLM, but I probably am just missing something.)
>
> However, the inverse link function cannot be written in a closed form solution. In fact it includes the normal cdf, which is a from of the error function and I just can't see how to write this in a closed form that is appropriate for the glmfit. Seems like the same is true for the link function.
I'm not sure where those 2's are coming from.
These are the probit link, link derivative, and inverse link functions that GLMFIT uses (edit private/stattestlink, called by glmfit):
link = @(mu) norminv(mu);
dlink = @(mu) 1 ./ normpdf(norminv(mu));
% keep mu = ilink(eta) in [eps, (1-eps)];
lowerBnd = norminv(eps(dataClass)); upperBnd = -lowerBnd;
ilink = @(eta) normcdf(min(max(eta,lowerBnd),upperBnd));
Given lambda, I believe you want
ilink = @(eta) lambda + ...
(1-lambda).*normcdf(min(max(eta,lowerBnd),upperBnd));
and therefore
link = @(mu) norminv((mu-lambda)./(1-lambda));
dlink = @(mu) 1 ./ ((1-lambda).*normpdf(norminv((mu-lambda)./(1-lambda))));
but you'll want to check my work. If your lambda is really .01, then you may not run into the problem with initial values that I alluded to, unless youhave more than 100 observations.
Hope this helps.
FYI, the 2 comes from the fact that people can lapse on both ends, so we want the cdf to range from lambda on the low end to (1-lambda) on the high end. For example, if lambda equals 0.01, then the lowest probability for very small values of the independent variable would be 0.01 and the highest probability for very large values of the independent variable would be 0.99.
Hope that makes sense. Will let you know if this works.
> FYI, the 2 comes from the fact that people can lapse on both ends, so we want the cdf to range from lambda on the low end to (1-lambda) on the high end. For example, if lambda equals 0.01, then the lowest probability for very small values of the independent variable would be 0.01 and the highest probability for very large values of the independent variable would be 0.99.
>
> Hope that makes sense. Will let you know if this works.
It does, but my versions of those three functions are not what you want. But I think the changes needed are obvious.
myilink = @(eta) L + (1-2*L).*normcdf(min(max(eta,lowerBnd),upperBnd));
mylink = @(mu) norminv((mu-L)./(1-2*L));
mydlink = @(mu) 1 ./ ((1-2*L).*normpdf(norminv((mu-L)./(1-L))));
If I fit with the standard probit model:
b = glmfit(motion,y,'binomial','link','probit');
everything works fine, so I know that my input vectors and general approach are correct.
However, when I fit using the following call (as I believe you had suggested):
b = glmfit(motion,y,'binomial','link',{mylink mydlink myilink});
I get the following error message:
??? Input argument "L" is undefined.
Error in ==> direction_detect>@(mu,L)norminv((mu-L)./(1-2*L)) at 64
mylink = @(mu,L) norminv((mu-L)./(1-2*L));
Error in ==> glmfit at 277
eta = linkFun(mu);
Error in ==> direction_detect at 77
b = glmfit(motion,y,'binomial','link',{mylink mydlink myilink});
First, I should probably let you know that L is defined and is a scalar. It is defined via the following three lines of m-file code:
lambda=0:.01:.06;
for j=1:length(lambda)
L = lambda(j);
so I believe that it both exists and is a scalar.
Now, if were to be calling my function(s) directly myself, I know that I could simply add my variable L in the function definition. For example:
myilink = @(eta, L) L + (1-2*L).*normcdf(min(max(eta,lowerBnd),upperBnd));
However, since I do not control the call function within glmfit, I do not see how to pass the L variable to my function(s).
Suggestions?
Thanks in advance,
Dan
Dan, you should set L before you define the functions above. Its value is
then incorporated into the function definitions.
It sounds like you're setting L in a loop. Then re-define these functions
each time through the loop, so they pick up the new value of L. I don't
think there will be a lot of overhead in doing that.
-- Tom
I think you want 1-2*L in the denom of the last one.
> However, when I fit using the following call (as I believe you had suggested):
> b = glmfit(motion,y,'binomial','link',{mylink mydlink myilink});
> I get the following error message:
>
> ??? Input argument "L" is undefined.
>
> Error in ==> direction_detect>@(mu,L)norminv((mu-L)./(1-2*L)) at 64
> mylink = @(mu,L) norminv((mu-L)./(1-2*L));
Where did that link function come from? It isn't the one you cited above, and it has L as an argument. L should not be an input argument. Let the anonomous function "captures" L by creating a new mylink at each iteration inside this loop:
You are correct. My mistake.
> > mylink = @(mu,L) norminv((mu-L)./(1-2*L));
>
> Where did that link function come from? It isn't the one you cited above, and it has L as an argument. L should not be an input argument. Let the anonomous function "captures" L by creating a new mylink at each iteration inside this loop:
>
> > lambda=0:.01:.06;
> > for j=1:length(lambda)
> > L = lambda(j);
You are correct. I tried using the function above after I had tried:
mylink = @(mu) norminv((mu-L)./(1-2*L));
Both gave the same error message.
I think that it is working now. When I am certain, would you like the "final" code. I simply modified the example code that you had provided earlier with the updated model and model fit, etc.
> I think that it is working now. When I am certain, would you like the "final" code. I simply modified the example code that you had provided earlier with the updated model and model fit, etc.
Post it on the MATLAB Central File Exchange!