The gradient jumps from 5.48908e+28 to 0 (exactly). That should happen
more gradually.
I suspect something is wrong with your model. See suggestions in
"3.8 Building a random effects model that works" in the Re-manual.
Also, I would start with someting really simple.
Hans
> _______________________________________________
> Users mailing list
> Us...@admb-project.org
> http://lists.admb-project.org/mailman/listinfo/users
>
>
_______________________________________________
Users mailing list
Us...@admb-project.org
http://lists.admb-project.org/mailman/listinfo/users
Dave
g = -nroute*log_sigma_mu - 0.5*norm2(mu_r)/mfexp(2*log_sigma_mu);
g -= -nobs*log_sigma_y - 0.5*norm2(y - Ey)/mfexp(2*log_sigma_y);
causes the inner maxg = 0 infinite loop, whereas
g -= -nroute*log_sigma_mu - 0.5*norm2(mu_r)/mfexp(2*log_sigma_mu);
g -= -nobs*log_sigma_y - 0.5*norm2(y - Ey)/mfexp(2*log_sigma_y);
appears to work (notice g is only decremented, not initialized). Some
examples initialize the objective function (logistic in admb-re
examples) and others do not (e.g., buscycle in admb examples). Which
is correct?
Dave, I've posted my tpl file is below as you generously requested.
Be warned, it's ugly right now -- I'm working on replacing the hideous
indexing with ragged 3d arrays.
Thanks,
Ian
DATA_SECTION
init_int nobs // Number of data points
init_int nroute // Number of routes
init_int nrouteyr // Number of unique
route*year combinations
init_int ncovs // Number of predictors
for each parameters
init_int nparams // Number of parameters
in the seasonal model
init_vector y(1,nobs) // Response vector
init_vector jd(1,nobs) // Julian day
init_vector route(1,nobs) // Vector of routes
init_vector routeyr(1,nobs)
init_vector routeyr_route(1,nrouteyr)
init_matrix X(1,nrouteyr,1,ncovs) // Design matrix of covariates
PARAMETER_SECTION
init_bounded_vector mu_pars(1,ncovs,-10,10,1)
init_bounded_vector beta_pars(1,ncovs,-10,10,1)
init_bounded_vector phi_pars(1,ncovs,-10,10,1)
init_bounded_number log_sigma_y(-5,5,1)
init_bounded_number log_sigma_mu(-5,5,2)
random_effects_vector mu_r(1,nroute,2)
vector Emu_siteyr(1,nrouteyr)
vector Ebeta_siteyr(1,nrouteyr)
vector Ephi_siteyr(1,nrouteyr)
vector mu(1,nrouteyr)
vector beta(1,nrouteyr)
vector phi(1,nrouteyr)
vector Ey(1,nobs)
objective_function_value g
GLOBALS_SECTION
#include <df1b2fun.h>
PROCEDURE_SECTION
Emu_siteyr = X * mu_pars;
Ebeta_siteyr = X * beta_pars;
Ephi_siteyr = X * phi_pars;
// Create route-yr level seasonal curve parameters
for (int i = 1; i <= nrouteyr; i++) {
mu(i) = Emu_siteyr(i) + mu_r(routeyr_route(i));
beta(i) = mfexp(Ebeta_siteyr(i));
phi(i) = mfexp(Ephi_siteyr(i));
}
// Compute expected value for an observation
for (int i = 1; i <= nobs; i++) {
dvariable z = (jd(i) - mu(routeyr(i)))/beta(routeyr(i));
Ey(i) = phi(routeyr(i)) * mfexp(-z + 1) * mfexp(-mfexp(-z));
}
g = -nroute*log_sigma_mu - 0.5*norm2(mu_r)/mfexp(2*log_sigma_mu);
g -= -nobs*log_sigma_y - 0.5*norm2(y - Ey)/mfexp(2*log_sigma_y);
REPORT_SECTION
report << mu_pars << endl;
report << beta_pars << endl;
report << phi_pars << endl;
TOP_OF_MAIN_SECTION
arrmblsize = 40000000L;
gradient_structure::set_GRADSTACK_BUFFER_SIZE(3000000);
gradient_structure::set_CMPDIF_BUFFER_SIZE(200000);
gradient_structure::set_MAX_NVAR_OFFSET(60000);
RUNTIME_SECTION
maximum_function_evaluations 500
convergence_criteria 1.e-3
--
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com
g = nroute*log_sigma_mu + 0.5*norm2(mu_r)/mfexp(2*log_sigma_mu);
g += nobs*log_sigma_y+0.5*norm2(y - Ey)/mfexp(2*log_sigma_y);
Also it is better to handle the std dev parameterization differently.
--
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com
but anyway I think the sign is wrong. I think it should be
g = nroute*log_sigma_mu + 0.5*norm2(mu_r)/mfexp(2*log_sigma_mu);
g += nobs*log_sigma_y+0.5*norm2(y - Ey)/mfexp(2*log_sigma_y);
Also it is better to handle the std dev parameterization differently.
I think Dave probably means his trick where you assume that the random effect is N(0,1) and when you use it you multiply it by the sd
Pred=mu+y*sd
g=0.5*norm2(y)
>I think Dave probably means his trick where you assume that th>e random
>effect is N(0,1) and when you use it you multiply it by the sd
>Pred=mu+y*sd
>g=0.5*norm2(y)
It is not really a trick. Just a different parameterization of the
problem. The point is that when sd-->0 it acts nicely. So you cant start
the model off with a small value of sd such as .05 or .01 which is
important for some more numerically difficult problems. Also you should
use the concentrated likelihood as in
init_bounded_number u(-0.5,1.5)
g = 0.5*norm2(mu_r);
dvariable r2=norm2(y - Ey);
dvariable vhat=r2*u/nobs;
g += 0.5*nobs*log(vhat) + 0.5*r2/vhat;
This replaces the numerically more difficult log_sigma
with the dimensionless quantity u whose value is known to be 1.
Also for comparison purposes you should add the log(sqrt(2*pi)) terms
to the -log-likelihood.
Finally don't tell us how good ADMB is for some problems. Tell the
ordinary users on the R list. We already know and they will never hear
it from the "experts" there.
> Finally don't tell us how good ADMB is for some problems. Tell the ordinary users on the R list.
> We already know and they will never hear it from the "experts" there.
It seems like more people from the R crowd are getting ready to take
the leap towards trying to harness ADMB's power -- there's a proposal
for Google's Summer of Code 2010 for establishing better interfaces
between either R and either ADMB or CppAD. The summary is at
http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:adinr.
It looks like they are already connected to the cppAD group -- is
anyone here in touch with them about ADMB?
Ian