Error std::bad_alloc [origin: bad_alloc] [origin: bad_alloc]

901 views
Skip to first unread message

Fabian Crespo

unread,
Sep 9, 2016, 9:49:02 AM9/9/16
to Stan users mailing list
Hello, I am trying to fit a model in  rstan and I get the error std::bad_alloc [origin: bad_alloc] [origin: bad_alloc]. Am using Windows 7, Rstudio 0.99.903 , Rstan 2.11.1 and gcc-4.6.3 

Not all chains fail, but some chains fail with this error .
Here is my code 


functions{
  real loglikelihood(int N,
                     real mu,
                     real k,
                     real p,
                     real c, 
                     real q, 
                     real d, 
                     real alpha, 
                     real gamma, 
                     real eta,
                     vector t, 
                     vector magnitudes, 
                     vector dif_tiempos,
                     vector dif_profundidades,
                     vector latitudes, 
                     vector longitudes, 
                     vector profundidades,
                     vector factor_cuadraticoAni,
                     vector factor_cuadraticoIso,
                     real tmax,
                     real magnitud0,
                     real lat_min,
                     real lat_max,
                     real long_min,
                     real long_max,
                     real radio,
                     real profundidad_capa
                     ){
    
    real tasa_sismicidad[N];
    real integral_tasa[N];
    
    real integral_mu;
    real log_verosimilitud;
    tasa_sismicidad[N]=log(mu);
    integral_tasa[N]=0;
    for(j in 1:(N-1)){
      vector[N-j] y;
      vector[N-j] z;
      vector[N-j] x;
      vector[N-j] w;
      vector[N-j] v;
     
      real temp;
      real temp1;
      real temp2;
      real temp3;
      int inicio;
      int fin;
      inicio =N*(j-1)-(j*(j-1))/2 + 1;
      fin =j*N-(j*(j+1))/2;
      
      
      y=dif_tiempos[inicio:fin];
      z=exp(-q*log(factor_cuadraticoAni[inicio:fin]./(exp(gamma*(magnitudes[(j+1):]-rep_vector(magnitud0,N-j))))+rep_vector(d,N-j)));
      y=(k*alpha*(p-1)*c^(p-1)*(q-1)*d^(q-1)*(1/pi()))*exp((alpha-gamma)*(magnitudes[(j+1):]-rep_vector(magnitud0,N-j))).*exp(-p*log(y+rep_vector(c,N-j)));
      x=y .* z ;
      z=(1/profundidad_capa) * dif_profundidades[inicio:fin];
      y=profundidades[(j+1):N];
      for(i in 1:(N-j)){
           
         w[i]=(1/(profundidad_capa * exp(lbeta(eta * ((1/profundidad_capa) * y[i]) + 1,eta - eta * ((1/profundidad_capa) * y[i])+ 1)))) * (z[i] ^ (eta * ((1/profundidad_capa) * y[i]))) * ((1 - z[i]) ^ (eta-eta * ((1/profundidad_capa) * y[i])));

          v[i]=(beta_cdf(1-((1/profundidad_capa) * y[i]), eta * ((1/profundidad_capa) * y[i]) + 1,eta-eta * ((1/profundidad_capa) * y[i])+1)) / (exp(lbeta(eta * ((1/profundidad_capa) * y[i]) + 1,eta - eta * ((1/profundidad_capa) * y[i])+ 1))) ;
      }
     
     
      tasa_sismicidad[j]=log(mu+sum(x .* w));
  
      temp=exp(alpha*(magnitudes[j+1]-magnitud0));
      temp1=exp(gamma*(magnitudes[j+1]-magnitud0));
      temp2=max(factor_cuadraticoAni[inicio:fin]);
      
      
      integral_tasa[j]=k*alpha*temp*(1-c^(p-1)/((dif_tiempos[j]+c)^(p-1)))*(1-d^(q-1)/((temp2/(temp1)+d)^(q-1))) * sum(v);
      
      
    }
   
    integral_mu=mu*tmax*pi()*profundidad_capa*radio^2;
    log_verosimilitud=sum(tasa_sismicidad)-integral_mu-sum(integral_tasa);
    return(log_verosimilitud);
  }
data{
  int<lower=0> N;
  vector[N] tiempo;
  real<lower=0> tiempo_max;
  vector[N] magnitudes;
  vector[N*(N-1)/2] factor_cuadraticoAni;
  vector[N*(N-1)/2] factor_cuadraticoIso;
  vector[N*(N-1)/2] dif_tiempos;
  vector[N*(N-1)/2] dif_profundidades;

  real<lower=0> magnitud_corte;
  vector[N] latitudes;
  vector[N] longitudes;
  vector[N] profundidades;
  real lat_min;
  real lat_max;
  real long_min;
  real long_max;
  real<lower=0> radio;
  real<lower=0> profundidad_capa;
}
parameters{
  real<lower=0> mu;
  real<lower=0> k;
  real<lower=1.000005> p;
  real<lower=0> c;
  real<lower=0> d;
  real<lower=1.00005> q;
  real<lower=0> alpha;
  real<lower=0> gamma;
  real<lower=0> eta;
  
}
model{

  mu~exponential(2);
  k~exponential(2);
  p~exponential(2);
  c~exponential(2);
 d~exponential(2);
  q~exponential(2);
  eta~exponential(2);
  gamma~exponential(2);
  alpha-gamma~exponential(5);
  
  increment_log_prob(loglikelihood(N,mu,k,p,c,q,d,alpha,gamma,eta,tiempo,magnitudes,dif_tiempos,dif_profundidades,latitudes,longitudes,profundidades,factor_cuadraticoAni,factor_cuadraticoIso,tiempo_max,magnitud_corte,lat_min,lat_max,long_min,long_max,radio,profundidad_capa));
}


And this ir the error output


SAMPLING FOR MODEL 'spETASespaciotemporalAniProfundidad' NOW (CHAIN 2).
[1] "Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                        
[2] "Exception thrown at line 131: Exception thrown at line 65: std::bad_alloc [origin: bad_alloc] [origin: bad_alloc]"                     
[3] "If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
[4] "but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                              
[5] "Rejecting initial value:"                                                                                                              
[6] "  Error evaluating the log probability at the initial value."                                                                          
[7] "Error : "                                                                                                                              
[1] "error occurred during calling the sampler; sampling not done"

SAMPLING FOR MODEL 'spETASespaciotemporalAniProfundidad' NOW (CHAIN 3).
[1] "Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                        
[2] "Exception thrown at line 131: Exception thrown at line 63: std::bad_alloc [origin: bad_alloc] [origin: bad_alloc]"                     
[3] "If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
[4] "but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                              
[5] "Rejecting initial value:"                                                                                                              
[6] "  Error evaluating the log probability at the initial value."                                                                          
[7] "Error : "                                                                                                                              
[1] "error occurred during calling the sampler; sampling not done"
spETASespaciotemporalAniProfundidad.stan

Daniel Lee

unread,
Sep 9, 2016, 10:00:52 AM9/9/16
to stan-users mailing list
Hi Fabian,

That's indicating there's a memory problem. Questions for you:
  • are you running this in parallel from R? If so, can you try running one chain at a time?
  • when you're running it, can you open up Task Manager and see if you're running out of RAM?
  • what's N? Is N ever 1?
  • what's j? Is j ever larger than N? In v2.11, there was an issue with negative integers being cast to an unsigned type which tried to allocate really large arrays. The next version will have that fixed.

Daniel


--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fabian Crespo

unread,
Sep 9, 2016, 10:19:19 AM9/9/16
to stan-...@googlegroups.com
Hi Daniel 
Thanks for answering my question.

Am not running the code in parallel. I am using :

fit = stan(file="C:\\Users\\GOETHE\\spETASespaciotemporalAniProfundidad.stan", data=dat, iter=20, init=initf, chains=4,refresh=1,control = list(adapt_delta = 0.9999))

During the running the physical memory use gets to 85% and my computer has 8 Gb of RAM


N is the number of earthquakes and is 765 and is never 1.So and looping over the earthquakes(index j) and then over the earthquakes before in time(index i) that the current eartquake j . I have the data ordered by time in decreasing order so earthquake j has preceding earthquakes in j+1:N.

Should I install another version of Rstan??

Thanks

Daniel Lee

unread,
Sep 9, 2016, 10:27:43 AM9/9/16
to stan-users mailing list
The next version of RStan isn't out yet, but that won't help you.

If I remember correctly, although you might have a 64-bit machine, I believe some part of the RTools toolchain is still 32-bit. I think it might be the linker. I'm not absolutely sure.

If you're willing to experiment, I'd try upgrading R and RTools. The latest version of RTools has g++-4.9.3. I'm hoping that's 64-bit, but I haven't tested.


Also... you can do things to make your Stan program more memory efficient. That might help. I didn't have much time to look, but you don't need to do this: log(y+rep_vector(c,N-j)
You can do this instead: log(y + c)

That was just something I saw while scanning through your code. I really haven't looked deeper into it.


Daniel




Fabian Crespo

unread,
Sep 9, 2016, 10:50:52 AM9/9/16
to stan-...@googlegroups.com
Hi Daniel 

Thanks, great tip!! . 

My R vesion is 3.3.1  and my version of Rtools is 3.3.

Regarding the code, I am using a doble loop(O(N^2)) because the functions related the beta distribution  beta_cdf and beta_lpdf receive reals(we can pass vectors) but return the sum ( a real) and not a vector(and I need  a vector that is elementwise  the pdf of the Beta distribution in some parameters vector). 

Is there any way to use funcional programming inside Rstan , that is apply beta_cdf to a vector of parameters(shape) and return a vector??

By the way, I am still getting the error.

Thanks in advance for the help.

Bob Carpenter

unread,
Sep 9, 2016, 1:25:24 PM9/9/16
to stan-...@googlegroups.com

> On Sep 9, 2016, at 10:50 AM, Fabian Crespo <crespofa...@gmail.com> wrote:
>
> Hi Daniel
>
> Thanks, great tip!! .
>
> My R vesion is 3.3.1 and my version of Rtools is 3.3.
>
> Regarding the code, I am using a doble loop(O(N^2)) because the functions related the beta distribution beta_cdf and beta_lpdf receive reals(we can pass vectors) but return the sum ( a real) and not a vector(and I need a vector that is elementwise the pdf of the Beta distribution in some parameters vector).
>
> Is there any way to use funcional programming inside Rstan , that is apply beta_cdf to a vector of parameters(shape) and return a vector??

Afraid not.

> By the way, I am still getting the error.

What OS are you running? Do you get the error if you simulate
a smaller data set?

- Bob

Fabian Crespo

unread,
Sep 9, 2016, 6:52:42 PM9/9/16
to stan-...@googlegroups.com
Hello Bob

I am using Windows 7 64 bits with 8GB RAM,  R version 3.3.1  and my version of Rtools is 3.3,Rstudio 0.99.903 , Rstan 2.11.1 and gcc-4.6.3 


I fit a simpler model(not considering the depths) to the same earthquake data and the results were perfect. But in the case of the hypocentral model(taking the depths into account) I get the error std::bad_alloc [origin: bad_alloc] [origin: bad_alloc],

The error occurs inside the inner loop, and we I try to fill the vectors w y v. 

How to trace the inner loop in order to detect bugs???


Thanks in advance 
 

Bob Carpenter

unread,
Sep 9, 2016, 7:54:26 PM9/9/16
to stan-...@googlegroups.com
The only real way to debug is with print statements,
perhaps embedded in conditionals.

If you're running out of memory, you can always cut down on
the number of operations. But if the model won't fit even
on a small set of simulated data, then there may be
something else going on.

Just this expression recalculates 1/profundidad_cap several
times per loop iteration:

w[i]=(1/(profundidad_capa * exp(lbeta(eta * ((1/profundidad_capa) * y[i]) + 1,eta - eta * ((1/profundidad_capa) * y[i])+ 1)))) * (z[i] ^ (eta * ((1/profundidad_capa) * y[i]))) * ((1 - z[i]) ^ (eta-eta * ((1/profundidad_capa) * y[i])));

Better to calculate it on the outside of the loop and reuse it. Same
with eta * (1 / profundidad_capa).

- Bob
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages