config=TRUE run time in latest version of R?

176 views
Skip to first unread message

whee...@gmail.com

unread,
Mar 6, 2024, 9:54:40 AM3/6/24
to R-inla discussion group
Good morning,

I recently updated R to version 4.3.2 and re-installed INLA.  I typically use the control.compute = list(config = TRUE) option, and I'm wondering if there are some new settings in the posterior calculation that make this run longer in the current version, compared to say last year?  The same models are running for quite a bit longer compared to before I updated R. 

I can remove this setting when I don't need to sample the posterior, wondering if this will help boost speed.  Or if there is another setting I should look into?

Thanks!
Katie

INLA help

unread,
Mar 6, 2024, 12:01:58 PM3/6/24
to R-inla discussion group, whee...@gmail.com
I would guess it’s the more strict convergence criteria and how the optimization is done that make the change.    This would happen for a very flat posterior.   Otherwise I would expect it would run better…

I can rerun it here to check if u like

Haavard Rue
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/67ad16c7-ff78-49e2-9488-798f58e894dan%40googlegroups.com.

whee...@gmail.com

unread,
Mar 6, 2024, 12:50:27 PM3/6/24
to R-inla discussion group
Thank you!  The data are confidential so I cannot share but I'll try removing config=TRUE since I don't need to sample from the posterior for this case.  

whee...@gmail.com

unread,
Mar 6, 2024, 12:53:24 PM3/6/24
to R-inla discussion group
Update: that improved speed a great deal, converged and got results in under 5 min :)

Helpdesk (Haavard Rue)

unread,
Mar 6, 2024, 1:21:29 PM3/6/24
to whee...@gmail.com, R-inla discussion group
the config=T will not impose more CPU time, just storage, so if this
takes much more time, then maybe you're hitting roof for RAM...
> https://groups.google.com/d/msgid/r-inla-discussion-group/f5fe43d4-ca59-4070-8bca-82231e253c83n%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Chris Schmidt

unread,
Mar 25, 2024, 11:57:59 AM3/25/24
to R-inla discussion group

Hello, I am experiencing something similar to what the original poster described, in that models that used to take around 4–7 hours to finish, when we ran them about a year ago, are now stuck endlessly at the final integration stage (72 hours and still not finished). This is happening even with the identical data sets, model specifications, and INLA builds, and with what should be plenty of RAM (based on previous experience with these models). Does this suggest that the culprit is something that has changed with our institution's cluster configuration?

However, I'm curious about the comment above about stricter convergence criteria and changes to how optimization is done. Prof. Rue, can you say anything more about which INLA builds you were referring to with that comment? Do you have any general suggestions for troubleshooting this problem?

Our model specification is as follows:

form <- y ~ -1 + f(b0, model = 'linear', prec.linear = 0.1) + f(mcnty, model = 'bym2', graph = mat, values = unique(mcnty)) + f(ID.year, model = 'ar1', replicate = as.integer(as.factor(county_edu))) + f(ID.year2, model = 'ar1', replicate = as.integer(mcnty)) + f(edu, model = 'iid')

The INLA call is:

model <- inla(form, family = "binomial", data = inla.stack.data(stk.y), Ntrials = Ntrials, control.predictor = list(A = inla.stack.A(stk.y), link = 1), control.inla = list(strategy = "gaussian", int.strategy = "eb", h = 1e-3), control.compute = list(config = TRUE, waic = FALSE, smtp = "taucs"), verbose = TRUE, debug = FALSE)

Thank you very much,

Chris

Havard Rue

unread,
Mar 25, 2024, 12:06:50 PM3/25/24
to R-inla discussion group, Chris Schmidt
That is weird. Any chance I can rerun it here?

Håvard Rue
Professor of Statistics
Statistics Program, CEMSE Division
King Abdullah University of Science and Technology
Thuwal 23955-6900, Saudi Arabia

Email: haava...@kaust.edu.sa
Office: +966 (0)12 808 0640
Mobile: +966 (0)54 470 0421
Research group: bayescomp.kaust.edu.sa
R-INLA project: www.r-inla.org
Zoom: kaust.zoom.us/my/haavard.rue

whee...@gmail.com

unread,
Apr 4, 2024, 1:44:22 PM4/4/24
to R-inla discussion group

One other thing I've noticed recently is that INLA often gets hung up when I try to run a model several times as a loop, passing different parameters on each iteration.  The same code used to work in the older version of INLA (using R version 4.2.2; I’m now using R version 4.3.2) and it would give me results for upwards of 10 model iterations within an hour or two.  In the latest version it usually gets stuck in the loop, usually somewhere around the second iteration, especially for more complex models with n>5000, but it never actually crashes.  I have to stop it after many hours of hanging.  But if I run each iteration manually they often run quickly, within a matter of minutes for each model.  Sometimes one of the iterations will “hiccup” and get stuck, I stop it, do a gc(), and then try again, and it’s ok. Are these also likely RAM issues?  

 

Thanks!

 

Here is an example of my code (again, sorry I cannot share the data):

 

#### LOOP OD OUTCOMES ####

#filter data on race and specify dataframe and graph

 

race <- "All"
mydata <- my_data_all_counties
mygraph <- Cnty2020_all_counties.adj

#create dataframe to hold results
all_output1 <- data.frame(name = character(), model=character(), outcome=character(), variable=character(),  est=numeric(), lci=numeric(), uci=numeric(),  n_rmse=numeric())

set.seed(12345)

#run outer loop for each outcome
for (outcome in c("OD1", "OD2", "OD3", "OD4", "OD5" ))
{
 
  outcome_count <- mydata[[c(paste0(outcome, "Count"))]]
  outcome_rate <- mydata[[c(paste0(outcome, "Rate100k"))]]
  outcome_rate_lag <- mydata[[c(paste0(outcome, "Rate100k_Lag"))]]
  log_outcome_rate_lag <- log(outcome_rate_lag+0.001)
 

  y <- outcome_count
  E <- (sum(outcome_count)/sum(mydata$Population))*mydata$Population
 
 
  inla_main2a <- data.frame(model=character(), variable=character(), est=numeric(), lci=numeric(), uci=numeric(), n_rmse=numeric(), dic=numeric(), waic=numeric())
 
#run inner loop for 1-4 versions of key variables  

  

for (i in 1:4)
    {
   
    a<- paste0("group", i, "_rx_chg_h")
    rx_domain_chg_h <- mydata[[a]]
   
    b<- paste0("group", i, "_tx_chg_h")
    tx_domain_chg_h <-mydata[[b]]
   
    c<- paste0("group", i, "_hr_chg_h")
    hr_domain_chg_h <-mydata[[c]]
   
    inla_form2a <-   y ~ 1 +
        f(countynum, model='bym', graph=mygraph)+
        f(YearNum10,model="rw1", constr=TRUE) +
        f(YearNum10b,model="iid", constr=TRUE) +
        f(ID.county.year,model="iid", constr=TRUE) +
        offset(log_outcome_rate_lag)+
        factor(Year)+
        factor(State)+
        Deathp1000 +
        pAge0to17 +
        pAge18to24 +
        pAge25to44 +
        pAge45to64 +
        pMale +
        pWhite +
        pBlack +
        pHispanic +
        pFamInPov +
        MHI_10K +
        pUnemployed +
        PopDens1000pSqMi_n +
        RatioTotalFentToTopOpioidAll100+
        rx_domain_chg_h +
        tx_domain_chg_h +
        hr_domain_chg_h
     
   
   
    inla_mod2a <- inla(inla_form2a,family = c("nbinomial"),
                       data = mydata, E=E,
                       verbose = T,
                       control.compute=list(dic=TRUE, waic=TRUE, cpo=TRUE, config=F))
   
   
    inla_res2a <-    
      inla_mod2a$summary.fixed[,c("mean", "0.025quant", "0.975quant")]
   
   
    inla_rmse2a <- sqrt(mean((outcome_count - (inla_mod2a$summary.fitted.values[,"mean"])*E)^2))/sd(outcome_count)
   
    inla_dic2a <- inla_mod2a$dic$dic
   
    inla_waic2a <- inla_mod2a$waic$waic
   
    inla_main2a <- rbind(inla_main2a, cbind(model=paste0("Model",i),var=rownames(inla_res2a), exp(inla_res2a), n_rmse = inla_rmse2a, dic= inla_dic2a, waic= inla_waic2a))
  }
#end inner loop  
 
  colnames(inla_main2a)<-c("model", "variable", "est", "lci", "uci", "n_rmse", "dic", "waic")
 
  all_output1 <- bind_rows(all_output1, cbind(name="inla_main_adj", outcome=paste0(outcome,"_Count"),inla_main2a))
 
}

#end outer loop


#print and export results

all_output1

write.csv(all_output1, paste0(mypath, race,Sys.Date(),".csv"))

Helpdesk (Haavard Rue)

unread,
Apr 4, 2024, 4:54:59 PM4/4/24
to whee...@gmail.com, R-inla discussion group

that sounds weird. you're sure you are using a recent version of R-INLA? if
problems still occur, please share to he...@r-inla.org so we can rerun

H

On Thu, 2024-04-04 at 10:44 -0700, whee...@gmail.com wrote:
> One other thing I've noticed recently is that INLA often gets hung up when I
> try to run a model several times as a loop, passing different parameters on
> each iteration.  The same code used to work in the older version of INLA
> (using R version 4.2.2; I’m now using R version 4.3.2) and it would give me
> results for upwards of 10 model iterations within an hour or two.  In the
> latest version it usually gets stuck in the loop, usually somewhere around the
> second iteration, especially for more complex models with n>5000, but it never
> actually crashes.  rerun I have to stop it after many hours of hanging.  But
> https://groups.google.com/d/msgid/r-inla-discussion-group/406203e6-f402-4879-a566-75d2591063d8n%40googlegroups.com
Reply all
Reply to author
Forward
0 new messages