Benchmarking RTMB and TMB: ways to improve RTMB::MakeADFun speed?

257 views
Skip to first unread message

ada...@uw.edu

unread,
Jan 7, 2025, 2:41:48 PMJan 7
to TMB Users
Hi All,

I am considering moving a multi-species stock assessment modelling package from TMB to RTMB. However, I am finding the MakeADFun in RTMB is prohibitively slower than in TMB (> 1 hour in RTMB vs 2 seconds). I am guessing it is due to my use of multi-dimensional arrays. TMB is great, but for future development, the benefits of developing using in RTMB are enticing and future of TMB seems to be an open ended question? However, I am interested in doing various simulation experiments, so reducing run time in MakeADFun is a big drive.

Apart from vectorizing the R code, are there any suggestions for speeding up MakeADFun in RTMB? 

For simulation experiments, I can't use "DataEval" because the parameter/data dimensions change. Hacking it by doing MakeADFun on a full dimension model with an indicator of whether to include data in the likelihood and penalizing parameters to 0 by setting "nll += param^2" breaks the AD when building (see RTMB DataEval Test.R). Would there be any other DataEval workarounds?

Benchmarking the vectorized "linreg" examples in TMB and RTMB also suggests that RTMB is just slower no matter what:
TMB vs RTMB benchmark.png

RTMB DataEval Test.R
RTMB vs TMB linreg benchmark.R

Kasper Kristensen

unread,
Jan 8, 2025, 7:18:22 AMJan 8
to TMB Users
The big performance penalty you observe is likely due to elementwise array sub-assignment, which is very slow in RTMB. To demonstrate, here are two versions of the same function. The first is more than 1000 times slower to tape than the second:

n <- 1e5
f1 <- function(x) {
    a <- AD(array(0, n))
    for (i in 1:n) {
        a[i] <- x
    }
    sum(a)
}
f2 <- function(x) {
    a <- AD(array(0, n))
    a[1:n] <- x
    sum(a)
}
> system.time(RTMB::MakeADFun(f1, 1))
   user  system elapsed
 44.741   3.976  48.722
> system.time(RTMB::MakeADFun(f2, 1))
   user  system elapsed
  0.024   0.004   0.028

Why the big difference? Because R re-allocates the entire array on each sub-assignment (which can be verified by .Internal(inspect(a)) inside the loop). As far as I know, this is the only serious performance concern with RTMB. If you avoid it, the performance difference vs normal TMB should be insignificant.

Regarding the 'linreg' benchmark it is a bit misleading when comparing RTMB vs TMB. First, it is probably the only example in the TMB test suite for which the conclusion wouldn't be the opposite! (try instead 'spde', 'hmm', 'spatial' etc). Second, contrary to what one might think, this benchmark demonstrates differences between two AD frameworks (TMBad used by RTMB vs CppAD used by TMB) and not 'R vs C++'.
FWIW, the 'linreg' benchmark can become much faster in RTMB by enabling vectorized operations on the tape rather than using the tape optimizer (which is then redundant):

RTMB::TapeConfig(vectorize="enable") ## Use vectorized operations on the tape
TMB::config(optimize.instantly=0,DLL="RTMB") ## Disable the tape optimizer

However, these setting are not generally recommended!

Ben Bolker

unread,
Jan 8, 2025, 9:29:24 AMJan 8
to tmb-...@googlegroups.com
Would it be possible/does it make sense to define an RTMB-specific
function setval(x, indices, value) that does in-place setting? (It would
violate R copy-on-modify semantics, but users would be responsible for
knowing that ...)

Related project: https://github.com/coolbutuseless/insitu
> TMB vs RTMB benchmark.png
>
> --
> To post to this group, send email to us...@tmb-project.org. Before
> posting, please check the wiki and issuetracker at https://github.com/
> kaskr/adcomp/ <https://github.com/kaskr/adcomp/>. Please try to create a
> simple repeatable example to go with your question (e.g issues 154, 134,
> 51). Use the issuetracker to report bugs.
> ---
> You received this message because you are subscribed to the Google
> Groups "TMB Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to tmb-users+...@googlegroups.com <mailto:tmb-
> users+un...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/tmb-
> users/dfca8115-5209-4110-abb7-365134d33686n%40googlegroups.com <https://
> groups.google.com/d/msgid/tmb-users/dfca8115-5209-4110-
> abb7-365134d33686n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.

ada...@uw.edu

unread,
Jan 8, 2025, 8:16:29 PMJan 8
to TMB Users
  Thanks Kasper! Benchmarking the SAM example using TMBad framework shows the same differences for me:

Model Function TMB RTMB 1 SAM MakeADFun 0.0128 0.0941 2 SAM Fit 0.6694 0.6694 3 SAM Sdreport 0.1222 0.2138 4 SAM fn 0.0026 0.0023 5 SAM gr 0.0034 0.0033
 
Same thing with the arrays I am guessing? I think population dynamics models (particularly multi-species like the predation loop below) will be very difficult or impossible to vectorize sub-assignment, so base TMB is probably the move for speed.

for(ksp in 1:nspp) { # Prey lop
  for(k_sex in 1:nsex[ksp]) { # Prey sex loop
    for(k_age in 1:nages[ksp]) { # Prey age loop
      for(yr in 1:nyrs) {
      # Type 2 MSVPA predation mortality
      # - Sum average N * ration * suitability / available food across all predators for each prey
      M2[ksp, k_sex, k_age, yr] <-
        sum(avgN_at_age[, , , yr] *
              ration[, , , yr] *
              suit_main[, , , ksp, k_sex, k_age, yr] /
              avail_food[, , , yr])
    }
  }
}

Unless Ben's suggestion would avoid the errors from using DataEval to update indices that turn on/off data/parameters?

ada...@uw.edu

unread,
Jan 8, 2025, 8:18:26 PMJan 8
to TMB Users
Sorry, that SAM benchmark table did not format well after send...

Capture.PNG

Kasper Kristensen

unread,
Jan 9, 2025, 9:18:07 AMJan 9
to TMB Users
My conclusion for this example is very different: The RTMB SAM implementation was made without any effort to vectorize at all, and yet you get a total performance penalty of only 10-20%. That's not bad! A more R-like implementation would vectorize the data loop and probably get you closer to ca 1% performance penalty.

Ben's suggestion to add a 'setvar' is definitely possible. I'm open to adding it if someone posts a github issue.

Another option could be to construct arrays from a function, something like:

fun2array <- function(FUN, ...) {
    stopifnot(identical(names(list(...)),
                        names(formals(FUN))))
    d <- lengths(list(...))
    grid <- as.list(expand.grid(...))
    ans <- .mapply(FUN, grid, NULL)
    cl <- class(ans[[1]])
    ans <- unlist(ans)
    class(ans) <- cl
    dim(ans) <- d
    ans
}

and then transform the loop into:

M2 <- fun2array(
    function(ksp, k_sex, k_age, yr)

        sum(avgN_at_age[, , , yr] *
            ration[, , , yr] *
            suit_main[, , , ksp, k_sex, k_age, yr] /
            avail_food[, , , yr]),
    ksp = 1:nspp,
    k_sex = 1:nsex,
    k_age = 1:nages,
    yr = 1:nyrs)

Kasper Kristensen

unread,
Feb 5, 2025, 4:36:20 AMFeb 5
to TMB Users
For future reference, it turns out there's a simple workaround to speed up MakeADFun in cases where one must build an array one element at a time. The trick is to use double bracket sub-assignment x[[i]]<-y rather than single bracket. Example:

f <- function(x) {
    n <- 100
    a <- AD(array(0, c(n,n,n)))

    for (i in 1:n)
        for (j in 1:n)
            for (k in 1:n)
                a[[i,j,k]] <- x
    sum(x)
}
system.time(F <- MakeTape(f, 0)) ## Reasonably fast

The above example takes forever to run when using single bracket sub-assignment 'a[i,j,k] <- x'.

William Stockhausen - NOAA Federal

unread,
Feb 5, 2025, 9:05:42 AMFeb 5
to Kasper Kristensen, TMB Users
Hi Kasper,

Thanks for the great tip! 
Can it also be used where `x` is a vector of length `n` and you skip the `k` loop so you're doing
a[[i,j,]]<-x ?

Buck

William T. Stockhausen

Research Fishery Biologist, Alaska Fisheries Science Center

NOAA Fisheries | U.S. Department of Commerce

Office: (206) 526-4241

www.fisheries.noaa.gov




To post to this group, send email to us...@tmb-project.org. Before posting, please check the wiki and issuetracker at https://github.com/kaskr/adcomp/. Please try to create a simple repeatable example to go with your question (e.g issues 154, 134, 51). Use the issuetracker to report bugs.

---
You received this message because you are subscribed to the Google Groups "TMB Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tmb-users+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tmb-users/9931c43c-bba3-410c-8709-18e4411b27cbn%40googlegroups.com.

pthomp...@gmail.com

unread,
Feb 5, 2025, 12:31:30 PMFeb 5
to TMB Users
I'm interested in this as well. I tried the code above and it gave an error; I think "[[" only works when all subscript dimensions are filled in with a single value in R. I quickly checked it out and it seems like in that situation, using the three-dimensional for-loop with [[i,j,k]] and individually assigning each element of x to a is faster than "a[i, j, ] = x" with single brackets. See my comparison below:

require(RTMB)
require(microbenchmark)

f1 <- function(x) {
  n <- 50

  a <- AD(array(0, c(n,n,n)))
 
  for (i in 1:n)
    for (j in 1:n)
      for (k in 1:n)
        a[[i,j,k]] <- x[[k]]
  sum(x)
}

f2 <- function(x) {
  n <- 50

  a <- AD(array(0, c(n,n,n)))
 
  for (i in 1:n)
    for (j in 1:n)
        a[i,j, ] <- x
  sum(x)
}

mb = microbenchmark(MakeTape(f1, numeric(50)),
                    MakeTape(f2, numeric(50)),
                    times = 5)
Picture1.png

I imagine there might be a better way to do this that somehow leverages the power of "[[" without needing to use the extra loop, but I'm not sure.

Peter

Kasper Kristensen

unread,
Feb 6, 2025, 5:07:53 AMFeb 6
to TMB Users
Peter,

You can combine inner loop vectorization and outer loop inplace insertion by using an array of lists. Then convert to normal array after construction:

library(RTMB)
n <- 100
f <- function(x) {
    a <- array( list() , c(n, n) )

    for (i in 1:n)
        for (j in 1:n)
            a[[i,j]] <- x
    ## convert to normal array
    a <- do.call("c", a)
    dim(a) <- c(n, n, n)
    sum(a)
}
system.time(F <- MakeTape(f, numeric(n)))

pthomp...@gmail.com

unread,
Feb 6, 2025, 11:48:57 AMFeb 6
to TMB Users
Indeed, this works great in comparison to both my versions of the function. Excellent solution Kasper, thanks!! :)

Peter

Reply all
Reply to author
Forward
0 new messages