Using Nimble functions (distributions) in a model

366 views
Skip to first unread message

Gregory

unread,
Dec 30, 2021, 9:14:38 AM12/30/21
to nimble-users
Hello Nimble users,

I've created a custom distribution including a nimbleFunction for each d, r, p, and q functions. Those functions all compile successfully using compileNimble, and I'm able to register the distribution using registerDistributions. However, when used in a model the MCMC is much slower than when using one of Nimble's provided distributions. I'd like to speed it up, if possible, so I'm trying to identify why it's so slow. I see two likely culprits.

1) The model is using the un-compiled distribution functions. This seems possible, because the calls in the model code use the un-compiled versions. I did this because when I try to register the compiled distribution or use it in a model, I get the an error that the function is not available and "must be a nimbleFunction (with no setup code)." But since I'm using a compiled model and mcmc, I'm guessing that the Nimble functions for my distribution get compiled at the same time, but I'm not sure about that. So, could someone tell me whether that is the case, and if it is not, then how do I use compiled versions of my functions in the model?

2) If the model is using compiled versions of my distribution functions, then the slowness is caused by how I've implemented the function. If that's the case, I'll need to think about how that might be improved. I have a good guess where the problem would be, but whether that can be eliminated might be a topic for a separate post here.

Thanks in advance,
Gregory

Daniel Turek

unread,
Dec 30, 2021, 9:41:30 AM12/30/21
to Gregory, nimble-users
Short answer to (1) is that yes, when you compile the model, it also compiles any user-defined distributions or functions which are in the model, and the compiled model object will be using the compiled versions of your custom distributions.

I hope you can make some headway with (2).  Yes, depending on how you've written things can cause non-trivial performance gains / losses.  You might also think about any memory allocation which has to take place (of vectors / arrays / higher dimensional objects) which might be invoked on *every * call to your distribution, which is one possible pitfall.  For example if you create a new array variable inside your distribution, then this memory allocation will happen, repeatedly, on every call of your distribution.

Keep at it,
Daniel


--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/f72a6224-9200-45c8-8414-5df95cd0d9bcn%40googlegroups.com.

Gregory

unread,
Dec 31, 2021, 10:32:15 AM12/31/21
to nimble-users
Daniel,

Thanks for the insight; I think it suggests something to try. My distribution is a discretized version of a normal distribution. The implementation of the d function is essentially this:

1) Define a finite range of support for the distribution.
2) Create a vector of normal densities for all integers within the range.
3) Standardize the vector created in (2) by dividing each element by the sum of its elements.
4) Look up the vector element corresponding to a particular integer and return it.

(2) Has to be done each time the density function is called, and if the distribution has a large range of support, then that's a lot of calls to repeat. An obvious solution in C++ is to create something like a std::map where key={integer in range}, value={corresponding element from standardized vector in (3)} and store that map for future use. I'm not sure how to do that with Nimble, though.

Daniel Turek

unread,
Jan 4, 2022, 12:29:36 PM1/4/22
to Gregory, nimble-users
Yes, unless you get into using the nimbleExternalCall function, to call arbitrary C++ code from your density function, then I think you'll be restricted to the standard numerical operations which are supported by the NIMBLE DSL.   That said, doing as much of the pre-computation *once* in advance will definitely be helpful.

So, it's true that the finite interval of support defined in (1) is constant?  And therefore, the set of integers within this range are also unchanging?

I'm guessing, then, the reason (2) must be repeated on each call to the distribution is because the mean and variance defining the normal distribution are changing?  In that case, I'm not thinking of any obvious way around evaluating the normal density at each integer, and returning the correct normalized value.

Reply all
Reply to author
Forward
0 new messages