Hi everyone,
I made the recent nimble 1.1.0 update adding the nimIntegrate function very useful for what I wanted to do.
Basically, the whole code has the purpose of degrading a normal distribution with a mean and a standard deviation to a histogram.
More precisely, I want a probability mass function values and midpoint of intervals for a specified number of classes within a given range based on the normal distribution parameters mu and sigma.
Now I will talk in depth about the part of estimating midpoint of intervals.
At first I used a simple (breaks[i+1] - breaks[i])/2 to make the estimation of these midpoint.
But now I want a set of midpoint based on the given data distribution (normal) because it's more accurate (especially for the tails of the distribution).
To make this I compute the weighted contribution of each data point to the average :
f <- function(x, mean, sd) {x * dnorm(x, mean, sd)}
Which is turned into a nimbleFunction (see here the used of theta, which is a vector with mean and sd in it) :
f <- nimbleFunction(
run = function(x = double(1), theta = double(1)) {
numerat <- x * dnorm(x, theta[1], theta[2]) # Calculate the numerator for weighted average (mean and sd)
returnType(double(1))
return(numerat)
}
)
I also compute the PDF of the normal to normalize my weighted contribution :
f2 <- function(x, mean, sd) {dnorm(x, mean, sd)}
Which is a simple dnorm turned into a nimbleFunction (here again we use theta as a vector of mean and sd) :
f2 <- nimbleFunction(
run = function(x = double(1), theta = double(1)) {
denomin <- dnorm(x, theta[1], theta[2]) # Calculate the denominator for normalization (mean and sd)
returnType(double(1))
return(denomin)
}
)
I now have my final nimbleFunction which estimate these midpoints :
f.midbreaks.taille2 <- nimbleFunction(
run = function(theta = double(1), nb_class = double(0)) {
# Definition of min/max bounds
borne_min <- 0
borne_max <- 10
# Here I define a min and max bounds with a large interval
# Calculate breakpoints based on the value range
breaks_base <- seq(0.2, 3, length.out = nb_class + 1)
# After I calculate breakpoints with a range more precise around the values we have and (nb_class+1) breaks
breaks_names <- c(borne_min, breaks_base, borne_max)
# I now add my tails bounds to capture the interval < first breaks and the interval > last breaks
# Initialization of vectors (explicit)
int_num <- numeric(length = (nb_class + 2))
int_denom <- numeric(length = (nb_class + 2))
# Loop to calculate integrals over each interval
for (i in 1:(nb_class + 2)) {
# Calculate the weighted average for the interval
int_num[i] <- nimIntegrate(f, lower = breaks_names[i], upper = breaks_names[i + 1], param = theta)[1] # Index 1 to retrieve only the value
# Calculate the normalization for the interval
int_denom[i] <- nimIntegrate(f2, lower = breaks_names[i], upper = breaks_names[i + 1], param = theta)[1] # Index 1 to retrieve only the value
}
# Finally, calculate mid_breaks as the ratio of weighted average and normalization
mid_breaks2 <- int_num / int_denom
returnType(double(1))
return(mid_breaks2)
}
)
This function works perfectly uncompiled or compiled.
BUT (here we are), when I add this function in an nimbleCode as follow :
model.nimble <- nimbleCode({
#------------------------------------------
# Prior
mu <- 1 # Prior for mean "mu" of a normal (fixed)
sigma <- 0.2 # Prior for standard deviation "sigma" of a normal (fixed)
theta <- nimC(mu, sigma) # Vector used by nimIntegrate
midbins[1:(nb_class+2)] <- f.midbreaks.taille2(theta = theta, nb_class = nb_class) # Mean value for each bins of my builted histrogram
# Likelihood
# end model
})
With nb_class as a constant :
nb_class <- 40 # Number of bins that I want to create in my histogram
const <- list(nb_class = nb_class) # Etablishment of the constant variable
And as I create the nimbleModel I get this :
> model.nimble <- nimbleModel(code = model.nimble, name = 'model.nimble', constants = const)
Defining model
Building model
Running calculate on model
[Note] Any error reports that follow may simply reflect missing values in model variables.
Error in integrate(f, lower, upper, param, subdivisions = subdivisions, :
non-finite function value
Checking model sizes and dimensions
[Note] This model is not fully initialized. This is not an error.
To see which variables are not initialized, use model$initializeInfo().
For more information on model initialization, see help(modelInitialization).
Message d'avis :
Dans model$theta[1] <<- nimC(model$mu[1], model$sigma[1]) :
le nombre d'objets à remplacer n'est pas multiple de la taille du remplacement (the number of objects to replace is not a multiple of the size of the replacement)