Standardized continuous covariates - do I need to reverse for outputs

32 views
Skip to first unread message

Suzanne Beck

unread,
Feb 19, 2025, 7:54:22 AM2/19/25
to R-inla discussion group
I have a ZIP model in r-inla where I have standardized my continuous variables as running the model with unstandardized variables causes R to crash. 

I have a table of betas and hyperparameters (mean and credible intervals) as well as some output graphs (following the methods from Zuur). My question is whether I need to back standardize the Mean and credible intervals for these standardized variables, and/or the predicted values for my output graphs?

Many thanks in advance

Finn Lindgren

unread,
Feb 19, 2025, 8:06:39 AM2/19/25
to Suzanne Beck, R-inla discussion group
Yes, if you want to be able to interpret model parameters on the non-standardised scale, you have to do the corresponding transformation of the parameter estimates.

Likewise, if you don’t do that parameter postprocessing, then to do prediction for new covariate values, those need to be transformed using the same transformation as used to preprocess the covariate as well. For this reason, using black box standardisation methods such as “scale()” aren’t very useful, and manual transformation is more practical, as you can then store and use the standardisation information on new data.

Finn

On 19 Feb 2025, at 12:54, Suzanne Beck <suuz...@gmail.com> wrote:

I have a ZIP model in r-inla where I have standardized my continuous variables as running the model with unstandardized variables causes R to crash. 

I have a table of betas and hyperparameters (mean and credible intervals) as well as some output graphs (following the methods from Zuur). My question is whether I need to back standardize the Mean and credible intervals for these standardized variables, and/or the predicted values for my output graphs?

Many thanks in advance

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/r-inla-discussion-group/fa3bed41-42c2-42a7-9d0f-2c989a761a81n%40googlegroups.com.

Suzanne Beck

unread,
Feb 19, 2025, 8:25:53 AM2/19/25
to R-inla discussion group
Hi Finn, 

Thanks for getting back to me, I used the standardisation (x - mean(x)) / sd(x) for all my continuous variables so I should be able to just apply a back standardisation x.std * sd(x) + mean(x)  directly to the parameter estimates and predicted values for my output graphs?

All the best
Suzanne

Thierry Onkelinx

unread,
Feb 19, 2025, 9:09:41 AM2/19/25
to Suzanne Beck, R-inla discussion group
Dear Suzanne,

Maybe consider centering to a specific value instead of centering to the mean. And likewise scale to a specific value instead of the standard deviation.

Consider a variable with mean = 1234 and standard deviation = 34. Then I would center to e.g. 1000 and scale by 10 or 100. Then undoing the scaling just changes the magnitude. You need covariates with mean reasonably close to zero and standard deviation reasonably close to 1. They don't need to be exactly 0 and 1.

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry....@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
Postadres: Koning Albert II-laan 15 bus 186, 1210 Brussel
Poststukken die naar dit adres worden gestuurd, worden ingescand en digitaal aan de geadresseerde bezorgd. Zo kan de Vlaamse overheid haar dossiers volledig digitaal behandelen. Poststukken met de vermelding ‘vertrouwelijk’ worden niet ingescand, maar ongeopend aan de geadresseerde bezorgd.
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




Op wo 19 feb 2025 om 14:25 schreef Suzanne Beck <suuz...@gmail.com>:

Finn Lindgren

unread,
Feb 19, 2025, 10:09:34 AM2/19/25
to R-inla discussion group
If the model you want is

eta = beta_0 + beta_x * x

but you pre-standardise to x_std = (x-mean(x)) / sd(x), that means

eta = beta_0 + beta_x * (x.std * sd(x) + mean(x)) = beta_0 + beta_x
* mean(x) + beta_x * sd(x) * x_std

This shows that the standardisation changes the interpretation of the
parameters _as seen by inla_, which is

eta = beta_0_std + beta_x_std * x_std

with beta_0_std = beta_0 + beta_x * mean(x) and beta_x_std = beta_x * sd(x).

So to recover beta_x, you need
beta_x_std / sd(x),
and to recover beta_0, you need
beta_0_std - beta_x * mean(x) = beta_0_std - beta_x_std / sd(x) * mean(x)

If all you want is prediction based on new data x_new, you can alternatively do

x_new_std = (x_new - mean(x))/sd(x)

and

beta_x_std * x_new_std

instead of converting beta_x_std to beta_x.

The trap you need to avoid is the temptation to use x_std = scale(x),
and then x_new_std = scale(x_new), which is completely different to
what I detailed above, as it would instead use mean(x_new) and
sd(x_new) in the conversion, which completely breaks the definition of
the model, which is based on standardising with the mean and sd of the
_original_ data.

Finn
> To view this discussion, visit https://groups.google.com/d/msgid/r-inla-discussion-group/ee72b1a1-6b1b-4a02-bbda-1265df7dca88n%40googlegroups.com.



--
Finn Lindgren
email: finn.l...@gmail.com
Reply all
Reply to author
Forward
0 new messages