Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Question about Interpretation of Validation Metrics from inla.group.cv

19 views
Skip to first unread message

Guido Fioravanti

unread,
May 12, 2025, 10:18:55 AMMay 12
to R-inla discussion group

Dear INLA group,

I have a question regarding the interpretation of three validation metrics obtained using the inla.group.cv function. My goal is to compare a calibration model (a spatio-temporal regression model) against a data fusion model (Bayesian melding), where the latter is a joint model with two likelihoods.

For each model, I computed the following monthly metrics:

  • Negative logarithmic score (LS)

  • Dawid-Sebastiani score (DS)

  • Kullback-Leibler divergence (KLD)

The results are provided in the attached file. My question concerns an apparent inconsistency:

  • The KLD suggests that the data fusion model performs better across all months.

  • However, the LS and DS metrics indicate either similar performance between the two models or a slightly better performance of the calibration model in certain months.

I would have expected these metrics to show more coherent behavior. Additionally, other metrics I computed (RMSE and MAE, not shown here) align more closely with LS and DS, suggesting comparable or slightly better performance for the calibration model.

How can I reconcile the KLD results with the other metrics? Is there an interpretation or methodological consideration I might be missing?

I would greatly appreciate any insights or suggestions you might have.

Thanks for your help,

Guido



metrics.png

Håvard Rue

unread,
May 12, 2025, 5:14:54 PMMay 12
to Guido Fioravanti, R-inla discussion group

In general, there is no answer what is the 'best' one, as this is defined by
what you chose to use.

I would also guess that the KLD one is the most ``non-robust'' one, as shown in
the plot, so I would likely avoid that one, so then its less of an issue which
one you chose.

Best
H



On Mon, 2025-05-12 at 07:18 -0700, Guido Fioravanti wrote:
> Dear INLA group,
> I have a question regarding the interpretation of three validation metrics
> obtained using the inla.group.cv function. My goal is to compare a calibration
> model (a spatio-temporal regression model) against a data fusion
> model (Bayesian melding), where the latter is a joint model with two
> likelihoods.
> For each model, I computed the following monthly metrics:
>  * Negative logarithmic score (LS)
>  * Dawid-Sebastiani score (DS)
>  * Kullback-Leibler divergence (KLD)
> The results are provided in the attached file. My question concerns an
> apparent inconsistency:
>  * The KLD suggests that the data fusion model performs better across all
> months.
>  * However, the LS and DS metrics indicate either similar performance between
> the two models or a slightly better performance of the calibration model in
> certain months.
> I would have expected these metrics to show more coherent behavior.
> Additionally, other metrics I computed (RMSE and MAE, not shown here) align
> more closely with LS and DS, suggesting comparable or slightly better
> performance for the calibration model.
> How can I reconcile the KLD results with the other metrics? Is there an
> interpretation or methodological consideration I might be missing?
> I would greatly appreciate any insights or suggestions you might have.
> Thanks for your help,
> Guido
>
>
> --
> You received this message because you are subscribed to the Google Groups "R-
> inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to r-inla-discussion...@googlegroups.com.
> To view this discussion, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/7abd4d06-cf0d-45d5-8dd1-8e8ff9dfb9f2n%40googlegroups.com
> .

--
Håvard Rue
hr...@r-inla.org
Reply all
Reply to author
Forward
0 new messages