Dear all,
Thanks everybody for the interesting discussion and sharing the paper.
Personally, I am not a big fan of just reporting single scores, like generalisation, specificity, etc. I think they should be complemented by other means of verification. One simple possibility would for example be to measure some standard anatomical distances on the model, calculate how these distances vary across random samples of the model.and then to compare that with reported studies in the medical literature. I also think that we should always show visualizations of the samples, as they might convey more information than a specificity score. To some degree, these things were done in the paper you cited paper about sexual dimorphism. However, I think one could/should go even further.
In this context I like the paper "Bayesian workflow", by Andrew Gelman et al. quite a lot. (
https://arxiv.org/abs/2011.01808). It speaks more generally about how to build and validate statistical models, but I think we can apply a lot of the thinking in shape modelling.
Best regards,
Marcel