Model Selection with DIC & PPC, and plots of posterior predictives

gabri....@gmail.com

unread,

Jan 29, 2021, 3:35:13 PM1/29/21

to hddm-users

Hello Everyone,

Just wanted to see if what I am planning to do was correct....

I have made 3 models and have the DIC for all of them. I wanted to also check the PPC to ensure I select the best model. My understanding is that I will simulate data from all three of the models, and then compare the simulated data from the model with the actual data and see which model has the lowest MSE. Code would be something like this:

#Simulated datasets

ppc_data_M1 = hddm.utils.post_pred_gen(M1)

ppc_data_M2 = hddm.utils.post_pred_gen(M2)

ppc_data_M3 = hddm.utils.post_pred_gen(M3)

#compare data

ppc_compare_M1 = hddm.utils.post_pred_stats(data, ppc_data_1)

ppc_compare_M2 = hddm.utils.post_pred_stats(data, ppc_data_2)

ppc_compare_M3 = hddm.utils.post_pred_stats(data, ppc_data_3)

Then I would look at each of the MSE columns of the data the ppc_compare makes and select the lowest one?

Also - in terms of my plots of posterior predictives, I am using the random dot motion task and for the 3 hardest conditions the plots don't look like the computer predicts them well at all. My understanding is that the participant is guessing the response and therefore the computer is basically guessing it too and it can't make great predictions???

Thanks in advance.

Gabz xxxxxxxxx

Mads Lund Pedersen

unread,

Feb 2, 2021, 1:30:44 PM2/2/21

to hddm-...@googlegroups.com

Your approach for comparing models' predictive ability looks correct, but I would not use the MSE (or something similar) for model selection, as it would not account for (penalize for) model complexity. But it could be a good tool to get a grasp of what the models can't capture. But I would recommend using DIC to select the best-fitting model.

For the last question, it's hard to say whether it's an issue with the model without knowing exactly what you mean with misprediction, but in general the model should also be able to predict difficult conditions as the drift should be closer to 0 when accuracy is close to 0.5.

--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/5f0842b8-80b3-4442-a026-7e94f4219efcn%40googlegroups.com.

--

Best,

Mads

gabrielle sheehan

unread,

Feb 4, 2021, 4:53:18 AM2/4/21

to hddm-...@googlegroups.com

Dear Mads,

Thank you so much for your reply.

For the last question please see the graphs below. This condition in the experiment is the 2nd most difficult and as you can see, the responses don't look like the computer has captured them very well. Is there a way to improve this?

Best best wishes,

Gabrielle

To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/CAP6J2yzHKh58U2dUpM3yx_5JrS79d6GWoN9O4rv8tj5Vs6X0wQ%40mail.gmail.com.

Mads Lund Pedersen

unread,

Feb 4, 2021, 7:52:36 AM2/4/21

to hddm-...@googlegroups.com

This apparent misprediction reflects that there aren't that many trials per condition and subject, so there's a lot of noise in the observed RT histograms. I would recommend comparing observed and simulated data on a group-level in this case, which would give you a better indication of whether the model can capture behavior. On a subject level you could compare summary scores.

To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/CAFdW91gem80cVcESy0ZipsGNe5JP5FtU97ndBJ_MA-jo3SVACw%40mail.gmail.com.

--

Best,

Mads

Sabina Nowakowska

unread,

May 7, 2021, 8:57:04 AM5/7/21

to hddm-users

Hello, I have a related question regarding model selection. The DIC values I am getting from my models are negative and have large magnitude. As far as I understand the lower the DIC value the better the model fit, so in my case I would select the model with the most negative DIC value, right? For example, model 1 DIC is: -50 000, model 2 DIC is: -51 000 and model 3 is: -54 000. So I select model 3. But I am not sure if this is the right interpretation, because when I look at my plots from PPC I have visually far worse fit from model 3, but the DIC for that one is lower. Both models have the same number of parameters, I only change which one varies with a condition.

Reply all

Reply to author

Forward