Within and between subjects model

Tim Sandhu

unread,

Sep 14, 2020, 10:18:05 AM9/14/20

to hddm-users

Dear HDDM experts,

I am trying to use HDDM to look at changes in parameters between two groups in a task with three conditions. As such, I think I need a within subjects model, as including ‘depends_on={'v': ['group', 'condition']}’ for example treats group and condition as independent. Additionally, it is likely that performance in one condition relates to performance in the other two.

However, I am unsure of how to specify the model to include the within subjects effect of condition and between subjects effect of group. I have read the patsy documentation and the tutorial on within subjects effects but I’m still not sure…

In the simplest case, assuming I was only interested in drift rate - is this the correct syntax? (In the below code, condition can take a value of {one, two, three} as categorical values)

from patsy import dmatrix

dmatrix("C(condition, Treatment('one'))",data)

m_within = hddm.HDDMRegressor(data,"v ~ 1+ group*C(condition, Treatment('one'))")

m_within.sample(5000,burn=250)

I’d be really grateful for any help with this!

Thanks in advance!

Tim

Tim Sandhu

unread,

Sep 23, 2020, 5:41:47 AM9/23/20

to hddm-users

Hi again!

Just wondering if anyone can provide any help with this problem?

Any help would be greatly appreciated!

Thanks again,

Best,

Tim

Mads Lund Pedersen

unread,

Sep 23, 2020, 5:53:24 AM9/23/20

to hddm-...@googlegroups.com

Hi Tim,

For this type of design I have ran separate models for each group to capture within-effects, i.e. run one model with within-effect for group A and one for group B. But Thomas recently upgraded HDDM to allow mixed designs, so in principle it should work the way you've set it up. But I haven't checked out the upgrade, so can't say for sure.

--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/b9e99227-5b81-4e2b-b75f-836bb16c738an%40googlegroups.com.

--

Best,

Mads

Tim Sandhu

unread,

Sep 29, 2020, 4:29:39 AM9/29/20

to hddm-users

Hi Mads,

Thanks for your suggestions, and apologies for not getting back to you quicker.

I've been running a series of models since last week as you suggested but seem to be getting unexpected errors. So I thought I'd try to understand those before reporting back. When I've overcome these I'll confirm that the two separate within subject model results fit with the single mixed design results.

Thanks again,

Best,

Tim

Tim Sandhu

unread,

Nov 25, 2020, 7:23:59 AM11/25/20

to hddm-users

Hi again Mads,

Sorry for my delay - I had numerous technical problems and a few other things crop up.

When I split the groups to run separate within subject models, I found that the model fits were problematic wrt autocorrelation and the chain plots. The first thing I thought would check was to see if this was driven by sample size (N=14, please correct me if I'm wrong in thinking this would contribute to model fit quality).

However, when I ran a within subjects model with double the number of participants (N=28) in comparison to treating the same effect as a between subjects, I noticed a couple of things:

the 'within' fits weren't as bad as with a lower sample size, but were still of a worse quality than the 'between' fits, with 15000 samples. Is this something that we would expect comparatively, particularly for the bias term?
the within fits took a lot longer, on the order of 9 hours, on a standard machine - again is this something I should expect, or is this a fault with my system?
when fitting the model, I was unable to retrieve the individual parameters for each level of the effect. I was able to get out individual parameters for the 'intercept' level of effect (L1 below), but then just a group difference between the two. Is it possible to get these individual parameters for each level out of the within subjects model?

The code I used is pasted below
dmatrix("C(effect, Treatment('L1'))",data)
within_m = hddm.HDDMRegressor(data,["v ~ 1 + C(effect, Treatment('L1'))","a ~ 1 + C(effect, Treatment('L1'))","t ~ 1 + C(effect, Treatment('L1'))","z ~ 1 + C(effect, Treatment('L1'))"],include = ['z','t'])
within_m.sample(15000,burn=1500,dbname='within_m.db', db='pickle')
within_m.save('within_m')

Many thanks for your help,

Best,

Tim

Mads Lund Pedersen

unread,

Nov 27, 2020, 9:58:40 AM11/27/20

to hddm-...@googlegroups.com

Hi Tim,

By 'within' and 'between' models you mean HDDMRegressor and HDDM? HDDM is generally quite a bit faster than HDDMRegressors, so that makes sense.

To estimate individual parameters for all predictors in HDDMRegressor you can set group_only_regressors=False.

It's difficult to say why the HDDMRegressor model looks worse. Have you checked for convergence?

To be clear, you set up two HDDMRegressor models, one for each group? And for the normal HDDM you include group in depends_on and run one model on the entire data?

To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/2d091c67-8aa5-4bc9-9380-11d97146c98an%40googlegroups.com.

--

Best,

Mads

Tim Sandhu

unread,

Dec 3, 2020, 3:45:08 AM12/3/20

to hddm-users

Hi Mads,

Thanks for getting back.

Yes sorry I meant HDDMRegressor = within, HDDM = between. Good to know that HDDM is faster, and thanks for the group_only_regressors tip - that has solved that problem.

Apologies for my lack of clarity. What I'm about to discuss is the output of a dataset with just one group, I just included a between subjects effect. There are 25-30 trials per person per each of the 3 levels of effect, with 28 participants. I have run a HDDMRegressor model with group_only_regressors = False. This took 20 hours on my PC - which seems like a long time.

Unfortunately I have had repeated issues with the Geweke statistic calculation which I can't seem to get round, but the chains from this model look pretty nasty so I suspect issues with convergence, and the autocorrelation looks very high. Is this a problem with lack of data that might crop up with HDDMRegressor?

I have also managed to get my hands on another computer, which I will try to run this code on to see if this is to do with my system.

Geweke error attached below.

Thanks so much,

Tim

***** Geweke error ********

I have come up with the following error when trying to calculate the Geweke statistic for my model. I have seen others have reported this problem, but I couldn’t see any solutions other than to check for NaN/extreme values (there weren’t any in my data).

As I understand it, if the chain has not converged, print check_geweke should return false. Instead I get the below errors.

In case this is useful - gr takes the value of either 0 or 1 (control/case) and stim takes the value of {bb,bd,tu} (different conditions).

AssertionErrorTraceback (most recent call last)

<ipython-input-6-4b1ae3ba7cfa> in <module>()

1 #b_3k.print_stats()

2 from kabuki.analyze import check_geweke

----> 3 print check_geweke(b_3k)

4

5 #b_3.gen_stats()

C:\Users\ts772\AppData\Local\Continuum\anaconda2\lib\site-packages\kabuki\analyze.pyc in check_geweke(model, assert_)

161 msg = "Chain of %s not properly converged" % param

162 if assert_:

--> 163 raise AssertionError(msg)

164 else:

165 print(msg)

AssertionError: Chain of knode_name a

stochastic True

observed False

subj False

node a(0.TU)

tag (0, TU)

depends [gr, stim]

hidden False

subj_idx NaN

stim TU

rt NaN

response NaN

trialInfo NaN

keypress NaN

gaze NaN

gr 0

correct NaN

mean 1.08191

std 0.0490583

2.5q 0.985729

25q 1.04933

50q 1.08188

75q 1.11475

97.5q 1.17916

mc err 0.000562783

map 1.37006

Name: a(0.TU), dtype: object not properly converged

Reply all

Reply to author

Forward