Problems with within-subject HDDM regression

261 views

Skip to first unread message

李浣嘉

unread,

Nov 21, 2022, 2:52:45 AM11/21/22

to hddm-users

Hello there! I am new to HDDM and I have been trying using my own data to replicate a study which is about valued-based decision-making.

In the orignial paper, the drift rate is defined as v ∼ vG · G − vL · L where both G and L is within subject variable (continous).

Q1: Is my code correct if I want to define v as vG · G − vL · L ?

my code:

model= hddm.HDDMRegressor(data, "v ~ G+L") # transformed L to negative values already
model.find_starting_values()
model.sample(10000, burn=2000,dbname='model_traces.db',db='pickle')

Q2: after I ran the above code, I get the following results. How to understand the intercept on subject level? And why v_G and v_L only has group level parameters?

Q3: I note that if I write hddm.HDDMRegressor(data, "v ~0+ G+L"), then the intercept will disappear from the result? Why would this happen? The result is also different if I write hddm.HDDMRegressor(data, "v ~1+ G+L"). I find difficulty understanding all of these...

Thanks a lot for your help!

Best,

Christine

Nadja Ging-Jehli

unread,

Nov 23, 2022, 9:03:56 PM11/23/22

to hddm-users

Hi Christine,

To your questions:

Q1: I suspect that vG and vL refer to your coefficients for the variables G and L, respectively. If you want to define drift rate as: v ∼ vG · G − vL · L.

Then your equation should be: "v ~ -1 + G + L" (note that this is equivalent to: v~ 0 + G + L).

Both of these versions will suppress the addition of an intercept.

However, I would carefully check whether this is indeed what you want to do.

In general, the interpretation of coefficients changes in regression models without an intercept (I explain this a bit more when answering Q3 below).

Q2: I think that this question will be resolved if you define the equation correctly as outlined in Q1.

Q3: "v ~ 0 + G + L" or "v ~ -1 + G + L" means that you suppress the intercept in your equation.

There are multiple reasons why you would be interested to do so (the reasons might also depend on whether your covariates are continuous or discrete).

In general, if you include an intercept in your equation (i.e., v ~ 1 + G + L which is the same as v ~ G + L), then the intercept provides you the value for v when vG and vL both equal zero (this might or might not be meaningful). That's why people often standardize their variables (e.g., mean-centering) when excluding the intercept. Two examples for which excluding the intercept would be worth to consider:

For continuous covariates: if you mean-center your variables G and L, then the intercept reflects the value of v for vG and vL both being equal to their corresponding means. Note that you want to think of whether you mean-center at the subject-level or at the group-level (or at both). It won't make a difference statistically, but it will make a difference in the source of variability that your coefficients are going to capture.

For discrete measures: excluding the intercept would mean that vG becomes the intercept. In this case, vL is the estimated difference from vG.

This allows you to immediately read off if vL is significantly different from vG (i.e., if the 95% credible interval of vL does not include the zero, then you have reasons to believe that vL is systematically different from vG. Note that interpretations of these things slightly change whether you are in a Bayesian or a Frequentist context).

In general, the syntax for defining equations in regression models in HDDM follows pretty much the same notation as that for the LME4 package in R.

The LME4 package is used for fitting linear mixed models in R. Check out Table 2 of this documentation: Fitting Linear Mixed-Effects Models using lme4 (r-project.org)

In general, this book is super helpful if you want to learn more about linear mixed modeling/hierarchical modeling: Home page for the book, "Data Analysis Using Regression and Multilevel/Hierarchical Models" (columbia.edu). Note that many of the topics discussed in this book also applies to modeling data with sequential sampling models within a Bayesian hierarchical framework (i.e., through HDDM).

Hope that helps!

Best,

Nadja

Nadja R. Ging-Jehli, PhD

Postdoctoral Research Associate in Computational Psychiatry & Cognitive Neuroscience

Brown University

Department of Cognitive, Linguistic & Psychological Sciences

190 Thayer St, Providence, RI 02912

Lab website: https://www.lnccbrown.com/home/

(614) 736-7755 | na...@gingjehli.com | www.gingjehli.com

Reply all

Reply to author

Forward

0 new messages