Compare a biased-v model to a biased-z model

473 views
Skip to first unread message

Lior Lebo

unread,
Aug 25, 2018, 4:05:59 PM8/25/18
to hddm-users
Hi,

I have just recently turned to using HDDM in order to compare two rival models related to my behavioral data of subject-specific choice biases.
For simplicity, assume that all trials in my data are impossible (i.e. no correct response) and identical to one another, and that I want to test which of the following models better describes the subject-specific choice biases:
A. biased v (idiosyncratic choice biases are better described by biased drift rates)
B. biased z (idiosyncratic choice biases are better described by ICs)
Since there is only one condition (and that it is symmetrical), I figured that I can run model A by:
model_v = hddm.HDDM(data, p_outlier=0.05)
My problem, however, is with model B. I couldn't figure out how to exclude 'v' (treat it as zero) and I am not sure about the meaning of:
model_z = hddm.HDDMRegressor(data, "v ~ 0", include=('z'), p_outlier=.05)

I am also not sure about the best way to capture global biases (as opposed to idiosyncratic ones) in such data. Any chance I can use the "group_only_nodes" statement for that purpose?

Thanks in advance for your kind help!

Best,
Lior

Anne Urai

unread,
Aug 27, 2018, 6:15:43 PM8/27/18
to hddm-...@googlegroups.com
Hi Lior,

to code for biases towards one vs. the other option (e.g. 'apple' vs 'orange', not correct vs. error) you'll need to use the HDDMstimcoding module - for that you'll need to organize your data in a slightly different way, see the HDDM documentation (http://ski.clps.brown.edu/hddm_docs/howto.html#code-subject-responses). You'll need to indicate, for each trial, which response is the correct one (i.e. what is the true identity of the stimulus) so I'm not sure what you mean by "assume that all trials in my data are impossible (i.e. no correct response)"?

For example, for A.
   m = hddm.HDDMStimCoding(mydata, stim_col='stimulus', split_param='v',
                drift_criterion=True, bias=False)
and for B. 
   m = hddm.HDDMStimCoding(mydata, stim_col='stimulus', split_param='v',
                drift_criterion=False, bias=True)

the 'drift criterion' (see Ratcliff & McKoon) will then capture a bias in the overall drift rate per person, and the 'bias' will capture each individual's starting point.
Hope that helps! 

Anne E. Urai, MSc
PhD student | Institut für Neurophysiologie und Pathophysiologie
Universitätsklinikum Hamburg-Eppendorf | Martinistrasse 52, 20246 | Hamburg, Germany
www.anneurai.net / @AnneEUrai


--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Lior Lebo

unread,
Aug 29, 2018, 6:20:47 AM8/29/18
to hddm-users

Hi Anne,

Thanks for your response.
I was aware of the stimulus coding module, but my main problem is that the trials of particular interest are actually completely symmetrical (participants compared the lengths of 2 segments that are of the exact same size) so that the stimulus has no binary identity and a correct response is therefore ill-defined.
I thought that if I am considering only these trials, then I can pretend that one response is correct and the other one isn't and then use accuracy coding. 
Perhaps a more elegant way would be using stimulus coding with 2 rather than 4 possible sets for the 'stim' and 'response' columns. That is, {'stim'=1,'response'=1} when one response is chosen and {'stim'=0,'response'=0} when the other response is chosen. Do you think this makes more of a sense? My fear is that I am just adding an additional free parameter for the by-subject drift rate, but maybe this can be resolved by using "group_only"?

Thanks,
Lior

Anne Urai

unread,
Aug 31, 2018, 1:50:47 PM8/31/18
to hddm-...@googlegroups.com
Hi Lior,

hm, that's an interesting case. I think the easiest solution would be to randomly assign stimulus identities to half of the stimuli and then estimate both an overall drift rate and a drift bias. You should then recover parameter estimates where the drift rate is around 0, and choices are dominated by either bias terms or noise.

Some things to consider:
- do you have other trials where there is a correct stimulus? Fitting these together would probably help constrain the other parameters. You could then do something like depends_on={'v', ['difficulty']} and check that for your impossible trials, the drift rate is zero.
- looking at the supplementary materials of the Wiecki paper, there is a non-zero prior at v=0 so you should be good. If you have issues with the expected parameter estimate being outside the prior range, you could use the setting to turn informative priors off.

Anne E. Urai, MSc
PhD student | Institut für Neurophysiologie und Pathophysiologie
Universitätsklinikum Hamburg-Eppendorf | Martinistrasse 52, 20246 | Hamburg, Germany
www.anneurai.net / @AnneEUrai

On Wed, 29 Aug 2018 at 06:17, Lior Lebo <lebovi...@gmail.com> wrote:
Hi Anne,
Thanks for your response.
I am aware of the stimulus coding module, but my main problem is that the trials of particular interest are actually completely symmetrical (participants compared the lengths of 2 segments that are of the exact same size) so that the stimulus has no binary identity and a correct response is therefore ill-defined.
I thought that if I am considering only these trials, then I can pretend that one response is correct and the other one isn't and then use accuracy coding. 
Perhaps a more elegant way would be using stimulus coding with 2 rather than 4 possible sets for the 'stim' and 'response' columns. That is, {'stim'=1,'response'=1} when one response is chosen and {'stim'=0,'response'=0} when the other response is chosen. Do you think this makes more of a sense? My fear is that I am just adding an additional free parameter for the by-subject drift rate, but maybe this can resolved by using "gtoup_only"?

Thanks,
Lior

So I actually have to problems:
The first, is that I am not sure what is the right way coding a symmetrical stimulus. 
The second, is that I don't know how to create a model that a
My objective is comparing a model that assumes a zero drift and : 
Perhaps I am missing something, but I don't understand 
Also, 


On Tuesday, 28 August 2018 01:15:43 UTC+3, Anne Urai wrote:

Lior Lebo

unread,
Sep 19, 2018, 7:06:10 AM9/19/18
to hddm-users
Hi Anne,

Thanks so much for your reply. My apologies for only noticing it now..
We actually do have asymmetrical trials, but so few for each difficulty level (less than 10 per correct response), so it would be almost hopeless using upon our current data. 

As you noted on the informative priors, the chain indeed converges to a sample-average drift close to zero. That is, for the model with biased-drift assumption (z=0) and using accuracy coding on the impossible trials only (as if one of the 2 responses is correct). 
For the model with biased-z assumptions (v=0), I managed to make minor modifications on the code that will allow a biased drift model, though I am facing a more of a conceptual problem. Please see my next reply on this correspondence, I will be happy discussing this matter with you.

Best, 
Lior

Lior Lebo

unread,
Sep 19, 2018, 7:16:20 AM9/19/18
to hddm-users
Hello again and thank you for the useful HDDM toolbox that I was just recently discovered.

My current problem is more of a conceptual and it relates to the assumption on the distribution of z_j (inverse logit of a Normal), which may me problematic (even with non-informative priors) if comparing a biased-drift and biased-starting-point models as I do.
This is because even for the non informative prior of z_j, σ_z cannot get values much larger than 0.7. So if for example μ_z=0, then assuming for simplicity that all subjects have the same threshold --> z_j would always have a unimodal distribution (see attached figure, z=θ/IC). 

inlogitNormal_dist_vs_close_Normal.jpg


To see why this is problematic, consider the distribution of p_bias over the sample (trials in which alternative A was chosen divided by the amount of trials, for each participant) for the simple case of equal thresholds and no to trial-to-trial variability. You can analytically show that for the 'biased z' model, p_bias will be distributed the same as z_j. For equal thresholds and no trial-to-trial variability, this is:

pDist_given_biasedZ.png


So if z_j is expected to have a unimodal distrubution (due to the priors), then p_bias is also expected to have a unimodal distribution. Then if the observed p_bias distribution is Uniform or bimodal - both informative and non-informative priors cannot well account for the data. 
This is not the case for the 'biased v' model, as the distribution of p_bias can get both unimodal and bimodal, with a Normal distribution of v_j and σ_v within the range of both informative and non informative priors (see the figure below, where observed=original data and expected is given by PPC data, based on each of the two models). 

p_bias_observedData_expectedPPC.png


Theoretically, I could have just modified the code so that σ_z may get larger values, allowing for the distribution of expected p_bias to get bimodal. Though I am not sure whether the historical reason for using invlogit(Normal(μ_z,σ_z)) is mainly due to the 0<z<1 transform (so limiting σ_j is because we have no reason to assume z_j is extremely wide or bimodal), or that it carries some additional assumption that can allow the use of a bimodal z_j distribution.

Any thoughts on this matter would be highly appreciated and I will be extremely happy discussing that further.

Best,
Lior

Lior Lebo

unread,
Sep 19, 2018, 1:46:17 PM9/19/18
to hddm-users
A small correction to the value of possible σ_z: in the above it should have been 1.3 (E[σ_z]+3*STD[σ_z]), though the problem remains as this still does not result in a bimodal distribution.

Thomas Wiecki

unread,
Sep 20, 2018, 4:08:55 AM9/20/18
to hddm-...@googlegroups.com
I'm not sure I follow your theoretical point about the biased v model. Where does your "observed" come from? Are you using the StimCoding model?

Yes, as the parameter is constrained to be between 0 and 1, an inv logit was chosen, also to match what was found in the literature (the Matzke paper). You can, however, create your own model too and choose a different prior.


Lior Lebo

unread,
Sep 20, 2018, 10:37:56 AM9/20/18
to hddm-users
Hi Thomas, 

Thanks for your reply and obviously, for this useful toolbox.

I am using HDDM to fit behavioral data of a perceptual discrimination task, where the only twist is that the trials of interest have no binary identity, as the stimulus is completely symmetrical (no correct response, difficulty=0). 
It therefore seemed reasonable, at least for the biased v model, to use "accuracy coding" (with one of the two responses coded as the correct one). To the best of my understanding, the obtained posteriors of v correspond to biases in drift rates. 
For the biased z model, I first had to make a minor modification in the code so it would allow using a zero-drift assumption. But then, I noticed that the biased z model is sensitive to the magnitude of preferences that participants exhibit. That is, it may be an appropriate competing model (to the biased v one) when participants' preferences are moderate, but less when the preferences have a wide distribution. Surely, this can be solved by modifying the range of σ_z, but it is important doing so, otherwise one might mistakenly favor a biased v model over a biased z one. 
I guess I was mostly interested in knowing whether you considered the case in which z_j is unimodal, which is possible when σ_z of invlogit( N(μ_z,σ_z) ) is sufficiently large.

Thanks again,

Lior
Reply all
Reply to author
Forward
0 new messages