Hi,
I was wondering about the correct way to calculate posterior distributions of contrasts in a hierarchical model. Here are the two ways I can think of:
Lets assume I have 2 conditions, A1 and AB and want to get the posterior distribution for A1-A2. Then I could
a) Take the nodes for the group means of A1 and A2 and subtract them. The posterior distribution is then the distribution of this differences calculated for all mcmc samples.
b) Subtract A1-A2 for each individual on the level of participant-nodes, and calculate the average of this division. The posterior distribution is then the distribution of this average of differences calculated for al mcmc samples.
I tried both, and b) typically shows better results in that the variance of the posterior distribution is smaller. However, I am unsure if this actually is a permissible way.
The reason why I think that b) is actually the correct way is that I have a design where condition is manipulated within participants, and I think that if I just take the group nodes to estimate contrasts, the variance I see comes from between and within participant differences. Put differently, I assume that the variation in the group node for a parameter depends also on between-participant variation in that parameter. However, if I have a contrast for a within-participants design, I am not interested in the between participant variance of a particular parameter, but only in the between participant variance for the difference between 2 parameters.
What do you think about this issue?
Cheers – Guido
--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
On Wed, Mar 6, 2013 at 8:40 AM, Guido Biele
<g.p....@psykologi.uio.no
For more options, visit
https://groups.google.com/groups/opt_out.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
--
You received this message because you are subscribed to the Google
Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to hddm-users+...@googlegroups.com
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Guido Biele Email: g.p....@psykologi.uio.no Phone: +47 228 45172 Website
Visiting Address Psykologisk Institutt Forskningsveien 3 A 0373 OSLO |
Mailing Address Psykologisk Institutt Postboks 1094 Blindern 0317 OSLO |
"y ~ x1 + x2"# Define the regression descriptor where we specify
reg_descr = {'func': "theta*C(stim)",
'outcome': 'a'}
m_reg = hddm.HDDMRegressor(data,
reg_descr,
depends_on={'v': 'stim'})
# if you do not want subj dists for the regressors, you can set group_only_regressors=True
Adding these covariates:
['Intercept', 'C(stim)[T.WL]', 'C(stim)[T.WW]', 'theta', 'theta:C(stim)[T.WL]', 'theta:C(stim)[T.WW]']
For more options, visit
https://groups.google.com/groups/opt_out.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
--
You received this message because you are subscribed to the Google
Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to hddm-users+...@googlegroups.com
Hi Thomas,
This looks quite complete, and I got some basic models to work, I hope also with 2 regressors, e.g.:
reg_descr = [{'func': '1 + drug + block', 'outcome': 'a'},{'func': '1 + drug + block', 'outcome': 'v'}]. I am not sure though that this worked as desired, as I get very weird parameters (e.g. an intercept for v of -15, whereas I get a much more reasonable estimates of v if I use only one regression function for v) I’ll try again with a longer chain tomorrow.
Here are 2 things I couldn’t figure out:
(1) include=’z’
when I tried to run it on the parameter z (after using “include=”z”” in the model setup), this did not work. Instead, I get following error message:
(2) stim-coding
When I want to run the model with “stim_coding” over 2 stimulus-types
for the parameter v, I would like to multiply the result of func with a 1 for
one stimulus type and with -1 for the other. Were the 1 and -1 supposed to be
coded in the theta you were mentioning in an earlier email?
( reg_descr = {'func': "theta*C(stim)",'outcome': 'a'} )
or would I need to approach this differently, for instance using the I(…)
syntax of patsy to get arithmetic transformations?
For me the following seemed to work, where drug and block are categorical variables and stim codes one stimulus as 1 and the other as -1:
reg_descr = [{'func': '1 + I(drug*stim) + I(block*stim)', 'outcome': 'v'}]
Cheers - Guido
Hi,
I made a few more checks, which lead me to revise my earlier email.
First, what remains is that HDDMregressor does not seem to work for z. I tried it with and without the “include = (‘z’)” specification, but neither worked.
What I wrote about stim-coding was, however, not so helpful
I think.
Here is what I am thinking now about the stim-coding issue:
(As a reminder: I suggest to implement stim-coding for e.g.
v by (a) multiplying all rows of the design matrix for one stimulus with a 1
and for the other stimulus with a -1, and (b) specifying a prior for the
intercept to be larger than 0)
I think this cannot easily be solved within the
patsy-formulas. Rather, one could patsy
let make a design matrix without stim-coding info, and then add this to the patsy-generated
design matrix.
As far as I can tell, this could be implemented in lines 72+
of hddm_regression.py.
Priors for regression-parameters for stim-coding should
probably be adjusted in line 196+ of hddm_regression.py.
I would suggest a gamma distribution as the prior for v the
intercept of v and a normal distribution for the other v-regressors.
Unfortunately the case is more complicated for z, because z is constrained to [0 1], and it is a product of multiple regressors. To give an example, if the intercept for z has the value .6, then the parameter for the effect of one level of a condition should be constrained to [-.6 .4]. I’ll try to find out if pymc has a function to implement such a linear constraint.
Cheers - Guido
import numpy as np
from patsy import ContrastMatrix
class StimCoding(object):
def __init__(self, reference=0):
self.reference = reference
def code_with_intercept(self, levels):
assert len(levels) == 2, "Only 2 levels are supported for stimulus coding."
return ContrastMatrix(np.array([[1, 0], [0, -1]]),
["[My.%s]" % (level,) for level in levels])
def code_without_intercept(self, levels):
raise NotImplementedError, "StimCoding can only be used without intercept. Add 0 + C()"
from patsy import dmatrix
data = ['stim1', 'stim1', 'stim2', 'stim2']
dmatrix("0 + C(stims, StimCoding)", {'stims': data})
Thomas
First, what remains is that HDDMregressor does not seem to work for z. I tried it with and without the �include = (�z�)� specification, but neither worked.
What I wrote about stim-coding was, however, not so helpful I think.
Here is what I am thinking now about the stim-coding issue:
(As a reminder: I suggest to implement stim-coding for e.g. v by (a) multiplying all rows of the design matrix for one stimulus with a 1 and for the other stimulus with a -1, and (b) specifying a prior for the intercept to be larger than 0)
I think this cannot easily be solved within the patsy-formulas. Rather, �one could patsy let make a design matrix without stim-coding info, and then add this to the patsy-generated �design matrix.
As far as I can tell, this could be implemented in lines 72+ of hddm_regression.py.
Priors for regression-parameters for stim-coding should probably be adjusted in line 196+ of hddm_regression.py.
I would suggest a gamma distribution as the prior for v the intercept of v and a normal distribution for the other v-regressors.
Unfortunately the case is more complicated for z, because z is constrained to [0 1], and it is a product of multiple regressors. To give an example, if the intercept for z has the value .6, then the parameter for the effect of one level of a� condition should be constrained to [-.6 .4].� I�ll try to find out if pymc has a function to implement such a linear constraint.
Cheers - Guido
On Monday, March 11, 2013 11:15:53 PM UTC+1, Guido Biele wrote:
Hi Thomas,
This looks quite complete, and I got some basic models to work, I hope also with 2 regressors, e.g.:
reg_descr = [{'func': '1 + drug + block', 'outcome': 'a'},{'func': '1 + drug + block', 'outcome': 'v'}]. I am not sure though that this worked as desired, as I get very weird parameters (e.g. an intercept for v of -15, whereas I get a much more reasonable estimates of v if I use only one regression function for v) I�ll try again with a longer chain tomorrow.
Here are 2 things I couldn�t figure out:
(1) include=�z�
when I tried to run it on the parameter z (after using �include=�z�� in the model setup), this did not work. Instead, I get following error message:
(2) stim-coding
When I want to run the model with �stim_coding� over 2 stimulus-types for the parameter v, I would like to multiply the result of func with a 1 for one stimulus type and with -1 for the other. Were the 1 and -1 supposed to be coded in the theta you were mentioning in an earlier email?
( reg_descr = {'func': "theta*C(stim)",'outcome': 'a'} )
or would I need to approach this differently, for instance using the I(�) syntax of patsy to get arithmetic transformations?
For me the following seemed to work, where drug and block are categorical variables and stim codes one stimulus as 1 and the other as -1:
reg_descr = [{'func': '1 + I(drug*stim) + I(block*stim)', 'outcome': 'v'}]
Cheers - Guido
PS: i would seem to me that it wold be useful to use a bit different priors for stim-coding of v (and z, if this would work). however, I also see that the code can get quite complex when this is also implemented...
On Monday, March 11, 2013 6:49:54 PM UTC+1, Thomas wrote:
I did some minor tweaks like adding support for multiple regressors and updated the documentation. I think it's pretty close to final so you might want to update and let me know if you agree.
On Mon, Mar 11, 2013 at 8:10 AM, Guido Biele <g.p....@psykologi.uio.no> wrote:
Hi Thomas,
thanks, it worked now.
cheers - guido
On Monday, March 11, 2013 1:03:03 PM UTC+1, Thomas wrote:
Hi Guido,
Indeed. I added the files. Let me know if it still fails.
Thomas
On Mon, Mar 11, 2013 at 5:47 AM, Guido Biele <g.p....@psykologi.uio.no> wrote:
Hi Thomas,
I tried to install hddm patsy, but the�cdfdif_wrapper is making a problem, because the c source-code seems to be missing.
here is the relevant section of log:"building 'cdfdif_wrapper' extensiongcc -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -mavx -fPIC -I/cluster/software/VERSIONS/python2-packages-2.7_2/lib/python2.7/site-packages/numpy/core/include -I/cluster/software/VERSIONS/python2-2.7.3/include/python2.7 -c src/cdfdif_wrapper.c -o build/temp.linux-x86_64-2.7/src/cdfdif_wrapper.ogcc: src/cdfdif_wrapper.c: No such file or directorygcc: no input fileserror: command 'gcc' failed with exit status 1"
I'm not sure what the problem is. the�cdfdif_wrapper is also in the master branch, but it seems that it is not compiled when installing hddm, or I am missing something ...cheers - Guido�
On Saturday, March 9, 2013 11:15:09 PM UTC+1, Thomas wrote:
OK, I took a stab at this. In our github repo there is a new branch called 'patsy' which requires the kabuki develop branch. This greatly simplifies regression model specification while making it pretty easy to specify very complex models.
You can now specify a model with, for example, a trial-by-trial regression on theta, within-subject categorical effects for stimulus (WW, LL, WL) and interactions effects as follows:# Define the regression descriptor where we specify reg_descr = {'func': "theta*C(stim)", 'outcome': 'a'} m_reg = hddm.HDDMRegressor(data, reg_descr, depends_on={'v': 'stim'}) # if you do not want subj dists for the regressors, you can set group_only_regressors=TrueWhich tells which covariates are being created:
Intercept will be threshold during the 'LL' condition. C(stim)[T.WL] will be the offset during the 'WL' condition. Theta is a trial-by-trial measure for which the regression coefficient will be estimated. The other are interaction terms.Adding these covariates: ['Intercept', 'C(stim)[T.WL]', 'C(stim)[T.WW]', 'theta', 'theta:C(stim)[T.WL]', 'theta:C(stim)[T.WW]']
For more information on how the syntax work I refer to the patsy docs:
And these slides:
Let me know whether you get any experiences with this.
Thomas
On Thu, Mar 7, 2013 at 6:57 PM, Thomas Wiecki <thomas...@gmail.com> wrote:
I think this is a very interesting direction to take. If this allows for within-subject effects and other tricks like stimulus coding many problems that would otherwise require a hand-tuned version would be solved.
I had an idea that we could include patsy to specify the design matrix�http://patsy.readthedocs.org/en/latest/quickstart.html. This would then allow for better syntax along the lines of R in specifying a GLM. E.g.:
Cheers � Guido
--
You received this message because you are subscribed to
the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to
send an email to hddm-users+...@googlegroups.com
For more options, visit
https://groups.google.com/groups/opt_out.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
--
You received this message because you are subscribed to the Google
Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it,
�
�
--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
Hi Thomas,
Thomas
First, what remains is that HDDMregressor does not seem to work for z. I tried it with and without the “include = (‘z’)” specification, but neither worked.
What I wrote about stim-coding was, however, not so helpful I think.
Here is what I am thinking now about the stim-coding issue:
(As a reminder: I suggest to implement stim-coding for e.g. v by (a) multiplying all rows of the design matrix for one stimulus with a 1 and for the other stimulus with a -1, and (b) specifying a prior for the intercept to be larger than 0)
I think this cannot easily be solved within the patsy-formulas. Rather, one could patsy let make a design matrix without stim-coding info, and then add this to the patsy-generated design matrix.
As far as I can tell, this could be implemented in lines 72+ of hddm_regression.py.
Priors for regression-parameters for stim-coding should probably be adjusted in line 196+ of hddm_regression.py.
I would suggest a gamma distribution as the prior for v the intercept of v and a normal distribution for the other v-regressors.
Unfortunately the case is more complicated for z, because z is constrained to [0 1], and it is a product of multiple regressors. To give an example, if the intercept for z has the value .6, then the parameter for the effect of one level of a condition should be constrained to [-.6 .4]. I’ll try to find out if pymc has a function to implement such a linear constraint.
Cheers - Guido
On Monday, March 11, 2013 11:15:53 PM UTC+1, Guido Biele wrote:
Hi Thomas,
This looks quite complete, and I got some basic models to work, I hope also with 2 regressors, e.g.:
reg_descr = [{'func': '1 + drug + block', 'outcome': 'a'},{'func': '1 + drug + block', 'outcome': 'v'}]. I am not sure though that this worked as desired, as I get very weird parameters (e.g. an intercept for v of -15, whereas I get a much more reasonable estimates of v if I use only one regression function for v) I’ll try again with a longer chain tomorrow.
Here are 2 things I couldn’t figure out:
(1) include=’z’
when I tried to run it on the parameter z (after using “include=”z”” in the model setup), this did not work. Instead, I get following error message:
(2) stim-coding
When I want to run the model with “stim_coding” over 2 stimulus-types for the parameter v, I would like to multiply the result of func with a 1 for one stimulus type and with -1 for the other. Were the 1 and -1 supposed to be coded in the theta you were mentioning in an earlier email?
( reg_descr = {'func': "theta*C(stim)",'outcome': 'a'} )
or would I need to approach this differently, for instance using the I(…) syntax of patsy to get arithmetic transformations?
For me the following seemed to work, where drug and block are categorical variables and stim codes one stimulus as 1 and the other as -1:
reg_descr = [{'func': '1 + I(drug*stim) + I(block*stim)', 'outcome': 'v'}]
Cheers - Guido
PS: i would seem to me that it wold be useful to use a bit different priors for stim-coding of v (and z, if this would work). however, I also see that the code can get quite complex when this is also implemented...
On Monday, March 11, 2013 6:49:54 PM UTC+1, Thomas wrote:
I did some minor tweaks like adding support for multiple regressors and updated the documentation. I think it's pretty close to final so you might want to update and let me know if you agree.
On Mon, Mar 11, 2013 at 8:10 AM, Guido Biele <g.p....@psykologi.uio.no> wrote:
Hi Thomas,
thanks, it worked now.
cheers - guido
On Monday, March 11, 2013 1:03:03 PM UTC+1, Thomas wrote:
Hi Guido,
Indeed. I added the files. Let me know if it still fails.
Thomas
On Mon, Mar 11, 2013 at 5:47 AM, Guido Biele <g.p....@psykologi.uio.no> wrote:
Hi Thomas,
I tried to install hddm patsy, but the cdfdif_wrapper is making a problem, because the c source-code seems to be missing.
here is the relevant section of log:"building 'cdfdif_wrapper' extensiongcc -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -mavx -fPIC -I/cluster/software/VERSIONS/python2-packages-2.7_2/lib/python2.7/site-packages/numpy/core/include -I/cluster/software/VERSIONS/python2-2.7.3/include/python2.7 -c src/cdfdif_wrapper.c -o build/temp.linux-x86_64-2.7/src/cdfdif_wrapper.ogcc: src/cdfdif_wrapper.c: No such file or directorygcc: no input fileserror: command 'gcc' failed with exit status 1"
I'm not sure what the problem is. the cdfdif_wrapper is also in the master branch, but it seems that it is not compiled when installing hddm, or I am missing something ...cheers - Guido
On Saturday, March 9, 2013 11:15:09 PM UTC+1, Thomas wrote:
OK, I took a stab at this. In our github repo there is a new branch called 'patsy' which requires the kabuki develop branch. This greatly simplifies regression model specification while making it pretty easy to specify very complex models.
You can now specify a model with, for example, a trial-by-trial regression on theta, within-subject categorical effects for stimulus (WW, LL, WL) and interactions effects as follows:# Define the regression descriptor where we specify reg_descr = {'func': "theta*C(stim)", 'outcome': 'a'} m_reg = hddm.HDDMRegressor(data, reg_descr, depends_on={'v': 'stim'}) # if you do not want subj dists for the regressors, you can set group_only_regressors=TrueWhich tells which covariates are being created:
Intercept will be threshold during the 'LL' condition. C(stim)[T.WL] will be the offset during the 'WL' condition. Theta is a trial-by-trial measure for which the regression coefficient will be estimated. The other are interaction terms.Adding these covariates: ['Intercept', 'C(stim)[T.WL]', 'C(stim)[T.WW]', 'theta', 'theta:C(stim)[T.WL]', 'theta:C(stim)[T.WW]']
For more information on how the syntax work I refer to the patsy docs:
And these slides:
Let me know whether you get any experiences with this.
Thomas
On Thu, Mar 7, 2013 at 6:57 PM, Thomas Wiecki <thomas...@gmail.com> wrote:
I think this is a very interesting direction to take. If this allows for within-subject effects and other tricks like stimulus coding many problems that would otherwise require a hand-tuned version would be solved.
I had an idea that we could include patsy to specify the design matrix http://patsy.readthedocs.org/en/latest/quickstart.html. This would then allow for better syntax along the lines of R in specifying a GLM. E.g.:
Cheers – Guido
Thomas
send an email to hddm-users+...@googlegroups.com
For more options, visit
https://groups.google.com/groups/opt_out.
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
--
You received this message because you are subscribed to the Google
Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it,
...
Hi Thomas,
nice that the patsy method works!!just to make sure that we are on the same page, here an example for a design matrix with stimulus coding for a design with 2 stimulus types and an orthogonal condition with 3 levels:data = {"s": ["s1", "s1", "s1", "s2", "s2", "s2"],"c":["c1", "c2", "c3", "c1", "c2", "c3"]}
patsy.dmatrix("0 + C(s, [[1], [-1]]) + C(s, [[1], [-1]]):C(c)", data)which makes the design matrixarray([[ 1., 0., 0.],[ 1., 1., 0.],[ 1., 0., 1.],[-1., -0., -0.],[-1., -1., -0.],[-1., -0., -1.]])where column one gives us the effect for condition c1, column two gives us the difference between c1 and c2, and column 3 the difference between c1 and c3.The nice thing is that this works with current version of HDDMregressor (i.e. the modification you described in your first email today is not required).
I think I still don't understand how the inverse logit as a prior helps to implement a linear contraint. if we take the design matrix above to model z, then one could do this by defining z asz = .5 + X*beta (in matrix notation, where beta is a column-vector with parameters and X is the design matrix.)the contraint would then be -.5<=X*beta<=.5. What I specifically do not understand if how using inverse logit priors for the three paramerters in beta here helps to implement the constraint. but it might well be that I am just not aware of the relevant literature.
...
...
Hi Thomas,I agree that the patsy approach is very versatile. I would still not dismiss the depends_on version, because is seems to be much faster when I implement the same design (the depends_on was 4.2 times faster in my quick test).
About inclusion of z and bias: I looked at your unittest (I actually didn't know if/how i can run it with a simple command...). The difference between your test and what i was trying to do is that I tried to have the regression to model z, and you included z while using the regression to model v. I can also do that, but as far as i can tell the regression can at the moment not be used to model z.
I now see how you use the inverse logit to transform the regression-output into a-parameter. Sorry for not picking this up earlier, I think this makes sense, even if this makes the interpretation of regression parameters for z a bit more difficult.
prior_descr = {'z_bottom': {'transform': lambda x: invlogit(x)},
'z_Intercept': {'prior_func': pm.Normal, 'prior_params': {'mu': .5, 'tau':1} }
PS: I can write a short documentation for the within-subjects stim-coding issue for v and z if you like.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
HDDMRegressor(data, {'func': 'C(cond)', 'outcome': 'v'}})
HDDMRegressor(data, 'v ~ C(cond)')
...
I think the new way to specify a model is nice and quicker to parse for a the human reader ;-)!
When reading through the patsy documentation to prepare a how-to, i found that they a number of nice "built in" function to prepare design matrices for particular contrasts of interest.I think this is very attractive, as it provides a fast way to set up design matrices in which the regressor coefficients can directly be read as contrats.For example, lets assume you have a design with condition drug with 3 ordered levels (e.g. antagonist, placebo, agonist). Thenhddm.HDDMregressor(data,'v ~ C(drug,Diff)',data)would set up a design matrix where the regressors model(1) the reference-level (e.g. drug1_antagonist) as the intercept,(2) a regressor capturing the contrast drug2_placebo > drug1_antagonist and(3) a regressor capturing the contrast drug3_agonist > drug3_placebo.*
The problem is, that the stimulus-coding scheme we discussed earlier is not easily combined with using these built in designs (i.e. one can do either the one or the other).
However, one could have both if one does not let the user implement the stim-coding through patsy, but rather adds an additional variable to the input (e..g "stim_coding = False" as default, which the the user can change to "stim_coding = True".When stim_coding is set to True, these few lines would add stim-coding to any existing design matrix X that was previously defined with patsy (assuming the stimulus-column is named stim)def stimcode(X,data):ncols = X.shape[1]stimdata = {'s':data['stim']}stim = (np.asarray(dmatrix('0 + C(s,[[1],[-1]])',stimdata)))mstim = np.kron(stim,np.ones((ncols,1))).reshape(-1,ncols)return X*mstim
About the regression for z: The new tests for include = 'z' together with the regression for z does not work for me.
--
...
link_func = lambda x: 1 / (1 + np.exp(-x))
m = hddm.HDDMRegressor(data, {'model': 'z ~ cov', 'link_func': link_func}, include='z')
--
...
def link_func(x, data=my_data_frame):
stims = data.stim.ix[x.index]
return .5 + 2 / (1 + np.exp(-(x*stim)))
...
Hi Thomas,
I have implemented the method you suggested and it works.for the benefit of others who want to test similar things, here what is I did:I created test-data (with hddm.generate) of an experiment with 2 stimulus types and 3 conditions, in which each condition influences drift rate and bias.Because i want interpretable v and z parameters, I set the hddm analysis up with accuracy coded data.z and v were modeled with an intercept regressor (modeling condition 1) + 2 dummy regressors for conditions 2 and 3, respectively. In addition I used a link function for the parameter z, which insures that the effective z is "z" for stimulus type A and "1-z" for stimulus type B.I implemented the model in the development branch. the one thing I had to adjust is line 77 in hddm_regrssion.py.the original code ispredictor = link_func((design_matrix * params).sum(axis=1))
which I changed topredictor = link_func(pd.DataFrame((design_matrix * params).sum(axis=1), index=data.index)because when implementing the stimulus coding via a link function the predictor has to be a pandas data frame.
For the experiment described above , the link function for stimulus-coding of z isdef z_link_func(x, data=mydata):stim = (np.asarray(dmatrix('0 + C(s,[[1],[-1]])',{'s':data.stim.ix[x.index]})))return 1 / (1 + np.exp(-(x * stim)))the regression models for z and v arez_reg = {'model': 'z ~ 1 + C(condition)', 'link_func': z_link_func}v_reg = {'model': 'v ~ 1 + C(condition)', 'link_func': lambda x : x}which one can further use to set up the modelreg_descr = [z_reg, v_reg]m_reg = hddm.HDDMRegressor( mydata, reg_descr,include='z')when looking at the results (after only 15000 samples for a data set of 9000 trials) the mean z values were closer to the true values than the mean v values.[keep link functions and regression model in mind when calculating v or z from the regression results ...]
one thing I was wondering about when setting this up if the way the link function is set up now means that the function dmatrix will be called for each mcmc sample. if so, I think a better way would be to use dmatrix only once outside the link function and to submit the array with the stim coding directly to the link_function (if this is possible, I'll try it later).
def func(args, design_matrix=dmatrix(reg['model'], data=data), link_func=reg['link_func']):
thanks again to thomas for making this possible.I'll write up a short documentation on using HDDMRegressor with patsy (with and without stimcoding ...) in the next days.
--