p-values for Tukeys HSD

178 views

Skip to first unread message

Thomas Haslwanter

unread,

Aug 5, 2015, 12:35:27 PM8/5/15

to pystatsmodels

Currently statsmodels does not provide p-values for Tukey's HSD test.
In his blog

http://jpktd.blogspot.co.at/2013/03/multiple-comparison-and-tukey-hsd-or_25.html

Josef calculates them by hand, using

>>> res2 = pairwise_tukeyhsd(dta2['StressReduction'], dta2['Treatment'])

>>> t_stat = res2[1][2] / res2[1][3] / np.sqrt(2)
>>> print t_stat
[ 3.15579093  2.10386062 -1.05193031]

>>> my_pvalues = stats.t.sf(np.abs(t_stat), 27) * 2   #two-sided
>>> my_pvalues
array([ 0.00390847,  0.04484301,  0.30215552])

However, the since he wrote this blog, the return values of "pairwise-tukeyhsd" have changed, and
I cannot reproduce his results.

I have also tried

    t_stat = (studentized_mean / studentized_variance) / np.sqrt(2)
    dof = len(data) - len(multiComp.groupsunique)
    my_pvalues = stats.t.sf(np.abs(t_stat), dof) * 2  # two-sided
    print(my_pvalues)

but end up with a different result. I have tried to work my way through the corresponding statsmodels
documentation/code, but was not all that successful :(

Can anyone tell me how to adapt Josef's code to obain p-values for Tukey's HSD-test?

Thanks, thomas

josef...@gmail.com

unread,

Aug 5, 2015, 7:23:18 PM8/5/15

to pystatsmodels

AFAIR, this https://github.com/statsmodels/statsmodels/issues/2346 is the latest discussion

tukeyhsd takes multiple testing or comparison into account, the p-value is not a simple t.sf.

Josef


Thanks, thomas

Reply all

Reply to author

Forward

0 new messages