tukey HSD - a not so pretty graph

josef...@gmail.com

unread,

Dec 29, 2010, 8:03:06 PM12/29/10

to pystat...@googlegroups.com

I finally have the first passing test for tukey hsd multiple comparison.

First plot is from R, second plot is mine.

My graph doesn't look as nice, pretty much brute force from the
closest gallery example I saw.

>>> m_r = [94.4, 102.5, 91.1, 118.2, 99.2]
>>> result = tukeyhsd(m_r, 6., 111, alpha=0.05, df=4) #example multicomp in R p 83

>>> npairs = len(result[1])

>>> plt.axis([-50,50,-1,11])
[-50, 50, -1, 11]
>>> for pair in range(len(result[1])):
l = plt.axhline(y=npairs-pair, xmin=0.5+result[4][pair,0]/100.,
xmax=0.5+result[4][pair,1]/100., linewidth=2, color='b')

>>> plt.axvline(x=0.5)
<matplotlib.lines.Line2D object at 0x03238290>
>>> plt.title('95 % confidence intervals for pairwise differences - Tukey HSD')

the start and endpoints are just a 2d array of confidence intervals,
the tick labels for the y axis are pairs of means that are compared
either string or in my current version, 2-tuple of indices.

>>> result[4]
array([[-10.04391794, 26.34391794],
[-21.45225794, 14.93557794],
[ 5.61441206, 42.00224794],
[-13.40225794, 22.98557794],
[-29.60225794, 6.78557794],
[ -2.53558794, 33.85224794],
[-21.55225794, 14.83557794],
[ 8.87275206, 45.26058794],
[-10.14391794, 26.24391794],
[-37.21058794, -0.82275206]])

Does anyone have a premade recipe with matplotlib, that look closer to
the plot in R?

For another graph, I need to have horizontal lines indicating subsets
of points on the x axis that are accepted as having the same mean. I
didn't look for a reference graph yet.
(REGWQ step-down procedure in SAS or SPSS)

Josef

r_tukey.png

mytukeyhsd.png

Skipper Seabold

unread,

Dec 30, 2010, 12:31:04 PM12/30/10

to pystat...@googlegroups.com

Some scratch work to make it replicable.

import numpy as np
import matplotlib.pyplot as plt

results = np.array([[-10.04391794, 26.34391794],

[-21.45225794, 14.93557794],
[ 5.61441206, 42.00224794],
[-13.40225794, 22.98557794],
[-29.60225794, 6.78557794],
[ -2.53558794, 33.85224794],
[-21.55225794, 14.83557794],
[ 8.87275206, 45.26058794],
[-10.14391794, 26.24391794],
[-37.21058794, -0.82275206]])

npairs = len(results)

Nothing premade, but there are two ways to do this. If you want to
use axhline to span the axes then you have to plot two overlapping
hlines

for pair in range(npairs):
data = .5+results[pair]/100.
# plot whole line with markers at endpoints
l = plt.axhline(y=npairs-pair, xmin=data[0], xmax=data[1], linewidth=.5,
color='b', marker="|", markevery=1)
# plot half way line with markers at endpoints
l = plt.axhline(y=npairs-pair, xmin=data[0], xmax=data.mean(), linewidth=.5,
color='b', marker="|", markevery=1)

Or you can just use plot and plot them all in one go with markers at
each of the data points

for pair in range(npairs):
data = results[pair]
data = np.r_[data[0],data.mean(),data[1]]
l = plt.plot(data, [npairs-pair]*len(data), color='b',
linewidth=.5, marker="|", markevery=1)

Skipper

Skipper Seabold

unread,

Dec 30, 2010, 12:33:03 PM12/30/10

to pystat...@googlegroups.com

PS

plt.axvline(x=.1, linestyle="--")

For the dashed vline.

John Hunter

unread,

Dec 30, 2010, 12:48:16 PM12/30/10

to pystat...@googlegroups.com

On Thu, Dec 30, 2010 at 11:33 AM, Skipper Seabold <jsse...@gmail.com> wrote:

>> Does anyone have a premade recipe with matplotlib, that look closer to
>> the plot in R?
>>
>
> Some scratch work to make it replicable.
>
> import numpy as np
> import matplotlib.pyplot as plt
>
> results = np.array([[-10.04391794, 26.34391794],
> [-21.45225794, 14.93557794],
> [ 5.61441206, 42.00224794],
> [-13.40225794, 22.98557794],
> [-29.60225794, 6.78557794],
> [ -2.53558794, 33.85224794],
> [-21.55225794, 14.83557794],
> [ 8.87275206, 45.26058794],
> [-10.14391794, 26.24391794],
> [-37.21058794, -0.82275206]])
>
> npairs = len(results)
>
> Nothing premade, but there are two ways to do this. If you want to
> use axhline to span the axes then you have to plot two overlapping
> hlines

matplotlib has some linestyles for making the vertical marker lines. Unfortunately we do not have a tick center, but you can get one by making two plots with tick up and tickdown:

In [60]: t = np.arange(0, 2, 0.1)

In [61]: s = np.sin(2*pi*t)

In [62]: plot(t, s, '-')
Out[62]: [<matplotlib.lines.Line2D object at 0x9866fcc>]

In [63]: plot(t, s, ls='None', marker=lines.TICKDOWN, markersize=10, lw=2, markeredgecolor='black')
Out[63]: [<matplotlib.lines.Line2D object at 0x982c90c>]

In [64]: plot(t, s, ls='None', marker=lines.TICKUP, markersize=10, lw=2, markeredgecolor='black')
Out[64]: [<matplotlib.lines.Line2D object at 0x986794c>]

John Hunter

unread,

Dec 30, 2010, 12:48:58 PM12/30/10

to pystat...@googlegroups.com

On Thu, Dec 30, 2010 at 11:48 AM, John Hunter <jdh...@gmail.com> wrote:

In [64]: plot(t, s, ls='None', marker=lines.TICKUP, markersize=10, lw=2, markeredgecolor='black')
Out[64]: [<matplotlib.lines.Line2D object at 0x986794c>]

I forgot to mention, lines is matplotlib.lines

import matplotlib.lines as lines

Skipper Seabold

unread,

Dec 30, 2010, 12:58:20 PM12/30/10

to pystat...@googlegroups.com

There appear to be some kind of discontinuity effects here?

Skipper

markerstyles.png

John Hunter

unread,

Dec 30, 2010, 1:02:56 PM12/30/10

to pystat...@googlegroups.com

On Thu, Dec 30, 2010 at 11:58 AM, Skipper Seabold <jsse...@gmail.com> wrote:

There appear to be some kind of discontinuity effects here?

Skipper

I can't decide if that is an optical illusion or not! You only see it when the underlying line is not flat, suggesting it may be an optical illusion. What do you think?

JDH

John Hunter

unread,

Dec 30, 2010, 1:04:25 PM12/30/10

to pystat...@googlegroups.com

Take a look at some of these: http://www.jimloy.com/puzz/illusion.htm

John Hunter

unread,

Dec 30, 2010, 1:10:33 PM12/30/10

to pystat...@googlegroups.com

On Thu, Dec 30, 2010 at 11:58 AM, Skipper Seabold <jsse...@gmail.com> wrote:

There appear to be some kind of discontinuity effects here?

Skipper

Try this -- use the 't' key to toggle the sine wave on and off. If you focus on the ticks, they do not appear to move when the sine is toggled, suggesting it is an optical illusion

import numpy as np
import matplotlib.lines as lines
import matplotlib.pyplot as plt

t = np.arange(0, 2, 0.1)

s = np.sin(2*np.pi*t)

ax = plt.subplot(111)
ax.plot(t, s, ls='None', marker=lines.TICKUP, markersize=10, lw=2, markeredgecolor='black')
ax.plot(t, s, ls='None', marker=lines.TICKDOWN, markersize=10, lw=2, markeredgecolor='black')

line, = ax.plot(t, s, '-')

def toggle(event):
    if event.key=='t':
        v = line.get_visible()
        line.set_visible(not v)
        ax.figure.canvas.draw()

ax.figure.canvas.mpl_connect('key_press_event', toggle)

plt.show()

Skipper Seabold

unread,

Dec 30, 2010, 1:32:23 PM12/30/10

to pystat...@googlegroups.com

Ha, that's what I thought at first then I convinced myself that it
wasn't an optical illusion. It sure seems to be. That's a neat trick
with the toggle!

Thanks,

Skipper

Skipper Seabold

unread,

Dec 30, 2010, 3:03:23 PM12/30/10

to pystat...@googlegroups.com

For searching later, script with better formatting attached and
inlined. Maybe not best practice all around but gets pretty close to
R

import numpy as np
import matplotlib.pyplot as plt

import matplotlib.ticker as mticker
import matplotlib.lines as lines

results = np.array([[-10.04391794, 26.34391794],
[-21.45225794, 14.93557794],
[ 5.61441206, 42.00224794],
[-13.40225794, 22.98557794],
[-29.60225794, 6.78557794],
[ -2.53558794, 33.85224794],
[-21.55225794, 14.83557794],
[ 8.87275206, 45.26058794],
[-10.14391794, 26.24391794],
[-37.21058794, -0.82275206]])

npairs = len(results)

fig = plt.figure()
fsp = fig.add_subplot(111)
fsp.axis([-50,50,0.5,10.5])
fsp.set_title('95 % family-wise confidence level')
fsp.title.set_y(1.025)
fsp.set_yticks(np.arange(1,11))
fsp.set_yticklabels(['V-T','V-S','T-S','V-P','T-P','S-P','V-M',
'T-M','S-M','P-M'])
#fsp.yaxis.set_major_locator(mticker.MaxNLocator(npairs))
fsp.yaxis.grid(True, linestyle='-', color='gray')
fsp.set_xlabel('Differences in mean levels of Var', labelpad=8)
fsp.xaxis.tick_bottom()
fsp.yaxis.tick_left()

xticklines = fsp.get_xticklines()
for xtickline in xticklines:
xtickline.set_marker(lines.TICKDOWN)
xtickline.set_markersize(10)

xlabels = fsp.get_xticklabels()
for xlabel in xlabels:
xlabel.set_y(-.04)

yticklines = fsp.get_yticklines()
for ytickline in yticklines:
ytickline.set_marker(lines.TICKLEFT)
ytickline.set_markersize(10)

ylabels = fsp.get_yticklabels()
for ylabel in ylabels:
ylabel.set_x(-.04)

for pair in range(npairs):
data = .5+results[pair]/100.

fsp.axhline(y=npairs-pair, xmin=data[0], xmax=data[1], linewidth=.5,
color='black', marker="|", markevery=1)
fsp.axhline(y=npairs-pair, xmin=data[0], xmax=data.mean(), linewidth=.5,
color='black', marker="|", markevery=1)

#for pair in range(npairs):
# data = .5+results[pair]/100.
# data = results[pair]
# data = np.r_[data[0],data.mean(),data[1]]
# l = plt.plot(data, [npairs-pair]*len(data), color='black',
# linewidth=.5, marker="|", markevery=1)

fsp.axvline(x=0, linestyle="--", color='black')

fig.subplots_adjust(bottom=.125)
plt.show()

tukeyplot.png

tukeyplot.py

josef...@gmail.com

unread,

Dec 30, 2010, 4:33:52 PM12/30/10

to pystat...@googlegroups.com

Thanks Skipper, John,

The last graphs looks really close to R.

I'm getting bored of multiple comparison, fighting most of the day
with recursive descend into subsets. The module has now 1700 lines
with examples and duplicate trial versions and my logfile for the IDLE
session is at 4000 lines, and I'm still not done.

I will commit tonight and finally merge again with and into devel, and
try to have some fun with something different.

Skipper, do you have uncommitted changes for devel?

Josef

Skipper Seabold

unread,

Dec 30, 2010, 4:49:06 PM12/30/10

to pystat...@googlegroups.com

I hear you. Finishing cleaning up arma tests now. Ugh. Plus side,
I'm almost done and everything looks pretty good.

>
> I will commit tonight and finally merge again with and into devel, and
> try to have some fun with something different.
>
> Skipper, do you have uncommitted changes for devel?
>

Nothing in devel. All in my branch. I refactored all of our tests to
use class methods for setup, made cython an option at the build stage,
and will finish the arma tests shortly. Details later.

Skipper

Reply all

Reply to author

Forward