To get a sense of speed, here's a comparison with some easier problems
like mean and max:
>> import bottleneck as bn
>> a = np.random.rand(1000,1000)
>> timeit bn.move_mean(a, 100)
100 loops, best of 3: 7.45 ms per loop
>> timeit bn.move_max(a, 100)
100 loops, best of 3: 15.9 ms per loop
>> timeit bn.move_median(a, 100)
10 loops, best of 3: 74.3 ms per loop
>>
>> a = np.random.rand(1e6)
>> timeit bn.move_mean(a, 10000)
100 loops, best of 3: 8.17 ms per loop
>> timeit bn.move_max(a, 10000)
100 loops, best of 3: 16.9 ms per loop
>> timeit bn.move_median(a, 10000)
10 loops, best of 3: 124 ms per loop
There's one failing unit test for when the input is float32 and
window=1. In that case the output is float64 but should be float32.
If you want to give it try:
$ cd bottleneck/bottleneck/src
$ make all
It takes a while to build. Report success/failure.
I don't understand why I didn't have to do any work for int dtypes. I
used the exact same code as for floats. I guess ints are automatically
cast to float? No idea and I didn't investigate.