If we want the moving median of, say, a 2d array along the columns,
then each column is a 1d moving median problem. Currently I create a
double heap structure at the top of each column and free it at the
end. That is not efficient. So I tried creating the double heap
structure once and resetting it at the end of each column an then
freeing it when all columns are done. That is faster:
>> import bottleneck as bn
>> a = np.random.rand(20,1000)
>> timeit bn.move.move_median_2d_float64_axis0(a, 3)
1000 loops, best of 3: 768 us per loop
>> timeit bn.move.move_median_2d_float64_axis0_2(a, 3)
1000 loops, best of 3: 662 us per loop # <-- reuse heap structure
I reset the heap struct on the cython side with:
mm.n_s = 0
mm.n_l = 0
That's not very clean since it reaches inside the structure of the
heap. So I tried doing this on the C side with:
inline void mm_reset(mm_handle *mm) {
mm->n_s = 0;
mm->n_l = 0;
}
Even though it is inline it is slower:
>> timeit bn.move.move_median_2d_float64_axis0(a, 3)
1000 loops, best of 3: 675 us per loop
It's a poor design choice but since bottleneck is about speed I'll
reset on the cython side.