Faster
* sum, mean, std, var, min, max, median, ranking
Moving window
* fast (Bottleneck): move_sum, move_mean, move_std, move_min, move_max
* slow (Python): move_ranking, move_median, move_func
Here's the motivation for using Bottleneck in the la package:
Import:
>> import la
>> import numpy as np
>> import scipy.stats
Make data:
>> lar = la.rand(1000,1000)
>> lar[lar > 0.5] = la.nan
>> arr = lar.A
Check:
>> lar.median()
0.24999568789356486
>> np.median(arr)
nan
>> scipy.stats.nanmedian(arr, axis=None)
array(0.24999568789356486)
Time it:
>> timeit lar.median()
1000 loops, best of 3: 1.69 ms per loop
>> timeit np.median(arr)
10 loops, best of 3: 63.3 ms per loop
>> timeit scipy.stats.nanmedian(arr, axis=None)
10 loops, best of 3: 82.9 ms per loop
The development version of la can be download from
https://github.com/kwgoodman/la. Please report any issues (good or
bad).
> Here's the motivation for using Bottleneck in the la package:
>
> Import:
>
>>> import la
>>> import numpy as np
>>> import scipy.stats
>
> Make data:
>
>>> lar = la.rand(1000,1000)
>>> lar[lar > 0.5] = la.nan
>>> arr = lar.A
>
> Check:
>
>>> lar.median()
> 0.24999568789356486
>>> np.median(arr)
> nan
>>> scipy.stats.nanmedian(arr, axis=None)
> array(0.24999568789356486)
>
> Time it:
>
>>> timeit lar.median()
> 1000 loops, best of 3: 1.69 ms per loop
>>> timeit np.median(arr)
> 10 loops, best of 3: 63.3 ms per loop
>>> timeit scipy.stats.nanmedian(arr, axis=None)
> 10 loops, best of 3: 82.9 ms per loop
The timing above wasn't fair. lar.median() worked on the input array
inplace. In Bottleneck 0.4.3dev it now works on a copy:
>> timeit lar.median()
100 loops, best of 3: 13.8 ms per loop
>> timeit np.median(arr)
10 loops, best of 3: 75.5 ms per loop
>> timeit scipy.stats.nanmedian(arr, axis=None)
10 loops, best of 3: 90.9 ms per loop