# NumericExtensions extended

91 views

### Dahua

Feb 16, 2014, 2:46:45 PM2/16/14
NumericExtensions are significantly updated recently.

The major changes:
- Reinstate/introduce reduce, mapreduce, reducedim, mapreducedim methods using functors
- All map, reducedim, and mapreducedim functions and functions derived thereon now support both contiguous arrays and strides views (those from ArrayViews.jl) with very efficient implementation.
- the PDMat part is separated to a new package PDMats.jl

Internal implementation strategy is recursion (computation over N-D arrays are decomposed into computations over (N-1)-D slices) + highly optimized 1D/2D kernels.

Here is some benchmarks obtained by running computations over 1000x1000 matrices
`a_sub = sub(a, 1:999, :)a_view = view(a, 1:999, :)   # using ArrayViewsfor sum:  dim = 1:  sum(a_sub, dim) => 0.1314s   sum(a_view, dim) => 0.0374s   |  gain = 3.5168x  dim = 2:  sum(a_sub, dim) => 0.1740s   sum(a_view, dim) => 0.0574s   |  gain = 3.0286xfor sumabs:  dim = 1:  sum(abs(a_sub), dim) => 0.6341s   sumabs(a_view, dim) => 0.0286s   |  gain = 22.1343x  dim = 2:  sum(abs(a_sub), dim) => 0.6331s   sumabs(a_view, dim) => 0.0639s   |  gain = 9.9013x`

Note: the ArrayViews package now provides a ``ellipview`` function, as
`ellipview(a, i)  # equivalent to view(a, ..., i), e.g. view(a, :,:,i) when ndims(a) == 3`

Some thoughts:

Tim's Cartesian machinery is very nice and has been in the Julia Base. It can be used to express generic algorithms using a small number of lines of codes, while producing reasonable run-time performance. However, for writing really high-performance implementation, recursion + highly optimized kernels is still faster (in run time), and this strategy allows one to plug-in SIMD, BLAS or other external functions for optimal performance (through multiple dispatch).

Consider the following code:
`sumabs(view(a, 1:500, 1:2:5, :), (1, 2))`
The strategy implemented in NumericExtensions will invoke ``BLAS.asum`` when performing computation along each column. That's why you will see over 20x speed gain above.

- Dahua