Efficiency of different operations in core.matrix is dependent on the implementation.
You approach should normally be pretty efficient however - things are generally designed so that operating over slices is fast. It will certainly be very fast in Vectorz, where slices are just lightweight offsets into the same underlying data array.
Note that there is a `normalise` operation - so you can simplify the inner expression to:
(map m/normalise (m/slices mtrx))
Mapping over slices and reassembling these to make a new array actually seems to be a sufficiently common operation that I'm wondering whether we should have a built-in function to do this.
Maybe a "map-slices" function? Or "slicemap"?