sub does not behave as ref; possibility of subarray reshape

33 views
Skip to first unread message

Carlo Baldassi

unread,
Apr 14, 2012, 11:56:16 AM4/14/12
to juli...@googlegroups.com
This is yet another chapter in the never-ending issue about trailing singleton dimensions; I don't think it was brought up or (I hope to be wrong here) that it has an easy solution currently.

See the following:

julia> a = rand(3, 4)
3x4 Float64 Array:
 0.121852   0.564228  0.637376  0.189798
 0.41312    0.412484  0.198252  0.262738
 0.0486016  0.101546  0.529687  0.931378

julia> s = sub(a, 1:3, 1)
3x1 SubArray of 3x4 Float64 Array:
 0.121852
 0.41312 
 0.0486016

julia> isa(s,AbstractVector)
false

This is different than what ref does:

julia> r = ref(a, 1:3, 1)
3-element Float64 Array:
 0.121852
 0.41312 
 0.0486016

julia> isa(r,AbstractVector)
true


You can see an example of the effect of this in statistics.jl:

julia> cov(a)
no method _jl_cov_pearson1(SubArray{Float64,2,Array{Float64,2},(Range1{Int64},Range1{Int64})},SubArray{Float64,2,Array{Float64,2},(Range1{Int64},Range1{Int64})},Float64,Float64)
 in method_missing at base.jl:60
 in _jl_cov_pearson at statistics.jl:157
 in cov_pearson at statistics.jl:170

This is because _jl_cov_pearson1 wants AbstractVectors as inputs. In this particular case, a specific fix could be easy (since that is just an internal function whose signature could be changed) but I believe a more general fix is in order.

Also: in general, I haven't found a way to recast the shape of subvectors without copying them (and thus losing the advantage of using SubArrays in the first place), which would be useful to call specific functions (e.g. if I have a 3D array and want to call some function which deals with matrices etc).

So there are 2 issues here actually:

1) make sub act the same as ref, i.e. dropping trailing singleton dimensions
2) provide a way to reshape or at least squeeze subarrays

Stefan Karpinski

unread,
Apr 14, 2012, 12:16:45 PM4/14/12
to juli...@googlegroups.com
Here's an interesting issue with sub: how do you handle a sub where all the indices are scalar, e.g. sub(a,1,2)? There are two ways I can see to go with this: return a 0-d SubArray object, or maybe sub should actually preserve the full dimensionality of the original object, meaning that it works differently than ref.

Carlo Baldassi

unread,
Apr 14, 2012, 12:51:27 PM4/14/12
to juli...@googlegroups.com
Personally, I'd find less surprising if sub always acted as closely as
possible to ref, meaning that sub(a, 1, 2) would give a 0-d subarray
(which would somehow be like a pointer to a scalar). I understand this
would still cause confusion with dispatch, since ref would return a
scalar and sub still an AbstractArray. However, when one uses sub it
should be clear that he's opening "views" on an array, and thus always
dealing with arrays, no matter what. Also, sub with all-scalar indices
should be a special case when the indices are not known in advance,
and are generated within a function (otherwise they're pointless, I
think), meaning they would be within code which expects to an array as
outputs anyway; therefore, dispatch of 0-d subarrays shouldn't be a
real issue.

Of course I'd be perfectly fine with leaving everything like it is
now, except that I think there's a real need for the possibility of
efficiently reshaping subarrays somehow (possibly with additional
constraints w.r.t. those of array reshapes).

Stefan Karpinski

unread,
Apr 14, 2012, 12:59:30 PM4/14/12
to juli...@googlegroups.com
Ok, I think that's a good argument. Let's make sub behave like ref with the exception that sub with all scalar indices returns a 0-d SubArray instead of a scalar value.
Reply all
Reply to author
Forward
0 new messages