When does colon indexing get evaluated / converted?

315 views
Skip to first unread message

Greg Plowman

unread,
Sep 20, 2015, 9:34:24 PM9/20/15
to julia-users
Hi,

I'm trying to define a custom Array type that can be indexed using arbitrary ranges.

e.g. A = MyArray(Int, 3:8) would define a 6-element vector with indexes ranging from 3 to 8, rather than the default 1 to 6.

I've made some progress, but am now stuck on how to handle colon indexing.

A[4:6] works by defining appropriate getindex and setindex!

e.g.  setindex!{T,S<:Real}(A::MyArray{T,1}, value, I::AbstractVector{S}) = ...

but A[:] = 0 seems to get translated to A[1:6] before dispatch on setindex!, so I can't hijack the call.

From subarray.jl, the code below suggests I can specialise on the Colon type, but this doesn't seem to work for me. Colon appears to be converted to UnitRange before calling setindex!

sub(A::AbstractArray, I::Union(RangeIndex, Colon)...) = sub(A, ntuple(length(I), i-> isa(I[i], Colon) ? (1:size(A,i)) : I[i])...)


Is there a way around this?
Should I be able to specialise on the colon argument?

-- Greg

Spencer Russell

unread,
Sep 20, 2015, 11:15:42 PM9/20/15
to julia...@googlegroups.com
Hi Greg,

This doesn’t answer your question directly, but I recommend you check out the Interfaces chapter of the manual for some good info on creating your own array type. To get indexing working the most important parts are:

1. subtype AbstractArray
2. implement the Base.linearindexing(Type) method
3. implement either getindex(A, i::Int) or getindex(A, i1::Int, ..., iN::Int), (and the setindex! versions)  depending on step 2.

Then the various indexing behaviors (ranges, etc.) should work for your type, as under-the-hood they boil down to the more basic indexing method that you define in step 3. One thing to note that’s not spelled out super clearly in the manual yet is that if you want the results of range indexing to be wrapped in your type, you should also implement `similar`. There’s some more details on that in this pr.

OT: something about the way your email client is configured is causing you to show up as julia...@googlegroups.com where for other users I see their names (at least in my email client), so it’s a bit hard to see who you are in lists of thread participant names.

-s

Greg Plowman

unread,
Sep 21, 2015, 12:04:23 AM9/21/15
to julia-users
Hi Spencer,

Thanks for your reply.

Actually I'm trying to follow the Interfaces chapter.

I'm using Julia v0.3, so that might be a problem.

Also I think my problem is mainly for vectors (1-dimensional array), because linear/subscript index has same syntax.

For multi-dimensional arrays, I can map subscript indexes to the default 1:length linear indexes.

A = MyArray(Int, -2:8, -2:6) # defines 11x9 Matrix
A[:] gets translated to A[1:99], which can be handled by linear indexing

B = MyArray(Int, -2:8) # defines 11-element Vector
B[:] gets translated to B[1:8], I need it to be translated to B[-2:8]

So I'm trying to find out where this conversion happens, so I can redefine it.

-- Greg

PS Not really sure what to do about email name. Name on julia-users shows:
me
 (Greg Plowman change)  
 

Greg Plowman

unread,
Sep 21, 2015, 12:38:03 AM9/21/15
to julia-users
To further clarify, I thought I could specialise getindex / setindex! on colon type argument.

see below, getindex2 is being called, but getindex is not being called.
A[:] calls getindex(A::AbstractArray{T,N},I::AbstractArray{T,N}) at abstractarray.jl:380
presumably after [:] has been converted to [1:8]



julia> getindex(A::FArray, ::Colon) = A[start(A) : endof(A)]
getindex (generic function with 177 methods)

julia> getindex2(A::FArray, ::Colon) = A[start(A) : endof(A)]
getindex2 (generic function with 1 method)

julia> A = FZeros(Int, -2:8)
11-element FArray{Int64,1} (-2:8,)

julia> A[:]
8-element Array{Int64,1}:
 0
 0
 0
 0
 0
 0
 0
 0

julia> @which A[:]
getindex(A::AbstractArray{T,N},I::AbstractArray{T,N}) at abstractarray.jl:380

julia> getindex2(A, :)
11-element Array{Int64,1}:
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0

julia>


Spencer Russell

unread,
Sep 21, 2015, 12:54:35 AM9/21/15
to julia...@googlegroups.com
Can you post the code (or link to the repo) where you define your FArray type?

-s

Mauro

unread,
Sep 21, 2015, 3:54:38 AM9/21/15
to julia...@googlegroups.com
I seem to recall that this was discussed recently, probably by Matt, but
cannot find it. I did however find this gist:
https://gist.github.com/alsam/8283205, maybe of help.

Tomas Lycken

unread,
Sep 21, 2015, 4:08:34 AM9/21/15
to julia-users

I recently implemented a similar transformation for array-like objects, in which I decided to create functions bounds(A, dim) = (lbound(A, dim), ubound(A, dim)) which return the end points of the indexing axes; without those, it was really hard to deduce what indices things should map to.

It might be that these should actually be included in Base and used by the colon indexing implementation for AbstractArrays (bounds(A, dim) = (1, size(A, dim)) for all Arrays, so could probably be inlined at no runtime cost). If so, you could use them too and have your FArray type return e.g. (-2, 8), which would allow the default colon indexing implementation to Just Work™.

// T

Matt Bauman

unread,
Sep 21, 2015, 9:20:02 AM9/21/15
to julia-users
Here's a quick rundown on the status quo:

* Colon lowering changed in 0.4, as did the AbstractArray's interface.  It's *much* easier to make your own arrays in 0.4.  I suggest updating to the latest RC.

* There is still a very strong assumption in base Julia that the valid indices in an AbstractArray `A` for dimension `d` are in `1:size(A, d)`.
* This means that once a base method has ensured that all the indices it visits are within `1:size(A,d)`, it might wrap its algorithm with `@inbounds`.
* This means that if you break that assumption, using a base function could result in the wrong answer and/or memory corruption and/or segfaults.
* It also means that the `end` keyword won't lower correctly for your array

Now, it is possible to break this assumption and still have things work, but for right now, it's undocumented magic.  There's several of us very interested in doing this, so it'll get better.  Also new in 0.4 is the eachindex method, which helps here, too.  The key is to aggressively check bounds and explicitly error on any indexing method that you know might be ambiguous in its meaning (e.g., does the caller know that your array has funny indexing semantics or not?).

Tim Holy

unread,
Sep 21, 2015, 9:53:50 AM9/21/15
to julia...@googlegroups.com
In julia 0.5 I plan to add eachindex(A, d) which can return the appropriate
range for your object along axis d.

--Tim
> > to UnitRange *before* calling setindex!

Greg Plowman

unread,
Sep 22, 2015, 1:07:01 AM9/22/15
to julia-users
Hi All,
Thanks all for your replies.

OK I can see this will be much easier in v0.4.
I will revisit when v0.4 released.

I'm still curious about colon and end
Colon lowering changed in 0.4,

Matt, could you expand on this? How/when is this done in v0.3 vs v0.4?

Does this mean v0.3 code attempting to dispatch on Colon type cannot work? (for example the code from subarray.jl quoted below)

sub(A::AbstractArray, I::Union(RangeIndex, Colon)...) = sub(A, ntuple(length(I), i-> isa(I[i], Colon) ? (1:size(A,i)) : I[i])...)

I noticed that  OffsetArrays (https://github.com/alsam/OffsetArrays.jl jA) defines const (..) = Colon(), presumably to use .. in place of :

Will it work in v0.4?
How and when is end converted?

Thanks again,
Greg



Matt Bauman

unread,
Sep 22, 2015, 9:09:41 AM9/22/15
to julia-users
A colon by itself is simply a synonym for `Colon()`, both on 0.3 and 0.4:

julia> :
Colon()

You can use colons in normal function calls and dispatch in both 0.3 and 0.4:

julia> f(x::Colon) = x
       f
(:)
Colon()

That's what's happening with that sub definition.  `sub` acts like indexing, but it's just a normal function call.  And so the Colon() passes through unchanged in both 0.3 and 0.4.

Indexing expressions are different.  There are some special lowering steps to handle colon (just in 0.3) and end (in both 0.3 and 0.4).  "Lowering" is an intermediate step between parsing and compilation, and you can see what happens to an expression during lowering by calling `expand`.  The expanded expressions have some funny things in them (`top(endof)(A)` is a special internal pseudo-syntax that is kinda-sorta like saying `Base.endof(A)`), but you can see how the `end` keyword lowers to getindex calls in both 0.3 and 0.4:

julia> expand(:(A[end]))
:(getindex(A,(top(endof))(A)))

julia
> expand(:(A[1, end]))
:(getindex(A,1,(top(trailingsize))(A,2)))

julia
> expand(:(A[1, end, 3]))
:(getindex(A,1,(top(size))(A,2),3))

So, now to the difference between 0.3 and 0.4: the special lowering step for colons in indexing expressions was removed.

# Julia 0.3:
julia
> expand(:(A[:]))
:(getindex(A,colon(1,(top(endof))(A))))

# Julia 0.4:
julia
> expand(:(A[:]))
:(getindex(A,:))

You can see that in 0.3, it effectively expanded to `1:end`.  But 0.4 just left the `:` symbol as it was, and it just gets evaluated normally as an argument to getindex.  It's no longer special. So in 0.4, you can specialize dispatch on Colon indices.

Make sense?

Greg Plowman

unread,
Sep 22, 2015, 9:27:21 PM9/22/15
to julia-users
Matt,
Thankyou so much for your great reply.
And yes it does make sense now after such a helpful and well-targeted explanation.
You explained "colon-lowering" in a "explanation-lowered" way that a non-expert can understand.
This is refreshingly thoughtful and helpful.
Thanks again.
-- Greg
Reply all
Reply to author
Forward
0 new messages