using ':' (colon) in a variable for indexing

493 views
Skip to first unread message

Spencer Russell

unread,
Dec 19, 2013, 11:27:19 AM12/19/13
to juli...@googlegroups.com
Is there a way to use colons that are captured in variables for indexing? For instance:

julia> s = :
Colon()

julia> x = reshape(1:20, (4, 5))
4x5 Array{Int64,2}:
 1  5   9  13  17
 2  6  10  14  18
 3  7  11  15  19
 4  8  12  16  20

julia> x[2, s]
ERROR: no method getindex(Array{Int64,2},Int64,Colon)

julia> x[2, :]
1x5 Array{Int64,2}:
 2  6  10  14  18

My use case is that I'm trying to build an indexing tuple programatically, and while I can build a range e.g. 1:size(x, d), it would be convenient if I could just have the colon in there. It seems currently the colon used in indexing isn't a full object but just a shorthand the parser uses or something?

-s

Stefan Karpinski

unread,
Dec 19, 2013, 11:46:11 AM12/19/13
to Julia Dev
You have to write methods that handle the Colon type. That's about all there is to it. When used in indexing syntax, yes, : is desugared by the parser into the appropriate range, rather than actually passing a Colon object.

Spencer Russell

unread,
Dec 19, 2013, 12:41:47 PM12/19/13
to juli...@googlegroups.com
Thanks, Stefan.

Would getindex methods that take a Colon object be something that would go into Base eventually, or is something you're thinking would be done by libraries and applications on top?

-s

Stefan Karpinski

unread,
Dec 19, 2013, 12:53:36 PM12/19/13
to Julia Dev
No, it could go into Base. It's a bit of a schlep right now because the array indexing code is, imo, kind of a mess, but it certainly could go in there.

Tim Holy

unread,
Dec 19, 2013, 12:56:58 PM12/19/13
to juli...@googlegroups.com
If memory serves, it's not part of base because making it _efficient_ involves a
combinatorial explosion of methods, of order 2^d for dimensionality d.
However, if the utmost in efficiency isn't crucial, you could follow the Colon-
handling model in subarray.jl, perhaps in combination with special short-
circuits for 1, 2, and perhaps 3 dimensions.

Of course, now that tuples are a lot better, perhaps the inefficient version may
not be so horrible. It might even be worth revisiting whether this is
something that should be in base. Some performance measurements would probably
be the first thing to try.

Best,
--Tim

Mauro

unread,
Dec 19, 2013, 9:58:17 PM12/19/13
to juli...@googlegroups.com
(I've been musing about similar things recently when writing the
getindex function for a ragged array datatype, where 'end' means a
different number depending on the column. However, that is a slightly
different but related matter:
https://groups.google.com/d/msg/julia-users/a5T9CoHLqeA/qfKg2jZje9UJ)

More generally, I think it would be nice to have some kind of slice
object like python does. It could be just a range in simple cases but
something more complicated when an 'end' features. Then one could build
up slices without an array and pass them around.

sli = colon(start, step, end)

and it should be possible to specify the 'end' in terms of just a number
or in terms of end minus number. E.g.:

sli = colon(5, 7, end-5)
a[sli] # equivalent to a[5:7:end-5]
b[sli] # equivalent to b[5:7:end-5]

Maybe 'end' could be a type and 'colon' featuring and 'end' would
produce a special range (otherwise just a normal range). Then getindex
could figure out what 'end' is when called for a specific indexable
collection.

(Then in Spencer's example it would have to be slightly more verbosely
s = 1:end)

In fact, the de-sugaring of 'end' in indices to size(a,n) seems a bit
in-transparent. Could something like above maybe rectify this?
(But probably a lot of work and not urgent.)

Stefan Karpinski

unread,
Dec 20, 2013, 1:53:05 PM12/20/13
to Julia Dev
For what it's worth, we started out with a whole menagerie of range types that were open-ended at the start, end or both, and ditched that whole business in favor of what we have now. It would be possible to revisit, but I would warn that it's pretty complex and the current state of array indexing code is a pretty nasty tangle that's extremely hard to modify. You may want to do some experiments first to make sure that the open-ended range object approach can be done without harming performance.

Reply all
Reply to author
Forward
0 new messages