Hilde
If you need that kind of thing a lot, look at Numeric (or its
replacement, numarray), or perhaps the standard library's array
module.
John
>>> zip(*l)[0]
(1, 2, 3)
--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science
If efficiency is not an issue and/or you need
[item[index] for item in theList] for more than one index at a time, you can
do:
>>> s = [[1,2],[3,4]]
>>> t = zip(*s)
>>> t
[(1, 3), (2, 4)]
>>> t[1]
(2, 4)
>>>
This creates a transposed (?) copy of the "matrix". The side effect of
creating tupples instead of inner lists should do no harm if you need only
read access to the entries.
Peter
l[;0] is illegal right now but does anyone of any other bit
of syntax it might conflict with if proposed as an extension?
Hilde
I think that in alist[from:to:step] the step argument is already overkill.
If there is sufficient demand for column extraction, I would rather make it
a method of list, as alist[:columnIndex] can easily be confused with
alist[;toIndex] (or was it the other way round :-). Would you allow
slicing, too, or make slicing and column extraction mutually exclusive?
Here's how to extract rows 2,4,6 and then columns 4 to 5:
m = n[2:7:2;4:6] # not valid python
Also, ";" is already used (though seldom found in real code) as an alternate
way to delimit statements.
So your suggestion might further complicate the compiler without compelling
benefits over the method approach.
Peter
This is not about "column extraction" but about operating on different
dimensions of the array, which there doesn't seem to be any handy way
of doing right now. So yes I would allow both: within each ";" separated
index group, the current syntax (whatever it is) would apply.
> Also, ";" is already used (though seldom found in real code) as an
> alternate way to delimit statements.
I doubt it can be found within sqare brackets, which is not difficult
to disambiguate.
While we are at it, I also don't understand why sequences can't be
used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
slice concept? To me, it's not just the step argument in the slice
that is overkill...
Hilde
>> Would you allow slicing, too, or make slicing and column extraction
>> mutually exclusive?
>
> This is not about "column extraction" but about operating on different
> dimensions of the array, which there doesn't seem to be any handy way
Like it or not, there are no "different dimensions", just lists of lists of
lists... so the N dimension case would resolve to (N-1) 2 dimension
operations.
> While we are at it, I also don't understand why sequences can't be
> used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
> slice concept? To me, it's not just the step argument in the slice
> that is overkill...
(a) Why not alist[[2, -3, 7]]?
OK with me.
[alist[2], alist[-3], alist[7]]
and
[alist[i] for i in [2, -3, 7]]
are not particularly cumbersome, though.
(b) Why a special slice concept?
It covers the most common cases of list item extraction with a concise
syntax.
Peter
> While we are at it, I also don't understand why sequences can't be
> used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
> slice concept? To me, it's not just the step argument in the slice
> that is overkill...
1. It will be more typing and harder to visually parse
l[:3] would be l[(0, 3)]
l[3:] would be l[(3,-1)]
2. Slicing two dimensional object will not be possible as the notion
you proposed is used just for that (ex. l[1,2] which is equivallent
to l[(1,2)] see below), and Numeric and numarray use it. See what
happens with an object which on indexing just returns the index
>>> class C:
... def __getitem__(self, i): return i
...
>>> c = C()
>>> c[3]
3
>>> c[:3]
slice(0, 3, None)
>>> # multi dimensional indexing
>>> c[1,3]
(1, 3)
>>> c[1:3,3:5]
(slice(1, 3, None), slice(3, 5, None))
--
=*= Lukasz Pankowski =*=
Ain't going to happen. If you want that kind of thing without forking
your own version of Python, Numeric/numarray is the closest you'll get
(no special syntax, but lots of useful functions and 'ufuncs').
[...]
> While we are at it, I also don't understand why sequences can't be
> used as indices. Why not, say, l[[2,3]] or l[(2, 3)]? Why a special
[...]
I'd guess you can subclass numarray's arrays and get this behaviour.
Or simply write your own sequence object and override __getitem__.
I'd guess it's highly unlikely ever to be part of the standard
sequence protocol, though.
John
> Thanks for the suggestion but zip is not nice for large lists
> and as for array/numpy, although I chose a numeric example in
> the posting, I don't see why only numeric arrays should enjoy
> the benefit of such a notation.
Numeric (and presumably numarray) can handle arbitrary Python objects.
> l[;0] is illegal right now but does anyone of any other bit
> of syntax it might conflict with if proposed as an extension?
BTW, just remembered that Numeric/numarray *does* use commas for
multi-dimensional indexing. Apparently (glancing at the language ref)
that's not actually indexing with a tuple, but part of Python's
syntax. Same goes for that other obscure bit of Python syntax, the
ellipsis, as used by Numeric: foo[a, ..., b].
John
Oops, wrong. See my other post.
John
You are being too litteral. A list of list is like a 2D array from an
indexing point of view, a list of lists of lists like a 3D array etc.
E.g., (((1,10),(2,20),(3,30)),((-1,'A'),(-2,'B'),(-3,'C'))) is a
2 x 3 x 2 rectangular data structure and has 3 dimensions. Hence,
e.g., l[0;2;1] ~ l[0][2][1] = 30
> [alist[2], alist[-3], alist[7]] and [alist[i] for i in [2, -3, 7]]
I agree that comprehensions alleviate the problem to an extent.
However the first notation is definitely cumbersome for all but the
shortest index lists.
> It covers the most common cases of list item extraction with a concise
> syntax.
Maybe but
1/ it is more or less redundant: the (x)range syntax could have been
extended with the same effect
2/ it lacks generality since it can only generate arithmetic progressions
Hilde
No because in many cases the reason why you want to use the syntax
l[seq] is that you already have seq, so you would refer to it by name
and "l[s]" is definitely not hard to parse.
> l[:3] would be l[(0, 3)]
> l[3:] would be l[(3,-1)]
Not at all. I was suggesting to use a semi-colon not a colon. Thus if
l is (10,20,30,40), l[:3] -> (10,20,30) whereas l[(0,3)] -> (10, 30),
i.e., same as in your class-based example, minus the parentheses, which
I now realize are superfluous (odd that python allows you to omit the
parentheses but not the commas).
> 2. Slicing two dimensional object will not be possible as the notion
> you proposed is used just for that (ex. l[1,2] which is equivalent
> to l[(1,2)] see below), and Numeric and numarray use it.
Same misunderstanding as above, I believe. Let, e.g., l be
((1,10,100),(2,20,200),(3,30,300),(4,40,400)). Then l[2:; 1:] ->
[(30, 300), (40, 400)]. This is equivalent to [i[1:] for i in l[2:]]
but, at least to me, it is the index notation that is easier to parse.
Incidentally, it strikes me that there are a lot of superfluous
commas. Why not just (1 10 100) or even 1 10 100 instead of (1,10,100)?
The commas do make the expression harder to parse visually.
> >>> # multi dimensional indexing
> >>> c[1,3]
I disagree that this is "multidimensional". You are passing a list
and getting back a list, so this is still flat. I think you are
confusing dimensionality and cardinality.
> Numeric and numarray use it
This is good if it is true (I have yet to look at these two because
my work is not primarily numerical) but, again, the restriction of this
syntax to arrays, and arrays of numeric values at that, strikes me
as completely arbitrary.
The bottom line is that python claims to be simple but has a syntax
which, in this little corner at least, is neither simple nor regular:
xranges and slices are both sequence abstractions but are used in
different contexts and have different syntaxes; C-style arrays are
treated differently than lists of lists of ..., although they are
conceptually equivalent; numeric structures are treated differently
than non-numeric ones etc etc
Oh well. Maybe a future PEP will streamline all that.
Hilde
>>Like it or not, there are no "different dimensions", just lists of lists
>>of lists...
>
>
> You are being too litteral. A list of list is like a 2D array from an
> indexing point of view, a list of lists of lists like a 3D array etc.
> E.g., (((1,10),(2,20),(3,30)),((-1,'A'),(-2,'B'),(-3,'C'))) is a
> 2 x 3 x 2 rectangular data structure and has 3 dimensions. Hence,
> e.g., l[0;2;1] ~ l[0][2][1] = 30
Only if all the sublists are of the same length, which is guaranteed for
a multi-dimensional array, but not for a list of lists.
What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?
That's why Numeric has a specific type for multi-dimensional arrays.
David
Oh. I will look at them pronto, then. As for:
> I'd guess it's highly unlikely ever to be part of the standard
> sequence protocol, though.
Passing a slice is passing an abstraction of a sequence (since indexing
a list with a slice returns a list). Given that, it seems hard to justify
not accepting the sequence itself... Passing the abstraction rather than
the thing itself should be an optimization, as in xrange vs. range, left
to the discretion of the programmer.
-- O.L.
> No because in many cases the reason why you want to use the syntax
> l[seq] is that you already have seq, so you would refer to it by name
> and "l[s]" is definitely not hard to parse.
So you want l[seq] to be a shorter way for current
[l[i] for i in seq]
Which is pythonic as it is explicit, easier to read if the code is not
written by you yesterday, although in some situations your interpretation
of l[seq] might be guessable from the context.
> Not at all. I was suggesting to use a semi-colon not a colon. Thus if
> l is (10,20,30,40), l[:3] -> (10,20,30) whereas l[(0,3)] -> (10, 30),
> i.e., same as in your class-based example, minus the parentheses, which
> I now realize are superfluous (odd that python allows you to omit the
> parentheses but not the commas).
Sorry for my misunderstanding, yes it would be nice to have a
posibility to index a sequence with a list of indices, here is a pure
Python (>= 2.2) implementation of the idea:
class List(list):
def __getitem__(self, index):
if isinstance(index, (tuple, list)):
return [list.__getitem__(self, i) for i in index]
else:
return list.__getitem__(self, index)
>>> l = List(range(0, 100, 10))
>>> l[0,2,3]
[0, 20, 30]
but in this simple using both commas and slices will not work as
expected
>>> l[0,1,7:]
[0, 10, [70, 80, 90]]
>
>> 2. Slicing two dimensional object will not be possible as the notion
>> you proposed is used just for that (ex. l[1,2] which is equivalent
>> to l[(1,2)] see below), and Numeric and numarray use it.
>
> Same misunderstanding as above, I believe. Let, e.g., l be
> ((1,10,100),(2,20,200),(3,30,300),(4,40,400)). Then l[2:; 1:] ->
> [(30, 300), (40, 400)]. This is equivalent to [i[1:] for i in l[2:]]
> but, at least to me, it is the index notation that is easier to parse.
>
> Incidentally, it strikes me that there are a lot of superfluous
> commas. Why not just (1 10 100) or even 1 10 100 instead of (1,10,100)?
> The commas do make the expression harder to parse visually.
>
This will will work until there are no expressions in the sequence.
If there are it is harder to read (and may be more error prone)
(1 + 3 4 + 2)
>> >>> # multi dimensional indexing
>> >>> c[1,3]
>
> I disagree that this is "multidimensional". You are passing a list
> and getting back a list, so this is still flat. I think you are
> confusing dimensionality and cardinality.
>
That is the notion of multidimensional indexing in Python.
>> Numeric and numarray use it
>
> This is good if it is true (I have yet to look at these two because
> my work is not primarily numerical) but, again, the restriction of this
> syntax to arrays, and arrays of numeric values at that, strikes me
> as completely arbitrary.
Here is an example of two dimentional Numeric array and it's indexing:
>>> from Numeric import *
>>> reshape(arange(9), (3,3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> a = reshape(arange(9), (3,3))
>>> a[1][0]
3
>>> a[1,0]
3
>>> a[(1,0)]
3
So currently indexing with sequence has a settled meaning of
multidimensional indexing if lists and tuples would allow indexing by
sequence, than either:
1. it might be confused with multidimensional indexing of numeric
types (the same notion for two different things)
2. it will require rework of multidimensional indexing maybe with your
semicolon notion, but will introduce incompatibilities (back and
forward)
>
> The bottom line is that python claims to be simple but has a syntax
> which, in this little corner at least, is neither simple nor regular:
> xranges and slices are both sequence abstractions but are used in
> different contexts and have different syntaxes; C-style arrays are
> treated differently than lists of lists of ..., although they are
> conceptually equivalent; numeric structures are treated differently
> than non-numeric ones etc etc
>
With multidimensional arrays and list of lists you may both write
a[i][j], so it is consistent, with arrays you may write a[i,j] if you
know that you have an 2-dim array in your hand.
> Sorry for my misunderstanding, yes it would be nice to have a
> posibility to index a sequence with a list of indices, here is a pure
> Python (>= 2.2) implementation of the idea:
>
> class List(list):
>
> def __getitem__(self, index):
> if isinstance(index, (tuple, list)):
> return [list.__getitem__(self, i) for i in index]
> else:
> return list.__getitem__(self, index)
>
>>>> l = List(range(0, 100, 10))
>>>> l[0,2,3]
> [0, 20, 30]
This is nice :-)
> but in this simple using both commas and slices will not work as
> expected
>
>>>> l[0,1,7:]
> [0, 10, [70, 80, 90]]
Your implementation can be extended to handle slices and still remains
simple:
class List3(list):
def __getitem__(self, index):
if hasattr(index, "__getitem__"): # is index list-like?
result = []
for i in index:
if hasattr(i, "start"): # is i slice-like?
result.extend(list.__getitem__(self, i))
else:
result.append(list.__getitem__(self, i))
return result
else:
return list.__getitem__(self, index)
I have used hasattr() instead of isinstance() tests because I think that the
"we take everything that behaves like X" approach is more pythonic than
"must be an X instance".
While __getitem__() is fairly basic for lists, I am not sure if start can be
considered mandatory for slices, though.
Peter
This is a red herring.
> What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?
Whatever error python returns if you ask, e.g., for (1 2 3)[4].
Hilde
This is a red herring.
> What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?
Whatever error python returns if you ask, e.g., for (1,2,3)[4].
Hilde
But consistency is pythonic, too ;-) and, as I pointed in another message,
if you accept l[sequence_abstraction], why not l[actual_sequence]??
Hilde
>>Only if all the sublists are of the same length, which is guaranteed for
>>a multi-dimensional array, but not for a list of lists.
>
>
> This is a red herring.
No, it's not. It is a hint that, despite the similarity of notation
between Matrix[i][j] and NestedList[i][j], there is something
fundamentally different between the two. See below for another example.
>
>
>>What do you expect a[;1] to return if a = [[], [1, 2, 3], [4], 5]?
>
>
> Whatever error python returns if you ask, e.g., for (1 2 3)[4].
>
> Hilde
Okay, here's a better example which I thought of just after I posted my
previous reply:
Given
x = [[1, 2], [3, 4]]
the statement
x[0] = [5, 6]
will result in a nested list x = [ [5, 6], [3, 4] ]. If you think of x
as a multi-dimensional array represented by a nested list, then I've
just replaced a row of the original array
1 2
3 4
with a new row, yielding
5 6
3 4
If we had an x[;0] notation, then you would expect to be able to do the
same thing to replace a column:
x[;0] = [7, 8]
Unfortunately, there is no pre-existing list representing the first
column of x, so x[;0] has to return a new list [1, 3], and assigning to
that new list has no affect on x.
Again, my point is that nested lists are a fundamentally different
structure than multi-dimensional arrays. For simple things like
x[i][j], you can use a nested list to represent a multi-dimensional
array, but if you actually want to manipulate a two-dimensional array
like a matrix, you are better off using a class designed for that
purpose, like the ones defined in Numeric Python.
David