Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

negative indices for sequence types

0 views
Skip to first unread message

dan

unread,
Sep 7, 2003, 2:26:28 PM9/7/03
to
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

Can anyone explain why this anomaly exists, and why it should continue
to exist?

Martin v. Löwis

unread,
Sep 7, 2003, 3:06:51 PM9/7/03
to
danb...@yahoo.com (dan) writes:

> This fact is *deeply* buried in the docs, and is not at all intuitive.

I find it highly intuitive and very convenient.

> If you want this behavior, how hard is it to say a = b[x %
> len(b)] ??

*This* I would call un-intuitive. It is also much slower.

To get the last element, you currently write b[-1]. If that was not
available, you would have to write b[len(b)-1], which is still
significantly slower. Also, you might not have a variable name, so try
rewriting foo()[-1].

Regards,
Martin

Peter Otten

unread,
Sep 7, 2003, 3:02:55 PM9/7/03
to
dan wrote:

After you have recovered from the shock, you probably will admit that
(1) the most common "out of bounds" case is caught:

>>> l = list("abc")
>>> l[3]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IndexError: list index out of range

and
(2) that accessing elements from the end of the list is something you will
soon appreciate:
>>> l[-1]
'c'
>>>

>>> l[-2:]
['b', 'c']
>>>

I think that more code enjoys the beauty of accessing the end of a list than
suffers from uncaught <0 index errors. See the possibilities rather than
the danger :-)

Peter

Bengt Richter

unread,
Sep 7, 2003, 3:18:52 PM9/7/03
to
On 7 Sep 2003 11:26:28 -0700, danb...@yahoo.com (dan) wrote:

>I was recently surprised, and quite shocked in fact, to find that
>Python treats negative indices into sequence types as if they were
>mod(length-of-sequence), at least up to -len(seq).
>
>This fact is *deeply* buried in the docs, and is not at all intuitive.
> One of the big advantages of a high-level language such as Python is
>the ability to provide run-time bounds checking on array-type
>constructs. To achieve this I will now have to subclass my objects
>and add it myself, which seems silly and will add significant
>overhead. If you want this behavior, how hard is it to say a = b[x %
>len(b)] ??

That isn't really the exact behavior. E.g.,

>>> range(5)
[0, 1, 2, 3, 4]
>>> range(5)[-4]
1
>>> range(5)[-5]
0
>>> range(5)[-6]


Traceback (most recent call last):
File "<stdin>", line 1, in ?
IndexError: list index out of range

>>> range(5)[4]
4
>>> range(5)[5]


Traceback (most recent call last):
File "<stdin>", line 1, in ?
IndexError: list index out of range

>Can anyone explain why this anomaly exists, and why it should continue
>to exist?
It has apparently proven more useful to have it so than not, though I sympathize
with your frustration in for your use.

Perhaps a .no_negative_indexing attribute or something could be added to the C implementation,
so that you could specify your desired checking without a performance hit.

Meanwhile, maybe an assert i>=0 in the index-supplier side of the contract might work too?

Regards,
Bengt Richter

Terry Reedy

unread,
Sep 7, 2003, 3:31:34 PM9/7/03
to

"dan" <danb...@yahoo.com> wrote in message
news:fbf8d8f2.03090...@posting.google.com...

> I was recently surprised, and quite shocked in fact, to find that
> Python treats negative indices into sequence types as if they were
> mod(length-of-sequence), at least up to -len(seq).

No, it adds len(seq). Changing + to % would be slower and more
obscure.

> This fact is *deeply* buried in the docs,

No more so than everything else in chapter subsections. From the Ref
Man table of contents I went directly to the most obvious place 5.3.2
Subscriptions, and found
'''
If the primary is a sequence, the expression (list) must evaluate to a
plain integer. If this value is negative, the length of the sequence
is added to it (so that, e.g., x[-1] selects the last item of x.) The
resulting value must be a nonnegative integer less than the number of
items in the sequence, and the subscription selects the item whose
index is that value (counting from zero).
'''
Translated to Python, letting idex be result of index expression:

if not isinstance(idex, (int,long)): raise TypeError()
if idex < 0: idex += seqlen
if idex < 0 or idex >= seqlen: raise IndexError()
<get seq[idex]>

> and is not at all intuitive.

Phrases like 'third from the end' are idiomatic English ;-)

> One of the big advantages of a high-level language such as Python
is
> the ability to provide run-time bounds checking on array-type
> constructs. To achieve this I will now have to subclass my objects
> and add it myself, which seems silly and will add significant
> overhead. If you want this behavior, how hard is it to say a = b[x
%
> len(b)] ??

Again, your innovation of using '% obscures rather than clarify.

> Can anyone explain why this anomaly exists, and why it should
continue
> to exist?

Being able to abbreviate seq(len(seq)-1] as seq[-1] is quite handy and
faster executing, , especially if seq is calculated from an
expression. Same for -2, etc. (And, of course, a change now would
break a noticeable fraction of existing programs.)

Terry J. Reedy


Erik Max Francis

unread,
Sep 7, 2003, 4:01:11 PM9/7/03
to
dan wrote:

> I was recently surprised, and quite shocked in fact, to find that
> Python treats negative indices into sequence types as if they were
> mod(length-of-sequence), at least up to -len(seq).

That is not the behavior of negative indices. Negative indices mean
index from the end of the sequence. So -1 means the _last_ element in
the list, -2 means the second to last element in the list, and so on.
-n (for n = len(seq) is the first element in the list.

> This fact is *deeply* buried in the docs, and is not at all intuitive.

It's mentioned prominently (and early) in all the tutorials and books on
Python I've read, and it's a very common and convenient convention, so
I'm not sure how far you could have gotten through learning Python and
never been exposed to it.

> One of the big advantages of a high-level language such as Python is
> the ability to provide run-time bounds checking on array-type
> constructs. To achieve this I will now have to subclass my objects
> and add it myself, which seems silly and will add significant
> overhead. If you want this behavior, how hard is it to say a = b[x %
> len(b)] ??

That's simply not true. Negative indices have similar bounds
requirements. If you have a sequence of length n, then indices 0
through (n - 1) map to the elements of the sequence in order from left
to right, and indices -1 through -n map to the elements in order from
right to left. Indices greater than n or less than -n generate
IndexErrors. Bounds checking is always done, whether on positive or
negative indices.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ Then you give me that Judas Kiss / Could you hurt me more than this
\__/ Lamya

Istvan Albert

unread,
Sep 7, 2003, 9:00:41 PM9/7/03
to
dan wrote:

> This fact is *deeply* buried in the docs, and is not at all intuitive.
> One of the big advantages of a high-level language such as Python is
> the ability to provide run-time bounds checking on array-type
> constructs.

Bounds checking means that the size is tracked for you and
an exception is thrown if you are trying to access an
element *beyond* that size. That's the natural way
of thinking about it, and not "checking wether there
is an index like this in the list".

The python way of using negative numbers in indices is
extremly handy as many have pointed out. It would be
silly to forego all that expressivness just to save an
if test in some rare cases.

> To achieve this I will now have to subclass my objects
> and add it myself, which seems silly and will add significant
> overhead.

I would guess that instead of paying for this every time,
as you want to (subclassing), you could just as simply
check the index at the time when you generate it and verify
that it is correct. This way using the same list in differnt
context will not make it less efficient.

Istvan.

Michael Peuser

unread,
Sep 8, 2003, 2:01:16 AM9/8/03
to

"dan" <danb...@yahoo.com>

> I was recently surprised, and quite shocked in fact, to find that
> Python treats negative indices into sequence types as if they were
> mod(length-of-sequence), at least up to -len(seq).
>
> This fact is *deeply* buried in the docs, and is not at all intuitive.

I think it is addressed even in most tutorials because it is quite handy as
others already pointed out. There is the same fetaure in Perl.

Kindly
Michael P

Fernando Perez

unread,
Sep 8, 2003, 2:55:06 AM9/8/03
to
dan wrote:

> I was recently surprised, and quite shocked in fact, to find that
> Python treats negative indices into sequence types as if they were
> mod(length-of-sequence), at least up to -len(seq).
>
> This fact is *deeply* buried in the docs, and is not at all intuitive.

Very deeply indeed: section 3.1.4 of the beginner's tutorial:

http://www.python.org/doc/current/tut/node5.html#SECTION005140000000000000000

Of all places, this is the section on lists:

>>> a = ['spam', 'eggs', 100, 1234]

[... snip ...]

>>> a[-2]
100
>>> a[1:-1]
['eggs', 100]

> Can anyone explain why this anomaly exists, and why it should continue
> to exist?

Because this 'anomaly' is incredibly useful in many contexts, as many others
have already pointed out. Rest assured that it will continue to exist,
probably for as long as the language is around. Better get to like it :)

Cheers,

f.

dan

unread,
Sep 8, 2003, 1:11:26 PM9/8/03
to
As is often the case, I think this comes down to documentation. While
the behavior is mentioned early in the tutorial, I found it difficult
to find it in the reference -- but whatever, we can chalk this up to
RTFM on my part.

My explanation of the behavior is correct however. list[a] always
equals list[a % len(list)]. A negative number mod N = its absolute
value subtracted from N:

a % n == n - abs(a) # where -n <= a <= 0

However if I want to count from the end of the list, I would of course
write
list[len(list)-a]. I wasn't really considering that the purpose of
this feature was to count from the end of a list, which I admit could
come in handy.

Thanks for the responses.

Fernando Perez <fper...@yahoo.com> wrote in message news:<bjh94c$j9p$1...@peabody.colorado.edu>...

Chad Netzer

unread,
Sep 8, 2003, 1:52:46 PM9/8/03
to
On Mon, 2003-09-08 at 10:11, dan wrote:

> My explanation of the behavior is correct however. list[a] always
> equals list[a % len(list)].

Many people pointed out to you that this is NOT true. In particular,
your version gives you NO bounds checking at all; every 'a' is a valid
index (for len(list) > 0). The Python behavior DOES give IndexError for
an out of bound a, and that difference is very significant, IMO.

--
Chad Netzer


bigdog

unread,
Sep 12, 2003, 5:29:21 PM9/12/03
to
danb...@yahoo.com (dan) wrote in message news:<fbf8d8f2.03090...@posting.google.com>...

Heck, I like it simply because I can read lines from files and easily
chop off the newline.

myStr = f.readline()[0:-1]

That alone is worth it's wait in gold to me, never mind all the other
things it makes easy.

Lukasz Pankowski

unread,
Sep 12, 2003, 8:04:38 PM9/12/03
to
msu...@comshare.com (bigdog) writes:

> myStr = f.readline()[0:-1]

this may eat you last character in the file (if last line does not end
with new line which happens, but this will not ::

myStr = f.readline().rstrip('\n')

but is 6 character longer :)

--

=*= Lukasz Pankowski =*=

Jacek Generowicz

unread,
Sep 17, 2003, 4:43:13 AM9/17/03
to
danb...@yahoo.com (dan) hypothesizes:

> My explanation of the behavior is correct however. list[a] always
> equals list[a % len(list)]. A negative number mod N = its absolute
> value subtracted from N:

Proof by counterexample:

Python 2.2.2 (#1, Feb 8 2003, 12:11:31)
[GCC 3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = '0123'
>>> s[-20 % len(s)]
'0'
>>> s[-20]


Traceback (most recent call last):
File "<stdin>", line 1, in ?

IndexError: string index out of range


Your explanation of the behaviour is incorrect.

QED.

0 new messages