"Common sense" is wrong. There are many compelling advantages to
numbering from zero instead of one:
http://lambda-the-ultimate.org/node/1950
> 2) In Python 3, why is print a function only, so that: print "Hello,
> World" is not okay, but it must be print("Hello, World") instead?
> (Yeah, I know: picky, picky . . . )
The real question is, why was print so special in Python 2 that is can
be called without parentheses? The answer was "no reason" and it was
fixed in Python 3 to be consistent with the rest of the language.
> 3) In Python 3, why does 2.0 / 3.0 display as 0.6666666666666666, but
> 8 * 3.57 displays as 28.56 (rounded off to 2 decimal places)? And
> yet, in Python 2.6, 8 * 3.57 displays as 28.559999999999999?
Because the code for displaying floats was improved in python 3. You
can follow the fascinating discussion on issue 7117:
http://bugs.python.org/issue7117
I can't defend the rounding issues of floating point numbers in general
- it's just "one of those things" that you have to deal with. But show
me a language where floats don't have this problem.
> And we wonder why kids don't want to learn to program.
Yeah, obscure language warts, that must be the reason.
Note to self: DNFTT...
Ryan
--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ry...@rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details
>>From "the emperor's new clothes" department:
>
> 1) Why do Python lists start with element [0], instead of element [1]?
> "Common sense" would seem to suggest that lists should start with [1].
http://userweb.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html
> 2) In Python 3, why is print a function only, so that: print "Hello,
> World"
> is not okay, but it must be print("Hello, World") instead? (Yeah, I know:
> picky, picky . . . )
"There should be one-- and preferably only one --obvious way to do it."
> 3) In Python 3, why does 2.0 / 3.0 display as 0.6666666666666666, but 8 *
> 3.57 displays as 28.56 (rounded off to 2 decimal places)? And yet, in
> Python 2.6, 8 * 3.57 displays as 28.559999999999999?
http://mail.python.org/pipermail/python-dev/2009-October/092958.html
and replies
--
By ZeD
(In addition to the other good answers already given)
Well, "tradition" (originating from C) suggests otherwise. *Very* few
languages use 1-based indexing:
http://en.wikipedia.org/wiki/Comparison_of_programming_languages_(array)#Array_system_cross-reference_list
> 2) In Python 3, why is print a function only, so that: print "Hello, World"
> is not okay, but it must be print("Hello, World") instead? (Yeah, I know:
> picky, picky . . . )
One less special case to learn; makes the language more regular and
easier to learn. It also lets one write:
f = lambda x: print(x)
Which is not possible if print is a statement.
Cheers,
Chris
--
http://blog.rebertia.com
I think the reason why is just historical; C uses zero-based indices.
In C, an array index is an offset with respect to the pointer that the
array variable actually is, so 0 makes sense (my_array[0] == *my_array).
I'm not convinceed (yet) by Dijkstra's reasoning. *Maybe* if you want
to describe a range with two </<='s, it makes sense. But Python (nor
C, nor ...) uses that notation. I agree with the OP that the first
item in a list would most naturally be called item 1, and therefore
have index 1. (This doesn't mean I'm in favor of 1-based indices)
One of the reasons I like python so much, is that you (almost) never
have to use indices. Normally you just iterate over the elements. If I
ever need indices, it's a strong indication that I actually want a
dictionary.
Cheers, Roald
> "Common sense" is wrong. There are many compelling advantages to
> numbering from zero instead of one:
>
> http://lambda-the-ultimate.org/node/1950
It makes sense in assembly language and even in many byte code languages.
It makes sense if you look at the internal representation of unsigned
numbers (which might become an index)
For a complete beginner common sense dictates differently and there
might be confusion why the second element in a list has index 1.
However I seriously doubt, that this is a real problem.
You learn things like this on the first day of learning a programming
language.
>
>> 2) In Python 3, why is print a function only, so that: print "Hello,
>> World" is not okay, but it must be print("Hello, World") instead?
>> (Yeah, I know: picky, picky . . . )
>
> The real question is, why was print so special in Python 2 that is can
> be called without parentheses? The answer was "no reason" and it was
> fixed in Python 3 to be consistent with the rest of the language.
>
>> 3) In Python 3, why does 2.0 / 3.0 display as 0.6666666666666666, but
>> 8 * 3.57 displays as 28.56 (rounded off to 2 decimal places)? And
>> yet, in Python 2.6, 8 * 3.57 displays as 28.559999999999999?
>
> Because the code for displaying floats was improved in python 3. You
> can follow the fascinating discussion on issue 7117:
>
> http://bugs.python.org/issue7117
>
> I can't defend the rounding issues of floating point numbers in general
> - it's just "one of those things" that you have to deal with. But show
> me a language where floats don't have this problem.
>
>> And we wonder why kids don't want to learn to program.
>
I did not see the original post, but this statement sounds rather
trollish to me.
It might just be, that you can do so many things on a computer without
having to program. Watching youtube or browsing the web, chatting about
favourite PC games, the amount of SW, that you can download for almost
every task make it much less attractive to write own programs.
When my parents had their first computer there were very little games
and PC's weren't connected to the net.
so if I wanted to play with the computer I had mostly the choice between
the games called:
- basic
- pascal
- word star
- super calc
Syntax details are barely a reason to frighten children.
children start very often with typing in small programs without
understanding them, lookin at the results and changing what they believe
to understand.
Non native english speakers can write programs, before they even knew
what the english words 'if' 'else' 'while' 'list' mean. They don't care.
They learn that if 'starts' a condition and that 'else' is the beginning
of the section to be executed if the condition is not true.
As others have pointed out, there is a nice argument to be made for
zero-based indices. However, the killer reason is: "it's what everybody
else does." As it stands, the only perceived problem with zero-based
indices is that it's one of the many tiny confusions that new
programmers face. On the other hand, it's the way nearly every other
popular programming language does it, and therefore, it's the way almost
every programmer likes to think about sequences.
Also, it has the nice property that, for an infinite sequence, every
integer makes sense as an index (in Python).
>
> 2) In Python 3, why is print a function only, so that: print "Hello,
> World" is not okay, but it must be print("Hello, World") instead?
> (Yeah, I know: picky, picky . . . )
>
> 3) In Python 3, why does 2.0 / 3.0 display as 0.6666666666666666, but 8
> * 3.57 displays as 28.56 (rounded off to 2 decimal places)? And yet, in
> Python 2.6, 8 * 3.57 displays as 28.559999999999999?
0:pts/3:~% python3.1
Python 3.1.2 (release31-maint, Jul 8 2010, 09:18:08)
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 28.56
28.56
>>>
0:pts/3:~% python2.6
Python 2.6.6rc1+ (r266rc1:83691, Aug 5 2010, 17:07:04)
[GCC 4.4.5 20100728 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 28.56
28.559999999999999
>>>
0:pts/3:~%
same number - why use more digits if you can avoid it? Python 3 is smart
enough to avoid it.
>
> And we wonder why kids don't want to learn to program.
Don't kids want to learn to program? Many don't, a fair bunch do. It's
the same for any other art. Also, the only people that realize this kind
of "issue" are those that have already learned programming.
Would said beginner also be surprised that a newborn baby is zero years
old or would it be more natural to call them a one year old? Zero
based counting is perfectly natural.
--
D'Arcy J.M. Cain <da...@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
A new born baby is in his/her first year. It's year 1 of his/her life.
For this reason, also "the year 0" doesn't exist. From the fact that a
baby can be half a year old, you derive that arrays should have floats
as indices?
> On 08/07/2010 05:05 AM, Default User wrote:
>>>From "the emperor's new clothes" department:
>>
>> 1) Why do Python lists start with element [0], instead of element [1]?
>> "Common sense" would seem to suggest that lists should start with [1].
>
> As others have pointed out, there is a nice argument to be made for
> zero-based indices. However, the killer reason is: "it's what everybody
> else does."
I'll have you know that there are still some Pascal programmers in the
world, thank you.
> As it stands, the only perceived problem with zero-based
> indices is that it's one of the many tiny confusions that new
> programmers face. On the other hand, it's the way nearly every other
> popular programming language does it, and therefore, it's the way almost
> every programmer likes to think about sequences.
It didn't take me long to get used to thinking in zero-based indexes, but
years later, I still find it hard to *talk* in zero-based indexes. It's
bad enough saying that the first element in a list in the zeroth element,
but that the second element is the first makes my head explode...
> Also, it has the nice property that, for an infinite sequence, every
> integer makes sense as an index (in Python).
Er, what's the -1th element of an infinite sequence?
--
Steven
No. You are giving me math and logic but the subject was common
sense. Common usage counts ages as years with the second year called
"one year old" so zero based counting is common. We don't tell Aunt
Martha that little Jimmy is in his third year. We say that he is two
years old and Aunt Martha, a non-programmer, understands exactly what
we mean. Using one-based counting (first year, second year, etc.)
would be the unnatural thing, would confuse Aunt Martha and make her
spoil her apple pie and no one wants that.
My point is that "0" in "Jimmy is 0" doesn't play the same role as in
"item 0 of a sequence".
> On Sat, 07 Aug 2010 13:48:32 +0200
> News123 <news...@free.fr> wrote:
>> It makes sense in assembly language and even in many byte code
>> languages. It makes sense if you look at the internal representation of
>> unsigned numbers (which might become an index)
>>
>> For a complete beginner common sense dictates differently and there
>> might be confusion why the second element in a list has index 1.
>
> Would said beginner also be surprised that a newborn baby is zero years
> old or would it be more natural to call them a one year old? Zero based
> counting is perfectly natural.
There's nothing natural about saying that a baby is zero years old. A
newborn baby is "a newborn baby", then it's "one day old", "two days
old", ... "one month old", "two months old", ... "one year old".
In any case, we're discussing *ordinals*, not cardinal numbers. The
ordinals in common English are first, second, third, ... but in computing
frequently zeroth, first, second, third, ...
There is a reason why mathematicians describe the integers 1, 2, 3, ...
as the "Natural Numbers". It took thousands of years of human
civilization before people accepted that zero was a number. In fact, for
the ancient Greeks, one wasn't a number either. We still reserve the term
"a number of X" to refer to more than one X, and would feel cheated if
somebody offered us a number of gifts and then gave us only a single one.
"Number" refers to a plurality, and one is singular.
According to Euclid, one was the monad, the indivisible unit from which
the numbers were formed. As late as 1537, the German mathematician Jacob
Kobel wrote "1 is no number, but it is a generatrix, beginning and
foundation for all other numbers".
In short, there's nothing natural about counting numbers at all, let
alone whether we should start at 0 or 1.
P.S. I don't know if I should be gratified or disappointed that nobody
has yet quoted Stan Kelly-Bootle:
"Should array indices start at 0 or 1? My compromise of 0.5 was rejected
without, I thought, proper consideration."
--
Steven
zeroth
oneth
twoth
;-)
(element no. one is a better way of pronouncing it.)
>
>
>> Also, it has the nice property that, for an infinite sequence, every
>> integer makes sense as an index (in Python).
>
> Er, what's the -1th element of an infinite sequence?
well, it's the first from the other end. The infinite bit is in between,
thank you very much. ;-)
>> A new born baby is in his/her first year. It's year 1 of his/her life.
>> For this reason, also "the year 0" doesn't exist. From the fact that a
>> baby can be half a year old, you derive that arrays should have floats
>> as indices?
>
> No. You are giving me math and logic but the subject was common
> sense. Common usage counts ages as years with the second year called
> "one year old" so zero based counting is common. We don't tell Aunt
> Martha that little Jimmy is in his third year.
Apparently, the Japanese used to (before they started adopting western
conventions). I.e. ages were given as "in his tenth year" (meaning nine
years old).
>> "Common sense" is wrong. There are many compelling advantages to
>> numbering from zero instead of one:
>>
>> http://lambda-the-ultimate.org/node/1950
>
> It makes sense in assembly language and even in many byte code languages.
> It makes sense if you look at the internal representation of unsigned
> numbers (which might become an index)
It also makes sense mathematically. E.g. for an MxN array stored as a
1-dimensional array, the element a[j][i] is at index
j * N + i
with zero-based indices but:
(j-1) * N + (i-1) + 1
= j * N + i - N
with one-based indices.
IOW, if a language uses one-based indices, it will inevitably end up
converting to and from zero-based indices under the hood, and may end up
forcing the user to do likewise if they need to do their own array
manipulation.
Nice example!
FORTRAN, MATLAB, and Octave all use 1-based subscripts.
The languages which have real multidimensional arrays, rather
than arrays of arrays, tend to use 1-based subscripts. That
reflects standard practice in mathematics.
John Nagle
And it's "I", not "i". :-)
> When any object is "born" (whether it be a life form, or a planet, or
> even a class instance) "it" will be zero years old until 1 year of
> time has passed has passed. If you want to properly describe age you
> could say a baby who was born five minutes ago is...
>
> - 0 millenniums
millennia
> - 0 centuries
> - 0 decades
> - 0 years
> - 0 months
> - 0 days
> - 0 hours
> - 5 minutes
> - 60*5 seconds
> - (60*5)*1000 millisecond
> - crikey i'm tired!
>
> Just because Aunt Martha is is too lazy to list out the details that
> has no effect on reality. YES a newborn is zero years old. YES, a
> newborn is zero months old, ...an so on.
> Well not if you are referring to how people "say" things. But what
> people "say" and the facts of reality are some times two different
> things. Heck we even have a few folks in this group who overuse the
> expression "used to" quite frequently in place of the more correct
> term "previously" -- i won't give names.
<RANT>
Rick, do you know *ANY* other language other than English? Not
everybody understands English on an Oxford level (and I doubt you
do). You're just a loud-mouthing (probably incorrect English) idiot.
(Spare your comments about personal attacks, you're not so stupid to
not know why you regularly receive such comments)
Lurking for long enough to know your style. Looking at your Unicode
rant, combined with some other comments and your general "I am right
and you are wrong because you disagree with me." style, I came to
the conclusion, that you are either a faschist or the perfect role
model for an imperialistic, foreign culture destroying,
self-praising, arrogant, ignorant moron.
> When any object is "born" (whether it be a life form, or a planet, or
> even a class instance) "it" will be zero years old until 1 year of
> time has passed has passed.
A year is not the smallest index. A year is 365 days is 8760 hours
is 525600 is 31536000 minutes, and so on and so on and so on...
That's *totally* different from array[0], array[1], etc. There is
*NO* array[0.5].
Going down to the smalled possible time scale (quantum physics level
), what is the the correct index for "my baby is *just* born, hence
it's X quantum leap old." where x might be 0 or 1.
Yes, I know that this is nonsense (since such a discrete quantum
leap doesn't even exists) but so is the whole discussion.
Something starts at 0, others at 1. You only have to add/sub 1
depending on the situation. Mind-boggling, I must say!
And really, even brand-new programmer will face *A LOT* harder
problems than this. Programming is not stamp collecting! Programming
requires a half working brain, and those who don't have that,
should sort their stamps while watching some "Next top model" stuff
on TV.
</RANT>
"Make everything as simple as possible, but not simpler."
- Einstein
Rick? Is that you?
Geremy Condra
> It didn't take me long to get used to thinking in zero-based indexes,
> but years later, I still find it hard to *talk* in zero-based indexes.
> It's bad enough saying that the first element in a list in the zeroth
> element, but that the second element is the first makes my head
> explode...
Don't say those things, then. In addition to making your head explode,
they're not true.
There is no “zeroth element” in a sequence.
The first element in a sequence is addressed by index 0.
The second element in a sequence is addressed by index 1.
The last element in a sequence is addressed by index -1.
In other words, it's never true (in Python) that index N addresses the
Nth element of the sequence; so that's not a useful equivalence to
maintain.
Hope that helps.
--
\ “The trouble with eating Italian food is that five or six days |
`\ later you're hungry again.” —George Miller |
_o__) |
Ben Finney
> No. You are giving me math and logic but the subject was common
> sense.
Common sense is often unhelpful, and in such cases the best way to teach
something is to plainly contradict that common sense.
Common sense, for example, would have the Earth as a flat surface with
the Sun and Moon as roughly-identically-sized objects orbiting the
Earth. Is it better to pander to that common sense, or to vigorously
reject it in order to teach something more useful?
--
\ “Courage is not the absence of fear, but the decision that |
`\ something else is more important than fear.” —Ambrose Redmoon |
_o__) |
Ben Finney
Koreans still do this. The day a child is born it is "one". Even
odder to me, the next birthday is not on the next anniversary of the
birth, but on the following New Year's Day. So a kid who is born on
Dec 26th, will be "two" as of New Year's Day the week following his/
her birth. (They also are aware of the "western" version of their
ages if needed).
.
I propose that this has less to do with the fact that those languages
have "real" multidimensional arrays, and more to do with the fact that
those languages are designed for doing mathematics. C, Oberon, and
others also have "real" multidimensional arrays, and use 0-based
subscripts. Standard practice in mathematics is not necessarily best
practice in programming.
Cheers,
Ian
> > Apparently, the Japanese used to (before they started adopting western
> > conventions). I.e. ages were given as "in his tenth year" (meaning nine
> > years old).
With apologies to Paul Simon...
One man's ceil() is another man's floor()
Python is also about being readable and consistent. It's going to get
really confusing if half the files use 1-based lists and the other
half use 0-based. Is it really that hard to get used to indices
running from 0 to length-1? Also, good luck getting through life
without running into C, C++, C#, Visual Basic, F#, Java, Ruby, Perl,
Lisp, or OCaml code. Along with all the languages that use 0-based
arrays.
It's all part of learning a programming language. Some have 0-based
indexing, others have 1-based indexing; some have mutable strings,
others have immutable strings, still others don't have 'proper' strings.
Just learn to adapt.
Because Zero is the neutral element of addition operation. And indexes
(and all adresses in computing) involve with addition much more than
multiplication! That's too clear i think and that's silly to use One
as first index of arrays/lists in a programming language!
> Lurking for long enough to know your style. Looking at your Unicode
> rant, combined with some other comments and your general "I am right
> and you are wrong because you disagree with me." style, I came to
> the conclusion, that you are either a faschist or the perfect role
> model for an imperialistic, foreign culture destroying,
> self-praising, arrogant, ignorant moron.
IOW, the "Ugly American".
I assure you that we here in the US are aware of the problem and do
try to keep them from annoying the rest of the world by providing them
with an abundance of junk food, bad television, worse movies, and a
variety of sporting events tailored to their tastes.
Unfortunately the advent of cheap air-travel in the latter half of the
20th century has allowed small groups of them to escape and wander
about various parts of the world (mainly Europe) in search of KFCs
while complaining about the toilets, the food, the money, the weather,
the roads, the cars, and so on. They are easily identified by their
cloathing and their believe that everybody understands English if you
speak it loudly and slowly while accompanying it with exaggerated,
mostly-random hand gestures.
Just avoid them if possible. Don't worry, the less contact they have
with "furriners" the happier they are (which makes one wonder why they
leave their home territories -- reasearch on that topic is ongoing).
Unfortunately, if you work in the hospitality industries there's not
much you can do other than grit your teeth, cross you fingers, and
hope you'll end up with a batch that over-tips.
--
Grant Edwards grant.b.edwards Yow! Youth of today!
at Join me in a mass rally
gmail.com for traditional mental
attitudes!
> IOW, the "Ugly American".
[snip hate rant]
Stereotypically bashing "Americans" is as ugly and obnoxious as bashing
any other ethnic group. I have traveled the world and Americans are no
worse, but are pretty much the same mix of good and bad. It is certainly
off-topic and inappropriate for this group.
--
Terry Jan Reedy
I wasn't bashing "Americans". I was making light of a certain type of
American tourist commonly denoted by the phrase "ugly american".
> is as ugly and obnoxious as bashing any other ethnic group. I have
> traveled the world and Americans are no worse, but are pretty much
> the same mix of good and bad.
I've travelled the world as well, and I think that Americans do indeed
make worse "tourists" than most others. I've seen a lot of European
and Asian tourists in the US, and I've never seen from them the types
of behavior for which the "Ugly American" tourists is famous.
I've never been confronted here in the US by a Japanese tourist who
thought that if he spoke Japanese to a store clerk loudly and slowing
the clerk would understand. I've never seen European tourists trying
to avoid eating "American" food or complaining about the electrical
outlets.
> It is certainly off-topic and inappropriate for this group.
To that I'll confess.
--
Grant Edwards grant.b.edwards Yow! I feel ... JUGULAR ...
at
gmail.com
Depends whether you are counting (discrete) things, or measuring them (over
a continuous range).
You would start counting at 1, but start measuring from 0.
--
Bartc
> = j * N + i - N
>
> with one-based indices.
In other words, an extra offset to be added, in an expression already using
a multiply and add, and which likely also needs an extra multiply and add to
get the byte address of the element.
(And often, the a[j][i] expression will be in a loop, which can be compiled
to a pointer that just steps from one element to the next using a single
add.)
The indices i and j might anyway be user data which happens to be 1-based.
And if the context is Python, I doubt whether the choice of 0-based over a
1-based makes that much difference in execution speed.
(I've implemented languages that allow both 0 and 1-based indexing (and
N-based for that matter). Both are useful. But my interpreted languages tend
to use 1-based default indexing as it seems more natural and 'obvious')
>
> IOW, if a language uses one-based indices, it will inevitably end up
> converting to and from zero-based indices under the hood,
Sometimes. At the very low level (static, fixed array), the cost is absorbed
into the address calculation. At a higher level, the cost is less
significant, or there might be tricks to avoid the extra addition.
> and may end up
> forcing the user to do likewise if they need to do their own array
> manipulation.
Lots of things require this sort of calculation, eg. how many pages are
needed to print 267 lines of text at 60 lines per page? These counts are
1-based so it's (L-1)/P+1 (integer divide), or 5 pages.
If we switch to 0-based counting, it's just L/P ('266' lines require '4'
pages), but who's going to explain that to the user?
--
Bartc
JM
"Ignorance is the mother of all traditions" (V. Hugo)
Pardon the response to the response. I missed Ben's message.
> Ben Finney wrote:
> > "D'Arcy J.M. Cain" <da...@druid.net> writes:
> >> No. You are giving me math and logic but the subject was common
> >> sense.
> >
> > Common sense is often unhelpful, and in such cases the best way to teach
> > something is to plainly contradict that common sense.
I even agree with you. However, the OP was claiming that zero based
counting contradicted common sense and that was what I was responding
to. I would never use "common sense" to prove anything.
--
D'Arcy J.M. Cain <da...@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
Just for the record:
I sincerely apologize for my rant. I usually don't loose control so
heavily, but this "Rick" person makes me mad (killfile'd now)
>> IOW, the "Ugly American".
No! That's not what I said. I'm myself one of those "bad germans"
and I *entirely* agree with Mr. Reedy's comment:
> Stereotypically bashing "Americans" is as ugly and obnoxious as bashing
> any other ethnic group. I have traveled the world and Americans are no
> worse, but are pretty much the same mix of good and bad. It is certainly
If this were really true, lists would be 1-based. I go back to
WATFOR; and Fortran (and I believe Cobol and PL/I, though I'm not
positive about them) were 1-based. (Now that I think about it, PL/I,
knowing IBM, could probably be set to use either) Back then, everyone
else was doing 1-based lists.
In my opinion, the reason lists are 0-based started with a lazy
programmer who decided that his own comfort (using 0-based addressing
at the machine level and not having to translate the high-level 1-
based language index into a low-level 0-based index) was paramount
over teaching the language and having it make sense in the real
world. After all, not even Brian Kernighan thinks books start on page
0. I'm not singling out C in this case because it is a relatively low-
level language for low-level programmers and 0-based lists make
perfect sense in that context. But then every compiler/interpreter
programmer after that stopped caring about it.
I smile every time I see the non-nonsensical sentence "The first
thing, therefore, is in thing[0]" in a programming language learning
book or tutorial. I laugh every time I hear someone defend that as
common sense. Every three year old watching Sesame Street knows
counting things starts with '1', not '0'. When you were three and you
counted your blocks, you started with '1', not '0'. The whole rest of
the world understands that implicitly, even if their counting starts
'1', '2', 'many'. 0-based lists are NOT common sense. They only make
sense to the programmers of computer languages, and their fanbois.
There may be loads of reasons for it, but don't throw common sense
around as one of them.
Den
It's a good thing then that I didn't:
>> ... However, the killer reason is: "it's what everybody
>> else does.
>>
>
"Where it all started" is that 0-based indexing gives languages like C a
very nice property: a[i] and *(a+i) are equivalent in C. From a language
design viewpoint, I think that's quite a strong argument. Languages
based directly on C (C++, Objective C, ...) can't break with this for
obvious reasons, and other language designers/implementers imitated this
behaviour without any good reason to do so, or not to do so. In
higher-level languages, it doesn't really matter. 1-based indexing might
seam more intuitive, but in the end, it's just another thing you have to
learn when learning a language, like "commas make tuples", and somebody
studying a programming language learns it, and gets used to it if they
aren't used to it already.
I think the main reason zero-based indexing is chosen in higher
level languages is the following useful property:
x[n:m] + x[m:len(x)] == x
--
Neil Cerutti
> "Where it all started" is that 0-based indexing gives languages like C a
> very nice property: a[i] and *(a+i) are equivalent in C. From a language
> design viewpoint, I think that's quite a strong argument. Languages
> based directly on C (C++, Objective C, ...) can't break with this for
> obvious reasons, and other language designers/implementers imitated this
> behaviour without any good reason to do so, or not to do so. In
> higher-level languages, it doesn't really matter. 1-based indexing might
> seam more intuitive.
>
In a higher level language 1-based indexing is just as limiting as 0-
based indexing. What you really want is the ability to declare the index
range to suit the problem: in Algol 60 it is very useful to be able to
declare something like:
real sample[-500:750];
and Algol 68 went even further:
flex [1:0] int count
where the array bounds change dynamically with each assignment to
'count'. Iteration is supported by the lwb and upb operators which return
the bounds of an array, so you can write:
for i from lwb count to upb count do....
--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
And I doubt anyone cares about execution speed when deciding whether to
use 1-based or 0-based array. The reason why you want to choose the
alternative that use less conversion to the other system is to simplify
the source code.
Many common mathematical/physics/economics formulas are expressed much
simply if we use 0-based counting:
* arithmetic series:
- 1-based: s(n) = a + (n - 1) * d
- 0-based: s(n) = a + n * d
* geometric series:
- 1-based: g(n) = a * r**(n - 1)
- 0-based: g(n) = a * r**n
* equation of motion:
- 1-based: x(t) = a + 1/2 * a * (t - 1)**2
- 0-based: x(t) = a + 1/2 * a * t**2
* exponential growth/decay:
- 1-based: d(t) = a * e**(k * (t - 1))
- 0-based: d(t) = a * e**(k*t)
In fact, most textbooks would already uses 0-based formula for some of
these formulas already. Most physics and economic textbooks would show
the base 0 variation of the formula, and refers to t=0 as the point in
time where the "event" started.
I often used this model of thinking for 0-based array indices (and
negative indices):
-7 -6 -5 -4 -3 -2 -1
+---+---+---+---+---+---+---+
| c | h | a | r | l | i | e |
+---+---+---+---+---+---+---+
0 1 2 3 4 5 6 (7)
instead of:
In short, the choice of 0-based array is of practical purpose, rather
than historical purpose.
so to repeat, I often use this model of thinking:
-7 -6 -5 -4 -3 -2 -1
+---+---+---+---+---+---+---+
| c | h | a | r | l | i | e |
+---+---+---+---+---+---+---+
0 1 2 3 4 5 6 (7)
instead of:
-7 -6 -5 -4 -3 -2 -1
+---+---+---+---+---+---+---+
| c | h | a | r | l | i | e |
+---+---+---+---+---+---+---+
0 1 2 3 4 5 6
that is, the indices refers to the "gap" between the array entries. The
"gap index" model highlights the naturalness of using 0-based array,
negative indices, array slicing, and half-open.
>
> I smile every time I see the non-nonsensical sentence "The first
> thing, therefore, is in thing[0]" in a programming language learning
> book or tutorial. I laugh every time I hear someone defend that as
> common sense.
If one thinks in terms of slicing at gap positions, the 'proper' indexes
would range from 0.5 (average of 0 and 1) to n-0.5. For convenience, we
round down or up. To put it another way, seq[n:n+1] is abbreviated as
either seq[n] or seq[n+1]. Put this way, the first choice is at least as
sensible as the second.
Given that Python allows indexing from both end, I prefer 0,1,2,... and
-1,-2,-3,... to 1,2,3... and 0,-1,-2,... or 1,2,3,... and -1,-2,-3.
As someone else pointed out, discretizing a continuous variable starting
at 0 gives 0,1,2,... so having indexes that match is handy.
If a problem is formulated in terms of 1,2,3, one can simply leave the
first cell blank rather than reformulate. If a problem is formulated in
terms of 0,1,2,... and indexes are 1 based, then one must reformulate.
> Every three year old watching Sesame Street knows
> counting things starts with '1', not '0'.
And that is the same mistake that most societies make, the mistake that
put a lid on Greak math, science, and finance. All fresh counts begin
with 0. Counting by people usually begins with a silent 0, just as fresh
tallies begin with a blank stick or paper.
But not always. For instance, lets count the people who have, up to noe,
become billionaires with Python. We start with an overt 0. Now we can
discuss whether the founders of Google should increase that to 2.
Mechanical counting requires an overt 0. A car odometer starts at 0, not
1 and not . Have you never written a counting program? Starting with n
= 1 instead of n = 0 before counting the first item would be a bad bug.
> There may be loads of reasons for it, but don't throw common sense
> around as one of them.
I won't. Only a few (about 3 or 4) societies included a proper 0 in
their number systems.
--
Terry Jan Reedy
Ugh, no. The ability to change the minimum index is evil. I don't
much care whether a high-level language uses 0-based or 1-based
indexing, but I do care that it is consistent. On the occasions when
I am forced to use Visual Basic, the single biggest wart that drives
me up a wall is constantly having to figure out whether the particular
thing that I am currently indexing is 0-based or 1-based.
Cheers,
Ian
> The languages which have real multidimensional arrays, rather
> than arrays of arrays, tend to use 1-based subscripts. That
> reflects standard practice in mathematics.
Actually I’d go one better, and say that the languages that have real
multidimensional arrays allow you to explicitly specify both the lower and
upper bounds of each dimension. E.g. Ada, ALGOL 68. Heck, even Pascal
allowed you to do that.
> "Where it all started" is that 0-based indexing gives languages like C a
> very nice property: a[i] and *(a+i) are equivalent in C. From a language
> design viewpoint, I think that's quite a strong argument.
It would be if pointers and arrays were the same thing in C. Only they’re
not, quite. Which somewhat defeats the point of trying to make them look the
same, don’t you think?
> The ability to change the minimum index is evil.
Pascal allowed you to do that. And nobody ever characterized Pascal as
“evil”. Not for that reason, anyway...
How are they not the same?
The code snippet (in C/C++) below is valid, so arrays are just
pointers. The only difference is that the notation x[4] reserves space
for 4 (consecutive) ints, and the notation *y doesn't.
int x[4];
int *y = x;
Moreover, the following is valid (though unsafe) C/C++:
int *x;
int y = x[4];
Cheers, Roald
True, but that something is "standard mathematical notation" doesn't
mean it's preferable. For example, I have never seen keyword arguments
in mathematical notation, and it's definitely not standard practice,
but nobody would drop them in favor of standard mathematical notation.
In fact, I think, regularly mathematical notation can be improved by
standard programming notation.
Moreover, I don't see what's so nice about 'real' multidimensional
arrays; the way to construct multidimensional arrays from one-
dimensional ones is more orthogonal. And you never *have* to think
about them as being one-dimensional, it's just a bonus you can
(sometimes) profit from.
Just to demonstrate that they are different, the following code
compiles cleanly:
int main() {
int *pointer;
pointer++;
return 0;
}
While this does not:
int main() {
int array[0];
array++;
return 0;
}
Geremy Condra
Interesting! Thanks for the lesson ;-).
Cheers, Roald
One that does matter, sometimes drastically, is sizeof(array) vs.
sizeof(pointer).
One interesting other effect of the compiler (nearly always) treating
pointers and arrays the same is expressions like:
int hexdigit = ...something...
char ch = "0123456789abcdef"[hexdigit];
char ch2 = hexdigit["0123456789abcdef";
both are valid assignments, and equivalent to
char ch3 = hexdigit + &("0123456789abcdef");
In retrospect, C's "pointer=array" concept was a terrible mistake.
It's a historical artifact; early C (pre K&R, as shipped with
"research UNIX" V6 and PWB) had no notion of typing; all "struct"
pointers were interchangeable and elements of a "struct" were just
offsets of a pointer. That was fixed, but arrays weren't.
The fundamental problem with "array=pointer" is that it results
in lying to the language. Consider the declaration of "read":
int read(int fd, char* buf, size_t n);
This is a a lie. "buf" is not a pointer to a character. It's an
array. And, the bad part, the compiler doesn't know how big it is.
The syntax should have been something like
int read(int fd, &char[n] buf, size_t n);
This says the type of the argument is an array of char of
length n, and it's being passed by reference. "read"
then knows how big "buf" is.
This design error in C is the cause of most buffer overflow
crashes and security holes.
John Nagle
>> real sample[-500:750];
> Ugh, no. The ability to change the minimum index is evil.
Not always; it can have its uses, particularly when you're
using the array as a mapping rather than a collection.
Pascal had a nice feature where you could use any ordinal
type as an array index, and sometimes it was handy to
have things like an array indexed by the characters
'A' to 'Z', or the values (Red, Green, Blue). Typically
you didn't need to do arithmetic on the indices in those
case, though.
Python addresses this by having separate types for
sequences and mappings.
--
Greg
One way to see that they're not *exactly* the same is
the fact that
sizeof("python rocks")
is 13, not sizeof(char *). Arrays exist as a distinct
concept in the type system.
What is true is that when you use the name of an array in
an expression, it evaluates to a pointer to its first
element. And array indexing is defined in terms of what
happens to the resulting pointer, rather than being done
directly on the array itself.
--
Greg
Not always -- mathematicians use whatever starting index is
most convenient for the problem at hand. For example, the
constant term of a polynomial is usually called term 0,
not term 1. So to a mathematician, an array being used to
hold polynomial coefficents would most naturally be indexed
from 0.
I suspect that part of the reason Fortran uses 1-based
indexing is that people hadn't had enough experience with
high-level languages back then to realise the awkwardness
it often leads to, and they were relying on "common sense".
--
Greg
> Not always -- mathematicians use whatever starting index is
> most convenient for the problem at hand.
Which may be 0, 1, or something else. There are plenty of situations,
for example, where you might want to use both positive and negative
indices. Which work with Python lists, but not the way a mathematician
would expect :-)
In general, I've found, "Because that's the way they do it in math", to
be an unreliable guide to, "How should we do it in a programming
language?" It's often a reasonable place to start, but there are other
considerations.
> For example, the constant term of a polynomial is usually called term 0,
> not term 1.
That is not some kind of ordinal numbering of the terms, that is the power
of the variable involved.
And polynomials can have negative powers, too.
Why do you refer to Pascal in the past tense? I use it most days (Delphi & Free Pascal).
Not so. Polynomials, by definition, are limited to non-negative integer
powers. You're thinking of a "polynomial quotient", otherwise known as a
"rational function".
http://mathworld.wolfram.com/Polynomial.html
--
Steven
> Ian Kelly wrote:
>> On Fri, Aug 13, 2010 at 11:53 AM, Martin Gregorie
>> <mar...@address-in-sig.invalid> wrote:
>
>>> real sample[-500:750];
>
>> Ugh, no. The ability to change the minimum index is evil.
>
> Not always; it can have its uses, particularly when you're using the
> array as a mapping rather than a collection.
>
Say you have intensity data captured from an X-ray goniometer from 160
degrees to 30 degrees at 0.01 degree resolution. Which is most evil of
the following?
1) real intensity[16000:3000]
for i from lwb intensity to upb intensity
plot(i/100, intensity[i]);
2) double angle[13000];
double intensity[13000];
for (int i = 0; i < 13000; i++)
plot(angle[i], intensity[i]);
3) struct
{
double angle;
double intensity
} measurement;
measurement m[13000];
for (int i = 0; i < 13000; i++)
plot(m[i].angle, m[i].intensity);
4) double intensity[13000];
for (int i = 0; i < 13000; i++)
plot((16000 - i)/100, intensity[i])
To my mind (1) is much clearer to read and far less error-prone to write,
while zero-based indexing forces you to use code like (2), (3) or (4),
all of which obscure rather than clarify what the program is doing.
Yes, there are many engineering fields where index starts at 0. Partly
for the reason you have stated concerning polynomials, especially
since this extend to series, which are pervasive in numerical
computing. In linear algebra, though, I remember to have always noted
matrices indexes in the [1,n] range, not [0,n-1]. In general, I
suspect this is much more a tradition than any kind of very reasoned
thinking (and conversely, I find most arguments for 0-indexing and
against one-indexing rather ... unconvincing. What's awkward is when
you have to constantly change from one to the other).
cheers,
David
C arrays are not pointers.
--
Neil Cerutti
> Say you have intensity data captured from an X-ray goniometer from 160
> degrees to 30 degrees at 0.01 degree resolution. Which is most evil of
> the following?
>
> 1) real intensity[16000:3000]
> for i from lwb intensity to upb intensity
> plot(i/100, intensity[i]);
How about (totally inventing syntax as I go along):
5) real intensity[160.0 : 30.0 : 0.01]
for index, value in intensity:
plot(index, value)
> 5) real intensity[160.0 : 30.0 : 0.01]
How many elements in that array?
a) 2999
b) 3000
c) neither of the above
c) neither of the above. More specifically, 13,001 (if I counted
correctly).
13000, actually. Floating point is a bitch.
[~/Movies]
|1> import numpy
[~/Movies]
|2> len(numpy.r_[160.0:30.0:-0.01])
13000
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
> On 8/16/10 9:29 PM, Roy Smith wrote:
>> In article<i4cqg0$olf$3...@lust.ihug.co.nz>,
>> Lawrence D'Oliveiro<l...@geek-central.gen.new_zealand> wrote:
>>
>>> In message<roy-EE1B7F.2...@news.panix.com>, Roy Smith wrote:
>>>
>>>> 5) real intensity[160.0 : 30.0 : 0.01]
>>>
>>> How many elements in that array?
>>>
>>> a) 2999
>>> b) 3000
>>> c) neither of the above
>>
>> c) neither of the above. More specifically, 13,001 (if I counted
>> correctly).
>
> 13000, actually. Floating point is a bitch.
>
> [~/Movies]
> |1> import numpy
>
> [~/Movies]
> |2> len(numpy.r_[160.0:30.0:-0.01])
> 13000
Actually, the answer is 0, not 13000, because the step size is given as
0.01, not -0.01.
>>> import numpy
>>> len(numpy.r_[160.0:30.0:-0.01])
13000
>>> len(numpy.r_[160.0:30.0:0.01])
0
--
Steven
I'm sure some would prefer to denote it as [0, n)
Roy wasn't using numpy/Python semantics but made-up semantics (following Martin
Gregorie's made-up semantics to which he was replying) which treat the step size
as a true size, not a size and direction. The direction is determined from the
start and stop parameters. It's an almost-reasonable design.
That wasn't a made-up example: AFAICR and ignoring a missing semi-colon
it was an Algol 68 snippet. The semantics of the for statement and the
use of lwb and upb operators to extract the bounds from a 1-dimensional
array are correct A68, but OTOH its a very long time since I last
programmed in that language. I used that rather than Python because Algol
68 supports the non-zero lower bound and treats the array limits as
attributes of the array.
> > Roy wasn't using numpy/Python semantics but made-up semantics (following
> > Martin Gregorie's made-up semantics to which he was replying) which
> > treat the step size as a true size, not a size and direction. The
> > direction is determined from the start and stop parameters. It's an
> > almost-reasonable design.
>
> That wasn't a made-up example: AFAICR and ignoring a missing semi-colon
> it was an Algol 68 snippet.
It was a made up example. Any similarity to a real programming
language, living or dead, was purely a coincidence.
I suspect I've probably also written a viable code snippet in Whitespace
as well (http://compsoc.dur.ac.uk/whitespace/). That, too, is a
coincidence.
Only if there's an emacs mode which can do the parenthesis matching
correctly ;-)
Count me in on that, that'd be great.
Geremy Condra
> Would said beginner also be surprised that a newborn baby is zero years
> old or would it be more natural to call them a one year old? Zero
> based counting is perfectly natural.
You're confusing continuous and discrete variables. Time is a
continuous variable, but a list index is discrete.
Take a look at any numbered list, such as the top ten football teams
or the top ten software companies. Have you ever seen such a list
start with zero? If so, where? I sure haven't.
When I studied linear algebra way back, vector and matrix indices also
always started with one, and I assume they still do.
The convention of starting with zero may have had some slight
performance advantage in the early days of computing, but the huge
potential for error that it introduced made it a poor choice in the
long run, at least for high-level languages.
I have to agree, there's innumerable number of examples where sequential
number of an item in a series is counted starting with one. Second loaf
of bread; third day of vacation, first cup of tea today, first gray
hair, 50th anniversary, 2nd century AD, and approximately a gazillion
other examples.
Contrast this with _one_ example that was repeated in this thread of
there being ground floor, 1st floor, 2nd, and so on. However! Consider
that ground floor is kind of different from the other floors. It's the
floor that's not built up over ground, but is already there -- in case
of the most primitive dwelling, you can put some sawdust over the
ground, put a few boards overhead and it's a "home", although probably
not a "house". But does it really have what can be officially called a
"floor"?
On a more practical angle, ground floors usually have the lobby,
receptionists, storefronts and stores, etc; while 1st floor and up are
business/residential.
I think different numbering from pretty much all other things out there
gives you a hint that the ground floor is a different animal.
-andrei
Besides that, the way things are now, it's almost an Abbot & Costello
routine:
- How many folders are there?
- 5
- Ok, give me the fourth one.
- Here.
- No, that's the last one!
- That's what you said!
- No, I said, fourth one!
- That's what I did!
- How many are there in all?
- I already said, five!
- You gave me the last one!!
- Just like you said - fourth!!!!
Yes, it's confusing. Which element of a list is the "first" element?
Wait, "first" is sometimes abbreviated as "1st". So is the 1st element
the 0 element or the 1 element? I honestly don't know.
Is the top team in the league the number 1 team -- or the number 0
team? I have yet to hear anyone call the best team the number 0 team!
Unfortunately, we're stuck with this goofy numbering system in many
languages. Fortunately, the trend is away from explicit indexing and
toward "for" loops when possible.
Bring back Coral 66, all is forgiven.
http://www.xgc.com/manuals/xgc-c66-rm/x357.html
Cheers.
Mark Lawrence.
> Contrast this with _one_ example that was repeated in this thread of
> there being ground floor, 1st floor, 2nd, and so on. However! Consider
> that ground floor is kind of different from the other floors. It's the
> floor that's not built up over ground, but is already there -- in case
> of the most primitive dwelling, you can put some sawdust over the
> ground, put a few boards overhead and it's a "home", although probably
> not a "house". But does it really have what can be officially called a
> "floor"?
That's the perfect example, although perhaps for an [apparently]
unintended reason <g>: I think that the notion of a qualitatively
different "ground floor" is European, or at least that's the way I
remember it from my high school French class way back in the late 1970s.
In the U.S., when you walk into a building (even a very tall commercial
building), that's the first floor, and when you go up a level, that's the
second floor, and all the room/suite/office numbers are two hundred and
something. I also seem to recall that some European buildings have a
mezzanine floor between the ground floor and the floor whose reference
number is 1, but again, high school was a long time ago.
Dan
> Is the top team in the league the number 1 team -- or the number 0 team?
> I have yet to hear anyone call the best team the number 0 team!
Why is the top team the one with the lowest number?
> Unfortunately, we're stuck with this goofy numbering system in many
> languages. Fortunately, the trend is away from explicit indexing and
> toward "for" loops when possible.
Agreed on the second sentence there, but not on the first. There's
nothing "goofy" about indexing items from 0. Yes, it does lead to slight
more difficulty when discussing which item you want in *human* languages,
but not in *programming* languages. The nth item is always the nth item.
The only difference is whether n starts at 0 or 1, and frankly, if you
(generic you, not you personally) can't learn which to use, you have no
business pretending to be a programmer.
--
Steven
How could it be otherwise? What is the highest number?
Here's a couple of things I'd like to see just once before I die:
1. The winner of the championship game chanting, "We're number zero!
We're number zero!
2. The loser of the championship game chanting, "We're number one!
We're number one!
>
> > Unfortunately, we're stuck with this goofy numbering system in many
> > languages. Fortunately, the trend is away from explicit indexing and
> > toward "for" loops when possible.
>
> Agreed on the second sentence there, but not on the first. There's
> nothing "goofy" about indexing items from 0. Yes, it does lead to slight
> more difficulty when discussing which item you want in *human* languages,
> but not in *programming* languages. The nth item is always the nth item.
> The only difference is whether n starts at 0 or 1, and frankly, if you
> (generic you, not you personally) can't learn which to use, you have no
> business pretending to be a programmer.
Maybe "goofy" was too derogatory, but I think you are rationalizing a
bad decision, at least for high-level languages. I don't think
programming languages should always mimic human languages, but this is
one case where there is no advantage to doing otherwise.
Why do you think "off by one" errors are so common? Because the darn
indexing convention is off by one!
And I'd still like to know if the "1st" element of aList is aList[0]
or aList[1].
> On Aug 18, 7:58 pm, Steven D'Aprano <steve-REMOVE-
> T...@cybersource.com.au> wrote:
>> On Wed, 18 Aug 2010 14:47:08 -0700, Russ P. wrote:
>> > Is the top team in the league the number 1 team -- or the number 0
>> > team? I have yet to hear anyone call the best team the number 0 team!
>>
>> Why is the top team the one with the lowest number?
>
> How could it be otherwise? What is the highest number?
If there are N teams, then the highest number is obviously N (if counting
from 1) or N-1 (if from 0).
In other words... why do we rank sporting teams Best to Worst rather than
the other way around?
[...]
> Maybe "goofy" was too derogatory, but I think you are rationalizing a
> bad decision, at least for high-level languages. I don't think
> programming languages should always mimic human languages, but this is
> one case where there is no advantage to doing otherwise.
>
> Why do you think "off by one" errors are so common? Because the darn
> indexing convention is off by one!
But you have that exactly backwards. Counting from 0 leads to fewer off
by one errors for many tasks.
(Of course, avoiding indexing in favour of iteration leads to even fewer
off by one errors.)
Anyway, in a feeble attempt to move this discussion somewhere --
anywhere! -- else:
http://c2.com/cgi/wiki?FencePostError
http://c2.com/cgi/wiki?WhyNumberingShouldStartAtZero
http://c2.com/cgi/wiki?WhyNumberingShouldStartAtOne
http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/zero
and of course:
--
Steven
> And I'd still like to know if the "1st" element of aList is aList[0]
> or aList[1].
aList[0]
--
Grant Edwards grant.b.edwards Yow! I'm definitely not
at in Omaha!
gmail.com
When I wrote my own C++ 2-D matrix class, I wrote a member function
which did exactly this - allow you to specify the initial index value.
Then users of my class (mainly my research lab coworkers) could
specify whichever behavior they wanted.
In terms of providing readable code and removing beginning programmer
confusion, why not extend python lists by using something similar to
(but not necessarily identical to) the C++ STL containers:
C++
int myX[ ] = { 1,2,3,4,5 };
std::vector<int> vectorX( myX, &myX[ sizeof( myX ) - 1 ] );
std::cout << vectorX.begin() << std::endl;
std::cout << vectorX.end() << std::endl;
Python
x = [ 1 , 2 , 3 , 4 , 5 ]
print x.first()
print x.last() ,
where the first and last behavior of python is to return a deep copy
of the object, and not a pointer.
It seems that this would avoid complaints about the 0/1 issue.
Of course, the problem is the behavior of:
myList = [ myObject1, myObject2, myObject3, ... , myObjectLast ]
print myList.first() + 5 ,
in which one will conceptually might want to get the 6th item in a
list, though if first() is defined to return the object, then we get
the returned object plus 5, if such behavior is defined to exist.
I completely acknowledge that the behavior is not well defined, and
that is why I am not proposing this as a final implementation, but
rather as a concept and motivation.
For those who don't like Python's 0-based indexing, why not just build
a wrapper type which features an item() method that handles the
internal conversion from 1 to 0 as the starting index?
Better yet, include a method which sets/specifies the 0/1 behavior,
and have item() reference the 0/1 setting to obtain the proper offset.
Just a thought.
J.B. Brown
Kyoto University
I did something similar in a library that needs to read
"positions" from the specification for a fixed-length fields
plain text database.
The printed specs for these document types often start counting
character positions at 1, but not universally.
> For those who don't like Python's 0-based indexing, why not
> just build a wrapper type which features an item() method that
> handles the internal conversion from 1 to 0 as the starting
> index? Better yet, include a method which sets/specifies the
> 0/1 behavior, and have item() reference the 0/1 setting to
> obtain the proper offset.
Because they know deep down they wouldn't win anything.
--
Neil Cerutti
Many years ago I wrote a fairly comprehensive vector/matrix class in C+
+. (It was an exercise to learn the intricacies of C++, and I also
really needed it.) I used a system of offset pointers to realize
indexing that starts with 1.
Years later, after I had quit using it, I decided that I should have
just bit the bullet and let the indexing start with zero for
simplicity. I believe that zero-based indexing is a mistake, but
attempts to fix it require additional boilerplate that is prone to
error and inefficiency. Short of a major language overhaul, we are
essentially stuck with zero-based indexing.
But that doesn't mean we should pretend that it was the right choice.
For those who insist that zero-based indexing is a good idea, why you
suppose mathematical vector/matrix notation has never used that
convention? I have studied and used linear algebra extensively, and I
have yet to see a single case of vector or matrix notation with zero-
based indexing in a textbook or a technical paper. Also, mathematical
summation is traditionally shown as "1 to N", not "0 to N-1". Are
mathematicians just too simple-minded and unsophisticated to
understand the value of zero-based indexing? Or have you perhaps been
mislead?
> The convention of starting with zero may have had some slight
> performance advantage in the early days of computing, but the huge
> potential for error that it introduced made it a poor choice in the long
> run, at least for high-level languages.
People keep saying this, but it's actually the opposite. Signpost errors
and off-by-one errors are more common in languages that count from one.
A simple example: Using zero-based indexing, suppose you want to indent
the string "spam" so it starts at column 4. How many spaces to you
prepend?
0123456789
spam
Answer: 4. Nice and easy and almost impossible to get wrong. To indent to
position n, prepend n spaces.
Now consider one-based indexing, where the string starts at column 5:
1234567890
spam
Answer: 5-1 = 4. People are remarkably bad at remembering to subtract the
1, hence the off-by-one errors.
Zero-based counting doesn't entirely eliminate off-by-one errors, but the
combination of that plus half-open on the right intervals reduces them as
much as possible.
The intuitive one-based closed interval notation used in many natural
languages is terrible for encouraging off-by-one errors. Quick: how many
days are there between Friday 20th September and Friday 27th September
inclusive? If you said seven, you fail.
One-based counting is the product of human intuition. Zero-based counting
is the product of human reason.
--
Steven
The error mode you refer to is much less common than the typical off-
by-one error mode. In the far more common error mode, zero-based
indexing is far more error prone.
> One-based counting is the product of human intuition. Zero-based counting
> is the product of human reason.
I suggest you take that up with mathematicians, who have used one-
based indexing all along. That's why it was used in Fortran and
Matlab, among other more mathematical and numerically oriented and
languages.
> For those who insist that zero-based indexing is a good idea, why you
> suppose mathematical vector/matrix notation has never used that
> convention? I have studied and used linear algebra extensively, and I
> have yet to see a single case of vector or matrix notation with zero-
> based indexing in a textbook or a technical paper. Also, mathematical
> summation is traditionally shown as "1 to N", not "0 to N-1".
In my experience, it's more likely to be "0 to N" than either of the
above, thus combining the worst of both notations.
> Are
> mathematicians just too simple-minded and unsophisticated to understand
> the value of zero-based indexing?
No, mathematicians are addicted to tradition.
Unlike computer scientists, who create new languages with radically
different notation and syntax at the drop of a hat, mathematicians almost
never change existing notation. Sometimes they *add* new notation, but
more often they just re-use old notation in a new context. E.g. if you
see (5, 8), does that mean a coordinate pair, a two-tuple, an open
interval, or something else?
Additionally, mathematical notation isn't chosen for its ability to
encourage or discourage errors. It seems often to be chosen arbitrarily,
or for convenience, but mostly out of tradition and convention. Why do we
use "x" for "unknown"? Why do we use i for an integer value, but not r
for a real or c for a complex value?
Mathematicians are awfully lazy -- laziness is one of the cardinal
virtues of the mathematician, as it is of programmers -- but they value
brevity and conciseness over notation that improves readability and
robustness. That's why they (e.g.) they use implicit multiplication, a
plethora of "line noise" symbols that would boggle even Perl programmers,
and two-dimensional syntax where the meaning of tokens depends on where
they are written relative to some other token. (E.g. subscript,
superscript, and related forms.)
There is one slightly mainstream language that uses mathematical
notation: APL. The result isn't pretty.
--
Steven
> I just checked, and Mathematica uses one-based indexing. Apparently they
> want their notation to look mathematical.
Well duh. It's called MATHematica, not PROGematica.
--
Steven
That is probably true. But computer languages are addicted to more
than tradition. They're addicted to compatibility and familiarity. I
don't know where zero-based indexing started, but I know that C used
it very early, probably for some minuscule performance advantage. When
C++ came along, it tried to be somewhat compatible with C, so it
continued using zero-based indexing. Then Java was loosely modeled
after C++, so the convention continued. Python was written in C, so
zero-based indexing was "natural." So the whole thing is based on a
decision by some guy who was writing a language for operating systems,
not mathematics or application programming.
> Unlike computer scientists, who create new languages with radically
> different notation and syntax at the drop of a hat, mathematicians almost
> never change existing notation. Sometimes they *add* new notation, but
> more often they just re-use old notation in a new context. E.g. if you
> see (5, 8), does that mean a coordinate pair, a two-tuple, an open
> interval, or something else?
>
> Additionally, mathematical notation isn't chosen for its ability to
> encourage or discourage errors. It seems often to be chosen arbitrarily,
> or for convenience, but mostly out of tradition and convention. Why do we
> use "x" for "unknown"? Why do we use i for an integer value, but not r
> for a real or c for a complex value?
>
> Mathematicians are awfully lazy -- laziness is one of the cardinal
> virtues of the mathematician, as it is of programmers -- but they value
> brevity and conciseness over notation that improves readability and
> robustness. That's why they (e.g.) they use implicit multiplication, a
> plethora of "line noise" symbols that would boggle even Perl programmers,
> and two-dimensional syntax where the meaning of tokens depends on where
> they are written relative to some other token. (E.g. subscript,
> superscript, and related forms.)
>
> There is one slightly mainstream language that uses mathematical
> notation: APL. The result isn't pretty.
As I wrote above, the use of zero-based indexing in C++, Java, and
Python are all simply based on the fact that C used it.
I wouldn't have guessed that APL is a "mainstream" language. As I
wrote in recent posts, Fortran, Matlab, and Mathematica all used one-
based indexing. Maybe Basic too, but I haven't checked.