Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

“Why do arrays start at 0?"

49 views
Skip to first unread message

Lynn McGuire

unread,
Aug 26, 2022, 4:50:10 PM8/26/22
to
“Why do arrays start at 0?"
https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/

"It's not the reason you think. No, it's not that reason either.”

My Fortran starts at one. My C++ starts at zero. This has made my life
hell.

Lynn

Gary Scott

unread,
Aug 26, 2022, 5:57:05 PM8/26/22
to
Fortran was there first...and got it right...you can change Fortran to
start at other index values for many cases.

Louis Krupp

unread,
Aug 26, 2022, 6:08:37 PM8/26/22
to
Here's a crude outline of a solution:

If you want a C++ array index to start at 1, make the array one element
bigger than necessary, and start indexing at 1, just like in Fortran.
Element 0 will be unused.

If you want a C++ array to map exactly to a Fortran array, but with an
index starting at 1, wrap the C++ array in a class and overload the
subscripting operator(s) to subtract 1 from the index.

Neither of these approaches seems elegant, but one of them should be
less hellish than the other.

Louis

Gawr Gura

unread,
Aug 26, 2022, 6:18:58 PM8/26/22
to
I've never personally had trouble with array indices but row-major vs
column-major order on matrices always gets me (especially going back and
forth between C++ and GLSL). Luckily, C++ gives you plenty of tools to
make your own arrays and matrices with the interface you prefer.

Mr Flibble

unread,
Aug 26, 2022, 7:45:03 PM8/26/22
to
Fortran is retarded.

/Flibble

Mr Flibble

unread,
Aug 26, 2022, 7:46:18 PM8/26/22
to
Being first doesn't make you correct. Fortran is retarded.

/Flibble

d thiebaud

unread,
Aug 26, 2022, 10:30:19 PM8/26/22
to
A reasonable language would let you specify the lower index. Fortran
and C are both retarded.

Keith Thompson

unread,
Aug 27, 2022, 1:05:36 AM8/27/22
to
"Mr Flibble" is a troll, and I'm sure he's aware that "retarded" is an
offensive word. Are you?

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Thomas Koenig

unread,
Aug 27, 2022, 1:48:00 AM8/27/22
to
Lynn McGuire <lynnmc...@gmail.com> schrieb:
If you want to declare your Fortran arrays to start at zero, just
declare them with a lower bound of zero, like

real, dimension(0:n-1) :: a

Mr Flibble

unread,
Aug 27, 2022, 7:06:29 AM8/27/22
to
On Fri, 26 Aug 2022 22:05:17 -0700
Keith Thompson <Keith.S.T...@gmail.com> wrote:

> d thiebaud <thieba...@aol.com> writes:
> > On 8/26/22 19:46, Mr Flibble wrote:
> >> On Fri, 26 Aug 2022 16:56:47 -0500
> >> Gary Scott <garyl...@sbcglobal.net> wrote:
> >>
> >>> On 8/26/2022 3:49 PM, Lynn McGuire wrote:
> >>>> “Why do arrays start at 0?"
> >>>>    https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
> >>>>
> >>>> "It's not the reason you think. No, it's not that reason either.”
> >>>>
> >>>> My Fortran starts at one.  My C++ starts at zero.  This has made
> >>>> my life hell.
> >>>>
> >>>> Lynn
> >>>>
> >>> Fortran was there first...and got it right...you can change
> >>> Fortran to start at other index values for many cases.
> >> Being first doesn't make you correct. Fortran is retarded.
> >> /Flibble
> >>
> > A reasonable language would let you specify the lower index. Fortran
> > and C are both retarded.
>
> "Mr Flibble" is a troll, and I'm sure he's aware that "retarded" is an
> offensive word. Are you?

Retarded isn't an offensive word but I could replace it with fucktarded
if that helps?

/Flibble

Scott Lurndal

unread,
Aug 27, 2022, 10:29:09 AM8/27/22
to
There are around twenty languages that start array indicies with one. None of
them are low-level hardware-oriented languages (Burroughs algol excepted) like C. In the
hardware indicies start at zero by definition.

d thiebaud

unread,
Aug 27, 2022, 3:19:38 PM8/27/22
to
Can you declare other lower bounds the same way?

Fred. Zwarts

unread,
Aug 27, 2022, 3:26:39 PM8/27/22
to
Op 26.aug..2022 om 22:49 schreef Lynn McGuire:
I assumed that it was done because in C x[i] is equivalent to *(x+i).

Lynn McGuire

unread,
Aug 27, 2022, 4:01:47 PM8/27/22
to
Yup. So Fortran x(i) is equivalent to *(x+i-1). Or, the x is
subtracted from first: x--; *(x+i);.

Lynn

Thomas Koenig

unread,
Aug 27, 2022, 6:22:43 PM8/27/22
to
d thiebaud <thieba...@aol.com> schrieb:
Yes.

You can, since 1991, when Fortran 90 was released, do

real, dimension(from:to) :: a

where from and to are arbitrary integer expressions. (If to <
from, then you get a zero-sized array, which is perfectly valid
and which just happens to have size zero and no element).

(In Fortran, you cannot declare variables in the middle of
code. If you want to do that, Fortran 2008 introduced the
BLOCK construct for declaring variables, much like
C's or C++'s { and }, so you can do

read (*,*) from, to
block
real, dimension(from:to) :: a
! Use a here
end block

Thomas Koenig

unread,
Aug 27, 2022, 6:24:19 PM8/27/22
to
Lynn McGuire <lynnmc...@gmail.com> schrieb:
> On 8/27/2022 2:26 PM, Fred. Zwarts wrote:
>> Op 26.aug..2022 om 22:49 schreef Lynn McGuire:
>>> “Why do arrays start at 0?"
>>>
>>> https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
>>>
>>> "It's not the reason you think. No, it's not that reason either.”
>>>
>>> My Fortran starts at one.  My C++ starts at zero.  This has made my
>>> life hell.
>>>
>>> Lynn
>>>
>>
>> I assumed that it was done because in C x[i] is equivalent to *(x+i).
>
> Yup. So Fortran x(i) is equivalent to *(x+i-1).

To be more precise, x(i) is equivalent to *(x+i-lbound(x,1))

> Or, the x is
> subtracted from first: x--; *(x+i);.

That's an f2c idiom, which is not valid C, AFAIK, because
the pointer would point before the actual array.

Lynn McGuire

unread,
Aug 27, 2022, 11:40:30 PM8/27/22
to
While the second pointer is a bad pointer, it is not a illegal pointer.
Variables can point to anywhere, they are not illegal until referenced.
Otherwise, code would be crashing all over the place. If I malloc'd
some space, then free'd the space, and did not NULL the pointer, that
dangling pointer would crash if ever referenced (hopefully) but not if
it just hangs around.

I wish that C/C++ would provide some sort of pointer validation but it
does not. I keep track of my pointers for that reason and validate them
before using when I have some error prone code. In 850,000 lines of F77
and 30,000 lines of C/C++ code, I have suspicious pointers in several
areas due to programmers suballocating malloc'd space and forgetting to
nullify those suballocated pointers after freeing the allocated space.
Old code, you gotta love it.

Lynn

Keith Thompson

unread,
Aug 28, 2022, 1:07:48 AM8/28/22
to
Lynn McGuire <lynnmc...@gmail.com> writes:
> On 8/27/2022 5:24 PM, Thomas Koenig wrote:
>> Lynn McGuire <lynnmc...@gmail.com> schrieb:
>>> On 8/27/2022 2:26 PM, Fred. Zwarts wrote:
>>>> Op 26.aug..2022 om 22:49 schreef Lynn McGuire:
>>>>> “Why do arrays start at 0?"
>>>>> https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
>>>>>
>>>>> "It's not the reason you think. No, it's not that reason either.”
>>>>>
>>>>> My Fortran starts at one.  My C++ starts at zero.  This has made my
>>>>> life hell.
>>>>
>>>> I assumed that it was done because in C x[i] is equivalent to *(x+i).
>>>
>>> Yup. So Fortran x(i) is equivalent to *(x+i-1).
>> To be more precise, x(i) is equivalent to *(x+i-lbound(x,1))
>>
>>> Or, the x is
>>> subtracted from first: x--; *(x+i);.
>> That's an f2c idiom, which is not valid C, AFAIK, because
>> the pointer would point before the actual array.
>
> While the second pointer is a bad pointer, it is not a illegal
> pointer. Variables can point to anywhere, they are not illegal until
> referenced. Otherwise, code would be crashing all over the place. If
> I malloc'd some space, then free'd the space, and did not NULL the
> pointer, that dangling pointer would crash if ever referenced
> (hopefully) but not if it just hangs around.

Computing a pointer before the beginning of an array causes undefined
behavior in C and C++, even if you never dereference it.

[...]

Thomas Koenig

unread,
Aug 28, 2022, 3:50:29 AM8/28/22
to
Lynn McGuire <lynnmc...@gmail.com> schrieb:
> On 8/27/2022 5:24 PM, Thomas Koenig wrote:
>> Lynn McGuire <lynnmc...@gmail.com> schrieb:
>>> On 8/27/2022 2:26 PM, Fred. Zwarts wrote:
>>>> Op 26.aug..2022 om 22:49 schreef Lynn McGuire:
>>>>> “Why do arrays start at 0?"
>>>>>
>>>>> https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
>>>>>
>>>>> "It's not the reason you think. No, it's not that reason either.”
>>>>>
>>>>> My Fortran starts at one.  My C++ starts at zero.  This has made my
>>>>> life hell.
>>>>>
>>>>> Lynn
>>>>>
>>>>
>>>> I assumed that it was done because in C x[i] is equivalent to *(x+i).
>>>
>>> Yup. So Fortran x(i) is equivalent to *(x+i-1).
>>
>> To be more precise, x(i) is equivalent to *(x+i-lbound(x,1))
>>
>>> Or, the x is
>>> subtracted from first: x--; *(x+i);.
>>
>> That's an f2c idiom, which is not valid C, AFAIK, because
>> the pointer would point before the actual array.
>
> While the second pointer is a bad pointer, it is not a illegal pointer.
> Variables can point to anywhere, they are not illegal until referenced.

Unfortunately not.

Looking at n2596.pdf, one finds under J.2, "Undefined behavior",

— Addition or subtraction of a pointer into, or just beyond,
an array object and an integer type produces a result that does
not point into, or just beyond, the same array object (6.5.6).

> Otherwise, code would be crashing all over the place.

Undefined behavior does not mean that the code will reliably crash.
It just says that the C standard gives no guarantee about what will
happen, and, even if it works right now, such code is at the mercy
of future compiler revisions, cosmic rays, and other forseen and
unforseen circumstances.

Fred. Zwarts

unread,
Aug 28, 2022, 4:01:41 AM8/28/22
to
Op 27.aug..2022 om 22:01 schreef Lynn McGuire:
I don't understand the idea of x--. Why modifying x? What happens if x
is indexed later again?

Thomas Koenig

unread,
Aug 28, 2022, 4:14:57 AM8/28/22
to
Fred. Zwarts <F.Zw...@KVI.nl> schrieb:
The idea is to use this modified pointer for one-based array
accesses, so that it would be possible to translate Fortran's A(1)
into a[1] on the C side, or A(N) into a[n].

The correct way to do this according to the C standard would be to
translate A(N) into a[n-1] on the C side. There are several reasons
why this might not have been done: Readability of the generated
code (although f2c code is already hard to read), because it made
the code slower with compilers of the day, and probably because it
"just worked" with the compilers.

Bonita Montero

unread,
Aug 28, 2022, 5:07:29 AM8/28/22
to
On the CPU-level you heave the least number of calculations to
determine an address of an indexed entity if the index starts
at zero.


David Brown

unread,
Aug 28, 2022, 5:52:47 AM8/28/22
to
Indeed.

C was designed to have as few restrictions on the hardware as it could,
while still having a minimum feature set. If you have a segmented
memory architecture (such as x86), then addresses might have a form
"segment:offset". If an array starts at offset 0, what does it mean to
have an address one before that? It might make no sense, or have
different meanings in different contexts, or require inefficient extra
instructions to get right. Some architectures pre-load information
about memory pages or segments when a pointer register is loaded,
causing trouble if it does not actually point to valid memory. Leaving
this all undefined is much simpler for everyone.

(The other end, one past the end of the array, is too useful in common C
idioms to leave undefined - even if it might mean that an implementation
can't use the last address in memory.)

Paavo Helde

unread,
Aug 28, 2022, 12:43:28 PM8/28/22
to
28.08.2022 06:40 Lynn McGuire kirjutas:
> On 8/27/2022 5:24 PM, Thomas Koenig wrote:
>> Lynn McGuire <lynnmc...@gmail.com> schrieb:
>>> On 8/27/2022 2:26 PM, Fred. Zwarts wrote:
>>>> Op 26.aug..2022 om 22:49 schreef Lynn McGuire:
>>>>> “Why do arrays start at 0?"
>>>>> https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
>>>>>
>>>>> "It's not the reason you think. No, it's not that reason either.”
>>>>>
>>>>> My Fortran starts at one.  My C++ starts at zero.  This has made my
>>>>> life hell.
>>>>>
>>>>> Lynn
>>>>>
>>>>
>>>> I assumed that it was done because in C x[i] is equivalent to *(x+i).
>>>
>>> Yup.  So Fortran x(i) is equivalent to *(x+i-1).
>>
>> To be more precise, x(i) is equivalent to *(x+i-lbound(x,1))
>>
>>> Or, the x is
>>> subtracted from first: x--; *(x+i);.
>>
>> That's an f2c idiom, which is not valid C, AFAIK, because
>> the pointer would point before the actual array.
>
> While the second pointer is a bad pointer, it is not a illegal pointer.

The C++ standard (n4861) calls this "an invalid pointer value".

For pointers which continue to point to freed objects, it says "A
pointer value becomes invalid when the storage it denotes reaches the
end of its storage duration".

About invalid pointers it says:
"Indirection through an invalid pointer value and passing an invalid
pointer value to a deallocation function have undefined behavior. Any
other use of an invalid pointer value has implementation-defined behavior."

There is also a footnote:
"Some implementations might define that copying an invalid pointer value
causes a system-generated runtime fault."

So, while technically one might get away with the "x--" trick most of
the time with the linear memory addressing used by mainstream
implementations nowadays, still various diagnostic tools would mark
these as invalid pointers, causing an avalanche of errors whenever you
want to solve your actual memory access problems.

I guess it might also subvert automatic garbage collection which is
sometimes used with C++. There is a special case of "safely-derived"
pointer values which I suspect is made exactly for making GC possible,
and changing the pointer value to x-1 would apparently ruin this.

And with segmented memory, like with 16-bit x86, it might cause all kind
of surprises.

Juha Nieminen

unread,
Aug 29, 2022, 7:16:14 AM8/29/22
to
In comp.lang.c++ Lynn McGuire <lynnmc...@gmail.com> wrote:
> ???Why do arrays start at 0?"
> https://buttondown.email/hillelwayne/archive/why-do-arrays-start-at-0/
>
> "It's not the reason you think. No, it's not that reason either.???
>
> My Fortran starts at one. My C++ starts at zero. This has made my life
> hell.

I don't know if it's the *original* reason, but I would assume that at least
in C one of the main reasons is the principle of maximum efficiency.

In many processor architectures the concept of "array" exists, at least
when it comes to values of the register sizes (ie. usually 1-byte,
2-byte, 4-byte and 8-byte elements, the last one at least on 64-bit
architectures). Prominently the concept of an indexable array exists
in the x86 architecture. (I don't remember now if it also exists in
the ARM architecture, but I would guess so.)

Generally when a processor architecture supports the concept of an "array",
it does so by having instructions that take (at least) two registers as
the input or the output parameter: A base address, and an offset. The
memory location of the element is calculated by adding those two. (The
number of bytes that an offset of 1 jumps depends on the instruction,
and thus multi-byte elements are supported.)

Thus zero-indexing is extraordinarily natural in processor architectures:
The "index" is actually an offset. It's a value you add to the base
address in order to get to the location you want. Thus, the first element
is at index/offset 0.

Since that's the case, the most optimal way to handle low-level arrays is
to have 0-based indexing in the programming language as well. That way
you don't need to be subtracting 1 from the index every time an array is
accessed (or you don't need an extraneous unused element at the beginning
of the array, consuming memory for no reason).

Also, since C supports pointer arithmetic, many operations become simpler.
Such as getting the index of an element when what you have is a pointer to
it (and the pointer to the start of the array).
0 new messages