Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Zero-based or one-based indexing

72 views
Skip to first unread message

James Harris

unread,
Nov 30, 2021, 3:07:34 AM11/30/21
to
From another thread, discussion between David and Bart:

D> But if you have just one starting point, 0 is the sensible one.
D> You might not like the way C handles arrays (and I'm not going to
D> argue about it - it certainly has its cons as well as its pros),
D> but even you would have to agree that defining "A[i]" to be the
D> element at "address of A + i * the size of the elements" is neater
D> and clearer than one-based indexing.

B> That's a crude way of defining arrays. A[i] is simply the i'th
B> element of N slots, you don't need to bring offsets into it.

Why call it 'i'th? I know people do but wouldn't it be easier to call it
'element n' where n is its index? Then that would work with any basing.


B> With 0-based, there's a disconnect between the ordinal number of
B> the element you want, and the index that needs to be used. So A[2]
B> for the 3rd element.

Why not call A[2] element 2?

BTW, Bart, do you consider the first ten numbers as 1 to 10 rather than
0 to 9? If so, presumably you count the hundreds as starting at 111.
That's not the most logical viewpoint.

Similarly, on the day a child is born do you say that he is one year old?


--
James Harris

Dmitry A. Kazakov

unread,
Nov 30, 2021, 4:18:32 AM11/30/21
to
On 2021-11-30 09:07, James Harris wrote:
> From another thread, discussion between David and Bart:
>
> D> But if you have just one starting point, 0 is the sensible one.
> D> You might not like the way C handles arrays (and I'm not going to
> D> argue about it - it certainly has its cons as well as its pros),
> D> but even you would have to agree that defining "A[i]" to be the
> D> element at "address of A + i * the size of the elements" is neater
> D> and clearer than one-based indexing.
>
> B> That's a crude way of defining arrays. A[i] is simply the i'th
> B> element of N slots, you don't need to bring offsets into it.
>
> Why call it 'i'th? I know people do but wouldn't it be easier to call it
> 'element n' where n is its index? Then that would work with any basing.

You are confusing position with index. Index can be of any ordered type.
Position is an ordinal number: first, second, third element from the
array beginning.

> B> With 0-based, there's a disconnect between the ordinal number of
> B> the element you want, and the index that needs to be used. So A[2]
> B> for the 3rd element.
>
> Why not call A[2] element 2?

Because it would be wrong. In most languages A[2] means the array
element corresponding to the index 2.

Remember, array is a mapping:

array : index -> element

In well-designed languages it is also spelt as a mapping:

A(2)

> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather than
> 0 to 9? If so, presumably you count the hundreds as starting at 111.
> That's not the most logical viewpoint.
>
> Similarly, on the day a child is born do you say that he is one year old?

Similar confusion here. There is date and duration. Date is absolute
like index. Duration is relative like position.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Bart

unread,
Nov 30, 2021, 5:28:04 AM11/30/21
to
On 30/11/2021 08:07, James Harris wrote:
> From another thread, discussion between David and Bart:
>
> D> But if you have just one starting point, 0 is the sensible one.
> D> You might not like the way C handles arrays (and I'm not going to
> D> argue about it - it certainly has its cons as well as its pros),
> D> but even you would have to agree that defining "A[i]" to be the
> D> element at "address of A + i * the size of the elements" is neater
> D> and clearer than one-based indexing.
>
> B> That's a crude way of defining arrays. A[i] is simply the i'th
> B> element of N slots, you don't need to bring offsets into it.
>
> Why call it 'i'th? I know people do but wouldn't it be easier to call it
> 'element n' where n is its index? Then that would work with any basing.

The most common base I use is 1 (about 2/3 of the time). You have a
3-element array, the 1st is numbered 1, the last is 3, the 3rd is 3 too.
All very intuitive and user-friendly.

But this is that 3-element array as 3 adjoining cells:

mmmmmmmmmmmmmmmmmmmmmmmmm
m m m m
m 1 m 2 m 3 m Normal indexing
m +0 m +1 m +2 m Offsets
m m m m
mmmmmmmmmmmmmmmmmmmmmmmmm

0 1 2 3 Distance from start point


The numbering is 1, 2, 3 as I prefer when /counting/. Or you can choose
to use offsets from the first element as C does, shown as +0, +1, +2.

There is also /measuring/, which applies more when each cell has some
physical dimension, such as 3 adjoining square pixels. Or maybe these
are three fence panels, and the vertical columns are the posts.

Here, offsets are again used, but notionally considered to be measured
from the first 'post'.

In this case, an 'index' of 2.4 is meaningful, being 2.4 units from the
left, and 40% along that 3rd cell.

Measurement can also apply when the cells represent other units, like
time as DAK touched on: how many days from Monday to Wednesday? That is
not that meaningful when a day is considered an indivisible unit like an
array cell.

You can say the difference is +2 days. In real life, it depends on what
time Monday, and what time Wednesday, so it can vary from 24 to 72 hours
(24:00 Mon to 00:00 Wed, or 00:00 Mon to 24:00 Wed).


>
> B> With 0-based, there's a disconnect between the ordinal number of
> B> the element you want, and the index that needs to be used. So A[2]
> B> for the 3rd element.
>
> Why not call A[2] element 2?

See N-based below.

>
> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather than
> 0 to 9? If so, presumably you count the hundreds as starting at 111.
> That's not the most logical viewpoint.

It's not always logical; I celebrated the millennium on 1-1-2000 like
everyone else. It's a big deal when the '19' year prefix in use for 100
years, suddenly changes to '20'.

> Similarly, on the day a child is born do you say that he is one year old?


This is 'measurement'; see above. However my dad always liked to round
his age up to the next whole year; most people round down! So the child
would be 0 years, but in its first year.

However there is not enough resolution using years to accurately measure
ages of very young children, so people also use days, weeks and months.

So, when do I use 0-based:

(a) When porting zero-based algorithms from elsewhere. This works more
reliably than porting one-based code to C.

[N]int A # 1-based (also [1:N] or [1..N]
[0:N]int A # 0-based (also [0..N-1])

(b) When I have a regular array normally index from 1, but that index
can have 0 as an escape value, meaning not set or not valid:

global tabledata() [0:]ichar opndnames =
(no_opnd=0, $),
(mem_opnd, $),
(memaddr_opnd, $),
....

(c) When the value used as index naturally includes zero.

When do I use N-based: this is much less common. An example might be:

['A'..'Z']int counts

Here, it becomes less meaningful to use the ordinal position index: the
first element has index 65! So this kind of array has more in common
with a hash or dict type, when the index is a key that can anything.

But for the special case of the keys being consecutive integers over a
small range, then a regular, fixed-size array indexed by that range is
far more efficient.

However, the slice counts['D'..'F] will have elements indexed from 1..3,
not 'D'..'F'. There are some pros and cons, but overall the way I've
done it is simpler (slices have a lower bound known at compile-time, not
runtime).

Andy Walker

unread,
Nov 30, 2021, 7:50:17 PM11/30/21
to
On 30/11/2021 08:07, James Harris wrote:
> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather
> than 0 to 9?

Until quite recently, Bart and almost everyone else would
certainly have done exactly that. Zero, as a number, was invented
in modern times [FSVO "modern"!]. "You have ten sheep and you sell
ten of them. How many sheep do you now have?" "??? I don't have
/any/ sheep left." Or, worse, "You have ten sheep and you sell
eleven of them. How many sheep do you now have?" "??? You can't
do that, it would be fraud." Or the Peano axioms for the natural
numbers: 1 is a natural number; for every n in the set, there
is a successor n' in the set; every n in the set /except/ 1 is
the successor of a unique member; .... Or look at any book;
only in a handful of rather weird books trying to make a point
is there a page 0. When you first learned to count, you almost
certainly started with a picture of a ball [or whatever] and
the caption "1 ball", then "2 cats", "3 trees", "4 cakes", ...
up to "12 candles"; not with an otherwise blank page showing
"0 children". [Note that 0 as a number in its own right is
different from the symbol 0 as a placeholder in the middle or at
the end of a number in Arabic numerals.]

Maths, inc numbers, counting, and science generally, got
along quite happily with only positive numbers from antiquity up
to around 1700, when the usefulness of the rest of the number
line became apparent, at least in maths and science if not to
the general public.

/Now/ the influence of computing has made zero-based
indexing more relevant. So have constructive arguments more
generally; eg, the surreal numbers -- a surreal number is two
sets of surreal numbers [with some conditions], so that the
natural starting point is there the two sets are empty, giving
the "empty" number, naturally identified with zero. So it was
only around 1970 that people started taking seriously the idea
of counting from zero. Of course, once you do that, then you
can contemplate not counting "from" anywhere at all; eg the
idea [which I first saw espoused by vd Meulen] that arrays
could be thought of as embedded in [-infinity..+infinity],
treated therefore always in practice as "sparse" arrays, with
almost all elements being "virtual".

> If so, presumably you count the hundreds as starting at
> 111. That's not the most logical viewpoint.

Note that we normally read that number as "one hundred
/and/ eleven", suggesting that it's eleven into the second
hundred. It's not illogical to suggest that the "hundreds"
start immediately after 100, nor to suggest that they start
/at/ 100. Dates are a special case, as there was [of course]
no year zero, so centuries "definitely" end on the "hundred"
years, not start on them. But, as Bart pointed out, there is
still an interest in the number clicking over from 1999 to
2000, and therefore the chance to get two parties.

> Similarly, on the day a child is born do you say that he is one year
> old?

Is this child your first-born? Would you call your
eighth-born child "Septimus"?

Slightly more seriously, there are of course legal
questions surrounding this [esp when they concern the age
of majority and such-like], and they are resolved by the
law and by conventions [which may well differ around the
world] rather than by maths and logic.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bendel

David Brown

unread,
Dec 1, 2021, 3:43:15 AM12/1/21
to
On 01/12/2021 01:50, Andy Walker wrote:
> On 30/11/2021 08:07, James Harris wrote:
>> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather
>> than 0 to 9?
>
>     Until quite recently, Bart and almost everyone else would
> certainly have done exactly that.

Remember Bart, and some others, think it is "natural" to count from
32767 on to -32767 (or larger type equivalents - 16-bit numbers are
easier to write) in the context of programming. Clearly they would not
think that way when counting sheep. So why apply sheep counting to
other aspects of programming? Personally I prefer to think you can't
add 1 to 32767 (or larger type equivalents), which is of course almost
equally silly in terms of sheep.

>  Zero, as a number, was invented
> in modern times [FSVO "modern"!].

(Historical note:

It reached Europe around 1200, but had been around in India, amongst
other countries, for a good while before that. The Mayans also had a
number zero earlier on. It is difficult to be precise about times,
however, because "zero" is used for many different purposes and ideas
changed and evolved over time.)

>  "You have ten sheep and you sell
> ten of them.  How many sheep do you now have?"  "??? I don't have
> /any/ sheep left."  Or, worse, "You have ten sheep and you sell
> eleven of them.  How many sheep do you now have?" "??? You can't
> do that, it would be fraud."  Or the Peano axioms for the natural
> numbers:  1 is a natural number;  for every n in the set, there
> is a successor n' in the set;  every n in the set /except/ 1 is
> the successor of a unique member; ....  Or look at any book;
> only in a handful of rather weird books trying to make a point
> is there a page 0.  When you first learned to count, you almost
> certainly started with a picture of a ball [or whatever] and
> the caption "1 ball", then "2 cats", "3 trees", "4 cakes", ...
> up to "12 candles";  not with an otherwise blank page showing
> "0 children".  [Note that 0 as a number in its own right is
> different from the symbol 0 as a placeholder in the middle or at
> the end of a number in Arabic numerals.]
>

The first Peano axiom is "0 is a natural number". They start counting
at zero, not at one.

There is no mathematical consensus as to whether the set of natural
numbers ℕ starts with 0 or 1. But there is no doubt that the numbers
generated by the Peano axioms start at 0.

Other than that, we can simply say that different types of number are
useful for different purposes.


>     Maths, inc numbers, counting, and science generally, got
> along quite happily with only positive numbers from antiquity up
> to around 1700, when the usefulness of the rest of the number
> line became apparent, at least in maths and science if not to
> the general public.
>

Negative numbers long pre-date the general acceptance of 0 as a
"number". They were used in accountancy, as well as by a few
mathematicians. But there general use, especially in Europe, came a lot
later.

>     /Now/ the influence of computing has made zero-based
> indexing more relevant.  So have constructive arguments more
> generally;  eg, the surreal numbers -- a surreal number is two
> sets of surreal numbers [with some conditions], so that the
> natural starting point is there the two sets are empty, giving
> the "empty" number, naturally identified with zero.  So it was
> only around 1970 that people started taking seriously the idea
> of counting from zero.  Of course, once you do that, then you
> can contemplate not counting "from" anywhere at all;  eg the
> idea [which I first saw espoused by vd Meulen] that arrays
> could be thought of as embedded in [-infinity..+infinity],
> treated therefore always in practice as "sparse" arrays, with
> almost all elements being "virtual".
>

I am quite confident that the idea of starting array indexes from 0 had
nothing to do with surreals. Surreal numbers are rather esoteric, and
very far from useful in array indexing in programming (which always
boils down to some kind of finite integer).

Bart

unread,
Dec 1, 2021, 6:13:29 AM12/1/21
to
On 01/12/2021 08:43, David Brown wrote:
> On 01/12/2021 01:50, Andy Walker wrote:
>> On 30/11/2021 08:07, James Harris wrote:
>>> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather
>>> than 0 to 9?
>>
>>     Until quite recently, Bart and almost everyone else would
>> certainly have done exactly that.
>
> Remember Bart, and some others, think it is "natural" to count from
> 32767 on to -32767 (or larger type equivalents - 16-bit numbers are
> easier to write) in the context of programming.

Remember David think's it's natural to count from 65535 onto 0.

I simply acknowledge that that is how most hardware works. Otherwise how
do you explain that the upper limit of some value is (to ordinary
people) the arbitrary figure of 32,767 or 65,535 instead of 99,999?



> Clearly they would not
> think that way when counting sheep. So why apply sheep counting to
> other aspects of programming? Personally I prefer to think you can't
> add 1 to 32767 (or larger type equivalents), which is of course almost
> equally silly in terms of sheep.

It might be silly, but you'd still be stuck if you had 33,000 sheep to
count; what are you going to do?



> I am quite confident that the idea of starting array indexes from 0 had
> nothing to do with surreals.

More to do with conflating them with offsets.

> Surreal numbers are rather esoteric, and
> very far from useful in array indexing in programming (which always
> boils down to some kind of finite integer).
>

a:=[:]

a{infinity} := 100
a{-infinity} := 200

println a # [Infinity:100, -Infinity:200]


David Brown

unread,
Dec 1, 2021, 7:25:04 AM12/1/21
to
On 01/12/2021 12:13, Bart wrote:
> On 01/12/2021 08:43, David Brown wrote:
>> On 01/12/2021 01:50, Andy Walker wrote:
>>> On 30/11/2021 08:07, James Harris wrote:
>>>> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather
>>>> than 0 to 9?
>>>
>>>      Until quite recently, Bart and almost everyone else would
>>> certainly have done exactly that.
>>
>> Remember Bart, and some others, think it is "natural" to count from
>> 32767 on to -32767 (or larger type equivalents - 16-bit numbers are
>> easier to write) in the context of programming.
>
> Remember David think's it's natural to count from 65535 onto 0.

No, I don't - as you would know if you read my posts.

>
> I simply acknowledge that that is how most hardware works. Otherwise how
> do you explain that the upper limit of some value is (to ordinary
> people) the arbitrary figure of 32,767 or 65,535 instead of 99,999?
>

You say the limit is 32767, or whatever - explaining it in terms of the
hardware if you like. People can understand that perfectly well.
Limits are quite natural in counting and measuring - wrapping is much
rarer (though it does occur, such as with times and angles).

>
>
>> Clearly they would not
>> think that way when counting sheep.  So why apply sheep counting to
>> other aspects of programming?  Personally I prefer to think you can't
>> add 1 to 32767 (or larger type equivalents), which is of course almost
>> equally silly in terms of sheep.
>
> It might be silly, but you'd still be stuck if you had 33,000 sheep to
> count; what are you going to do?
>

Buy a bigger pen to put them in.

It is perfectly reasonable to say that you are counting sheep by putting
them in a pen, and if the pen only holds 20 sheep then you can't count
beyond 20.

>
>
>> I am quite confident that the idea of starting array indexes from 0 had
>> nothing to do with surreals.
>
> More to do with conflating them with offsets.

Having indexes of low-level arrays correlate to offsets is simple,
clear, obvious and efficient. (And again, I like having higher-level
array handling where index types can be more flexible - such as integer
subranges or enumeration types.)

>
>>  Surreal numbers are rather esoteric, and
>> very far from useful in array indexing in programming (which always
>> boils down to some kind of finite integer).
>>
>
>     a:=[:]
>
>     a{infinity}  := 100
>     a{-infinity} := 200
>
>     println a               # [Infinity:100, -Infinity:200]
>
>

General hashmaps or dictionaries are a different concept from contiguous
arrays (though some languages combine them). They are suitable (and
very useful) in higher level languages, but should not be part of the
core language for low-level languages. Libraries can then offer a range
of different variations on the theme, letting programmers pick the
version that fits their needs.

(Oh, and there is no such surreal as "infinity" - most surreals are
non-finite. But that's really getting off-topic!)

Andy Walker

unread,
Dec 1, 2021, 7:12:00 PM12/1/21
to
On 01/12/2021 08:43, David Brown wrote:
> [I wrote:]
>>   Zero, as a number, was invented
>> in modern times [FSVO "modern"!].
> (Historical note:
> It reached Europe around 1200, but had been around in India, amongst
> other countries, for a good while before that.

Yes, but that's nearly always zero as a placeholder, not
as a number in its own right. [I'm not convinced by many of the
claimed exceptions, which often smack of flag-waving.]

[...]
> The first Peano axiom is "0 is a natural number". They start counting
> at zero, not at one.
> There is no mathematical consensus as to whether the set of natural
> numbers ℕ starts with 0 or 1. But there is no doubt that the numbers
> generated by the Peano axioms start at 0.

When Peano first wrote his axioms, he started at 1. Later
he wrote a version starting at 0. The foundational maths books on
my shelves, even modern ones, are split; it really matters very
little.

[...]
> Negative numbers long pre-date the general acceptance of 0 as a
> "number". They were used in accountancy, as well as by a few
> mathematicians. But there general use, especially in Europe, came a
> lot later.

My impression is that accountants used red ink rather than
negative numbers. As late as the 1970s, hand/electric calculators
still used red numerals rather than a minus sign.

>>     /Now/ the influence of computing has made zero-based
>> indexing more relevant.  So have constructive arguments more
>> generally;  eg, the surreal numbers [...].
> I am quite confident that the idea of starting array indexes from 0 had
> nothing to do with surreals. [...]

Surreal numbers were an example; they are part of the
explanation for mathematics also tending to become zero-based.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ketterer

David Brown

unread,
Dec 2, 2021, 2:37:33 AM12/2/21
to
On 02/12/2021 01:11, Andy Walker wrote:
> On 01/12/2021 08:43, David Brown wrote:
>> [I wrote:]
>>>    Zero, as a number, was invented
>>> in modern times [FSVO "modern"!].
>> (Historical note:
>> It reached Europe around 1200, but had been around in India, amongst
>> other countries, for a good while before that.
>
>     Yes, but that's nearly always zero as a placeholder, not
> as a number in its own right.  [I'm not convinced by many of the
> claimed exceptions, which often smack of flag-waving.]
>

Certainly zero as a placeholder was much more common. As a number -
well, since there was not even a consensus as to what a "number" is
until more rigorous mathematics of the past few centuries, it is very
difficult to tell. And of course we don't exactly have complete records
of all mathematics in all cultures for the last few millennium. So
there is definitely place for interpretation, opinions and hypotheses in
the history here, with no good way to judge the accuracy.

> [...]
>> The first Peano axiom is "0 is a natural number".  They start counting
>> at zero, not at one.
>> There is no mathematical consensus as to whether the set of natural
>> numbers ℕ starts with 0 or 1.  But there is no doubt that the numbers
>> generated by the Peano axioms start at 0.
>
>     When Peano first wrote his axioms, he started at 1.  Later
> he wrote a version starting at 0.  The foundational maths books on
> my shelves, even modern ones, are split;  it really matters very
> little.

It matters a lot once you get into the arithmetic - 0 is the additive
identity. I suppose you /could/ define addition with the starting point
"a + 1 = succ(a)" rather than "a + 0 = a", but it is all much easier and
neater when you start with 0. That is certainly how I learned it at
university, and how I have seen it a few other places - but while I
think I have a couple of books covering them, they are buried in the
attic somewhere.

>
> [...]
>> Negative numbers long pre-date the general acceptance of 0 as a
>> "number".  They were used in accountancy, as well as by a few
>> mathematicians.  But there general use, especially in Europe, came a
>> lot later.
>
>     My impression is that accountants used red ink rather than
> negative numbers.  As late as the 1970s, hand/electric calculators
> still used red numerals rather than a minus sign.
>

Many conventions have been used, in different countries, times, and
cultures. "Red ink" is certainly a well-known phrase in modern
English-speaking countries. But brackets, minus signs, and other
methods are used. Go far enough back and people didn't write with ink
at all.

But again, it is difficult to decide when something was considered "a
negative number" rather than "a number to be subtracted rather than added".

>>>      /Now/ the influence of computing has made zero-based
>>> indexing more relevant.  So have constructive arguments more
>>> generally;  eg, the surreal numbers [...].
>> I am quite confident that the idea of starting array indexes from 0 had
>> nothing to do with surreals. [...]
>
>     Surreal numbers were an example;  they are part of the
> explanation for mathematics also tending to become zero-based.
>

Really? Again, I would suggest that they are far too esoteric for the
purpose. Constructions of surreal numbers will normally start with 0 -
but so will constructions of other more familiar types, such as
integers, reals, ordinals, cardinals, and almost any other numbers.
Maybe it is just that with surreals, few people ever have much idea of
what they are, or get beyond reading how they are constructed! (Some
day I must get the book on them - it was Conway that developed them, and
Knuth that wrote the book, right?)

Bart

unread,
Dec 2, 2021, 9:56:51 AM12/2/21
to
On 01/12/2021 12:25, David Brown wrote:
> On 01/12/2021 12:13, Bart wrote:
>> On 01/12/2021 08:43, David Brown wrote:

>>> Remember Bart, and some others, think it is "natural" to count from
>>> 32767 on to -32767 (or larger type equivalents - 16-bit numbers are
>>> easier to write) in the context of programming.
>>
>> Remember David think's it's natural to count from 65535 onto 0.
>
> No, I don't - as you would know if you read my posts.
>
>>
>> I simply acknowledge that that is how most hardware works. Otherwise how
>> do you explain that the upper limit of some value is (to ordinary
>> people) the arbitrary figure of 32,767 or 65,535 instead of 99,999?
>>
>
> You say the limit is 32767, or whatever - explaining it in terms of the
> hardware if you like. People can understand that perfectly well.
> Limits are quite natural in counting and measuring - wrapping is much
> rarer (though it does occur, such as with times and angles).

Yes, exactly. You travel east but when you hit 180E, it suddenly turns
into 180W, and the next degree along will be 179W not 181E.

The integer values represented by N bits can be thought of as being
arranged in a circle, here shown for N=3 as either unsigned, two's
complement or signed magnitude:

u3 i3 s3

000 0 000 0 000 +0 Origin
001 +1 001 +1 001 +1
010 +2 010 +2 010 +2
011 +3 011 +3 011 +3
100 +4 100 -4 100 -0
101 +5 101 -3 101 -1
110 +6 110 -2 110 -2
111 +7 111 -1 111 -3
000 0 000 0 000 +0 Origin
001 +1 001 +1 001 +1
...

Degrees of longtitude, if they were whole numbers rather than
continuous, would correspond most closely with the middle column (but
there would be 179E then 180W; no 180E).

Whatever column is chosen, wrapping behaviour is well-defined, even if
it may not be meaningful if your prefered result would need 6 bits to
represent; you don't want just the bottom 3.

But if you're in an aircraft flying along the equator, travelling 10
degrees east then 10 degrees west would normally get to you back to the
same longitude, whatever the start point, even when you cross the 180th
meridian.

James Harris

unread,
Dec 2, 2021, 12:29:52 PM12/2/21
to
On 30/11/2021 10:28, Bart wrote:
> On 30/11/2021 08:07, James Harris wrote:
>>  From another thread, discussion between David and Bart:

...

>> B> That's a crude way of defining arrays. A[i] is simply the i'th
>> B> element of N slots, you don't need to bring offsets into it.

I disagree. A[i] is not necessarily or even naturally the ith element.
See below about cardinal and ordinal numbers.

>>
>> Why call it 'i'th? I know people do but wouldn't it be easier to call
>> it 'element n' where n is its index? Then that would work with any
>> basing.
>
> The most common base I use is 1 (about 2/3 of the time). You have a
> 3-element array, the 1st is numbered 1, the last is 3, the 3rd is 3 too.
> All very intuitive and user-friendly.
>
> But this is that 3-element array as 3 adjoining cells:
>
>    mmmmmmmmmmmmmmmmmmmmmmmmm
>    m       m       m       m
>    m   1   m   2   m   3   m     Normal indexing
>    m  +0   m  +1   m  +2   m     Offsets
>    m       m       m       m
>    mmmmmmmmmmmmmmmmmmmmmmmmm
>
>    0       1       2       3     Distance from start point
>
>
> The numbering is 1, 2, 3 as I prefer when /counting/. Or you can choose
> to use offsets from the first element as C does, shown as +0, +1, +2.
>
> There is also /measuring/, which applies more when each cell has some
> physical dimension, such as 3 adjoining square pixels. Or maybe these
> are three fence panels, and the vertical columns are the posts.

Rather that 'counting' and 'measuring' here's another way to look at it:
The natural place to count an element - any element - is when it is
complete; however, where I think the conflict appears is that if an
element is known to be indivisible then its can never be partially
present so we know when we see the start of it that it is complete. That
'trick' works for whole elements but does not work in the general case.

To explain, consider a decimal number and take the units. You may see
them as

1, 2, 3, etc

but now take the tens position. In those numbers the tens position is
zero so they can be seen as (in normal notation, not in C-form octal!)

01, 02, 03, etc

Similarly, the number of hundreds in those numbers is also zero, i.e.
with three digits they are

001, 002, 003, etc

The tens and the hundreds are each subdivided (into smaller elements
1/10th of their value). We have to wait for the units to tick past 9
before we add 1 to the tens column, and for the tens column to tick past
9 before we add another hundred. So the mathematically natural indexing
for tens and hundreds and all higher digit positions more is from zero.
It's more consistent, then, to number the units from zero, too, but we
often find it natural to count them from 1. Here's an idea as to why:

Perhaps we think of counting units from 1 because we normally count
/whole/ objects. We don't need to wait for them to complete; since they
are indivisible we know they are complete when we first see them.

But that's a special case. The more general case is to start from zero.


>
> Here, offsets are again used, but notionally considered to be measured
> from the first 'post'.
>
> In this case, an 'index' of 2.4 is meaningful, being 2.4 units from the
> left, and 40% along that 3rd cell.

As above, that's (correctly) seeing an object as partial.

...

>> Similarly, on the day a child is born do you say that he is one year old?
>
>
> This is 'measurement'; see above. However my dad always liked to round
> his age up to the next whole year; most people round down! So the child
> would be 0 years, but in its first year.
>
> However there is not enough resolution using years to accurately measure
> ages of very young children, so people also use days, weeks and months.

Partials, again. A person doesn't become 1 year old until he reaches 12
months, for example.

Cardinals and ordinals

Going back to your point at the beginning, as above the ordinal of
something is naturally one more than its cardinal number. Our /first/
year is when we are age zero whole years. In the 20th century the
century portion of the date was 19. Etc.

So in a zero-based array it would be inconsistent to refer to

A[1]

as the first element even though lots of people do it. It is, in fact,
the second. It's probably easiest to refer to it as

element 1

then the bounds don't matter.



--
James Harris

James Harris

unread,
Dec 2, 2021, 12:39:59 PM12/2/21
to
On 30/11/2021 09:18, Dmitry A. Kazakov wrote:
> On 2021-11-30 09:07, James Harris wrote:
>>  From another thread, discussion between David and Bart:

...

>> B> That's a crude way of defining arrays. A[i] is simply the i'th
>> B> element of N slots, you don't need to bring offsets into it.
>>
>> Why call it 'i'th? I know people do but wouldn't it be easier to call
>> it 'element n' where n is its index? Then that would work with any
>> basing.
>
> You are confusing position with index. Index can be of any ordered type.
> Position is an ordinal number: first, second, third element from the
> array beginning.

You are confusing a question to Bart with an opinion. :-)

...

>> Why not call A[2] element 2?
>
> Because it would be wrong. In most languages A[2] means the array
> element corresponding to the index 2.

"Element 2" doesn't mean "second element" so why is A[2] not "element 2"
just as A["XX"] would be element "XX"?

>
> Remember, array is a mapping:
>
>    array : index -> element
>
> In well-designed languages it is also spelt as a mapping:
>
>    A(2)

This is something I want to come back to elsewhere but since you mention
it I'm curious, Dmitry, as to whether you would accept such a mapping
returning the address of, as in

A(2) = A(2) + 1


--
James Harris

Dmitry A. Kazakov

unread,
Dec 2, 2021, 1:05:17 PM12/2/21
to
On 2021-12-02 18:39, James Harris wrote:
> On 30/11/2021 09:18, Dmitry A. Kazakov wrote:
>> On 2021-11-30 09:07, James Harris wrote:

>>> Why not call A[2] element 2?
>>
>> Because it would be wrong. In most languages A[2] means the array
>> element corresponding to the index 2.
>
> "Element 2" doesn't mean "second element"

Again, A[2] is the element corresponding to the index 2. Not "element
2," not "second element," just an element denoted by the index value 2.

> so why is A[2] not "element 2"

Because "element 2" is undefined, so far.

> just as A["XX"] would be element "XX"?

Nope. It is the element corresponding to the index "XX".

>> Remember, array is a mapping:
>>
>>     array : index -> element
>>
>> In well-designed languages it is also spelt as a mapping:
>>
>>     A(2)
>
> This is something I want to come back to elsewhere but since you mention
> it I'm curious, Dmitry, as to whether you would accept such a mapping
> returning the address of, as in
>
>   A(2) = A(2) + 1

It does not return address. A(2) denotes the array element corresponding
to the index 2 on both sides. No any addresses. A is a mapping, mutable
in this case. It does not return anything.

You as always confuse implementation with semantics. There is an
uncountable number of valid implementations of a mapping. The programmer
does not care most of the time, because he presumes the compiler vendors
are sane people, until proven otherwise.

Bart

unread,
Dec 2, 2021, 3:11:46 PM12/2/21
to
On 02/12/2021 17:29, James Harris wrote:
> On 30/11/2021 10:28, Bart wrote:


> To explain, consider a decimal number and take the units. You may see
> them as
>
>   1, 2, 3, etc
>
> but now take the tens position. In those numbers the tens position is
> zero so they can be seen as (in normal notation, not in C-form octal!)
>
>   01, 02, 03, etc
>
> Similarly, the number of hundreds in those numbers is also zero, i.e.
> with three digits they are
>
>   001, 002, 003, etc
>
> The tens and the hundreds are each subdivided (into smaller elements
> 1/10th of their value). We have to wait for the units to tick past 9
> before we add 1 to the tens column, and for the tens column to tick past
> 9 before we add another hundred. So the mathematically natural indexing
> for tens and hundreds and all higher digit positions more is from zero.
> It's more consistent, then, to number the units from zero, too, but we
> often find it natural to count them from 1.

They're not really numbered, they're counted, and the number of tens go
from 0 to 9 in total.

Most people do count from zero, in that the start point when you have
nothing is zero; the next is designated 1; the next 2, and so on. The
last in your collection is designated N, and you have N things in all.

Except the tens in your example are not ordered nor individually
numbered, you just need the total.

I guess if you had two cars in your household, you would agree there
were '2' cars and not '1' (which would confuse everyone, and would mean
that anyone without a car would have, what, -1 cars? If doesn't work!).

But if you had to number the cars, with a number on the roof, or on the
keytags, you can choose to number them 0 and 1, or 1 and 2, or 5000 and
5001, if you needed a sequential order.

The number of tens digit in that column however, must correspond to the
number of cars in a household, and not to the highest value of whatever
numbering scheme you favour.

>> This is 'measurement'; see above. However my dad always liked to round
>> his age up to the next whole year; most people round down! So the
>> child would be 0 years, but in its first year.
>>
>> However there is not enough resolution using years to accurately
>> measure ages of very young children, so people also use days, weeks
>> and months.
>
> Partials, again. A person doesn't become 1 year old until he reaches 12
> months, for example.

I think it's a mistake to conflate discrete, abstract units with
physical measurement.

If you go back to my fence and fenceposts example, you have N panels and
N+1 posts for a straight fence.

If you number the /posts/ from 0 to N, then the number gives you the
physical distance from the start (in fence panel units).

You wouldn't number the panels to get that information, because it would
be inaccurate; the panels are too wide.

The panels however do correspond to the elements of an array. This is
where I'd number them from 1 (since there is no reason to use 0 or
anything else); you'd probably use 0 for misguided reasons (perhaps too
much time spent coding in C or Python).

> Going back to your point at the beginning, as above the ordinal of
> something is naturally one more than its cardinal number. Our /first/
> year is when we are age zero whole years.

Our age in years is a continuous measure. Usually you specify it as
whole years when it has floor() applied to round it down.

> In the 20th century the
> century portion of the date was 19. Etc.

Yeah, that confuses a lot of people, but not us, right?

>
>   A[1]
>
> as the first element even though lots of people do it. It is, in fact,
> the second. It's probably easiest to refer to it as
>
>   element 1
>
> then the bounds don't matter.

Very often you do need to refer to the first or the last. In a strictly
1-based scheme, they would be A[1] and A[N]; 0-based is A[0] and A[N-1].

X-based (since N is the length) gets ugly, eg. A.[A.lwb] and A.[A.upb]
or A[$].

However it looks you're itching to start your arrays from 0; then just
do so. You don't need an excuse.

I happen to think that 1-based is better:

* It's more intuitive and easier to understand

* It corresponds to how most discrete things are numbered in real life

* If there are N elements, the first is 1, and the last N; there is no
dis-connect are there is with 0-based

* It plays well with the rest of a language, so for-loops can go from
1 to N instead of 0 to N-1.

* In N-way select (n | a,b,c |z), then n=1/2/3 selects 1st/2nd/3rd

* If you have a list indexed 1..N, then a search function can return
1..N for success, and 0 for failure. How would it work for 0-based
since 0 could be a valid return value?

* Such a return code will also be True in conditional (if x in A then...)

But despite the advantages, I still use 0-based too; it's just not the
primary choice.

James Harris

unread,
Dec 2, 2021, 3:31:31 PM12/2/21
to
On 02/12/2021 18:05, Dmitry A. Kazakov wrote:
> On 2021-12-02 18:39, James Harris wrote:
>> On 30/11/2021 09:18, Dmitry A. Kazakov wrote:
>>> On 2021-11-30 09:07, James Harris wrote:
>
>>>> Why not call A[2] element 2?
>>>
>>> Because it would be wrong. In most languages A[2] means the array
>>> element corresponding to the index 2.
>>
>> "Element 2" doesn't mean "second element"
>
> Again, A[2] is the element corresponding to the index 2. Not "element
> 2," not "second element," just an element denoted by the index value 2.
>
>> so why is A[2] not "element 2"
>
> Because "element 2" is undefined, so far.
>
>> just as A["XX"] would be element "XX"?
>
> Nope. It is the element corresponding to the index "XX".

Just as the house corresponding with the number 48a is commonly called
house 48a, then. I am suggesting referring to elements of arrays by
their labels rather than by their positions.

>
>>> Remember, array is a mapping:
>>>
>>>     array : index -> element
>>>
>>> In well-designed languages it is also spelt as a mapping:
>>>
>>>     A(2)
>>
>> This is something I want to come back to elsewhere but since you
>> mention it I'm curious, Dmitry, as to whether you would accept such a
>> mapping returning the address of, as in
>>
>>    A(2) = A(2) + 1
>
> It does not return address. A(2) denotes the array element corresponding
> to the index 2 on both sides. No any addresses. A is a mapping, mutable
> in this case. It does not return anything.
>
> You as always confuse implementation with semantics. There is an
> uncountable number of valid implementations of a mapping. The programmer
> does not care most of the time, because he presumes the compiler vendors
> are sane people, until proven otherwise.

Strange. I think it's you who too often conflates implementations with
semantics. But in this case I certainly was referring at least to a
reference or ideal implentation from which information (and other
potential implementations with the same semantics) can be inferred.

But to the point, are you comfortable with the idea of the A(2) in

x = A(2) + 0

meaning the same mapping result as the A(2) in

A(2) = 0

?


--
James Harris

Dmitry A. Kazakov

unread,
Dec 2, 2021, 3:49:55 PM12/2/21
to
Yes, in both cases the result is the array element corresponding to the
index 2. That is the semantics of A(2).

James Harris

unread,
Dec 2, 2021, 4:25:45 PM12/2/21
to
On 02/12/2021 20:11, Bart wrote:
> On 02/12/2021 17:29, James Harris wrote:
>> On 30/11/2021 10:28, Bart wrote:
>
>
>> To explain, consider a decimal number and take the units. You may see
>> them as
>>
>>    1, 2, 3, etc
>>
>> but now take the tens position. In those numbers the tens position is
>> zero so they can be seen as (in normal notation, not in C-form octal!)
>>
>>    01, 02, 03, etc
>>
>> Similarly, the number of hundreds in those numbers is also zero, i.e.
>> with three digits they are
>>
>>    001, 002, 003, etc
>>
>> The tens and the hundreds are each subdivided (into smaller elements
>> 1/10th of their value). We have to wait for the units to tick past 9
>> before we add 1 to the tens column, and for the tens column to tick
>> past 9 before we add another hundred. So the mathematically natural
>> indexing for tens and hundreds and all higher digit positions more is
>> from zero. It's more consistent, then, to number the units from zero,
>> too, but we often find it natural to count them from 1.
>
> They're not really numbered, they're counted, and the number of tens go
> from 0 to 9 in total.

So (whatever you prefer to call it) do you agree that the number line
has the tens, hundreds, and above starting at zero and increasing to 9?
If so why not apply that to the units digit, too and say the natural
first number is zero?

...

> I guess if you had two cars in your household, you would agree there
> were '2' cars and not '1' (which would confuse everyone, and would mean
> that anyone without a car would have, what, -1 cars? If doesn't work!).
>
> But if you had to number the cars, with a number on the roof, or on the
> keytags, you can choose to number them 0 and 1, or 1 and 2, or 5000 and
> 5001, if you needed a sequential order.

As I said, whole units do not have partial, incomplete phases, and the
cars are whole units.

But if you were putting petrol in one of the cars would you count
yourself as having received a tankful when the first drop went in? No,
where elements are partial we don't count the whole until it is complete.

Similarly, if you sold one of the cars to a friend who was to pay you
£100 a month for it would you count yourself as having received the
payment after the first month? No, this is also partial so you'd count
it at the end.

Ergo it's only for indivisible units that 1-based can possibly be seen
as natural. It's more general, though, to begin counting from zero -
even if it is less familiar.

...

> The panels however do correspond to the elements of an array. This is
> where I'd number them from 1 (since there is no reason to use 0 or
> anything else); you'd probably use 0 for misguided reasons (perhaps too
> much time spent coding in C or Python).

No, I use 0 because it scales better. BTW, it sounds like the posts are
also an array.

...

>> In the 20th century the century portion of the date was 19. Etc.
>
> Yeah, that confuses a lot of people, but not us, right?

But do you see the point of it? The first century /naturally/ had
century number zero, not one, and the N'th century has century number

N - 1

IOW the numbering begins at zero.

That's not a convention, by the way, but how all numbering works: things
with partial phases begin at zero.

>
>>
>>    A[1]
>>
>> as the first element even though lots of people do it. It is, in fact,
>> the second. It's probably easiest to refer to it as
>>
>>    element 1
>>
>> then the bounds don't matter.
>
> Very often you do need to refer to the first or the last. In a strictly
> 1-based scheme, they would be A[1] and A[N]; 0-based is A[0] and A[N-1].
>
> X-based (since N is the length) gets ugly, eg. A.[A.lwb] and A.[A.upb]
> or A[$].
>
> However it looks you're itching to start your arrays from 0; then just
> do so. You don't need an excuse.

I wasn't looking for advice but I thought I'd have a go at challenging
your position and see where the argument led me.

>
> I happen to think that 1-based is better:
>
> * It's more intuitive and easier to understand

It's easier on indivisible elements. That's fine if you only have a
single, simple array. But if you have arrays being processed in nested
loops then it might be best if you didn't count the outer one as
complete until the first set of iterations of the inner one have
finished. That's why I asked you before if you start numbering your
three-digit numbers at 111...!

>
> * It corresponds to how most discrete things are numbered in real life
>
> * If there are N elements, the first is 1, and the last N; there is no
>   dis-connect are there is with 0-based

Yes, you are talking about discreet units which are not made of parts.

>
> * It plays well with the rest of a language, so for-loops can go from
>   1 to N instead of 0 to N-1.
>
> * In N-way select (n | a,b,c |z), then n=1/2/3 selects 1st/2nd/3rd

My version of that selects phrases from zero:

n of (a, b)

If n is zero it will pick a; if one, b. If you treat integers as
booleans (as you do below) then it doubles as a boolean test in the
order false, true - the opposite of C's ?: operator.


>
> * If you have a list indexed 1..N, then a search function can return
>   1..N for success, and 0 for failure. How would it work for 0-based
>   since 0 could be a valid return value?

That's convenient, for sure, but then so is treating the return from
strcmp as a boolean when it is typically a signum or a difference.

As for the alternative, some options: -1, N, exception, designated
default value, boolean instead of index.

>
> * Such a return code will also be True in conditional (if x in A then...)
>
> But despite the advantages, I still use 0-based too; it's just not the
> primary choice.

Sure. For discrete units either will do - and if our programming is
mainly in discrete units then we can become accustomed to thinking
1-based. Yet that begins to run out of steam when processing hierarchies.


--
James Harris

James Harris

unread,
Dec 2, 2021, 4:42:33 PM12/2/21
to
On 02/12/2021 20:49, Dmitry A. Kazakov wrote:
> On 2021-12-02 21:31, James Harris wrote:

...

>> But to the point, are you comfortable with the idea of the A(2) in
>>
>>    x = A(2) + 0
>>
>> meaning the same mapping result as the A(2) in
>>
>>    A(2) = 0
>>
>> ?
>
> Yes, in both cases the result is the array element corresponding to the
> index 2. That is the semantics of A(2).

Cool. If A were, instead, a function that, say, ended with

return v

then what would you want those A(2)s to mean and should they still mean
the same as each other? The latter expression would look strange to many.

I've been meaning to reply to Charles about the same issue but what you
said reminded me of it.


--
James Harris

Bart

unread,
Dec 2, 2021, 5:38:50 PM12/2/21
to
On 02/12/2021 21:25, James Harris wrote:
> On 02/12/2021 20:11, Bart wrote:

> As I said, whole units do not have partial, incomplete phases, and the
> cars are whole units.
>
> But if you were putting petrol in one of the cars would you count
> yourself as having received a tankful when the first drop went in? No,
> where elements are partial we don't count the whole until it is complete.
>
> Similarly, if you sold one of the cars to a friend who was to pay you
> £100 a month for it would you count yourself as having received the
> payment after the first month? No, this is also partial so you'd count
> it at the end.
>
> Ergo it's only for indivisible units that 1-based can possibly be seen
> as natural. It's more general, though, to begin counting from zero -
> even if it is less familiar.

Continuous measurements need to start from 0.0.

Discrete entities are counted, starting at 0 for none, then 1 for 1 (see
Xs below).

Some are in-between, where continuous quantities are represented as lots
of small steps. (Example: money in steps of £0.01, or time measured in
whole seconds.)



> ...
>
>> The panels however do correspond to the elements of an array. This is
>> where I'd number them from 1 (since there is no reason to use 0 or
>> anything else); you'd probably use 0 for misguided reasons (perhaps
>> too much time spent coding in C or Python).
>
> No, I use 0 because it scales better. BTW, it sounds like the posts are
> also an array.

The posts don't have a dimension, not abstract ones anyway, and can't
store data. If you were draw a diagram of bits or bytes or array
elements in memory, they would be the lines separate those elements.


> But do you see the point of it? The first century /naturally/ had
> century number zero, not one, and the N'th century has century number
>
>   N - 1
>
> IOW the numbering begins at zero.

Define what you mean by numbering first.

For me it means assigning sequential integers to a series of entities.
But you need an entity to hang a number from. With no entities, where
are you going to stick that zero?


>>
>> I happen to think that 1-based is better:
>>
>> * It's more intuitive and easier to understand
>
> It's easier on indivisible elements. That's fine if you only have a
> single, simple array. But if you have arrays being processed in nested
> loops then it might be best if you didn't count the outer one as
> complete until the first set of iterations of the inner one have
> finished. That's why I asked you before if you start numbering your
> three-digit numbers at 111...!

If you write a number with the usual decimal notation then a number like:

abc

has the value a*10^2 + b*10^1 + c*10^0.

The value of each of a,b,c is in the range 0..9 exclusive. That's just
how decimal notation works. Each digit represents as count as I said.

I'm not sure what you're trying to argue here; that because 0 is used to
mean nothing, then that must be the start point for everything?

Here are some sets of Xs increasing in size:

How many X's? Numbered as? Number of the Last?
--------
- 0 - -
--------
X 1 1 1
--------
X X 2 1 2 2
--------
X X X 3 1 2 3 3
--------

How would /you/ fill in those columns? I'd guess my '1 2 3' becomes '0 1
2', and that that last '3' becomes '2'.

But what about the first '3' on that last line; don't tell me it becomes
'2'! (Because then what happens to the '0'?)

Using you scheme (as I assume it will be); there is too much disconnect:
a '0' in the first row, and two 0s the second; a '1' in the second, and
two 1s in the third. Everything is out of step!

> Yes, you are talking about discreet units which are not made of parts.

Yes, arrays of elements that are computer data with no physical dimensions.


>> But despite the advantages, I still use 0-based too; it's just not the
>> primary choice.
>
> Sure. For discrete units either will do - and if our programming is
> mainly in discrete units then we can become accustomed to thinking
> 1-based. Yet that begins to run out of steam when processing hierarchies.

My arrays have a general declaration that looks like this:

[A..B]T X # or [A:N] where N is the length (B=A+N-1)

So, an array X of T, indexed from A to B inclusive. Here, whether A is
0, 1 or anything else doesn't come into it.

I just need to be aware of it so that I don't assume a specific lower
bound. (But usually I will know when A is 1 so I can take advantage.)

David Brown

unread,
Dec 2, 2021, 7:08:28 PM12/2/21
to
On 02/12/2021 22:25, James Harris wrote:
> On 02/12/2021 20:11, Bart wrote:
>> On 02/12/2021 17:29, James Harris wrote:

>>> In the 20th century the century portion of the date was 19. Etc.
>>
>> Yeah, that confuses a lot of people, but not us, right?
>
> But do you see the point of it? The first century /naturally/ had
> century number zero, not one, and the N'th century has century number
>
>   N - 1
>
> IOW the numbering begins at zero.
>
> That's not a convention, by the way, but how all numbering works: things
> with partial phases begin at zero.
>
Note, however, that the first century began with year 1 AD (or 1 CE, if
you prefer). The preceding year was 1 BC. There was no year 0. This
means the first century was the years 1 to 100 inclusive.

It really annoyed me that everyone wanted to celebrate the new
millennium on 01.01.2000, when in fact it did not begin until 01.01.2001.

It would have been so much simpler, and fitted people's expectations
better, if years have been numbered from 0 onwards instead of starting
counting at 1.

Bart

unread,
Dec 2, 2021, 8:42:26 PM12/2/21
to
On 03/12/2021 00:08, David Brown wrote:
> On 02/12/2021 22:25, James Harris wrote:
>> On 02/12/2021 20:11, Bart wrote:
>>> On 02/12/2021 17:29, James Harris wrote:
>
>>>> In the 20th century the century portion of the date was 19. Etc.
>>>
>>> Yeah, that confuses a lot of people, but not us, right?
>>
>> But do you see the point of it? The first century /naturally/ had
>> century number zero, not one, and the N'th century has century number
>>
>>   N - 1
>>
>> IOW the numbering begins at zero.
>>
>> That's not a convention, by the way, but how all numbering works: things
>> with partial phases begin at zero.
>>
> Note, however, that the first century began with year 1 AD (or 1 CE, if
> you prefer). The preceding year was 1 BC. There was no year 0. This
> means the first century was the years 1 to 100 inclusive.

So -1 was followed by +1?


> It really annoyed me that everyone wanted to celebrate the new
> millennium on 01.01.2000, when in fact it did not begin until 01.01.2001.

> It would have been so much simpler, and fitted people's expectations
> better, if years have been numbered from 0 onwards instead of starting
> counting at 1.

I'm sure we can all pretend that the start point was the year before 1
AD, which can be an honorary year 0.

It must have been a big deal on 24:00 on 31-12-999 when not only a new
century began, from the 10th to the 11th, and the century year changed
not only from 9 to 10, but from 1 digit to 2 digits.

Then probably some spoilsport came along and said it didn't count, they
were still in the same century really, despite that '10' in the year,
and they'd have to wait until midnight on 31-12-1000.


David Brown

unread,
Dec 3, 2021, 2:31:44 AM12/3/21
to
On 03/12/2021 02:42, Bart wrote:
> On 03/12/2021 00:08, David Brown wrote:
>> On 02/12/2021 22:25, James Harris wrote:
>>> On 02/12/2021 20:11, Bart wrote:
>>>> On 02/12/2021 17:29, James Harris wrote:
>>
>>>>> In the 20th century the century portion of the date was 19. Etc.
>>>>
>>>> Yeah, that confuses a lot of people, but not us, right?
>>>
>>> But do you see the point of it? The first century /naturally/ had
>>> century number zero, not one, and the N'th century has century number
>>>
>>>    N - 1
>>>
>>> IOW the numbering begins at zero.
>>>
>>> That's not a convention, by the way, but how all numbering works: things
>>> with partial phases begin at zero.
>>>
>> Note, however, that the first century began with year 1 AD (or 1 CE, if
>> you prefer).  The preceding year was 1 BC.  There was no year 0.  This
>> means the first century was the years 1 to 100 inclusive.
>
> So -1 was followed by +1?

Yes. Although of course the idea of AD and BC numbering was developed
long afterwards. The people living in 1 BC didn't know their year was
called 1 BC :-)

>
>
>> It really annoyed me that everyone wanted to celebrate the new
>> millennium on 01.01.2000, when in fact it did not begin until 01.01.2001.
>
>> It would have been so much simpler, and fitted people's expectations
>> better, if years have been numbered from 0 onwards instead of starting
>> counting at 1.
>
> I'm sure we can all pretend that the start point was the year before 1
> AD, which can be an honorary year 0.

It's difficult to change that now. But it rarely makes a big
difference, since many BC dates are only known approximately anyway.

>
> It must have been a big deal on 24:00 on 31-12-999 when not only a new
> century began, from the 10th to the 11th, and the century year changed
> not only from 9 to 10, but from 1 digit to 2 digits.
>

The new century didn't begin until 1001, but the year number got an
extra digit.

And at that time, day was from the first hour starting about dawn - what
we call 06:00 - until the twelfth hour about sunset - what we call
18:00. The length of hours in the day and the night depended on the
time of year. They were really only tracked by monasteries, where they
had their obsession about prayers and masses at different times. For
example, they needed to know when the ninth hour was (about 15:00 modern
timing) for their "noon" prayers.

It's easy to assume that people saw the change to year 1000 (or 1001) as
a big thing or perhaps the time for the "second coming" or apocalypse,
but from the records we have, it does not seem to be the case. (I'm
talking about the UK and Europe here - folks like the Mayans and Chinese
always loved a really big party at calender rollovers.) We have banking
records of people taking out 10 year loans in 998, for example, without
any indication that it was unusual.

> Then probably some spoilsport came along and said it didn't count, they
> were still in the same century really, despite that '10' in the year,
> and they'd have to wait until midnight on 31-12-1000.
>

I doubt if anyone listened to them. No one listened to me at 01.01.2000 :-(

Dmitry A. Kazakov

unread,
Dec 3, 2021, 2:41:26 AM12/3/21
to
On 2021-12-02 22:42, James Harris wrote:
> On 02/12/2021 20:49, Dmitry A. Kazakov wrote:
>> On 2021-12-02 21:31, James Harris wrote:
>
> ...
>
>>> But to the point, are you comfortable with the idea of the A(2) in
>>>
>>>    x = A(2) + 0
>>>
>>> meaning the same mapping result as the A(2) in
>>>
>>>    A(2) = 0
>>>
>>> ?
>>
>> Yes, in both cases the result is the array element corresponding to
>> the index 2. That is the semantics of A(2).
>
> Cool. If A were, instead, a function that, say, ended with
>
>   return v

PL/1 had those, if I correctly remember. But no, it is not a function.
If you want to go for fully abstract array types, it is a procedure (and
a method):

procedure Setter (A : in out Array; I : Index; E : Element)

So

A(2) = 0

must compile into

Setter (A, 2, 0) or A.Setter (2, 0)

whatever notation you prefer.

The other one is a Getter:

function Getter (A : Array; I : Index) return Element;

> then what would you want those A(2)s to mean and should they still mean
> the same as each other?

Yes, they always mean same.

In a language that does not have abstract arrays a programmer might
implement that using helper types decomposing it in the [wrong way] you
suggested. That would involve all sorts of helper types having
referential semantics, smart pointers etc. Unfortunately such
abstractions leak producing quite a mess of unreadable error messages.
You asked about methods and free functions. One of the leakage points is
that these helper types are unrelated and the operations on them might
get not fully visible in some context etc. This is why it is better to
provide abstract arrays on the language level.

David Brown

unread,
Dec 3, 2021, 4:08:52 AM12/3/21
to
On 02/12/2021 22:42, James Harris wrote:
> On 02/12/2021 20:49, Dmitry A. Kazakov wrote:
>> On 2021-12-02 21:31, James Harris wrote:
>
> ...
>
>>> But to the point, are you comfortable with the idea of the A(2) in
>>>
>>>    x = A(2) + 0
>>>
>>> meaning the same mapping result as the A(2) in
>>>
>>>    A(2) = 0
>>>
>>> ?
>>
>> Yes, in both cases the result is the array element corresponding to
>> the index 2. That is the semantics of A(2).
>
> Cool. If A were, instead, a function that, say, ended with
>
>   return v
>
> then what would you want those A(2)s to mean and should they still mean
> the same as each other? The latter expression would look strange to many.
>

Do you mean like returning a reference in C++ style?


int a[10];

void foo1(int i, int x) {
a[i] = x;
}

int& A(int i) {
return a[i];
}

void foo2(int i, int x) {
A(i) = x;
}

foo1 and foo2 do the same thing, and have the same code. Of course,
foo2 could add range checking, or offsets (for 1-based array), or have
multiple parameters for multi-dimensional arrays, etc. And in practice
you'd make such functions methods of a class so that the class owns the
data, rather than having a single global source of the data.

Andy Walker

unread,
Dec 3, 2021, 7:24:50 AM12/3/21
to
On 02/12/2021 07:37, David Brown wrote:
>>> [...] But there is no doubt that the numbers
>>> generated by the Peano axioms start at 0.
>>     When Peano first wrote his axioms, he started at 1.  Later
>> he wrote a version starting at 0.  The foundational maths books on
>> my shelves, even modern ones, are split;  it really matters very
>> little.
> It matters a lot once you get into the arithmetic - 0 is the additive
> identity.

"Additive identity" is meaningless before you have defined
addition; and once you have got to anywhere interesting, "0" is
defined anyway. It's really not important, except when you get to
rationals [when 1-based is better, as you don't have to make a
special case for when the denominator is zero].

> I suppose you /could/ define addition with the starting point
> "a + 1 = succ(a)" rather than "a + 0 = a", but it is all much easier and
> neater when you start with 0.

I don't see "a + 1 == a'" as interestingly harder than
"a + 0 = a". In some ways it's easier; if we abbreviate [eg]
3' [or succ(3)] as 4, in the usual way, then 1-based has [eg]

3+4 = (3+3)' = ((3+2)')' = (((3+1)')')' = 4''' = 5'' = 6' = 7,

whereas 0-based has

3+4 = (3+3)' = ((3+2)')' = (((3+1)')')' = ((((3+0)')')')' =
3'''' = 4''' = 5'' = 6' = 7,

and you have two extra steps with every addition.

> That is certainly how I learned it at
> university, and how I have seen it a few other places - but while I
> think I have a couple of books covering them, they are buried in the
> attic somewhere.

Fine; as I said, books vary, and it's at the whim of
the lecturer [if any -- it's commonly not taught at all, except
at the level of "if you really want to know how all this stuff
gets defined, look at (some book -- Landau in my case)"].

[...]
>>> I am quite confident that the idea of starting array indexes from 0 had
>>> nothing to do with surreals.
>>     Surreal numbers were an example;  they are part of the
>> explanation for mathematics also tending to become zero-based.
> Really? Again, I would suggest that they are far too esoteric for the
> purpose.

Again, I would repeat that they were an /example/ of the
way that /mathematics/, predominantly 1-based, has /tended/ to
become 0-based. That's not a /purpose/; it just so happens that
some relatively recent maths has found uses where 0-based seems
more natural than 1-based. There are still plenty where 1-based
remains more usual/natural.

> Constructions of surreal numbers will normally start with 0 -
> but so will constructions of other more familiar types, such as
> integers, reals, ordinals, cardinals, and almost any other numbers.

You're assuming the answer! As above, you can equally
get to integers [and so rationals and reals] from 1.

> Maybe it is just that with surreals, few people ever have much idea of
> what they are, or get beyond reading how they are constructed! (Some
> day I must get the book on them - it was Conway that developed them, and
> Knuth that wrote the book, right?)

Knuth wrote /a/ book on them; /the/ book is Conway's "On
Numbers and Games", but a more accessible version is "Winning Ways"
by Berlekamp, Conway and Guy [all three of whom, sadly, died within
a year and two days in 2019-20]; expensive to buy, but there is a
PDF freely available online. What most people don't realise is the
motivation: Conway couldn't see /why/ the step from rationals to
reals is so difficult. We define naturals eg by Peano, then get
by equivalence classes to integers and rationals, and then ...?
The usual constructions of reals seem so artificial, and not at
all related to what happens earlier. So Conway wondered what
would happen if we went the other way -- start from the concept
of a Dedekind section, forget that it relies on knowing about
the rationals, and just build on what we know. Thus we get the
idea of partitioning whatever numbers we know into two sets.
That is how we build the surreals, without exceptions or special
cases. Oh, we also get [combinatorial] games as a side-effect;
which is where it gets interesting to people like me, and to CS
more generally, and why it's not as esoteric as people think.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Boccherini

David Brown

unread,
Dec 3, 2021, 9:38:14 AM12/3/21
to
On 03/12/2021 13:24, Andy Walker wrote:
> On 02/12/2021 07:37, David Brown wrote:
>>>> [...] But there is no doubt that the numbers
>>>> generated by the Peano axioms start at 0.
>>>      When Peano first wrote his axioms, he started at 1.  Later
>>> he wrote a version starting at 0.  The foundational maths books on
>>> my shelves, even modern ones, are split;  it really matters very
>>> little.
>> It matters a lot once you get into the arithmetic - 0 is the additive
>> identity.
>
>     "Additive identity" is meaningless before you have defined
> addition;  and once you have got to anywhere interesting, "0" is
> defined anyway.  It's really not important, except when you get to
> rationals [when 1-based is better, as you don't have to make a
> special case for when the denominator is zero].
>

Rationals are easier when you have 0 :

ℚ = { p / q : p, q ∈ ℤ, q > 0 }

vs.

ℚ = { p / q : p, q ∈ ℕ⁺ } ∪ { 0 } ∪ { -p / q : p, q ∈ ℕ⁺ }


>>          I suppose you /could/ define addition with the starting point
>> "a + 1 = succ(a)" rather than "a + 0 = a", but it is all much easier and
>> neater when you start with 0.
>
>     I don't see "a + 1 == a'" as interestingly harder than
> "a + 0 = a".  In some ways it's easier;  if we abbreviate [eg]
> 3' [or succ(3)] as 4, in the usual way, then 1-based has [eg]
>
>   3+4 = (3+3)' = ((3+2)')' = (((3+1)')')' = 4''' = 5'' = 6' = 7,
>
> whereas 0-based has
>
>   3+4 = (3+3)' = ((3+2)')' = (((3+1)')')' = ((((3+0)')')')' =
>     3'''' = 4''' = 5'' = 6' = 7,
>
> and you have two extra steps with every addition.

The number of steps doesn't matter much - we are not looking for an
efficient way of adding up! But calling the additive identity "0" or
"1" /does/ matter, because one choice makes sense and the other choice
is pointlessly confusing. (The definition of addition does not actually
require an identity.)

I can happily agree that you /can/ define Peano numbers starting with 1,
and I am sure some people think that is better. But personally I think
there are good reasons why 0 is the more common starting point (as far
as I could see from a statistically invalid and non-scientific google
search) - the big hint comes from Peano himself who started with 1, then
changed his mind and started with 0.

>
>>                   That is certainly how I learned it at
>> university, and how I have seen it a few other places - but while I
>> think I have a couple of books covering them, they are buried in the
>> attic somewhere.
>
>     Fine;  as I said, books vary, and it's at the whim of
> the lecturer [if any -- it's commonly not taught at all, except
> at the level of "if you really want to know how all this stuff
> gets defined, look at (some book -- Landau in my case)"].

We were taught them as 0-based in pure maths. And in computing, we
constructed them from 0 in a Haskell-like functional programming
language as a practical exercise.

>
> [...]
>>>> I am quite confident that the idea of starting array indexes from 0 had
>>>> nothing to do with surreals.
>>>      Surreal numbers were an example;  they are part of the
>>> explanation for mathematics also tending to become zero-based.
>> Really?  Again, I would suggest that they are far too esoteric for the
>> purpose.
>
>     Again, I would repeat that they were an /example/ of the
> way that /mathematics/, predominantly 1-based, has /tended/ to
> become 0-based.  That's not a /purpose/;  it just so happens that
> some relatively recent maths has found uses where 0-based seems
> more natural than 1-based.  There are still plenty where 1-based
> remains more usual/natural.

I was referring to /your/ purpose in picking the surreals as an example.
(It's not that I don't find surreals interesting, it's just that most
people have probably never heard of them. Mind you, this thread has
given me some information about them that I didn't know, so thanks for
that anyway!)

>
>>         Constructions of surreal numbers will normally start with 0 -
>> but so will constructions of other more familiar types, such as
>> integers, reals, ordinals, cardinals, and almost any other numbers.
>
>     You're assuming the answer!  As above, you can equally
> get to integers [and so rationals and reals] from 1.
>

Of course you can always start from 1. In my experience (as someone
with a degree in mathematics and theoretical computing, but not having
worked in a mathematical profession) it is usually simpler and easier,
and more common, to start from 0.

>> Maybe it is just that with surreals, few people ever have much idea of
>> what they are, or get beyond reading how they are constructed!  (Some
>> day I must get the book on them - it was Conway that developed them, and
>> Knuth that wrote the book, right?)
>
>     Knuth wrote /a/ book on them;  /the/ book is Conway's "On
> Numbers and Games", but a more accessible version is "Winning Ways"
> by Berlekamp, Conway and Guy [all three of whom, sadly, died within
> a year and two days in 2019-20];  expensive to buy, but there is a
> PDF freely available online. 

I'll have a look for that - thanks.

(It was sad that the inventor of "life" died of Covid.)

> What most people don't realise is the
> motivation:  Conway couldn't see /why/ the step from rationals to
> reals is so difficult.  We define naturals eg by Peano, then get
> by equivalence classes to integers and rationals, and then ...?
> The usual constructions of reals seem so artificial, and not at
> all related to what happens earlier.  So Conway wondered what
> would happen if we went the other way -- start from the concept
> of a Dedekind section, forget that it relies on knowing about
> the rationals, and just build on what we know.  Thus we get the
> idea of partitioning whatever numbers we know into two sets.
> That is how we build the surreals, without exceptions  or special
> cases.

That is interesting. I had thought of surreals as trying to find gaps
and endpoints in the reals and filling them in (though I know the
construction doesn't do that).

I wonder if the lengths in TeX, which have 3 (IIRC) layers of infinities
and infinitesimals, were invented as a kind of computer approximation to
surreals?

>  Oh, we also get [combinatorial] games as a side-effect;
> which is where it gets interesting to people like me, and to CS
> more generally, and why it's not as esoteric as people think.
>

All sorts of maths that is esoteric to most people has some odd
application or two. That is often what makes it interesting.

Andy Walker

unread,
Dec 3, 2021, 11:26:42 AM12/3/21
to
On 03/12/2021 14:38, David Brown wrote:
> Rationals are easier when you have 0 :
> ℚ = { p / q : p, q ∈ ℤ, q > 0 }
> vs.
> ℚ = { p / q : p, q ∈ ℕ⁺ } ∪ { 0 } ∪ { -p / q : p, q ∈ ℕ⁺ }

Or vs ℚ = { p, q : p ∈ ℤ, q ∈ ℕ }.

> [...] But calling the additive identity "0" or
> "1" /does/ matter, because one choice makes sense and the other choice
> is pointlessly confusing. (The definition of addition does not actually
> require an identity.)

No-one calls the additive identity 1, esp as it's not even
useful until you get to much more advanced maths, by which time you
already have 0 [part of ℤ, defined by equivalence classes of pairs
of members of ℕ].

> [...] In my experience (as someone
> with a degree in mathematics and theoretical computing, [...].

No such thing when I was a student!

> I had thought of surreals as trying to find gaps
> and endpoints in the reals and filling them in (though I know the
> construction doesn't do that).

No, rather that "filling in gaps" [ie, partitioning known
numbers] produces the reals and more. If the partitioning is
ordered, you get numbers; if unordered, you get games [of which
numbers are therefore a subset].

> I wonder if the lengths in TeX, which have 3 (IIRC) layers of infinities
> and infinitesimals, were invented as a kind of computer approximation to
> surreals?

Pass. I've never used [and don't like] TeX; we had real
experts in typography in my dept [consultants for major academic
publishers, exam boards, etc], and some of that rubbed off. They
disliked the "look" of Knuth's books, and even more so that we
kept being told that TeX knows best. They devoted their time to
tweaking Troff, which takes a more pragmatic view [and some of
the tweaks found their way back into "official" Troff]. I also
quite like Lout, FWIW.

[...]
> All sorts of maths that is esoteric to most people has some odd
> application or two. That is often what makes it interesting.

/All/ maths is esoteric to most people! A large majority
don't even know that there is maths beyond arithmetic, apart from
the algebra [etc] that never made any sense at all to them.

David Brown

unread,
Dec 3, 2021, 12:49:28 PM12/3/21
to
On 03/12/2021 17:26, Andy Walker wrote:
> On 03/12/2021 14:38, David Brown wrote:
>> Rationals are easier when you have 0 :
>>      ℚ = { p / q : p, q ∈ ℤ, q > 0 }
>> vs.
>>      ℚ = { p / q : p, q ∈ ℕ⁺  } ∪ { 0 } ∪ { -p / q : p, q ∈ ℕ⁺  }
>
>     Or vs ℚ = { p, q : p ∈ ℤ, q ∈ ℕ }.
>

ℕ is ambiguous - you need to write something like ℕ⁺ unless it is clear
from earlier. While it is quite easy to write ℕ⁺, there is no good
argument for suggesting it is noticeably simpler than using ℤ, nor any
special case handling for 0.

>> [...] But calling the additive identity "0" or
>> "1" /does/ matter, because one choice makes sense and the other choice
>> is pointlessly confusing.  (The definition of addition does not actually
>> require an identity.)
>
>     No-one calls the additive identity 1, esp as it's not even
> useful until you get to much more advanced maths, by which time you
> already have 0 [part of ℤ, defined by equivalence classes of pairs
> of members of ℕ].

You wrote:

I don't see "a + 1 == a'" as interestingly harder than
"a + 0 = a". In some ways it's easier;


Presumably then that was a typo. (Fair enough, we all do that.)

>
>> [...] In my experience (as someone
>> with a degree in mathematics and theoretical computing, [...].
>
>     No such thing when I was a student!
>
>> I had thought of surreals as trying to find gaps
>> and endpoints in the reals and filling them in (though I know the
>> construction doesn't do that).
>
>     No, rather that "filling in gaps" [ie, partitioning known
> numbers] produces the reals and more.  If the partitioning is
> ordered, you get numbers;  if unordered, you get games [of which
> numbers are therefore a subset].
>
>> I wonder if the lengths in TeX, which have 3 (IIRC) layers of infinities
>> and infinitesimals, were invented as a kind of computer approximation to
>> surreals?
>
>     Pass.  I've never used [and don't like] TeX;  we had real
> experts in typography in my dept [consultants for major academic
> publishers, exam boards, etc], and some of that rubbed off.  They
> disliked the "look" of Knuth's books, and even more so that we
> kept being told that TeX knows best.  They devoted their time to
> tweaking Troff, which takes a more pragmatic view [and some of
> the tweaks found their way back into "official" Troff].  I also
> quite like Lout, FWIW.

I learned my typography from the TeXBook and the LaTeX Documentation
Preparation System. Knuth's early books had ugly typography - that's
why he made TeX. But of course there's a lot of scope for subjective
choice in the subject, and for picking formats, layouts, etc., that suit
your own needs. LaTeX and friends take a bit of work to use well, but I
find it is worth it. (That is especially for most people that can't
afford in-house typographers, so the alternative is LibreOffice or,
horrors, Word.)

>
> [...]
>> All sorts of maths that is esoteric to most people has some odd
>> application or two.  That is often what makes it interesting.
>
>     /All/ maths is esoteric to most people!  A large majority
> don't even know that there is maths beyond arithmetic, apart from
> the algebra [etc] that never made any sense at all to them.
>

Touché.

Andy Walker

unread,
Dec 3, 2021, 5:03:19 PM12/3/21
to
On 03/12/2021 17:49, David Brown wrote:
[Definition of rationals:]
> ℕ is ambiguous - you need to write something like ℕ⁺ unless it is clear
> from earlier.

It was clear in context! But ℕ⁺ is still a more primitive
concept than ℤ.

> While it is quite easy to write ℕ⁺, there is no good
> argument for suggesting it is noticeably simpler than using ℤ, nor any
> special case handling for 0.

My version didn't have any special casing for 0. Note that
according to the usual development you need four members of ℕ to
construct two members of ℤ [as two equivalence classes of members
of ℕ], of which one is logically redundant.

> You wrote:
> I don't see "a + 1 == a'" as interestingly harder than
> "a + 0 = a". In some ways it's easier;
> Presumably then that was a typo. (Fair enough, we all do that.)

If it was, I still don't see it. Did you miss the "'"
[usual abbreviation for the successor function]?

Recap:
ANW: ℕ is 1, 1', 1'', ...; "+" is defined by "a + 1 == a'"
and "a + b' = (a+b)'"; "ℤ" by equivalence classes on
pairs of members of "ℕ"; "ℚ" as a triple of members of
"ℕ" [equivalently, a member of "ℤ" and a member of "ℕ"].
DB: ℕ is 0, 0', 0'', ...; "+" is defined by "a + 0 == a"
and "a + b' = (a+b)'"; "ℤ" by equivalence classes on
pairs of members of "ℕ"; "ℚ" as a pair of members of
"ℤ" of which the second is > 0 [equivalently, a member
of "ℤ" and a member of "ℕ⁺"].
I don't see an interesting difference in difficulty, nor any
good reason other than fashion to choose one over the other.

Further recap: this started with whether Bart counted "1, 2,
3, ..." or "0, 1, 2, ...". IRL virtually everyone is first
taught to count "1, 2, 3, ...". Everything else comes later
[if at all].

I doubt whether I have anything further to contribute
to this thread, which is now diverging a long way from CS.

luserdroog

unread,
Dec 4, 2021, 12:50:36 AM12/4/21
to
On Thursday, December 2, 2021 at 1:37:33 AM UTC-6, David Brown wrote:
> On 02/12/2021 01:11, Andy Walker wrote:

> > [...]
> >> Negative numbers long pre-date the general acceptance of 0 as a
> >> "number". They were used in accountancy, as well as by a few
> >> mathematicians. But there general use, especially in Europe, came a
> >> lot later.
> >
> > My impression is that accountants used red ink rather than
> > negative numbers. As late as the 1970s, hand/electric calculators
> > still used red numerals rather than a minus sign.
> >
> Many conventions have been used, in different countries, times, and
> cultures. "Red ink" is certainly a well-known phrase in modern
> English-speaking countries. But brackets, minus signs, and other
> methods are used. Go far enough back and people didn't write with ink
> at all.
>

Dang. Y'all giving me ideas.

--
better watch out

David Brown

unread,
Dec 4, 2021, 6:29:46 AM12/4/21
to
On 03/12/2021 23:03, Andy Walker wrote:
> On 03/12/2021 17:49, David Brown wrote:
> [Definition of rationals:]
>> ℕ is ambiguous - you need to write something like ℕ⁺ unless it is clear
>> from earlier.
>
>     It was clear in context!  But ℕ⁺ is still a more primitive
> concept than ℤ.
>

You can't say that your definition of ℕ in your definition of ℚ is clear
in the context - that would be begging the question. Obviously I know
what you meant by ℕ because I know how the rationals are defined. But
if you are giving the definition of something, you can't force readers
to assume the definition you want in order to figure out the meaning of
the things you use in your definition.

I can agree that the naturals - with or without 0 - are more primitive
than the integers. I don't see an advantage in that. Mathematics is
about building up concepts step by step - once you have a concept, you
use it freely for the next step. There is no benefit in going back to
more primitive stages.

>>         While it is quite easy to write ℕ⁺, there is no good
>> argument for suggesting it is noticeably simpler than using ℤ, nor any
>> special case handling for 0.
>
>     My version didn't have any special casing for 0.  Note that
> according to the usual development you need four members of ℕ to
> construct two members of ℤ [as two equivalence classes of members
> of ℕ], of which one is logically redundant.
>
>> You wrote:
>>      I don't see "a + 1 == a'" as interestingly harder than
>> "a + 0 = a".  In some ways it's easier;
>> Presumably then that was a typo.  (Fair enough, we all do that.)
>
>     If it was, I still don't see it.  Did you miss the "'"
> [usual abbreviation for the successor function]?
>

Ah, yes, I /did/ miss that. Sorry. A tick mark as a successor
indicator is fine on paper, but I find it can easily be missed in emails
or Usenet where typography is limited. With that sorted out, I agree
with you - it is entirely possible to start your addition recursive
definition with 1 here. I can't see a benefit from starting with 1, and
the 0 will come in very handy later on, but they are both valid
alternatives.

>
>     I doubt whether I have anything further to contribute
> to this thread, which is now diverging a long way from CS.
>

I enjoy a little side-tracking sometimes, but it is a bit off-topic for
the thread and the group!

Rod Pemberton

unread,
Dec 6, 2021, 5:39:31 AM12/6/21
to
On Tue, 30 Nov 2021 08:07:30 +0000
James Harris <james.h...@gmail.com> wrote:

> From another thread, discussion between David and Bart:

> D> But if you have just one starting point, 0 is the sensible one.
> D> You might not like the way C handles arrays (and I'm not going to
> D> argue about it - it certainly has its cons as well as its pros),
> D> but even you would have to agree that defining "A[i]" to be the
> D> element at "address of A + i * the size of the elements" is neater
> D> and clearer than one-based indexing.
>
> B> That's a crude way of defining arrays. A[i] is simply the i'th
> B> element of N slots, you don't need to bring offsets into it.
>
> Why call it 'i'th? I know people do but wouldn't it be easier to call
> it 'element n' where n is its index? Then that would work with any
> basing.
>

'n'th, 'i'th, 'x'th, ...

Does the letter choice matter?

Why wouldn't you say the 10th or 7th or 1st etc by using an actual
number to specify a specific item? Although I do like variables, this
isn't algebra.

> 'element n'

Chemistry ... Obtuse?

Well, "item" is shorter than "element". So, that's one up vote for
"item" ...

> B> With 0-based, there's a disconnect between the ordinal number of
> B> the element you want, and the index that needs to be used. So A[2]
> B> for the 3rd element.
>
> Why not call A[2] element 2?
>
> BTW, Bart, do you consider the first ten numbers as 1 to 10 rather
> than 0 to 9?

IMO, irrelevant.

As, he has his choice of either with 0-based indexing.

If he wants the first ten elements of an array to be indexed by 0 to 9,
he can use 0 to 9.

If he wants the first ten elements of an array to be indexed by 1 to
10, he can use 1 to 10, skip using 0, but he needs to remember
to allocate one additional element.

> [What does Rod prefer?]
> (Let's hope I'm not talking to myself now.)

Oh boy, you didn't ask me that ...
(Oops, it seems that I am talking to myself now.)


Maybe I've been coding in C for too long now, as I prefer zero-based
indexing, even in non-C situations like personal to-do lists.

0-based works really well with C, but I do recall it being somewhat
"unnatural" at first, even though zero was the central starting point of
the signed number line in mathematics. For programming, I strongly
prefer unsigned only. This eliminates many coding errors.

E.g., for C, I know that the address of the 0'th (zeroth) element of an
"array" (&A[0]) is the same as the base address of the said "array"
named A. This can be convenient as no address calculation needs to be
computed because there is no actual indexing into the array when the
index is zero.

E.g., for C, you can do neat 0-based "tricks" like below for loops to
detect loop termination. I.e., the value of 10 below as MAX detects the
end-of-loop of the 10 values of variable "i" with 0..9 being the actual
printed values:

#define MAX 10
for(i=0;i!=MAX;i++)


And, if the language is designed correctly and the variable is declared
with file scope, using zero-based indexing means that variables don't
need to be initialized to zero. They'll be cleared to zero upon
program execution. I.e., the "i=0" above should be optional in a
language for all declared file scope variables, since such variables
should be initialized to or cleared to zero by default. E.g., a C
compiler may warn if "i=0" initialization is missing when "i" is
declared with auto or local scope (within a procedure), but C compilers
will generally not warn when "i" is declared with file or global scope
(or local static) as they are initialized to zero due to BSS.


--
"If Britain were to join the United States, it would be the
second-poorest state, behind Alabama and ahead of Mississippi,"
Hunter Schwarz, Washington Post

Bart

unread,
Dec 6, 2021, 7:24:20 AM12/6/21
to
On 06/12/2021 10:41, Rod Pemberton wrote:
> On Tue, 30 Nov 2021 08:07:30 +0000
> James Harris <james.h...@gmail.com> wrote:

>> [What does Rod prefer?]
>> (Let's hope I'm not talking to myself now.)

> 0-based works really well with C,

Well, C invented it. (If it didn't, then it made it famous.)

but I do recall it being somewhat
> "unnatural" at first, even though zero was the central starting point of
> the signed number line in mathematics.

Yeah, this my fence/fencepost distinction.

If you draw XY axes on squared paper, and start annotating the positive
X axis as 0 (at Y-axis), 1, 2, 3 ..., then those figures will mark the
vertical divisions between the squares.

NOT the squares themselves. The squares are what correspond to C array
elements.

> For programming, I strongly
> prefer unsigned only. This eliminates many coding errors.

OK, so we can forget about that negative X axis!


> E.g., for C, you can do neat 0-based "tricks" like below for loops to
> detect loop termination. I.e., the value of 10 below as MAX detects the
> end-of-loop of the 10 values of variable "i" with 0..9 being the actual
> printed values:
>
> #define MAX 10
> for(i=0;i!=MAX;i++)
>
>
> And, if the language is designed correctly and the variable is declared
> with file scope, using zero-based indexing means that variables don't
> need to be initialized to zero. They'll be cleared to zero upon
> program execution. I.e., the "i=0" above should be optional in a
> language for all declared file scope variables, since such variables
> should be initialized to or cleared to zero by default.

That's great. Until the second time you execute this loop:

for(; i!=MAX; i++)

since now i will have value 10 (or whatever it ended up as after 1000
more lines of code, being a file-scope variable visible from any
function). Or when you execute this separate loop further on:

for(; i<N; i++)

David Brown

unread,
Dec 6, 2021, 8:56:43 AM12/6/21
to
On 06/12/2021 11:41, Rod Pemberton wrote:

>
> Maybe I've been coding in C for too long now, as I prefer zero-based
> indexing, even in non-C situations like personal to-do lists.
>
> 0-based works really well with C, but I do recall it being somewhat
> "unnatural" at first, even though zero was the central starting point of
> the signed number line in mathematics. For programming, I strongly
> prefer unsigned only. This eliminates many coding errors.

Out of curiosity, what kinds of coding errors do you eliminate by using
unsigned types?


>
> E.g., for C, I know that the address of the 0'th (zeroth) element of an
> "array" (&A[0]) is the same as the base address of the said "array"
> named A. This can be convenient as no address calculation needs to be
> computed because there is no actual indexing into the array when the
> index is zero.
>
> E.g., for C, you can do neat 0-based "tricks" like below for loops to
> detect loop termination. I.e., the value of 10 below as MAX detects the
> end-of-loop of the 10 values of variable "i" with 0..9 being the actual
> printed values:
>
> #define MAX 10
> for(i=0;i!=MAX;i++)
>
>
> And, if the language is designed correctly and the variable is declared
> with file scope, using zero-based indexing means that variables don't
> need to be initialized to zero. They'll be cleared to zero upon
> program execution. I.e., the "i=0" above should be optional in a
> language for all declared file scope variables, since such variables
> should be initialized to or cleared to zero by default. E.g., a C
> compiler may warn if "i=0" initialization is missing when "i" is
> declared with auto or local scope (within a procedure), but C compilers
> will generally not warn when "i" is declared with file or global scope
> (or local static) as they are initialized to zero due to BSS.
>

Loop variables are almost always local, rather than file scope (or other
static lifetime). Indeed, they are normally local to the loop itself.
The idiomatic for loop in C is :

for (int i = 0; i < MAX; i++) { ... }

Rod Pemberton

unread,
Dec 7, 2021, 7:48:43 AM12/7/21
to
On Mon, 6 Dec 2021 12:24:18 +0000
Bart <b...@freeuk.com> wrote:

> On 06/12/2021 10:41, Rod Pemberton wrote:
> > On Tue, 30 Nov 2021 08:07:30 +0000
> > James Harris <james.h...@gmail.com> wrote:

> >> [What does Rod prefer?]
> >> (Let's hope I'm not talking to myself now.)
>
> > 0-based works really well with C,
>
> Well, C invented it. (If it didn't, then it made it famous.)
>
> > but I do recall it being somewhat
> > "unnatural" at first, even though zero was the central starting
> > point of the signed number line in mathematics.
>
> Yeah, this my fence/fencepost distinction.
>
> If you draw XY axes on squared paper, and start annotating the
> positive X axis as 0 (at Y-axis), 1, 2, 3 ..., then those figures
> will mark the vertical divisions between the squares.
>
> NOT the squares themselves. The squares are what correspond to C
> array elements.

Well, they can represent the squares too along a single axis, either X
or Y. I.e., skip zero, use the other values.

This won't work for a 2-dimensional grid, but the same is true of
arrays, e.g., A[2][3]. I.e., what do you call the 3rd square up in
the 2nd column?

> > For programming, I strongly
> > prefer unsigned only. This eliminates many coding errors.
>
> OK, so we can forget about that negative X axis!
>

and that negative Y axis too!

Yeah, we're down to one quadrant instead of four!

> > E.g., for C, you can do neat 0-based "tricks" like below for loops
> > to detect loop termination. I.e., the value of 10 below as MAX
> > detects the end-of-loop of the 10 values of variable "i" with 0..9
> > being the actual printed values:
> >
> > #define MAX 10
> > for(i=0;i!=MAX;i++)
> >
> >
> > And, if the language is designed correctly and the variable is
> > declared with file scope, using zero-based indexing means that
> > variables don't need to be initialized to zero. They'll be cleared
> > to zero upon program execution. I.e., the "i=0" above should be
> > optional in a language for all declared file scope variables, since
> > such variables should be initialized to or cleared to zero by
> > default.
>
> That's great. Until the second time you execute this loop:
>
> for(; i!=MAX; i++)

Why would you execute the loop a second time? ...

I.e., I'd argue that in general, very generally, loops are only
executed once within most C programs. However, obviously, if the loop
is re-used, the programmer must set i to 0, as shown previously, or to
1 in your case, prior to the re-use of i.

> since now i will have value 10 (or whatever it ended up as after 1000
> more lines of code, being a file-scope variable visible from any
> function). Or when you execute this separate loop further on:
>
> for(; i<N; i++)

The paragraph above was about the clearing of file or global scope
variables, not specifically about loops. If numbering and indexing
start with a value of one, then you'll have to "clear" the BSS
variables to a value of "one", except the binary representations for
what "one" is may vary depending on the data type, e.g., integer vs
float. Whereas, the binary representation for "zero" is almost always
all bits clear.


--

Rod Pemberton

unread,
Dec 7, 2021, 7:52:20 AM12/7/21
to
On Mon, 6 Dec 2021 14:56:41 +0100
David Brown <david...@hesbynett.no> wrote:

> On 06/12/2021 11:41, Rod Pemberton wrote:
>
> >
> > Maybe I've been coding in C for too long now, as I prefer zero-based
> > indexing, even in non-C situations like personal to-do lists.
> >
> > 0-based works really well with C, but I do recall it being somewhat
> > "unnatural" at first, even though zero was the central starting
> > point of the signed number line in mathematics. For programming, I
> > strongly prefer unsigned only. This eliminates many coding errors.
>
> Out of curiosity, what kinds of coding errors do you eliminate by
> using unsigned types?

When you expect the variable to function as unsigned, which is common
in C when working with characters, pointers, or binary data, but the
variable was actually declared as signed. If you add the value to
another integer, you'll end up with the "wrong" numeric result, i.e.,
other than what was expected because the variable was signed.

i=1000
x=0xFF (signed 8-bit char)
y=0xFF (unsigned 8-bit char)

i+x = 999 (unexpected)
i+y = 1255 (expected)
The paragraph on clearing variables, wasn't intended to be specifically
linked with the concept of a loop variable, or even a discussion of C
specifically, but was a generic statement for any file scope or global
variables, for some new language using C as a means to explain.

The other option, per the prior discussion, would require that the file
scope variables in BSS all be "cleared" to a value of one. As you know
from C, the representations for a value of one may be different for
each type of variable, e.g., integer vs float.

As for C, variable declarations within the for() loop is not valid
for ANSI C (C89), i.e., valid for C99 or C11 or later. So, one could
argue, that to ensure backwards code compatibility, hence portability
of C code, that declaring a variable somewhere within a procedure, such
as within a for() loop, should be avoided, yes? Think of C style guide
suggestions.


--

Bart

unread,
Dec 7, 2021, 9:07:13 AM12/7/21
to
On 07/12/2021 12:52, Rod Pemberton wrote:
> On Mon, 6 Dec 2021 14:56:41 +0100
> David Brown <david...@hesbynett.no> wrote:
>
>> On 06/12/2021 11:41, Rod Pemberton wrote:
>>
>>>
>>> Maybe I've been coding in C for too long now, as I prefer zero-based
>>> indexing, even in non-C situations like personal to-do lists.
>>>
>>> 0-based works really well with C, but I do recall it being somewhat
>>> "unnatural" at first, even though zero was the central starting
>>> point of the signed number line in mathematics. For programming, I
>>> strongly prefer unsigned only. This eliminates many coding errors.
>>
>> Out of curiosity, what kinds of coding errors do you eliminate by
>> using unsigned types?
>
> When you expect the variable to function as unsigned, which is common
> in C when working with characters, pointers, or binary data, but the
> variable was actually declared as signed. If you add the value to
> another integer, you'll end up with the "wrong" numeric result, i.e.,
> other than what was expected because the variable was signed.
>
> i=1000
> x=0xFF (signed 8-bit char)

If x has int8 type then it has the value -1 not +255.


> y=0xFF (unsigned 8-bit char)
>
> i+x = 999 (unexpected)

This is not unexpected, only if you expect int8 to be able to represent
+255.

> i+y = 1255 (expected)

Unsigned has its own problems:

unsigned int i=1000;
unsigned int x=1001;

printf("%u\n",i-x);
printf("%f\n",(double)(i-x));

displsys:

4294967295
4294967295.000000

David Brown

unread,
Dec 7, 2021, 9:40:03 AM12/7/21
to
On 07/12/2021 13:52, Rod Pemberton wrote:
> On Mon, 6 Dec 2021 14:56:41 +0100
> David Brown <david...@hesbynett.no> wrote:
>
>> On 06/12/2021 11:41, Rod Pemberton wrote:
>>
>>>
>>> Maybe I've been coding in C for too long now, as I prefer zero-based
>>> indexing, even in non-C situations like personal to-do lists.
>>>
>>> 0-based works really well with C, but I do recall it being somewhat
>>> "unnatural" at first, even though zero was the central starting
>>> point of the signed number line in mathematics. For programming, I
>>> strongly prefer unsigned only. This eliminates many coding errors.
>>
>> Out of curiosity, what kinds of coding errors do you eliminate by
>> using unsigned types?
>
> When you expect the variable to function as unsigned, which is common
> in C when working with characters, pointers, or binary data, but the
> variable was actually declared as signed.

It's a bad idea ever to use plain "char" and think of it as a number at
all - thus signedness makes no sense. Use "signed char" or "unsigned
char" if you need a signed small type - or, preferably, uint8_t, int8_t,
or one of the other <stdint.h> types if you need maximal portability.

Pointers don't have a concept of signedness.

The only type that is guaranteed to work for general "binary data"
without a proper type, is "unsigned char".

These issues are not solved by "preferring to use unsigned", they are
solved by learning the language.

> If you add the value to
> another integer, you'll end up with the "wrong" numeric result, i.e.,
> other than what was expected because the variable was signed.
>
> i=1000
> x=0xFF (signed 8-bit char)
> y=0xFF (unsigned 8-bit char)
>
> i+x = 999 (unexpected)
> i+y = 1255 (expected)
>

"unsigned" on its own means "unsigned int". You don't have any unsigned
ints here, nor would they help.

Do you mean that you prefer to use a type that can hold the data you put
into it, rather than having it truncated or converted? If so, then I
agree - that is sane practice for all types and all programming languages.

But your "unexpected" result is caused by someone trying to use a
"signed char" or plain "char" (without knowing the signedness) for
values outside its range and for something completely inappropriate. A
much more useful coding rule here is "don't use plain chars for numbers
or arithmetic - use <stdint.h> types", rather than "prefer unsigned".


I agree that people do make mistakes by assuming that plain "char" is
signed - I just disagree that "preferring unsigned" is helpful in
avoiding such mistakes.

(I personally use a lot more unsigned types than most, because types
such as uint8_t, uint16_t and uint32_t are most natural when dealing
with low-level embedded programming.)
Eh, okay, I suppose. It's an rather odd example, in that loop variables
(in any language) are perhaps the least likely of any variable uses to
have static lifetime.

> The other option, per the prior discussion, would require that the file
> scope variables in BSS all be "cleared" to a value of one. As you know
> from C, the representations for a value of one may be different for
> each type of variable, e.g., integer vs float.

It would also give a new and exciting meaning to the word "clear", as
well as making every other static-lifetime default initialised variable
wrong.

>
> As for C, variable declarations within the for() loop is not valid
> for ANSI C (C89), i.e., valid for C99 or C11 or later.

C18 is the current standard - without other qualifications, that is what
is meany by "C". If you want to pick an older standard, it is best to
specify it explicitly. (And I don't recommend using the term "ANSI C"
at all - people often use it to mean C89, when in fact it means "the
current ISO C standard" - i.e., C18 at the time of writing.)

Of course you are correct that putting declarations in the "for" loop
was introduced in C99. Rounded to the nearest percentage, 100% of C
code has been written since the introduction of C99, and probably at
least 98% since it became widely supported by common tools. There are a
few very niche situations where it makes sense to use pre-C99 today,
other than for maintaining old programs in the style in which they were
written. Other than that, C99 syntax is standard.

> So, one could
> argue, that to ensure backwards code compatibility, hence portability
> of C code, that declaring a variable somewhere within a procedure, such
> as within a for() loop, should be avoided, yes? Think of C style guide
> suggestions.
>

No.

That's like saying software should be published as printouts in a
magazine, rather than, say, on a web page, for backwards compatibility.



David Brown

unread,
Dec 7, 2021, 9:55:24 AM12/7/21
to
On 07/12/2021 15:05, Bart wrote:
> On 07/12/2021 12:52, Rod Pemberton wrote:
>> On Mon, 6 Dec 2021 14:56:41 +0100
>> David Brown <david...@hesbynett.no> wrote:
>>
>>> On 06/12/2021 11:41, Rod Pemberton wrote:
>>>
>>>>
>>>> Maybe I've been coding in C for too long now, as I prefer zero-based
>>>> indexing, even in non-C situations like personal to-do lists.
>>>>
>>>> 0-based works really well with C, but I do recall it being somewhat
>>>> "unnatural" at first, even though zero was the central starting
>>>> point of the signed number line in mathematics.  For programming, I
>>>> strongly prefer unsigned only.  This eliminates many coding errors.
>>>
>>> Out of curiosity, what kinds of coding errors do you eliminate by
>>> using unsigned types?
>>
>> When you expect the variable to function as unsigned, which is common
>> in C when working with characters, pointers, or binary data, but the
>> variable was actually declared as signed.  If you add the value to
>> another integer, you'll end up with the "wrong" numeric result, i.e.,
>> other than what was expected because the variable was signed.
>>
>> i=1000
>> x=0xFF  (signed 8-bit char)
>
> If x has int8 type then it has the value -1 not +255.
>

Correct. "0xff" is interpreted as an integer constant (value 255).
When assigned to "x", it is converted using an implementation-dependent
algorithm (which is invariably modulo reduction, when using a two's
complement system) to the range of the target variable - arriving at -1.

I'd prefer if C had some way of initialising variables that did not have
such implicit conversions (also known as "narrowing conversions"). C++
has a method which works well in practice but is, IMHO, ugly :

int8_t x { 0xff };

instead of :

int8_t x = 0xff;

The first is an error in C++ because of the narrowing conversion, the
second works exactly like C.

A syntax such as :

int8_t x := 0xff;

would have been nicer IMHO, but it's too late for that in C (or C++).


>
>> y=0xFF  (unsigned 8-bit char)
>>
>> i+x = 999   (unexpected)
>
> This is not unexpected, only if you expect int8 to be able to represent
> +255.
>
>> i+y = 1255  (expected)
>
> Unsigned has its own problems:
>
>     unsigned int i=1000;
>     unsigned int x=1001;
>
>     printf("%u\n",i-x);
>     printf("%f\n",(double)(i-x));
>
> displsys:
>
>     4294967295
>     4294967295.000000
>

That is also not unexpected.

(Other languages could reasonably handle this differently - C's
treatment of unsigned int as a modulo type is not the only practical way
to handle things. But if you know the basics of C, it is neither a
problem not a surprise.)

Bart

unread,
Dec 7, 2021, 7:42:33 PM12/7/21
to
When using signed integers, then you really only get problems with
overflows and such at quite high magnitudes, around +/- 2e9 for int32,
and considerably higher for int64 at +/- 9e18.

But you will normally work with values a long way from those limits,
unless results are stored in narrower types, so such problems are rare.

With unsigned numbers however, one of those problematic limits is zero,
which is really too close for comfort! So you will see problems even
with small, very ordinary calculations, such as 2 - 3, which here
underflows that zero.


David Brown

unread,
Dec 8, 2021, 3:41:17 AM12/8/21
to
On 08/12/2021 01:42, Bart wrote:
> On 07/12/2021 14:55, David Brown wrote:
>> On 07/12/2021 15:05, Bart wrote:
>
>>> Unsigned has its own problems:
>>>
>>>      unsigned int i=1000;
>>>      unsigned int x=1001;
>>>
>>>      printf("%u\n",i-x);
>>>      printf("%f\n",(double)(i-x));
>>>
>>> displsys:
>>>
>>>      4294967295
>>>      4294967295.000000
>>>
>>
>> That is also not unexpected.
>>
>> (Other languages could reasonably handle this differently - C's
>> treatment of unsigned int as a modulo type is not the only practical way
>> to handle things.  But if you know the basics of C, it is neither a
>> problem not a surprise.)
>
> When using signed integers, then you really only get problems with
> overflows and such at quite high magnitudes, around +/- 2e9 for int32,
> and considerably higher for int64 at +/- 9e18.

Yes.

>
> But you will normally work with values a long way from those limits,
> unless results are stored in narrower types, so such problems are rare.
>

Agreed.

> With unsigned numbers however, one of those problematic limits is zero,
> which is really too close for comfort! So you will see problems even
> with small, very ordinary calculations, such as 2 - 3, which here
> underflows that zero.
>

Disagreed.

Whatever kind of numbers you use, you have to apply a few brain cells.
You can't represent 1/3 with an integer, no matter how big it is. You
can't represent negative numbers with unsigned types. It's common
sense, not a "problematic limit". Anyone who finds it surprising that
you can't subtract 3 from 2 without signed numbers should give up their
programming career and go back to primary school. We have to have
/some/ standard of education in this profession!





Bart

unread,
Dec 8, 2021, 4:45:21 AM12/8/21
to
It won't be 2 - 3, it will be A - B.

Can the result be properly represented, or not? Given ordinary (ie.
smallish) signed values, it can. Given ordinary unsigned values, the
chances are 50% that it can't!

This is why I prefer signed types for general use to unsigned types. And
why my mixed arithmetic is performed using signed types.

Imagine working with unsigned float; where would you start with all the
potential problems!

C of course prefers to use unsigned for mixed arithmetic (although the
precise rules are complex). So here:

int a = 2;
unsigned b = 3;
double c = a-b;

printf("%f\n", c);

it prints 4294967295.000000. Same using b-4 instead of a-b.

If I do the same:

int a := 2
word b := 3
real c := a-b

println c

it shows -1.000000, for b-4 too. Fewer surprises.

I actually do all arithmetic using at least i64. Values of types u8, u16
and u32 are converted losslessly to i64 first. It's only when u64 is
involved that you need to start taking care, but my example uses u64
('word'), and that has more sensible behaviour than the C.

David Brown

unread,
Dec 8, 2021, 6:07:38 AM12/8/21
to
Again, it is absolutely no different from A / B.

>
> This is why I prefer signed types for general use to unsigned types. And
> why my mixed arithmetic is performed using signed types.

Most people use signed types for general integer arithmetic - that's
fine, and the ability to subtract them and get negative numbers is one
of the reasons for that.

>
> Imagine working with unsigned float; where would you start with all the
> potential problems!
>
> C of course prefers to use unsigned for mixed arithmetic (although the
> precise rules are complex). So here:

The precise rules are simple, not complex. Pretending they are
difficult does not help. Personally, however, I don't think they are
good - I would have preferred rules that promoted both sides to a common
type that is big enough for all the values involved. Thus "signed int +
unsigned int" should be done as "signed long" (or "signed long long" if
necessary - or the rules for sizes should be changed too). Failing
that, it should be a constraint error (i.e., fail to compile).

C doesn't give me the rules I want here, so I use warnings and errors in
my compilation that flags such mixed arithmetic use as errors. The
result is that I can't get accidents with mixed arithmetic when
developing, and the code is fine for other compilers or flags because it
is just a slightly limited subset of C.

Other programmers make other choices, of course - that's just the way I
choose to handle this.

Please don't mistake my understanding of C's rules, my acceptance of
them, my appreciation that C is used by many people for many purposes
with different preferences and requirements, my working with C and
liking C, as meaning that I think C's rules are the way I personally
would have preferred.


>
>     int a = 2;
>     unsigned b = 3;
>     double c = a-b;
>
>     printf("%f\n", c);
>
> it prints 4294967295.000000. Same using b-4 instead of a-b.

Yes. That is not surprising. In C, the types of expressions are
determined from the inside out (the actual calculations can be done in
any order, as long as the results match the sequence point
requirements). The type used on the left of an assignment operator has
no bearing on the types used on the right hand side (and vice versa).

This is the same in the solid majority of programming languages. It is
a simple and consistent choice that is easy to understand and use.

It is not the only option. In Ada, as I understand it, expressions are
influenced by the type they are assigned to. This is certainly true for
literals and it allows overloading functions based on the return type.
I don't know the full rules here (others here know better - indeed, much
of what /I/ know has come from postings here).

Applying the types from the outside in, so that in "c = a - b;" the type
of "c" is applied to those of "a" and "b" before the subtraction, is an
alternative to applying it from the inside out. It is not /better/, it
is /different/. It has some pros, and some cons. It is not in any
sense more "natural" or more "expected".

>
> If I do the same:
>
>     int a := 2
>     word b := 3
>     real c := a-b
>
>     println c
>
> it shows -1.000000, for b-4 too. Fewer surprises.

Surprises are for people who don't know what they are doing. I can
agree that a result of -1 is more likely to be useful to the programmer,
but not that a reasonably competent C programmer would fine the C
version surprising.

And you are mixing two separate issues here, which does not help your case.

>
> I actually do all arithmetic using at least i64. Values of types u8, u16
> and u32 are converted losslessly to i64 first. It's only when u64 is
> involved that you need to start taking care, but my example uses u64
> ('word'), and that has more sensible behaviour than the C.

As I said above, I would be happiest with a language that when given "a
- b" would first ensure that "a" and "b" are converted to a common type
that covers the whole range of both. If that can't be done, or if the
overflow characteristics of the original types are incompatible and an
overflow is possible, then there should be an error.

This is completely orthogonal as to whether "a - b" should be converted
to "real" before the subtraction, given that the result will be assigned
to a "real", or whether it should be evaluated first in the closest
common type specified by the language ("i64" in your language, "unsigned
int" in the C version) and /then/ converted to "real".

In particular, what does your language give for :

int a := 2
int b := 3
real c := b / a;

println c


Does it print 1, or 1.5 ?

The C version would give 1. Ada, as far as I could see in a quick test
on <https://godbolt.org>, will not accept mixing types in the same
expression or assignment without explicit casts.

Bart

unread,
Dec 8, 2021, 6:55:34 AM12/8/21
to
On 08/12/2021 11:07, David Brown wrote:
> On 08/12/2021 10:45, Bart wrote:

>> C of course prefers to use unsigned for mixed arithmetic (although the
>> precise rules are complex). So here:
>
> The precise rules are simple, not complex. Pretending they are
> difficult does not help.

Here is the table of rules for C: S means the operation is performed as
signed with a signed result; "." (chosen to make it clearer) means unsigned:

u8 u16 u32 u64 i8 i16 i32 i64

u8 S S . . S S S S
u16 S S . . S S S S
u32 . . . . . . . S
u64 . . . . . . . .

i8 S S . . S S S S
i16 S S . . S S S S
i32 S S . . S S S S
i64 S S S . S S S S

Here is the corresponding table for my language:

u8 u16 u32 u64 i8 i16 i32 i64

u8 . . . . S S S S
u16 . . . . S S S S
u32 . . . . S S S S
u64 . . . . S S S S

i8 S S S S S S S S
i16 S S S S S S S S
i32 S S S S S S S S
i64 S S S S S S S S

I think people can make up their own minds as to which has the simpler
rules!

(My table is missing row/colums for i128/u128, but it's the same
pattern: unsigned/unsigned => unsigned, otherwise signed. I don't know
what C's would look like with 128-bit added.)

> Personally, however, I don't think they are
> good - I would have preferred rules that promoted both sides to a common
> type that is big enough for all the values involved. Thus "signed int +
> unsigned int" should be done as "signed long" (or "signed long long" if
> necessary - or the rules for sizes should be changed too). Failing
> that, it should be a constraint error (i.e., fail to compile).
>
> C doesn't give me the rules I want here,

Yeah, I get that feeling a lot. (Are you still wondering why I prefer my
language?)

>
> In particular, what does your language give for :
>
> int a := 2
> int b := 3
> real c := b / a;
>
> println c
>
>
> Does it print 1, or 1.5 ?

My languages have two divide operators: "/" and "%".

"%" means integer divide. "/" is supposed to be for floating point
divide, but that's only on one language; the static one will still do
integer divide when both operands are integers.

So M will give 1.0, Q will give 1.5.

But in both cases, it is the operator and the operand types that
determine what happens. It can't look beyond that, since I want the same
code to work in dynamic code where that information doesn't exist (c
will not even have a type until assigned to).

You would anyway want a term like A*B, to give the same result in terms
of value and type, no matter which expressions it is part of.

In languages that do it differently, A*B could give different results
even if repeated within the same expression!

Bart

unread,
Dec 8, 2021, 9:26:22 AM12/8/21
to
Actually my table is not up-to-date. I was going to add a remark about
thinking of tweaking it so that there were more signed operations; I'd
forgotten I'd already done it and was trying it out!

The current table is (arguably even simpler than before):

u8 u16 u32 u64 i8 i16 i32 i64

u8 S S S S S S S S
u16 S S S S S S S S
u32 S S S S S S S S
u64 S S S . S S S S

i8 S S S S S S S S
i16 S S S S S S S S
i32 S S S S S S S S
i64 S S S S S S S S

Everything is done as signed (specifically as i64, not shown), except
for u64/u64.

Any scheme will give incorrect results: inappropriate signedness,
overflow etc on certain combination. This was designed to minimise those
and give the most useful results for the most common values.

Explicit u64 types are not common; but quite common are u8 u16 u32 used
in arrays and structs, which are about saving space.

But this is a demonstration of the benefit:

u8 a:=2, b:=3

println a-b

Under the old chart, this displayed 18446744073709551615 (u8-u8 =>
u64-u64 => u64). Under the new one, it shows -1 (u8-u8 => i64-i64 => i64).

BTW this is the C table showing the operation and result types (both
sides promoted to the type shown):

u8 u16 u32 u64 i8 i16 i32 i64

u8 i32 i32 u32 u64 i32 i32 i32 i64
u16 i32 i32 u32 u64 i32 i32 i32 i64
u32 u32 u32 u32 u64 u32 u32 u32 i64
u64 u64 u64 u64 u64 u64 u64 u64 u64

i8 i32 i32 u32 u64 i32 i32 i32 i64
i16 i32 i32 u32 u64 i32 i32 i32 i64
i32 i32 i32 u32 u64 i32 i32 i32 i64
i64 i64 i64 i64 u64 i64 i64 i64 i64

And this is my own current chart:

u8 u16 u32 u64 i8 i16 i32 i64

u8 i64 i64 i64 i64 i64 i64 i64 i64
u16 i64 i64 i64 i64 i64 i64 i64 i64
u32 i64 i64 i64 i64 i64 i64 i64 i64
u64 i64 i64 i64 u64 i64 i64 i64 i64

i8 i64 i64 i64 i64 i64 i64 i64 i64
i16 i64 i64 i64 i64 i64 i64 i64 i64
i32 i64 i64 i64 i64 i64 i64 i64 i64
i64 i64 i64 i64 i64 i64 i64 i64 i64

Spot the odd-one-out.

David Brown

unread,
Dec 8, 2021, 10:36:36 AM12/8/21
to
On 08/12/2021 12:55, Bart wrote:
> On 08/12/2021 11:07, David Brown wrote:
>> On 08/12/2021 10:45, Bart wrote:
>
>>> C of course prefers to use unsigned for mixed arithmetic (although the
>>> precise rules are complex). So here:
>>
>> The precise rules are simple, not complex.  Pretending they are
>> difficult does not help.
>

What is it with you and your campaign to claim everything C is bad, and
everything in your useless little private language is good? It doesn't
matter what anyone writes - you /always/ twist the facts, move the
goalposts or deliberately misinterpret what others write. (And yes,
your language is useless - no one else will ever use it. You've had
made useful software with it and used it in your work in the past.
That's great, and genuinely praise-worthy. But it is dead now. Move
along.)


So - let's start with some kindergarten logic. Claiming that your rules
are simpler than C's does not make C's rules complex.

In a binary arithmetic expression with integer types, any type smaller
than "int" is first converted to an "int". Then if the two parts have
different types, they are converted to the bigger type with "unsigned"
types being treated as slightly bigger than the signed types.

It is /not/ hard. It is /not/ complex. You might not think it is
ideal, and I'd agree. But it really is not rocket science, and it
doesn't need a complicated table of inappropriate made-up types to make
it look more complicated.

Oh, and your method will screw up too, for some cases. /Any/ method
will in some cases, unless you have unlimited ranges for your integers
(like Python) or point-blank refuse mixed signed expressions (like Ada).
And your language will still screw up on overflows.

(And before you post your knee-jerk response, the fact that C gets
things wrong on overflow does not mean your language is right or better.)


<snip more pointless and annoying drivel>

>> In particular, what does your language give for :
>>
>>       int a := 2
>>       int b := 3
>>       real c := b / a;
>>
>>       println c
>>
>>
>> Does it print 1, or 1.5 ?
>
> My languages have two divide operators: "/" and "%".
>
> "%" means integer divide. "/" is supposed to be for floating point
> divide, but that's only on one language; the static one will still do
> integer divide when both operands are integers.

Genius. Does it also use "and" as a keyword for the remainder after
division? Nothing says "simple" and "intuitive" like picking different
meanings for your operators than all other languages.

>
> So M will give 1.0, Q will give 1.5.
>

That's your two languages that are proudly the same syntax, but handle
expressions in completely different ways?


If you want to keep posting about your own language, please feel free -
only you can tell if you are making things up as you go along. But
/please/ stop posting shite about other languages that you refuse to
understand.

Understand me correctly here - I really don't care if you like C or not.
I don't care if anyone else here likes it or not, uses it or not. I am
not interested in promoting C or any other language - I'll use what I
want to use, and others will use what they want.

But what I /do/ react against is lies, FUD, and misrepresentations. I
am not "pro-C" - I am "anti-FUD", and it just so happens that your
bizarre hatred of C means it is C you post rubbish about. I'd react
against anyone else deliberately and repeatedly writing nonsense about
other topics too.

Bart

unread,
Dec 8, 2021, 11:58:56 AM12/8/21
to
On 08/12/2021 15:36, David Brown wrote:
> On 08/12/2021 12:55, Bart wrote:


>
> What is it with you and your campaign to claim everything C is bad, and
> everything in your useless little private language is good?

I said the rules are complex. You said they are simple. I disagreed, and
illustrated my point with a chart.

> than "int" is first converted to an "int". Then if the two parts have
> different types, they are converted to the bigger type with "unsigned"
> types being treated as slightly bigger than the signed types.

At least, they are simpler than the rules for type syntax. And not much
simpler than the rules for charting the Mandelbrot Set!

> It is /not/ hard. It is /not/ complex. You might not think it is
> ideal, and I'd agree. But it really is not rocket science, and it
> doesn't need a complicated table of inappropriate made-up types

What made-up types? And why are they inappropriate?

Are you sure you aren't twisting and making up things yourself?

> to make
> it look more complicated.

I think most people would be surprised at how untidy that chart is. /I/ was.


>>> Does it print 1, or 1.5 ?
>>
>> My languages have two divide operators: "/" and "%".
>>
>> "%" means integer divide. "/" is supposed to be for floating point
>> divide, but that's only on one language; the static one will still do
>> integer divide when both operands are integers.
>
> Genius. Does it also use "and" as a keyword for the remainder after
> division? Nothing says "simple" and "intuitive" like picking different
> meanings for your operators than all other languages.

"%" was used for integer divide in Pascal. I adopted it in the 1980s
when I needed distinct operators.

And I use "rem" for integer REMainder instead of "%"; "ixor" instead of
"^"; "ior" instead of "|" and "or" instead of "||". Maybe it's just me,
but I find them more readable.

Why, what do other languages use for integer divide?

>> So M will give 1.0, Q will give 1.5.
>>
>
> That's your two languages that are proudly the same syntax, but handle
> expressions in completely different ways?

Funnily enough, C and Python will also give 1.0 and 1.5 respectively.

But that of course is fine.

David Brown

unread,
Dec 8, 2021, 12:13:59 PM12/8/21
to
On 08/12/2021 17:58, Bart wrote:
> On 08/12/2021 15:36, David Brown wrote:
>> On 08/12/2021 12:55, Bart wrote:
>
>
>>
>> What is it with you and your campaign to claim everything C is bad, and
>> everything in your useless little private language is good?
>
> I said the rules are complex. You said they are simple. I disagreed, and
> illustrated my point with a chart.

A chart designed purely to make the simple rules of C appear complex -
it is FUD. You added those of your own language, which is utterly
irrelevant to C, purely to be able to claim that the rules of your
language are simple. Note that even if your language's rules are
simpler in some way, that does /not/ make C's rules complex!

>
>> than "int" is first converted to an "int".  Then if the two parts have
>> different types, they are converted to the bigger type with "unsigned"
>> types being treated as slightly bigger than the signed types.
>
> At least, they are simpler than the rules for type syntax. And not much
> simpler than the rules for charting the Mandelbrot Set!
>
>> It is /not/ hard.  It is /not/ complex.  You might not think it is
>> ideal, and I'd agree.  But it really is not rocket science, and it
>> doesn't need a complicated table of inappropriate made-up types
>
> What made-up types? And why are they inappropriate?

There are no types of the names you used in C. C has a perfectly good
set of fundamental types (regardless of what you personally might think
of them, or even what /I/ personally might think of them), and the rules
of C are given in terms of those types.

>
> Are you sure you aren't twisting and making up things yourself?
>
>> to make
>> it look more complicated.
>
> I think most people would be surprised at how untidy that chart is. /I/
> was.

You seem to find just about everything in C surprising.

But let's be clear here. Do you think people familiar and experienced
with C programming will find C's rules surprising? Or do you just think
people who have never used C will find them surprising?

>
>
>>>> Does it print 1, or 1.5 ?
>>>
>>> My languages have two divide operators: "/" and "%".
>>>
>>> "%" means integer divide. "/" is supposed to be for floating point
>>> divide, but that's only on one language; the static one will still do
>>> integer divide when both operands are integers.
>>
>> Genius.  Does it also use "and" as a keyword for the remainder after
>> division?  Nothing says "simple" and "intuitive" like picking different
>> meanings for your operators than all other languages.
>
> "%" was used for integer divide in Pascal. I adopted it in the 1980s
> when I needed distinct operators.
>
> And I use "rem" for integer REMainder instead of "%"; "ixor" instead of
> "^"; "ior" instead of "|" and "or" instead of "||". Maybe it's just me,
> but I find them more readable.
>
> Why, what do other languages use for integer divide?

Most use /. And in most languages, if they have % operator for
integers, it means modulus. (Conventions differ regarding rounding and
signs when dividing by negative integers.)

>
>>> So M will give 1.0, Q will give 1.5.
>>>
>>
>> That's your two languages that are proudly the same syntax, but handle
>> expressions in completely different ways?
>
> Funnily enough, C and Python will also give 1.0 and 1.5 respectively.
>
> But that of course is fine.

Yes.

I've no problem with different languages handling these in different
ways - just as I have no problem with different languages handling
integer promotions and implicit conversions in different ways. I merely
have a problem with claims that one method is "surprising" and another
somehow unsurprising, and I would question the benefit of making
languages designed specifically to be as similar in appearance and
syntax as possible while disagreeing on something that fundamental.

So it is /fine/ that your language promotes unsigned types to signed
types in mixed arithmetic. Those are the rules you chose, and if they
are clear and consistent, great. It is /wrong/ to say they are better,
or simpler, than other choices. OK?

Bart

unread,
Dec 8, 2021, 12:58:51 PM12/8/21
to
On 08/12/2021 17:13, David Brown wrote:
> On 08/12/2021 17:58, Bart wrote:
>> On 08/12/2021 15:36, David Brown wrote:
>>> On 08/12/2021 12:55, Bart wrote:
>>
>>
>>>
>>> What is it with you and your campaign to claim everything C is bad, and
>>> everything in your useless little private language is good?
>>
>> I said the rules are complex. You said they are simple. I disagreed, and
>> illustrated my point with a chart.

> A chart designed purely to make the simple rules of C appear complex -

Does it correctly represent what you get when you apply those rules?
Then there's nothing underhand about it.

> it is FUD. You added those of your own language, which is utterly
> irrelevant to C, purely to be able to claim that the rules of your
> language are simple.

My chart is partly simpler because there isn't a discontinuity in the
type system between 32-bit and 64-bit types as there is in most desktop Cs.

But it also simpler because I made it so.

>> What made-up types? And why are they inappropriate?
>
> There are no types of the names you used in C. C has a perfectly good
> set of fundamental types (regardless of what you personally might think
> of them, or even what /I/ personally might think of them), and the rules
> of C are given in terms of those types.


Oh, right, I should written uint64_t etc. Unfortunately that would have
made for a rather wide and spaced out chart.

(Or maybe I should included char, signed char, unsigned char,
signed/unsigned long etc as well. Then it would really have been big
/and/ complex!)

That is a ludicrous quibble; this is a language-agnostic group, and
everyone here surely can figure out what those types represent.

Besides I wanted two charts for comparison; they need to use the same
annotations.


>>
>> Are you sure you aren't twisting and making up things yourself?
>>
>>> to make
>>> it look more complicated.
>>
>> I think most people would be surprised at how untidy that chart is. /I/
>> was.
>
> You seem to find just about everything in C surprising.
>
> But let's be clear here. Do you think people familiar and experienced
> with C programming will find C's rules surprising

I think so. I thought for a long time that mixed arithmetic in C was
done as unsigned. But according to that chart, only 44% of mixed
combinations are done as unsigned; most are signed.


> Or do you just think
> people who have never used C will find them surprising?

There's a ton of things in C that even those who've used it for many
years, will find surprising.



>> Why, what do other languages use for integer divide?
>
> Most use /.

That's not integer divide. For example, Python uses "/" for floating
point divide, and "//" for integer divide. Although Python and its "//"
came along some years after I chose "%".

So what else is there?

Wikipedia says (https://en.wikipedia.org/wiki/Division_(mathematics)):

"Names and symbols used for integer division include div, /, \, and %"

In my IL, I used DIV, IDIV for float and integer division, and IREM for
integer remainder. (Float remainder uses FMOD.)

I had once reserved "//" for designating rational numbers.

> And in most languages, if they have % operator for
> integers, it means modulus.

And if they don't have "%"? Here:

https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages

it seems to be split between REM, MOD and %. I chose REM.

Some languages use more than one for a choice of behaviour.

I don't think "%" is the most common; where it is used, it's often for a
language with C-style syntax.


Bart

unread,
Dec 8, 2021, 2:05:22 PM12/8/21
to
I post criticisms of quite a few languages I come across, although in
this group it might be largely C and Algol68 that come up.

C figures highly because I can't really get away from it; it's
everywhere. It's also the one whose purpose and use-cases most closely
match my own.

But it also annoys me that it is so deified despite being a such a
dreadful language.

That is not surprising given when it was created, nearly 50 years ago.
But it hasn't moved on. Its aficionados seem to treat every misfeature
as an advantage.

> I'd react
> against anyone else deliberately and repeatedly writing nonsense about
> other topics too.

You mention lots of things you don't like about C. But it sounds like
you don't have much of a choice about it; you have to rely on external
tools to make it useful. That's OK, many people are stuck with languages
they don't like.

But some of us can do something about it, yet that seems to annoy you
and you are constantly belittling people's efforts, especially mine.


David Brown

unread,
Dec 9, 2021, 7:58:52 AM12/9/21
to
On 08/12/2021 20:05, Bart wrote:

>
> I post criticisms of quite a few languages I come across, although in
> this group it might be largely C and Algol68 that come up.
>
> C figures highly because I can't really get away from it; it's
> everywhere. It's also the one whose purpose and use-cases most closely
> match my own.
>
> But it also annoys me that it is so deified despite being a such a
> dreadful language.

This is where the communication problem lies - your annoyance is based
on two incorrect ideas.

First, you think C is "deified" - it is /not/. I really wish you could
understand that, as it would make discussions so much easier. You seem
to be fundamentally incapable of distinguishing between people who
understand C and use it (of which there are vast numbers), and people
who think C is the best language ever and completely flawless (of which
there are, to my knowledge, none).

Take me, as an example - because it's a lot easier to speak for myself
than for other people! I have a good understanding of the main C
language, and a subset of the standard library (there is a great deal in
it that I never use). I have read the standards, I keep up with changes
to the new standards. I have written a great deal of C code over the
years, almost all for small embedded systems (and a little for Linux).
I have used a wide range of C compilers for a wide range of
microcontrollers. Far and away the best C compiler I have seen is gcc,
which I know well and use for several targets.

I have worked in many different languages (I have at least some
experience with perhaps 20 programming languages, ranging from
functional programming, assembly, hardware description languages,
scripting languages, imperative languages, and more). I have used
assembly on a couple of dozen architectures over the years. I regularly
use several different languages for different types of programming.

I like programming in C. I think is a good language for a lot of what I
do, and I think it is a good language for a lot of what other people do.
But I also think it is /not/ an appropriate language for many uses
people make of it, and it is not an appropriate language for people who
are not able or willing to learn it properly. It is a language that
trusts the programmer to know what they are doing - if you are not
worthy of that trust, don't use C.

I would drop it in a heartbeat if I had something better. I /do/ drop
it without a backwards glance when I have something better for the task
at hand. Thus on some embedded systems, C++ is more appropriate and I
use that. (On occasions that are thankfully rare now, assembly was a
better choice.) On PC's or bigger systems, I often use Python - but
sometimes other languages.

C is not perfect. I have never heard anyone suggest it is - though you,
Bart, repeatedly accuse people (including me) of saying so. There are a
number of sub-optimal aspects in C that there is quite general agreement
about, and a large number where some people think it could have been
better, but different people have different opinions. For the most
part, those who know about the language understand why things are the
way they are - whether it be for historical reasons, compatibility,
limitations of old systems, or for more modern reasons and uses. No one
is in any doubt that if a language were being designed today to do the
job of C, many aspects would be different. No one is in any doubt that
C is not perfect for their needs or desires. Nonetheless, it is a good
language that works well for many programmers.

It takes effort, skill, knowledge and experience to use any language
well. You need to understand the subset that is appropriate for your
usage - all languages, bar a few niche or specialist ones, have features
and flexibility well outside what makes sense for any particular
programmer's needs. You need to understand how to use the tools for the
language as an aid to developing good code, avoiding problems, and
getting good results in the end. If you fight with the tools, you will
fail. If you fight with the language, you will lose. If you avoid the
useful features of the language, you will only make life harder for
yourself. If you are determined to find fault and dislike in every
aspect of a language, you will not like the language and you will not be
productive with it.


Your second mistake is to think C is a "dreadful language". It is not.
You place artificial limitations on it that make it a poorer language,
you misunderstand its philosophy and design, you fail to make good use
of proper tools (and C was always intended to be used with helpful
tools), and in general your emphasis is on finding faults rather than
uses. You appear unable to believe that people can successfully use the
language.


There is certainly a place for criticism, especially constructive
criticism, in all languages - /none/ are anywhere close to being
universally perfect. But there is no benefit to anyone in a repetitive,
out of context and biased stream of abuse and negativity towards a
language (or anything else, for that matter).


>
> That is not surprising given when it was created, nearly 50 years ago.
> But it hasn't moved on. Its aficionados seem to treat every misfeature
> as an advantage.

I treat things I see in C as misfeatures, as misfeatures. So does
everyone else. I don't treat things /you/ see as misfeatures that way.
In reality, there are very few misfeatures in C that cannot be avoided
by good use of tools, good general development practices, and
occasionally a little extra effort. This is the same in all programming
languages, though of course the details vary. For some reason, you
insist on avoiding good tools (and avoiding good use of tools), and
prefer to find ways to misuse every feature of C that you can.

(The primary reason I have for moving to C++ is to gain additional
features, not to move away from misfeatures.)

>
>>  I'd react
>> against anyone else deliberately and repeatedly writing nonsense about
>> other topics too.
>
> You mention lots of things you don't like about C. But it sounds like
> you don't have much of a choice about it; you have to rely on external
> tools to make it useful. That's OK, many people are stuck with languages
> they don't like.

I /do/ like C - I just don't think it is perfect (and certainly not
perfect for every task). And with good tools used well, it is a very
pleasant and effective language to work with. The same applies to any
good software developer with any language - you find a language that is
suitable for the task and fits your style, you find good tools that help
with the job, and development processes that work well. If you don't
have that, you won't like what you are doing and won't do it well. The
choice of programming language is irrelevant outside its suitability for
the task.

Perhaps you are just envious that I can happily and successfully work
with C, while you have failed? That would be a shame - I am happy, not
envious, that you have a language that you enjoy working with. And I
think it would be better if you avoided dealing with a language that you
clearly don't appreciate or enjoy.

>
> But some of us can do something about it, yet that seems to annoy you
> and you are constantly belittling people's efforts, especially mine.
>

People can choose whatever language they like, and use it as they want.
I don't belittle your effort or your language - I belittle your
attitude to your language and to C, your egotism and narcissistic
viewpoint. When you say you prefer to code in your own language, and
had success with it, that's fine. When you say your language is an
alternative to C, you are wrong. When you say it is "better" than C,
you are wrong. When you say a particular given aspect is "better" than
the equivalent aspect of C, then you /might/ be /subjectively/ right -
i.e., it could be better in some ways for some people or some use-cases.
(And I have regularly agreed on such points.)

luserdroog

unread,
Dec 10, 2021, 12:00:32 AM12/10/21
to
Also going with 'no'. If you need backwards compatibility, wrap some extra
braces in there making a new compound statement extending to the end
of the function. Then you can declare new variables in the middle of a
function even in old timey C.

For loops require a slight adjustment. Pull the declaration out front and
wrap the whole thing in extra braces.

{
int i;
for( i=0; ...){ ... }
}

Now you're cookin' with gas.

David Brown

unread,
Dec 10, 2021, 3:17:03 AM12/10/21
to
Yes, you certainly /can/ do this with C90. But it quickly becomes quite
ugly if your functions are big. It is better, where possible, to split
things into smaller functions.

If you are still coding in C90 today (other than small changes to
maintain legacy code), it is likely to be because you are stuck with an
ancient compiler with poor optimisation - it's going to be poor at
inlining so you need to write big functions (and horrendous
function-like macros) if speed is an issue. There isn't really a good
solution here if you like clear code - just options that are bad in
different ways.


As far as I am concerned, the habit of putting all variable definitions
at the start of a function, before any statements, is as legacy and
out-dated as non-prototype function declarations or the explicit use of
"auto" for local variables.

We don't use Latin to talk about science. We don't program in Algol 68.
Let's leave C90's limitations to the history books too, as much as we
reasonably can.


Bart

unread,
Dec 10, 2021, 6:16:04 AM12/10/21
to
Yet it makes for tidier looking code. There is a separation between the
logic of the code, and the less important details of the types of
variables, which now no longer clutter up the logic.

To find a variable's type, you just glance up at the 'cast-list' at the
top of the function.

If transfering code between languages, the executable code is likely to
be more portable sans its type declarations, which are going to be
language-specific. The other language may not even need types.

You also don't need to worry about block scope: declare that 'int i'
right here, and then you find you can't access 'i' beyond the next '}',
or find (eventually) that it is using a more outer 'i' - the wrong one.

And you don't need to worry about rearranging code where you'd need to
keep ensuring that the first use of a variable is that one with the
declaration.

I wonder if, when declaring module-scope variables, macros, enums, types
and so on, whether you place these before all the functions, or
intersperse them between the functions, to have them as close as
possible to where they are first used?

> We don't use Latin to talk about science. We don't program in Algol 68.
> Let's leave C90's limitations to the history books too, as much as we
> reasonably can.

Factual books still tend to have glossaries at one end of the book or
the other.

(BTW Algol68 allowed declarations interspersed with statements before C did.

I allow the same now (though with function-wide scope to avoid the
problems above), but tend to use that ability as often as I use 'goto';
something that is normally best avoided.)

David Brown

unread,
Dec 10, 2021, 11:51:20 AM12/10/21
to
The type of variables is a critical part of their usage - not some extra
feature.

To be fair, there are certainly languages with weak or no typing, which
can be useful for simple tasks. And it's not unreasonable to use
generic integer types that are big enough for most purposes without
being fussy about the type.

But outside of that, types are vital information. A strong typing
system in a language /hugely/ reduces the risks of errors, as well as
improving the efficiency of the language, and is particularly important
for larger programs.

So moving types away from the use of a variable is hiding useful
information, not making the code "tidier".

Having all the local variables at the start of a function arguably made
sense long ago, with weaker compilers that needed a list of variables
and allocated a fixed stack frame for them. Re-using the same variables
for different purposes in the code was useful for efficiency. Those
days are long gone. Compilers allocate as and when needed, with
variables in registers, stack slots, optimised out entirely, or
combinations as the code progresses. Mixing declarations and statements
means you can freely split code in logical sections, naming things
usefully whenever it is convenient. You make your variables when you
have something to put in them - there is no need to stock up on
variables in advance. You can often avoid changing the value of a
variable - you just make a new one with the new value.

All this makes it far easier to understand code, analyse it, and be sure
it is correct. Your variables can have invariants - established as soon
as the variable is created. You no longer have a period where the
variable exists and could be used accidentally, but does not have an
appropriate value.

It's no surprise that many new languages make variables constant by
default, and functional programming languages - famous for letting you
write provably correct code in many cases - don't let you change
variables at all.

>
> To find a variable's type, you just glance up at the 'cast-list' at the
> top of the function.
>
> If transfering code between languages, the executable code is likely to
> be more portable sans its type declarations, which are going to be
> language-specific. The other language may not even need types.
>

That would only be the case if you are transferring code between
languages that are extremely similar, where one of them is limited to
defining variables at the start of the function (or at least the start
of the block), where you want to do the conversion manually, where you
want to keep an identical structure to the code, and where you are
willing to write sub-standard code in a riskier manner to make this all
work more simply.

/If/ all that is true, then I agree it is simpler - obviously
translating code directly between two languages is going to be simpler
if you stick to a common subset of the features of the languages.

But I would not judge such cases to be even a vaguely significant
fraction of code written. It is too obscure to bother considering.

> You also don't need to worry about block scope: declare that 'int i'
> right here, and then you find you can't access 'i' beyond the next '}',
> or find (eventually) that it is using a more outer 'i' - the wrong one.
>

It is easier to find the definition of the variable when it is close by.
If that is not the case, your function is too big and messy in the
first place.

> And you don't need to worry about rearranging code where you'd need to
> keep ensuring that the first use of a variable is that one with the
> declaration.
>

I agree that that can occasionally be a disadvantage. It is not a big
enough matter to change my overall opinion - not by a /long/ way. (And
note that more often than not, having the declaration at the usage site
makes it easier to re-arrange or copy-and-paste the code between
different functions, since everything is in one place.)

One thing that would help here would be if the language allowed
something like :

int x = ...
...
int x = ...

without starting a new block. C (and C++) do not allow this. But in my
idea of a "perfect" language, it /would/ be allowed. (Of course such a
feature could be abused to write confusing code - that's a risk with any
feature. But it would be good in some circumstances, such as
copy-and-pasted similar code sections.

> I wonder if, when declaring module-scope variables, macros, enums, types
> and so on, whether you place these before all the functions, or
> intersperse them between the functions, to have them as close as
> possible to where they are first used?

I put them where I feel they are appropriate. If they are "exported"
from the unit, I put them in "file.h" rather than "file.c".

If they are local to the file, I put them where it makes sense according
to the grouping of the functionality within the file. Declarations in C
have to come before their use, and I don't forward declare things unless
I have a particularly good reason. Thus static variables or local types
will often come just before the functions that use them, after other
functions that happened to come earlier and didn't need them. They may
also happen to be placed earlier in the file, if it makes more logical
sense to place them alongside other declarations that come there.

So if you are asking if I put all my module-scope variables and stuff at
the start of a module before the code, the answer is no.

(I'm using C as an example here, I do similar things in other languages,
adapted according to the language.)

>
>> We don't use Latin to talk about science.  We don't program in Algol 68.
>>   Let's leave C90's limitations to the history books too, as much as we
>> reasonably can.
>
> Factual books still tend to have glossaries at one end of the book or
> the other.
>
> (BTW Algol68 allowed declarations interspersed with statements before C
> did.
>
> I allow the same now (though with function-wide scope to avoid the
> problems above), but tend to use that ability as often as I use 'goto';
> something that is normally best avoided.)

I agree with you about "goto" :-)

Bart

unread,
Dec 10, 2021, 1:37:33 PM12/10/21
to
With an algorithm in a dynamic language, you don't need explicit types.

When in a complex macro, or in a template body, then precise types don't
matter much either.

Concrete types will be needed when such code is instantiated or
expanded, or when that dynamic code is actually run. But need not be
essential for understanding the code.

David Brown

unread,
Dec 10, 2021, 3:03:51 PM12/10/21
to
Fair enough - when you have polymorphism, you are defining a function
over a range of types. Even then, however, particular types or part of
the types are important. Maybe you want variables in the function to
have the same type as those of the parameter, or perhaps be related to
them (such as a list of that type). Maybe you want to restrict the
types available, such as accepting any type as long as it is an
arithmetic type or a string type. (How you restrict things is often as
useful as what you support.)

I do a fair amount of programming in Python, which is very dynamic (in
many senses). It is quick and easy to write functions that work on
multiple types, but more often than not you only have one type in mind
for each function you write. In my bigger Python programs, I
desperately miss the ability to specify types (Python type annotations
were introduced long after these programs had already grown huge and
unwieldy - it's hard to add these things afterwards).

Sometimes it is nice to be able to write code that works with /any/
type. But often it is better to be able to explicitly say what you need
of the type - it lets you catch certain kinds of problems faster,
earlier in the compile process (and long before running), and it lets
you make your intentions clear to the reader.

Rod Pemberton

unread,
Dec 18, 2021, 9:17:07 PM12/18/21
to
On Thu, 9 Dec 2021 13:58:49 +0100
David Brown <david...@hesbynett.no> wrote:

> On 08/12/2021 20:05, Bart wrote:

> > I post criticisms of quite a few languages I come across, although
> > in this group it might be largely C and Algol68 that come up.
> >
> > C figures highly because I can't really get away from it; it's
> > everywhere. It's also the one whose purpose and use-cases most
> > closely match my own.
> >
> > But it also annoys me that it is so deified despite being a such a
> > dreadful language.
>
> This is where the communication problem lies - your annoyance is based
> on two incorrect ideas.
>
> First, you think C is "deified" - it is /not/.

Yes, it is. I think it's the God language. I have yet to need another
language, unlike BASIC, Pascal, Fortran, ... The only other language
that comes close to C's power, that I'm familiar with, was a early
version of PL/1, which was like Pascal with pointers. (Pascal
apparently now has pointers.)

There isn't anything you can't program in C with enough effort. And,
most other languages, at one time or another, compiled directly into C
or had apps to convert their code into C to be compiled, including C++.
You can even, with a touch of creativity, do object-oriented
programming in C.

> people who think C is the best language ever

Yes.

> completely flawless (of which there are, to my knowledge, none).

No. C has a bunch of mistakes and dark corners. It's best to program
in a limited "safe" subset of C. Some would argue "portable" instead
of "safe", but whatever ... There are obviously a number of safe C
standards and compilers out there e.g., MISRA, CompCert.

An old post to James, in the middle lists things I think C got correct:
https://groups.google.com/g/comp.lang.misc/c/4VNii2cQ_Zo/m/7SkgWMit_iIJ

> But I also think it is /not/ an appropriate language for
> many uses people make of it, and it is not an appropriate language
> for people who are not able or willing to learn it properly.

C is /always/ an appropriate language choice for a few reasons:

1) C is available everywhere, ready-to-go, works reliably and
consistently, and requires no need to learn a new language
2) C compilers produce highly optimized code or quick binaries
3) C is known by an entire generation programmers or a few gens
4) C consistently ranks high on the TIOBE index for usage

C is like English of the programming world. Before that, BASIC.
Arguably, nothing has quite replaced C, although some now recommend
Python. Python looks C-like to me, e.g., C meets Pascal.

> There is certainly a place for criticism, especially constructive
> criticism, in all languages - /none/ are anywhere close to being
> universally perfect. But there is no benefit to anyone in a
> repetitive, out of context and biased stream of abuse and negativity
> towards a language (or anything else, for that matter).

This isn't a language criticism group. It's a language development
group. Bart is developing. I was. James is. Others reading and
posting here are or have, or are interested in the theories.

> >>  I'd react
> >> against anyone else deliberately and repeatedly writing nonsense
> >> about other topics too.
> >
> > You mention lots of things you don't like about C. But it sounds
> > like you don't have much of a choice about it; you have to rely on
> > external tools to make it useful. That's OK, many people are stuck
> > with languages they don't like.
>
> I /do/ like C - I just don't think it is perfect (and certainly not
> perfect for every task). And with good tools used well, it is a very
> pleasant and effective language to work with. The same applies to any
> good software developer with any language - you find a language that
> is suitable for the task and fits your style, you find good tools
> that help with the job, and development processes that work well. If
> you don't have that, you won't like what you are doing and won't do
> it well. The choice of programming language is irrelevant outside
> its suitability for the task.

I would argue that you need to find a solution to coding a program that
fits the functionality of the language, then the outcome will be good.

What I usually see with people who dislike C is that they attempt to
code their program the way they want to, or they attempt to implement
their program according to the design that they want, instead of
thinking, "How can this best be implemented in C?" C is a
general-purpose programming language that is exceptional with low-level
constructs, such as integers, pointers, and characters. IMO, it's not
good with floating point, and it's missing complex numbers, etc. So,
it's a language you wouldn't choose to do math intensive stuff, like is
done with Fortran or MatLab etc. If you move beyond those few things
in C, you probably need to redesign your program significantly to fit
that model. E.g., you may need to restructure the program to use more
integers or strings.


--

Rod Pemberton

unread,
Dec 18, 2021, 9:17:16 PM12/18/21
to
On Wed, 8 Dec 2021 16:36:33 +0100
David Brown <david...@hesbynett.no> wrote:

> On 08/12/2021 12:55, Bart wrote:
> > On 08/12/2021 11:07, David Brown wrote:
> >> On 08/12/2021 10:45, Bart wrote:

> >>> C of course prefers to use unsigned for mixed arithmetic
> >>> (although the precise rules are complex). So here:
> >>
> >> The precise rules are simple, not complex.  Pretending they are
> >> difficult does not help.
> >
>
> What is it with you and your campaign to claim everything C is bad,
> and everything in your useless little private language is good?

Does he know C better than you? Is that why he makes you angry? ...

> It doesn't matter what anyone writes - you /always/ twist the facts,
> move the goalposts or deliberately misinterpret what others write.
> (And yes, your language is useless - no one else will ever use it.
> You've had made useful software with it and used it in your work in
> the past. That's great, and genuinely praise-worthy. But it is dead
> now. Move along.)
>

Sigh, that really reminds me of some (incorrect) but pedantic C guys on
comp.lang.c from decades ago.

> So - let's start with some kindergarten logic. Claiming that your
> rules are simpler than C's does not make C's rules complex.

Who knows C's rules besides you? If I seriously need them, I look them
up in a book ... Otherwise, I code around them to not need to know
them, as that leads to fewer coding mistakes than using what I
remember. How long have you been on Usenet? Has it been over 3 or 4
decades? If so, you'll notice that people who "authoritatively" make
statements from memory, almost always get them wrong. (Yeah, like you
just did recently on comp.compilers in regards to C type-punning.)

> >> In particular, what does your language give for :
> >>
> >>       int a := 2
> >>       int b := 3
> >>       real c := b / a;
> >>
> >>       println c
> >>
> >>
> >> Does it print 1, or 1.5 ?
> >
> > My languages have two divide operators: "/" and "%".
> >
> > "%" means integer divide. "/" is supposed to be for floating point
> > divide, but that's only on one language; the static one will still
> > do integer divide when both operands are integers.
>
> Genius. Does it also use "and" as a keyword for the remainder after
> division? Nothing says "simple" and "intuitive" like picking
> different meanings for your operators than all other languages.
>

...

> > So M will give 1.0, Q will give 1.5.
> >
>
> That's your two languages that are proudly the same syntax, but handle
> expressions in completely different ways?
>
>
> If you want to keep posting about your own language, please feel free
> - only you can tell if you are making things up as you go along. But
> /please/ stop posting shite about other languages that you refuse to
> understand.
>
> Understand me correctly here - I really don't care if you like C or
> not. I don't care if anyone else here likes it or not, uses it or
> not. I am not interested in promoting C or any other language - I'll
> use what I want to use, and others will use what they want.

Genius. Pick a dead language to program in. That way there is no
chance whatsoever that you'll encounter a new and unknown bug.
Better yet, code your own language so you never have to come across
any unknown bug or only bugs you can fix. Oh, well, that's what Bart
did ... Imagine that. You two are actually in sync.

Did you know that TIOBE ranks programming languages by usage so your
skills aren't obsolete? TIOBE currently ranks C++ at 7.73% and ranks C
at 11.80%.

While picking a dead language is genius for code development, it's
horrible if your code is to be maintained by others. Then, they'll
need some obscure, bit-rotted, uncompilable piece of crud in order to
compile the project. That will make everyone want to hang you. This
problem seriously affects Linux, as I recently had the "pleasure" of
recompiling from code over numerous software packages, probably over
two hundred.

> But what I /do/ react against is lies, FUD, and misrepresentations. I
> am not "pro-C" - I am "anti-FUD", and it just so happens that your
> bizarre hatred of C means it is C you post rubbish about. I'd react
> against anyone else deliberately and repeatedly writing nonsense about
> other topics too.

Well, that would be you.

Your recent incorrect post to comp.compilers about type-punning and
your insistence that arrays in C aren't actually just pointers, i.e.,
pass-by-reference, but instead magically "decay into" pointers.
You need to learn about how the subscript operator [] in C actually
works. It takes a pointer and an index in either order. It
doesn't take arrays. Then, read up on Dennis Ritchie's papers where he
states that make the language syntax appear to match variable
declarations, because not doing so confusing to newbs. You should then
realize that C only has array declarations to allocate storage and
check type, and no actual arrays in C, as they are pointers, and array
syntax is simulated by the subscript operator.


--

Rod Pemberton

unread,
Dec 18, 2021, 9:17:24 PM12/18/21
to
On Wed, 8 Dec 2021 18:13:56 +0100
David Brown <david...@hesbynett.no> wrote:

> On 08/12/2021 17:58, Bart wrote:
> > On 08/12/2021 15:36, David Brown wrote:
> >> On 08/12/2021 12:55, Bart wrote:

> >> What is it with you and your campaign to claim everything C is
> >> bad, and everything in your useless little private language is
> >> good?
> >
> > I said the rules are complex. You said they are simple. I
> > disagreed, and illustrated my point with a chart.
>
> A chart designed purely to make the simple rules of C appear complex -
> it is FUD.

And, some of the simple rules of C ...

C has 14 casting rules.
C has 11 assignment conversion rules.
C has 16 precedence levels.
C has 11 input conversion specifiers.
C has 12 output conversion specifiers.
...

Yeah, I see exactly where you're coming from on C's simple rules.

> You added those of your own language, which is utterly
> irrelevant to C, purely to be able to claim that the rules of your
> language are simple.

Um, no offense, but that seems rather far fetched.

> Note that even if your language's rules are
> simpler in some way, that does /not/ make C's rules complex!

...

> > Are you sure you aren't twisting and making up things yourself?
> >
> >> to make
> >> it look more complicated.
> >
> > I think most people would be surprised at how untidy that chart is.
> > /I/ was.
>
> You seem to find just about everything in C surprising.
>
> But let's be clear here. Do you think people familiar and experienced
> with C programming will find C's rules surprising?

Yes. Some of C's rules are rather bizarre, especially casting rules.

Haven't you ever read Harbison and Steele's "C: A Reference Manual".
It has plenty of tables describing C's rules for all sorts of things.

> >>>> Does it print 1, or 1.5 ?
> >>>
> >>> My languages have two divide operators: "/" and "%".
> >>>
> >>> "%" means integer divide. "/" is supposed to be for floating point
> >>> divide, but that's only on one language; the static one will
> >>> still do integer divide when both operands are integers.
> >>
> >> Genius.  Does it also use "and" as a keyword for the remainder
> >> after division?  Nothing says "simple" and "intuitive" like
> >> picking different meanings for your operators than all other
> >> languages.
> >
> > "%" was used for integer divide in Pascal. I adopted it in the 1980s
> > when I needed distinct operators.
> >
> > And I use "rem" for integer REMainder instead of "%"; "ixor"
> > instead of "^"; "ior" instead of "|" and "or" instead of "||".
> > Maybe it's just me, but I find them more readable.
> >
> > Why, what do other languages use for integer divide?
>
> Most use /. And in most languages, if they have % operator for
> integers, it means modulus. (Conventions differ regarding rounding
> and signs when dividing by negative integers.)
>

As Bart proved, not "**ALL** other languages" as you claimed (emphasis
added), i.e., that was FUD. You can't dismiss Pascal as some
unimportant language.


--

Rod Pemberton

unread,
Dec 18, 2021, 9:17:59 PM12/18/21
to
On Wed, 8 Dec 2021 16:58:54 +0000
Bart <b...@freeuk.com> wrote:

> On 08/12/2021 15:36, David Brown wrote:
> > On 08/12/2021 12:55, Bart wrote:

> >>> Does it print 1, or 1.5 ?
> >>
> >> My languages have two divide operators: "/" and "%".
> >>
> >> "%" means integer divide. "/" is supposed to be for floating point
> >> divide, but that's only on one language; the static one will still
> >> do integer divide when both operands are integers.
> >
> > Genius. Does it also use "and" as a keyword for the remainder after
> > division? Nothing says "simple" and "intuitive" like picking
> > different meanings for your operators than all other languages.
>
> "%" was used for integer divide in Pascal. I adopted it in the 1980s
> when I needed distinct operators.

I knew I'd seen that somewhere, but it wasn't Forth or Postscript.
Pascal was 1980s the last time I used it. It was too limited. It
didn't have pointers back then. You were restricted to the Pascal
scope or space, and usually couldn't program the machine it was running
on.

> And I use "rem" for integer REMainder instead of "%"; "ixor" instead
> of "^"; "ior" instead of "|" and "or" instead of "||". Maybe it's
> just me, but I find them more readable.
>

As I think I might've mentioned to James, I happen to prefer C's method
of symbolic operators for math, logical, and binary operations, as it
makes it easier for me to read the variables, keywords, and procedure
names, etc. I.e., easier to separate. This probably came from Algol,
but that was before my time.

--

Bart

unread,
Dec 19, 2021, 7:04:12 AM12/19/21
to
On 19/12/2021 02:19, Rod Pemberton wrote:
> On Wed, 8 Dec 2021 16:58:54 +0000
> Bart <b...@freeuk.com> wrote:
>
>> On 08/12/2021 15:36, David Brown wrote:

>>> Genius. Does it also use "and" as a keyword for the remainder after
>>> division? Nothing says "simple" and "intuitive" like picking
>>> different meanings for your operators than all other languages.
>>
>> "%" was used for integer divide in Pascal. I adopted it in the 1980s
>> when I needed distinct operators.
>
> I knew I'd seen that somewhere, but it wasn't Forth or Postscript.
> Pascal was 1980s the last time I used it. It was too limited. It
> didn't have pointers back then. You were restricted to the Pascal
> scope or space, and usually couldn't program the machine it was running
> on.

The DEC Pascal I used in the late 70s had pointers.

What it didn't have was a useful treatment of arrays, and some other
things necessary in a real world language. But ours was for teaching
purposes.

David Brown

unread,
Dec 19, 2021, 9:25:55 AM12/19/21
to
I see you have recently made a lot of posts in replies to things I have
written here, much of it in a rather confrontational tone.

I don't really want to reply to them all, point for point - it would be
a lot of time and effort and of little interest to most people. Some of
what you write is your own opinion, and I can't argue with it - if /you/
want to deify C and consider it the appropriate choice for all
situations, I guess that's up to you. Some of it was fact - Bart was
right about operators in Pascal. Much of it, however, is piddle. If
you don't understand the details of C, but are convinced that you /do/,
then I doubt if there is anything I can write that will change your
mind. If you want to understand what "array" means in C, then I
recommend you read up about them in the C standards. The same goes for
type punning.

Don't rely on ideas you vaguely remember someone saying in your youth -
read the /standards/. That is what defines the language - not mistaken
ideas or misunderstandings, or things that "always worked that way
before", or what someone wrote about it in the past. It doesn't even
matter if the person in question wrote the first book on C, or the first
compiler, or the first draft of the language, or worked on earlier
versions of the standards. The ghost of Dennis Ritchie could post here
saying what he intended C to be if he likes - the C standards valid
today (C18) say what the language /is/. The poster in comp.compilers
completely failed that test - his own personal opinion about older C
standards does not trump the written text of current standards. (Of
course he is free to have an opinion on what C /should/ have been - just
because the standards committee agrees as a whole on particular wording
for the standard does not mean individuals agree on each point.)

And I'd prefer it if you didn't set up imaginary straw men in your
attempts to prove me wrong in some manner. I know how the [] operator
works in C. (I make as many real mistakes as the next person, and don't
need you to invent more in my name.)



Bakul Shah

unread,
Dec 21, 2021, 1:25:43 PM12/21/21
to
On 12/1/21 4:11 PM, Andy Walker wrote:
> On 01/12/2021 08:43, David Brown wrote:
>> [I wrote:]
>>>    Zero, as a number, was invented
>>> in modern times [FSVO "modern"!].
>> (Historical note:
>> It reached Europe around 1200, but had been around in India, amongst
>> other countries, for a good while before that.
>
>     Yes, but that's nearly always zero as a placeholder, not
> as a number in its own right.  [I'm not convinced by many of the
> claimed exceptions, which often smack of flag-waving.]

Note that Indian mathematicians such as Brahmgupta used negative
numbers as early as 7th century. Earlier (3-2 century BC) Pingala
used "shunya" to refer to zero. The *notation* of zero as a place
value came later (by 5-7th century). This makes me thing that the
understanding of zero as a number came earlier than the notation.
[Though, as in many things in Sanskrit, there are multiple meanings
of "shunya" (emptiness, for example) as well as multiple words for
describing the concept of zero!]

anti...@math.uni.wroc.pl

unread,
Jan 23, 2022, 9:34:29 AM1/23/22
to
David Brown <david...@hesbynett.no> wrote:
>
> Whatever kind of numbers you use, you have to apply a few brain cells.
> You can't represent 1/3 with an integer, no matter how big it is. You
> can't represent negative numbers with unsigned types. It's common
> sense, not a "problematic limit".

But common sense is wrong here. One can represent fractions using
integers and it is quite useful when you need exact result but
do not want to deal with explicit fractions. More precisely,
one uses modular integers (so in C that would be unsigned type).
Modulo 2^n odd numbers like 3 are invertible. So you get
171 with 8 bit representation, 43691 with 16 bit, 2863311531
with 32 bit and 12297829382473034411 with 64 bit. That
is really not much different than using 2-complement to
represent negative numbers. Of course, in C there is no
special support for this, '+', '-' and '*' work OK, but
for division and I/O you need extra routines. And, if you
want to stick to machine operations then 1/2 (in fact, any even
number) is problematic because such numbers are not ivertible
modulo 2^n (one could use different modulus, but then all
operations get more complicated).

> Anyone who finds it surprising that
> you can't subtract 3 from 2 without signed numbers should give up their
> programming career and go back to primary school. We have to have
> /some/ standard of education in this profession!

I am affraid that we need _much_ more than primary school...

--
Waldek Hebisch

David Brown

unread,
Jan 23, 2022, 11:12:26 AM1/23/22
to
On 23/01/2022 15:34, anti...@math.uni.wroc.pl wrote:
> David Brown <david...@hesbynett.no> wrote:
>>
>> Whatever kind of numbers you use, you have to apply a few brain cells.
>> You can't represent 1/3 with an integer, no matter how big it is. You
>> can't represent negative numbers with unsigned types. It's common
>> sense, not a "problematic limit".
>

I'm trying to figure out what you are talking about here when you are
resurrecting a long-finished thread.

> But common sense is wrong here.

No, it is not.

Introducing new operations, or new ways to interpret numbers, does not
help. Being able to do something in /theory/ does not necessarily help
in /practice/.

The rationals are countably infinite. It is therefore possible to
represent any rational number (including negative ones) using a
non-negative integer, by producing an appropriate coding scheme. That
does not help in any real use, however. Similarly, you could call a 1
MB executable an 8-million bit integer, but to what purpose? It's
wonderful for proving results about computable problems, but not for
practical programming.

Invertible numbers in modulo arithmetic are equally useless for normal
arithmetic. (They certainly have their uses in encryption and other
fields.) Who cares if 171 is the inverse of 3 modulo 256? What does it
give you? Calculating that 171 is a time-consuming process. You can't
use it to divide by 3 - you only get a useful answer if the numerator
happens to be divisible by 3. You can't find the inverse of 6. You
can't distinguish it from 171, despite 171 being an entirely different
number from 1/3. It does not give you a useful way to represent fractions.

If I remember this thread correctly, the point was that any finite and
limited representation is going to have limits on what it can represent.

anti...@math.uni.wroc.pl

unread,
Jan 24, 2022, 11:08:32 AM1/24/22
to
Sure, like any computation it needs time.

> You can't
> use it to divide by 3 - you only get a useful answer if the numerator
> happens to be divisible by 3.

Not only. If you have apropriate bounds, say numerator has absolute value
less than 10 and denominator is between 1 and 12 you can _uniqely_
reconstruct fraction from such representation. Of course, range of
fractions representable in single byte is small, if you need more
use bigger integer type.

> You can't find the inverse of 6.

Yes, here one needs different modulus, so no longer can directly
use machine arithmetic.

> You
> can't distinguish it from 171,

Well, you need enough bits to have unique representation, there is
no way around this.

> despite 171 being an entirely different
> number from 1/3. It does not give you a useful way to represent fractions.

It is useful as in "used by actual programs" which produce fractions
as results.

> If I remember this thread correctly, the point was that any finite and
> limited representation is going to have limits on what it can represent.

Sure. However, IMO your wording was unfortunate. You probably do
not need such representation and normal programming languages have
no support for it. But this is tradof based on popular needs
and not an absolute impossibility.

--
Waldek Hebisch

David Brown

unread,
Jan 24, 2022, 2:07:06 PM1/24/22
to
Yes, and that is a key point.

Another important point is that if you want to represent rationals in a
practical and useful manner, this is not the way to do it - a pair of
integers (of some fixed size, or of variable size) is vastly more
convenient for most purposes.

>
>> despite 171 being an entirely different
>> number from 1/3. It does not give you a useful way to represent fractions.
>
> It is useful as in "used by actual programs" which produce fractions
> as results.

No, it will not be - the format is too inconvenient for most purposes.
There may be niche cases where it is useful (I am guessing that
cryptography could be one area, but I am not an expert there).

The same applies to other types of "number", such as doing your
arithmetic over the Galois Field GF(2⁸). Now every number from 1 to 255
has a unique multiplicative inverse. This can let you do marvellous
things - such as guaranteeing a solution to the simultaneous equations
used to restore RAID6 disk sets when two drives are dead. But it also
means that while 20 / 5 is 4, 20 / 3 is 12 and 20 / 6 is 6. It is
useless for "normal" division (or normal multiplication, addition or
subtraction).

>
>> If I remember this thread correctly, the point was that any finite and
>> limited representation is going to have limits on what it can represent.
>
> Sure. However, IMO your wording was unfortunate. You probably do
> not need such representation and normal programming languages have
> no support for it. But this is tradof based on popular needs
> and not an absolute impossibility.
>

I'll agree with that.

anti...@math.uni.wroc.pl

unread,
Jan 28, 2022, 9:21:48 PM1/28/22
to
Rationals as pairs of integers of fixed size are a toy, useless
if you want to do serious computations. Basic point about rationals
is that there is tendency to have very big numerators and denominators.
Without simplifying common factors in many cases one gets exponential
growth of length of numerators and denominators. Canceling common
factors is expensive and still there is growth (but only linear).
If _intermediate_ numbers exceed representable range, the best you
can hope is modular result. But if you are after modular result
why bother with numerators and denominators?

Note: there are important special cases when final result is small
but intermediate results if represented in naive way would be
prohibitively big. And even is final result is big there are
cases when one can relatively easily increase modulus so that
modulus is big enough to uniquely reconstruct result.

Variable size numerators and denominators are general way, relatively
easy to program but may be inefficient. Sometimes there are no
better way. But there are important cases when modular calculations
gives result that would be essentially imposible to obtain using
general representation (think of using several TB of memory instead
of say 1G).

> >
> >> despite 171 being an entirely different
> >> number from 1/3. It does not give you a useful way to represent fractions.
> >
> > It is useful as in "used by actual programs" which produce fractions
> > as results.
>
> No, it will not be - the format is too inconvenient for most purposes.
> There may be niche cases where it is useful (I am guessing that
> cryptography could be one area, but I am not an expert there).

I wrote my sentence in present tense, future is too hard to predict...
You may consider exact computations as a niche. But by this token
probably most programs are niche: there is small number of widely
used popular programs and long tail of specialized programs that each
have relativly small number of users. It is pretty clear that
when you count programs (as opposed to users) most programs will
be in this long tail.

BTW: I am not aware of serious use of Z(2^n) in crypthography
(I am not saying it is not used, just that I mainstream
cryptosystem that heard about do not use it)
From my point of view crypthography is good because thanks to
crypthography processor makers got more serious about performance
of multiprecision arithmetic. But I am really not going deep
into crypthography...

> The same applies to other types of "number", such as doing your
> arithmetic over the Galois Field GF(2?). Now every number from 1 to 255
> has a unique multiplicative inverse. This can let you do marvellous
> things - such as guaranteeing a solution to the simultaneous equations
> used to restore RAID6 disk sets when two drives are dead. But it also
> means that while 20 / 5 is 4, 20 / 3 is 12 and 20 / 6 is 6.

Not sure what you mean here: GF(2^n) is different than Z(2^n) and
there is no natural correspondence between elements of GF(2^n)
and Z(2^n). In particular in GF(2^n) we have 1 + 1 = 0, so natural
image of 2 = 1 + 1 definitely is not invertible. If you fix irreducible
polynomial there is correspondence between bitstrings of length n
and elements of GF(2^n). But to multiply bitstrings you should
do "carryless mutiplication" and then reduce modulo irreducible
polynomial...

> It is
> useless for "normal" division (or normal multiplication, addition or
> subtraction).

GF(2^n) has uses when you deal with polynomials. But unless
hardware efficiently supports carryless mutiplication it is
usually more efficient to use Z(p) for prime p > 2. That
is assuming that you care only about polynomials with rational
cofficients (if you deal with Z(2) you may be forced to use GF(2^n)).

--
Waldek Hebisch

David Brown

unread,
Jan 29, 2022, 8:00:42 AM1/29/22
to
I am aware of this, and I agree that rationals as pairs of integers
don't have a lot of uses, and in particular you don't want to do a lot
of arithmetic with them or you have a good chance of getting really
large numbers.

The points is that /if/ you need rationals, that is the way to hold them
and use them.

When you are using other types of arithmetic and division, such as
modular arithmetic or Galois fields, you are not using rational numbers.
It's a different field - a different mathematical structure. They can
often be useful, and much more efficient than rationals, and sometimes
you can use them for calculations that give results which correspond to
results from rational numbers (but calculated more efficiently).

>
>>>
>>>> despite 171 being an entirely different
>>>> number from 1/3. It does not give you a useful way to represent fractions.
>>>
>>> It is useful as in "used by actual programs" which produce fractions
>>> as results.
>>
>> No, it will not be - the format is too inconvenient for most purposes.
>> There may be niche cases where it is useful (I am guessing that
>> cryptography could be one area, but I am not an expert there).
>
> I wrote my sentence in present tense, future is too hard to predict...
> You may consider exact computations as a niche. But by this token
> probably most programs are niche: there is small number of widely
> used popular programs and long tail of specialized programs that each
> have relativly small number of users. It is pretty clear that
> when you count programs (as opposed to users) most programs will
> be in this long tail.

I suppose you could say that, yes.

>
> BTW: I am not aware of serious use of Z(2^n) in crypthography
> (I am not saying it is not used, just that I mainstream
> cryptosystem that heard about do not use it)

Modular arithmetic turns up a lot in cryptography, but not over 2^n.
Normally you are using bases that are calculated from large prime
numbers. (RSA public/private key systems are a popular example.)

> From my point of view crypthography is good because thanks to
> crypthography processor makers got more serious about performance
> of multiprecision arithmetic. But I am really not going deep
> into crypthography...
>

Does the hardware here have other uses? I've seen hardware accelerators
for things like 3DES and AES symmetric cyphers, and for SHA hashes.
I've also seen dedicated chips for elliptical-curve cryptography (of
which I know almost nothing). But I don't know how these could be of
any use for multiprecision integer arithmetic.

>> The same applies to other types of "number", such as doing your
>> arithmetic over the Galois Field GF(2?). Now every number from 1 to 255
>> has a unique multiplicative inverse. This can let you do marvellous
>> things - such as guaranteeing a solution to the simultaneous equations
>> used to restore RAID6 disk sets when two drives are dead. But it also
>> means that while 20 / 5 is 4, 20 / 3 is 12 and 20 / 6 is 6.
>
> Not sure what you mean here: GF(2^n) is different than Z(2^n) and
> there is no natural correspondence between elements of GF(2^n)
> and Z(2^n).

Correct.

> In particular in GF(2^n) we have 1 + 1 = 0, so natural
> image of 2 = 1 + 1 definitely is not invertible.

Exactly as you say, you do your multiplications modulo an irreducible
polynomial.

The most important practical choice of representation and polynomial,
since it is used to give efficient RAID6 implementations, is to reduce
modulo x⁸ + x⁴ + x³ + x² + 1. Then your "multiply by 2" operation is :

def times_g(x) :
# Multiply mod x⁸ + x⁴ + x³ + x² + 1
if (x & 0x80) :
return ((x << 1) ^ 0x1d) & 0xff
else :
return (x << 1) & 0xff

"2" is /not/ the sum of "1" and "1" - you don't have an additive
generator. You have a multiplicative generator "g" which corresponds to
2. The inverse of 2 corresponds to 0x8e. Since every non-zero number
raised to the power 255 is 1, you can calculate 2^-1 as 2^254, using the
"times_g" operation 254 times to get 0x8e.


> If you fix irreducible
> polynomial there is correspondence between bitstrings of length n
> and elements of GF(2^n). But to multiply bitstrings you should
> do "carryless mutiplication" and then reduce modulo irreducible
> polynomial...
>

Yes.


>> It is
>> useless for "normal" division (or normal multiplication, addition or
>> subtraction).
>
> GF(2^n) has uses when you deal with polynomials. But unless
> hardware efficiently supports carryless mutiplication it is
> usually more efficient to use Z(p) for prime p > 2. That
> is assuming that you care only about polynomials with rational
> cofficients (if you deal with Z(2) you may be forced to use GF(2^n)).
>

GF(2^8) has vital /practical/ uses.

<https://mirrors.edge.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>

James Harris

unread,
Feb 15, 2022, 7:06:49 AM2/15/22
to
On 02/12/2021 22:38, Bart wrote:
> On 02/12/2021 21:25, James Harris wrote:
>> On 02/12/2021 20:11, Bart wrote:

...

> Continuous measurements need to start from 0.0.

Surely you mean 0.1. ;-)

>
> Discrete entities are counted, starting at 0 for none, then 1 for 1 (see
> Xs below).

Counting and indexing are different things. Whether one indexes from 0
or 1 or 197 the count will still be the count.

I agree with you about /discreet/ entities being identifiable starting
from 1 but if any subdivision of the units is possible (either at the
time a program is written or later) then labelling starting from 1 could
well become awkward. That's why indexing from zero is more natural
mathematically, even though it is less natural societally.

>
> Some are in-between, where continuous quantities are represented as lots
> of small steps. (Example: money in steps of £0.01, or time measured in
> whole seconds.)

Again, if one indexes from zero in all cases then the issues of
subdivision no longer apply.

Don't get me wrong. I agree that indexing is more familiar to humans as
starting from 1. We learn to deal in discreet quantities.

...

>> But do you see the point of it? The first century /naturally/ had
>> century number zero, not one, and the N'th century has century number
>>
>>    N - 1
>>
>> IOW the numbering begins at zero.
>
> Define what you mean by numbering first.

By numbering in this case I mean indexing, labelling.

>
> For me it means assigning sequential integers to a series of entities.

Sure.

> But you need an entity to hang a number from. With no entities, where
> are you going to stick that zero?

In terms of centuries we call the first "zero". Why? Because it's
mathematically /natural/ to do so. That's despite it being more familiar
to us as humans to begin counting from 1.

I would add that it takes mental effort to say what century is the 15th,
for example, because it's not labelled 15.

Similarly, it's awkward to refer to the "15th element" of a zero-based
array so I prefer to call it "element 14" then there's no discrepancy.

...

> I'm not sure what you're trying to argue here; that because 0 is used to
> mean nothing, then that must be the start point for everything?

I'm not trying to argue or to win an argument, BTW, just to challenge
your view and explore the issue.

>
> Here are some sets of Xs increasing in size:
>
>              How many X's?   Numbered as?  Number of the Last?
>   --------
>    -         0               -             -
>   --------
>    X         1               1             1
>   --------
>    X X       2               1 2           2
>   --------
>    X X X     3               1 2 3         3
>   --------
>
> How would /you/ fill in those columns? I'd guess my '1 2 3' becomes '0 1
> 2', and that that last '3' becomes '2'.
>
> But what about the first '3' on that last line; don't tell me it becomes
> '2'! (Because then what happens to the '0'?)

In human terms I'd number them as you do but when writing software I've
found that a different way is more consistent, more scalable and more
mathematically natural.

Your How Many column I would have the same as you do.

Your Numbered As column I'd have as 0, 1, 2.

>
> Using you scheme (as I assume it will be); there is too much disconnect:
> a '0' in the first row, and two 0s the second; a '1' in the second, and
> two 1s in the third. Everything is out of step!
>
>> Yes, you are talking about discreet units which are not made of parts.
>
> Yes, arrays of elements that are computer data with no physical dimensions.

Indeed, although if there's even any chance of later subdivision then
1-based indexing becomes mathematically unnatural.

As a compiler writer you will be aware of having to work in zero-based
/offsets/ rather than 1-based indexes.


--
James Harris

James Harris

unread,
Feb 15, 2022, 7:18:18 AM2/15/22
to
On 03/12/2021 00:08, David Brown wrote:
> On 02/12/2021 22:25, James Harris wrote:

...

>> That's not a convention, by the way, but how all numbering works: things
>> with partial phases begin at zero.
>>
> Note, however, that the first century began with year 1 AD (or 1 CE, if
> you prefer). The preceding year was 1 BC. There was no year 0. This
> means the first century was the years 1 to 100 inclusive.
>
> It really annoyed me that everyone wanted to celebrate the new
> millennium on 01.01.2000, when in fact it did not begin until 01.01.2001.

I never understood that annoyance. The exact length of the millennium
was messed up by the Julian calendar (i.e. you may still have been a day
out) and the supposed start point was somewhat arbitrary anyway, with
Jesus likely being born a year or two /before/ what came to be called
year 1 and certainly not at midnight on 31 December!

So to me the notable point was when the calendar rolled over. It was
always going to be an artificial date so why not go with the numbering?

>
> It would have been so much simpler, and fitted people's expectations
> better, if years have been numbered from 0 onwards instead of starting
> counting at 1.

Indeed.


--
James Harris

James Harris

unread,
Dec 8, 2023, 2:28:31 PM12/8/23
to
On 03/12/2021 09:08, David Brown wrote:
> On 02/12/2021 22:42, James Harris wrote:
>> On 02/12/2021 20:49, Dmitry A. Kazakov wrote:
>>> On 2021-12-02 21:31, James Harris wrote:
>>
>> ...
>>
>>>> But to the point, are you comfortable with the idea of the A(2) in
>>>>
>>>>    x = A(2) + 0
>>>>
>>>> meaning the same mapping result as the A(2) in
>>>>
>>>>    A(2) = 0
>>>>
>>>> ?
>>>
>>> Yes, in both cases the result is the array element corresponding to
>>> the index 2. That is the semantics of A(2).
>>
>> Cool. If A were, instead, a function that, say, ended with
>>
>>   return v
>>
>> then what would you want those A(2)s to mean and should they still mean
>> the same as each other? The latter expression would look strange to many.
>>
>
> Do you mean like returning a reference in C++ style?

Hi David, apologies for not replying before. I was just now looking for
old posts that were outstanding in some way. (There may be many which I
have yet to reply to like your one.)

I probably left your post until I found out about C++ references and
never got round to reading up on them.

>
>
> int a[10];
>
> void foo1(int i, int x) {
> a[i] = x;
> }
>
> int& A(int i) {
> return a[i];
> }
>
> void foo2(int i, int x) {
> A(i) = x;
> }
>
> foo1 and foo2 do the same thing, and have the same code. Of course,
> foo2 could add range checking, or offsets (for 1-based array), or have
> multiple parameters for multi-dimensional arrays, etc. And in practice
> you'd make such functions methods of a class so that the class owns the
> data, rather than having a single global source of the data.

I see at https://www.geeksforgeeks.org/references-in-cpp/ code which
includes

void swap(int& first, int& second)
{
int temp = first;
first = second;
second = temp;
}

That's not what I was thinking about. I don't care for it because in a
call such as

swap(a, b)

it's not clear in the syntax that the arguments a and b can be modified
- a calamity of a design, IMO. :-(

But in your case you are referring to a /returned/ value. That appears
to be OK except that to match what I had in mind I think there should be
a const in there somewhere. To illustrate, I was proposing that in a
function, f, one could have

return h

which would /conceptually/ return the address of h (which I guess in
Algol terms means it would return h rather than the value of h).
Crucially, and perhaps at variance with Algol (I don't know) the value
of h (i.e. the value at the returned address) would be read-only to the
caller.

The caller would be able to use the address returned as it could any
other address, but it could not write over the referenced value. If the
callee returned with something like

return a[4]

then it would conceptually return the address of a[4] and, again, the
value at the returned address would be read-only in the caller.

What I've said so far is by default but overwriting would be possible.
To conceptually return the address of a variable which /could/ be
overwritten one would use the rw modifier as in

return rw h

or

return rw a[4]

I'll say no more just now as this is an old topic but I wanted to at
least make a reply.



--
James Harris



David Brown

unread,
Dec 9, 2023, 9:33:57 AM12/9/23
to
On 08/12/2023 20:28, James Harris wrote:
> On 03/12/2021 09:08, David Brown wrote:
>> On 02/12/2021 22:42, James Harris wrote:
>>> On 02/12/2021 20:49, Dmitry A. Kazakov wrote:
>>>> On 2021-12-02 21:31, James Harris wrote:
>>>
>>> ...
>>>
>>>>> But to the point, are you comfortable with the idea of the A(2) in
>>>>>
>>>>>     x = A(2) + 0
>>>>>
>>>>> meaning the same mapping result as the A(2) in
>>>>>
>>>>>     A(2) = 0
>>>>>
>>>>> ?
>>>>
>>>> Yes, in both cases the result is the array element corresponding to
>>>> the index 2. That is the semantics of A(2).
>>>
>>> Cool. If A were, instead, a function that, say, ended with
>>>
>>>    return v
>>>
>>> then what would you want those A(2)s to mean and should they still mean
>>> the same as each other? The latter expression would look strange to
>>> many.
>>>
>>
>> Do you mean like returning a reference in C++ style?
>
> Hi David, apologies for not replying before. I was just now looking for
> old posts that were outstanding in some way. (There may be many which I
> have yet to reply to like your one.)
>
> I probably left your post until I found out about C++ references and
> never got round to reading up on them.
>

I saw my post was dated 03.12 (or 12.03, for any date-backwards
Americans in the audience) and thought it was strange that I'd forgotten
the post from 5 days ago. Then I noticed the year...

I'll try to reply, but forgive me if I've forgotten details of the thread!

>>
>>
>> int a[10];
>>
>> void foo1(int i, int x) {
>>      a[i] = x;
>> }
>>
>> int& A(int i) {
>>      return a[i];
>> }
>>
>> void foo2(int i, int x) {
>>      A(i) = x;
>> }
>>
>> foo1 and foo2 do the same thing, and have the same code.  Of course,
>> foo2 could add range checking, or offsets (for 1-based array), or have
>> multiple parameters for multi-dimensional arrays, etc.  And in practice
>> you'd make such functions methods of a class so that the class owns the
>> data, rather than having a single global source of the data.
>
> I see at https://www.geeksforgeeks.org/references-in-cpp/ code which
> includes
>
> void swap(int& first, int& second)
> {
>     int temp = first;
>     first = second;
>     second = temp;
> }
>
> That's not what I was thinking about. I don't care for it because in a
> call such as
>
>   swap(a, b)
>
> it's not clear in the syntax that the arguments a and b can be modified
> - a calamity of a design, IMO. :-(
>

That is a reasonable view. It is not uncommon in C++ programming to
have a rule that you use pointers when the operands will be changed, and
only use references as "const T&" types. Since the function can't
change data that is passed by const reference, from the caller viewpoint
it is just like passing by value. The only real difference is the
efficiency - whether a real value is passed, or an address pointing to
the value.

> But in your case you are referring to a /returned/ value. That appears
> to be OK except that to match what I had in mind I think there should be
> a const in there somewhere. To illustrate, I was proposing that in a
> function, f, one could have
>
>   return h
>
> which would /conceptually/ return the address of h (which I guess in
> Algol terms means it would return h rather than the value of h).
> Crucially, and perhaps at variance with Algol (I don't know) the value
> of h (i.e. the value at the returned address) would be read-only to the
> caller.

That would be returning a const reference, in C++ terms. And like const
reference parameters, it is semantically very similar to returning a
value, but with possible efficiency differences. The only thing you
need to be careful about is whether the object pointed to still exists -
you don't want to return a reference to a local object!

Andy Walker

unread,
Dec 9, 2023, 4:40:14 PM12/9/23
to
On 08/12/2023 19:28, James Harris wrote:
> [...] To illustrate, I was
> proposing that in a function, f, one could have
> return h
> which would /conceptually/ return the address of h (which I guess in
> Algol terms means it would return h rather than the value of h).

In Algol terms, it would return "h" coerced to the declared
type of the function. That could be "h" itself, if it already has
that type, but more generally could involve any of the coercions
available to the compiler, singly or combined together as defined
in the syntax by the Revised Report. Some notes:

-- An applied occurrence of an identifier in Algol /always/ means
a priori the object defined by the defining occurrence of that
identifier.

-- The return value from a function is a "strong" position; the
compiler knows exactly, from the declaration of the function,
what type is required. So all available coercions can be used.
This is in contrast to other positions, such as operands and
the LHS of an assignment, where only some coercions are allowed.
This is all defined by the syntax, which is carefully designed
to be unambiguous.

-- If "h" is a variable, ie an address in the computer, then "h"
/is/ that address. If you want to use instead the value at that
address, then "h" must be dereferenced, almost always implicitly,
in accordance with the syntax.

-- Trying to explain all this always makes Algol seem much more
complicated than it really is. IRL, everything just works, and
does so in a natural and convenient way. C-style languages go
around the houses to achieve the same effects, but we are all
[sadly] used to that.

> Crucially, and perhaps at variance with Algol (I don't know) the
> value of h (i.e. the value at the returned address) would be
> read-only to the caller.

Algol has no concept of "read-only variables". There were
some proposals for "in" and "out" parameters, but they never got any
real traction. So, in the present context, it depends on what the
returned value is. Eg, a procedure returning int of course returns
something "read only"; you can't overwrite [eg] 2 with 3. If OTOH
it returns a variable [such as a "ref int"], a place in the computer,
then you can [also of course] change what value is stored there.
Thus, "2 := 3" is forbidden, but "int h := 2; ...; h := 3;" is fine.
Then you can't change "h" [it will remain the same place], but you
can [of course] change whether a pointer points at "h" or at some
other place. [The use of "letter aleph" in the RR, which creates
identifiers known to the OS but that a programmer cannot access in
the program, solves many, but not all, of the real-life problems for
which other languages need "const".]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Richards

James Harris

unread,
Dec 12, 2023, 11:00:24 AM12/12/23
to
At the time I had to drop out of Usenet for a while but was aware that
there was a number of posts which should be replied to. I'll try to get
to some of the others, as well.


--
James Harris


0 new messages