Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

reference type for C

334 views
Skip to first unread message

roland....@gmail.com

unread,
May 16, 2013, 10:08:43 AM5/16/13
to
I like the reference type that has been introduced with C++
I wondered why isn't that feature retrofitted in C.
Is there something hindering it?

thx - roar -

Malcolm McLean

unread,
May 16, 2013, 10:51:51 AM5/16/13
to
References are really just pointers in disguise, though in C++ they have the
feature that they can never be null.
In C, it's better to keep everything explicit. That's the design principle
of the language, to expose as much as possible of the underlying machine
operations to the programmer. So if you're passed an address, you can
examine its bits to see which area of memory it came from, for example.

--
Visit Malcolm's website
http://www.malcolmmclean.site11.com/www

James Kuyper

unread,
May 16, 2013, 10:52:33 AM5/16/13
to
Reference types in C++ don't allow you do anything that can't also be
done, with slightly different syntax, using pointers. Syntactic
convenience is often sufficient to justify a feature, such as += or ++;
but in this case the increased convenience is relatively minor. I'm not
saying that this can't be done - but there's just not a lot of demand
for it. There is a minor inconvenience: because references and pointers
provide 2 different ways of doing the same thing, wherever the C++
standard says something about pointers, it often has to also say
something similar about references. Adding references to C would require
adding similar text in a great many places in the C standard.


ralph

unread,
May 16, 2013, 11:39:15 AM5/16/13
to
On Thu, 16 May 2013 07:08:43 -0700 (PDT), roland....@gmail.com
wrote:

>I like the reference type that has been introduced with C++
>I wondered why isn't that feature retrofitted in C.
>Is there something hindering it?
>

Object-oriented programming (and design/analysis) has a concept of
Identity. A Reference in OO is simply a thingy containing enough
information to uniquely refer to, allow access to, a specific object.

C++ is an OOPL, essentially designed to facilitate coding OO
solutions, thus it is no surprise that C++ provides a 'reference'
thingy.

While EVERYBODY and his friends and family *knows* a C++ reference is
a 'pointer', after all they are implemented as 'pointers' (albeit with
a few restrictions) - the fact is the C++ standard goes to great
lengths to NOT define how a reference is implemented.

As C has no need to introduce another higher-level concept it boils
down to what Mr. Kuyper pointed out - "syntactic
convenience".

-ralph

Martin Shobe

unread,
May 16, 2013, 3:24:14 PM5/16/13
to
On 5/16/2013 9:52 AM, James Kuyper wrote:
> On 05/16/2013 10:08 AM, roland....@gmail.com wrote:
>> I like the reference type that has been introduced with C++
>> I wondered why isn't that feature retrofitted in C.
>> Is there something hindering it?
>
> Reference types in C++ don't allow you do anything that can't also be
> done, with slightly different syntax, using pointers.

While this isn't the place to go too deep into C++'s bag of tricks.
there are things that you can do with references that you can't do with
pointers. For example, you can extend the lifetime of a temporary.

Martin Shobe

BGB

unread,
May 16, 2013, 11:33:10 PM5/16/13
to
(actually wrote earlier today, but I had accidentally sent as an email...).


On 5/16/2013 9:52 AM, James Kuyper wrote:
albeit there is a possible lazier solution:
a reference is (formally) just a slight syntactic sugar over a pointer.
(IOW: just define it as being a pointer... possibly just leaving a NULL
reference as undefined behavior).

maybe the behavior is: a variable declared as a reference will always
behave as-if it was being operated on with a the '*' operator.


possibly a keyword could be added both for declarations, and partly to
take the role of the '&' operator in these cases ('_Ref' in an
expression canceling out its use in a declaration, giving access to the
raw pointer).


say, if added as a keyword:
int foo(_Ref int r) //internally 'int *r'
{
r=3; //declaration suppresses need for explicit "*r"
}

int i;
foo(&i); //creates pointer/reference to int
foo(_Ref i); //would do basically the same thing as &i here

int bar(_Ref int r)
{
int *pr;
pr=_Ref r; //maybe less conceptually ambiguous than '&r'.
...
_Ref r=pr; //maybe allow assigning references
}

optionally, the compiler could add sanity checks as-needed to try to
avoid NULL references.


alternatively, it could be possible to just mimic the C++ behavior.

Ian Collins

unread,
May 17, 2013, 1:45:52 AM5/17/13
to
James Kuyper wrote:
> On 05/16/2013 10:08 AM, roland....@gmail.com wrote:
>> I like the reference type that has been introduced with C++
>> I wondered why isn't that feature retrofitted in C.
>> Is there something hindering it?
>
> Reference types in C++ don't allow you do anything that can't also be
> done, with slightly different syntax, using pointers.

It's probably fairer to say reference types in C wouldn't allow you do
anything that can't also be done, with slightly different syntax, using
pointers.

In C++, they enable rather a lot.

--
Ian Collins

roland....@gmail.com

unread,
May 17, 2013, 4:58:47 AM5/17/13
to
Thanks for your contributions so far.

I'd like to remind though that the question was : is there anything in the C standard so far that would hinder implementation of this feature.

I don't want to duplicate here all the discussions one can see on the web re. the feature in C++ (esp. ref being pointers in disguise in the implementation)

(and if you wonder why I would like to have it in C : reference cannot be made to refer to another var after it is defined / I know you can go through hoops to make this statement false, and *yes* I like the syntactic sugar that reference can make code more "unobstructed" for uses of const pointers) -- but again, please no rant about my own tastes.

gwowen

unread,
May 17, 2013, 6:38:56 AM5/17/13
to
On May 17, 4:33 am, BGB <cr88...@hotmail.com> wrote:
> albeit there is a possible lazier solution:
> a reference is (formally) just a slight syntactic sugar over a pointer.
> (IOW: just define it as being a pointer... possibly just leaving a NULL
> reference as undefined behavior).

In non-pathological usage, a C++ reference cannot be null. That's
caught at compile time, which is a *massive* win over introducing
another bit of undefined behaviour.

A C++ reference-to-T cannot be mistaken for the the start of any array
of T, or an iterator into a collection of T's, so you can't do pointer
arithmetic on them.

NULL pointer dereferences and incorrect pointer arithmetic are two of
the most common causes of bugs in C/C++ programs.

To say "you can't do anything with a reference that you can't do with
a pointer" is to miss the point. The things that you *can't* do to a
reference are a feature (but can do to a pointer).

In C, a pointer is one-or-more of "a reference to an object", "an
iterator into an array of objects", "the address of some memory", "a
reference to nothing".

In C++, a reference is "a reference to an object".

I don't consider that restriction to be "slight syntactic sugar". It
actually allows the compiler to do type checking so that the function
caller's and the function writer's intentions are the same.

Of course, they'll never be introduced to C, because C of the
minimalist nature of C. Introducing new types similar to old types -
regardless of the type safety advantages - will not happen.

James Kuyper

unread,
May 17, 2013, 7:36:39 AM5/17/13
to
On 05/17/2013 06:38 AM, gwowen wrote:
...
> A C++ reference-to-T cannot be mistaken for the the start of any array
> of T, or an iterator into a collection of T's, so you can't do pointer
> arithmetic on them.

Given
int original;
int &reference = original;
int *pointer = &original;

then reference corresponds to *pointer, and &reference corresponds to
pointer. You can do pointer arithmetic on &reference.
--
James Kuyper

Ian Collins

unread,
May 17, 2013, 7:42:51 AM5/17/13
to
James Kuyper wrote:
> On 05/17/2013 06:38 AM, gwowen wrote:
> ....
>> A C++ reference-to-T cannot be mistaken for the the start of any array
>> of T, or an iterator into a collection of T's, so you can't do pointer
>> arithmetic on them.
>
> Given
> int original;
> int &reference = original;
> int *pointer = &original;
>
> then reference corresponds to *pointer, and &reference corresponds to
> pointer. You can do pointer arithmetic on &reference.

You can do pointer arithmetic on &original, so what's your point?

--
Ian Collins

James Kuyper

unread,
May 17, 2013, 8:06:52 AM5/17/13
to
That the difference between references and pointers is primarily
syntactic, and that the inability to do pointer arithmetic on references
does NOT constitute an exception to that fact - a single operator is the
only syntactic difference between a reference and something that you can
indeed to pointer arithmetic on. Properly, I should have used

int * const pointer = & original;

to make the analogy closer.
--
James Kuyper

roland....@gmail.com

unread,
May 17, 2013, 8:11:26 AM5/17/13
to
For me, gowen has been the best advocate for why reference is a very good thing to have in C.

I would also say that using a reference has a semantic value (the alias role) for the reader of the code, which (as gowen said) is lost if one used the army-knife-pointer - in that sense it helps legibility.

But what I am hearing is that it will never make it, either because people might not see the value of it, or because you can do without it or even because updating the standard would be too difficulte -- sob...

Ed Prochak

unread,
May 17, 2013, 8:37:31 AM5/17/13
to
On Thursday, May 16, 2013 3:24:14 PM UTC-4, Martin Shobe wrote:

>
> While this isn't the place to go too deep into C++'s bag of tricks.
> there are things that you can do with references that you can't do with
> pointers. For example, you can extend the lifetime of a temporary.
>
>
> Martin Shobe

I don't get it. In C I can control the lifetime of everything. I don't have a garbage collector taking things away behind my back. So I don't see this as an advantage of references.

ed

Malcolm McLean

unread,
May 17, 2013, 9:29:16 AM5/17/13
to
On Friday, May 17, 2013 11:38:56 AM UTC+1, gwowen wrote:
> On May 17, 4:33 am, BGB <cr88...@hotmail.com> wrote:
>
>
> In non-pathological usage, a C++ reference cannot be null. That's
> caught at compile time, which is a *massive* win over introducing
> another bit of undefined behaviour.
>
>
Most aircraft crashes are caused by controlled flight into terrain.
Most errors can be corrected by automatic systems. But not the pilot
explicitly telling the aircraft to fly into a mountain, because, from
the aircraft's point of view, everything is correct an normal.

Undefined behaviour is bad, but it's only "undefined" from the point of view
of the c standard. Null pointer defererences can and usually are defined
to stop the program with an error message. Depending on the situation,
that's usually a lot better than wrong results.
>
>
> NULL pointer dereferences and incorrect pointer arithmetic are two of
> the most common causes of bugs in C/C++ programs.
>
Pointer arithmetic is seldom necessary and usually indicates old-fashioned
programming. Pointers usually point either to structures or arrays, and
don't need modifying.
Null pointer dereferences are a bug, but they aren't usually the root cause
of a bug. If a pointer that should be valid is in fact null, then usually
that's because there's a logic error somewhere upstream. So you have to fix
the bug at the place the pointer became null, not where it was dereferenced.
References might help slightly, but you can't overcome logic errors with
syntactical constructs. The flip of references can't be null is that you
can't use null as a sentinel. If you use a special "empty" sentinel value,
you've got far more potential for logic errors than if you use null pointers.

Ike Naar

unread,
May 17, 2013, 9:31:39 AM5/17/13
to
Alternatively, one can view a reference not as some kind of pointer,
but as another name for a given object.
Consider the two code fragments:

int original;
int &reference = original;
// point A

and

int reference;
int &original = reference;
// point A

The situation at point A in the first fragment is indistinguishable
from the situation at point A in the second fragment.
At both locations, 'original' and 'reference' have type 'int' and refer to
a common object. At point A, it's impossible to tell which one of 'original'
or 'reference' is the original object, and which one is the reference.

Isn't it a bit odd, then, to regard 'reference' as some kind of pointer
in the first code fragment, but not in the second fragment?

Paul N

unread,
May 17, 2013, 9:42:02 AM5/17/13
to
I don't think there's anything in the standard that would hinder
having references. In particular, if a function is called before a
prototype has been seen, and so the compiler rashly treats it as a
call-by-value rather than a call-by-reference, it will complain when
it does see the prototype.

I think it's more a matter of philosophy. In C's predecessor BCPL,
everything was the same type, so it could be arranged that calling a
function simply involved bunging the arguments on the stack and the
function could simply reel them off. (Or, indeed, use the stack
locations as where to store the parameters.) C does of course have
types, but I think it would be considered a leap too far for a
function call that looks like it is referring to a variable to
actually be pushing a disguised pointer instead. If you want a laguage
with different trade-offs, then C++ is there and waiting; but if you
don't, then you don't.

[PS I hope the above comes out OK as regarding line lnegths and
quoting - I'm using the old Google groups which seemed to do this
right, but this time it looks suspicious.]

Noob

unread,
May 17, 2013, 10:02:53 AM5/17/13
to
gwowen wrote:

> In non-pathological usage, a C++ reference cannot be null. That's
> caught at compile time, which is a *massive* win over introducing
> another bit of undefined behaviour.

I had thought "restrict" might help:

$ cat ref.c
int foo(int *restrict const p) { return *p; }
int bar(void) { return foo(0); }
$ gcc -std=c99 -pedantic-errors -Wall -Wextra -Os -S ref.c
/* NO WARNING */

I expected a warning.
(AFAIU, restrict pointers must be valid, foo(0) is not allowed.)
Perhaps I should test with something more recent than 4.6.3

Regards.

SG

unread,
May 17, 2013, 10:24:38 AM5/17/13
to
On May 16, 4:08 pm, roland.arth...@gmail.com wrote:
> I like the reference type that has been introduced with C++

What do you like about it?
What would it enable in C that was not possible before?

It's like Ian said: There are a lot of situations where references
help a lot in C++. I don't see these kinds of situations in C.

> I wondered why isn't that feature retrofitted in C.
> Is there something hindering it?

I guess there has to be enough bang for the buck. Is there?

Cheers!
SG

Martin Shobe

unread,
May 17, 2013, 11:42:29 AM5/17/13
to
In C++, some expressions result in the creation of temporary objects.
Usually these objects are destroyed when the program has finished
evaluating the expression[1] they were created in. If a temporary object
is bound to a reference, then (with a few exceptions) the object isn't
destroyed until the reference is. Garbage collection (which C++ isn't
required to have) has nothing to do with it.

Martin Shobe

[1] This expression is the one referred to in the C++ standard as the
full-expression.

James Kuyper

unread,
May 17, 2013, 12:17:16 PM5/17/13
to
On 05/17/2013 09:29 AM, Malcolm McLean wrote:
...
> Pointer arithmetic is seldom necessary and usually indicates old-fashioned
> programming. Pointers usually point either to structures or arrays, and
> don't need modifying.

How do you access any element of an array other than the first without
pointer arithmetic? array[i] is defined to be behave as *(array+i), and
therefore does involve pointer arithmetic.

James Kuyper

unread,
May 17, 2013, 12:24:43 PM5/17/13
to
The main thing that "restrict" does is give undefined behavior to
certain uses of the restricted pointer. Those uses are not constraint
violations, so no diagnostics are required. Note that foo(0) would have
had undefined behavior even if 'restrict' had not been used.


Malcolm McLean

unread,
May 17, 2013, 7:01:21 PM5/17/13
to
On Friday, May 17, 2013 5:17:16 PM UTC+1, James Kuyper wrote:
> On 05/17/2013 09:29 AM, Malcolm McLean wrote
>
> How do you access any element of an array other than the first without
> pointer arithmetic? array[i] is defined to be behave as *(array+i), and
> therefore does involve pointer arithmetic.
>
Counting on isn't usually considered to be arithmetic, though it's
a semantic argument. Someone who knows the names, order and value
of the numbers in his native language is not considered to be
numerate.
If you take the value of the pointer expression and store it,
then I'd define that as pointer arithmetic. In the olden days,
pointer arithmetic could be faster than array notation, because

for(i=0;i<N;i++)
array[i] = 0;
would generate N additions,
whilst

for(ptr=array;N--;ptr++)
*ptr = 0;

would generate N increments, which were a single machine instruction.

But nowadays you'd be unlucky to find a compiler which didn't optimise
the first loop.

glen herrmannsfeldt

unread,
May 17, 2013, 8:31:38 PM5/17/13
to
Malcolm McLean <malcolm...@btinternet.com> wrote:
> On Friday, May 17, 2013 5:17:16 PM UTC+1, James Kuyper wrote:
(snip)
>> How do you access any element of an array other than the first without
>> pointer arithmetic? array[i] is defined to be behave as *(array+i),
>> and therefore does involve pointer arithmetic.

> Counting on isn't usually considered to be arithmetic, though it's
> a semantic argument. Someone who knows the names, order and value
> of the numbers in his native language is not considered to be
> numerate.

> If you take the value of the pointer expression and store it,
> then I'd define that as pointer arithmetic. In the olden days,
> pointer arithmetic could be faster than array notation, because

> for(i=0;i<N;i++)
> array[i] = 0;
> would generate N additions,
> whilst

> for(ptr=array;N--;ptr++)
> *ptr = 0;

Yes, the latter is what I would call "using pointer arithmetic".

> would generate N increments, which were a single machine instruction.

As well as I remember the stories, it was for processors like the
PDP-11 where one worried about such.

Well, it gets more interesting when you do:

for(i=0;i<N;i++) array[3*i] = 0;

where the compiler might do a multiply, on a processor with a
slow (or no) multiply instruction.

But this kind of things was done by the IBM Fortran H compiler
over 40 years ago, so it isn't new technology. Doing it in small
memory is harder, though.

> But nowadays you'd be unlucky to find a compiler which
> didn't optimise the first loop.

And, often the pointer form is less readable.

-- glen

Noob

unread,
May 18, 2013, 11:40:02 AM5/18/13
to
James Kuyper wrote:
> On 05/17/2013 10:02 AM, Noob wrote:
>> gwowen wrote:
>>
>>> In non-pathological usage, a C++ reference cannot be null. That's
>>> caught at compile time, which is a *massive* win over introducing
>>> another bit of undefined behaviour.
>>
>> I had thought "restrict" might help:
>>
>> $ cat ref.c
>> int foo(int *restrict const p) { return *p; }
>> int bar(void) { return foo(0); }
>> $ gcc -std=c99 -pedantic-errors -Wall -Wextra -Os -S ref.c
>> /* NO WARNING */
>>
>> I expected a warning.
>> (AFAIU, restrict pointers must be valid, foo(0) is not allowed.)
>> Perhaps I should test with something more recent than 4.6.3
>
> The main thing that "restrict" does is give undefined behavior to
> certain uses of the restricted pointer. Those uses are not constraint
> violations, so no diagnostics are required.

I had QoI in mind, rather than compliance.

> Note that foo(0) would have
> had undefined behavior even if 'restrict' had not been used.

Right. I shouldn't have given the definition of foo,
juste the prototype.

int foo(int *restrict const p);
int bar(void) {
return foo(0);
}

In that situation, I expect a "good" compiler (in terms of
QoI) to flag the UB.

Regards.

BartC

unread,
May 18, 2013, 4:28:05 PM5/18/13
to


<roland....@gmail.com> wrote in message
news:fe635f20-1124-4b36...@googlegroups.com...
If that's the only C++ feature you need, why not just use C++, and use
mainly C programming style?

--
bartc

roland....@gmail.com

unread,
May 19, 2013, 2:02:37 AM5/19/13
to
> If that's the only C++ feature you need, why not just use C++, and use
>
> mainly C programming style?
>
> --
>
> bartc

You are right, I thought about doing that.
I only wondered if there wasn't areas where I would get unanticipated side effects, i.e. where C++ would compile differently C code than a C compiler would.

Other features that I would like to have in C, are function overloading and default parameters. But I think that it render sometimes your code less obvious when C/C++ automatic conversions happen.

-roar

Bart van Ingen Schenau

unread,
May 19, 2013, 5:35:48 AM5/19/13
to
On Sat, 18 May 2013 23:02:37 -0700, roland.arthaud wrote:

>> If that's the only C++ feature you need, why not just use C++, and use
>>
>> mainly C programming style?
>>
>> --
>>
>> bartc
>
> You are right, I thought about doing that. I only wondered if there
> wasn't areas where I would get unanticipated side effects, i.e. where
> C++ would compile differently C code than a C compiler would.

There are a few less used, darker corners of C and C++, where a C and a C+
+ compiler silently give different results. The most notable is the type
of character literals (and consequently, the result of sizeof on them):

#include <stdio.h>
int main()
{
if (sizeof(char) == sizeof(int))
puts("Can't tell. char and int have same size");
else if (sizeof('a') == 1)
puts("C++");
else
puts("C");
return 0;
}

This is the only silent difference between C and C++ that I am aware of.
If there are any others, they will likely affect you even less than this
one.

On course, there are also some C constructs that are not valid in C++
(e.g. implicit conversion from void* to T*), but of those the compiler
will complain loudly enough.

>
> Other features that I would like to have in C, are function overloading
> and default parameters. But I think that it render sometimes your code
> less obvious when C/C++ automatic conversions happen.

If you use them incorrectly, then they can certainly harm the
understandability of the code, but that holds also for most of the
constructs that C already has.

>
> -roar

Bart v Ingen Schenau

Jorgen Grahn

unread,
May 19, 2013, 6:40:02 AM5/19/13
to
On Sun, 2013-05-19, roland....@gmail.com wrote:

[attribution lost; apparently bartc]

>> If that's the only C++ feature you need, why not just use C++, and use
>> mainly C programming style?

> You are right, I thought about doing that.

Slightly easier and quicker than getting a new C feature approved and
added to all compilers you care about, teaching coworkers to like and
use it ...

> I only wondered if there wasn't areas where I would get
> unanticipated side effects, i.e. where C++ would compile differently C
> code than a C compiler would.

There are some such differences, but it's more common to simply get
compilation errors on things like void pointers.

In general, if you code in C++ (in order to have stuff like
references), you have to accept that you are, in fact, coding in C++.
You cannot pretend it's your own C-with-references dialect.

> Other features that I would like to have in C, are function
> overloading and default parameters. But I think that it render
> sometimes your code less obvious when C/C++ automatic conversions
> happen.

It is true that in C++ you can lose yourself good in combinations of
overloading, default parameters and constructors which lack the
"explicit" keyword. So don't do that!

Nothing wrong with mild use of overloading though; it's one of the C++
features I miss the most in C.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Malcolm McLean

unread,
May 19, 2013, 7:39:00 AM5/19/13
to
On Sunday, May 19, 2013 11:40:02 AM UTC+1, Jorgen Grahn wrote:
> On Sun, 2013-05-19, roland....@gmail.com wrote:
>
>
> It is true that in C++ you can lose yourself good in combinations of
> overloading, default parameters and constructors which lack the
> "explicit" keyword. So don't do that!
>
One feature tends to pull in another.
For instance if you use the "class" keyword, you're going to need to allow
a constructor to fail. In C, you return a null pointer. In C++, you can't do
this, so the only real answer is to call and exception.
Then in C we often need twin constructors, e.g.

COMPLEX *complex(double real, double imaginary);
COMPLEX *complexpolar(double theta, double magnitude);

In C++ you've got to overload the two functions. But in C they've got the same
signature. So it needs to be


class Complex{
Complex(double real, double imaginary);
Complex(Angle &theta, double magnitude);
};

and in fact you'll probably want Angle templated before long.

James Kuyper

unread,
May 19, 2013, 12:10:04 PM5/19/13
to
On 05/17/2013 07:01 PM, Malcolm McLean wrote:
> On Friday, May 17, 2013 5:17:16 PM UTC+1, James Kuyper wrote:
...
>> How do you access any element of an array other than the first without
>> pointer arithmetic? array[i] is defined to be behave as *(array+i), and
>> therefore does involve pointer arithmetic.
>>
> Counting on isn't usually considered to be arithmetic, though it's
> a semantic argument. Someone who knows the names, order and value
> of the numbers in his native language is not considered to be
> numerate.

I'm unfamiliar with the use of the phrase "count on" with any meaning
other than "rely upon". So are the dictionaries I've consulted. What do
you mean by that? I can't imagine a reasonable definition of "pointer
arithmetic" that excludes pointer+integer, which is, in my opinion, the
most prototypical example of pointer arithmetic.

> If you take the value of the pointer expression and store it,
> then I'd define that as pointer arithmetic.

I consider "pointer arithmetic" to refer exclusively to the use of
arithmetic operators when one of the operands is a pointer value. In
other words, pointer+integer, integer+pointer, pointer-integer,
pointer-pointer, pointer+=integer, pointer-=integer, pointer++,
++pointer, pointer--, or --pointer. Technically, that definition also
covers +pointer, but I don't want to worry about that edge case. I
consider subscripting to be an example of hidden pointer arithmetic,
since pointer[integer] is defined as being equivalent to *(pointer+integer).

Arguably, the member selection operator is another example of hidden
pointer arithmetic, because pointer->member is normally implemented in a
manner equivalent to

*(member_type*)((char*)pointer + offsetof(struct_type, member))

However, since the standard does not actually define the member
selection operator in those terms, I only mark that as "arguable".

Your definition does not require that the pointer expression involve
arithmetic operators, which I find odd, and requires that the result be
stored, which I find even odder.

The standard never defines the meaning of the phrase "pointer
arithmetic", and uses it in only one place, 6.5.6p10:
> EXAMPLE Pointer arithmetic is well defined with pointers to variable length array types.
> {
> int n = 4, m = 3;
> int a[n][m];
> int (*p)[m] = a; // p == &a[0]
> p += 1; // p == &a[1]
> (*p)[2] = 99; // a[1][2] == 99
> n = p - a; // n == 1
> }

The wording implies that the example contains pointer arithmetic on a
pointer to a VLA type. Unfortunately, both of our definitions are
consistent with that implication:

According to your definition, as I understand it, the following
expressions from the example involve pointer arithmetic on pointers to a
VLA type. I would disagree with the first one, because it doesn't
involve any arithmetic operators:
int (*p)[m] = a;
p += 1

According to my definition, the following expressions from the example
involve pointer arithmetic on pointers to a VLA type. Am I correct in
concluding that you would disagree with the second expression, because
it does not involve storage of the value of an expression with pointer type?
p += 1
p - a

I also consider (*p)[2] to be an example of hidden pointer arithmetic,
which does not meet your definition, since it doesn't involve storage of
the value of an expression of pointer type. However, since *p is not a
pointer to a VLA type, (*p)[2] couldn't be the example pointer
arithmetic whose existence is implied by the sentence at the start of
6.5.6p10.

My definition seems consistent with the uses of the phrase "pointer
arithmetic" within the wikipedia article on pointers; insofar as there's
a difference between our definitions, yours does not.
--
James Kuyper

James Kuyper

unread,
May 19, 2013, 12:30:54 PM5/19/13
to
On 05/19/2013 05:35 AM, Bart van Ingen Schenau wrote:
...
> There are a few less used, darker corners of C and C++, where a C and a C+
> + compiler silently give different results. The most notable is the type
> of character literals (and consequently, the result of sizeof on them):
>
> #include <stdio.h>
> int main()
> {
> if (sizeof(char) == sizeof(int))
> puts("Can't tell. char and int have same size");
> else if (sizeof('a') == 1)
> puts("C++");
> else
> puts("C");
> return 0;
> }
>
> This is the only silent difference between C and C++ that I am aware of.

There's a few others.

> If there are any others, they will likely affect you even less than this
> one.

You're probably right about that.

> On course, there are also some C constructs that are not valid in C++
> (e.g. implicit conversion from void* to T*), but of those the compiler
> will complain loudly enough.

I've posted versions of the following material three times, most
recently in a message dated 2011-06-19. It has never provoked much
discussion, which disappointed me. I haven't updated it for the most
recent versions of the C and C++ standards, but I doubt that it needs
modification for either one.

The following is very carefully designed to make many different points
in a program that is as small as possible. I make no claim that it is an
example of good programming practice

It is syntactically valid code in both C and C++, conforms strictly to
the C99 standard, and is well-formed code according to both the C++98
and C++03 standards. It's behavior under C90 is technically undefined,
but only by reason of it's use of __cplusplus, an identifier which a C90
compiler could, in principle, have reserved for it's own incompatible
usage - but such compilers are rare, and probably non-existent.

You can compile and link both modules as C code, or as C++ code; the
resulting executables are guaranteed by the applicable standards to exit
with an failure status, for two entirely different sets of reasons,
depending upon which language is used (note that many compilers
automatically infer the language to be used from the extension on the
filename - you might need to rename the files to get them to actually
compile in one language rather than the other).

If you compile the first module with C, and the second with C++, it is
guaranteed to return a successful exit status. If C++ were really just
an extension to C, then what I've said about how this program's behavior
varies with the programming language would be impossible.

shared.h:
=========
#ifndef SHARED_H
#define SHARED_H

extern char tag;
extern int enumer[2];

typedef void voidvoid(void);

int Cnmtyp(void);
int Cfunc(voidvoid*);

#endif

First module:
=============
#ifdef __cplusplus
extern "C" {
#endif

#include "shared.h"

char tag = 0;

static int hidden(voidvoid *pfunc)
{
(*pfunc)();
tag = sizeof 'C' == sizeof(int);
return Cnmtyp() && enumer[0] && enumer[1];
}

int Cfunc(voidvoid* pfunc)
{
struct tag
{
enum { enumer, other } in;
int integer;
} out;

out.integer = sizeof(tag) == 1 && sizeof(enumer) == sizeof out.in;

return hidden(pfunc) && out.integer;
}

#ifdef __cplusplus
}
#endif

Second module:
==============
#ifdef __cplusplus
extern "C" {
#endif

#include "shared.h"

int enumer[2] = {0, 1};

static void Cppname_Cpptype(void)
{
enumer[0] = sizeof 'C' == 1;
return;
}

#ifdef __cplusplus
}
#endif

int Cnmtyp(void)
{
struct tag
{
enum { enumer, other } in;
int integer;
} out;

out.integer = sizeof(enumer) == 2 * sizeof(int);

return out.integer && sizeof(tag) == sizeof out;
}

static voidvoid Cppname_Ctype;

static void Cppname_Ctype(void) {
Cppname_Cpptype();
}

int main(void) {
return Cfunc(&Cppname_Ctype) && tag;
}

As an exercise for the student: explain precisely why three different
conditional expressions in the above code are guaranteed to have
different values in C and C++, and why two other conditionals will have
different values except in the unlikely case that sizeof(int)==1.
--
James Kuyper

Ian Collins

unread,
May 19, 2013, 3:35:55 PM5/19/13
to
Bart van Ingen Schenau wrote:
> On Sat, 18 May 2013 23:02:37 -0700, roland.arthaud wrote:
>
>>> If that's the only C++ feature you need, why not just use C++, and use
>>>
>>> mainly C programming style?
>>>
>>> --
>>>
>>> bartc
>>
>> You are right, I thought about doing that. I only wondered if there
>> wasn't areas where I would get unanticipated side effects, i.e. where
>> C++ would compile differently C code than a C compiler would.
>
> There are a few less used, darker corners of C and C++, where a C and a C+
> + compiler silently give different results. The most notable is the type
> of character literals (and consequently, the result of sizeof on them):
>
> #include <stdio.h>
> int main()
> {
> if (sizeof(char) == sizeof(int))
> puts("Can't tell. char and int have same size");
> else if (sizeof('a') == 1)
> puts("C++");
> else
> puts("C");
> return 0;
> }
>
> This is the only silent difference between C and C++ that I am aware of.
> If there are any others, they will likely affect you even less than this
> one.

One of the other significant differences between the two is with an
expression like

const size_t n = 42;
int i[n];

In C, i is a VLA, in C++ it is a normal array.

--
Ian Collins

glen herrmannsfeldt

unread,
May 19, 2013, 3:56:26 PM5/19/13
to
James Kuyper <james...@verizon.net> wrote:
> On 05/17/2013 07:01 PM, Malcolm McLean wrote:

(snip)
>> Counting on isn't usually considered to be arithmetic, though it's
>> a semantic argument. Someone who knows the names, order and value
>> of the numbers in his native language is not considered to be
>> numerate.

> I'm unfamiliar with the use of the phrase "count on" with any meaning
> other than "rely upon". So are the dictionaries I've consulted. What do
> you mean by that? I can't imagine a reasonable definition of "pointer
> arithmetic" that excludes pointer+integer, which is, in my opinion, the
> most prototypical example of pointer arithmetic.

The idea is to distinguish the two programming styles:

for(i=0;i<n;i++) sum += x[i];

and

for(i=0;i<n;i++) sum += *x++;
x -= n;

(For the latter, it might be that x isn't needed after this
construct, so there is no need to restore x. More commonly it
would be done with a copy of the pointer, though.)

If you want to suggest a different name to distinguish the two,
then you might convince others to use it.

In the past, maybe on some of the early processors that C was
implemented on, there was a suggestion that one form might be
faster than the other.

>> If you take the value of the pointer expression and store it,
>> then I'd define that as pointer arithmetic.

> I consider "pointer arithmetic" to refer exclusively to the use of
> arithmetic operators when one of the operands is a pointer value.
>In other words, pointer+integer, integer+pointer, pointer-integer,
> pointer-pointer, pointer+=integer, pointer-=integer, pointer++,
> ++pointer, pointer--, or --pointer. Technically, that definition also
> covers +pointer, but I don't want to worry about that edge case. I
> consider subscripting to be an example of hidden pointer arithmetic,
> since pointer[integer] is defined as being equivalent
> to *(pointer+integer).

Note that Fortran has pointers, Java has Object reference variables,
both have subscripting (indexing) but not pointer arithmetic.

> Arguably, the member selection operator is another example of hidden
> pointer arithmetic, because pointer->member is normally implemented in a
> manner equivalent to

> *(member_type*)((char*)pointer + offsetof(struct_type, member))

> However, since the standard does not actually define the member
> selection operator in those terms, I only mark that as "arguable".

> Your definition does not require that the pointer expression involve
> arithmetic operators, which I find odd, and requires that the result be
> stored, which I find even odder.

I suppose [] isn't an arithmetic operator.

The point of the result being stored is to distinguish the two
styles shown above. But one might even do:

for(i=0;i<n;i++) {
temp=x+i;
sum += *temp;
}

which does store a pointer, but is logically more like the first
case than the second.

Now, consider the Fortran code:

DO 1 I=1,N
1 SUM=SUM+Y(2*I)

Fortran compilers have known how to optimize this case, keeping
an address in a register and adding the appropriate amount each
time through the loop, since before C even existed.

-- glen

James Kuyper

unread,
May 19, 2013, 5:01:00 PM5/19/13
to
On 05/19/2013 03:56 PM, glen herrmannsfeldt wrote:
> James Kuyper <james...@verizon.net> wrote:
>> On 05/17/2013 07:01 PM, Malcolm McLean wrote:
>
> (snip)
>>> Counting on isn't usually considered to be arithmetic, though it's
>>> a semantic argument. Someone who knows the names, order and value
>>> of the numbers in his native language is not considered to be
>>> numerate.
>
>> I'm unfamiliar with the use of the phrase "count on" with any meaning
>> other than "rely upon". So are the dictionaries I've consulted. What do
>> you mean by that? I can't imagine a reasonable definition of "pointer
>> arithmetic" that excludes pointer+integer, which is, in my opinion, the
>> most prototypical example of pointer arithmetic.
>
> The idea is to distinguish the two programming styles:
>
> for(i=0;i<n;i++) sum += x[i];
>
> and
>
> for(i=0;i<n;i++) sum += *x++;
> x -= n;


In any context where the second style is better, I'd expect the
following to be even more efficient (I'm expecting the optimizer to lift
"array+n" out of the loop).

for(int *pi = array; pi < array+n; pi++)
sum += *pi;

But I would never consider using the term "pointer arithmetic" to
distinguish the two styles. One uses explicit pointer arithmetic, the
other hides it inside an array subscript, but they both use it.

> If you want to suggest a different name to distinguish the two,
> then you might convince others to use it.

I seldom have a need to distinguish them, and I normally use the term
"pointer arithmetic" for purposes other than distinguishing them. For
instance, I'll say "pointer arithmetic is not allowed on void pointers",
a statement which would be false if I were using Bart's definition of
"pointer arithmetic".

...
>> ... I
>> consider subscripting to be an example of hidden pointer arithmetic,
>> since pointer[integer] is defined as being equivalent
>> to *(pointer+integer).
...
>> Your definition does not require that the pointer expression involve
>> arithmetic operators, which I find odd, and requires that the result be
>> stored, which I find even odder.
>
> I suppose [] isn't an arithmetic operator.

Correct, which is why I consider it to be an example of hidden pointer
arithmetic.
--
James Kuyper

Malcolm McLean

unread,
May 19, 2013, 5:10:32 PM5/19/13
to
On Sunday, May 19, 2013 5:10:04 PM UTC+1, James Kuyper wrote:
> On 05/17/2013 07:01 PM, Malcolm McLean wrote:
>
> I'm unfamiliar with the use of the phrase "count on" with any meaning
> other than "rely upon". So are the dictionaries I've consulted. What > do you mean by that? I can't imagine a reasonable definition of
> "pointer arithmetic" that excludes pointer+integer, which is, in my
> opinion, the most prototypical example of pointer arithmetic.
>
Counting one is where you take a number, let's say 42, and
ask the subject what is the nest number.
It's a necessary ability for numeracy, but it isn't in itself
considered enough to establish numeracy. It's a semantic argument,
of course.
Array notation is counting on for pointers. So it's not pointer
arithmetic. Again, it's a semantic argument. But it's not
unreasonable. Knowing that number 42 is a reasonable but
not excessive distance down the street doesn't equate to
knowing that 42 = 6 times 7.

Keith Thompson

unread,
May 19, 2013, 5:38:51 PM5/19/13
to
James Kuyper <james...@verizon.net> writes:
[...]
> I consider "pointer arithmetic" to refer exclusively to the use of
> arithmetic operators when one of the operands is a pointer value. In
> other words, pointer+integer, integer+pointer, pointer-integer,
> pointer-pointer, pointer+=integer, pointer-=integer, pointer++,
> ++pointer, pointer--, or --pointer. Technically, that definition also
> covers +pointer, but I don't want to worry about that edge case. I
> consider subscripting to be an example of hidden pointer arithmetic,
> since pointer[integer] is defined as being equivalent to *(pointer+integer).
[...]

There is no unary "+" operator for pointers. The operand of unary "+"
or "-" must be of arithmetic type; this is a constraint.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

glen herrmannsfeldt

unread,
May 19, 2013, 6:14:05 PM5/19/13
to
Keith Thompson <ks...@mib.org> wrote:
> James Kuyper <james...@verizon.net> writes:

>> I consider "pointer arithmetic" to refer exclusively to the use of
>> arithmetic operators when one of the operands is a pointer value. In
>> other words, pointer+integer, integer+pointer, pointer-integer,
>> pointer-pointer, pointer+=integer, pointer-=integer, pointer++,
>> ++pointer, pointer--, or --pointer. Technically, that definition also
>> covers +pointer, but I don't want to worry about that edge case. I
>> consider subscripting to be an example of hidden pointer arithmetic,
>> since pointer[integer] is defined as being equivalent
>> to *(pointer+integer).

> There is no unary "+" operator for pointers. The operand of unary "+"
> or "-" must be of arithmetic type; this is a constraint.

So, you can say:

x = pointer + (pointer - pointer);

or

x = pointer - pointer + pointer;

but not:

x= pointer + pointer - pointer;

Maybe too for off topic, but the OS/360 (and succssor)
assemblers allow for it. That is, relocation factors can be one
of {-1, 0, 1, 2}. (It only takes one more bit in the object
program, though does complicate the relocation routine.)

-- glen

ralph

unread,
May 19, 2013, 6:26:22 PM5/19/13
to
On Sun, 19 May 2013 19:56:26 +0000 (UTC), glen herrmannsfeldt
<g...@ugcs.caltech.edu> wrote:


>
>I suppose [] isn't an arithmetic operator.
>

This thread is tough because we are mixing in what might be going on
and what did go on back then. In the "legacy" days the array operator
was nothing but an arithmentic operator.

Mr Kuyper pointed out that the 'modern' C standard defines array[i] to
*behave* as *(array+i), the fact is back then, array[i] *was*
*(array+i). The array operator was merely short-hand. Underneath,
after the preprocessor, array[i] became *(array + i ) which is
definitely 'arithmetic'. <g>

This was easily demonstrated as both the following would compile and
produce the same result...

x = array[i];
or
x = i[array];

The latter statement varied depending on vendor and warning levels.
eg, some would allow
x = 5[array]; // a constant, but not
x = i[array];

While the 'replacement' represented no real impact in most cases,
there was still a lot of mythology surrounding the 'array operator',
thus those of us programming in late 90s and 80s tended to avoid array
notation like the plague, unless clarity was required (or requested).
<bg>

-ralph

ralph

unread,
May 19, 2013, 6:32:45 PM5/19/13
to
On Sun, 19 May 2013 17:26:22 -0500, ralph <nt_con...@yahoo.com>
wrote:

>
>While the 'replacement' represented no real impact in most cases,
>there was still a lot of mythology surrounding the 'array operator',
>thus those of us programming in * late * 90s and 80s tended to avoid array
>notation like the plague, unless clarity was required (or requested).
><bg>
>

Make that EARLY 90's.
I started out typing late 70's, 80's, and early 90's - then edited.
<g>

James Kuyper

unread,
May 19, 2013, 8:53:25 PM5/19/13
to
On 05/19/2013 06:14 PM, glen herrmannsfeldt wrote:
> Keith Thompson <ks...@mib.org> wrote:
...
>> There is no unary "+" operator for pointers. The operand of unary "+"
>> or "-" must be of arithmetic type; this is a constraint.
>
> So, you can say:
>
> x = pointer + (pointer - pointer);
>
> or
>
> x = pointer - pointer + pointer;
>
> but not:
>
> x= pointer + pointer - pointer;

True, but not relevant as a response to Keith's comment, because your
examples did not include any unary + operators. The constraint that
Keith refers to is violated by the following example:

x = +pointer;
--
James Kuyper

James Kuyper

unread,
May 19, 2013, 8:53:26 PM5/19/13
to
On 05/19/2013 06:14 PM, glen herrmannsfeldt wrote:
> Keith Thompson <ks...@mib.org> wrote:
...
>> There is no unary "+" operator for pointers. The operand of unary "+"
>> or "-" must be of arithmetic type; this is a constraint.
>
> So, you can say:
>
> x = pointer + (pointer - pointer);
>
> or
>
> x = pointer - pointer + pointer;
>
> but not:
>
> x= pointer + pointer - pointer;

James Kuyper

unread,
May 19, 2013, 8:55:34 PM5/19/13
to
On 05/19/2013 05:38 PM, Keith Thompson wrote:
...
> There is no unary "+" operator for pointers. The operand of unary "+"
> or "-" must be of arithmetic type; this is a constraint.

I'm not sure I've ever deliberately used the unary + operator, so it's
not surprising that I've forgotten what the constraints are - but I
should have checked before posting.
--
James Kuyper

James Kuyper

unread,
May 19, 2013, 9:17:30 PM5/19/13
to
On 05/19/2013 05:10 PM, Malcolm McLean wrote:
> On Sunday, May 19, 2013 5:10:04 PM UTC+1, James Kuyper wrote:
>> On 05/17/2013 07:01 PM, Malcolm McLean wrote:
>>
>> I'm unfamiliar with the use of the phrase "count on" with any meaning
>> other than "rely upon". So are the dictionaries I've consulted. What > do you mean by that? I can't imagine a reasonable definition of
>> "pointer arithmetic" that excludes pointer+integer, which is, in my
>> opinion, the most prototypical example of pointer arithmetic.
>>
> Counting one is where you take a number, let's say 42, and
> ask the subject what is the nest number.

So, counting on "from x" (is that the proper way to use the phrase?)
means x+1? Your definition uses "nest" (I presume, "next"?) at a key
point. Is the distinction you're making the one between x+6 and
x+1+1+1+1+1+1? They're equivalent, but if you only know how to do the
latter, I suppose it could be said that you don't know how to do
arithmetic, though that's not how I'd describe the situation.

> It's a necessary ability for numeracy, but it isn't in itself
> considered enough to establish numeracy. It's a semantic argument,
> of course.
> Array notation is counting on for pointers. ...

OK - so that means I was wrong. pointer[6] is not defined in terms of
repeatedly adding 1 to pointer 6 times. If p points at the mth element
of a array, p[n] is defined as pointing at the (m+n)th element of the
array (so long as the array contains at least m+n elements). Therefore,
if pointer[6] is just "counting on", and therefore not an example of
"pointer arithmetic", then you seem to be saying that m+n is not an
example of arithmetic - at which point I reach total confusion as to the
distinction you're making.

> ... So it's not pointer
> arithmetic. Again, it's a semantic argument. But it's not
> unreasonable. Knowing that number 42 is a reasonable but
> not excessive distance down the street doesn't equate to
> knowing that 42 = 6 times 7.

The standard does not define any meaning for multiplication that applies
to pointer arguments. If neither addition, nor (presumably?) subtraction
count as pointer arithmetic, and neither multiplication nor division are
even defined in this context, what is the motivation for using the word
"arithmetic" as part of this phrase? I'd recommend a different term that
doesn't give the false appearance of having some connection to the term
"arithmetic" as it applies to numbers.
--
James Kuyper

Keith Thompson

unread,
May 19, 2013, 10:29:02 PM5/19/13
to
Ian Collins <ian-...@hotmail.com> writes:
[...]
> One of the other significant differences between the two is with an
> expression like
>
> const size_t n = 42;
> int i[n];
>
> In C, i is a VLA, in C++ it is a normal array.

Yes, but are there cases where it matters, i.e., where a given
program has different semantics in C and C++ because of it?
There are contexts where VLAs aren't allowed, but that just means
that some program are valid C++ and invalid C.

Keith Thompson

unread,
May 19, 2013, 10:55:58 PM5/19/13
to
What do you think has changed since the "legacy days"?

The [] operator is still commutative; array[i] is still equivalent
to i[array], even in C11. The mapping of x[y] to *(x+y) is not
done by the preprocessor, and as far as I know it never has been.

I'd be surprised to see a C compiler, even an old one, that doesn't
accept

x = i[array];

Which isn't to say that there was no such compiler, just that I'd
be surprised to see it. A compiler that conforms to any of C90,
C99, or C11 *must* accept 'x = i[array];" (though it's free to warn
about the programmer's lack of taste).

Keith Thompson

unread,
May 19, 2013, 11:05:14 PM5/19/13
to
Malcolm McLean <malcolm...@btinternet.com> writes:
> On Sunday, May 19, 2013 5:10:04 PM UTC+1, James Kuyper wrote:
>> On 05/17/2013 07:01 PM, Malcolm McLean wrote:
>>
>> I'm unfamiliar with the use of the phrase "count on" with any meaning
>> other than "rely upon". So are the dictionaries I've consulted. What
>> do you mean by that? I can't imagine a reasonable definition of
>> "pointer arithmetic" that excludes pointer+integer, which is, in my
>> opinion, the most prototypical example of pointer arithmetic.
>>
> Counting one is where you take a number, let's say 42, and

"Counting on"?

> ask the subject what is the nest number.

"next number"?

> It's a necessary ability for numeracy, but it isn't in itself
> considered enough to establish numeracy. It's a semantic argument,
> of course.
> Array notation is counting on for pointers. So it's not pointer
> arithmetic. Again, it's a semantic argument. But it's not
> unreasonable. Knowing that number 42 is a reasonable but
> not excessive distance down the street doesn't equate to
> knowing that 42 = 6 times 7.

I wouldn't ordinarily complain about typos, but when you're using
English words and phrases in a way that seems quite distinct from
the way I believe most people use them, I suggest it's important
to be precise.

In a previous post, you seemed to be saying that pointer+integer
is "pointer arithmetic", but "pointer++" is not? If so, this
distinction is not supported by the C standard, which defines "++"
in terms of adding the value 1.

But now you seem to have some other reason for asserting that the []
operator does not involve "pointer arithmetic". Certainly arr[i]
can be thought of as accessing the i'th element of array, and you
don't necessarily have to understand pointer arithmetic to use the
[] operator. But, as I'm sure you know, [] is defined in terms of
pointer+integer arithmetic.

I don't know just what distinction you're making. Whatever it is,
I don't believe it's meaningful.

Eric Sosman

unread,
May 19, 2013, 11:26:58 PM5/19/13
to
On 5/19/2013 6:26 PM, ralph wrote:
> On Sun, 19 May 2013 19:56:26 +0000 (UTC), glen herrmannsfeldt
> <g...@ugcs.caltech.edu> wrote:
>
>
>>
>> I suppose [] isn't an arithmetic operator.
>>
>
> This thread is tough because we are mixing in what might be going on
> and what did go on back then. In the "legacy" days the array operator
> was nothing but an arithmentic operator.
>
> Mr Kuyper pointed out that the 'modern' C standard defines array[i] to
> *behave* as *(array+i), the fact is back then, array[i] *was*
> *(array+i). The array operator was merely short-hand. Underneath,
> after the preprocessor, array[i] became *(array + i ) which is
> definitely 'arithmetic'. <g>

You have it backwards. K&R says in 1978 that the subscript
operator "is interpreted in such a way that E1[E2] is identical
to *((E1)+(E2))." The original ANSI C Standard of 1989 says
"The *definition* [emphasis mine] ... is that E1[E2] is identical
to (*(E1+(E2)))." (Note the rather odd change in punctuation.)

That is, "behave as if identical" is original and "defined
as" is modern, not the other way around.

> [...] those of us programming in late 90s and 80s tended to avoid array
> notation like the plague, unless clarity was required (or requested).

Speak for yourself, not for "those of us." I wrote C in
the 2010's, 2000's, 1990's, 1980's, and 1970's, and in none of
those decades did I feel any impulse to avoid [].

--
Eric Sosman
eso...@comcast-dot-net.invalid

glen herrmannsfeldt

unread,
May 20, 2013, 12:55:34 AM5/20/13
to
Keith Thompson <ks...@mib.org> wrote:

(snip)
>> Mr Kuyper pointed out that the 'modern' C standard defines array[i] to
>> *behave* as *(array+i), the fact is back then, array[i] *was*
>> *(array+i). The array operator was merely short-hand. Underneath,
>> after the preprocessor, array[i] became *(array + i ) which is
>> definitely 'arithmetic'. <g>

>> This was easily demonstrated as both the following would compile and
>> produce the same result...

>> x = array[i];
>> or
>> x = i[array];

(snip)

> What do you think has changed since the "legacy days"?

Processors have different addressing modes.

> The [] operator is still commutative; array[i] is still equivalent
> to i[array], even in C11. The mapping of x[y] to *(x+y) is not
> done by the preprocessor, and as far as I know it never has been.

> I'd be surprised to see a C compiler, even an old one, that doesn't
> accept

> x = i[array];

> Which isn't to say that there was no such compiler, just that I'd
> be surprised to see it.

I believe one of the DEC compilers about 20 years ago wouldn't do it,
but I don't remember what it did with it.

As well as I remember, that was the VAX/VMS days. I haven't tried it
on any VMS systems since then.

-- glen

glen herrmannsfeldt

unread,
May 20, 2013, 12:58:45 AM5/20/13
to
Eric Sosman <eso...@comcast-dot-net.invalid> wrote:

(snip)

>> [...] those of us programming in late 90s and 80s tended to avoid array
>> notation like the plague, unless clarity was required (or requested).

Seems like 90's was a little late, but earlier, yes.

> Speak for yourself, not for "those of us." I wrote C in
> the 2010's, 2000's, 1990's, 1980's, and 1970's, and in none of
> those decades did I feel any impulse to avoid [].

How do you write the loop inside strcpy() is you happen to
be writing one in C?

-- glen

Malcolm McLean

unread,
May 20, 2013, 4:36:06 AM5/20/13
to
On Monday, May 20, 2013 4:05:14 AM UTC+1, Keith Thompson wrote:
> Malcolm McLean <malcolm...@btinternet.com> writes:
>
>
> "Counting on"?
>
> I don't know just what distinction you're making. Whatever it
> is,I don't believe it's meaningful.
>
Let's say I ask a primary school child to count out 42
beans, and he does it correctly. This is not sufficient to
be called "arithmetic". It's knowing the numbers, or "counting on".
It's also "counting on" if we don't start from one, lets say we
have a pile of beans, which we tell him contains a hundred, and
ask him to unite another pile of 42 beans to it, and he takes
beans from the 42 pile one by one and goes "a hundred and one,
and hundred and two ..."
That's not enough to be able to say that the child can do
simple arithmetic.

But let's say I ask him to count out forty two beans, count
out five, and then unite the piles and count them. This is
arithemetic. Because he's now taking two numbers, manipulating
them to produce a third, and storing the result.

array[42] is counting on. ptr = array + 42 is arithmetic.

Bart van Ingen Schenau

unread,
May 20, 2013, 6:37:33 AM5/20/13
to
On Sun, 19 May 2013 12:30:54 -0400, James Kuyper wrote:

> On 05/19/2013 05:35 AM, Bart van Ingen Schenau wrote: ...
>> There are a few less used, darker corners of C and C++, where a C and a
>> C+ + compiler silently give different results. The most notable is the
>> type of character literals (and consequently, the result of sizeof on
>> them):
>>
>> #include <stdio.h>
>> int main()
>> {
>> if (sizeof(char) == sizeof(int))
>> puts("Can't tell. char and int have same size");
>> else if (sizeof('a') == 1)
>> puts("C++");
>> else
>> puts("C");
>> return 0;
>> }
>>
>> This is the only silent difference between C and C++ that I am aware
>> of.
>
> There's a few others.

Yes, your response reminded me of the difference in rules for name lookup
in the tag namespace and for nested types.

>
>> If there are any others, they will likely affect you even less than
>> this one.
>
> You're probably right about that.
>
>> On course, there are also some C constructs that are not valid in C++
>> (e.g. implicit conversion from void* to T*), but of those the compiler
>> will complain loudly enough.
>
> I've posted versions of the following material three times, most
> recently in a message dated 2011-06-19. It has never provoked much
> discussion, which disappointed me.

What kind of discussion were you expecting?

<snip>
> You can compile and link both modules as C code, or as C++ code; the
> resulting executables are guaranteed by the applicable standards to exit
> with an failure status, for two entirely different sets of reasons,
> depending upon which language is used (note that many compilers
> automatically infer the language to be used from the extension on the
> filename - you might need to rename the files to get them to actually
> compile in one language rather than the other).
>
> If you compile the first module with C, and the second with C++, it is
> guaranteed to return a successful exit status. If C++ were really just
> an extension to C, then what I've said about how this program's behavior
> varies with the programming language would be impossible.

Actually, I believe you are wrong about the exit status of the various
possibilities. When both modules are compiled in the same language, then
the exit status of the application is guaranteed to be *success*.
It only results in a failure exit status if the first module is compiled
in C and the second in C++.

<snip>

Bart v Ingen Schenau

James Kuyper

unread,
May 20, 2013, 7:16:07 AM5/20/13
to
On 05/20/2013 04:36 AM, Malcolm McLean wrote:
...
> Let's say I ask a primary school child to count out 42
> beans, and he does it correctly. This is not sufficient to
> be called "arithmetic". It's knowing the numbers, or "counting on".
> It's also "counting on" if we don't start from one, lets say we
> have a pile of beans, which we tell him contains a hundred, and
> ask him to unite another pile of 42 beans to it, and he takes
> beans from the 42 pile one by one and goes "a hundred and one,
> and hundred and two ..."
> That's not enough to be able to say that the child can do
> simple arithmetic.
>
> But let's say I ask him to count out forty two beans, count
> out five, and then unite the piles and count them. This is
> arithemetic. Because he's now taking two numbers, manipulating
> them to produce a third, and storing the result.
>
> array[42] is counting on. ...

I don't see the connection between your example of counting out 42
beans, and array[42]. The defined behavior of array[42] does not involve
successively stepping through the first 42 positions in the array; it's
defined in terms of pointer addition, which in turn is defined in terms
of integer addition as applied to positions within an array, something
I'd consider to unambiguously be "arithmetic".

> ... ptr = array + 42 is arithmetic.

The defined behavior of array[42] is *(array + 42), so fundamentally,
the only difference between those two expressions is the difference
between "*(expression)" and "ptr = expression". I would call the first
"dereferencing" and the second "assignment". Your use of "counting on"
vs "arithmetic" to describe that distinction seems quite odd to me.
--
James Kuyper

Malcolm McLean

unread,
May 20, 2013, 7:32:09 AM5/20/13
to
On Monday, May 20, 2013 12:16:07 PM UTC+1, James Kuyper wrote:
> On 05/20/2013 04:36 AM, Malcolm McLean wrote:
>

> I don't see the connection between your example of counting out
> 42 beans, and array[42]. The defined behavior of array[42] does
> not involve successively stepping through the first 42 positions > in the array; it's defined in terms of pointer addition, which in
> turn is defined in terms of integer addition as applied to
> positions within an array, something I'd consider to
> unambiguously be "arithmetic".
>
Internally, yes, the processor will calculate 0x1234 + 42 to
give an address. But the programmer doesn't see that. He just
sees "take array and go 42 positions to the right". Like
going to a street called array avenue and finding house 42.
That's counting on.
(Then normally you step through an array, you seldom see
array[42] = x in real code, it's almost always
for(i=0;i<42;i++)
array[i] = x.
Very clearly this is counting on.)

James Kuyper

unread,
May 20, 2013, 7:42:15 AM5/20/13
to
On 05/20/2013 06:37 AM, Bart van Ingen Schenau wrote:
> On Sun, 19 May 2013 12:30:54 -0400, James Kuyper wrote:
...
>>> On course, there are also some C constructs that are not valid in C++
>>> (e.g. implicit conversion from void* to T*), but of those the compiler
>>> will complain loudly enough.
>>
>> I've posted versions of the following material three times, most
>> recently in a message dated 2011-06-19. It has never provoked much
>> discussion, which disappointed me.
>
> What kind of discussion were you expecting?

Well, I was hoping to have the person who was describing C++ as a strict
superset of C to comment on how his position had changed as a result of
reviewing that program, or to explain why his position had not changed.
You didn't make exactly that claim, but you came close, and the three
previous occasions involved claims more explicitly of that type.

I was anticipating that someone might ask for an explanation - the
differences in the relevant rules are a bit obscure, and the code is not
annotated to explain why it makes a difference which language you use to
compile it.

However, at a minimum, I was hoping to have someone check it out and
tell me if I had made any errors, such as the following:

...
>> If you compile the first module with C, and the second with C++, it is
>> guaranteed to return a successful exit status. If C++ were really just
>> an extension to C, then what I've said about how this program's behavior
>> varies with the programming language would be impossible.
>
> Actually, I believe you are wrong about the exit status of the various
> possibilities. When both modules are compiled in the same language, then
> the exit status of the application is guaranteed to be *success*.
> It only results in a failure exit status if the first module is compiled
> in C and the second in C++.

Thanks, I hope I'll remember to correct that discrepancy if I ever
decide to post that code again. Of course, the fact that the exit status
depends upon which language is used to compile the code is all that
matters for the point that I was making.
--
James Kuyper

James Kuyper

unread,
May 20, 2013, 8:16:11 AM5/20/13
to
On 05/20/2013 07:32 AM, Malcolm McLean wrote:
> On Monday, May 20, 2013 12:16:07 PM UTC+1, James Kuyper wrote:
>> On 05/20/2013 04:36 AM, Malcolm McLean wrote:
>>
>
>> I don't see the connection between your example of counting out
>> 42 beans, and array[42]. The defined behavior of array[42] does
>> not involve successively stepping through the first 42 positions > in the array; it's defined in terms of pointer addition, which in
>> turn is defined in terms of integer addition as applied to
>> positions within an array, something I'd consider to
>> unambiguously be "arithmetic".
>>
> Internally, yes, the processor will calculate 0x1234 + 42 to
> give an address. But the programmer doesn't see that. ...

That depends upon how good the programmer's "eyes" are. Many C
constructs have behavior that can be defined as convenient packages of
more fundamental constructs; for instance, #ifdef can be defined in
terms of #if and defined(), while() can be defined in terms of if() and
goto, i++ can be defined in terms of + and =, and [] can be defined in
terms of * and +. I've always considered these packages as equivalent to
their component parts, and have never thought of them as unanalyzed
entities in their own right. YMMV

> ... He just
> sees "take array and go 42 positions to the right". Like
> going to a street called array avenue and finding house 42.

I can't say that I've ever thought about array subscripting in that way.

> That's counting on.

I could, with equal justification, describe pointer+42 as "take the
position pointed at and go 42 positions to the right". For that matter,
I could equally well describe 3+42 as "Find 3 on the number line, and go
42 numbers to the right". That is, incidentally, precisely the way I was
taught to understand addition, several decades ago. I was also taught
techniques for manipulating symbols to implement addition, but that
description is precisely how I was taught what addition means. It also
corresponds pretty accurately to the way mathematicians formally derive
the concept of addition from the concept of the successor of a number.
If being able to describe array[42] that way justified describing it as
just "counting on", and not "arithmetic", then the same would also be
true of 3+42.

> (Then normally you step through an array, you seldom see
> array[42] = x in real code, it's almost always
> for(i=0;i<42;i++)
> array[i] = x.
> Very clearly this is counting on.)

I can see i++ as an example of counting on, but not array[i], which,
being defined as *(array+i), is pointer arithmetic as I understand the
term. If you'd re-written it as:

int *end = array + n;

for(int *p = array; p<end; p++)
*p = x;

then I could see the loop as an example of the distinction you're making
between "counting on" and "arithmetic". Of course, the preceding array+n
calculation is clearly pointer arithmetic.
--
James Kuyper

Eric Sosman

unread,
May 20, 2013, 10:13:14 AM5/20/13
to
s/is/if/ ?

I'd do it in the time-honored way, with a pair of pointers.
That's not "avoiding [] like the plague," it's just using whatever
notation suits the use case better. How do *you*, might I ask,
write Gauss-Jordan elimination for a linear system?

--
Eric Sosman
eso...@comcast-dot-net.invalid

Keith Thompson

unread,
May 20, 2013, 11:39:10 AM5/20/13
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
> Keith Thompson <ks...@mib.org> wrote:
> (snip)
>>> Mr Kuyper pointed out that the 'modern' C standard defines array[i] to
>>> *behave* as *(array+i), the fact is back then, array[i] *was*
>>> *(array+i). The array operator was merely short-hand. Underneath,
>>> after the preprocessor, array[i] became *(array + i ) which is
>>> definitely 'arithmetic'. <g>
>
>>> This was easily demonstrated as both the following would compile and
>>> produce the same result...
>
>>> x = array[i];
>>> or
>>> x = i[array];
>
> (snip)
>
>> What do you think has changed since the "legacy days"?
>
> Processors have different addressing modes.

I don't see how that would necessarily affect the semantics of the []
operator, and it doesn't seem to be what ralph was referring to.

[snip]

glen herrmannsfeldt

unread,
May 20, 2013, 12:01:34 PM5/20/13
to
Eric Sosman <eso...@comcast-dot-net.invalid> wrote:
>> (snip)

(snip, someone wrote)
>>>> [...] those of us programming in late 90s and 80s tended
>>>> to avoid array notation like the plague, unless clarity
>>>> was required (or requested).

>> Seems like 90's was a little late, but earlier, yes.

>>> Speak for yourself, not for "those of us." I wrote C in
>>> the 2010's, 2000's, 1990's, 1980's, and 1970's, and in none of
>>> those decades did I feel any impulse to avoid [].

>> How do you write the loop inside strcpy() is you happen to
>> be writing one in C?

> s/is/if/ ?

Yes.

> I'd do it in the time-honored way, with a pair of pointers.

I probably would too, though that is a slightly special case,
there the original value of the pointers isn't needed later.

Given the choice between copying the pointers and indexing,
I might go to indexing.

> That's not "avoiding [] like the plague," it's just using whatever
> notation suits the use case better.

> How do *you*, might I ask, write Gauss-Jordan elimination
> for a linear system?

Probably in Fortran. Well, there are plenty of them already
written in Fortran.

Not so long ago, I needed one in Java. I found one on the web,
according to the comments translated from a version in Fortran
originally for the IBM 650. (A vacuum tube decimal machine, with,
if I remember right, addressing in decimal.)

The tradtional way to write much of the matrix processing work
in Fortran is to consider the matrix as a 1D array, and compute
the appropriate offset. (This is still part of the newest
Fortran 2008 standard, part of assumed size arrays.)

The usual way to do matrix inversion, and I believe also
LU-decomposition, is to do it in place, remembering which
matrix elements are the input values, and which are the output
values.

There is the additional complication that one might want to
dimension an array larger than the size needed, but only use
part of it, such that the actual matrix elements aren't stored
contiguously. That was usual in the Fortran days before dynamic
allocation, and still useful even with dynamic allocation.

-- glen

Malcolm McLean

unread,
May 20, 2013, 12:04:53 PM5/20/13
to
On Monday, May 20, 2013 1:16:11 PM UTC+1, James Kuyper wrote:
> On 05/20/2013 07:32 AM, Malcolm McLean wrote:
>
> I could, with equal justification, describe pointer+42 as "take the
> position pointed at and go 42 positions to the right". For that matter,
> I could equally well describe 3+42 as "Find 3 on the number line, and go
> 42 numbers to the right". That is, incidentally, precisely the way I was
> taught to understand addition, several decades ago. I was also taught
> techniques for manipulating symbols to implement addition, but that
> description is precisely how I was taught what addition means.
>
Exactly. "Counting on" is how people who haven't been to school add numbers.
It's a semantic argument whether you call it "arithmetic" or not, but generally
it's not considered to be so. It's the step before arithmetic.
So at primary school, they work with the intuitive method first. It's a new
pedagogical theory - I wasn't taught with number lines, but first with
counters then to do it algorithmically.

glen herrmannsfeldt

unread,
May 20, 2013, 12:18:30 PM5/20/13
to
Malcolm McLean <malcolm...@btinternet.com> wrote:
> On Monday, May 20, 2013 12:16:07 PM UTC+1, James Kuyper wrote:
>> On 05/20/2013 04:36 AM, Malcolm McLean wrote:

>> I don't see the connection between your example of counting out
>> 42 beans, and array[42].

Someone might be able to count beans, but not able to add.

You couldn't ask for 42+5 beans, but could ask for 42 beans,
then for five more, which they would count.

If you go to the bank to cash a check for $100, and ask for it
in $5 bills, they will count it out by fives. (5, 10, 15, 20 ...).
They could just count the 20 bills and multiply by five, but it
is tradition.

>> The defined behavior of array[42] does
>> not involve successively stepping through the first 42 positions
>> in the array; it's defined in terms of pointer addition, which in
>> turn is defined in terms of integer addition as applied to
>> positions within an array, something I'd consider to
>> unambiguously be "arithmetic".

> Internally, yes, the processor will calculate 0x1234 + 42 to
> give an address. But the programmer doesn't see that.

On many processors, the program has to compute the address
of array element 42, possibly as 0x1234+4*42. Note, though,
that VAX has an indexed addressing mode that includes the size
of the element being addressed. You can put 0x1234 in one
register, 42 in another, and get the appropriate array element.

> He just sees "take array and go 42 positions to the right". Like
> going to a street called array avenue and finding house 42.
> That's counting on.
> (Then normally you step through an array, you seldom see
> array[42] = x in real code, it's almost always
> for(i=0;i<42;i++)
> array[i] = x.
> Very clearly this is counting on.)

Or maybe:

for(i=0;i<42;i++) *array++ = x;

(If array was a pointer variable, and its value wasn't needed again.)

-- glen

Keith Thompson

unread,
May 20, 2013, 1:21:51 PM5/20/13
to
Malcolm McLean <malcolm...@btinternet.com> writes:
> On Monday, May 20, 2013 4:05:14 AM UTC+1, Keith Thompson wrote:
>> Malcolm McLean <malcolm...@btinternet.com> writes:
>> "Counting on"?
>>
>> I don't know just what distinction you're making. Whatever it
>> is, I don't believe it's meaningful.
>>
> Let's say I ask a primary school child to count out 42
> beans, and he does it correctly. This is not sufficient to
> be called "arithmetic". It's knowing the numbers, or "counting on".
> It's also "counting on" if we don't start from one, lets say we
> have a pile of beans, which we tell him contains a hundred, and
> ask him to unite another pile of 42 beans to it, and he takes
> beans from the 42 pile one by one and goes "a hundred and one,
> and hundred and two ..."
> That's not enough to be able to say that the child can do
> simple arithmetic.
>
> But let's say I ask him to count out forty two beans, count
> out five, and then unite the piles and count them. This is
> arithemetic. Because he's now taking two numbers, manipulating
> them to produce a third, and storing the result.

Hmm.

I agree that counting or incrementing is not the same thing as
addition, and that the former is simpler than the latter.

I'd say that both are forms of arithmetic. Knowing how to count
doesn't imply that you know how to add -- nor does knowing how to
add imply that you know how to multiply or divide.

> array[42] is counting on. ptr = array + 42 is arithmetic.

I *think* I see the distinction you're trying to make. I still
say it's neither valid nor meaningful.

There is a meaningful distinction between accessing array elements
using "array notation" (the "[]" operator) and accessing array
elements using explicit pointer arithmetic ("+" and "*"). The two
forms are by definition semantically equivalent, but they are in
some sense conceptually distinct -- especially for someone (like me,
for example) who first learned about array indexing in a language
that doesn't define it the way C does.

But I don't think that's quite what you're saying. As far as I can
tell, the point you're making requires ignoring a great many things.

James Kuyper

unread,
May 20, 2013, 1:24:20 PM5/20/13
to
On 05/20/2013 12:04 PM, Malcolm McLean wrote:
> On Monday, May 20, 2013 1:16:11 PM UTC+1, James Kuyper wrote:
>> On 05/20/2013 07:32 AM, Malcolm McLean wrote:
>>
>> I could, with equal justification, describe pointer+42 as "take the
>> position pointed at and go 42 positions to the right". For that matter,
>> I could equally well describe 3+42 as "Find 3 on the number line, and go
>> 42 numbers to the right". That is, incidentally, precisely the way I was
>> taught to understand addition, several decades ago. I was also taught
>> techniques for manipulating symbols to implement addition, but that
>> description is precisely how I was taught what addition means.
>>
> Exactly. "Counting on" is how people who haven't been to school add numbers.
> It's a semantic argument whether you call it "arithmetic" or not, but generally
> it's not considered to be so. It's the step before arithmetic.

So, if 3+42 is just "counting on", and not "arithmetic", what is
"arithmetic" (as you apply the term to ordinary numbers, outside of the
C context)?

> So at primary school, they work with the intuitive method first. It's a new
> pedagogical theory ...

"new"? I was taught using that method nearly a half-century ago. At that
time, the "new" teaching paradigm was not one that emphasized intuitive
approaches - it emphasized taking fundamental concepts, such as set
theory, that had previously only been taught to mathematicians in grad
school, and introducing them to everybody at a young age. I've no idea
whether using the "number line" was part of that movement - it's hard to
see as "advanced" something you were first introduced to at that young
of an age - which is, of course, the whole point of that approach.

> ... - I wasn't taught with number lines, but first with

glen herrmannsfeldt

unread,
May 20, 2013, 1:58:42 PM5/20/13
to
Keith Thompson <ks...@mib.org> wrote:

(snip, someone wrote)
>>>> Mr Kuyper pointed out that the 'modern' C standard defines array[i] to
>>>> *behave* as *(array+i), the fact is back then, array[i] *was*
>>>> *(array+i). The array operator was merely short-hand. Underneath,
>>>> after the preprocessor, array[i] became *(array + i ) which is
>>>> definitely 'arithmetic'. <g>

(snip)
>>> What do you think has changed since the "legacy days"?

(snip, then I wrote)
>> Processors have different addressing modes.

> I don't see how that would necessarily affect the semantics of the []
> operator, and it doesn't seem to be what ralph was referring to.

Yes, it doesn't affect the semantics.

But the people who started programming C on the PDP-11, and using
pointer arithmetic, especially the *s++ form, because it was
faster (and smaller) on the PDP-11.

(Also, the PDP-11 compilers probably weren't so good at optimizing
the [] form.)

Now, VAX has an indexed addressing mode that knows about the size,
such that one can index into an array without computing the address
(in a register or memory) of the array element. No idea how fast it
is, though.

The purpose of the comment was that what was optimal on the PDP-11
isn't necessarily optimal on later processors.

-- glen

roland....@gmail.com

unread,
May 20, 2013, 4:53:08 PM5/20/13
to
Hey guys, why not talk about Peano's axioms ?
You are going off topic (haywire?) with the pointer arithmetic discussion.

I'd like the posts to stay on the following discussions :
- are some C++ features possible with C as is currently is ?
- would using a C++ compiler give unintended results on the C code

Also whenever you say : the code works the same whichever way,
I *hugely* favor the code that conspicuously shows one's intentions
(so yes for indexing and not playing with pointers, so yes for references
and not doing the same with pointers in the case of the alias)
- and let the optimizer do it's job.

James Kuyper

unread,
May 20, 2013, 5:50:29 PM5/20/13
to
On 05/20/2013 04:53 PM, roland....@gmail.com wrote:
> Hey guys, why not talk about Peano's axioms ?
> You are going off topic (haywire?) with the pointer arithmetic discussion.
>
> I'd like the posts to stay on the following discussions :

People will discuss what they want to discuss; there's not much that can
be done about that in an unmoderated newsgroup. Even in a moderated
newsgroup, thread drift is normal and considered acceptable, so long as
it stays within the range of topics that the newsgroup is appropriate
for. The meaning of "pointer arithmetic" (at least as it applies to C
code) is definitely on-topic for this newsgroup.

> - are some C++ features possible with C as is currently is ?

Yes, but I wouldn't consider it likely. C still exists as a language
independent of C++ precisely because some people don't want it to
include very many C++ features.

> - would using a C++ compiler give unintended results on the C code

Yes. Annex C of the C++ standard has a good summary of the possible
problems. Many of the possible problems involve a mandatory diagnostic,
which can be avoided just by paying attention to error messages while
compiling. Many of the others involve undefined behavior in one of the
two languages - the only reliable protection is to know precisely what
the issue is. However, it's also possible to write code which has
defined behavior in both languages, but it's different defined behavior
in each. The code that I posted was intended to include exactly one
instance of every such case. If you're interested in such issues, I'd
recommend attempting the "exercise for the student" that I mentioned at
the end of that message.

> Also whenever you say : the code works the same whichever way,
> I *hugely* favor the code that conspicuously shows one's intentions
> (so yes for indexing and not playing with pointers, so yes for references
> and not doing the same with pointers in the case of the alias)
> - and let the optimizer do it's job.

In many contexts, pointers can show one's intentions just as clearly as
indexing arrays, and sometimes more clearly. Most of the str*() standard
libraries could, in principle, be written in strictly conforming C
(though they're likely to be implemented in assembler on many
platforms). That C code generally looks a lot cleaner if written to use
pointers rather than arrays.

In C++, the way operator overloading in supported makes the use of
references necessary. However, unless operator overloading is added to
C, the use of references rather than pointers does not (in my opinion)
greatly affect the ability to clearly show what the code is supposed to do.

Malcolm McLean

unread,
May 20, 2013, 6:36:02 PM5/20/13
to
On Monday, May 20, 2013 6:24:20 PM UTC+1, James Kuyper wrote:
> On 05/20/2013 12:04 PM, Malcolm McLean wrote:
>
>
> So, if 3+42 is just "counting on", and not "arithmetic",
> what is "arithmetic" (as you apply the term to ordinary
> numbers, outside of the C context)?
>
Counting on is "start at three, and continue counting ...
now stop". If you conceptualise it as 3 + 42, you've made
the leap to arithmetic. People generally don't, they
count on as a consequence of knowing the numbers in their
native language, but they can't add values unless they
are explicitly taught.

Ian Collins

unread,
May 20, 2013, 7:18:55 PM5/20/13
to
Which is one reason why young children a taught basic arithmetic using
number lines; they can conceptualise counting on (or back) N before they
understand add or subtract N.

--
Ian Collins

glen herrmannsfeldt

unread,
May 20, 2013, 7:31:18 PM5/20/13
to
Malcolm McLean <malcolm...@btinternet.com> wrote:

(snip)
> Counting on is "start at three, and continue counting ...
> now stop". If you conceptualise it as 3 + 42, you've made
> the leap to arithmetic. People generally don't, they
> count on as a consequence of knowing the numbers in their
> native language, but they can't add values unless they
> are explicitly taught.

Or watch kids, not so long after they learn to count, they
can "add" by counting, and then "subtract" by counting backwards.
(As long as the numbers don't get too big.)

-- glen

glen herrmannsfeldt

unread,
May 20, 2013, 7:47:06 PM5/20/13
to
James Kuyper <james...@verizon.net> wrote:

(snip, someone wrote)
>> Also whenever you say : the code works the same whichever way,
>> I *hugely* favor the code that conspicuously shows one's intentions
>> (so yes for indexing and not playing with pointers, so yes for
>> references and not doing the same with pointers in the case of
>> the alias) - and let the optimizer do it's job.

> In many contexts, pointers can show one's intentions just as clearly as
> indexing arrays, and sometimes more clearly. Most of the str*() standard
> libraries could, in principle, be written in strictly conforming C
> (though they're likely to be implemented in assembler on many
> platforms). That C code generally looks a lot cleaner if written to use
> pointers rather than arrays.

But mostly because we are used to seeing them that way:

while( *s++ = *t++) ;

or

for(i=0; t[i]; i++) s[i]=t[i];
s[i]=0;

The latter is almost legal Java, and so almost understandable
by a Java programmer. Fortran programmers would also have an
easier time understanding it. If you consider [] as a
subscript operation, it is close to the way mathematicians
would see it.

Other than that many C compilers might have been designed to
special case the former, I don't know that one is necessarily
faster than the other on current processors.


-- glen

James Kuyper

unread,
May 20, 2013, 11:25:12 PM5/20/13
to
So the difference between 3+42 as "counting on" and 3+42 as "arithmetic"
is entirely a matter of how the person implementing the operation thinks
about it? I doubt that there's any implementation of C, anywhere, where
array[42] is implemented in a fashion that corresponds to what you call
"counting on" rather than to arithmetic. Can you cite any?
--
James Kuyper

David Brown

unread,
May 21, 2013, 3:20:26 AM5/21/13
to
On 17/05/13 14:37, Ed Prochak wrote:
> On Thursday, May 16, 2013 3:24:14 PM UTC-4, Martin Shobe wrote:
>
>>
>> While this isn't the place to go too deep into C++'s bag of tricks.
>> there are things that you can do with references that you can't do
>> with pointers. For example, you can extend the lifetime of a
>> temporary.
>>
>>
>> Martin Shobe
>
> I don't get it. In C I can control the lifetime of everything. I
> don't have a garbage collector taking things away behind my back. So
> I don't see this as an advantage of references.
>
> ed
>

This is not about "garbage collection" (which C++ does not have any more
than C does - you need libraries or classes that support it if you want
garbage collection), or the lifetime of memory allocations (which is a
separate issue). C objects never "die" - they just fade away from lack
of use, and the compiler re-uses their space (registers, stack space,
etc.)

In C++, objects have a specific "point of death" - it is when their
destructor is called.

Normally for a local or temporary object, that will happen when it goes
out of scope (the compiler can, of course, shuffle things around to
improve the code - but logically the destructor is called at the end of
scope). Using a reference, you can extend and delay this destructor call.

If you are familiar with *nix, you can think of a pointer as a soft
link, and a reference as a hard link. A hard link will prevent a file
from being deleted when the first link is rm'ed, but a soft link offers
no such protection.

roland....@gmail.com

unread,
May 21, 2013, 4:09:23 AM5/21/13
to
>
> People will discuss what they want to discuss; there's not much that can
> be done about that in an unmoderated newsgroup. Even in a moderated
> newsgroup, thread drift is normal and considered acceptable, so long as
> it stays within the range of topics that the newsgroup is appropriate
> for. The meaning of "pointer arithmetic" (at least as it applies to C
> code) is definitely on-topic for this newsgroup.

James, you are right.
I cannot prevent people from rambling on the subject of what is or what is not arithmetic. At some point, to me it is nitpicking and not very useful for a news group on the C language - my opinion.

But if you want my pick on the subject ;-) : the discussion is somewhat missing the point. I think Malcom was in hindsight talking about behaviour vs implementation.

A compiler language only talks about behaviour.
In the discussion, I feel that people talk about the equivalence of indexing and pointer (arithmetic, arghh -no) usage, mostly because they "know" how it translates in machine code.

But it hinges on knowledge of one implementation.
Let me propose for the sake of the discussion that we set the stage in 500 years from now : quantum computing has come of age now.
For some reason (let's say because of particle spin or charm) memory organization has to spread out odd and even indexes to different "address space", or whatever. Now seeing indexing and an equivalent linear addressing is totally irrelevant. I know what indexing means, I don't know what pointer meddling means - but I have to have my favorite compiler of the time do the "right" thing : probably something defined in terms of (quantum) indexing, not in linear address you had in mind 500 years ago - the tables have turned.

Malcolm McLean

unread,
May 21, 2013, 4:44:43 AM5/21/13
to
If array is on the stack, then a typical assembly listing
will have a pointer called stack top, which points to
one past the end of all local variables, often with a
bit of space for the function return address or other
bits and bats. An explict array reference would resolve
to stack top, minus the position, so if array is fifty
elements wide, it would be stack top minus (50 - 42).

But the calculation may never be done at the assembly
level. Since instruction sets are designed to give
C compilers an easy time, often there a special "base plus
offset" instruction. Internally in the chip the base and
the offset will be joined to form an address, but often
it's not by addition -base might always have the lower bits
clear, offsets might be restricted to the value of those
lower bits. So the base offset calculation is implemented
internally as a logical operation, not an addition,
which saves circuitry because you don't need a carry.

So really it's pretty hopeless to look at details of
implementation and say something about the source code,
the C programmer's view of the program and the chip's
view of it are different things.

BartC

unread,
May 21, 2013, 5:43:20 AM5/21/13
to


<roland....@gmail.com> wrote in message
news:34ff5158-6693-4efd...@googlegroups.com...

> A compiler language only talks about behaviour.
> In the discussion, I feel that people talk about the equivalence of
> indexing and pointer (arithmetic, arghh -no) usage, mostly because they
> "know" how it translates in machine code.
>
> But it hinges on knowledge of one implementation.
> Let me propose for the sake of the discussion that we set the stage in 500
> years from now : quantum computing has come of age now.
> For some reason (let's say because of particle spin or charm) memory
> organization has to spread out odd and even indexes to different "address
> space", or whatever. Now seeing indexing and an equivalent linear
> addressing is totally irrelevant. I know what indexing means, I don't know
> what pointer meddling means - but I have to have my favorite compiler of
> the time do the "right" thing : probably something defined in terms of
> (quantum) indexing, not in linear address you had in mind 500 years ago -
> the tables have turned.

The way the hardware works should be irrelevant; the model of pointer
arithmetic that C has should still work. Because that's how the language is
defined (indexing and pointer arithmetic *are* equivalent, because C says
so). It's the compiler's job to implement that model on whatever hardware is
available.

Of course, C was designed around the way computer memory has worked over the
last few decades; so possibly it won't be the best choice of language in 500
years' time.

--
Bartc

roland....@gmail.com

unread,
May 21, 2013, 6:46:57 AM5/21/13
to
> The way the hardware works should be irrelevant; the model of pointer
> arithmetic that C has should still work. Because that's how the language is
> defined (indexing and pointer arithmetic *are* equivalent, because C says
> so). It's the compiler's job to implement that model on whatever hardware is
> available.
>
> Of course, C was designed around the way computer memory has worked over the
> last few decades; so possibly it won't be the best choice of language in 500
> years' time.
> --
> Bartc

I totally concur.
It's only pointer operation is not the same as memory address arithmetic.

I just wanted to show that in C people so easily cross from behavior to implementation, leading them to biased choices according to a perceived efficiency to the detriment of an adequate representation - again, let the compiler do the job.

James Kuyper

unread,
May 21, 2013, 7:20:49 AM5/21/13
to
On 05/21/2013 03:20 AM, David Brown wrote:
> On 17/05/13 14:37, Ed Prochak wrote:
>> On Thursday, May 16, 2013 3:24:14 PM UTC-4, Martin Shobe wrote:
>>
>>>
>>> While this isn't the place to go too deep into C++'s bag of tricks.
>>> there are things that you can do with references that you can't do
>>> with pointers. For example, you can extend the lifetime of a
>>> temporary.
>>>
>>>
>>> Martin Shobe
>>
>> I don't get it. In C I can control the lifetime of everything. I
>> don't have a garbage collector taking things away behind my back. So
>> I don't see this as an advantage of references.
>>
>> ed
>>
>
> This is not about "garbage collection" (which C++ does not have any more
> than C does - you need libraries or classes that support it if you want
> garbage collection), or the lifetime of memory allocations (which is a
> separate issue). C objects never "die" - they just fade away from lack
> of use, and the compiler re-uses their space (registers, stack space,
> etc.)

C objects have a lifetime, and the C standard defines very precisely
when that lifetime ends. It is not just a matter of "fading away from
lack of use" - though it is in fact the case that it's no longer
possible to use an object once it's lifetime has ended. The C standard
doesn't define a meaning for the term "die", but it seems reasonable to
me to use that term to refer to what happens when the lifetime of a C
object ends.

> In C++, objects have a specific "point of death" - it is when their
> destructor is called.

If an object has a non-trivial destructor, C++ rules define that the
lifetime of the object ends when that destructor starts executing.
Otherwise, it ends when storage for that object is no longer allocated,
which is essentially the same as the C rules. However, unless the
destructor is invoked explicitly, an implementation has considerable
freedom to delay execution of a destructor, so the point of death is not
as specific as you suggest, or at least no more specific than the point
of death for a C object.
--
James Kuyper

James Kuyper

unread,
May 21, 2013, 7:34:04 AM5/21/13
to
On 05/21/2013 04:44 AM, Malcolm McLean wrote:
...
> If array is on the stack, then a typical assembly listing
> will have a pointer called stack top, which points to
> one past the end of all local variables, often with a
> bit of space for the function return address or other
> bits and bats. An explict array reference would resolve
> to stack top, minus the position, so if array is fifty
> elements wide, it would be stack top minus (50 - 42).
>
> But the calculation may never be done at the assembly
> level. Since instruction sets are designed to give
> C compilers an easy time, often there a special "base plus
> offset" instruction. Internally in the chip the base and
> the offset will be joined to form an address, but often
> it's not by addition -base might always have the lower bits
> clear, offsets might be restricted to the value of those
> lower bits. So the base offset calculation is implemented
> internally as a logical operation, not an addition,
> which saves circuitry because you don't need a carry.
>
> So really it's pretty hopeless to look at details of
> implementation and say something about the source code,
> the C programmer's view of the program and the chip's
> view of it are different things.

So, for ordinary numbers being added by people, the difference between
3+42 as "counting on" and 3+42 as "arithmetic" depends upon how the
person thinks about the process. However, for pointers being added by a
computer, array[42] is inherently "counting on", regardless of how the
computer implements it, even though the way that the computer handles it
has no meaningful similarity to the way human beings do "counting on"?

The definition of "counting on", combined with its claimed applicability
to array subscription, just gets less clear every time you add to the
explanation. You don't seem to be making any headway in convincing me
that the concept makes any sense, and I don't seem to be making any
headway in convincing you that it's nonsense, so I'm just going to give
up on this.
--
James Kuyper

James Kuyper

unread,
May 21, 2013, 8:00:49 AM5/21/13
to
On 05/21/2013 04:09 AM, roland....@gmail.com wrote:
...
> James, you are right. I cannot prevent people from rambling on the
> subject of what is or what is not arithmetic. At some point, to me it
> is nitpicking and not very useful for a news group on the C language
> - my opinion.

I find the nitpicking somewhat more interesting than your original
question, but then I'm well known for my pedantry. Most of the other
people monitoring this newsgroup seem to find both topics equally
uninteresting. You seem to think I'm distracting people from discussing
the topic you want to talk about. The fact that no one's talking about
your topic is due to the fact that no one monitoring this newsgroup is
interested in talking about it; my side issue has nothing to do with that.

...
> A compiler language only talks about behaviour. In the discussion, I
> feel that people talk about the equivalence of indexing and pointer
> (arithmetic, arghh -no) usage, mostly because they "know" how it
> translates in machine code.
>
> But it hinges on knowledge of one implementation.

My understanding of what constitutes pointer arithmetic is grounded
entirely in the abstract definition of the language itself, and does not
depend in any way upon the implementation of that language. That seems
reasonable to me, since my understanding of what "arithmetic" means for
ordinary numbers is also grounded entirely in the abstract definitions
of mathematics, and does not depend in any way upon the specific symbol
manipulation processes that people use to implement arithmetic.

Malcolm's understanding does appear to depend upon the implementation,
but in ways that make no sense to me: implementation of array
subscription has never, as far as I know, ever been implemented in a way
that has any meaningful similarity to "counting on", as Malcolm defines
that term for ordinary numbers. It could be done; it wouldn't violate
any of C's requirements to do so; but it would make evaluation of
array[n] an O(n) process, and I can't imagine any reason why anyone
would want to do that when ways of implementing it in constant time are
well known.
--
James Kuyper

Martin Shobe

unread,
May 21, 2013, 8:02:46 AM5/21/13
to
Can you give an example of when C++ has any freedom to delay invoking
a destructor that isn't an example of the "as if" rule?

Martin Shobe

James Kuyper

unread,
May 21, 2013, 8:46:00 AM5/21/13
to
On 05/21/2013 08:02 AM, Martin Shobe wrote:
> On 5/21/2013 6:20 AM, James Kuyper wrote:
>> On 05/21/2013 03:20 AM, David Brown wrote:
...
>>> In C++, objects have a specific "point of death" - it is when their
>>> destructor is called.
>>
>> If an object has a non-trivial destructor, C++ rules define that the
>> lifetime of the object ends when that destructor starts executing.
>> Otherwise, it ends when storage for that object is no longer allocated,
>> which is essentially the same as the C rules. However, unless the
>> destructor is invoked explicitly, an implementation has considerable
>> freedom to delay execution of a destructor, so the point of death is not
>> as specific as you suggest, or at least no more specific than the point
>> of death for a C object.
>>
> Can you give an example of when C++ has any freedom to delay invoking
> a destructor that isn't an example of the "as if" rule?

The "as-if" rule often applies - actually making use of a C++ object
after it's lifetime has ended has undefined behavior, so the only things
that prevent a delay in execution of a destructor are any side-effects
it may cause. If all the destructor does is release resources (a common
case), the "as-if" rule allows it to be delayed almost indefinitely.

In any event, there is one case that does not rely on the "as-if" rule:
"The completions of the destructors for all initialized objects with
thread storage duration within that thread are sequenced before the
initiation of the destructors of any object with static storage
duration." (3.6.3p1) It could be a VERY long time between the end of any
particular thread and the time when objects with static storage duration
get destroyed.
--
James Kuyper

Malcolm McLean

unread,
May 21, 2013, 9:02:44 AM5/21/13
to
On Tuesday, May 21, 2013 12:34:04 PM UTC+1, James Kuyper wrote:
> On 05/21/2013 04:44 AM, Malcolm McLean wrote:
>
> for ordinary numbers being added by people, the difference > between 3+42 as "counting on" and 3+42 as "arithmetic"
> depends upon how the person thinks about the process.
>
Yes. Counting on is how you do addition, without really
understanding that you are adding two values.
Actually, if we define addition as operation one,
multiplication as operation two, exponentiation as
operation three, then there's a "operation zero" which bears
the same relation to addition as addition does to
multiplication.
Operation zero is "a plus count of a's". So it's a counting
operation, you start from a then count one for each time
the value a appears in the list.

> However, for pointers being added by a computer, array[42]
> is inherently "counting on", regardless of how the
> computer implements it, even though the way that the
> computer handles it has no meaningful similarity to the
> way human beings do "counting on"?
>
Looking at the machine code is a distraction. There might
be an explicit "array + 42" op code in there, there
might not. The store might even be optimised out to a
temporary register.

glen herrmannsfeldt

unread,
May 21, 2013, 9:49:56 AM5/21/13
to
roland....@gmail.com wrote:

(snip)

> But if you want my pick on the subject ;-) : the discussion is
> somewhat missing the point. I think Malcom was in hindsight
> talking about behaviour vs implementation.

> A compiler language only talks about behaviour.
> In the discussion, I feel that people talk about the equivalence
> of indexing and pointer (arithmetic, arghh -no) usage, mostly
> because they "know" how it translates in machine code.

Yes. And while it is well known that the ++ and -- operators
did not have their origin in the PDP-11, much early C coding
was done on that machine.

As far as I know, the while(*s++=*t++) form was popularized on
the PDP-11 (even if it existed earlier) where it was known to
generate fast code. (And also a few less keystrokes are used.)

As machines changed, and compilers improved, people didn't
rethink the way to write those loops.

There should be reasonably names for the "indexing in loop"
and "increment pointers in loop" form, but I don't know what
they are.

-- glen


glen herrmannsfeldt

unread,
May 21, 2013, 10:07:16 AM5/21/13
to
James Kuyper <james...@verizon.net> wrote:

(snip)
> I find the nitpicking somewhat more interesting than your original
> question, but then I'm well known for my pedantry. Most of the other
> people monitoring this newsgroup seem to find both topics equally
> uninteresting. You seem to think I'm distracting people from discussing
> the topic you want to talk about. The fact that no one's talking about
> your topic is due to the fact that no one monitoring this newsgroup is
> interested in talking about it; my side issue has nothing to do with that.

(snip)
>> A compiler language only talks about behaviour. In the discussion, I
>> feel that people talk about the equivalence of indexing and pointer
>> (arithmetic, arghh -no) usage, mostly because they "know" how it
>> translates in machine code.

>> But it hinges on knowledge of one implementation.

Yes.

> My understanding of what constitutes pointer arithmetic is grounded
> entirely in the abstract definition of the language itself, and does not
> depend in any way upon the implementation of that language. That seems
> reasonable to me, since my understanding of what "arithmetic" means for
> ordinary numbers is also grounded entirely in the abstract definitions
> of mathematics, and does not depend in any way upon the specific symbol
> manipulation processes that people use to implement arithmetic.

Yes, but consider two specific examples to compare:

for(i=0;i<n;i++) *s++=*t++;
s -=n;
t -=n;

and

for(i=0;i<n;i++) s[i]=t[i];

it doesn't seem so unreasonable to call these the "pointer
arithmetic" version and the "indexed version" even though,
as we well know, indexing is defined in C in terms of pointer
arithmetic.

> Malcolm's understanding does appear to depend upon the implementation,
> but in ways that make no sense to me: implementation of array
> subscription has never, as far as I know, ever been implemented in a way
> that has any meaningful similarity to "counting on", as Malcolm defines
> that term for ordinary numbers.

Yes, but it does seem a convenient example to show that there are
two different ways to look at what is otherwise the same problem.

In the two cases above, one would expect the compiler to increment
two registers containing addresses for the first case, and to
increment one register containing the index for the second.

Fortran compilers have known how to optimize the index case when
needed for about 40 years now, keeping the addresses in temporary
registers. The former example requires restoring the pointers, as
they might be used again. That seems fair to me, but in some
cases, such as strcpy, not needed.

Strictly as written, the first case might require fetching two
pointers from memory each iteration, where the second might keep
the index (and loop invariant origins) in registers.

-- glen

Keith Thompson

unread,
May 21, 2013, 11:45:58 AM5/21/13
to
David Brown <da...@westcontrol.removethisbit.com> writes:
[...]
> This is not about "garbage collection" (which C++ does not have any more
> than C does - you need libraries or classes that support it if you want
> garbage collection), or the lifetime of memory allocations (which is a
> separate issue). C objects never "die" - they just fade away from lack
> of use, and the compiler re-uses their space (registers, stack space,
> etc.)

As James Kuyper points out, the beginning and end of the lifetimes of
objects are very well defined in C. The end of an object's lifetime
typically doesn't require any specific action to occur; it just means
that the behavior of accessing the object becomes undefined.

> In C++, objects have a specific "point of death" - it is when their
> destructor is called.

Not all C++ objects have destructors. (C++'s definition of the word
"object" is quite similar to C's, and has nothing to do with
"object-oriented"; given "int foo;", "foo" is an object.)

> Normally for a local or temporary object, that will happen when it goes
> out of scope (the compiler can, of course, shuffle things around to
> improve the code - but logically the destructor is called at the end of
> scope). Using a reference, you can extend and delay this destructor call.

Scope and lifetime, in both C and C++, are distinct things. For
example, if an object is defined with the "static" keyword inside a
block, it has block scope (meaning that its identifier is visible only
inside the enclosing block), but its lifetime (static storage duration)
is the entire execution of the program.

For a non-static object defined inside a block, its scope and its
lifetime end at the same point (the closing "}" of the nearest enclosing
block) -- but the scope is a region of program text, and the lifetime is
a range of time during the program's execution.

I don't know C++ as well as I know C, but I don't think that a reference
can affect the lifetime of an object with automatic storage duration.
For example, I think this:

int& func() {
int local = 42;
int &ref = local;
return ref;
}

has undefined behavior in C++.

> If you are familiar with *nix, you can think of a pointer as a soft
> link, and a reference as a hard link. A hard link will prevent a file
> from being deleted when the first link is rm'ed, but a soft link offers
> no such protection.

Keith Thompson

unread,
May 21, 2013, 11:52:51 AM5/21/13
to
roland....@gmail.com writes:
[...]
> A compiler language only talks about behaviour. In the discussion, I
> feel that people talk about the equivalence of indexing and pointer
> (arithmetic, arghh -no) usage, mostly because they "know" how it
> translates in machine code.
[...]

At least around here, most people talk about the equivalence of
indexing and pointer arithmetic simply because the C language
standard defines them to be equivalent. It defines the behavior
of pointer arithmetic (in abstract terms, not as machine-level
operations), and defines the [] operator on top of that.

I suppose it could have been done the other way around with no change
in semantics. Indexing could have been defined as a fundamental
operation, with *ptr defined as equivalent to ptr[0].

C's semantics are based on, but avoid directly depending on,
machine-level semantics. That's the motivation for defining []
in terms of pointer arithmetic.

Bart van Ingen Schenau

unread,
May 21, 2013, 1:00:58 PM5/21/13
to
On Tue, 21 May 2013 13:49:56 +0000, glen herrmannsfeldt wrote:

> There should be reasonably names for the "indexing in loop" and
> "increment pointers in loop" form, but I don't know what they are.

I would think that "array notation" and "pointer notation" cover the
territory well enough.
The most important aspect of the terms being that they indicate what it
looks like in the source code, not how that source code should be
interpreted by either the human reader or the compiler.

>
> -- glen

Bart v Ingen Schenau

Bart van Ingen Schenau

unread,
May 21, 2013, 1:22:47 PM5/21/13
to
On Tue, 21 May 2013 14:07:16 +0000, glen herrmannsfeldt wrote:

> James Kuyper <james...@verizon.net> wrote:
>
>> My understanding of what constitutes pointer arithmetic is grounded
>> entirely in the abstract definition of the language itself, and does
>> not depend in any way upon the implementation of that language. That
>> seems reasonable to me, since my understanding of what "arithmetic"
>> means for ordinary numbers is also grounded entirely in the abstract
>> definitions of mathematics, and does not depend in any way upon the
>> specific symbol manipulation processes that people use to implement
>> arithmetic.
>
> Yes, but consider two specific examples to compare:
>
> for(i=0;i<n;i++) *s++=*t++;
> s -=n;
> t -=n;
>
> and
>
> for(i=0;i<n;i++) s[i]=t[i];
>
> it doesn't seem so unreasonable to call these the "pointer arithmetic"
> version and the "indexed version" even though, as we well know, indexing
> is defined in C in terms of pointer arithmetic.

It is not unreasonable to refer to the two different ways to write that
loop with different names (I personally prefer respectively "pointer
notation" and "array notation"), but it *is* unreasonable to claim that
the index/array version does not have any pointer arithmetic in it.

<snip>

> Yes, but it does seem a convenient example to show that there are two
> different ways to look at what is otherwise the same problem.

But it does not help if you need to re-define core language constructs in
the process, such as re-defining array indexing such that it no longer
involves pointer arithmetic, while talking about the C language.

>
> In the two cases above, one would expect the compiler to increment two
> registers containing addresses for the first case, and to increment one
> register containing the index for the second.

For a straight-forward, unoptimized implementation, yes.

>
> Fortran compilers have known how to optimize the index case when needed
> for about 40 years now, keeping the addresses in temporary registers.
> The former example requires restoring the pointers, as they might be
> used again. That seems fair to me, but in some cases, such as strcpy,
> not needed.
>
> Strictly as written, the first case might require fetching two pointers
> from memory each iteration, where the second might keep the index (and
> loop invariant origins) in registers.

I am sorry, but I don't see why in the first case the pointers can't be
held in registers as well.
And you forgot an address calculation on each iteration of the second
loop, which isn't needed in the first loop.

I don't believe there is a clear-cut winner here and any half-decent
optimizer should be able to produce entirely equivalent code for both of
them.

glen herrmannsfeldt

unread,
May 21, 2013, 3:48:17 PM5/21/13
to
Bart van Ingen Schenau <ba...@ingen.ddns.info.invalid> wrote:

(snip)
>> James Kuyper <james...@verizon.net> wrote:
>>> My understanding of what constitutes pointer arithmetic is grounded
>>> entirely in the abstract definition of the language itself, and does
>>> not depend in any way upon the implementation of that language.

(snip, then I wrote)
>> Yes, but consider two specific examples to compare:

>> for(i=0;i<n;i++) *s++=*t++;
>> s -=n;
>> t -=n;

>> and

>> for(i=0;i<n;i++) s[i]=t[i];

>> it doesn't seem so unreasonable to call these the "pointer arithmetic"
>> version and the "indexed version" even though, as we well know, indexing
>> is defined in C in terms of pointer arithmetic.

> It is not unreasonable to refer to the two different ways to write that
> loop with different names (I personally prefer respectively "pointer
> notation" and "array notation"), but it *is* unreasonable to claim that
> the index/array version does not have any pointer arithmetic in it.

Any two names are fine with me, as long as they are different.

> <snip>

>> Yes, but it does seem a convenient example to show that there are two
>> different ways to look at what is otherwise the same problem.

> But it does not help if you need to re-define core language constructs
> in the process, such as re-defining array indexing such that it no
> longer involves pointer arithmetic, while talking about the C language.

I don't think it requires redefining the language, but how you use
the available language features. a[i] is defined in terms of pointer
arithmetic, but that doesn't require that the compiler actually
do the arithmetic. On the PDP-11, it may have actually done it.

>> In the two cases above, one would expect the compiler to increment two
>> registers containing addresses for the first case, and to increment one
>> register containing the index for the second.

> For a straight-forward, unoptimized implementation, yes.

It could get more interesting if you add:

volatile i, s, t;

>> Fortran compilers have known how to optimize the index case when needed
>> for about 40 years now, keeping the addresses in temporary registers.
>> The former example requires restoring the pointers, as they might be
>> used again. That seems fair to me, but in some cases, such as strcpy,
>> not needed.

>> Strictly as written, the first case might require fetching two pointers
>> from memory each iteration, where the second might keep the index (and
>> loop invariant origins) in registers.

> I am sorry, but I don't see why in the first case the pointers can't be
> held in registers as well.

Yes, they can be. Assuming the processor has enough available.
(And you don't add volatile.)

> And you forgot an address calculation on each iteration of the second
> loop, which isn't needed in the first loop.

You mean multiplying by sizeof(s) and sizeof(t)?

For one, some hardware has indexing modes that index in units of
the size being referenced. That is in addition to the ability to
do indexed addressing at all.

Otherwise, previously mentioned Fortran compilers have been known
to generate a loop with the loop variable four times the value of i,
and so avoid any multiply. Even more, fold in any constant multiplier
in the index itself:

for(i=0;i<n;i++) s[3*i]=t[3*i];

compilers can avoid both the explicit multply by three, and the
implied multiply by sizeof(s) and sizeof(t).

> I don't believe there is a clear-cut winner here and any half-decent
> optimizer should be able to produce entirely equivalent code for
> both of them.

Consider the Fortran 66 loop:

I=5
DO 1 I=1,10
1 S(I)=T(I)

what should the value of I be after the loop?
(Especially when the compiler keeps I in a register.)

-- glen

army1987

unread,
May 21, 2013, 4:51:26 PM5/21/13
to
On Sun, 19 May 2013 09:35:48 +0000, Bart van Ingen Schenau wrote:

> This is the only silent difference between C and C++ that I am aware of.
> If there are any others, they will likely affect you even less than this
> one.

Well, if you're using C89, there's silly stuff like 2//* */2, which will
equal 2 in C++ (and C99) but 1 in C89.

--
[ T H I S S P A C E I S F O R R E N T ]
Troppo poca cultura ci rende ignoranti, troppa ci rende folli.
-- fathermckenzie di it.cultura.linguistica.italiano
<http://xkcd.com/397/>

army1987

unread,
May 21, 2013, 5:08:37 PM5/21/13
to
On Sun, 19 May 2013 12:30:54 -0400, James Kuyper wrote:

> You can compile and link both modules as C code, or as C++ code; the
> resulting executables are guaranteed by the applicable standards to exit
> with an failure status, [...]

[...]

> int main(void) {
> return Cfunc(&Cppname_Ctype) && tag;
> }

Actually, it's implementation-defined what values EXIT_SUCCESS and
EXIT_FAILURE have and what exit(i) does for i other than 0 or those two;
so in principle an implementation could have EXIT_SUCCESS equal to 1, and
then the program would return a successful status no matter what. (But
that's trivial to patch up.)

James Kuyper

unread,
May 21, 2013, 6:28:52 PM5/21/13
to
On 05/21/2013 05:08 PM, army1987 wrote:
> On Sun, 19 May 2013 12:30:54 -0400, James Kuyper wrote:
>
>> You can compile and link both modules as C code, or as C++ code; the
>> resulting executables are guaranteed by the applicable standards to exit
>> with an failure status, [...]
>
> [...]
>
>> int main(void) {
>> return Cfunc(&Cppname_Ctype) && tag;
>> }
>
> Actually, it's implementation-defined what values EXIT_SUCCESS and
> EXIT_FAILURE have and what exit(i) does for i other than 0 or those two;
> so in principle an implementation could have EXIT_SUCCESS equal to 1, and
> then the program would return a successful status no matter what. (But
> that's trivial to patch up.)

I now remember that there was a "? EXIT_SUCCESS : EXIT_FAILURE" at the
end of that expression in an earlier version of that program; it must
have gotten lost at some point during editing. That would also explain
the discrepancy Bart brought up - which doesn't explain why I failed to
catch that discrepancy during testing (I know that I tested it - but I
must have been concentrating on getting different results with the two
compilers, rather than matching the description I had already written
up). I remember wanting to remove the dependency on <stdlib.h> - but I
think the reason why I wanted to do that is no longer present in this
version of the code.

army1987

unread,
May 21, 2013, 6:52:39 PM5/21/13
to
On Fri, 17 May 2013 06:42:02 -0700, Paul N wrote:

> C does of course have types, but I
> think it would be considered a leap too far for a function call that
> looks like it is referring to a variable to actually be pushing a
> disguised pointer instead.

Even in C++, I can't stand using non-const references as function
parameters in most cases (at least for POD types), because I don't like
to not be able to assume that calling foo(i) won't affect i.

Martin Shobe

unread,
May 21, 2013, 10:31:32 PM5/21/13
to
On 5/21/2013 7:46 AM, James Kuyper wrote:
> On 05/21/2013 08:02 AM, Martin Shobe wrote:
>> On 5/21/2013 6:20 AM, James Kuyper wrote:
>>> On 05/21/2013 03:20 AM, David Brown wrote:
> ...
>>>> In C++, objects have a specific "point of death" - it is when their
>>>> destructor is called.
>>>
>>> If an object has a non-trivial destructor, C++ rules define that the
>>> lifetime of the object ends when that destructor starts executing.
>>> Otherwise, it ends when storage for that object is no longer allocated,
>>> which is essentially the same as the C rules. However, unless the
>>> destructor is invoked explicitly, an implementation has considerable
>>> freedom to delay execution of a destructor, so the point of death is not
>>> as specific as you suggest, or at least no more specific than the point
>>> of death for a C object.
>>>
>> Can you give an example of when C++ has any freedom to delay invoking
>> a destructor that isn't an example of the "as if" rule?
>
> The "as-if" rule often applies - actually making use of a C++ object
> after it's lifetime has ended has undefined behavior, so the only things
> that prevent a delay in execution of a destructor are any side-effects
> it may cause. If all the destructor does is release resources (a common
> case), the "as-if" rule allows it to be delayed almost indefinitely.

There's a reason why I wanted to exclude those that were a result of the
"as if" rule. Since the actual execution of a destructor isn't
observable, you can't even count on an explicitly invoked destructor
executing if the implementation can find a different way to do the same
thing.

>
> In any event, there is one case that does not rely on the "as-if" rule:
> "The completions of the destructors for all initialized objects with
> thread storage duration within that thread are sequenced before the
> initiation of the destructors of any object with static storage
> duration." (3.6.3p1) It could be a VERY long time between the end of any
> particular thread and the time when objects with static storage duration
> get destroyed.
>

That's more of what I was looking for. Thanks.

Martin Shobe

Rosario1903

unread,
May 22, 2013, 3:55:38 AM5/22/13
to
On Tue, 21 May 2013 22:52:39 +0000 (UTC), army1987
<army...@ask-for-it.invalid> wrote:

>On Fri, 17 May 2013 06:42:02 -0700, Paul N wrote:
>
>> C does of course have types, but I
>> think it would be considered a leap too far for a function call that
>> looks like it is referring to a variable to actually be pushing a
>> disguised pointer instead.
>
>Even in C++, I can't stand using non-const references as function
>parameters in most cases (at least for POD types), because I don't like
>to not be able to assume that calling foo(i) won't affect i.
-----------
non c'� bisogno di usare l'inutile parola "const" basta assumere come
default di regola di programmazione che reference e valori normali
per argomenti di funzione nn variano subito dopo la chiamata della
funzione
-----------
one would not use the useless word "const"...
it is enough one has the formal programming law that arg value for
function as in
f(a,b,c)
not modify a, b c just after the call
if their definition [a,b,c] is not type pointer

for modify that value their have to be pointers not values or
reference.
e.g
int a;
f(&a, b)
so the value that contain a can be changed

if
g(int& a, b)

int a;
g(a,b)

than a is not changed..

Stephen Sprunk

unread,
May 22, 2013, 5:01:38 AM5/22/13
to
On 22-May-13 02:55, Rosario1903 wrote:
> one would not use the useless word "const"... it is enough one has
> the formal programming law that arg value for function as in f(a,b,c)
> not modify a, b c just after the call if their definition [a,b,c] is
> not type pointer
>
> for modify that value their have to be pointers not values or
> reference. e.g int a; f(&a, b) so the value that contain a can be
> changed
>
> if g(int& a, b)
>
> int a; g(a,b)
>
> than a is not changed..

The way C++ does operator overloading requires that functions be able to
modify (non-const) reference arguments.

IIRC, C++'s other major need for references is to avoid needing to call
copy constructors for const arguments.

Since C doesn't have operator overloading or constructors, though, they
would be mere syntactic sugar, not a necessity as they are in C++.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

Rosario1903

unread,
May 22, 2013, 7:57:41 AM5/22/13
to
On Wed, 22 May 2013 04:01:38 -0500, Stephen Sprunk
<ste...@sprunk.org> wrote:

>On 22-May-13 02:55, Rosario1903 wrote:
>> one would not use the useless word "const"... it is enough one has
>> the formal programming law that arg value for function as in f(a,b,c)
>> not modify a, b c just after the call if their definition [a,b,c] is
>> not type pointer
>>
>> for modify that value their have to be pointers not values or
>> reference. e.g int a; f(&a, b) so the value that contain a can be
>> changed
>>
>> if g(int& a, b)
>>
>> int a; g(a,b)
>>
>> than a is not changed..
>
>The way C++ does operator overloading requires that functions be able to
>modify (non-const) reference arguments.

but i not modify them as a law for write code

>IIRC, C++'s other major need for references is to avoid needing to call
>copy constructors for const arguments.

Do not compiler find if no instruction modify what reference
'point'to?

reference is useful for pass big data as pointer in function args
without write the big data in to the stack

Bart van Ingen Schenau

unread,
May 22, 2013, 9:57:27 AM5/22/13
to
On Tue, 21 May 2013 19:48:17 +0000, glen herrmannsfeldt wrote:

> Bart van Ingen Schenau <ba...@ingen.ddns.info.invalid> wrote:
>
>> And you forgot an address calculation on each iteration of the second
>> loop, which isn't needed in the first loop.
>
> You mean multiplying by sizeof(s) and sizeof(t)?

No, I meant adding the offset/index value to the origin pointers.
But I see that that can be part of the CPU instruction.

<snip>

>> I don't believe there is a clear-cut winner here and any half-decent
>> optimizer should be able to produce entirely equivalent code for both
>> of them.
>
> Consider the Fortran 66 loop:
>
> I=5
> DO 1 I=1,10
> 1 S(I)=T(I)
>
> what should the value of I be after the loop? (Especially when the
> compiler keeps I in a register.)

I am sorry, but I am not familiar enough with Fortran 66 to determine
that with any certainty.
My initial guess would be 10, but I would not be surprised if it should
be 5 or 11.
And if it is about the value in the register, I would have no problems
with that being 20, 40, or even 0.

But I don't see what that has to do with what a optimizer can do with two
semantically equivalent pieces of code written in different styles.

Stephen Sprunk

unread,
May 22, 2013, 11:30:45 AM5/22/13
to
On 22-May-13 06:57, Rosario1903 wrote:
> On Wed, 22 May 2013 04:01:38 -0500, Stephen Sprunk
> <ste...@sprunk.org> wrote:
>> On 22-May-13 02:55, Rosario1903 wrote:
>>> one would not use the useless word "const"... it is enough one
>>> has the formal programming law that arg value for function as in
>>> f(a,b,c) not modify a, b c just after the call if their
>>> definition [a,b,c] is not type pointer
>>>
>>> for modify that value their have to be pointers not values or
>>> reference. e.g int a; f(&a, b) so the value that contain a can
>>> be changed
>>>
>>> if g(int& a, b)
>>>
>>> int a; g(a,b)
>>>
>>> than a is not changed..
>>
>> The way C++ does operator overloading requires that functions be
>> able to modify (non-const) reference arguments.
>
> but i not modify them as a law for write code

Consider this basic snippet:

cin << "Hello World!" << endl;

That calls operator<<() twice, and in both cases, the function modifies
its first reference argument (cin); it cannot work any other way.

Granted, you may not take advantage of that feature of the language for
your own functions, but you're probably missing out on opportunities to
write more efficiently and clearly when that's the best solution.

>> IIRC, C++'s other major need for references is to avoid needing to
>> call copy constructors for const arguments.
>
> Do not compiler find if no instruction modify what reference
> 'point'to?

Sorry; I don't understand what you're saying here.

> reference is useful for pass big data as pointer in function args
> without write the big data in to the stack

Pass-by-reference and pass-by-pointer are no different in that way.

Like pointer arguments, a reference argument may be const or not
depending on whether the function needs to modify it. There is no good
reason to require that non-const arguments always be passed by pointer;
it is perfectly valid to pass them by reference. Use whichever form
makes the code easier for humans (particularly ones other than you) to
understand.

Keith Thompson

unread,
May 22, 2013, 11:53:45 AM5/22/13
to
Stephen Sprunk <ste...@sprunk.org> writes:
[...]
> Consider this basic snippet:
>
> cin << "Hello World!" << endl;

I think you mean "cout << ...".

> That calls operator<<() twice, and in both cases, the function modifies
> its first reference argument (cin); it cannot work any other way.

Does it? As far as I know, the overloaded << operator takes two
arguments of types whatever-the-type-of-cout-is and some-other-type, and
returns a result of type whatever-the-type-of-cout-is. It doesn't need
to modify either operand.

Similarly, the C equivalent:

fprintf(some_file, "%s\n", "Hello World!");

does not (and cannot) modify its first argument, which is of type FILE*.
If the type whatever-the-type-of-cout-is is similar to FILE*, there's no
need for << to modify it.

It may well be the case that it *does* modify it, but I think it could
work perfectly well if it didn't.
It is loading more messages.
0 new messages