question about pointers involving ++

Faheem Mitha

unread,

Dec 22, 2000, 9:54:18 PM12/22/00

to

Dear C people,

Consider the following function.

/* strcatptr: concatenate t to end of s; s must be big enough
(pointer version) */
void strcatptr(char *s, char *t)
{
while (*s++)
;
while ( *(s-1)++ = *t++)
;
}

This is essentially K&R2 Ex 5.3. My compiler gcc version egcs 2.91-66
throws a fit at the expression *(i-1)++. "Invalid l-value" etc.etc. I
am not sure why. I understand this to mean: take the value at the
address (pointer) i-1, and then increment the pointer to i. The
alternative expression *(i++ - 1), which I think should be equivalent,
behaves as expected. Can someone tell me what rule I am violating? I
looked at the books I have available: Steve Summit's C FAQ book, which
is useful for these kinds of questions; K&R2; and Kelley and Pohl's "A
Book on C", but received no enlightenment.

Thanks in advance for any response.

Sincerely, Faheem Mitha.

Ben Pfaff

unread,

Dec 22, 2000, 10:35:03 PM12/22/00

to

fah...@email.unc.edu (Faheem Mitha) writes:

> My compiler gcc version egcs 2.91-66
> throws a fit at the expression *(i-1)++. "Invalid l-value" etc.etc. I
> am not sure why. I understand this to mean: take the value at the
> address (pointer) i-1, and then increment the pointer to i.

What it actually says is "Take the value of the pointer i and
subtract one from it. Then dereference the pointer result as
well as incrementing it." However, it makes no sense to
increment an arbitrary expression, so this is an error.

Suppose I had integer variables a, b, and c and wrote the
expression (a * b + c)++. What would you have that mean?
(i-1)++ is no more meaningful.
--
"It wouldn't be a new C standard if it didn't give a
new meaning to the word `static'."
--Peter Seebach on C99

Dave Vandervies

unread,

Dec 22, 2000, 10:49:35 PM12/22/00

to

In article <slrn9484uo...@Chrestomanci.home.earth>,

Faheem Mitha <fah...@email.unc.edu> wrote:
>
>Consider the following function.
>
>/* strcatptr: concatenate t to end of s; s must be big enough
>(pointer version) */
>void strcatptr(char *s, char *t)
>{
> while (*s++)
> ;
> while ( *(s-1)++ = *t++)
> ;
>}
>
>This is essentially K&R2 Ex 5.3. My compiler gcc version egcs 2.91-66
>throws a fit at the expression *(i-1)++. "Invalid l-value" etc.etc. I
>am not sure why. I understand this to mean: take the value at the
>address (pointer) i-1, and then increment the pointer to i.

This is because unary * and ++ associate left to right, so *(i-1)++
evaluates as *( (i-1)++ ) . This is Bad because (i-1) isn't an lvalue,
so the ++ (which requires an lvalue) becomes an error. What would make
more sense here (though it's not what you're looking for) is (*(i-1))++
, which increments whatever (i-1) points at and returns its value
before the increment. (The extra parentheses would still be a Good
Thing even if they weren't required, since that way you wouldn't be
sending people who read your code to their operator precedence tables
to see how the expression got evaluated.)

> The
>alternative expression *(i++ - 1), which I think should be equivalent,
>behaves as expected.

This does what you want, and has the bonus of not having any confusing
operator precedence problems.

> Can someone tell me what rule I am violating? I
>looked at the books I have available: Steve Summit's C FAQ book, which
>is useful for these kinds of questions; K&R2; and Kelley and Pohl's "A
>Book on C", but received no enlightenment.

K&R2 has an operator precedence table in section 2.12 (page 53). It's
not the most intuitive place for it to be (I looked in the appendices
first), but if you look at that it will answer most questions about
operator precedence.

dave

--
Dave Vandervies dj3v...@student.math.uwaterloo.ca

ASR: Our DSWs are longer than your DSWs.
-- David P. Murphy in the Scary Devil Monastery

Faheem Mitha

unread,

Dec 23, 2000, 12:00:58 AM12/23/00

to

On 23 Dec 2000 03:49:35 GMT, Dave Vandervies
<dj3v...@student.math.uwaterloo.ca> wrote:

>In article <slrn9484uo...@Chrestomanci.home.earth>,
>Faheem Mitha <fah...@email.unc.edu> wrote:

>>Consider the following function.
>>
>>/* strcatptr: concatenate t to end of s; s must be big enough
>>(pointer version) */
>>void strcatptr(char *s, char *t)
>>{
>> while (*s++)
>> ;
>> while ( *(s-1)++ = *t++)
>> ;
>>}
>>
>>This is essentially K&R2 Ex 5.3. My compiler gcc version egcs 2.91-66
>>throws a fit at the expression *(i-1)++. "Invalid l-value" etc.etc. I
>>am not sure why. I understand this to mean: take the value at the
>>address (pointer) i-1, and then increment the pointer to i.
>
>This is because unary * and ++ associate left to right, so *(i-1)++
>evaluates as *( (i-1)++ ) . This is Bad because (i-1) isn't an lvalue,
>so the ++ (which requires an lvalue) becomes an error.

Oh, I see. K&R2 defines an l-value to be an expression referring to an
object, where an object is a named region of storage. So what this
boils down to is that the ++ operator can only be applied to
expressions corresponding to something which has storage. I presume
therefore that s-1 has not been assigned storage, presumably because
the function has only reserved space for s and t.

Now that you point it out, I see that K&R2 has a line on pg 46 which
goes. "The increment and decrement operators can only be applied to
varibles;an expression like (i+j)++ is illegal." Variables and object
are the same thing, apparently. Ie. something with storage. I take it
that it is the same kind of point being made here.

>What would make more sense here (though it's not what you're looking
>for) is (*(i-1))++ , which increments whatever (i-1) points at and
>returns its value before the increment.

But *(i-1) is here incremented by ++, and it has not been assigned
storage any more than i-1. Ie. we don't know whether i-1 points to
anything. What is the distinction here? As it happens, my compiler
throws a fit at this expression too. :-)

>K&R2 has an operator precedence table in section 2.12 (page 53). It's
>not the most intuitive place for it to be (I looked in the appendices
>first), but if you look at that it will answer most questions about
>operator precedence.

My confusion here was not about operator precedence. This is treated
in both K&R2 and is also a question in Summit's book.

Thanks for the remarkably fast response. Thanks also to Ben Pfaff for
replying.

Best regards, Faheem Mitha.

Dave Vandervies

unread,

Dec 23, 2000, 10:54:12 AM12/23/00

to

In article <slrn948cc7...@Chrestomanci.home.earth>,

Faheem Mitha <fah...@email.unc.edu> wrote:
>On 23 Dec 2000 03:49:35 GMT, Dave Vandervies
><dj3v...@student.math.uwaterloo.ca> wrote:
>>
>

>Now that you point it out, I see that K&R2 has a line on pg 46 which
>goes. "The increment and decrement operators can only be applied to
>varibles;an expression like (i+j)++ is illegal." Variables and object
>are the same thing, apparently. Ie. something with storage. I take it
>that it is the same kind of point being made here.

That's almost right. Something like *(foo-bar) is, assuming that foo
and bar have the right types, an object (since it points to a region of
storage) (I'm reasonably sure of that, at least - If any language
lawyers disagree, we'll hear about it soon), but it isn't a variable
(foo and bar are the variables here); I would guess that at this point
in the book this statement is true for everything that's been
introduced up to this point (though it's been a long time since I've
read K&R2 cover-to-cover).

>>What would make more sense here (though it's not what you're looking
>>for) is (*(i-1))++ , which increments whatever (i-1) points at and
>>returns its value before the increment.
>
>But *(i-1) is here incremented by ++, and it has not been assigned
>storage any more than i-1. Ie. we don't know whether i-1 points to
>anything.

We don't know whether i-1 points to anything, but we know that i-1 *is*
a pointer to something, so we know that *(i-1) is an object that
somebody, somewhere has hopefully allocated storage for, so it makes
sense to increment it. The compiler will check that for you and
complain if what you're trying to increment doesn't make sense (i.e.
isn't an lvalue); it's your job as a programmer to make sure that this
only gets pointers that point one past a valid region of storage, since
it tries to increment the object previous to the one the pointer points
to.

> What is the distinction here? As it happens, my compiler
>throws a fit at this expression too. :-)

Really? What does it say about it? My compiler accepts it without
complaining:
--------
dj3vande@mef08:~/clc (0) $ cat test.c
void foo(int *i)
{
(*(i-1))++;
}
dj3vande@mef08:~/clc (0) $ gcc -W -Wall -ansi -pedantic -c test.c
dj3vande@mef08:~/clc (0) $ gcc --version
2.95.2
dj3vande@mef08:~/clc (0) $
--------

--
Dave Vandervies dj3v...@student.math.uwaterloo.ca
> Well, it's rather far from rocket science, mixing it up....
Actually, I hear it's a primary ingredient in the space shuttle's solid rocket
boosters. --Ingvar the Grey and Phillip Jones in the Scary Devil Monastery

Chris Torek

unread,

Dec 23, 2000, 12:46:13 PM12/23/00

to

In regard to applying "++" only to actual objects (or, loosely,
"variables")...

In article <slrn948cc7...@Chrestomanci.home.earth>
Faheem Mitha <fah...@email.unc.edu> writes:
>But [in (*(i-1))++,] *(i-1) is here incremented by ++, and it has not

>been assigned storage any more than i-1. Ie. we don't know whether i-1
>points to anything. What is the distinction here?

C partitions the universe of expressions into two halves, which I
call "objects" and "values". (The C standard uses the terms "lvalue"
and "rvalue" but I think "object" and "value" are clearer for
expository purposes, and it is easy enough to mentally map "lvalue
<=> object" and "rvalue <=> value".) Roughly speaking, an object
is a chunk of memory that holds a value, and you need these because
values are fleeting and ephemeral. If you have a value around,
but fail to stick it into an object, it will be gone by the time
you want it again. (Note: all values, and all objects, always have
types, and the types are often crucial to figuring out what any
particular operation means, or whether it is even legal in the
first place. For this article I am not going to write out my
"<object, int i, 3>" and "<value, int, 3>" triples, but rather
concentrate on the first part of that set -- the object-vs-value.)

C then provides a number of operators that, either explicitly or
implicitly, *convert* something from an object to a value, or vice
versa. This latter step is actually something programmers find
quite natural, once they have written a few programs in virtually
*any* language. Consider a simple statement like:

x = y + z;

(or "x := y + z" in Pascal, or anything even vaguely similar in
any vaguely similar language, by which I mean to exclude the weirder
ones like Lisp and Forth :-) ). Somehow this statement finds the
*values* of y and z, adds those two, and stores the resulting sum
into the *variable* x. How is it that it did not store the sum
into the value of x? That is, if x was 3 before this, and y and
z were 4 and 5, how come this set x to 9, rather than setting 3 to 9?
Of course, "set 3 to 9" is nonsense, so it *has* to set x to 9 --
but the question is not "why" but rather "how" did this come about?

The C answer -- which is the same as that in Pascal, and Fortran,
and COBOL and awk and hundreds of other languages -- is that C
distinguishes between the left-hand side of an assignment operator
-- the "left-hand-side value" or "l-value" -- and the right-hand
side. What makes C different from most of these other languages
is that C "exports" that difference, between lvalue and rvalue,
AKA object and value, and gives you those extra operators.

The two most interesting operators here are the unary "&" and
the unary "*", used as "&x" or "*p":

p = &x;
*p = 3;

The unary "&" operator takes an object and "finds its address",
and gives you back that object's address as a value. (That object
must *have* an address -- variables declared with the keyword
"register" do not, and neither do bitfield members of a structure,
and on a lot of modern machines things that do not have addresses
taken can go in a CPU register and the code runs faster.) The
resulting value has some pointer type, based on the type of the
object whose address you just took.

The unary "*" operator takes a value -- which has to have some
pointer type -- and follows the rainbow, er, the pointer, wherever
it leads. Presumably the "pot of gold at the end of the rainbow"
-- the object to which the pointer points -- really *is* an object
of the right type; if not, all bets are off.

In short, then, whenever you have an object where the language
obviously "needs" a value, the C compiler simply fishes out the
current value of that object, probably using its address if it has
one. On the other hand, when you use an explicit "&" or "*", you
can explicitly find the address, or follow it to its object. Thus:

x = y + z;

in effect "means":

- find y and z, and fetch their values, almost like using
&y and &z and then "*" on each of those, but likely more
efficiently, and successfully even if y and z are "register";
- then, add the two values;
- last, using something a lot like what "&x" finds, store
the sum into the object "x".

All this happens because the "=" assignment needs an object on the
left, and a value on the right. (I should add that the above is
very sequential: "first do this, then do that, and last do this
other thing." The C language actually permits things to happen in
all kinds of crazy orders, and requires that you, as a programmer,
make use of "sequence points" when you need to control whether
something really does finish before another step. These sequence
points usually work out nicely if you keep your expressions simple,
so at this stage, you do not have to worry about them yet.)

So what about this "(*(i-1))++" thing? Well, the "++" operator is
rather like the left-hand side of "=", in a way: it needs an object
on which it can work its incrementation. The binary "-" operator,
as in "i - 1", is quite different. It needs two values, and produces
one value, namely the result of subtracting the two input values.
The "++" operator cannot possibly work on the result of the
subtraction, because it is just a value, not an object.

Lucky for us ( :-) ) the subtraction-expression has a unary "*"
operator applied to its result. The unary "*", remember, takes a
value that has some pointer type, and "follows the rainbow" -- the
pointer -- to wherever it leads. The unary "*" always produces an
object, and it is that object that the "++" will ultimately increment.

Now, as you wrote:

>But [in (*(i-1))++,] *(i-1) is here incremented by ++, and it has not

>been assigned storage any more than i-1. Ie. we don't know whether i-1
>points to anything. What is the distinction here?

Note the passive voice here: "it" (no matter what "it" is) "has
not been assigned storage". Who has not assigned what? This is
important! When you use the explicit "*" operator, *you* -- the
programmer -- are responsible for making sure that the value you
are "*"ing is a valid pointer value that points to some actual
object somewhere in memory.

As long as "i-1" really *does* point to something, *(i-1) finds
that object. That object is the one that "++" will increment.
If "i-1" does not point to an actual object, all bets are off --
the program may crash, or the computer might start acting weirdly,
or whatever.

Incidentally, it is worth noting here that even ordinary assignment
produces a value. You can thus embed "x = y + z" inside some other
expression. For instance:

w = (x = y + z) + 2;

is a legal C statement. I already showed how "x = y + z" requires
that x be an object, fishes out values for y and z, adds those,
and stores the result in x. The only new bit here is that the "="
assignment operator produces a value (of some type): the value it
produces is the value that the object (here, x) will have after
the assignment is done. If y and z are "int"s and are 4 and 5,
and "x" is an int, then 4+5 is 9, and x will get set to 9, so the
value of the "=" assignment to x is also 9 (and still an int).
Clearly, then, the compiler will also have to compute 9+2 and get
11, and put 11 in the object "w". If w is also an int, the whole
expression-statement has the value 11, and the "side effects" of
setting x to 9 and w to 11. The side effects are the important
thing here; the overall value -- the int 11 -- is ultimately
discarded. Since we get two separate side effects -- changes to
both w and x -- those "sequence points" I mentioned above might
start to become important. Without a sequence point, we have no
idea whether a C compiler will set w first, or set x first, or
maybe even set both of them "at the same time". (In this case, it
obviously does not matter, so the whole statement is okay.)
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA, USA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.

Scott J. McCaughrin

unread,

Dec 23, 2000, 1:17:28 PM12/23/00

to

I can't believe the volume of responses to such a simple question.
Imagine you are 'cc' and parsing: <expr>++ and the LALR(1) grammar
I have gives the derivation: <expr>++ => <postfix-op> <postfix-op>
=> <primary-exp> <postfix-op> => <primary-p2-exp> <postfix-op> =>
<primary-p1-op> <postfix-op> => <id-name> <postfix-op>.
Never mind pedantic arguments about reduction/production, etc. --
the point is that the respondent quoting K&R about "variables" hit
the nail on the head: the original poster'c 'cc' complained because
++ postfixed a non-identifier.

Chris Torek

unread,

Dec 23, 2000, 2:45:53 PM12/23/00

to

In article <Y_516.539$1a.207@firefly>
Scott J. McCaughrin <sjmc...@bluestem.prairienet.org> writes:
>I can't believe the volume of responses to such a simple question. ...

>Never mind pedantic arguments about reduction/production, etc. --
>the point is that the respondent quoting K&R about "variables" hit
>the nail on the head: the original poster'c 'cc' complained because
>++ postfixed a non-identifier.

But postfix "++" does not have to be applied to an identifier,
nor does applying it to an identifier suffice. Consider:

void f(void *vp) {
(*(int *)vp)++;
}

void g(void) {
int i = 3;

f(&i);
}

Here f() increments "i", yet the "++" operator is applied to the
result of a cast expression. Or:

void bug(void) {
const int a;
int b[3];

a++; /* ERROR */
b++; /* ERROR */
}

Here the postfix "++" is applied to two identifiers, yet neither one
works.

Kaz Kylheku

unread,

Dec 23, 2000, 3:03:48 PM12/23/00

to

On Sat, 23 Dec 2000 19:45:53 GMT, Chris Torek <to...@elf.bsdi.com> wrote:
>But postfix "++" does not have to be applied to an identifier,
>nor does applying it to an identifier suffice. Consider:
>
> void f(void *vp) {
> (*(int *)vp)++;
> }
>
> void g(void) {
> int i = 3;
>
> f(&i);
> }
>
>Here f() increments "i", yet the "++" operator is applied to the
>result of a cast expression.

Of course Chris meant to write: applied to the lvalue expressed by
dereferencing the result of a cast expression. :)

Chris Torek

unread,

Dec 23, 2000, 4:09:56 PM12/23/00

to

[I wrote, in part]

>>Here f() increments "i", yet the "++" operator is applied to the
>>result of a cast expression.

In article <slrn94a1j...@ashi.FootPrints.net>

Kaz Kylheku <k...@ashi.footprints.net> writes:
>Of course Chris meant to write: applied to the lvalue expressed by
>dereferencing the result of a cast expression. :)

Er, right.

I probably also should have initialized the "const int"s in the
other example, since uninitialized "const" variables are generally
pretty useless. :-)

-hs-

unread,

Dec 24, 2000, 9:40:28 AM12/24/00

to

Chris Torek a écrit dans le message <9234ag$jj5$1...@elf.bsdi.com>...

>
>I probably also should have initialized the "const int"s in the
>other example, since uninitialized "const" variables are generally
>pretty useless. :-)

Not at all. It is a design question. It is absolutely useful to define a
variable "read-only" (I prefer this wording), if your code is not supposed
to modify it after initialization (parameter, result of a compute, return
from a function etc.).

of course,

int const = 2;

is not very useful, but

int f(int const x) /* read-only protection */
{
int const y = x + 3; /* read-only protection */

x = 0; /* compile error */
y++; /* compile error */

return y;
}

is fine.

--
-hs- Tabs out, spaces in.
CLC-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
ISO-C Library: http://www.dinkum.com/htm_cl
FAQ de FCLC : http://www.isty-info.uvsq.fr/~rumeau/fclc

Chris Torek

unread,

Dec 24, 2000, 12:52:06 PM12/24/00

to

I wrote:
>>I probably also should have initialized the "const int"s in the
>>other example, since uninitialized "const" variables are generally
>>pretty useless. :-)

In article <9251t2$1hc$1...@news5.isdnet.net>

-hs- <email....@server.invalid> writes:
>Not at all. It is a design question. It is absolutely useful to define a
>variable "read-only" (I prefer this wording), if your code is not supposed
>to modify it after initialization (parameter, result of a compute, return
>from a function etc.).

Sure -- but such variables need to have an initial value. Remember,
my original example was something like:

void f(void) {
const int a;
...
}

>of course,

>int f(int const x) /* read-only protection */
>{
>int const y = x + 3; /* read-only protection */

Here x and y are both initialized -- x, by the parameter to f(),
and y, to x + 3. In mine, "a" was junk, maybe even a trap
representation, and could never be changed. The variable "a" thus
can neither be written -- it is read-only -- nor read.

Faheem Mitha

unread,

Dec 26, 2000, 12:31:34 PM12/26/00

to

On Sat, 23 Dec 2000 17:46:13 GMT, Chris Torek <to...@elf.bsdi.com> wrote:
>In regard to applying "++" only to actual objects (or, loosely,
>"variables")...

Dear Chris Torek,

Thank you for your very lucid and detailed exposition. I printed it
out and studied it for a while, and it helped to clear up some
things. Now the reference manual at the back of K&R2 makes a lot more
sense. For some reason it is quite difficult for me to keep the
concept of a piece of storage and a value distinct in my mind

>In article <slrn948cc7...@Chrestomanci.home.earth>
>Faheem Mitha <fah...@email.unc.edu> writes:
>>But [in (*(i-1))++,] *(i-1) is here incremented by ++, and it has not
>>been assigned storage any more than i-1. Ie. we don't know whether i-1
>>points to anything. What is the distinction here?
>
>C partitions the universe of expressions into two halves, which I
>call "objects" and "values".

Yes, but all objects correspond to values as well, but not vice-versa, right?

>(The C standard uses the terms "lvalue"
>and "rvalue"

K&R2 do not use the term rvalue, though Summit's FAQ book does.

> For this article I am not going to write out my
>"<object, int i, 3>" and "<value, int, 3>" triples, but rather
>concentrate on the first part of that set -- the object-vs-value.)

What does the 3 correspond to?

>The C answer -- which is the same as that in Pascal, and Fortran, and
>COBOL and awk and hundreds of other languages -- is that C
>distinguishes between the left-hand side of an assignment operator --
>the "left-hand-side value" or "l-value" -- and the right-hand side.
>What makes C different from most of these other languages is that C
>"exports" that difference, between lvalue and rvalue, AKA object and
>value, and gives you those extra operators.

In what sense does it "export" the difference? I am not quite clear
what you mean here.

>So what about this "(*(i-1))++" thing? Well, the "++" operator is
>rather like the left-hand side of "=", in a way: it needs an object
>on which it can work its incrementation.

Yes, reasonably enough, since ++ does two things, namely stores the
value incredmented by one back into an object, and returns an
expression with value the same as before.

Also, the result of the incrementation is *not* a l-value. I don't
quite understand why this is not so. There is an natural choice for
this, namely the aforesaid object.

>Now, as you wrote:
>
>>But [in (*(i-1))++,] *(i-1) is here incremented by ++, and it has not
>>been assigned storage any more than i-1. Ie. we don't know whether i-1
>>points to anything. What is the distinction here?
>
>Note the passive voice here: "it" (no matter what "it" is) "has
>not been assigned storage". Who has not assigned what? This is
>important! When you use the explicit "*" operator, *you* -- the
>programmer -- are responsible for making sure that the value you
>are "*"ing is a valid pointer value that points to some actual
>object somewhere in memory.

Ok. I guess I was thrown by ambigious language. Both K&R2 and "A Book
on C" say that ++ can only be applied to variables. K&R2 defines a
variable as "a location in storage", which is OK. However, "A book on
C" says (pg 107, 4th Edn") that "Variable and constants are teh
objects that a program manipulates. In C, all variables must be
declared before they can be used." Since I had not declared *(i-1), I
thought that the compiler would reject it. I guess the difference is
one between parsing and the behaviour of the program at actual
runtime. The compiler will parse this Ok, but presumably if *(i-1) has
not been declared, there will be problems at runtime.

>As long as "i-1" really *does* point to something, *(i-1) finds
>that object. That object is the one that "++" will increment.
>If "i-1" does not point to an actual object, all bets are off --
>the program may crash, or the computer might start acting weirdly,
>or whatever.

Roughly what I say above, I think.

>Incidentally, it is worth noting here that even ordinary assignment
>produces a value. You can thus embed "x = y + z" inside some other
>expression. For instance:

[snipped]

Yes, I am aware of this. K&R2 use this quite a lot in the form
if ((c=getchar( ))!-'\0') and so on.

I am surprised you were willing to take the time to write such a
detailed answer to a complete stranger. Thank you!

Best regards, Faheem Mitha.

Faheem Mitha

unread,

Dec 26, 2000, 12:50:13 PM12/26/00

to

On 23 Dec 2000 15:54:12 GMT, Dave Vandervies
<dj3v...@student.math.uwaterloo.ca> wrote:

>Really? What does it say about it? My compiler accepts it without
>complaining:
>--------
>dj3vande@mef08:~/clc (0) $ cat test.c
>void foo(int *i)
>{
> (*(i-1))++;
>}
>dj3vande@mef08:~/clc (0) $ gcc -W -Wall -ansi -pedantic -c test.c
>dj3vande@mef08:~/clc (0) $ gcc --version
>2.95.2
>dj3vande@mef08:~/clc (0) $
>--------

Sorry, I was mistaken. I was trying to use something like

(*(i-1))++ = *j++;

The compiler rejects this because (*(i-1))++ is not an l-value. But it
happily accepts the function you give above. I didn't understand these
things that well when I wrote the above. I think I understand it
better now, partly due to Chris Torek's helpful tutorial (see
elsewhere in this thread).

Best regards, Faheem Mitha.

Kaz Kylheku

unread,

Dec 26, 2000, 2:08:47 PM12/26/00

to

On Tue, 26 Dec 2000 17:31:34 -0000, Faheem Mitha <fah...@email.unc.edu> wrote:
>On Sat, 23 Dec 2000 17:46:13 GMT, Chris Torek <to...@elf.bsdi.com> wrote:
>>In regard to applying "++" only to actual objects (or, loosely,
>>"variables")...
>
>Dear Chris Torek,
>
>Thank you for your very lucid and detailed exposition. I printed it
>out and studied it for a while, and it helped to clear up some
>things. Now the reference manual at the back of K&R2 makes a lot more
>sense. For some reason it is quite difficult for me to keep the
>concept of a piece of storage and a value distinct in my mind
>
>>In article <slrn948cc7...@Chrestomanci.home.earth>
>>Faheem Mitha <fah...@email.unc.edu> writes:
>>>But [in (*(i-1))++,] *(i-1) is here incremented by ++, and it has not
>>>been assigned storage any more than i-1. Ie. we don't know whether i-1
>>>points to anything. What is the distinction here?
>>
>>C partitions the universe of expressions into two halves, which I
>>call "objects" and "values".
>
>Yes, but all objects correspond to values as well, but not vice-versa, right?

That is correct. Values can arise in several ways:

-- an expression which designates an object (in other words an lvalue)
is converted to the value stored in the designated object.
-- an operator is applied to some operand values to produce a new value.
-- a function call returns a value (really a special case of 2).
-- A literal denotes a value, like 'x', 3.14.

In some contexts an lvalue is not turned into a value. You can apply the sizeof
operator to an lvalue to compute its size, rather than its value. And you can
apply the & operator to take the address rather than the value. As well, array
lvalues automatically convert to a pointer to the first element.

>>(The C standard uses the terms "lvalue"
>>and "rvalue"
>
>K&R2 do not use the term rvalue, though Summit's FAQ book does.

The standard does not use the term rvalue. There is no special expression type
that may only appear at the right hand of an assignment. The term that is used
is simply ``value of the expression''.

Chris Torek

unread,

Dec 28, 2000, 5:27:49 AM12/28/00

to

In article <slrn94hlfj...@Chrestomanci.home.earth>
Faheem Mitha <fah...@email.unc.edu> writes:
>... For some reason it is quite difficult for me to keep the
>concept of a piece of storage and a value distinct in my mind.

Well, all objects in storage have a value (with one exception:
uninitialized objects may have garbage in them[%]). This means
all storage has values. The reverse is *not* true in C: you can
have values without storage. (In a really simple compiler that
does not optimize well, the latter might include "the result in
the accumulator register, which is never used for anything except
arithmetic". That accumulator register might be "storage" inside
the chip, but it is not directly accessible from C.[*]) The thing
is, values without storage soon evaporate, so they are not much
good outside of some computation-type expression. You have to save
the result -- in an object or "lvalue" -- so that you can reuse
it later.
-----
[%] This garbage or "junk" value may even be "poisonous". An
example of the this is that uninitialized float or double variables
might actually secretly be initialized to a signalling NaN, so that
any use of the variable causes a runtime exception.

[*] Imagine a simple assembly language for a simple machine with
one accumulator register and everything else in memory. A simple
computation like "x = y + z" might turn into:

load Y # accum := y
add Z # accum += z
store X # x := accum

and something tougher like "y = (y + 4) * z" might turn into:

load Y
add 4
mul Z
store Y

In all cases, the "values" wind up in the accumulator; if you did
not "store X" and "store Y" at the end, it would just get wiped
out by the next expression.
-----

Anyway...

>On Sat, 23 Dec 2000 17:46:13 GMT, Chris Torek <to...@elf.bsdi.com> wrote:
>>C partitions the universe of expressions into two halves, which I
>>call "objects" and "values".

>Yes, but all objects correspond to values as well, but not vice-versa, right?

Right; as I noted above, all objects have values. Automatic objects,
and objects in malloc()ed space, have initial "junk" values that
may even be poisonous, so you should assign them new values without
looking at the junk.

>>(The C standard uses the terms "lvalue" and "rvalue"

>K&R2 do not use the term rvalue, though Summit's FAQ book does.

I goofed -- as Kaz noted, the standard just uses "value". The
terms lvalue and rvalue are standard computer-science-ese, though,
so you will see them elsewhere too.

>>For this article I am not going to write out my
>>"<object, int i, 3>" and "<value, int, 3>" triples, but rather
>>concentrate on the first part of that set -- the object-vs-value.)

>What does the 3 correspond to?

That would be the current value.

Introducing types into C -- which is one of the things that made
C "C" instead of just B or New B -- made C substantially more
complex. The B language was typeless, like BCPL before it. To
add i and j as integers, you used "i + j"; to add them as floating
point numbers, you used something more like "i #+ j", and so on.
The "add floats" operator would add two floats and it was your job,
as a programmer, to provide two floats; if you added two integers
as if they were floats, you got a nonsense result. (Assembly
languages are typically just like this today. You have to use the
"fadd" instruction to add floats, and the "fmul" to multiply floats,
and so on.) Typelessness makes the language simpler, but usually
makes writing code in it harder, and also less portable, so C has
types. The "+" operator takes two ints and adds them and the
result is an int, or two "double"s and adds those and the result
is a double -- and if you give it one "int" and one "double", it
converts.

So, suppose you have:

int i = 3;
double x, y = 5.2;

and you write:

x = y + i;

In order to describe what happens in C, you need to write down all
the values *and* their types:

y is: <object, double, 5.2>
i is: <object, int, 3>
x is: <object, double, junk> (before the add)

We cannot add an int and a double directly, so to add y and i, we
have to convert the value of i -- the int 3 -- to a "double":

value of i as a double: <value, double, 3.0>

Note that this value has no corresponding object. It is just a
value, floating around somewhere, and if we do not do something
quickly with it, it will go away, never to be seen again. Fortunately
we have something to do: we want to add it to y. That, in turn,
requires fishing the value out of y:

value of y: <value, double, 5.2>

Now we have two <value>s of the same type (double), so we add:

sum: <value, double, 5.2 + 3.0 = 8.2>

This is another one of those evanescent <value> things, but we have
something to do with it: stuff it into x. Since x is an object
(and is not "const"), this is is legal; since x has type double,
it is even easy; and the result is that x changes from <object,
double, junk> to <object, double, 8.2>.

Note that if x had some other type -- say, int -- the store back
into x would require a second conversion step, from double to int.
This would take <value, double, 8.2> and produce <value, int, 8>,
according to C's rules for conversions.

(You might want to use a shortcut here, if you were doing this
yourself. Obviously 5.2 + 3.0 is 8.2 which converts to 8, so why
not convert 5.2 to 5 first, and get 5+3=8, doing the whole thing
in "int" instead of "double"? But this is not what you wrote in
the C code, and to do this itself, the C compiler first has to
prove to itself that it always gets the right answer. If you did
something else -- say:

i = i / 1.2;

-- the answer might change. Here <object, int, 3> (i) becomes
<value, int, 3> and then <value, double, 3.0>; 3.0 / 1.2 is 2.5;
and <value, double, 2.5> converts to <value, int, 2>. If you used
a shortcut by converting <value, double, 1.2> to <value, int, 1>,
you would get 3/1=3, which is wrong.)

>>What makes C different from most of these other languages is that C
>>"exports" that difference, between lvalue and rvalue, AKA object and
>>value, and gives you those extra operators.

>In what sense does it "export" the difference? I am not quite clear
>what you mean here.

"Export" may not have been the best word. "Expose" is better: C
exposes the difference, via the & and * operators.

Consider a Pascal procedure with a "var" parameter, or a C++
function with a reference parameter. Here you have something like:

procedure incr(var i: integer);
begin i := i + 1 end;

or:

void incr(int& i) { ++i; }

which you then use with any ordinary same-typed variable as:

incr(z);

Even though "z" is an ordinary variable, the procedure or function
you called -- here, "incr" -- is able to change it. Compilers
often actually implement this internally by passing a pointer to
the target variable, i.e., the address of the object in memory.
Each reference in the callee is likewise turned into an indirection
through the supplied pointer. The "i := i + 1" line really means
"fetch the target to which the pointer points, add 1, and store
the result through the pointer."

C lacks all this. Instead, the object-ness of "i" is exposed
directly to the C programmer, and a (silly) function like incr()
has to be written using explicit pointers:

void incr(int *ip) { ++(*ip); }

Likewise, the caller has to explicitly pass in a pointer to the
object to be incremented:

incr(&z);

Higher-level languages (and C++ and Pascal are not really enough
"higher", but never mind that :-) ) keep these details hidden from
the programmer, making the language even more complex but allowing
compilers to use sneaky methods like "value-result" or any other
thing that happens to be more efficient than raw pointers.

>Also, the result of the incrementation is *not* a l-value. I don't
>quite understand why this is not so. There is an natural choice for
>this, namely the aforesaid object.

True at least for prefix "++" -- and in C++, the result of a prefix
"++" operator *is* an lvalue (or object). The result of a postfix
"++" operator is not. In C++, with its user-defined operators and
constructors, it is easy to see why -- if you write:

x = y++;

the semantics in C are "fetch the value in the object y, add 1 to
that value, arrange to store the result in y, and arrange to store
the old value -- pre-incrementation, as it were -- in x." In C++,
if x and y are user-defined types with user-defined operators and
user-defined constructors, "x = y" means "call the copy constructor
to copy from y to x", and "y++" means "call the user-defined
postfix++ operator on y". But "x = y++" must *first* call the
user-defined postfix++ operator, and only *then* call the copy
constructor, and by then it is too late: y has already been ++'ed.
The postfix++ operator thus has to return a temporary copy of the
*old* value. If postfix++ produced an lvalue, the object in question
would be the temporary copy, which is soon to vanish!

In any case, the C definition for "++" includes "takes an object
and produces a value", for both pre- and post-fix "++". Even though
prefix ++ could easily yeild an object, it does not.

>Ok. I guess I was thrown by ambigious language. Both K&R2 and "A Book
>on C" say that ++ can only be applied to variables. K&R2 defines a
>variable as "a location in storage", which is OK.

It is also a better definition than the next one. Consider:

int *ip = malloc(10 * sizeof *ip);
if (ip == NULL) ... handle error ...

After this, ip[i] (where 0 <= i < 10) is a valid "int" variable --
but compare that with this:

>However, "A book on C" says (pg 107, 4th Edn) that "Variable and
>constants are the objects that a program manipulates. In C, all

>variables must be declared before they can be used."

The variable "ip" is certainly declared, but the object at, say,
ip[3] is not "declared" at all. It comes from malloc() storage,
and has "allocated" storage duration, i.e., it lasts until a
corresponding free(ip). Still, ip[3] is a variable, and after
setting it to some initial value such as 42, "ip[3]++" is just
fine.

By the way, I think another example of those <object-or-value,
type, current-value> triples here is appropriate here. The variable
"ip" has type "int *":

ip: <object, int *, ptr from malloc>

The subscript operator [] is defined as:

ip[i] <=> (*(ip)+(i))

hence:

ip[3] <=> *(ip + 3)

Now we have:

ip + 3

which is:

Converting the object to its value is easy:

<value, int *, ptr from malloc> + <value, int, 3>

Next we have to look up the rules for adding an integer to a pointer.
The result is another pointer, in this case "3 objects away" from
the original one. The three objects are "int"s because the pointer
has type "pointer to int", so the sum is:

<value, int *, "3 ints away" from ptr from malloc>

The "*" indirection operator then takes the pointer and finds the
object:

<object, int, junk>

(well, initially junk anyway, until you stick something into it).

Faheem Mitha

unread,

Jan 4, 2001, 10:44:58 PM1/4/01

to

On Thu, 28 Dec 2000 10:27:49 GMT, Chris Torek <to...@elf.bsdi.com> wrote:
>In article <slrn94hlfj...@Chrestomanci.home.earth>
>Faheem Mitha <fah...@email.unc.edu> writes:
>>... For some reason it is quite difficult for me to keep the
>>concept of a piece of storage and a value distinct in my mind.
>
>Well, all objects in storage have a value (with one exception:

[snipped]

Well, I don't really have a whole lot to say here, except to again say
thank you for taking the trouble to reply in such detail. I feel I
understand the issues raised quite well now, thanks to your
postings. (I read this most recent posting carefully). Have you ever
considered becoming a professional expositor? :-)

Best regards, Faheem (now trying to learn about external and static variables).