Operator Precedence when ---x, +--x

Alex Vinokur

unread,

Apr 12, 2003, 9:02:25 AM4/12/03

to

=============================
Windows 2000
DJGPP 2.03
GNU gcc version 3.2.1
=============================

--------- C code : BEGIN ---------
int main()
{
int x = 10;

---x; // compilation error : non-lvalue in decrement
+--x; // no errors
-++x; // no errors
+++x; // compilation error : non-lvalue in increment

return 0;
}

--------- C code : END -----------

How to explain the compiler behavior?

=================================
Alex Vinokur
mailto:ale...@connect.to
http://www.simtel.net/pub/oth/19088.html
=================================

Alex Vinokur

unread,

Apr 12, 2003, 9:39:25 AM4/12/03

to

"Alex Vinokur" <ale...@bigfoot.com> wrote in message news:b78v8s$c9mnd$1...@ID-79865.news.dfncis.de...

> =============================
> Windows 2000
> DJGPP 2.03
> GNU gcc version 3.2.1
> =============================
>
> --------- C code : BEGIN ---------
> int main()
> {
> int x = 10;
>
> ---x; // compilation error : non-lvalue in decrement
> +--x; // no errors
> -++x; // no errors
> +++x; // compilation error : non-lvalue in increment
>
> return 0;
> }
>
> --------- C code : END -----------
>
> How to explain the compiler behavior?

Hypothesis :
---x is ((--)(-x))
+--x is (+(--x))
-++x is (-(++x))
+++x is ((++)(+x))

rjh

unread,

Apr 12, 2003, 8:46:55 AM4/12/03

to

[Followups set to clc]

Alex Vinokur wrote:

> =============================
> Windows 2000
> DJGPP 2.03
> GNU gcc version 3.2.1
> =============================
>
> --------- C code : BEGIN ---------
> int main()
> {
> int x = 10;
>
> ---x; // compilation error : non-lvalue in decrement

Maximum munch principle:

-- -x

-x yields 0 - x, which is an expression but not a modifiable lvalue (i.e.
it's not a simple object).

> +--x; // no errors

+ --x

x is an object, so --x is well-defined. The + is effectively a no-op here.

> -++x; // no errors

- ++x is the equivalent of (0 - ++x). ++x is well-defined, so all is well.

> +++x; // compilation error : non-lvalue in increment

++ +x

x is an object, but +x is not.

<snip>

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Tom St Denis

unread,

Apr 12, 2003, 8:58:58 AM4/12/03

to

"Alex Vinokur" <ale...@bigfoot.com> wrote in message
news:b78v8s$c9mnd$1...@ID-79865.news.dfncis.de...

> =============================
> Windows 2000
> DJGPP 2.03
> GNU gcc version 3.2.1
> =============================
>
> --------- C code : BEGIN ---------
> int main()
> {
> int x = 10;
>
> ---x; // compilation error : non-lvalue in decrement
> +--x; // no errors
> -++x; // no errors
> +++x; // compilation error : non-lvalue in increment
>
> return 0;
> }
>
> --------- C code : END -----------
>
> How to explain the compiler behavior?

Do your own homework? If you understood how the parser works this would be
trivial [hint: try adding () arround the statements e.g. -(++x) then look at
what it gets]

more hint: ---x is the same as -- -x

Tom

Chris Torek

unread,

Apr 12, 2003, 9:18:04 AM4/12/03

to

Preliminary note: Forget about "precedence" for now. The C standard
does not use an operator precedence grammar. "Operator precedence"
is a trick *you* use to parse an expression. A C compiler is allowed
to use a *different* trick. Whichever trick you prefer -- yours, or
the C standards -- it only comes into play when there is more than one
way to interpret an "expression".

In article <b78v8s$c9mnd$1...@ID-79865.news.dfncis.de>

Alex Vinokur <ale...@connect.to> writes:
> ---x; // compilation error : non-lvalue in decrement
> +--x; // no errors
> -++x; // no errors
> +++x; // compilation error : non-lvalue in increment

>How to explain the compiler behavior?

One early step in interpreting C code is to make "tokens" out of
character sequences (specifically, "pp-tokens" or preprocessor
tokens). This step is often called "lexical analysis". These
tokens are (eventually) the input to the "parser" -- the part of
a compiler that matches input "words" up against some sort of
grammar. Each token is, in effect, a word in the language. Thus,
until you have "lexed up" some tokens, you cannot even begin to
parse them.

Now, C has +, -, ++, and -- as elementary tokens. So when you see
three "+" characters in a row, you -- or the compiler -- must
decide: "is this a + token followed by a ++ token, or is this a ++
token followed by a + token?" The rule C uses is incredibly simple:
ALWAYS TAKE THE LONGEST POSSIBLE MATCH. This is sometimes called
the "maximal munch".

If you take the longest match each time, the input character sequence:

---x;

turns into the four tokens "--", "-", "x", and ";". But there is no
"+-" token, so the input sequence:

+--x;

cannot turn into "+-" followed by something. Instead, the only
way to tokenize this is to produce "+", "--", "x", and ";".

Note that if we did not take the longest match, the latter might
even turn into "+", "-", "-", "x", and ";" (five tokens), which is
clearly not what we want.

(Exercise for the reader: tokenize "-++x" and "+++x", using "maximal
munch".)

Having produced these token sequences, *now* we get to the job of
"parsing" -- matching words against the C grammar. Given the
sequence of words:

-- - x

(I am discarding the semicolon to simplify this) the C standard
requires that a C compiler match this against the following grammar
rules (this is necessarily long, because C does not use an
operator-precedence grammar):

primary-expr:
identifier
constant
string-literal
( expression )

unary-expr:
postfix-expr
++ unary-expr
-- unary-expr
unary-operator cast-expr
sizeof unary-expr
sizeof ( type-name )

unary-operator: one of
& * + - ~ !

cast-expr:
unary-expr
( type-name ) cast-expr

Now, "x" itself is an identifier token, so it is a "primary-expr".
A "cast-expr" can be a "unary-expr" with nothing added, and a "-"
is a unary-operator, so "-" followed by a cast-expr (which is a
primary-expr) is a valid unary-expr. Hence we can use the following
to handle the "-" and "x" parts:

x is a primary-expr
- x is a unary-operator followed by a cast-expr (where
the cast-expr is just a primary-expr, i.e., "x"),
which makes the whole thing a valid unary-expr

This leaves only the "--" token, which can appear in front of a
unary-expr. Since "- x" has the form of a unary-expr, the whole
thing -- "-- - x" -- is also a unary-expr. This takes care of
the parsing.

Finally, having done all the parsing, now we get to the point where
a C compiler must produce a diagnostic (the "compilation error"
observed above). According to the standard (section 6.3.3 in the
draft I have handy):

6.3.3.1 Prefix increment and decrement operators
Constraints
[#1] The operand of the prefix increment or decrement
operator shall have qualified or unqualified real or pointer
type and shall be a modifiable lvalue.

While "- x" is a valid unary-expr, it is *not* a modifiable lvalue
per section 6.6.6.3:

6.3.3.3 Unary arithmetic operators
...
Semantics
...
[#3] The result of the unary - operator is the negative of
its operand. The integer promotion is performed on the
operand, and the result has the promoted type.

and 6.2.2.1:

6.2.2.1 Lvalues and function designators
[#1] An lvalue is an expression ... that designates an
object. ...
[#2] Except when it is the operand of the sizeof operator,
the unary & operator, the ++ operator, the -- operator, or
the left operand of the . operator or an assignment
converted to the value stored in the designated object (and
is no longer an lvalue). ...

all of which tells us that "- x" is no longer an lvalue (much
less a modifiable one).

Now, if we were to go back in time and change the rules for lexing
tokens in C, we might try to get rid of the "maximal munch" idea,
and observe that if only we had lexed this as the token sequence
"-", then "--", then "x", we could have parsed this by binding the
"--" to the "x" part using the "unary-expr can be -- followed by
another unary-expr" rule plus the "unary-expr can be primary-expr"
rule plus the "primary-expr can be identifier" rule. This would
then allow us to bind the first "-" using the "unary-expr can be
unary-operator followed by cast-expr rule" along with the "cast-expr
can be unary-expr rule". The whole thing would then be a valid
parse, and now the "--" operator would apply to "x" instead of to
"- x", and as long as x is a modifiable lvalue, it would all be
valid.

If you really think this is a good idea, go ahead and build a time
machine into a DeLorean, and go back to the early 1970s and convince
Dennis Ritchie to use some rule other than "maximal munch". But
you will also have to come up with that other rule, and find a way
to implement it. The maximal munch rule has some fundamental
computer-science theory behind it, and applying this theory makes
writing compilers a lot easier.

While it is off-topic here, it is perhaps worth mentioning the CS
theory. Parsers and lexers are different animals, with different
inherent complexities. A lexer operates on so-called "regular
expressions", which express fundamentally simpler automata than do
parser grammars. Since it is simpler, it can store less state,
and potentially run faster. A lexer is incapable of, for instance,
counting parentheses to see if they match. Parsers can match
parentheses; lexers cannot.

In any case, if you really want to do gimmicky things with token
sequences in C, you can always use white-space to force the lexer
to produce the desired tokens. Instead of:

x+++++y /* error */

you can write:

x++ + ++y /* OK */

or even:

x+++ ++y /* also OK but confusing to humans */

Since C's lexical rules treat whitespace as significant (but C's
parsing rules ignore it!), either of the latter two suffice to
force the lexer to munch "++" followed by "+" followed by "++",
rather than "++" followed by "++" followed by "+".

Finally, let me say one last time that the Standard C grammar DOES
NOT use operator precedence. (At most, one can say that the grammar
"implies" a precedence. I believe the word "specifies" is too
strong, despite its appearance in a non-normative footnote.) Humans
use "precedence", along with the related concept of "associativity",
because it is easier for us to look these two up in a table than
it is to memorize things like "an expression can be an assignment-expr
which can be a conditional-expr which can be a logical-OR-expr
which ... shift-expr ... multiplicative-expr ... primary-expr" (I
left out about a dozen intermediate terms here!). Carrying all
that gunk around in our heads would leave us unable to walk and
chew gum at the same time. :-) But computers are good at these
kinds of finicky details. They never get confused; they do exactly
what you tell them to, even when it is not what you meant.
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)
email: forget about it http://67.40.109.61/torek/ (for the moment)
Reading email is like searching for food in the garbage, thanks to spammers.

CBFalconer

unread,

Apr 12, 2003, 10:24:44 AM4/12/03

to

Alex Vinokur wrote:
>
> =============================
> Windows 2000
> DJGPP 2.03
> GNU gcc version 3.2.1
> =============================
>
> --------- C code : BEGIN ---------
> int main()
> {
> int x = 10;
>
> ---x; // compilation error : non-lvalue in decrement
> +--x; // no errors
> -++x; // no errors
> +++x; // compilation error : non-lvalue in increment
>
> return 0;
> }
>
> --------- C code : END -----------
>
> How to explain the compiler behavior?

Programmer failure to include blanks between operators. Try:

- --x;
+ --x;
- ++x;
+ ++x;

The blanks shortage ended several years ago.

--
Chuck F (cbfal...@yahoo.com) (cbfal...@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Mark McIntyre

unread,

Apr 12, 2003, 6:57:10 PM4/12/03

to

On Sat, 12 Apr 2003 14:24:44 GMT, in comp.lang.c , CBFalconer
<cbfal...@yahoo.com> wrote:
>
>The blanks shortage ended several years ago.

actuallywerestillrightoutoftheminthesticksofbuckinghamshirebutihopethatwhenimovetooxfordnextweekillgetsomedelivered

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>

Morris Dovey

unread,

Apr 12, 2003, 7:38:28 PM4/12/03

to

Mark McIntyre wrote:
> On Sat, 12 Apr 2003 14:24:44 GMT, in comp.lang.c , CBFalconer
> <cbfal...@yahoo.com> wrote:
>
>>The blanks shortage ended several years ago.
>
> actuallywerestillrightoutoftheminthesticksofbuckinghamshirebutihopethatwhenimovetooxfordnextweekillgetsomedelivered

Spaces are workable substitutes; and they're quieter, less
expensive, and easier to store than blanks - although they may be
somewhat more expensive in Oxford than in Buckinghamshire.

--
Morris Dovey
West Des Moines, Iowa USA
C links at http://www.iedu.com/c

Mark McIntyre

unread,

Apr 12, 2003, 8:29:03 PM4/12/03

to

On Sat, 12 Apr 2003 18:38:28 -0500, in comp.lang.c , Morris Dovey
<mrd...@iedu.com> wrote:

>Mark McIntyre wrote:
>> On Sat, 12 Apr 2003 14:24:44 GMT, in comp.lang.c , CBFalconer
>> <cbfal...@yahoo.com> wrote:
>>
>>>The blanks shortage ended several years ago.
>>
>> actuallywerestillrightoutoftheminthesticksofbuckinghamshirebutihopethatwhenimovetooxfordnextweekillgetsomedelivered
>
>Spaces are workable substitutes; and they're quieter, less
>expensive, and easier to store than blanks - although they may be
>somewhat more expensive in Oxford than in Buckinghamshire.

The last part is so true that it hurts.... :-(

Alex

unread,

Apr 14, 2003, 3:23:53 PM4/14/03

to

In comp.lang.c Chris Torek <nos...@elf.eng.bsdi.com> wrote:
> Preliminary note: Forget about "precedence" for now. The C standard
> does not use an operator precedence grammar. "Operator precedence"
> is a trick *you* use to parse an expression. A C compiler is allowed
> to use a *different* trick. Whichever trick you prefer -- yours, or
> the C standards -- it only comes into play when there is more than one
> way to interpret an "expression".

> In article <b78v8s$c9mnd$1...@ID-79865.news.dfncis.de>
> Alex Vinokur <ale...@connect.to> writes:
>> ---x; // compilation error : non-lvalue in decrement
>> +--x; // no errors
>> -++x; // no errors
>> +++x; // compilation error : non-lvalue in increment

>>How to explain the compiler behavior?

> One early step in interpreting C code is to make "tokens" out of
> character sequences (specifically, "pp-tokens" or preprocessor
> tokens). This step is often called "lexical analysis". These
> tokens are (eventually) the input to the "parser" -- the part of
> a compiler that matches input "words" up against some sort of
> grammar. Each token is, in effect, a word in the language. Thus,
> until you have "lexed up" some tokens, you cannot even begin to
> parse them.

Chris, please write a book. I'm tired of archiving all your posts :)

Chris Dollin

unread,

Apr 17, 2003, 10:04:18 AM4/17/03

to

Alex Vinokur wrote:

> --------- C code : BEGIN ---------
> int main()
> {
> int x = 10;
>
> ---x; // compilation error : non-lvalue in decrement

Well, of course. You wrote

-- -x;

-x is not an lvalue, so you can't -- it.

> +--x; // no errors
> -++x; // no errors

Yep.

> +++x; // compilation error : non-lvalue in increment

You wrote

++ +x;

Same as ---x above.

> How to explain the compiler behavior?

It's implementing the language rules correctly; and in this case,
they have nothing to do with oeprator precedence, and everything
to do with the lexical structure of the language, which is longest-
match-wins.

--
Chris "transmogrify, worms, ducks" Dollin
C FAQs at: http://www.faqs.org/faqs/by-newsgroup/comp/comp.lang.c.html
C welcome: http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html