Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

unary - operator (size_t)

3 views
Skip to first unread message

John Williams

unread,
Jun 5, 1996, 3:00:00 AM6/5/96
to

Bob Nelson wrote:
>
> These five straightforward questions concerning the "unary - operator"
> are based upon reading, in particular, 6.3.3.3 and 6.2.1.3 from the
> Standard after reviewing the casting endorsed by Koenig (CT&P p. 71):
>
> /* Appropriate headers and supporting code presumed */
>
> double d;
> long i;
>
> i = -sizeof(char); /* 1). Guaranteed to be -1? */

If sizeof(size_t) == sizeof(long) then probably so, otherwise,
no, since according to what I have been told size_t *must* be an
unsigned type, so you probably will get something like 0xFFFF or
0xFFFFFFFF in i.

> d = -sizeof(char); /* 2). "Max size_t" -1? */

In this case I'm sure the value will positive.

> d = (long)-sizeof(char); /* 3). -1? */

Same as #1 with the addition of a cast to double.

> d = -(long)sizeof(char); /* 4). -1? */

This will always be -1 because the value 1 is put into a signed type and
*then* negated--that is very well-defined. Next you are simply converting
it to a double, which AFAIK will always preserve the sign on any
implementation that isn't very badly broken.

> 5). Does the Standard provide any guidance on how a compiler shall or
> may choose to implement the conversions on the latter two statements?
> Is one preferred over the other?

As a general rule of thumb it's a good idea to avoid mixing signed and
unsigned types a lot, and if you really want to get a negative number,
use a signed type.

-- John Williams

Bob Nelson

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

These five straightforward questions concerning the "unary - operator"
are based upon reading, in particular, 6.3.3.3 and 6.2.1.3 from the
Standard after reviewing the casting endorsed by Koenig (CT&P p. 71):

/* Appropriate headers and supporting code presumed */


double d;
long i;

i = -sizeof(char); /* 1). Guaranteed to be -1? */

d = -sizeof(char); /* 2). "Max size_t" -1? */

d = (long)-sizeof(char); /* 3). -1? */


d = -(long)sizeof(char); /* 4). -1? */

5). Does the Standard provide any guidance on how a compiler shall or


may choose to implement the conversions on the latter two statements?
Is one preferred over the other?

--
=============================================================================
Bob Nelson: Dallas, Texas, U.S.A. - bne...@netcom.com
Linux for fun, M$ for $$$...and the NFL for what really counts!
=============================================================================


Mark Brader

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

> > /* Appropriate headers and supporting code presumed */
> >
> > double d; long i;
> > i = -sizeof(char); /* 1). Guaranteed to be -1? */
>
> If sizeof(size_t) == sizeof(long) then probably so...

Um, well, "probably guaranteed" means "not guaranteed".

The value of i is not guaranteed. There are three cases.

[1] size_t is narrower than int. Then it promotes to int, the
negation is done in int, and i must be -1.

[2] size_t isn't narrower than int, but is narrower than long.
Then sizeof(char) is unchanged by the integral promotions,
and on negation, it yields the maximum number representable
in size_t (I'll call it SIZE_T_MAX in this posting). This
fits in a long, so i must be SIZE_T_MAX.

[3] size_t is the same width as long. Then again the right-hand
side yields SIZE_T_MAX, but now this value cannot be represented
in a long. The value assigned to i is implementation-defined.

The *likely* cases are 2 and 3, with the most likely result in case 3
being -1. But this is a side comment merely reflecting common practice.


> > d = -sizeof(char); /* 2). "Max size_t" -1? */
>

> In this case I'm sure the value will positive.

Not quite -- in the unlikely case that size_t is narrower than int, d
will be -1, just as in #1. Otherwise the value is positive, but it's
(double)SIZE_T_MAX, not (double)(SIZE_T_MAX-1).

Note also that double need not have enough precision to represent the
value, in which case the result is chosen in an implementation-defined
manner from the nearest representable values. For example, if SIZE_T_MAX is
4294967295, d might become 4294967040 or 4294967296 rather than 4294967295.

> > d = (long)-sizeof(char); /* 3). -1? */

> Same as #1 with the addition of a cast to double.
>

> > d = -(long)sizeof(char); /* 4). -1? */

> This will always be -1 ...

Right.



> > 5). Does the Standard provide any guidance on how a compiler shall or
> > may choose to implement the conversions on the latter two statements?

The standard specifies some conversions and leaves others implementation-
defined, as enumerated above.



> As a general rule of thumb it's a good idea to avoid mixing signed and
> unsigned types a lot, and if you really want to get a negative number,
> use a signed type.

Yep. In particular, arithmetic that may cause different results depending
on the sizes of types on the machine, whether because of their interaction
with promotion or because of simple overflow, is to be avoided.

--
Mark Brader, m...@sq.com "The last time I trusted you, we had Mark."
SoftQuad Inc., Toronto -- Jill, "Home Improvement" (B.K. Taylor)

My text in this article is in the public domain.

Clive D.W. Feather

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

In article <bnelsonD...@netcom.com>,

Bob Nelson <bne...@netcom.com> wrote:
> double d;
> long i;
> i = -sizeof(char); /* 1). Guaranteed to be -1? */
> d = -sizeof(char); /* 2). "Max size_t" -1? */
> d = (long)-sizeof(char); /* 3). -1? */
> d = -(long)sizeof(char); /* 4). -1? */

Let's start by reducing these to the essential expressions:

(long) - (size_t) 1U /* (1) */
(double) - (size_t) 1U /* (2) */
(double) (long) - (size_t) 1U /* (3) */
(double) - (long) (size_t) 1U /* (4) */

In each case, we start by converting 1U to size_t, giving us a value
with type size_t and value 1 (obvious so far). The first three
expressions then negate this. To do this, we first apply the integral
promotions; the possibilities, and the resulting values, are:

(A) size_t is unsigned char and UCHAR_MAX <= INT_MAX -> 1
(B) size_t is unsigned char and UCHAR_MAX > INT_MAX -> 1U
(C) size_t is unsigned short and USHRT_MAX <= INT_MAX -> 1
(D) size_t is unsigned short and USHRT_MAX > INT_MAX -> 1U
(E) size_t is unsigned int -> 1U
(F) size_t is unsigned long -> 1UL

[I ignore here the viewpoint that says that (B) is impossible.]

Negating the value then yields:

(A)(C) 1 -> -1 (type is signed int)
(B)(D)(E) 1U -> UINT_MAX (type is unsigned int)
(F) 1UL -> ULONG_MAX (type is unsigned long)

Casting -1 to long yields -1L; casting it to double yields -1.0 whether
via long or not. So (A) and (C) give -1L, -1.0, and -1.0 for (1) to (3).

Casting the other two values to long depends on whether they are greater
than LONG_MAX or not; if so, then the result is implementation-defined,
while if they are less than or equal to it, they remain unchanged.
When a large integer such as UINT_MAX is converted to double, the value
will be converted exactly if possible, or rounded if not. The exact
result can be determined based on the symbols in <limit.h> and <float.h>,
but the technique is not trivial.

Expression (4) is different. The first cast gives 1U in some unsigned
type, so the cast to long gives 1L, and the final result is -1.0.

[Phew. That'll teach you to mix variable unsigned types like size_t into
arithmetic expressions :-)]

--
Clive D.W. Feather | If you lie to the compiler,
cl...@demon.net (work, preferred) | it will get its revenge.
cl...@stdc.demon.co.uk (home) | - Henry Spencer

Antoine LECA

unread,
Jun 17, 1996, 3:00:00 AM6/17/96
to cd...@cityscape.co.uk

Clive D.W. Feather wrote:
>[snip; we're evaluating -sizeof(char)]
> We start by converting 1U to size_t, giving us a value

> with type size_t and value 1 (obvious so far). The first three
> expressions then negate this. To do this, we first apply the integral
> promotions; the possibilities, and the resulting values, are:
>
> (A) size_t is unsigned char and UCHAR_MAX <= INT_MAX -> 1
> (B) size_t is unsigned char and UCHAR_MAX > INT_MAX -> 1U
> (C) size_t is unsigned short and USHRT_MAX <= INT_MAX -> 1
> (D) size_t is unsigned short and USHRT_MAX > INT_MAX -> 1U
> (E) size_t is unsigned int -> 1U
> (F) size_t is unsigned long -> 1UL
>
> [I ignore here the viewpoint that says that (B) is impossible.]
>
>[rest deleted]

Tell me if I'm wrong (or may be I haven't understand you), but the
(B) case really is possible.

For example, if char==short==int are 16 bits long.

Or if char==short==int==long are 32 bits long.

If this is impossible, please tell me why (I'm a novice in std.c).

Antoine LECA

Lawrence Kirby

unread,
Jun 17, 1996, 3:00:00 AM6/17/96
to

In article <31C525...@renault.fr>
antoin...@renault.fr "Antoine LECA" writes:

>For example, if char==short==int are 16 bits long.
>
>Or if char==short==int==long are 32 bits long.
>
>If this is impossible, please tell me why (I'm a novice in std.c).

Yes, these are both fine.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------

Clive D.W. Feather

unread,
Jun 19, 1996, 3:00:00 AM6/19/96
to

In article <31C525...@renault.fr>,
Antoine LECA <antoin...@renault.fr> wrote:

>Clive D.W. Feather wrote:
>> (B) size_t is unsigned char and UCHAR_MAX > INT_MAX -> 1U
>> [I ignore here the viewpoint that says that (B) is impossible.]
[rest deleted]

>Tell me if I'm wrong (or may be I haven't understand you), but the
>(B) case really is possible.

>For example, if char==short==int are 16 bits long.

If unsigned char can hold the same number of values as int, or even
more, then the function "isalpha" cannot, in general, be implemented
because its domain has to be larger than the range of unsigned char.
Thus, says the logic used, since isalpha must be implementable, unsigned
char must have a smaller range than int.

I personally think this is a defect needing remedying.

James Kanze US/ESC 60/3/141 #40763

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

In article <Dt8Ew...@stdc.demon.co.uk> cl...@stdc.demon.co.uk (Clive
D.W. Feather) writes:

|> If unsigned char can hold the same number of values as int, or even
|> more, then the function "isalpha" cannot, in general, be implemented
|> because its domain has to be larger than the range of unsigned char.
|> Thus, says the logic used, since isalpha must be implementable, unsigned
|> char must have a smaller range than int.

Is this some sort of official interpretation, or just an argument about
why in any real implementation, unsigned char must have a range smaller
than int. (I have a certain amount of code that will break if I ever
encounter a machine in which sizeof( int ) == sizeof( char ), unless
that machine has some special logic so that UCHAR_MAX + 1 > UCHAR_MAX.
I don't like depending on something which is not guaranteed, but there
are a few cases where I was really unable to find a reasonable
alternative.)
--
James Kanze Tel.: (+33) 88 14 49 00 email: ka...@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils, études et réalisations en logiciel orienté objet --
-- A la recherche d'une activité dans une region francophone


Clive D.W. Feather

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

In article <DtC44...@stdc.demon.co.uk>, I wrote:
>> unless
>> that machine has some special logic so that UCHAR_MAX + 1 > UCHAR_MAX.
> UCHAR_MAX + 1 is zero, by definition.

... because if char and int have the same size, int cannot hold all the
values of unsigned char, so the latter promotes to unsigned int, so
the calculation is done in unsigned int and UCHAR_MAX == UINT_MAX, so it
wraps to zero.

Clive D.W. Feather

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

In article <KANZE.96J...@slsvgqt.lts.sel.alcatel.de>,

James Kanze US/ESC 60/3/141 #40763 <ka...@lts.sel.alcatel.de> wrote:
>> Thus, says the logic used, since isalpha must be implementable, unsigned
>> char must have a smaller range than int.
> Is this some sort of official interpretation, or just an argument about
> why in any real implementation, unsigned char must have a range smaller
> than int.

Just an argument.

I don't recall whether or not this issue was included in DR069, which
covers a number of integer representation issues, nor, if it was, what
WG14 said.


> unless
> that machine has some special logic so that UCHAR_MAX + 1 > UCHAR_MAX.

UCHAR_MAX + 1 is zero, by definition.

--

k...@cafe.net

unread,
Jun 27, 1996, 3:00:00 AM6/27/96
to
In article <Dt8Ew...@stdc.demon.co.uk>,

Clive D.W. Feather <cd...@cityscape.co.uk> wrote:
>In article <31C525...@renault.fr>,
>Antoine LECA <antoin...@renault.fr> wrote:
>>Clive D.W. Feather wrote:
>>> (B) size_t is unsigned char and UCHAR_MAX > INT_MAX -> 1U
>>> [I ignore here the viewpoint that says that (B) is impossible.]
>[rest deleted]
>
>>Tell me if I'm wrong (or may be I haven't understand you), but the
>>(B) case really is possible.
>>For example, if char==short==int are 16 bits long.
>
>If unsigned char can hold the same number of values as int, or even
>more, then the function "isalpha" cannot, in general, be implemented
>because its domain has to be larger than the range of unsigned char.

Why do you assume that the domain of isalpha() has to include all of the
possible unsigned char values? Does the / operator admit all possible
values of a given arithmetic type as its right operand?

>Thus, says the logic used, since isalpha must be implementable, unsigned
>char must have a smaller range than int.

Why is that? isalpha is an int -> int function, and doesn't involve unsigned
char types. If you must apply isalpha() to an unsigned char, the char
will be converted into an int according to implementation defined rules.

The implementation can coordinate its version of isalpha() together with these
conversion rules to give meaningful results. For example, values of unsigned
char exceeding the integer range can be mapped to negative values of int. On
two's complement machines, this involves a mere re-interpretation of the bit
pattern as a negative number, assuming the width is equal.

>I personally think this is a defect needing remedying.

There is really only one potential glitch. Suppose that one implementation is
a sign+magnitude machine with a 16-bit int and unsigned char. The unsigned
char holds 65536 distinct values, whereas the int only 65535. Why? Because
two bit patterns represent a zero int.

Nevertheless, the implementation could define a one-to-one conversion rule
in going from unsigned char to int, and isalpha() could use a bit operation
to distinguish the two distinct ``flavors'' of zero.


k...@cafe.net

unread,
Jun 27, 1996, 3:00:00 AM6/27/96
to
In article <DtC4B...@stdc.demon.co.uk>,

Clive D.W. Feather <cd...@cityscape.co.uk> wrote:
>In article <DtC44...@stdc.demon.co.uk>, I wrote:
>>> unless
>>> that machine has some special logic so that UCHAR_MAX + 1 > UCHAR_MAX.
>> UCHAR_MAX + 1 is zero, by definition.
>
>... because if char and int have the same size, int cannot hold all the
>values of unsigned char, so the latter promotes to unsigned int, so
>the calculation is done in unsigned int and UCHAR_MAX == UINT_MAX, so it
>wraps to zero.

Thus it's not ``by definition'' at all, but by implementation.

Stephen Baynes

unread,
Jun 28, 1996, 3:00:00 AM6/28/96
to
k...@cafe.net wrote:
: Why is that? isalpha is an int -> int function, and doesn't involve unsigned

: char types. If you must apply isalpha() to an unsigned char, the char
: will be converted into an int according to implementation defined rules.

: The implementation can coordinate its version of isalpha() together with these
: conversion rules to give meaningful results. For example, values of unsigned
: char exceeding the integer range can be mapped to negative values of int. On
: two's complement machines, this involves a mere re-interpretation of the bit
: pattern as a negative number, assuming the width is equal.

Remember that EOF is negative and isalpha etc has to give defined behaviour
for EOF.

--
Stephen Baynes bay...@ukpsshp1.serigate.philips.nl
Philips Semiconductors Ltd
Southampton My views are my own.
United Kingdom
Are you using ISO8859-1? Do you see © as copyright, ÷ as division and ½ as 1/2?

0 new messages