On 25/09/18 00:07, Bart wrote:
> On 24/09/2018 21:51, David Brown wrote:
>> On 24/09/18 17:18, Bart wrote:
>
>>> Since then, I don't think any of the 9 or 10 architectures I've coded
>>> for (and a few more I've studied) would have had anything other than
>>> an all 1s representation of -1.)
>>
>> There have never been many systems that had something other than all
>> 1's for -1, but there have been a few. You are not likely to come
>> across them, especially in modern C programming. (I have worked with
>> programming language implementations where -1 had a representation
>> other than all 1's, but it was not C.)
>
> In software anyone can make up their own schemes. As a simple example,
> they can choose to use floating point representation for integers, where
> -1 might be stored as:
>
> -100000000010000000000000000000000000000000000000000000000000000
>
That is approximately the format used by the implementation I was
thinking about (I haven't counted the number of 0's to be sure). It had
a 32-bit number format that could store either an integer in 24 bits
plus a sign bit, or a floating point value with 24-bit mantissa.
> in binary. And I use a big int library that uses signed-magnitude
> decimals with 'digits' that go from 0 to 999999. -1 would be the two
> 32-bit words (1,1).
>
> This is about common machine representations and languages that work
> closely with such representations.
Well, the thread was never about this at all - the representation of -1
on the machine is irrelevant to the fact that "-1" converted to an
unsigned type in C always gives you the maximum value of that type.
>
>>> (With pen and paper, if doing it in decimal, you also make the
>>> interesting discovery that -1 can be represented as an infinite
>>> sequence of 9s, ie. "....999". Negative numbers with denoted with
>>> "-", really signed magnitude, aren't needed after all!)
>>
>> That is just nonsense.
>
> (It what way? If I add 1 to "...999" I get 0, just as I would adding 1
> to -1.
No, you don't. I can't imagine what jumble of ideas makes you think that.
An infinite sequence of 9's is infinity. Adding 1 leaves it unchanged
as infinity. Normal mathematical integers do not wrap - nor do they
have a maximum value. (If you are interested in infinite arithmetic, we
can go into more detail, but it would be very off-topic.)
>
> If I multiply "...999" by itself I get "...0001", or +1, just as I get
> if I square -1.
>
> If I multiply it by 5 I get "...995", which if I add +5, gives me 0,
> just as -5 would do.
>
> So it all seems to work.
It only /appears/ to work if you don't actually bother doing the
calculations - you just handle an arbitrarily small number of the least
significant digits. But the maths is not integer maths, you don't have
proper rules for handling them (just try division), and if you work
things through enough to get rigorous definitions of operations and a
mapping to normal integers, you'll find that your "...999" bit is just
an inconvenient way of writing "-".
>
> But having to have an infinite number of leading digits, or using a
> fixed number with the left-most one designated a sign digit, is
> inconvenient. Inside a computer however and using binary (and with a
> fixed number of digits, not infinite), it is very helpful.)
>
It is helpful, yes. But that is mainly an effect of the way we tend to
implement arithmetic using logic gates - nothing more than that. In a
finite computer you are always going to have a finite model for
arithmetic - you can't implement mathematical integers. Two's
complement with fixed sizes is a representation that gives a good
balance between efficient implementation and useful features - it does
not mean that two's complement is inherently more "natural" or "correct"
than other representations of negative numbers.
>>>
>>>> I don't want to discourage anyone from giving advice but when you have
>>>> explicitly stated that you don't want to learn what you think of as C's
>>>> crazy rules, it seems odd that you choose to.
>>>
>>> Well I read your reply to the OP, and I couldn't make head or tail of
>>> it, not even on a second reading. Sometimes you want to keep those
>>> crazy rules at bay as much as possible and just try and make
>>> practical common sense work.
>>>
>> Try again, and if it still doesn't make sense, ask someone about the
>> parts you have trouble with.
>
> Here's the fast part:
>
> "A signed integer value is converted to an unsigned type by subtracting
> (repeatedly if necessary) one more than the maximum value of that type."
>
(It should read "adding or subtracting", not just "subtracting".)
When you have a representation with no padding bits (as you usually do,
for all types other than _Bool), this has exactly the same mathematical
effect as you would get by taking the signed integer in two's complement
representation, sign-extending as much as necessary if the new unsigned
type has more bits. Then truncate from the LSB to the size of the
unsigned type.
> Assuming "that type" means the unsigned type we're trying to determine
> the maximum value of, then that method for working out that value,
> requires that you first know what that value is!
The /compiler/ needs to know the value - the programmer doesn't.
Clearly the compiler always knows the size of the types involved.
>
> If someone is trying to work through that with pen and paper, then they
> will need to know SIZE_MAX before they can determine SIZE_MAX.
>
The point of calculations like "(size_t) -1" is that they work for any C
implementation - with values that are dependent on the implementation.
If you want to do this by pen and paper, you need to know which C
implementation you are using - and then you can look up the sizes in the
documentation. It does not make sense to try to get a calculation on
paper that is independent of the compiler.
>> If you don't understand something, ask - don't try to add your own
>> alternative confusions as an answer to someone else's question.
>
> C doesn't have a monopoly on signed and unsigned representations of
> binary integers. These same problems come up with any number of
> languages, and outside languages too as they are among the fundamental
> characteristics of nearly all computers.
>
True.
> That the byte pattern 11111111, is 255 or the maximum value of an
> unsigned byte, if interpreted one way, or -1 if interpreted as signed -
> for twos complement - has been very well known for decades. And is the
> basis for using ((size_t)-1) since the -1 gives you as many ones as needed.
>
>
8 bits is a common size of byte - it is the standard for all but niche
situations. The same applies to two's complement for signed integer
representation. However, these are /not/ fundamental to computing -
they are merely convenient standards that works well in practice, and
have thus become dominant.
That means that for all but the most unusual implementations (which /do/
exist, even if you never see them yourself), it is correct to say that
((size_t) -1) will give you lots of 1's. But what is /always/ correct
in C - even for the weirdest implementations, is that ((size_t) - 1)
will give you the maximum value of type size_t.
And the OP was not interested in having a bunch of 1's. He was
interested in having the maximum value of type size_t.
Of course, there is no guarantee that size_t is the largest unsigned
type. In C99, that would be uintmax_t, which is guaranteed to be able
to represent any value of any unsigned type. (Note that pointers can be
bigger.) It is entirely possible for a C implementation on a 64-bit
computer to have size_t as 64-bit, but have solid support for 128-bit
arithmetic and define uintmax_t to 128-bit.
And for most C types, there are standard library macros for the maximum
values - such as UINTMAX_MAX. size_t is a bit unusual in this respect,
which is why something like ((size_t) -1) might be useful.