Ron Shepard <
nos...@nowhere.org> schrieb:
> On 5/11/23 9:14 AM, Steven G. Kargl wrote:
>> On Thu, 11 May 2023 01:52:48 -0500, Ron Shepard wrote:
> [...]
>
>> 3. Request J3 to add an 'unsigned' attribute to
>> the Fortran standard.
There is
https://github.com/j3-fortran/fortran_proposals/issues/2
for which I have written up a proposal which is supposed to be
submitted to J3.
Attached below for comment and further refinement.
>> 4. Add an extension to gfortran to have an
>> 'unsigned' attribute.
If this ever makes any headway at J3, I've committed to implementing
them. It would make an interesting project, probably with a lot
of repetetive code.
> First, there are lots of situations where unsigned arithmetic would be
> nice to have in fortran, so I'm all for that. That is about 40 or 50
> years late to the language in my opinion.
> However, for the LCG situations, this seems to me to be kind of a
> gimmick to solve this problem. In C, unsigned integer arithmetic is
> basically defined to be mod 32 arithmetic. That is, overflows with
> unsigned arithmetic are ignored.
They are not ignored, there is no overflow in mod 2^n arithmatic.
> What if you want overflows with unsigned arithmetic to be caught the
> same way they are for signed arithmetic? C has already made its choice,
> what is done is done there. But in fortran, is mod 32 arithmetic really
> what is best for the language? I don't think that answer is obvious.
If you want to check for overflow on addition with unsigned
integers, this is trivial:
c = a + b
if (c < a) then ! overflow occurred
For multiplication, you'll need a widening multiplication and to check
for overflow.
> I think my preferred solution from all of the possibilities is to have a
> pair of fortran intrinsic functions just for this special case. One does
> additions with overflow ignored and the other does multiplications with
> overflow ignored. Of course, compilers should inline those functions for
> efficiency (they map directly to the hardware, after all), but the
> programmer would then be in control in those few situations where silent
> overflow (or mod 32 arithmetic) is desired. No compiler options or
> volatile attributes should be required.
I've proposed such intrinsics, too. For GCC, these would just
be builtin functions, which can easily be called from gfortran.
> If such functions were added to the language, then there are still some
> choices. Should they work for all integer kinds, or just the kind that
> is the same size as an address (e.g. 64 bits)?
All kinds.
> Or maybe just the default
> integer kind (e.g. 32 bits)? As a programmer, I would like all kinds to
> be supported, but if that is too difficult for compiler writers, then I
> would gladly settle for something less general just have have it
> available. (Thoughts of PDTs that are still buggy in compilers come to
> mind.)
PDTs are a different kettle of fish. Implementing such functions from
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html would
not be too hard.
[...]
And here's the unsigned proosal in its current form:
To: J3 J3/XX-XXX
From: Thomas König
Subject: UNSIGNED type
Date:
# 1. Introduction
Unsigned integers, are a basic data type used in many programming
languages, like C. Arithmetic on them is typically performed modulo
2^n for a datatype with n bits. They are useful for a range of
applications, including, but not limited to
- hashing
- cryptography (including multi-precision arithmetic)
- image processing
- binary file I/O
- interfacing to the operating system
- signal processing
- data compression
Introduction of unsigned integers should not repeat the mistakes of
languages like C, and syntax and functionality should be familiar to
people who today use unsigned types in other programming languages.
# 2. C interoperability
One major use case is C interoperability, including interfacing to
operating system calls specified in C. At the moment, Fortran uses
signed int for interoperability with C unsigned int types, which has
two drawbacks:
## 2.1 Value range
An unsigned int with n bits has a value range between 0 and 2^n-1,
while Fortran model numbers have values between -2^(n-1)+1 and
2^(n-1)-1. While agreement of representation between nonzero
interoperable Fortran integers and nonnegative unsigned ints on
a companion processor is assured by the C standard, this is not
the case for unsigned ints larger than 2^(n-1)-1.
## 2.2 Automatically generated C headers
It is straightforward to generate C prototypes or declarations
suitable for inclusion in the companion processor from Fortran
interfaces. At least one compiler, gfortran, has an
[option to do this](
https://gcc.gnu.org/onlinedocs/gfortran/Interoperability-Options.html).
This fails in the case where the C code specifies unsigned, and
Fortran can only specify interoperable signed integers.
# 3. Avoiding traps and pitfalls
There are numerous well-known traps and pitfalls in the way that C
implements unsigned integers. These are mostly the result of C's
integer promotion rules, which need to be avoided. Specifically,
comparison of signed vs. unsigned values can lead to confusion,
which can lead to hard-to-detect errors in the code, infinite
loops, and similar.
# 4. Prior art
At least one Fortran compiler, Sun Fortran, introduced unsigned ints.
Documentation can be found at
[Oracle](
https://docs.oracle.com/cd/E19205-01/819-5263/aevnb/index.html).
This proposal borrows heavily from that prior art, without sticking
to it in all details. The discussion at the [Fortran proposals
site](
https://github.com/j3-fortran/fortran_proposals/issues/2) also
influenced this proposal.
# 5. Proposal
## 5.1 General
- A type name tentativily called UNSIGNED, with the same KIND
mechanism as for INTEGER, plus a SELECTED_UNSIGNED_KIND FUNCTIONS,
to implement unsigned integers
- Unsigned integers are marked with an U suffix, with an optional
KIND number following
- A conversion function UINT, with an optional KIND
- A prohibition of mixing INTEGER and UNSIGNED operands on binary
operands. It is to be decided if a positive integer constant
should be valid as an operand together with an unsgined value.
- Binary operations between INTEGER and UNSIGNED are prohibited
without explicit conversion, binary operations between UNSIGNED
and REAL are permitted
- Unsigned integers to be allowed in SELECT CASE
- Unsigned integers not to be allowed as index variables in a DO
statement or array indices
- Unsigned integers can be be read or written in list-directed,
namelist or unformatted I/O, and by using the usual edit
descriptors such as I,B,O and Z
- Extension of ISO_C_BINDING with KIND numbers like C_UINT, C_UINT8_T etc
- Likewise, the ISO_FORTRAN_ENV should be extended.
- Behavior on conversion to integer outside the range of the integer
should be processor-dependent, and identical to the behavior of
the companion C processor.
## 5.2 Behavior on overflow
In the discussion on github, two possible behaviors on overflow were
discussed: That this should be forbidden (using a "shall" directive)
and that this should wrap around.
The author of this proposal is of the opinion that wrap-around semantics
(modulo 2^n for an n-bit type) should be specified, for several reasons:
- It is required for several applications which would otherwise be left
to C, such as cryptography, hashes and big-integer arithmetic
- The standard does not (up to now) mandate run-time checks, and an
implementation which does not perform overflow checks would perform
the same operation as with modulo 2^n arithmetic
- Writing checks for overflow with user code is relatively straightforward.
For example,
```
c = a + b
if (c < a) then ! overflow occurred
```
but only possible if the operation to be checked is not, in fact, illegal.
Over time, compilers will tend to remove such checks because they cannot
be true because of the langauge definition (compare the removal of NULL
pointer checks in C).
# 6. Relation to other proposals
This proposal complements the BITS proposal, J3/07-007r2.pdf, as
proposed in J3/22-195.txt. BITS restricts its operations to logical
operations and comparisons on bit lengths, whereas this proposal is
for values requiring arithmetic operations, and is less flexible
in bit length.