Options to replace EQUIVALENCE statements involving CHARACTER variables and CHARACTER arrays

FortranFan

unread,

Sep 22, 2016, 6:13:36 PM9/22/16

to

I'm working on some older code that involves EQUIVALENCE statements involving CHARACTER variables and CHARACTER arrays and I'm looking at options to replace such statements. What comes to my mind immediately is to use the facility that makes use of the C address of the variable. And I can't think of anything else.

The older code specifies EQUIVALENCE and uses the CHARACTER variable and the array interchangeably as follows:

-- begin code --
module m

implicit none

integer, parameter :: n = 2
integer, parameter :: l = 6

character(len=n*l) :: s
character(len=l) :: t(n)

equivalence (s, t)

end module m

program p

use m, only : s, t

integer :: i

! Some sections of code define 's'
s = "123456789012"

! Then perform operations involving 't'
do i = 1, size(t)
print *, t(i)
end do

! Other sections of code define 't'
t(1) = "Hello"
t(2) = "World!"

! Then perform operations involving 's'
print *, s

stop

end program p
-- end code --

Upon execution, the above leads to
-- begin output --
123456
789012
Hello World!

-- end output --

All I could think of achieve the same that doesn't involve too many changes in client code was to employ a setter method that takes the C address of a character scalar variable and associate a Fortran rank-one character array with POINTER attribute with it, as shown below.

-- begin code --
module m

use, intrinsic :: iso_c_binding, only : c_loc, c_f_pointer

implicit none

integer, parameter :: n = 2
integer, parameter :: l = 6

character(len=n*l), target :: s
character(len=l), pointer :: t(:) => null()

contains

subroutine set_t()

call c_f_pointer( c_loc(s), t, shape=[n] )

return

end subroutine set_t

end module m

program p

use m, only : s, t, set_t

integer :: i

! Consumer has to remember to invoke this
call set_t()

! Some sections of code define 's'
s = "123456789012"

! Then perform operations involving 't'
do i = 1, size(t)
print *, t(i)
end do

! Other sections of code define 't'
t(1) = "Hello"
t(2) = "World!"

! Then perform operations involving 's'
print *, s

stop

end program p

-- end code --

The above code 'works' ok with the toolsets and platforms of interest at present with the caveat (a big concern) that the 'consumer' handles the pointer 't' carefully and doesn't do other dangerous stuff with it.

Can readers suggest other, better options?

Thanks much,

herrman...@gmail.com

unread,

Sep 22, 2016, 8:06:46 PM9/22/16

to

On Thursday, September 22, 2016 at 3:13:36 PM UTC-7, FortranFan wrote:
> I'm working on some older code that involves EQUIVALENCE statements
> involving CHARACTER variables and CHARACTER arrays and I'm
> looking at options to replace such statements.
> What comes to my mind immediately is to use the facility that
> makes use of the C address of the variable.
> And I can't think of anything else.

Using C_LOC doesn't seem to be going in the right direction.

It doesn't make it easier to understand, for one.
(Well, maybe for some C programmers.)

> The older code specifies EQUIVALENCE and uses the CHARACTER
> variable and the array interchangeably as follows:

(snip)

> character(len=n*l) :: s
> character(len=l) :: t(n)
> equivalence (s, t)

You could use just t, and substring operations.

That is, replace t(i) with s(i*l-l+1,i*l).

if t(n) also uses substrings, then it is a little harder.

-- glen

Ian Harvey

unread,

Sep 22, 2016, 8:30:21 PM9/22/16

to

On 2016-09-23 08:13, FortranFan wrote:
> I'm working on some older code that involves EQUIVALENCE statements
> involving CHARACTER variables and CHARACTER arrays and I'm looking at
> options to replace such statements. What comes to my mind
> immediately is to use the facility that makes use of the C address of
> the variable. And I can't think of anything else.
>
> The older code specifies EQUIVALENCE and uses the CHARACTER variable
> and the array interchangeably as follows:

What's the motivation for the change?

I avoid the need for equivalence in the visible aspects of new code,
but, in isolation, its use in existing code isn't something that I see
as a particular concern.

The code design aspects around equivalence that do bother me (loosely -
I don't like one thing magically pretending to be something else) also
apply to the pointer approach, plus you now have pointer and target
attributes on the objects. The rules around what you can do with C_LOC
and C_F_POINTER (which have changed with standard version) also need to
be checked carefully (I think your current C pointer approach is
non-conforming as the argument to C_LOC is not an interoperable data
entity (the length of s is not one), and therefore the FPTR argument to
C_F_POINTER must be scalar).

herrman...@gmail.com

unread,

Sep 22, 2016, 9:35:41 PM9/22/16

to

On Thursday, September 22, 2016 at 5:30:21 PM UTC-7, Ian Harvey wrote:
> On 2016-09-23 08:13, FortranFan wrote:
> > I'm working on some older code that involves EQUIVALENCE statements
> > involving CHARACTER variables and CHARACTER arrays and I'm looking at
> > options to replace such statements. What comes to my mind
> > immediately is to use the facility that makes use of the C address of
> > the variable. And I can't think of anything else.

> > The older code specifies EQUIVALENCE and uses the CHARACTER variable
> > and the array interchangeably as follows:

> What's the motivation for the change?

I agree.

In the Fortran 66 days, EQUIVALENCE was sometimes used when
playing with character data, in ways that were non-standard, and
likely not so easy to follow.

But the OP code doesn't seem so bad.

Well, since it is sample code, not the real code, it is hard to say.

The substring method I suggested may or may not be more
readable, depending on the actual usage.

-- glen

FortranFan

unread,

Sep 22, 2016, 9:36:04 PM9/22/16

to

On Thursday, September 22, 2016 at 8:06:46 PM UTC-4, herrman...@gmail.com wrote:
> ..

>
> Using C_LOC doesn't seem to be going in the right direction.
>
> It doesn't make it easier to understand, for one.
> (Well, maybe for some C programmers.)
>

I agree, hence my question. Plus see Ian Harvey's comment downthread: it's not standard-conforming either.

> ..

>
> You could use just t, and substring operations.
>
> That is, replace t(i) with s(i*l-l+1,i*l).
>
> if t(n) also uses substrings, then it is a little harder.
>

Unfortunately your suggested replacement will be too much code change which we want to avoid; in addition, it may not work under all circumstances.

FortranFan

unread,

Sep 22, 2016, 10:06:57 PM9/22/16

to

On Thursday, September 22, 2016 at 8:30:21 PM UTC-4, Ian Harvey wrote:

> ..

>
> What's the motivation for the change?
>

Twofold:

1) Over the years, the code has been modified and maintained for periods of time by engineers well-versed in their technical fields and also other coding approaches but who have little to no knowledge of Fortran: explaining EQUIVALENCE to them has been a rather unpleasant experience.

2) This use of EQUIVALENCE has been at the back of my mind for a while now and I haven't been able to come up with a suitable and safe replacement using standard Fortran that doesn't involve a lot of modifications to existing code e.g., introduce TRANSFER statements. This has bothered me.

I've never bought into the text in the Fortran standard, ".. EQUIVALENCE .. is error-prone. Whilst use of these statements was invaluable prior to Fortran 90 they are now redundant and can inhibit performance." This is from section B.3.11, WD 1539-1 J3/16-007r1 (F2015 Working Document) 1st May 2016 16:07. I do not think the last statement from the standard is fully accurate.

The question I'm now asking myself with the help of this real-life use case is also whether an added facility for coders involving CHARACTER variables and CHARACTER arrays in pointer assignment was missed when the remapping of the elements of a rank-one array was permitted in pointer assignment starting with Fortran 2003. In a rather crude sense, can not the 't' in the example above be considered a rank-two array of CHARACTER(len=1) (C char) type and the 's' be thought of as rank-one array? So starting with Fortran 2003, if one can do:
..
integer, target :: a(m*m) ! say m is a constant
integer, pointer :: p(:,:)
..
p(1:m,1:m) => a(1:m*m)

then could I simply have been able to do:

character(len=n*l), target :: s
character(len=l), pointer :: t(:) => s

In my simple-minded thinking, I tend to view both of above as just involving association with length-type of information of the objects on either side of the assignment. But one is allowed by the standard while the other is not.

Thanks much for your reminding me of the considerations involving the C interoperability feature and the non-conformance.

Gary Scott

unread,

Sep 22, 2016, 10:11:55 PM9/22/16

to

On 9/22/2016 9:06 PM, FortranFan wrote:
> On Thursday, September 22, 2016 at 8:30:21 PM UTC-4, Ian Harvey wrote:
>
>> ..
>>
>> What's the motivation for the change?
>>
>
> Twofold:
>
> 1) Over the years, the code has been modified and maintained for periods of time by engineers well-versed in their technical fields and also other coding approaches but who have little to no knowledge of Fortran: explaining EQUIVALENCE to them has been a rather unpleasant experience.

I know hundreds of engineers (mostly EEs) who would understand this
concept easily. What's so difficult about it?

FortranFan

unread,

Sep 22, 2016, 11:03:09 PM9/22/16

to

On Thursday, September 22, 2016 at 10:06:57 PM UTC-4, FortranFan wrote:

> ..

>
> 2) This use of EQUIVALENCE has been at the back of my mind for a while now and I haven't been able to come up with a suitable and safe replacement using standard Fortran that doesn't involve a lot of modifications to existing code e.g., introduce TRANSFER statements. This has bothered me.
>
> I've never bought into the text in the Fortran standard, ".. EQUIVALENCE .. is error-prone. Whilst use of these statements was invaluable prior to Fortran 90 they are now redundant and can inhibit performance." This is from section B.3.11, WD 1539-1 J3/16-007r1 (F2015 Working Document) 1st May 2016 16:07. I do not think the last statement from the standard is fully accurate.
>

> ..

Upon further thought, I wonder if the Fortran 2008 feature of a 'pointer function' which allows "a reference to a pointer function is treated as a variable and is permitted in any variable-definition context" might be useful here.

I think with Fortran 2008 I should be able to do the following:

-- begin code --
module m

implicit none

private

integer, parameter, public :: n = 2
integer, parameter :: ell = 6

character(len=n*ell), target, public :: s

public :: t

contains

function t( idx ) result( substr )

integer, intent(in) :: idx
!.. Function result
character(len=ell), pointer :: substr

if ( (idx < 0).or.(idx > n) ) then
error stop
end if

substr => s((idx-1)*ell+1:idx*ell)

return

end function t

end module m

program p

use m, only : s, t, n

integer :: i

! Some sections of code define 's'
s = "123456789012"

! Then perform operations involving 't'

do i = 1, n

print *, t(i)
end do

! Other sections of code define 't'
t(1) = "Hello"
t(2) = "World!"

! Then perform operations involving 's'
print *, s

stop

end program p
-- end code --

gfortran (GCC 7.0 eperimental version) compiles the above code without errors or warnings and is able to execute the first part ok but it generates a backtrace at line 16 of the main program that has 't(1) = "Hello"' assignment; hmm.. wonder if it's a compiler error!

Anyone have any comments on this option, other than the classic refrain: "stay away from pointer functions"!

Thanks for your attention,

Louis Krupp

unread,

Sep 22, 2016, 11:52:42 PM9/22/16

to

<snip>

>Can readers suggest other, better options?

You *could*, depending on your platform, do this in a language that's
all about redefining storage. For example:

identification division.
program-id. pst.
environment division.
data division.
working-storage section.
01 s.
05 t pic x(6) occurs 2 times.
procedure division.
begin.
move "123456789012" to s.
display t(1).
display t(2).
move "Hello" to t(1).
move "World!" to t(2).
display s.
stop run.

GnuCobol is available free for Linux and Windows and possibly other
platforms. It generates C code that I'm sure you could link to
Fortran.

I'm sure your engineers would love it. :)

If that's not realistic, and you just want to get rid of equivalence,
you could try the reverse of what Glen suggested: Keep the array t(n)
as it is, and write (1) a set_s() subroutine that would take a long
string and split it between elements of t and (2) a get_s() function
that would return the concatenation of the elements of t.

Louis

Ian Harvey

unread,

Sep 22, 2016, 11:55:41 PM9/22/16

to

On 2016-09-23 12:06, FortranFan wrote:
> On Thursday, September 22, 2016 at 8:30:21 PM UTC-4, Ian Harvey
> wrote:
>
>> ..
>>
>> What's the motivation for the change?
>>
>
> Twofold:
>
> 1) Over the years, the code has been modified and maintained for
> periods of time by engineers well-versed in their technical fields
> and also other coding approaches but who have little to no knowledge
> of Fortran: explaining EQUIVALENCE to them has been a rather
> unpleasant experience.

If they come from a C background, call it a union.

The *why* of the situation is where confusion would probably set in for
me, not the *what* or *how*. Rhetorically, as each code has its own
story, but "Why are there two variables, that are really different
things but they have to be considered the same?"

Fortran has the concept of storage association generally, and sequence
association for array arguments and certain kinds of character scalar
arguments. Those features practically constraints the layout of arrays
in memory - implementations cannot have padding between rows of the
array for example, even though doing so might be more efficient in some
situations. Because implementations are already practically
constrained, the implementation of pointer rank remapping is
incrementally zero cost from an efficiency point of view.

The storage association concept and sequence association also applies to
character scalars, in the way that you anticipate, but only for default
and C_CHAR character kind. For other character kinds it does not apply,
hence, for other character kinds, implementations have the ability to
layout arrays of strings in a manner that best suits them (each element
in the array could start on an address with a certain alignment, for
example). Different alignment requirements is a reasonable reason for
why an implementation might have a non-default character kind that still
had the same character set as for default kind. Permitting pointer rank
remapping in this situation would remove that flexibility - it is not
zero cost efficiency wise. That flexibility is practically removed even
if a program doesn't use the feature.

The implications for implementations around storage, even if the
features aren't used, are part of the argument against equivalence and
common being in the language, and is used as an argument against having
sequence association in the language. By extension it could then also
be an argument against permitting rank remapping!

paul.rich...@gmail.com

unread,

Sep 23, 2016, 3:47:23 AM9/23/16

to

ifort yields:
clf.f90(49): error #5415: Feature not yet implemented: Assignment to pointer function
t(1) = "Hello"
---^
compilation aborted for clf.f90 (code 1)

The ICE that gfortran produces is clearly wrong and an appropriate error message should be produced.

This is PR77703

Paul

Bálint Aradi

unread,

Sep 23, 2016, 4:04:27 AM9/23/16

to

>
> Can readers suggest other, better options?

In case, the F2008 pointer function trick does not work with the compiler of your choice, a minimal string type could emulate similar behaviour. It is purely Fortran 2003, but the changes could go over that you are willing to do to your code.

---BEGIN CODE
module typedefs
implicit none

integer, parameter :: nn = 2
integer, parameter :: ll = 6

type :: String
character(len=nn*ll) :: str
end type String

type :: StringRef
character(len=ll), pointer :: ptr
end type StringRef

interface assignment(=)
module procedure String_assign
module procedure StringRef_assign
end interface assignment(=)

interface char
module procedure String_toChar
module procedure StringRef_toChar
end interface char

contains

subroutine String_assign(this, rhs)
type(String), intent(out) :: this
character(*), intent(in) :: rhs

this%str = rhs

end subroutine String_assign

subroutine StringRef_assign(this, rhs)
type(StringRef), intent(inout) :: this
character(*), intent(in) :: rhs

this%ptr = rhs

end subroutine StringRef_assign

function String_toChar(this) result(toChar)
type(String), target, intent(in) :: this
character(len=len(this%str)), pointer :: toChar

toChar => this%str

end function String_toChar

function StringRef_toChar(this) result(toChar)
type(StringRef), target, intent(in) :: this
character(len=len(this%ptr)), pointer :: toChar

toChar => this%ptr

end function StringRef_toChar

function getReferences(this) result(refs)
type(String), target, intent(in) :: this
type(StringRef) :: refs(nn)

integer :: ii

do ii = 1, nn
refs(ii)%ptr => this%str((ii - 1) * ll + 1 : ii * ll)
end do

end function getReferences

end module typedefs

program test
use typedefs
implicit none

type(String), target :: ss
type(StringRef) :: tt(nn)
integer :: ii

tt(:) = getReferences(ss)

ss = '123456789012'

do ii = 1, size(tt)
print *, char(tt(ii))
end do

tt(1) = 'Hello'
tt(2) = 'World'

print *, char(ss)

end program test
---END CODE

FortranFan

unread,

Sep 23, 2016, 9:13:58 AM9/23/16

to

On Friday, September 23, 2016 at 3:47:23 AM UTC-4, paul.rich...@gmail.com wrote:

> ..

>
> The ICE that gfortran produces is clearly wrong and an appropriate error message should be produced.
>
> This is PR77703

> ..

Paul,

Interestingly, I do NOT get an ICE. Instead a run-time exception is encountered:

123456
789012
At line 16 of file C:\dev\Fortran\temp\sor\p.f90
Fortran runtime error: Unequal character lengths (1961528533/6) in pointer assig
nment

Error termination. Backtrace:

Thanks much for the PR, perhaps you can add the above shown error as a note?

FortranFan

unread,

Sep 23, 2016, 9:15:45 AM9/23/16

to

On Friday, September 23, 2016 at 4:04:27 AM UTC-4, Bálint Aradi wrote:

>
> In case, the F2008 pointer function trick does not work with the compiler of your choice, a minimal string type could emulate similar behaviour. It is purely Fortran 2003, but the changes could go over that you are willing to do to your code.

> ..

@Bálint Aradi,

Thanks much for your response; I'll keep it in mind.

Regards,

FortranFan

unread,

Sep 23, 2016, 9:41:50 AM9/23/16

to

On Thursday, September 22, 2016 at 11:55:41 PM UTC-4, Ian Harvey wrote:

> .. For other character kinds it does not apply,

> hence, for other character kinds, implementations have the ability to
> layout arrays of strings in a manner that best suits them (each element
> in the array could start on an address with a certain alignment, for
> example). Different alignment requirements is a reasonable reason for
> why an implementation might have a non-default character kind that still
> had the same character set as for default kind. Permitting pointer rank
> remapping in this situation would remove that flexibility - it is not
> zero cost efficiency wise. That flexibility is practically removed even
> if a program doesn't use the feature.
>

> ..

Thanks again for your detailed comments, but I'm not sure I quite follow. The language has type, kind, and rank (TKR) restrictions on pointer remapping and I would expect any potential enhancement to character variables and arrays will retain the same. So I would expect whatever an implementation might be doing for a non-default character kind would be immaterial, especially when the remapping that is permitted is to a rank-one array as is the case currently in the standard. John Reid writes in "The New Features of Fortran 2003", "The limitation to rank-one arrays is because pointer arrays need not occupy contiguous storage: .. but all the gaps have the same length in the rank-one case."

Surely when the language allows the following:

type, public :: t
end type t

integer, parameter :: n = 2

type(t), target :: a(n*n)
type(t), pointer :: p(:,:)

p(1:n,1:n) => a

where t can be any derived type, why not

character(kind=some_char_kind,len=n*ell), target :: s
character(kind=some_char_kind,len=ell), pointer :: t(n) => s

Looks like an oversight to me.

Thanks,

FortranFan

unread,

Sep 23, 2016, 9:46:05 AM9/23/16

to

On Thursday, September 22, 2016 at 11:52:42 PM UTC-4, Louis Krupp wrote:

> .. It generates C code that I'm sure you could link to

> Fortran.
>
> I'm sure your engineers would love it. :)
>

> ..

@Louis Krupp,

Thanks much for your input. Actually I just found out this morning there is a variation of this code managed by another team where they do effectively what you say i.e., do the storage association in C; I don't have the details but I assume it's something simple given the simple use case. That's an option too.

Regards,

Gordon Sande

unread,

Sep 23, 2016, 10:27:18 AM9/23/16

to

You never desribe the application of the character variables. Your
little example
suggests that that it is some sort of symbol table with fixed sized
symbols. There
may be some reason for wanting to concatenate all the symbols into a
longer string.

A first suggestion might be to use a 2-d array of character with the
rows for the
fixed symbols. Some amount of reworking subscripts, ranges and slices that is
just tedious with a pattern driven text editor.

Without a real desctipion of the application of the string manipulation
you are not
likely to get much insightful advice.:-(

FJ

unread,

Sep 23, 2016, 1:28:03 PM9/23/16

to

I did'n see whether somebody has shown a pure F2003 solution. Here my
proposal :

module m

implicit none

character(32) :: text="Test without iso_c_binding"
character,pointer :: ptext(:)=>null()

contains

subroutine set_ptext()
call set(text,len(text),ptext)
end subroutine

subroutine set(string,lstring,pstring)
integer ,intent(in) :: lstring
character,intent(in),target :: string(lstring)
character,pointer,intent(out) :: pstring(:)
pstring => string
end subroutine

end module

program main
use m
implicit none
call set_ptext
write(*,*) ptext
end program

FJ

unread,

Sep 23, 2016, 1:55:45 PM9/23/16

to

I did'n see whether somebody has shown a pure F2003 solution without
iso_c_binding. Here my proposal :

FortranFan

unread,

Sep 23, 2016, 2:23:55 PM9/23/16

to

FJ,

I really appreciate your input. From what I can comprehend, what you propose looks very clean, compact, and efficient that makes use of native Fortran features. It's likely the best option if we decide to be rid of EQUIVALENCE in this code.

Thanks much,

herrman...@gmail.com

unread,

Sep 23, 2016, 4:48:56 PM9/23/16

to

On Friday, September 23, 2016 at 11:23:55 AM UTC-7, FortranFan wrote:

(snip)

> I really appreciate your input. From what I can comprehend,
> what you propose looks very clean, compact, and efficient that
> makes use of native Fortran features. It's likely the best option
> if we decide to be rid of EQUIVALENCE in this code.

You still didn't say why you want to get rid of EQUIVALENCE.

Not that there shouldn't be reasons, but the result should at least
be more readable.

Seems to me that with appropriate comments, you could explain
what each of the lines using s and t are doing.

And to better answer the question, we need to know more
exactly what they are doing. So far, we only have the program
you posted, which is too simple to be the real question.

Ian Harvey

unread,

Sep 23, 2016, 6:03:51 PM9/23/16

to

Take `ell` to be three and `n` to be four, and some_char_kind to be not
the value for default character kind and not C_CHAR, and lets test my
ascii diagram skills.

If s contains the letters from 'a' through 'l' in order, the language
practically requires (substring operations, rules around storage
association) that the layout of s in its relevant storage unit be:

| 1 1 |
| 0 1 2 3 4 5 6 7 8 9 0 1 |
| |
| a b c d e f g h i j k l |

The layout of a rank one array of length three scalars is not so
constrained. Using the same storage unit definition as for the single
scalar case above, it could be laid out:

| 1 1 1 1 1 1 1 |
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 |
| |
| a b c . d e f . g h i . j k l . |

where `.` indicates padding, inserted to permit each array element to
start on a storage unit offset that the processor requires for efficient
access to the array element.

Those two layouts can currently be incompatible. Your proposal would
require them to be the same (so that "all the gaps have the same
length", as you quote from John Reid, would be true), practically, even
if a program did not use the proposed feature.

(If your proposal only applied to default character or C_CHAR character
kind scalars, then it would not introduce any additional constraints on
storage.)

The significance of this additional constraint could certainly be
debated, today no one would probably notice because the
non-default/non-C_CHAR character kinds are not common. But there is
more than a hint of regret around the implications that things like
sequence and storage association have on alignment requirements and
optimisation today. I suspect that regret is one of the reasons that
non-default, non-C_CHAR character kinds have been explicitly excluded
from the sequence and storage association rules.

Pointer rank remapping in the language currently only applies to arrays
(rank one or simply contiguous). The rules around sequence and storage
association already exclude the possibility of padding between higher
dimensions of the basic storage of an array, that is anything different
to the padding that exists between adjacent elements within a rank.
From an alignment/optimisation point of view, that horse has already
bolted.

Array subscripting can then mean that basic array storage is not
referenced contiguously, but in an argument association context where
you are going from a potentially non-contiguous or uneven reference to a
procedure that assumes contiguous or evenly spaced storage (e.g. vector
subscripted actual, or explicit shape or assumed size dummy), the
argument passing rules are such that the compiler can make a copy just
for the life of the associated procedure reference - it cannot do that
for pointer association because there is no well defined lifespan for
the copy.

Terence

unread,

Sep 23, 2016, 6:16:35 PM9/23/16

to

"FortranFan" wrote in message
news:00dd94aa-84ba-46df...@googlegroups.com...

>I'm working on some older code that involves EQUIVALENCE statements
>involving >CHARACTER variables and CHARACTER arrays and I'm looking at
>options to replace such >statements.

Here is an option:-

The use of character strings is usually in order to communicate with the
user of the application.

I have always used a technique which has all the communication text outside
the application. This requires a text file containing all the needed
language and individual text strings as an ID number, followed by an message
line number, followed by the text for that line of that ID number; all in
whatever text-coding method the language requires. Lets say the assigned
file name is 'language.txt'. In practice, the first record can be a header
which defines colour palates and the basic symbol for YES and NO to be
initially used.

In the program, there will be a matrix of pointers, indexed by the ID
number, of the number of line text entries for that ID, and an associated
matrix of the pointers to the storage area of the first text line for that
index number.

Then the program first reads this 'language.txt' file and reads any optional
header information, then for the rest of the file, the ID number and index
number and the following text until the end-of-line signal (usually a cr-lf
or lf code) is met and stores the text sequentially in a large character or
word-based matrix. And continues till end-of-file signal is met

The pointer matrix indexed by ID has the line count updated and the
associated matrix of text pointers has the new pointer added to it.

The length of each line entry is defined by the difference of the storage
addresses of the desired line and the start of the next.

A single routine is called to pass the ID number and the line index number
to return the needed pointer to the text line and its length, whenever a
communication is to be made to the user. (In the same way, responses can be
matched to provide a response number)

The application can then be provided with the address of whatever language
file is needs to use, and switching languages in mid-run is easily performed
as well.

Terence ('Oldster')

Gary Scott

unread,

Sep 23, 2016, 10:52:46 PM9/23/16

to

Yes, and some of the solutions posted are much uglier than
equivalence...KISS should apply

FortranFan

unread,

Sep 23, 2016, 11:31:01 PM9/23/16

to

On Friday, September 23, 2016 at 6:03:51 PM UTC-4, Ian Harvey wrote:

> ..
>

> Take `ell` to be three and `n` to be four, and some_char_kind to be not

> the value for default character kind and not C_CHAR, ..

>
> Those two layouts can currently be incompatible. Your proposal would
> require them to be the same (so that "all the gaps have the same
> length", as you quote from John Reid, would be true), practically, even
> if a program did not use the proposed feature.
>
> (If your proposal only applied to default character or C_CHAR character
> kind scalars, then it would not introduce any additional constraints on
> storage.)
>

> .. But there is

> more than a hint of regret around the implications that things like
> sequence and storage association have on alignment requirements and
> optimisation today. I suspect that regret is one of the reasons that
> non-default, non-C_CHAR character kinds have been explicitly excluded
> from the sequence and storage association rules.
>

> ..

I see the following in the standard:

section 16.3.5.4: "Partial association shall exist only between an object that is default character or of character sequence type and an object that is default character or of character sequence type .."

and a bit later in the same section,

"For character entities, partial association may occur only through argument association or the use of COMMON or EQUIVALENCE statements."

So are you saying when it comes to character storage units of a non-default or non-c_char kind, the association that occurs with argument association or the use of EQUIVALENCE cannot be replicated in an enhanced facility of pointer remapping? FJ's suggestion upthread nicely combines what the facility of argument association with pointer assignment, I wish I can better understand why the language can't offer an abstraction along such lines which a coder can use:

-- begin code --
module m

use, intrinsic :: iso_fortran_env, only : character_kinds
use, intrinsic :: iso_c_binding, only : c_loc, c_intptr_t

implicit none

integer, parameter :: CK = character_kinds(2)
integer, parameter :: n = 3
integer, parameter :: ell = 4

character(kind=CK, len=n*ell), target :: s
character(kind=CK, len=ell), pointer :: t(:) => null()

private :: set

contains

subroutine set_t()

call set(s,t)

end subroutine set_t

subroutine set(string, pstring)

character(kind=CK,len=ell), intent(in), target :: string(n)
character(kind=CK,len=ell), intent(out), pointer :: pstring(:)

pstring(1:n) => string(1:n)

end subroutine set

subroutine output_addresses( x, s1, s2 )

integer, intent(in) :: x
character(kind=CK, len=*), intent(in), target :: s1
character(kind=CK, len=*), intent(in), target :: s2

integer(c_intptr_t) :: add_1
integer(c_intptr_t) :: add_2

add_1 = transfer( c_loc(s1), mold=add_1)
add_2 = transfer( c_loc(s2), mold=add_2)
print "(1x,i0,3x,z0,24x,z0)", x, add_1, add_2

return

end subroutine output_addresses

end module m
program p

use m, only : s, t, output_addresses, n, ell, set_t

implicit none

integer :: i
integer :: j
integer :: k

call set_t()

do i = 1, n*ell
s(i:i) = achar(96+i, kind=kind(S))
end do

print *, "Values"
print *, "i j k s(k:k) t(i)(j:j)"

do i = 1, n

do j = 1, ell
k = (i-1)*ell + j
print "(1x,i0,3x,i0,3x,i0,3x,g0,7x,g0)", i, j, k, s(k:k), t(i)(j:j)
end do
end do

print *, "Addresses in memory"
print *, "i s((i-1)*ell+1:(i-1)*ell+1) t(i)"

do i = 1, n

call output_addresses( i, s((i-1)*ell+1:(i-1)*ell+1), t(i) )
end do

stop

end program p
-- end code --

Upon execution with gfortran which supports a second character kind,

-- begin output --
Values
i j k s(k:k) t(i)(j:j)
1 1 1 a a
1 2 2 b b
1 3 3 c c
1 4 4 d d
2 1 5 e e
2 2 6 f f
2 3 7 g g
2 4 8 h h
3 1 9 i i
3 2 10 j j
3 3 11 k k
3 4 12 l l
Addresses in memory **
i s((i-1)*ell+1:(i-1)*ell+1) t(i)
1 448020 448020
2 448030 448030
3 448040 448040

-- end output --

** the use of c_loc is questionable but I figure it's handy here.

So to summarize, my silly 'brainwave' was if a coder could do above, can't the following be allowed as a compact representation:

character(kind=CK, len=n*ell), target :: s
character(kind=CK, len=ell), pointer :: t(1:n) => s

So if I understand your point correctly, it may work with c_char and default char kinds of most compilers today and also perhaps (based on above example) with CHARACTER_KINDS(2) in gfortran, but not so in the general case. Interesting, though I'll never quite understand the finer details, suffice it to say it is yet another thought that goes up in flames!

Thanks for your attention,

FortranFan

unread,

Sep 24, 2016, 12:35:43 AM9/24/16

to

On Friday, September 23, 2016 at 4:48:56 PM UTC-4, herrman...@gmail.com wrote:

> ..
>

> And to better answer the question, ..

I think the matter has been discussed sufficiently, I'm not awaiting any "better" response. Thanks for your interest.

The use case is indeed as simple as shown in the original post. EQUIVALENCE proves a safe and succinct approach for how some character data are handled in this code, the standard having marked it as obsolescent and redundant can be kept in mind while writing new code.

Ian Harvey

unread,

Sep 24, 2016, 7:07:51 PM9/24/16

to

For non-default/non-C_CHAR character, there is no such
argument/equivalence association to replicate.

Sequence association between actual and dummy arguments only applies
between character scalars and character arrays that are default or
C_CHAR kind (F2008 12.5.2.11 p2 and p3, F2008 12.5.2.4p3 and p4).
Otherwise type and type parameters are required to match.

The rules around storage association, that apply to EQUIVALENCE and
COMMON, only permit association between character scalars and character
arrays when the character kind is default character (no special
treatment for C_CHAR even, assuming it is different). Otherwise F2008
C594 requires a match in type and type parameters for EQUIVALENCE, and
5.7.2.4p5 similarly for COMMON.

The length parameter is a type parameter.

Note also that the list of rules for storage sequences don't explicitly
call out non-default, non-C_CHAR character (go through the list in
16.5.3.2p2). In a storage association context, a non-default,
non-C_CHAR character scalar can not be considered as a sequence of its
individual characters - you can only line up the storage sequences of
scalars or array elements when the length type parameter matches.

C-interop aside (and perhaps some aspects of sequence association), the
current storage association rules reflect an unfortunate bleed-through
of underlying early language implementations into the language
specification - I consider them to there more by accident, not by design.

It is unfortunate, because modern implementations of the modern language
are still constrained by whatever choices the early language
implementations made, even though there may be much better ways of
implementing things these days, if things like storage association were
not present in the language. I think this is part of the reason why
there are moves to obsolete the storage association related features in
the language.

Newer language features, such as non-default kinds, are not constrained
by whatever was done in early implementations (because there were no
earlier implementations), consequently the rules around storage
association do not cover them. No need to compromise an implementation
to support a feature that a program may never use.

Instead, similar capability is provided though things like TRANSFER and
SEQUENCE types - which have a very local effect on implementations.

> -- begin code -- module m
> -- begin code --
> module m
>
> use, intrinsic :: iso_fortran_env, only : character_kinds
> use, intrinsic :: iso_c_binding, only : c_loc, c_intptr_t
>
> implicit none
>
> integer, parameter :: CK = character_kinds(2)
> integer, parameter :: n = 3
> integer, parameter :: ell = 4
>
> character(kind=CK, len=n*ell), target :: s
> character(kind=CK, len=ell), pointer :: t(:) => null()
>
> private :: set
>
> contains
>
> subroutine set_t()
>
> call set(s,t)

If CK is not default or C_CHAR (and `n` is not one), then this call is
non-conforming - "the type parameter values of the actual argument shall
agree with the corresponding ones of the dummy argument that are not
assumed, except... (default character/C_CHAR)" (12.5.2.4p3).

The length type parameter of the `s` actual argument is 12, but the
length type parameter of the `string` dummy argument below is 4.

Given the length parameters are constant expressions, it is conceivable
that a compiler could issue a diagnostic for this, though it is not
required to. It is also conceivable that a debugging runtime could
issue a runtime error.

Further "if the actual argument is a noncoindexed scalar, the
corresponding dummy argument shall be scalar unless the actual argument
is ... (default character/C_CHAR) ..., or is an element or substring of
an array that is not an assumed-shape, pointer, or polymorphic array"
(12.5.2.4p13).

The actual argument is scalar, it is not an array element or substring
or an array, but the dummy argument is not scalar.

It is conceivable that a compiler could issue a diagnostic for this too.