Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Must declared length of a character function be matched by any procedure declaring it?

58 views
Skip to first unread message

John

unread,
Sep 1, 2009, 1:06:04 AM9/1/09
to
I'm wondering if I have developed an unneccessary habit. I have
assumed
that if I had a character function that was declared with an explicit
length that any routine that used the function had to declare it to be
of the same length (this goes back to pre-module/pre-interface days).

Somewhat in support of that someone was having problems with a large
mostly f77-ish code where strings were being returned with bad values
by a function where there was a mismatch in lengths; when changes were
made so the
length was consistent the problem went away.

But then I made a short test program while discussing this issue with
the code's developer, assuming the compiler would quickly support that
this was the case (by failing). But to my chagrin the following
little
test got no warning messages from any compiler I tried ...

program testit
character(len=8),external::bigfunc ! declare function as 8
characters
write(*,*)'string=['//bigfunc()//']'
end program testit
character(len=16) function bigfunc() ! function is defined as 16
characters
bigfunc='ABCDEFGHIJKLMNOP'
end function bigfunc

So if the function declares itself to be of one length, but references
to it use another, is the behavior just equivalent to
stringa=stringb
as far as truncation or padding to the right with blanks (depending of
course
on which string variable is bigger); or is this
prohibited? The answer didn't leap out at me from the standard and I'm
hoping someone knows off the cuff with authority?

The good news for me would be that this code hasn't been doing
something
wrong for a considerable time; the bad news is that I will then be in
the position of not knowing why the changes that were made fixed the
problem -- and some of the answers are not good (probably smacking
memory
somewhere else, for example)

John Urban


Richard Maine

unread,
Sep 1, 2009, 2:03:03 AM9/1/09
to
John <urba...@comcast.net> wrote:

> I have assumed
> that if I had a character function that was declared with an explicit
> length that any routine that used the function had to declare it to be
> of the same length (this goes back to pre-module/pre-interface days).

That is correct. There is also the case of assumed character length, but
that is a definite oddity (and doesn't fall under your explicit length
case).

> But to my chagrin the following little
> test got no warning messages from any compiler I tried ...

There are many errors that are typically not diagnosed by compilers.
That includes most cases of mismatched procedure arguments (and function
returns) in f77-style external procedures. Such things were very common
sources of sometimes hard-to-find bugs in f77 code; that's one of the
many things that modules do much better.

> So if the function declares itself to be of one length, but references

> to it use another...or is this prohibited?

This is just plain prohibited. And it is very much the kind of error
that I could imagine lying undetected for a long time.

> The answer didn't leap out at me from the standard and I'm
> hoping someone knows off the cuff with authority?

I'm quite sure of the answer. Having a bit of trouble finding it stated
quite as clearly as one might like in the standard. I see it implied,
but I'd have hoped for something better. Didn't spend a lot of time
looking.

In f203, see 12.2.2, "Characteristics of function results." Included in
the long list of things that are characteristics is "type parameters".
Character length is a type parameter.

Then see 12.3, where it says "The characteristics of a procedure are
fixed, but the remainder of the interface may differ in different
scoping units."I take that as implying the restriction, but I'd like it
said more clearly. Hmm.

Well, if you have an explicit interface, the condition is pretty
explicit. In 12.3.2.1, 6th para after all the constraints:

"If an explicit specific interface is specified for by interface body or
a procedure declaration statement (12.3.2.3) for an external procedure,
the characteristics shall be consistent with those specified in the
procedure definition, except...[exception irrelevant]".

12.3.2.5 has stuff about implicit interfaces. It says that the function
result is specified in the obvious ways and it says that the actual
arguments have to be consistent with the characteristics of the dummy,
but it doesn't explicitly say that the function result has to be right.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

James Van Buskirk

unread,
Sep 1, 2009, 2:40:38 AM9/1/09
to
"John" <urba...@comcast.net> wrote in message
news:2a7f5e48-4679-4851...@p9g2000vbl.googlegroups.com...

> program testit
> character(len=8),external::bigfunc ! declare function as 8
> characters
> write(*,*)'string=['//bigfunc()//']'
> end program testit
> character(len=16) function bigfunc() ! function is defined as 16
> characters
> bigfunc='ABCDEFGHIJKLMNOP'
> end function bigfunc

My view of what to expect here is motivated by two considerations:
1) F77 allows functions to declare themselves to be CHARACTER*(*).
2) F77 has no provision for dynamic memory.

Given these conditions, a function at compile time might have no way
of knowing what its length would be and indeed its length could be
different when invoked from different program units. Since the lack
of dynamic memory meant that it could not allocate memory for its
result at run time, it needed the memory to be given to it by its
caller. The most usual implementation is for the caller to pass
a pointer to memory big enough to hold the result and also the length
of the function result. Since the caller has no way of knowing
whether the callee was fixed length like CHARACTER*(16) or assumed
length, CHARACTER*(*), every invocation of a character-valued function
had to be the same: pointer to result address and result length.

This goes forward into f03 when the interface is implicit as here.
Now, think about what is going to happen when your function is
invoked: program testit will allocate 8 bytes it knows are safe to
write to and tell function bigfunc that it expects a result of
length 8. Function bigfunc already knows what its length is,
however, so it will ignore that information and write 16 bytes to
the address it has been given, overwriting 8 dangerous bytes.

If you really want see see bad things happen, make bigfunc much
longer and call it from another procedure. Then, if the staging
area for bigfunc's result variable is on the stack, it should
probably overwrite somebody's return address. Let's try to
create an example like that:

C:\gfortran\clf\illegal_len>type illegal_len.f90
program main
implicit none
character(4) level1
character(4) answer
integer i

do i = 1024, 1, -1023
write(*,'(a,i0)') 'Invoking level1 with i = ', i
answer = level1(i)
write(*,'(a)') 'Back from level1'
write(*,'(a,i0,2a)') 'i = ', i, ', answer = ', answer
end do
end program main

function level1(i)
implicit none
integer i
character(4) level1
character(i) bigfunc

level1 = bigfunc()
end function level1

function bigfunc()
implicit none
character(1024) bigfunc

bigfunc = repeat('b',len(bigfunc))
end function bigfunc

C:\gfortran\clf\illegal_len>gfortran -Wall
illegal_len.f90 -frecursive -oillegal
_len

C:\gfortran\clf\illegal_len>illegal_len
Invoking level1 with i = 1024
Back from level1
i = 1024, answer = bbbb
Invoking level1 with i = 1
Back from level1
i = 1, answer = b

Oog, that worked! Seems that gfortran allocated (maybe too much?) memory
on the heap instead of the stack for the result of bigfunc in level1 so
that it didn't overwrite a return address after all. As for ifort:

C:\gfortran\clf\illegal_len>ifort illegal_len.f90
Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.1
Build 20061104
Copyright (C) 1985-2006 Intel Corporation. All rights reserved.

illegal_len.f90(19) : Error: This passed length character name has been used
in
an invalid context. [BIGFUNC]
character(i) bigfunc
----------------^
compilation aborted for illegal_len.f90 (code 1)

I don't know whether the problem is with my code or the old
version of ifort. I presume that accessing characters past the
declared length (in the caller) of the function result in the
callee is illegal, but I haven't been able to demonstrate an error
caused by this usage. Sorry.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


robin

unread,
Sep 1, 2009, 9:50:47 AM9/1/09
to
"John" <urba...@comcast.net> wrote in message
news:2a7f5e48-4679-4851...@p9g2000vbl.googlegroups.com...
> I'm wondering if I have developed an unneccessary habit. I have
> assumed
> that if I had a character function that was declared with an explicit
> length that any routine that used the function had to declare it to be
> of the same length (this goes back to pre-module/pre-interface days).
>
> Somewhat in support of that someone was having problems with a large
> mostly f77-ish code where strings were being returned with bad values
> by a function where there was a mismatch in lengths; when changes were
> made so the
> length was consistent the problem went away.
>
> But then I made a short test program while discussing this issue with
> the code's developer, assuming the compiler would quickly support that
> this was the case (by failing). But to my chagrin the following
> little
> test got no warning messages from any compiler I tried ...

That's because the code is in F77 mode, and errors like that usually
cannot be diagnosed.

> program testit
> character(len=8),external::bigfunc ! declare function as 8
> characters
> write(*,*)'string=['//bigfunc()//']'
> end program testit
> character(len=16) function bigfunc() ! function is defined as 16
> characters
> bigfunc='ABCDEFGHIJKLMNOP'
> end function bigfunc

Try the following:

program test
character(len=8) :: t
t = bigfunc()
print *, t
print *, '[' // bigfunc() // ']'

contains

function bigfunc() result (s)
character(len=16) :: s
s = 'ABCDEFGHIJKLMNOP'
return
end function bigfunc
end program test

> So if the function declares itself to be of one length, but references
> to it use another, is the behavior just equivalent to
> stringa=stringb
> as far as truncation or padding to the right with blanks (depending of
> course
> on which string variable is bigger); or is this
> prohibited?

Definitely prohibited. It always has been.
The reason that the error is not detected is that there is no explicit
interface.
Without an explicit interface, the compiler is relying on YOU to get it
correct.
If you don't get it correct, anything can happen because the program is
wrong.
Typical symptoms are wrong results or a a program crash.
Or, as you noticed, wrong results.

In Fortran 90 and later, a character function can return a string
of any length. The length can be constant, or it can depend on
the value or length of some argument. Don't forget to use an
explicit interface though.

robin

unread,
Sep 1, 2009, 9:50:48 AM9/1/09
to
"John" <urba...@comcast.net> wrote in message
news:2a7f5e48-4679-4851...@p9g2000vbl.googlegroups.com...
> I'm wondering if I have developed an unneccessary habit. I have
> assumed
> that if I had a character function that was declared with an explicit
> length that any routine that used the function had to declare it to be
> of the same length (this goes back to pre-module/pre-interface days).
>
> Somewhat in support of that someone was having problems with a large
> mostly f77-ish code where strings were being returned with bad values
> by a function where there was a mismatch in lengths; when changes were
> made so the
> length was consistent the problem went away.

You can avoid a mis-match in lengths the following way:

program testit
character(len=8),external::bigfunc ! declare function as 8 characters
write(*,*)'string=['//bigfunc()//']'
end program testit

character(len=*) function bigfunc() ! function takes length from caller


bigfunc='ABCDEFGHIJKLMNOP'
end function bigfunc

Here the length of the string returned by the function is dertermined by
the calling program, and thus the returned length is 8, not 16.

> So if the function declares itself to be of one length, but references
> to it use another, is the behavior just equivalent to
> stringa=stringb

The above performs an equivalent action to what you suggest.


John

unread,
Sep 2, 2009, 9:36:30 PM9/2/09
to
On Sep 1, 9:50 am, "robin" <robi...@bigpond.com> wrote:
> "John" <urbanj...@comcast.net> wrote in message

Thanks to all for the answers. I was starting to think I had been been
making
life unneccesarily complicated. A good reminder not to depend on the
compiler
as a diagnostic tool too much. One more reason to bite the bullet and
put more
code into modules. 11,342,174 lines of pre-f90 code to go -- better go
get some
expresso.
Thanks again!

Richard Maine

unread,
Sep 2, 2009, 10:03:55 PM9/2/09
to
John <urba...@comcast.net> wrote:

> One more reason to bite the bullet and put more code into modules.
> 11,342,174 lines of pre-f90 code to go -- better go get some expresso.

If you are updating code by doing such things as putting it into
modules, you might want to keep in mind that assumed-length character
functions are officially obsolescent as of f95. They always were a bit
of a strange wart, misunderstood by most people who tried to use them.
To quote from B.2.5 in the f95 standard,

"Assumed character length for functions is an irregularity in the
language since elsewhere in Fortran the philosophy is that the
attributes of a function result depend only on the actual arguments
of the invocation and on any data acessible by the function..."

There are quite a few other oddities about such functions. One that
seems relevant relates to your comment about putting code into modules.
Although you can put an assumed-length character function in a module,
you cannot then reference it. (The reasons why are left as an exercise
for the student). I think I recall a debate about whether to disallow
such functions from being put into a module, since that makes them
useless and silly. I guess the conclusion must have been that Fortran
doesn't go out of its way to restrict against things just because they
are useless and silly (or a lot of existing code would be illegal).

Anyway, the short of it is that I recommend against using assumed-length
character functions.

I have Robin killfiled, so I would not have even seen his comment if you
had not quoted it. I rather expect him to reply to this (though there is
a chance that my saying I expect it might keep it from happening), but I
won't get into a "debate" with him (or even read his posts). I quote
"debate" for a reason. :-(

robin

unread,
Sep 4, 2009, 10:26:03 AM9/4/09
to
"James Van Buskirk" <not_...@comcast.net> wrote in message
news:h7ifl8$jds$1...@news.eternal-september.org...

> "John" <urba...@comcast.net> wrote in message
> news:2a7f5e48-4679-4851...@p9g2000vbl.googlegroups.com...
>
>> program testit
>> character(len=8),external::bigfunc ! declare function as 8
>> characters
>> write(*,*)'string=['//bigfunc()//']'
>> end program testit
>> character(len=16) function bigfunc() ! function is defined as 16
>> characters
>> bigfunc='ABCDEFGHIJKLMNOP'
>> end function bigfunc
>
> My view of what to expect here is motivated by two considerations:
> 1) F77 allows functions to declare themselves to be CHARACTER*(*).
> 2) F77 has no provision for dynamic memory.
>
> Given these conditions, a function at compile time might have no way
> of knowing what its length would be and indeed its length could be
> different when invoked from different program units. Since the lack
> of dynamic memory meant that it could not allocate memory for its
> result at run time,

That is irrelevant, as the memory is allocated at compile time by the
calling program unit.

It's no different from the situation when a character argument
is passed to a dummy argument in some procedure, where that
dummy argument is defined as CHARACTER *(*).


robin

unread,
Sep 4, 2009, 10:26:04 AM9/4/09
to
"Richard Maine" <nos...@see.signature> wrote in message
news:1j5fiu3.1ybflx1yj2q02N%nos...@see.signature...

> John <urba...@comcast.net> wrote:
>
>> One more reason to bite the bullet and put more code into modules.
>> 11,342,174 lines of pre-f90 code to go -- better go get some expresso.
>
> If you are updating code by doing such things as putting it into
> modules, you might want to keep in mind that assumed-length character
> functions are officially obsolescent as of f95. They always were a bit
> of a strange wart, misunderstood by most people who tried to use them.

Including yourself, as I recall.

> To quote from B.2.5 in the f95 standard,
>
> "Assumed character length for functions is an irregularity in the
> language since elsewhere in Fortran the philosophy is that the
> attributes of a function result depend only on the actual arguments
> of the invocation and on any data acessible by the function..."
>
> There are quite a few other oddities about such functions. One that
> seems relevant relates to your comment about putting code into modules.
> Although you can put an assumed-length character function in a module,

You can? Why don't you try it?

> you cannot then reference it. (The reasons why are left as an exercise
> for the student). I think I recall a debate about whether to disallow
> such functions from being put into a module, since that makes them
> useless and silly. I guess the conclusion must have been that Fortran
> doesn't go out of its way to restrict against things just because they
> are useless and silly (or a lot of existing code would be illegal).
>
> Anyway, the short of it is that I recommend against using assumed-length
> character functions.
>
> I have Robin killfiled,

And I have you in my viper file, as one who consistently abuses
posters in c.l.f. And you have done so here again.

James Van Buskirk

unread,
Sep 5, 2009, 1:57:00 AM9/5/09
to
"robin" <rob...@bigpond.com> wrote in message
news:%r9om.17450$ze1....@news-server.bigpond.net.au...

> "James Van Buskirk" <not_...@comcast.net> wrote in message
> news:h7ifl8$jds$1...@news.eternal-september.org...

>> My view of what to expect here is motivated by two considerations:


>> 1) F77 allows functions to declare themselves to be CHARACTER*(*).
>> 2) F77 has no provision for dynamic memory.

>> Given these conditions, a function at compile time might have no way
>> of knowing what its length would be and indeed its length could be
>> different when invoked from different program units. Since the lack
>> of dynamic memory meant that it could not allocate memory for its
>> result at run time,

> That is irrelevant, as the memory is allocated at compile time by the
> calling program unit.

The fact that memory must be allocated at compile time (in F77) by
the calling program unit is central to my argument, so yes, it is
completely relevant. Sorry to have used logic so that my post was
consequently made difficult to understand. But I wouldn't have
responded to this except that I was trying to make more points that
were missed in, e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41235 .

Given separate compilation of function bigfunc, the buffer overrun
should normally go undetected at compilation time, and I worry about
compilers that claim to have detected an error in my example. Let's
make a little change to make the program standard conforming (at
least spending some time searching the standard I couldn't find
anything wrong with it) :

C:\gfortran\clf\illegal_len>type legal_len.f90


program main
implicit none
character(4) level1
character(4) answer
integer i

do i = 1024, 1, -1023
write(*,'(a,i0)') 'Invoking level1 with i = ', i
answer = level1(i)
write(*,'(a)') 'Back from level1'
write(*,'(a,i0,2a)') 'i = ', i, ', answer = ', answer
end do
end program main

function level1(i)
implicit none
integer i
character(4) level1
character(i) bigfunc

level1 = bigfunc()
end function level1

function bigfunc()
implicit none
character(*) bigfunc

bigfunc = repeat('b',len(bigfunc))
end function bigfunc

C:\gfortran\clf\illegal_len>gfortran -Wall legal_len.f90 -olegal_len

C:\gfortran\clf\illegal_len>legal_len


Invoking level1 with i = 1024
Back from level1
i = 1024, answer = bbbb
Invoking level1 with i = 1
Back from level1
i = 1, answer = b

All OK, right? So gfortran seems to like it and I think it's
correct. However, looking at the output of

gfortran -Wall legal_len.f90 -S

I see a little weirdness:

.globl _level1_
.def _level1_; .scl 2; .type 32; .endef
_level1_:
pushq %rbp
movq %rsp, %rbp
pushq %rsi
pushq %rbx
subq $32, %rsp
movq %rcx, 16(%rbp)
movl %edx, 24(%rbp)
movq %r8, 32(%rbp)
movq 32(%rbp), %rax
movl (%rax), %ebx
movl $0, %eax
testl %ebx, %ebx
cmovns %ebx, %eax
sall $6, %eax
testl %eax, %eax
jns L6
leaq LC8(%rip), %rcx
call __gfortran_runtime_error
L6:
cltq
movl $1, %edx
testq %rax, %rax
cmovle %rdx, %rax
movq %rax, %rcx
call _malloc
testq %rax, %rax
jne L7
leaq LC9(%rip), %rcx
call __gfortran_os_error
L7:
movq %rax, %rsi
movl $0, %eax
testl %ebx, %ebx
cmovns %ebx, %eax
movl %eax, %edx
movq %rsi, %rcx
call _bigfunc_
testl %ebx, %ebx
movl $0, %eax
testl %ebx, %ebx
cmovs %eax, %ebx
cmpl $3, %ebx
jle L8
movl $4, %r8d
movq %rsi, %rdx
movq 16(%rbp), %rcx
call _memmove
jmp L9
L8:
movslq %ebx, %rax
movq %rax, %r8
movq %rsi, %rdx
movq 16(%rbp), %rcx
call _memmove
movslq %ebx, %rax
movl $4, %edx
subq %rax, %rdx
movslq %ebx, %rax
addq 16(%rbp), %rax
movq %rdx, %r8
movl $32, %edx
movq %rax, %rcx
call _memset
L9:
movq %rsi, %rax
testq %rax, %rax
je L5
movq %rax, %rcx
call _free
L5:
addq $32, %rsp
popq %rbx
popq %rsi
leave
ret

At issue is the line

sall $6, %eax

Which seems to be allocating 64 bytes per character instead of 1
byte per character. But maybe I am just misunderstanding the purpose
of this line of code.

ifort still hates this code:

C:\gfortran\clf\illegal_len>ifort legal_len.f90


Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.1
Build 20061104
Copyright (C) 1985-2006 Intel Corporation. All rights reserved.

legal_len.f90(19) : Error: This passed length character name has been used

in an
invalid context. [BIGFUNC]
character(i) bigfunc
----------------^

compilation aborted for legal_len.f90 (code 1)

So I think ifort really hasn't detected the potential buffer overrun
in the original version (illegal_len.f90) but is just rejecting code
that is valid in isolation. I don't know whether ifort has fixed
this problem in more recent releases.

0 new messages