How do I call a character function in C++

Simon

unread,

Oct 14, 2010, 6:45:00 AM10/14/10

to

I have legacy function (by which I mean I can't change it since it's
used everywhere) that is defined like so:

FUNCTION F_X(A) bind(c,name='F_X')
!DEC$ ATTRIBUTES DLLEXPORT::F_X
CHARACTER*(*) F_X
REAL(4), intent(in) :: A
CHARACTER*32 T
WRITE(T,'(f0.4)') A
F_X = T
END FUNCTION F_X

To call this from C++ I've tried
extern "C" {char* F_X(float*);}

float v = 1.0;
char* s = new char[32];
s = F_X(&v);

which fails at runtime.

Is it possible to call this directly from C++ or do I have to write a
wrapper routine?

TIA

Simon

glen herrmannsfeldt

unread,

Oct 14, 2010, 8:04:13 AM10/14/10

to

Simon <si...@whiteowl.co.uk> wrote:
> I have legacy function (by which I mean I can't change it since it's
> used everywhere) that is defined like so:

> FUNCTION F_X(A) bind(c,name='F_X')
> !DEC$ ATTRIBUTES DLLEXPORT::F_X
> CHARACTER*(*) F_X
> REAL(4), intent(in) :: A
> CHARACTER*32 T
> WRITE(T,'(f0.4)') A
> F_X = T
> END FUNCTION F_X

> To call this from C++ I've tried
> extern "C" {char* F_X(float*);}

It seems unlikely that it would be (char*).

Who would allocate it, and who would free it?

I can see how a fixed length CHARACTER would work,
but it isn't so obvious for variable length.

> float v = 1.0;
> char* s = new char[32];
> s = F_X(&v);

> which fails at runtime.

> Is it possible to call this directly from C++ or do I have to write a
> wrapper routine?

A wrapper seems like the best choice to me.

-- glen

Richard Maine

unread,

Oct 14, 2010, 12:36:12 PM10/14/10

to

Simon <si...@whiteowl.co.uk> wrote:

> I have legacy function (by which I mean I can't change it since it's
> used everywhere) that is defined like so:
>
> FUNCTION F_X(A) bind(c,name='F_X')
> !DEC$ ATTRIBUTES DLLEXPORT::F_X
> CHARACTER*(*) F_X
> REAL(4), intent(in) :: A
> CHARACTER*32 T
> WRITE(T,'(f0.4)') A
> F_X = T
> END FUNCTION F_X

...

> Is it possible to call this directly from C++ or do I have to write a
> wrapper routine?

I'm surprised you can call it at all from anything. You certainly have
no guarantees. It shouldn't even compile. I see two problems. One
trivial, and one not so. The trivial problem is that 4 is not guaranteed
to be a valid real kind. My quick skim overlooked that one until the nag
compiler reminded me with

Error: clf.f90, line 4: KIND value (4) does not specify a valid
representation method

While 4 is a common choice for the kind number of default real, it is
neither specified by the standard, nor universal. I'll regard that as a
minor side issue for the moment, though It might bite you someday.

Far more significant is the assumed-length character result. Do you have
any idea what that actually means? Very few people do. It is a quite
quirky feature that is completely misunderstood by the large majority of
the people who try to use it. I have heard of a few people who have used
it correctly, but only a very few. While it is related to an assumed
length dummy argument, it is not the same thing and is far more quirky.

Most people seem to think it has something to do with the result being
variable length. It doesn't. The result is fixed length, but the length
is declared in the invoking scope rather than in the function. So yes,
it can be a diferent foxed length in different invoking scopes, but it
is still a fixed length in any particular scope. One of the quirky
consequences is that you can't actually use an assumed-length function
if it is a module procedure. That's because using an assumed length
function requires that you declare the function length in the calling
scope, while module procedures disallow redeclaration. I think someone
once pointed out an even quirkier way to get around that (perhaps by
passing the module procedure as an actual argument).

It is an obsolescent feature as of f95, and is definitely not C
interoperable (quite independent of any C++ questions). As the NAG
compiler says after I fix the kind number problem with the real:

Error: clf.f90, line 8: BIND(C) function F_X has assumed CHARACTER
length

G95 compiles the code without complaint, but I don't know what it does
with it.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

Richard Maine

unread,

Oct 14, 2010, 2:14:35 PM10/14/10

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> Simon <si...@whiteowl.co.uk> wrote:

> > FUNCTION F_X(A) bind(c,name='F_X')
> > !DEC$ ATTRIBUTES DLLEXPORT::F_X
> > CHARACTER*(*) F_X

...

> I can see how a fixed length CHARACTER would work,
> but it isn't so obvious for variable length.

This is not variable length. It is fixed length assumed from the
referencing scope. Not that this makes it any better for the OP's
problem any. See my other post.

A wrapper isn't guaranteed to help either, as this code isn't guaranteed
to compile at all. Even if you have a compiler that it compiles on at
the moment, that could change. I'd consider it a bug that the compiler
compiled this at all. Compiler bugs sometimes do get fixed. That's going
to cause the OP problems if he "can't change it" but it won't compile as
is.

Steve Lionel

unread,

Oct 14, 2010, 2:52:27 PM10/14/10

to

On 10/14/2010 2:14 PM, Richard Maine wrote:
>Even if you have a compiler that it compiles on at
> the moment, that could change. I'd consider it a bug that the compiler
> compiled this at all. Compiler bugs sometimes do get fixed.

Yes, indeed. And I will see that this get fixed in Intel Fortran. It
almost certainly does not work as intended.

--
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://software.intel.com/en-us/forums/
Intel Software Development Products Support
http://software.intel.com/sites/support/
My Fortran blog
http://www.intel.com/software/drfortran

Ian Bush

unread,

Oct 14, 2010, 2:55:23 PM10/14/10

to

Richard Maine wrote:
> So yes,
> it can be a diferent foxed length in different invoking scopes,

Cracking typo Richard, and I'm not referring to diferent

(http://en.wiktionary.org/wiki/foxed just in case this is UKish only)

Ian

glen herrmannsfeldt

unread,

Oct 14, 2010, 3:52:13 PM10/14/10

to

Richard Maine <nos...@see.signature> wrote:
> glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>> Simon <si...@whiteowl.co.uk> wrote:

>> > FUNCTION F_X(A) bind(c,name='F_X')
>> > !DEC$ ATTRIBUTES DLLEXPORT::F_X
>> > CHARACTER*(*) F_X
> ...

>> I can see how a fixed length CHARACTER would work,
>> but it isn't so obvious for variable length.

> This is not variable length. It is fixed length assumed from the
> referencing scope. Not that this makes it any better for the OP's
> problem any. See my other post.

I see what you mean, but there is no way to do that, even for
the Fortran versions that allow it, from C.

C can return a struct containing an array, and that array
(at least in C89), would have to have a constant length.

It might be that you could call a Fortran CHARACTER function
with a constant (known at compile time) length from C, but
not one like this.

One common C solution is to return a pointer to a static char
array, which the caller has to copy before calling the function
again. The unix/C library routine ctime() converts the time
value as commonly used by unix into a human readable string.
I once wrote a program to extract the creation, modification,
and access times for a file and print them out, using three
calls to ctime. All the times come out the same.

So, wrong term, but it is more variable than a C function
can return.

> A wrapper isn't guaranteed to help either, as this code isn't guaranteed
> to compile at all. Even if you have a compiler that it compiles on at
> the moment, that could change. I'd consider it a bug that the compiler
> compiled this at all. Compiler bugs sometimes do get fixed. That's going
> to cause the OP problems if he "can't change it" but it won't compile as
> is.

-- glen

glen herrmannsfeldt

unread,

Oct 14, 2010, 5:33:19 PM10/14/10

to

Simon <si...@whiteowl.co.uk> wrote:
> I have legacy function (by which I mean I can't change it since it's
> used everywhere) that is defined like so:

You don't have to change it, and you don't have to use it
from C. I can see that if you have thousands of lines of
legacy code that you need to call from C, but this one takes
only a few minutes to rewrite.

> FUNCTION F_X(A) bind(c,name='F_X')
> !DEC$ ATTRIBUTES DLLEXPORT::F_X
> CHARACTER*(*) F_X
> REAL(4), intent(in) :: A
> CHARACTER*32 T
> WRITE(T,'(f0.4)') A
> F_X = T
> END FUNCTION F_X

#include <stdio.h>
#include <string.h>
#include <math.h>

struct char32 {
char x[32];
};

struct char32 f(double x) {
struct char32 buf;
strncpy(buf.x," ",32);
sprintf(buf.x,"%20.14e",x);
return buf;
}

int main() {
struct char32 tmp;
tmp=f(sqrt(3.));
printf("%32.32s\n",tmp.x);
}

The demonstrates a C function returning (a struct containing)
a 32 character string. I didn't try to call it from Fortran.

Note that using %f in C with sprintf is dangerous unless
you have a very large buffer.

Consider on a Cray system:

sprintf(buf,"%f",1e4000);
sprintf(buf,"%f",1e-8000);

-- glen

Ron Shepard

unread,

Oct 14, 2010, 6:03:40 PM10/14/10

to

In article <1jqchrj.1f79rzgwgs0lcN%nos...@see.signature>,
nos...@see.signature (Richard Maine) wrote:

> Far more significant is the assumed-length character result. Do you have
> any idea what that actually means? Very few people do. It is a quite
> quirky feature that is completely misunderstood by the large majority of
> the people who try to use it. I have heard of a few people who have used
> it correctly, but only a very few. While it is related to an assumed
> length dummy argument, it is not the same thing and is far more quirky.

I have never understood why people are confused about these kinds of
functions, but I agree with you that many people are. I would say that,
as far as the fortran side is concerned, it is pretty much exactly the
same as an assumed length dummy argument to a subroutine. Maybe there
are complications when the function is referenced more than once in a
statement? Interfacing to C is of course always a problem for assumed
length arguments, but if it can be done for a subroutine argument then
it should be possible to do it for an assumed length character function.
The same information must get passed to the subroutine or function: the
character string location and its length.

In fact, if there is difficulty writing the interface as a function, one
work-around might be to change it to a subroutine in the calling
program. Except for the multiple-reference-within-a-statement, thing
(which is really the only reason to use a function rather than a
subroutine, right?), the calling program would require only minimal
changes.

What does cfortran.h do for such functions?

$.02 -Ron Shepard

JB

unread,

Oct 14, 2010, 7:36:14 PM10/14/10

to

On 2010-10-14, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Note that using %f in C with sprintf is dangerous unless
> you have a very large buffer.

... hence snprintf()

> Consider on a Cray system:
>
> sprintf(buf,"%f",1e4000);
> sprintf(buf,"%f",1e-8000);

Ok..

PGC-W-0020-Overflow of real or double precision constant (bigprint.c: 6)
PGC-W-0019-Underflow of real or double precision constant (bigprint.c: 7)

(and, it turns out that due to the overflow/underflow the constant is
replaced by 0.0. So no buffer overflow, if that was the point of the
above exercise. And no, I didn't check with the other available C
compilers, just the default.)

--
JB

Richard Maine

unread,

Oct 14, 2010, 8:03:59 PM10/14/10

to

Ron Shepard <ron-s...@NOSPAM.comcast.net> wrote:

> In article <1jqchrj.1f79rzgwgs0lcN%nos...@see.signature>,
> nos...@see.signature (Richard Maine) wrote:
>
> > Far more significant is the assumed-length character result. Do you have
> > any idea what that actually means? Very few people do. It is a quite
> > quirky feature that is completely misunderstood by the large majority of

> > the people who try to use it....

> I have never understood why people are confused about these kinds of
> functions, but I agree with you that many people are. I would say that,
> as far as the fortran side is concerned, it is pretty much exactly the
> same as an assumed length dummy argument to a subroutine. Maybe there
> are complications when the function is referenced more than once in a
> statement?

Try more than once in scoping unit - not just in a statement. The length
must be specified in a declaration in the invoking scoping unit. That
length then applies to all references in the scoping unit. So if you
have two references in the same scoping unit, they better need the same
length.

Furthermore, because the length is specified in a declaration, it has to
be a specification expression. That means it requires obscure tricks to
get the length to be anything other than a compile-time constant. In
particular, you can't compute the length in executable statements in the
same scoping unit. Instead, what you have to do for a case where the
length is computed by executable statements is write a wrapper routine
and do the function call inside of it; then you have to figure out how
to get the data out of that wrapper routine, which probably pretty much
ruins the whole idea - might as well have used a subroutine in the first
place.

For example it is just fine to do

read(*,*) n
call some_subroutine(string(1:n))

but you can't do anything like that with assumed function length without
introducing an extra level of wrapper procedure to get the n into a
specification expression.

Finally there is the problem of putting an assumed-length function in a
module. You almost can't. There was a time when I thought there was no
way to use an assumed length module function. I think it was probably
James Van Buskirk who pointed out to me that there was an obscure way. I
might be wrong about it being James, but it seems like the kind of thing
he could come up with.

The problem is that you aren't allowed to redeclare module procedures
outside of the module, but you can't reference an assumed-length
function without redeclaring it.

The obscure trick is to pass the module procedure as an actual argument.
Then you can do the needed redeclaration on the dummy. I'm not actually
100% sure that is completely legit, but it at least might be. It is
surely obscure.

This is all much more subtle than what most people try to do with it. I
think the most common thing is to hope that the * is a synonym for
"magic" that will make everything work out without the user ever having
to specify a length anywhere. Instead, we now know the closest thing to
a syntax for that involves allocatable length. :-)

> Interfacing to C is of course always a problem for assumed
> length arguments, but if it can be done for a subroutine argument then
> it should be possible to do it for an assumed length character function.
> The same information must get passed to the subroutine or function: the
> character string location and its length.

Probably, but you can't do either one with the f2003 C interop features
that the OP was trying to use. You are talking about doing it by
figuring out what the Fortran compiler's implementation looks like and
then translating than into C. Yes, you can probably use that kind of
trick - sort of. But as I mentioned elsewhere, this code should not
compile at all as is because the BIND(C) requires that the interface be
interoperable as defined by the f2003 rules. Neither assumed-length
dummy arguments nor this meet are alowed by those rules. If it won't
compile at all, you won't be able to make it interoperate very well. So
yes, you could probably make something sort of like this work, but that
will require starting out by modifying this code to at least take off
the BIND(C) and then to use the appropriate compiler-dependent hackery.

> In fact, if there is difficulty writing the interface as a function, one
> work-around might be to change it to a subroutine in the calling
> program. Except for the multiple-reference-within-a-statement, thing
> (which is really the only reason to use a function rather than a
> subroutine, right?), the calling program would require only minimal
> changes.

There are plenty of things one could do that would not involve much
change. That's certainly one, but it still won't compile with BIND(C). I
took the OP's comment about being unable to change the code perhaps a
litle too literally, particularly as he is unlikely to be able to get by
with that choice (note Steve's agreement that it was a compiler bug to
accept that code as is).

robin

unread,

Oct 14, 2010, 11:25:00 PM10/14/10

to

"Simon" <si...@whiteowl.co.uk> wrote in message news:i96mvd$1io$1...@news.eternal-september.org...

The FORTRAN function expects to receive the length of the string and
its address.

glen herrmannsfeldt

unread,

Oct 15, 2010, 1:35:20 AM10/15/10

to

JB <f...@bar.invalid> wrote:
(snip)

>> Consider on a Cray system:

>> sprintf(buf,"%f",1e4000);
>> sprintf(buf,"%f",1e-8000);

> Ok..

> PGC-W-0020-Overflow of real or double precision constant (bigprint.c: 6)
> PGC-W-0019-Underflow of real or double precision constant (bigprint.c: 7)

How about on a Cray-1 or Cray-XMP system. The single precision
format is 64 bits with, I believe, a 15 bit exponent.
So, the 1e-8000 is too small, but the 1e4000 will fit, and take
4001 characters to print, not counting the decimal point or any
digits after the decimal point.

-- glen

JB

unread,

Oct 15, 2010, 8:27:56 AM10/15/10

to

On 2010-10-15, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> JB <f...@bar.invalid> wrote:
> (snip)
>
>>> Consider on a Cray system:
>
>>> sprintf(buf,"%f",1e4000);
>>> sprintf(buf,"%f",1e-8000);
>
>> Ok..
>
>> PGC-W-0020-Overflow of real or double precision constant (bigprint.c: 6)
>> PGC-W-0019-Underflow of real or double precision constant (bigprint.c: 7)
>
> How about on a Cray-1 or Cray-XMP system.

Sorry, I don't have access to a computer museum where I could use one
of the above systems. When you said "Cray system", I assumed you meant
an arbitrary, or at least contemporary, Cray system. But, given your
posting history, that was perhaps a poor assumption.

> The single precision
> format is 64 bits with, I believe, a 15 bit exponent.
> So, the 1e-8000 is too small, but the 1e4000 will fit, and take
> 4001 characters to print, not counting the decimal point or any
> digits after the decimal point.

Yes, it was quite obvious what the point was. That being said, if you
wanted to demonstrate buffer overflow with sprintf(), surely you could
come up with an example that doesn't require some exotic piece of
hardware in a museum to test and verify?

--
JB

glen herrmannsfeldt

unread,

Oct 15, 2010, 9:18:35 AM10/15/10

to

JB <f...@bar.invalid> wrote:
(I wrote)

Well, the x87 temporary real format, sometimes REAL*10 for
Fortran compilers or (long double) for C, also has a 15 bit
exponent field. There might be some that allow it for sprintf.

Otherwise, some might decide that a 1000 character buffer should
be big enough for printing just one floating point value.

-- glen

Simon

unread,

Oct 15, 2010, 12:21:00 PM10/15/10

to

Thank you all for your responses. My feeling was that it wouldn't be
possible because there is no way (that I could see) to pass the required
string length to the Fortran compiler. I've solved the problem by
writing a wrapper subroutine that defines the length explicitly and I
can arrange for the calling C++ (itself a wrapper) always to have the
correct amount of memory allocated.

As for writing a c-equivalent that is exactly the problem I'm solving.
There is already a C++ mimic and this should be replaced by a call to
the (original) Fortran so that number formatting is only ever done in
one place.

Simon

James Van Buskirk

unread,

Oct 15, 2010, 2:43:59 PM10/15/10

to

"Steve Lionel" <steve....@intel.invalid> wrote in message
news:8hp1rd...@mid.individual.net...

> On 10/14/2010 2:14 PM, Richard Maine wrote:
>>Even if you have a compiler that it compiles on at
>> the moment, that could change. I'd consider it a bug that the compiler
>> compiled this at all. Compiler bugs sometimes do get fixed.

> Yes, indeed. And I will see that this get fixed in Intel Fortran. It
> almost certainly does not work as intended.

BTW, there was another behavior of ifort regarding assumed length
character functions that I thought was a bug that I posted in

http://groups.google.com/group/comp.lang.fortran/msg/0048deb68a1968a7?hl=en

, but I have such an old version of ifort that I'm not sure that the
problem persists in current versions.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

James Van Buskirk

unread,

Oct 15, 2010, 3:17:16 PM10/15/10

to

"Simon" <si...@whiteowl.co.uk> wrote in message

news:i99v1f$s7p$1...@news.eternal-september.org...

> Thank you all for your responses. My feeling was that it wouldn't be
> possible because there is no way (that I could see) to pass the required
> string length to the Fortran compiler. I've solved the problem by writing
> a wrapper subroutine that defines the length explicitly and I can arrange
> for the calling C++ (itself a wrapper) always to have the correct amount
> of memory allocated.

> As for writing a c-equivalent that is exactly the problem I'm solving.
> There is already a C++ mimic and this should be replaced by a call to the
> (original) Fortran so that number formatting is only ever done in one
> place.

Don't give up just yet! I posted an example that shows what you can
do:

http://groups.google.com/group/comp.lang.fortran/msg/2d1be8880cc894ff?hl=en

In the example I offer 3 versions: first a Fortran-only version that
employs an assumed-length character function, then a version that rewrites
the assumed-length character function in C, and finally a version that
rewrites the assumed-length character function as a BIND(C) Fortran
function.

So by following the quoted example you can see how to write a C prototype
for the assumed-length character function so that you can invoke it from
C++. Alternatively you can use the BIND(C) Fortran version so that your
function works both when invoked through its Fortran name by a Fortran
caller and through its BIND(C) name when invoked via C++.

The problem here is that different Fortran processors may put the length
and address of the result variable in different places. You can tell
where they go from examining the assembly language output of a simple
assumed-length character function. If you want to be transportable
across Fortran processors that put these in different order or use a
completely different method of returning the result variable, then
you've got a big problem for sure. Also if the assumed-length character
function is compiled as STDCALL then interoperability is particularly
problematic because ifort doesn't permit you to mix STDCALL with BIND(C)
but gfortran more or less requires you to mix them.

I notice that dummy argument int len in charstar3.c should perhaps be
size_t len and integer(C_INT), value :: len in charstar4.c should
likewise be integer(C_SIZE_T), value :: len... but maybe is was OK as
written. It's been long enough ago that I'm not sure whether I checked
this any more.