Hi, John,
I took a look at FortWrap, and have a few questions. Thanks for
drawing our attention to your project.
I see that you use the 'ISO_C_BINDING' interop module, but you don't
use the 'bind(c, name="foo")' feature for the wrapped subroutines, and
reproduce the internal name mangling of g95 and gfortran. Why not
just do
subroutine XYZ() bind(c, name='xyz')
...
end subroutine
and then not worry about reproducing the mangling schemes of the
different compilers? That would give you a significant boost in
portability. Is there something in the compiler implementations that
prevents you from doing this?
Along those lines: as you may know, fortran compilers are allowed to
do whatever the heck they want to the internals of derived types. For
example, if the user had a derived type:
type objectA
integer x
real y
end type
A fortran compiler could switch around the ordering of 'x' and 'y',
add padding, or anything else to the internals of objectA. The
ISO_C_BINDING stuff allows the user to turn all of that off, so a
Fortran derived type matches a C struct definition, and you can pass
back and forth interoperable fortran derived types and C structs
without any copying or accessors. This requires the modification of
the type declaration, however, and may not be an option for legacy
code. The non-interoperability problem is felt especially with arrays
of derived types, which requires a fortran-level call for every access
to a derived type element, which would kill performance. Any thoughts
on how to handle this?
Lastly, I see there is a restriction to arrays of 1 or 2 dimensions
only. Would this restriction be lifted in the future? Is there any
significant reason why it's there currently?
Thanks,
Kurt
Hi Kurt,
Thanks for your interest. There are several reasons for not using
BIND(C). The main reason is that my goal is to wrap derived types and
the approach I use is an "opaque container" approach (a.k.a pointer
handles). The wrappers FortWrap generates allow the derived type data
to be passed around using a pointer/handle, but the internal data
members can not be accessed directly. This follows the object
oriented mindset -- if everything is done using get/set functions,
there is no problem, because the internal data need not be touched
directly from the target language (now I realize this might not be
ideal for wrapping all Fortran code, but wrapping this type of code is
where FortWrap can provide the most benefit).
With the opaque handle approach, FortWrap can create wrappers that
will work regardless of what is stored inside the derived type. For
example, the derived type may contain data with ALLOCATABLE, POINTER,
etc., and it will not be a problem because all the wrapper code does
is obtain a pointer to an instance and pass that around to Fortran
routines that are expecting that pointer.
So this is one reason for not using BIND(C): because this approach
allows me to wrap subroutines that it would not be possible to wrap
using BIND(C) -- e.g. cases where the derived type itself is not
interoperable, so the Fortran code wouldn't compile with BIND(C) (btw
I started what turned out to be a big discussion about this on the
gfortran list a while back:
http://gcc.gnu.org/ml/fortran/2009-06/msg00034.html).
A second reason is to cut down on the amount of code and reduce the
compile time. If FortWrap generated BIND(C) style Fortran wrapper
code for every routine being wrapped, the amount of code and compile
time would go way up.
The third reason is that generation of the wrapper code becomes a
little bit trickier because I would need to be able to reproduce all
of the argument lists and type declarations in the Fortran BIND(C)
wrapper routines (I assume sort of like Fwrap does when wrapping
routines that contain assumed size arrays). I'm not too far from
being able to do this though, so the main reasons are the first two.
Based on my opaque container approach it should be clear why I'm not
concerned about the compiler's internal representation of the derived
type. Re your question about performance of derived type
interoperability, I'm not completely sure I understand what the issue
is, but in general my recommendation would be to augment the original
Fortran code with accessor functions and use those instead of directly
accessing the data from C. Then only pointers are passed around so it
shouldn't incur a performance hit. I haven't yet figured out a good
way to work with arrays of derived types (aside from creating a new
derived type that contains them).
Regarding array dimensionality: I rarely use arrays with 3 or more
dimensions in my work so hadn't had the need. I guess the main issue
would be making sure the data order is compatible. Currently FortWrap
creates a C++ class for handling two-dimensional arrays, and this
class takes care of storing data in Fortran order but letting C++ code
access the data as if it were in C order. This approach could
probably be extended, although it isn't super elegant as it introduces
an intermediary class.
Thanks for your comments. It will be good to have someone to bounce
ideas off of.
John
This might be of interest as an approach for wrapping deep derived
types in a portable, standard-compliant way. Basically I haven't used
this approach in FortWrap for the reasons I mentioned above
(additional wrapper code, more complicated, etc.) g95 and gfortran,
at least, implement derived type procedure arguments by simply passing
a pointer to the data, so I haven't found it necessary to go the extra
mile and make that wrapping standard compliant with BIND(C).
John