I was imprecisely bundling "cannot leak memory" with general memory
management issues. Without defined assignment you will end up with a
"double free" situation, as you discuss below, if the user uses
assignment on a container.
> I can see that my present implementation can have two containers
> pointing to the same content. This might or might not be waht is
> desired. If it is, than the final subroutine must handle this
> correctly.
I don't think it is possible to write a final subroutine that can handle
that situation correctly.
> I think my preferred solution would be disabling assignment (i.e.
> writing a fined assignment which calls "abort") and providing two
> methods clone and make_a_reference .
From the perspective of a client using the container, if they want to
make a copy of the value of the container, assignment is a pretty
natural way to go about doing that. Therefore, if possible, I think it
preferable to enable clients to be able to safely use that natural way,
particularly given that there is no way currently in the language
(perhaps there should be a way) of preventing clients from using
assignment at compile time.
A runtime error telling the client that they have done something silly
is probably better than nothing, but it is not ideal.
At some point you have to rely on the client reading, understanding and
implementing their use of your library as per your documentation, but as
a general principle, the more that things work as might otherwise be
naively expected (least surprise), the better.
So if you need defined assignment to make objects of your type do
sensible things with assignment, you should be providing defined assignment.
(Another somewhat relevant principle is that the more that can be
defended against at compile time, the better.)
With F2003, the natural way to copy the value of a polymorphic object is
to use ALLOCATE(dest, SOURCE=source). It very much bothers me that
this natural way will break resource management objects that rely on
finalization and defined assignment, even though I can write "don't use
ALLOCATE(dest, SOURCE=source)" in the documentation for my objects.
(VALUE arguments also have issues, but they aren't particularly natural
for Fortran procedures, so perhaps they can be ignored.)
Bear in mind that issues with assignment and ALLOCATE(...SOURCE=xxx)
apply to aggregates too, perhaps in the case where the problematic type
is several levels deep amongst the detailed implementation of a
component hierarchy, and not particularly visible to the programmer
writing an otherwise innocuous looking statement.
"Workaround" is probably being too nice. It is a hack, that
unfortunately needs to be considered with the language of today.
> Some difficulties that I see are:
> 1) the code for the generic iterable container can not be compiled on
> its own, it must be included
This is a good point. Ideally, given the specification of the
requirements of the object (as per immediately below), you want the
compiler to check that the generic code is internally consistent with
those requirements - e.g. if your generic code calls a binding of a
object of parameterised type, then the requirements on that object must
explicitly specify that such a binding is accessible.
I don't think that "extending" - taking that to specifically mean type
extension - has to be the way this is done. Other have mentioned
"interfaces" as they appear in java and other languages - I think is
likely to be more useful - please supply a type that has a given
not-the-fortran-concept-of-interface (given characteristics?), where
that not-the-fortran-concept-of-interface looks like some subset of that
of the numeric intrinsic types.
>> Additional language features that could be used to simplify things in
>> both library and client code include the F2008 feature that permits a
>> function reference with data pointer result to appear as a variable, and
>
> Interesting! Would this allow writing getters which look like
> components, such as container%get_data()(1:2:10) ?
No. You cannot, today, chain subobject selectors, such as a subscript
list, substring range or component selector, onto a function reference.
You could provide something approaching the equivalent of the subobject
selector through additional arguments to your get_data function:
! Start at one, go to 2, stepping 10
container%get_data(1,2,10) = xxx
I am in two minds about whether this is a problem or not. Careful
thought would need to be put into possible syntax ambiguity if the
possibility of chaining was considered.
>> the F2008 feature of intrinsic polymorphic allocation with F2003
>> functions with allocatable polymorphic results to provide container and
>> iterator constructors.
>
> Could you provide an example of this?
In the example code from your post above, you had a subroutine to return
a new iterator. It looked like:
subroutine new_iterator(it,iterator)
class(t_iterable_impl), intent(in) :: it
class(c_iterator), allocatable, intent(out) :: iterator
allocate( t_iterator_impl :: iterator )
select type(iterator); class is(t_iterator_impl)
if(size(it%objects).gt.0) then
iterator%here = 1
iterator%items => it%objects
else
iterator%here = -1
endif
end select
end subroutine new_iterator
and referenced in client code:
class(c_iterator), allocatable :: iter
...
call vars%new_iterator(item)
That could be rewritten:
function new_iterator(it) result(iterator)
class(t_iterable_impl), intent(in) :: it
type(t_iterator_impl) :: iterator
if(size(it%objects).gt.0) then
iterator%here = 1
iterator%items => it%objects
else
iterator%here = -1
endif
end function new_iterator
and referenced in client code:
class(c_iterator), allocatable :: iter
...
item = vars%new_iterator()
>> With all the above I am merely highlighting generic programming
>> alternatives with the current state of the language, I am not pretending
>> that those alternatives are ideal or acceptable.
>>
>>> 2) the implementation has to use pointers to ensure that things are
>>> TARGET. It would be nice to say that some components of a derived type
>>> are TARGETs due to their internal representation, without asking the
>>> user to remember this.
>>
>> Yes. The need to use pointer components to robustly ensure that
>> something is guaranteed to be a target is a nuisance. However,
>> conceptually this could get tricky.
>>
>> An example of using INCLUDE to implement a generic "shared pointer" can
>> be found at
www.megms.com.au/shared-pointers.htm
>
> Interesting! Are you aware of any proposed extension of the language
> that would allow this without INCLUDE? I wonder whether one could
> rewrite your code with such an extension together with an "ad hoc"
> preprocessor to produce standard Fortran code. Parametrized modules
> have been mentioned on this list by you and others; do you also have
> examples using them?
There have been many proposals. In terms of things reasonably concrete
that I've seen, a proposal for parameterized modules that was considered
for F2008 is at
http://j3-fortran.org/doc/year/05/05-195.pdf . That was
followed by a proposal for intelligent macros - see
http://j3-fortran.org/doc/year/05/05-280.txt - which was removed from
F2008 at quite a late stage.
(I am somewhat relieved that the macro approach was removed, as I think
the generic code that resulted with that approach was difficult to read,
plus I think there is reasonable experience from other languages that a
token substitution approach operates at too low a level - the author of
the generic code needs to be able to describe their requirements for
client supplied parameters in terms of the entities that result from the
semantic analysis of tokens.)
>> Other issues that I have encountered while putting that together:
>>
>> - Fortran's model of finalization has what I regard as significant holes
>> to do with ALLOCATE(..., SOURCE=xxx) and VALUE dummy arguments - the
>> generic library code loses control of aspects of construction and
>> finalization of those objects.
>
> As mentioned above, I also don't like source allocation. I think there
> should be a way to redefine this, much as there is now a way to
> redefine what WRITE means.
Yes - it may be possible to fix this hole by including a "value
construction binding" that is invoked as part of ALLOCATE(dest,
SOURCE=source).
Things become messier when considering VALUE arguments.
> Also, I don't like the overlapping between source allocation and
> assignment. I think that both of them are somehow too smart.
> (Concerning assignment, yes, it can be redefined, which is already
> good, but there are many overloaded operators and it is not easy to
> redefine *all* of them).
Assignment is special, compared to operators. If you don't provide an
definition of an operator for your derived type and a client uses that
operator, they get a diagnostic from the compiler. You don't "redefine"
a particular operator, you extend its definition to cover your type. You
cannot redefine the intrinsic operations.
But a user can, today, always use assignment, whether you [re]define it
or not.
>> - Often containers require defined assignment as part of their
>> implementation. Defined assignment is part of the internal
>> implementation, but it suppresses intrinsic assignment and the
>> associated reallocation on assignment of an object, which is visible to
>> clients. The workaround is to use an intermediate wrapper type in the
>> library code, which, as far as I can tell, insulates the client code.
>
> Yes, again, I see you are also not completely happy with assignment.
>
>> - Forwarding calls of a type bound procedure of a contained object
>> requires use of a temporary as you cannot apply binding references to
>> function results. (You can apply operators which map to type bound
>> procedures, and you can use non-type bound procedure references.)
>
> How is this related to your comment above about
>
>> F2008 feature that permits a function reference with data pointer
>> result to appear as a variable
Above, the inability to put a subobject selector of some sort on the end
of a function reference was discussed. You cannot stick a binding
reference on a function reference either. In the following, if get_data
is a binding of the container object to a function that returns an
object that is of a type that has a binding `some_binding`, you cannot
do something like:
xyz = container%get_data()%some_binding(x)
In the first instance, you need a temporary:
class(xxx), allocatable :: tmp
tmp = container%get_data()
xyz = tmp%some_binding(x)
(The parenthetical bits were because you can do something like
(obviously - this is just using a function reference as an actual argument):
xyz = some_procedure(container%get_data(), x)
where some_procedure does the same thing as some_binding, though without
the dispatch based on dynamic type that bindings involve.
If you still want that dynamic dispatch, and the nature of the other
arguments to the procedure is such that they can be operands, then you
could alternatively write something like:
xyz = container%get_data() .someoperator. x
where .someoperator. is a type bound binary defined operator.)