efficiency of arrays memory allocation of derived type with type-bound procedures

Stefano Zaghi

unread,

Oct 24, 2012, 2:55:10 AM10/24/12

to

Hi all,
I am trying to figure out if the use of type-bound procedures could compromise the efficiency of memory allocation.

Suppose you have e derived type like the following:

type, public:: Type_Vector_Eff
sequence
real:: x,y,z
end type Type_Vector_Eff
type(Type_Vector_Eff):: vec_eff(1:1000,1:1000)

Because of the presence of "sequence" statement the 2D array "vec_eff" has its own elements allocated sequentially thus the loops operations over them are efficient.

Now suppose you want to modify the above derived type and introduce some type-bound procedures. In this case the "sequence" statement is not legal:

type, public:: Type_Vector
real:: x,y,z
contains
procedure:: init
end type Type_Vector
type(Type_Vector):: vec(1:1000,1:1000)

Now how are allocated the elements of "vec"?

Is it possible that its elements are not sequentially allocated thus the access to them is not efficient into loops operations?

Is it possible that the type-bound procedure "init" makes not efficient the access to the memory of array "vec"?

Thank you for all suggestions.

glen herrmannsfeldt

unread,

Oct 24, 2012, 3:42:40 AM10/24/12

to

Stefano Zaghi <stefan...@gmail.com> wrote:

> I am trying to figure out if the use of type-bound procedures could
> compromise the efficiency of memory allocation.

> Suppose you have e derived type like the following:

> type, public:: Type_Vector_Eff
> sequence
> real:: x,y,z
> end type Type_Vector_Eff
> type(Type_Vector_Eff):: vec_eff(1:1000,1:1000)

> Because of the presence of "sequence" statement the 2D array
> "vec_eff" has its own elements allocated sequentially thus the
> loops operations over them are efficient.

What do you mean by allocated sequentially?

Without SEQUENCE the compiler is allowed to rearrange structure
members, so it might come out Y,Z,X or Z,X,Y, though it is
pretty hard to see why it would do that.

In the case of members of different size, it might save padding,
but not in this case.

There are cases where there is an efficiency (mostly cache)
difference between array of structures and structure of
arrays, though as far as I know the compiler normally
won't do that.

(There is a story about a C compiler doing it for a SPEC
benchmark program, though it isn't supposed to be done
in C, either.)

> Now suppose you want to modify the above derived type and introduce
> some type-bound procedures. In this case the "sequence"
> statement is not legal:

> type, public:: Type_Vector
> real:: x,y,z
> contains
> procedure:: init
> end type Type_Vector
> type(Type_Vector):: vec(1:1000,1:1000)

> Now how are allocated the elements of "vec"?

I would be surprised to see it change.

> Is it possible that its elements are not sequentially allocated
> thus the access to them is not efficient into loops operations?

The structure is small enough that I would be surprised to
see a difference, but it is possible.

> Is it possible that the type-bound procedure "init" makes not
> efficient the access to the memory of array "vec"?

Possible, yes, but not likely. But you don't say how you are
accessing the array, so it is hard to say more.

-- glen

Stefano Zaghi

unread,

Oct 24, 2012, 3:57:21 AM10/24/12

to

Thank you Glen, you are very kind.

> What do you mean by allocated sequentially?
> Without SEQUENCE the compiler is allowed to rearrange structure
> members, so it might come out Y,Z,X or Z,X,Y, though it is
> pretty hard to see why it would do that.

You are right. I am (often) not clear.

Using "sequence" I have supposed that the order is exactly "x,y,z" whereas without "sequence" AND type-bound procedures I was wondering if
it is possible that the order is, e.g., y,"somethings",z,x. In other words I was concerned about the presence of some strange padding.

> In the case of members of different size, it might save padding,
> but not in this case.

Is this true even with type-bound procedures?

> There are cases where there is an efficiency (mostly cache)
> difference between array of structures and structure of
> arrays, though as far as I know the compiler normally
> won't do that.
>
> (There is a story about a C compiler doing it for a SPEC
> benchmark program, though it isn't supposed to be done
> in C, either.)

Nice...

> > Now how are allocated the elements of "vec"?
> I would be surprised to see it change.

Why? I am a bit confused. Why the presence of type-bound procedures makes illegal the statement "sequence"? I have guessed that is due to some (obscure for me) memory padding.

> > Is it possible that its elements are not sequentially allocated
> > thus the access to them is not efficient into loops operations?
> The structure is small enough that I would be surprised to
> see a difference, but it is possible.

Good news.

> > Is it possible that the type-bound procedure "init" makes not
> > efficient the access to the memory of array "vec"?
> Possible, yes, but not likely. But you don't say how you are
> accessing the array, so it is hard to say more.

A typical access could be:

do j=1,1000
do i=1,1000
vec3(i,j) = vec1(i,j) + vec2(i,j)
enddo
enddo

Thank you again.

glen herrmannsfeldt

unread,

Oct 24, 2012, 5:33:43 AM10/24/12

to

Stefano Zaghi <stefan...@gmail.com> wrote:

(snip, I wrote)

>> What do you mean by allocated sequentially?
>> Without SEQUENCE the compiler is allowed to rearrange structure
>> members, so it might come out Y,Z,X or Z,X,Y, though it is
>> pretty hard to see why it would do that.

> You are right. I am (often) not clear.

> Using "sequence" I have supposed that the order is exactly
> "x,y,z" whereas without "sequence" AND type-bound procedures
> I was wondering if it is possible that the order is, e.g.,
> y,"somethings",z,x. In other words I was concerned about
> the presence of some strange padding.

>> In the case of members of different size, it might save padding,
>> but not in this case.

(snip)

> Why? I am a bit confused. Why the presence of type-bound
> procedures makes illegal the statement "sequence"?
> I have guessed that is due to some (obscure for me) memory padding.

I don't know about that one.

(snip)

>> Possible, yes, but not likely. But you don't say how you are
>> accessing the array, so it is hard to say more.

> A typical access could be:

> do j=1,1000
> do i=1,1000
> vec3(i,j) = vec1(i,j) + vec2(i,j)
> enddo
> enddo

Now I am confused. Are you sure you are asking about type-bound
procedures and not defined operations?

I have commented before on the fact that Fortran doesn't
have structure expressions, though it does have array
expressions.

If you are asking aobut defined operations, then you might
want to worry about efficiency. It seems that many
implementations do a procedure call for each operation.
If you separately add the three members, then the compiler
should just generate the code inline, but likely not
for defined operations.

-- glen

Stefano Zaghi

unread,

Oct 24, 2012, 5:55:59 AM10/24/12

to

> > do j=1,1000
> > do i=1,1000
> > vec3(i,j) = vec1(i,j) + vec2(i,j)
> > enddo
> > enddo

> Now I am confused. Are you sure you are asking about type-bound
> procedures and not defined operations?

I am asking "from a general point view".

I have a derived type similar to the one above described, with type-bound procedures and, as a consequence, without "sequence" statement. Among other things I perform "costly loops operations" using overloaded operators similar (more complex in general) to the loops above. I was concerned about the effects of the presence of type-bound procedures (and, as a consequence, of the absence of the sequence statement, and finally to the possible presence of "stange" padding of arrays elements) on loops efficiency.

> If you are asking aobut defined operations, then you might
> want to worry about efficiency. It seems that many
> implementations do a procedure call for each operation.
> If you separately add the three members, then the compiler
> should just generate the code inline, but likely not
> for defined operations.

Indeed the operations involved range in many different kinds. There are loops operations with overloaded operators accessing to array elements, others with direct use of derived type members, and so on.

I suppose that the question can be summarized as:

Has the presence of type-bound procedures an effect on the memory padding of the elements of an array?

If the answer is yes, I am concerned about the efficient access to the array memory into costly loop operations.

Thank you again for your support.

glen herrmannsfeldt

unread,

Oct 24, 2012, 6:17:05 AM10/24/12

to

Stefano Zaghi <stefan...@gmail.com> wrote:
>> > do j=1,1000
>> > do i=1,1000
>> > vec3(i,j) = vec1(i,j) + vec2(i,j)
>> > enddo
>> > enddo

>> Now I am confused. Are you sure you are asking about type-bound
>> procedures and not defined operations?

> I am asking "from a general point view".

> I have a derived type similar to the one above described, with
> type-bound procedures and, as a consequence, without
> "sequence" statement. Among other things I perform "costly
> loops operations" using overloaded operators similar (more
> complex in general) to the loops above.

As far as I know, overloaded operators (defined operations)
have a significant overhead. I would avoid them when
speed is important.

> I was concerned about the effects of the presence of
> type-bound procedures (and, as a consequence, of the absence
> of the sequence statement, and finally to the possible presence
> of "stange" padding of arrays elements) on loops efficiency.

Well, the idea behind padding is effeciency, especially
alignment. It might be that on some systems it is more
efficient to pad such that the size is four words long.

That reduces a little the chance of keeping everything
in cache, but then again your 1 million element loop
will likely fill the cache.

>> If you are asking aobut defined operations, then you might
>> want to worry about efficiency. It seems that many
>> implementations do a procedure call for each operation.
>> If you separately add the three members, then the compiler
>> should just generate the code inline, but likely not
>> for defined operations.

> Indeed the operations involved range in many different kinds.
> There are loops operations with overloaded operators accessing
> to array elements, others with direct use of derived type
> members, and so on.

> I suppose that the question can be summarized as:

> Has the presence of type-bound procedures an effect on the
> memory padding of the elements of an array?

Probably not compared to the cost of overloaded operators.

> If the answer is yes, I am concerned about the efficient
> access to the array memory into costly loop operations.

-- glen

Stefano Zaghi

unread,

Oct 24, 2012, 6:26:29 AM10/24/12

to

> As far as I know, overloaded operators (defined operations)
> have a significant overhead. I would avoid them when
> speed is important.

You are right, I am conscious of this overhead.

> Well, the idea behind padding is effeciency, especially
> alignment. It might be that on some systems it is more
> efficient to pad such that the size is four words long.

Yes, your are right again. My concerning is about the "control" of possible padding: as far as I know the padding is directly programmed (when I was student I have used array directly padded) or introduced by a clever compiler. The possible padding due to the presence of the type-bound procedures sounds like an "uncontrolled source of padding".

> > Has the presence of type-bound procedures an effect on the
> > memory padding of the elements of an array?
> Probably not compared to the cost of overloaded operators.

Ok.

Thak you again Glen.

Ian Harvey

unread,

Oct 24, 2012, 6:35:25 AM10/24/12

to

On 2012-10-24 8:33 PM, glen herrmannsfeldt wrote:
> Stefano Zaghi <stefan...@gmail.com> wrote:
>
> (snip, I wrote)
>>> What do you mean by allocated sequentially?
>>> Without SEQUENCE the compiler is allowed to rearrange structure
>>> members, so it might come out Y,Z,X or Z,X,Y, though it is
>>> pretty hard to see why it would do that.
>
>> You are right. I am (often) not clear.
>
>> Using "sequence" I have supposed that the order is exactly
>> "x,y,z" whereas without "sequence" AND type-bound procedures
>> I was wondering if it is possible that the order is, e.g.,
>> y,"somethings",z,x. In other words I was concerned about
>> the presence of some strange padding.

In the non-sequence case, if there was padding, it would more likely be
{x, y, z, pad}, {x, y, z, pad}, ....; or perhaps {x, pad, y, pad, z,
pad}, {x, pad, y, pad, z ,pad}.

If the compiler put padding in there, it would likely be because it
thought that it would improve performance.

Sequence types have additional constraints on them. In the general case
those constraints are going to hinder optimisation - the compiler has
less freedom to rearrange things optimally.

You use sequence types when you need to stick them in common or
equivalence them with things - i.e. when you need to do sequence
association stuff. If you don't need to do sequence association stuff,
don't use sequence types.

>
>>> In the case of members of different size, it might save padding,
>>> but not in this case.
>
> (snip)
>
>> Why? I am a bit confused. Why the presence of type-bound
>> procedures makes illegal the statement "sequence"?
>> I have guessed that is due to some (obscure for me) memory padding.

To implement type bound procedures the compiler may need to store
additional internal information (perhaps a pointer to a dispatch table
or similar) with each polymorphic object. That internal information
cannot be sanely sequence associated with anything else, hence type
bound procedures and sequence types are incompatible.

Perhaps an orthogonal point - in the absence of good reasons to the
contrary, my recommendation is that if you are not doing "type bound
stuff" with your type bound procedures (i.e., you don't ever need to
decide at runtime which specific procedure gets called for a binding
based on the dynamic type of an object), then don't use type bound
procedures. The syntax of the object%binding procedure reference might
be alluring, but it comes with a cost.

>
> I don't know about that one.
>
> (snip)
>
>>> Possible, yes, but not likely. But you don't say how you are
>>> accessing the array, so it is hard to say more.
>
>> A typical access could be:
>
>> do j=1,1000
>> do i=1,1000
>> vec3(i,j) = vec1(i,j) + vec2(i,j)
>> enddo
>> enddo
>
> Now I am confused. Are you sure you are asking about type-bound
> procedures and not defined operations?
>
> I have commented before on the fact that Fortran doesn't
> have structure expressions, though it does have array
> expressions.
>
> If you are asking aobut defined operations, then you might
> want to worry about efficiency. It seems that many
> implementations do a procedure call for each operation.
> If you separately add the three members, then the compiler
> should just generate the code inline, but likely not
> for defined operations.

I would expect (but perhaps I'm lining myself up for disappointment...)
that modern compilers could inline defined operations, with "whole of
program" type optimisation. Some simple tests here show that one of the
compilers I use can.

Rafik Zurob

unread,

Oct 24, 2012, 6:37:30 AM10/24/12

to

Hi

Sequence means that the compiler should not insert padding
before/after/between the derived type components and should lay them out in
memory in the order they appear in the derived type definition. This is
useful when a derived type object is overlayed (e.g. part of equivalence,
common, or argument association without an interface) or when you can about
the exact size of the type.

Since components x, y, and z in your example are well aligned and don't
require padding, there should be no difference in speed between the case
with sequence and the case without. In fact, you can get worse performance
with sequence if you're not careful to ensure that the components are
aligned to their natural boundaries. Without sequence, the compiler will
ensure the components are laid out for fast access.

Also, note that type-bound procedures don't occupy any space in the derived
type. So even without sequence, the derived type will most likely be 12
bytes long, assuming real is 4 bytes. Also, since you declared vec_eff
using TYPE instead of CLASS, calling vec_eff%init can be resolved at compile
time and should be equivalent in speed to calling init(vec_eff) directly.

Regards

Rafik
Visit the Fortran Cafe at:
https://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?communityUuid=b10932b4-0edd-4e61-89f2-6e478ccba9aa

"Stefano Zaghi" <stefan...@gmail.com> wrote in message
news:4951d478-6b4e-449f...@googlegroups.com...

Stefano Zaghi

unread,

Oct 24, 2012, 6:49:18 AM10/24/12

to

Dear Ian, thank you.

> In the non-sequence case, if there was padding, it would more likely be
> {x, y, z, pad}, {x, y, z, pad}, ....; or perhaps {x, pad, y, pad, z,
> pad}, {x, pad, y, pad, z ,pad}.
>
> If the compiler put padding in there, it would likely be because it
> thought that it would improve performance.

Yes, I know the padding usefulness. As I said before I was not sure the possible padding of the type-bound procedures waste the performance.

> Sequence types have additional constraints on them. In the general case
> those constraints are going to hinder optimisation - the compiler has
> less freedom to rearrange things optimally.

You are right.

> You use sequence types when you need to stick them in common or
> equivalence them with things - i.e. when you need to do sequence
> association stuff. If you don't need to do sequence association stuff,
> don't use sequence types.

Can you give me an example of "sequence association stuff"? This can improve my knowledge.

> To implement type bound procedures the compiler may need to store
> additional internal information (perhaps a pointer to a dispatch table
> or similar) with each polymorphic object. That internal information
> cannot be sanely sequence associated with anything else, hence type
> bound procedures and sequence types are incompatible.

This was exactly what I have guessed.

> Perhaps an orthogonal point - in the absence of good reasons to the
> contrary, my recommendation is that if you are not doing "type bound
> stuff" with your type bound procedures (i.e., you don't ever need to
> decide at runtime which specific procedure gets called for a binding
> based on the dynamic type of an object), then don't use type bound
> procedures. The syntax of the object%binding procedure reference might
> be alluring, but it comes with a cost.

Thank you very much. I was tempted by the object%binding syntax, but I have not really need of "type-bound" stuff.

> I would expect (but perhaps I'm lining myself up for disappointment...)
> that modern compilers could inline defined operations, with "whole of
> program" type optimisation. Some simple tests here show that one of the
> compilers I use can.

I have performed some benchmarks and I find that, in my case, the overhead of defined operations for the above derived type is negligible. Instead with other derived type (e.g. allocatable arrays members) it is not.

My conclusion is to avoid type-bound procedures.

Thank you all.

Stefano Zaghi

unread,

Oct 24, 2012, 6:54:24 AM10/24/12

to

> Also, note that type-bound procedures don't occupy any space in the derived
> type. So even without sequence, the derived type will most likely be 12
> bytes long, assuming real is 4 bytes. Also, since you declared vec_eff
> using TYPE instead of CLASS, calling vec_eff%init can be resolved at compile
> time and should be equivalent in speed to calling init(vec_eff) directly.

Now I am confused. Reading the comments of Ian I understand exactly the contrary. So the presence of type-bound procedures is not important in the above derived type? There is no strange padding? Are all the callings resolved at compile time?

Thank you Rafik.

Ian Harvey

unread,

Oct 24, 2012, 7:07:36 AM10/24/12

to

On 2012-10-24 9:49 PM, Stefano Zaghi wrote:
> Dear Ian, thank you.
>
>> In the non-sequence case, if there was padding, it would more likely be
>> {x, y, z, pad}, {x, y, z, pad}, ....; or perhaps {x, pad, y, pad, z,
>> pad}, {x, pad, y, pad, z ,pad}.
>>
>> If the compiler put padding in there, it would likely be because it
>> thought that it would improve performance.
>
> Yes, I know the padding usefulness. As I said before I was not sure the possible padding of the type-bound procedures waste the performance.
>
>> Sequence types have additional constraints on them. In the general case
>> those constraints are going to hinder optimisation - the compiler has
>> less freedom to rearrange things optimally.
>
> You are right.
>
>> You use sequence types when you need to stick them in common or
>> equivalence them with things - i.e. when you need to do sequence
>> association stuff. If you don't need to do sequence association stuff,
>> don't use sequence types.
>
> Can you give me an example of "sequence association stuff"? This can improve my knowledge.

Perhaps I should have said "storage association" rather than sequence
association.

TYPE sequence_type
SEQUENCE
INTEGER :: a
REAL :: b
END TYPE sequence_type

COMMON /my_common_name/ object
TYPE(sequence_type) :: object

(in another program unit...)

COMMON /my_common_name/ va, vb
INTEGER :: va
REAL :: vb

object%a is storage associated with va, object%b with vb.

>> To implement type bound procedures the compiler may need to store
>> additional internal information (perhaps a pointer to a dispatch table
>> or similar) with each polymorphic object. That internal information
>> cannot be sanely sequence associated with anything else, hence type
>> bound procedures and sequence types are incompatible.
>
> This was exactly what I have guessed.
>
>> Perhaps an orthogonal point - in the absence of good reasons to the
>> contrary, my recommendation is that if you are not doing "type bound
>> stuff" with your type bound procedures (i.e., you don't ever need to
>> decide at runtime which specific procedure gets called for a binding
>> based on the dynamic type of an object), then don't use type bound
>> procedures. The syntax of the object%binding procedure reference might
>> be alluring, but it comes with a cost.
>
> Thank you very much. I was tempted by the object%binding syntax, but I have not really need of "type-bound" stuff.
>
>> I would expect (but perhaps I'm lining myself up for disappointment...)
>> that modern compilers could inline defined operations, with "whole of
>> program" type optimisation. Some simple tests here show that one of the
>> compilers I use can.
>
> I have performed some benchmarks and I find that, in my case, the overhead of defined operations for the above derived type is negligible. Instead with other derived type (e.g. allocatable arrays members) it is not.
>
> My conclusion is to avoid type-bound procedures.

Well, to clarify - avoid type bound procedures if you don't need type
bound procedures. If you need type bound procedures (and they are
rather handy), then by all means use type bound procedures.

Ian Harvey

unread,

Oct 24, 2012, 7:25:08 AM10/24/12

to

There is a difference between TYPE (not polymorphic) and CLASS
(polymorphic). When an object is not polymorphic, the dynamic type of
the object is the same as the declared type and the compiler doesn't
have to store extra internal information.

When the object is polymorphic, the dynamic type of the object can be
any extension of the declared type, and in the general case, the
compiler will not know what the dynamic type is at compile time. Hence
for polymorphic objects a typical implementation is to store additional
information about the dynamic type of the object in the form of a
pointer to a vtable or similar.

(Note that it is one pointer per declared thing - i.e. if you have
declared an array it is one extra pointer for the whole array, not one
extra pointer per array element.)

Inside a type bound procedure, the passed argument is always
polymorphic. That means that, in the general case, the compiler always
has to pass along that extra bit of information that indicates the
dynamic type of the passed argument, regardless of whether the actual
argument was polymorphic or not.

(In specific cases the compiler might be able to do some fancy
optimisation to avoid that, but that's getting pretty fancy indeed.)

Stefano Zaghi

unread,

Oct 24, 2012, 7:35:57 AM10/24/12

to

Hi Ian
thank for your help. The "common" example was useful (even if I had purged "common" blocks away from my life times ago...).

To be a little more clear: I do not really need type-bound procedures (I have no needs of truly polymorphic data), I use them only because they are handy.

Regards.

Ron Shepard

unread,

Oct 24, 2012, 11:38:37 AM10/24/12

to

In article <4951d478-6b4e-449f...@googlegroups.com>,

Stefano Zaghi <stefan...@gmail.com> wrote:

> Hi all,
> I am trying to figure out if the use of type-bound procedures could
> compromise the efficiency of memory allocation.
>
> Suppose you have e derived type like the following:
>
> type, public:: Type_Vector_Eff
> sequence
> real:: x,y,z
> end type Type_Vector_Eff
> type(Type_Vector_Eff):: vec_eff(1:1000,1:1000)
>
> Because of the presence of "sequence" statement the 2D array "vec_eff" has
> its own elements allocated sequentially thus the loops operations over them
> are efficient.

The elements of vec_eff(:,:) are always allocated sequentially,
regardless of the presence of SEQUENCE. The SEQUENCE affects only
what is within the derived type. If you think in terms of low-level
code, the addresses of vec_eff(i,j) and veceff(i+1,j) or of
vec_eff(i,j) and veceff(i,j+1) will always differ by a fixed amount.
However, the relative offsets for the components may depend on the
presence of SEQUENCE.

The addresses for vec_eff(i,j) and veceff(i+1,j) will be spaced
apart the same regardless of the presence of SEQUENCE, and that
spacing value will be the same for all i and j values. Further, the
addresses for vec_eff(i,j)%x and veceff(i+1,j)%x will be spaced
apart the same regardless of the presence of SEQUENCE. What may be
different is the spacing between vec_eff(i,j)%x and vec_eff(i,j)%y,
for example; without SEQUENCE, that might even be negative in some
cases but positive in others. But that value will still be the same
regardless of the values of i and j.

To try to relate this to something practical, an expression like
vec_eff(:,:)%x may be used as an actual argument that matches the
assumed shape dummy array

real :: x(:,:)

and in most implementations, this will NOT require anything to be
copied in or out in order for those arguments to match. In
contrast, the dummy array declarations

real :: x(m,n)
real :: x(m,*)

will almost certainly require the compiler to create a temporary
array where the values are copied in and out in order to get
contiguous array elements within the subprogram. All of this is
true regardless of the presence of SEQUENCE in the derived type.

>
> Now suppose you want to modify the above derived type and introduce some
> type-bound procedures. In this case the "sequence" statement is not legal:
>
> type, public:: Type_Vector
> real:: x,y,z
> contains
> procedure:: init
> end type Type_Vector
> type(Type_Vector):: vec(1:1000,1:1000)
>
> Now how are allocated the elements of "vec"?

They are still sequential, just as before.

>
> Is it possible that its elements are not sequentially allocated thus the
> access to them is not efficient into loops operations?
>
> Is it possible that the type-bound procedure "init" makes not efficient the
> access to the memory of array "vec"?
>
> Thank you for all suggestions.

Generally, SEQUENCE is used when storage of the derived type must
match some kind of external constraints. Within fortran, that would
include things like common blocks variables and equivalence.
Outside of fortran, it would include things like matching data
structures within an OS call or matching data structures defined in
other programming languages. SEQUENCE prohibits things like padding
and rearrangements of the components.

Also, generally speaking, the use of SEQUENCE suppresses
optimizations and makes the code run slower. You seem to be
thinking that it speeds up the code. That is sort of backwards.
There might be exceptions to this, for example where the compiler
might try to optimize for storage space rather than for execution
speed, but that is not what usually happens. Usually SEQUENCE slows
things down if it has any effect at all.

$.02 -Ron Shepard

Stefano Zaghi

unread,

Oct 24, 2012, 11:50:57 AM10/24/12

to

Hi Ron, thanks for your help.

> The elements of vec_eff(:,:) are always allocated sequentially,
> regardless of the presence of SEQUENCE. The SEQUENCE affects only
> what is within the derived type. If you think in terms of low-level
> code, the addresses of vec_eff(i,j) and veceff(i+1,j) or of
> vec_eff(i,j) and veceff(i,j+1) will always differ by a fixed amount.
> However, the relative offsets for the components may depend on the
> presence of SEQUENCE.

Yes, this was clear for me.

> To try to relate this to something practical, an expression like
> vec_eff(:,:)%x may be used as an actual argument that matches the
> assumed shape dummy array
> real :: x(:,:)
> and in most implementations, this will NOT require anything to be
> copied in or out in order for those arguments to match. In
> contrast, the dummy array declarations
> real :: x(m,n)
> real :: x(m,*)
> will almost certainly require the compiler to create a temporary
> array where the values are copied in and out in order to get
> contiguous array elements within the subprogram. All of this is
> true regardless of the presence of SEQUENCE in the derived type.

Very nice examples, thanks.

> Generally, SEQUENCE is used when storage of the derived type must
> match some kind of external constraints. Within fortran, that would
> include things like common blocks variables and equivalence.
> Outside of fortran, it would include things like matching data
> structures within an OS call or matching data structures defined in
> other programming languages. SEQUENCE prohibits things like padding
> and rearrangements of the components.
>
> Also, generally speaking, the use of SEQUENCE suppresses
> optimizations and makes the code run slower. You seem to be
> thinking that it speeds up the code. That is sort of backwards.
> There might be exceptions to this, for example where the compiler
> might try to optimize for storage space rather than for execution
> speed, but that is not what usually happens. Usually SEQUENCE slows
> things down if it has any effect at all.

You are right. I do not know why, but I in my mind SEQUENCE "ensured that no strange (uncontrolled) padding was added".

I am purging SEQUENCE from my types...

Thank you very much.

Dick Hendrickson

unread,

Oct 24, 2012, 11:54:27 AM10/24/12

to

On 10/24/12 5:26 AM, Stefano Zaghi wrote:
>> As far as I know, overloaded operators (defined operations)
>> have a significant overhead. I would avoid them when
>> speed is important.
>
> You are right, I am conscious of this overhead.
>

Are you sure? I would think that for reasonably small procedures modern
compilers would inline the code and then run it through the optimizer.
There's no "overhead", only the cost of doing the operations you asked
for. I'm thinking the "overhead" you're referring to involves the call
code and stack management. True, you'll need to compile the code in a
way that lets the compiler inline the operator code and that isn't
always possible or easy with large codes or libraries.

Dick Hendrickson

Stefano Zaghi

unread,

Oct 24, 2012, 12:00:55 PM10/24/12

to

Hi Dick,

> Are you sure? I would think that for reasonably small procedures modern
> compilers would inline the code and then run it through the optimizer.
> There's no "overhead", only the cost of doing the operations you asked
> for. I'm thinking the "overhead" you're referring to involves the call
> code and stack management. True, you'll need to compile the code in a
> way that lets the compiler inline the operator code and that isn't
> always possible or easy with large codes or libraries.

The meaning of my answer was: I know that defined operations COULD introduce an overhead. In my experience, as I said in some previous messages, this overhead is negligible for my cases. The reason is probably due to optimizer work as you said.

Regards.

Tobias Burnus

unread,

Oct 25, 2012, 6:29:07 AM10/25/12

to

Ian Harvey wrote:
> Perhaps an orthogonal point - in the absence of good reasons to the
> contrary, my recommendation is that if you are not doing "type bound
> stuff" with your type bound procedures (i.e., you don't ever need to
> decide at runtime which specific procedure gets called for a binding
> based on the dynamic type of an object), then don't use type bound
> procedures. The syntax of the object%binding procedure reference might
> be alluring, but it comes with a cost.

If you talk about computational cost, I don't see a difference between a
normal procedure call, a generic procedure, defined operator/assignment
and type-bound procedures. At least as long a nonpolymorphic TYPE is
used. With CLASS, the type-bound procedures have an additional cost,
unless they are non_overridable. If they non_overridable, the compiler
can directly call the procedure without extra cost.

(Even if you use polymorphic variables, the compiler might be able to
devirtualize the type-bound procedure calls and inline the call, but
that might require multi-file optimization (called link-time
optimization (LTO) or LNO).)

By itself, the overhead of finding the type-bound procedure at run-time
(compared to directly calling it) is small; the main issue is that it
prevents further optimization. If your procedure is in a different file
and you don't use LTO (and hence prevent inlining, cloning and other
optimizations), I don't expect a measurable difference between CLASS and
TYPE.

Regarding the cost of defined operators, for
"a = b + c"
I think the generated code works rather well after inlining. (I assume
that "+" and "*" are user-defined operators, simple and inlinable.)

However, for
a = b * c + d
defined operators might give:
tmp = b * c
a = tmp + d

If all variables are arrays, the implied loops might not get merged and
the "tmp" might not be optimized away, even if it were possible and the
compiler can do inlining. In that case, a procedure which directly
operates on "b * c + d" might be significantly faster.

Like always, it depends on the actual code (and on the compiler) whether
it matters (measure, don't guess). Unless the uglier code is much faster
at a place where the speed gain matters, a clear, maintainable style is
much more important. (One can argue whether user-defined operators make
the code clearer or more obscure.)

Tobias

Stefano Zaghi

unread,

Oct 25, 2012, 6:41:06 AM10/25/12

to

Hi Tobias,
thanks for your help. The non_ovveridable attribute can be useful in my case. You are right, it necessary to measure (and not guess) the effect. It is what I am doing with an accurate profiling of the code.

Thank you.

Ian Harvey

unread,

Oct 25, 2012, 6:08:10 PM10/25/12

to

On 2012-10-25 9:29 PM, Tobias Burnus wrote:
> Ian Harvey wrote:
>> Perhaps an orthogonal point - in the absence of good reasons to the
>> contrary, my recommendation is that if you are not doing "type bound
>> stuff" with your type bound procedures (i.e., you don't ever need to
>> decide at runtime which specific procedure gets called for a binding
>> based on the dynamic type of an object), then don't use type bound
>> procedures. The syntax of the object%binding procedure reference might
>> be alluring, but it comes with a cost.
>
> If you talk about computational cost, I don't see a difference between a
> normal procedure call, a generic procedure, defined operator/assignment
> and type-bound procedures. At least as long a nonpolymorphic TYPE is
> used. With CLASS, the type-bound procedures have an additional cost,
> unless they are non_overridable. If they non_overridable, the compiler
> can directly call the procedure without extra cost.

Cost in the general sense - not just execution performance. Also, I
don't want to be seen to overstate the cost side of things - nearly
everything I write these days has some sort of type binding stuff going
on. But you do see questions here and elsewhere along the lines of
"I've put some bindings after contains in a type and now the compiler
won't let me do this, why?" and the answer is "because that stuff is now
type bound and what you want to do doesn't make sense in the context of
things being type bound."

Non-polymorphic/overridable - doesn't the compiler still have to
construct a descriptor with the dynamic type for the argument whenever a
binding is called - because while that specific binding might be
statically resolvable/not be overridable, in the general case (fancy
pants optimisation aside) other bindings might not be?

TYPE :: parent
CONTAINS
PROCEDURE, NON_OVERRIDABLE :: fixed
PROCEDURE :: wobbly
END TYPE parent
...
SUBROUTINE fixed(arg)
CLASS(parent), INTENT(IN) :: arg
CALL arg%wobbly ! dynamic dispatch here.
END SUBROUTINE fixed

SUBROUTINE proc
TYPE(parent) :: a
! Descriptor that includes dynamic type must be built before
! calling either of these?
CALL a%wobbly
CALL a%fixed
END SUBROUTINE proc