Overloading OPERATOR(+): my usage causing memory leaks.

Paul van Delst

unread,

Nov 3, 2009, 2:51:11 PM11/3/09

to

Hi,

I'm overloading an elemental ADD function to OPERATOR(+) and I'm getting memory leaks when
I apply it to array arguments. Scalar arguments cause no problem.

Down below is my "defining" module that contains all the structure-related operations
(it's a bit long, sorry), below that is my test program, and below that is the
valgrind --leak-check=full
output.

My question is: what in the code could be causing these memory leaks?

In the test program, if I comment out the
c = c + d
line, i.e. adding the array structures, then I get no memory leaks. I made all the
routines in question elemental, so they should work with array arguments, right?

Things I have tested:
1) I know it's not the "reuse" of the "c" array on the left hand side of the assignment,
since when I replace that with a completely different structure I get the same memory leak
reports.
2) altering the "add" function to use a local structure to hold the result before copying
back into the function result() also makes no difference.

I'm using gfortran (4.4, Feb2009 static build on a RHE5.0 system). g95 exhibits similar
behaviour (but valgrind reports errors as well as memory leaks) and pgf95 v9.0 fails with
an ICE (surpise surprise). Compiling on an AIX system yields no errors or warnings (but
also no memory check).

As usual, any info/insight appreciated.

cheers,

paulv

p.s. If you actually make it through the code and valgrind output below, the beers are on
me next time you're visiting the DC area! :o)

!<-----begin module my_type_define----->
module my_type_define
implicit none

private
public :: my_type
public :: assignment(=)
public :: operator(+)
public :: my_associated
public :: my_destroy
public :: my_create
public :: my_inspect

interface assignment(=)
module procedure my_assign
end interface assignment(=)
interface operator(+)
module procedure my_add
end interface operator(+)

type :: my_type
integer :: n = 0
real, allocatable :: x(:), y(:), z(:)
end type my_type

contains

! Public

elemental function my_associated( my ) result( status )
type(my_type), intent(in) :: my
logical :: status
! test the structure members
status = allocated(my%x) .or. allocated(my%y) .or. allocated(my%z)
end function my_associated

elemental subroutine my_destroy( my )
type(my_type), intent(out) :: my
end subroutine my_destroy

elemental subroutine my_create( my, n )
type(my_type), intent(out) :: my
integer, intent(in) :: n
integer :: alloc_stat
! check input
if ( n < 1 ) return
! perform the allocation
allocate( my%x(n), my%y(n), my%z(n), stat=alloc_stat )
! initialise
if ( alloc_stat == 0 ) then
! ...dimension
my%n = n
! ...arrays
my%x = 0.0
my%y = 0.0
my%z = 0.0
end if
end subroutine my_create

subroutine my_inspect( my )
type(my_type), intent(in) :: my
if ( .not. my_associated( my ) ) return
write(*,'(5x,"x:")'); write(*,'(5(1x,es13.6))') my%x
write(*,'(5x,"y:")'); write(*,'(5(1x,es13.6))') my%y
write(*,'(5x,"z:")'); write(*,'(5(1x,es13.6))') my%z
end subroutine my_inspect

! Private

elemental subroutine my_assign( copy, original )
type(my_type), intent(out) :: copy
type(my_type), intent(in) :: original
! if input structure not used, do nothing
if ( .not. my_associated( original ) ) return
! create the output structure
call my_create( copy, original%n )
! ...return if no allocation performed
if ( .not. my_associated( copy ) ) return
! copy data
copy%x = original%x
copy%y = original%y
copy%z = original%z
end subroutine my_assign

elemental function my_add( my1, my2 ) result( mysum )
type(my_type), intent(in) :: my1, my2
type(my_type) :: mysum
! copy the first structure
mysum = my1
! and add its components to the second one
mysum%x = mysum%x + my2%x
mysum%y = mysum%y + my2%y
mysum%z = mysum%z + my2%z
end function my_add

end module my_type_define
!<-----end module my_type_define----->

!<-----begin program my_test----->
program my_test
use my_type_define
implicit none

integer, parameter :: n = 3

integer :: i
type(my_type) :: a, b
type(my_type) :: c(n), d(n)

! Scalar test
call my_create(a, 10)
a%x = 1.0
a%y = 2.0
a%z = 3.0
write(*,'(/,"A")')
call my_inspect(a)
b = a
write(*,'(/,"B")')
call my_inspect(b)
a = a + b
write(*,'(/,"A = A + B")')
call my_inspect(a)
call my_destroy(a)
call my_destroy(b)

! Rank-1 test
call my_create(c, 10)
do i = 1, n
c(i)%x = 1.0
c(i)%y = 2.0
c(i)%z = 3.0
write(*,'(/,"C(",i0,")")') i
call my_inspect(c(i))
end do
d = c
do i = 1, n
write(*,'(/,"D(",i0,")")') i
call my_inspect(d(i))
end do
c = c + d ! commenting this line makes memory leaks go away.
do i = 1, n
write(*,'(/,"C(",i0,") + D(",i0,")")') i, i
call my_inspect(c(i))
end do
call my_destroy(c)
call my_destroy(d)
end program my_test
!<-----end program my_test----->

!<-----begin valgrind output----->
==356==
==356== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1)
==356== malloc/free: in use at exit: 360 bytes in 9 blocks.
==356== malloc/free: 146 allocs, 137 frees, 225,998 bytes allocated.
==356== For counts of detected errors, rerun with: -v
==356== searching for pointers to 9 not-freed blocks.
==356== checked 70,540 bytes.
==356==
==356== 120 bytes in 3 blocks are definitely lost in loss record 1 of 3
==356== at 0x40053C0: malloc (vg_replace_malloc.c:149)
==356== by 0x804921F: __my_type_define_MOD_my_create (in a.out)
==356== by 0x8048AE0: __my_type_define_MOD_my_assign (in a.out)
==356== by 0x80486C6: __my_type_define_MOD_my_add (in a.out)
==356== by 0x8049E64: MAIN__ (in a.out)
==356== by 0x804A20A: main (fmain.c:21)
==356==
==356==
==356== 120 bytes in 3 blocks are definitely lost in loss record 2 of 3
==356== at 0x40053C0: malloc (vg_replace_malloc.c:149)
==356== by 0x804912A: __my_type_define_MOD_my_create (in a.out)
==356== by 0x8048AE0: __my_type_define_MOD_my_assign (in a.out)
==356== by 0x80486C6: __my_type_define_MOD_my_add (in a.out)
==356== by 0x8049E64: MAIN__ (in a.out)
==356== by 0x804A20A: main (fmain.c:21)
==356==
==356==
==356== 120 bytes in 3 blocks are definitely lost in loss record 3 of 3
==356== at 0x40053C0: malloc (vg_replace_malloc.c:149)
==356== by 0x8049035: __my_type_define_MOD_my_create (in a.out)
==356== by 0x8048AE0: __my_type_define_MOD_my_assign (in a.out)
==356== by 0x80486C6: __my_type_define_MOD_my_add (in a.out)
==356== by 0x8049E64: MAIN__ (in a.out)
==356== by 0x804A20A: main (fmain.c:21)
==356==
==356== LEAK SUMMARY:
==356== definitely lost: 360 bytes in 9 blocks.
==356== possibly lost: 0 bytes in 0 blocks.
==356== still reachable: 0 bytes in 0 blocks.
==356== suppressed: 0 bytes in 0 blocks.
!<-----end valgrind output----->

If you've made it this far, I applaud you!

Richard Maine

unread,

Nov 3, 2009, 3:57:22 PM11/3/09

to

Paul van Delst <paul.v...@noaa.gov> wrote:

> I'm overloading an elemental ADD function to OPERATOR(+) and I'm getting

> Imemory leaks when apply it to array arguments. Scalar arguments cause no
> Iproblem.

> My question is: what in the code could be causing these memory leaks?
>
> In the test program, if I comment out the
> c = c + d
> line, i.e. adding the array structures, then I get no memory leaks. I made
> all the routines in question elemental, so they should work with array
> arguments, right?

[code and valgrind output elided]

I can't get anything from the valgrind output, probably because I've
never used valgrind. I'm not a beer drinker anyway. :-)

I'd guess it to be a compiler bug, and I can even roughly guess where.
I'm not at all sure, but it seems a good enough guess to mention.

I think it is the compiler failing to remove a compiler temporary. Your
addition of c+d is quite likely to generate a temporary array for that
sum. Then that temporary array will be assigned to c (using your defined
assignment). The compiler *OUGHT* to deallocate the temporary after the
assignment, but it seems a good candidate for something some compilers
might miss.

P.S. As far as I can see, your defined assignment does the same thing as
the intrinsic assignment would... except that the compiler might have a
better chance of being smart about optimizing the intrinsic one, whereas
is seems likely to just "punt" and naively invoke your defined one. Of
course, if this is just a simplification of code that involves other
things, that would be different. Or perhaps variants of this are also
intended to work with pointer components instead of allocatable ones;
pointer components tend to need more of this kind of stuff to fake what
allocatable ones do intrinsically.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

Paul van Delst

unread,

Nov 3, 2009, 4:40:50 PM11/3/09

to

Richard Maine wrote:
> Paul van Delst <paul.v...@noaa.gov> wrote:
>
>> I'm overloading an elemental ADD function to OPERATOR(+) and I'm getting
>> Imemory leaks when apply it to array arguments. Scalar arguments cause no
>> Iproblem.
>
>> My question is: what in the code could be causing these memory leaks?
>>
>> In the test program, if I comment out the
>> c = c + d
>> line, i.e. adding the array structures, then I get no memory leaks. I made
>> all the routines in question elemental, so they should work with array
>> arguments, right?
>
> [code and valgrind output elided]
>
> I can't get anything from the valgrind output, probably because I've
> never used valgrind. I'm not a beer drinker anyway. :-)

Well, I probably put too much output in there for context. The important thing in that
output is the "LEAK SUMMARY" and "definitely lost", "possibly lost" items.

>
> I'd guess it to be a compiler bug, and I can even roughly guess where.
> I'm not at all sure, but it seems a good enough guess to mention.
>
> I think it is the compiler failing to remove a compiler temporary. Your
> addition of c+d is quite likely to generate a temporary array for that
> sum. Then that temporary array will be assigned to c (using your defined
> assignment). The compiler *OUGHT* to deallocate the temporary after the
> assignment, but it seems a good candidate for something some compilers
> might miss.

That's sort of what I thought but wanted independent opinion on. My last OS upgrade broke
most of my other linux compilers. :o( (apart from pgi which simply ICEd)

> P.S. As far as I can see, your defined assignment does the same thing as
> the intrinsic assignment would... except that the compiler might have a
> better chance of being smart about optimizing the intrinsic one, whereas
> is seems likely to just "punt" and naively invoke your defined one. Of
> course, if this is just a simplification of code that involves other
> things, that would be different. Or perhaps variants of this are also
> intended to work with pointer components instead of allocatable ones;
> pointer components tend to need more of this kind of stuff to fake what
> allocatable ones do intrinsically.

Wha..? You mean, sans my defined assignment routine, doing
a = b
where a and b are derived types with allocatable components, the compiler should allocate
the components of the lhs before doing the assignments?

Remember I'm still using f95+tr15581 compilers. I am rearranging things for an easy switch
to using type bound procedures once f2003 compilers become routinely available.
Does f95+tr15581 do more behind-the-scenes housekeeping than I thought?

Regarding the pointer components issue, I am moving away from using pointers as proxys for
allocatables for precisely the reason you state - all of the associated "stuff" to check
that things really are as they seem. We do have some structures where pointers are used
purely for aliasing via the "=>" operator, but they are in the minority (in fact I can
only think of two right now amongst >50 structure definitions).

cheers,

paulv

Richard Maine

unread,

Nov 3, 2009, 5:22:50 PM11/3/09

to

Paul van Delst <paul.v...@noaa.gov> wrote:

> Richard Maine wrote:

> > P.S. As far as I can see, your defined assignment does the same thing as
> > the intrinsic assignment would...
>

> Wha..? You mean, sans my defined assignment routine, doing
> a = b
> where a and b are derived types with allocatable components, the compiler
> should allocate the components of the lhs before doing the assignments?

Yes.

> Remember I'm still using f95+tr15581 compilers. I am rearranging things
> for an easy switch to using type bound procedures once f2003 compilers
> become routinely available. Does f95+tr15581 do more behind-the-scenes
> housekeeping than I thought?

Yes. From N1379 (as close as I have handy to the final version of the
TR)

For intrinsic assignment of objects of a derived type containing an
allocatable array component, the allocatable array component of the
variable on the left-hand-side receives the allocation status and,
if allocated, the bounds and value of the corresponding component of
the expression.

It then goes on to cover some fine points about possible optimizations
of and the effects of the allocatable component also being a target.

In fact, I recall this as being one of the points made in favor of
making allocatable assignment work "right" (by my definition :-)) in
f2003. With f95+TR, we were in the position where assignment of
allocatables did the "magic" if the allocatables were components, but
not if they were the top-level variables. To me, this seemed
inconsistent. I also thought it invited silliness such as defining a
derived type with a single component just in order to make assignment
work "sensibly".

There were people in J3 who were quite resistant to applying this
"magic" to top-level allocatables as was done in f2003. That's part of
why I recall having to make the argument - because it was an argument.

Paul van Delst

unread,

Nov 3, 2009, 6:10:56 PM11/3/09

to

Richard Maine wrote:
> Paul van Delst <paul.v...@noaa.gov> wrote:
>
>> Richard Maine wrote:
>
>>> P.S. As far as I can see, your defined assignment does the same thing as
>>> the intrinsic assignment would...
>> Wha..? You mean, sans my defined assignment routine, doing
>> a = b
>> where a and b are derived types with allocatable components, the compiler
>> should allocate the components of the lhs before doing the assignments?
>
> Yes.
>
>> Remember I'm still using f95+tr15581 compilers. I am rearranging things
>> for an easy switch to using type bound procedures once f2003 compilers
>> become routinely available. Does f95+tr15581 do more behind-the-scenes
>> housekeeping than I thought?
>
> Yes. From N1379 (as close as I have handy to the final version of the
> TR)
>
> For intrinsic assignment of objects of a derived type containing an
> allocatable array component, the allocatable array component of the
> variable on the left-hand-side receives the allocation status and,
> if allocated, the bounds and value of the corresponding component of
> the expression.
>
> It then goes on to cover some fine points about possible optimizations
> of and the effects of the allocatable component also being a target.

I did not know that. Let's see what happens in my test program when I remove my defined
assignment function so that intrinsic assignment is used instead.....

...modifying...compiling...running....

Holy crap.... the memory leaks have disappeared! Let me test the actual code I'm working
on......

OMG! No memory leaks there either! All 8000 tests passed, 88.9MB deallocated!

So it would appear removal of my assignment function did, in fact, allow the compiler to
optimise the process in the add operation when intrinsic assignment is used. This is
brilliant. Not only have the memory leaks gone, but I've eliminated the need for an
assignment procedure.

Thanks so very much.

cheers,

paulv

steve

unread,

Nov 3, 2009, 6:16:08 PM11/3/09

to

On Nov 3, 3:10 pm, Paul van Delst <paul.vande...@noaa.gov> wrote:
> Richard Maine wrote:

Paul,

Some of these memory leaks with gfortran are well-known. Paul Thomas
had developed a -fcheck=memleak (or some such named option) to help
track down leaks in not only user code but also gfortran. Sadly, that
patch
was lightly tested and withered in the mailing list. There are plans
to
revamp the internal representation in array descriptors. It is hoped
that
some (if not all) of these leaks will be caught at that time.

--
steve

Paul van Delst

unread,

Nov 3, 2009, 7:03:26 PM11/3/09

to

steve wrote:
>
> Paul,
>
> Some of these memory leaks with gfortran are well-known. Paul Thomas
> had developed a -fcheck=memleak (or some such named option) to help
> track down leaks in not only user code but also gfortran. Sadly, that
> patch
> was lightly tested and withered in the mailing list. There are plans
> to
> revamp the internal representation in array descriptors. It is hoped
> that
> some (if not all) of these leaks will be caught at that time.

Hi again Steve,

No worries. I'm treating this as more of a "don't try to fool the compiler" type of lesson
(which I should have learned long ago but there ya go :o). Granted the leaks shouldn't
occur if I write my own assignment function, but letting the compiler do what it should be
doing anyway was a quick workaround. And I learned something new and important - so there
are no complaints on my end.

Would it be useful if I tacked my (not so small) test case on to the relevant bugzilla
report (I assume there is one)?

cheers,

paulv

Tobias Burnus

unread,

Nov 4, 2009, 5:34:50 AM11/4/09

to

On 11/04/2009 01:03 AM, Paul van Delst wrote:
> Would it be useful if I tacked my (not so small) test case on to the relevant bugzilla
> report (I assume there is one)?

I was not sure there was one and created a new one at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41936
it still can be marked as duplicate if needed.

(Currently, the number of bug reports is (still?) small enough that a
bugreport too much does less harm than not-filled reports.* Fully
reduced test cases are nice, but also somewhat larger ones are OK (and
might show different issues than the tiny test cases ;-)
The bug with the largest test case I filled had was a 200,000 LOC
program (open source, linked to the download page); but I also reduced
it myself in steps to a few lines. Thus also large test cases might be
handled.)

Tobias

(* except for bug reports against g77 or against gcc 4.0.x/4.1.x - or
those GCC compile problems where the user messed up their gmp/mpfr
libraries installation.)