Derived types and allocatable

Gib Bogle

unread,

May 6, 2008, 12:01:28 AM5/6/08

to

I am still trying to get to grips with how one should copy derived types that contain allocatable
arrays. The test program below, when compiled with IVF 10.1, displays the cdata array contents
correctly, but gives an access violation in the deallocation step. If I neglect the deallocation
there are no error messages. When compiled on IBM Linux (with xlf95_r) it runs without errors. But
running my real program, in which I am presumably doing something wrong, on the IBM machine there is
no error until the deallocation step, which is carried out exactly as in the code below, and then I
get a multitude of error messages like

glibc detected *** double free or corruption (!prev): 0x100a0220
and
glibc detected *** corrupted double-linked list: 0x1009db18

Is it OK to do

cell2(i) = cell1(i)

when cell2(i)%cdata has not been allocated, or do I first need to ensure that %cdata is allocated?

!----------------------------------------------------------------
program main
integer :: csize(2) = (/10,100/)
integer :: k, i
type cell_type
integer :: ID
real, allocatable :: cdata(:)
end type
type(cell_type), allocatable :: cell1(:),cell2(:)

allocate(cell1(2))
allocate(cell2(2))

do i = 1,2
allocate(cell1(i)%cdata(csize(i)))
do k = 1,csize(i)
cell1(i)%cdata(k) = k
enddo
enddo

do i = 1,2
cell2(i) = cell1(i) ! cell2(i)%cdata not allocated
write(*,*) cell2(i)%cdata
enddo

do i = 1,2
do k = 1,csize(i)
if (allocated(cell1(i)%cdata)) then
deallocate(cell1(i)%cdata)
endif
enddo
if (allocated(cell1)) then
deallocate(cell1)
endif
enddo
do i = 1,2
do k = 1,csize(i)
if (allocated(cell2(i)%cdata)) then
deallocate(cell2(i)%cdata)
endif
enddo
if (allocated(cell2)) then
deallocate(cell2)
endif
enddo

end
------------ And now a word from our sponsor ------------------
Want to have instant messaging, and chat rooms, and discussion
groups for your local users or business, you need dbabble!
-- See http://netwinsite.com/sponsor/sponsor_dbabble.htm ----

Gib Bogle

unread,

May 6, 2008, 2:09:36 AM5/6/08

to

I must be more tired than I thought. The code I posted previously is obviously wrong, in a couple
of embarrassing ways, and the reason for the access violation is clear. What I meant to do is shown
below. This code executes fine. I thought I was on the scent of the bug in my real program, but I
was confused. The interesting thing (to me) here is that the assignment cell2(i) = cell1(i) has the
effect of allocating cell2(i)%cdata. Should this be obvious?

program main
integer :: csize(2) = (/10,100/)
integer :: k, i
type cell_type
integer :: ID
real, allocatable :: cdata(:)
end type
type(cell_type), allocatable :: cell1(:),cell2(:)

allocate(cell1(2))
allocate(cell2(2))

do i = 1,2
allocate(cell1(i)%cdata(csize(i)))
do k = 1,csize(i)
cell1(i)%cdata(k) = k
enddo
enddo

do i = 1,2
cell2(i) = cell1(i) ! cell2(i)%cdata not allocated before this
write(*,*) cell2(i)%cdata
enddo

do i = 1,2

if (allocated(cell1(i)%cdata)) then
deallocate(cell1(i)%cdata)
endif
enddo
if (allocated(cell1)) then
deallocate(cell1)
endif

do i = 1,2
if (allocated(cell2(i)%cdata)) then
write(*,*) 'deallocate cell2(i)%cdata: ',i

James Van Buskirk

unread,

May 6, 2008, 2:11:29 AM5/6/08

to

"Gib Bogle" <g.b...@auckland.no.spam.ac.nz> wrote in message
news:481fd818$1...@news.auckland.ac.nz...

> do i = 1,2
> do k = 1,csize(i)
> if (allocated(cell1(i)%cdata)) then
> deallocate(cell1(i)%cdata)
> endif
> enddo
> if (allocated(cell1)) then
> deallocate(cell1)
> endif
> enddo

Your problems start at the above loop. Consider that for each
value of i, there is only one cell1(i)*cdata array. Now you are
contemplating deallocating this array 10 or 100 times in the k
loop, but you are saved by the fact that you check for allocation
status before deallocating. You are not so lucky regarding the
consequences of deallocation of cell1 later in the loop, however.
Consider that on the first trip through the i loop, cell1 was
allocated, so you checked this, then deallocated it. On the
second trip through the i loop, your program should die when
you check whether cell1(i)%cdata is allocated because there is
no cell1(i)%cdata because there is no cell1(i) because you
deallocated cell1 on the first trip through the loop. You
might more easily have replaced the whole i loop with:

if(allocated(cell1)) deallocate(cell1)

Because that automatically deallocates its allocatable components.

> do i = 1,2
> do k = 1,csize(i)
> if (allocated(cell2(i)%cdata)) then
> deallocate(cell2(i)%cdata)
> endif
> enddo
> if (allocated(cell2)) then
> deallocate(cell2)
> endif
> enddo

See comments for other deallocation loop.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

Gib Bogle

unread,

May 6, 2008, 2:39:41 AM5/6/08

to

Please forgive me for responding to my own posts.

I made csize big: csize(2) = (/1000000,100000000/), put a do loop around the whole program, and
experimented with allocating/not allocating cell2(i)%cdata in advance of assigning values to it. I
have convinced myself that there is no memory leakage. It seems that when the derived-type
variables cell2(i) and cell1(i) are equated the allocatable component cell2(i)%cdata is allocated if
not previously allocated, and if it was already allocated it is effectively resized and any unused
memory is deallocated. In other words the compiler does what you'd want it to do.

Probably everyone else already knew this.
------------ And now a word from our sponsor ---------------------
For a secure high performance FTP using SSL/TLS encryption
upgrade to SurgeFTP
---- See http://netwinsite.com/sponsor/sponsor_surgeftp.htm ----

James Van Buskirk

unread,

May 6, 2008, 2:41:26 AM5/6/08

to

"Gib Bogle" <g.b...@auckland.no.spam.ac.nz> wrote in message

news:481ff61f$1...@news.auckland.ac.nz...

> was confused. The interesting thing (to me) here is that the assignment
> cell2(i) = cell1(i) has the effect of allocating cell2(i)%cdata. Should
> this be obvious?

Yes, that's the way allocatable components are supposed to work.
In fact, you could have simply said:

cell2 = cell1

and cell2 would have been reallocated to the same shape as cell1
and the the components of each element of the cell2 array would
have been allocated if necessary.

WARNING: you said earlier that you were ifort 10.1. This compiler
does not behave in a standard way when assignments to whole
allocatable arrays are carried out, unless you use some switch
at compile time. Check the ifort documentation carefully for
the exact nature of its behavior. It's not standard without
the switch.

Steven Correll

unread,

May 6, 2008, 10:08:11 AM5/6/08

to

On May 5, 11:41 pm, "James Van Buskirk" <not_va...@comcast.net> wrote:
> WARNING: you said earlier that you were ifort 10.1. This compiler
> does not behave in a standard way when assignments to whole
> allocatable arrays are carried out, unless you use some switch
> at compile time. Check the ifort documentation carefully for
> the exact nature of its behavior. It's not standard without
> the switch.

In fairness to Intel, the compiler conforms to the Fortran standard as
of 1998; the switch "-assume realloc_lhs" makes it conform to the
standard (in this particular matter) as of 2003. Agreed, it's odd that
the 1998 TR15581 document created a situation in which allocatable
array assignment behaved differently depending on whether the array
was a component or a variable, but prior to 2003 it was the
responsibility of the programmer to allocate the target explicitly if
it wasn't a component.

Kurt Kallblad

unread,

May 7, 2008, 1:56:49 AM5/7/08

to

"Gib Bogle" <g.b...@auckland.no.spam.ac.nz> wrote in message
news:481ff61f$1...@news.auckland.ac.nz...

<snip>

> do i = 1,2
> if (allocated(cell1(i)%cdata)) then
> deallocate(cell1(i)%cdata)
> endif
> enddo
> if (allocated(cell1)) then
> deallocate(cell1)
> endif
> do i = 1,2
> if (allocated(cell2(i)%cdata)) then
> write(*,*) 'deallocate cell2(i)%cdata: ',i
> deallocate(cell2(i)%cdata)
> endif
> enddo
> if (allocated(cell2)) then
> deallocate(cell2)
> endif
>
> end

A different questions:
Is it necessery to deallocate the components?
I thought it's enough to deallocate cell1 and cell2.
Kurt

Richard Maine

unread,

May 7, 2008, 3:15:43 AM5/7/08

to

Kurt Kallblad <kurt.k...@tele2.se> wrote:

No. Some might prefer it as a style, but it is not necessary.

> I thought it's enough to deallocate cell1 and cell2.

As a general rule, you can assume that (barring compiler bugs),
allocatables cannot leak memory. You can pretty much deduce the details
from that. If the components weren't automatically deallocated for you,
then forgetting to deallocate them would leak memory. Therefore, you can
deduce that the components will be dealocated for you. The deduction
would be correct.

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain

Damian

unread,

May 7, 2008, 3:03:34 PM5/7/08

to

On May 7, 12:15 am, nos...@see.signature (Richard Maine) wrote:
> Kurt Kallblad <kurt.kallb...@tele2.se> wrote:
> > "Gib Bogle" <g.bo...@auckland.no.spam.ac.nz> wrote in message

Just make sure you
1. Couple the above assumptions with empirical testing,
2. Read the compiler manual carefully to determine if you need to pass
a flag to force the automatic deallocations, and
3. Update to the latest version of your compiler if your testing
uncovers problems. Portland Group had leaks up until its latest
release (7.2?). Intel still has some in its current release (10.1).
IBM XL Fortran does the deallocations for you only if you pass it a
certain flag when you invoke the compiler. So even if you make the
right assumptions based on the language standard, don't automatically
assume the compiler has implemented the standard correctly.

Damian

Steve Lionel

unread,

May 7, 2008, 3:33:13 PM5/7/08

to

On Wed, 7 May 2008 12:03:34 -0700 (PDT), Damian <dam...@rouson.net> wrote:

>3. Update to the latest version of your compiler if your testing
>uncovers problems. Portland Group had leaks up until its latest
>release (7.2?). Intel still has some in its current release (10.1).

I am not aware of allocatable component leaks in Intel Fortran 10.1. If you
have a support issue number for such, please send it to me in email so that I
can look it up.
--
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://softwareforums.intel.com/
Intel Fortran Support
http://support.intel.com/support/performancetools/fortran
My Fortran blog
http://www.intel.com/software/drfortran

Gib Bogle

unread,

May 7, 2008, 4:35:27 PM5/7/08

to

I was going to ask that question.

Damian

unread,

May 8, 2008, 12:05:09 PM5/8/08

to

I filed an issue report (actually two but one was fixed) and discussed
the issue by telephone with an Intel developer who said it was too
late for inclusion in 10.1 but would be fixed in the next release.
Unfortunately all of this was nearly a year ago and I have since
changed employers so I don't have access to my old e-mails and such.
I will e-mail you my contact information. I can probably give you
enough information offline so that you could track down the issue
number.

Damian

Steve Lionel

unread,

May 8, 2008, 12:52:00 PM5/8/08

to

On Thu, 8 May 2008 09:05:09 -0700 (PDT), Damian <dam...@rouson.net> wrote:

>I will e-mail you my contact information. I can probably give you
>enough information offline so that you could track down the issue
>number.

Please do. It is probable that the fix went into an update to 10.1. Your
name and email address (at the time) should be all I need.

Damian

unread,

May 8, 2008, 10:56:55 PM5/8/08

to

Name: Damian Rouson. At that time, my e-mail address was
damian...@nrl.navy.mil. That address is no longer active and
someone else submitted it on my behalf so it might be under her name
and e-mail address.

I found this code that I *think* was submitted with the bug report
(can't recall if this was for the first bug report, which was fixed,
or for the second, which was not fixed in the initial release of 10.1
but might be fixed in a more recent release):

MODULE Vector_Module

IMPLICIT NONE

PRIVATE ! hide all types & procedures by default
PUBLIC :: Vector ! expose type
PUBLIC :: Vector_ ! expose constructor

TYPE Vector
PRIVATE
REAL ,DIMENSION(:), ALLOCATABLE :: component
END TYPE Vector

CONTAINS

! ___________ Vector constructor: allocate/initialize state
variables _______

FUNCTION Vector_(num_elements) RESULT(this)
INTEGER ,INTENT(IN) :: num_elements
TYPE(Vector) :: this

ALLOCATE(this%component(num_elements))

this%component = 0.0

END FUNCTION Vector_

END MODULE Vector_Module

MODULE Double_Vector_Module

USE Vector_Module

IMPLICIT NONE

PRIVATE ! hide all types & procedures by default
PUBLIC :: Double_Vector ! expose type
PUBLIC :: Double_Vector_ ! expose constructor

TYPE Double_Vector
PRIVATE
TYPE(Vector) :: first_vector
TYPE(Vector) :: second_vector
END TYPE Double_Vector

CONTAINS

! ___________ Vector constructor: allocate/initialize state
variables _______

FUNCTION Double_Vector_(num_elements) RESULT(this)
INTEGER ,INTENT(IN) :: num_elements
TYPE(Double_Vector) :: this

this%first_vector = Vector_(num_elements)
this%second_vector = Vector_(num_elements)

END FUNCTION Double_Vector_

END MODULE Double_Vector_Module

PROGRAM main
USE Double_Vector_Module
IMPLICIT NONE

INTEGER ,PARAMETER :: loop_length=150
INTEGER ,PARAMETER :: vector_length=2**22 ! x 4 bytes/real = 16 MB
vectors
INTEGER :: i

TYPE(Double_Vector) :: duplicate

DO i=1,loop_length
PRINT *, i
duplicate = Double_Vector_(vector_length)
END DO

END PROGRAM main

These files are all time-stamped April 15, 2007. I named the file
containing the above main program "leak.f90." I named the following
file "noleak.f90":

PROGRAM main
USE Vector_Module
IMPLICIT NONE

INTEGER ,PARAMETER :: loop_length=1500
INTEGER ,PARAMETER :: vector_length=2**22 ! x 4 bytes/real = 16 MB
vectors
INTEGER :: i

TYPE(Vector) :: duplicate

DO i=1,loop_length
PRINT *, i
duplicate = Vector_(vector_length)
END DO

END PROGRAM main

So apparently the allocatable components have to be nested a couple of
levels deep to demonstrate the leak. In my application, I'm
interested in even deeper nesting, so I hope the leak fix is general
enough to handle arbitrarily deeply nested types.

Damian

Steve Lionel

unread,

May 9, 2008, 2:11:34 PM5/9/08

to

On Thu, 8 May 2008 19:56:55 -0700 (PDT), Damian <dam...@rouson.net> wrote:

>I found this code that I *think* was submitted with the bug report
>(can't recall if this was for the first bug report, which was fixed,
>or for the second, which was not fixed in the initial release of 10.1
>but might be fixed in a more recent release):

Ok - I found it and remember the case now. This particular problem is not
fixed in a 10.1 update but will be fixed in the next major release, as we want
to make sure that the fix gets adequate external testing. It is a rather
unsusual combination of things to show the problem, but it is indeed an
outstanding bug in 10.1.

Damian

unread,

May 9, 2008, 3:15:07 PM5/9/08

to

I

Regarding the reported bug being "a rather unusual combination of
things to show a problem," the first leak I reported was simpler.
When that was fixed, I had to go to a slightly more complicated
example to demonstrate that the fix was not sufficiently universal.
Even that example is relatively simple compared to what is common in a
certain class of object-oriented scientific software projects.

I've been working on a scientific software design methodology that
relies upon heavy use of user-defined assignments and operators with
arguments that contain nested derived types. This methodology
facilitates writing "software abstractions that resemble blackboard
abstractions" to quote my colleague Kevin Long. Specifically, it
facilitates code that closely resembles equations in the physical
sciences (scalar, vector & tensor field equation). Numerous research
groups have done similar work in C++, e.g. the Sundance project at
Sandia National Laboratories, the Overture project at Lawrence
Livermore National Laboratories, and the Sophus project at the
University of Bergen.

I believe Fortran will eventually outpace these C++ efforts in several
respects (not the least of which being the automatic memory
deallocation features we're discussing), but this will only happen if
cases like the one I constructed are considered of central importance
rather than unusual cases.

Damian

James Giles

unread,

May 10, 2008, 8:26:34 PM5/10/08

to

Steven Correll wrote:
...

> In fairness to Intel, the compiler conforms to the Fortran standard as
> of 1998; the switch "-assume realloc_lhs" makes it conform to the
> standard (in this particular matter) as of 2003. Agreed, it's odd that
> the 1998 TR15581 document created a situation in which allocatable
> array assignment behaved differently depending on whether the array
> was a component or a variable, but prior to 2003 it was the
> responsibility of the programmer to allocate the target explicitly if
> it wasn't a component.

Some people might read the above as implying that there's an
incompatibility between F95 and F2003. That's not quite right. The
syntax and semantics of those cases allowed by the old standard
are identical in the new standard. However, the new standard
provides a meaning for cases the older standard explicitly prohibited.
No standard conforming F95 program should behave any differently
under F2003.

If the left-hand side of the assignment is already allocated and is
conformable to the expression the right hand side, the semantics
remain precisely what the F95 standard required. And it doesn't
matter which syntax you use. Suppose A is an allocatable rank-1
array and it's already allocated with the same size as array B.
Then all the following assignments were legal in F95, and
they meant the same thing:

A = B
A(:) = B(:)
A = B(:)
A(:) = B

Well they still mean the same thing under F2003. However, if
A isn't conformable to B, or if A isn't allocated at all, the above
four assignments are all non-standard in F95. The F2003
standard gives a new meaning to the first and third variant. That's
upward compatible and perfectly within the committee's allowed
privilege. I too wish they had given the same meaning to all four
variants.

I understand! Yes, the committee wanted some form that still
required conformability. It's not clear why, but that's how the
votes lined up I guess. Of course, they could still do it. And
it would still be an upward compatible change. Conforming
F2003 programs would still run with the same behavior.

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

Jan Vorbrüggen

unread,

May 14, 2008, 6:07:46 AM5/14/08

to

> I too wish they had given the same meaning to all four variants.

I don't think that can be done. On the LHS, A(:) is a slice, and
therefore must be allocated and defineable for the assignment to have a
meaning at all. Think about it - what if I wrote:

A(::2) = B

What semantics would you ascribe to this assignment in case A is not
allocated in a conformable way?

Jan

James Van Buskirk

unread,

May 14, 2008, 1:03:21 PM5/14/08

to

"James Giles" <james...@worldnet.att.net> wrote in message
news:_2rVj.186212$D_3....@bgtnsc05-news.ops.worldnet.att.net...

> Some people might read the above as implying that there's an
> incompatibility between F95 and F2003. That's not quite right. The
> syntax and semantics of those cases allowed by the old standard
> are identical in the new standard. However, the new standard
> provides a meaning for cases the older standard explicitly prohibited.
> No standard conforming F95 program should behave any differently
> under F2003.

Yes, but a standard conforming f03 program may be interpreted as
an f95 program with extensions.

C:\gfortran\clf\confusion>type confusion.f90
module mod
implicit none
contains
subroutine sub(x,label)
integer x(:,:)
character(*) label
character(80) fmt

write(*,'(a)') label//' ='
write(fmt,'(a,i0,a)') '(',size(x,2),'i3)'
write(*,fmt) transpose(x)
end subroutine sub
end module mod

program confusion
use mod
implicit none
integer, allocatable :: x1(:,:)
integer, allocatable :: y1(:,:)
integer, allocatable :: x2(:,:)
integer, allocatable :: y2(:,:)
integer i

allocate(x1(2,3))
x1 = reshape([(i,i=1,size(x1))],shape(x1))
allocate(y1(3,2))
y1 = 1
allocate(x2(3,2))
allocate(y2(2,3))
x2 = x1
y2 = y1
call sub(x1,'x1')
call sub(x2,'x2')
call sub(y1,'y1')
call sub(y2,'y2')
call sub(matmul(x2,y2),'matmul(x2,y2)')
end program confusion

C:\gfortran\clf\confusion>gfortran confusion.f90 -oconfusion

C:\gfortran\clf\confusion>confusion
x1 =
1 3 5
2 4 6
x2 =
1 3
2 4
3 6
y1 =
1 1
1 1
1 1
y2 =
1 1 1
1 1 6
matmul(x2,y2) =
4 4 19
6 6 26
9 9 39

C:\gfortran\clf\confusion>ifort /check:bounds confusion.f90
Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.1
Build 20061104
Copyright (C) 1985-2006 Intel Corporation. All rights reserved.

-out:confusion.exe
-subsystem:console
confusion.obj

C:\gfortran\clf\confusion>confusion
x1 =
1 3 5
2 4 6
x2 =
1 3
2 4
3 5
y1 =
1 1
1 1
1 1
y2 =
1 1***
1 1***
matmul(x2,y2) =
4 4***
6 6***
8 8***

We see the f03 behavior in neither compiler.

James Giles

unread,

May 14, 2008, 3:30:39 PM5/14/08

to

Actaully A(:) is easily recognizable as a slice that corresponds
to the whole array. A(::2) is easily recognizable as a slice
that does *not* mean the whole array. What I wanted was that
if the left-hand side is not subscripted (the whole array) or is
subscripted with only colon (:) in each subscript position,
the meaning should be the same. It's an easily recognizable
syntactic form. If the subscripts in each dimension *aren't*
merely colon (:), then the thing should follow the existing
rule for slices (even if the bounds expressions on the
subscript triplets turn out to refer to the whole aray).

Hence:

real, allocatable :: x(:)
...
allocate(x(100))
...

x(1:100) = expr

This should require conformability even though the subscript
turns out to refer to the whole array.

x = expr
x(:) = expr

The above two could mean the same thing, even if that means
reallocation.

In fact, what would really be useful is a the following as
well as the above:

x(5:) = expr

This reallocates if necessary and the new low-bound of the
array is 5.

x(:10) = expr

This reallocates if necessary and the new upper bound of the
array is 10. I suppose these last two would be too error prone
though and should not be allowed. :-(

James Giles

unread,

May 14, 2008, 3:37:02 PM5/14/08

to

James Van Buskirk wrote:
> "James Giles" <james...@worldnet.att.net> wrote in message
> news:_2rVj.186212$D_3....@bgtnsc05-news.ops.worldnet.att.net...
>
>> Some people might read the above as implying that there's an
>> incompatibility between F95 and F2003. That's not quite right. The
>> syntax and semantics of those cases allowed by the old standard
>> are identical in the new standard. However, the new standard
>> provides a meaning for cases the older standard explicitly
>> prohibited. No standard conforming F95 program should behave any
>> differently under F2003.
>
> Yes, but a standard conforming f03 program may be interpreted as
> an f95 program with extensions.

That's the price any implementor pays for allowing extensions:
future standards may use the same syntax for some different
meaning. It's also the price any user pays for using extensions:
future standards may require the code to be changed. All
the standard can guarantee when they maintain backward
compatibility is the above: no previously conforming program
should behave differently under the new standard. Previously
non-conforming programs have no guarantees at all.

glen herrmannsfeldt

unread,

May 14, 2008, 4:39:38 PM5/14/08

to

James Giles wrote:
(snip)

> Actaully A(:) is easily recognizable as a slice that corresponds
> to the whole array. A(::2) is easily recognizable as a slice
> that does *not* mean the whole array. What I wanted was that
> if the left-hand side is not subscripted (the whole array) or is
> subscripted with only colon (:) in each subscript position,
> the meaning should be the same. It's an easily recognizable
> syntactic form. If the subscripts in each dimension *aren't*
> merely colon (:), then the thing should follow the existing
> rule for slices (even if the bounds expressions on the
> subscript triplets turn out to refer to the whole aray).

In previous discussions it was decided that (:) would
prevent allocate by assignment. It seems to be the only
way to tell the compiler that allocate by assignment is not
desired, to save the overhead of actually testing for it.

If you want to add a notation that means 'whole array' on
the left side of an assignment, how would () be?

a()=b+c

-- glen

James Giles

unread,

May 14, 2008, 4:49:59 PM5/14/08

to

glen herrmannsfeldt wrote:
...

> In previous discussions it was decided that (:) would
> prevent allocate by assignment. It seems to be the only
> way to tell the compiler that allocate by assignment is not
> desired, to save the overhead of actually testing for it.

It's still not clear to me what that overhead consists of.
In order to do a whole array assignment at all you must
do almost all (90+ %) of the work necessary to do conformance
tests. You have to find out how much data is moving and
from where to where,, that's nearly everything. Further,
that work is outside of the data movement itself, so the
remaining (<10%) of the effort is not "inside the loop".

Since conformance testing is identical to the test for
reallocation (the only difference is what you do if
conformance didn't match), the only overhead if
both sides do conform is the test. I'm aware that
some implementors claim the overhead is significant.
I have no idea why. Is it something inherent in the
problem, or (more likely) just something they haven't
worked out yet? The latter is likely because they can
always tell users to turn off the tests if the performance
is unacceptable. This allows them to put off the job
of fixing the performance problem. Things like that
can exist in implementations for years.

> If you want to add a notation that means 'whole array' on
> the left side of an assignment, how would () be?
>
> a()=b+c

If you really need a way to defeat reallocation, how would
be:

A(::1) = b + c

glen herrmannsfeldt

unread,

May 14, 2008, 5:27:46 PM5/14/08

to

James Giles wrote:
(snip on overhead for allocate on assignment)

> It's still not clear to me what that overhead consists of.
> In order to do a whole array assignment at all you must
> do almost all (90+ %) of the work necessary to do conformance
> tests. You have to find out how much data is moving and
> from where to where,, that's nearly everything. Further,
> that work is outside of the data movement itself, so the
> remaining (<10%) of the effort is not "inside the loop".

If bounds checking is on, I agree the overhead is small.

Another consideration is in debugging where it is nice to
know the places where an array can be reallocated. If I
later find an array to be the wrong size, I have to go
through all possible reallocation statements to check.

(Partly that has to do with bugs I have chased down
in R programs, where R does allocate on assignment.
It also has to do with a feature of the R apply function.)

> Since conformance testing is identical to the test for
> reallocation (the only difference is what you do if
> conformance didn't match), the only overhead if
> both sides do conform is the test. I'm aware that
> some implementors claim the overhead is significant.
> I have no idea why.

Well, for small arrays in deeply nested loops it could
easily be very significant. Also, I am not sure how
good compilers are at optimizing bounds checking.
A DO variable (that isn't modified in the loop) can
be assumed to stay within the loop bounds, and should
avoid some bounds tests. I don't know which compilers
do that optimization.

> Is it something inherent in the
> problem, or (more likely) just something they haven't
> worked out yet? The latter is likely because they can
> always tell users to turn off the tests if the performance
> is unacceptable. This allows them to put off the job
> of fixing the performance problem. Things like that
> can exist in implementations for years.

(snip)

> If you really need a way to defeat reallocation, how would
> be:

> A(::1) = b + c

I suppose that works.

-- glen

James Giles

unread,

May 14, 2008, 5:40:48 PM5/14/08

to

glen herrmannsfeldt wrote:
> James Giles wrote:
> (snip on overhead for allocate on assignment)
>
>> It's still not clear to me what that overhead consists of.
>> In order to do a whole array assignment at all you must
>> do almost all (90+ %) of the work necessary to do conformance
>> tests. You have to find out how much data is moving and
>> from where to where,, that's nearly everything. Further,
>> that work is outside of the data movement itself, so the
>> remaining (<10%) of the effort is not "inside the loop".
>
> If bounds checking is on, I agree the overhead is small.

No, my analysis remains true whether bounds checking is
on or not. The test remains only a small part of the whole
effort to decide how much data and where. And it remains
"outside the loop". I agree that *some* implementations
might still not have efficient ways of doing it. But that's
quality of implementation and not a language design issue.

> Another consideration is in debugging where it is nice to
> know the places where an array can be reallocated. If I
> later find an array to be the wrong size, I have to go
> through all possible reallocation statements to check.

To be sure I haven't ever said that no way should exist to
avoid reallocation. I just think the existing way of expressing
that distinction is not satisfactory.

>> Since conformance testing is identical to the test for
>> reallocation (the only difference is what you do if
>> conformance didn't match), the only overhead if
>> both sides do conform is the test. I'm aware that
>> some implementors claim the overhead is significant.
>> I have no idea why.
>
> Well, for small arrays in deeply nested loops it could

> easily be very significant. [...]

Sure. And as I've said before, contrived examples can
hypothetically arise in real programs. But if you have
already identified the conformance test for reallocation-
on-assignment as your bottleneck that's the time to take
alternative action. Why hamper the language's design
in non-intuitive ways. Are not the number of people
that are confused by the feature as it stands sufficient
evidence that changes are in order?

Converting the whole array operation into a DO loop is maybe
faster in most of those instances of small data moves anyway.

> [...] Also, I am not sure how

> good compilers are at optimizing bounds checking.
> A DO variable (that isn't modified in the loop) can
> be assumed to stay within the loop bounds, and should
> avoid some bounds tests. I don't know which compilers
> do that optimization.

So, language features should not be chosen with assumption
that 30+ year old technology ought to be used?

Jan Vorbrüggen

unread,

May 15, 2008, 5:14:13 AM5/15/08

to

> Since conformance testing is identical to the test for
> reallocation (the only difference is what you do if
> conformance didn't match), the only overhead if
> both sides do conform is the test.

However, the point is that when the compiler knows reallocation cannot
happen - because the colon form is used or, as apparently in the case of
the Intel compiler, a switch has disabled the functionality - it can
completely skip the test. Although I haven't checked, my gut feeling
tells me that conformability of array assignments is not a constraint in
the standard, and it's therefore up to the programmer to ensure Bad
Things Don't Happen and not up to the processor.

> I'm aware that
> some implementors claim the overhead is significant.

Walking through two seperate data structures, comparing them for
conformability - with the performance of that test being influenced by
the detailed design of the data structure, which however also needs to
serve other uses - and with the large amount of conditional jumps that
entails does strike me as a potentially significant overhead if you're
not moving megabytes every time.

And as noted above for the Intel compiler, apparently competent people
have concluded that it's worth the overhead (in a different meaning of
the word, to be sure) of enabling their customers to control this
themselves. So there.

Jan