On Thursday, November 3, 2016 at 12:41:34 AM UTC-4, FortranFan wrote:
> On Wednesday, November 2, 2016 at 11:48:25 PM UTC-4, Ian Harvey wrote:
>
> > ..
> >
> > If each element of the array is itself of reasonable size, it can help
> > to hold each element via an allocatable component in a wrapper type,
> > then you can move_alloc that component across to the new array rather
> > than having to actually copy the element value.
>
> .. the move operation instead of a copy:
>
> call tmp(1:N)%move( foo ) ! this replaces <tmp(1:N) = foo> step in the snippet
.. The above does help me tremendously in terms of performance ..
Shown below are some results with a rather simple code example that is listed further down in this post, hopefully this will make things clearer. So the point is as explained by Ian Harvey, instead of following the book example, if one employs the MOVE_ALLOC intrinsic for the component(s) of the FOO_T derived type in the original post, considerable improvements can be noticed.
In the code below, two sequences are attempted: 1) resize_copy, a 'contain'ed procedure that follows the steps shown by Metcalf et al. and which were described in the original post and 2) resize_move which is along the lines of Ian Harvey's suggestion and which is something I've had the back of my mind for a while now, ever since I had become familiar with C++ 11 standard revision.
Readers will notice the performance differences between the two approaches.
---
So the question I want to pose is this: is it possible to formalize this somehow in a future revision of the Fortran standard in order to make it more easier and clearer for coders who need to work with any kind of *containers* of derived types? That is, I am wondering if it's possible to introduce some form of "move assignment" semantics in addition to the existing defined assignment which is effectively a copy assignment. See this for some background with respect to C++:
https://en.wikipedia.org/wiki/Move_assignment_operator.
I bring this up because in the general context of derived types and type extensions, it is rather difficult for a coder to be fully sure of how to design the type (as in FOO_T below) and how to setup the move scheme (as in move_foo_dat below) so that containers of the type, an array being the simplest, can be grown and shrunk efficiently. So if some rules and constraints and syntax and such are introduced in the language around the derived type components and methods acting on them, perhaps it can be made more straightforward.
---
For the code listed below, here're some execution results using gfortran for a 64-bit target on a Windows 7 workstation with 16 GB RAM, Intel Core i7-6820HQ @ 2.7 GHz CPU:
N = 100000 --------------------------------
# Array CPU Time CPU Time
Elements resize_copy resize_move
(seconds) (seconds)
-------------------------------------------
1000 0.2893 3.9E-3
5000 2.433 3.9E-3
10000 30.98 4.0E-3
15000 203.9 4.2E-3
-------------------------------------------
N = 50000 ---------------------------------
# Array CPU Time CPU Time
Elements resize_copy resize_move
(seconds) (seconds)
-------------------------------------------
2000 0.2709 2.0E-3
10000 1.721 2.3E-3
20000 19.81 1.9E-3
-------------------------------------------
-- begin code --
module mykinds_m
use, intrinsic :: iso_fortran_env, only : I8 => int64, WP => real64
implicit none
end module
module foo_m
use mykinds_m, only : WP
implicit none
private
type, public :: foo_t
real(WP), allocatable :: dat(:)
contains
procedure, pass(this), public :: set => set_foo
end type foo_t
public :: move_foo_dat
contains
elemental subroutine move_foo_dat( rhs, lhs )
type(foo_t), intent(inout) :: rhs
type(foo_t), intent(inout) :: lhs
call move_alloc( from=rhs%dat, to=lhs%dat )
return
end subroutine move_foo_dat
impure elemental subroutine set_foo( this, n )
class(foo_t), intent(inout) :: this
integer, intent(in) :: n
if ( allocated(this%dat) ) then
deallocate( this%dat)
end if
allocate( this%dat(n) )
call random_number( this%dat )
return
end subroutine set_foo
end module foo_m
program p
use mykinds_m, only : I8, WP
use foo_m, only : foo_t, move_foo_dat
implicit none
integer, parameter :: N = 100000
integer, parameter :: DATA_SIZE = 15000
type(foo_t), allocatable :: foo(:)
print *, "Initial array size, N = ", N
print *, "Size of data in each array element, DATA_SIZE = ", DATA_SIZE
! Set up foo array
allocate( foo(N) )
call foo%set( DATA_SIZE )
print *, "Before reallocation:"
print *, "size(foo) = ", size(foo)
print *, "foo(N)%dat(1) = ", foo(N)%dat(1)
print *
!call resize_copy()
call resize_move()
print *
print *, "After reallocation:"
print *, "size(foo) = ", size(foo)
print *, "foo(N)%dat(1) = ", foo(N)%dat(1)
stop
contains
subroutine resize_copy()
type(foo_t), allocatable :: tmp(:)
real(WP) :: start_time
real(WP) :: end_time
call my_cpu_time( start_time )
! Canonical sequence for array growth per Metcalf et al.
allocate( tmp(2*N) )
tmp(1:N) = foo
call move_alloc( from=tmp, to=foo )
call my_cpu_time( end_time )
print "(*(g0.4))", "Reallocation sequence: CPU time = ", (end_time - start_time), &
" seconds."
return
end subroutine resize_copy
subroutine resize_move()
type(foo_t), allocatable :: tmp(:)
real(WP) :: start_time
real(WP) :: end_time
call my_cpu_time( start_time )
! Possible sequence for array growth for types with ALLOCATABLE components
allocate( tmp(2*N) )
call move_foo_dat( foo, tmp(1:N) )
call move_alloc( from=tmp, to=foo )
call my_cpu_time( end_time )
print "(*(g0.4))", "Reallocation sequence: CPU time = ", (end_time - start_time), &
" seconds."
return
end subroutine resize_move
subroutine my_cpu_time( time )
!.. Argument list
real(WP), intent(inout) :: time
!.. Local variables
integer(I8) :: tick
integer(I8) :: rate
call system_clock (tick, rate)
time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) )
return
end subroutine my_cpu_time
end program p
-- end code --