Kay Diederichs
unread,Mar 1, 2023, 5:14:53 AM3/1/23You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
Dear all,
I'm a long-time ifort user; now with oneAPI 2022.2.1 or 2023.0.0 on Linux (CentOS-7).
I used to find the OpenMP part of ifort reliable (and I did contribute many OpenMP-related bug reports in the early years of Parallel Studio), but now I seem to run into a rather trivial bug.
The following program tries to reduce the performance impact of a CRITICAL region which is used for writing to a file, by
buffering batches of nbuf lines for each thread. So the larger nbuf is, the more output is done upon entering the CRITICAL region.
Clearly, the buffers of each thread need to be emptied at the end. This requires a parallel region. The code is
! for ifort, must raise stacksize with (say): ulimit -s 81920
! and depending on the values of nrow and ncol: export OMP_STACKSIZE 16m
USE omp_lib
IMPLICIT NONE
INTEGER, PARAMETER :: nbuf=100
INTEGER nrow,ncol,i,j,ncc,nskip,nout,ibuf,it(nbuf),jt(nbuf)
REAL, ALLOCATABLE :: matrix(:,:)
REAL cc,cct(nbuf)
nrow=99
ncol=65
ALLOCATE(matrix(nrow,ncol))
CALL RANDOM_NUMBER(matrix)
nskip=0
nout=0
OPEN(2,file='test.dat')
! calculate something, and write it out in batches of size nbuf
!$OMP PARALLEL PRIVATE(i,j,ncc,cc,it,jt,cct) SHARED(nrow,ncol,matrix) &
!$OMP& REDUCTION(+:nskip,nout) PRIVATE(ibuf) DEFAULT(none)
ibuf=0
!$OMP DO SCHEDULE(guided)
DO i=1,ncol-1
DO j=i+1,ncol
call get_stuff(matrix(1:nrow,i),matrix(1:nrow,j),nrow,cc,ncc)
IF (ncc>4) THEN
nout=nout+1
ibuf=ibuf+1
it(ibuf)=i
jt(ibuf)=j
cct(ibuf)=cc
IF (ibuf==nbuf) THEN
!$OMP CRITICAL
DO ibuf=1,nbuf
WRITE(2,'(2(i0,1x),f0.3)') it(ibuf),jt(ibuf),cct(ibuf)
END DO
!$OMP END CRITICAL
ibuf=0
END IF
ELSE
nskip=nskip+1
END IF
END DO
END DO
!$OMP END DO
! remaining buffers:
!$OMP CRITICAL
! if MOD(nount,nbuf)/=0 then the ibuf values should not be all 0
!$ print '(a,i3,a,i3,a)','thread=',omp_get_thread_num(),' ibuf=',ibuf,' (should not be all 0)'
DO i=1,ibuf
WRITE(2,'(2(i0,1x),f0.3)') it(i),jt(i),cct(i)
END DO
!$OMP END CRITICAL
!$OMP END PARALLEL
CLOSE(2)
PRINT*,'number of output lines in test.dat should be',nout, '(check with wc-l !)'
END
SUBROUTINE get_stuff(a,b,n,cc,ncc)
INTEGER, INTENT(IN) :: n
REAL, INTENT(IN) :: a(n),b(n)
REAL, INTENT(OUT) :: cc
INTEGER, INTENT(OUT) :: ncc
cc=SUM(a+b)/n/2
ncc=COUNT(a>0.1)
END SUBROUTINE get_stuff
The program works as expected with gfortran or the Sunstudio compiler.
But with ifort, the values of ibuf are all 0 when finally emptying the buffer, and consequently many output lines are missing in test.dat .
Can you confirm the bug? Or am I overlooking something?
Thanks,
Kay