Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

gfortran and OpenMP

522 views
Skip to first unread message

Bart Vandewoestyne

unread,
Nov 9, 2007, 9:37:37 AM11/9/07
to
Doing some first experiments with OpenMP and Fortran 95, I am
trying to compile and run the following program:

program sums

integer, parameter :: sp = kind(1.0)
integer, parameter :: dp = selected_real_kind(2*precision(1.0_sp))

integer, parameter :: num_steps = 2000000
real :: t1, t2
integer :: i, j
real(kind=dp), dimension(0:num_steps-1) :: y

call cpu_time(t1)

!$omp parallel do
do j=0,num_steps-1
do i=0,49
y(j) = y(j) + 0.7_dp**i
end do
end do
!$omp end parallel do

call cpu_time(t2)

print *, "y(end) = ", y(num_steps-1)
print *, "Reached result in ", t2-t1, " seconds processor time."

end program sums


I am using gfortran as follows:

bartv@ciney:~/openmp$ gfortran --version | head -1
GNU Fortran (GCC) 4.3.0 20071002 (experimental) [trunk revision
128946]
bartv@ciney:~/openmp$ gfortran -fopenmp -o sums sums.f95
bartv@ciney:~/openmp$ ./sums
Segmentation fault (core dumped)
bartv@ciney:~/openmp$

Does anybody see the reason for this segmentation fault? What am
i doing wrong here?

Thanks,
Bart

highegg

unread,
Nov 9, 2007, 9:51:29 AM11/9/07
to
On Nov 9, 3:37 pm, Bart Vandewoestyne

I'm not entirely sure, but I think that only the iterator variable of
the parallelized loop itself is implicitly private, not any enclosed
loop. I think you should make I private, i.e. use

!$omp parallel do private(i)


deltaseq0

unread,
Nov 9, 2007, 9:58:37 AM11/9/07
to

"Bart Vandewoestyne" <MyFirstName...@cs.kuleuven.be> wrote in
message news:fh1rbh$67m$1...@ikaria.belnet.be...
Does y need to be initialized? - Mike


Bart Vandewoestyne

unread,
Nov 9, 2007, 10:09:11 AM11/9/07
to
highegg wrote:
>
> I'm not entirely sure, but I think that only the iterator variable of
> the parallelized loop itself is implicitly private, not any enclosed
> loop. I think you should make I private, i.e. use
>
> !$omp parallel do private(i)

This doesn't seem to help :-(

Bart

Bart Vandewoestyne

unread,
Nov 9, 2007, 10:09:58 AM11/9/07
to
deltaseq0 wrote:
>
> Does y need to be initialized? - Mike

Adding

y = 0

right before the first call to cpu_time() doesn't seem to help
either :-(

Bart

Bart Vandewoestyne

unread,
Nov 9, 2007, 10:44:21 AM11/9/07
to

Hmm... I've just figured out the cause of the problem...
apparently the value of 2000000 for num_steps is too big... if I
decrease it to 200000 the program runs fine.

However, I do not understand this... I am running this on a
computer with 2GB of main memory... A rough calculation tells me
that I need:

2000000 x 8 bytes/real = 16000000 bytes = 15 MB

so the 2 GB in my machine should be absolutely enough, shouldn't
it???

What am I missing here?

Bart

--
"Share what you know. Learn what you don't."

Salvatore

unread,
Nov 9, 2007, 11:03:51 AM11/9/07
to
On 9 Nov, 16:44, Bart Vandewoestyne
<MyFirstName.MyLastN...@telenet.be> wrote:
Original program runs fine with gfortran 4.2.2, segfaults with
4.3.0-20071102
looks like a bug in 4.3

Salvatore

malzf...@googlemail.com

unread,
Nov 9, 2007, 11:11:32 AM11/9/07
to


I just tried this with ifort and it segfaults, too, if I don't
increase the size of the stack. With "ulimit -s unlimited", it runs
fine for me...


lin...@sohu.com

unread,
Nov 9, 2007, 11:14:45 AM11/9/07
to
program sums

integer, parameter :: sp = kind(1.0)
integer, parameter :: dp = selected_real_kind(2*precision(1.0_sp))

integer, parameter :: num_steps = 2000000
real :: t1, t2
integer :: i, j

real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp

call cpu_time(t1)

!$omp parallel do private(i)


do j=0,num_steps-1
do i=0,49
y(j) = y(j) + 0.7_dp**i
end do
end do
!$omp end parallel do

call cpu_time(t2)

print *, "y(end) = ", y(num_steps-1)
print *, "Reached result in ", t2-t1, " seconds processor time."

end program sums

================================
I just have done two little fixes, one is y=0.0_dp ,another is private
(i),then the program passed.

apple@localhost ~/linux/fortran $ gfortran omp.f95 -fopenmp
apple@localhost ~/linux/fortran $ time ./a.out
y(end) = 3.3333332733844969
Reached result in 7.8244886 seconds processor time.

real 0m23.556s
user 0m7.760s
sys 0m0.072s
====================================
my computer has 768MB ram glibc's version is 2.7. and ,GNU Fortran
(GCC) 4.3.0 20071102 (experimental)

then I set num_steps = 20000000, and give gfortan some flags to speed
up:

gfortran omp.f95 -fopenmp -mfpmath=sse -mtune=pentium4 -march=pentium4
-O3 -ftree-vectorize -funroll-all-loops

the result is
apple@localhost ~/linux/fortran $ time ./a.out
y(end) = 3.3333332733844969
Reached result in 118.09538 seconds processor time.

real 3m59.905s
user 1m57.579s
sys 0m0.544s


malzf...@googlemail.com

unread,
Nov 9, 2007, 11:20:51 AM11/9/07
to
On Nov 9, 11:14 am, linu...@sohu.com wrote:
> ================================
> I just have done two little fixes, one is y=0.0_dp ,another is private
> (i),then the program passed.

Agreed, I made those two little changes as well. With ifort, however,
the program does not scale at all and the speed with which it runs is
virtually identical single-threaded and on two threads.

lin...@sohu.com

unread,
Nov 9, 2007, 11:23:52 AM11/9/07
to
I think gfortran 4.3.0 20071102 is very stable.

You will see this by doing a "make check-fortan" after compiling GCC.

Bart Vandewoestyne

unread,
Nov 9, 2007, 11:33:31 AM11/9/07
to
On 2007-11-09, lin...@sohu.com <lin...@sohu.com> wrote:
> I think gfortran 4.3.0 20071102 is very stable.
>
> You will see this by doing a "make check-fortan" after compiling GCC.

I just tried the following program with the latest version of
gfortran (the one of today... can't get it fresher than that ;-)

program sums

integer, parameter :: sp = kind(1.0)
integer, parameter :: dp = selected_real_kind(2*precision(1.0_sp))

integer, parameter :: num_steps = 2000000
real :: t1, t2
integer :: i, j
real(kind=dp), dimension(0:num_steps-1) :: y

call cpu_time(t1)

y = 0.0_dp

!$omp parallel do private(i)


do j=0,num_steps-1
do i=0,49
y(j) = y(j) + 0.7_dp**i
end do
end do
!$omp end parallel do

call cpu_time(t2)

print *, "y(end) = ", y(num_steps-1)
print *, "Reached result in ", t2-t1, " seconds processor time."

end program sums


bartv@vonneumann:~/openmp$ gfortran --version | head -1
GNU Fortran (GCC) 4.3.0 20071109 (experimental) [trunk revision 130034]
bartv@vonneumann:~/openmp$ gfortran -fopenmp sums.f95
bartv@vonneumann:~/openmp$ ./a.out
Segmentation fault
bartv@vonneumann:~/openmp$


If i decrease num_steps from 2000000 to 200000 then I don't get the
segmentation fault.

I'm doing this on a machine with 1.5GB of main memory.

Can we conclude that this is a bug in gfortran or not?

Regards,

lin...@sohu.com

unread,
Nov 9, 2007, 12:03:21 PM11/9/07
to
yes,it is a bug.

suggest you file a bug report to the GCC team.

Toon Moene

unread,
Nov 9, 2007, 3:31:41 PM11/9/07
to
lin...@sohu.com wrote:

> I think gfortran 4.3.0 20071102 is very stable.
>
> You will see this by doing a "make check-fortan" after compiling GCC.

Be careful. What you see with make check-fortran is only that we fixed
the bugs we caught.

Cheers,

--
Toon Moene - e-mail: to...@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
At home: http://moene.indiv.nluug.nl/~toon/
GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003

Tobias Burnus

unread,
Nov 9, 2007, 5:02:12 PM11/9/07
to
On Nov 9, 6:03 pm, linu...@sohu.com wrote:
> yes,it is a bug.
> suggest you file a bug report to the GCC team.

I filled now http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34044
However, I agree with Jakub (= GCC OpenMP specialist) in wondering
whether this is merely a stack-size-limit problem.

What does ulimit -s report? I use unlimited stack size (especially as
ifort is very stack hungry) and it works for me with -m32 and -m64 and
gfortran 4.2.2 and 4.3.0 on x86-64.

On Nov 9, 9:31 pm, Toon Moene <t...@moene.indiv.nluug.nl> wrote:


> linu...@sohu.com wrote:
> > I think gfortran 4.3.0 20071102 is very stable.
> > You will see this by doing a "make check-fortan" after compiling GCC.
> Be careful. What you see with make check-fortran is only that we fixed
> the bugs we caught.

I agree that gfortran 4.3.0 is quite stable. There were not many bugs
reported lately and also the development has calmed down a bit -
especially also in the middle and backend. There might be well a GCC
release in Q1 2008. openSUSE recently switched to GCC 4.3 as default
compiler for Factory and also Red Hat wants to use GCC 4.3 for their
next release.

Regarding "make check-gfortran": It is required that one runs the
gfortran test suite for any patch to be applied. And for middle end
work, the tests for all languages (incl. gfortran) need to pass. Thus
"make check-gfortran" should almost never fail (on common platforms at
least, others have some failures).

Tobias

Bart Vandewoestyne

unread,
Nov 9, 2007, 6:52:20 PM11/9/07
to
On 2007-11-09, Tobias Burnus <bur...@net-b.de> wrote:
>
> What does ulimit -s report? I use unlimited stack size (especially as
> ifort is very stack hungry) and it works for me with -m32 and -m64 and
> gfortran 4.2.2 and 4.3.0 on x86-64.

For the record:

With the following program (num_steps equal to 2000000):

program sums

integer, parameter :: sp = kind(1.0)
integer, parameter :: dp = selected_real_kind(2*precision(1.0_sp))

integer, parameter :: num_steps = 2000000
real :: t1, t2
integer :: i, j
real(kind=dp), dimension(0:num_steps-1) :: y

call cpu_time(t1)

y = 0.0_dp

!$omp parallel do private(i)


do j=0,num_steps-1
do i=0,49
y(j) = y(j) + 0.7_dp**i
end do
end do
!$omp end parallel do

call cpu_time(t2)

print *, "y(end) = ", y(num_steps-1)
print *, "Reached result in ", t2-t1, " seconds processor time."

end program sums

and gfortran version:

bartv@ciney:~/openmp$ gfortran --version | head -1
GNU Fortran (GCC) 4.3.0 20071002 (experimental) [trunk revision 128946]

I get:

bartv@ciney:~/openmp$ gfortran -fopenmp sums.f95
bartv@ciney:~/openmp$ ./a.out
Segmentation fault (core dumped)


ulimit -s gives me:

bartv@ciney:~$ ulimit -s
8192

If you need more info or want me to do more tests, just let me know.

Regards,

lin...@sohu.com

unread,
Nov 9, 2007, 8:06:34 PM11/9/07
to
I noticed that a new compiler flag "-fmax-stack-var-size=N" was
introduced to gfortran ,it seems has some relationship with
openmp ,what is it?

Regards,


lin...@sohu.com

unread,
Nov 9, 2007, 8:17:51 PM11/9/07
to
> real(kind=dp), dimension(0:num_steps-1) :: y
>
> call cpu_time(t1)
>
> y = 0.0_dp
>
> bartv@ciney:~/openmp$ gfortran -fopenmp sums.f95
> bartv@ciney:~/openmp$ ./a.out
> Segmentation fault (core dumped)


I think it is a bug obviosly.

because in my computer , a.out executes normally when
real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp

and it gives a Segmentation fault when


real(kind=dp), dimension(0:num_steps-1) :: y
call cpu_time(t1)
y = 0.0_dp

am I right?

lin...@sohu.com

unread,
Nov 9, 2007, 8:59:04 PM11/9/07
to
On 11 9 , 8 31 , Toon Moene <t...@moene.indiv.nluug.nl> wrote:

> linu...@sohu.com wrote:
> > I think gfortran 4.3.0 20071102 is very stable.
>
> > You will see this by doing a "make check-fortan" after compiling GCC.
>
> Be careful. What you see with make check-fortran is only that we fixed
> the bugs we caught.

the result of "make check-fortran" is

=== gfortran Summary ===

# of expected passes 21944
# of unexpected successes 1
# of expected failures 8
# of unsupported tests 16
/usr/src/gcc-4.3-20071102/f95/gcc/testsuite/gfortran/../../gfortran
version 4.3.0 20071102 (experimental) (GCC)
make: [check-gfortran] error 1 (ignored)


how about this result?

Tobias Burnus

unread,
Nov 10, 2007, 6:29:39 AM11/10/07
to
On Nov 10, 12:52 am, Bart Vandewoestyne

<MyFirstName.MyLastN...@telenet.be> wrote:
> I get:
> bartv@ciney:~/openmp$ gfortran -fopenmp sums.f95
> bartv@ciney:~/openmp$ ./a.out
> Segmentation fault (core dumped)
>
> ulimit -s gives me:
>
> bartv@ciney:~$ ulimit -s
> 8192

This is too small. As someone has calculated:

2000000 x 8 bytes/real = 16000000 bytes = 15 MB

Try it with:
ulimit -S -s 16384 # increase soft limit to 16384 KiB
or use "unlimited".

For OpenMP the memory is put on the stack whereas the serial program
uses static memory. Thus for the OpenMP program your stack is too
small. Ifort actually does the same.

(In this particular case, static memory could be used as well as "y"
is in the program. If "y" were a local variable in a procedure, which
could be called by several threads simultaneously, using static memory
gives wrong results. The sunf95 compiler seems to use static memory
for your particular program and has thus not the stack-size problem.)


> I noticed that a new compiler flag "-fmax-stack-var-size=N" was
> introduced to gfortran ,it seems has some relationship with
> openmp ,what is it?

Well, it it not applicable with OpenMP and the relationship is as
follows.

The option -fopenmp implies -frecursive.

-frecursive
Allow indirect recursion by forcing all local arrays
to be allocated on the stack. This flag cannot be
used together with -fmax-stack-var-size= or
-fno-automatic.

-fmax-stack-var-size=n
This option specifies the size in bytes of the
largest array that will be put on the stack; if the
size is exceeded static memory is used (except in
procedures marked as RECURSIVE). Use the option
-frecursive to allow for recursive procedures which
do not have a RECURSIVE attribute or for parallel
programs. Use -fno-automatic to never use the stack.

This option currently only affects local arrays declared
with constant b
ter variables. Future versions of GNU Fortran may improve
this behavior.

> > > I think gfortran 4.3.0 20071102 is very stable.
> > > You will see this by doing a "make check-fortan" after compiling GCC.
> >
> > Be careful. What you see with make check-fortran is only that we fixed
> > the bugs we caught.
>

> the result of "make check-fortran" is
> === gfortran Summary ===
> # of expected passes 21944
> # of unexpected successes 1
> # of expected failures 8
> # of unsupported tests 16

> how about this result?

Looks good. It means that all tests succeed, except for 8 which are
expected to fail (e.g. deficit in the system's C library).

Nonetheless, real-world programs may fail even if the test-suite
succeeds. As the number of recently reported gfortran bug reports is
quite low, the number of reported regressions is and was low, the
middle end is in a bug fix only mode, and the users of gcc 4.3 and
gfortran 4.3 is quite big, I agree that 4.3 is very stable.

Tobias

lin...@sohu.com

unread,
Nov 10, 2007, 7:40:39 AM11/10/07
to
thanks very much.
but you havn't answered one of my questiones yet. Even I use
"unlimited" ,gfortran also gives a segmentation fault.
please check it and fix this bug in future versiones.

================================================================

Steven G. Kargl

unread,
Nov 10, 2007, 10:26:51 AM11/10/07
to
In article <1194698439.4...@v29g2000prd.googlegroups.com>,

lin...@sohu.com writes:
> thanks very much.
> but you havn't answered one of my questiones yet. Even I use
> "unlimited" ,gfortran also gives a segmentation fault.
> please check it and fix this bug in future versiones.

In reviewing this entire thread and reading the gcc
bugzilla report, it would appear that there is no bug
in gfortran. You have hit a system resource limit on
your system. "unlimited" does not mean unlimited.
It means the maximum amount of memory that the sysadmin
has decided to allow a user to use. She could have
set to limit to 1 byte.

--
Steve
http://troutmask.apl.washington.edu/~kargl/

lin...@sohu.com

unread,
Nov 10, 2007, 12:42:49 PM11/10/07
to
On 11 10 , 3 26 , ka...@troutmask.apl.washington.edu (Steven G.
Kargl) wrote:
> In article <1194698439.449479.126...@v29g2000prd.googlegroups.com>,


really? I don't think so.

what is difference between


real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp
and

real(kind=dp), dimension(0:num_steps-1) :: y
y=0.0_dp
?????

why the first one can pass through but the later give a segmentention
fault?
try it !

regards,
X

deltaseq0

unread,
Nov 10, 2007, 1:17:42 PM11/10/07
to
:
:

>> Stevehttp://troutmask.apl.washington.edu/~kargl/
>
>
> really? I don't think so.
>
> what is difference between
> real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp
> and
> real(kind=dp), dimension(0:num_steps-1) :: y
> y=0.0_dp
> ?????
>
> why the first one can pass through but the later give a segmentention
> fault?
> try it !
>
> regards,
> X
>
One difference is that a local variable initialized in a type declaration
statement automatically inherits the SAVE attribute BUT I don't know why
that would account for the segmentation fault.
If you change the second declaration to:
real(kind=dp), SAVE, dimension(0:num_steps-1) :: y
do you still get a fault?
Just guessing. - Mike


lin...@sohu.com

unread,
Nov 10, 2007, 1:34:10 PM11/10/07
to
>
> One difference is that a local variable initialized in a type declaration
> statement automatically inherits the SAVE attribute BUT I don't know why
> that would account for the segmentation fault.
> If you change the second declaration to:
> real(kind=dp), SAVE, dimension(0:num_steps-1) :: y
> do you still get a fault?
> Just guessing. - Mike

great! it passed when
real(kind=dp),save,dimension(0:num_steps-1) :: y
y = 0.0_dp

obviously,this is just the source of the segmentation fault.


Tobias Burnus

unread,
Nov 10, 2007, 2:09:25 PM11/10/07
to
> > what is difference between
> > real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp
> > and
> > real(kind=dp), dimension(0:num_steps-1) :: y
> > y=0.0_dp
> > why the first one can pass through but the later give a segmentention
> > fault?
>
> One difference is that a local variable initialized in a type declaration
> statement automatically inherits the SAVE attribute BUT I don't know why
> that would account for the segmentation fault.

If a variable has the SAVE attribute, it ends up in static memory (and
is thus in OpenMP programs shared by all threads). This makes a HUGE
difference as the example below illustrates. In order to give the
expected result, gfortran puts all local variables on the stack when
for OpenMP programs. For serial programs, many such variables are
usually put into the static memory. (This is controlled by the options
-fautomatic, -fmax-stack-var-size=n and -frecursive, see gfortran
manpage).

Assume that the "function foo" simply returns the argument. Then the
following program

do i = 1, 10
res(i) = foo(i)
end do

initializes the array with the values "1 2 3 4 5 6 7 8 9 10".


Now run the following OpenMP program once with "SAVE :: res" and once
without.

---------------------------
program test
implicit none
integer :: i, res(10)
!$OMP parallel do
do i = 1, 10
res(i) = bar(i)
end do
print '(10(i0," "))', res
contains
function bar(x)
integer :: x, res, bar
!!!! SAVE :: res
integer j
real :: y
! assign here
res = x
! waste some CPU cycles
! to make a race more probable
do j = 1, 20000
y = cos(real(j))
end do
bar = res
end function bar
end program test
---------------------------

The result
- for the serial program
- for the parallel program (OMP_NUM_THREADS >= 1) without SAVE:
1 2 3 4 5 6 7 8 9 10
but for the -fopenmp program with SAVE (OMP_NUM_THREADS > 1), e.g.
6 8 9 10 5 2 7 3 4 5

This shows why gfortran puts with -fopenmp all local variables on the
stack. (This is actually not needed for variables in the main program
("PROGRAM ..."), but gfortran and seemingly ifort also use the stack
for these.)

Tobias

Steven G. Kargl

unread,
Nov 10, 2007, 4:18:27 PM11/10/07
to
In article <1194716569....@q5g2000prf.googlegroups.com>,

lin...@sohu.com writes:
> On 11 10 , 3 26 , ka...@troutmask.apl.washington.edu (Steven G.
> Kargl) wrote:
>> In article <1194698439.449479.126...@v29g2000prd.googlegroups.com>,
>> linu...@sohu.com writes:
>>
>> > thanks very much.
>> > but you havn't answered one of my questiones yet. Even I use
>> > "unlimited" ,gfortran also gives a segmentation fault.
>> > please check it and fix this bug in future versiones.
>>
>> In reviewing this entire thread and reading the gcc
>> bugzilla report, it would appear that there is no bug
>> in gfortran. You have hit a system resource limit on
>> your system. "unlimited" does not mean unlimited.
>> It means the maximum amount of memory that the sysadmin
>> has decided to allow a user to use. She could have
>> set to limit to 1 byte.
>>
>> --
>> Steve
>> http://troutmask.apl.washington.edu/~kargl/
>
>
> really?

Yes, really! It is not a bug in gfortran.

> I don't think so.

You have an extra word in the above sentence.

> what is difference between
> real(kind=dp), dimension(0:num_steps-1) :: y=0.0_dp
> and
> real(kind=dp), dimension(0:num_steps-1) :: y
> y=0.0_dp
> ?????

stack memory versus heap memory.

You have unfounded expectation on your system resources.

--
Steve
http://troutmask.apl.washington.edu/~kargl/

0 new messages