FDS and OpenMP - First results

110 views
Skip to first unread message

Christian Rogsch

unread,
Feb 3, 2009, 10:52:38 AM2/3/09
to FDS and Smokeview Discussions
All,

I have done some parallelization work (only velo.f90 and divg.f90)
with OpenMP. It works. First performance results can be seen in the
uploaded file "OpenMP_FDS.jpeg". The blue one shows the calculated
wall clock and the red one the optimal "wish" wall clock.

If there are questions, please ask.

Best regards,
Christian

dr_jfloyd

unread,
Feb 3, 2009, 10:56:59 AM2/3/09
to FDS and Smokeview Discussions
Nice. The increase in time from 12 to 16 threads, was the computer a
quad processor / quad core?

Christian Rogsch

unread,
Feb 3, 2009, 10:58:55 AM2/3/09
to FDS and Smokeview Discussions
It was a 32 core...

Christian Rogsch

unread,
Feb 3, 2009, 11:05:21 AM2/3/09
to FDS and Smokeview Discussions
Just for clarification:

IBM Power 575 supercomputing node
see http://www-03.ibm.com/systems/power/hardware/575/

Hostikka Simo

unread,
Feb 3, 2009, 12:12:56 PM2/3/09
to fds...@googlegroups.com
Very interesting!

First, I don't really know anything about using OpenMP.

- Was that a fire case or just flow? (Could you post your input file
too, please)
- How do you specify the number of threads - during the compile or
runtime?
- Was the "one thread" case a normal (unmodified) FDS or your version
with just one thread?

Simo

Christian Rogsch

unread,
Feb 3, 2009, 1:29:01 PM2/3/09
to FDS and Smokeview Discussions
Simo,

input file is in the file section (1-Gitter_OpenMP.fds), it was only
hot air-flow (because of parallelization of divg and velo), so I need
a file with a very high percentage of velo and divg calculation. Velo
+ Divg = 73% of the total calculation effort.

Number of threads is specified at runtime, so you can choose how many
threads should be used.

The "one thread" was calculated with the modified version by using
only one thread.

I used Version : 5.2.4 Serial, SVN Revision No. : 2670

Just a short comment about the parallelization: I did not check any
performance about parallelizing the DO-Loops, so all possible loops
are parallelized. But this sometimes requires a longer calculation
time, because the loop must be distributed to the threads, and if I
have a "short" DO-Loop (e.g. DO I=1, NEWC) this is a break, because
the thread distribution of loops uses more time than the calculation
of the loop itself. To get "good" performance results, very detailed
analysis have to be done: 1) check the calculation time for each loop
2)check the thread distribution time for each loop using different
number of threads 3)Find a factor, where it is better to a)calculate
in serial or b)use a limited number of threads or c)use the full
number of threads.
But this is quite difficult...

Regards,
Christian

Kevin

unread,
Feb 3, 2009, 1:38:02 PM2/3/09
to FDS and Smokeview Discussions
This is terrific! This seems like a very good way to speed up
calculations on all the multi-processor, multi-core machines. What has
to be added to the source code to make this all work? Is it possible
to "hide" the OpenMP calls for those who do not have it installed? Is
a one-thread case the exact same thing as running the code with no
OpenMP at all?
> > > Christian- Hide quoted text -
>
> - Show quoted text -

Christian Rogsch

unread,
Feb 3, 2009, 2:24:23 PM2/3/09
to FDS and Smokeview Discussions
OpenMP can not be installed, OpenMP are compiler directives which say
to the compiler "produce this loop as a multi-threaded loop", so there
is no need to "hide" this calls. Running the code with OpenMP is the
same than running it without OpenMP, except one short thing (I tested
only linux). On linux you have to specifiy the number of threads which
should be used with typing at the command prompt:
export OMP_NUMTHREADS = number of threads,
than start fds the normal way, that's all.So, if you want run the code
with 4 threads, write:
export OMP_NUMTHREADS = 4
fds5 filename.fds

Furthermore the code can be compiled without the OpenMP directives,
because it is nothing else than the standard serial code without the
directives. Also, a combination of OpenMP and MPI is no problem.

Here you can see calculating MU with OpenMP compiler directives:
The OpenMP lines are:

!$OMP PARALLEL PRIVATE(CS)
!$OMP DO COLLAPSE(3) PRIVATE
(K,J,I,DELTA,DUDX,DVDY,DWDZ,DUDY,DUDZ,DVDX,DVDZ,DWDX,DWDY,S12,S13,S23,SS,ITMP)
!$OMP END DO
!$OMP END PARALLEL

If you ignore these lines (= compiling without OpenMP) than you have
the serial code.

IF (LES) THEN
!$OMP PARALLEL PRIVATE(CS)
CS = CSMAG
IF (EVACUATION_ONLY(NM)) CS = 0.9_EB
!$OMP DO COLLAPSE(3) PRIVATE
(K,J,I,DELTA,DUDX,DVDY,DWDZ,DUDY,DUDZ,DVDX,DVDZ,DWDX,DWDY,S12,S13,S23,SS,ITMP)
DO K=1,KBAR
DO J=1,JBAR
DO I=1,IBAR
IF (TWO_D) THEN
DELTA = SQRT(DX(I)*DZ(K))
ELSE
DELTA = (DX(I)*DY(J)*DZ(K))**ONTH
ENDIF
DUDX = RDX(I)*(UU(I,J,K)-UU(I-1,J,K))
DVDY = RDY(J)*(VV(I,J,K)-VV(I,J-1,K))
DWDZ = RDZ(K)*(WW(I,J,K)-WW(I,J,K-1))
DUDY = 0.25_EB*RDY(J)*(UU(I,J+1,K)-UU(I,J-1,K)+UU(I-1,J
+1,K)-UU(I-1,J-1,K))
DUDZ = 0.25_EB*RDZ(K)*(UU(I,J,K+1)-UU(I,J,K-1)+UU(I-1,J,K
+1)-UU(I-1,J,K-1))
DVDX = 0.25_EB*RDX(I)*(VV(I+1,J,K)-VV(I-1,J,K)+VV(I
+1,J-1,K)-VV(I-1,J-1,K))
DVDZ = 0.25_EB*RDZ(K)*(VV(I,J,K+1)-VV(I,J,K-1)+VV(I,J-1,K
+1)-VV(I,J-1,K-1))
DWDX = 0.25_EB*RDX(I)*(WW(I+1,J,K)-WW(I-1,J,K)+WW(I
+1,J,K-1)-WW(I-1,J,K-1))
DWDY = 0.25_EB*RDY(J)*(WW(I,J+1,K)-WW(I,J-1,K)+WW(I,J
+1,K-1)-WW(I,J-1,K-1))
S12 = 0.5_EB*(DUDY+DVDX)
S13 = 0.5_EB*(DUDZ+DWDX)
S23 = 0.5_EB*(DVDZ+DWDY)
SS = SQRT(2._EB*(DUDX**2 + DVDY**2 + DWDZ**2 + 2._EB*
(S12**2 + S13**2 + S23**2)))
IF (DYNSMAG) CS = C_DYNSMAG(I,J,K)
ITMP = 0.1_EB*TMP(I,J,K)
MU(I,J,K) = SPECIES(0)%MU(ITMP) + RHOP(I,J,K)*(CS*DELTA)
**2*SS
ENDDO
ENDDO
ENDDO
!$OMP END DO
!$OMP END PARALLEL
ENDIF

Kevin schrieb:

Hostikka Simo

unread,
Feb 4, 2009, 2:58:10 AM2/4/09
to fds...@googlegroups.com
Christian,
Thanks for your clarifications.

If you compare the results of one-threaded and multi-threaded cases, are they exactly the same?

Christian Rogsch

unread,
Feb 4, 2009, 4:32:31 AM2/4/09
to FDS and Smokeview Discussions
Simo,

the results are the same for multi-threaded and one-threaded. Exactly
is not possible, because of NOISE = .TRUE. If the results should not
be the same for one-threaded and multi-threaded, this means that
OpenMP parallelization is wrong. To ensure a correct parallelization,
the application must be checked with the Intel ThreadChecker. I
checked it. OpenMP is the hand-coded version of the "auto-
parallelization" feature of the compiler. OpenMP is a standard, thus
it should be able to compile with every compiler an OpenMP
application.

If you want to know more about OpenMP, have a look at
https://fs.hlrs.de//projects/par/par_prog_ws/

This is a Online Course for parallel programming and there you will
find the slides about OpenMP. It is easier to understand than the
official manual...

Regards,
Christian

Hostikka Simo schrieb:

Randy McDermott

unread,
Feb 4, 2009, 8:15:26 AM2/4/09
to FDS and Smokeview Discussions
Christian,

I just want to add my kudos. Very nice work!!

Randy

On Feb 4, 4:32 am, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> Simo,
>
> the results are the same for multi-threaded and one-threaded. Exactly
> is not possible, because of NOISE = .TRUE. If the results should not
> be the same for one-threaded and multi-threaded, this means that
> OpenMP parallelization is wrong. To ensure a correct parallelization,
> the application must be checked with the Intel ThreadChecker. I
> checked it. OpenMP is the hand-coded version of the "auto-
> parallelization" feature of the compiler. OpenMP is a standard, thus
> it should be able to compile with every compiler an OpenMP
> application.
>
> If you want to know more about OpenMP, have a look athttps://fs.hlrs.de//projects/par/par_prog_ws/

jkbi

unread,
Feb 26, 2009, 2:07:41 AM2/26/09
to FDS and Smokeview Discussions
Hi

I have also made some tests with OpenMP i FDS, and got some of the
same speedup numbers. My question is - is this going to be implemented
in FDS in the near future, and is it a DIY project :) ?

/Jens Kristian
> > > > > > - Show quoted text -- Skjul tekst i anførselstegn -
>
> - Vis tekst i anførselstegn -

Kevin

unread,
Mar 1, 2009, 2:10:05 PM3/1/09
to FDS and Smokeview Discussions
We plan to add OpenMP to FDS as a permanent feature. We just need to
finish up a few other projects. We will probably start with what
Christian has already done, and then go on to add the calls to other
routines.
> > - Vis tekst i anførselstegn -- Hide quoted text -

Christian Rogsch

unread,
Mar 2, 2009, 5:17:57 AM3/2/09
to FDS and Smokeview Discussions
The current status is, that I have parallelized the complete FDS code
with OpenMP. The actual problem is the compiler, thus it seems that
the Intel Compiler has some problems if I choose some optimization
flags. If I have a successful compiled and running code, I will post
more results. End of March there is a one week Parallel Programming
Workshop (sponsored by Microsoft, Intel and Sun), where I participate,
thus I hope they can help me to solve these problems. There is also a
talk about about "Future Directions in High Performance Computing
2009-2018", which should be also very interesting...

On 1 Mrz., 20:10, Kevin <mcgra...@gmail.com> wrote:
> We plan to addOpenMPto FDS as a permanent feature. We just need to
> finish up a few other projects. We will probably start with what
> Christian has already done, and then go on to add the calls to other
> routines.
>
> On Feb 26, 2:07 am, jkbi <j...@lund-birkmose.dk> wrote:
>
>
>
> > Hi
>
> > I have also made some tests withOpenMPi FDS, and got some of the
> > same speedup numbers. My question is - is this going to be implemented
> > in FDS in the near future, and is it a DIY project :) ?
>
> > /Jens Kristian
>
> > On 4 Feb., 14:15, Randy McDermott <randy.mcderm...@gmail.com> wrote:
>
> > > Christian,
>
> > > I just want to add my kudos.  Very nice work!!
>
> > > Randy
>
> > > On Feb 4, 4:32 am, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
>
> > > > Simo,
>
> > > > the results are the same for multi-threaded and one-threaded. Exactly
> > > > is not possible, because of NOISE = .TRUE. If the results should not
> > > > be the same for one-threaded and multi-threaded, this means that
> > > >OpenMPparallelization is wrong. To ensure a correct parallelization,
> > > > the application must be checked with the Intel ThreadChecker. I
> > > > checked it.OpenMPis the hand-coded version of the "auto-
> > > > parallelization" feature of the compiler.OpenMPis a standard, thus
> > > > it should be able to compile with every compiler anOpenMP
> > > > application.
>
> > > > If you want to know more aboutOpenMP, have a look athttps://fs.hlrs.de//projects/par/par_prog_ws/
>
> > > > This is a Online Course for parallel programming and there you will
> > > > find the slides aboutOpenMP. It is easier to understand than the
> > > > official manual...
>
> > > > Regards,
> > > > Christian
>
> > > > Hostikka Simo schrieb:
>
> > > > > Christian,
> > > > > Thanks for your clarifications.
>
> > > > > If you compare the results of one-threaded and multi-threaded cases, are they exactly the same?
>
> > > > > Simo
>
> > > > > > -----Original Message-----
> > > > > > From: fds...@googlegroups.com
> > > > > > [mailto:fds...@googlegroups.com] On Behalf Of Christian Rogsch
> > > > > > Sent: Tuesday, February 03, 2009 21:24
> > > > > > To: FDS and Smokeview Discussions
> > > > > > Subject: [fds-smv post:6404] Re: FDS andOpenMP- First results
>
> > > > > >OpenMPcan not be installed,OpenMPare compiler directives
> > > > > > which say to the compiler "produce this loop as a
> > > > > > multi-threaded loop", so there is no need to "hide" this
> > > > > > calls. Running the code withOpenMPis the same than running
> > > > > > it withoutOpenMP, except one short thing (I tested only
> > > > > > linux). On linux you have to specifiy the number of threads
> > > > > > which should be used with typing at the command prompt:
> > > > > > export OMP_NUMTHREADS = number of threads, than start fds the
> > > > > > normal way, that's all.So, if you want run the code with 4
> > > > > > threads, write:
> > > > > > export OMP_NUMTHREADS = 4
> > > > > > fds5 filename.fds
>
> > > > > > Furthermore the code can be compiled without theOpenMP
> > > > > > directives, because it is nothing else than the standard
> > > > > > serial code without the directives. Also, a combination of
> > > > > >OpenMPand MPI is no problem.
>
> > > > > > Here you can see calculating MU withOpenMPcompiler directives:
> > > > > > TheOpenMPlines are:
>
> > > > > > !$OMP PARALLEL PRIVATE(CS)
> > > > > > !$OMP DO COLLAPSE(3) PRIVATE
> > > > > > (K,J,I,DELTA,DUDX,DVDY,DWDZ,DUDY,DUDZ,DVDX,DVDZ,DWDX,DWDY,S12,
> > > > > > S13,S23,SS,ITMP)
> > > > > > !$OMP END DO
> > > > > > !$OMP END PARALLEL
>
> > > > > > If you ignore these lines (= compiling withoutOpenMP) than
> > > > > > > to "hide" theOpenMPcalls for those who do not have it
> > > > > > installed? Is
> > > > > > > a one-thread case the exact same thing as running the code with no
> > > > > > >OpenMPat all?
> > > > > > > > > > divg.f90) withOpenMP. It works. First performance

jkbi

unread,
Mar 4, 2009, 1:36:32 AM3/4/09
to FDS and Smokeview Discussions
Hi Christian

Is it possible to get your converted FDS version for OpenMP with the
makefile. I have an opportunity to test it on an IBM Power6 machine
(with linux) for the next couple of weeks and maybe and a Cray
computer.

/ Jens Kristian
> > > > > > > > > > > From: fds...@googlegroups.com- Skjul tekst i anførselstegn -
>
> - Vis tekst i anførselstegn -...
>
> læs mere »

Christian Rogsch

unread,
Mar 4, 2009, 4:39:09 AM3/4/09
to FDS and Smokeview Discussions
Jens,

the OpenMP setting will be added to the official sourcecode. I'm
actually doing some tests to simplify the usage of the OpenMP-Threads
via in input in the MISC line. I think at the next week the first
parts of the OpenMP implementation is done, so you have to download
the code from the online repository. Building instruction to the
makefile will also be added...

On 4 Mrz., 07:36, jkbi <j...@lund-birkmose.dk> wrote:
> Hi Christian
>
> Is it possible to get your converted FDS version forOpenMPwith the
> makefile. I have an opportunity to test it on an IBM Power6 machine
> (with linux) for the next couple of weeks and maybe and a Cray
> computer.
>
> / Jens Kristian
>
> On 2 Mar., 11:17, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
>
>
>
> > The current status is, that I have parallelized the complete FDS code
> > withOpenMP. The actual problem is the compiler, thus it seems that
> > > > > > > > > > On 3 Feb., 18:12, "Hostikka Simo" <Simo.Hosti...@vtt.fi> wrote:...
>
> Erfahren Sie mehr »

jkbi

unread,
Mar 4, 2009, 9:51:37 AM3/4/09
to FDS and Smokeview Discussions
Nice :)
> > > > > > > > > > > the thread distribution of loops- Skjul tekst i anførselstegn -

jkbi

unread,
Mar 17, 2009, 7:33:27 AM3/17/09
to FDS and Smokeview Discussions
Hi Christian

I have tried to compiler the source code on a IBM Power6 with Suse 10
and IBM compilers, but when I try to run a job, I get an error saying:

Problems with MATL number: 1

The input file runs on an Intel 64 bit windows pc with FDS 5.3 without
problem.

Can you help my - what is your compiler settings for your
Powersystems ? My compiled version works if at have a test case where
there is no materials in it.

On 4 Mar., 10:39, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> > > > > > > > > > > the thread distribution of loops- Skjul tekst i anførselstegn -

Christian Rogsch

unread,
Mar 17, 2009, 7:52:09 AM3/17/09
to FDS and Smokeview Discussions
Hi,

I think this a windows-linux problem, because your error message is
based on your input file (.fds). I think you have to check that you
have converted your windows .fds-file to a linux .fds-file. The
problem is that windows uses other "control characters" for line
handling than linux. There is a tool dos2unix which should help you.
Furthermore there are no changes in the read subrountines, thus this
is not a OpenMP problem.
If you compile the code yourself, compile 2 versions of the code: with
OpenMP settings (see at your compiler settings) and without OpenMP.
Then try your file. Normally both versions should be able to read the
file or both are not able to read the file.

Here are full optimized compiler setting for xlf compiler on an AIX-
System and OpenMP. The _r compiler is for thread-save compiling, thus
you have to check if you have this compiler, too. If not, then you
must look in the compiler help...

#AIX, JUMP, SINGLE-Version OpenMP
aix_single_openmp : FFLAGS = -O3 -qhot -q64 -qtune=pwr6 -qarch=pwr6 -
qmaxmem=-1 -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -qsmp=omp
aix_single_openmp : CFLAGS = -O3 -qhot -Dpp_noappend -q64 -qtune=pwr6
-qarch=pwr6 -qmaxmem=-1 -bdatapsize:64K -bstackpsize:64K -btextpsize:
64K -qsmp=omp
aix_single_openmp : FCOMPL = xlf90_r
aix_single_openmp : CCOMPL = xlc_r
aix_single_openmp : obj = fds5_jump_single_openmp
aix_single_openmp : $(obj_serial)
$(FCOMPL) $(FFLAGS) -o $(obj) $(obj_serial)
> > > > > > > > > > > > many threads...
>
> Erfahren Sie mehr »

jkbi

unread,
Mar 18, 2009, 3:11:18 AM3/18/09
to FDS and Smokeview Discussions
Hi again

Dos2unix solved the problem - thanks

- But when I try to compiler the openmp version at get the following
syntax errors in the file velo.f90

/opt/ibmcmp/xlf/11.1/bin/xlf90_r -c -O3 -q64 -qtune=pwr6 -qarch=pwr6 -
qmaxmem=-1 -qsmp=omp -qreport=smplist source/velo.f90
"velo.f90", line 115.8: 1514-050 (S) Specification statement is out of
order. Statement is ignored.
"velo.f90", line 115.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 146.8: 1514-426 (S) The directive specified for the
DO loop does not match the END PARALLEL DO directive, or no directive
matches the END PARALLEL DO directive.
"velo.f90", line 296.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 319.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
"velo.f90", line 347.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 379.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 383.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 414.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 418.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 449.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
"velo.f90", line 1036.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1045.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 1047.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1055.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 1057.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1065.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
"velo.f90", line 1105.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1114.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 1116.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1124.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 1126.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 1134.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
"velo.f90", line 2125.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2137.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 2139.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2152.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 2155.13: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2168.8: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 2171.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2184.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
"velo.f90", line 2186.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2194.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 2197.13: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2205.8: 1514-426 (S) The directive specified for the
DO loop does not match the END DO NOWAIT directive, or no directive
matches the END DO NOWAIT directive.
"velo.f90", line 2208.10: 1515-019 (S) Syntax is incorrect.
"velo.f90", line 2216.5: 1514-426 (S) The directive specified for the
DO loop does not match the END DO directive, or no directive matches
the END DO directive.
** velo === End of Compilation 1 ===
1501-511 Compilation failed for file velo.f90.
make: *** [velo.o] Error 1

Seems like my compiler is not to happy with the syntax of lines like
this (line 347-348):

!$OMP DO COLLAPSE(3) &
!$OMP PRIVATE
(K,J,I,WP,WM,VP,VM,EPSWP,EPSWM,EPSVP,EPSVM,WOMY,VOMZ,RRHO,AH,DVDY,DWDZ,TXXP,TXXM,DTXXDX,DTXYDY,DTXZDZ,VTRM)

I'am using IBM XLF v11.1 compiler on a Power6 platform with Suse v10


On 17 Mar., 12:52, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> Hi,
>
> I think this a windows-linux problem, because your error message is
> based on your input file (.fds). I think you have to check that you
> have converted your windows .fds-file to a linux .fds-file. The
> problem is that windows uses other "control characters" for line
> handling than linux. There is a tool dos2unix which should help you.
> Furthermore there are no changes in the read subrountines, thus this
> is not aOpenMPproblem.
> If you compile the code yourself, compile 2 versions of the code: withOpenMPsettings (see at your compiler settings) and withoutOpenMP.
> Then try your file. Normally both versions should be able to read the
> file or both are not able to read the file.
>
> Here are full optimized compiler setting for xlf compiler on an AIX-
> System andOpenMP. The _r compiler is for thread-save compiling, thus
> > > theOpenMPsetting will be added to the official sourcecode. I'm
> > > actually doing some tests to simplify the usage of theOpenMP-Threads
> > > via in input in the MISC line. I think at the next week the first
> > > parts of theOpenMPimplementation is done, so you have to download

Christian Rogsch

unread,
Mar 18, 2009, 5:05:43 AM3/18/09
to FDS and Smokeview Discussions
I found the problem:
The XLF 11.1. Compiler is not able to compile fortran programs with
the OpenMP 3.0 standard. It can compile only OpenMP 2.5 standard. The
difference between OpenMP 3.0 and OpenMP 2.5 is (in this case) the
COLLAPSE keyword, which is new in OpenMP 3.0. The XLF 12.1. compiler
solves this problem

Thus, you have two possibilities:
- change the compiler to XLF 12.1 (I compiled with 12.1. and no
problem occurs)
- remove all COLLAPSE(X) statements from the code. The code will run,
but you do not have the full performance, because only the first loop
is parallelized.

I think you should change the compiler to the new version, there is
also a test-version available on the IBM website...

On 18 Mrz., 08:11, jkbi <j...@lund-birkmose.dk> wrote:
> Hi again
>
> Dos2unix solved the problem - thanks
>
> - But when I try to compiler theopenmpversion at get the following
> ...
>
> Erfahren Sie mehr »

chris_cfd

unread,
Mar 18, 2009, 10:06:27 PM3/18/09
to FDS and Smokeview Discussions
This is exciting! Good work, Christian. Look forward to when this is
part of an official release of FDS.

Just out of curiosity, do you know the price of your supercomputing
node? I read somewhere that the Power 575 (I think thats what you are
using) is on the order of about $US 500,000. Its an impressive
machine.

Christian Rogsch

unread,
Mar 19, 2009, 5:28:41 AM3/19/09
to FDS and Smokeview Discussions
Thanks for your reply. The costs are not so important for me, I do not
have to pay ;-)

jkbi

unread,
Mar 19, 2009, 11:23:28 AM3/19/09
to FDS and Smokeview Discussions
Hi Christian

Thanks for you quick reply..

Got XLF version 12.1 now - which version of C are you using ?

On 18 Mar., 10:05, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> I found the problem:
> The XLF 11.1. Compiler is not able to compile fortran programs with
> theOpenMP3.0 standard. It can compile onlyOpenMP2.5 standard. The
> difference betweenOpenMP3.0 andOpenMP2.5 is (in this case) the
> COLLAPSE keyword, which is new inOpenMP3.0. The XLF 12.1. compiler
> > (K,J,I,WP,WM,VP,VM,EPSWP,EPSWM,EPSVP,EPSVM,WOMY,VOMZ,RRHO,AH,DVDY,DWDZ,TXXP­,TXXM,DTXXDX,DTXYDY,DTXZDZ,VTRM)
> > > > > > > On 1 Mrz., 20:10, Kevin- Skjul tekst i anførselstegn -

Christian Rogsch

unread,
Mar 19, 2009, 11:33:27 AM3/19/09
to FDS and Smokeview Discussions
The C Compiler can be an older one, there are no OpenMP calls in this
file (and I do not add any to this file). The OpenMP-flag is just to
have a "complete" OpenMP compiling environment. You can compile the C-
file without OpenMP flag. So, if your compiler compiles the C-file,
the compiler is OK. I have not checked, but I think Version 9 is
installed on the machine...
> > > > > > > > flags....
>
> Erfahren Sie mehr »

jkbi

unread,
Apr 27, 2009, 3:14:06 PM4/27/09
to FDS and Smokeview Discussions
Hi again

Final I got some time for test on P6 platform again. But I still have
some problems:

I have Ibm Power 6 PC with 16 cores, 32 GB ram and SUSE v10, XLF 12.1
and VACPP v9. All tested and running.

I have tried to compile SVN 3854 with

xlf90_r -c -O2 -q64 -qtune=pwr6 -qarch=pwr6 -qmaxmem=-1 -qsmp=omp -
Code and input files works fine

xlf90_r -c -O3 -q64 -qtune=pwr6 -qarch=pwr6 -qmaxmem=-1 -qsmp=omp -
Will not compile, I get an internal compiler error for mass.f90,
divg.f90, velo.f90

xlf90_r -c -O3 -q64 -qhot -qtune=pwr6 -qarch=pwr6 -qmaxmem=-1 -
qsmp=omp - Will compile, but I get an error msg. "STOP: Numerical
Instability" after 2 time stepes.

Any Suggestions why this work or not?

/ Jens Kristian

On 19 Mar., 17:33, Christian Rogsch <rog...@uni-wuppertal.de> wrote:
> The C Compiler can be an older one, there are noOpenMPcalls in this
> file (and I do not add any to this file). TheOpenMP-flag is just to
> have a "complete"OpenMPcompiling environment. You can compile the C-
> file withoutOpenMPflag. So, if your compiler compiles the C-file,
> > > > > > > makefile will also be added...- Skjul tekst i anførselstegn -

Christian Rogsch

unread,
Apr 27, 2009, 3:49:37 PM4/27/09
to FDS and Smokeview Discussions
Hi,

there seems to be a problem with the -qhot or -O3 optimization. -O3
includes the -qhot optimization, -O2 not. I think it is a problem of
optimization done by the compiler. The manual points out, that -O3
"may change the semantics of the program slightly", there is some kind
of reordering the code and the floating point calculation is touched.
If you use -O2, there is no touch at the floating point calculation.
So, if you want to use -O3, test with -O3 -qstrict. -qstrict is
necessary if you require the same absolute precision in floating-point
computational accuracy as you get with -O0, -O2 (this is running as
you write). This may be the problem. So, try compiling with -c -O3 -
qstrict -qtune=pwr6 -qarch=pwr6 -qmaxmem=-1 -qsmp=omp -.
Reply all
Reply to author
Forward
0 new messages