Hello all,
I want to run FDS in a Linux cluster in which version FDS 6.0.0 is currently installed. The cluster has multiple machines but I would like to use one machine with 16 cores.
I created the FDS model with 16 meshes of similar size.
Since I do not have experience with parallel processing in a Linux cluster, I am very confused. Any help would be appreciated!
In order to initiate the simulation in the cluster, i am using a qsub file:
#PBS -l nodes=1:nameofmachine:ppn=16
#PBS -N testrun
#############################################################
## PBS (DO NOT CHANGE!) ##
#############################################################
#PBS -l cput=20:00:00
#PBS -W umask=002
#PBS -j oe
#PBS -c c=1
#PBS -V
export OMP_NUM_THREADS=16
cd $PBS_O_WORKDIR
umask 002
cp $PBS_NODEFILE $PBS_O_WORKDIR/hostfile.gen
chmod 664 hostfile.gen
NPROC=`wc -l <$PBS_NODEFILE`
source /wb_apps/FDS/shortcuts/bashrc_fds6 intel64
echo $LD_LIBRARY_PATH
ulimit -s unlimited
#############################################################
## END OF PBS (DO NOT CHANGE!) ##
#############################################################
runid=testrun
However, my .out file shows that OpenMP is Disabled:
Fire Dynamics Simulator
Compilation Date : Sun, 03 Nov 2013
Version : FDS 6.0.0 Parallel
OpenMP Disabled
SVN Revision No. : 17279
Job TITLE : 150MW_fire
Job ID string : GN_testrun
and the error message I receive gives the following:
Fire Dynamics Simulator
Compilation Date : Sun, 03 Nov 2013
Current Date : February 6, 2017 16:01:18
Version: FDS 6.0.0; MPI Enabled; OpenMP Disabled
SVN Revision No. : 17279
Job TITLE : 150MW_fire
Job ID string : testrun
Time Step: 1, Simulation Time: 0.06 s
Time Step: 2, Simulation Time: 0.10 s
Time Step: 3, Simulation Time: 0.13 s
Time Step: 4, Simulation Time: 0.17 s
Time Step: 5, Simulation Time: 0.20 s
Time Step: 6, Simulation Time: 0.23 s
Time Step: 7, Simulation Time: 0.26 s
mpirun: killing job...
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 12044 on node galatea.cluster.intern exited on signal 0 (Unknown signal 0).
--------------------------------------------------------------------------
mpirun: clean termination accomplished
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
Stack trace terminated abnormally.
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
Stack trace terminated abnormally.