Hi,
I am currently successfully using 12 processors and 12 meshes on a single computer to run FDS simulation.
Now I am aiming for using more computers in cluster.
To start with, I have tried to use 11 processors from the main FDS computer, and then using 1 processor from my desktop computer (remote computer) to see if it works.
I have tried to follow the instructions in FDS User's guide (ch. 3.2) along with my IT guy but have not managed to get it running. Hoping someone can guide us.
"If you wish to run FDS on more than one computer, do the following:
1. Create a text file, say hostfile.txt, and in it list, line by line, the names of your computers."
We have done this with our remote computer "Nameofremotecomputer" but are not sure where to place this file, and have tried 2 locations. One location being PyroSim 2021<fds<mpi (which has space in the name by origin, is this a problem, like in step 3?)
Should I only write the name of the other computers I will use, or also the computer that I am running FDS from?
Next step in the FDS User manual...
2. Test your network by running the following test program:
mpiexec -n <procs> -f hostfile.txt test_mpi where <procs> is the number of computers you want to test. If this command returns a “Hello World” message from each of your computers, proceed to the next step. If this command fails, check that you can “see” the other machines by “pinging” them, and check that the other computer can “see” your
computer as well. Also, make sure that the same version of FDS is installed on the other computers.
We didn't get any Hello World message when we wrote:
mpiexec -n 1 -f hostfile.txt test_mpi (when I was testing one remote computer)
We were located in the mpi directory when writing that command.
The error message that I get is the following (Nameofsimulationcomputer is the main computer and Nameofremotecomputer is the second computer):
C:\Program Files\PyroSim 2021\fds\mpi>mpiexec -n 1 -f hostfile.txt test_mpi
[proxy:0:0@Nameofremotecomputer] launch_processes (proxy.c:571): error creating process (error code 2). The system cannot find the file specified.
[proxy:0:0@Nameofremotecomputer] main (proxy.c:927): error launching_processes
[mpiexec@Nameofsimulationcomputer] wmain (mpiexec.c:2113): assert (exitcodes != NULL) failed
Pinging worked though.
We have also followed the instructions from Pyrosim user manual in chapter 19.5.
Pyrosim is installed in exactly same folder on all (2) cluster machines
Pyrosim installations are exactly the same version
Simulation folder has no space in name and is accessible to both machines
Help would be much appreciated.
And even an online meeting if anyone has the time.
Sigurdur Bjarni Gislason