Error: The number of MPI processes

658 views
Skip to first unread message

Antony

unread,
Jun 17, 2020, 3:36:05 AM6/17/20
to FDS and Smokeview Discussions
FDS version: 6.7.4

Hi, I encounter an error message for MPI process.

I have 2 workstations: (1) w-ashkg-d-13161 (2) w-ashkg-d-13156. Each workstation has 16 cores.

I have a domain with 23 meshes. I start 12 processes in w-ashkg-d-13161 and 11 processes in w-ashkg-d-13156, refer to the following command:
mpiexec -hosts 2 w-ashkg-d-13156 15 w-ashkg-d-13161 15 -env OMP_NUM_THREADS 1 -wdir \\w-ashkg-d-13156\scenc2$ fds scenc2.fds

However, error message appeared:
ERROR: The number of MPI processes, 32, exceeds the number of meshes, 23 

The same model can be run in FDS version 6.7.1.

Antony
ScenC2.fds

Kevin McGrattan

unread,
Jun 17, 2020, 9:00:11 AM6/17/20
to fds...@googlegroups.com
mpiexec -hosts 2 w-ashkg-d-13156 15 w-ashkg-d-13161 15 -env OMP_NUM_THREADS 1 -wdir \\w-ashkg-d-13156\scenc2$ fds scenc2.fds

This command is asking for 15 processes for each computer.
Message has been deleted

Antony

unread,
Jun 18, 2020, 4:20:55 AM6/18/20
to FDS and Smokeview Discussions
Kevin, I simplified the test file with only 8 meshes.

Workstation D20HK-E0151DLJ (named as "A" below) is a 6-core CPU
Workstation w-ashkg-d-12134 (named as "B" below) is dual CPU with total 12 cores

I assigned 4 cores from "A" and 4 cores from "B" to do the job. 
Firstly I run the test_mpi, it is strange that 6 processes from "A" and 6 processes from "B" are initiated. (see capture1.png)
Then, I run the fds file, error message prompt there are 12 MPI processes launched which exceed the number of meshes, 8, required. (see capture2.png)

It seems that the MPI will use all the cores of the computer "A" instead of the number that I assigned.
ScenC2.fds
Capture1.PNG
Capture2.PNG

Kevin McGrattan

unread,
Jun 18, 2020, 8:38:57 AM6/18/20
to fds...@googlegroups.com
Try this

  mpiexec -n <# of processes> -ppn <# of processes per node> -f <hostfile> fds jobname.fds

hostfile is a text file listing the names of the computers, like this

computer1
computer2
computer3


Antony

unread,
Jun 21, 2020, 11:03:04 PM6/21/20
to FDS and Smokeview Discussions
Kevin, I follow your guideline. I found the computers cannot be assigned different number of process. 

The SMV file is not created.

Could this command be included in the user guide?


hostfile.txt
Capture3.PNG

Kevin McGrattan

unread,
Jun 22, 2020, 8:19:27 AM6/22/20
to fds...@googlegroups.com
I am happy it worked for you, but I do not understand how it worked. You specified -ppn 1. This means "processes per node". I would expect in your case that you should use -ppn 4. Try that.
Message has been deleted

Antony

unread,
Jun 23, 2020, 5:17:49 AM6/23/20
to FDS and Smokeview Discussions
It works, the complete command is
mpiexec -f hostfile.txt -n 8 -ppn 4 -env OMP_NUM_THREADS 1 -wdir \\D20HK-E0151DLJ\test1 fds scenc2.fds

Grateful if you can include this command into the user guide in the next update.

Kevin McGrattan

unread,
Jun 23, 2020, 9:10:43 AM6/23/20
to fds...@googlegroups.com
Yes, this now makes sense. I will add it. Thanks.

Antony

unread,
May 11, 2021, 5:35:06 AM5/11/21
to FDS and Smokeview Discussions
Hi Kevin,

I have upgrade to FDS 6.7.5, but I found that the parallel process across computers cannot start.
I follow the user guide Section 3.2
1. Ping the network computer successfully
2. But fail to run the test_mpi
cap.png

I can run fds_local without any problem.
Do you have any idea?

Kevin McGrattan

unread,
May 11, 2021, 9:34:36 AM5/11/21
to fds...@googlegroups.com
Were you able to do this with previous versions of FDS?

Antony

unread,
May 17, 2021, 5:18:22 AM5/17/21
to FDS and Smokeview Discussions
No. I think the MPI connection is blocked. Can you tell me what port / protocol / requirement setting on firewall / network required for MPI connection? So I can pass to our IT department. Thanks

Glenn Forney

unread,
May 17, 2021, 8:11:46 AM5/17/21
to fds...@googlegroups.com
the ports used by fds (actually hydra_service) are 8670->8690 .  They are defined by the setup_fds_firewall.bat batch file run by the installer.  This batch file is in the bot repo at bot\bundle\fds\for_bundle .

there should be a way to see the ports used by running processes (hydra_service ) but I dont see how to do this now at the moment


--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fds-smv/c437416a-c505-4f34-9311-9a9d8dde3abbn%40googlegroups.com.


--
Glenn Forney

Glenn Forney

unread,
May 17, 2021, 8:24:44 AM5/17/21
to fds...@googlegroups.com
Also, this is the label we give for fds firewall rules

"Intel MPI Port for FDS"

On Mon, May 17, 2021 at 5:18 AM Antony <anton...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fds-smv/c437416a-c505-4f34-9311-9a9d8dde3abbn%40googlegroups.com.


--
Glenn Forney

Antony

unread,
May 30, 2021, 11:30:50 PM5/30/21
to FDS and Smokeview Discussions
Glenn, the label of "Intel MPI Port for FDS" is enabled to allow the app through windows defender firewall. And the port used is 8680 with status "listen".
Any other thing need to check?

Antony

unread,
Jun 1, 2021, 4:28:33 AM6/1/21
to FDS and Smokeview Discussions
I tried "hydra_service -install", "hydra_service -start", "mpiexec -remove" and "mpiexec -register", and I got the following error messages.
Capture.PNG

Antony

unread,
Jun 3, 2021, 12:16:04 AM6/3/21
to FDS and Smokeview Discussions
Hi, 
I installed the latest FDS 6.7.6 into 2 computers "1KQPL73" & "CJQPL73"
Firstly I run the test_mpi on 1KQPL73 , "Hello World" is returned from 1KQPL73 only. There is no reply from CJQPL73. See attachment "test_mpi.png"
Secondly I run mpiexec, error messages are displayed. See attachment "mpiexec.png"
Could you advise please?

Antony

unread,
Jun 3, 2021, 12:20:45 AM6/3/21
to FDS and Smokeview Discussions
Files are attached

mpiexec.PNGtest_mpi.PNG

Kevin McGrattan

unread,
Jun 3, 2021, 9:24:36 AM6/3/21
to fds...@googlegroups.com
Are both computers on a Windows Domain Network? That is, do you use the same credentials for all computers? 

Antony

unread,
Jun 3, 2021, 8:44:10 PM6/3/21
to FDS and Smokeview Discussions
Kevin, both computers are on the same Windows Domain Network. I use the same login and register the same user and password in the mpiexec by type "mpiexec -register".

Antony

unread,
Jun 3, 2021, 9:00:10 PM6/3/21
to FDS and Smokeview Discussions
Kevin, I found a strange output during test_mpi, see screen capture below:

I use the old command to run "test_mpi" as shown in the screen capture.

Firstly, I put "1KQPL73" in the front and "CJQPL73" at the back, it return "Hello world" from "1KQPL73" only.
Secondly I test "CJQPL73" only, it return "Hello world" from "CJQPL73".
Thirdly, I put " CJQPL73" in the front and "1KQPL73" at the back, it return "Hello world" from "1KQPL73" only.

The same output when I use this command "mpiexec -n 2 -f hostfile.txt test_mpi"
The contents in the hostfile.txt are as follows:
First
1KQPL73
CJQPL73

Second
CJQPL73

Third
CJQPL73
1KQPL73

So from the results above, does it mean that the MPI can connect to the network computer at least?

Capture.PNG

Kevin McGrattan

unread,
Jun 4, 2021, 9:34:26 AM6/4/21
to fds...@googlegroups.com
I cannot successfully run this command on my Windows network. This is why I use a linux cluster for FDS simulations. 

Perhaps someone else can explain how to get the Windows version of FDS working across a network. Over the years, my success rate is about 50%. Sometimes it's related to some sub-network issue, firewall, hydra service, etc. 
Reply all
Reply to author
Forward
0 new messages