MPI setup problem

1,208 views
Skip to first unread message

fde

unread,
Oct 23, 2017, 7:38:03 AM10/23/17
to FDS and Smokeview Discussions
I have two PCs on the same network. Their computer names are PC-CFD1 and PC-CFD2. Both have the same FDS version and the environmental variables are set. 
I follow the steps in the user guide (§3.2.2) to run a job by two machines. Command line is run as Administrator.


 In the first step, I check both machines as below and I get similar output from both of them:

C:\>mpiexec.exe -np 2 test_mpi.exe
 Hello world: rank            0  of            2  running on
 PC-CFD1.efectis.local

 Hello world: rank            1  of            2  running on
 PC-CFD1.efectis.local


Then I try  

C:\>mpiexec -hosts 2 PC-CFD1 1 PC-CFD2 1 test_mpi

But have the following error. 

Error connecting to the Service
[mpiexec@PC-CFD1] ..\hydra\utils\sock\sock.c (224): unable to get host address for PC-CFD2 (11001)

What should I check? Where might be the problem?

Thank you.


Kevin

unread,
Oct 23, 2017, 9:02:56 AM10/23/17
to FDS and Smokeview Discussions
You can try to ping each machine. That is, login to cfd2 and

ping cfd1

and then vice versa. If that does not work, there's your answer. If it does work, I do not have further advice. Others might be able to help.

Salah Benkorichi

unread,
Oct 23, 2017, 9:10:43 AM10/23/17
to fds...@googlegroups.com
I haven't tried it under windows, but did you try to connect them with an ethernet cable first to see if they would work or not, you can find some helpful sources on google and youtube.

--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+unsubscribe@googlegroups.com.
To post to this group, send email to fds...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fds-smv/f69c7fec-ecd2-4471-ba30-c702f331e2f0%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

fde

unread,
Oct 23, 2017, 9:27:24 AM10/23/17
to FDS and Smokeview Discussions
@Kevin, I tried pinging, they see each other. Thank you.

@Salah, They are not connected each other directly via ethernet. Maybe I can ask help from IT department.  Thank you for the suggestion.

Kevin

unread,
Oct 23, 2017, 9:45:48 AM10/23/17
to FDS and Smokeview Discussions
I do not know if connecting them directly will help. If the computers share the same Windows Domain Network, that should be enough. However, I find running FDS in parallel on Windows networks unreliable. There are many different reasons why one machine cannot see another; reasons having to do with DNS maps, internal firewalls, and so on. Many times here at NIST, I cannot "see" another machine just because the IP address is not understand.

fde

unread,
Oct 23, 2017, 11:22:03 AM10/23/17
to FDS and Smokeview Discussions
I removed FDS from both machines and reinstalled. 

At the moment I receive the following message:

pmi_proxy not found on PC-CFD. Set Intel MPI environment variables

Actually, the pmi_proxy.exe file is in the same folder with others. And the folder is in environment variables. 


Kevin

unread,
Oct 23, 2017, 12:04:27 PM10/23/17
to FDS and Smokeview Discussions
Type this

where pmi_proxy

Salah Benkorichi

unread,
Oct 23, 2017, 12:30:21 PM10/23/17
to fds...@googlegroups.com

See if this solution helps. 


Example 3

Symptom/Error Message

pmi_proxy not found on node02. Set Intel MPI environment variables read from stdin failed, error 9.
[mpiexec@node01] ..\hydra\tools\demux\demux_select.c (78): select error (No such file or directory)
[mpiexec@node01] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@node01] ..\hydra\ui\mpich\mpiexec.c (1063): process manager error waiting for completion

Cause

The Intel® MPI Library runtime scripts are not available. A possible reason is that the shared space cannot be reached.

Solution

Check MPI availability on all the nodes. Possibly, there are some problems with network shared drive.


https://software.intel.com/en-us/mpi-developer-guide-windows-examples-of-mpi-failures


--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+unsubscribe@googlegroups.com.
To post to this group, send email to fds...@googlegroups.com.

fde

unread,
Oct 24, 2017, 2:59:41 AM10/24/17
to FDS and Smokeview Discussions
@Kevin, 
Both machines find the location of pmi_proxy file. 

@Salah, I can access the folders where fds files are from each pc to the other pc. 



A question; should fds files be in the identical folder name in both computers? 

Kevin

unread,
Oct 24, 2017, 9:13:43 AM10/24/17
to FDS and Smokeview Discussions
Try this

mpiexec -hosts 2 cpu1 1 cpu2 1 -wdir \\cpu1\shared_folder \\cpu1\shared_folder_pointing_to_fds.exe\fds job_name.fds

This way, both computers use the same executable. The \\ means that these are network addresses pointing to shared folders.

fde

unread,
Oct 25, 2017, 8:59:49 AM10/25/17
to FDS and Smokeview Discussions
It is still the same. Neither of these two PCs can find pmi_proxy file in other PC. 

I think I will give up for now. 

Kevin

unread,
Oct 25, 2017, 9:13:47 AM10/25/17
to FDS and Smokeview Discussions
pmi_proxy.exe and fds.exe are in the same folder. Share this folder with all other computers on the network. If your other computer cannot "see" this folder, than there is something wrong with your network sharing, DNS server, etc.
Reply all
Reply to author
Forward
0 new messages