We have a bunch of brand new Mac Pro computers because we use an apple cluster to compress the HiDef videos from our experiments. There is a lot of unused computer time on these Mac's. They are very busy compressing after each experiment, but then they sit idle until the next video recording ends. These Mac Pros have truly impressive specs and it would be a shame to let all of that computational power go to waste.
Can anyone send me an example of the host file that you use with fds? The error that I am receiving says that it can't start the orde process on the remote computer. When I ssh to the remote computer, I can start the programs, so I am pretty sure that I have the environment variables correct.
--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+u...@googlegroups.com.
To post to this group, send email to fds...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fds-smv/21e8009e-f331-4974-9532-6357d3bb27c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
is the directory where your case is located visible in the same place on each computer where you are trying to run your jobs? ie does /home/drdtsheppard (if that is your home directory) show the same files (not just copies). In addition to setting up ssh keys, on our linux cluster, we cross mount the file system containing home directories so that it is visible on each of our compute nodes
On Thu, Jan 29, 2015 at 6:45 AM, Dave Sheppard <drdtsh...@gmail.com> wrote:
We have a bunch of brand new Mac Pro computers because we use an apple cluster to compress the HiDef videos from our experiments. There is a lot of unused computer time on these Mac's. They are very busy compressing after each experiment, but then they sit idle until the next video recording ends. These Mac Pros have truly impressive specs and it would be a shame to let all of that computational power go to waste.
Can anyone send me an example of the host file that you use with fds? The error that I am receiving says that it can't start the orde process on the remote computer. When I ssh to the remote computer, I can start the programs, so I am pretty sure that I have the environment variables correct.
--
You received this message because you are subscribed to the Google Groups "FDS and Smokeview Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fds-smv+unsubscribe@googlegroups.com.
To post to this group, send email to fds...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fds-smv/21e8009e-f331-4974-9532-6357d3bb27c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Glenn Forney
FRLCLSTR00:fds cluster$ mpiexec -np 4 clu...@10.243.200.102
--------------------------------------------------------------------------
mpiexec was unable to find the specified executable file, and therefore
did not launch the job. This error was first reported for process
rank 0; it may have occurred for other processes as well.
NOTE: A common cause for this error is misspelling a mpiexec command
line parameter option (remember that mpiexec interprets the first
unrecognized command line token as the executable).
Node: FRLCLSTR00
Executable: clu...@10.243.200.102
--------------------------------------------------------------------------
4 total processes failed to startI have tried several different ways. The following command worked. The other three didn't My command is bolded. The not bolded is the response from the computer. Thanks in advance for any help.
FRLCLSTR00:fds cluster$ mpiexec -np 4 -host FRLCLSTR00 fds_mpi gasfill.data
FRLCLSTR00:fds cluster$ mpiexec -np 4 FRLCLSTR00
--------------------------------------------------------------------------
mpiexec was unable to find the specified executable file, and therefore
did not launch the job. This error was first reported for process
rank 0; it may have occurred for other processes as well.
NOTE: A common cause for this error is misspelling a mpiexec command
line parameter option (remember that mpiexec interprets the first
unrecognized command line token as the executable).
Node: FRLCLSTR00
Executable: FRLCLSTR00
--------------------------------------------------------------------------
4 total processes failed to start
FRLCLSTR00:fds cluster$ mpiexec -np 4 -host FRLCLSTR02 fds_mpi gasfill.data
ssh: Could not resolve hostname FRLCLSTR02: nodename nor servname provided, or not known
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
Hello Dave,
although I also use FDS on a Mac Pro, I never coupled multiple Macs together.
But on our Windows cluster we had similar problems. As already discussed above, it was necessary to explicitly specify a file with the given hosts by 'mpirun -hostfile name_of_hostfile'.
But additionally, we also had to specify the working directory by adding '-wdir path_to_working_dir'. Did you already check this? As you wrote the home-directory structure on your single cluster nodes identical, so hopefully this should work. Or have you already been able to resolve the problem with the --prefix command?
As already mentioned in louiea's thread, a very good reference for questions related to mpi is:
http://www.open-mpi.org/faq/?category=running
Best Susan