MPI init fails when no network available, checking for import before hand?

997 views
Skip to first unread message

Chris Kannas

unread,
Mar 16, 2011, 5:33:35 AM3/16/11
to mpi...@googlegroups.com
 When I run a script which imports mpi4py.MPI on a multicore CPU with the network disconected
I get the following error:
[NC-03:00212] [[5478,0],0] ORTE_ERROR_LOG: Error in file D:\ompi\OpenMPI\openmpi
-1.5.1\orte\mca\ess\hnp\ess_hnp_module.c at line 218
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[NC-03:00212] [[5478,0],0] ORTE_ERROR_LOG: Error in file D:\ompi\OpenMPI\openmpi
-1.5.1\orte\runtime\orte_init.c at line 128
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[NC-03:00212] [[5478,0],0] ORTE_ERROR_LOG: Error in file D:\ompi\OpenMPI\openmpi
-1.5.1\orte\orted\orted_main.c at line 352
[NC-03:05188] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on th
e local node in file D:\ompi\OpenMPI\openmpi-1.5.1\orte\mca\ess\singleton\ess_si
ngleton_module.c at line 470
[NC-03:05188] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on th
e local node in file D:\ompi\OpenMPI\openmpi-1.5.1\orte\mca\ess\singleton\ess_si
ngleton_module.c at line 138
[NC-03:05188] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on th
e local node in file D:\ompi\OpenMPI\openmpi-1.5.1\orte\runtime\orte_init.c at l
ine 128
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Unable to start a daemon on the local node (-128) instead o
f ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Unable to start a daemon on the local node" (-128) instead of "S
uccess" (0)
--------------------------------------------------------------------------
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[NC-03:5188] Abort before MPI_INIT completed successfully; not able to guarantee
 that all other processes were killed!

From some search I did this relates to the fact the OpenMPI does not find an active network interface and it fails.
Is there any way to check the mpi4py.MPI import before the import takes place?
I did try with try: ... except: ... but no luck :(.

Thanks,
Chris Kannas

Lisandro Dalcin

unread,
Mar 16, 2011, 7:09:58 AM3/16/11
to mpi...@googlegroups.com
On 16 March 2011 06:33, Chris Kannas <chris...@gmail.com> wrote:
>  When I run a script which imports mpi4py.MPI on a multicore CPU with the
> network disconected
> I get the following error:
>
> [NC-03:00212] [[5478,0],0] ORTE_ERROR_LOG: Error in file
> D:\ompi\OpenMPI\openmpi
> -1.5.1\orte\mca\ess\hnp\ess_hnp_module.c at line 218
>
> From some search I did this relates to the fact the OpenMPI does not find an
> active network interface and it fails.
>

Indeed. Did you asked in the Open MPI mailing list? Perhaps there is a
MCA parameter (this is how they name your configuration stuff) you can
set to make it work.


> Is there any way to check the mpi4py.MPI import before the import takes
> place?
> I did try with try: ... except: ... but no luck :(.
>

Sorry, but I do not understand you... Please take into account that
MPI initialization errors are fatal, there is no way mpi4py could
recover form that and throw a Python exception. You could make mpi4py
stop calling MPI_Init() on import, and call MPI.Init() yourself from
Python, but that's still going to fail bad, even if you try: ...
except:...

PS: Do you really need to use Open MPI on Windows? Their Windows
support is IIUC in beta status. Note that MPICH2 do support Windows
from long, long time ago.

--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

Chris Kannas

unread,
Mar 17, 2011, 9:13:49 AM3/17/11
to mpi...@googlegroups.com
Well from OpenMPI mailing lists I found that OpenMPI has some problems with available network connections.
For example it crashes if u have more than 8 active network connections, since OpenMPI is configured to use all available connections, up to the number of 8 connections.

The way I somehow solved my problem, was to check if there is an active network, with the use of a simple function:
#!/usr/bin/env python
import urllib2

def Is_internet_on(dummy = False):
    if not dummy:
        try:
            response = urllib2.urlopen('http://www.google.com')
            return True
        except urllib2.URLError:
            pass

    return False
So if Check_Internet.Is_internet_on() returns True then I use MPI, else I do not.
I know this have some disadvantages, but I consider it as a tradeoff for not running on a cluster or on a machine without internet.

I will check for special OpenMPI initializations tho.
Reply all
Reply to author
Forward
0 new messages