NULL Communicator Error, Invalid Communicator

593 views
Skip to first unread message

julym...@gmail.com

unread,
Apr 13, 2017, 1:43:39 AM4/13/17
to mpi4py
Hello,

I am just now starting to work with mpi4py and I am running into the following error: 
mpi4py.MPI.Exception: Invalid communicator, error stack: 
PMPI_Comm_rank(109): MPI_Comm_rank(MPI_COMM_NULL, rank=0x7fffffff5d5c) failed 
PMPI_Comm_rank(66).: Null communicator

I have a python function defined in a file that is spawning worker processes:
---------------------------------
def search_direction(f, graph, g, od):
    x = np.power(f.reshape((f.shape[0],1)), np.array([0,1,2,3,4]))
    grad = np.einsum('ij,ij->i', x, graph[:,3:])
    g.es["weight"] = grad.tolist()
    
    L = np.zeros(len(g.es),dtype='float64')

    #The following code is for MPI
    comm = MPI.COMM_SELF.Spawn(sys.executable,
                           args=['AoN_igraph_MPI.py'],
                           maxprocs=2)

    comm.bcast(od, root=MPI.ROOT)
    comm.bcast(g, root =MPI.ROOT)
    comm.Reduce(none, L, op=MPI.SUM, root= MPI.ROOT)

    return L, grad
------------------------------------

The work/child code is as follows (saved in AoN_igraph_MPI.py):
--------------------------------------------------
from mpi4py import MPI
from mpi4py.MPI import ANY_SOURCE

comm = MPI.Comm.Get_parent()
rank = comm.Get_rank()  #Rank of the particular process
size = comm.Get_size()  #number of processes
 ..... 

---------------------------------------

Now, the code executes past the "comm = MPI.Comm.Get_parent()" command, but then stops with error whenever I try to get the rank or size in the work process. Is this a problem with the fact that it's a function that is spawning the worker processes? The search_direction function is called by another function solver_3, residing in the same file, which in turn is called by the main function in the file. Does mpi4py allow for function definition?

I should also mention that I have successfully ran the "Compute Pi" example provided in the tutorial, so I don't think it it a problem with the mpi4py installation on my machine. Also, this all running on a Linux machine.

Help with this would be greatly appreciated.

Juliette

Lisandro Dalcin

unread,
Apr 13, 2017, 1:53:09 AM4/13/17
to mpi4py
Have you tried using another example using spawn? Any chances that you have multiple installs of mpi4py? If you spawn a child, and in that child Comm.Get_parent() returns the null communicator, well, either your MPI is not working properly, or the MPI being used in the child is not the same as the one in the parent process. This would happen if you have multiple mpi4py installs with different MPI backends, and somehow (PYTHONPATH environ var?) the parent imports one of the mpi4py's, but the child is importing the other.

--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+unsubscribe@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at https://groups.google.com/group/mpi4py.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/5da37832-e641-4c05-8188-571ef22d6a98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459
Reply all
Reply to author
Forward
0 new messages