Issue with Gatherv on multiple nodes

Ankit Ankit

unread,

Jun 30, 2020, 11:13:44 AM6/30/20

to mpi4py

Hi Lisandro,

The following is a code for gathering numpy arrays of varying size from different processes. When I run this code on a single node (PBS node with 48 cores), it works fine. However, when I run it on a multiple node, the gathered data is incorrect. Can you please help me solve this problem?

#--------------------------------------------------------------------------------------------------------

import numpy as np

from numpy.linalg import norm

from mpi4py import MPI

Comm = MPI.COMM_WORLD

N_Workers = Comm.Get_size()

Rank = Comm.Get_rank()

RefDataLen = int(1e4)

VecLenList = RefDataLen*np.arange(1, N_Workers+1)

VecDisplList = np.array([np.sum(VecLenList[:i]) for i in range(N_Workers)])

N_GatheredVec = np.sum(VecLenList)

DataList = []

for i in range(N_Workers):

Data = np.arange(VecLenList[i])*1e-3

DataList.append(Data)

for i in range(10):

if Rank == 0:

GatheredVec = np.zeros(N_GatheredVec)

Comm.Gatherv(DataList[Rank], (GatheredVec, VecLenList, VecDisplList, MPI.DOUBLE), 0)

print(norm(GatheredVec-np.hstack(DataList)))

else: Comm.Gatherv(DataList[Rank],None,0)

#--------------------------------------------------------------------------------------------------------

Output for single node run (<=48 cores):

0.0

Output for multiple node run (>48 cores):

0.0

990524.488609363

Kind regards,

Ankit

Lisandro Dalcin

unread,

Jun 30, 2020, 2:13:51 PM6/30/20

to mpi...@googlegroups.com

On Tue, 30 Jun 2020 at 18:13, Ankit Ankit <ankitn...@gmail.com> wrote:

Hi Lisandro,

The following is a code for gathering numpy arrays of varying size from different processes. When I run this code on a single node (PBS node with 48 cores), it works fine. However, when I run it on a multiple node, the gathered data is incorrect.

If it works fine in a single node but now on multiple nodes, then the problem is most likely deep down in the backend MPI implementation.

Can you please help me solve this problem?

You provided almost no additional information, though if you did, I doubt I would say anything, as I do not have access to the machine. You should really ask for help to the IT staff of the computing infrastructure you are using.

PS: As time passes, I get more and more emails (and many times to my personal address) with large code pastes asking me to spot bugs. I don't want to be hostile, but it is a bit too much, I'm not a human debugger!

--

Lisandro Dalcin
============
Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

Ankit Ankit

unread,

Jun 30, 2020, 8:30:47 PM6/30/20

to mpi...@googlegroups.com

No worries. Thanks Lisandro.

--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/CAEcYPwDOkbCwqx0hdZbKrAjhk1tTrSNyc4AnhP2nxJXSqGvCcw%40mail.gmail.com.

Reply all

Reply to author

Forward