Scatterv with class instance

Skip to first unread message

Jacob Wren

Apr 2, 2023, 12:22:02 PM4/2/23
to mpi4py


I am using Scatterv to distribute m initial independent states (child_states below).


import numpy as np

seed = 98765. # Set the seed.

# Create the RNG to pass around.

rng = np.random.default_rng(seed)

# Get the SeedSequence of the passed RNG.

ss = rng.bit_generator._seed_seq

# Create m initial independent states.

child_states = np.array(ss.spawn(reps), dtype=object)

Where an element of child_states looks like:





and is an instance of class ‘np.random.bit_generator.SeedSequence'.

In my call to Scatterv, I am unsure how I should specify the type?



Lisandro Dalcin

Apr 2, 2023, 1:03:37 PM4/2/23
mpi4py's uppercase Scatterv() method is not able to communicate NumPy arrays with Python object items (ie. dtype=object). 
You should just use the lowercase scatterv() method. Note that the send object to scatterv has to be a sequence of comm.size elements, otherwise you should partition your data in a list-of-lists, with the outer list of length comm.size, and the inner list items of approximately the same size for good load balancing, Python 3.12 will have a `itertools.batched` function to do this, otherwise look at more_itertools.chunked, but this is so simple that you can roll your own little function for it.

You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Lisandro Dalcin
Senior Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
Message has been deleted

Jacob Wren

Apr 2, 2023, 3:55:31 PM4/2/23
to mpi4py
Thank you, Lisandro.

Lisandro Dalcin

Apr 3, 2023, 4:10:24 AM4/3/23

On Sun, 2 Apr 2023 at 22:21, Jacob Wren <> wrote:
Thank you, Lisandro.

I know this is a bit out of scope, but do you have any recommendations for producing repeatable pseudo-random numbers across each process?

Say, I want to run m repetitions, where m > comm.size. So, in addition to requiring independence across processes, I will need independence within each process as well. (I will also sample more than once per repetition.) 

I'm not an expert and I cannot provide any good advice. But looks like you are already using numpy's APIs documented  in For sure they ensure independence across processes, but I'm not sure about repeatability. 
I am hoping to use (upper case) Scatterv.

Remember you cannot use Scatterv for non-numeric dtypes involving Python objects. Passing around things like SeedSequence with pickle should not be expensive, so you can very well use lowercase scatter().
Reply all
Reply to author
0 new messages