Scatterv with class instance

21 views
Skip to first unread message

Jacob Wren

unread,
Apr 2, 2023, 12:22:02 PM4/2/23
to mpi4py

Hi,


I am using Scatterv to distribute m initial independent states (child_states below).

Specifically:


import numpy as np

seed = 98765. # Set the seed.

# Create the RNG to pass around.

rng = np.random.default_rng(seed)

# Get the SeedSequence of the passed RNG.

ss = rng.bit_generator._seed_seq

# Create m initial independent states.

child_states = np.array(ss.spawn(reps), dtype=object)


Where an element of child_states looks like:

SeedSequence(

  entropy=1234,

  spawn_key=(0,),

),

and is an instance of class ‘np.random.bit_generator.SeedSequence'.


In my call to Scatterv, I am unsure how I should specify the type?


Thanks,

Jake

Lisandro Dalcin

unread,
Apr 2, 2023, 1:03:37 PM4/2/23
to mpi...@googlegroups.com
mpi4py's uppercase Scatterv() method is not able to communicate NumPy arrays with Python object items (ie. dtype=object). 
You should just use the lowercase scatterv() method. Note that the send object to scatterv has to be a sequence of comm.size elements, otherwise you should partition your data in a list-of-lists, with the outer list of length comm.size, and the inner list items of approximately the same size for good load balancing, Python 3.12 will have a `itertools.batched` function to do this, otherwise look at more_itertools.chunked, but this is so simple that you can roll your own little function for it.

--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/322e045e-00a6-436f-b84d-35fb2df57de6n%40googlegroups.com.


--
Lisandro Dalcin
============
Senior Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/
Message has been deleted

Jacob Wren

unread,
Apr 2, 2023, 3:55:31 PM4/2/23
to mpi4py
Thank you, Lisandro.

Lisandro Dalcin

unread,
Apr 3, 2023, 4:10:24 AM4/3/23
to mpi...@googlegroups.com


On Sun, 2 Apr 2023 at 22:21, Jacob Wren <jacob...@gmail.com> wrote:
Thank you, Lisandro.

I know this is a bit out of scope, but do you have any recommendations for producing repeatable pseudo-random numbers across each process?

 
Say, I want to run m repetitions, where m > comm.size. So, in addition to requiring independence across processes, I will need independence within each process as well. (I will also sample more than once per repetition.) 

I'm not an expert and I cannot provide any good advice. But looks like you are already using numpy's APIs documented  in https://numpy.org/doc/stable/reference/random/parallel.html. For sure they ensure independence across processes, but I'm not sure about repeatability. 
 
I am hoping to use (upper case) Scatterv.


Remember you cannot use Scatterv for non-numeric dtypes involving Python objects. Passing around things like SeedSequence with pickle should not be expensive, so you can very well use lowercase scatter().
Reply all
Reply to author
Forward
0 new messages