Hi,
Sorry for my frequent questions.
If you have an idea, please help me.
My question is basically, how we can write gmove, when only a part of processes can execute gmove. Please have a look at the code attached. It should be clearer than natural language.
The result (will be copied at the bottom) indicates that I also need to have processes which only receive data participate. However, this is not clear to me how to do.
I also tried 'gmove out async' (and receivers do wait_async), but this looks causing memory leak even though the result might be good (probably, just by chance).
I hope there is an expected scenario in XcalableMP/ACC.
On a different note, regarding multi-GPU node case, I think the attached specification tells the answer. Sorry for the disturbance.
I am using;
Omni Compiler: 1.3.2 with xacc option.
PGI: 20.1
DGX station
Best regards,
Noriyuki Kushida
==== result ====
1001
1002
1003
1004
0
0
0
0
3001
3002
3003
3004
0
0
0
0