Add the:
<MPIParam>--bind-to none</MPIParam>
Inside of <Simulation><RunInfo><mode>
(or right after <runSbatch/>)
Joshua Cogliati
--
You received this message because you are subscribed to the Google Groups "INL RAVEN Users Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to inl-raven-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/inl-raven-users/CAMsK15JPqBEpCKjG%2BBFYBAtu8-svcHNwKyKFZdKf0HULuyMBMg%40mail.gmail.com.
Hm, that is a new error to me.
Joshua Cogliati
Good afternoon,
I had a follow up question for requesting resources. Since I was able to get the "bind-to none" to work, I have been attempting to request additional nodes to increase the amount of cases running at a given time. However, when I request more than 3 full nodes, I begin to run into a significant number of errors. I was wondering if there might be a potential solution to this problem or if there is just a hard limit on the number of nodes I can request within RAVEN.
Attached is an example of the slurm.out file that I get. After a certain number of cases, they begin to immediately fail. I am getting numerous errors including:
"<jemalloc>: background thread creation failed (11)"
"pmix_progress_thread_start failed
--> Returned value -1 instead of PMIX_SUCCESS"
"[br370:432108] PRTE ERROR: The system limit on number of children a process can have was reached in file plm_slurm_module.c at line 436"
Thank you for your timeJonathan