Hello deal.II community
I am working with the deal.II finite element library and recently transitioned from workstation based simulations to an HPC environment. My simulations are 3D problems with a large number of degrees of freedom, so I am using MPI parallelisation.
I am facing an issue while running an MPI-based deal.II application on our institute’s HPC cluster, and I would appreciate your guidance.
Software details:
deal.II version: 9.7.0
deal.II module: dealii_9.7.0_intel
MPI launcher available on system: /usr/bin/mpiexec (OpenMPI 4.1.5)
Intel oneAPI environment is sourced in the job script
Scheduler: PBS Pro (version 23.06.06)
HPC node configuration (from pbsnodes):
node1: 32 CPUs 125 GB RAM
node2: 32 CPUs, 126 GB RAM
node3: 32 CPU 504 GB RAM
node4: 32 CPUs, 504 GB RAM
Observed behaviour:
The code runs correctly on any single node.
The code runs correctly when using node3 + node4 together.
The code fails when using node1 + node2 together, or other mixed node combinations.
PBS job script and Error file attached for your reference
Question:
Does this behaviour indicate a known issue related to MPI launcher usage, node allocation, or deal.II configuration on PBS-based clusters? Any guidance on how such node-combination-dependent failures should be diagnosed from the deal.II side would be very helpful.
Thank you for your time and support.
Best regards,
Newton
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dealii/7ffb3122-ac5a-4218-9645-1cdb8a9ef91an%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dealii/CAEvUdMLVGmb7JRRYmRncQPc6_McqTz2ahSLAXJPAvQL7YWXFLA%40mail.gmail.com.