Nexus twist averaging for N_twists > N_mpi

9 views
Skip to first unread message

Matej Ditte

unread,
Jan 6, 2026, 6:02:36 PM (5 days ago) Jan 6
to qmcpack
Hello,

I have a problem with the way Nexus bundles the QMCPack part of the twist averaging jobs.
My setup is a desktop with a maximum of 8 MPI ranks and 2 threads (machine='ws16').

For example, for a system with 4 independent twists, Nexus creates a dmc.in file containing

dmc.g000.twistnum_0.in.xml
dmc.g001.twistnum_1.in.xml
dmc.g002.twistnum_2.in.xml
dmc.g003.twistnum_3.in.xml

dmc.inp is passed into QMCPack, the code creates 4 MPI groups and then runs all 4 jobs simultaneously, each with 2 MPI ranks.
As an example, the output of dmc.g001.twistnum_1.in.xml contains

  Total number of MPI ranks = 8
  Number of MPI groups      = 4
  MPI group ID              = 1
  Number of ranks in group  = 2
  MPI ranks per node        = 8
  OMP 1st level threads     = 2

 
The problem is that for a system with more twists (e.g. 10), Nexus is trying to do the same, leading to an error (in dmc.err):

Fatal Error. Aborting at main(). Current 8 MPI ranks cannot accommodate all the 10 individual calculations in the ensemble. Increase the number of MPI ranks or reduce the number of calculations.

Abort(1) on node 0 (rank 0 in comm 496): application called MPI_Abort(world, 1) - process 0
Abort(1) on node 1 (rank 1 in comm 496): application called MPI_Abort(world, 1) - process 1
Abort(1) on node 2 (rank 2 in comm 496): application called MPI_Abort(world, 1) - process 2
Abort(1) on node 3 (rank 3 in comm 496): application called MPI_Abort(world, 1) - process 3
Abort(1) on node 4 (rank 4 in comm 496): application called MPI_Abort(world, 1) - process 4
Abort(1) on node 5 (rank 5 in comm 496): application called MPI_Abort(world, 1) - process 5
Abort(1) on node 6 (rank 6 in comm 496): application called MPI_Abort(world, 1) - process 6
Abort(1) on node 7 (rank 7 in comm 496): application called MPI_Abort(world, 1) - process 7


Is there a way to split the bundled TA QMCPack calculations? I found nothing in the documentation nor in the available examples.

If you need an example, try to run 
with 4x4x4 kgrid on a laptop.

I would be grateful for any help or suggestions.

Best,
Matej

Paul R. C. Kent

unread,
Jan 7, 2026, 10:39:45 AM (5 days ago) Jan 7
to qmcpack
The solution is to create the QMC calculations one twist at a time. For example:

ntwists = len(system.structure.kpoints) qmc_job = job(processes=1,threads=2,hours=8) for n in range(ntwists): qmc = generate_qmcpack(id = 'dmc.g'+str(n).zfill(3), twistnum = n, job = qmc_job,
...)

The key entry here is the twistnum request in generate_qmcpack().

This was a good question and indeed the answer should be added to the docs.

I'll note a related problem - scaling a single job proportionately to the number of twists. For that you can

system = generate_physical_system(...) N = 10 M = len(system.structure.kpoints) qmc_job = job(nodes=N*M,...) qmc = generate_qmpcack(job=qmc_job,...

Credit to Jaron Krogel for the explanations.

Matej Ditte

unread,
Jan 7, 2026, 1:38:26 PM (4 days ago) Jan 7
to qmcpack
Thank you, Paul (and Jaron), this works!

The only small issue is that it does not create the dmc.g*.twist_info.dat files automatically, so a simple qmca -a -q e *scalar.dat does not produce correct averages. An easy fix is to store the weights (supercell.structure.kweights) and create the .twist_info.dat files in the correct directories after calling run_project(). They do not need to contain the k-vectors; a single number (the weight) is sufficient.

Best,
Matej
Reply all
Reply to author
Forward
0 new messages