Hi everyone,
I'm running gmx_MMPBSA (version 2023.3) with quasi-harmonic entropy calculations on 1000 ns MD trajectories generated with GROMACS 2023.1. I'm working on two different but similar systems ("51mod" and "negative"), and both runs frequently fail when using high MPI parallelization (-np 64).
The crash happens late in the run (~90% progress) with the following error:
Error: PDB _GMXMMPBSA_avgcomplex.pdb: No frames read. atom=1619 expected 4932. Error: Could not set up '_GMXMMPBSA_avgcomplex.pdb' for reading. cpptraj failed with prmtop COM.prmtop! Error occurred on rank 34.Here’s the SLURM submission script I’m using:
#!/bin/bash #SBATCH --job-name=51mod #SBATCH --partition=all #SBATCH --nodes=8 #SBATCH --ntasks-per-node=8 #SBATCH --cpus-per-task=8 #SBATCH --mem=196GB #SBATCH --output=MMPBSAENTROPY.out #SBATCH --time=2-00:00:00 export OMP_NUM_THREADS="${SLURM_CPUS_PER_TASK:-1}" export GMXMMPBSA_TEMP_DIR=$SLURM_TMPDIR module load GCC/12.2.0 module load Anaconda3/2023.07-2 source activate gmxMMPBSA2023.3 mpirun -np 64 gmx_MMPBSA -O -i mmpbsa_initial.in -cs md51mod1000.tpr -ct mol51mod1000.xtc -ci index.ndx -cg 10 11 -cp 51mod.top -o FINRES_MMPBSAENT.dat -eo FINRES_MMPBSAENT.csv conda deactivateI've seen this error on both systems multiple times. It just happens that one of the 51mod runs managed to finish correctly once, likely by chance.
Lowering -np to 32 always works, but unfortunately it’s not fast enough to finish a 1000 ns simulation within my SLURM time limit (2 days). I suspect it may be related to temporary file I/O or a race condition when writing/reading _avgcomplex.pdb.
Any advice would be greatly appreciated! I’m happy to provide more logs or test suggestions.
Thanks in advance
Thanks for both replies — really appreciate the insight.
I’m using quasi-harmonic (QH) entropy to complement MMPBSA binding energy comparisons between several nanobody–antigen complexes. I initially thought one of the mutated models (51mod) was performing better due to its significantly lower enthalpy (ΔH) compared to the wildtype. However, after including QH entropy, I noticed that 51mod had a much higher entropy penalty (–TΔS), which significantly impacted the overall ΔG.
These calculations were part of a systematic analysis based on simulation length. In each case, the first 1,000 frames were excluded from the entropy calculation to avoid initial equilibration bias:
Here's a summary of ΔH, –TΔS, and ΔG for 51mod, wildtype, and a negative control:
We also have experimental binding affinity data (SPR) for the wildtype, so the goal was to use this data as a benchmark and see whether the simulations converge toward those values over time.
I now see that using ~79,000 or more frames might be overkill and potentially introduces statistical correlation between frames. I’ll try rerunning with interval = 10 to reduce the frame count and improve snapshot independence, as suggested.
I understand QH has some limitations, but in this context, it has been quite helpful in uncovering a misleading interpretation based solely on enthalpy.
Thanks again for your time and help — it’s much appreciated.