-hrex segmentation fault 11 (gromacs 2020.4, plumed 2.7)

459 views
Skip to first unread message

Peter Starrs

unread,
Mar 27, 2021, 9:19:28 PM3/27/21
to PLUMED users
Dear PLUMED users,

I've been trying to use the REST2 protocol (as implemented in PLUMED) to study conformations of a polyglycan in solution using 4 replicas. Unfortunately, I can't get the replica exchange simulations to run. At the first exchange attempt, they crash with the error:

[node10:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
[node10:mpi_rank_1][error_sighandler] Caught error: Segmentation fault (signal 11)
[node10:mpi_rank_3][error_sighandler] Caught error: Segmentation fault (signal 11)
[node10:mpi_rank_2][error_sighandler] Caught error: Segmentation fault (signal 11)

The equilibration runs -  also done using the patched gromacs install - run fine, as do the replica exchange simulations before the first exchange attempt. I see this behaviour with both gromacs-2020.4/plumed-2.7 and gromacs-2021/plumed-dev-2.8.

The command I submit to run my jobs is exactly this:
mpirun -np 4 gmx_mpi mdrun -multidir hremd{0..3} -plumed -replex 1000 -hrex yes -dlb no

The plumed printout at the start of the simulations in the md.log files is as follows:

Replica exchange in temperature
 300.0 300.0 300.0 300.0

Repl  p  1.00  1.00  1.00  1.00
Replica exchange interval: 1000

Replica random seed: 185548135
Replica exchange information below: ex and x = exchange, pr = probability
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest

PLUMED: PLUMED is starting
PLUMED: Version: 2.7.0 (git: Unknown) compiled on Mar 25 2021 at 16:58:39
PLUMED: Please cite these papers when using PLUMED [1][2]
PLUMED: For further information see the PLUMED web page at http://www.plumed.org
PLUMED: Root: /gpfs1/scratch/compchem/pns2/software/plumed-2.7.0-install/lib/plumed
PLUMED: For installed feature, see /gpfs1/scratch/compchem/pns2/software/plumed-2.7.0-install/lib/plumed/src/config/config.txt
PLUMED: Molecular dynamics engine: gromacs
PLUMED: Precision of reals: 4
PLUMED: Running over 1 node
PLUMED: Number of threads: 4
PLUMED: Cache line size: 512
PLUMED: Number of atoms: 7041
PLUMED: GROMACS-like replica exchange is on
PLUMED: File suffix: .0
PLUMED: FILE: plumed.dat
PLUMED: END FILE: plumed.dat
PLUMED: Timestep: 0.002000
PLUMED: KbT: 2.494339
PLUMED: Relevant bibliography:
PLUMED:   [1] The PLUMED consortium, Nat. Methods 16, 670 (2019)
PLUMED:   [2] Tribello, Bonomi, Branduardi, Camilloni, and Bussi, Comput. Phys. Commun. 185, 604 (2014)
PLUMED: Please read and cite where appropriate!
PLUMED: Finished setup
Started mdrun on rank 0 Sun Mar 28 02:01:28 2021

I notice that gromacs reports that all 4 replicas are at 300 K (as we're varying the solute Hamiltonian) and that consequently the exchange probabilities are all 1.00. I assume this does not indicate a problem, and is simply because gromacs doesn't know about the hrex exchange criteria...

Note that the plumed.dat file is totally blank - I noticed some other reported problems with -hrex in which changing the commands in plumed.dat seamed to solve the issue.

Has anyone else experienced this problem and found a solution?

Thanks in advance,
Peter Starrs

Daniel Burns

unread,
Mar 29, 2021, 12:44:29 PM3/29/21
to PLUMED users
Hi Peter,

This might have to do with one of the easily overlooked items on the hrex tutorial:

  • Choose neighbor list update (nstlist) that divides replex. Notice that running with GPUs GROMACS is going to change nstlist automatically, be sure that it still divides replex.
Check the nstlist setting in your mdp file and then check your log file to see if gromacs is changing it.

in the log file you'll see something like:
Overriding nsteps with value passed on the command line: -1 steps
Changing nstlist from 30 to 100, rlist from 1.25 to 1.337


Then set your replex interval to a multiple of the nstlist.  Hopefully that will make it work.

Good luck,

Dan

Peter Starrs

unread,
Mar 29, 2021, 3:17:36 PM3/29/21
to PLUMED users
Hi Daniel,

Thanks for your reply. I probably should have mentioned this in my post, but I am aware of that issue, and indeed GROMACS does change nstlist (from 20 to 100). However, I am attempting exchanges either every 100 or 1000 steps, both of which are divisible by 100...

May I ask if you have successfully run REST2 simulations on recent versions of GROMACS/PLUMED? I'm wondering if there is some difficulty specifically with newer versions. The tutorial page for REST2 is out of date in that it references the "-multi" tag which has been replaced in GROMACS by "-multidir".

Cheers,

Peter

Daniel Burns

unread,
Apr 11, 2021, 6:49:47 PM4/11/21
to PLUMED users
Sorry to take so long to respond.  I have been running HREX on Gromacs 2018.8 and Plumed 2.5.5. I was never able to get it to run on our University HPC but have had no trouble on Xsede Comet with my own installations.

The University HPC seemed to give a similar mpi error - maybe your IT can trace it?

Dan

Peter Starrs

unread,
May 5, 2021, 8:29:53 PM5/5/21
to PLUMED users
Hi again Dan,

Thanks for your reply. It looks like I'm getting the same error on a local machine using openmpi-4 compilers instead of mvapich. Do you remember which MPI library you used for compilation?

Cheers,

Peter
Message has been deleted

Jeremy Lapierre

unread,
Oct 11, 2021, 12:16:27 PM10/11/21
to PLUMED users
Hi Peter,

I wanted to know if you've found the issue ? Because I've the same problem as you with gromacs2020.4+Plumed2.7.0 and 2021+2.8.0. I've tried to come back to simple system like alanine dipeptide, with 4 identical parallel replicas scaled with `plumed partial_tempering 1` (so no scaling) and an empty plumed file, but I still get the same error. I think this problem is related to https://groups.google.com/g/plumed-users/c/S0UQEg7fxQY .

Cheers,
Jeremy

Peter Starrs

unread,
Oct 11, 2021, 1:15:27 PM10/11/21
to plumed...@googlegroups.com
Hi Jeremy,

I ended up finding some discussion of the issue on github:

From this it looks as though the problem should have been patched (on June 10th). Have you tried GROMACS 2020.5/PLUMED-2.7.2, which are referenced in the second link?

In any case, what I did was simply use GROMACS 2019.6/PLUMED-2.6.3 which was reported to be working in the first link - and indeed I haven't had problems with it.

Either way I hope you can get it working.

Peter

On Mon, 11 Oct 2021 at 17:15, Jeremy Lapierre <lapierr...@gmail.com> wrote:
Hi Peters,

I wanted to know if you've found the issue ? Because I've the same problem as you with gromacs2020.4+Plumed2.7.0 and 2021+2.8.0. I've tried to come back to simple system like alanine dipeptide, with 4 identical parallel replicas scaled with `plumed partial_tempering 1` (so no scaling) and an empty plumed file, but I still get the same error. I think this problem is related to https://groups.google.com/g/plumed-users/c/S0UQEg7fxQY .

Cheers,
Jeremy

Le jeudi 6 mai 2021 à 02:29:53 UTC+2, peters...@gmail.com a écrit :

--
You received this message because you are subscribed to a topic in the Google Groups "PLUMED users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/plumed-users/AL5ZOPd50MM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to plumed-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plumed-users/eb6ab9c8-21d1-4d59-bb06-9cd45375b854n%40googlegroups.com.

Jeremy Lapierre

unread,
Oct 11, 2021, 5:38:59 PM10/11/21
to plumed...@googlegroups.com
Hi Peter,

Indeed the problem is fixed with 2020.6+Plumed-2.7.2. Thanks a lot for you help !

Cheers,
Jeremy

Reply all
Reply to author
Forward
0 new messages