The latest git version of plumed seems to break NAMD.

240 views
Skip to first unread message

Snow Summer

unread,
Jul 5, 2017, 6:12:42 AM7/5/17
to PLUMED users
Hello,

I am testing the latest git version of plumed with NAMD. These days NAMD crashed with plumed assertion errors: 
terminate called after throwing an instance of 'PLMD::Exception'
 what():   
+++ Internal PLUMED error
+++ file Atoms.cpp, line 97
+++ message: assertion failed p || gatindex.size()==0, NULL mass pointer with non-zero local atoms

I just selected two torsion angles in plumed configuration file:
phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17

Any ideas?

Thanks,

Snow Summer

unread,
Jul 5, 2017, 10:25:04 AM7/5/17
to PLUMED users
Hello,

I reverted the commit 5952dbec4f4f043dcd69c9bc6c47193d0adbbb3b and NAMD worked again.
That commit seems improve the performace. Should the NAMD patch be modified accordingly?
 
在 2017年7月5日星期三 UTC+8下午6:12:42,Snow Summer写道:

Bogdan Marekha

unread,
Jul 5, 2017, 10:37:49 AM7/5/17
to PLUMED users
What was the NAMD version you used?

Snow Summer

unread,
Jul 5, 2017, 10:46:17 AM7/5/17
to PLUMED users
NAMD_Git-2017-06-30
The original patch in plumed is for namd 2.9 and I adapt it to the newer version of namd.

在 2017年7月5日星期三 UTC+8下午10:37:49,Bogdan Marekha写道:

Bogdan Marekha

unread,
Jul 5, 2017, 11:50:32 AM7/5/17
to PLUMED users
Great! Would it be then perhaps possible for you to share the 'adapted' patch and it could be included in the official plumed release as well? 

Unless it is a relly custom-made patch which is too specific to your cluster configuration and/or NAMD installation, of course.

Best regards

Snow Summer

unread,
Jul 5, 2017, 9:28:05 PM7/5/17
to PLUMED users
Ok, here's the patch.
cd to your NAMD directory like ~/HDD/software/NAMD_Git-2017-06-30_Source, then run:
patch -Np2 -i 1.patch
and then create soft links(I installed plumed in /usr/local):
Plumed.cmake -> /usr/local/lib64/plumed/src/lib/Plumed.cmake.shared
Plumed.h -> /usr/local/include/plumed/wrapper/Plumed.h
Plumed.inc -> /usr/local/lib64/plumed/src/lib/Plumed.inc.shared
then you can build the namd.
Remember to revert the 5952dbec4f4f043dcd69c9bc6c47193d0adbbb3b before building plumed.
I have added extra features in the patch(need more testing):
1. setKbt according to thermostat temperature in NAMD(borrow some code from colvarproxy_namd.C).
2. doCheckpoint according to NAMD restart frequency setting.
3. update box size(thanks to colvars author, see also: http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2017-2018/0446.html).
I am not a plumed developer and it seems plumed developers seldom use NAMD. I also hope they can fix the bug and make a new patch for new NAMD versions.

在 2017年7月5日星期三 UTC+8下午11:50:32,Bogdan Marekha写道:
1.patch

Snow Summer

unread,
Jul 10, 2017, 11:17:46 PM7/10/17
to PLUMED users
I spent some time in debugging and find for AMBER and Gromacs the shuffledAtoms is 0, but for NAMD shuffledAtoms is 1 and dd is false.
The code crashes at second round of void Atoms::createFullList(int*n).
I don't know what shuffledAtoms means and why it set to 1 in NAMD.
If anyone knows what happens please inform me.
Thanks,


在 2017年7月5日星期三 UTC+8下午6:12:42,Snow Summer写道:
Hello,

Giovanni Bussi

unread,
Jul 11, 2017, 2:57:23 AM7/11/17
to plumed...@googlegroups.com
Hi!

Thanks for your effort. I am afraid some of us needs to have a look at that. I will open an issue on github.

Meanwhile, did I get it right that reverting to an older plumed version everything is working correctly?

Thanks!

Giovanni


--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.
To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/plumed-users/6a92a689-f1d9-4e6a-bdfd-13b879db0a8a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Snow Summer

unread,
Jul 11, 2017, 3:41:17 AM7/11/17
to PLUMED users
Hi,

Revert 5952dbec4f4f043dcd69c9bc6c47193d0adbbb3b fix the namd problem.

The attachment is a simple test example(from the ABF tutorial) I use for NAMD.

You can run
<patched namd directory>/Linux-x86_64-g++/namd2 +p1 +idlepoll ./namd.conf
to test it.

Thanks!

在 2017年7月11日星期二 UTC+8下午2:57:23,Giovanni Bussi写道:
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users...@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.
plumed_namd_test.tar.xz

Carlo Camilloni

unread,
Jul 11, 2017, 6:27:02 AM7/11/17
to plumed...@googlegroups.com
Hi,

Could you try to go back to the plumed version what was not working and apply this patch?

namd_patch.patch

Carlo Camilloni

unread,
Jul 11, 2017, 6:59:53 AM7/11/17
to plumed...@googlegroups.com
HI

Sorry, ignore my previous patch, it cannot work

C

On 11 Jul 2017, at 12:26, Carlo Camilloni <carlo.c...@gmail.com> wrote:

Hi,

Could you try to go back to the plumed version what was not working and apply this patch?

<namd_patch.patch>

In the plumed2 folder you can type
git apply namd_patch.patch
And see if this solves your problem?

Carlo



For more options, visit https://groups.google.com/d/optout.
<plumed_namd_test.tar.xz>


Giovanni Bussi

unread,
Jul 11, 2017, 12:49:37 PM7/11/17
to plumed...@googlegroups.com
Hi,

I am trying to reproduce the problem from outside NAMD. I have a small cpp program which should make the same plumed calls that NAMD does.

The only way that I have to get that error is to have no atoms requested. I am not sure it is the same problem (I get it both with plumed 2.3 and 2.4, so it's not related with commit 5952dbec4f4f043dcd69c9bc6c47193d0adbbb3b).

Anyway, I see the following:

with an input such as
d: DISTANCE ATOMS=1,2
PRINT ARG=d FILE=COLVAR STRIDE=1

there is no error. If I remove the PRINT line I get the error.

Is this the same that happens with NAMD?

Giovanni



To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/plumed-users/854a76a8-ee5d-4237-8c1b-917b5a75181a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<plumed_namd_test.tar.xz>

--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

Snow Summer

unread,
Jul 11, 2017, 8:52:57 PM7/11/17
to PLUMED users
Hi,

I got an error with the PRINT line first and it didn't throw 'PLMD::Exception'(stopped with segfault in dmesg). So I removed the PRINT line and it showed 'PLMD::Exception'.

It seems these are two different issues. Sorry for my mess.

And it's yes that NAMD doesn't throw PLMD::Exception with the PRINT line, but it still crashes with that commit.

在 2017年7月12日星期三 UTC+8上午12:49:37,Giovanni Bussi写道:
To view this discussion on the web visit https://groups.google.com/d/msgid/plumed-users/854a76a8-ee5d-4237-8c1b-917b5a75181a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<plumed_namd_test.tar.xz>

--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users...@googlegroups.com.
To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

Giovanni Bussi

unread,
Jul 12, 2017, 2:56:01 AM7/12/17
to plumed...@googlegroups.com
OK. So to recapitulate:

Case 1 - without PRINT statement

# here's your input file:
d: DISTANCE ATOMS=1,2

You get the exception. I would expect this to happen with all plumed versions, and can be fixed easily. You should edit the src/ComputeMgr.C file after patch in this manner:

// add a private member to class GlobalMasterPlumed:
bool first;

// set it to true in void easy_init()
first=true;

// use it to enforce the list of atoms is requested at first step
// after this line:    if(index.size()!=n)redo=true;
// add
if(first) redo=true;
first=false;

In this manner you will not get the exception thrown anymore.

Anyway, I would say using zero atoms is a non common choice, so this fix is not particularly important.

Case 2 - with PRINT statement

# this is your plumed.dat file:
d: DISTANCE ATOMS=1,2
PRINT ARG=d FILE=COLVAR STRIDE=1

This is more complicated since it is a segfault and we don't know were the code is stopping. Do you think you can try to get it with gdb?

Also, can you add in the src/ComputeMgr.C file the following instruction before any appearance of cmd("setStep",&s):

std::cerr<<"STEP "<<s<<std::endl;

You might have to include <iostream> somewhere for this to work.

What I would like to know is:
1. If the first time cmd("setStep",&s) is called s is equal to zero (as it should)
2. After how many time steps you see the crash

Thanks a lot for your feedbacks they are very useful!

Giovanni



To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

Snow Summer

unread,
Jul 12, 2017, 3:36:21 AM7/12/17
to PLUMED users
Hello,

I added fprintf(stdout,"Finished setStep: at step %d\n",s); before cmd("setStep",&s);
The NAMD stops at step 0;

I recompiled plumed with -g3 and without -O.
gdb shows:
Thread 1 "namd2" received signal SIGSEGV, Segmentation fault.
0x00007ffff77bf526 in PLMD::PlumedMain::update (this=0x16ce55c0) at PlumedMain.cpp:733
733         p->beforeUpdate();
(gdb) list
728
729       stopwatch.start("6 Update");
730     // update step (for statistics, etc)
731       updateFlags.push(true);
732       for(const auto & p : actionSet) {
733         p->beforeUpdate();
734         if(p->isActive() && p->checkUpdate() && updateFlagsTop()) p->update();
735       }
736       while(!updateFlags.empty()) updateFlags.pop();
737       if(!updateFlags.empty()) plumed_merror("non matching changes in the update flags");
backtrace:
#0  0x00007ffff77bf526 in PLMD::PlumedMain::update (this=0x16ce55c0) at PlumedMain.cpp:733
#1  0x00007ffff77be63a in PLMD::PlumedMain::performCalc (this=0x16ce55c0) at PlumedMain.cpp:637
#2  0x00007ffff77b23e6 in PLMD::PlumedMain::cmd (this=0x16ce55c0, word="performCalc", val=0x0) at PlumedMain.cpp:207
#3  0x00007ffff77c5e79 in PLMD::WithCmd::cmd (this=0x16ce55c0, key="performCalc", val=0x0) at WithCmd.h:46
#4  0x00007ffff77c5b6b in plumedmain_cmd (plumed=0x16ce55c0, key=0x133b5ea "performCalc", val=0x0) at PlumedMainInitializer.cpp:52
#5  0x00007ffff79d8933 in plumed_cmd (p=..., key=0x133b5ea "performCalc", val=0x0) at Plumed.c:177
#6  0x00000000009c91b5 in PLMD::Plumed::cmd (this=0x16ce4f50, key=0x133b5ea "performCalc", val=0x0) at src/../Plumed.h:539
#7  0x00000000009b0a25 in GlobalMasterPlumed::easy_calc (this=0x16ce4e00) at src/ComputeMgr.C:242
#8  0x0000000000c867f6 in GlobalMasterEasy::calculate (this=0x16ce4e00) at src/GlobalMasterEasy.C:169
#9  0x0000000000c68405 in GlobalMaster::processData (this=0x16ce4e00, a_i=0x16e64380, a_e=0x16e64388, p_i=0x16e64240, g_i=0x0, g_e=0x0, gm_i=0x0, gm_e=0x0, gtf_i=0x0, gtf_e=0x0,
    last_atoms_forced_i
=0x0, last_atoms_forced_e=0x0, last_forces_i=0x0, forceid_i=0x0, forceid_e=0x0, totalforce_i=0x0) at src/GlobalMaster.C:47
#10 0x0000000000c6a17c in GlobalMasterServer::callClients (this=0x16ce4c10) at src/GlobalMasterServer.C:379
#11 0x0000000000c691ca in GlobalMasterServer::recvData (this=0x16ce4c10, msg=0x16e64270) at src/GlobalMasterServer.C:108
#12 0x00000000009b4704 in ComputeMgr::recvComputeGlobalData (this=0xc8769d0, msg=0x16e64270) at src/ComputeMgr.C:1454
#13 0x00000000009bcd35 in CkIndex_ComputeMgr::_call_recvComputeGlobalData_ComputeGlobalDataMsg (impl_msg=0x16e64270, impl_obj_void=0xc8769d0) at inc/ComputeMgr.def.h:2173
#14 0x0000000000ff8070 in CkDeliverMessageFree ()
#15 0x0000000000ffdb1c in _processHandler(void*, CkCoreState*) ()
#16 0x00000000010c7932 in CsdScheduleForever ()
#17 0x00000000010c7c9d in CsdScheduler ()
#18 0x00000000009097f4 in BackEnd::suspend () at src/BackEnd.C:292
#19 0x0000000000e9d1ad in ScriptTcl::suspend (this=0xc733c70) at src/ScriptTcl.C:72
#20 0x0000000000e9d386 in ScriptTcl::runController (this=0xc733c70, task=1) at src/ScriptTcl.C:110
#21 0x0000000000ea41a3 in ScriptTcl::run (this=0xc733c70) at src/ScriptTcl.C:2184
#22 0x00000000009047cc in after_backend_init (argc=2, argv=0x7fffffffd678) at src/mainfunc.C:178
#23 0x000000000090412f in main (argc=4, argv=0x7fffffffd678) at src/mainfunc.C:50

Do I need also recompile NAMD with -g3?

Thanks,


在 2017年7月12日星期三 UTC+8下午2:56:01,Giovanni Bussi写道:

Snow Summer

unread,
Jul 12, 2017, 3:48:57 AM7/12/17
to PLUMED users
More info from gdb. Might be helpful.
(gdb) print actionSet
$1
= (PLMD::ActionSet &) @0x16ce6990: {<std::vector<PLMD::Action*, std::allocator<PLMD::Action*> >> = std::vector of length 2, capacity 2 = {0xca08570, 0xca06bd0}, plumed = @0x16ce55c0}
(gdb) print (*actionSet)
Cannot resolve function operator* to any overloaded instance
(gdb) print actionSet[0]
$2
= (PLMD::Action *) 0xca08570
(gdb) print *actionSet[0]
$3
= {_vptr.Action = 0x0, name = "", label = "", line = std::vector of length 0, capacity 0, update_from = 1.7976931348623157e+308, update_until = 1.7976931348623157e+308,  
  after
= std::vector of length 0, capacity 0, active = true, options = std::set with 0 elements, restart = false, doCheckPoint = false, plumed = @0x0, log = @0x0,  
  files
= std::set with 0 elements, replica_index = 0, comm = @0x0, multi_sim_comm = @0x0, keywords = @0x0}

Giovanni Bussi

unread,
Jul 12, 2017, 3:55:26 AM7/12/17
to plumed...@googlegroups.com
This is very weird... Which compiler are you using exactly? 

Compiling NAMD with -g3 should be not necessary, the error happens to be in plumed.

As an additional check, you might compile plumed with --debug or, even better, with "--debug --debug_glibcxx" (only working with gcc I suspect). This will make it slower but check all array boundaries.

Giovanni


--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

Giovanni Bussi

unread,
Jul 12, 2017, 3:58:54 AM7/12/17
to plumed...@googlegroups.com
OK maybe I found it,  I can detect some problem with valgrind on master branch.

I will try to find a solution now. Thanks again!

Giovanni

Giovanni Bussi

unread,
Jul 12, 2017, 5:55:24 AM7/12/17
to plumed...@googlegroups.com
Hi,

perhaps I fixed it. Could you please try branch fix-254 on github?

Here is the change:


Giovanni

Snow Summer

unread,
Jul 12, 2017, 7:46:24 AM7/12/17
to PLUMED users
Hi,

It fixed the crash issue but the colvar value never updates.

In the COLVAR file the timestep updates normally but the distance colvar remains the same. I also test the DUMPATOMS and the atoms positions never change.

Thanks,

在 2017年7月12日星期三 UTC+8下午5:55:24,Giovanni Bussi写道:
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users...@googlegroups.com.

To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

Snow Summer

unread,
Jul 25, 2017, 9:30:31 PM7/25/17
to PLUMED users
Hi,

After the commit Fix 241 (#248) and this commit (fix-254):

void Atoms::createFullList(int*n) set n to 104(the total number of atoms) when initializing, but it set n to 2(selected atoms in plumed.dat) in following steps.

Before the commit Fix 241 (#248) and this commit (fix-254):

void Atoms::createFullList(int*n) always set n to 2(selected atoms in plumed.dat).

Is this the expected behavior?


在 2017年7月12日星期三 UTC+8下午7:46:24,Snow Summer写道:
Reply all
Reply to author
Forward
0 new messages