Installing and running NWChem in the AWS clusters

143 views
Skip to first unread message

Shuqiang Niu

unread,
Jun 17, 2022, 11:39:36 AM6/17/22
to nwchem...@googlegroups.com
Hi,

I have installed NWChem in an AWS cluster following the instruction: 
Installation instructions for the precompiled packages on Ubuntu Focal 20.04

I have tested /usr/bin/nwchem in the login node. It works very well. I am wondering how to create and load NWChem module of the slurm script for running NWChem in the multi-node cluster. 

Thanks
Shuqiang

Shuqiang Niu

unread,
Jun 20, 2022, 1:07:36 PM6/20/22
to nwchem...@googlegroups.com
Hi,

My question was related to a weird performance of NWChem running in the cluster with muti-nodes.

If I used the slurm script below.to run nwchem in a cluster with different node numbers, I got different outcomes. 

******************************************************************************************
#!/bin/bash
#
#SBATCH --job-name=jobst1
#SBATCH --output=jobst1-%j.out
#SBATCH --error=jobst1-%j.err
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=10
#SBATCH --partition=regular

module purge

mpirun nwchem TiOH6-2-m06-dzvp2 > TiOH6-2-m06-dzvp2.out
 
exit
**********************************************************************************************

Although the calculations appear to work fine, the outcomes are unstable.  If I use multi-node, I may get different results. I am wondering if creating and loading nwchem modules are important for solving this problem.

The following is an input example.

******************************************************************************************************
title "Ti2O4 model  ECP         "

start TiOH6-2-m06-dzvp2

echo

charge -2

geometry units angstroms
symmetry C1
Ti    7.197215000000     12.356000000000     10.655614000000
O     5.335286000000     12.157686000000     10.716351000000
H     5.118759000000     11.434242000000     10.347667000000
O     7.075484000000     12.897193000000      8.891045000000
H     6.451182000000     12.480796000000      8.512770000000
O     7.027445000000     14.110058000000     11.206510000000
H     6.403972000000     14.478761000000     10.781351000000
O     7.366985000000     10.601942000000     10.104719000000
H     7.990458000000     10.233239000000     10.529878000000
O     7.318946000000     11.814807000000     12.420184000000
H     7.943248000000     12.231204000000     12.798459000000
O     9.059144000000     12.554314000000     10.594877000000
H     9.275671000000     13.277758000000     10.963562000000
zcoord
  cvr_scaling 1.2
# bond  1   2
end
end

dft
   odft
   mult 1
   vectors input atomic output TiOH6-2-m06-dzvp2.movecs
   XC m06
   grid xfine
   iterations 200
   mulliken
end

driver
  maxiter 80
end

basis "ao basis" cartesian print
Ti  library "DZVP2 (DFT Orbital)"
H   library "DZVP2 (DFT Orbital)"
O   library "DZVP2 (DFT Orbital)"
END

task dft optimize
******************************************************************************************************************************

If I used one node (or longin node, or a workstation), the final results (toward an intermolecular hydrogen bond species):

      ----------------------
      Optimization converged
      ----------------------


  Step       Energy      Delta E   Gmax     Grms     Xrms     Xmax   Walltime
  ---- ---------------- -------- -------- -------- -------- -------- --------
@   36   -1304.38794707  2.7D-06  0.00003  0.00001  0.00058  0.00131   2289.6
                                     ok       ok       ok       ok

If I used multi-node (for example 3 nodes with 10 cores/node), the final results (toward an intramolecular hydrogen bond species):

      ----------------------
      Optimization converged
      ----------------------


  Step       Energy      Delta E   Gmax     Grms     Xrms     Xmax   Walltime
  ---- ---------------- -------- -------- -------- -------- -------- --------
@   66   -1304.39789550 -8.9D-08  0.00003  0.00000  0.00051  0.00169   4995.3
                                     ok       ok       ok       ok

Any suggestions?

Best
Shuqiang

Edoardo Aprà

unread,
Jul 12, 2022, 6:29:26 PM7/12/22
to NWChem Forum
Are these results completely deterministic? In other words, do you always get the same optimization sequence in the multi-node and single-node case, respectively?

Shuqiang Niu

unread,
Jul 13, 2022, 12:00:26 AM7/13/22
to nwchem...@googlegroups.com
HI, Edo,

The initial geometry is a crystal structure. If I used a single node machine with NWChem 7.0.2, I would consistently get a local minimum with the octahedral structure (like crystal one). I have also tested it on another computer cluster server with the NWChem 7.0.0  with different nodes or cores (though a slurm script to submit jobs). The results were consistently the same toward the local minimum. However, on our AWS clusters, the optimizations appeared randomly toward two minimums, especially using multi-nodes. The below are the tested results (1x30 means one node with 30 cores/node; 3-opt step E means the total energy at the third optimization step).

Best
Shuqiang


image.png
.




--
You received this message because you are subscribed to the Google Groups "NWChem Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nwchem-forum...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nwchem-forum/77ad6f8a-61cf-4e2a-9a64-b538d33cf8c1n%40googlegroups.com.

Edoardo Aprà

unread,
Jul 14, 2022, 12:55:41 PM7/14/22
to NWChem Forum
Have you tried to use the geometry that results in the -1304.39899 energy and see what energy you get in a single node execution?

Shuqiang Niu

unread,
Jul 14, 2022, 2:47:59 PM7/14/22
to nwchem...@googlegroups.com
If I used the geometry (stationary point ) with the -1304.39899 energy, the reoptimization on any computer reproduced the same results with the -1304.39899 energy. Similarly, If I used the geometry (stationary point ) with the -1304.38795 energy, the optimization on any computer reproduced the same results with the -1304.38795 energy. 

Best
Shuqiang


Edoardo Aprà

unread,
Jul 14, 2022, 8:56:57 PM7/14/22
to NWChem Forum
My conclusion is that the multi-node execution has somewhat fortuitously discovered a lower energy structure that you should use.
You might want to re-do calculations with both geometries with a huge grid to see what the energy difference of the two structures becomes.
Reply all
Reply to author
Forward
0 new messages