Too slow for 1008 Ni atoms, 20 or more minutes per step

358 views
Skip to first unread message

yell...@gmail.com

unread,
Apr 10, 2017, 4:45:34 AM4/10/17
to cp2k

Dear  cp2k  users and developers,

I am trying to run ab initio molecular dynamics on the system including 1008 Ni atoms with PBE functional. The 1*24 mpi task is too slow, around 80 steps per day on average; the 3*24 mpi task is slower, 38 steps per day on average.

Any help is highly appreciated.

 

Best regards,

 Huang


CP2K version 4.1

 SVN source code revision svn:17462

 cp2kflags: fftw3 parallel mpi2 scalapack

composer_xe_2015.2.164

Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Haswell


# Step Nr. Time[fs] Kin.[a.u.] Temp[K] Pot.[a.u.] Cons Qty[a.u.] UsedTime[s]

 0 0.000000 9.566949152 2000.000000000 -170544.785566084 -170535.208832257 0.000000000

 1 1.000000 9.646234174 2016.574776405 -170554.174550500 -170544.519479220 13708.945765018

 2 2.000000 9.903391164 2070.334232377 -170556.710812560 -170546.798982374 1826.637025118

 3 3.000000 10.258759808 2144.625134870 -170557.585906167 -170547.318873932 748.965154886

 4 4.000000 10.622230399 2220.609774417 -170557.531726730 -170546.901213604 856.725823164


&GLOBAL

  ! limit the runs to 5min

  ! WALLTIME 1000

  ! reduce the amount of IO

  IOLEVEL  LOW

  ! the project name is made part of most output files... useful to keep order 

  PROJECT Ni-1008-2k

  ! various runtypes (energy, geo_opt, etc.) available.

  RUN_TYPE MD             

&END GLOBAL


&FORCE_EVAL

  STRESS_TENSOR ANALYTICAL

  ! the electronic structure part of CP2K is named Quickstep

  METHOD Quickstep

  &DFT

    ! basis sets and pseudopotential files can be found in cp2k/data

    BASIS_SET_FILE_NAME BASIS_SET

    POTENTIAL_FILE_NAME GTH_POTENTIALS            


    ! Charge and multiplicity

    CHARGE 0

    MULTIPLICITY 1


    &MGRID

       ! PW cutoff ... depends on the element (basis) too small cutoffs lead to the eggbox effect.

       ! certain calculations (e.g. geometry optimization, vibrational frequencies,

       ! NPT and cell optimizations, need higher cutoffs)

       CUTOFF [Ry] 300 !500 

    &END


    &QS

       ! use the GPW method (i.e. pseudopotential based calculations with the Gaussian and Plane Waves scheme).

       METHOD GPW 

       ! default threshold for numerics ~ roughly numerical accuracy of the total energy per electron,

       ! sets reasonable values for all other thresholds.

       EPS_DEFAULT 1.0E-7 !10 

       ! used for MD, the method used to generate the initial guess.

       EXTRAPOLATION ASPC 

    &END


    &POISSON

       PERIODIC XYZ ! the default, gas phase systems should have 'NONE' and a wavelet solver

    &END


!    &PRINT

!       ! at the end of the SCF procedure generate cube files of the density

!       &E_DENSITY_CUBE OFF

!       &END E_DENSITY_CUBE

!       ! compute eigenvalues and homo-lumo gap each 20nd MD step

!       &MO_CUBES

!          ! compute 4 unoccupied orbital energies

!          NLUMO 4

!          NHOMO 4

!          ! but don't write the cube files

!          WRITE_CUBE .FALSE.

!          ! do this every 10th MD step.

!          &EACH

!            MD 20

!          &END

!       &END

!    &END


    ! use the OT METHOD for robust and efficient SCF, suitable for all non-metallic systems.

    &SCF                              

      SCF_GUESS ATOMIC ! can be used to RESTART an interrupted calculation

      MAX_SCF 50

      EPS_SCF 1.0E-4 ! accuracy of the SCF procedure typically 1.0E-6 - 1.0E-7

      ! do not store the wfn during MD

      &PRINT

        &RESTART OFF

        &END

      &END

      

      &OT

        ! an accurate preconditioner suitable also for larger systems

        PRECONDITIONER FULL_SINGLE_INVERSE

        ! the most robust choice (DIIS might sometimes be faster, but not as stable).

        MINIMIZER DIIS

      &END OT

      &OUTER_SCF ! repeat the inner SCF cycle 10 times

        MAX_SCF 20

        EPS_SCF 1.0E-4 ! must match the above

      &END

    &END SCF


    ! specify the exchange and correlation treatment

    &XC

      ! use a PBE functional 

      &XC_FUNCTIONAL 

         &PBE

         &END

      &END XC_FUNCTIONAL

    &END XC

  &END DFT

 

  ! description of the system

  &SUBSYS

    &CELL 

      ! unit cells that are orthorhombic are more efficient with CP2K

      ABC [angstrom] 21.98167992 21.98167992 25.64529419

    &END CELL


    ! atom coordinates can be in the &COORD section,

    ! or provided as an external file.

    &TOPOLOGY

      COORD_FILE_NAME Ni-1008-2k.xyz

      COORD_FILE_FORMAT XYZ

    &END


    ! MOLOPT basis sets are fairly costly,

    ! but in the 'DZVP-MOLOPT-SR-GTH' available for all elements

    ! their contracted nature makes them suitable

    ! for condensed and gas phase systems alike.

    &KIND Ni                              

      BASIS_SET DZV-GTH-PADE        

      POTENTIAL GTH-PBE-q18             

    &END KIND

  &END SUBSYS

&END FORCE_EVAL


! how to propagate the system, selection via RUN_TYPE in the &GLOBAL section

&MOTION

! &GEO_OPT

!   OPTIMIZER LBFGS ! Good choice for 'small' systems (use LBFGS for large systems)

!   MAX_ITER  100

!   MAX_DR    [bohr] 0.003 ! adjust target as needed

!   &BFGS

!   &END

!  &END

 &MD

   ENSEMBLE NPT_I  ! sampling the canonical ensemble, accurate properties might need NVE

   TEMPERATURE [K] 2000

   TIMESTEP [fs] 1

   STEPS 1000000

   # GLE thermostat as generated at http://epfl-cosmo.github.io/gle4md 

   # GLE provides an effective NVT sampling.

   &BAROSTAT

       PRESSURE 1.0

   &END BAROSTAT

   &THERMOSTAT

      &NOSE

      &END NOSE

   &END THERMOSTAT

 &END MD

 &PRINT

   &TRAJECTORY

     &EACH

       MD 100

     &END EACH

   &END TRAJECTORY

   &VELOCITIES OFF

   &END VELOCITIES

   &FORCES OFF

   &END FORCES

   &RESTART_HISTORY

     &EACH

       MD 200

     &END EACH

   &END RESTART_HISTORY

   &RESTART

     BACKUP_COPIES 3

     &EACH

       MD 200

     &END EACH

   &END RESTART

  &END PRINT

&END

zhj...@gmail.com

unread,
Apr 10, 2017, 5:45:00 AM4/10/17
to cp...@googlegroups.com
1. the system(containing more than 1000 atoms) is too big;
2. using PREFERRED_DIAG_LIBRARY in CP2K_INPUT / GLOBAL section can speedup the simulation, set PREFERRED_DIAG_LIBRARY=ELPA, but you should install elpa     library before using it;
3. the cutoff may be too small, the bigger values will be slower;
4. the EPS_SCF is too big,  tunning it to small( for example, 4.0E-7, the simulation will be slower);
5. using the relaxed system can speedup convergence.

Marcella Iannuzzi

unread,
Apr 11, 2017, 3:58:18 AM4/11/17
to cp2k
Dear  Huang,

As already pointed out, you have to improve the computational settings and provide enough computational resources.
If the system is metallic, you need to use a standard diagonalization as optimisation scheme, including the smearing of the occupation numbers and a proper mixing. 
The ELPA library for the diagonalisation, which replaces the corresponding function of the ScaLapack, is to be preferred because of the significantly better performance. 
The orbital transformation method is not going to work for metallic systems. In any case, when the SCF is not well converged, the electronic structure and the forces are going to be wrong. What happens then is totally out of control. 
Kind regards
Marcella

yell...@gmail.com

unread,
Apr 12, 2017, 5:17:19 AM4/12/17
to cp2k

Thank you very much for your swift reply.

Task was waiting in the queue yesterday. The simulation time per step is reduced obviously.

But the speed gets slower because of the increasing of SCF cycles.

 

# Step Nr. Time[fs] Kin.[a.u.] Temp[K] Pot.[a.u.] Cons Qty[a.u.] UsedTime[s]

 0 0.000000 9.566949152 2000.000000000 -170530.032653392 -170520.455919565 0.000000000

 1 1.000000 9.631592301 2013.513848184 -170530.197758967 -170520.559182878 3368.403749943

 2 2.000000 9.844516264 2058.026254206 -170530.614774376 -170520.760553775 119.362696171

 3 3.000000 10.134210226 2118.587663569 -170531.251818051 -170521.099615591 140.031235933

 

 103 103.000000 5.131475709 1072.750701983 -170544.302114205 -170530.046201117 479.899127960

 104 104.000000 5.085703611 1063.181904658 -170544.285813199 -170530.018294504 466.044290066

 105 105.000000 5.031751308 1051.903010733 -170544.056415561 -170529.785912487 447.755058050

 

 122 122.000000 6.246773076 1305.907029920 -170546.291929917 -170529.923306198 866.148313046

 123 123.000000 6.724651351 1405.808945809 -170545.774218367 -170528.879454790 852.673817873

 124 124.000000 7.087707135 1481.706868681 -170545.226813770 -170527.918999875 866.536659002

 

Is the following phenomenon normal?

 

*** SCF run converged in     1 steps ***

  Leaving inner SCF loop after reaching    20 steps.

 

Best regards,

Huang


在 2017年4月10日星期一 UTC+8下午5:45:00,zhj...@gmail.com写道:

yell...@gmail.com

unread,
Apr 12, 2017, 5:22:42 AM4/12/17
to cp2k

Dear Marcella,

 

Thank you very much.

The smearing and mixing have been added according to your advice. Is BASIS_MOLOPT or BASIS_SET suitable for my system?

 

Best regards,

Huang


在 2017年4月11日星期二 UTC+8下午3:58:18,Marcella Iannuzzi写道:
Reply all
Reply to author
Forward
0 new messages