Improve extrapolation between relaxation steps?

Nicholas Winner

unread,

Nov 16, 2021, 12:58:59 PM11/16/21

to cp2k

Hello all,

When I run a relaxation for a complex system, e.g. in my case a defective transition metal oxide with hybrid dft on a modest ~200 atom supercell, each SCF still takes quite a long time to converge.

I originally came from vasp before switching to cp2k, and it in that code, my experience was that once the first scf loop converged, the next relaxation step would take much less time, and so on until the last few steps converged in just a few iterations. This isn't my experience with CP2K though. Running CG on my most recent relaxation I still find that after more than 20 relaxation steps, 40-50 scf steps are required to achieve convergence.

Is this something that I could improve by changing certain settings? Below is the relevant info for my input file. Please note that I understand FULL_KINETIC and CG might lead to slower convergence than FULL_ALL + DIIS, but my question lies in why convergence is not appreciably speeding up rather than its absolute speed.

-Nick

&DFT

BASIS_SET_FILE_NAME BASIS_MOLOPT

BASIS_SET_FILE_NAME BASIS_MOLOPT_UCL

BASIS_SET_FILE_NAME BASIS_ADMM

BASIS_SET_FILE_NAME BASIS_ADMM_MOLOPT

POTENTIAL_FILE_NAME GTH_POTENTIALS

UKS T

MULTIPLICITY 0

CHARGE 4

&SCF

MAX_ITER_LUMO 400

MAX_SCF 50

EPS_SCF 1e-06

SCF_GUESS RESTART

&OT

ALGORITHM STRICT

MINIMIZER CG

LINESEARCH 2PNT

PRECONDITIONER FULL_KINETIC

ENERGY_GAP 0.01

ROTATION F

OCCUPATION_PRECONDITIONER F

&END OT

&OUTER_SCF

EPS_SCF 1e-06

MAX_SCF 20

&END OUTER_SCF

&END SCF

&AUXILIARY_DENSITY_MATRIX_METHOD

ADMM_PURIFICATION_METHOD None

METHOD BASIS_PROJECTION

&END AUXILIARY_DENSITY_MATRIX_METHOD

&QS

EPS_DEFAULT 1e-12

EPS_PGF_ORB 1e-16

EXTRAPOLATION PS

METHOD GPW

&END QS

&MGRID

NGRIDS 5

CUTOFF 550.0

REL_CUTOFF 50.0

&END MGRID

&XC

&XC_FUNCTIONAL NO_SHORTCUT

&PBE

SCALE_X 0.0

SCALE_C 1.0

&END PBE

&XWPBE

SCALE_X 0.063

SCALE_X0 0.877

OMEGA 0.07

&END XWPBE

&PBE_HOLE_T_C_LR

SCALE_X 0.123

CUTOFF_RADIUS 5.930000000000001

&END PBE_HOLE_T_C_LR

&END XC_FUNCTIONAL

&HF

FRACTION 1.0

&SCREENING

EPS_SCHWARZ 1e-07

EPS_SCHWARZ_FORCES 1e-06

SCREEN_P_FORCES T

SCREEN_ON_INITIAL_P T

&END SCREENING

&INTERACTION_POTENTIAL

POTENTIAL_TYPE MIX_CL_TRUNC

OMEGA 0.07

SCALE_COULOMB 0.06

SCALE_LONGRANGE 0.063

CUTOFF_RADIUS 5.930000000000001

T_C_G_DATA t_c_g.dat

&END INTERACTION_POTENTIAL

&LOAD_BALANCE

RANDOMIZE T

&END LOAD_BALANCE

&MEMORY

EPS_STORAGE_SCALING 0.1

MAX_MEMORY 2000

&END MEMORY

&END HF

&END XC

&END DFT

Matt Watkins

unread,

Nov 17, 2021, 1:06:33 PM11/17/21

to cp2k

Well, CP2K was setup to do Born-Oppenheimer MD for bio systems - so for wide gap insulators you should for sure only need a few SCF cycles to converge.

Is your system insulating (band gap, let's say > 0.2 eV)? If not different approach needed.

If so, improve the OT setup - especially the preconditioner

remove

PRECONDITIONER FULL_KINETIC
ENERGY_GAP 0.01

try

PRECONDITIONER FULL_SINGLE_INVERSE

or (can be expensive for larger systems)

PRECONDITIONER FULL_ALL

and leave out the energy_gap (if you are using an up-to-date cp2k)

Also check that you XC setup is correct ...

Matt

Matt Watkins

unread,

Nov 17, 2021, 1:08:02 PM11/17/21

to cp2k

'so for wide gap insulators you should for sure only need a few SCF cycles to converge'

I meant to write when fairly relaxed / close to the minimum ...

Nicholas Winner

unread,

Nov 17, 2021, 1:25:06 PM11/17/21

to cp2k

It is a large band gap (TiO2), but with a has a shallow defect state, making the overall band gap small (~0.13eV).

I know OT works better the wider the gap, but I had hooped that after the first SCF, it would still speed up quite a bit for the subsequent relaxation steps.

I was trying to use FULL_KINETIC because IRAC+Rotations+occupation_preeconditioner together require that. This has the advantage of making defect states more robust by allowing for fractional occupations, but of course it can severely compromise convergence speed.

When I switch to FULL_SINGLE_INVERSE preconditioner. I can grep and get:

outer SCF loop converged in 3 iterations or 104 steps

outer SCF loop converged in 2 iterations or 72 steps

outer SCF loop converged in 3 iterations or 116 steps

outer SCF loop converged in 2 iterations or 72 steps

outer SCF loop converged in 2 iterations or 93 steps

outer SCF loop converged in 2 iterations or 72 steps

outer SCF loop converged in 2 iterations or 63 steps

outer SCF loop converged in 2 iterations or 60 steps

It doesn't look like its speeding up very much. I suppose 60 i better than 100, but its not what I expected.

If this is just the way it is, then I'll live with it.

Matt Watkins

unread,

Nov 18, 2021, 3:34:57 AM11/18/21

to cp2k

I've not played with smearing / OT, maybe someone else can suggest a good setup and whether it is recommended.

If there is a modest gap I would expect to get convergence in ~10 DIIS / 20 CG SCF steps after a few relaxation steps.

Matt

Message has been deleted

Marcella Iannuzzi

unread,

Nov 18, 2021, 4:17:56 AM11/18/21

to cp2k

Dear Nick

Being possibly a spin polarised, charged, and small gap system it is not that surprising that the SCF has troubles.

Does the total energy fluctuates significantly among the optimisation steps?

Have you tried to bias the magnetisation of the atoms close to the defect to help the SCF find solutions for the real space distribution?

It happens that the SCF gets trapped into a wrong minimum and then convergence becomes unstable. Have you checked whether the charge distribution you get at any of those SCF steps is what you would expect?

Best

Marcella

751013040

unread,

Nov 18, 2021, 4:18:07 AM11/18/21

to Marcella Iannuzzi

你好，我已收到邮件~谢谢

Nicholas Winner

unread,

Nov 18, 2021, 1:03:05 PM11/18/21

to cp2k

There are not many energy fluctuations. Looking at printed force_eval (below) it shows that it pretty consistently decreases, with only one CG step moving away from minimum. I have not tried biasing the magnetization. Convergence does not seem to be unstable, it is very consistent, it's just much slower than I would expect for a proper wfn extrapolation considering the energy is changing by a small amount and the spin moment/charge on the site is not changing dramatically between steps.

It is possible, and has been pointed out to me, that the issue is the combination of (a) longrange HF functional, (b) shallow state, and (c) a reasonably tight screening of EPS_SCHWARZ 1e-7. The very small integrals that this introduces may cause the condition number to go up, therefore making the calculation slower.

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.957685447298900

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.962799617469500

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.968174215491672

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.969982458291270

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.974875112599875

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.977152106254835

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.983613254400552

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.987573339468327

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.975501539662218

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.980535609151048

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.985348311781308

ENERGY| Total FORCE_EVAL ( QS ) energy [a.u.]: -6495.988025124417618

On Thursday, November 18, 2021 at 1:18:07 AM UTC-8 7510...@qq.com wrote:

你好，我已收到邮件~谢谢

Marcella Iannuzzi

unread,

Nov 19, 2021, 2:48:40 AM11/19/21

to cp2k

Hi Nick

Actually EPS_SCHWARZ 1e-7. is not very tight. Maybe too many integrals are screened out instead.

The ASPC extrapolation methods might help.

You could also try diagonalization+mixing+smearing with a high smearing temperature, but this is in general slower than OT.