Optimization inputs PDOS with HSE06 level of theory and AI-MD

Lorenzo Lagasco

unread,

Apr 3, 2026, 12:19:22 AM (13 days ago) Apr 3

to cp2k

Hello everyone,
I'm a new cp2k user and I have a couple of questions, not strictly related to each other:

I have written an input file for a band structure calculation of a system consisting of delafossite CuAlO₂ with an organic dye (coumarin 343) anchored on its surface, using the HSE06 level of theory. However, the calculation appears to be unusually slow (using 4 nodes with 52 processors each, after about 2 hours it has not even completed the first SCF step). I would like to ask whether it would be possible to get feedback on how to optimize the input file, or if there might be some incorrectly set parameters. I have attached both the input file and the xyz file of the system geometry.
At the same time, I am working on NVT CP2K dynamics at the PBE+D3 level of theory for aluminosilicate nanosheets in water of different sizes (the simulation box ranges from about 719 to 900 atoms). Currently, the simulations proceed at a speed of about 25 seconds per MD step (using 1 node with 52 processors for each simulation). Since I need a trajectory of 60 ps with a timestep of 0.5 fs, I would like to ask whether it is possible to improve the computational performance. I am also attaching the input file in this case md-GPW.inp

Thanks for the help!

final-geom-opt.xyz

CuAlO2_HSE06.inp

md-GPW.inp

Frederick Stein

unread,

Apr 4, 2026, 5:18:21 AM (12 days ago) Apr 4

to cp2k

Dear Lorenzo,

It is highly recommended to perform hybrid calculations using the ADMM approach (https://manual.cp2k.org/trunk/CP2K_INPUT/FORCE_EVAL/DFT/AUXILIARY_DENSITY_MATRIX_METHOD.html and https://manual.cp2k.org/trunk/methods/dft/hartree-fock/admm.html). You will need to choose suitable fitting basis sets (BASIS AUX_FIT <basis set name>) in the respective kind sections usually from BASIS_ADMM or BASIS_ADMM_UZH. This should accelerate your calculations significantly.

Regards,

Frederick

Lorenzo Lagasco

unread,

Apr 4, 2026, 1:25:02 PM (12 days ago) Apr 4

to cp...@googlegroups.com

Good afternoon,
thanks for the information! In the meantime, I’d like to take the opportunity to ask if you also have any advice regarding my second question.
Best regards,
Lorenzo Lagasco

--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cp2k/7634d177-82ae-445f-aa43-9068ad1d9392n%40googlegroups.com.

Frederick Stein

unread,

Apr 4, 2026, 2:38:00 PM (12 days ago) Apr 4

to cp2k

Dear Lorenzo,

Sorry, I forgot about your second question. Others may correct me if necessary. I can only guess. Can you run 2-3 MD steps and send us the timing report at the end of the CP2K output file? Do you run with 52 cores or do you use less? What hardware are you running on and which software stack do you use? Which version of CP2K are you talking about?

Usually, standard-DFT-calculations are most performant if the number of MPI ranks is a square number and you use at most 2 OpenMP threads per MPI rank.

Best,

Frederick

Lorenzo Lagasco

unread,

Apr 6, 2026, 2:42:29 AM (10 days ago) Apr 6

to cp...@googlegroups.com

Good evening,
here attached input and output of my ai-md calculation. In this case, 4 nodes with 16 processors have been used (the ones with 52 priocessors are non available at the moment). Moreover, I'm using the version 2025.1 on a cluster (CPU model name Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz).

Best regards
Lorenzo Lagasco

To view this discussion visit https://groups.google.com/d/msgid/cp2k/fc2e811b-1ae9-4572-a3d2-1c1e14a2d046n%40googlegroups.com.

output-md.out

md-GPW.inp

Marcella Iannuzzi

unread,

Apr 6, 2026, 5:24:36 AM (10 days ago) Apr 6

to cp2k

Hi,

As Frederick mentioned, for a proper analysis one would need the timing report at the end of the output.

You could speed up the calculation by increasing EPS_DEFAULT (10^-10)

I would also set CALCULATE_C9_TERM to false.

Kind regards

Marcella

Lorenzo Lagasco

unread,

Apr 6, 2026, 6:23:18 AM (10 days ago) Apr 6

to cp...@googlegroups.com

Good morning,
my bad, I've now attached both input and output files (with final timing report). In this case, I repeated the calculation using 2 nodes with 40 processors each.
Thanks for your patience.

Best regards
Lorenzo Lagasco

To view this discussion visit https://groups.google.com/d/msgid/cp2k/37acddf8-506b-4a3b-a4cd-e2fa03010058n%40googlegroups.com.

md-GPW.inp

output-md.out

Frederick Stein

unread,

Apr 6, 2026, 10:40:01 AM (10 days ago) Apr 6

to cp2k

Dear Lorenzo,

Your timing report reveals that grid operations take about 50 % of your runtime. As already suggested by Marcella, you may try to increase EPS_DEFAULT. In addition, you may try to reduce CUTOFF or increase the number of grids (NGRIDS). Please check whether the accuracy matches your expectations. Finally, you may also try to use 2 OpenMP threads per MPI rank and reduce the number of MPI ranks accordingly to decrease communication.

Best regards,

Frederick

Lorenzo Lagasco

unread,

Apr 7, 2026, 3:41:09 AM (9 days ago) Apr 7

to cp...@googlegroups.com

Ok,
thanks again for the helpful hints!

Best regards
Lorenzo Lagasco

To view this discussion visit https://groups.google.com/d/msgid/cp2k/82d23fda-3a28-4a80-bec7-7bb7ac08d19dn%40googlegroups.com.

Reply all

Reply to author

Forward