I am trying to do single point dft calculations on a rather large system (500 atoms). I get the calculation is very slow and the output says "Extra Local Memory (stack+heap) needed for incore: 7242 Mbytes". My current input file specifies "heap 200 mb stack 1000 mb global 2800 mb" and "direct" in the dft section (which I thought would override incore...). The node has 128 gb memory, and I'm running the calculation across all 24 cpus of the node. I chose the 4gb memory specification based on a recommendation to only use 75% of the total memory (128 /24 × 0.75 = 4).
Do you have recommendations for how to shift the memory allocation to make this calculation run faster?