Hi Eric,
Thanks for the response. I went and tested this out as you suggested by your example. I have nodes with 56 processors each and I tested using two nodes comparing with time steps I'm getting for one node. I tested the following settings:
1 MPI 112 Threads
2 MPI 56 Threads
4 MPI 28 Threads
8 MPI 14 Threads
16 MPI 7 Threads
I end up getting the best performance at and above 4 MPI processes and higher but this is still slower than simply using one node with 56 MPI processes. Thanks for the suggestion though.
I'm not an experienced user with CP2K but just using an input file my colleague gave me to test out the installation that I'm helping out with. I would expect that a system of 100 water molecules and a single Au atom would still see some acceleration by using two nodes but I might be wrong. Any other thoughts?