log into your bios and disable hyperthreading, it's not good for run time whether for small or big jobs.
MPI will always yield better than OpenMP.
Having 18 cores doesn't guarantee too much of speed, from my limited experience I would say having 12 cores machine is usually sufficient to be considered as one node, and usually this will come into two sockets each with 6 cores. Filling your sockets will actually not help improving the run time.
The best strategy that I have is through trial and error, which means that if I have large model, I would set it to around 2-5s only, and test different settings, nbr of MPI used, or number of nodes used, and also nbr of OpenMP used, generally when having large model, always set OMP_NUM_THREADS=1, whatever setting yield better than I choose it. You might lose around 20-30min setting it but will gain you hours if not days if the model is quite large.