Spin contamination error using the same input file but different computational nodes

61 views
Skip to first unread message

葉哲和

unread,
Dec 7, 2022, 10:42:38 AM12/7/22
to molpro-user
Dear Molpro users/experts,

I am currently working on some CASSCF computation using Molpro 2022.3.0.
Our research group used to use Molpro 2012 for a long time and decided to upgrade to the 2022 Version this summer. We have a cluster computer system consisting of several computational nodes to which jobs can be submitted.

Either using Molpro 2022.3.0 or 2022.2.2 we faced the same problem as such: when we are submitting the job into one of our nodes (so-called Dell02), there is almost always this spin contamination thing popping up during the CASSCF iterations and it was never anywhere near convergence. But sending the same input file to one of our other nodes (Dell08) it would work as it should be (say in the 2012 version). I read through this forum and some said it's due to version issue. But I don't think this is the case right here...

Could someone please help? Many many thanks.

I attach three files here, one being the input file, and the other two the output files, with test_dell02.out being the spin-contaminated one and test_dell08.out the OK one.
Thanks again.

PS. Both of the nodes are of the same Linux distribution and setup and 64bit. I don't know much about the hardware and OS but I give what I think is important here.

Best of the best,
Che-Ho Yeh
Department of Applied Chemistry,
National Yang-Ming Chiao-Tung University,
Hsinchu City, Taiwan
test_dell02.out
test.inp
test_dell08.out

tibo...@gmail.com

unread,
Dec 7, 2022, 1:28:48 PM12/7/22
to molpro-user
Dear Che-Ho,

This is a case where hardware differences between the nodes could have an impact, if they have a sufficiently different CPU, because linear algebra libraries often choose different code paths based on the capabilities of the CPU.
Could you run something like the lscpu command on both of these nodes? That would instantly tell us if they have different CPUs.

Best,
Tibor

Andy May

unread,
Dec 11, 2022, 7:13:50 AM12/11/22
to tibo...@gmail.com, molpro-user
This is very likely the issue, our binaries are linked against MKL. There is more information about this and how to address it at https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html . On some systems I have needed to set MKL_CBWR="AVX2,STRICT" to get all testjobs to pass, but you should select the most appropriate option based on your hardware.

Best wishes,

Andy

--
You received this message because you are subscribed to the Google Groups "molpro-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to molpro-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/molpro-user/c9cbc9b1-55ca-4657-8c16-fbf634095720n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages