Hi,
I have installed the Dedalus code on our local cluster with the conda environment.
The code is working fine with one node. But when I switch to more than one node I am getting errors.
I also checked by running a simple hello program using multi-node without any errors.
So I guess the installation is correct.
I checked in the groups and find a thread where a similar issue has been solved -
So I change "FILEHANDLER_TOUCH_TMPFILE = True" to "dedalus.cfg", and got an error like -
2020-07-13 12:47:43,027 __main__ 0/2 INFO :: Solver built
2020-07-13 12:47:43,259 __main__ 0/2 INFO :: Starting loop
2020-07-13 12:47:44,475 __main__ 1/2 ERROR :: Exception raised, triggering end of main loop.
And I checked the error file, it is showing -
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/subhajitkar/wave_flow/tmpfile_p1'
So In our cluster, I have /scratch/subhajitkar space for each node (they are independent)
In this case I have used a total 2 processors with 1 processor from each node, and
checked that the hosting node is creating the folder but the second node is not able to see it.
Here I have a question - do each node independently create temporary files?
Can you please suggest how to do this?
Please let me know if you need any other information.
Thanks for the help!
Subhajit Kar