eaf_open failed error in 7.2.2

25 views
Skip to first unread message

Tobias Glossmann

unread,
Apr 14, 2024, 6:00:31 PMApr 14
to NWChem Forum
Dear community,

Since updating from 7.02 to 7.2.2 I have an issue. After a job has been running for a while I get this error. Everything was working fine before, well except of some other issues I was hoping to eliminate with the new version.

 ------------------------------------------------------------------------
 grid_pckbufw: eaf_open failed                   0
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
  current input line :
    67: task PYTHON
 ------------------------------------------------------------------------
 ------------------------------------------------------------------------
 An error occured while trying to read or write to disk space
 ------------------------------------------------------------------------I have asked the administrator to compile with the "USE_NOIO=y" option (found here https://nwchemgit.github.io/Special_AWCforum/st/id2288.html ) but that slows down nwchem.

I made sure the scratch drive was linked, even ran the program directly in the fast scratch drive.
Has anyone a hint? Program attached.

Thank you!
Tobias

PS: - less important side note, "occurred" should have 2x "r"  if anyone could fix the typo. 

tce_scan_python.inp

Edoardo Aprà

unread,
Apr 15, 2024, 1:56:12 PMApr 15
to NWChem Forum
Could you provide more details about this NWChem failure?
For example, 
1. what is the output  of the command below when the jobs stops
ls -l ./scratch
2. How many processes and nodes have you been using?
3.Please post the output file

Edoardo Aprà

unread,
Apr 15, 2024, 5:16:45 PMApr 15
to NWChem Forum
Could you also post the output of the command

cat /proc/sys/fs/file-nr

Tobias Glossmann

unread,
Apr 15, 2024, 10:02:29 PMApr 15
to NWChem Forum
I will answer your other question in a minute. Thanks for your responses!

...$ cat /proc/sys/fs/file-nr
11200   0       3760000

Tobias Glossmann

unread,
Apr 15, 2024, 10:02:33 PMApr 15
to NWChem Forum
Hi Eduardo,

1. I'm running it again to reproduce the output, my job file deletes scratch files after a run

I disabled the rm after trap .. I will get back to you asap
dir=$SCRATCH/$SLURM_JOB_ID; mkdir -p $dir; trap "rm -r $dir" EXIT
ln -snf $dir ./scratch

mpirun nwchem tce_scan_python.inp >"output_$SLURM_JOB_ID.out"


2. one node with 40 processors
3.  as soon as it is done

Thanks 
Tobias

Edoardo Aprà

unread,
Apr 15, 2024, 10:04:06 PMApr 15
to NWChem Forum
The limit value of 3760000 is a bit on the low side.
Anyhow, if you could monitor the content of /proc/sys/fs/file-nr during the actuall run, it might be useful.
Reply all
Reply to author
Forward
0 new messages