Ocasionally when writing (usually large files) to an NFS mounted
directory, we get the console message:
"NFS WRITE ERROR on Host 'hostname' Timed Out: Error Code 62" . The
process continues and if not actually observed at the console the message
goes by undetected. In all cases now, we watch the console, rerun the
process and rewrite the file.
Is this message info only with NFS handling the problem, or is there a
real chance that the file write is corrupted ? We are interested in
automating some of the processes which use the NFS files and want to know
how to detect and deal with an NFS problem if the process typically
continues. We're beginning to research this, but have limited NFS
documentation at this time. Any info on this error message or brief
comment on the subject of NFS errors would be greatly appreciated. Thanks
in advance !
Mike Welch
--
Mike Welch
Cornell Business Services
(607) 255 2921 fax: (607) 254 4577
mp...@cornell.edu
: Ocasionally when writing (usually large files) to an NFS mounted
: directory, we get the console message:
: "NFS WRITE ERROR on Host 'hostname' Timed Out: Error Code 62" . The
I got the same problem and I read the error.h include file. It defines
what is the meaning of all error codes.... but I still don't know what
the error codes mean......
Simon :-)
: process continues and if not actually observed at the console the message
The manuals seem to be singularly unhelpful on the subject of soft and hard
(no anatomy jokes, please) mounts. What'sa difference?
--
You can have it done fast |
You can have it done cheap|----- Pick any two.
You can have it done right|
>.... we have (4) SCO Unix machines running on the network using NFS
> Ocasionally when writing (usually large files) to an NFS mounted
>directory, we get the console message:
> "NFS WRITE ERROR on Host 'hostname' Timed Out: Error Code 62" . The
>process continues and if not actually observed at the console the message
>goes by undetected. In all cases now, we watch the console, rerun the
>process and rewrite the file.
> Is this message info only with NFS handling the problem, or is there a
>real chance that the file write is corrupted ?
If you get timeouts on NFS writes, you do have a problem.
And you do have file corruption.
This can easily be prevented by mounting the filesystems "hard", not
soft. r/w mounted filesystems should never be mounted "soft", if you
put any value on your data at all.
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
>The manuals seem to be singularly unhelpful on the subject of soft and hard
>(no anatomy jokes, please) mounts. What'sa difference?
hard mounts: clients keeps retrying reads/writes until the server
acks with either success or failure (diskfull, access denied, stale file
handle, over quota). Data gets written to server unless server can't write
data to disk. You can reboot the server, halt it for some days, and still
the data gets to the server's disks.
soft mounts: clients retries reads/writes for some amount of time.
If no response from server after that time, fail read/write with ETIMEDOUT.
Data gets lots, progams can die (pages don't arrive from server fast enough)
Not all programs check their read/write error codes, so silent corruption
may occur.
For large I/O nound computations, one timeout will requir you to restart
the run.
In short, if you ever see the message "NFS server not responding", you lost
data with soft mounts.
Casper