Thank you for the comment.
I agree that there is not much we can do.
But still, for Thunderbird, I would like to see a graceful shutdown with
an easy to understand error message about "network file system did not
respond.", etc.
Otherwise, the user is left with a bitter taste in the mouth thinking
"Is my last e-mail sent successfully?", "Is the last downloaded e-mail
stored securely?", etc.
For idempotent operation, that is, operations that can be tried many
times and return the same result always.), a crash is OK.
E.g. FF's fetching a page that would return the same page not matter how
many times we try, or TB's
letting users looking at the headers of already downloaded messages.
They are idempotent operations.
(I am ignoring the cache update or already-read-flag setting, etc.)
For non-idempotent operations, and TB's mail handling such as
receiving/writing/sending e-mails are not idempotent operations,
a crash is too harsh on the user. That is why I try to make it a bit
more acceptable in the face of serious trouble with network file system
and other I/O operations
by handling such errors sensibly (and gracefully exit if not much can be
done.)
However, it is an uphill battle since low-level I/O error handling was
not considered/tested well in TB.
But such attention should be given to FF users as well.
I suspect that FF user in the middle of important transaction (such as
banking/payment), which is definitely NOT an idempotent operation,
would have a similar sentiment if FF crashes just because underlying
network file system does not respond, etc.
BTE, yhere id be a subtle difference between Windows and Linux regarding
network file system operation (and its errors).
Windows I/O system primitive tries various network error recovery
schemes such as re-trying including automatically handle short-read and
try to read as many octets as possible
when the remote server returns less than requested number of octets at
initial call and there are still remaining octets on the remote server.
So in that sense, if Windows I/O system call fails for network
operation, that is when we know hard unrecoverable error occured.
Windows has already tried a few error recovery method.
OTOH, under linux, the system call obviously does not do such extra
error recovery and all is passed to user code, which needs to take care of
the short read and other recovery measure if any
Currently T-B, and puresumably FF, too, does not handle such recovery
very well.
At least I have produced a patch for short-read issues for TB under
linux and have tested it locally for several years.
I have learnt the difference between Windows and Linux network I/O error
handling at OS level because C-C TB under linux could not talk to
congested remote server which occasionally returned short response
whereas C-C TB under windows did not show such behavior.
I investigated and realized that the C-C TB under linux needed a fix.
While testing the code by mimicking the remote file server by unplugging
network cable many times for a few weeks, I learned there are still
other I/O issues such as failure of ftell and lseek which are not still
handled perfectly in my patch yet. (It *IS* rare, and I suspect ftell
wrapper has a bug somewhere. The error is thrown back as signal which
C-C TB did not catch, thus crashing.)
The issue of coping with misbehaving remote network file system
gracefully is so hard to test without an instance of malfunctioning
remote file server which can be controlled to "err" on demand.
Anyway, I know this topic is tangetial to the original discussion.
At least, being able to know the causes of crash including possible
hardware issues including network failure is great.
So thank you for showing how to do it.
Chiaki