Hi, I have a question to those who either use CL to talk to sockets or implement a socket interface in a CL implementation, and to those who know about TCP/IP in general.
How do you treat ECONNRESET (i.e., RST packet)? How do you want it to be treated?
Specifically, do you consider it roughly equivalent to (or a subset of) EOF (i.e., FIN packet)? (In particular, is it true that, once a TCP socket signaled ECONNRESET, no data can be written there - or, if it can be written, it is guaranteed that the recipient will not receive it - and nothing will even be successfully read from it?)
I expect the answers to come in two groups:
1. I don't know and I don't care. This code: (with-open-stream (sock (socket-connect 80 "www.google.com")) (write-line "GET ...." sock) (loop :for line = (read-line sock nil nil) :while line :do ...)) has always worked for me just fine.
Variants: A. same as above + "I actually always assumed that condition connection-reset was a subclass of end-of-file, that's why the above code works reliably" B. same as above + "I have to wrap READ-LINE above in IGNORE-ERRORS because I know that RST is not treated the same as FIN".
2. nope, ECONNRESET is an _error_ while EOF is a normal element of communications, so it should not be routinely ignored (or swept under the carpet by the EOF-P argument READ-LINE and friends), so the code above should be wrapped in HANDLER-BIND which restarts the connection on ECONNRESET (or something).
> 2. nope, ECONNRESET is an _error_ while EOF is a normal element of > communications, so it should not be routinely ignored (or swept under > the carpet by the EOF-P argument READ-LINE and friends), so the code > above should be wrapped in HANDLER-BIND which restarts the connection > on ECONNRESET (or something).
I'd like to think this, but I think there's this half-duplex close thing where I might be happy to deal with a half-closed connection, but the other end might not and might then send an RST. But actually that's an error too, so I guess I always think it's an error.
> Hi, > I have a question to those who either use CL to talk to sockets or > implement a socket interface in a CL implementation, and to those who > know about TCP/IP in general.
> How do you treat ECONNRESET (i.e., RST packet)? > How do you want it to be treated?
According to Google, the Stevens book only has a few pages covering ECONNRESET. Here's one of them. The others can be found using the search box on the left.
Considering ECONNRESET is a possible value of errno, I think a CL implementation should respond accordingly...
When the CL user makes a call that eventually triggers an errno, that is the point where an error might be signalled. This is somewhat complicated by CL APIs that layer a stream over the file descriptor. The stream may not have delivered everything that had been buffered...
I wonder how Java libraries handle this.
Should a stream read return all available data and then throw the error on the next read? Or should the error be delivered eagerly? I think the error should be delayed to the next application-level read, but that might not be practical.
At the lowest level, it would be nice to have a raw binding to the POSIX APIs. At the high-level user API, this seems like a subclass of file-error or stream-error, depending on the API.
> Specifically, do you consider it roughly equivalent to (or a subset of) > EOF (i.e., FIN packet)?
Not for all applications. As you note, there are many times when that's a fine approximation.
+--------------- | How do you treat ECONNRESET (i.e., RST packet)? ... | I expect the answers to come in two groups: ... | 2. nope, ECONNRESET is an _error_ while EOF is a normal element of | communications, so it should not be routinely ignored (or swept under | the carpet by the EOF-P argument READ-LINE and friends), so the code | above should be wrapped in HANDLER-BIND which restarts the connection | on ECONNRESET (or something). +---------------
ECONNRESET is an *error* when doing net programming in any other language (e.g., in C). I don't see any reason that CL should be any different in that regard.
-Rob
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
r...@rpw3.org (Rob Warnock) writes: > Sam Steingold <s...@gnu.org> wrote: > +--------------- > | How do you treat ECONNRESET (i.e., RST packet)? > ... > | I expect the answers to come in two groups: > ... > | 2. nope, ECONNRESET is an _error_ while EOF is a normal element of > | communications, so it should not be routinely ignored (or swept under > | the carpet by the EOF-P argument READ-LINE and friends), so the code > | above should be wrapped in HANDLER-BIND which restarts the connection > | on ECONNRESET (or something). > +---------------
> ECONNRESET is an *error* when doing net programming in > any other language (e.g., in C).
More or less.
> I don't see any reason > that CL should be any different in that regard.
In Common Lisp, we have CONDITIONs, and ERRORs are just a subclass of conditions.
The question is whether ECONNRESET should be a subclass of CL:ERROR, or a subclass of CL:CONDITION.
In the case of ECONNRESET it seems that it should be regarded as a CL:ERROR, since receiving it indicates that the remote has unexpectedly closed the connection and you cannot do more I/O.
On the other hand, if it was something you might need to know, but that wouldn't make you stop normal processing, then SIGNALing a simple condition would do (eg. receiving out-of-band data could be handled signaling a non-ERROR, or even a non-SERIOUS-CONDITION condition. But this doesn't seem to apply to ECONNRESET.
Pascal J. Bourguignon <p...@informatimago.com> wrote: +--------------- | r...@rpw3.org (Rob Warnock) writes: | > ECONNRESET is an *error* when doing net programming in | > any other language (e.g., in C). | | More or less. | | > I don't see any reason that CL should be any different in that regard. | | In Common Lisp, we have CONDITIONs, and ERRORs are just a subclass of | conditions. | | The question is whether ECONNRESET should be a subclass of CL:ERROR, or | a subclass of CL:CONDITION. | | In the case of ECONNRESET it seems that it should be regarded as a | CL:ERROR, since receiving it indicates that the remote has unexpectedly | closed the connection and you cannot do more I/O. +---------------
I concur.
+--------------- | On the other hand, if it was something you might need to know, but that | wouldn't make you stop normal processing, then SIGNALing a simple | condition would do (eg. receiving out-of-band data could be handled | signaling a non-ERROR, or even a non-SERIOUS-CONDITION condition. But | this doesn't seem to apply to ECONNRESET. +---------------
Nope. ECONNRESET is a fatal error (at least w.r.t. that connection).
-Rob
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
>> Sam Steingold <s...@gnu.org> wrote: >> +--------------- >> | How do you treat ECONNRESET (i.e., RST packet)? >> ... >> | I expect the answers to come in two groups: >> ... >> | 2. nope, ECONNRESET is an _error_ while EOF is a normal element of >> | communications, so it should not be routinely ignored (or swept under >> | the carpet by the EOF-P argument READ-LINE and friends), so the code >> | above should be wrapped in HANDLER-BIND which restarts the connection >> | on ECONNRESET (or something). >> +---------------
>> ECONNRESET is an *error* when doing net programming in >> any other language (e.g., in C).
> More or less.
>> I don't see any reason >> that CL should be any different in that regard.
> In Common Lisp, we have CONDITIONs, and ERRORs are just a subclass of > conditions.
> The question is whether ECONNRESET should be a subclass of CL:ERROR, > or a subclass of CL:CONDITION.
It is beyond any doubt that ECONNRESET is a STREAM-ERROR. The _question_ is: is ECONNRESET a subclass of END-OF-FILE also? Don Cohen argues that it is. I think it is not. Our respective arguments are expounded upon in the clisp RFE I referenced and summarized in my original message. I am sorry it was not clear.
> It is beyond any doubt that ECONNRESET is a STREAM-ERROR. > The _question_ is: is ECONNRESET a subclass of END-OF-FILE also? > Don Cohen argues that it is. I think it is not. > Our respective arguments are expounded upon in the clisp RFE I referenced > and summarized in my original message. > I am sorry it was not clear.
I agree with you. If it was an END-OF-FILE something which was expecting to handle that might do so, believing that it had in fact seen all the data, when in fact it hasn't, because it got an RST for whatever reason. Code which does something like this would misbehave:
(defun ts (file) (handler-case ;; obviously this is really with-open-socket or what have you. (with-open-file (in file :direction :input) (loop (read-line in))) (end-of-file (e) (format *debug-io* "~&end of file on ~A, all is well~%" file) (values t file e)) (error (e) (format *debug-io* "~&mysteron ~A, all is not well~%" file) (values nil file e))))
> It is beyond any doubt that ECONNRESET is a STREAM-ERROR. > The _question_ is: is ECONNRESET a subclass of END-OF-FILE also? > Don Cohen argues that it is. I think it is not. > Our respective arguments are expounded upon in the clisp RFE I referenced > and summarized in my original message. > I am sorry it was not clear.
It is not an END-OF-FILE, because END-OF-FILE occurs when you are reading, but ECONNRESET occurs when you are writing. It would be more like a OUT-OF-SPACE-IN-FILESYSTEM error, if there was one such.
So, I'd say, CONNECTION-RESET should be a subclass of STREAM-ERROR.
> > It is beyond any doubt that ECONNRESET is a STREAM-ERROR. > > The _question_ is: is ECONNRESET a subclass of END-OF-FILE also? > > Don Cohen argues that it is. I think it is not. > > Our respective arguments are expounded upon in the clisp RFE I referenced > > and summarized in my original message. > > I am sorry it was not clear.
> It is not an END-OF-FILE, because END-OF-FILE occurs when you are > reading, but ECONNRESET occurs when you are writing. It would be more > like a OUT-OF-SPACE-IN-FILESYSTEM error, if there was one such.
Just to open a can of worms... We should really sit down and add a few "standard" (CDRized) conditions to our bag of tricks.
Sam Steingold <s...@gnu.org> wrote: +--------------- | Pascal J. Bourguignon <c...@vasbezngvzntb.pbz> [2011-04-12 05:08:14 +0200]: | > The question is whether ECONNRESET should be a subclass of CL:ERROR, | > or a subclass of CL:CONDITION. | | No. This is not the question. | | http://www.lispworks.com/documentation/HyperSpec/Body/e_end_of.htm | Class Precedence List: | end-of-file, stream-error, error, serious-condition, condition, t | | It is beyond any doubt that ECONNRESET is a STREAM-ERROR. | The _question_ is: is ECONNRESET a subclass of END-OF-FILE also? | Don Cohen argues that it is. I think it is not. +---------------
I concur with you, Tim, & Pascal. A STREAM-ERROR for sure, but *not* a subclass of END-OF-FILE. The semantics are wrong.
-Rob
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
I already agreed in a parallel reply that IMHO ECONNRESET is not an END-OF-FILE, but would like to quibble a bit about lesser points: ;-}
Pascal J. Bourguignon <p...@informatimago.com> wrote: +--------------- | It is not an END-OF-FILE, because END-OF-FILE occurs when you are | reading, but ECONNRESET occurs when you are writing. It would be more | like a OUT-OF-SPACE-IN-FILESYSTEM error, if there was one such. +---------------
AFAIK, ECONNRESET doesn't *only* occur when you are writing. It should be equally possible while reading as well.
IME, the most frequent error while *writing* to the net is EPIPE[1], which is *very* commonly seen by web servers when the client browser closes the connection before the full response has been sent. [This happens *very* often on slow or high-delay connections when users hit "Stop" or click on another link before the first page finishes downloading.]
Finally, OUT-OF-SPACE-IN-FILESYSTEM errors generally give ENOSPC (Linux, BSDs, other Unixes) or, if due to administrative limits, EFBIG (Linux/BSD/Unix) or EDQUOT(FreeBSD).
The semantics of ECONNRESET feel more like plain ol' EIO to me.
[By the way, note that FreeBSD distinguishes ENETRESET, ECONNABORTED, and ECONNRESET, as well as several other errors than can happen mid-connection, e.g., ENETDOWN, ENETUNREACH, EHOSTDOWN, EHOSTUNREACH, ETIMEDOUT, etc.]
-Rob
[1] Which, on Unix/Linux systems, if not ignored/caught, causes SIGPIPE! ;-} Thanks again to Dan Barlow, who once suggested simply *ignoring* SIGPIPE in CL-based web servers, and instead catching STREAM-ERROR on writes and silently ignoring it if caused by EPIPE [but logging other STREAM-ERRORs, of course].
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
> AFAIK, ECONNRESET doesn't *only* occur when you are writing. > It should be equally possible while reading as well.
I think that as well, for instance if the writer crashes then it will leave a half-open connection which it will need to nuke by sending an RST. (I guess I don't know if the RST translates to ECONRESET: may be in that case it is translated to something else. In any case it is an error, not EOF.)
Though I'm actually confused about how this is discovered: the reader will presumably be sitting in ESTABLISHED, passively waiting for segments to arrive from the writer: what's the mechanism for the writer to discover that the reader is still there? Is it just keepalives?
Tim Bradshaw <t...@tfeb.org> wrote: +--------------- | Rob Warnock said: | > AFAIK, ECONNRESET doesn't *only* occur when you are writing. | > It should be equally possible while reading as well. | | I think that as well, for instance if the writer crashes then it will | leave a half-open connection which it will need to nuke by sending an | RST. (I guess I don't know if the RST translates to ECONRESET: may be | in that case it is translated to something else. In any case it is an | error, not EOF.) +---------------
Actually, it depends on what you mean by "the writer" here. If you mean the user process [and "crash" as in core dump or unhandled signal], in a Unix/Linux context that will just result in all its file descriptors being closed [albeit w/o user-mode buffers being flushed first!], which will look to the reader as a normal EOf.
But if by "the writer crashed" you mean the whole system containing the sending process, then the reader would get *no* notification until either the protocol stack in the receiving system gets a timeout [which might cause an ECONNRESET?] or until the [former] writing system reboots and later sends a RST to the receiving system when the latter sends probes or ACKs for the "stale" connection which the newly-rebooted [former] writing system doesn't think is open [which should, I think, cause an ECONNRESET on the receiving side].
+--------------- | Though I'm actually confused about how this is discovered: the reader | will presumably be sitting in ESTABLISHED, passively waiting for | segments to arrive from the writer: what's the mechanism for the writer | to discover that the reader is still there? Is it just keepalives? +---------------
I'm confused: Did you mean to ask "what's the mechanism for the *reader* to discover that the *writer* is [or is not] still there?"? If the latter, then, yes, keepalives, timeouts, etc.
-Rob
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
> But if by "the writer crashed" you mean the whole system containing the > sending process, then the reader would get *no* notification until either > the protocol stack in the receiving system gets a timeout [which might > cause an ECONNRESET?] or until the [former] writing system reboots and > later sends a RST to the receiving system when the latter sends probes > or ACKs for the "stale" connection which the newly-rebooted [former] > writing system doesn't think is open [which should, I think, cause an > ECONNRESET on the receiving side].
This is what I meant: I suppose technically I meant "the TCP/IP stack lost its mind" but that almost always will mean a system-level crash in practice of course. The older RFCs are worded such that they clearly think of the TCP/IP stack might be entirely separate from the host using it (and that may be was the case - wasn't it all done in IMPs?), and I guess fancy message-passing OSs could easily suffer a crash & restart in the part that deals with TCP/IP without this being a system crash as such.
> +--------------- > | Though I'm actually confused about how this is discovered: the reader > | will presumably be sitting in ESTABLISHED, passively waiting for > | segments to arrive from the writer: what's the mechanism for the writer > | to discover that the reader is still there? Is it just keepalives? > +---------------
> I'm confused: Did you mean to ask "what's the mechanism for the *reader* > to discover that the *writer* is [or is not] still there?"? If the latter, > then, yes, keepalives, timeouts, etc.
What I meant was something like: - process on host a has connection open to host b and is writing to b. b is not writing to a, just waiting for data. - host a crashes and reboots - no magic reopening of the connection happens on a (probably the initial open was from b or something), so a does not find out that b is still waiting.
So now b is sitting there, thinking the connection is open, but a knows nothing of it. But because b is passively waiting, it's not sending anything to a to prompt a to send it a RST. Now my understanding gets a bit vague, but I *think* the only two mechanisms that cause the connection to get reset are: - if TCP keepalives are in use, then b will at some point send a keepalive to a which will then send an RST back; - if there is an application-level timeout then at some point b will give up, try and close the connection, and get a RST.
And I guess what I'm asking is whether there's a mechanism *other* than that: if TCP keepalives are not in use, and there's no application-level timeout (including application-level keepalives which I think SSH has), does b ever discover that a has gone away? I think it doesn't, but I'm not sure. Certainly things like telnet/ssh sessions can hang around for a very long time until you type at them (at which point they obviously try and send data, and promptly discover that the other end is gone).
Tim Bradshaw <t...@tfeb.org> wrote: +--------------- | So now b is sitting there, thinking the connection is open, but a knows | nothing of it. But because b is passively waiting, it's not sending | anything to a to prompt a to send it a RST. Now my understanding gets | a bit vague, but I *think* the only two mechanisms that cause the | connection to get reset are: | - if TCP keepalives are in use, then b will at some point send a | keepalive to a which will then send an RST back; | - if there is an application-level timeout then at some point b will | give up, try and close the connection, and get a RST. +---------------
I think those are basically the choices. Oh, plus this one:
- If "a" [after rebooting] tries to initiate a new connection to "b" that happens to re-use the same port numbers as the "stale" connection on "b", then *b* might send a RST to "a" in reply to the initial SYN. Or "b" might just send an ACK, which probably won't match the SYN's sequence number, which could cause "a" to send a RST. Or something like that. [I'm not sure. It's been a while since I've walked through the TCP state diagram.]
-Rob
----- Rob Warnock <r...@rpw3.org> 627 26th Avenue <http://rpw3.org/> San Mateo, CA 94403
> Tim Bradshaw <t...@tfeb.org> wrote: > +--------------- > | So now b is sitting there, thinking the connection is open, but a knows > | nothing of it. But because b is passively waiting, it's not sending > | anything to a to prompt a to send it a RST. Now my understanding gets > | a bit vague, but I *think* the only two mechanisms that cause the > | connection to get reset are: > | - if TCP keepalives are in use, then b will at some point send a > | keepalive to a which will then send an RST back; > | - if there is an application-level timeout then at some point b will > | give up, try and close the connection, and get a RST. > +---------------
> I think those are basically the choices. Oh, plus this one:
> - If "a" [after rebooting] tries to initiate a new connection to "b" > that happens to re-use the same port numbers as the "stale" connection > on "b", then *b* might send a RST to "a" in reply to the initial SYN. > Or "b" might just send an ACK, which probably won't match the SYN's > sequence number, which could cause "a" to send a RST. Or something > like that. [I'm not sure. It's been a while since I've walked through > the TCP state diagram.]
It depends on whether the sequence number is in b's current window.
If it's in the window, b will send a RST, and the application on b should be given an ECONNRESET error.
If it's outside the window, b will send an ACK containing the sequence number of the beginning of the window. When a receives this, it should send a RST to b (it was expecting a SYN-ACK, not a plain ACK). This should then cause the application on b to get an ECONNRESET error.
So while the underlying details are different, the end result in either case should be that the application gets ECONNRESET.
-- Barry Margolin, bar...@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group ***