Mysterious mochiweb crashes on R15B02

150 views
Skip to first unread message

Anthony Molinaro

unread,
Dec 6, 2012, 1:46:46 PM12/6/12
to moch...@googlegroups.com
Hi,

I'm currently trying out R15B02 on a production box next to many servers running R14B04 and the R15B02 are every so often getting the following crash

2012-12-06 00:06:46 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.11222.4523>
    registered_name: []
    exception error: {{badmatch,{error,einval}},[{mochiweb_http,new_request,3,[{
file,"mochiweb_http.erl"},{line,129}]},{mochiweb_http,handle_invalid_request,3,[
_lib.erl"},{line,227}]}]}
    ancestors: [webmachine_mochiweb,ox_http_gateway_sup,<0.625.0>]
    messages: [{tcp_closed,#Port<0.61144399>}]
    links: [<0.928.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1597
    stack_size: 24
    reductions: 597
  neighbours:

Which seems to suggest that somehow the socket has been closed.  I've checked all the sysctl and other settings, and the systems are identical except for R15B02.

It's happening at a low rate compared to the server request rate, but I'd still like to understand what is happening.  Has anyone seen this before?

Thanks,

-Anthony

Bob Ippolito

unread,
Dec 6, 2012, 4:50:24 PM12/6/12
to moch...@googlegroups.com
Hard to look into it without knowing exactly what version of mochiweb that is, the line numbers in the stack trace don't match up with master. Certainly some kind of error is happening, and then it fails when it tries to send the response for the invalid request. I'd start looking around here https://github.com/mochi/mochiweb/blob/master/src/mochiweb_http.erl#L69

My guess is that some kind of socket closed error message isn't handled properly. Could be a bug that was fixed a long time ago in master, from the line numbers in your stack trace it doesn't seem like a recent version.

-bob



--
You received this message because you are subscribed to the Google Groups "MochiWeb" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mochiweb/-/kJh7tJiLzj0J.
To post to this group, send email to moch...@googlegroups.com.
To unsubscribe from this group, send email to mochiweb+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mochiweb?hl=en.

Anthony Molinaro

unread,
Dec 10, 2012, 6:58:36 PM12/10/12
to moch...@googlegroups.com
Sorry, it was a build off of tag v2.3.1, so not quite the most recent.  The line number are off because I added some debugging.

The breaking function is

new_request(Socket, Request, RevHeaders) ->
    ok = mochiweb_socket:setopts(Socket, [{packet, raw}]),
    mochiweb:new_request({Socket, Request, lists:reverse(RevHeaders)}).

mochiweb_socket:setopts is returning an error where the socket has been closed.  The strange thing is I was never seeing this happen in R14B04.  Also this function has not changed in mochiweb master so it's probably not been fixed whatever it is.

-Anthony

Bob Ippolito

unread,
Dec 10, 2012, 9:29:17 PM12/10/12
to moch...@googlegroups.com
Yep, but this shouldn't be called if the socket was closed, since that's handled elsewhere. Did you take a look at the code I pointed at, where it ignores the value of the error and tries to send this error response? In order to fix it properly, we'd just need to make a case in there to handle the kind of error that did happen, but in order to do that we'd need to know the error or how to reproduce it.


To view this discussion on the web visit https://groups.google.com/d/msg/mochiweb/-/pQm1NuScST0J.

Anthony Molinaro

unread,
Dec 12, 2012, 6:02:20 PM12/12/12
to moch...@googlegroups.com
Unfortunately this is not something I've figured out how to reproduce.  I know it doesn't happen with R14B04, but does with R15B02 on our production systems.

mochiweb_socket:setopts/1 is returning {error,einval} because the socket is closed.

I went ahead and added some debugging to mochiweb_http:request/2 and mochiweb_http:headers/5 and it looks like the _Other clause is getting

{tcp_error,#Port<0.98680764>,emsgsize}

Which seems to happen if the packet size is bigger than the recv buffer


So maybe the default recv buffer size changed in R15B.  I've been looking through the inet_drv.c but not seeing much there, so it could be elsewhere.

Anyway, seems like a possible fix would be to add a clause in front of _Other in header/5 an request/2 which matches {tcp_error,_,emsgsize} and closes the socket (since it's probably in the process of being closed anyway).

I'll see if I can't construct some extra large requests to see if I can't get it to be reproducible.

-Anthony

Anthony Molinaro

unread,
Dec 12, 2012, 7:42:29 PM12/12/12
to moch...@googlegroups.com
Okay, so my last response hasn't gone through yet, but I'm pretty sure I found the issue.  I was able to test by constructing an HTTP request with a really large header and saw that for R14B04 the message you get is

{http,#Port<0.120387186>,{http_error,"<data clipped>"}}

But with R15B02 the message you get is

{tcp_error,#Port<0.104226352>,emsgsize}

I believe the cause is this commit


Which changes how decode_packet is working to prevent DOS attacks.  It seems like it may have also changed the default behavior as well.

Not sure what you all would prefer for a fix, adding the additional clause to the receive calls or attempting to decode_packet with a packet_size and line_length?

-Anthony

Bob Ippolito

unread,
Dec 12, 2012, 9:29:43 PM12/12/12
to moch...@googlegroups.com
Handling the tcp_error in the receive clause is probably the best fix


To view this discussion on the web visit https://groups.google.com/d/msg/mochiweb/-/Tg6TpbuFxegJ.

Anthony Molinaro

unread,
Dec 13, 2012, 2:48:15 PM12/13/12
to moch...@googlegroups.com
Okay, here's the pull request


-Anthony
Reply all
Reply to author
Forward
0 new messages