[erlang-questions] 8k limit on gen_tcp:recv?

12 views
Skip to first unread message

Rachel Willmer

unread,
Mar 10, 2010, 6:40:11 PM3/10/10
to erlang-q...@erlang.org
I've been debugging a problem whereby couchdb rejects a URL longer than 8k.

At first, I thought it was an already known limitation in mochiweb
(the stack is couchdb/mochiweb/erlang).

But with large amounts of ioformat debugging, I've pinned it down to a
call from mochiweb to gen_tcp:recv which fails on this long URL.

Is this a known limitation in erlang's gen_tcp:recv?

Rachel

________________________________________________________________
erlang-questions (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-questio...@erlang.org

Bob Ippolito

unread,
Mar 10, 2010, 6:48:17 PM3/10/10
to Rachel Willmer, erlang-q...@erlang.org
Since this is definitely not clear in the question, the code would
look something like this:

inet:setopts(Socket, [{packet, http}]),
case gen_tcp:recv(Socket, 0, ?IDLE_TIMEOUT) of
{ok, {http_request, Method, Path, Version}} ->
headers(Socket, {Method, Path, Version}, [], Body, 0)
end.

Rachel - how about providing a reproducible test case and making it an
issue in the mochiweb project? There might be some continuation
response from recv that mochiweb doesn't currently expect to receive.

http://code.google.com/p/mochiweb/issues/entry

Rachel Willmer

unread,
Mar 10, 2010, 6:57:07 PM3/10/10
to Bob Ippolito, erlang-q...@erlang.org
I'll add the test case tomorrow morning, it is 100% repeatable.

From what I could see from the response from gen_tcp:recv, it's just
returning an error, not a continuation response, so I don't think
there's anything for mochiweb to handle.

But yes, I will add the test case and the diagnostic tomorrow. Just
wanted to check now that this wasn't already a known issue/limitation.

Bob Ippolito

unread,
Mar 10, 2010, 7:14:40 PM3/10/10
to Rachel Willmer, erlang-q...@erlang.org
At some point you have to give up on parsing a long URL to prevent a
denial of service attack. I don't know if that limit is documented or
configurable, I haven't recently looked at the code that handles
parsing HTTP requests.

I'm sure someone knows about it, but it's definitely not something
everyone knows.

Is there a use case for such a long URL? Most browsers and servers
have some kind of limit (2038 for IE, 8190 for apache 2.2 by default,
16k for IIS, ...). 8k might be a little conservative, but how far do
you really want to go?

Steve Vinoski

unread,
Mar 10, 2010, 10:43:04 PM3/10/10
to Bob Ippolito, Rachel Willmer, erlang-q...@erlang.org
Try setting your socket's receive buffer to something larger than 8k
before you do the recv:

inet:setopts(Socket, [{recbuf,16384}]).

By passing a 0 for the length to recv, you're telling it to receive
all available bytes. I assume it can't receive more bytes than its
buffer can store.

--steve

Tony Rogvall

unread,
Mar 11, 2010, 5:31:11 AM3/11/10
to Steve Vinoski, Bob Ippolito, Rachel Willmer, erlang-q...@erlang.org
Note that setting the recbuf not only updates the inet driver read buffer but also
the sets the SO_RCVBUF socket option. This is not always what you want!!!
First consult the networking documentation on TCP and buffer management to
see if this will result in unwanted behavior.

If you do not know exactly what recbuf does you may instead use the option
{buffer, Size} which only affects the inet driver read buffer

/Tony

Rachel Willmer

unread,
Mar 19, 2010, 10:20:04 AM3/19/10
to erlang-q...@erlang.org
Thanks to everyone who responded, this is all really useful.

Trying it now...
Rachel

Rachel Willmer

unread,
Mar 22, 2010, 8:40:58 AM3/22/10
to erlang-q...@erlang.org
Thanks again, I've got a version working now with the longer receive buffer.

The use case for the long URL is couchdb replication, by the way.

https://issues.apache.org/jira/browse/COUCHDB-644

Bob Ippolito

unread,
Mar 22, 2010, 9:10:41 AM3/22/10
to Rachel Willmer, erlang-q...@erlang.org
Is changing the limit from 8k to 16k really the best solution? Beyond
technical reasons, e.g. some kinds of proxies might limit URL length,
is there a guarantee that all CouchDB URLs will be shorter than 16k?
I'd be fine with changing the default buffer size, but only if there's
proof that 16k is going to be big enough for all CouchDB use cases.
8kb is certainly enough for exposing URLs to browsers since their
practical limits are much lower.

Rachel Willmer

unread,
Mar 22, 2010, 10:52:55 AM3/22/10
to Bob Ippolito, erlang-q...@erlang.org
On 22 March 2010 13:10, Bob Ippolito <b...@redivi.com> wrote:
> Is changing the limit from 8k to 16k really the best solution? Beyond
> technical reasons, e.g. some kinds of proxies might limit URL length,
> is there a guarantee that all CouchDB URLs will be shorter than 16k?

It's a good enough solution for what I need right now, (which is to be
able to replicate some very large couchdb databases without the system
falling over).

But I'd agree that it's probably not the "best" solution, and I
wouldn't suggest changing the default buffer size.

I'm going to do some further digging around inside the couchdb code to
figure out why these long URLs are being generated in the first place.
There is supposed to be code in there which splits the request into
multiple requests if the revision list is too long, but that doesn't
seem to be working. Fixing that would be a better solution.

Reply all
Reply to author
Forward
0 new messages