Well, I applied your transfer-encoding patches for now. I have some
thoughts on this, read on.
As for the original behavior: I'm guessing that it was setting the
transfer-encoding to nil because Blaine designed it to work with the
output of `curl -is`.
There are two problems here:
* libcurl knows about chunked encoding, and decodes it for you (which
is reasonable: chunked responses are a transport-layer detail, and no
one wants to deal with that... that's what curl is for!). And, when
you pass -i to curl, it prints the Transfer-Encoding header if it
exists, even though libcurl decoded the chunks. Both behaviors seem
reasonable in isolation; the combination is definitely a funny edge-
case.
* Net::HTTP supports chunked-encoding (or at least my copy, 1.8.6p111,
does), so you can't give it a response to parse with a "Content-
Encoding: chunked" header and an already-decoded body... it wants to
decode the chunks itself, since it thinks it's on the network. It'll
either raise a HTTPBadResponse or an EOFError if you try it.
So to get it to work with recorded responses from `curl -is`, Blaine
stripped that header from the baked response, so that Net::HTTP
wouldn't try to decode the chunks that had already been decoded by
curl.
I pasted some example request/responses here, to prove that `curl -is`
is decoding the chunks and leaving the header in:
http://pastie.org/376361
As you can see, I opened a couple raw HTTP sessions against
www.google.com:80.
When I specified HTTP/1.1 in the request, the response body correctly
came back chunked. (Below that, there's an HTTP/1.0 request, which
isn't chunked.) curl uses HTTP/1.1, and decodes the chunks, but leaves
in the header.
So it seems like there are two options:
* have people stop using `curl -is`, so that chunked responses are
actually stored with the chunks they're supposed to have; and remove
the original transfer-encoding code (and your patch) altogether;
* or keep the current "curl -is" support, but delete the transfer-
encoding header altogether, if it exists (instead of setting it to
nil), so it matches the body. That should keep Mechanize happy.
Also, note: this isn't just a chunking issue; the other transfer-
encodings (deflate, compress, etc.) all have the exact same problem.
I think I'm more interested in approach #2... it seems to be the most
pragmatic. If we did #1, we'd need to provide a tool for users to
record responses in the proper fashion without using curl, and
probably raise if a recorded response is used that has a Transfer-
Encoding header but an already-decoded body, since it'd be invalid.
Hope this helps,
Chris Kampmeier