On Fri, Dec 28, 2012 at 3:08 PM, am_p1 <
andrewmc...@gmail.com> wrote:
> I'm using this for sure:
> response.setEncoding('utf8')
>
> but the problem is the chunks can be split more than once and with UTF8
> strings there doesn't seem to be any character that indicates the buffer was
> split. I read that JSON responses have the \n you can use but I don't see
> that anywhere in the XML response I'm receiving.
>
> If node.js doesn't put these back together for me, then I need to figure out
> what characters are at the end of the buffer to indicate a split chunk so I
> can put them back together myself. Currently I'm looking for some strings in
> the XML packet to indicate a complete or incomplete chunk but again, it's
> not working 100%.
You may be looking at two separate issues here.
1. Partial character sequences. When used as documented,
stream.setEncoding() takes care of that: if the data chunk ends in a
partial sequence, it's not emitted until the next chunk arrives.
For the curious, the relevant code is in lib/string_decoder.js.
2. Partial XML documents. node.js can't help you here, you somehow
need to track that yourself.
If the server sets a Content-Length header, it's easy: just xml +=
data until Buffer.byteLength(xml) equals the content length. Caveat
emptor: repeatedly calling Buffer.byteLength() is not very efficient
but don't worry about that until later. Make it work first, then make
it work fast.
If the response is sent using chunked encoding, you probably need to
parse it with a SAX parser first.