HTTPS request incomplete when piping to a file

Matt

unread,

May 13, 2013, 2:35:32 PM5/13/13

to nod...@googlegroups.com

Hi,

I've been trying to debug this problem for a week now and it's finally time to come here and ask if I'm doing something dumb. We're downloading a PDF file from American Express, with the following code:

var request = require('request'); // Mikeal's request library

var req = _make_req(...); // constructs a request

var r = request(req);

r.on('response', function (res) {

res.on('error', function (err) { console.log(err) });

res.on('end', function () { console.log('end') });

res.on('close', function () { console.log('close') });

// encoding shouldn't matter here as we're writing Buffer objects anyway...

var ws = fs.createWriteStream(filename, {encoding: 'binary'});

r.pipe(ws);

ws.on('close', function () {

// parse the PDF

});

The problem is that sporadically the PDF is corrupted (the end is missing). I haven't managed to even replicate it outside of our live systems. The code is used to get PDFs from many places other than Amex and works just fine, and even works fine on Amex most of the time. We can't really even packet sniff because it's over https...

Is there anything obvious I'm missing? It's not an fsync issue because even hours later the file isn't complete.

I can't provide the request headers for privacy reasons, but here's the response headers:

Response Headers:

date: Mon, 13 May 2013 12:00:23 GMT

server: IBM_HTTP_Server

x-powered-by: Servlet/3.0

content-disposition: attachment; filename="Statement_Apr 2013.pdf";

pragma: max-age=86400

expires: Tue, 14 May 2013 12:00:24 GMT

lastmodified: Mon, 13 May 2013 12:00:24 GMT

transfer-encoding: chunked

content-type: application/pdf

content-language: en-US

connection: close

Oh, and this is with Node v0.8.22. Upgrading to v0.10 isn't really an option right now - we need to wait for it to fully be stable first.

Matt.

mscdex

unread,

May 13, 2013, 2:45:35 PM5/13/13

to nodejs

On May 13, 2:35 pm, Matt <hel...@gmail.com> wrote:
> var ws = fs.createWriteStream(filename, {encoding: 'binary'});

Don't set the encoding here if you're piping. Just do: `var ws =
fs.createWriteStream(filename);`

Matt

unread,

May 13, 2013, 2:53:06 PM5/13/13

to nod...@googlegroups.com

On Mon, May 13, 2013 at 2:45 PM, mscdex <msc...@gmail.com> wrote:

Don't set the encoding here if you're piping. Just do: `var ws =
fs.createWriteStream(filename);`

See the comment above that line - I've tried both ways. Still occurs.

greelgorke

unread,

May 15, 2013, 4:25:24 AM5/15/13

to nod...@googlegroups.com

are you sure it's the end of the file? not the start? you're on 0.8.X, so you might just loose some packages at start, because the stream is already going on. a bug, that is fixed in 0.10. have you tried this:

var ws = fs.createWriteStream(filename, {encoding: 'binary'});

r.pipe(ws);

ws.on('close', function () {

// parse the PDF

});

r.on('error', function (err) { console.log(err) });

r.on('end', function () { console.log('end') });

r.on('close', function () { console.log('close') });

Matt

unread,

May 15, 2013, 11:17:02 AM5/15/13

to nod...@googlegroups.com

On Wed, May 15, 2013 at 4:25 AM, greelgorke <greel...@gmail.com> wrote:

are you sure it's the end of the file? not the start?

Yeah - the files always start with the correct %PDF bytes. It's just missing the index at the end of the file.

I'll try your suggestion anyway, and see if it makes any difference.

Matt.

Matt

unread,

May 16, 2013, 9:42:38 AM5/16/13

to nod...@googlegroups.com

FWIW that did not fix the problem.

Reply all

Reply to author

Forward