Bad characters in gzip stream

1,050 views
Skip to first unread message

Joe Ferner

unread,
Jul 27, 2012, 2:44:40 PM7/27/12
to nod...@googlegroups.com
Before posting an issue I wanted to discuss it here to make sure I'm doing things correctly.

What I'm seeing is occasional bad characters in my gzip file.

Here is the code I use to create the file...

var fstream = require('fstream');
var tar = require('tar');
var zlib = require('zlib');

console.log("tar gzipping " + source + " -> " + dest);
fstream.Reader({path: source, type: 'Directory'})
  .pipe(tar.Pack())
  .pipe(zlib.createGzip())
  .pipe(fstream.Writer(dest));

9/10 this code works great. 1/10 it adds 3 additional bytes to the beginning of the gzip file. Unfortunately it's happening very inconsistently and I'm unable to reproduce it reliably. I'm running on OSX using node 0.8.3.

Bytes from a working gzip it creates looks like this...
1F 8B 08 00 00 00 00 00 00 03 EC BD

Bytes from a non-working gzip look like this...
1F EF BF BD 08 00 00 00 00 00 00 00 03 EF BF <- Red bytes should be 8B

If I change the three bytes above to 8B the file works again.

Thomas Shinnick

unread,
Jul 27, 2012, 3:56:34 PM7/27/12
to nod...@googlegroups.com
Bytes from a working gzip it creates looks like this...
1F 8B 08 00 00 00 00 00 00 03 EC BD

Bytes from a non-working gzip look like this...
1F EF BF BD 08 00 00 00 00 00 00 00 03 EF BF <- Red bytes should be 8B

This is the Unicode UTF8 "replacement character". 
    http://en.wikipedia.org/wiki/Replacement_character#Replacement_character
    https://www.google.com/search?q=EF%20BF%20BD
    https://www.google.com/search?q=0xEF+0xBF+0xBD

Somebody is not seeing your streamed data as binary.  I don't know why a UTF encoder/decoder would be called here.

I am not conversant with these NPM modules (fstream, etc.)  I am sure someone who has used these would know immediately if any flags need to be set in your calls.

Isaac Schlueter

unread,
Jul 29, 2012, 2:34:09 PM7/29/12
to nod...@googlegroups.com
Joe,

Something is definitely amiss here. npm does basically exactly this
very frequently, and pretty much flawlessly these days, so my guess is
that it's something in either your code that's wrong, or an edge case
that is not encountered very often.

Can you share the folder contents and the rest of the program? I'd be
happy to take a look. Fstream, node-tar, and the builtin zlib
bindings are all written by me, so if it's not your bug, it's
definitely mine :)
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com
> To unsubscribe from this group, send email to
> nodejs+un...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
Reply all
Reply to author
Forward
0 new messages