I'm using Compress::Zlib for the first time. I'm working with strings
are are ZLib-compressed but not fully RFC1950 compliant. I'm getting
some errors and, although I have it working, I know there has to be
a better way.
I started pretty simple:
my ($inflate,$status);
($inflate,$status) = inflateInit();
while ($i<3)
{
print "$i\n";
# the strings are coming from an internet socket.
# i can probably get some sample lines if needed
read($socket,$line,$number);
($newline,$status) = $inflate->inflate($line);
print "line -> $newline\n";
print "status -> $status\n";
print "msg -> " , $inflate->msg() , "\n";
$i++;
}
and got these results
0
line ->
status -> data error
msg -> incorrect header check
1
line ->
status -> data error
msg -> incorrect header check
2
line ->
status -> data error
msg -> incorrect header check
So I did some research and found about adding the header to the string:
my ($inflate,$status);
($inflate,$status) = inflateInit();
while ($i<3)
{
print "$i\n";
# the strings are coming from an internet socket.
# i can probably get some sample lines if needed
read($socket,$line,$number);
$line = "\x78\x01$line"; # i added this line here
($newline,$status) = $inflate->inflate($line);
print "line -> $newline\n";
print "status -> $status\n";
print "msg -> " , $inflate->msg() , "\n";
$i++;
}
and did a little better
0
line -> 06710002515770TradePAC USD17393883907^PDP.IV
status ->
msg ->
1
line ->
status -> data error
msg -> incorrect data check
2
line ->
status -> data error
msg -> incorrect data check
So I adjusted the header thusly
my ($inflate,$status);
($inflate,$status) = inflateInit();
while ($i<3)
{
print "$i\n";
# the strings are coming from an internet socket.
# i can probably get some sample lines if needed
read($socket,$line,$number);
$line = "\x78\x01$line" if $i==0; # i altered this line here
($newline,$status) = $inflate->inflate($line);
print "line -> $newline\n";
print "status -> $status\n";
print "msg -> " , $inflate->msg() , "\n";
$i++;
}
but same results
line -> 06710002515770TradePAC USD17393883907^PDP.IV
status ->
msg ->
1
line ->
status -> data error
msg -> incorrect data check
2
line ->
status -> data error
msg -> incorrect data check
Now, here is what I finally got working:
while ($i<3)
{
print "$i\n";
my ($inflate,$status);
($inflate,$status) = inflateInit(); # moved inflateInit here
# the strings are coming from an internet socket.
# i can probably get some sample lines if needed
read($socket,$line,$number);
$line = "\x78\x01$line"; # always add header
($newline,$status) = $inflate->inflate($line);
print "line -> $newline\n";
print "status -> $status\n";
print "msg -> " , $inflate->msg() , "\n";
$i++;
}
but it has to be inefficient always running inflateInit.
So can somebody help me figure out what I'm doing wrong?
Thanks.
--hymie! http://lactose.homelinux.net/~hymie hy...@lactose.homelinux.net
------------------------ Without caffeine for 546 days ------------------------
Convicted of a crime I didn't even commit! Attempted Murder -- now, honestly,
what is that? Do they give a Nobel Prize for Attempted Chemistry? Do they?
-- Sideshow Bob (The Simpsons)
-------------------------------------------------------------------------------
Do you have any more info on the structure of the data in the socket. For
example, is there any significance toyou reading $number bytes at a time
from the socket?
If you can get it to work by prefixing every $number bytes with "\x78\x01"
it sounds like you are dealing with one or more RFC1951 data streams.
If you want Compress::Zlib to read an RFC1951 stream, add the WindowsBits
parameter like this when you create the inflation object
($inflate,$status) = inflateInit(WindowBits => -MAX_WBITS);
An alternative is to use one of the of IO::Uncompress modules that
Compress::Zlib now uses behind the scenes. So, for example, if you expect a
single RFC1951 compressed data stream from the socket, the code below will
read it, uncompress it, and write the result into $data
use IO::Uncompress::RawInflate qw(:all)
my $data ;
rawinflate $socket => \$data
or die "Cannot uncompress: $RawInflateError\n";
Paul
>Do you have any more info on the structure of the data in the socket. For
>example, is there any significance toyou reading $number bytes at a time
>from the socket?
Data spec, including java sample code (I don't speak java)
The basic structure of the data is:
<blockTime><blockSequence><blockSize><block>...
blockTime = a 64-bit long representing the time of the block in
milliseconds since the epoc.
blockSequence = a 32-bit int representing a sequence number for the
block of data (see "id" parameter above).
blockSize = a 32-bit int representing the size of the compressed block
in bytes.
block = a compressed chunk of data.
You will need to take the compressed block of data and uncompress it
to produce workable raw data. You should be able to use a standard ZLIB
compression library to do this. This functionality is usually found in
your existing development environment. In Java, ZLIB is built into the
VM itself. You need to hand the data block to a java.util.zip.Inflater.
Example:
Inflater decompresser = new Inflater();
decompresser.setInput(compressedBlock, 0, compressedDataLength);
byte[] result = new byte[2048];
int resultLength = decompresser.inflate(result);
decompresser.end();
String message = new String(result, 0, resultLength, "UTF-8");
>If you can get it to work by prefixing every $number bytes with "\x78\x01"
>it sounds like you are dealing with one or more RFC1951 data streams.
No. It only works if I re-run inflateInit and *then* prefix the
string with \x78\x01
>If you want Compress::Zlib to read an RFC1951 stream, add the WindowsBits
>parameter like this when you create the inflation object
>
> ($inflate,$status) = inflateInit(WindowBits => -MAX_WBITS);
That works well for my first data chunk, after that I get
1
line ->
status -> stream end
2
line ->
status -> stream end
> use IO::Uncompress::RawInflate qw(:all)
>
> my $data ;
> rawinflate $socket => \$data
> or die "Cannot uncompress: $RawInflateError\n";
Again, this works well for my first data chunk, but doesn't read the
cleartext data preceding the second data chunk:
0
line -> 06710003526346TradeNLS USD11221066211JRN
status ->
1
The key part I didn't know was you are dealing with a sequence of compressed
RFC 1951 data streams that are each prefixed with a bespoke header. So that
means you need to deal with reading both compressed & non-compressed data
from the socket. Is there a link you can post that describes this feed in
more detail?
So assuming nothing else comes down the socket, you could try something like
this - you may need to change the final "N" in the unpack call to a "V" if
the byte order is non-standard.
my $header;
my $header_size = 8 + 4 + 4; # time + sequence + size
while (read($socket, $header, $header_size) == $header_size)
{
my ($time1, $time2, $sequence, $size) = unpack "NNNN", $header;
rawinflate $socket => \$data, InputLength => $size
or die "Cannot uncompress: $RawInflateError\n";
print "$daat\n";
}
cheers
Paul
>The key part I didn't know was you are dealing with a sequence of compressed
>RFC 1951 data streams that are each prefixed with a bespoke header. So that
>means you need to deal with reading both compressed & non-compressed data
>from the socket.
Sorry. I didn't realize that the data and the acquisition of the data
were so badly intertwined.
>my $header;
>my $header_size = 8 + 4 + 4; # time + sequence + size
>while (read($socket, $header, $header_size) == $header_size)
>{
>
> my ($time1, $time2, $sequence, $size) = unpack "NNNN", $header;
>
> rawinflate $socket => \$data, InputLength => $size
> or die "Cannot uncompress: $RawInflateError\n";
> print "$data\n";
>}
This is perfect!!!
Thank you so much for your help.
--hymie! http://lactose.homelinux.net/~hymie hy...@lactose.homelinux.net
------------------------ Without caffeine for 547 days ------------------------