typesize

53 views
Skip to first unread message

KHR

unread,
Mar 23, 2011, 11:29:32 PM3/23/11
to blosc

In the code below I use blosc to serialize some lists of floats.
blosc appears to be very fast while at the same time providing good
compression. It looks like a very interesting package.

My question is about the typesize paramter. What exactly is this
parameter?
Is the value 8 the correct value to use for a list of floats?


import sys
import time
import struct
import bz2
import zlib
import blosc
import random

n=500000


a = map(float,range(n)) # linearly
increasing floats
b = [1.1]*n #
repeated value
c = [random.uniform(1,2) for i in range(n)] # random values
print
print 'Length of lists'
print '%8d %8d %8d\n'%(len(a),len(b),len(c))


a1 = struct.pack('%df'%n, *a )
b1 = struct.pack('%df'%n, *b )
c1 = struct.pack('%df'%n, *c )
print 'struct - length of strings'
print '%8d %8d %8d'%(len(a1),len(b1),len(c1))
print


compresslevel = 5

ts = time.time()
a2 = bz2.compress(a1,compresslevel)
b2 = bz2.compress(b1,compresslevel)
c2 = bz2.compress(c1,compresslevel)
print 'bz2 %6.3f sec'%(time.time()-ts)
print '%8d %8d %8d\n'%(len(a2),len(b2),len(c2))

ts = time.time()
a3 = zlib.compress(a1,compresslevel)
b3 = zlib.compress(b1,compresslevel)
c3 = zlib.compress(c1,compresslevel)
print 'zlib %6.3f sec'%(time.time()-ts)
print '%8d %8d %8d\n'%(len(a3),len(b3),len(c3))

typesize=4
ts = time.time()
a4 = blosc.compress(a1, typesize)
b4 = blosc.compress(b1, typesize)
c4 = blosc.compress(c1, typesize)
print 'blosc %6.3f sec'%(time.time()-ts)
print '%8d %8d %8d\n'%(len(a4),len(b4),len(c4))

typesize=8
ts = time.time()
a4 = blosc.compress(a1, typesize)
b4 = blosc.compress(b1, typesize)
c4 = blosc.compress(c1, typesize)
print 'blosc %6.3f sec'%(time.time()-ts)
print '%8d %8d %8d\n'%(len(a4),len(b4),len(c4))

Francesc Alted

unread,
Mar 24, 2011, 4:09:30 AM3/24/11
to bl...@googlegroups.com
A Thursday 24 March 2011 04:29:32 KHR escrigué:

> In the code below I use blosc to serialize some lists of floats.
> blosc appears to be very fast while at the same time providing good
> compression. It looks like a very interesting package.
>
> My question is about the typesize paramter. What exactly is this
> parameter?
> Is the value 8 the correct value to use for a list of floats?
>
>
> import sys
> import time
> import struct
> import bz2
> import zlib
> import blosc
> import random
>
> n=500000
>
>
> a = map(float,range(n)) # linearly
> increasing floats
> b = [1.1]*n #
> repeated value
> c = [random.uniform(1,2) for i in range(n)] # random values
> print
> print 'Length of lists'
> print '%8d %8d %8d\n'%(len(a),len(b),len(c))
>
>
> a1 = struct.pack('%df'%n, *a )
[clip]

No, in this case the correct size is 4 bytes (a 'f' typesize in struct
is 4 bytes long).

--
Francesc Alted

Reply all
Reply to author
Forward
0 new messages