Also, can anyone recommend an efficient method of compressing
timestamps which could be expressed as string or numerically, ie.
11:35:22.244, disitribution of values is random.
thanks
Why would you believe that your typical input sequence will have a
simple generating function? What is your input data? Unless you
have very special input, the generating function will be as complex
to represent as the original input.
Why do you believe that the sequence of primes does not have a
generating function? It does. It's just not writable with the
"usual" transcendental functions or as a rational or otherwise
"nice" function. Here we go:
primes(z) = 2 + 3z + 5 z^2 + 7 z^3 + 11 z^4 + ...
> Also, can anyone recommend an efficient method of compressing
> timestamps which could be expressed as string or numerically, ie.
> 11:35:22.244, disitribution of values is random.
Then, you need log(24 * 60 * 60) bits in total, ignoring leap-seconds.
If you have milliseconds in your time stamp, you need log(24*60*60*1000)
bits. The algorithm for optimal compression is simple: Convert your
timestamp into the number of milliseconds from an average starting time,
and represent this number with the number of bits computed above, which
is clearly sufficient. This is the (a) optimal representation for random
distribution. (Homework: Show why!)
So long,
Thomas
> Did anybody experiment with applying generating functions to
> compression? Clearly, not all sequences will have a power series
> expansion (ie. sequence of primes) but many do.
I've seen something like this at:
http://www.oberhumer.com/products/lzo-professional/
described as "Data To Code Transformations (D2CT)", but I couldn't get in
touch with them (not only me...) to get any more information, so if you
want to get more information you should probably fake NASA employee because
they don't talk with an ordinary people...
--
Bartosz Wójcik | www.pelock.com
Sounds like snake oil to me. No products available for testing. If they
enhanced the LZO code, their code is GPL'd, too, so the sources will
have to be made available. I somehow doubt that this will happen...
So long,
Thomas
Do what? They're allowed to license their own code under any
other license they see fit, including having it closed source.
Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration
Thomas, what do you mean by log(24 * 60 * 60*1000) bits. That's ~8
bits per what?
Obviously, given random distribution, the only approach is to
represent 1ms with 1bit. So that's 24 * 60 * 60 * 1000 bits
Bits per timestamp.
> Obviously, given random distribution, the only approach is to
> represent 1ms with 1bit. So that's 24 * 60 * 60 * 1000 bits
No, the problem is not to encode events or one bit per event, but to encode a given time
with a given resolution.
So long,
Thomas
No, allow me to disagree. The problem is compress data without loss.
My example
is about time sequence, at a millisecond resolution. Details are
insignificant.
I thought somebody knew a clever way to compress set of timestamps.
It appears there isn't a way.
There might be a way, but you need to know more about the data you
compress. If all you have is a set of random time stamps, there is no
better way indeed. If you have a sequence of data, not sampled at
*random*, then there's hope, but you didn't say so.
So long,
Thomas
Time stamps are just data. You can't compress data.
Now if you know something special about the distribution of
the data, then perhaps something can be done. However, you've
not mentioned any such special distribution so we have to
presume there's none. I know that in my inertial reference
frame all microseconds are of equal length, and thus it's
equally likely that an arbitrary event will occur in each
microsecond.