Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

generating function algo

1 view
Skip to first unread message

BrainDumb

unread,
May 15, 2008, 3:01:41 PM5/15/08
to
Did anybody experiment with applying generating functions to
compression? Clearly, not all sequences will have a power series
expansion (ie. sequence of primes) but many do.

Also, can anyone recommend an efficient method of compressing
timestamps which could be expressed as string or numerically, ie.
11:35:22.244, disitribution of values is random.

thanks

Thomas Richter

unread,
May 15, 2008, 4:26:32 PM5/15/08
to
BrainDumb wrote:
> Did anybody experiment with applying generating functions to
> compression? Clearly, not all sequences will have a power series
> expansion (ie. sequence of primes) but many do.

Why would you believe that your typical input sequence will have a
simple generating function? What is your input data? Unless you
have very special input, the generating function will be as complex
to represent as the original input.

Why do you believe that the sequence of primes does not have a
generating function? It does. It's just not writable with the
"usual" transcendental functions or as a rational or otherwise
"nice" function. Here we go:

primes(z) = 2 + 3z + 5 z^2 + 7 z^3 + 11 z^4 + ...

> Also, can anyone recommend an efficient method of compressing
> timestamps which could be expressed as string or numerically, ie.
> 11:35:22.244, disitribution of values is random.

Then, you need log(24 * 60 * 60) bits in total, ignoring leap-seconds.
If you have milliseconds in your time stamp, you need log(24*60*60*1000)
bits. The algorithm for optimal compression is simple: Convert your
timestamp into the number of milliseconds from an average starting time,
and represent this number with the number of bits computed above, which
is clearly sufficient. This is the (a) optimal representation for random
distribution. (Homework: Show why!)

So long,
Thomas

Bartosz Wójcik

unread,
May 15, 2008, 6:34:10 PM5/15/08
to
On Thu, 15 May 2008 12:01:41 -0700 (PDT), BrainDumb wrote:

> Did anybody experiment with applying generating functions to
> compression? Clearly, not all sequences will have a power series
> expansion (ie. sequence of primes) but many do.

I've seen something like this at:

http://www.oberhumer.com/products/lzo-professional/

described as "Data To Code Transformations (D2CT)", but I couldn't get in
touch with them (not only me...) to get any more information, so if you
want to get more information you should probably fake NASA employee because
they don't talk with an ordinary people...

--
Bartosz Wójcik | www.pelock.com

Thomas Richter

unread,
May 16, 2008, 2:43:41 AM5/16/08
to
Bartosz Wójcik schrieb:

Sounds like snake oil to me. No products available for testing. If they
enhanced the LZO code, their code is GPL'd, too, so the sources will
have to be made available. I somehow doubt that this will happen...

So long,
Thomas

Phil Carmody

unread,
May 16, 2008, 5:15:22 AM5/16/08
to

Do what? They're allowed to license their own code under any
other license they see fit, including having it closed source.

Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

BrainDumb

unread,
May 20, 2008, 5:54:30 PM5/20/08
to
On May 15, 4:26 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:

Thomas, what do you mean by log(24 * 60 * 60*1000) bits. That's ~8
bits per what?
Obviously, given random distribution, the only approach is to
represent 1ms with 1bit. So that's 24 * 60 * 60 * 1000 bits

Thomas Richter

unread,
May 21, 2008, 10:52:50 AM5/21/08
to
BrainDumb schrieb:

> On May 15, 4:26 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
>
> Thomas, what do you mean by log(24 * 60 * 60*1000) bits. That's ~8
> bits per what?

Bits per timestamp.

> Obviously, given random distribution, the only approach is to
> represent 1ms with 1bit. So that's 24 * 60 * 60 * 1000 bits

No, the problem is not to encode events or one bit per event, but to encode a given time
with a given resolution.

So long,
Thomas

BrainDumb

unread,
May 22, 2008, 8:39:49 PM5/22/08
to
> Obviously, given random distribution, the only approach is to
> > represent 1ms with 1bit. So that's 24 * 60 * 60 * 1000 bits
>
> No, the problem is not to encode events or one bit per event, but to encode a given time
> with a given resolution.

No, allow me to disagree. The problem is compress data without loss.
My example
is about time sequence, at a millisecond resolution. Details are
insignificant.

I thought somebody knew a clever way to compress set of timestamps.
It appears there isn't a way.

Thomas Richter

unread,
May 23, 2008, 2:14:27 AM5/23/08
to
BrainDumb schrieb:

There might be a way, but you need to know more about the data you
compress. If all you have is a set of random time stamps, there is no
better way indeed. If you have a sequence of data, not sampled at
*random*, then there's hope, but you didn't say so.

So long,
Thomas

Phil Carmody

unread,
May 23, 2008, 4:33:24 AM5/23/08
to

Time stamps are just data. You can't compress data.

Now if you know something special about the distribution of
the data, then perhaps something can be done. However, you've
not mentioned any such special distribution so we have to
presume there's none. I know that in my inertial reference
frame all microseconds are of equal length, and thus it's
equally likely that an arbitrary event will occur in each
microsecond.

0 new messages