Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Huffman codes and Fibonacci numbers

0 views
Skip to first unread message

Alex Vinokur

unread,
Apr 28, 1999, 3:00:00 AM4/28/99
to


Mark Nelson <ma...@tiny.com> wrote :
(See the thread titled
"LZ and Huffman encoding"
in comp.compression)

[snip]
> Limiting the length of Huffman codes is a popular topic. One very simple
> way to limit the lengths is simply to limit the maximum counter value
> on your first pass over the data. You will find that the worst case for
> lengths of Huffman codes are reached when the input counts follow a
> Fibonacci sequence.
[snip]


Walter Roberson <robe...@ibd.nrc.ca> wrote :
(See the thread titled
"Maximum height of a Huffman tree?"
in comp.compression)

[snip]
> Given a certain number of input characters, S, what is
> the worst case Huffman tree? My implicit answer had been ceil(log2(S))-1 .
> The poster showed how there were cases much worse than that, that
> occur when the frequencies are distributed with a Fibonacci-type
> sequence.
[snip]

Hi,
The Huffman codes are closely connected with the Fibonacci numbers.
Here are some aspects of such a connection.

Thanks in advance,
Alex


####################### 1 #######################

1. Preface
==========

Definition 1.1. The binary tree size is number of its terminal nodes.

Definition 1.2. The numbers sequence size is number of its elements.


Let T(n) denote binary tree of size n.
---------------------------------

Let Q(n) denote sequence of size n.
------------------------------


Definition 1.3. The sequence Q(n) is called T(n)-Huffman sequence
if T(n) is binary Huffman tree (code)
for the sequence Q(n).

Let S(Q) denote Huffman-cost of the sequence Q(n)
---------------------------------------------
(i.e., the weighted path length of the binary tree T(n)
for the sequence Q(n)).


Let M be some class of T(n)-Huffman sequences.
-----------------------------------------

Definition 1.4. T(n)-Huffman sequence Qmin(n) is called
a minimizing Huffman-sequence (on T(n)) in class M
if S(Qmin) <= S(Q) for all Q belonging to M.

Definition 1.5. T(n)-Huffman sequence Qmax(n) is called
a maximizing Huffman-sequence (on T(n)) in class M
if S(Qmax) >= S(Q) for all Q belonging to M.


Definition 1.6. A binary tree is called elongated
if at least one of any two adjacent nodes is terminal.
An elongated binary tree is called left-sided
if left node in each pair of adjacent nodes is terminal.

Note. Elongated binary tree of size n has maximum height
among all binary trees of size n.

Let L(n) denote left-sided binary tree of size n.
---------------------------------------------


Let F(n) denote n-th Fibonacci number.
----------------------------------


-----------------------------------------------------------------
----------------- Example 1.1 -----------------------------------
----------------- Left-sided Huffman binary tree ----------------
----------------- for the sequence {w(1), w(2), ..., w(n)} ------
----------------- (n = 6) ---------------------------------------


s5 l(i) = the path length from the tree root
/\ to the i-th terminal node
/ \
s4 w(6) w(i) = the i-th terminal node weight
/\
/ \ n = size(T) = number of terminal nodes
s3 w(5) (Huffman code size)
/\
/ \ h(T) = max l(i), i = {1, ..., n}
s2 w(4) i
/\
/ \ E(T) = sum l(i), i = {1, ..., n}
s1 w(3) i
/\
/ \ S(T) = sum (w(i) * l(i)), i = {1, ..., n}
w(1) w(2) i


h(T) is a the height of the binary tree T.

E(T) is a the path length of the binary tree T.

S(T) is a the weighted path length of the binary tree T
for the sequence {w(1), w(2), ..., w(n)}


s1 = w(1) + w(2)
s2 = s1 + w(3)
s3 = s2 + w(4)
s4 = s3 + w(5)
s5 = s4 + w(6)

Tree 1.1 - Left-sided Huffman binary tree T

-----------------------------------------------------------------


####################### 2 #######################

2. Huffman codes (trees) and minimizing properties of Fibonacci numbers.
====================================================================

Let 1) B = {b1, b2, ..., b#n} be a non-decreasing sequence
of positive integers; size (B) = n.
2) C(B,k) = {c1, c2, ..., c#(n-k)} be a non-decreasing sequence
after k-th stage of the Huffman algorithm (k = 0, 1, ..., n-1);
size (C(B,k)) = n-k.

Example.
B = 3, 4, 6, 9, 10, 20
C(B,1) = 6, (7), 9, 10, 20
C(B,2) = 9, 10, (13), 20
C(B,3) = (13), (19), 20
C(B,4) = 20, (32)
C(B,5) = (52)

Definition 2.1. A sequence B is called _strict_
if c2 < c3 in each sequence C(B,k) (k = 1, ..., n-3)

Definition 2.2. A _strict_ sequence B is called _absolutely_strict_
if b2 < b3.

Definition 2.3. A sequence B is called _non__strict_
if there is a sequence C(B,k) ( 1 <= k <= (k-3) )
in which c2 = c3.

Definition 2.4. A _non_strict_ sequence B is called _alternative_
(or _absolutely_non_strict_)
if c2 = c3 in every sequence C(B,k) ( 1 <= k <= (k-3) )
and b2 = b3.


Let 1) INT_SEQ_SET_1 denote the set
of such _absolutely_strict_ sequences for which
the binary Huffman tree (code) is elongated;
2) INT_SEQ_SET_2 denote the set
of such _strict_ sequences for which
the binary Huffman tree (code) is elongated;
3) INT_SEQ_SET_3 denote the set
of all (_strict_ and _non__strict_) sequences for which
the binary Huffman tree (code) is elongated.

Let Phi = (1+sqrt(5))/2.
-------------------

Theorem 2.1. A Fibonacci sequence U(n) = {F(1), F(2), ..., F(n)}
is a minimizing Huffman-sequence
on a left-sided binary tree L(n)
in the class INT_SEQ_SET_1
(See Definition 1.4 and Definition 2.2).
The Huffman-cost S(U(n)) = F(n+4) - (n+4).
S(U(n)) asymptotic = Phi^(n+4))/sqrt(5).


Theorem 2.2. A sequence V(n) = {v(1), v(2), ..., v(n)}
such that v(1) = v(2) = v(3) = 1,
v(i) = F(i) - F(i-4), i >= 4, F(0) = 0,
is a minimizing Huffman-sequence
on a left-sided binary tree L(n)
in the class INT_SEQ_SET_2
(See Definition 1.4 and Definition 2.1).
The Huffman-cost S(V(n)) = F(n+4) - F(n) - (n+3).
S(V(n)) asymptotic = ((Phi^n) * (Phi^4 - 1))/ sqrt(5).


Theorem 2.3. A sequence Y(n) = {y(1), y(2), ..., y(n)}
such that y(1) = 1,
y(i) = F(i-1), i >= 2.
is a minimizing Huffman-sequence
on a left-sided binary tree L(n)
in the class INT_SEQ_SET_3
(See Definition 1.4 and Definitions 2.1 - 2.3).
The Huffman-cost S(Y(n)) = F(n+3) - 3.
S(Y(n)) asymptotic = Phi^(n+3))/sqrt(5).

Corollary 2.1. The non-strict sequence Y(n) (See Theorem 2.3)
is alternative (See Definition 2.4).

Corollary 2.2. For the alternative sequence Y(n), n >= 3
(See Theorem 2.3) there exist 2^(n-3) different
elongated binary Huffman trees
(one of them is left-sided).

-----------------------------------------------------------------
----------------- Example 2.1 -----------------------------------
----------------- Minimizing Huffman-sequence -------------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class INT_SEQ_SET_1 --------------------
----------------- (n = 6) ---------------------------------------
----------------- (See Theorem 2.1) -----------------------------


20 l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
12 8 w(1) = 1, w(2) = 1, w(3) = 2,
/\ w(4) = 3, w(5) = 5, w(6) = 8
/ \
7 5 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
4 3 i
/\
/ \ E(T) = sum l(i) = 20
2 2 i
/\
/ \ S(T) = sum (w(i) * l(i)) = 45
1 1 i

Note! S(T) = F(n+4) - (n+4)
= F(10) - 10

Tree 2.1 - Left-sided Huffman binary tree
for minimizing Huffman-sequence
in the class INT_SEQ_SET_1

Huffman algorithm stages :
Source sequence U : 1, 1, 2, 3, 5, 8 u(2) < u(3)
After stage#1 C(U,1) : 2, (2), 3, 5, 8 c(2) < c(3)
After stage#2 C(U,2) : 3, (4), 5, 8 c(2) < c(3)
After stage#3 C(U,3) : 5, (7), 8 c(2) < c(3)
After stage#4 C(U,4) : 8, (12)
After stage#5 C(U,5) : (20)


-----------------------------------------------------------------


-----------------------------------------------------------------
----------------- Example 2.2 -----------------------------------
----------------- Minimizing Huffman-sequence -------------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class INT_SEQ_SET_2 --------------------
----------------- (n = 6) ---------------------------------------
----------------- (See Theorem 2.2) -----------------------------

17 l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
10 7 w(1) = 1, w(2) = 1, w(3) = 1,
/\ w(4) = 3, w(5) = 4, w(6) = 7
/ \
6 4 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
3 3 i
/\
/ \ E(T) = sum l(i) = 20
2 1 i
/\
/ \ S(T) = sum (w(i) * l(i)) = 38
1 1 i

Note! S(T) = F(n+4) - F(n) - (n+3)
= F(10) - F(6) - 9

Tree 2.2 - Left-sided Huffman binary tree
for minimizing Huffman-sequence
in the class INT_SEQ_SET_2


Huffman algorithm stages :
Source sequence V : 1, 1, 1, 3, 4, 7 v(2) = v(3)
After stage#1 C(V,1) : 1, (2), 3, 4, 7 c(2) < c(3)
After stage#2 C(V,2) : 3, (3), 4, 7 c(2) < c(3)
After stage#3 C(V,3) : 4, (6), 7 c(2) < c(3)
After stage#4 C(V,4) : 7, (10)
After stage#5 C(V,5) : (17)


-----------------------------------------------------------------

-----------------------------------------------------------------
----------------- Example 2.3.1 ---------------------------------
----------------- Minimizing Huffman-sequence -------------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class INT_SEQ_SET_3 --------------------
----------------- (n = 6) ---------------------------------------
----------------- (See Theorem 2.3) -----------------------------


13 l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
8 5 w(1) = 1, w(2) = 1, w(3) = 1,
/\ w(4) = 2, w(5) = 3, w(6) = 5
/ \
5 3 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
3 2 i
/\
/ \ E(T) = sum l(i) = 20
2 1 i
/\
/ \ S(T) = sum (w(i) * l(i)) = 31
1 1 i

Note! S(T) = F(n+3) - 3
= F(9) - 3

Tree 2.3.1 - Left-sided Huffman binary tree
for minimizing Huffman-sequence
in the class INT_SEQ_SET_3


Huffman algorithm stages :
Source sequence Y : 1, 1, 1, 2, 3, 5 y(2) = y(3)
After stage#1 C(Y,1) : 1, (2), 2, 3, 5 c(2) = c(3)
After stage#2 C(Y,2) : 2, (3), 3, 5 c(2) = c(3)
After stage#3 C(Y,3) : 3, (5), 5 c(2) = c(3)
After stage#4 C(Y,4) : 5, (8)
After stage#5 C(Y,5) : (13)

-----------------------------------------------------------------

-----------------------------------------------------------------
----------------- Example 2.3.2 ---------------------------------
----------------- Non-minimizing Huffman-sequence ---------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class INT_SEQ_SET_3 --------------------
----------------- (n = 6) ---------------------------------------


14 l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
8 6 w(1) = 1, w(2) = 1, w(3) = 1,
/\ w(4) = 2, w(5) = 3, w(6) = 6
/ \
5 3 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
3 2 i
/\
/ \ E(T) = sum l(i) = 20
2 1 i
/\
/ \ S(T) = sum (w(i) * l(i)) = 32
1 1 i


Tree 2.3.2 - Left-sided Huffman binary tree
for non-minimizing Huffman-sequence
in the class INT_SEQ_SET_3


Huffman algorithm stages :
Source sequence X : 1, 1, 1, 2, 3, 6 x(2) = x(3)
After stage#1 C(X,1) : 1, (2), 2, 3, 6 c(2) = c(3)
After stage#2 C(X,2) : 2, (3), 3, 6 c(2) = c(3)
After stage#3 C(X,3) : 3, (5), 6 c(2) < c(3)
After stage#4 C(X,4) : 6, (8)
After stage#5 C(X,5) : (14)


-----------------------------------------------------------------


####################### 3 #######################

3. Huffman codes (trees) and maximizing properties of Fibonacci numbers.
====================================================================

Definition 3.1. A sequence P = {p1, p2, ..., p#n} of positive numbers
is called non-decreasing normalized sequence
if sum (p#i) = 1, i = {1, 2, ..., n}; size (P) = n.
i

Let NORM_SEQ_SET denote the set
of all non-decreasing normalized sequences (of positive numbers)
for which the binary Huffman tree (code) is elongated.

Theorem 3.1. A sequence Z(n) = {z(1), z(2), ..., z(n)}
such that z(1) = 1/F(n+1),
z(i) = F(i-1)/F(n+1), i >= 2.
is a maximizing Huffman-sequence
on a left-sided binary tree L(n) in the class NORM_SEQ_SET
(See Definition 1.5).
The Huffman-cost S(Z(n)) = 2 + (F(n) - 3)/F(n+1).
n -> Infinity : lim S(Z(n)) = (3 + sqrt(5))/2 = 2.6180

-----------------------------------------------------------------
----------------- Example 3.1.1 ---------------------------------
----------------- Maximizing Huffman-sequence -------------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class NORM_SEQ_SET ---------------------
----------------- (n = 6) ---------------------------------------
----------------- (See Theorem 3.1) -----------------------------

* l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
* 5/13 w(1) = 1/13, w(2) = 1/13, w(3) = 1/13,
/\ w(4) = 2/13, w(5) = 3/13, w(6) = 5/13
/ \
* 3/13 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
* 2/13 i
/\
/ \ E(T) = sum l(i) = 20
* 1/13 i
/\ 31
/ \ S(T) = sum (w(i) * l(i)) = -- = 2.3846
1/13 1/13 i 13

Note! S(T) = 2 + (F(n) - 3)/F(n+1)
= 2 + (F(6) - 3)/F(7)

Tree 3.1.1 - Left-sided Huffman binary tree
for maximizing Huffman-sequence
in the class NORM_SEQ_SET

Huffman algorithm stages : See Example 2.3.1

-----------------------------------------------------------------

-----------------------------------------------------------------
----------------- Example 3.1.2 ---------------------------------
----------------- Non-maximizing Huffman-sequence ---------------
----------------- on a left-sided binary tree L(n) --------------
----------------- in the class NORM_SEQ_SET ---------------------
----------------- (n = 6) ---------------------------------------

* l(1) = 5, l(2) = 5, l(3) = 4.
/\ l(4) = 3, l(5) = 2, l(6) = 1
/ \
* 6/14 w(1) = 1/14, w(2) = 1/14, w(3) = 1/14,
/\ w(4) = 2/14, w(5) = 3/14, w(6) = 6/14
/ \
* 3/14 n = size(T) = 6
/\
/ \ h(T) = max l(i) = 5
* 2/14 i
/\
/ \ E(T) = sum l(i) = 20
* 1/14 i
/\ 32
/ \ S(T) = sum (w(i) * l(i)) = -- = 2.2857
1/14 1/14 i 14


Tree 3.1.2 - Left-sided Huffman binary tree
for non-maximizing Huffman-sequence
in the class NORM_SEQ_SET

Huffman algorithm stages : See Example 2.3.2

-----------------------------------------------------------------

####################### 4 #######################

4. Huffman codes (trees) and contrast properties of Fibonacci numbers.
===================================================================

By Theorem 2.3, the integers sequence Y(n) = {1, F(1), F(2), ..., F(n-1)}
is minimizing integers Huffman-sequence of size n
on a left-sided binary tree L(n).
(See Example 2.3.1 and Example 2.3.2).


When the the sequence Y(n) is normalized, i.e., each of its elements
is divided by the sum of all elements (equal to F(n+1)),
its extremal properties are reversed:
By Theorem 3.1, the normalized sequence Z(n) =
{1/F(n+1), F(1)/F(n+1), F(2)/F(n+1), ..., F(n-1)/F(n+1)}
is maximizing normalized Huffman-sequence of size n
on a left-sided binary tree L(n).
(See Example 3.1.1 and Example 3.1.2).


#################################################

For more details please see :
1. Huffman trees and Fibonacci numbers // Kibernetika (Kiev). -
1986. - No. 6. - pp. 9-12; - Translated into English - pp.
692-696.
2. Huffman codes and maximizing properties of Fibonacci numbers
// Kibernetika and System Analysis (Kiev). - 1992. - No. 3. -
pp. 10-15; - Translated into English - pp. 329-334.

#################################################


-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

0 new messages