Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Analyses on Limitations of Information Theory

4 views

Skip to first unread message

wangyong

unread,

Oct 23, 2007, 7:12:59 AM10/23/07

Analyses on Limitations of Information Theory
Yong WANG
(School of Computer and Control, Guilin University Of Electronic
Technology, Guilin City, Guangxi Province, China, 541004)
Hel...@126.com
Abstract-The paper analyzes the limitations from the angles of
conditional entropy and the expression of information. It is pointed
out Shannon's definition of information is not proper and the
probability in the expression of information maybe random variable in
practice, but in information theory the probability is treated as a
fixed value, then the use of Shannon's theory is limited. The
signality of the reliability of information which is ignored in
Shannon's information theory is indicated and the clew to measure the
reliability of information is given to be the uncertainty of
probability.
Keywords-information theory, conditional entropy, probability,
uncertainty, measure
I. Introduction
Shannon's information theory is triumphantly used in some fields, such
as telecommunication technology [1]. But his theory is not adaptable
in many aspects of our daily life. A lot of scholars realized the
limitations of Shannon's information theory. Even Shannon himself
realized the limitations, encouraged scholar to criticize and said:
"workers in other fields should realize that the basic results of the
subject are aimed in a very specific direction", "Research rather than
exposition is the keynote, and our critical thresholds should be
raised. Authors should submit only their best efforts, and these only
after careful criticism by themselves and their colleagues" [2]. Lots
of problems of information theory were pointed out [3-5]. The
discussed limitations of information theory are focused on the
limitations of classical sets, the neglect of the meanings of language
and the neglect of the use of language. Due to the limitations of
classical sets, fuzzy sets, rough sets and other sets were presented
[6, 7]. Considering the meanings of language and the neglect of the
use of language, Daniel Federico, Nauta Doede, Yixin Zhong, Chenguang
Lu, and so on gave new methods to measure meanings and use of
information. In the early 1990s Generalized Information Theory (GIT)
was introduced to name a research program whose objective was to
develop a broader treatment of uncertainty-based information [5]. The
above theories focus on uncertainty reduction and do not consider the
reliability of information. In the probability theory and information
theory, the probability is always taken as fixed value, but not random
variable, so the mulriple uncertainty of information is overlooked.
When studying the perfect secrecy of one-time-pad, the posterior
uncertainty of information was found increased [8, 9]. Though the
amount of the posterior information is decreased, but the reliability
of the posterior information is increased [10]. People would select
information that is more reliable but more uncertain than the reverse
in practice. That seems to be conflicting with information theory.
This paper aims to explain the problem and indicate the limitations of
information theory.

II. Eduction of perfect secrecy of one-time-pad
We give an example of one-time-pad to bring out the problems as
follows: plaintext space is M＝{0,1}, ciphertext space is C＝{0,1} and
key space is K＝{0,1}. According to the information that cryptanalysts
got beforehand, they can get the prior probability of plaintext as
P(M=0) = 0.9 and P(M=1) = 0.1. Later the ciphertext C=0 is
intercepted. When only considering C=0 and the cryptosystem
(regardless of the prior probability of plaintext), for there is a one-
to-one correspondence between all the plaintexts and keys for C=0, so
the probabilities of the corresponding plaintext and key are the same.
As all the keys are equally likely, so we can educe that all the
plaintexts are equally likely. The prior probabilities of plaintexts
are seldom the same, so the two probability distributions of
plaintexts gained from different conditions are conflicting. Then the
compromise of the two probability distributions is indispensable. The
compromised posterior probability of the plaintext would be between
the two corresponding probabilities of the two sectional conditions.
Thus when C=0 is intercepted, the posterior probability P(M=0) is
between 0.9 and 0.5, and P(M=1) is between 0.1 and 0.5. The
compromised posterior probability of the plaintext isn't equal to the
prior probability, so OTP is not perfectly secure. In this example, we
can see that the posterior (conditional) entropy of plaintext is
increased. But Shannon declared that the uncertainty of y would be
never increased by knowledge of x. and it would be decreased unless x
and y were independent events, in which case it was not changed (Here
x and y were two events). So that is inconsistent with information
theory. It is obvious that the compromised posterior probability of
the plaintext is more reliable than the prior probability for it is
gained from more perfect conditions than the prior [9]. Similarly the
posterior information expressed by the posterior probability is more
reliable than the prior information. Even though the posterior
information is more uncertain than the prior information, but people
would select the posterior for it is more reliable than the prior.
Reliability of information is very important and it is the root of
information value. But in information theory, the reliability of
information is never considered and measured, merely the reliability
of information transmission is considered. In the following
paragraphs, we will discuss the problems about conditional entropy and
reliability of information.

III. Limitations analyses on conditional entropy
In the literature [1], Shannon provided the definition to entropy of
the joint events and conditional entropy, produced the formula, and
made the conclusion that the uncertainty of y would be never increased
by knowledge of x. and it would be decreased unless x and y were
independent events, in which case it was not changed. The conclusion
is gained from his result
H (y) ≥H x(y)
Here H x(y) is conditional entropy of y by the knowledge of x.
>From the conclusion H x(y) should be the uncertainty of y by knowledge
of x.
Suppose for any particular value i that x can assume there is a
conditional probability pi( j) that y has the value j.
By the knowledge of x the uncertainty of y should be
Hx(y)＝－＝－ (1)
We can get the conclusion from Shannon's proof that the prior
probability distribution of y is the same as the posterior
(conditional) probability distribution of y by the knowledge of x.
In his proof, he used

and got the conclusion

Then he defined the conditional entropy of y, Hx(y) as the average of
the entropy of y for each value of x, weighted according to the
probability of getting that particular x. The formal is
Hx(y)＝－ (2)
and got
H(x, y)＝H(x)＋H x(y)
For

H(x)＋H (y)≥H(x, y)＝H(x)＋H x(y)
Hence
H (y) ≥H x(y).
>From his proof, we can find for any i and j, p(i, j) is the same in
the formals to compute H(x), H(y), H(x, y) and Hx(y), otherwise, the
proof about the inequation should be groundless. Therefore the so-
called prior probability distribution of y and the so-called
posterior(conditional) probability distribution of y is under the same
joint probability distributions of x and y, thus
so-called prior probability distribution of y can be expressed as
for any j, p(j) =
so-called posterior probability distribution of y by the knowledge of
x can be expressed as
for any j, px(j) =
so the two probability distributions are sameself. The uncertainty of
y unchanges for the prior probability distribution is accordant with
the knowledge of x. The so-called prior one and posterior one are the
same one. In fact, by the knowledge of x, the probability distribution
of y usually changes. The influence of x on the probability of y is
discretionary and the random uncertainty of y by the knowledge of x
can be increased, decreased or unchanged when compared with the prior
random uncertainty of y.
Formal (2) is not equal to the conditional entropy or the uncertainty
of y by the knowledge of x, it is only a weighted average of the
separated entropies, as entropy calculation involves logarithmic
calculation that can induce distortion, so the inequation appears.
That is the reason Shannon got the improper conclusion.
>From above analysis, we can find Shannon's conditional entropy is not
suitable to be considered as conditional entropy, but it is suitable
to be regarded as weighted average of the conditional entropy when the
possible value of random variable x is fixed. Mostly it is properly
used as average of conditional entropy in the information, that is the
reason why it is not questioned. But the naming of conditional entropy
is not proper and confusing. Thus the conclusion that the uncertainty
of y is never increased by knowledge of x is not absolutely correct
and it can be revised as that the uncertainty of y is averagely not
increased by knowledge of x.
Due to the conclusion that the uncertainty of y is never increased by
knowledge of x, Shannon defined information as that which reduces
uncertainty, but in the case of OPT, information could increase
uncertainty. Therefore the definition is not absolutely right and then
the definition of information should be changed. As all the
definitions of information ignore the reliability of information which
is very significative, a new definition of information based on
reliability was given [11].
IV. Limitations analysis on the expression of information
In the case of OPT, the posterior uncertainty of plaintext is
decreased, but if the prior probability and the posterior probability
distribution that is more uncertain than the prior are leaved to
choose, people would choose the posterior for it is more reliable,
though the amount of the prior information is more than the posterior,
so another index of information should be provided, we call it
reliability. The measure of reliability seems difficult. We can find
clew from the expression of information.
To find the limitation of the expression of information, the former
study focused on the limitations of the classical set, thus fuzzy set
and rough set were suggested. But we find even in the case of
classical set, the expression of information is also not universally
appropriate.
As we have pointed out that the probability of event is always
considered as a fixed value, though it was not directly stated that
the probability was a fixed value [10]. But it can be seen from a lot
of formals in probability theory and information theory that
probability is always taken as a fixed value, otherwise, the formals
may be impossible to compute, for example, the formal of entropy would
be impossible to compute if probability is a random variable. The case
that probability is a random variable is universal for fixed value is
only a special case of random variable. For instance, the probability
that comes from the unreliable conditions has more than one possible
value, it is not fixed value, so the probability is a random variable
and has random uncertainty correspondingly. As probability is always
treated as fixed value in information theory, it causes the limitation
of information theory that they can not solve the problems when the
probability is a random variable. Generally speaking, for fixed
probability, neither the analysis of the reliability of information
itself nor the fusion of unreliable and incomplete information is
doable. Most information in reality is not absolutely reliable or
perfect, we should compromise and fuse different information. As
information is expressed by probability, so information is
unchangeable if the corresponding probability is considered as fixed
value. To take probability as a fixed value is one of the fundamental
reasons why information theory can not be used to research the
reliability of information itself and information fusion, while it can
be used to research reliability of communication.
Maybe it is proper to measure the reliability of information by
computing the uncertainty of the probabilities, but it is more complex
than the computation of random uncertainty of information (entropy).
The consideration of the uncertainty of probability may generalize the
use of information and expand information theory.
V. Conclusion
The paper analyzes the problems of definition and calculation of
conditional entropy, and rectifies the calculation formula of
conditional entropy. It is pointed out that information is not fit to
be defined as the thing which reduces uncertainty and the expression
of information is limited for the value of probability is regarded as
fixed. The reliability of information is not considered in Shannon's
information theory, that is relative to the limitation that the value
of probability is regarded as fixed. The consciousness of the
limitations of information will generalize information theory and
solve many problems in information science that can not be solved by
Shannon's theory. Shannon was incredibly inventive and made a lot of
landmark contributions in information theory, but due to the
particularity of communication, Shannon's information theory is very
proper to deal with the problems in communication and the above
limitations are just consistent with the particularity of
communication.
Acknowledgements
This work has been supported by Guangxi Science Foundation under grant
0640171 and Modern Communication National Key Laboratory Foundation
under grant 9140C1101050706.

Reference
［1］ C.E.Shannon. A mathematical theory of communication [J], Bell
System Technical Journal,27 (1948),379-429,623-656
［2］ C.E.Shannon. The bandwagon [J]. IEEE Trans. On Information Theory,
1956,2:3
［3］ Chenguang LU, A Generalized Information Theory [J] (in Chinese),
Hefei: China Science and Technology University Press, 1993
［4］ Yixin ZHONG. Information science principle (Third Edition)
[M].Beijing: Beijing University of Posts and Telecommunications Press,
2002(in Chinese).
［5］ Geogre J K. An update on generalized information
theory[A].ISIPTA[C].2003.321～334.
［6］ Zadeh, Fuzzy Sets [J], Information Control, 1965, 8, 338-353.
［7］ Pawlak. Rough sets[J].International Journal of Computer and
Information Sciences, 1982, (11): 341 ～ 356.
［8］ Yong WANG, Perfect Secrecy and Its Implement [J], Network &
Computer Security, 2005(05)
［9］ Yong WANG, Fanglai ZHU, Security Analysis of One-time System and
Its Betterment[J], Journal of Sichuan University(Engineering Science
Edition),2007,supp. 39(5):222-225
［10］ Yong WANG, On Relativity of Information, presented at First
National Conference on Social Information Science in 2007, Wuhan,
China, 2007.
［11］ Yong WANG, On the Perversion of information's Definition,
presented at First National Conference on Social Information Science
in 2007, Wuhan, China, 2007.

Biography：
Yong WANG (1977－) Tianmen city, Hubei province, Male, Master of
cryptography, Research fields: cryptography, information security,
generalized information theory, quantum information technology. Guilin
University of Electronic Technology, Guilin, Guangxi, 541004 E-mail:
hel...@126.com wang197...@sohu.com
Mobile 13978357217 fax: (86)7735601330(office)
School of Computer and Control, Guilin University Of Electronic
Technology, Guilin City, Guangxi Province, China, 541004

Harvey Pecker

unread,

Oct 23, 2007, 11:26:09 AM10/23/07

Shut up.

"wangyong" <hel...@126.com> wrote in message
news:1193137979.0...@k35g2000prh.googlegroups.com...

"Where muh wang?"

wangyong

unread,

Nov 3, 2007, 3:39:08 AM11/3/07

On 10月23日, 下午11时26分, "Harvey Pecker" <inva...@example.com> wrote:
> Shut up.
>
> "wangyong" <hell...@126.com> wrote in message

>
> news:1193137979.0...@k35g2000prh.googlegroups.com...
>
> "Where muh wang?"

what did you mean

0 new messages