jim_bow
...@hotmail.com wrote:
> What are some of the potential practical problems with running the
> following proposed prize competition?
> PS: Many people disagree with the posited definition of "artificial
> intelligence quality", but it seems obvious to me. I'm not sure why
> its not obvious to everyone but such is the case. Any suggestions on
> how to get this across more effectively? Or are we dealing with
> different cognitive types -- some will simply 'get it' and others
> won't?
> http://www.geocities.com/jim_bowery/cprize.html
> The C-Prize
Very interesting. There is a similar contest that has been going on
for 9 years. It might help you think about the details.
http://mailcom.com/challenge/ Have you picked out a text corpus? I think this is a good approach to
the AI problem, very objective, and easier to administer than the
Loebner prize.
Last year I once proposed almost exactly the same thing to NSF with a
$50K prize, but they would not fund it.
The connection between text compression and AI is not obvious, but
there is a simple proof that it solves the Turing test. The
compression problem is to code text strings x with length log2 1/P(x)
bits. But if you know the probability distribution P(x) (specifically
the distribution over chat sessions) then you could build a machine
that could answer questions such that it is indistinguishable from
human. Given a question q, you generate response r with distribution
P(r|q) = P(qr)/P(q). You can generalize q to any sequence of questions
and responses up to the last question.
I am glossing over some assumptions that need to be more clearly
stated. For example, I am assuming that the interrogator, machine, and
the machine's human opponent know the same distribution P(x), and that
people use the same distribution for writing text as they do for
recognition. I think these are reasonable approximations because the
machine only has to be close enough to fool the interrogator, not an
exact match to the "average" human.
Neither AI nor text compression is solved. In 1950 Shannon estimated
the entropy of written English to be about 1 bit per character (based
on how well humans can guess successive letters), but the best text
compressors achieve about 1.2-1.3 bpc. These are language models for
speech recognition that use a combination of word trigram models and
semantic models based on word proximity. I did some of my dissertation
work in this area before I switched to a less interesting topic where I
could get funding. http://cs.fit.edu/~mmahoney/dissertation/
-- Matt Mahoney