Markov-chain prediction tools for stand-up comedy (NOT a humor post)

Basil White

unread,

Aug 19, 2002, 8:34:23 AM8/19/02

to

I had to cross-post this 'cos my problem is kinda cross-philosophical.
None of the groups I found seem to have real domain about my problem.
Please help.

I'm a standup comedian, and I'm trying to exploit the available
knowledge in cognitive science about text prediction in order to
improve the way humor text is edited. I'm under no illusions that
machines can write humor better than a human (yet), but it's possible
that artificial systems might be able to generate a more robust method
for determining all possible ways in which the text of a joke might be
misinterpreted by predicting expectations of subsequent sentences as
per presupposition theory, where each sentence is understood in the
context of previous sentences and is the basis for comprehending
subsequent sentences.

I shared this with a sufficiently-advanced AI theorist friend of mine.
He started talking. Forty-five seconds later, he had overwhelmed my
knowledge, intelligence and capacity for abstract thought. After I
put corks in my ears to stop the flow of blood, he told me about
Markov chain prediction tools. I searched for these on the Web. The
web pages about Markov chains all seem to be about mathematical value
predictions, and are all written in a vocabulary so dense that I was
afraid the scabs on my ears would start to break.

Are there Markov-chain prediction tools for text? Best-case scenario
for me would be a website or downloadable application where I can
paste in text and it locates the agents, actions and descriptions in
the sentences and predicts the next action or description (i.e.,
"SUBJECT (Bartender) RESPONDS TO STATEMENT BY SUBJECT (GUY)).

I'm not looking for something to write jokes. This is part of my
humor research on how machine prediction can assist in discovering
permutations of semantic relationships in text to see what's missing
in the text, false statements that the text might suggest, and where
the text seems to be leading the reader.

Please send responses to basil...@basilwhite.com.

Thanks,

Basil White
basil...@basilwhite.com

Daryl McCullough

unread,

Aug 19, 2002, 11:06:41 AM8/19/02

to

basil...@basilwhite.com says...

>I'm a standup comedian, and I'm trying to exploit the available
>knowledge in cognitive science about text prediction in order to
>improve the way humor text is edited. I'm under no illusions that

>machines can write humor better than a human (yet)...

They all laughed when I said I wanted to build a joke-telling machine.
Well, I showed them! Nobody's laughing *now*!

--
Daryl McCullough
Ithaca, NY

Vishal Doshi

unread,

Aug 20, 2002, 5:11:35 AM8/20/02

to

"Basil White" <basil...@basilwhite.com> wrote in message
news:cdd85393.02081...@posting.google.com...
[snip]

> machines can write humor better than a human (yet), but it's possible
> that artificial systems might be able to generate a more robust method
> for determining all possible ways in which the text of a joke might be
> misinterpreted

Upto this point ... all you need is a semantic parser that will generate all
possible parses (and possibly interpretations) for your text. Many parsers
do infact generate all possible parses and then prune the ones they consider
unlikely;

Consider : Lingo + LKB (downloadable - but usage is *far* from intuitive.
Has a very well written manual that is freely downloadable though).

> by predicting expectations of subsequent sentences as
> per presupposition theory, where each sentence is understood in the
> context of previous sentences and is the basis for comprehending
> subsequent sentences.
>

Essentially the only part of this that I think works even remotely well is
anaphora resolution. That's to say figuring out who 'he', or 'it' refers to
given a sequence of sentences.

Reading about anaphora resolution might provide some hints on how the
subject of a given sentence is identified and then maintained as current
over the next few sentences. For the most part -- this is a pre-processing
kind of step -- and does not dramatically change the semantics of the next
sentence.

Again -- LKB is capable of producing something called MRS -- Minimum
Recursion Semantics. Basically this is a system for underspecification.
(aside: I like to think of it as a c++ class where a chunk of the members
are not initialized -- and are not uninitialised either; It's all quantum
:)).
You could in theory produce more specialised interpretations by considering
the MRS output for a sentence and then adding information from previous
sentences in some way. Not that I know how to do this. Someonle else in this
NG may ;)

> I shared this with a sufficiently-advanced AI theorist friend of mine.
> He started talking. Forty-five seconds later, he had overwhelmed my
> knowledge, intelligence and capacity for abstract thought. After I
> put corks in my ears to stop the flow of blood, he told me about
> Markov chain prediction tools. I searched for these on the Web. The
> web pages about Markov chains all seem to be about mathematical value
> predictions, and are all written in a vocabulary so dense that I was
> afraid the scabs on my ears would start to break.
>

Beyond me.

>
> Please send responses to basil...@basilwhite.com.
>

Vishal Doshi.

Basil White

unread,

Aug 20, 2002, 11:11:13 AM8/20/02

to

da...@cogentex.com (Daryl McCullough) wrote in message news:<ajr1i...@drn.newsguy.com>...

>
> They all laughed when I said I wanted to build a joke-telling machine.
> Well, I showed them! Nobody's laughing *now*!

What I really want is a joke-LISTENING machine, e.g., "I've heard all
sentences leading up to and including N. Here's my prediction about
sentence N+1." Then I can look at the presuppositions that led to the
prediction and see if they're appropriate to the prediction that I'm
trying to create.

john bailey

unread,

Aug 20, 2002, 11:19:58 AM8/20/02

to

On 19 Aug 2002 05:34:23 -0700, basil...@basilwhite.com (Basil White)
wrote:

>I'm a standup comedian, and I'm trying to exploit the available
>knowledge in cognitive science about text prediction in order to
>improve the way humor text is edited. I'm under no illusions that
>machines can write humor better than a human (yet), but it's possible
>that artificial systems might be able to generate a more robust method
>for determining all possible ways in which the text of a joke might be
>misinterpreted by predicting expectations of subsequent sentences as
>per presupposition theory, where each sentence is understood in the
>context of previous sentences and is the basis for comprehending
>subsequent sentences.

I was suprised at the number of websites that fit the search criteria
of markov model AND humor

http://www.katz.pitt.edu/index.asp?pid=01_04&ID=74
http://www.worldscinet.com/ijig/01/0103/S0219467801000232.html
http://www.informatik.uni-trier.de/~ley/db/conf/sigir/GuptaDNG99a.html
http://www.theatlantic.com/unbound/digicult/dc980429.htm
http://cassandra.sprex.com/else/tv/CV.1993.html
http://pespmc1.vub.ac.be/FTPTREE.html

John Bailey
Goto http://home.rochester.rr.com/jbxroads for my real email address

Tom Breton

unread,

Aug 20, 2002, 2:48:59 PM8/20/02

to

basil...@basilwhite.com (Basil White) writes:

Interesting, but ISTM a human being make those predictions better and
more subtly. Still, it couldn't hurt to have a second source.

("Three guys walk into a bar..."

(.65 "...and the bartender says...")
(.3 "...and the first guy says...")
(.05 'other))

--
Tom Breton at panix.com, username tehom. http://www.panix.com/~tehom

Wolf Kirchmeir

unread,

Aug 20, 2002, 4:40:06 PM8/20/02

to

On 20 Aug 2002 14:48:59 -0400, Tom Breton wrote:

>("Three guys walk into a bar..."
>
> (.65 "...and the bartender says...")
> (.3 "...and the first guy says...")
> (.05 'other))

No, no.

It's:

.5 And the bartender says ....
.3 The first guy says ....
.25 And there's this dog sitting drinking a beer....
.05 other...

Best Wishes,

Wolf Kirchmeir
Blind River, Ontario

..................................................................
You can observe a lot by watching
(Yogi Berra, Phil. Em.)
..................................................................

H.M. Hubey

unread,

Aug 20, 2002, 11:05:06 PM8/20/02

to

It would require either

a) a huge amount of data

or

b) semantic capability

neither of which presently exist.

Basil White

unread,

Aug 21, 2002, 10:38:54 AM8/21/02

to

Tom Breton <te...@REMOVEpanNOSPAMix.com> wrote in message news:<m3vg65b...@panix.com>...

> Still, it couldn't hurt to have a second source.
>
> ("Three guys walk into a bar..."
>
> (.65 "...and the bartender says...")
> (.3 "...and the first guy says...")
> (.05 'other))

EXACTLY!!!!!!! This is EXACTLY what I want! Thank you!

-Basil White
basil...@basilwhite.com

Basil White

unread,

Aug 21, 2002, 10:41:42 AM8/21/02

to

Damn.
-Basil

"H.M. Hubey" <hhu...@nj.rr.com> wrote in message news:<3D6302F5...@nj.rr.com>...

> --

Vishal Doshi

unread,

Aug 21, 2002, 11:14:34 AM8/21/02

to

>"H.M. Hubey" <hhu...@nj.rr.com> wrote in message
news:3D6302F5...@nj.rr.com...

>It would require either

>a) a huge amount of data

All of this rather academic and may not be practical at all -- still for the
sake of discussion :

While there may not be any *easily accessible* pre-compiled source -- it
does not mean that the data is not available! It depends on how much you
want a joke-listening machine.. ;)

I wonder if you can get books, text etc of standup comedies?
I know that one can definately get audio (tapes) which could be transcribed,
albeit with considerable effort. (human or tweaking recogniser's)

Another possibility would be collecting material from the usenet itself -
say rec.humor.funny archives.

Anyways, apart from all this, I have another question - one that I'm really
interested in:

Assuming that one had a large quantity of the right data ... how would you
use it to generate a sentence? Like :

Calculate trigram possibilities;

Walk through text till you reach what seems to be a sentence end

Generate a word -- and then using that word (and two previous words)
generate another word till we generate a sentence end kind of trigram?

Is this the bit where the Markov Chain stuff comes in? Anyone care to
explain a possible algorithm? (in simple words ?)

Vishal.

Matthew Purver

unread,

Aug 21, 2002, 1:30:40 PM8/21/02

to

H.M. Hubey wrote:

> It would require either
>
> a) a huge amount of data
>
> or
>
> b) semantic capability
>
> neither of which presently exist.

in fact, wouldn't it require a large amount of data *annotated with the
audience's expectations* for each sentence? (that's if I've understood the
intention correctly). This won't already exist, but if you really want to
create it, there's nothing stopping you. And once you had this, I don't
think you'd need any semantic capability to train a n-gram (n-sentence?)
model.

Data sparsity could be a big problem, but you might be able to reduce the
amount of data required by substituting generic classes for certain words,
e.g. using "three guys go into a bar", "two dogs go into a laundrette" etc.
all to train your "N Xs go into a Y" model.

--
Matthew Purver - matt at purver dot org

john bailey

unread,

Aug 21, 2002, 2:13:34 PM8/21/02

to

On Wed, 21 Aug 2002 16:14:34 +0100, "Vishal Doshi"
<Vishal...@techprt.co.uk> wrote:

>Assuming that one had a large quantity of the right data ... how would you
>use it to generate a sentence? Like :
> Calculate trigram possibilities;
> Walk through text till you reach what seems to be a sentence end
>Generate a word -- and then using that word (and two previous words)
>generate another word till we generate a sentence end kind of trigram?
>Is this the bit where the Markov Chain stuff comes in? Anyone care to
>explain a possible algorithm? (in simple words ?)

There was a brilliantly funny thread on rec.puzzles starting May 6,
1997 with Robert Meows simply posting:
I were you I love of worms between the lines is it, pillar to one or
e-mail a tee is it of seven a jam only one difference great white.
Myself a kiss desired result of least squares to z and sweet song and.

The thread gave a series of examples and is well worth looking up on
Google groups: Message-ID: <5koron$b...@opus.ccrwest.org>#1/1

A Markov model represents a series of states like lilly pads from
which a frog jumps from pad to pad. The transition from one pad to
another is given a probability. A hidden markov model (HMM) is such a
model, but not all the lily pads can be seen and must be inferred. An
algorithm called the Viturbi algorithm showed how to infer the hidden
lily pads and viola--HMM is used for speech recognition, languague
translation, dna research, and is IMHO the hottest software idea since
the Fast Fourier Transform.

H.M. Hubey

unread,

Aug 21, 2002, 7:19:18 PM8/21/02

to

Vishal Doshi wrote:

"H.M. Hubey" <hhu...@nj.rr.com> wrote in message

news:3D6302F5...@nj.rr.com...

It would require either

a) a huge amount of data

All of this rather academic and may not be practical at all -- still for the
sake of discussion :

While there may not be any *easily accessible* pre-compiled source -- it
does not mean that the data is not available! It depends on how much you
want a joke-listening machine.. ;)

I wonder if you can get books, text etc of standup comedies?
I know that one can definately get audio (tapes) which could be transcribed,
albeit with considerable effort. (human or tweaking recogniser's)

Another possibility would be collecting material from the usenet itself -
say rec.humor.funny archives.

Anyways, apart from all this, I have another question - one that I'm really
interested in:

Assuming that one had a large quantity of the right data ... how would you
use it to generate a sentence?

One can go from various formal languages like finite state languages, context-free languages
etc to stochastic models and these are essentially Markov type models. The transitions in
formal languages are not stochastic. Instead one (an intelligent entity) selects which transition
he/she/it wants to use.

So they complement each other.

Gerry Myerson

unread,

Aug 21, 2002, 11:23:59 PM8/21/02

to

In article <3d63d667...@news-server.rochester.rr.com>,
john_...@rochester.rr.com (john bailey) wrote:

=> There was a brilliantly funny thread on rec.puzzles starting May 6,
=> 1997 with Robert Meows simply posting....

Moews, not Meows.
--
Gerry Myerson (ge...@mpce.mq.edi.ai) (i -> u for email)

Alex Magidow

unread,

Aug 22, 2002, 12:02:26 AM8/22/02

to

Wolf Kirchmeir wrote:

>On 20 Aug 2002 14:48:59 -0400, Tom Breton wrote:
>
>
>
>>("Three guys walk into a bar..."
>>
>> (.65 "...and the bartender says...")
>> (.3 "...and the first guy says...")
>> (.05 'other))
>>
>>
>
>No, no.
>
>It's:
>
>.5 And the bartender says ....
>.3 The first guy says ....
>.25 And there's this dog sitting drinking a beer....
>
>

>.05 "And says 'Ouch'"

>
>
>Best Wishes,
>
>Wolf Kirchmeir
>Blind River, Ontario
>
>..................................................................
>You can observe a lot by watching
>(Yogi Berra, Phil. Em.)
>..................................................................
>
>
>
>

--
"Nike only sells shoes that, if they were cars, would have been driven by pimps in the 70's. "- Some guy on slashdot