What Lebling said...

Charles R.

unread,

Jul 23, 2001, 12:34:54 PM7/23/01

to

What do you think?

http://www.adventurecollective.com/articles/interview-davelebling.htm

- Charles

John E

unread,

Jul 23, 2001, 2:57:46 PM7/23/01

to

The thing that most stuck out was..

<paraphrased>
Everquest has a parser that is on a lower level than Collosal Cafe.
</paraphrased>

This is very true. Most of the people you deal with are "real" as in
connected player. But NPC are outshined by any infocom NPC including Lloyd!

I think the "technology" that is the Z-machine still has a modern use.
Namely, imagine adding the Inform (or Tads, or Hugo etc) parser to
Everyquest.

-John

Charles R. wrote in message ...

tna...@direct.ca

unread,

Jul 23, 2001, 10:14:54 PM7/23/01

to

Thus Spake Charles R. <co13...@yahoo.com>:

> What do you think?
>
> http://www.adventurecollective.com/articles/interview-davelebling.htm

It was interesting. The biggest thing I agree with him on is how much
of a throwback these graphic "adventures" like Myst are. They're pretty
to look at, but from a playability point of view, there about as much
fun as web pages.

I wonder: how can one make a fully 3-D game without requiring a command
parser (i.e., having to type "take the paper", "fold it")? Maybe the
new gesture interfaces out there might help, but has anyone given it any
real thought?

--
-----
Travers Naran, F/T Programmer & P/T Meddler in Time & Space
New Westminster, British Columbia, Canada, Earth, Milky Way, etc.
"Stand Back! I'm a programmer!"
Visit the SFTV Science Blunders Hall of Infamy!
<http://www.geocities.com/naran500/>

Daniel Ellison

unread,

Jul 24, 2001, 6:18:00 AM7/24/01

to

<tna...@direct.ca> wrote in message news:9jilm...@enews2.newsguy.com...

> I wonder: how can one make a fully 3-D game without requiring a command
> parser (i.e., having to type "take the paper", "fold it")? Maybe the
> new gesture interfaces out there might help, but has anyone given it any
> real thought?

Wouldn't this be a perfect place for voice recognition? I know it's STILL
not perfect - or even close to it - but this is certainly a case where the
vocabulary is restricted enough to make voice recognition viable, no?

Daniel Ellison
Toronto, Ontario
dan...@syrinx.net

Todd Nathan

unread,

Jul 24, 2001, 6:20:15 PM7/24/01

to

I'm sure there are millions of people who have given it a thought, someone
or anyone that has given it a solution may have already figured there is
just not enough interest in the markets, ie selling and making money.
But not me... I think there is a great potential, and hopefully real money
to be made.

To interact from a command standpoint, like with ViaVoice, and typing
can just go away, then as DLebling mentioned, that real computer interaction
needs/should take place. COuldn't agree more, like conversational content.
This tell computer, figure out why it didn't work, tell computer loop for
an interface is old, and most don't find it stimulating enough, me being one
of them. VIsuals are great, and IMHO graphics ARE 93% of the content
and the rest is informational, that is where things need to go. Myst had a
great
idea, yet wondering around and having eye candy, and that is IT, is not the
total package, and where it fell short for the 'intelligent' consumer.
Which
brings up a key point, computers have changed to commodities 10 or so
years ago (or maybe more). Consumers don't want a computer because
they are smart, they want a computer so they can be dumb and do more and
make more money. IF and games like it don't do that, they entertain and
with
the likes of Playstation and such coming out potentially with 16x processors
and memory down to the 'free' price point, IF remains and always will
a niche market for anyone to write/profit from, resulting in no income,
resulting
in no interest, resulting in commercial death (as we know it).

SHame, yes. Reality IMHO yep. It would be great to have real, stimulating
interactions with computers, have them challenge us as if were real, and
also
be able to do it in a network environment (have people walking around and
discovering, reading, looking, smelling [yes, spray your face with scents
from
a blow port in your monitor, and likely give us cancer in the process] and
touching
[body suits as Dave mentioned]). Or just 'read a program' and play with
your
mind and think for a while. That in the end is what IF did for me, and
unfortunately
we have kids leaving High School unable to read at 3rd grade levels (or at
all),
so for us sliding down the toilet of life, it is unreasonable to raise the
bar at this
point, it is only going to get worse before it gets better, or worse.

\t

<tna...@direct.ca> wrote in message news:9jilm...@enews2.newsguy.com...

Todd Nathan

unread,

Jul 24, 2001, 6:31:21 PM7/24/01

to

Just wondering, is there a library online of stuff folks have submitted
and cataloged into types of items, like vehicles (car, truck, bicycle,
teleporter, etc),
food, buildings (home, office, so on ...)?

Thanks, I don't know if I'm asking a lot, I just figured that since TADS
is OO, and actually one of the first OO languages I learned 11 years ago
(thanks High Tech ;'P, I know they don't exist, but Mr. Roberts does ;'P)
I would have figured that a object repository would have started by now.

Take care, and all the best!

\t

T Raymond

unread,

Jul 26, 2001, 12:27:38 AM7/26/01

to

Todd Nathan was overheard typing about:

> Thanks, I don't know if I'm asking a lot, I just figured that
> since TADS is OO, and actually one of the first OO languages I
> learned 11 years ago (thanks High Tech ;'P, I know they don't
> exist, but Mr. Roberts does ;'P) I would have figured that a
> object repository would have started by now.

Sure, just visit your local IF-ARCHIVE site. Lots of source available
there. That's about the only repository of any sort that I know of,
outside of macro files that I and other authors undoubtably have put
together for our own programming.

Tom
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Tom Raymond ab402 AT osfnDOTorg
"The original professional ameteur."
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

John Colagioia

unread,

Jul 27, 2001, 9:48:11 AM7/27/01

to

tna...@direct.ca wrote:

> Thus Spake Charles R. <co13...@yahoo.com>:
> > What do you think?
> > http://www.adventurecollective.com/articles/interview-davelebling.htm
> It was interesting. The biggest thing I agree with him on is how much
> of a throwback these graphic "adventures" like Myst are. They're pretty
> to look at, but from a playability point of view, there about as much
> fun as web pages.

Agreed. Then there are other games, like the Seventh Guest. At one point
playing that, I considered that perhaps the actual game was taking place
elsewhere in the house, and I was just bumping into it (and the plot) at
random intervals, solving arbitrary puzzles just to pass the time...

> I wonder: how can one make a fully 3-D game without requiring a command
> parser (i.e., having to type "take the paper", "fold it")? Maybe the
> new gesture interfaces out there might help, but has anyone given it any
> real thought?

It might be doable. Especially verbs like "fold," "rub," "open," and so on
could be done gesturally.

Sorry, let me rephrase that. You'll always need a parser, by definition.
There will always have to be some software (or maybe hardware, if you're
really specialized) to convert the user's stream of input to internal
commands. You might be able to use a different input stream to the parser
than text, however, if you're clever about the interface. The gestures are
probably a good start.

Hmmm...With a three-button mouse, you *might* be able to expand the gestures
to semi-full hand movement, with independant fingers. That is, depressing
all buttons except the left button ("left" being the term Windows
understands as being the button under your index finger, in case you have a
left-handed mouse--at which point that'd be your right button, but...) would
let you point at things, push buttons, poke people, or what have you.

On the other hand, voice isn't much fun, particularly if you're the sort of
player who likes to run through games while everyone else is asleep...

Mathias Maul

unread,

Jul 27, 2001, 3:57:38 PM7/27/01

to

Daniel Ellison <dan...@syrinx.net> wrote:

> Wouldn't this be a perfect place for voice recognition? I know it's STILL
> not perfect - or even close to it - but this is certainly a case where the
> vocabulary is restricted enough to make voice recognition viable, no?

Regrettably, no.

--mm:)

Dennis G Jerz

unread,

Jul 28, 2001, 11:46:04 AM7/28/01

to

mm...@gmx.net (Mathias Maul) wrote in message news:<1ex6d3l.p8laj3qslv40N%mm...@gmx.net>...

My experience with Dragon Naturally Speaking was not actually all that
bad... it's a lot of fun to lean back with your hands over your head
and talk your way through an IF game -- but whatever pleasure that
gives is pretty much equalled out by the tedious nature of correcting
the spelling of words that are unfamiliar to Naturally Speaking.

Dennis G. Jerz

M. D. Krauss

unread,

Jul 28, 2001, 7:23:59 PM7/28/01

to

That raises an interesting insight. In IF, words have to be familiar to
the game anyhow. If the speech recognition software was tied to the
parsing software, it could automagically know every word relevant to the
game and select from possibles on the basis of what the parser will accept
in context.

-Matthew

On 28 Jul 2001 08:46:04 -0700

Dennis G Jerz

unread,

Jul 29, 2001, 12:18:01 AM7/29/01

to

M. D. Krauss <MDKr...@home.com> wrote in message news:<20010728192359.2...@home.com>...

> That raises an interesting insight. In IF, words have to be familiar to
> the game anyhow. If the speech recognition software was tied to the
> parsing software, it could automagically know every word relevant to the
> game and select from possibles on the basis of what the parser will accept
> in context.
>
> -Matthew

I have that software installed at my office, where I'm presently not,
so I can't check... but I seem to recall that you can feed a bunch of
documents into Naturally Speaking, so that it knows which words are
most likely to be chosen.

Of course part of the fun in IF is figuring out which words are the
"right" ones (guess-the-verb puzzles excepted). If the
speech-recognition software favored interpretations that advanced the
plot, you might find yourself clearing your throat and having the
following appear on the screen: "examine red speck on Lord
Chesterfield's tie" (or whatever).

Dennis

Ross Presser

unread,

Jul 30, 2001, 10:33:48 AM7/30/01

to

jer...@uwec.edu (Dennis G Jerz) wrote:

> Of course part of the fun in IF is figuring out which words are the
> "right" ones (guess-the-verb puzzles excepted). If the
> speech-recognition software favored interpretations that advanced the
> plot, you might find yourself clearing your throat and having the
> following appear on the screen: "examine red speck on Lord
> Chesterfield's tie" (or whatever).

Kind of like scratching your nose at an auction and finding out you've
just spend several millions you don't have.

--
Ross Presser * ross_p...@imtek.com
"Back stabbing is a sport best played by those that can't stand face
to face with their opponent." - Danny Taddei

Dennis G Jerz

unread,

Aug 1, 2001, 1:23:29 AM8/1/01

to

Text-to-speech developments:

July 31, 2001

Software Called Capable of Copying Any Human Voice

AT&T (news/quote) Labs will start selling speech software that it says
is so good at reproducing the sounds, inflections and intonations of a
human voice that it can recreate voices and even bring the voices of
long-dead celebrities back to life. The software, which turns printed
text into synthesized speech, makes it possible for a company to use
recordings of a person's voice to utter things that the person never
actually said.

[Interesting implications for the future of computer-assisted
storytelling.]

http://www.nytimes.com/2001/07/31/technology/31VOIC.html

Dennis

Daniel Ellison

unread,

Aug 1, 2001, 6:28:04 AM8/1/01

to

"Dennis G Jerz" <jer...@uwec.edu> wrote in message
news:792c6202.01073...@posting.google.com...

> AT&T (news/quote) Labs will start selling speech software that it says
> is so good at reproducing the sounds, inflections and intonations of a
> human voice that it can recreate voices and even bring the voices of
> long-dead celebrities back to life. The software, which turns printed
> text into synthesized speech, makes it possible for a company to use
> recordings of a person's voice to utter things that the person never
> actually said.

Very odd! Just yesterday I accidentally stumbled upon this site:

http://www.research.att.com/~mjm/cgi-bin/ttsdemo

It allows you to type in sentences and have the AT&T software "say" what you
typed. It isn't nearly as spectacular as they make it sound; you can
definitely tell it's computer generated. For example, when I typed "You are
in a maze of twisty little passages, all alike", "alike" was said with a
long "a", and the whole sentences was spoken as if the person was nervous or
on a mild roller coaster. The cadence was... well, not human. The voice
quality was good, though. Certainly a step in the right direction!

Andrew Plotkin

unread,

Aug 1, 2001, 10:34:21 AM8/1/01

to

Daniel Ellison <dan...@syrinx.net> wrote:
> "Dennis G Jerz" <jer...@uwec.edu> wrote in message
> news:792c6202.01073...@posting.google.com...

>> AT&T (news/quote) Labs will start selling speech software that it says
>> is so good at reproducing the sounds, inflections and intonations of a
>> human voice that it can recreate voices and even bring the voices of
>> long-dead celebrities back to life. The software, which turns printed
>> text into synthesized speech, makes it possible for a company to use
>> recordings of a person's voice to utter things that the person never
>> actually said.

> Very odd! Just yesterday I accidentally stumbled upon this site:

> http://www.research.att.com/~mjm/cgi-bin/ttsdemo

> It allows you to type in sentences and have the AT&T software "say" what you
> typed. It isn't nearly as spectacular as they make it sound; you can
> definitely tell it's computer generated. For example, when I typed "You are
> in a maze of twisty little passages, all alike", "alike" was said with a
> long "a", and the whole sentences was spoken as if the person was nervous or
> on a mild roller coaster. The cadence was... well, not human.

It's really very good, though.

I bet you could add markup by hand -- indicating which words were the
subject and verb of the sentence, which phrases were the meaning and
which were auxiliary -- and improve it another big chunk.

You know, I bet there's a future in "voice capture". Record a human
actor's voice, doing lines with the desired emoting. Analyze for
pitch, rhythm, and volume. Then feed that data into a speech
synthesizer with the same text, but an entirely different voice
pattern. Hire J. Cheapo Actor, have Humphrey Bogart's voice come out
the other end. Or something entirely inhuman. (We use such simple
tricks for "inhuman" voices right now -- reverb, band-pass filters --
it could wind up being as obsolete as face-paint monster effects.)

--Z

"And Aholibamah bare Jeush, and Jaalam, and Korah: these were the borogoves..."
*
* Gore won the undervotes:
http://www.gopbi.com/partners/pbpost/news/election2000_pbgore.html

Daniel Ellison

unread,

Aug 1, 2001, 10:57:41 AM8/1/01

to

"Andrew Plotkin" <erky...@eblong.com> wrote in message
news:9k941d$4t$1...@news.panix.com...

> It's really very good, though.
>
> I bet you could add markup by hand -- indicating which words were the
> subject and verb of the sentence, which phrases were the meaning and
> which were auxiliary -- and improve it another big chunk.

I suppose it IS pretty good compared to what else is out there. I guess I
just expect too much. I would have thought there would be a much larger
improvement in the ensuing TWENTY years since I was playing with the
Sweet-Talker 2 speech synthesis card in my Apple ][ clone, making it sound
like WOPR from Wargames. "Shall we play a game?"

I'm still enamoured with the idea of experiencing IF by talking with the
computer and having it describe the results of my actions in a semi-human
voice. I wonder how much this AT&T software will be...

Sean T Barrett

unread,

Aug 1, 2001, 11:29:07 AM8/1/01

to

Daniel Ellison <dan...@syrinx.net> wrote:
>It allows you to type in sentences and have the AT&T software "say" what you
>typed. It isn't nearly as spectacular as they make it sound; you can

>definitely tell it's computer generated. [snip] The cadence was... well, not

>human. The voice quality was good, though. Certainly a step in the right
>direction!

One problem with judging these things is that they essentially
conflate two entirely different tasks in each product, and you're
left guessing which of the two is contributing.

The first task is "english-to-phonemese", and the second task
is "phonemese-to-audio".

I've heard more natural cadence and stress patterns in a fifteen-year
old computer-sounding speech synthesizers, but I believe that was
because they had authored the phonemese directly. So in this example
it's hard to tell whether the problem is that the english-to-phonemese
isn't up to guessing the stresses correctly, or whether the audio
synthesis technology isn't up to snuff. From reading about the details
of the tech, which is sample-based instead of synthesis-based, it seems
possible that they actually may have too limited a range of inflections,
and no way of moving smoothly between stress levels (e.g. smoothly
rising and falling in pitch).

The scenario Zarf suggests is essentially a way of recovering
phonemese directly, bypassing the english-to-phonemese problem,
which is in practice impossible to solve (actors can say the
same line twenty different ways; it's impossible to know which
one is right without both context and direction).

However, that scenario doesn't sound that likely to me in practice,
any more than virtual actors taking over the movies. Someone still
has to do the acting--the animators or the voice actor--and the
extra layer of indirection of virtual actors gives the director
less immediate control (if more repeatable control in the long
run), as well as making it *more* expensive than just using
actual actors. Sure, it'll be used for special purposes, like
synthesizing dead people, or synthesizing actors who are too
expensive but agree to having a duplicate used (although for
voice-acting human actors can already fake it pretty well; see
also 'Yellow Submarine'...), e.g. for some Star Trek computer
game--although that's not too practical for the current system,
which requires a 10-hour session to capture the voice.

SeanB

Andrew Plotkin

unread,

Aug 1, 2001, 11:55:11 AM8/1/01

to

Sean T Barrett <buz...@world.std.com> wrote:

> The scenario Zarf suggests is essentially a way of recovering
> phonemese directly, bypassing the english-to-phonemese problem,
> which is in practice impossible to solve (actors can say the
> same line twenty different ways; it's impossible to know which
> one is right without both context and direction).

> However, that scenario doesn't sound that likely to me in practice,
> any more than virtual actors taking over the movies.

I didn't suggest it would take over; I suggested that it could become
a commonly-used technique for the cases where it's valuable. (Creating
the voice of someone who is unavailable, or a voice no human can
produce.)

Entirely-virtual actors aren't likely to take over in movies -- but
motion-capture has become big business.

Sean T Barrett

unread,

Aug 1, 2001, 2:55:42 PM8/1/01

to

Andrew Plotkin <erky...@eblong.com> wrote:
>Entirely-virtual actors aren't likely to take over in movies -- but
>motion-capture has become big business.

Has it? It's used in a small fraction of computer games, but the
application there is very different: motion capturing allows synthesizing
*3d-character* motion which can be viewed from multiple angles, something
you couldn't do by just filming a person making the moves. There's
no analogy to virtual voice acting for that scenario. (There may be an
analogy for the "inhuman character" mocap scenario, but that's a lot
less frequent in games--inhuman characters are much more likely to be
hand-animated.)

I'm very familiar with the use of the technology in games, since that's
the industry I'm employed in, and indeed I've worked on a game which used
an awful lot of mocaps. You'll have to enlighten me on its application
in other industries.

SeanB

Billy Harris

unread,

Aug 1, 2001, 10:42:23 PM8/1/01

to

In article <UQQ97.22225$sf2.5...@news3.rdc1.on.home.com>, Daniel
Ellison <dan...@syrinx.net> wrote:

> "Dennis G Jerz" <jer...@uwec.edu> wrote in message
> news:792c6202.01073...@posting.google.com...
>
> > AT&T (news/quote) Labs will start selling speech software that it says

> definitely tell it's computer generated. For example, when I typed "You are

> in a maze of twisty little passages, all alike", "alike" was said with a
> long "a", and the whole sentences was spoken as if the person was nervous or

These are phoneme problems rather than speech problems. The production
system will very likely have people use phonetic spellings with
capitals to indicate stresses. I don't have javascript, but see if
something like the following sounds better:

U R en a maize of TWIStee PAHsagess, awl UHlike.

Of course, documentation on what phonemes they use, and an extended
character set to use accents rather than goofy spelling would be great.

Alexander Deubelbeiss

unread,

Aug 2, 2001, 5:40:17 AM8/2/01

to

Billy Harris schrieb in Nachricht ...

Yes. It would mean that non-native speakers of English have a chance of
using the thing for their languages, for one thing ;) Of course, the
extended
character set in question has been around for a while (the IPA alphabet),
and there is even a more-or-less widely known, although inofficial,
"standard"
for transliteration into ASCII.

Matthew Russotto

unread,

Aug 2, 2001, 9:46:15 AM8/2/01

to

In article <BC2D55F690A3B12C.6279F4C5...@lp.airnews.net>,

Way back when, when the small mammals were just beginning to take over
from the dinosaurs, I built a speech synthesizer for the Apple II
based on some General Instruments chip or another. It had a set of a
whole bunch of allophones (variant forms of phonemes). It still
didn't do tone or stress very well, but it did work.

Then I wrote an text-to-speech program in BASIC... worked pretty well
for being completely ad-hoc.
--
Matthew T. Russotto russ...@pond.com
=====
Get Caught Reading, Go To Jail!
A message from the Association of American Publishers
Free Dmitry Sklyarov! DMCA delenda est!
http://www.freedmitry.org

Daniel Ellison

unread,

Aug 2, 2001, 10:03:54 AM8/2/01

to

"Matthew Russotto" <russ...@wanda.pond.com> wrote:

> Way back when, when the small mammals were just beginning to take over
> from the dinosaurs, I built a speech synthesizer for the Apple II
> based on some General Instruments chip or another. It had a set of a
> whole bunch of allophones (variant forms of phonemes). It still
> didn't do tone or stress very well, but it did work.
>
> Then I wrote an text-to-speech program in BASIC... worked pretty well
> for being completely ad-hoc.

Well that beats me! All I did (see earlier message) is BUY a
SweetTalker ][ from Steve Ciarcia (I think) for my Apple ][ clone. But in
my own defense, I DID write a pretty cool phoneme editor in assembly
language hooked into Applesoft. BASIC.

What I want to know is (paraphrasing myself from previous message) why
hasn't there been a much larger improvement in the ensuing TWENTY years
since I was playing with the SweetTalker ][? For all intents and purposes,
artificial speech today sounds almost identical to that produced way back

when, when the small mammals were just beginning to take over from the

dinosaurs. I would think speech recognition would be the hard problem, not
speech production.

(I'm probably exposing my ignorance here. Go ahead: roast me:)

Karl Ove Hufthammer

unread,

Aug 2, 2001, 10:45:36 AM8/2/01

to

"Daniel Ellison" <dan...@syrinx.net> skreiv i innlegget
news:e5da7.38119$sf2.7...@news3.rdc1.on.home.com:

> What I want to know is (paraphrasing myself from previous
> message) why hasn't there been a much larger improvement in
> the ensuing TWENTY years since I was playing with the
> SweetTalker ][?

There has. Try <URL: http://www.lhsl.com/realspeak/demo.cfm >.

--
Karl Ove Hufthammer

Daniel Ellison

unread,

Aug 2, 2001, 10:52:23 AM8/2/01

to

"Karl Ove Hufthammer" <huf...@bigfoot.com> wrote:

> There has. Try <URL: http://www.lhsl.com/realspeak/demo.cfm >.

Just tried it. When it got to the point of "saying" what I typed, my
browser gave me a "Cannot find server" page. From the original site it
jumps to 208.162.98.40. That machine itself is alive (it responded to my
"ping"), but no dice otherwise. Sigh. And I was all excited.

Mathias Maul

unread,

Aug 2, 2001, 6:10:15 PM8/2/01

to

Daniel Ellison <dan...@syrinx.net> wrote:

> What I want to know is (paraphrasing myself from previous message) why
> hasn't there been a much larger improvement in the ensuing TWENTY years

> since I was playing with the SweetTalker ][? [...]

I'd say that one of the reasons why there has been no large improvement
in speech production (or, for that matter, in speech recognition) is
that there has been no quantum leap in linguistics. Of course,
tremendous amounts of new theories and models were developed in the last
twenty years, but as yet not one of these has been successful in
modelling a human speaker (listener) closely enough to allow for
sensible speech production (recognition).

Talking of "sensible" -- It all seems to boil down to the idea that you
need to *understand* what you want to say before you can actually
generate and articulate the utterance. Consequentially, many people
would say that until a machine cannot be made to understand (whatever
that means) what you mean (whatever that means) by, say, "insert red
book into DMCA officer," it is very very unlikely that natural sounding
speech can be generated because there is no linguistic model which would
allow to robustly analyze the sentence. Even more, finding out which of
the words are nouns, which are verbs etc., is hard enough to do
algorithmically, and if you think of this as being just one tiny part of
the puzzle, you might get an idea of how far away we still seem to be
away from the real thing.

> I would think speech recognition would be the hard problem, not
> speech production.

Unfortunately, both seem to be very very hard. NP-hard, actually. :)

Cheers,
Mathias.

--
"Dance like it hurts. Love like you need the money. Work when
people are watching." -- Dogbert.

John G Wood

unread,

Aug 3, 2001, 7:30:25 AM8/3/01

to

In article <GHEKK...@world.std.com>, Sean T Barrett wrote:
> Andrew Plotkin <erky...@eblong.com> wrote:
> >Entirely-virtual actors aren't likely to take over in movies -- but
> >motion-capture has become big business.
>

> Has it? [...]

I suppose it depends on your definition of "big business", but I work for a
company producing motion capture equipment and it brings us several million
pounds a year.

> I'm very familiar with the use of the technology in games, since that's
> the industry I'm employed in, and indeed I've worked on a game which used
> an awful lot of mocaps. You'll have to enlighten me on its application
> in other industries.

Here are a few examples where our equipment has been used:

Film: animating non-humans (e.g., Gungans & robots in The Phantom Menace);
overlaying graphics on actors (The Mummy); creating crowd scenes (Titanic,
Gladiator).

Music Videos/TV: Much as for films.

Ergonomics: optimising equipment for human use (various car companies);
tracking people and objects in an environment (for VR & architecture).

Gait Analysis: Examining the walking difficulties of cerebral palsy
sufferers or people with artificial limbs.

Sports Science: Analysing golf swings, rugby tackles, skiing, ...

Surgery: Tracking equipment.

Research: My favourite was recreating the mating dances of pigeons!

I'm sure I've missed some of the interesting ones, but it should give some
idea of the scope.

ObIF: Nope, can't think of anything. Great. My first post in three years
and it's off-topic...

John
(Glad to be back anyway)

Karl Ove Hufthammer

unread,

Aug 3, 2001, 10:23:35 AM8/3/01

to

"Daniel Ellison" <dan...@syrinx.net> skreiv i innlegget

news:HOda7.38266$sf2.7...@news3.rdc1.on.home.com:

>> There has. Try <URL: http://www.lhsl.com/realspeak/demo.cfm
>

> Just tried it. When it got to the point of "saying" what I
> typed, my browser gave me a "Cannot find server" page.

Sorry. Same results for me. You can try <URL:
http://www.research.att.com/~mjm/cgi-bin/ttsdemo >. Not quite as
good as RealSpeak, but not bad either.

--
Karl Ove Hufthammer

David Thornley

unread,

Aug 6, 2001, 3:40:48 PM8/6/01

to

In article <1exjap1.1nxy0tr1foxl1aN%mm...@gmx.net>,

Mathias Maul <mm...@gmx.net> wrote:
>Daniel Ellison <dan...@syrinx.net> wrote:
>
>> What I want to know is (paraphrasing myself from previous message) why
>> hasn't there been a much larger improvement in the ensuing TWENTY years
>> since I was playing with the SweetTalker ][? [...]
>
>I'd say that one of the reasons why there has been no large improvement
>in speech production (or, for that matter, in speech recognition) is
>that there has been no quantum leap in linguistics.

It's worse than that. There has been no quantum leap in artificial
intelligence (despite numerous advances). (I hope I'm not seen as
quibbling, but I do think you're making natural computer speech
production seem far too easy.)

>Talking of "sensible" -- It all seems to boil down to the idea that you
>need to *understand* what you want to say before you can actually
>generate and articulate the utterance.

Right. In general, the less successful parts of AI have been
concerned with understanding things.

>book into DMCA officer," it is very very unlikely that natural sounding
>speech can be generated because there is no linguistic model which would
>allow to robustly analyze the sentence.

There's more to realistic inflections than sentence analysis is going
to solve.

Even more, finding out which of
>the words are nouns, which are verbs etc., is hard enough to do
>algorithmically,

Impossible, in the general case. "Time flies like a banana." This
is a pathological case, but there are lots of pathological cases
and you never know when you're going to run into one. (I read
an article on Loglan - I think in DDJ long ago - and it mentioned
the large amount of work that went into the phrase "pretty little
girl's school".)

and if you think of this as being just one tiny part of
>the puzzle, you might get an idea of how far away we still seem to be
>away from the real thing.
>

Right.

>> I would think speech recognition would be the hard problem, not
>> speech production.
>
>Unfortunately, both seem to be very very hard. NP-hard, actually. :)
>

Human brains can throw a lot of resources at the problems, require
a programming time of well over a decade to get it more or less
right, and are subject to a great many problems. It's not just NP-hard,
it's AI-complete.

--
David H. Thornley | If you want my opinion, ask.
da...@thornley.net | If you don't, flee.
http://www.thornley.net/~thornley/david/ | O-

Daniel Ellison

unread,

Aug 6, 2001, 7:01:41 PM8/6/01

to

"David Thornley" <thor...@visi.com> wrote:

> intelligence (despite numerous advances). (I hope I'm not seen as
> quibbling, but I do think you're making natural computer speech
> production seem far too easy.)

Well, I DID say earlier: "I'm probably exposing my ignorance here. Go
ahead: roast me". I certainly make no claim to be knowledgeable on this
subject. That's why I brought it up in the first place; I required
enlightenment. :) I was curious about the apparent lack of advancement in
the field. After all 20 years in computer time is pretty well equivalent to
a millenium or two in real years. And no, I have NO scientific proof of
this.

> and you never know when you're going to run into one. (I read
> an article on Loglan - I think in DDJ long ago - and it mentioned
> the large amount of work that went into the phrase "pretty little
> girl's school".)

Loglan was first publicly described in an article published by James Brown
in Scientific American magazine in 1960. Good God y'all, hit me! I
remember searching through several libraries (of course, way-pre-web) for
that issue of SciAm. Fascinating stuff! An artificial human language with
zero ambiguity that can easily be parsed by a computer... sounds tailor-made
for IF. Of course, you'd have to learn to speak another language, but
that's easy! Heh.

And yes, I know: not THAT James Brown.

Daniel Ellison
Toronto, Ontario
dan...@syrinx.net

"David Thornley" <thor...@visi.com> wrote in message
news:4pCb7.3625$x84.7...@ruti.visi.com...

David Thornley

unread,

Aug 7, 2001, 2:12:58 PM8/7/01

to

In article <plFb7.5050$st4.1...@news3.rdc1.on.home.com>,

Daniel Ellison <dan...@syrinx.net> wrote:
>"David Thornley" <thor...@visi.com> wrote:
>
>> intelligence (despite numerous advances). (I hope I'm not seen as
>> quibbling, but I do think you're making natural computer speech
>> production seem far too easy.)
>
>Well, I DID say earlier: "I'm probably exposing my ignorance here. Go
>ahead: roast me".

I'm sorry. What I intended to be interpreted as has apparently
suffered from the usual TCP/IP nuance filter. I meant to support
what you were saying and extend it.

I certainly make no claim to be knowledgeable on this
>subject. That's why I brought it up in the first place; I required
>enlightenment. :) I was curious about the apparent lack of advancement in
>the field. After all 20 years in computer time is pretty well equivalent to
>a millenium or two in real years. And no, I have NO scientific proof of
>this.
>

There are a lot of things about computers that do not advance much in
twenty years. You don't hear about them because there aren't that
may people working in those areas and "Professor achieves little"
is not really newsworthy unless the journalist is trying to do
one of those "your tax dollars pay for this research" articles.

The easy part of speech production was done reasonably well
twenty years ago. (Any early Mac users remember the Talking Moose?)
The hard part is still largely intractable.

Mathias Maul

unread,

Aug 7, 2001, 5:58:37 PM8/7/01

to

Hey David,

you wrote:

> (I hope I'm not seen as quibbling, but I do think you're making natural
> computer speech production seem far too easy.)

Oh, I didn't do that on purpose: I am quite aware of the issues of
"realistic" machine-based speech production. (From a mainly
linguistically oriented perspective, however. The two years I spent in
AI were too less to grasp much more than the basics.)

> It's not just NP-hard, it's AI-complete.

I fear it's even worse than that.

When I wrote my Masters Thesis (in Linguistics and CompSci) on Speech
Recognition, I was astonished at the somewhat simplistic approaches
undertaken by some researchers in the more engineering-oriented field.
Take an utterance, perform a FFT, some filters, route it through some
pre-trained HMMs, et voilá, there's your string of phonemes.

I mean "simplistic" by no means in a derogatory sense -- many of these
approaches are considered "simplistic" from a linguist's point of view.
And FWIW, especially when taking into account that they do not
*understand* the utterances, many "technical" recog systems are doing
very well.

My point is that while the engineers were building more and more
sophisticated techniques to extract phonemes, linguists were busy
creating models of how language (in the broadest sense of the word)
works, and some of these models were even postulating the non-existence
of distinct, segmental, "phoneme"-like entities. (I'm particularly fond
of those. It makes compscientists and engineers go crazy. :) )

With these two (arbitrarily defined) groups of scientists working on
different sides of the fence, the result was that some very very clever
models emerged on both sides. But when linguists had built a nice model,
they tended to say, well, this model should be sufficient for doing
automatic speech recognition, so would someone please implement it? At
the same time, however, the engineers were so much occupied with
building even better HMM libraries and noise-reduction filters and
whathaveyou, so there was (and, AFAIK, is) not as much interaction
between these not-so-disparate groups than one might have wanted.

Of course, there are some pleasant exceptions to that. (Project HAL
comes to mind, anyone know if it's still alive?)

Erm. I forgot what it was I wanted to say. Oh yes, the topic of
AI-completeness. Pardon my ignorance, but is "AI-complete" a technical
term? I've never heard it in that context. If it is and if the
implication of "completeness" is the same as in "NP-complete" (which is
what I probably meant when I wrote "NP-hard") then you are most probably
right: As soon as the issue of speech production/recognition/
understanding/wa-wa is "solved," then the solution to all other
AI-complete problems might pop up instantly. And as much as I would like
this to happen tomorrow, it seems that we may have to wait and work for
another couple of millenia for this quantum leap to happen.

...

As some other poster on rai-f has recently said:

</soapbox>,
Mathias. :)

Billy Harris

unread,

Aug 8, 2001, 12:10:59 AM8/8/01

to

In article <1exsfi4.10fdjuk1dsm8sgN%mm...@gmx.net>, Mathias Maul
<mm...@gmx.net> wrote:

> Erm. I forgot what it was I wanted to say. Oh yes, the topic of
> AI-completeness. Pardon my ignorance, but is "AI-complete" a technical
> term? I've never heard it in that context. If it is and if the

I think AI-complete is refering to the General Knowledge problem. This
is referring to the fact that computers don't have any common sense and
even worse don't have any idea that their "knowledge" doesn't apply to
a situation.

As applied to natual language understanding, the English sentence "Even
John could pass that test" is giving us information about John as well
as information about the test; essentially any computer able to
convincingly converse with a human must also be able to learn new
things, adapt old knowledge to new situations, form plans, understand
human languages, know when to ask relevant questions [and what question
to ask], distinguish between true statements, statements the speaker
believes to be true but might be incorrect, and outright lies, and on
and on and on.....

A similar situation exists in many other subfields of AI; for example,
a perfect machine learning algorithm would be able to learn how to
speak naturally, etc.

Magnus Olsson

unread,

Aug 8, 2001, 4:21:33 AM8/8/01

to

In article <8640F72169FA14B8.8C1B1EFA...@lp.airnews.net>,

Billy Harris <wha...@mail.airmail.net> wrote:
>In article <1exsfi4.10fdjuk1dsm8sgN%mm...@gmx.net>, Mathias Maul
><mm...@gmx.net> wrote:
>
>> Erm. I forgot what it was I wanted to say. Oh yes, the topic of
>> AI-completeness. Pardon my ignorance, but is "AI-complete" a technical
>> term? I've never heard it in that context. If it is and if the
>
>I think AI-complete is refering to the General Knowledge problem. This
>is referring to the fact that computers don't have any common sense and
>even worse don't have any idea that their "knowledge" doesn't apply to
>a situation.

The term "AI-complete" is formed in analogy to "NP-complete". As fasr
as I understand things, unlike "NP-complete" it isn't "technical" in
the sense of having a rigorous mathematical definition, but it's
convenient.

Simply put, a problem is AI-complete if solving it means that you
have to create "real" AI. Saying that language understanding is AI-
complete implies that you can't separate language understanding from
other aspects of human intelligence; to understand human language,
you must in essence be human.

--
Magnus Olsson (m...@df.lth.se, m...@pobox.com)
------ http://www.pobox.com/~mol ------

Jason Melancon

unread,

Aug 9, 2001, 1:37:24 PM8/9/01

to

On Tue, 07 Aug 2001 23:10:59 -0500, Billy Harris
<wha...@mail.airmail.net> wrote:

> essentially any computer able to
> convincingly converse with a human must also be able to learn new
> things, adapt old knowledge to new situations, form plans, understand
> human languages, know when to ask relevant questions [and what question
> to ask], distinguish between true statements, statements the speaker
> believes to be true but might be incorrect, and outright lies, and on
> and on and on.....

Heck, I wish *I* could do all that stuff.

--
Jason Melancon

Jonadab the Unsightly One

unread,

Aug 12, 2001, 12:04:56 AM8/12/01

to

thor...@visi.com (David Thornley) wrote:

> >> I would think speech recognition would be the hard problem, not
> >> speech production.
> >
> >Unfortunately, both seem to be very very hard. NP-hard, actually. :)
> >
> Human brains can throw a lot of resources at the problems, require
> a programming time of well over a decade to get it more or less
> right, and are subject to a great many problems. It's not just NP-hard,
> it's AI-complete.

The thing that makes voice recognition so much worse is that
the computer has to do the understanding.

Speech generation is impossible to get sounding right, but
in most cases you can produce something that an intelligent
human, fluent in the language, can decipher with relative ease.
The intonation may be off, especially at the sentence level,
but barring unknown words and dialectical differences the
computer can at least make itself understood -- because the
one doing the understanding is the human -- and humans *can*
do things that are AI-complete.

Of course, if you set the bar to the level of producing
speech that's indistinguishable from human speech, then
speech generation may be just as hard as voice recognition.

- jonadab

David Thornley

unread,

Aug 14, 2001, 11:53:06 AM8/14/01

to

In article <3b75fd7b...@news.bright.net>,

Jonadab the Unsightly One <jon...@bright.net> wrote:
>thor...@visi.com (David Thornley) wrote:
>
>> >> I would think speech recognition would be the hard problem, not
>> >> speech production.
>> >
>> >Unfortunately, both seem to be very very hard. NP-hard, actually. :)
>> >
>> Human brains can throw a lot of resources at the problems, require
>> a programming time of well over a decade to get it more or less
>> right, and are subject to a great many problems. It's not just NP-hard,
>> it's AI-complete.
>
>The thing that makes voice recognition so much worse is that
>the computer has to do the understanding.
>

There's other things. If I listen to a person, I hear distinct
words. It's hard to get a computer to do that. I can understand
people against background noise and with various sorts of distortions
(although I'm worse than average at it), and this gets very tricky.

It's not all that difficult to do primitive speech production,
which involves going from a well-defined text and producing corresponding
sounds. Recognizing speech involves dealing with highly variable
analog input.

>Speech generation is impossible to get sounding right, but
>in most cases you can produce something that an intelligent
>human, fluent in the language, can decipher with relative ease.
>The intonation may be off, especially at the sentence level,

I find that jarring, personally. I don't know about anybody
else, but I find it hard to listen to machine-generated speech
for long.

>but barring unknown words and dialectical differences the
>computer can at least make itself understood -- because the
>one doing the understanding is the human -- and humans *can*
>do things that are AI-complete.
>

Yup.

If you can understand the written text, you can understand the
spoken text. It won't sound natural, barring a great breakthrough.

>Of course, if you set the bar to the level of producing
>speech that's indistinguishable from human speech, then
>speech generation may be just as hard as voice recognition.
>

It may be possible to put enough ad hoc rules in to make it
sound almost as good most of the time, but it will always be
possible to make it wrong (if only by finding sentences with
two possible renderings based on meaning, and declaring the one
the computer comes up with wrong).

John W. Kennedy

unread,

Aug 23, 2001, 12:49:51 PM8/23/01

to

In re: Loglan, see http://www.loglan.org.

--
John W. Kennedy
(Working from my laptop)