Getting Visemes from Live Audio Input

shineling

unread,

Jan 12, 2008, 10:54:43 AM1/12/08

to

I have a talking puppet whose mouth moves to SAPI5 voices.

What I'd like to do now is hook it up to a microphone.

Does anyone know how to extract visemes from a live audio input?

Thanks!

sergio

shineling

unread,

Jan 13, 2008, 5:38:44 AM1/13/08

to

Nobody knows how to do this?

Steve Meyer [MSFT]

unread,

Jan 17, 2008, 11:57:54 PM1/17/08

to

Sorry, Sergio, but viseme events are only created during speech synthesis
not speech recognition.

-- Steve Meyer

This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send e-mail directly to this alias. This alias is for
newsgroup purposes only.

"shineling" <ser...@karigirl.com> wrote in message
news:482a1cc5-526b-4119...@1g2000hsl.googlegroups.com...

shineling

unread,

Jan 18, 2008, 5:00:05 AM1/18/08

to

I have a program - no code - that will take a .wav file and extract
visemes... how was that done and can that shed some light on my
situation?

Thanks.

shineling

unread,

Jan 18, 2008, 5:11:26 AM1/18/08

to

Correction... the program I have extracts phonemes... either of them
will do. I just need to extract mouth positions from a spoken audio.

http://www.ccir.ed.ac.uk/~jad/phonemes.html

So it is possible... and it does work. Uses SAPI voice recognition.

If I find a solution I will post it.

Steve Meyer [MSFT]

unread,

Jan 18, 2008, 11:03:53 AM1/18/08

to

Hi Sergio,

Maybe I misunderstood what you want to do. Yes, it is possible to get the
pronunciations for recognized speech and it is possible to convert from
phonemes to visemes, which is what James's tool does. But you can't do it
in real time because the recognition of the live audio needs to complete
before you can convert it to visemes.

If processing the audio beforehand and playing back the visemes later meets
your needs, the it sounds like phonemes.exe should give you what you want
(though the link to the executable isn't working for me). If you'd like to
write your own program for this, let me know and I can provide some details
that should help you out. And if this is the case, let me know what
language you'd be working in.

-- Steve Meyer

This posting is provided "AS IS" with no warranties, and confers no rights.
Please do not send e-mail directly to this alias. This alias is for
newsgroup purposes only.

"shineling" <ser...@karigirl.com> wrote in message

news:cb464576-1231-49e9...@e25g2000prg.googlegroups.com...

jfch...@gmail.com

unread,

Sep 10, 2012, 4:48:39 PM9/10/12

to

If ever you get the live viseme recognition working, you could build a gold mine from a new "ToonTalk" phone application for mobile devices. In a televatar application, users could specify which 2D or 3D character they want to appear as when speaking on a mobile device. There would be a website which people can upload or download 2D and/or 3D televatars. Each televatar has an image of the avatar for each viseme, recognizing live which viseme each user is pronunciating. The conversation video, which should be able to save to a video file, would cut live between the televatars based upon which one is currently speaking. Also each televatar user can specify their own background image (eiather custom or from the televatar website). If done correcly, each phone conversation can be converted live into a television cartoon or 3D animation. Good Luck!