how to specify long i phoneme?

305 views
Skip to first unread message

greg

unread,
Sep 9, 2010, 12:54:16 PM9/9/10
to TTS-for-Android
According to the X-SAMPA "English five" example at http://en.wikipedia.org/wiki/Xsampa,
the way to specify the pronunciation of five is "faIv". However, the
SVOX pico speech output for the following code

String text = "<speak xml:lang=\"en-US\"> <phoneme
alphabet=\"xsampa\" ph=\"faIv\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

is "fiv" rather than "five". What is the proper way to specify, using
X-SAMPA, the long i sound to pico?

- Greg

greg

unread,
Sep 9, 2010, 1:31:33 PM9/9/10
to TTS-for-Android
I thought perhaps adding the x-SAMPA symbol for long (i.e., :) might
help, but the SVOX pico speech output for the following code

text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
\"xsampa\" ph=\"faI:v\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

is still "fiv" rather than "five".

- Greg

On Sep 9, 12:54 pm, greg <gsgm2...@gmail.com> wrote:
> According to the X-SAMPA "English five" example athttp://en.wikipedia.org/wiki/Xsampa,

Johan Wouters

unread,
Sep 17, 2010, 9:05:36 AM9/17/10
to tts-for...@googlegroups.com
Hi Greg,

You could try the following xsampa: "faI_^v". The underscore and caret
indicate that "I" is a semivowel.

Johan

- Greg

--
You received this message because you are subscribed to the Google Groups
"TTS-for-Android" group.
To post to this group, send email to tts-for...@googlegroups.com.
To unsubscribe from this group, send email to
tts-for-andro...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/tts-for-android?hl=en.

greg

unread,
Sep 17, 2010, 10:07:17 AM9/17/10
to TTS-for-Android
Hi, Johan.

Thanks for the tip. I was unaware of the ability to specify
semivowels. Unfortunately, the xsampa "faI_^v" still produces the
speech output "fiv" (and so does "fa_^I_^v") rather than "five".

- Greg

On Sep 17, 9:05 am, "Johan Wouters" <johan.m.wout...@gmail.com> wrote:
> Hi Greg,
>
> You could try the following xsampa: "faI_^v".  The underscore and caret
> indicate that "I" is a semivowel.
>
> Johan
>
> -----Original Message-----
> From: tts-for...@googlegroups.com
>
> [mailto:tts-for...@googlegroups.com] On Behalf Of greg
> Sent: 09 September 2010 18:54
> To: TTS-for-Android
> Subject: how to specify long i phoneme?
>
> According to the X-SAMPA "English five" example athttp://en.wikipedia.org/wiki/Xsampa,

Johan Wouters

unread,
Sep 20, 2010, 12:35:43 PM9/20/10
to tts-for...@googlegroups.com
Hi Greg,

The following xsampa sequences should work for en-US diphthongs:

o_U nose "no_Uz
O_I noise "nO_Iz
a_I rise "r\a_Iz
a_U rouse "r\a_Uz
e_I raise "r\e_Iz

greg

unread,
Sep 20, 2010, 2:41:10 PM9/20/10
to TTS-for-Android
Hi, Johan.

Thanks. The xsampa sequence "fa_Iv" produces the desired long i sound
in the speech output "five".

I'm not sure how to specify your example of rise "r\a_Iz though due to
the backslash escape character. I tried the following and they all
produced a speech output sounding like "huh ive".

text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
\"xsampa\" ph=\"ra_Iz\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
\"xsampa\" ph=\"r\\a_Iz\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
\"xsampa\" ph=\"r*a_Iz\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

My attempted use of an asterisk is based upon the note in
http://en.wikipedia.org/wiki/Xsampa that "X-SAMPA uses backslashes as
modifying suffixes to create new symbols. For example O is a distinct
sound from O\, to which it bears no relation. Such use of the
backslash character can be a problem, since many programs interpret it
as an escape character for the character following it. For example,
you cannot use such X-SAMPA symbols in EMU, therefore you need to
replace backlash with some other symbol (e.g. an asterisk: '*') when
adding phonemic transcription to an EMU speech database."

Clearly, my use of http://en.wikipedia.org/wiki/Xsampa as a reference
hasn't been particularly effective. Is there a reference you
recommend for those trying to learn how to specify pronunciations
using xsampa sequences? (In a few weeks, I'll be volunteering with
some 7th and 8th graders on a project to specify the pronunciation of
the scientific names of the birds of North America using xsampa, pico,
and Android.)

Thanks again,
Greg


On Sep 20, 12:35 pm, "Johan Wouters" <johan.m.wout...@gmail.com>

greg

unread,
Sep 21, 2010, 7:24:07 AM9/21/10
to TTS-for-Android
Thanks Johan. Yes, four backslashes works. The speech output
specified by the following code is "rise".

text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
\"xsampa\" ph=\"r\\\\a_Iz\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);


On Sep 20, 2:41 pm, greg <gsgm2...@gmail.com> wrote:
> Hi, Johan.
>
> Thanks.  The xsampa sequence "fa_Iv" produces the desired long i sound
> in the speech output "five".
>
> I'm not sure how to specify your example of rise "r\a_Iz though due to
> the backslash escape character.  I tried the following and they all
> produced a speech output sounding like "huh ive".
>
>                 text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
> \"xsampa\" ph=\"ra_Iz\"/>.</speak>";
>                 mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
>
>                 text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
> \"xsampa\" ph=\"r\\a_Iz\"/>.</speak>";
>                 mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
>
>                 text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=
> \"xsampa\" ph=\"r*a_Iz\"/>.</speak>";
>                 mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
>
> My attempted use of an asterisk is based upon the note inhttp://en.wikipedia.org/wiki/Xsampathat "X-SAMPA uses backslashes as
> modifying suffixes to create new symbols. For example O is a distinct
> sound from O\, to which it bears no relation. Such use of the
> backslash character can be a problem, since many programs interpret it
> as an escape character for the character following it. For example,
> you cannot use such X-SAMPA symbols in EMU, therefore you need to
> replace backlash with some other symbol (e.g. an asterisk: '*') when
> adding phonemic transcription to an EMU speech database."
>
> Clearly, my use ofhttp://en.wikipedia.org/wiki/Xsampaas a reference
Reply all
Reply to author
Forward
0 new messages