questions about Voice Dream text-to-speech app

Mark Spahn

unread,

Dec 16, 2013, 7:16:08 PM12/16/13

to win...@voicedreams.com, hon...@googlegroups.com

Dear Winston,
Somebody turned me on to your Voice Dream app by
pointing me to your website
http://www.voicedream.com/ .
You have an intriguing text-to-speech app, but I have
two questions that I can't find answers to on your site.

(1) Your home page (rightmost column) refers to a
"Dyslexia friendly font", which is also referred to
at the bottom (third-last line) of page
http://www.voicedream.com/?page_id=80
as the "OpenDyslexia font".
Where can I find what this "dyslexia-friendly" font
looks like? I have a hard time understanding how
one font could encourage or discourage dyslexia
any more than any other font. And why would a
user want a font that is "dyslexia friendly", rather than
a dyslexia-hostile font that tries to extirpate dyslexia?

(2) Before selecting a voice, is it possible to audition
the voice by having it read a brief "lorem ipsum" text?
Under the US English voices on page
http://www.voicedream.com/?page_id=1758
is a voice described as "Saul, Hip-pop".
This might be ideal as a readback voice to listen to
as I proofread what I have written. Rather than
listining to the bland voice I hear in my head when
I read a text myself, maybe I can catch more typos
by listening to the lively rhythms of a rap artist like Saul.

By the way, here is an apparent error you will want to fix:
Under "What's the syntax for the pronunciation dictionary's
text mode?", which is the first item on page
http://www.voicedream.com/?page_id=2466 ,
are the two terms "1=Anywhere" and "Any Text".
Are these the same thing? If so, you will minimize
confusion by adopting the same uniform term for
this match mode.
-- Mark Spahn (West Seneca, NY)

Joel Dechant

unread,

Dec 17, 2013, 8:19:40 AM12/17/13

to hon...@googlegroups.com, mark...@verizon.net

Mark

In 199whenever, when I purchased the seminal Spahn and Hadamitsky kanji dictionary as a college student (and accidental student of Japanese mind you), never in my wildest dreams did I imagine that one day I would be recommending synthetic hip-hop voices to one of the authors.

On that note, if you download the "Lite" version of the Voice Dream app, you can test drive all of the major functions and play small samples of all of the voices from the settings menu. What makes the Lite app Lite is that it stops reading the text every 30 seconds, forcing you to restart playback frequently (and prompting you to upgrade). But this should be sufficient enough for you to decide whether or not the app--and Hip Hop Saul--is the right solution for you. (Or you may find you are partial to Australian Tyler instead.)

Best

Joel Dechant

Fukuoka, Japan

(from a musty guesthouse in Beppu)

Mark Spahn

unread,

Dec 17, 2013, 3:34:23 PM12/17/13

to hon...@googlegroups.com

Mark[,]

- - - - - - - - - -

Joel,

Haha, a strange twist of fate!

Without having the device on which the Voice Dream app is meant to run, there does not seem to be a way to audition the repertory cast of narrators; i.e., it's not as simple as playing a YouTube clip on a desktop computer.

The perfect narrator would be like the actor Edward Norton reading

http://www.amazon.com/Ambush-Fort-Bragg-Tom-Wolfe/dp/0553478966 .

He adopts different voices, accents, and pacing for the different characters and for the third-person-omniscient narrator. I do not expect such talent from any of the Voice Dream narrators, but maybe a less-than-ideal narration would better serve the purpose of catching typos.

I wonder how sensitive the Voice Dream narrators are to the stage-direction instructions that are provided by punctuation. Here's a quick-and-easy(?) experiment you can try. Have Hip-Hop Saul narrate the following dialogue.

"Mr. Zuckerberg," said the stock broker, "Consolidated Apps

is up to $14.52. Should I buy you a thousand shares?"

Mark answered, "No[,] price too high!"

Does Hip-Hop Saul's replication of Mark Zuckerberg's intonation (and his meaning) vary depending on whether the comma is there? That is, is Hip-Hop Saul smart enough to insert a pause after a comma?

-- Mark Spahn (West Seneca, NY)

P.S. I've always thought that "Beppu" sounds like it should be in Vietnam.

Mark Spahn

unread,

Dec 19, 2013, 3:22:45 AM12/19/13

to hon...@googlegroups.com

Mark,

I would not expect too much from the voices, but they no longer have that computery Microsoft Sam sound circa Win 3.1. And yes, they will stop at commas.

I am sending you some links to samples of the voices you can use with VoiceDream.

They are all developed by third party vendors, so you may be able to download and use them with another program that suits your needs. The Acapela site has a link below the voices to a demo screen where you can enter your own text and try them out.

For the record, I really like Will (BadGuy). I might just have to get him and have him read my university policy documents back to me like Walter White.

Acapela

http://www.acapela-group.com/english-us-36-text-to-voice.html

NeoSpeech

http://www.neospeech.com/

(samples available in bottom right menu)

FYI a pharma translator I know swears by Second Speech Center, although I have never used it

http://download.cnet.com/2nd-Speech-Center/3000-7239_4-10386344.html

Best

joel

- - - - - - - - - -

Joel,

Fascinating.

I listened to a few of the voice samples, while wondering whether they are computer generated or rather voiced by an actor. At the Acapela (why not "Acapella"?) site, I listened to Will (bad guy) and heard him pronounce the noun CONtent as the adjective conTENT, so I guess that's evidence of text-to-speech software control rather than a human voice actor. (This is kind of like a reverse Turing test: trying to detect a human who is faking a fake voice.) But in some other samples the noun "content" *is* pronounced correctly as CONtent (feedback from commenters?).
The character Will (old man) speaks with "vocal fry" (look it up; a way for women to sound authoritative); this is the creakiness heard in the voice of Elmer Fudd.

Saul certainly sounds hip-hoppy, but slightly mispronounces his own name as "Sol", and "a try with" should have more of a pause after "try" than it does.

On page

http://www.acapela-group.com/text-to-speech-interactive-demo.html ,

Saul does indeed make the proper intonation distinction between "No price too high!" and "No, price too high!".

I also listened to a few non-English voice samples. The Japanese "Sakura (Female)" at

http://www.acapela-group.com/japanese-165-text-to-voice.html

sound very good, execpt that "riyou houhou" sounds too much like like "ryou-houhou" (both methods). And to my ignorant ear, "Happy Jeroen" of Flanders

http://www.acapela-group.com/dutch-b-34-text-to-voice.html

sounds pretty Dutchy (the B is for Belgian Dutch, I infer).

The only voice I listened too that is sub-par is the six-second sample of speech by Sjudur of the Faeroe Islands

http://www.acapela-group.com/faeroese-37-text-to-voice.html ,

who needs to practice his word-transition elocution.

But maybe there's not much of a market for text-to-speech voices in the

https://en.wikipedia.org/wiki/Faroese_language ,

which is spoken by fewer people than live in the town of Kennewick, Washington.

All in all, these voices are much more natural-sounding than the voice used by the theoretical physicist Stephen Hawking (but maybe he has updated his synthetic voice, as better versions become available).

Personally, I don't think I'll switch over to proofreading in this high-tech way, but I am grateful to you for this glimpse into world of text-to-speech voices.

Mark Spahn

unread,

Dec 19, 2013, 3:41:52 AM12/19/13

to hon...@googlegroups.com

If you enjoy masochism as you proofread, you can

hardly get a better text reader than "Queen Elizabeth (Female)":

http://www.acapela-group.com/english-uk-35-text-to-voice.html

It actually sounds like Her Majesty, with all her royal screechiness.

(And she slurs "have a try" so that it sounds like "have a drive".

Maybe she's had a nip of one of her country's fine libations.)

-- Mark Spahn (West Seneca, NY)

P.S. This is tangentially related to the telephone voice discussed at

http://drudgegae.iavian.net/r?hop=http://newsfeed.time.com/2013/12/10/meet-the-robot-telemarketer-who-denies-shes-a-robot/

You can play a recording of the brief phone conversations.

Mark Spahn

unread,

Dec 19, 2013, 11:49:36 PM12/19/13

to hon...@googlegroups.com

> Personally, I don't think I'll switch over to proofreading

> in this high-tech way, but I am grateful to you for this

> glimpse into world of text-to-speech voices.

> -- Mark Spahn (West Seneca, NY)

Drat! That should be "this glimpse into THE world of

text-to-speech voices". If only I had had a synthetic

voice read back this passage to me, this omission of

"the" would not have happened.

Reply all

Reply to author

Forward