We are currently looking for a limited number of beta testers
for our Text-to-Speech Object. If you are interested, please
contact us at:
While we will try to accommodate everyone who wishes to test
the Object, we have the resources to release the distribution
to only a dozen or so beta testers. We already have 6 testers
lined up, so the rest will be filled on a first come, first
served basis. If you are interested, please send the following
information to us :
a) Your name.
b) Your academic or business affiliation, if any.
c) A postal address to which we can mail the distribution.
d) A brief description of the application(s) in which
you may be incorporating text-to-speech.
The distribution will be on 1.44 MB floppy disks unless other
arrangements are made with us.
TEXT-TO-SPEECH OBJECT: DESCRIPTION
The Text-to-Speech Object allows developers to incorporate
real-time conversion of text to synthesized speech into any
application they write. The Object allows unlimited text
input, with pronunciations derived primarily by dictionary
look up. The developer and user can override or supplement
the Main Pronunciation Dictionary by adding words (and their
corresponding pronunciations) to the Application and User
Dictionaries. Full parsing of numbers is also provided, and
a letter-to-sound algorithm provides pronunciations whenever
a word cannot be found in any of the pronunciation dictionaries.
Synthesis of the speech is done by rule in real-time on the DSP.
This is in contrast to speech synthesis which uses compressed,
pre-recorded phonemes, syllables, or words. The main advantages
of speech synthesis-by-rule are that it does not require storage
for the pre-recorded segments (typically tens of megabytes), and
it is very flexible. This allows words to be connected into
smooth utterances, and lets the programmer set the voice speed,
volume, and voice quality. Also, intonation and rhythm are easily
handled by the system, which means that utterances will sound
different when, for example, a sentence ends with question or
exclamation mark, as opposed to a period. Another benefit is
that the user can specify that a particular word be emphasized
when spoken.
The Text-to-Speech system is implemented as a server which can
serve a number of clients. The advantage of this structure is
that the developer must link in only a very small amount of object
code. This means that the developer can keep the application
very small in size while retaining access to a very powerful and
large service.