Speech

47 views
Skip to first unread message

bri

unread,
Dec 23, 1997, 3:00:00 AM12/23/97
to

Im looking for some software for voice recignition on the amiga. Is there
any software out there and if so then where, as i have looked in aminet and
a few other sites, but no luck so far. Also is is possible to get a
reasonable sounding voise out of my amiga as "SAY" sounds completely crap.


Philip Kaulfuss

unread,
Dec 23, 1997, 3:00:00 AM12/23/97
to

Following his latest murder, bri smeared on the wall in blood...

: Im looking for some software for voice recignition on the amiga. Is there any


: software out there and if so then where, as i have looked in aminet and a few
: other sites, but no luck so far.

The only thing I've seen is a little toy called Animan. It puts a rendered
head on your screen which talks to you and will respond to voice commands.

: Also is is possible to get a reasonable sounding voise out of my amiga as "SAY"
: sounds completely crap.

There was a replacement for the Amiga's speech synthesizer released a year or
so ago (can't remember the name). It allowed you to configure almost every
aspect of how the speech sounded and was apparently very good when configured
properly. However, I came across a demo version and it sounded terrible - worse
than Say. Maybe the full version is better.

--
,- Philip Kaulfuss ------------------------- "Are you hungry? D'you want -.
|- ph...@boehme.demon.co.uk Some salt?" -|
|- http://www.boehme.demon.co.uk Vic Reeves, -|
`- PhilK in Undernet #HenryMichaels and #AmigaCafe --- talking to a bird -'


Kirk Strauser

unread,
Dec 24, 1997, 3:00:00 AM12/24/97
to

On 23-Dec-97 15:29:19, Philip Kaulfuss (ph...@boehme.demon.co.uk) wrote:
>Following his latest murder, bri smeared on the wall in blood...

>> Also is is possible to get a reasonable sounding voise out of my amiga as
>> "SAY" sounds completely crap.

>There was a replacement for the Amiga's speech synthesizer released a year or
>so ago (can't remember the name). It allowed you to configure almost every
>aspect of how the speech sounded and was apparently very good when configured
>properly. However, I came across a demo version and it sounded terrible -
>worse than Say. Maybe the full version is better.

I believe you're referring to "SoftTalk" or "SofTalk" or similar. It was
a commercial program for a while, but it was released as either shareware
or (I think) freeware a while back. Anyway, the full version is on
Aminet.

Kirk Strauser | Member // | Teknique on Undernet/#AmigaCafe,
kstr...@pcis.net | Team AMIGA \X/ | http://www.pcis.net/kstrauser/


sp...@spam.spam.spam

unread,
Dec 24, 1997, 3:00:00 AM12/24/97
to

These were the words of <bri> on Tue, 23 Dec 1997 20:03:10 -0000:


> Im looking for some software for voice recignition on the amiga. Is there
> any software out there and if so then where, as i have looked in aminet and

> a few other sites, but no luck so far. Also is is possible to get a


> reasonable sounding voise out of my amiga as "SAY" sounds completely crap.

There's a simple voice recognition type thing on Aminet which does CLI
commands by speech input with a sampler. I played with it a couple of
years back, and it worked OK. Not very useful, but it was kind of neat
doing Star Trek type stuff: "Computer, Format drive dh0: name..."

util/misc/VS121.lha (may be more recent version, didn't check)


--

<-AD-> <morse 'at' ahab.demon.co.uk>


Vanilla Gorilla

unread,
Dec 24, 1997, 3:00:00 AM12/24/97
to

On 24-Dec-97 05:27:00 wrote something like this....

there is a more recent version VS122, also there is VoiceShell
1.33. Don't know if it is the same app or or a different one.

Timothy Rue

unread,
Dec 26, 1997, 3:00:00 AM12/26/97
to

Am I correct in believing these apps have a vocabulary limit and are also
user specific?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*3 S.E.A.S - Virtual Interaction Configuration (VIC) - VISION OF VISIONS*
*~ ~ ~*
Timothy Rue Email: tim...@mindspring.com What's DONE in anything we do?
AI PK OI IP OP SF IQ ID KE
Web @ http://www.mindspring.com/~timrue/ >INPUT->(Processing)->OUTPUT>v
^---------<---------<--------<
Search email/name @ http://www.dejanews.com for other posts/puzzle parts.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Robert Berkey

unread,
Dec 26, 1997, 3:00:00 AM12/26/97
to

Michael Wrote:

>Hi Bri,

>On Tue, 23 Dec 1997, bri wrote:

>: ...Also is is possible to get a reasonable sounding voise out of my amiga


>: as "SAY" sounds completely crap.

>I happen to like the way that Wread sounds. It can be found here...

>WreadFiles47.lha util/misc 123K 147+Enhanced vocal text file reader

>...on AmiNet. It takes text files and reads them back to you. If you want
>to make it a little different, try...

>EsperantoAccen.lha dev/misc 3K Esperanto.accent for translator.library 42
>Ax_1Srpski.lha util/libs 2K Serbian accent for translator.library V43
>DeutschAkzent.lha util/libs 5K German Accent for translator lib V42
>DeutscherAkzen.lha util/libs 4K German Accent for translator lib V42
>Klingon-Accent.lha util/libs 96K Klingon Accent for translator.library v42
>LSHunAccent.lha util/libs 2K Hungarian (magyar) Accent for
> translator.library v42
>makeUSA.lha util/libs 2K Fixes American Accent of Translator42
>nlaccent.lha util/libs 2K Dutch accent-file for Translator.library
>SpanishAccent.lha util/libs 41K Spanish accent for Translator Library v42
>Tran43pch.lha util/libs 41K Patches Translator42.4, 43.0 to
> Translator43.1
>translator42.lha util/libs 80K Multilingual translator lib replacement
>

>...on AmiNet. I have only used a couple of the English ones and just played
>around with some of the others.

>And as always, YMMV.

>C-ya,

>Michael (sho...@best.com)

>Team *AMIGA*

---

OK... Right-on Michael..

One of the more forgotten aspects of the std Amiga..

Actually one of the reasons I bought mine in 1986.. <s>.

Haven't tried Wread but wanta try the Esperanto dialect..

Stalled at half way through a free ESP course..
Really curious how the language actually sounds.. :-)..

---

Bob Berkey

Member: Team AMIGA

--//--


sp...@spam.spam.spam

unread,
Dec 27, 1997, 3:00:00 AM12/27/97
to

These were the words of <Timothy Rue> on 26 Dec 97 19:29:54 -0500:


> On 25-Dec-97 01:28:37 Vanilla Gorilla <vg...@monkeyshines.com> wrote:
> >On 24-Dec-97 05:27:00 wrote something like this....
> >>These were the words of <bri> on Tue, 23 Dec 1997 20:03:10 -0000:
>
> >>> Im looking for some software for voice recignition on the amiga. Is there
> >>> any software out there and if so then where, as i have looked in aminet

> >>> and a few other sites, but no luck so far. Also is is possible to get a


> >>> reasonable sounding voise out of my amiga as "SAY" sounds completely crap.
>

> >>There's a simple voice recognition type thing on Aminet which does CLI
> >>commands by speech input with a sampler. I played with it a couple of
> >>years back, and it worked OK. Not very useful, but it was kind of neat
> >>doing Star Trek type stuff: "Computer, Format drive dh0: name..."
>
> >> util/misc/VS121.lha (may be more recent version, didn't check)
>
> >there is a more recent version VS122, also there is VoiceShell
> >1.33. Don't know if it is the same app or or a different one.
>
> >>--
> >> <-AD-> <morse 'at' ahab.demon.co.uk>
>
> Am I correct in believing these apps have a vocabulary limit and are also
> user specific?

As I remember it, there isn't an actual limit to the size of the word
dictionary given in the docs, but the more words you add, the slower
the recognition becomes. You can set a value for the 'strictness' of
match between what you say and what is in the dictionary, and this
affects speed also. When I was playing with the software, it worked
reasonably well on a friend's spoken commands after being programmed
with my voice.

I recall the system using a library (voice.library) for the
recognition part, so I imagine this could be used to base other speech
recognition applications on.

Does this mean that VIC is going to include speech input?

Timothy Rue

unread,
Dec 28, 1997, 3:00:00 AM12/28/97
to

No, but this doesn't exclude the VIC from being used to improve speed of
such voice command software.

The key thing to understand about the VIC is that it is not to become
bloatware. It's evolutional direction is to be just the opposite
(streamline-ware). To be such a set of fundamental functionality that
supports Virtual Interaction. It is through integration (team work) of
it's primary functions that we achieve additional functionality and power.
Changes in its specs are only to be done in order to better integrate
(improve the team work benefits) so it can handle "any" exceptions.
Finding exceptions it cannot handle identifies something in need of
improvement. Of course the target objective if the VIC is to provide a
tool having ultimate versatility in it's ability to integrate.

--

You said the voice applications slows down as the word dictionary grows.
This is a search problem where applying constraints can help improve
speed. Having the ability to change the constraints as you go along, can
bring additional speed. No need to search what you know is not going to
match. By narrowing down (you might call this focusing) the search based
on input sequence, speed might well increase as you go along (to reach a
given objective).

This might sound complex if your thinking of pattern matching voice
patterns. But there are several ways of thinking about and doing, all with
the same underlying concept of applying and changing constraints as you go.

To get a easy idea take a look at a thesaurus, the front section. There is
the "Plan Of Classification" and "Tabular Synopsis Of Categories". These
provide you with what you might see as a search tree. Through these you
can quickly find a word of what meaning you want and without searching
the whole book.

Now let's apply this to the voice command applications, but not on the
primary level of voice patterns (this would be to difficult for most of us
to organize into a search tree). But we can do it with changing word
dictionaries.

First word: Computer (so it knows we are talking to it.)

Second word: Utilities (for the group application type dictionary)

Third word: Format (the application we want as well as it's dictionary)

Fourth word: "D" (for the drives dictionary)

Fifth word: "F" (for the floppy dictionary)

Sixth word: "0" (identifying the specific drive)


At this point we seem to have gotten ourselves into a corner, stuck in the
floppy dictionary, unable to complete the command. Although we could have
just included in the format dictionary all the things format might be used
on (which might including text files) this problem is worth a better look
at.

We know other actions might be performed on DF0 (i.e. backup, cd, reorg,
etc..). Instead of duplicating or reinventing this data in each
applications dictionary, it's better to simply provide specific
dictionaries and ways of getting to them within dictionaries needing to.
In this case, within the format dictionary, it's "D" (which is set/defined
to get the drive dictionary, though "D" might mean something else in
another dictionary).

Because DF0 can be used in so many ways, we cannot define it's (or "0")
definition action to go back to the format dictionary. But we do know that
DF0 is an arguement to something, in this case the format utility. And
what we are really doing is creating a command line to execute. So we put
this into a the contents of a variable, as is done with each element.

We have worked our way down in dictionaries and now we need to get back up
to the format dictionary and eventually back up to the dictionary
containing the word "Utilities" in order to execute the command.

But HOW do we "get back up"?

Cycles within cycles within cycles, etc.. Think of it this way:

Many years ago (while studing programing in college) I came up with what
has got to be the simplest program flowchart method of all. Circles within
circles. Draw a circle and within this circle draw a smaller circle that
touches the outer circle a some point. Consider the outer most circle of a
program it's "main" function and the smaller circles that touch it
"sub-routines". Of course and circles within "sub-routines" that touch it
are sub-routines of these "sub-routines", etc..

A cycle is made up of a sequence of actions and the point where an inner
circle touches an outer one is it's entry and exist points. You might see
the OS as the biggest circle for applications but the real biggest circle
is the computer on/off switch. :)

With the above we can see that we need to create and use cycles. Cycles
we can enter into and exit out of. How do we do this?

Scripts: the basic element of a user created cycle is the sequencing of
actions in a script. Exiting a script or cycle is as simple as just
coming to the end of the script or such command ending the script.

By having scripts executed within scripts, inherently "getting back" is
simple. In the VIC this is done thru the SF line in the PK file, a stack
of filenames and line-numbers, keeping the overhead of script file/line
sequencing down to a minimum.


Back to our format task:

Ok, we have established which "drive" to "format" and only need to exit
any inner scripts within the format script and set the voice dictionary
being used back to the format dictionary.

Q: What's this format script? How do we know that the users voice input
is following this script?

Lets' recap for a moment:

The word "computer" gets the system ready for input (gets it's attention
on mic input, other than the attention getting word "computer").

To keep things simple and direct, we will have the word "computer" also
triggers off the VIC AI command, setting the VIC up for use by the voice
app. and waiting for input from it.

++ Remember, this is not voice patterns we are getting from the voice app.
but the results of it's pattern matching against it's (voice apps.) own
dictionaries. Dictionaries of which we will be *changing* via the VIC.

In the spirit of simplicity, the voice app. is only used to match a
voice pattern (sound) with a word (text). A word it sends to the VIC.
In other words: The Voice app only translates from sound to text, NOT
command execution.

The VIC, in it's script processing, first changes the voice app.
dictionary used (for search speed). Likewise, it can change it's (VIC)
dictionary to parrallel the voice dictionary change. The next action (VIC
script) is to get input. Ah, a cycle, get input, act on it and repeat.

A: We are building a command line that has specific elements. All we need
do is to fill in each required element and any optional elements. So long
as we get back to the format dictionaries we can fill in each of these
elements in the same way as we did "DF0" and with disregard to the final
or proper sequence of arguement elements in a command line. With this in
mind the format script may be as simple as "get input, act on it and
repeat".

There is one more thing needed here. We need a word to exit out of this
cycle ("go" or any word defined in the format dictionary to do so) and in
the process place the elements in proper order. Of course placing the
result in a variable the utilility script sends to a command line for
execution. Once sent, the simple utility script exits by shutting the VIC
down which cleans up before shutown.

So the vocal input might be:

"Computer utility format name f u n drive D F 0 go".

Ultimately it's a trade off between VIC processing time and total voice
dictionaries size, to accomplish overall increased voice processing speed.
With only a small voice dictionary, it might not be worth it. But if you
plan on expanding the voice dictionary(s) to a large size or perhaps able
to handle specific individuals, then it will be worth it. Of course those
likely to benefit most from such are the handi-cap, even those with
additional speech problems (voice word dictionaries customized for them).

I know I've left out some details (specifics of how the VIC does these
things) but I'm already pushing flames with the size of this post.

In the same way the VIC gets input from the voice app, this input could
come from other methods. Also such a VIC process could be created to
allow easy creation of parallel voice/vic dictionaries.

But do understand the VIC does not do the work of a voice translation
program, but can assist in it's use. No need to reinvent the wheel, just a
way to integrate the wheel with other things, including other application
control.

A few notes:

Dragon Systems is perhaps the best or at least most notiable voice command
software available today. But it's for the PC. I've seen commericials for,
I think, IBM e-business regarding voice to text conversion. And I know the
Mac has some pretty decent computer voice output.


I hope I haven't bored or otherwise confused anyone. Perhaps there will
even be an improvement in how serious the VIC is taken. Maybe even
inspire some to openly contribute to the creation of the VIC.

There are two more notes (They've been on my mind).

1) Thru the ICOAs resource pages there is mention of seven different
methods of external program control on other systems, but that AREXX (port)
on the AMIGA is the best. Not having the limitations or disadvantages
these other methods have.

I didn't know there where seven others, but know AREXX (port) is very
usable. With this in mind, perhaps now there is not so much competitive
concern regarding my stand of the VIC being freeware. Certainly as the VIC
gets out and people find it's usefulness, they will be drawn towards the
system that allows optimum use of the VIC. The Amiga. And yes, I've always
known this. :) Additionally I'm well aware imitators will always crop up
but having it right to begin with, well it only leaves the imitators with
something less than right to do. :)

2) The archived resource of the newsgroups. What are we putting into this
archive? Alot of garbage and a little information that will be usable over
time? It's our choice, let's not waste this potentially valuable resource
to badly. Consider how each newsgroups is specific in it's overall topic.
Consider how all newsgroups might be mapped like the front of a thesauras
so topic and field specific information can more easily be found. For
example where might one find natural language processing functionality?
How about Cad specific vocabularies? etc.. The foundation size of this
archived resource is huge and it only follows in having a huge potential
far greater than what anyone or group could hope to build outside of this
already built foundation.


I just wish I had more time to focus my effort, but the VIC has gotta be
freeware, I gotta earn an income and right now it only leaves me one day a
week to do domestic chores and focus on the VIC and related things (i.e.
web page updating). And to think I don't currently work with computers.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*3 S.E.A.S - Virtual Interaction Configuration (VIC) - VISION OF VISIONS!*
*~ ~ ~ Advancing the way we Perceive and Use the Tool of Computers!*
Timothy Rue What's DONE in anything we do?
Email @ tim...@mindspring.com v<--------<----9----<--------<
Web @ http://www.mindspring.com/~timrue/ | *AI PK OI IP OP SF IQ ID KE* |
>INPUT->(Processing)->OUTPUT>^
Search email/name @ http://www.dejanews.com for other puzzle parts/posts.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Reply all
Reply to author
Forward
0 new messages