More fun with Speech Recognition (at least for me) :)

Mark Conrad

unread,

Nov 16, 2009, 10:22:16 PM11/16/09

to

Whoopie! - Never knock the sheer fun aspect of
speech recognition technology.

Just for fun, I decided to crank up the dictation
speed of that medical example which I have been
posting recently.

Instead of dictating at a sedate rate of 75wpm, I
dictated at 200wpm, just to see what would happen.

Darn near sprained my tongue trying to get phrases
like "perioperative transesophageal echocardiography"
dictated at a fast rate.

The result?

Just like I said, either MacSpeech or me is going to
make some mistakes, at those speeds, with complex text.

In this case, I think MacSpeech stumbled first, not me.

...on this _correct_ segment:

"were carefully separated from primary chords"

Somehow, MacSpeech wrongly interpreted that as:

"were carefully separated from her Merry chords"

I played back the original audio, and apparently I did
slur that phrase a tiny bit.

But "her Merry", c'mon. Any human could have figured
out what I was saying, and why in the heck did MacSpeech
capitalize "Merry". All very dumb.

Ladies and gentlemen, that was the _only_ mistake that
MacSpeech made in my entire 600 word dictation at 200wpm.

Looks like I am going to have to retract my previous
statement that MacSpeech is not capable of handling
complex medical speech at fast speaking speeds.

I will admit that I probably contributed to the one
error by slapping in a different microphone, without
creating a new "user profile" for the microphone.

One should always create a new user profile, when
switching to a new microphone, but I got lazy.

The brand of the microphone is "The Boom", model C.

Just too bad that general Mac users can not take
advantage of my level of speech recognition
performance, due to several factors.

Disclaimer - following points are not a criticism
of MacSpeech. IMO they are doing a great job of
bringing modern speech recognition to the Mac,
particularly considering their tiny size,
coupled with the extreme difficulty
of programming a speech app'.

1) lousy MacSpeech help file, non-intuitive

2) very poor ability to search help file
for advice, same reason, non-intuitive

3) no way to deactivate built-in words that
will never be used, those built-in words
often interfere with proper recognition of
the words that a user *WILL* be using
(by contrast, Dragon allows most all of the
built-in words to be deactivated by the user)

4) no tutorial built into MacSpeech
(Vista Speech on a PC has a great tutorial)

Speech recognition is not easy to learn
well enough to be useful, despite all
the hype to the contrary.

For those who missed the complex medical junk
which I spoke into MacSpeech Dictate 1.5.5
(the regular $200 non-medical version)

My next post will show all (approx') 600
words of it.

Mark-

Mark Conrad

unread,

Nov 16, 2009, 10:26:35 PM11/16/09

to

In article <161120091922163366%ae...@mostly.invalid>, Mark Conrad
<ae...@mostly.invalid> wrote:

> My next post will show all (approx') 600
> words of it.

Here it is.

---------------------------

New Surgical Procedure for Ischemic/Functional Mitral
Regurgitation: Mitral Complex Remodeling

Hirokuni Arai, MD, PhD*, Fusahiko Itoh, MD, Takeshi
Someya, MD, Keiji Oi, MD, PhD, Kiyoshi Tamura, MD,
PhD, Hiroyuki Tanaka, MD, PhD

Department of Cardiothoracic Surgery, Tokyo Medical and
Dental University Graduate School of Medicine, Tokyo,
Japan

* Address correspondence to Dr Arai, Department of
Cardiothoracic Surgery, Tokyo Medical and Dental
University Graduate School of Medicine, 1-5-45 Yushima,
Bunkyo-ku, Tokyo, 113-8519, Japan Email:
hiro...@tmd.ac.jp

On-pump beating heart mitral complex remodeling was
performed without aortic clamping. The mitral valve was
exposed through a left atriotomy posterior to the
interatrial groove. Interrupted 2-0 braided horizontal
mattress sutures without pledgets were placed around
the annulus to optimize exposure of the subvalvular
apparatus. Secondary chords to the anterior leaflet
from both papillary muscles were carefully separated
from primary chords with a nerve hook and were divided.

Two pairs of 5-0 and 4-0 Gore-Tex sutures (W. L. Gore &
Associates, Flagstaff, AZ) were each placed to both
fibrous portions of the anterior and posterior
papillary muscle tips, buttressed with pledgets of
autologous pericardium. Two pairs of the free arms of
the 5-0 Gore-Tex sutures were twice passed through the
free edge of the middle portion of the anterior leaflet
about 5 mm from the margin, from ventricular to atrial
side. Suture length was adjusted to be the same length
as the corresponding marginal chords, and the sutures
were tied.

Each pair of the free arms of the 4-0 Gore-Tex sutures
was passed through the posterior annulus at sites
around the border of the lateral and middle portions
and middle and medial portions of the annulus,
respectively (annulopapillary suture), and was also
passed through corresponding sites in the annuloplasty
ring (Carpentier-Edwards Physio; Edwards Lifesciences,
Irvine, CA). The 26-mm semi-rigid annuloplasty ring was
then seated. The annulopapillary sutures were pulled to
retract the papillary muscle tips closer to the
annulus, to the point at which leaflet coaptation
occurred in the plane of the mitral annulus during
systole, to visually confirm no residual MR. Suture
lengths were determined, and the sutures were tied.

To avoid air embolism, a vent cannula with a
pressure-monitoring catheter (TOYOBO Co Ltd, Osaka,
Japan) was inserted into the left ventricular apex and
was connected to the suction circuit equipped with a
small reservoir chamber (Senko Medical Instrument Mfg
Ltd, Saitama, Japan). During the final adjustment of
the annulopapillary suture length, this chamber was
filled with blood, and the height of the fluid level of
this chamber was adjusted to load the left ventricle.
The left ventricular systolic pressure was monitored to
keep it slightly lower than the systemic perfusion
pressure to avoid ejection through the aortic valve.

This new technique has been performed on 3 patients
with ischemic/functional MR. The patients were aged 61,
64, and 69 years; their ejection fractions were 0.34,
0.25, 0.32; their left ventricular diastolic diameters
were 62, 74, and 79 mm; and tenting heights were 11,
12, and 14 mm, respectively. Preoperatively, all
patients showed severe MR; perioperative
transesophageal echocardiography showed disappearance
of MR. Mitral valvular function has remained stable
during a mean short-term follow-up of 6 months (range,
1 to 12 months), with no or trivial MR noted.

------------------------------

Wes Groleau

unread,

Nov 16, 2009, 11:20:48 PM11/16/09

to

Mark Conrad wrote:
> I will admit that I probably contributed to the one
> error by slapping in a different microphone, without
> creating a new "user profile" for the microphone.

No question that's the cause.

It couldn't _possibly_ have any connection
with the TV playing loudly across the room.

Don't let anyone, not even Nuance, tell you otherwise.

--
Wes Groleau

Film Review: The Blue Butterfly
http://Ideas.Lang-Learn.us/russell?itemid=1565

Mark Conrad

unread,

Nov 17, 2009, 8:00:16 AM11/17/09

to

In article <hdt8au$jfp$2...@news.eternal-september.org>, Wes Groleau
<Grolea...@FreeShell.org> wrote:

> > I will admit that I probably contributed to the one
> > error by slapping in a different microphone, without
> > creating a new "user profile" for the microphone.
>
> No question that's the cause.
>
> It couldn't _possibly_ have any connection
> with the TV playing loudly across the room.
>
> Don't let anyone, not even Nuance, tell you otherwise.

In this case I can not blame the text error on my TV,
because I was in my quiet den, with no outside noises.

I will admit that my reason for switching microphones
was because I want to run a "fun test" at 200wpm
dictation in the near future.

...or faster, think I can muster at least 220 wpm
on that medical example.

"The Boom" microphone is well known to be a better
performer in the presence of outside noise,
compared to the $20 headset microphone that I
commonly use.

FWIW, you are correct in your assumption that _any_
outside noise which makes it through the microphone
to the speech app' itself, can contribute to mistakes.

It is just a question of _how_ _much_ noise can be
tolerated without drastically degrading accuracy.

In my experience, quite a bit of outside noise can be
tolerated, _PROVIDED_ the microphone "setup"
is properly adjusted.

e.g. - adjust the mic' so it almost brushes your lips,
and speak in a somewhat louder voice when setting up
the microphone automatic volume control.

Speech recognition would be almost useless if a user
had to be in a quiet environment in order to
get acceptable accuracy.

BTW, continue the good work in keeping me honest,
as I welcome any criticism of my technique.

It really bugs me that the general user of speech apps
finds it so difficult to get the same level of performance
as I manage to achieve.

I wish Nuance, MacSpeech, Microsoft, would all stop the
nonsense of advertising how easy it is supposed to be
to get first-rate accuracy and speed out of their
respective speech recognition apps.

It is *NOT* easy, in fact it is downright difficult.

(at least for the sort of complex medical
dictation which I commonly speak)

Mark-

Mark Conrad

unread,

Nov 17, 2009, 2:46:50 PM11/17/09

to

In article <171120090500168385%ae...@mostly.invalid>, Mark Conrad
<ae...@mostly.invalid> wrote:

> FWIW, you are correct in your assumption that _any_
> outside noise which makes it through the microphone

> to the speech app' itself, can contribute to text mistakes.

>
> It is just a question of _how_ _much_ noise can be
> tolerated without drastically degrading accuracy.
>
>
> In my experience, quite a bit of outside noise can be
> tolerated, _PROVIDED_ the microphone "setup"
> is properly adjusted.
>
> e.g. - adjust the mic' so it almost brushes your lips,
> and speak in a somewhat louder voice when setting up
> the microphone automatic volume control.

Whoops, I should have added _why_ the above adjustments
work to prevent outside loud noises from degrading the
accuracy of text output.

Essentially, the close placement of the microphone to the
users mouth means that his voice is the loudest thing that
the microphone "hears".

His voice therefore overwhelms the background loud
blaring of the TV, despite how loud the TV appears
to sound.

It is something like shouting into the ear of a mostly
deaf person, a person so deaf that they can't hear
a loud TV.

In practice, once the microphone is properly adjusted,
I can speak in a normal volume voice, or even a somewhat
softer volume voice, however my "soft" voice is still the
loudest thing around as far as the microphone is concerned,
because I am practically swallowing the microphone.

BTW, that is directly counter to Nuance advice about
"correct" microphone positioning, but it works for me.
(i.e., no discernable assuracy penalty)

Mark-

Wes Groleau

unread,

Nov 17, 2009, 8:25:22 PM11/17/09

to

Mark Conrad wrote:
> His voice therefore overwhelms the background loud
> blaring of the TV, despite how loud the TV appears
> to sound.

Except every time you pause to breathe.

--
Wes Groleau

New numbers for next year
http://Ideas.Lang-Learn.us/barrett?itemid=1495

Mark Conrad

unread,

Nov 18, 2009, 1:42:56 AM11/18/09

to

In article <hdvie0$bdt$3...@news.eternal-september.org>, Wes Groleau
<Grolea...@FreeShell.org> wrote:

> > His voice therefore overwhelms the background loud
> > blaring of the TV, despite how loud the TV appears
> > to sound.
>
> Except every time you pause to breathe.

My bad. I forgot to mention that all the common
modern speech apps have a "manual" setting which
overrides Automatic Volume Control. (AVR)

Even the longest pauses do not pick up background
noise. Manual works great on Dragon & MacSpeech,
does not work so good on Vista Speech, which can't
bypass AVR for some reason.

Even breathing on the mic' confuses Vista Speech.

Mark-

--
Speech Recognition Trivia...

5 Ways Speech Recognition Can Make You a Better Writer

<http://lateralaction.com/articles/speech-recognition-writing/>

More on general benefits of speech recognition
for creative writing productivity.

<http://lateralaction.com/articles/dragon-naturallyspeaking-speech-recog
nition/>

Wes Groleau

unread,

Nov 19, 2009, 12:21:30 AM11/19/09

to

Mark Conrad wrote:
> My bad. I forgot to mention that all the common
> modern speech apps have a "manual" setting which
> overrides Automatic Volume Control. (AVR)

I suppose you apply the common argumentative technique
of defining "modern" as "the set of speech apps which
have a manual setting."

--
Wes Groleau

Homework Again
http://Ideas.Lang-Learn.us/russell?itemid=1577

Mark Conrad

unread,

Nov 19, 2009, 1:06:16 PM11/19/09

to

In article <he2kkp$e2t$3...@news.eternal-september.org>, Wes Groleau
<Grolea...@FreeShell.org> wrote:

> > My bad. I forgot to mention that all the common
> > modern speech apps have a "manual" setting which
> > overrides Automatic Volume Control. (AVR)
>
> I suppose you apply the common argumentative technique
> of defining "modern" as "the set of speech apps which
> have a manual setting."

To be specific, I mean these 3 apps:

1) Dragon NaturallySpeaking
2) MacSpeech Dictate
3) WSR (Windows Speech Recognition)

I am not aware of any other modern speech apps which
are offered to the personal computer market.

Mark-

Wes Groleau

unread,

Nov 19, 2009, 8:25:15 PM11/19/09

to

Mark Conrad wrote:
> To be specific, I mean these 3 apps:
>
> 1) Dragon NaturallySpeaking
> 2) MacSpeech Dictate
> 3) WSR (Windows Speech Recognition)
>
> I am not aware of any other modern speech apps which
> are offered to the personal computer market.

There are othyers that are VERY expensive.
And there's ViaVoice.

--
Wes Groleau

Posted Last Episodes of “Muzzy II”
http://Ideas.Lang-Learn.us/russell?itemid=1424

Mark Conrad

unread,

Nov 20, 2009, 6:23:03 AM11/20/09

to

In article <he4r5q$74m$4...@news.eternal-september.org>, Wes Groleau
<Grolea...@FreeShell.org> wrote:

> And there's ViaVoice.

According to AppleLinks, ViaVoice has not been updated
since 2003 by IBM, its creator:

<http://www.applelinks.com/mooresviews/via.shtml>

Unfortunately the only somewhat reasonably priced
SR app for the Mac is MacSpeech Dictate, present update
is 1.5.7 (available yesterday)

Prices vary from $200 for the regular version, to $600
for the legal and medical versions.

It is the _only_ modern and actively maintained SR app
which works directly with OS X

Assuming a person uses the $200 MacSpeech on regular
simple everyday speech, he should expect 98% to 99%
raw accuracy of the uncorrected output text, on the average.

This at normal speaking speed, which varies from 100wpm
to 200wpm, depending on the users normal talking rate.

I get about 99% raw accuracy before text correction, myself.

I can jack this up very close to 100% with "proper"
correction techniques, as proved by my recent posts
on complex medical speech, using the "regular" $200
version of MacSpeech, which is not designed to handle
such complex speech.

I will tell you, it was a real PITA to force the $200
MacSpeech to handle complex medical speech.

Any reasonably competent user should be able to
duplicate my results.

At the present time, MacSpeech is very buggy,
lacks many modern _fast_ text correction features,
but is steadily being improved.

(comparing MacSpeech to $200 Dragon NaturallySpeaking)

One short year ago, I did not consider older MacSpeech 1.3
worth buying.

Despite MacSpeech bugs, a determined user can find
"workarounds" so that the product is useful enough to be
of value to the user.

Regarding speech recognition apps -
> There are others that are VERY expensive.

Yeah, like the $1,600 Dragon Medical that I own.

I have to run it on a Vista partition of my MacBook.

Costly apps are still limited by present hardware and
software, despite their high price.

Huge AI apps could in theory be created to "understand"
words in context, but no present hardware is fast enough
yet to run such software at a reasonably fast rate.

BTW, those high priced apps are just the beginning,
a doctor soon learns that he has to buy other software
just to help him fill out EMR and EHR forms by voice.

For a radiologist, these additional software "helper" apps
can easily cost an additional $5,000

The equivalent additional software site for a
MacSpeech-using doctor is below:

<http://www.macpractice.com/mp/md/>

...and ordinary people wonder why their medical costs
are so high.

Their medical bills would be even higher if their
doctors did _not_ use this costly software.

Mark-

Wes Groleau

unread,

Nov 20, 2009, 10:53:22 PM11/20/09

to

Mark Conrad wrote:

,,, that MacSpeech is modern while ViaVoice is not.

then he described how bad MacSpeech is.

Thanks, I'll stick with the not-modern. It's apparently better.

--
Wes Groleau

Learning to see the forest instead of the trees.
http://Ideas.Lang-Learn.us/WWW?itemid=75

Davoud

unread,

Nov 21, 2009, 12:28:48 AM11/21/09

to

Wes Groleau paraphrased Mark Conrad:

> Mark Conrad wrote:
>
> ,,, that MacSpeech is modern while ViaVoice is not.
>
> then he described how bad MacSpeech is.

And then Wes Groleau added:

> Thanks, I'll stick with the not-modern. It's apparently better.

I haven't seen Mr. Conrad's posts directly for years :-) but I often
see them second-hand. It would appear that he has become pathologically
obsessed with MacSpeech.

Davoud

--
I agree with almost everything that you have said and almost everything that
you will say in your entire life.

usenet *at* davidillig dawt cawm

Mark Conrad

unread,

Nov 21, 2009, 1:11:57 PM11/21/09

to

In article <211120090028486468%st...@sky.net>, Davoud <st...@sky.net>
wrote:

> I haven't seen Mr. Conrad's posts directly for years :-) but I often
> see them second-hand. It would appear that he has become pathologically
> obsessed with MacSpeech.

Odd choice of words.

Actually, I spend more time and money on Dragon NaturallySpeaking,
as do 99% of others in the USA.

MacSpeech is a bit player in speech recognition, they are lucky to
make 2 million a year gross income world wide.

Nuance (Dragon) will spend 500 million a year buying out
competitors like "Philips Speech Products".

Mark-

Mark Conrad

unread,

Nov 21, 2009, 1:12:04 PM11/21/09

to

In article <he7o7g$8hc$2...@news.eternal-september.org>, Wes Groleau
<Grolea...@FreeShell.org> wrote:

> Mark Conrad wrote:
>
> ,,, that MacSpeech is modern while ViaVoice is not.

That is correct.

> then he described how bad MacSpeech is.

I have not even started to tell you how bad it is,
but then you can observe that for yourself by
looking at what the MacSpeech customers
have to say about it in the MacSpeech forum.

<http://www.macspeech.com/extensions/forums/>

> Thanks, I'll stick with the not-modern. It's apparently better.

No, and I should know, because I still have ViaVoice.

ViaVoice is so bad that IBM quit supporting it in 2002, IBM was
effectively nudged out of the speech recognition arena because
ViaVoice could not compete with the much better product "Dragon".

In the Wes Groleau signature:

> "Learning to see the forest instead of the trees"

Looks like you are not very good at taking your own advice.

MacSpeech is about on a par with the free Vista Speech,
a $MS product with herds of $MS programmers behind it.

I can get essentially the same quality and speed out of either
MacSpeech or Vista Speech.

If you did not know which one I was using, you would not
be able to tell by the "results" of the text produced by me.

By contrast, if I were using ViaVoice, it would be readily
apparent that I was using an obsolete 3rd-rate product.

Do not take my word for this, find out for yourself.

It is easy to check out Vista Speech, as it comes free
with both the Vista and Windows-7 OSs.

Vista, formally called WSR, (Windows Speech Recognition)
is every bit as bad as MacSpeech, worse in some ways,
better in other ways.

Vista Speech is much easier for a speech novice to learn,
assuming the novice knows a tiny bit about how to run
the Vista OS.

I have bare minimum Windows OS knowledge, yet I have
no problem at all getting good results from Vista Speech.

Despite all the Bad Press that I give to all 3 modern speech
apps, they are good enough to be used in situations where
they outperform other methods of converting thoughts
to text.

I am surprised that no one faults me for badmouthing
all 3 modern speech apps.

Mark-