windows speech recognition fails if there are too many elements

24 views
Skip to first unread message

Tyler Elliott

unread,
Apr 25, 2019, 4:09:21 PM4/25/19
to Dragonfly Speech Recognition
I've noticed 2 problems that seem to stop WSR from working properly. Both result in WSR UI acting like it's listening: Blue mic, audio indicator moving, the word "Listening". But no commands are ever processed. No orange "What was that?" or anything.

wsr.png


This seems to happen when:
1) If the title of a window is too long, e.g. a long full file path in the title of a window or the long URL of a website that doesn't set it's title.
2) If there are too many elements being reported as clickable, e.g. a web page with 1000 links or some slack channels if there are too many messages on the screen.

In these cases, no commands work, dragonfly or WSR. I can't say, "show numbers", "close that", or any of my custom dragonfly commands.

One solution is to use the sapi5inproc engine but then I miss out on all the useful built in WSR features like dictation, correction, pressing arbitrary buttons, opening arbitrary programs, and clicking arbitrary buttons. And if I have WSR active at the same time as Dragonfly using sapi5inproc engine, both try to execute commands and I get double actions for overloaded phrases (this is the most unworkable when I'm in a text field where WSR always tries to add text). My workflows tend to bounce back and forth between the two.

If I use the sapi5shared engine (my preferred setup when it works), then any time a window reports too many elements to WSR, commands completely stop working for both.

Any workarounds or solutions for this?

Tyler Elliott

unread,
Apr 25, 2019, 4:59:01 PM4/25/19
to Dragonfly Speech Recognition
The one exception that I've just noticed - if an input field has focus, the dictation scratchpad continues to work (or text gets inserted into the text field if the scratch pad is turned off. But none of the scratchpad commands work ("insert", "cancel", "correct that", etc).

Dane

unread,
Apr 27, 2019, 10:49:03 AM4/27/19
to Dragonfly Speech Recognition
Hello Tyler,

I haven't come across this particular problem with WSR before. I have replicated the behaviour on my system with long window titles. I think you could get around those by truncating the current window's title with "win32gui.SetWindowText(handle, text)". It doesn't sound like you can work around the too many elements problem.

Swapping between sapi5inproc and sapi5shared without double actions could be done by using a dragonfly Context sub-class for all grammars loaded into sapi5inproc, perhaps by using a Grammar sub-class. The "matches()" method of the Context sub-class should return True if sapi5inproc should recognise speech. I'm pretty sure that programmatically checking or changing WSR's mic state is not possible. You could try checking a property of the WSR window instead. Perhaps sapi5inproc should be used if that window isn't visible?

Dane Finlay
Reply all
Reply to author
Forward
0 new messages