Questions about Dragon Naturally Speaking

114 views
Skip to first unread message

Gregory Tippett

unread,
May 14, 2019, 6:44:54 PM5/14/19
to Dragonfly Speech Recognition
Hello, as I'm just starting off with voice recognition (and Dragonfly) I've been experimenting with Windows Speech Recognition. It's been frustrating at best, so I'm inclined to go ahead and purchase Dragon (15) which I gather is much more accurate. However, I've got a few questions:

1) Is there any question of compatibility between Dragon 15 and Natlink or Dragonfly? I would guess not, but I would like to be sure before purchase.

2) I've heard that Dragon installs a variety of extra processes which take up memory in an unnecessary way, and that this can lead to fairly regular freezing and crashes. Have other people experienced this, and what sort of workarounds do you use? The "KnowBrainer" website offers a PDF explaining how to deal with these problems with a purchase of Dragon from them. I'm inclined to take them up on this offer, but would first like to get more feedback from Dragon users to understand this issue more fully.

3) WSR does not seem to recognize my IDE (Webstorm) - it's possible to move the cursor around a bit, but it won't actually change any text. In other applications (in Chrome, Notepad++), it will bring up an "Insert box", which is plausible to use for composing text, but for coding, a really impractical extra step. Finally for some applications (ie: LibreOffice), WSR is completely unresponsive. Anyway, I'm wondering how will Dragon interface with text editors or IDE's? Are there any incompatibilities/gotchas/extra things that I should know about?

Thanks so much for any thoughts on these topics!
Gregory

Caspar

unread,
May 15, 2019, 11:31:50 PM5/15/19
to Dragonfly Speech Recognition
Welcome! To answer your questions:

1. No, Dragon 15 works fine with Natlink as long as you use the latest version of Natlink - http://handsfreecoding.org/2018/02/24/dragon-15-now-works-with-natlink-and-dragonfly/

2. Dragon is not the most stable program, but for me it has not affected the stability of the rest of my computer. I would not worry about Dragon's extra processes taking up memory. However I do recommend deleting (or renaming) the DragonBar executable once you're done learning how to use Dragon, as there's a bug whereby the Dragonbar shows up whenever you hit Winkey to open the start menu so that the focus is stolen from the Start Menu (you don't need the Dragonbar really - everything it can do can be done from the tray icon instead). I have no input into whether Knowbrainer's tweaks PDF makes it work purchasing Dragon from them, however I have found most of their advice to be geared towards users who are less technical than typical users of Dragonfly; I recommend basing any purchase on "can I download the installer again later if I need it" (because for Nuance's website, you only have 3 months to download the installer later unless you pay for some kind of extended download).

3. I have not tried WSR, but when it comes to Dragon and Dragonfly, there are 2 input "mechanisms": the first is Dragonfly's, which uses Windows APIs (PostMessage IIRC) to send keys - this works fine in essentially all IDEs and processes, with the following caveats:

* on non-Qwerty keyboard layouts you might need to do some tweaking (I got incorrect output with IIRC Key actions when I experimented with the Dvorak layout, although Text actions worked fine)
* you also want to be sure you're using dragonfly2 ( https://github.com/dictation-toolbox/dragonfly ) rather than dragonfly as the latter is abandoned but the former has a patch for better unicode support and such.
* elevated applications (e.g. those running as administrator) cannot have input sent to them by Dragonfly (more info and a workaround for this)

The second is Dragon's input sending. This works fine for elevated applications, but it gets further subdivided into applications which support "Select and Say" and those which do not. For the former, you can use nice commands like "go before <some words>" to move the cursor around etc; mainstream IDEs are basically never supported by this, but some people have hacks for making Notepad++ and similar lite-IDEs work with this (google "dragon notepad++ select say" or similar). For applications which are not supported, the default is to show the "dictation pad" (a little window which captures your spoken input and provides a buffer for editing and a command to copy-paste that into the application), but it is slow so I recommend turning it off entirely. Personally I satisfy that select & say functionality using http://handsfreecoding.org/2018/12/27/enhanced-text-manipulation-using-accessibility-apis/ for short bits of text and http://handsfreecoding.org/2015/08/30/avoid-the-dictation-box/ (except using https://liquidninja.com/metapad/ instead of notepad, so I get multiple undos, always on top, and window transparency) for long bits of text.

Regarding incompatibilities/gotchas specifically, one annoying one that comes to mind is opening a large file in a select & say enabled application can cause Dragon to become unresponsive; under the covers I believe Dragon uses COM and other tricks to mirror the text from the application to its own buffer so that it knows what words are available, and for large inputs (e.g. 1MB file opened in notepad) it runs out of memory and as a result repeatedly writes a whole bunch of debugging data to its log files (also burning up your disk space) until you restart Dragon. So err, just don't do that. (Also fun fact, the log files themselves are set at 10MB and open in notepad by default themselves so you can imagine that being annoying to track down..)

Anyway, on the whole I highly recommend buying Dragon - most of its surrounding infrastructure ranges from buggy to average however its actual speech recognition accuracy and latency is currently unmatched, so if you're getting into this because you need to avoid use of hands and you're writing code for a living then IMO it's a no brainer to buy Dragon for use with Dragonfly (unless MacOS is your poison of choice, in which case I'd recommend https://talonvoice.com w/ Dragon for Mac, though it's a little hard to buy the latter nowadays and when you do you need to make sure you patch it to the latest version - the Talon Slack is very helpful in figuring that stuff out though).

- Caspar

Gregory Tippett

unread,
May 16, 2019, 9:25:36 PM5/16/19
to Dragonfly Speech Recognition
Wow, thank you for such a thoughtful and detailed response! There's so much here for me to study and think about. I'm sure I will have more questions as I study and understand this more. For the moment:

- Am I understanding correctly that the Dragonfly "Windows API" mechanism and Dragon's "input sending" options which I can select - ie: I can choose to use one process or the other?

- How do I identify which IDEs have Select and Say? I've googled a bit for this, but haven't found anything that makes sense to me. Currently I am mainly using Jetbrains Webstorm. In this video: https://www.youtube.com/watch?v=P5DCDiCv4TE&t=1252s the presenter utilizes Vim, and it's clearly editing directly on the screen. I'd be up for using Vim (on Windows of course...) if that is best with Dragon(fly) - I'd really like the full functionality of an IDE. (May I ask - what is your setup?)

Thanks! 

Caspar

unread,
May 16, 2019, 10:16:06 PM5/16/19
to Dragonfly Speech Recognition
Yes, you sort of choose between Dragon and Dragonfly, in the sense that it depends on what commands you say. Voice coding via Dragonfly is made up of saying various commands, ideally one after the other in the same utterance - so one command might be "move cursor to end of line and press enter" (for me, that's "shockoon"), and another might be "insert the text function" (for me, "say function"), so "shockoon say function" is one utterance of 2 commands, and the last one causes Dragonfly to send the word "function" to the current window. These commands are grouped into grammars (by you), and for <reasons> you can only speak commands from the same grammar in one utterance; Dragonfly (or rather the underlying library Natlink) basically hooks into Dragon to define your grammars alongside the normal grammars which Dragon provides.

In addition to having its own command grammar (which lets you minimise windows etc), conceptually Dragon has a "raw dictation grammar", which is the fallback for when no other grammar matches (so if you speak gibberish, Dragon will generally interpret it as "enter the text which most closely corresponds to that gibberish"); if you want you can put Dragon in/out of command mode by saying "start/stop command mode" to avoid that fallback happening. So to get back to answering the question, the mechanism for sending input is used is determined by which words you say, which commands you have defined, and which grammars are active.

Regarding which IDEs have Select and Say, a decent rule of thumb is "nothing more complicated than notepad/wordpad unless either lawyers or doctors use those apps". So MS Word works, for example.. you can probably get Nuance to give you a list if you ask them. But that's sort of beside the point - even if an app doesn't have full select and say support in Dragon, Dragon will still be able to send keystrokes (and hence text) to it.

Since you asked, AFAIK Jetbrains IDEs are not select and say enabled. I use a mix of VS Code, Vim in WSL (in https://github.com/mintty/wsltty for now, until the shiny new Windows terminal announced at MS Build is stable!) and sometimes Jetbrains IDEs in a VM (virtualbox works ok here, vmware doesn't work well - but VMs make other things like conditionally enabling grammars based on the focused window harder, and that's the main reason I stick to VS Code & Vim ATM).

- Caspar

alexander15w

unread,
May 17, 2019, 2:07:06 PM5/17/19
to Dragonfly Speech Recognition
Gregory with regards to which IDs to use, I for one use sublime text most of the time with a Plug-in I wrote to help me jump from line to line in mod fashion.

Take a look through that getting started document link I sent you in an earlier thread, that should give you a reasonable idea as to things you should think about with regards to customizing your IDE/text environment.

Is also worth taking a look at Mark's exposition on Win32 pad, which actually does support Select-and-Say but is not the most full-featured of editors:

Gregory Tippett

unread,
May 17, 2019, 5:01:15 PM5/17/19
to Dragonfly Speech Recognition
Thanks for the info & reminder! Reading up on this now. Seems like a good intro there on Vocola, probably I will start there :)
Reply all
Reply to author
Forward
0 new messages