In the prior two articles in this series, I went over the basics of getting started with voice programming, and talked a little bit about the history and community of it. In this article, I’m going to go over best practices.

Let me preface with this. Your personal command set and phonetic design are going to depend on a variety of factors: accent, programming environment and languages, disability (if any), usage style (assistance versus total replacement), etc. The following is a list of guidelines based mostly on my experiences. Your mileage may vary.

Use Command Chains

If I could only impart one of these to you, it would be to use continuous command recognition/ command sequences. Get Dragonfly or Vocola and learn how to set it up. (Dragonfly. — Vocola.) Speaking chains of commands is much faster and smoother than speaking individual commands with pauses in between. If you’re not convinced yet, watch Tavis Rudd do it.

Phonetic Distinctness Trumps All

When selecting words as spoken triggers (specs) for actions, keep in mind that Dragon must understand you, and unless you’re a professional news anchor, your pronunciation is probably less than perfect.

  • James Stout points out the use of prefix and suffix words on his blog, Hands-Free Coding. Though they do add syllables to the spec, they make the spec more phonetically distinct. An example of a prefix word might be, adding “fun” to the beginning of the name of a function you commonly use. Doing so also gets you in the habit of saying “fun” when a function is coming up, which believe it or not, is often enough time to think of the rest of the name of the function, allowing for an easy mental slide.

  • Use what you can pronounce. Don’t be afraid to steal words or phonemes from books or even other spoken languages. I personally think Korean is very easy on the tongue with its total lack of adjacent unvoiced consonants. Maybe you like German, or French.
  • Single syllable specs are okay, but if they’re not distinct enough, Dragon may mistakenly hear them as parts of other commands (especially in command chains). As a rule of thumb, low number of syllables is alright, low number of phonemes isn’t.

The Frequency Bump

When you speak sentences into Dragon, it uses a frequency/proximity algorithm to determine whether you said “ice cream” or “I scream”, etc. However, it works differently for words registered as command specs. Spec words get a major frequency bump and are recognized much more easily than words in normal dictation. Take advantage of this and let Dragon do the heavy lifting. Let me give you an example of what I mean.

Dragonfly’s Dictation element and Vocola’s <_anything> allow you to create commands which take a chunk of spoken text as a parameter. The following Dragonfly command prints “hello N” where N is whatever comes after the word “for”.

I’m going to refer to these sorts of commands as free-form commands. Given a choice between setting up the following Function action with free-form dictation via the Dictation element, or a set of choices via the Choice element, the Choice element is the far superior um… choice.

In this example, if you set up <parameter> as a Dictation element, Dragon can potentially mishear either “foo” or “bar”. If you set up <parameter> as a Choice element instead, all of the options in the Choice element (in this case, “foo” and “bar”) get registered as command words just like the phrase “do some action” does, and are therefore far more likely to be heard correctly by Dragon.

Anchor Words and the Free-Form Problem

Let’s say we have a free-form command, like the one mentioned above, and another command with the spec “splitter”. In this hypothetical situation, let’s also say they are both part of the same command chain grammar.

Usually, I would use the “splitter” command to print out the split function, but this time I want to create a variable called “splitter”. If I say “variable splitter”, nothing will happen. This is because, when Dragon parses the command chain, first it recognizes “variable”, then before it can get any text to feed to the “variable”command, the next command (“splitter”) closes off the dictation. This has the effect of crashing the entire command chain.

There are a few ways around this. The first is to simply give up on using free-form commands or specs with common words in command chains. Not a great solution. The second way is to use anchor words.

In this modified version of the command, “elephant” is being used as an anchor word, a word that tells Dragon “free-form dictation is finished at this point”. So here, I can say, “variable splitter elephant” to produce the text “var splitter”.

Despite the effectiveness of the second workaround, I find myself getting annoyed at having to say some phonetically distinct anchor word all the time, and often use another method: pluralizing the free-form Dictation element, then speaking a command for the backspace character immediately after. For example, to produce the text “var splitter”, I could also say, “variable splitters clear”. (“Clear” is backspace in Caster.)

I am working on a better solution to this problem and will update this article when I finish it.

Reusable Parts

On the Yahoo VoiceCoder group site, Mark Lillibridge proposes two categories for voice programmers, what he calls Type I and Type II. Type I optimize strongly for a very specific programming environment. Type II create more generic commands intended to work in a wide variety of environments. Along with Ben Meyer of, I fall into the latter category. My job has me switching between editors and languages constantly, so I try to use lots of commands like the following.

I also try to standardize spoken syntax between programming languages. It does take extra mental effort to program by voice, so the less specialized commands you have to learn across the different environments you work in, the better.

What About You?

That’s all I’ve got. Have any best practices or techniques of your own? Leave them in the comments; I’d love to hear them!