Dragons and Dictation Software: How the Failure Continues

mic_thumb

In late November, Gear Diary ran a short video sneak-peek of Dragon Dictation for the iPhone, an app that, like its big brother Mac and PC counterparts, converts spoken words into written text. The teaser video begins with the words “An app that will transform your iPhone usage.”

It’s now available in the iTunes Store, and for a limited time, it’s free. But don’t rush to download it just yet. While this is not a review of the Dragon Dictation app, it is a cautionary tale to be skeptical of the hype. Because, unless you already have a very specific need for speech-to-text technology (subscription required), this app fails where all dictation software has failed before.

A Quick Recap

In the early 90’s people started taking dictation software seriously, and developers dreamed of a not-too-distant Star Trek-inspired world in which our primary method of interaction with our computers would be via the spoken word. Not just stilted single-word utterances either, but fluid, organic sentences. Natural speech, they like to call it.

It all sounds fantastic. But the hardware was a long-time coming. Processors were underpowered. Microphones were too low-fidelity for dictation software to do its job reliably.

Image courtesy of SummerRain812 on Flickr

Sadly, higher quality microphones tend to be prohibitively expensive or must be strapped to the head during use. (Not exactly user-friendly.) In any case, even when speech recognition and dictation works well, it’s a control mechanism most of us find horribly uncomfortable.

What do I mean by that? If you’ve never tried dictating an email, letter, article or essay, go do it right now. I guarantee you’ll be returning to the keyboard in next to no time.

Dictation tools still require you to explicitly dictate punctuation (an awkward skill to master). The fact is, until computers really are as smart as those in Star Trek, the biggest problem with dictation software is not with the software at all, but with you, the user. You see, you need to be carefully re-trained not only in how you go about the task of ‘writing,’ but also in how you control your computer. It’s deliciously ironic that, after a while spent training yourself to speak the right words, the right way, at the right speed and with the right tone of voice, you sound more like a robot than your computer ever could.

Challenges

No one writes an essay or lengthy document knowing in advance exactly the words they will use and in precisely what order. If you’re anything like me, you write a few lines here, an edit there, a quick jump back to the beginning to add something you forgot… Writing is a creative process that requires a lot of flexibility.

Just try moving your carat around a page using only your voice, and you soon realise that in the time it took to navigate successfully to that one particular spot on the page, you could have reached for the mouse, clicked, made your edit, completed your sentence and wandered off to watch last night’s episode of Stargate Universe.

Dragon Dictation is nothing special unless you already have a very specific need for such software, like I said at the start of this ranticle (a portmanteau of ‘rant’ and ‘article’ I suspect would take an eternity to type using dictation software).

The problems with dictation technologies can’t be blamed on Dragon Dictation; rather, they belong to those human interface challenges that are the product of our bias towards using our hands for most activities. If your hands and arms work sufficiently well, you’ll just prefer a mouse or keyboard.

For what it’s worth (even though I did promise this was not a review) Dragon Dictate has some noteworthy limitations; Dictation has to occur in short bursts of 20-30 seconds, which will swiftly become a nuisance if you happen to speak s-l-o-w-l-y. There’s no realtime visual feedback, so you can’t tell whether the speech-to-text conversion was successful until after you’ve finished dictating. Most importantly, the Dragon Dictate app doesn’t itself perform the speech-to-text conversion; those short 20-odd seconds of speech are recorded by the iPhone and transferred via the web to a server which does all the real work, sending the text results back to your iPhone. So not only is there an unavoidable processing delay, but you also have to be online to use it in the first place.

So, is this really the app that’s going to transform how you use your iPhone? If you constantly use Voice Control then perhaps you’ll love it. But for everyone else, this is likely one of those apps left to gather virtual dust, another victim of the harsh reality of current voice-interaction technologies.

I look forward to a Star Trek future in which we all talk to our computers and receive intelligent, useful responses. But don’t forget that the crew of the Starship Enterprise does the bulk of their computer work with their hands. (And it’s all dignified tapping and swiping, mind you, not comically-impractical Minority Report arm-waving!)

loading

Comments have been disabled for this post