Speech Recognition You Can Count On–For Under $250
Speech recognition has been positioned as the holy grail of computing for decades, but many people have found the off-the-shelf software solutions too prone to inaccuracies. That’s really changed with some of the newer products, though, and I’m now regularly using a digital voice recorder and Dragon Naturally Speaking software to do very accurate transcriptions of interviews and even television and radio segments, videocast segments and podcasts. If you spend a fair amount of time writing—even just writing e-mails—it’s worth looking into how accurate speech recognition is now.
I’ve always found the Dragon Naturally Speaking software products to produce the most accurate transcriptions, although ViaVoice is good too, and comes in a nice version for Mac OS X. The good news is that you can now buy very good digital voice recorders that come bundled with the Dragon software, making transcriptions of, say, interviews as easy as hitting Record and then sending your audio directly to text in a word processor. In this post, I’ll discuss some inexpensive, good ways to do this.
My favorite of the digital voice recorder and Dragon bundles is the Sony CD-MX20VTP Handheld Digital Voice Recorder which has a five-star rating at Shopping.com. It has 32MB of flash memory, and you can use Memory Sticks for essentially infinite capacity. You can store your recorded files in over 300 folders, and Dragon Naturally Speaking software is bundled. I found the bundle on Shopping.com for under $250. You can get well above 90 percent accuracy with this recorder and the software, and well above 95 percent accuracy if the software is recognizing just your own voice.

For a cheaper solution, you can get a Naturally Speaking Mobile 5.0 Upgrade version of the Dragon software that comes bundled with a digital voice recorder. The recorder is not as good as the Sony one, but you can get this bundle for around $200, and you can try it during a free-trial period if you like.

I have a few tips that can help you boost the accuracy of your transcriptions. First, it’s hugely important to train the software to recognize your own speech. Follow the instructions to do so, and spend at minimum an hour doing the training. When dictating, speak at a natural pace, not too quickly. Also, a good computer with a fast processor and a healthy amount of memory will boost accuracy.
I have yet to find the speech recognition solution that makes no mistakes, but I have found these solutions to work well enough that I can watch as a transcription progresses in my word processor, take notes as I notice mistakes, and then quickly go back at the end to make corrections. This is vastly faster and less annoying than typing transcriptions.
Do you have any good tips on speech recognition?
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
My big question is whether or not the audio goes from the recorder to text in real time. Can you speed it up?
No, you can’t actually speed up the voice-to-text transcription, and you woudn’t want to because you would introduce more inaccuracies. But yes you can send the audio directly to the computer for transcription.
Sam
One day not too far away, speech recognition conversion of audio files will be faster than real time because we are talking about digital data here.
Within 12 months, Dr. Rob A. Rutenbar, professor of electrical and computer engineering at Carnegie Mellon University believes he will have a system that can recognise 5,000 to 10,000 words in five to 10 times faster than real time… cool!
We have an interesting and related product as well – the Acappella Conference Audio Recorder at http://www.acappella.com.au
I really like dragon naturally speaking, but do know that ver. 9 requires a web connection to validate and use.
My connection to speech recognitions started with Milton-Bradley — the old school game maker (chutes and ladders, battleship). In the late 70s and early 80s they got into electronic toys (Simon and some others). They were very interested for a short time in speech recognition — single words, not continuous — to make toys that you could give verbal commands.
I next went to Verbex which had just been purchased by Exxon where I met the Bakers. I don’t think Verbex lasted long, and I didn’t last long there either. I believe the Bakers — can’t remember first names — a very ‘geeky’ couple and very serious about voice recognition, left Verbex to start Dragon systems. Not having worked in that field since then, I noticed your post, and it brought back that little snippet of my personal work history.
Speech recognition is one of those fields that was always just around the corner from being great (bubble memory!). It sounds like its working pretty well now.
Thanks for the article. I read a lot of books (part of a new profession) and I needed a way to rapidly convert to text. I bought an Olympus WS-300M Digital Recorder (which I love) and a copy of Dragon. I worked hard to use it, but ultimately gave up on it [1]. Now I use the best voice recognition system ever built [2]. I estimate I could get about 15-20 files done for $250. It would have been cheaper in the long run to do this via machine, but my conclusion is – at this time – humans do it better and easier (for me).
My 2c!
[1] Matt’s Idea Blog: Notes on using a digital voice recorder for taking reading notes
[2] Matt’s Idea Blog: The 4-hour workweek applied: How I spent $100, saved hours, and boosted my reading workflow
Can more than one person use the same voice recog. and is is compatible with programs like ShortHand?