One of the benefits of working for the GigaOM Network is gaining exposure to a lot of great resources. The recently launched GigaOM Pro is a major source of information that touches on a lot of what we do. GigaOM Pro has published a technical report (subscription required) that delves into the impact that speech technology can have for mobile uses such as those performed on a phone. I have long been an advocate of speech recognition, and I believe it can play a part in creating a natural interface for working with computers. Speaking your mind takes on a special meaning when it’s done in order to interact with a computer. That interaction can benefit when applied to the mobile phone, perhaps even more so than with speech-enabled computers.
The mobile phone is a personal device that has become ingrained in most everything we do in our lives. It is a device that is designed from the ground up to work with speech, and it makes great sense that a proper interface revolving around speech technology could be a huge benefit.
The report touches on current real-world applications that leverage the use of speech to their advantage — GOOG411, for example. It makes sense to speak your queries when possible for the ultimate ease of use. These applications are possible due to more capable hardware and using the cloud. Phones now have good processing power, and having the heavy lifting done by remote servers (the cloud) maximizes what can be accomplished with speech. Speech recognition takes a lot of processing power, and the report points out that local and remote resources are now sufficient to do a great deal.
I have used speech recognition for years, and the thought of a completely speech-enabled phone excites me. I would love to approach the “Star Trek” era by speaking commands to my phone and having it react appropriately. We do that on a restricted level currently, as in the case of voice dialing. That is speech recognition in its most basic form, and expanding that capability would only be better. As the report indicates, the next 12 to 24 months will see this ability spread much further toward total speech operation of our phones. Heck, my Bluetooth headset has speech recognition of its own that responds to basic commands.
Phones operated by speech will have to overcome a couple of barriers to adoption, in my opinion. My experience with speech recognition, and talking to others about it, has me convinced that many are embarrassed at the thought of operating a phone by voice in public. I’ve had many people tell me that’s why they don’t use headsets, either — they don’t like to be seen by others talking to their phone. Public places are often too noisy to allow accurate speech recognition, and this will have to be solved before widespread adoption can take place. The speech interface must work everywhere, no matter what, to gain widespread adoption. These barriers are just a result of human nature, and speech is a very human phenomenon, so they will have to be addressed.
The GigaOM Pro report is a good, comprehensive look at bringing a speech interface to the mobile phone. I recommend you take a long look at this report if you are a GigaOM Pro subscriber, and if not, maybe you should think hard about becoming one.
{"source":"https:\/\/gigaom.com\/2009\/07\/07\/speech-recognition-and-the-mobile-interface-a-report\/wijax\/49e8740702c6da9341d50357217fb629","varname":"wijax_3c951ad800efc70394009cd69b7b4822","title_element":"header","title_class":"widget-title","title_before":"%3Cheader%20class%3D%22widget-title%22%3E","title_after":"%3C%2Fheader%3E"}