23 Comments

Summary:

When it comes to designing intuitive, compelling user interfaces, Apple is hands-down the best. Starting with the Mac but most evident with each new generation of “i” products — iMac, iPod and iPhone — the company has demonstrated time and again what so many other device […]

iphonetalkWhen it comes to designing intuitive, compelling user interfaces, Apple is hands-down the best. Starting with the Mac but most evident with each new generation of “i” products — iMac, iPod and iPhone — the company has demonstrated time and again what so many other device makers and mobile operators have failed to understand: It’s the UI, stupid! So when Apple features Voice Control in commercials for the newest iPhone 3GS, the mobile industry should sit up and take notice.

For those under a rock over the last month, Voice Control is Apple’s VUI (voice user interface) that allows you to make calls and control the iPod features on the iPhone 3GS by speaking, rather than pressing numbers or navigating via the touchscreen. None of the functions of Voice Control are particularly new, and their implementation on the iPhone has been met with mixed reviews. Still, Apple has an uncanny ability to recognize and deliver features that consumers find compelling — witness the incredible success of the touchscreen.

Apparently, Apple believes — as we at immr do — that speech recognition is the sound of things to come for mobile devices and applications (subscription required). Apple’s attention is a welcome development, and will undoubtedly accelerate the shift that began with the success of Goog411, Vlingo and other speech-enabled mobile apps. Despite the fact that mobile devices are well-suited for speech recognition — they do, after all, have microphones already built in — no OEM or operator to date has delivered a speech solution that is easy to use, much less promoted the feature to users as a key distinction. Apple is changing that, and other device makers and mobile operators that fail to keep up will be left behind in the competition for users who value simpler, more intuitive UIs.

User-friendly interfaces, such as the touchscreen, have fueled adoption and use of mobile apps. So why is speech likely to be the next big innovation in mobile user interfaces? Several factors are driving developments:

  • While UIs are much improved, mobile devices and apps still demand considerable user attention — for example, viewing displays, entering text, navigating through the UI, etc. Speech-enabled solutions free users from hands- and eyes-on distractions.
  • Platforms such as Spinvox are opening up APIs, making it easier for developers to incorporate speech into their applications.
  • Companies such as Vlingo and Google have taken advantage of sophisticated technology and an enormous user experience base to dramatically refine speech-recognition results.
  • Synthesized speech, which once sounded “computer generated,” can now be produced in a natural-sounding way; book publishers recently sued to prevent Amazon from including it on the latest-generation Kindle.

While the marriage of speech technologies and mobile is under way and irreversible, the transition won’t be a smooth one. First, many undoubtedly remember past speech applications that didn’t work very well. That perception will need to be overcome; implementing speech with simple applications, as Apple has done with Voice Control, is a good way to start. Secondly, some applications are more compatible with speech than others. Selecting and listening to music, for instance, is a natural application; the number of songs and artists is limited, which improves accuracy of speech recognition, and users typically listen to music in a closed environment or with a headset — hopefully with a built-in microphone — which reduces ambient noise and makes it easier for voice commands to be understood.

Much as RIM has carved out a loyal following by developing solutions optimized for email, there is a significant opportunity for operators and OEMs to incorporate speech into mobile devices and applications in a comprehensive way. Apple is leading the way, and others will likely follow suit.

Phil Hendrix is the founder and director of the Institute for Mobile Markets Research and a member of the GigaOM Analyst Network. His complete discussion of the impact speech technologies will have in mobile is available in the latest GigaOM Pro report, “How Speech Technologies can Transform Mobile Use” (subscription required).

You’re subscribed! If you like, you can update your settings

  1. I failed to see the point of this article. There is no description or comparison of Apple’s Voice Control feature. And what’s up with the “gizmodo” style talking bubble?

  2. You should never have to touch your phone if you are in the car or riding a bike or whatever, and not loose function.

    The Appple voice control drops the ball in a lot little ways though. It fails (though many of the non-intergrated apps get it right) to use the phonetic pronunciation fields in contacts making many MANY names (particular even common forum names like Natala) a complete crap shoot. And renders the device essentially uncontainable, heaven help you if you have a accent.

    Voice Control also completely fails to integrate with bluetooth, not uprising since somebody seems to simple hate BT waiting years for even basic functions to get on board then missing key controls (skip, volume etc). I hear some of that might get fixed in 3.1 but it’s always a crap shoot.

    The BT stack is still way to limited (printers? Hello), and the controls like any customization (say maybe you don’t want to have to hold your button for 5 seconds to turn it on?). Wrap a huge walled garden around that and it might take a long time until the fucntions in the iphone catch up much less surpass other tech available now for things like winmo.

    1. I’m trying to fully absorb that lovely picture wherein iPhone is trying to catch up to WinMo.

  3. What is it with the Apple lovefest on this site? Almost every article mentions how they revolutionized something. Vlingo is the innovator in this space, not Apple. Play Artist Michael Bolton ring a bell? Where is the article about how Ford marketed that feature and is a pioneer?

    Have you seen that new copy and paste feature for the iPhone? Everyone is kicking themselves for not marketing that feature to users first.

  4. I’ll admit it, I’m a fanboy, but Apple speech recognition is little more than a parlor trick at this point. The rumored 3.1 OS release supposedly improves the voice recognition, but unless the improvements are dramatic I don’t see many people making serious use of the feature. Accuracy needs to be close to 100% and it’s not even close in this release.

    1. As the article points out, “none of the functions of Voice Control are particularly new, and their implementation on the iPhone has been met with mixed reviews.” At present and most likely for the forseeable future, accuracy of device-resident speech recognition will lag that of server-based solutions, such as Vlingo’s and Google’s. Despite these valid criticisms of Voice Control, Apple is calling attention to and promoting speech recognition, which will undoubtedly up the ante for the entire industry.

      1. I guess my point was, Apple drives the industry when it is demonstrably better than competing alternatives. Their voice feature, while more ambitious, is significantly *less* useful than the voice dialing that came on the last several free phones I got before switching to the iPhone. Your point may be valid, but I’m tired of Apple fans being portrayed as mindlessly worshiping whatever Apple ships. One wag wrote that if Apple shipped an iTurd, Apple fans would line up and pay $500 for it. I disagree. I think Apple fans are quite happy to heap scorn on Apple for any perceived flaw in a product. If anything, expectations are higher. In general, Apple doesn’t do check-off item features. It’s done right or it’s not in the product. In this case, I think they made an exception.

  5. This article is missing One Voice Technologies! Symbol OVOE

    Google spent quite a bit of time trying to patent what they already patented.

    http://blog.onev.com/blogsite/voice_search/

    MTNL is in the process of launching their email by voice to over 7 million.

    http://mobilevoice.mtnldelhi.in/

    Telmex has been playing with IRIS, a One Voice Mobile Voice product for over 2 years…

    http://telnor.com/iris

    Its time we all woke up and learned to get along and put all this tech to good use.

    Its my opinion that this article is missing the key to whats its supposed to be about. I hope One Voice Technologies get the press they deserve in this piece.

    1. One Voice gets kudos for being a pioneer, but, rumor has it that the speech recognition on the iPhone is licensed from Nuance Communications, and Nuance has over 1,000 patents, if it’s a patent contest.

  6. I think the next best app. for voice recognition is one voice technologies. the web site is onev.com. see for yourself.

  7. http://en.wikipedia.org/wiki/Microsoft_Voice_Command

    November 2003 saw the release of voice control in the Windows Mobile platform. I’m sure a little research would show that Palm or some other OS had it even earlier.

    Just keep saying Apple is an innovator; repeat any lie long enough and people will think it’s true.

    Copy/paste, video, MMS, A2DP, touchscreen, apps….the list goes on and on.

  8. I don’t understand all this about who was first stuff… I remember paying $40 for Microsoft Voice Command 1.0 in December 2003 for my PocketPC, tho it took them maybe a year to port it to smartphones. My phone had copy paste then too.

  9. Rich Rosen, FastCall411 Tuesday, July 7, 2009

    “Is iPhone’s Voice Control the Sound of Things to Come?”

    Yes, and not just mobile devices, voice control (voice search, voice portals) will be prolific. There is a convergence of technology, cost, and usage. Microphones and speakers are cheap, bandwidth is easy and voice applications are up to the task. There will be voice control at retail (“tell me the specs of the new Sony flat screen”); voice portals built into vending machines; at tourist attractions – everywhere that a kiosk with a touch screen has been cost prohibitive or otherwise inaccessible.

  10. I’ve just seen the very start of the voice side of things with voice recorder. Have to say, I have failed miserably in the two attempts to use it so far, so not sure what the future holds!

Comments have been disabled for this post