7 Comments

Summary:

So will Apple’s Siri be like Facetime, widely praised and less widely used? Or will it be like touch screens that ushered in new ways of interacting with handsets? Folks in the speech recognition and virtual assistant market are hoping it’s the latter.

voice

Will Apple’s Siri be like Facetime; widely praised and less-widely used? Or will it be like the App Store or even touch screens that ushered in new ways of interacting with handsets? Folks in the speech recognition and virtual assistant market are hoping it’s the latter. Executives at Vlingo and Nuance both got on the phone today to discuss how they’re reacting to Apple’s moves and what it means for the industry.

The little guy

Vlingo, which introduced a virtual assistant product called Vlingo Virtual Assistant this year, responded to Apple’s integration of its Siri acquisition on Tuesday by making its iOS application free. Prior to this, some tasks, such as checking the weather or asking Vlingo to find nearby restaurants were free, but others such as dictating emails or texts were not. The Siri product does all this and more. Given that 4 million of Vlingo’s 10 million users are on the iOS platform, it stood to lose significantly if it continued to charge what Apple now natively offers for free.

CEO Dave Grannan told me that the entrance of Apple isn’t a bad thing, however. “I think it’s a significant market maker for virtual assistants,” he said. “When Apple does something, it’s generally a tipping point and signals that something is going mainstream.” That’s good for his business. For example, when Google announced voice features for Android, Vlingo added users, Grannan said. However, as impressed as he was by the demonstrations, he also sounded a note of caution.

He wondered how far Apple went with the natural language interface, and if users might end up being too conversational and getting frustrated if the product doesn’t work. He explained that a virtual assistant has to offer two things: one easy and one really hard. The easier technology is basic speech recognition, and the harder bit is adding context to the words spoken and figuring out what the user wants to do. That requires a semantic engine and artificial intelligence that’s continually getting better but is still not perfect.

“If you don’t guide the users between some guard rails for natural language processing, it can cause problems,” Grannan says. “For those reasons we have tended to shy away from extreme natural language conditioning for our users, but that’s a balance [Apple will] have to strike in guiding their users.”

The big fish

Meanwhile, Nuance, a much bigger speech recognition and virtual assistant provider, is also excited about Apple’s potential to influence the adoption of voice as a broadly used user interface on mobile and other consumer devices. Mike Thompson, SVP & GM, mobile division at Nuance, says the company currently has over 100 million requests for mobile speech transactions on its service, and believes Apple’s move shows how speech is a “mainstream interface for mobile phones and mainstream consumer devices.”

Nuance counts Apple as a customer of its software and has worked with Siri in the past, but won’t disclose the details of the relationship it has today with Apple and Siri. But Thompson says, “When Apple puts its stamp of approval and design on something the rest of the world follows, and we expect a lot of innovation coming in deep natural language understanding where the phone takes action almost magically.”

This magic comes at cost in terms of back-end processing and connection to the network. The Siri service will require a connection to the web to not only understand the speech, but also to figure out what steps to take once it knows what was said. Given that most tasks such as asking Siri to tell your wife you are running late, or helping you find the best vegetarian restaurant in Seattle, require broadband, this isn’t really a hardship. Although it does mean if you’re lonely and without a data connection, even Siri won’t talk to you.

  1. > The easier technology is basic speech recognition, and the harder bit is adding context to the words spoken and figuring out what the user wants to do. That requires a semantic engine and artificial intelligence that is continually getting better but is still not perfect.

    Yes well said

    > “If you don’t guide the users between some guard rails for natural language processing it can cause problems,” Grannan says. “For those reasons we have tended to shy away from extreme natural language conditioning for our users, but that’s a balance [Apple will] have to strike in guiding their users.”

    Somewhat right but the subtext is Vlingo does not have the right semantic and artificial intelligence expertise not does Nuance.

    Share
  2. i.e. nor does Nuance.

    Share
    1. Huh? But but wasn’t/isn’t Siri based on Nuance?

      http://gigaom.com/2010/02/04/siri-make-artificial-intelligence-your-slave/

      Share
      1. Siri relies on Nuance only for speech recognition not the semantic engine. Also because Siri was build using a modular approach, it give them the flexibility to swap out Nuance for another speech recognition engine without too much difficulty.

        Share
  3. Good discussion, Stacey. Our report (How Speech Technologies Will Transform Mobile Use at http://bit.ly/xW9tv) offers quite a bit of discussion on this topic… this excerpt (http://bit.ly/n9svr3) outlines situations in which voice UIs will be more or less suitable (and useful). We are also very bullish on the prospects for Assistant (see http://bit.ly/nNuOYR), particularly if Apple gets the AI right (and there is quite a bit of debate on this topic – see critique by @marshallk at http://rww.to/nMrBMn).
    Dr. Phil Hendrix, immr and GigaOm Pro analyst

    Share
  4. John Harrington, Jr. Wednesday, October 5, 2011

    I think what will separate Siri from FaceTime is this: it is reliant on the relationship between the phone-user and the phone itself.

    FaceTime on the other hand requires other people with FaceTime-capable phones to get anything started. Also consider that the biggest appeal of video chatting is that it brings two people together who aren’t able to connect face-to-face on a frequent basis (which is why it is most appealing for parents with young children + grandparents, students far from home/studying abroad, etc…). More on why I think Siri will be bigger than FaceTime here: http://bit.ly/q4UdTZ

    Share
  5. FaceTime is awesome, but not the same.

    It’s easier to compare to Copy & Paste or Multitasking. But, it’s better than that. It changes how you interact with both your phone and your computer.

    The problem with Vlingo and Nuance I’d see, is if they are downloaded, it would be only a few people who would get a lot of use, and otherwise be a novelty app. They would need to grow, and maybe partner with Windows, Android, Blackberry, etc.

    It really needs to be baked in.

    My typical day and how I’d use it.
    Alarm Clock, Reminders, Scheduling, Weather. Between these it’s a minimum of 5 a day.
    Search (Google, Wiki, Wolfram) averaging 5-15 a day.
    iPod: at least 10.

    Then I’d move on to transcribing dreams or thoughts while running or in bed. Dictating voice into notes while reading.

    And it’s one click away. Another problem with Vlingo or Nuance, if I have to open them as apps, I might as well just launch the app I want. So they need to grow, but also bake themselves in. That being said, once Siri hits the market, people will love her and grow even more attached to their iPhones. Which should scare the competition into working on their own.

    Share

Comments have been disabled for this post