Blog Post

Could Siri be the invisible interface of the future?

Amid the 200 new iOS 5 software features and updated hardware specifications of Apple’s iPhone 4S (s AAPL), a beta application is poised to be the breakout star. I’m talking about Siri, the “intelligent assistant” that works with natural language to help iPhone owners easily get information or take certain actions solely through speech, and the subject of my latest report at GigaOM Pro (subscription required).

Although Siri is limited in what it can do, what it does do, it does well. And based on my experiences with Siri so far, it illustrates what I think of as the “invisible interfaces” of future connected devices. Admittedly, that sounds like a bold claim, but the reality is this: Thanks to the “Internet of things,” more devices are gaining connectivity that makes them smarter and more useful. At the same time, computing interfaces haven’t changed all that much in the past several decades. They’re going to have to, however, as we can’t have a multitude of different interfaces across a myriad of connected devices in this new world.

The key for potential success here is in Siri’s uncanny ability to understand not just natural language input but also context. This is great for smartphones, where we have so much personal data such as contact names, addresses, phone numbers and digital music tracks. Even better is when Siri works with multiple apps or services on our handsets, tying them together through a simple command. “Remind me to take out the trash when I get home,” for example, leverages both the Reminders application and the integrated GPS radio of an iPhone.

Think ahead and you can begin to see how powerful this concept is as more of the devices, appliances and gadgets around us gain connectivity. Saying, “Close the windows and turn on the air-conditioning if the outside temperature rises above 85 degrees,” could be a real-world example in just a few years’ time. And with a solution similar to Siri, it won’t require any mouse clicking or screen tapping to happen. If it did, the solution would be diminished in effectiveness, because you’d have to interact with each device separately, using varied interfaces, or hope that each smart object could speak to the others.

One interface, based on standard conversational language, however, would be a far more powerful solution. You wouldn’t see it because it’s invisible, but it would always be there, ready to grant your spoken commands. I’m so convinced that the Siri of today is just touching the tip of the iceberg for such a future that I expanded on this topic in detail this week in a GigaOM Pro report (subscription required). I’d say, “Read the report out loud” for you, but Siri isn’t quite that good. Yet.

13 Responses to “Could Siri be the invisible interface of the future?”

  1. AppleFUD

    Interesting. . . when Google does it there’s very little talk about it. When apple purchases something from Nuance and implements it’s the “future.”

    What? Google and/or Nuance isn’t good enough to supply something for the future?
    Does it really have to have a fruity logo on it to be “the future?”

    • Andre Goulet

      Just shows the confidence that folks have in Apple delivering things in a polished, usable format, nothing else. They make things ‘fit for the masses’ unlike other companies.

      And no, the Voice Control on Android phones and previous iPhones is not even close, not by a long shot. Siri is in a whole other league.

  2. Thomas P.

    If you ask the SIRI team, they should (they used to) be more than forthcoming about the problems with a single virtual agent and what’s termed “cross talk.” Basically, a single agent can only access so much information (or a fixed number of APIs) before it gets confused.

    The future is with brand-based or portal-specific virtual agents who don’t have to scan through wads of data and average out probable best-response, because the contextual failure rate will be unacceptable.

    So, “many Siri’s” is more likely, each with a distinctive knowledge pool and more importantly a distinctive voice.

    • “So, “many Siri’s” is more likely, each with a distinctive knowledge pool and more importantly a distinctive voice.”

      That seem frustrating; like going to city hall and getting a building permit or rezoning. Too many conflicting responses and no SOP.

    • I believe that is not necessarily the case. Context is organized data, where we might share the organization, speech in general but not meaning which can be very personalized for example.
      Now it’s “easy” to create shared context, most likely that’s what Siri is doing. But the next step will be to create personalized context which gives meaning on a personal level.

      Will Siri get there, don’t know. It’s all about data organization if they have a model based on fixed shared context then most likely not and as you say they will end up with many Siri. If they have a flexible model[self organizing] which can build personal context and shared context then yes, only one Siri.