30 Comments

Summary:

New hardware helped Apple sell 4 million iPhone 4S handsets over the weekend, but Siri, the personal assistant software, may be the most groundbreaking aspect. Here’s a closer look at the difference in voice command and intelligence between the two largest mobile platforms: iOS and Android.

siri-use

Apple reported on Monday that it has sold 4 million iPhone 4S handsets in the debut weekend of the new smartphone. Some may have purchased it for the dual-core processor, while others were sold on the updated 8 megapixel camera and intelligent imaging software. But as a long-time Android owner, one feature alone pushed me to add an iPhone 4S to my stable of smartphones: Apple’s Siri service that turns the handset into a personal assistant.

Google had voice commands a year ago

“But wait,” the Android users are crying out. “Google has voice services too!” Indeed, Android devices do have similar voice services. Voice search is native to the platform and blew me away when I first used it on a Google Nexus One in Jan. 2010. Even in a crowded room, I could search the web with ease just by speaking to my phone. And in August of last year, Google introduced Voice Actions, a downloadable Android app that extended voice services to speech commands. Here’s a video demo to illustrate the functionality.

At a quick glance, Apple’s Siri and Google’s Voice Actions appear similar. In many ways, they are. Here’s a quick breakdown of the similar functions that both services provide through speech:

  • Creation of text messages and emails.
  • Get navigation directions.
  • Call a contact.
  • View a map of a particular area.
  • Write a note.
  • Play music.
  • Perform a web search.

Google’s Voice Actions also includes the ability to search for and call a business in one step, plus it can be used to open a web page. Siri comes close to opening web pages; when I say “go to yahoo.com” Siri does a Google search for the site, making it one tap away.

So what’s different about Siri?

That’s pretty much the end for Google Voice Actions, which requires users to memorize the exact commands much like Microsoft’s Voice Command for Windows Mobile did back in its 2003 debut. It’s not difficult to do so, and the functionality is certainly useful. However, Siri can do even more and provides intelligence that Android doesn’t yet match. Compare the above video demo of Google’s solution with this introduction to Siri that Apple created; you’ll immediately notice that Siri isn’t simply a voice command system. Instead, it’s a semi-intelligent interactive assistant:

The biggest difference is the one that will likely have the most impact going forward; Siri’s use of natural language. Instead of memorizing set commands, Siri can understand questions, even if they’re asked in different ways. The speech engine works with conversational language, much like when speaking to a person.

For example, Siri will pull up my calendar items if I say, “What’s my schedule for today?” “Do I have any appointments?” or simply, “What’s next for me?” By allowing for this variance, I can get the specific information I want, though; the first case shows today’s calendar events, the second brings up a full week’s worth of calendar events, while the last phrase returned only my next appointment.

Note to pre-empt the inevitable comments the screen shot will generate: I use an Exchange setup for iOS Mail and have shared calendars; my wife and step-daughter have yoga and dance class, not me.

Rise of the machines and artificial intelligence

Siri’s real value is illustrated when scheduling a new event, although that’s just one example. Instead of blindly creating an appointment, Siri first checks the calendar for any conflicts and warns if it finds any. Plus, it will ask if you want to move the conflicting event. Because Siri uses high-level artificial intelligence (AI), it offers far more value, but does so in a way that’s natural, so even first-time smartphone owners can use it. Siri can also learn who’s important in your life. I can tell Siri to text my son and it will automatically create the message with his phone number; no need to speak his name.

Just like Google has done with Voice Actions for Android, Apple has integrated Siri with several native apps such as the phone, email, messaging, notes and web searches. Siri can search Google, but it also adds support for Wolfram Alpha, which provides answers to very specific questions. Asking Siri about the square root of a number, for example, will quickly return the answer from Wolfram Alpha.

Siri also works in iOS 5 with the Clock, Reminders and Apple’s new Find My Friends app. Instead of setting a recurring alarm, I’m now telling Siri when to wake me up in the morning, depending on how late I stay awake. You can ask for the local time or the time in a given location. Setting task reminders is simple but effective: You can tell Siri to remind you at a given time or simply say, “Remind me to take out the trash in 17 minutes.” And if your family members or friends are sharing their location, you can ask Siri where they are.

Siri is quite the comedian

On the less productive side — but no less entertaining — is Siri’s personality. By contrast, Google’s Voice Actions has none because there’s no AI involved. Here are some of the fun questions and answers I’ve had in conversations with Siri:

Me: What’s the best mobile phone?
Siri: The one you’re holding.

Me: What’s your favorite color?
Siri: My favorite color is… well, I don’t know how to say it in your language. It’s sort of greenish, but with more dimensions.

Me: What do you want to be when you grow up?
Siri:  I try to be satisfied with what I have, Kevin.

Me: What do you look like?
Siri: Does my appearance matter to you, Kevin?

OK, so these little chats don’t help my productivity levels, but it’s fun to see how Siri will respond in conversation. And it illustrates the level of contextual understanding and AI that’s built in to the product. The product reminds me of the vision I had last year when writing a GigaOM Pro article about smartphones powering robots (subscription required): With sensors and connectivity, our smartphones have “senses” plus access to near limitless information. Thanks to Siri, they now have intelligence to understand our questions in everyday language and are even fun to interact with.

Google: It’s your move

Remember that Siri is only a beta product. Apple is sure to improve and expand Siri’s capabilities going forward. Google can rely on third-party apps to do the same — think Vlingo, which is experiencing a sudden surge is use thanks to Siri — but a better approach would be for Google to mature Android’s native voice functionality. Yes, third-party software may help sell phones to a degree, but native functionality, especially when it looks magical to the mass market, will have a greater impact.

As a side note, I had decided before the iPhone 4S launch that I might pass on it if Apple didn’t boost the screen size as I had hoped it would. My tired old eyes prefer a larger display and I can also type faster on a bigger software keyboard. Siri, however, mitigates this to a point. The more I use Siri, the less I look at the iPhone’s screen, because Siri provides spoken feedback. Additionally, iOS 5 adds a voice input button on the keyboard, similar to Android. Using the speech-to-text engine has me typing less than ever, because it’s extremely accurate and appears faster than Google’s own speech recognition engine.

  1. Give the Speaktoit app in the market a try.

    Share
  2. NICE YOGA AND DANCE CLASS, DUDE!!!1!!

    Just kidding :) I actually can read.

    I think Siri is pretty cool. The most worrying thing about it so far, besides missing useful features like being able to tell it to open any app on the iPhone by name and interact with them, is the up and down nature of the Siri servers so far. I think on the 15th, I was completely unable to connect with Siri all day.

    I’m assuming Apple will fix that, and I’m also wondering if that’s why they limited Siri to the 4S for now, so as to limit the impact of the sudden flow of Siri use by the masses.

    Share
    1. Well…. I *can* do so sick hip-hop moves. ;) I agree: there are many more features I’d like to see added, but given that this is a beta I figure Apple will add them over time. In fact, my gut says that this may become an interface that rivals touch in the future. The server issue is a problem and sounds like a flood of people using the service; Apple needs to get the kinks worked out for that big data center so it can handle all of this speech processing.

      Share
  3. so bomb. i want to try!

    Share
  4. This article makes me even more jealous of iPhone 4S users!!! I can’t afford to upgrade from my iPhone 4 right now, and this is the one feature I want most!

    The primary reason I want Siri is that I would find it most useful while driving. Instead of handing my phone off to my wife while I drive to perform some task on my iPhone 4, I could use Siri to search, read/reply texts, call, map a route, etc. Oh well…

    Share
  5. If you want an that compares to Siri on Android give Voice Actions by Pannous (not Google) a try. It has all the features of Siri as far as I can tell and you can speak in a Natural Voice/language. Can open any app, play music tracks or playlists, download and search apps in Market, answers all kind of questions, etc. It can be set to run in background and can be “woke” by shaking it or saying “hey Jeannie”. Anyways I’ve been using it for a while and it’s pretty good but ever since Siri arrived there have been comparisons of Siri vs Vlingo, or Siri vs Google Voice Actions. Both those apps are limited in features and the ability to speak naturally. Would like to see a Siri vs Voice Actions (by Pannous)

    Share
    1. I agree. I have to laugh because this is the third article I’ve seen comparing Siri to an Android app, and the writer isn’t even aware of other options that are available. I use Jeannie everyday. And she may not be perfect, but she’s not bad. And she has a sense of humor, too. I’m actually finding that the more I use her, the better she becomes. Voice Actions by Pannous is definitely more similar to Siri than the Google version or Vlingo would be.

      Share
      1. I’m aware of 3rd party apps for Android and even mentioned that in the article. This was a look at what Apple and Google are doing from a platform perspective.

        Share
      2. I use Jeannie every day too, however I’ve renamed her “Skynet” after I asked her “what is your quest?”

        Share
  6. @Kevin have you tried that one before?

    Share
    1. Marvin, I haven’t tried it, but based on your description, I’ll give it a look. I focused more on the native voice functionality in this article because it’s interesting to compare what Apple and Google are doing for their respective platforms. But you’re right: there are solid 3rd party alternatives.

      Share
      1. Id still like that comparison though. I mean technically Siri is a third party app that Apple had sense enough to buy. Great article though. Informative and unbiased

        Share
      2. what marvin said. siri is/was a 3rd party app before bought… kinda irks me when people say apple created siri… this is also my first time hearing about voice actions by pannous, im gonna try it :) glad i read this.

        Share
  7. Stan Tarnovsky Monday, October 17, 2011

    It looks like Siri is a clever and entertaining software. Let’s see if it will prove to be useful as well…

    Share
  8. Is that voice input button only available on iPhone 4s or is it on the iPhone 4 as well with iOS 5?

    Share
    1. Siri is currently available for iPhone 4S only. That could change in the future though.

      Share
  9. Intelligence is proactive, dumb [systems] are reactive.

    For example there is no reason a system can not ask me if it should calculate the MPG if I get back into my car after I stopped at a gas station:

    Knows car [bluetooth]
    Knows gas station [GPS]
    Got out of car [turned of, bluetooth]
    Back into car [turned back on,bluetooth]
    If in a smarter car, can get miles driven since last gas stop
    [ok, should use NFC here, but for the sake of argument]

    Real life data is [ mostly always] bound by space time[which becomes painfully clear if one teaches a machine math], in other words occurrence based data binding. Which doesn’t have to be “programmed” it can bind itself.
    The same goes for “meaning”. Context[organized data, see above] provides meaning, that’s why words can have different meanings in different situations or different words can have “same” meaning. Google has none, Siri some.

    While Siri moves in the right direction, speech or keyboard input, I believe they have a long way to go to understand and build context by using self[I know I'm a phone and I know where I'm in space time and recognize data (past/present/prediction)] to become proactive.

    Or in Google’s case. Which darn keyboard is used in dictation mode to give better results. Ever tried swype and/or Google keyboard to switch to dictation for the same text dictated and got different results? Android can do everything and nothing well. Or everybody can hack something together[1].

    1. http://techcrunch.com/2011/10/17/iris-is-sort-of-siri-for-android/

    Share
  10. I see someone mentioned it already but, I have used Voice Actions to do what Siri is doing for about a year now. It also has its own bit of humor.

    https://market.android.com/details?id=com.pannous.voice.actions.free&feature=search_result

    Share
    1. Also I just noticed Voice Actions is also on the iPhone, but I can’t test if it works as well as the Android version.

      Share

Comments have been disabled for this post