Blog Post

Speech smack-down: Siri vs. Android Voice Actions

Apple(s aapl) reported on Monday that it has sold 4 million iPhone 4S handsets in the debut weekend of the new smartphone. Some may have purchased it for the dual-core processor, while others were sold on the updated 8 megapixel camera and intelligent imaging software. But as a long-time Android(s goog) owner, one feature alone pushed me to add an iPhone 4S to my stable of smartphones: Apple’s Siri service that turns the handset into a personal assistant.

Google had voice commands a year ago

“But wait,” the Android users are crying out. “Google has voice services too!” Indeed, Android devices do have similar voice services. Voice search is native to the platform and blew me away when I first used it on a Google Nexus One in Jan. 2010. Even in a crowded room, I could search the web with ease just by speaking to my phone. And in August of last year, Google introduced Voice Actions, a downloadable Android app that extended voice services to speech commands. Here’s a video demo to illustrate the functionality.

At a quick glance, Apple’s Siri and Google’s Voice Actions appear similar. In many ways, they are. Here’s a quick breakdown of the similar functions that both services provide through speech:

  • Creation of text messages and emails.
  • Get navigation directions.
  • Call a contact.
  • View a map of a particular area.
  • Write a note.
  • Play music.
  • Perform a web search.

Google’s Voice Actions also includes the ability to search for and call a business in one step, plus it can be used to open a web page. Siri comes close to opening web pages; when I say “go to” Siri does a Google search for the site, making it one tap away.

So what’s different about Siri?

That’s pretty much the end for Google Voice Actions, which requires users to memorize the exact commands much like Microsoft’s(s msft) Voice Command for Windows Mobile did back in its 2003 debut. It’s not difficult to do so, and the functionality is certainly useful. However, Siri can do even more and provides intelligence that Android doesn’t yet match. Compare the above video demo of Google’s solution with this introduction to Siri that Apple created; you’ll immediately notice that Siri isn’t simply a voice command system. Instead, it’s a semi-intelligent interactive assistant:

The biggest difference is the one that will likely have the most impact going forward; Siri’s use of natural language. Instead of memorizing set commands, Siri can understand questions, even if they’re asked in different ways. The speech engine works with conversational language, much like when speaking to a person.

For example, Siri will pull up my calendar items if I say, “What’s my schedule for today?” “Do I have any appointments?” or simply, “What’s next for me?” By allowing for this variance, I can get the specific information I want, though; the first case shows today’s calendar events, the second brings up a full week’s worth of calendar events, while the last phrase returned only my next appointment.

Note to pre-empt the inevitable comments the screen shot will generate: I use an Exchange setup for iOS Mail and have shared calendars; my wife and step-daughter have yoga and dance class, not me.

Rise of the machines and artificial intelligence

Siri’s real value is illustrated when scheduling a new event, although that’s just one example. Instead of blindly creating an appointment, Siri first checks the calendar for any conflicts and warns if it finds any. Plus, it will ask if you want to move the conflicting event. Because Siri uses high-level artificial intelligence (AI), it offers far more value, but does so in a way that’s natural, so even first-time smartphone owners can use it. Siri can also learn who’s important in your life. I can tell Siri to text my son and it will automatically create the message with his phone number; no need to speak his name.

Just like Google has done with Voice Actions for Android, Apple has integrated Siri with several native apps such as the phone, email, messaging, notes and web searches. Siri can search Google, but it also adds support for Wolfram Alpha, which provides answers to very specific questions. Asking Siri about the square root of a number, for example, will quickly return the answer from Wolfram Alpha.

Siri also works in iOS 5 with the Clock, Reminders and Apple’s new Find My Friends app. Instead of setting a recurring alarm, I’m now telling Siri when to wake me up in the morning, depending on how late I stay awake. You can ask for the local time or the time in a given location. Setting task reminders is simple but effective: You can tell Siri to remind you at a given time or simply say, “Remind me to take out the trash in 17 minutes.” And if your family members or friends are sharing their location, you can ask Siri where they are.

Siri is quite the comedian

On the less productive side — but no less entertaining — is Siri’s personality. By contrast, Google’s Voice Actions has none because there’s no AI involved. Here are some of the fun questions and answers I’ve had in conversations with Siri:

Me: What’s the best mobile phone?
Siri: The one you’re holding.

Me: What’s your favorite color?
Siri: My favorite color is… well, I don’t know how to say it in your language. It’s sort of greenish, but with more dimensions.

Me: What do you want to be when you grow up?
Siri:  I try to be satisfied with what I have, Kevin.

Me: What do you look like?
Siri: Does my appearance matter to you, Kevin?

OK, so these little chats don’t help my productivity levels, but it’s fun to see how Siri will respond in conversation. And it illustrates the level of contextual understanding and AI that’s built in to the product. The product reminds me of the vision I had last year when writing a GigaOM Pro article about smartphones powering robots (subscription required): With sensors and connectivity, our smartphones have “senses” plus access to near limitless information. Thanks to Siri, they now have intelligence to understand our questions in everyday language and are even fun to interact with.

Google: It’s your move

Remember that Siri is only a beta product. Apple is sure to improve and expand Siri’s capabilities going forward. Google can rely on third-party apps to do the same — think Vlingo, which is experiencing a sudden surge is use thanks to Siri — but a better approach would be for Google to mature Android’s native voice functionality. Yes, third-party software may help sell phones to a degree, but native functionality, especially when it looks magical to the mass market, will have a greater impact.

As a side note, I had decided before the iPhone 4S launch that I might pass on it if Apple didn’t boost the screen size as I had hoped it would. My tired old eyes prefer a larger display and I can also type faster on a bigger software keyboard. Siri, however, mitigates this to a point. The more I use Siri, the less I look at the iPhone’s screen, because Siri provides spoken feedback. Additionally, iOS 5 adds a voice input button on the keyboard, similar to Android. Using the speech-to-text engine has me typing less than ever, because it’s extremely accurate and appears faster than Google’s own speech recognition engine.

30 Responses to “Speech smack-down: Siri vs. Android Voice Actions”

  1. Liquidrain7

    @Kevin: As an Android user, I agree Siri is a bit more polished than Google’s built-in Voice actions; however you mention a couple of items that I would like to touch on. First, if you can ask Google Voice Actions the square root of any number, and it will provide the correct result. So to say that it cannot perform that type of function is misleading. Also, I feel it is worth mentioning that with the introduction of Android 4.0 (Ice Cream Sandwich) that dictating text is near instant and I believe does not require an Server connection, something that Siri requires all the time to be functional. As a interested reader, I hope you are willing to revisit the comparison when the Galaxy Nexus is launched with the official Android 4.0 Ice Cream Sandwich. Thanks!

    • Totally valid points; the example I used was the square root, which Google Voice Actions can find though a web search as well. But overall, Siri appears to have a wider array of “knowledge” so to speak. And you’re right about Android 4.0 on the dictation as it is near-real time. I’m not sure if it requires a server connection, although I suspect it does. If not, that’s a huge differentiator. Note that I wrote this article several days prior to the Android 4.0 launch, so I couldn’t have mentioned this functionality. ;) No doubt I’ll be looking at voice actions/commands when I have an Android 4.0 device in hand – should be interesting to see the difference then – thanks!

  2. George Kraev

    unless google has been working on a similar engine behind the scenes they cannot just expand the functionality of google commands. Siri is a combination of an AI and voce recognition with a very well designed semantics engine. It takes a long time to get something like this working well and even with the amount of genius gathered at the Google campus you cannot just wake up and decide to replicate Siri. This is also the season why Siri is still in beta and Apple will likely keep it like that for at least another year. That said, once the work is finished, Siri will become the mobile version of the OS X services and that is something I cannot wait to see!

  3. David Pat

    You can navigate a webpage voice? You don’t look at the screen when Siri tells you the results its pulled up for you?

    These tech journalists don’t actually use these products outside their offices. Take Siri out and realize how much more useful voice map navigation is (Android) then being able to ask Siri how to make a sandwich as you are driving…

  4. Voice Actions and Speaktoit don’t really require specific command words. I speak naturally to Voice Actions (Jeannie), and it responds pretty much the way Siri does. But my Evo $G is a single core phone. once I’m ready to upgrade to a new phone (maybe the Nexus Prime Tegra 3) it should work even better. That Google video in the review is old…a lot has changed since then.

  5. I think Voice Actions is the real comparison, since it seems to have all the features Siri now has. I have been using it for about a year and just did a side by side comparison with my coworkers iPhone 4S and they both seem to hit all the points, except Voice Actions can open apps and has a robotic voice. They both have funny responses, but Siri is a good bit funnier.

  6. Intelligence is proactive, dumb [systems] are reactive.

    For example there is no reason a system can not ask me if it should calculate the MPG if I get back into my car after I stopped at a gas station:

    Knows car [bluetooth]
    Knows gas station [GPS]
    Got out of car [turned of, bluetooth]
    Back into car [turned back on,bluetooth]
    If in a smarter car, can get miles driven since last gas stop
    [ok, should use NFC here, but for the sake of argument]

    Real life data is [ mostly always] bound by space time[which becomes painfully clear if one teaches a machine math], in other words occurrence based data binding. Which doesn’t have to be “programmed” it can bind itself.
    The same goes for “meaning”. Context[organized data, see above] provides meaning, that’s why words can have different meanings in different situations or different words can have “same” meaning. Google has none, Siri some.

    While Siri moves in the right direction, speech or keyboard input, I believe they have a long way to go to understand and build context by using self[I know I’m a phone and I know where I’m in space time and recognize data (past/present/prediction)] to become proactive.

    Or in Google’s case. Which darn keyboard is used in dictation mode to give better results. Ever tried swype and/or Google keyboard to switch to dictation for the same text dictated and got different results? Android can do everything and nothing well. Or everybody can hack something together[1].


    • Marvin, I haven’t tried it, but based on your description, I’ll give it a look. I focused more on the native voice functionality in this article because it’s interesting to compare what Apple and Google are doing for their respective platforms. But you’re right: there are solid 3rd party alternatives.

      • Id still like that comparison though. I mean technically Siri is a third party app that Apple had sense enough to buy. Great article though. Informative and unbiased

      • what marvin said. siri is/was a 3rd party app before bought… kinda irks me when people say apple created siri… this is also my first time hearing about voice actions by pannous, im gonna try it :) glad i read this.

  7. If you want an that compares to Siri on Android give Voice Actions by Pannous (not Google) a try. It has all the features of Siri as far as I can tell and you can speak in a Natural Voice/language. Can open any app, play music tracks or playlists, download and search apps in Market, answers all kind of questions, etc. It can be set to run in background and can be “woke” by shaking it or saying “hey Jeannie”. Anyways I’ve been using it for a while and it’s pretty good but ever since Siri arrived there have been comparisons of Siri vs Vlingo, or Siri vs Google Voice Actions. Both those apps are limited in features and the ability to speak naturally. Would like to see a Siri vs Voice Actions (by Pannous)

    • I agree. I have to laugh because this is the third article I’ve seen comparing Siri to an Android app, and the writer isn’t even aware of other options that are available. I use Jeannie everyday. And she may not be perfect, but she’s not bad. And she has a sense of humor, too. I’m actually finding that the more I use her, the better she becomes. Voice Actions by Pannous is definitely more similar to Siri than the Google version or Vlingo would be.

  8. Colin Komar

    This article makes me even more jealous of iPhone 4S users!!! I can’t afford to upgrade from my iPhone 4 right now, and this is the one feature I want most!

    The primary reason I want Siri is that I would find it most useful while driving. Instead of handing my phone off to my wife while I drive to perform some task on my iPhone 4, I could use Siri to search, read/reply texts, call, map a route, etc. Oh well…


    Just kidding :) I actually can read.

    I think Siri is pretty cool. The most worrying thing about it so far, besides missing useful features like being able to tell it to open any app on the iPhone by name and interact with them, is the up and down nature of the Siri servers so far. I think on the 15th, I was completely unable to connect with Siri all day.

    I’m assuming Apple will fix that, and I’m also wondering if that’s why they limited Siri to the 4S for now, so as to limit the impact of the sudden flow of Siri use by the masses.

    • Well…. I *can* do so sick hip-hop moves. ;) I agree: there are many more features I’d like to see added, but given that this is a beta I figure Apple will add them over time. In fact, my gut says that this may become an interface that rivals touch in the future. The server issue is a problem and sounds like a flood of people using the service; Apple needs to get the kinks worked out for that big data center so it can handle all of this speech processing.