Blog Post

Could Yap Be The Next Big Speech Recognition Player?

The speech recognition field has quietly gotten a little more crowded. Yap, a Charlotte, N.C.-based startup, today said it has been tapped to power Microsoft’s (s msft) Talk-to-Text mobile application, which Sprint (s s) offers to BlackBerry users. The app uses Yap’s web-based speech recognition technology to automatically transcribe users’ spoken words into texts and e-mails, much like services such as Vlingo.

Yap pocketed $6.5 million in a Series A round led by SunBridge Partners two years ago, and several months ago it replaced SpinVox as Cincinnati Bell’s voicemail-to-text service provider. Cincinnati Bell’s move was something of a shot across the bow of SpinVox parent Nuance (s nuan), which last week pulled the plug on SpinVox’s consumer service to focus more intensely on its carrier business.

Microsoft’s decision to license Yap’s  speech recognition technology instead of using its own is interesting considering the software behemoth bought its way onto the field three years ago with the acquisition of Tellme Networks for a reported $800 million. But it’s a clear sign that a legitimate new player has joined the giants — Microsoft, Google (s goog) and Nuance — in the speech recognition world.

Related content from GigOM Pro (sub req’d):

How Speech Technologies Will Transform Mobile Use

Image courtesy Flickr user DJOtaku.

5 Responses to “Could Yap Be The Next Big Speech Recognition Player?”

  1. Steve Kauffman

    Maybe I’m crazy. I am not a tech person;but everything I’ve seen,including Vlingo is like a toy compared to Nuance. There is no ulterior motive here. Has any independent research company compared these products. It just doesn’t seem close.

    • Steve, you have to carefully set up an apples-to-apples comparison. Think of three categories of speech-recognition applications:

      1. Speaker dependent, with high quality audio (PC based dictation, like Dragon NaturallySpeaking)
      2. Speaker dependent, with noisy audio (Smartphone applications)
      3. Speaker independent, with noisy audio (Voicemail transcription for cloud-based telephony)

      Category 3 is the most challenging, and Yap compares quite favorably with larger players here. For that category, we (MyCaption) have been using Yap technology for almost a year now. We have seen impressive improvement in their accuracy.

      Speech recognition is far from perfect, and yes, we do end up using human editors to bridge the accuracy gap. But I am optimistic about the future of this technology in general, and of Yap in particular.