2 Comments

Summary:

Google is developing technology it hopes will translate foreign languages almost instantly as users talk on their phones. But while the software could transform the way we talk to each other around the world, as Google acknowledges, it will likely take many years to achieve.

Google is developing mobile software that would translate foreign languages almost instantly, according to a new report in The Times Online. Enabling automated voice communication among speakers of the world’s 6,000-plus languages is a lofty goal, and one that may take many years to achieve — if it can be done at all.

The project combines Google’s speech-recognition technology with its automated system for translating text on computers, which now covers 52 languages. The company showcased the technology as a “concept demo” two months ago and now hopes to have a basic mobile translation system in place “within a couple of years,” according to the Times story. (A Google spokesman declined to tell me whether the project has moved beyond the concept stage and said the company has no news to announce.)

Delivering accurate translations in near real time will be an extremely difficult task, however. The language information behind such a service couldn’t possibly be stored and accessed on a phone, so Google will need the fast access to the cloud promised by HSPA and LTE. (Current speech-to-text engines send large chunks of data to the cloud for conversion to text, but translations during voice conversations will have to be infinitely faster.) And while Google’s speech recognition software is respectable, the company may need to bulk up its portfolio by acquiring another player on the field — Nuance, which has continued to expand its portfolio with the recent acquisitions of SpinVox and Jott , would be a particularly attractive (if expensive) option.

The value of technology that can deliver on-the-fly translations is evident in the Phraselator, a pricey gadget used by soldiers in Iraq. But to be effective in the mass market, voice recognition technologies must be able to consistently understand users regardless of speech patterns, dialects and other variables. Meanwhile, software must be able to determine context and other nuances to accurately translate one language to another. That’s difficult enough for humans to do; it may not be possible with technology alone.

Image courtesy Flickr user Jeremy Brooks.

  1. This will be very interesting to follow!

    All technologies involved here are error-prone. Overcoming their limitation will warrant some kind of reinvention of the wheel that will be impressive to behold if it works.

    Simply employing current Speech-to-Text output in conjunction with machine translation alone won’t cut it (even with automatic speech recognition accuracy rates well in the eighties.) Translating clean text into another language is a noisy channel problem and while there are machine translation approaches that work with “broken” input, there is no wholesale solution that systematically models speech recognition errors. The problem then becomes semantic one (not statistics, which Google exceeds at).

    In any case, Google’s solution may just blow our minds.

    Google buying Nuance I suspect is unlikely, not only because of company value, but also as the latter consists of a whole range of product and professional service groups (beyond the obvious speech technology data/expertise) that Google will not be interested in, but which will be hard to break out of the overall structure.

    Share

Comments have been disabled for this post