New “deep neural networks” technology out of Microsoft and the University of Toronto claims to not only translate your English into Mandarin but to speak it in your voice. Microsoft head researcher Richard Rashid demonstrated the technology in China recently.


Talk about your time savers. New research from Microsoft and the University of Toronto may make it possible for non-Chinese speakers to “speak” the language in their own voices without having to learn the language. Given the trade relationships between the US and China, this could be a really big deal if it works as advertised.

While great strides have been made in speech recognition over the past decades, the current systems still carry word error rates of 20 percent to 25 percent when handling “arbitrary speech,” Microsoft’s Richard Rashid wrote in a blog post.  (Do you hear that Siri?)

But now, new technology called Deep Neural Networks, which mimics the way the human brain operates, enables much more discriminating speech recognition, according to Rashid, Microsoft’s chief research officer.

Rashid, who demonstrated the technology at a Microsoft conference in Tianjin, China in late October said the process takes text from the subject’s speech, runs it through a translator, first finding the Chinese (I’m assuming Mandarin) equivalents for his words, then rearranges the words in a way that is appropriate to the new language.

In addition, a text-to-speech system uses samples from a native Chinese speaker and from the English speaker’s own voice from pre-recorded English data to model the sound of the speaker’s voice. Watching a video of Rashid’s demo, it’s clear that the technology impressed the audience of Tianjin students. The system is not perfect, Rashid said, but it does cut word error rate by 30 percent.

Microsoft, by the way, last week inked a deal with 21Vianet, a Shanghai-based ISP, to bring both Windows Azure public cloud services and Office 365 to China.

More here about the translation technology from TheNextWeb and SlashGear. And now, check out the video.


You’re subscribed! If you like, you can update your settings

  1. lingchenjiang Friday, November 9, 2012

    as a chinese, i am wondering how it works…

    1. As an Indian I am also wondering how it works.

  2. Nigerian Newspapers Today Friday, November 9, 2012

    This is amazing!!
    How on earth did they come up with such technology?
    Amazing things happen

    1. technology means nothing, details are everything.

  3. The headline made me laugh very loudly. Good work that editor.

  4. Here’s a video on software for translating Mandarin-English conversations:

  5. An error rate of 3percent is considered unacceptable for speech translation software. If the Microsoft translator has an error rate 20-25 percent, it is not ready for alpha testing.

  6. It’s a Brave New World…

  7. This system has hand held application for verbal communication.
    I have a system for written info processing or info storage and retrieval called weizima which
    may be googled. Weizima can help to sort Chinese characters.

  8. Alvin7 writes: “An error rate of 3percent is considered unacceptable for speech translation software. If the Microsoft translator has an error rate 20-25 percent, it is not ready for alpha testing.”

    Considering my current error rate in a cab in Taipei is 1000%, I’ll take 75% success rate any day.

Comments have been disabled for this post