Disclosure: I am the founder of Der Mundo, a multilingual blogging service and translation community that combines human and machine translation (provided in part by Google), and I have researched translation technology for more than 10 years via the Worldwide Lexicon project. Blogoscoped reports that Google […]

Disclosure: I am the founder of Der Mundo, a multilingual blogging service and translation community that combines human and machine translation (provided in part by Google), and I have researched translation technology for more than 10 years via the Worldwide Lexicon project.

Blogoscoped reports that Google is preparing to launch Google Translation Center, a new translation tool for freelance and professional translators. This is an interesting move, and it has broad implications for the translation industry, which up until now has been fragmented and somewhat behind the times, from a technology standpoint

Google has been investing significant resources in a multi-year effort to develop its statistical machine translation technology. Statistical MT works by comparing large numbers of parallel texts that have been translated between languages and from these learns which words and phrases usually map to others — similar to the way humans acquire language. The problem with statistical MT is that it requires a large number of directly translated sentences. These are hard to find, and because of this SMT systems use sources like the proceedings from the European Parliament, United Nations, etc. Which are fine if you’re writing in bureaucrat-speak, but aren’t so great for other texts. Google Translation Center is a straightforward and very clever way to gather a large corpus of parallel texts to train its machine translation systems.

Part machine translator and part translation memory (a sort of search engine for translation that helps translators to recall translations), GTC will help translators by providing a free, global translation memory, and in turn drive costs down by reducing the amount of work needed to complete a text. It will help Google by providing an excellent source of high quality parallel texts that can be fed back into the statistical translation systems.

If Google releases an API for the translation management system, it could establish a de facto standard for integrated machine translation and translation memory, creating a language platform around which projects like Der Mundo can build specialized applications and collect more training data.

On the other hand, GTC could be bad news for translation service bureaus — especially those that use proprietary translation management systems as a way to hold customers and translators hostage. Most translation bureaus aren’t really technology companies and aren’t very competent at building quality software. Google Translation Center fills a void in the translation tools market that was created when the few independent companies, such as Trados, were acquired.

For freelancers, GTC could be very good news; they could work directly with clients and have access to high quality productivity tools. Overall this is a welcome move that will force service providers to focus on quality, while Google, which is competent at software, can focus on building tools. Google has a pretty mixed track record with consumer-facing services outside its core search business. But if it positions itself as a neutral service provider, it could enable projects like Der Mundo and others to create powerful and easy-to-use translation services for a broad range of industries.

Translation management is more complex than it appears, with different practices in different industries. If you’re translating a news story, you want minimal cost and fast turnaround time (publish early, correct often). If you’re translating a product spec sheet, you’re willing to spend more to have it done right before it goes to press. Google would be smart to position GTC as a utility for translators and to encourage service bureaus to standardize around it, much as it did around earlier tools like Trados, and much as it has done with their keyword ad business. That strategy would also eliminate a potential conflict of interest, as translation professionals are understandably wary of contributing to something that could put them out of work, as well as avoid channel conflicts with partners who will be their best advocates in selling to various clients.

While it’s my guess that Google has no intention of directly monetizing the service (charging a commission on transactions it brokers would expose Google to a billing and payment disbursal nightmare), the R&D value of collecting millions of parallel sentences in every language pair imaginable is indisputable, and it will pay off in unforeseen ways. So, my guess is Google will make this a free tool for the translation industry to use, and it will figure the money part out later. It can afford to be patient.

Translation is a very difficult problem. If it weren’t, it would have been solved a long time ago. I remain convinced that a multilingual web will be a reality in a short time, and that a menagerie of tools and services will emerge over the next few years — some geared toward helping translators, some toward building translation communities, and others that make publishing multilingual sites and blogs easy and intuitive.

As these emerge, the web will begin translating itself, and within a short time, we’ll be able to read content from sources worldwide just as we currently explore the web in our own language today.

You’re subscribed! If you like, you can update your settings

  1. Watch out ProZ, here comes Google Translation Center | Global by Design Monday, August 4, 2008

    [...] I just read an insightful article on this Google’s service at GigaOm… Possibly related posts:The end of translation as we know [...]

  2. Good article and addresses some thoughts I had about translators for a while. Good that Google is doing something in this space.

    I’m still waiting for the day that I will be able to chat with people that I dont speak my language and have the chat program make that fact transparent to me.

  3. How to Surf the Multilingual Web « Shepherd’s Pi Tuesday, August 5, 2008

    [...] on the web, multilingual approaches are absolutely necessary, whether machine, human, or hybrid.  GigaOm yesterday featured a piece by Brian McConnell, founder of Der Mundo, a “multilingual blogging service and translation community that [...]

  4. You could have talked about http://www.cucumis.org/. We are doing exactly the same thing since 3 years. Except there is only volunteers, no money involved.
    From my experience with cucumis, the most difficult part, is to proofread the the translations. We have a team of more than 100 volunteers admins there.

  5. similiar Project just launched a few months ago and will be available in english, too! » http://www.tolingo.com

    A Social Translation Community with Translation Memory help and online Translation Service

  6. Similar service already exist, and already open for everyone – OneHourTranslation.com

    This is a web service for people who need to translate their documents, fast and low-cost.
    It is based on thousands of freelance translators all around the globe, monitored and rated for high quality results.

  7. The last point you raised is so important.
    “As these emerge, the web will begin translating itself, and within a short time”
    Just think about all the possibilities that a multi-language sites offer.
    I already use a good instant service (http://onehourtranslation.com) to translate my blogs to 2 more languages. Machine translation is still far from being perfect, and regular translators are very expensive. In a flat world this service should cost next to nothing.

  8. And Money Should Grow On Trees | “la parole exportée” Tuesday, August 5, 2008

    [...] was reading this very interesting article on Google’s new Translation Center, when my eye was drawn to the last comment: Machine [...]

  9. Thanks benni,

    Just checked onehourtranslation.com.

    bought 1000 words for 36.1$ and got my text in Spanish in no time

    It’s really working, cool

  10. I don’t think humans actually do learn language by performing statistical analysis of parallel text.

    I think humans learn language by hearing the people around them talk, people who reward them for repeating the teacher (parent, babysitter, etc.) while smiles, sounds of encouragement,etc. The sounds are often associated with specific objects or actions, such as saying “bottle” while showing the learner a bottle.

    Later on, one learns a language by studying vocabularly, grammar rules, and repetition of sounds which get approved or rejected.

Comments have been disabled for this post