Google Translation Center: The World’s Largest Translation Memory

Brian McConnell | Monday, August 4, 2008 | 5:50 PM PT | 17 comments

Disclosure: I am the founder of Der Mundo, a multilingual blogging service and translation community that combines human and machine translation (provided in part by Google), and I have researched translation technology for more than 10 years via the Worldwide Lexicon project.

Blogoscoped reports that Google is preparing to launch Google Translation Center, a new translation tool for freelance and professional translators. This is an interesting move, and it has broad implications for the translation industry, which up until now has been fragmented and somewhat behind the times, from a technology standpoint

Google has been investing significant resources in a multi-year effort to develop its statistical machine translation technology. Statistical MT works by comparing large numbers of parallel texts that have been translated between languages and from these learns which words and phrases usually map to others — similar to the way humans acquire language. The problem with statistical MT is that it requires a large number of directly translated sentences. These are hard to find, and because of this SMT systems use sources like the proceedings from the European Parliament, United Nations, etc. Which are fine if you’re writing in bureaucrat-speak, but aren’t so great for other texts. Google Translation Center is a straightforward and very clever way to gather a large corpus of parallel texts to train its machine translation systems.

Part machine translator and part translation memory (a sort of search engine for translation that helps translators to recall translations), GTC will help translators by providing a free, global translation memory, and in turn drive costs down by reducing the amount of work needed to complete a text. It will help Google by providing an excellent source of high quality parallel texts that can be fed back into the statistical translation systems.

If Google releases an API for the translation management system, it could establish a de facto standard for integrated machine translation and translation memory, creating a language platform around which projects like Der Mundo can build specialized applications and collect more training data.

On the other hand, GTC could be bad news for translation service bureaus — especially those that use proprietary translation management systems as a way to hold customers and translators hostage. Most translation bureaus aren’t really technology companies and aren’t very competent at building quality software. Google Translation Center fills a void in the translation tools market that was created when the few independent companies, such as Trados, were acquired.

For freelancers, GTC could be very good news; they could work directly with clients and have access to high quality productivity tools. Overall this is a welcome move that will force service providers to focus on quality, while Google, which is competent at software, can focus on building tools. Google has a pretty mixed track record with consumer-facing services outside its core search business. But if it positions itself as a neutral service provider, it could enable projects like Der Mundo and others to create powerful and easy-to-use translation services for a broad range of industries.

Translation management is more complex than it appears, with different practices in different industries. If you’re translating a news story, you want minimal cost and fast turnaround time (publish early, correct often). If you’re translating a product spec sheet, you’re willing to spend more to have it done right before it goes to press. Google would be smart to position GTC as a utility for translators and to encourage service bureaus to standardize around it, much as it did around earlier tools like Trados, and much as it has done with their keyword ad business. That strategy would also eliminate a potential conflict of interest, as translation professionals are understandably wary of contributing to something that could put them out of work, as well as avoid channel conflicts with partners who will be their best advocates in selling to various clients.

While it’s my guess that Google has no intention of directly monetizing the service (charging a commission on transactions it brokers would expose Google to a billing and payment disbursal nightmare), the R&D value of collecting millions of parallel sentences in every language pair imaginable is indisputable, and it will pay off in unforeseen ways. So, my guess is Google will make this a free tool for the translation industry to use, and it will figure the money part out later. It can afford to be patient.

Translation is a very difficult problem. If it weren’t, it would have been solved a long time ago. I remain convinced that a multilingual web will be a reality in a short time, and that a menagerie of tools and services will emerge over the next few years — some geared toward helping translators, some toward building translation communities, and others that make publishing multilingual sites and blogs easy and intuitive.

As these emerge, the web will begin translating itself, and within a short time, we’ll be able to read content from sources worldwide just as we currently explore the web in our own language today.

4 trackbacks so far

August 4th, 2008
11:06 PM PT

[...] I just read an insightful article on this Google’s service at GigaOm… Possibly related posts:The end of translation as we know [...]

August 5th, 2008
3:27 AM PT

[...] on the web, multilingual approaches are absolutely necessary, whether machine, human, or hybrid.  GigaOm yesterday featured a piece by Brian McConnell, founder of Der Mundo, a “multilingual blogging service and translation community that [...]

August 5th, 2008
2:41 PM PT

[...] was reading this very interesting article on Google’s new Translation Center, when my eye was drawn to the last comment: Machine [...]

August 18th, 2008
2:52 AM PT

[...] the Google Translation Center can only be a good thing. For more information try these two links: Gigaom and [...]

13 comments so far

August 5th, 2008
2:30 AM PT
David Codish said:

Good article and addresses some thoughts I had about translators for a while. Good that Google is doing something in this space.

I’m still waiting for the day that I will be able to chat with people that I dont speak my language and have the chat program make that fact transparent to me.

August 5th, 2008
4:31 AM PT
JP said:

You could have talked about (link) . We are doing exactly the same thing since 3 years. Except there is only volunteers, no money involved.
From my experience with cucumis, the most difficult part, is to proofread the the translations. We have a team of more than 100 volunteers admins there.

August 5th, 2008
5:09 AM PT
Lukas said:

similiar Project just launched a few months ago and will be available in english, too! » (link)

A Social Translation Community with Translation Memory help and online Translation Service

August 5th, 2008
7:19 AM PT
Benny said:

Similar service already exist, and already open for everyone - OneHourTranslation.com
(link)

This is a web service for people who need to translate their documents, fast and low-cost.
It is based on thousands of freelance translators all around the globe, monitored and rated for high quality results.

August 5th, 2008
7:41 AM PT
Sue said:

The last point you raised is so important.
“As these emerge, the web will begin translating itself, and within a short time”
Just think about all the possibilities that a multi-language sites offer.
I already use a good instant service (http://onehourtranslation.com) to translate my blogs to 2 more languages. Machine translation is still far from being perfect, and regular translators are very expensive. In a flat world this service should cost next to nothing.

August 6th, 2008
9:54 AM PT
mendi said:

Thanks benni,

Just checked onehourtranslation.com.

bought 1000 words for 36.1$ and got my text in Spanish in no time

It’s really working, cool

August 6th, 2008
11:15 AM PT
Chuck Baggett said:

I don’t think humans actually do learn language by performing statistical analysis of parallel text.

I think humans learn language by hearing the people around them talk, people who reward them for repeating the teacher (parent, babysitter, etc.) while smiles, sounds of encouragement,etc. The sounds are often associated with specific objects or actions, such as saying “bottle” while showing the learner a bottle.

Later on, one learns a language by studying vocabularly, grammar rules, and repetition of sounds which get approved or rejected.

August 6th, 2008
3:22 PM PT
Marco said:

The world’s larget translation memory is here:
(link)

August 8th, 2008
1:39 PM PT
Gabriel Kent said:

Onward to a meta-language!

As these systems grow better at transforming from one language to another we will find Babel redux in the aether.

Exciting ;)

August 16th, 2008
11:40 AM PT
Lautaro said:

There were an interesting discussion about this issue at (link)

Personally I think that there are different translation markets, the human translation will continue be a business, but Proz will have a serious competitor.

August 27th, 2008
2:26 AM PT
Maxwell said:

Hi there,

I work for a large UK translations company.

There are a few points to make regarding automated translation systems such as those found online.

Software can be used for translation projects, but I would advise against it if what you’re translating is intended ultimately for publication.

There are often so many cultural nuances to take into consideration that cutting corners by using translation software can render your message practically impossible to understand for your target audience.

And beyond popular European languages like Spanish (that being 2nd only to English and 4th overall in the world), you run the risk of dramatic misinterpretation.

Often the only way to go is to utilise the services of a professional translator.

Anyway, great blog though; keep it up.

Good to have these things discussed so that people get the facts rather than being left to speculate and not get the results that they want.

Check this one…

Ning Social Bookmarking Group: Google Translation Center
(link)

October 15th, 2008
12:18 AM PT
Mark Daniels said:

As the owner of a Serbian-English translation business I certainly feel a hint of concern about the future of our industry! I recently tried Google’s Serbian-English translation tool and was pretty amazed at the result. Much of the text was understandable, even word-perfect.

However, it’s the bit that ISN’T perfect that is going to be the major obstacle and the reason why I think our foreseeable future is safe. For despite the 80% comprehensibility of the translated text, there is a 20% that I cannot see a computer dealing with in the near future, an aspect of communication that cannot be emulated by a computer. Apart from the nuances of culture, context, inflection etc. there is something more fundamental which computers lack: understanding.

I am firmly convinced that true translation is only possible when there is understanding - of the message, not just interpretation of the words, or even the sentences. True understanding, by definition, will only be achievable by an effectively sentient computer, and I don’t think we are anywhere near that just yet!

For superficial, quick-help understanding of a text, these tools are great, but for accurate, authentic communication in the target language humans will be needed for a long time to come…

Editorial Masthead

Carolyn Pritchard
Managing Editor
Celeste LeCompte
Special Projects Editor
Om Malik
Senior Writer
Stacey Higginbotham
Staff Writer
Wagner James Au
Contributing Editor
Liz Gannes
Staff Writer
Chris Albrecht
Staff Writer
Katie Fehrenbacher
Staff Writer
Josie Garthwaite
Staff Writer
Close
E-mail It