Google Translation Center: The World's Largest Translation Memory

25 Comments

Disclosure: I am the founder of Der Mundo, a multilingual blogging service and translation community that combines human and machine translation (provided in part by Google), and I have researched translation technology for more than 10 years via the Worldwide Lexicon project.

Blogoscoped reports that Google is preparing to launch Google Translation Center, a new translation tool for freelance and professional translators. This is an interesting move, and it has broad implications for the translation industry, which up until now has been fragmented and somewhat behind the times, from a technology standpoint

Google has been investing significant resources in a multi-year effort to develop its statistical machine translation technology. Statistical MT works by comparing large numbers of parallel texts that have been translated between languages and from these learns which words and phrases usually map to others — similar to the way humans acquire language. The problem with statistical MT is that it requires a large number of directly translated sentences. These are hard to find, and because of this SMT systems use sources like the proceedings from the European Parliament, United Nations, etc. Which are fine if you’re writing in bureaucrat-speak, but aren’t so great for other texts. Google Translation Center is a straightforward and very clever way to gather a large corpus of parallel texts to train its machine translation systems.

Part machine translator and part translation memory (a sort of search engine for translation that helps translators to recall translations), GTC will help translators by providing a free, global translation memory, and in turn drive costs down by reducing the amount of work needed to complete a text. It will help Google by providing an excellent source of high quality parallel texts that can be fed back into the statistical translation systems.

If Google releases an API for the translation management system, it could establish a de facto standard for integrated machine translation and translation memory, creating a language platform around which projects like Der Mundo can build specialized applications and collect more training data.

On the other hand, GTC could be bad news for translation service bureaus — especially those that use proprietary translation management systems as a way to hold customers and translators hostage. Most translation bureaus aren’t really technology companies and aren’t very competent at building quality software. Google Translation Center fills a void in the translation tools market that was created when the few independent companies, such as Trados, were acquired.

For freelancers, GTC could be very good news; they could work directly with clients and have access to high quality productivity tools. Overall this is a welcome move that will force service providers to focus on quality, while Google, which is competent at software, can focus on building tools. Google has a pretty mixed track record with consumer-facing services outside its core search business. But if it positions itself as a neutral service provider, it could enable projects like Der Mundo and others to create powerful and easy-to-use translation services for a broad range of industries.

Translation management is more complex than it appears, with different practices in different industries. If you’re translating a news story, you want minimal cost and fast turnaround time (publish early, correct often). If you’re translating a product spec sheet, you’re willing to spend more to have it done right before it goes to press. Google would be smart to position GTC as a utility for translators and to encourage service bureaus to standardize around it, much as it did around earlier tools like Trados, and much as it has done with their keyword ad business. That strategy would also eliminate a potential conflict of interest, as translation professionals are understandably wary of contributing to something that could put them out of work, as well as avoid channel conflicts with partners who will be their best advocates in selling to various clients.

While it’s my guess that Google has no intention of directly monetizing the service (charging a commission on transactions it brokers would expose Google to a billing and payment disbursal nightmare), the R&D value of collecting millions of parallel sentences in every language pair imaginable is indisputable, and it will pay off in unforeseen ways. So, my guess is Google will make this a free tool for the translation industry to use, and it will figure the money part out later. It can afford to be patient.

Translation is a very difficult problem. If it weren’t, it would have been solved a long time ago. I remain convinced that a multilingual web will be a reality in a short time, and that a menagerie of tools and services will emerge over the next few years — some geared toward helping translators, some toward building translation communities, and others that make publishing multilingual sites and blogs easy and intuitive.

As these emerge, the web will begin translating itself, and within a short time, we’ll be able to read content from sources worldwide just as we currently explore the web in our own language today.

25 Comments

habib

I believe that your english-arabic translation is far from accurate and often nonesense

Ima Translator

Thank you for your insights.
I agree with your statement – t is more complex than it appears – that is why that the more challenging translations has to be done by experts.

P. R.

This is pure science fiction: automated translation will never replace the human translator. Take a look at Umberto Eco’s book – Dire quasi la stessa cosa (I don’t think it was translated, even though it should contain some excerpts from the canadian “Experiences in translation”)- where the author makes some experiments with the automated translation tools available at the time. The results are just funny.

Louis

I use a translation add-on for Mozilla to understand what a text is about. the thing’s called Moztrans. It doesn’t use google because I have compared translations of one and the same tex and they are different. Moztrans supports topical dictionaries and the the results are really good when you choose the right dictionary.

John

Machine translation? I am an engineer turned translator. I agree with what was said above that a translator needs to understand the message, not just the words. Many, many times the translator picks up errors by the original author. A machine will just faithfully translate the errors, because it does not, cannot, have a clue about the author’s intentions. To translate an unknown text to find out what it’s all about, sure, use MT. But to produce something you want publicly displayed? Don’t even think about it.

Seth

Google needs to hire HUMAN translators. Machines just cannot ever read the minds of authors in their writings in context. Machines are not the answer to that. Translating old writing requires using bilingual specialists who not only have Bilingual spoken experience, but also studies in literature of the language experience. I’m a Spanish Literature major at IU Bloomington reading “El delantal blanco” just for the heck it and I could translate it better than Google! There is figure of speech used that cannot be translated literally. So Google, can I be your first professional HUMAN translator of Spanish and English?

Jim

No doubt, a great tool for translators.

Yet, The Tomedes.com is still the king of language translation with their unique model.

It offers the best mixture of prices, quality and delivery time.

Mark Daniels

As the owner of a Serbian-English translation business I certainly feel a hint of concern about the future of our industry! I recently tried Google’s Serbian-English translation tool and was pretty amazed at the result. Much of the text was understandable, even word-perfect.

However, it’s the bit that ISN’T perfect that is going to be the major obstacle and the reason why I think our foreseeable future is safe. For despite the 80% comprehensibility of the translated text, there is a 20% that I cannot see a computer dealing with in the near future, an aspect of communication that cannot be emulated by a computer. Apart from the nuances of culture, context, inflection etc. there is something more fundamental which computers lack: understanding.

I am firmly convinced that true translation is only possible when there is understanding – of the message, not just interpretation of the words, or even the sentences. True understanding, by definition, will only be achievable by an effectively sentient computer, and I don’t think we are anywhere near that just yet!

For superficial, quick-help understanding of a text, these tools are great, but for accurate, authentic communication in the target language humans will be needed for a long time to come…

Maxwell

Hi there,

I work for a large UK translations company.

There are a few points to make regarding automated translation systems such as those found online.

Software can be used for translation projects, but I would advise against it if what you’re translating is intended ultimately for publication.

There are often so many cultural nuances to take into consideration that cutting corners by using translation software can render your message practically impossible to understand for your target audience.

And beyond popular European languages like Spanish (that being 2nd only to English and 4th overall in the world), you run the risk of dramatic misinterpretation.

Often the only way to go is to utilise the services of a professional translator.

Anyway, great blog though; keep it up.

Good to have these things discussed so that people get the facts rather than being left to speculate and not get the results that they want.

Gabriel Kent

Onward to a meta-language!

As these systems grow better at transforming from one language to another we will find Babel redux in the aether.

Exciting ;)

Chuck Baggett

I don’t think humans actually do learn language by performing statistical analysis of parallel text.

I think humans learn language by hearing the people around them talk, people who reward them for repeating the teacher (parent, babysitter, etc.) while smiles, sounds of encouragement,etc. The sounds are often associated with specific objects or actions, such as saying “bottle” while showing the learner a bottle.

Later on, one learns a language by studying vocabularly, grammar rules, and repetition of sounds which get approved or rejected.

mendi

Thanks benni,

Just checked onehourtranslation.com.

bought 1000 words for 36.1$ and got my text in Spanish in no time

It’s really working, cool

Sue

The last point you raised is so important.
“As these emerge, the web will begin translating itself, and within a short time”
Just think about all the possibilities that a multi-language sites offer.
I already use a good instant service (http://onehourtranslation.com) to translate my blogs to 2 more languages. Machine translation is still far from being perfect, and regular translators are very expensive. In a flat world this service should cost next to nothing.

Benny

Similar service already exist, and already open for everyone – OneHourTranslation.com
http://www.onehourtranslation.com

This is a web service for people who need to translate their documents, fast and low-cost.
It is based on thousands of freelance translators all around the globe, monitored and rated for high quality results.

Lukas

similiar Project just launched a few months ago and will be available in english, too! » http://www.tolingo.com

A Social Translation Community with Translation Memory help and online Translation Service

JP

You could have talked about http://www.cucumis.org/. We are doing exactly the same thing since 3 years. Except there is only volunteers, no money involved.
From my experience with cucumis, the most difficult part, is to proofread the the translations. We have a team of more than 100 volunteers admins there.

David Codish

Good article and addresses some thoughts I had about translators for a while. Good that Google is doing something in this space.

I’m still waiting for the day that I will be able to chat with people that I dont speak my language and have the chat program make that fact transparent to me.

Comments are closed.