Google Translate logo on a mobile phone.Getty
As a prophet, Warren Weaver was bad, but as a strategist he was a genius. When in the 1950s, this American mathematician predicted that machines would be able to translate automatically within five years, his colleagues raised an eyebrow. And investors loosened the portfolio. Weaver may have come up, but he did it for one purpose: to get funding for research that is still being refined today: machine translation.
Last January, Google announced that its translation service, Google Translate , will be able to perform oral simultaneous translation. This milestone would allow two people to understand each other speaking different languages without having to write what they want to translate beforehand, as they have done before. The company's helmets will be able to whisper in our ear in our language everything that is spoken, say, in a Jakarta bar. This is how one of the predictions that science fiction has fantasized about for decades, from Star Trek to Guide for the Hitchhiker to the Galaxy, is fulfilled. Investors and buyers are ready to loosen the portfolio, but this time the experts have shown no surprise. Automatic translation has been changing the way of understanding (and of being understood in) the world for a couple of decades. This is just one more step.
Weaver's remarks kicked off, and Google's announcement could mark the final sprint in an 80-year marathon. Automatic translators detect 180 languages, make it possible for us to read (even without much precision) the entire web, limit situations of isolation during travel and promise a future that blurs language, cultural and even class barriers. But before analyzing future potentialities, it is worth analyzing the present risks.
“If you don't know how to use it, Google Translate can make you speak worse English,” confirms Celia Rico, an expert in translation technologies at the European University of Madrid. “It is based on a corpus of words and these can be limited, degenerating the language. We cannot limit ourselves to using it thinking that everything that comes out of it is perfect ”. Rico is a translator, but her statement does not stem from a misunderstood rivalry. In fact, she has been studying machine translation with passion for 30 years. "Everyone imagines us with a pen and a dictionary, but translating is a very technological profession," he explains sarcastically. "Almost all of us use machine translation tools, they make our work easier, what happens is that you have to know how to use them."
So, how should we use Google Translate? "If it's about a language we know, it can serve as a first pass, almost as inspiration," says Rico. "It can also help us if we don't know certain words." From here you have to review, modify and polish the text. And this, the translator concludes, cannot be done by a machine.
The maker of the machine, ironically, agrees. From Google they insist that their tool will never replace the work of a good translator, and when users are asked for advice they announce: “Google Translate works better when it comes to short fragments of text, such as menus, signs or articles, and can be very useful in short conversations when we need, for example, to ask for an address, check what ingredients a dish contains or find out the price of something. It is not intended to replace fluency in another language. ”
Another point to keep in mind when using this tool is that the ways of expressing yourself differ depending on the language. "The translation may be exact, but many times when you read it you realize that there is something wrong," says Rico. “For example, in Spanish we elaborate more on the same idea, we take more detours, while English uses shorter and more direct sentences. People reason differently depending on the language and this is evident in the way the texts are structured ”. That is where the work of a good translator comes into play, to change the literality without altering the spirit of a text.
Carmen Torrijos has spent her entire life dedicated to making us understand each other better. Before, I did it by mediating between people of different languages. Now it does so by mediating between humans and machines. This former translator has been recycled in computational linguistics, a job she exercises for the Institute of Knowledge Engineering. That is why he has a more global vision of the translation carried out by machines. To explain his idea about this technology, he pulls an anecdote: “On one occasion I asked Google Translate for the exact translation into English of the expression“ human trafficking ”, and they replied“ Is about people ”(” it goes about people ”). So I went to the translator DeepL, who answered “human trafficking”. The difference was great, but both answers were strictly valid "Only I could decide which was correct, because I knew the character of the text, the context and the client." For this reason, she recommends doing a comprehensive reading of the text to be translated before hitting the button and trusting that the machines work their magic.
Bureaucracy is the new stone Rosetta
This magic, however, is becoming more and more sophisticated. In their latest versions, automatic translators take context into account before translating. "I think the big change happened in 2014, when Google started using the neural network for translation," reflects Rico. Until then, automatic translators relied on a syntactic and statistical translation, understanding words in isolation. The use of artificial intelligence has made them begin to understand texts as a whole, putting details in context.
To achieve this, technology companies feed algorithms with an enormous amount of texts translated into several languages. And searching the available databases, they have found a vein in the most unsuspected corner of the internet: where the international bureaucracy is moth-eaten. Trade treaties, United Nations protocols or European Union laws, written in dozens of languages of the member countries, are the perfect food for these algorithms. Bureaucracy is the new Rosetta stone. Perhaps this is why automatic translators are much more reliable in formal and academic language, and they fail more when it comes to translating expressions and street slang, which is also constantly evolving.
This way of training automatic translators can lead to to biases and distortions. "For example, if you have seen policy texts to train or learn, you will translate some medical texts quite badly," explains Marta R. Costa-Jussà, a researcher at the Polytechnic University of Catalonia. "It can also amplify gender biases," she says. An example: in English, names have no gender, so professions are neutral. But when translating them into other languages, for example those of Latin descent, they have to be assigned a gender. And this reproduces what you have read on the Internet. Thus, for many years, doctors have been men and nurses women. In 2018, Google Translate corrected this problem by adding a double window with translations in both genders. However, as Algorithm Watch denounced , some sexist biases have survived the change.
Another side effect of feeding artificial intelligence with texts and audios available on the network is the difference between languages. Obviously there are not the same amount of texts in Kazakh as there are in Japanese. Therefore the translation does not work as well. This carries over, to a lesser extent, to intermediate languages. “Speech recognizers work better in German than in Finnish. The machine translation between English and Portuguese is significantly better than that between Dutch and Spanish ”, says Costa-Jussà. She is working to make it stop being like that. He does not want to achieve it through words but through numbers.
The LUNAR project , directed by Costa-Jussà, aims to create a kind of mathematical Esperanto. "Our idea is to find a mathematical representation of language, spoken and written," explains the researcher. "Current translation systems use deep learning algorithms that transform language into a mathematical representation." What LUNAR aims to do is use the abstraction capacity of these algorithms and achieve a universal representation of the language. Reduce a language to a formula. This representation will allow to have automatic translation systems that improve the quality of translation of minority languages in which the translation is not yet very good. Or as Costa-Jussà calls them “languages with few resources”.
Projects like LUNAR intend to continue breaking down language barriers, but they do not want (and cannot) replace the process of learning a language or the human effort of translating it. The cultural, personal and linguistic richness of learning a new language. Umberto Eco said in his book Dire quasi la stessa cosa (translated by Google and Debolsillo as Say almost the same thing ) that translation is a matter of negotiation, it cannot be reduced to a handful of formulas and algorithms. Nor does it seem that this is the intention of these tools and of the researchers who improve them. "We do not seek to replace interpreters and translators or dissuade someone from learning a new language," they say from Google. “We focus on breaking down language barriers and facilitating people's communication. Language is more than words, and we wholeheartedly support and encourage learning new languages and different cultures. ”
Warren Weaver was not only a great strategist, he was also a voracious collector. The mathematician was obsessed with Alice in Wonderland . He came to accumulate up to 160 versions of Lewis Carroll's book in 42 different languages. He even wrote a book, Alice in many tongues , analyzing the quality of the versions, focusing especially on what he considered most difficult to translate, the Mad Hatter's logical jokes and puns. Something that, he thought, a machine would never be able to translate fluently. At the moment reality seems to agree. Although we have said that as a prophet, Weaver was not particularly bueno.
can follow COUNTRY TECHNOLOGY RETINA in Facebook , Twitter, Instagram here or subscribe to our Newsletter .