Facebook Completes Transition to Neural Machine Translation
Publié le 5 Août 2017
We switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day
Facebook Completes Transition to Neural Machine Translation
Facebook, which uses machine translation to translate text in posts and comments automatically, announced in a blog post on August 3, 2017 that it has completed transitioning to a neural machine translation (NMT) system.
“We switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day,” Facebook engineers Juan Miguel Pino, Alexander Sidorov and Necip Fazil Ayan wrote in the company’s developers blog.
read more :
https://slator.com/technology/facebook-completes-transition-neural-machine-translation/
Language translation is one of the ways we can give people the power to build community and bring the world closer together. It can help people connect with family members who live overseas, or better understand the perspective of someone who speaks a different language. We use machine translation to translate text in posts and comments automatically, in order to break language barriers and allow people around the world to communicate with each other
Creating seamless, highly accurate translation experiences for the 2 billion people who use Facebook is difficult. We need to account for context, slang, typos, abbreviations, and intent simultaneously. To continue improving the quality of our translations, we recently switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day. These new models provide more accurate and fluent translations, improving people's experience consuming Facebook content that is not written in their preferred language.
Sequence-to-sequence LSTM with attention: Using context
Our previous phrase-based statistical techniques were useful, but they also had limitations. One of the main drawbacks of phrase-based systems is that they break down sentences into individual words or phrases, and thus when producing translations they can consider only several words at a time. This leads to difficulty translating between languages with markedly different word orderings. To remedy this and build our neural network systems, we started with a type of recurrent neural network known as sequence-to-sequence LSTM (long short-term memory) with attention. Such a network can take into account the entire context of the source sentence and everything generated so far, to create more accurate and fluent translations. This allows for long-distance reordering, as encountered between English and Turkish, for example. Take the following translation produced by a phrase-based Turkish-to-English system:
With the new system, we saw an average relative increase of 11 percent in BLEU — a widely used metric for judging the accuracy of machine translation — across all languages compared with the phrase-based systems.
Handling unknown words
In many cases, a word in the source sentence doesn't have a direct corresponding translation in the target vocabulary. When that happens, a neural system will generate a placeholder for the unknown word. In this case, we take advantage of the soft alignment that the attention mechanism produces between source and target words in order to pass the original source word through to the target sentence. Then we look up the translation of that word in a bilingual lexicon built from our training data and replace the unknown word in the target sentence. This method is more robust than using a traditional dictionary, especially for noisy input. For example, in English-to-Spanish translation, we are able to translate “tmrw” (tomorrow) into “mañana.” Though the addition of a lexicon brings only marginal improvements in BLEU score, it leads to higher translation ratings by people on Facebook.
Vocabulary reduction
A typical neural machine translation model calculates a probability distribution over all the words in the target vocabulary. The more words we include in this distribution, the more time the calculation takes. We use a modeling technique called vocabulary reduction to remedy this issue at both training and inference time. With vocabulary reduction, we combine the most frequently occurring words in the target vocabulary and a set of possible translation candidates for individual words of a given sentence to reduce the size of the target vocabulary. Filtering the target vocabulary reduces the size of the output projection layer, which helps make computation much faster without degrading quality too significantly.
Tuning model parameters
Neural networks almost always have tunable parameters that control things like the learning rate of the model. Picking the optimal set of these hyperparameters can be extremely beneficial to performance. However, this presents a significant challenge for machine translation at scale, since each translation direction is represented by a unique model with its own set of hyperparameters. Since the optimal values may be different for each model, we had to tune them for for each system in production separately. We ran thousands of end-to-end translation experiments over several months, leveraging the FBLearner Flow platform to fine-tune hyperparameters such as learning rate, attention type, and ensemble size. This had a major impact for some systems. For example, we saw a relative improvement of 3.7 percent BLEU for English to Spanish, based only on tuning model hyperparameters.
Scaling neural machine translation with Caffe2
One of the challenges with transitioning to a neural system was getting the models to run at the speed and efficiency necessary for Facebook scale. We implemented our translation systems in the deep learning framework Caffe2. Its down-to-the-metal and flexible nature allowed us to tune the performance of our translation models during both training and inference on our GPU and CPU platforms.
For training, we implemented memory optimizations such as blob recycling and blob recomputation, which helped us to train larger batches and complete training faster. For inference, we used specialized vector math libraries and weight quantization to improve computational efficiency. Early benchmarks on existing models indicated that the computational resources to support more than 2,000 translation directions would be prohibitively high. However, the flexible nature of Caffe2 and the optimizations we implemented gave us a 2.5x boost in efficiency, which allowed us to deploy neural machine translation models into production.
We follow the practice, common in machine translation, of using beam search at decoding time to improve our estimate of the highest-likelihood output sentence according to the model. We exploited the generality of the recurrent neural network (RNN) abstraction in Caffe2 to implement beam search directly as a single forward network computation, which gives us fast and efficient inference.
Over the course of this work, we developed RNN building blocks such as LSTM, multiplicative integration LSTM, and attention. We're excited to share this technology as part of Caffe2 and to offer our learnings to the research and open source communities.
Ongoing work
The Facebook Artificial Intelligence Research (FAIR) team recently published research on using convolutional neural networks (CNNs) for machine translation. We worked closely with FAIR to bring this technology from research to production systems for the first time, which took less than three months. We launched CNN models for English-to-French and English-to-German translations, which brought BLEU quality improvements of 12.0 percent (+4.3) and 14.4 percent (+3.4), respectively, over the previous systems. These quality improvements make CNNs an exciting new development path, and we will continue our work to utilize CNNs for more translation systems.
We have just started being able to use more context for translations. Neural networks open up many future development paths related to adding further context, such as a photo accompanying the text of a post, to create better translations.
We are also starting to explore multilingual models that can translate many different language directions. This will help solve the challenge of fine-tuning each system relating to a specific language pair, and may also bring quality gains from some directions through the sharing of training data.
Completing the transition from phrase-based to neural machine translation is a milestone on our path to providing Facebook experiences to everyone in their preferred language. We will continue to push the boundaries of neural machine translation technology, with the aim of providing humanlike translations to everyone on Facebook.
read more : https://code.facebook.com/posts/289921871474277/transitioning-entirely-to-neural-machine-translation/
ili : the first WEARABLE TRANSLATOR In The World - OOKAWA Corp.
CES 2016] The entire world is a global village now [CES 2016] The entire world is a global village now in the knowledge economy, and you can be sure that while command of the English language as a ...
http://ookawa-corp.over-blog.com/2016/01/ili-the-first-wearable-translator-in-the-world.html
1 sur 5 ili : the first WEARABLE TRANSLATOR In The World - OOKAWA Corp.
Google Translate and Wikipedia Fused into Geo-Linguistic Map - OOKAWA Corp.
Interactive platform lets users learn about history and pronunciation of spoken word across Earth's terrain Basic communication relies on one's innate ability to communicate, and a thoughtful ...
Google Translate and Wikipedia Fused into Geo-Linguistic Map - OOKAWA Corp.
Goggle Translate : Google Traduit Aussi En Africain - OOKAWA Corp.
Le moteur de traduction de Google est depuis longtemps un outil performant utilisé par des millions d'internautes Mais Google franchit un pas en proposant la traduction en Africain Quel que soit le
http://ookawa-corp.over-blog.com/2014/07/goggle-translate-google-traduit-aussi-en-africain.html
Goggle Translate : Google Traduit Aussi En Africain - OOKAWA Corp.
Skype Translator : la traduction vocale en temps réel, ça fonctionne ! - OOKAWA Corp.
Profitant de sa conférence partenaires WPC qui se tient actuellement à Washington DC, Microsoft a organisé une démonstration en direct de la technologie Skype Translator : elle traduit en temps...
Skype Translator : la traduction vocale en temps réel, ça fonctionne ! - OOKAWA Corp.
Google Translate Web - How to use Google as your universal translator - OOKAWA Corp.
The search giant offers a variety of tools for translating websites and text from one language to another. 1. Google's free online language translation service instantly translates web pages to ...
Google Translate Web - How to use Google as your universal translator - OOKAWA Corp.
Afrika'da balık sezonu açılışı - OOKAWA Corp.
"Afrika'daki yürek burkan bu görüntü; halimize binlerce kez şükretmemiz gerektiğini bize çok iyi anlatıyor..." Do you need Google TRANSLATE? Just, let explore it! The search giant offers a...
http://ookawa-corp.over-blog.com/2014/09/afrika-da-balik-sezonu-acilisi.html
Afrika'da balık sezonu açılışı - OOKAWA Corp.
Video: Google Translate breaks down language barrier for Portadown Youth - OOKAWA Corp.
Google sent a team from America to film the successful team using the app It was a language barrier broken down by a love of football for Alberto Balde who was born in Spain but moved to Northern ...
Video: Google Translate breaks down language barrier for Portadown Youth - OOKAWA Corp.
Google's Zulu translator is officially live - Dare to be better ? OK ! - OOKAWA Corp.
Google's Zulu translator is officially live Google announced earlier this week that it has added nine new languages to its online translation tool, Google Translate . The total number of languages ...
Google's Zulu translator is officially live - Dare to be better ? OK ! - OOKAWA Corp.
Improve Google Translate using your language knowledge - OOKAWA Corp.
If you've used Google Translate, you know things Sometimes get lost in translation. It's sometimes hard to know just what may work linguistically, so Google has created a Translate Community page to
http://ookawa-corp.over-blog.com/2014/09/improve-google-translate-using-your-language-knowledge.html
1 sur 5 Improve Google Translate using your language knowledge - OOKAWA Corp.
Vers des interfaces interactives et immersives, pour donner des sensations proches de la vraie vie ! L'ère du smartphone et de la tablette et plus généralement des interfaces tactiles a déclenc...
La Visite Virtuelle rendue possible grâce aux interfaces interactives et immersives : la solution B'360 - OOKAWA Corp.
Ten things we know to be true We first wrote these "10 things" when Google was just a few years old. From time to time we revisit this list to see if it still holds true. We hope it does-and you can
Google company : is there a mean to be better ? The 10 things they know to be true ! OK ! - OOKAWA Corp.
Dossier Monde du Travail #7 : La première intelligence non humaine prend forme. - OOKAWA Corp.
Dossier Monde du Travail : Perspectives à court et moyen terme / Volet 7 sur 8 La moitié des emplois pourraient être remplacés par des robots dans les 20 ans à venir : mais quelle moitié ? La...
Dossier Monde du Travail #7 : La première intelligence non humaine prend forme. - OOKAWA Corp.