Papers and publications
企業情報
With more than 50 years of experience in translation technologies, SYSTRAN has pioneered the greatest innovations in the field, including the first web-based translation portals and the first neural translation engines combining artificial intelligence and neural networks for businesses and public organizations.
SYSTRAN provides business users with advanced and secure automated translation solutions in various areas such as: global collaboration, multilingual content production, customer support, electronic investigation, Big Data analysis, e-commerce, etc. SYSTRAN offers a tailor-made solution with an open and scalable architecture that enables seamless integration into existing third-party applications and IT infrastructures.
Domain Control for Neural Machine Translation [PDF]
Domain Control for Neural Machine Translation [PDF]Machine translation systems are very sensitive to the domains they were trained on. Several domain adaptation techniques have been deeply studied. We propose a new technique for neural machine translation (NMT) that we call domain control which is performed at runtime using a unique neural network covering multiple domains. The presented approach shows quality improvements when compared to dedicated domains translating on any of the covered domains and even on out-of-domain data. In addition, model parameters do not need to be re-estimated for each domain, making this effective to real use cases. Evaluation is carried out on English-to-French translation for two different testing scenarios. We first consider the case where … Continued
Catherine Kobus, Josep Crego, Jean Senellart
Published in "Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017", INCOMA Ltd., Varna, Bulgaria, Sep 4–6 2017 - [v2] 12 Sep 2017Adaptation incrémentale de modèles de traduction neuronaux
Adaptation incrémentale de modèles de traduction neuronauxL’adaptation au domaine est un verrou scientifique en traduction automatique. Il englobe généralement l’adaptation de la terminologie et du style, en particulier pour la post-édition humaine dans le cadre d’une traduction assistée par ordinateur. Avec la traduction automatique neuronale, nous étudions une nouvelle approche d’adaptation au domaine que nous appelons “spécialisation” et qui présente des résultats prometteurs tant dans la vitesse d’apprentissage que dans les scores de traduction. Dans cet article, nous proposons d’explorer cette approche.
Christophe Servan, Josep Crego, Jean Senellart
24e Conférence sur le Traitement Automatique des Langues Naturelles (TALN) - Actes de TALN 2017, volume 2 : articles courts, pages 218--225, 26-30 juin 2017, Orléans, FranceConception d'une solution de détection d'événements basée sur Twitter
Conception d'une solution de détection d'événements basée sur TwitterCet article présente un système d’alertes fondé sur la masse de données issues de Tweeter. L’objectif de l’outil est de surveiller l’actualité, autour de différents domaines témoin incluant les événements sportifs ou les catastrophes naturelles. Cette surveillance est transmise à l’utilisateur sous forme d’une interface web contenant la liste d’événements localisés sur une carte.
Christophe Servan, Catherine Kobus, Yongchao Deng, Cyril Touffet, Jungi Kim, Inès Kapp, Djamel Mostefa, Josep Crego, Jean Senellart
24e Conférence sur le Traitement Automatique des Langues Naturelles (TALN) - Actes de TALN 2017, volume 3 : démonstrations, pages 19--20, 26-30 juin 2017, Orléans, FranceSYSTRAN Pure Neural Machine Translation [PDF]
SYSTRAN Pure Neural Machine Translation [PDF]Each of us have experienced or heard of deep learning in day-to-day business applications. What are the fundamentals of this new technology and what new opportunities does it offer?
Domain specialization: a post-training domain adaptation for Neural Machine Translation
Domain specialization: a post-training domain adaptation for Neural Machine TranslationDomain adaptation is a key feature in Machine Translation. It generally encompasses terminology, domain and style adaptation, especially for human post-editing workflows in Computer Assisted Translation (CAT). With Neural Machine Translation (NMT), we introduce a new notion of domain adaptation that we call “specialization” and which is showing promising results both in the learning speed and in adaptation accuracy. In this paper, we propose to explore this approach under several perspectives.
Christophe Servan, Josep Crego, Jean Senellart
Computation and Language (cs.CL)Neural Machine Translation from Simplified Translations
Neural Machine Translation from Simplified TranslationsText simplification aims at reducing the lexical, grammatical and structural complexity of a text while keeping the same meaning. In the context of machine translation, we introduce the idea of simplified translations in order to boost the learning ability of deep neural translation models. We conduct preliminary experiments showing that translation complexity is actually reduced in a translation of a source bi-text compared to the target reference of the bi-text while using a neural machine translation (NMT) system learned on the exact same bi-text. Based on knowledge distillation idea, we then train an NMT system using the simplified bi-text, and show that it outperforms the initial system that was built … Continued
Josep Crego, Jean Senellart
SYSTRAN's Pure Neural Machine Translation Systems
SYSTRAN's Pure Neural Machine Translation SystemsSince the first online demonstration of Neural Machine Translation (NMT) by LISA, NMT development has recently moved from laboratory to production systems as demonstrated by several entities announcing roll-out of NMT engines to replace their existing technologies. NMT systems have a large number of training configurations and the training process of such systems is usually very long, often a few weeks, so role of experimentation is critical and important to share. In this work, we present our approach to production-ready systems simultaneously with release of online demonstrators covering a large variety of languages (12 languages, for 32 language pairs). We explore different practical choices: an efficient and evolutive open-source framework; … Continued
Josep Crego, Jungi Kim, Guillaume Klein, Anabel Rebollo, Kathy Yang, Jean Senellart, Egor Akhanov, Patrice Brunelle, Aurelien Coquard, Yongchao Deng, Satoshi Enoue, Chiyo Geiss, Joshua Johanson, Ardas Khalsa, Raoum Khiari, Byeongil Ko, Catherine Kobus, Jean Lorieux, Leidiana Martins, Dang-Chuan Nguyen, Alexandra Priori, Thomas Riccardi, Natalia Segal, Christophe Servan, Cyril Tiquet, Bo Wang, Jin Yang, Dakun Zhang, Jing Zhou, Peter Zoldan
Computation and Language (cs.CL)System Combination RWTH Aachen - SYSTRAN for the NTCIR-10 PatentMT Evaluation 2013 [PDF]
System Combination RWTH Aachen - SYSTRAN for the NTCIR-10 PatentMT Evaluation 2013 [PDF]This paper describes the joint submission by RWTH Aachen University and SYSTRAN in the Chinese-English Patent Machine Translation Task at the 10th NTCIR Workshop. We specify the statistical systems developed by RWTH Aachen University and the hybrid machine translation systems developed by SYSTRAN. We apply RWTH Aachen’s combination techniques to create consensus hypotheses from very different systems: phrase-based and hierarchical SMT, rule-based MT (RBMT) and MT with statistical post-editing (SPE). The system combination was ranked second in BLEU and second in the human adequacy evaluation in this competition.
Minwei Feng, Markus Freitag, Hermann Ney, Bianka Buschbeck, Jean Senellart, Jin Yang
June 18-21, 2013, Tokyo, JapanSYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT 2011 [PDF]
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT 2011 [PDF]This report describes SYSTRAN’s Chinese-English and English-Chinese machine translation systems that participated in the CWMT 2011 machine translation evaluation tasks. The base systems are SYSTRAN rule-based machine translation systems, augmented with various statistical techniques. Based on the translations of the rule-based systems, we performed statistical post-editing with the provided bilingual and monolingual training corpora. In this report, we describe the technology behind the systems, the training data, and finally the evaluation results in the CWMT 2011 evaluation. Our primary Chinese-English system was ranked first in BLEU in the translation tasks.
Jin Yang, Satoshi Enoue, Jean Senellart
Proceedings of the 7th China Workshop on Machine Translation (CWMT), September 2011.Convergence of Translation Memory and Statistical Machine Translation [PDF]
Convergence of Translation Memory and Statistical Machine Translation [PDF]We present two methods that merge ideas from statistical machine translation (SMT) and translation memories (TM). We use a TM to retrieve matches for source segments, and replace the mismatched parts with instructions to an SMT system to fill in the gap. We show that for fuzzy matches of over 70%, one method outperforms both SMT and TM base- lines.
Philipp Koehn, Jean Senellart
JEC, November 2010.