Papers and publications
About Systran
With more than 50 years of experience in translation technologies, SYSTRAN has pioneered the greatest innovations in the field, including the first web-based translation portals and the first neural translation engines combining artificial intelligence and neural networks for businesses and public organizations.
SYSTRAN provides business users with advanced and secure automated translation solutions in various areas such as: global collaboration, multilingual content production, customer support, electronic investigation, Big Data analysis, e-commerce, etc. SYSTRAN offers a tailor-made solution with an open and scalable architecture that enables seamless integration into existing third-party applications and IT infrastructures.
Enhanced Transformer Model for Data-to-Text Generation
Enhanced Transformer Model for Data-to-Text GenerationNeural models have recently shown significant progress on data-to-text generation tasks in which descriptive texts are generated conditioned on database records. In this work, we present a new Transformer-based data-to-text generation model which learns content selection and summary generation in an end-to-end fashion. We introduce two extensions to the baseline transformer model: First, we modify … Continued
Li Gong, Josep Crego, Jean Senellart
Book: Proceedings of the 3rd Workshop on Neural Generation and Translation, pages 148--156, Association for Computational Linguistics, November 2019, Hong-Kong, ChinaSYSTRAN @ WAT 2019: Russian-Japanese News Commentary task
SYSTRAN @ WAT 2019: Russian-Japanese News Commentary taskThis paper describes Systran{‘}s submissions to WAT 2019 Russian-Japanese News Commentary task. A challenging translation task due to the extremely low resources available and the distance of the language pair. We have used the neural Transformer architecture learned over the provided resources and we carried out synthetic data generation experiments which aim at alleviating the … Continued
Jitao Xu, TuAnh Nguyen, MinhQuang Pham, Josep Crego, Jean Senellart
Proceedings of the 6th Workshop on Asian Translation, pages 189--194, Association for Computational Linguistics, November 2019, Hong-Kong, ChinaSYSTRAN @ WNGT 2019: DGT Task
SYSTRAN @ WNGT 2019: DGT TaskThis paper describes SYSTRAN participation to the Document-level Generation and Translation (DGT) Shared Task of the 3rd Workshop on Neural Generation and Translation (WNGT 2019). We participate for the first time using a Transformer network enhanced with modified input embeddings and optimising an additional objective function that considers content selection. The network takes in structured … Continued
Li Gong, Josep Crego, Jean Senellart
Proceedings of the 3rd Workshop on Neural Generation and Translation, pages 262--267, Association for Computational Linguistics, November 2019, Hong-Kong, ChinaSYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering
SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus FilteringThis paper describes the participation of SYSTRAN to the shared task on parallel corpus filtering at the Third Conference on Machine Translation (WMT 2018). We participate for the first time using a neural sentence similarity classifier which aims at predicting the relatedness of sentence pairs in a multilingual context. The paper describes the main characteristics … Continued
Minh Quang Pham, Josep Crego, Jean Senellart
Third Conference on Machine Translation (WMT18), October 31 - November 1 2018, Brussels, BelgiumFixing Translation Divergences in Parallel Corpora for Neural MT
Fixing Translation Divergences in Parallel Corpora for Neural MTCorpus-based approaches to machine translation rely on the availability of clean parallel corpora. Such resources are scarce, and because of the automatic processes involved in their preparation, they are often noisy. % may contain sentence pairs that are not as parallel as one would expect. This paper describes an unsupervised method for detecting translation divergences … Continued
Minh Quang Pham, Josep Crego, Jean Senellart, François Yvon
2018 Conference on Empirical Methods in Natural Language Processing, October 31 – November 4 2018, Brussels, BelgiumAnalyzing Knowledge Distillation in Neural Machine Translation
Analyzing Knowledge Distillation in Neural Machine TranslationKnowledge distillation has recently been successfully applied to neural machine translation. It basically allows for building shrunk networks while the resulting systems retain most of the quality of the original model. Despite that many authors report on the benefits of knowledge distillation, few works discuss the actual reasons why it works, especially in the context … Continued
Dakun Zhang, Josep Crego and Jean Senellart
15th International Workshop on Spoken Language Translation, October 29-30 2018, Bruges, BelgiumOpenNMT System Description for WNMT 2018: 800 words/sec on a single-core CPU
OpenNMT System Description for WNMT 2018: 800 words/sec on a single-core CPUWe present a system description of the OpenNMT Neural Machine Translation entry for the WNMT 2018 evaluation. In this work, we developed a heavily optimized NMT inference model targeting a high-performance CPU system. The final system uses a combination of four techniques, all of them leading to significant speed-ups in combination: (a) sequence distillation, (b) … Continued
Jean Senellart, Dakun Zhang, Bo Wang, Guillaume Klein, J.P. Ramatchandirin, Josep Crego, Alexander M. Rush
Published in "Proceedings of the 2nd Workshop on Neural Machine Translation and Generation", pages 122-–128, Association for Computational Linguistics, July 20 2018, Melbourne, AustraliaNeural Network Architectures for Arabic Dialect Identification
Neural Network Architectures for Arabic Dialect IdentificationSYSTRAN competes this year for the first time to the DSL shared task, in the Arabic Dialect Identification subtask. We participate by training several Neural Network models showing that we can obtain competitive results despite the limited amount of training data available for learning. We report our experiments and detail the network architecture and parameters … Continued
Elise Michon, Minh Quang Pham, Josep Crego, Jean Senellart
Published in "Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects", Association for Computational Linguistics, pages 128-–136, August 20 2018, New Mexico, USABoosting Neural Machine Translation [PDF]
Boosting Neural Machine Translation [PDF]Training efficiency is one of the main problems for Neural Machine Translation (NMT). Deep networks need for very large data as well as many training iterations to achieve state-of-the-art performance. This results in very high computation cost, slowing down research and industrialisation. In this paper, we propose to alleviate this problem with several training methods … Continued
Dakun Zhang, Jungi Kim, Josep Crego, Jean Senellart
Published in "Proceedings of the Eighth International Joint Conference on Natural Language Processing" (Volume 2: Short Papers), Asian Federation of Natural Language Processing, 2017, Taipei, TaiwanOpenNMT: Open-Source Toolkit for Neural Machine Translation [PDF]
OpenNMT: Open-Source Toolkit for Neural Machine Translation [PDF]We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about … Continued
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander Rush
Published in "Proceedings of ACL 2017, System Demonstrations", pages 67--72, Association for Computational Linguistics, 2017, Vancouver, Canada