Skip to main content

Showing 1–9 of 9 results for author: Gladkoff, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.16969  [pdf, other

    cs.CL stat.AP

    The Multi-Range Theory of Translation Quality Measurement: MQM scoring models and Statistical Quality Control

    Authors: Arle Lommel, Serge Gladkoff, Alan Melby, Sue Ellen Wright, Ingemar Strandvik, Katerina Gasova, Angelika Vaasa, Andy Benzo, Romina Marazzato Sparano, Monica Foresi, Johani Innis, Lifeng Han, Goran Nenadic

    Abstract: The year 2024 marks the 10th anniversary of the Multidimensional Quality Metrics (MQM) framework for analytic translation quality evaluation. The MQM error typology has been widely used by practitioners in the translation and localization industry and has served as the basis for many derivative projects. The annual Conference on Machine Translation (WMT) shared tasks on both human and automatic tr… ▽ More

    Submitted 9 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: working paper, 20 pages, under-review

  2. arXiv:2312.07250  [pdf, other

    cs.CL cs.AI

    Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

    Authors: Lifeng Han, Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Betty Galiano, Goran Nenadic

    Abstract: We conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs). The experimental results on thre… ▽ More

    Submitted 21 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by Frontiers in Digital Health - Health Informatics

  3. arXiv:2308.00158  [pdf, other

    cs.CL cs.AI

    MTUncertainty: Assessing the Need for Post-editing of Machine Translation Outputs by Fine-tuning OpenAI LLMs

    Authors: Serge Gladkoff, Lifeng Han, Gleb Erofeev, Irina Sorokina, Goran Nenadic

    Abstract: Translation Quality Evaluation (TQE) is an essential step of the modern translation production process. TQE is critical in assessing both machine translation (MT) and human translation (HT) quality without reference translations. The ability to evaluate or even simply estimate the quality of translation automatically may open significant efficiency gains through process optimisation. This work exa… ▽ More

    Submitted 21 June, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted by EAMT2024: The 25th Annual Conference of The European Association for Machine Translation

  4. arXiv:2303.04526  [pdf, other

    cs.CL cs.IT math.NA stat.AP

    Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

    Authors: Serge Gladkoff, Lifeng Han, Goran Nenadic

    Abstract: In natural language processing (NLP) we always rely on human judgement as the golden quality evaluation method. However, there has been an ongoing debate on how to better evaluate inter-rater reliability (IRR) levels for certain evaluation tasks, such as translation quality evaluation (TQE), especially when the data samples (observations) are very scarce. In this work, we first introduce the study… ▽ More

    Submitted 9 July, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Accepted to RANLP2023: Recent Advances in Natural Language Processing, Varna, Bulgaria. 30 Aug - 8 Sep \url{https://ranlp.org/ranlp2023/}

  5. arXiv:2210.06068  [pdf, other

    cs.CL cs.AI

    Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

    Authors: Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

    Abstract: Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks. This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning. We carry out an experimental investigation using Meta-AI's MMPLMs ``wmt21… ▽ More

    Submitted 4 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted to ClinicalNLP-2023 WS@ACL-2023

  6. arXiv:2209.07417  [pdf, other

    cs.CL cs.AI

    Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It

    Authors: Lifeng Han, Gleb Erofeev, Irina Sorokina, Serge Gladkoff, Goran Nenadic

    Abstract: Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks. Extra-large PLMs (xLPLMs) are proposed very recently to claim supreme performances over smaller-sized PLMs such as in machine translation (MT) tasks. These xLPLMs include Meta-AI's wmt… ▽ More

    Submitted 25 October, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: System paper Accepted to WMT2022: BiomedicalMT Track (ClinSpEn2022)

  7. arXiv:2112.13833  [pdf, other

    cs.CL cs.AI

    HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation

    Authors: Serge Gladkoff, Lifeng Han

    Abstract: Traditional automatic evaluation metrics for machine translation have been widely criticized by linguists due to their low accuracy, lack of transparency, focus on language mechanics rather than semantics, and low agreement with human quality evaluation. Human evaluations in the form of MQM-like scorecards have always been carried out in real industry setting by both clients and translation servic… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

    Comments: 5 figures, 9 pages

  8. arXiv:2111.07699  [pdf, other

    cs.CL math.NA stat.AP

    Measuring Uncertainty in Translation Quality Evaluation (TQE)

    Authors: Serge Gladkoff, Irina Sorokina, Lifeng Han, Alexandra Alekseeva

    Abstract: From both human translators (HT) and machine translation (MT) researchers' point of view, translation quality evaluation (TQE) is an essential task. Translation service providers (TSPs) have to deliver large volumes of translations which meet customer specifications with harsh constraints of required quality level in tight time-frames and costs. MT researchers strive to make their models better, w… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: 13 pages, 9 figures

  9. arXiv:2108.09484  [pdf, other

    cs.CL cs.AI cs.LG

    cushLEPOR: customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSE

    Authors: Lifeng Han, Irina Sorokina, Gleb Erofeev, Serge Gladkoff

    Abstract: Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python version we developed (ported) which achieved the auto… ▽ More

    Submitted 15 November, 2021; v1 submitted 21 August, 2021; originally announced August 2021.

    Comments: Forthcoming: in Proceedings of Six Conference on Machine Translation (WMT2021)