Skip to main content

Showing 1–1 of 1 results for author: Buschiazzo, R

.
  1. arXiv:2006.13425  [pdf, other

    cs.CL

    A High-Quality Multilingual Dataset for Structured Documentation Translation

    Authors: Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Marshall, Richard Socher, Caiming Xiong

    Abstract: This paper presents a high-quality multilingual dataset for the documentation domain to advance research on localization of structured text. Unlike widely-used datasets for translation of plain text, we collect XML-structured parallel text segments from the online documentation for an enterprise software platform. These Web pages have been professionally translated from English into 16 languages a… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

    Comments: Published at WMT2019; the draft has been updated with our dataset's URL: https://github.com/salesforce/localization-xml-mt