Empowering LLM-based Machine Translation with Cultural Awareness

Yao, Binwei; Jiang, Ming; Yang, Diyi; Hu, Junjie

Computer Science > Computation and Language

arXiv:2305.14328v1 (cs)

[Submitted on 23 May 2023 (this version), latest version 23 Mar 2024 (v2)]

Title:Empowering LLM-based Machine Translation with Cultural Awareness

Authors:Binwei Yao, Ming Jiang, Diyi Yang, Junjie Hu

View PDF

Abstract:Traditional neural machine translation (NMT) systems often fail to translate sentences that contain culturally specific information. Most previous NMT methods have incorporated external cultural knowledge during training, which requires fine-tuning on low-frequency items specific to the culture. Recent in-context learning utilizes lightweight prompts to guide large language models (LLMs) to perform machine translation, however, whether such an approach works in terms of injecting culture awareness into machine translation remains unclear. To this end, we introduce a new data curation pipeline to construct a culturally relevant parallel corpus, enriched with annotations of cultural-specific entities. Additionally, we design simple but effective prompting strategies to assist this LLM-based translation. Extensive experiments show that our approaches can largely help incorporate cultural knowledge into LLM-based machine translation, outperforming traditional NMT systems in translating cultural-specific sentences.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.14328 [cs.CL]
	(or arXiv:2305.14328v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.14328

Submission history

From: Binwei Yao [view email]
[v1] Tue, 23 May 2023 17:56:33 UTC (1,459 KB)
[v2] Sat, 23 Mar 2024 02:20:02 UTC (2,900 KB)

Computer Science > Computation and Language

Title:Empowering LLM-based Machine Translation with Cultural Awareness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Empowering LLM-based Machine Translation with Cultural Awareness

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators