Gloss2Text: Sign Language Gloss translation using LLMs and Semantically Aware Label Smoothing

Fayyazsanavi, Pooya; Anastasopoulos, Antonios; Košecká, Jana

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.01394 (cs)

[Submitted on 1 Jul 2024 (v1), last revised 12 Jul 2024 (this version, v2)]

Title:Gloss2Text: Sign Language Gloss translation using LLMs and Semantically Aware Label Smoothing

Authors:Pooya Fayyazsanavi, Antonios Anastasopoulos, Jana Košecká

View PDF HTML (experimental)

Abstract:Sign language translation from video to spoken text presents unique challenges owing to the distinct grammar, expression nuances, and high variation of visual appearance across different speakers and contexts. The intermediate gloss annotations of videos aim to guide the translation process. In our work, we focus on {\em Gloss2Text} translation stage and propose several advances by leveraging pre-trained large language models (LLMs), data augmentation, and novel label-smoothing loss function exploiting gloss translation ambiguities improving significantly the performance of state-of-the-art approaches. Through extensive experiments and ablation studies on the PHOENIX Weather 2014T dataset, our approach surpasses state-of-the-art performance in {\em Gloss2Text} translation, indicating its efficacy in addressing sign language translation and suggesting promising avenues for future research and development.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2407.01394 [cs.CV]
	(or arXiv:2407.01394v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.01394

Submission history

From: Pooya Fayyazsanavi [view email]
[v1] Mon, 1 Jul 2024 15:46:45 UTC (1,808 KB)
[v2] Fri, 12 Jul 2024 14:44:33 UTC (1,808 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Gloss2Text: Sign Language Gloss translation using LLMs and Semantically Aware Label Smoothing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Gloss2Text: Sign Language Gloss translation using LLMs and Semantically Aware Label Smoothing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators