Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Wiesner, Matthew; Raj, Desh; Khudanpur, Sanjeev

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2110.04863 (eess)

[Submitted on 10 Oct 2021]

Title:Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Authors:Matthew Wiesner, Desh Raj, Sanjeev Khudanpur

View PDF

Abstract:Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models. We demonstrate how universal phoneset acoustic models can leverage cross-lingual supervision to improve transfer of pretrained self-supervised representations to new languages. We also show how target-language text can be used to enable and improve fine-tuning with the lattice-free maximum mutual information (LF-MMI) objective. In three low-resource languages these techniques greatly improved few-shot learning performance.

Comments:	\c{opyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
Cite as:	arXiv:2110.04863 [eess.AS]
	(or arXiv:2110.04863v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2110.04863

Submission history

From: Matthew Wiesner [view email]
[v1] Sun, 10 Oct 2021 17:33:44 UTC (163 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators