Skip to main content

Showing 1–7 of 7 results for author: Gheini, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.09982  [pdf, other

    cs.CL cs.SD eess.AS

    Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data

    Authors: Mozhdeh Gheini, Tatiana Likhomanenko, Matthias Sperber, Hendra Setiawan

    Abstract: Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language. Specifically, self-training, or pseudo-labeling, labels unsupervised data and adds that to the training pool. In this work, we investigate and use pseudo-labeling for a recently proposed novel setup: joint transcription and translation of speech, which suffers from an ab… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  2. arXiv:2210.05096  [pdf, other

    cs.CL cs.AI cs.CY

    Checks and Strategies for Enabling Code-Switched Machine Translation

    Authors: Thamme Gowda, Mozhdeh Gheini, Jonathan May

    Abstract: Code-switching is a common phenomenon among multilingual speakers, where alternation between two or more languages occurs within the context of a single conversation. While multilingual humans can seamlessly switch back and forth between languages, multilingual neural machine translation (NMT) models are not robust to such sudden changes in input. This work explores multilingual NMT models' abilit… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  3. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, AdriĆ  Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  4. arXiv:2205.12453  [pdf, other

    cs.CL

    Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-Tuning

    Authors: Mozhdeh Gheini, Xuezhe Ma, Jonathan May

    Abstract: A recent family of techniques, dubbed lightweight fine-tuning methods, facilitates parameter-efficient transfer learning by updating only a small set of additional parameters while kee** the parameters of the pretrained language model frozen. While proven to be an effective method, there are no existing studies on if and how such knowledge of the downstream fine-tuning approach should affect the… ▽ More

    Submitted 8 December, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

  5. arXiv:2104.08771  [pdf, other

    cs.CL

    Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation

    Authors: Mozhdeh Gheini, Xiang Ren, Jonathan May

    Abstract: We study the power of cross-attention in the Transformer architecture within the context of transfer learning for machine translation, and extend the findings of studies into cross-attention when training from scratch. We conduct a series of experiments through fine-tuning a translation model on data where either the source or target language has changed. These experiments reveal that fine-tuning… ▽ More

    Submitted 14 September, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021 Main Conference

  6. arXiv:2012.06154  [pdf, other

    cs.CL cs.AI

    ParsiNLU: A Suite of Language Understanding Challenges for Persian

    Authors: Daniel Khashabi, Arman Cohan, Siamak Shakeri, Pedram Hosseini, Pouya Pezeshkpour, Malihe Alikhani, Moin Aminnaseri, Marzieh Bitaab, Faeze Brahman, Sarik Ghazarian, Mozhdeh Gheini, Arman Kabiri, Rabeeh Karimi Mahabadi, Omid Memarrast, Ahmadreza Mosallanezhad, Erfan Noury, Shahab Raji, Mohammad Sadegh Rasooli, Sepideh Sadeghi, Erfan Sadeqi Azer, Niloofar Safi Samghabadi, Mahsa Shafaei, Saber Sheybani, Ali Tazarv, Yadollah Yaghoobzadeh

    Abstract: Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this rich language. The availability of high-quality evaluat… ▽ More

    Submitted 13 July, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: To appear on Transactions of the Association for Computational Linguistics (TACL), 2021

  7. arXiv:1909.06516  [pdf, other

    cs.CL

    A Universal Parent Model for Low-Resource Neural Machine Translation Transfer

    Authors: Mozhdeh Gheini, Jonathan May

    Abstract: Transfer learning from a high-resource language pair `parent' has been proven to be an effective way to improve neural machine translation quality for low-resource language pairs `children.' However, previous approaches build a custom parent model or at least update an existing parent model's vocabulary for each child language pair they wish to train, in an effort to align parent and child vocabul… ▽ More

    Submitted 19 September, 2019; v1 submitted 13 September, 2019; originally announced September 2019.