Skip to main content

Showing 1–4 of 4 results for author: Amara, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.07028  [pdf, other

    cs.CL cs.AI

    Dynamic Corrective Self-Distillation for Better Fine-Tuning of Pretrained Models

    Authors: Ibtihel Amara, Vinija Jain, Aman Chadha

    Abstract: We tackle the challenging issue of aggressive fine-tuning encountered during the process of transfer learning of pre-trained language models (PLMs) with limited labeled downstream data. This problem primarily results in a decline in performance on the subsequent task. Inspired by the adaptive boosting method in traditional machine learning, we present an effective dynamic corrective self-distillat… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  2. arXiv:2310.09680  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring

    Authors: Ankitha Sudarshan, Vinay Samuel, Parth Patwa, Ibtihel Amara, Aman Chadha

    Abstract: Automatic Speech Recognition (ASR) has witnessed a profound research interest. Recent breakthroughs have given ASR systems different prospects such as faithfully transcribing spoken language, which is a pivotal advancement in building conversational agents. However, there is still an imminent challenge of accurately discerning context-dependent words and phrases. In this work, we propose a novel a… ▽ More

    Submitted 3 March, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

  3. arXiv:2212.12965  [pdf, other

    cs.CV

    BD-KD: Balancing the Divergences for Online Knowledge Distillation

    Authors: Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark

    Abstract: Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks… ▽ More

    Submitted 25 December, 2022; originally announced December 2022.

  4. arXiv:2209.07606  [pdf, other

    cs.CV cs.LG

    CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

    Authors: Ibtihel Amara, Maryam Ziaeefard, Brett H. Meyer, Warren Gross, James J. Clark

    Abstract: Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap betw… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: ICPR2022