Showing 1–2 of 2 results for author: Shiri, A

Search v0.5.6 released 2020-02-24

arXiv:2311.06493 [pdf, other]

cs.CL

L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational Language Models

Authors: Aidin Shiri, Kaushik Roy, Amit Sheth, Manas Gaur

Abstract: Fine-tuning pre-trained foundational language models (FLM) for specific tasks is often impractical, especially for resource-constrained devices. This necessitates the development of a Lifelong Learning (L3) framework that continuously adapts to a stream of Natural Language Processing (NLP) tasks efficiently. We propose an approach that focuses on extracting meaningful representations from unseen d… ▽ More Fine-tuning pre-trained foundational language models (FLM) for specific tasks is often impractical, especially for resource-constrained devices. This necessitates the development of a Lifelong Learning (L3) framework that continuously adapts to a stream of Natural Language Processing (NLP) tasks efficiently. We propose an approach that focuses on extracting meaningful representations from unseen data, constructing a structured knowledge base, and improving task performance incrementally. We conducted experiments on various NLP tasks to validate its effectiveness, including benchmarks like GLUE and SuperGLUE. We measured good performance across the accuracy, training efficiency, and knowledge transfer metrics. Initial experimental results show that the proposed L3 ensemble method increases the model accuracy by 4% ~ 36% compared to the fine-tuned FLM. Furthermore, L3 model outperforms naive fine-tuning approaches while maintaining competitive or superior performance (up to 15.4% increase in accuracy) compared to the state-of-the-art language model (T5) for the given task, STS benchmark. △ Less

Submitted 11 November, 2023; originally announced November 2023.
arXiv:2308.12272 [pdf, other]

cs.CL cs.AI

Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models

Authors: Nancy Tyagi, Aidin Shiri, Surjodeep Sarkar, Abhishek Kumar Umrawal, Manas Gaur

Abstract: Foundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are develo** larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While develo** larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larg… ▽ More Foundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are develo** larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While develo** larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larger FLMs are built on the same foundations as smaller FLMs (e.g., BERT); hence, one must recognize the potential of smaller FLMs which can be realized through an ensemble. In the current research, we perform a reality check on FLMs and their ensemble on benchmark and real-world datasets. We hypothesize that the ensembling of FLMs can influence the individualistic attention of FLMs and unravel the strength of coordination and cooperation of different FLMs. We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach. We discovered that the suggested Deep-Ensemble BERT outperforms its large variation i.e. BERTlarge, by a factor of many times using datasets that show the usefulness of NLP in sensitive fields, such as mental health. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted at the 10th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2023)

Search v0.5.6 released 2020-02-24