L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational Language Models
Authors:
Aidin Shiri,
Kaushik Roy,
Amit Sheth,
Manas Gaur
Abstract:
Fine-tuning pre-trained foundational language models (FLM) for specific tasks is often impractical, especially for resource-constrained devices. This necessitates the development of a Lifelong Learning (L3) framework that continuously adapts to a stream of Natural Language Processing (NLP) tasks efficiently. We propose an approach that focuses on extracting meaningful representations from unseen d…
▽ More
Fine-tuning pre-trained foundational language models (FLM) for specific tasks is often impractical, especially for resource-constrained devices. This necessitates the development of a Lifelong Learning (L3) framework that continuously adapts to a stream of Natural Language Processing (NLP) tasks efficiently. We propose an approach that focuses on extracting meaningful representations from unseen data, constructing a structured knowledge base, and improving task performance incrementally. We conducted experiments on various NLP tasks to validate its effectiveness, including benchmarks like GLUE and SuperGLUE. We measured good performance across the accuracy, training efficiency, and knowledge transfer metrics. Initial experimental results show that the proposed L3 ensemble method increases the model accuracy by 4% ~ 36% compared to the fine-tuned FLM. Furthermore, L3 model outperforms naive fine-tuning approaches while maintaining competitive or superior performance (up to 15.4% increase in accuracy) compared to the state-of-the-art language model (T5) for the given task, STS benchmark.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models
Authors:
Nancy Tyagi,
Aidin Shiri,
Surjodeep Sarkar,
Abhishek Kumar Umrawal,
Manas Gaur
Abstract:
Foundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are develo** larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While develo** larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larg…
▽ More
Foundational Language Models (FLMs) have advanced natural language processing (NLP) research. Current researchers are develo** larger FLMs (e.g., XLNet, T5) to enable contextualized language representation, classification, and generation. While develo** larger FLMs has been of significant advantage, it is also a liability concerning hallucination and predictive uncertainty. Fundamentally, larger FLMs are built on the same foundations as smaller FLMs (e.g., BERT); hence, one must recognize the potential of smaller FLMs which can be realized through an ensemble. In the current research, we perform a reality check on FLMs and their ensemble on benchmark and real-world datasets. We hypothesize that the ensembling of FLMs can influence the individualistic attention of FLMs and unravel the strength of coordination and cooperation of different FLMs. We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach. We discovered that the suggested Deep-Ensemble BERT outperforms its large variation i.e. BERTlarge, by a factor of many times using datasets that show the usefulness of NLP in sensitive fields, such as mental health.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.