Skip to main content

Showing 1–1 of 1 results for author: Odenthal, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14971  [pdf, other

    cs.CL cs.AI cs.LG

    Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation

    Authors: Shamane Siriwardhana, Mark McQuade, Thomas Gauthier, Lucas Atkins, Fernando Fernandes Neto, Luke Meyers, Anneketh Vij, Tyler Odenthal, Charles Goddard, Mary MacCarthy, Jacob Solawetz

    Abstract: We conducted extensive experiments on domain adaptation of the Meta-Llama-3-70B-Instruct model on SEC data, exploring its performance on both general and domain-specific benchmarks. Our focus included continual pre-training (CPT) and model merging, aiming to enhance the model's domain-specific capabilities while mitigating catastrophic forgetting. Through this study, we evaluated the impact of int… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures