Skip to main content

Showing 1–2 of 2 results for author: Tejaswi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14670  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Design Choices for Building Language-Specific LLMs

    Authors: Atula Tejaswi, Nilesh Gupta, Eunsol Choi

    Abstract: Despite rapid progress in large language models (LLMs), their performance on a vast majority of languages remain unsatisfactory. In this paper, we study building language-specific LLMs by adapting monolingual and multilingual LLMs. We conduct systematic experiments on how design choices (base model selection, vocabulary extension, and continued fine-tuning) impact the adapted LLM, both in terms of… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures, 11 tables

  2. arXiv:2405.19597  [pdf, other

    cs.LG cs.AI cs.CL

    SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors

    Authors: Vijay Lingam, Atula Tejaswi, Aditya Vavre, Aneesh Shetty, Gautham Krishna Gudur, Joydeep Ghosh, Alex Dimakis, Eunsol Choi, Aleksandar Bojchevski, Sujay Sanghavi

    Abstract: Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights \(W\) and inject learnable matrices \(ΔW\). These \(ΔW\) matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically show a performance gap compared to full fine-tuning. Although… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures, 14 tables