Skip to main content

Showing 1–4 of 4 results for author: Aski, V

.
  1. arXiv:2406.02290  [pdf, other

    cs.LG

    A Study of Optimizations for Fine-tuning Large Language Models

    Authors: Arjun Singh, Nikhil Pandey, Anup Shirgaonkar, Pavan Manoj, Vijay Aski

    Abstract: Fine-tuning large language models is a popular choice among users trying to adapt them for specific applications. However, fine-tuning these models is a demanding task because the user has to examine several factors, such as resource budget, runtime, model size and context length among others. A specific challenge is that fine-tuning is memory intensive, imposing constraints on the required hardwa… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures. Revised text for clarity, updated references

  2. arXiv:2404.00213  [pdf, other

    cs.CL

    Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning

    Authors: Nick Mecklenburg, Yiyou Lin, Xiaoxiao Li, Daniel Holstein, Leonardo Nunes, Sara Malvar, Bruno Silva, Ranveer Chandra, Vijay Aski, Pavan Kumar Reddy Yannam, Tolga Aktas, Todd Hendry

    Abstract: In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge remains a challenge, particularly for facts and events that occur after the model's knowledge cutoff date. This paper investigates the effectiveness of Su… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: 16 pages; 7 figures. updated authors list

  3. arXiv:2401.08406  [pdf, other

    cs.CL cs.LG

    RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture

    Authors: Angels Balaguer, Vinamra Benara, Renato Luiz de Freitas Cunha, Roberto de M. Estevão Filho, Todd Hendry, Daniel Holstein, Jennifer Marsman, Nick Mecklenburg, Sara Malvar, Leonardo O. Nunes, Rafael Padilha, Morris Sharp, Bruno Silva, Swati Sharma, Vijay Aski, Ranveer Chandra

    Abstract: There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the external data, while fine-Tuning incorporates the additional knowledge into the model itself. However, the pros and cons of both approaches are not well… ▽ More

    Submitted 30 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  4. arXiv:2310.06225  [pdf, other

    cs.AI cs.LG

    GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using Large Language Models

    Authors: Bruno Silva, Leonardo Nunes, Roberto Estevão, Vijay Aski, Ranveer Chandra

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding across various domains, including healthcare and finance. For some tasks, LLMs achieve similar or better performance than trained human beings, therefore it is reasonable to employ human exams (e.g., certification tests) to assess the performance of LLMs. We present a comprehensive evaluation o… ▽ More

    Submitted 12 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.