Skip to main content

Showing 1–2 of 2 results for author: Sherstinsky, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00732  [pdf, other

    cs.CL cs.AI cs.LG

    LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

    Authors: Justin Zhao, Timothy Wang, Wael Abid, Geoffrey Angus, Arnav Garg, Jeffery Kinnison, Alex Sherstinsky, Piero Molino, Travis Addair, Devvret Rishi

    Abstract: Low Rank Adaptation (LoRA) has emerged as one of the most widely adopted methods for Parameter Efficient Fine-Tuning (PEFT) of Large Language Models (LLMs). LoRA reduces the number of trainable parameters and memory usage while achieving comparable performance to full fine-tuning. We aim to assess the viability of training and serving LLMs fine-tuned with LoRA in real-world applications. First, we… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  2. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

    Authors: Alex Sherstinsky

    Abstract: Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling" an RNN… ▽ More

    Submitted 30 July, 2023; v1 submitted 9 August, 2018; originally announced August 2018.

    Comments: 43 pages, 10 figures, 78 references

    Journal ref: Elsevier "Physica D: Nonlinear Phenomena" journal, Volume 404, March 2020: Special Issue on Machine Learning and Dynamical Systems