Skip to main content

Showing 1–5 of 5 results for author: Golechha, S

.
  1. arXiv:2405.12755  [pdf, other

    cs.LG cs.AI

    Progress Measures for Grokking on Real-world Tasks

    Authors: Satvik Golechha

    Abstract: Grokking, a phenomenon where machine learning models generalize long after overfitting, has been primarily observed and studied in algorithmic tasks. This paper explores grokking in real-world datasets using deep neural networks for classification under the cross-entropy loss. We challenge the prevalent hypothesis that the $L_2$ norm of weights is the primary cause of grokking by demonstrating tha… ▽ More

    Submitted 20 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 5 pages

    Journal ref: ICML 2024 Workshop on High-dimensional Learning Dynamics (HiLD)

  2. arXiv:2402.06733  [pdf, other

    cs.CL cs.AI cs.LG

    NICE: To Optimize In-Context Examples or Not?

    Authors: Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma

    Abstract: Recent work shows that in-context learning and optimization of in-context examples (ICE) can significantly improve the accuracy of large language models (LLMs) on a wide range of tasks, leading to an apparent consensus that ICE optimization is crucial for better performance. However, most of these studies assume a fixed or no instruction provided in the prompt. We challenge this consensus by inves… ▽ More

    Submitted 6 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted as a full paper (9 pages) at ACL 2024 (Main)

    Journal ref: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics 2024 (Volume 1: Long Papers)

  3. arXiv:2402.04620  [pdf, other

    cs.HC cs.LG

    CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients

    Authors: Pragnya Ramjee, Bhuvan Sachdeva, Satvik Golechha, Shreyas Kulkarni, Geeta Fulari, Kaushik Murali, Mohit Jain

    Abstract: The healthcare landscape is evolving, with patients seeking more reliable information about their health conditions, treatment options, and potential risks. Despite the abundance of information sources, the digital age overwhelms individuals with excess, often inaccurate information. Patients primarily trust doctors and hospital staff, highlighting the need for expert-endorsed health information.… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  4. arXiv:2402.03855  [pdf, other

    cs.LG cs.AI

    Position Paper: Toward New Frameworks for Studying Model Representations

    Authors: Satvik Golechha, James Dao

    Abstract: Mechanistic interpretability (MI) aims to understand AI models by reverse-engineering the exact algorithms neural networks learn. Most works in MI so far have studied behaviors and capabilities that are trivial and token-aligned. However, most capabilities are not that trivial, which advocates for the study of hidden representations inside these networks as the unit of analysis. We do a literature… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  5. arXiv:2211.02943  [pdf, other

    cs.LG cs.AI

    Predicting Treatment Adherence of Tuberculosis Patients at Scale

    Authors: Mihir Kulkarni, Satvik Golechha, Rishi Raj, Jithin Sreedharan, Ankit Bhardwaj, Santanu Rathod, Bhavin Vadera, Jayakrishna Kurada, Sanjay Mattoo, Rajendra Joshi, Kirankumar Rade, Alpan Raval

    Abstract: Tuberculosis (TB), an infectious bacterial disease, is a significant cause of death, especially in low-income countries, with an estimated ten million new cases reported globally in $2020$. While TB is treatable, non-adherence to the medication regimen is a significant cause of morbidity and mortality. Thus, proactively identifying patients at risk of drop** off their medication regimen enables… ▽ More

    Submitted 15 November, 2022; v1 submitted 5 November, 2022; originally announced November 2022.

    Comments: 11 pages