Skip to main content

Showing 1–3 of 3 results for author: Lad, V

.
  1. arXiv:2406.19384  [pdf, other

    cs.LG cs.AI cs.CL

    The Remarkable Robustness of LLMs: Stages of Inference?

    Authors: Vedang Lad, Wes Gurnee, Max Tegmark

    Abstract: We demonstrate and investigate the remarkable robustness of Large Language Models by deleting and swap** adjacent layers. We find that deleting and swap** interventions retain 72-95\% of the original model's prediction accuracy without fine-tuning, whereas models with more layers exhibit more robustness. Based on the results of the layer-wise intervention and further experiments, we hypothesiz… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2402.05110  [pdf, other

    cs.LG

    Opening the AI black box: program synthesis via mechanistic interpretability

    Authors: Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark

    Abstract: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. We test MIPS on a benchmark of 62 algorithmic tasks that can be learned by an RNN and find it highly complementary to GPT-4: MIPS solves 32 of them, including 13 that are not solved by G… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 24 pages

  3. arXiv:2307.05080  [pdf, other

    cs.LG cs.CV

    Estimating label quality and errors in semantic segmentation data via any model

    Authors: Vedang Lad, Jonas Mueller

    Abstract: The labor-intensive annotation process of semantic segmentation datasets is often prone to errors, since humans struggle to label every pixel correctly. We study algorithms to automatically detect such annotation errors, in particular methods to score label quality, such that the images with the lowest scores are least likely to be correctly labeled. This helps prioritize what data to review in or… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: ICML Workshop on Data-centric Machine Learning Research 2023