Skip to main content

Showing 1–22 of 22 results for author: Raghu, M

.
  1. Study of the transient nature of classical Be stars using multi-epoch optical spectroscopy

    Authors: Gourav Banerjee, Blesson Mathew, K. T. Paul, Annapurni Subramaniam, Anjusha Balan, Suman Bhattacharyya, R. Anusha, Deeja Moosa, C S Dheeraj, Aleeda Charly, Megha Raghu

    Abstract: Variability is a commonly observed property of classical Be stars (CBe) stars. In extreme cases, complete disappearance of the Hα emission line occurs, indicating a disc-less state in CBe stars. The disc-loss and reappearing phases can be identified by studying the Hα line profiles of CBe stars on a regular basis. In this paper, we present the study of a set of selected 9 bright CBe stars, in the… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: 19 pages, 7 figures, 2 tables, accepted in JApA

  2. arXiv:2205.13647  [pdf, other

    cs.LG stat.ML

    Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

    Authors: Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang

    Abstract: This paper considers the Pointer Value Retrieval (PVR) benchmark introduced in [ZRKB21], where a 'reasoning' function acts on a string of digits to produce the label. More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks. It is first shown that in order to learn logical functions with gradient descent on symmetric neural networks, the g… ▽ More

    Submitted 20 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: To appear in NeurIPS 2022

  3. arXiv:2202.07184  [pdf, other

    cs.LG

    On the Origins of the Block Structure Phenomenon in Neural Network Representations

    Authors: Thao Nguyen, Maithra Raghu, Simon Kornblith

    Abstract: Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations. This block structure has two seemingly contradictory properties: on the one hand, its constituent layers exhibit highly similar dominant first principal components (PCs), but on the other hand, their representations, and their commo… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  4. arXiv:2108.08810  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Do Vision Transformers See Like Convolutional Neural Networks?

    Authors: Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, Alexey Dosovitskiy

    Abstract: Convolutional neural networks (CNNs) have so far been the de-facto model for visual data. Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or even superior performance on image classification tasks. This raises a central question: how are Vision Transformers solving these tasks? Are they acting like convolutional networks, or learning entirely different visual re… ▽ More

    Submitted 3 March, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

  5. arXiv:2107.12580  [pdf, other

    cs.LG cs.AI stat.ML

    Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

    Authors: Chiyuan Zhang, Maithra Raghu, Jon Kleinberg, Samy Bengio

    Abstract: Central to the success of artificial neural networks is their ability to generalize. But does neural network generalization primarily rely on seeing highly similar training examples (memorization)? Or are neural networks capable of human-intelligence styled reasoning, and if so, to what extent? These remain fundamental open questions on artificial neural networks. In this paper, as steps towards a… ▽ More

    Submitted 18 February, 2022; v1 submitted 26 July, 2021; originally announced July 2021.

  6. arXiv:2011.03037  [pdf, other

    cs.LG

    Teaching with Commentaries

    Authors: Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

    Abstract: Effective training of deep neural networks can be challenging, and there remain many open questions on how to best learn these models. Recently developed methods to improve neural network training examine teaching: providing learned information during the training process to improve downstream model performance. In this paper, we take steps towards extending the scope of teaching. We propose a fle… ▽ More

    Submitted 11 March, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

    Comments: ICLR 2021

  7. arXiv:2010.15327  [pdf, other

    cs.LG

    Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

    Authors: Thao Nguyen, Maithra Raghu, Simon Kornblith

    Abstract: A key factor in the success of deep neural networks is the ability to scale models to improve performance by varying the architecture depth and width. This simple property of neural network design has resulted in highly effective architectures for a variety of tasks. Nevertheless, there is limited understanding of effects of depth and width on the learned representations. In this paper, we study t… ▽ More

    Submitted 9 April, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: ICLR 2021

  8. arXiv:2007.07400  [pdf, other

    cs.LG cs.CV stat.ML

    Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

    Authors: Vinay V. Ramasesh, Ethan Dyer, Maithra Raghu

    Abstract: A central challenge in develo** versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks. Despite the ubiquity of catastrophic forgetting, there is limited understanding of the underlying process and its causes. In this paper, we address this important knowledge gap, investigating how forgetting… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  9. arXiv:2003.11755  [pdf, other

    cs.LG stat.ML

    A Survey of Deep Learning for Scientific Discovery

    Authors: Maithra Raghu, Eric Schmidt

    Abstract: Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific se… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

  10. arXiv:1909.09157  [pdf, other

    cs.LG stat.ML

    Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

    Authors: Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

    Abstract: An important research direction in machine learning has centered around develo** meta-learning algorithms to tackle few-shot learning. An especially successful algorithm has been Model Agnostic Meta-Learning (MAML), a method that consists of two optimization loops, with the outer loop finding a meta-initialization, from which the inner loop can efficiently learn new tasks. Despite MAML's popular… ▽ More

    Submitted 12 February, 2020; v1 submitted 19 September, 2019; originally announced September 2019.

    Comments: ICLR 2020

  11. arXiv:1903.12220  [pdf, other

    cs.CV cs.AI cs.LG

    The Algorithmic Automation Problem: Prediction, Triage, and Human Effort

    Authors: Maithra Raghu, Katy Blumer, Greg Corrado, Jon Kleinberg, Ziad Obermeyer, Sendhil Mullainathan

    Abstract: In a wide array of areas, algorithms are matching and surpassing the performance of human experts, leading to consideration of the roles of human judgment and algorithmic prediction in these domains. The discussion around these developments, however, has implicitly equated the specific task of prediction with the general task of automation. We argue here that automation is broader than just a comp… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

  12. arXiv:1902.07208  [pdf, other

    cs.CV cs.LG stat.ML

    Transfusion: Understanding Transfer Learning for Medical Imaging

    Authors: Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio

    Abstract: Transfer learning from natural image datasets, particularly ImageNet, using standard large models and corresponding pretrained weights has become a de-facto method for deep learning applications to medical imaging. However, there are fundamental differences in data sizes, features and task specifications between natural image classification and the target medical tasks, and there is little underst… ▽ More

    Submitted 29 October, 2019; v1 submitted 13 February, 2019; originally announced February 2019.

    Comments: NeurIPS 2019

  13. arXiv:1807.01771  [pdf, other

    cs.LG stat.ML

    Direct Uncertainty Prediction for Medical Second Opinions

    Authors: Maithra Raghu, Katy Blumer, Rory Sayres, Ziad Obermeyer, Robert Kleinberg, Sendhil Mullainathan, Jon Kleinberg

    Abstract: The issue of disagreements amongst human experts is a ubiquitous one in both machine learning and medicine. In medicine, this often corresponds to doctor disagreements on a patient diagnosis. In this work, we show that machine learning models can be trained to give uncertainty scores to data instances that might result in high expert disagreements. In particular, they can identify patient cases th… ▽ More

    Submitted 28 May, 2019; v1 submitted 4 July, 2018; originally announced July 2018.

    Comments: Accepted for publication at ICML 2019

  14. arXiv:1806.05759  [pdf, other

    stat.ML cs.AI cs.CV cs.LG cs.NE

    Insights on representational similarity in neural networks with canonical correlation

    Authors: Ari S. Morcos, Maithra Raghu, Samy Bengio

    Abstract: Comparing different neural network representations and determining how representations evolve over time remain challenging open questions in our understanding of the function of neural networks. Comparing representations in neural networks is fundamentally difficult as the structure of representations varies greatly, even across groups of networks trained on identical tasks, and over the course of… ▽ More

    Submitted 23 October, 2018; v1 submitted 14 June, 2018; originally announced June 2018.

    Comments: NIPS 2018

  15. arXiv:1801.02774  [pdf, other

    cs.CV

    Adversarial Spheres

    Authors: Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, Ian Goodfellow

    Abstract: State of the art computer vision models have been shown to be vulnerable to small adversarial perturbations of the input. In other words, most images in the data distribution are both correctly classified by the model and are very close to a visually similar misclassified image. Despite substantial research interest, the cause of the phenomenon is still poorly understood and remains unsolved. We h… ▽ More

    Submitted 10 September, 2018; v1 submitted 8 January, 2018; originally announced January 2018.

    MSC Class: 68T45 ACM Class: I.2.6

  16. arXiv:1711.02301  [pdf, other

    cs.AI cs.NE stat.ML

    Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?

    Authors: Maithra Raghu, Alex Irpan, Jacob Andreas, Robert Kleinberg, Quoc V. Le, Jon Kleinberg

    Abstract: Deep reinforcement learning has achieved many recent successes, but our understanding of its strengths and limitations is hampered by the lack of rich environments in which we can fully characterize optimal behavior, and correspondingly diagnose individual actions against such a characterization. Here we consider a family of combinatorial games, arising from work of Erdos, Selfridge, and Spencer,… ▽ More

    Submitted 28 June, 2018; v1 submitted 7 November, 2017; originally announced November 2017.

    Comments: Accepted to ICML 2018, code opensourced at: https://github.com/rubai5/ESS_Game

  17. arXiv:1706.05806  [pdf, other

    stat.ML cs.LG

    SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

    Authors: Maithra Raghu, Justin Gilmer, Jason Yosinski, Jascha Sohl-Dickstein

    Abstract: We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). We deploy this tool to measure the intrinsic dimensionality of… ▽ More

    Submitted 8 November, 2017; v1 submitted 19 June, 2017; originally announced June 2017.

    Comments: Accepted to NIPS 2017, code: https://github.com/google/svcca/ , new plots on Imagenet

  18. arXiv:1704.01255  [pdf, other

    cs.LG stat.ML

    Linear Additive Markov Processes

    Authors: Ravi Kumar, Maithra Raghu, Tamas Sarlos, Andrew Tomkins

    Abstract: We introduce LAMP: the Linear Additive Markov Process. Transitions in LAMP may be influenced by states visited in the distant history of the process, but unlike higher-order Markov processes, LAMP retains an efficient parametrization. LAMP also allows the specific dependence on history to be learned efficiently from data. We characterize some theoretical properties of LAMP, including its steady-st… ▽ More

    Submitted 4 April, 2017; originally announced April 2017.

    Comments: Accepted to WWW 2017

  19. arXiv:1611.08083  [pdf, other

    stat.ML cs.LG cs.NE

    Survey of Expressivity in Deep Neural Networks

    Authors: Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

    Abstract: We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the n… ▽ More

    Submitted 24 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  20. arXiv:1606.05340  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Exponential expressivity in deep neural networks through transient chaos

    Authors: Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein, Surya Ganguli

    Abstract: We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights. Our results reveal an order-to-chaos expressivity phase transition, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth but not width. We prove this gene… ▽ More

    Submitted 17 June, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

    Comments: Fixed equation references

  21. arXiv:1606.05336  [pdf, other

    stat.ML cs.AI cs.LG

    On the Expressive Power of Deep Neural Networks

    Authors: Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

    Abstract: We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute. Our approach is based on an interrelated set of measures of expressivity, unified by the novel notion of trajectory length, which measures how the output of a network changes as the input sweeps along a… ▽ More

    Submitted 18 June, 2017; v1 submitted 16 June, 2016; originally announced June 2016.

    Comments: Accepted to ICML 2017

  22. arXiv:1506.00147  [pdf, ps, other

    cs.DS cs.GT

    Team Performance with Test Scores

    Authors: Jon Kleinberg, Maithra Raghu

    Abstract: Team performance is a ubiquitous area of inquiry in the social sciences, and it motivates the problem of team selection -- choosing the members of a team for maximum performance. Influential work of Hong and Page has argued that testing individuals in isolation and then assembling the highest-scoring ones into a team is not an effective method for team selection. For a broad class of performance m… ▽ More

    Submitted 25 March, 2018; v1 submitted 30 May, 2015; originally announced June 2015.