Skip to main content

Showing 151–200 of 284 results for author: Anandkumar, A

.
  1. arXiv:2012.04160  [pdf, other

    cs.LG math.OC stat.ML

    Stability and Identification of Random Asynchronous Linear Time-Invariant Systems

    Authors: Sahin Lale, Oguzhan Teke, Babak Hassibi, Anima Anandkumar

    Abstract: In many computational tasks and dynamical systems, asynchrony and randomization are naturally present and have been considered as ways to increase the speed and reduce the cost of computation while compromising the accuracy and convergence rate. In this work, we show the additional benefits of randomization and asynchrony on the stability of linear dynamical systems. We introduce a natural model f… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  2. arXiv:2011.07748  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Fast Uncertainty Quantification for Deep Object Pose Estimation

    Authors: Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu

    Abstract: Deep learning-based object pose estimators are often unreliable and overconfident especially when the input image is outside the training domain, for instance, with sim2real transfer. Efficient and robust uncertainty quantification (UQ) in pose estimators is critically needed in many robotic tasks. In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose esti… ▽ More

    Submitted 26 March, 2021; v1 submitted 16 November, 2020; originally announced November 2020.

    Comments: Video and code are available at https://sites.google.com/view/fastuq

    Journal ref: International Conferenceon Robotics and Automation (ICRA), 2021

  3. arXiv:2011.02680  [pdf, other

    physics.chem-ph cs.LG

    Multi-task learning for electronic structure to predict and explore molecular potential energy surfaces

    Authors: Zhuoran Qiao, Feizhi Ding, Matthew Welborn, Peter J. Bygrave, Daniel G. A. Smith, Animashree Anandkumar, Frederick R. Manby, Thomas F. Miller III

    Abstract: We refine the OrbNet model to accurately predict energy, forces, and other response properties for molecules using a graph neural-network architecture based on features from low-cost approximated quantum operators in the symmetry-adapted atomic orbital basis. The model is end-to-end differentiable due to the derivation of analytic gradients for all electronic structure terms, and is shown to be tr… ▽ More

    Submitted 1 December, 2020; v1 submitted 5 November, 2020; originally announced November 2020.

    Comments: Accepted for presentation at the Machine Learning for Molecules workshop at NeurIPS 2020

  4. arXiv:2010.08895  [pdf, other

    cs.LG math.NA

    Fourier Neural Operator for Parametric Partial Differential Equations

    Authors: Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

    Abstract: The classical development of neural networks has primarily focused on learning map**s between finite-dimensional Euclidean spaces. Recently, this has been generalized to neural operators that learn map**s between function spaces. For partial differential equations (PDEs), neural operators directly learn the map** from any functional parametric dependence to the solution. Thus, they learn an… ▽ More

    Submitted 16 May, 2021; v1 submitted 17 October, 2020; originally announced October 2020.

  5. arXiv:2010.05784  [pdf, other

    cs.LG cs.CV

    Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach

    Authors: Haoxuan Wang, Zhiding Yu, Yisong Yue, Anima Anandkumar, Anqi Liu, Junchi Yan

    Abstract: We propose a framework for learning calibrated uncertainties under domain shifts, where the source (training) distribution differs from the target (test) distribution. We detect such domain shifts via a differentiable density ratio estimator and train it together with the task network, composing an adjusted softmax predictive form concerning domain shift. In particular, the density ratio estimatio… ▽ More

    Submitted 5 February, 2024; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: IJCAI 2023

  6. arXiv:2010.00840  [pdf, other

    cs.CL

    MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models

    Authors: Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

    Abstract: Existing pre-trained large language models have shown unparalleled generative capabilities. However, they are not controllable. In this paper, we propose MEGATRON-CNTRL, a novel framework that uses large-scale language models and adds control to text generation by incorporating an external knowledge base. Our framework consists of a keyword predictor, a knowledge retriever, a contextual knowledge… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: Accepted in EMNLP 2020 main conference

  7. arXiv:2010.00763  [pdf, other

    cs.AI cs.CV cs.LG

    Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

    Authors: Weili Nie, Zhiding Yu, Lei Mao, Ankit B. Patel, Yuke Zhu, Animashree Anandkumar

    Abstract: Humans have an inherent ability to learn novel concepts from only a few samples and generalize these concepts to different situations. Even though today's machine learning models excel with a plethora of training data on standard recognition tasks, a considerable gap exists between machine-level pattern recognition and human-level concept learning. To narrow this gap, the Bongard problems (BPs) we… ▽ More

    Submitted 4 January, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: 22 pages, NeurIPS 2020

  8. arXiv:2009.10019  [pdf, other

    cs.RO cs.LG

    Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

    Authors: Xingye Da, Zhaoming Xie, David Hoeller, Byron Boots, Animashree Anandkumar, Yuke Zhu, Buck Babich, Animesh Garg

    Abstract: We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago). The system consists of a high-level controller that learns to choose from a set of primitives in response to changes in the environment and a low-level controller that utilizes an established control method to robustly execute… ▽ More

    Submitted 23 November, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: supplementary video: https://youtu.be/JJOmFZKpYTo

  9. arXiv:2008.11833  [pdf

    cs.CV cs.RO

    Deep learning-based computer vision to recognize and classify suturing gestures in robot-assisted surgery

    Authors: Francisco Luongo, Ryan Hakim, Jessica H. Nguyen, Animashree Anandkumar, Andrew J Hung

    Abstract: Our previous work classified a taxonomy of suturing gestures during a vesicourethral anastomosis of robotic radical prostatectomy in association with tissue tears and patient outcomes. Herein, we train deep-learning based computer vision (CV) to automate the identification and classification of suturing gestures for needle driving attempts. Using two independent raters, we manually annotated live… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: 5 figures, 2 tables

    ACM Class: J.3

  10. arXiv:2008.07087  [pdf, other

    cs.LG cs.AI stat.ML

    OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation

    Authors: Hongyu Ren, Yuke Zhu, Jure Leskovec, Anima Anandkumar, Animesh Garg

    Abstract: Real-world tasks often exhibit a compositional structure that contains a sequence of simpler sub-tasks. For instance, opening a door requires reaching, gras**, rotating, and pulling the door knob. Such compositional tasks require an agent to reason about the sub-task at hand while orchestrating global behavior accordingly. This can be cast as an online task inference problem, where the current t… ▽ More

    Submitted 17 August, 2020; originally announced August 2020.

    Comments: UAI 2020

  11. arXiv:2007.12291  [pdf, other

    cs.LG math.OC stat.ML

    Reinforcement Learning with Fast Stabilization in Linear Dynamical Systems

    Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

    Abstract: In this work, we study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems. When learning a dynamical system, one needs to stabilize the unknown dynamics in order to avoid system blow-ups. We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment with an improved exploration strategy. We show tha… ▽ More

    Submitted 3 June, 2022; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: 25th International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

  12. arXiv:2007.09250  [pdf, other

    cs.LG cs.CV stat.ML

    Unsupervised Controllable Generation with Self-Training

    Authors: Grigorios G Chrysos, Jean Kossaifi, Zhiding Yu, Anima Anandkumar

    Abstract: Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Ins… ▽ More

    Submitted 2 May, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: Accepted in IJCNN 2021

  13. arXiv:2007.09200  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Neural Networks with Recurrent Generative Feedback

    Authors: Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, Anima Anandkumar

    Abstract: Neural networks are vulnerable to input perturbations such as additive noise and adversarial attacks. In contrast, human perception is much more robust to such perturbations. The Bayesian brain hypothesis states that human brains use an internal generative model to update the posterior beliefs of the sensory input. This mechanism can be interpreted as a form of self-consistency between the maximum… ▽ More

    Submitted 10 November, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2020

  14. arXiv:2007.08479  [pdf, other

    cs.LG stat.ML

    Active Learning under Label Shift

    Authors: Eric Zhao, Anqi Liu, Animashree Anandkumar, Yisong Yue

    Abstract: We address the problem of active learning under label shift: when the class proportions of source and target domains differ. We introduce a "medial distribution" to incorporate a tradeoff between importance weighting and class-balanced sampling and propose their combined usage in active learning. Our method is known as Mediated Active Learning under Label Shift (MALLS). It balances the bias from c… ▽ More

    Submitted 25 February, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: 18 pages, 9 figures, to appear at the 2021 International Conference on Artificial Intelligence and Statistics (AIStats)

  15. arXiv:2007.08026  [pdf, other

    physics.chem-ph cs.LG

    OrbNet: Deep Learning for Quantum Chemistry Using Symmetry-Adapted Atomic-Orbital Features

    Authors: Zhuoran Qiao, Matthew Welborn, Animashree Anandkumar, Frederick R. Manby, Thomas F. Miller III

    Abstract: We introduce a machine learning method in which energy solutions from the Schrodinger equation are predicted using symmetry adapted atomic orbitals features and a graph neural-network architecture. \textsc{OrbNet} is shown to outperform existing methods in terms of learning efficiency and transferability for the prediction of density functional theory results while employing low-cost features that… ▽ More

    Submitted 18 January, 2022; v1 submitted 15 July, 2020; originally announced July 2020.

    Journal ref: J. Chem. Phys. 153, 124111 (2020)

  16. arXiv:2007.06965  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Automated Synthetic-to-Real Generalization

    Authors: Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

    Abstract: Models trained on synthetic images often face degraded generalization to real data. As a convention, these models are often initialized with ImageNet pre-trained representation. Yet the role of ImageNet knowledge is seldom discussed despite common practices that leverage this knowledge to maintain the generalization ability. An example is the careful hand-tuning of early stop** and layer-wise le… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Comments: Accepted to ICML 2020

  17. arXiv:2007.00631  [pdf, other

    cs.LG cs.CV stat.ML

    Causal Discovery in Physical Systems from Videos

    Authors: Yunzhu Li, Antonio Torralba, Animashree Anandkumar, Dieter Fox, Animesh Garg

    Abstract: Causal discovery is at the core of human cognition. It enables us to reason about the environment and make counterfactual predictions about unseen scenarios that can vastly differ from our previous experiences. We consider the task of causal discovery from videos in an end-to-end fashion without supervision on the ground-truth graph structure. In particular, our goal is to discover the structural… ▽ More

    Submitted 29 November, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2020. Project page: https://yunzhuli.github.io/V-CDN/

  18. arXiv:2006.15637  [pdf, other

    cs.LG stat.ML

    Deep Bayesian Quadrature Policy Optimization

    Authors: Akella Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Anima Anandkumar, Yisong Yue

    Abstract: We study the problem of obtaining accurate policy gradient estimates using a finite number of samples. Monte-Carlo methods have been the default choice for policy gradient estimation, despite suffering from high variance in the gradient estimates. On the other hand, more sample efficient alternatives like Bayesian quadrature methods have received little attention due to their high computational co… ▽ More

    Submitted 16 December, 2020; v1 submitted 28 June, 2020; originally announced June 2020.

    Comments: Conference paper: AAAI-21. Code available at https://github.com/Akella17/Deep-Bayesian-Quadrature-Policy-Optimization

  19. arXiv:2006.14560  [pdf, other

    cs.NE cs.LG math.NA stat.ML

    Learning compositional functions via multiplicative weight updates

    Authors: Jeremy Bernstein, Jiawei Zhao, Markus Meister, Ming-Yu Liu, Anima Anandkumar, Yisong Yue

    Abstract: Compositionality is a basic structural feature of both biological and artificial neural networks. Learning compositional functions via gradient descent incurs well known problems like vanishing and exploding gradients, making careful learning rate tuning essential for real-world applications. This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional fun… ▽ More

    Submitted 8 January, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

  20. arXiv:2006.10611  [pdf, other

    cs.LG cs.GT cs.MA stat.ML

    Competitive Policy Optimization

    Authors: Manish Prajapat, Kamyar Azizzadenesheli, Alexander Liniger, Yisong Yue, Anima Anandkumar

    Abstract: A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties. To tackle this, we propose competitive policy optimization (CoPO), a novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates. Motivated by the competitive gr… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 11 pages main paper, 6 pages references, and 31 pages appendix. 14 figures

  21. arXiv:2006.10179  [pdf, other

    math.OC cs.GT cs.LG

    Competitive Mirror Descent

    Authors: Florian Schäfer, Anima Anandkumar, Houman Owhadi

    Abstract: Constrained competitive optimization involves multiple agents trying to minimize conflicting objectives, subject to constraints. This is a highly expressive modeling language that subsumes most of modern machine learning. In this work we propose competitive mirror descent (CMD): a general method for solving such problems based on first order information that can be obtained by automatic differenti… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: The code used to produce the numerical experiments can be found under https://github.com/f-t-s/CMD

  22. arXiv:2006.09535  [pdf, other

    cs.LG math.NA stat.ML

    Multipole Graph Neural Operator for Parametric Partial Differential Equations

    Authors: Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

    Abstract: One of the main challenges in using deep learning-based methods for simulating physical systems and solving partial differential equations (PDEs) is formulating physics-based data in the desired structure for neural networks. Graph neural networks (GNNs) have gained popularity in this area since graphs offer a natural way of modeling particle interactions and provide a clear way of discretizing th… ▽ More

    Submitted 19 October, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

  23. arXiv:2005.04374  [pdf, other

    cs.RO cs.AI eess.SY math.OC

    Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

    Authors: Yashwanth Kumar Nakka, Anqi Liu, Guanya Shi, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

    Abstract: Learning-based control algorithms require data collection with abundant supervision for training. Safe exploration algorithms ensure the safety of this data collection process even when only partial knowledge is available. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained stochastic optimal control with dynamics learning and feedback con… ▽ More

    Submitted 27 October, 2020; v1 submitted 9 May, 2020; originally announced May 2020.

    Comments: Accepted IEEE Robotics and Automation Letters 2020

  24. arXiv:2005.01463  [pdf, other

    cs.LG eess.IV physics.flu-dyn stat.ML

    MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework

    Authors: Chiyu Max Jiang, Soheil Esmaeilzadeh, Kamyar Azizzadenesheli, Karthik Kashinath, Mustafa Mustafa, Hamdi A. Tchelepi, Philip Marcus, Prabhat, Anima Anandkumar

    Abstract: We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs. While being computationally efficient, MeshfreeFlowNet accurately recovers the fine-scale quantities of interest. MeshfreeFlowNet allows for: (i) the output to be sampled at all spatio-temporal resolutions, (ii) a set of Par… ▽ More

    Submitted 21 August, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Supplementary Video: https://youtu.be/mjqwPch9gDo. Accepted to SC20

  25. arXiv:2004.07984  [pdf, other

    cs.LG stat.ML

    Spectral Learning on Matrices and Tensors

    Authors: Majid Janzamin, Rong Ge, Jean Kossaifi, Anima Anandkumar

    Abstract: Spectral methods have been the mainstay in several domains such as machine learning and scientific computing. They involve finding a certain kind of spectral decomposition to obtain basis functions that can capture important structures for the problem at hand. The most common spectral method is the principal component analysis (PCA). It utilizes the top eigenvectors of the data covariance matrix,… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

    Journal ref: Foundations and Trends in Machine Learning: Vol. 12: No. 5-6, pp 393-536 (2019)

  26. arXiv:2003.11227  [pdf, other

    cs.LG math.OC stat.ML

    Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems

    Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

    Abstract: We study the problem of system identification and adaptive control in partially observable linear dynamical systems. Adaptive and closed-loop system identification is a challenging problem due to correlations introduced in data collection. In this paper, we present the first model estimation method with finite-time guarantees in both open and closed-loop system identification. Deploying this estim… ▽ More

    Submitted 23 June, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

  27. arXiv:2003.05999  [pdf, ps, other

    cs.LG math.OC stat.ML

    Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting

    Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

    Abstract: We study the problem of adaptive control in partially observable linear quadratic Gaussian control systems, where the model dynamics are unknown a priori. We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty, to effectively minimize the overall control cost. We employ the predictor state evolution representation of the system dyn… ▽ More

    Submitted 23 June, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

  28. arXiv:2003.03485  [pdf, other

    cs.LG math.NA stat.ML

    Neural Operator: Graph Kernel Network for Partial Differential Equations

    Authors: Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

    Abstract: The classical development of neural networks has been primarily for map**s between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces. The purpose of this work is to generalize neural networks so that they can learn map**s between infinite-dimensional spaces (operators). The key innovation in our work is that a single set of network pa… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

  29. arXiv:2003.03461  [pdf, other

    cs.CV cs.LG

    Semi-Supervised StyleGAN for Disentanglement Learning

    Authors: Weili Nie, Tero Karras, Animesh Garg, Shoubhik Debnath, Anjul Patney, Ankit B. Patel, Anima Anandkumar

    Abstract: Disentanglement learning is crucial for obtaining disentangled representations and controllable generation. Current disentanglement methods face several inherent limitations: difficulty with high-resolution images, primarily focusing on learning disentangled representations, and non-identifiability due to the unsupervised setting. To alleviate these limitations, we design new architectures and los… ▽ More

    Submitted 25 November, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: ICML 2020, 21 pages. Project page: https://sites.google.com/nvidia.com/semi-stylegan

  30. arXiv:2002.09131  [pdf, other

    cs.LG cs.CV stat.ML

    Convolutional Tensor-Train LSTM for Spatio-temporal Learning

    Authors: Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Animashree Anandkumar

    Abstract: Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation.However, existing methods still perform poorly on challenging video tasks such as long-term forecasting. This is because these kinds of challenging tasks require learning long-term spatio-temporal correlations in the video sequence. In this paper,… ▽ More

    Submitted 4 October, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Jiahao Su and Wonmin Byeon contributed equally to this work. 22 pages, 14 figures, NeurIPS 2020

  31. arXiv:2002.00082  [pdf, ps, other

    cs.LG math.OC stat.ML

    Regret Minimization in Partially Observable Linear Quadratic Control

    Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

    Abstract: We study the problem of regret minimization in partially observable linear quadratic control systems when the model dynamics are unknown a priori. We propose ExpCommit, an explore-then-commit algorithm that learns the model Markov parameters and then follows the principle of optimism in the face of uncertainty to design a controller. We propose a novel way to decompose the regret and provide an en… ▽ More

    Submitted 7 March, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

  32. arXiv:1912.04527  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Learning Pose Estimation for UAV Autonomous Navigation andLanding Using Visual-Inertial Sensor Data

    Authors: Francesca Baldini, Animashree Anandkumar, Richard M. Murray

    Abstract: In this work, we propose a new learning approach for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). We develop a multimodal fusion of deep neural architectures for visual-inertial odometry. We train the model in an end-to-end fashion to estimate the current vehicle pose from streams of visual and inertial measurements. We first evaluate the accuracy of our estimation by com… ▽ More

    Submitted 9 April, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

  33. arXiv:1912.03978  [pdf, other

    cs.LG cs.CV stat.ML

    InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

    Authors: Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation. However, conditioning CNFs on signals of interest for conditional image generation and downstream predictive tasks is inefficient due to the high-dimensional latent code generated by the model, which needs to be of the same si… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: 17 pages, 14 figures, 2 tables

  34. arXiv:1912.02279  [pdf, other

    cs.LG cs.CV stat.ML

    Angular Visual Hardness

    Authors: Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar

    Abstract: Recent convolutional neural networks (CNNs) have led to impressive performance but often suffer from poor calibration. They tend to be overconfident, with the model confidence not always reflecting the underlying true ambiguity and hardness. In this paper, we propose angular visual hardness (AVH), a score given by the normalized angular distance between the sample feature embedding and the target… ▽ More

    Submitted 10 July, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

  35. arXiv:1911.05811  [pdf, other

    cs.LG stat.ML

    Triply Robust Off-Policy Evaluation

    Authors: Anqi Liu, Hao Liu, Anima Anandkumar, Yisong Yue

    Abstract: We propose a robust regression approach to off-policy evaluation (OPE) for contextual bandits. We frame OPE as a covariate-shift problem and leverage modern robust regression tools. Ours is a general approach that can be used to augment any existing OPE method that utilizes the direct method. When augmenting doubly robust methods, we call the resulting method Triply Robust. We prove upper bounds o… ▽ More

    Submitted 15 November, 2019; v1 submitted 13 November, 2019; originally announced November 2019.

    Comments: Preliminary Work

  36. arXiv:1911.05332  [pdf, other

    cs.LG cs.CY cs.SI stat.ML

    Finding Social Media Trolls: Dynamic Keyword Selection Methods for Rapidly-Evolving Online Debates

    Authors: Anqi Liu, Maya Srikanth, Nicholas Adams-Cohen, R. Michael Alvarez, Anima Anandkumar

    Abstract: Online harassment is a significant social problem. Prevention of online harassment requires rapid detection of harassing, offensive, and negative social media posts. In this paper, we propose the use of word embedding models to identify offensive and harassing social media messages in two aspects: detecting fast-changing topics for more effective data collection and representing word semantics in… ▽ More

    Submitted 15 November, 2019; v1 submitted 13 November, 2019; originally announced November 2019.

    Comments: AI for Social Good workshop at NeurIPS (2019)

  37. arXiv:1911.05180  [pdf, ps, other

    physics.comp-ph

    Turbulence forecasting via Neural ODE

    Authors: Gavin D. Portwood, Peetak P. Mitra, Mateus Dias Ribeiro, Tan Minh Nguyen, Balasubramanya T. Nadiga, Juan A. Saenz, Michael Chertkov, Animesh Garg, Anima Anandkumar, Andreas Dengel, Richard Baraniuk, David P. Schmidt

    Abstract: Fluid turbulence is characterized by strong coupling across a broad range of scales. Furthermore, besides the usual local cascades, such coupling may extend to interactions that are non-local in scale-space. As such the computational demands associated with explicitly resolving the full set of scales and their interactions, as in the Direct Numerical Simulation (DNS) of the Navier-Stokes equations… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

  38. arXiv:1911.01545  [pdf, other

    cs.LG cs.NE stat.ML

    Compositional Generalization with Tree Stack Memory Units

    Authors: Forough Arabshahi, Zhichu Lu, Pranay Mundra, Sameer Singh, Animashree Anandkumar

    Abstract: We study compositional generalization, viz., the problem of zero-shot generalization to novel compositions of concepts in a domain. Standard neural networks fail to a large extent on compositional learning. We propose Tree Stack Memory Units (Tree-SMU) to enable strong compositional generalization. Tree-SMU is a recursive neural network with Stack Memory Units (\SMU s), a novel memory augmented ne… ▽ More

    Submitted 15 October, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

  39. arXiv:1910.05852  [pdf, other

    cs.LG stat.ML

    Implicit competitive regularization in GANs

    Authors: Florian Schäfer, Hongkai Zheng, Anima Anandkumar

    Abstract: To improve the stability of GAN training we need to understand why they can produce realistic samples. Presently, this is attributed to properties of the divergence obtained under an optimal discriminator. This argument has a fundamental flaw: If we do not impose regularity of the discriminator, it can exploit visually imperceptible errors of the generator to always achieve the maximal generator l… ▽ More

    Submitted 30 October, 2020; v1 submitted 13 October, 2019; originally announced October 2019.

    Comments: The code used to produce the numerical experiments can be found under http://github.com/devzhk/ICR . A high-level overview of this work can be found under https://f-t-s.github.io/projects/icr/

  40. arXiv:1909.07746  [pdf, other

    cs.LG cs.CL cs.IR

    Multi Sense Embeddings from Topic Models

    Authors: Shobhit Jain, Sravan Babu Bodapati, Ramesh Nallapati, Anima Anandkumar

    Abstract: Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due to their success in capturing useful semantic information. These representations assign only a single vector to each word whereas a large number of words are polysemous (i.e., have multiple meanings). In this work, we approach this critical problem in lexical semantics, namely that of representing v… ▽ More

    Submitted 3 February, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted at ACL supported conference for Natural Language & Speech Processing. https://www.aclweb.org/anthology/W19-74, Year: 2019

  41. arXiv:1907.04572  [pdf, other

    cs.LG cs.CV stat.ML

    Out-of-Distribution Detection Using Neural Rendering Generative Models

    Authors: Yujia Huang, Sihui Dai, Tan Nguyen, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Out-of-distribution (OoD) detection is a natural downstream task for deep generative models, due to their ability to learn the input probability distribution. There are mainly two classes of approaches for OoD detection using deep generative models, viz., based on likelihood measure and the reconstruction loss. However, both approaches are unable to carry out OoD detection effectively, especially… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

  42. arXiv:1907.00496  [pdf, other

    physics.geo-ph cs.LG

    Directivity Modes of Earthquake Populations with Unsupervised Learning

    Authors: Zachary E. Ross, Daniel T. Trugman, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: We present a novel approach for resolving modes of rupture directivity in large populations of earthquakes. A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster. The azimuthal distribution of energy for each earthquake is then assumed to result from one of several distinct modes of rupture propag… ▽ More

    Submitted 30 June, 2019; originally announced July 2019.

    Comments: 14 pages, 14 figures

  43. arXiv:1906.10437  [pdf, other

    cs.LG stat.ML

    Learning Causal State Representations of Partially Observable Environments

    Authors: Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

    Abstract: Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predi… ▽ More

    Submitted 8 February, 2021; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 35 pages, 8 figures

  44. arXiv:1906.05819  [pdf, other

    cs.LG eess.SY stat.ML

    Robust Regression for Safe Exploration in Control

    Authors: Anqi Liu, Guanya Shi, Soon-Jo Chung, Anima Anandkumar, Yisong Yue

    Abstract: We study the problem of safe learning and exploration in sequential control problems. The goal is to safely collect data samples from operating in an environment, in order to learn to achieve a challenging control goal (e.g., an agile maneuver close to a boundary). A central challenge in this setting is how to quantify uncertainty in order to choose provably-safe actions that allow us to collect i… ▽ More

    Submitted 26 June, 2020; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: 2nd Annual Conference on Learning for Dynamics and Control

  45. arXiv:1905.12103  [pdf, other

    math.OC cs.GT cs.LG

    Competitive Gradient Descent

    Authors: Florian Schäfer, Anima Anandkumar

    Abstract: We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games. Our method is a natural generalization of gradient descent to the two-player setting where the update is given by the Nash equilibrium of a regularized bilinear local approximation of the underlying game. It avoids oscillatory and divergent behaviors seen in alternating gradient descent.… ▽ More

    Submitted 30 June, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Appeared in NeurIPS 2019. This version corrects an error in theorem 2.2. Source code used for the numerical experiments can be found under http://github.com/f-t-s/CGD. A high-level overview of this work can be found under http://f-t-s.github.io/projects/cgd/

  46. arXiv:1903.09734  [pdf, ps, other

    cs.LG stat.ML

    Regularized Learning for Domain Adaptation under Label Shifts

    Authors: Kamyar Azizzadenesheli, Anqi Liu, Fanny Yang, Animashree Anandkumar

    Abstract: We propose Regularized Learning under Label shifts (RLLS), a principled and a practical domain-adaptation algorithm to correct for shifts in the label distribution between a source and a target domain. We first estimate importance weights using labeled source data and unlabeled target data, and then train a classifier on the weighted source samples. We derive a generalization bound for the classif… ▽ More

    Submitted 22 March, 2019; originally announced March 2019.

    Comments: International Conference on Learning Representations (ICLR) 2019

  47. arXiv:1902.10758  [pdf, other

    cs.LG stat.ML

    Tensor Dropout for Robust Learning

    Authors: Arinbjörn Kolbeinsson, Jean Kossaifi, Yannis Panagakis, Adrian Bulat, Anima Anandkumar, Ioanna Tzoulaki, Paul Matthews

    Abstract: CNNs achieve remarkable performance by leveraging deep, over-parametrized architectures, trained on large datasets. However, they have limited generalization ability to data outside the training domain, and a lack of robustness to noise and adversarial attacks. By building better inductive biases, we can improve robustness and also obtain smaller networks that are more memory and computationally e… ▽ More

    Submitted 11 December, 2020; v1 submitted 27 February, 2019; originally announced February 2019.

  48. arXiv:1901.11261  [pdf, other

    stat.ML cs.LG

    Higher-order Count Sketch: Dimensionality Reduction That Retains Efficient Tensor Operations

    Authors: Yang Shi, Animashree Anandkumar

    Abstract: Sketching is a randomized dimensionality-reduction method that aims to preserve relevant information in large-scale datasets. Count sketch is a simple popular sketch which uses a randomized hash function to achieve compression. In this paper, we propose a novel extension known as Higher-order Count Sketch (HCS). While count sketch uses a single hash function, HCS uses multiple (smaller) hash funct… ▽ More

    Submitted 4 November, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

  49. arXiv:1901.09490  [pdf, other

    cs.LG stat.ML

    Stochastic Linear Bandits with Hidden Low Rank Structure

    Authors: Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

    Abstract: High-dimensional representations often have a lower dimensional underlying structure. This is particularly the case in many decision making settings. For example, when the representation of actions is generated from a deep neural network, it is reasonable to expect a low-rank structure whereas conventional structures like sparsity are not valid anymore. Subspace recovery methods, such as Principle… ▽ More

    Submitted 27 January, 2019; originally announced January 2019.

  50. Neural Lander: Stable Drone Landing Control using Learned Dynamics

    Authors: Guanya Shi, Xichen Shi, Michael O'Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung

    Abstract: Precise near-ground trajectory control is difficult for multi-rotor drones, due to the complex aerodynamic effects caused by interactions between multi-rotor airflow and the environment. Conventional control methods often fail to properly account for these complex effects and fall short in accomplishing smooth landing. In this paper, we present a novel deep-learning-based robust nonlinear controll… ▽ More

    Submitted 4 March, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: 7 pages, 5 figures, https://youtu.be/FLLsG0S78ik

    Journal ref: International Conferenceon Robotics and Automation (ICRA), 2019, pp. 9784-9790