-
Sigma-Delta and Distributed Noise-Sha** Quantization Methods for Random Fourier Features
Authors:
**jie Zhang,
Harish Kannan,
Alexander Cloninger,
Rayan Saab
Abstract:
We propose the use of low bit-depth Sigma-Delta and distributed noise-sha** methods for quantizing the Random Fourier features (RFFs) associated with shift-invariant kernels. We prove that our quantized RFFs -- even in the case of $1$-bit quantization -- allow a high accuracy approximation of the underlying kernels, and the approximation error decays at least polynomially fast as the dimension o…
▽ More
We propose the use of low bit-depth Sigma-Delta and distributed noise-sha** methods for quantizing the Random Fourier features (RFFs) associated with shift-invariant kernels. We prove that our quantized RFFs -- even in the case of $1$-bit quantization -- allow a high accuracy approximation of the underlying kernels, and the approximation error decays at least polynomially fast as the dimension of the RFFs increases. We also show that the quantized RFFs can be further compressed, yielding an excellent trade-off between memory use and accuracy. Namely, the approximation error now decays exponentially as a function of the bits used. Moreover, we empirically show by testing the performance of our methods on several machine learning tasks that our method compares favorably to other state of the art quantization methods in this context.
△ Less
Submitted 12 April, 2022; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning
Authors:
Mohammad Babaeizadeh,
Mohammad Taghi Saffar,
Danijar Hafner,
Harini Kannan,
Chelsea Finn,
Sergey Levine,
Dumitru Erhan
Abstract:
Model-based reinforcement learning (MBRL) methods have shown strong sample efficiency and performance across a variety of tasks, including when faced with high-dimensional visual observations. These methods learn to predict the environment dynamics and expected reward from interaction and use this predictive model to plan and perform the task. However, MBRL methods vary in their fundamental design…
▽ More
Model-based reinforcement learning (MBRL) methods have shown strong sample efficiency and performance across a variety of tasks, including when faced with high-dimensional visual observations. These methods learn to predict the environment dynamics and expected reward from interaction and use this predictive model to plan and perform the task. However, MBRL methods vary in their fundamental design choices, and there is no strong consensus in the literature on how these design decisions affect performance. In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. A big exception to this finding is that predicting future observations (i.e., images) leads to significant task performance improvement compared to only predicting rewards. We also empirically find that image prediction accuracy, somewhat surprisingly, correlates more strongly with downstream task performance than reward prediction accuracy. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks (that require exploration) will perform the same as the best-performing models when trained on the same training data. Simultaneously, in the absence of exploration, models that fit the data better usually perform better on the downstream task as well, but surprisingly, these are often not the same models that perform the best when learning and exploring from scratch. These findings suggest that performance and exploration place important and potentially contradictory requirements on the model.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Optimizing persistent homology based functions
Authors:
Mathieu Carrière,
Frédéric Chazal,
Marc Glisse,
Yuichi Ike,
Hariprasad Kannan
Abstract:
Solving optimization tasks based on functions and losses with a topological flavor is a very active, growing field of research in data science and Topological Data Analysis, with applications in non-convex optimization, statistics and machine learning. However, the approaches proposed in the literature are usually anchored to a specific application and/or topological construction, and do not come…
▽ More
Solving optimization tasks based on functions and losses with a topological flavor is a very active, growing field of research in data science and Topological Data Analysis, with applications in non-convex optimization, statistics and machine learning. However, the approaches proposed in the literature are usually anchored to a specific application and/or topological construction, and do not come with theoretical guarantees. To address this issue, we study the differentiability of a general map associated with the most common topological construction, that is, the persistence map. Building on real analytic geometry arguments, we propose a general framework that allows us to define and compute gradients for persistence-based functions in a very simple way. We also provide a simple, explicit and sufficient condition for convergence of stochastic subgradient methods for such functions. This result encompasses all the constructions and applications of topological optimization in the literature. Finally, we provide associated code, that is easy to handle and to mix with other non-topological methods and constraints, as well as some experiments showcasing the versatility of our approach.
△ Less
Submitted 17 February, 2021; v1 submitted 16 October, 2020;
originally announced October 2020.
-
High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks
Authors:
Ruben Villegas,
Arkanath Pathak,
Harini Kannan,
Dumitru Erhan,
Quoc V. Le,
Honglak Lee
Abstract:
Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex inductive biases inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if…
▽ More
Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex inductive biases inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: finding minimal inductive bias for video prediction while maximizing network capacity. We investigate this question by performing the first large-scale empirical study and demonstrate state-of-the-art performance by learning large models on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling car driving.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
Persistent homology of unweighted complex networks via discrete Morse theory
Authors:
Harish Kannan,
Emil Saucan,
Indrava Roy,
Areejit Samal
Abstract:
Topological data analysis can reveal higher-order structure beyond pairwise connections between vertices in complex networks. We present a new method based on discrete Morse theory to study topological properties of unweighted and undirected networks using persistent homology. Leveraging on the features of discrete Morse theory, our method not only captures the topology of the clique complex of su…
▽ More
Topological data analysis can reveal higher-order structure beyond pairwise connections between vertices in complex networks. We present a new method based on discrete Morse theory to study topological properties of unweighted and undirected networks using persistent homology. Leveraging on the features of discrete Morse theory, our method not only captures the topology of the clique complex of such graphs via the concept of critical simplices, but also achieves close to the theoretical minimum number of critical simplices in several analyzed model and real networks. This leads to a reduced filtration scheme based on the subsequence of the corresponding critical weights, thereby leading to a significant increase in computational efficiency. We have employed our filtration scheme to explore the persistent homology of several model and real-world networks. In particular, we show that our method can detect differences in the higher-order structure of networks, and the corresponding persistence diagrams can be used to distinguish between different model networks. In summary, our method based on discrete Morse theory further increases the applicability of persistent homology to investigate the global topology of complex networks.
△ Less
Submitted 6 September, 2019; v1 submitted 2 January, 2019;
originally announced January 2019.
-
Adversarial Logit Pairing
Authors:
Harini Kannan,
Alexey Kurakin,
Ian Goodfellow
Abstract:
In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question (Athalye et al., 2018). Next, we introduce enhanced defenses using a technique we call logit…
▽ More
In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question (Athalye et al., 2018). Next, we introduce enhanced defenses using a technique we call logit pairing, a method that encourages logits for pairs of examples to be similar. When applied to clean examples and their adversarial counterparts, logit pairing improves accuracy on adversarial examples over vanilla adversarial training; we also find that logit pairing on clean examples only is competitive with adversarial training in terms of accuracy on two datasets. Finally, we show that adversarial logit pairing achieves the state of the art defense on ImageNet against PGD white box attacks, with an accuracy improvement from 1.5% to 27.9%. Adversarial logit pairing also successfully damages the current state of the art defense against black box attacks on ImageNet (Tramer et al., 2018), drop** its accuracy from 66.6% to 47.1%. With this new accuracy drop, adversarial logit pairing ties with Tramer et al.(2018) for the state of the art on black box attacks on ImageNet.
△ Less
Submitted 16 March, 2018;
originally announced March 2018.
-
Newton-type Methods for Inference in Higher-Order Markov Random Fields
Authors:
Hariprasad Kannan,
Nikos Komodakis,
Nikos Paragios
Abstract:
Linear programming relaxations are central to {\sc map} inference in discrete Markov Random Fields. The ability to properly solve the Lagrangian dual is a critical component of such methods. In this paper, we study the benefit of using Newton-type methods to solve the Lagrangian dual of a smooth version of the problem. We investigate their ability to achieve superior convergence behavior and to be…
▽ More
Linear programming relaxations are central to {\sc map} inference in discrete Markov Random Fields. The ability to properly solve the Lagrangian dual is a critical component of such methods. In this paper, we study the benefit of using Newton-type methods to solve the Lagrangian dual of a smooth version of the problem. We investigate their ability to achieve superior convergence behavior and to better handle the ill-conditioned nature of the formulation, as compared to first order methods. We show that it is indeed possible to efficiently apply a trust region Newton method for a broad range of {\sc map} inference problems. In this paper we propose a provably convergent and efficient framework that includes (i) excellent compromise between computational complexity and precision concerning the Hessian matrix construction, (ii) a dam** strategy that aids efficient optimization, (iii) a truncation strategy coupled with a generic pre-conditioner for Conjugate Gradients, (iv) efficient sum-product computation for sparse clique potentials. Results for higher-order Markov Random Fields demonstrate the potential of this approach.
△ Less
Submitted 5 September, 2017;
originally announced September 2017.
-
Eye Tracking for Everyone
Authors:
Kyle Krafka,
Aditya Khosla,
Petr Kellnhofer,
Harini Kannan,
Suchendra Bhandarkar,
Wojciech Matusik,
Antonio Torralba
Abstract:
From scientific research to commercial applications, eye tracking is an important tool across many domains. Despite its range of applications, eye tracking has yet to become a pervasive technology. We believe that we can put the power of eye tracking in everyone's palm by building eye tracking software that works on commodity hardware such as mobile phones and tablets, without the need for additio…
▽ More
From scientific research to commercial applications, eye tracking is an important tool across many domains. Despite its range of applications, eye tracking has yet to become a pervasive technology. We believe that we can put the power of eye tracking in everyone's palm by building eye tracking software that works on commodity hardware such as mobile phones and tablets, without the need for additional sensors or devices. We tackle this problem by introducing GazeCapture, the first large-scale dataset for eye tracking, containing data from over 1450 people consisting of almost 2.5M frames. Using GazeCapture, we train iTracker, a convolutional neural network for eye tracking, which achieves a significant reduction in error over previous approaches while running in real time (10-15fps) on a modern mobile device. Our model achieves a prediction error of 1.71cm and 2.53cm without calibration on mobile phones and tablets respectively. With calibration, this is reduced to 1.34cm and 2.12cm. Further, we demonstrate that the features learned by iTracker generalize well to other datasets, achieving state-of-the-art results. The code, data, and models are available at http://gazecapture.csail.mit.edu.
△ Less
Submitted 18 June, 2016;
originally announced June 2016.