-
Quantum Circuit Optimization with AlphaTensor
Authors:
Francisco J. R. Ruiz,
Tuomas Laakkonen,
Johannes Bausch,
Matej Balog,
Mohammadamin Barekatain,
Francisco J. H. Heras,
Alexander Novikov,
Nathan Fitzpatrick,
Bernardino Romera-Paredes,
John van de Wetering,
Alhussein Fawzi,
Konstantinos Meichanetzidis,
Pushmeet Kohli
Abstract:
A key challenge in realizing fault-tolerant quantum computers is circuit optimization. Focusing on the most expensive gates in fault-tolerant quantum computation (namely, the T gates), we address the problem of T-count optimization, i.e., minimizing the number of T gates that are needed to implement a given circuit. To achieve this, we develop AlphaTensor-Quantum, a method based on deep reinforcem…
▽ More
A key challenge in realizing fault-tolerant quantum computers is circuit optimization. Focusing on the most expensive gates in fault-tolerant quantum computation (namely, the T gates), we address the problem of T-count optimization, i.e., minimizing the number of T gates that are needed to implement a given circuit. To achieve this, we develop AlphaTensor-Quantum, a method based on deep reinforcement learning that exploits the relationship between optimizing T-count and tensor decomposition. Unlike existing methods for T-count optimization, AlphaTensor-Quantum can incorporate domain-specific knowledge about quantum computation and leverage gadgets, which significantly reduces the T-count of the optimized circuits. AlphaTensor-Quantum outperforms the existing methods for T-count optimization on a set of arithmetic benchmarks (even when compared without making use of gadgets). Remarkably, it discovers an efficient algorithm akin to Karatsuba's method for multiplication in finite fields. AlphaTensor-Quantum also finds the best human-designed solutions for relevant arithmetic computations used in Shor's algorithm and for quantum chemistry simulation, thus demonstrating it can save hundreds of hours of research by optimizing relevant quantum circuits in a fully automated way.
△ Less
Submitted 5 March, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
On the generalization of the Kruskal-Szekeres coordinates: a global conformal charting of the Reissner-Nordstrom spacetime
Authors:
Ali Fawzi,
Dejan Stojkovic
Abstract:
The Kruskal-Szekeres coordinates construction for the Schwarzschild spacetime could be viewed geometrically as a squeezing of the $t$-line associated with the asymptotic observer into a single point, at the event horizon $r=2M$. Starting from this point, we extend the Kruskal charting to spacetimes with two horizons, in particular the Reissner-Nordström manifold, $\mathcal{M}_{RN}$. We develop a n…
▽ More
The Kruskal-Szekeres coordinates construction for the Schwarzschild spacetime could be viewed geometrically as a squeezing of the $t$-line associated with the asymptotic observer into a single point, at the event horizon $r=2M$. Starting from this point, we extend the Kruskal charting to spacetimes with two horizons, in particular the Reissner-Nordström manifold, $\mathcal{M}_{RN}$. We develop a new method for constructing Kruskal-like coordinates and find two algebraically distinct classes charting $\mathcal{M}_{RN}$. We pedagogically illustrate our method by constructing two compact, conformal, and global coordinate systems labeled $\mathcal{GK_{I}}$ and $\mathcal{GK_{II}}$ for each class respectively. In both coordinates, the metric differentiability can be promoted to $C^\infty$. The conformal metric factor can be explicitly written in terms of the original $t$ and $r$ coordinates for both charts.
△ Less
Submitted 31 December, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Determining rigid body motion from accelerometer data through the square-root of a negative semi-definite tensor, with applications in mild traumatic brain injury
Authors:
Yang Wan,
Alice Lux Fawzi,
Haneesh Kesari
Abstract:
Mild Traumatic Brain Injuries (mTBI) are caused by violent head motions or impacts. Most mTBI prevention strategies explicitly or implicitly rely on a "brain injury criterion". A brain injury criterion takes some descriptor of the head's motion as input and yields a prediction for that motion's potential for causing mTBI as the output. The inputs are descriptors of the head's motion that are usual…
▽ More
Mild Traumatic Brain Injuries (mTBI) are caused by violent head motions or impacts. Most mTBI prevention strategies explicitly or implicitly rely on a "brain injury criterion". A brain injury criterion takes some descriptor of the head's motion as input and yields a prediction for that motion's potential for causing mTBI as the output. The inputs are descriptors of the head's motion that are usually synthesized from accelerometer and gyroscope data. In the context of brain injury criterion the head is modeled as a rigid body. We present an algorithm for determining the complete motion of the head using data from only four head mounted tri-axial accelerometers. In contrast to inertial measurement unit based algorithms for determining rigid body motion the presented algorithm does not depend on data from gyroscopes; which consume much more power than accelerometers. Several algorithms that also make use of data from only accelerometers already exist. However, those algorithms, except for the recently presented AO-algorithm [Rahaman MM, Fang W, Fawzi AL, Wan Y, Kesari H (2020): J Mech Phys Solids 104014], give the rigid body's acceleration field in terms of the body frame, which in general is unknown. Compared to the AO-algorithm the presented algorithm is much more insensitive to bias type errors, such as those that arise from inaccurate measurement of sensor positions and orientations.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
An accelerometer-only algorithm for determining the acceleration field of a rigid body, with application in studying the mechanics of mild Traumatic Brain Injury
Authors:
Mohammad Masiur Rahaman,
Wenqiang Fang,
Alice Lux Fawzi,
Yang Wan,
Haneesh Kesari
Abstract:
We present an algorithm for determining the acceleration field of a rigid body using measurements from four tri-axial accelerometers. The acceleration field is an important quantity in bio-mechanics problems, especially in the study of mild Traumatic Brain Injury (mTBI). The in vivo strains in the brain, which are hypothesized to closely correlate with brain injury, are generally not directly acce…
▽ More
We present an algorithm for determining the acceleration field of a rigid body using measurements from four tri-axial accelerometers. The acceleration field is an important quantity in bio-mechanics problems, especially in the study of mild Traumatic Brain Injury (mTBI). The in vivo strains in the brain, which are hypothesized to closely correlate with brain injury, are generally not directly accessible outside of a laboratory setting. However, they can be estimated on knowing the head's acceleration field. In contrast to other techniques, the proposed algorithm uses data exclusively from accelerometers, rather than from a combination of accelerometers and gyroscopes. For that reason, the proposed accelerometer only (AO) algorithm does not involve any numerical differentiation of data, which is known to greatly amplify measurement noise. For applications where only the magnitude of the acceleration vector is of interest, the algorithm is straightforward, computationally efficient and does not require computation of angular velocity or orientation. When both the magnitude and direction of acceleration are of interest, the proposed algorithm involves the calculation of the angular velocity and orientation as intermediate steps. In addition to hel** understand the mechanics of mTBI, the AO-algorithm may find widespread use in several bio-mechanical applications, gyroscope-free inertial navigation units, ballistic platform guidance, and platform control.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
Adversarial Robustness through Local Linearization
Authors:
Chongli Qin,
James Martens,
Sven Gowal,
Dilip Krishnan,
Krishnamurthy Dvijotham,
Alhussein Fawzi,
Soham De,
Robert Stanforth,
Pushmeet Kohli
Abstract:
Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust agai…
▽ More
Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust against weak attacks but break down under attacks that are stronger. This is often attributed to the phenomenon of gradient obfuscation; such models have a highly non-linear loss surface in the vicinity of training examples, making it hard for gradient-based attacks to succeed even though adversarial examples still exist. In this work, we introduce a novel regularizer that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness. We show via extensive experiments on CIFAR-10 and ImageNet, that models trained with our regularizer avoid gradient obfuscation and can be trained significantly faster than adversarial training. Using this regularizer, we exceed current state of the art and achieve 47% adversarial accuracy for ImageNet with l-infinity adversarial perturbations of radius 4/255 under an untargeted, strong, white-box attack. Additionally, we match state of the art results for CIFAR-10 at 8/255.
△ Less
Submitted 10 October, 2019; v1 submitted 4 July, 2019;
originally announced July 2019.
-
Learning dynamic polynomial proofs
Authors:
Alhussein Fawzi,
Mateusz Malinowski,
Hamza Fawzi,
Omar Fawzi
Abstract:
Polynomial inequalities lie at the heart of many mathematical disciplines. In this paper, we consider the fundamental computational task of automatically searching for proofs of polynomial inequalities. We adopt the framework of semi-algebraic proof systems that manipulate polynomial inequalities via elementary inference rules that infer new inequalities from the premises. These proof systems are…
▽ More
Polynomial inequalities lie at the heart of many mathematical disciplines. In this paper, we consider the fundamental computational task of automatically searching for proofs of polynomial inequalities. We adopt the framework of semi-algebraic proof systems that manipulate polynomial inequalities via elementary inference rules that infer new inequalities from the premises. These proof systems are known to be very powerful, but searching for proofs remains a major difficulty. In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems. We propose a deep reinforcement learning framework that learns an embedding of the polynomials and guides the choice of inference rules, taking the inherent symmetries of the problem as an inductive bias. We compare our approach with powerful and widely-studied linear programming hierarchies based on static proof systems, and show that our method reduces the size of the linear program by several orders of magnitude while also improving performance. These results hence pave the way towards augmenting powerful and well-studied semi-algebraic proof systems with machine learning guiding strategies for enhancing the expressivity of such proof systems.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Are Labels Required for Improving Adversarial Robustness?
Authors:
Jonathan Uesato,
Jean-Baptiste Alayrac,
Po-Sen Huang,
Robert Stanforth,
Alhussein Fawzi,
Pushmeet Kohli
Abstract:
Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that…
▽ More
Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that unlabeled data can be a competitive alternative to labeled data for training adversarially robust models. Theoretically, we show that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors. On standard datasets like CIFAR-10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-the-art on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. This demonstrates that our finding extends as well to the more realistic case where unlabeled data is also uncurated, therefore opening a new avenue for improving adversarial training.
△ Less
Submitted 5 December, 2019; v1 submitted 31 May, 2019;
originally announced May 2019.
-
Verification of deep probabilistic models
Authors:
Krishnamurthy Dvijotham,
Marta Garnelo,
Alhussein Fawzi,
Pushmeet Kohli
Abstract:
Probabilistic models are a critical part of the modern deep learning toolbox - ranging from generative models (VAEs, GANs), sequence to sequence models used in machine translation and speech processing to models over functional spaces (conditional neural processes, neural processes). Given the size and complexity of these models, safely deploying them in applications requires the development of to…
▽ More
Probabilistic models are a critical part of the modern deep learning toolbox - ranging from generative models (VAEs, GANs), sequence to sequence models used in machine translation and speech processing to models over functional spaces (conditional neural processes, neural processes). Given the size and complexity of these models, safely deploying them in applications requires the development of tools to analyze their behavior rigorously and provide some guarantees that these models are consistent with a list of desirable properties or specifications. For example, a machine translation model should produce semantically equivalent outputs for innocuous changes in the input to the model. A functional regression model that is learning a distribution over monotonic functions should predict a larger value at a larger input. Verification of these properties requires a new framework that goes beyond notions of verification studied in deterministic feedforward networks, since requiring worst-case guarantees in probabilistic models is likely to produce conservative or vacuous results. We propose a novel formulation of verification for deep probabilistic models that take in conditioning inputs and sample latent variables in the course of producing an output: We require that the output of the model satisfies a linear constraint with high probability over the sampling of latent variables and for every choice of conditioning input to the model. We show that rigorous lower bounds on the probability that the constraint is satisfied can be obtained efficiently. Experiments with neural processes show that several properties of interest while modeling functional spaces can be modeled within this framework (monotonicity, convexity) and verified efficiently using our algorithms
△ Less
Submitted 6 December, 2018;
originally announced December 2018.
-
Robustness via curvature regularization, and vice versa
Authors:
Seyed-Mohsen Moosavi-Dezfooli,
Alhussein Fawzi,
Jonathan Uesato,
Pascal Frossard
Abstract:
State-of-the-art classifiers have been shown to be largely vulnerable to adversarial perturbations. One of the most effective strategies to improve robustness is adversarial training. In this paper, we investigate the effect of adversarial training on the geometry of the classification landscape and decision boundaries. We show in particular that adversarial training leads to a significant decreas…
▽ More
State-of-the-art classifiers have been shown to be largely vulnerable to adversarial perturbations. One of the most effective strategies to improve robustness is adversarial training. In this paper, we investigate the effect of adversarial training on the geometry of the classification landscape and decision boundaries. We show in particular that adversarial training leads to a significant decrease in the curvature of the loss surface with respect to inputs, leading to a drastically more "linear" behaviour of the network. Using a locally quadratic approximation, we provide theoretical evidence on the existence of a strong relation between large robustness and small curvature. To further show the importance of reduced curvature for improving the robustness, we propose a new regularizer that directly minimizes curvature of the loss surface, and leads to adversarial robustness that is on par with adversarial training. Besides being a more efficient and principled alternative to adversarial training, the proposed regularizer confirms our claims on the importance of exhibiting quasi-linear behavior in the vicinity of data points in order to achieve robustness.
△ Less
Submitted 23 November, 2018;
originally announced November 2018.
-
SaaS: Speed as a Supervisor for Semi-supervised Learning
Authors:
Safa Cicek,
Alhussein Fawzi,
Stefano Soatto
Abstract:
We introduce the SaaS Algorithm for semi-supervised learning, which uses learning speed during stochastic gradient descent in a deep neural network to measure the quality of an iterative estimate of the posterior probability of unknown labels. Training speed in supervised learning correlates strongly with the percentage of correct labels, so we use it as an inference criterion for the unknown labe…
▽ More
We introduce the SaaS Algorithm for semi-supervised learning, which uses learning speed during stochastic gradient descent in a deep neural network to measure the quality of an iterative estimate of the posterior probability of unknown labels. Training speed in supervised learning correlates strongly with the percentage of correct labels, so we use it as an inference criterion for the unknown labels, without attempting to infer the model parameters at first. Despite its simplicity, SaaS achieves state-of-the-art results in semi-supervised learning benchmarks.
△ Less
Submitted 2 May, 2018;
originally announced May 2018.
-
Adversarial vulnerability for any classifier
Authors:
Alhussein Fawzi,
Hamza Fawzi,
Omar Fawzi
Abstract:
Despite achieving impressive performance, state-of-the-art classifiers remain highly vulnerable to small, imperceptible, adversarial perturbations. This vulnerability has proven empirically to be very intricate to address. In this paper, we study the phenomenon of adversarial perturbations under the assumption that the data is generated with a smooth generative model. We derive fundamental upper b…
▽ More
Despite achieving impressive performance, state-of-the-art classifiers remain highly vulnerable to small, imperceptible, adversarial perturbations. This vulnerability has proven empirically to be very intricate to address. In this paper, we study the phenomenon of adversarial perturbations under the assumption that the data is generated with a smooth generative model. We derive fundamental upper bounds on the robustness to perturbations of any classification function, and prove the existence of adversarial perturbations that transfer well across different classifiers with small risk. Our analysis of the robustness also provides insights onto key properties of generative models, such as their smoothness and dimensionality of latent space. We conclude with numerical experimental results showing that our bounds provide informative baselines to the maximal achievable robustness on several datasets.
△ Less
Submitted 30 November, 2018; v1 submitted 23 February, 2018;
originally announced February 2018.
-
Robustness of classifiers to uniform $\ell\_p$ and Gaussian noise
Authors:
Jean-Yves Franceschi,
Alhussein Fawzi,
Omar Fawzi
Abstract:
We study the robustness of classifiers to various kinds of random noise models. In particular, we consider noise drawn uniformly from the $\ell\_p$ ball for $p \in [1, \infty]$ and Gaussian noise with an arbitrary covariance matrix. We characterize this robustness to random noise in terms of the distance to the decision boundary of the classifier. This analysis applies to linear classifiers as wel…
▽ More
We study the robustness of classifiers to various kinds of random noise models. In particular, we consider noise drawn uniformly from the $\ell\_p$ ball for $p \in [1, \infty]$ and Gaussian noise with an arbitrary covariance matrix. We characterize this robustness to random noise in terms of the distance to the decision boundary of the classifier. This analysis applies to linear classifiers as well as classifiers with locally approximately flat decision boundaries, a condition which is satisfied by state-of-the-art deep neural networks. The predicted robustness is verified experimentally.
△ Less
Submitted 22 February, 2018;
originally announced February 2018.
-
Robustness of classifiers to universal perturbations: a geometric perspective
Authors:
Seyed-Mohsen Moosavi-Dezfooli,
Alhussein Fawzi,
Omar Fawzi,
Pascal Frossard,
Stefano Soatto
Abstract:
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers. In this paper, we propose the first quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations,…
▽ More
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers. In this paper, we propose the first quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations, and the geometry of the decision boundary. Specifically, we establish theoretical bounds on the robustness of classifiers under two decision boundary models (flat and curved models). We show in particular that the robustness of deep networks to universal perturbations is driven by a key property of their curvature: there exists shared directions along which the decision boundary of deep networks is systematically positively curved. Under such conditions, we prove the existence of small universal perturbations. Our analysis further provides a novel geometric method for computing universal perturbations, in addition to explaining their properties.
△ Less
Submitted 1 March, 2021; v1 submitted 26 May, 2017;
originally announced May 2017.
-
Classification regions of deep neural networks
Authors:
Alhussein Fawzi,
Seyed-Mohsen Moosavi-Dezfooli,
Pascal Frossard,
Stefano Soatto
Abstract:
The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space. We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary. Through a systematic empirical investigation, we show that state-of-the-art deep nets learn connected classification regions, and that the decision b…
▽ More
The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space. We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary. Through a systematic empirical investigation, we show that state-of-the-art deep nets learn connected classification regions, and that the decision boundary in the vicinity of datapoints is flat along most directions. We further draw an essential connection between two seemingly unrelated properties of deep networks: their sensitivity to additive perturbations in the inputs, and the curvature of their decision boundary. The directions where the decision boundary is curved in fact remarkably characterize the directions to which the classifier is the most vulnerable. We finally leverage a fundamental asymmetry in the curvature of the decision boundary of deep nets, and propose a method to discriminate between original images, and images perturbed with small adversarial examples. We show the effectiveness of this purely geometric approach for detecting small adversarial perturbations in images, and for recovering the labels of perturbed images.
△ Less
Submitted 26 May, 2017;
originally announced May 2017.
-
Universal adversarial perturbations
Authors:
Seyed-Mohsen Moosavi-Dezfooli,
Alhussein Fawzi,
Omar Fawzi,
Pascal Frossard
Abstract:
Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being q…
▽ More
Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networks are highly vulnerable to such perturbations, albeit being quasi-imperceptible to the human eye. We further empirically analyze these universal perturbations and show, in particular, that they generalize very well across neural networks. The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.
△ Less
Submitted 9 March, 2017; v1 submitted 26 October, 2016;
originally announced October 2016.
-
Robustness of classifiers: from adversarial to random noise
Authors:
Alhussein Fawzi,
Seyed-Mohsen Moosavi-Dezfooli,
Pascal Frossard
Abstract:
Several recent works have shown that state-of-the-art classifiers are vulnerable to worst-case (i.e., adversarial) perturbations of the datapoints. On the other hand, it has been empirically observed that these same classifiers are relatively robust to random noise. In this paper, we propose to study a \textit{semi-random} noise regime that generalizes both the random and worst-case noise regimes.…
▽ More
Several recent works have shown that state-of-the-art classifiers are vulnerable to worst-case (i.e., adversarial) perturbations of the datapoints. On the other hand, it has been empirically observed that these same classifiers are relatively robust to random noise. In this paper, we propose to study a \textit{semi-random} noise regime that generalizes both the random and worst-case noise regimes. We propose the first quantitative analysis of the robustness of nonlinear classifiers in this general noise regime. We establish precise theoretical bounds on the robustness of classifiers in this general regime, which depend on the curvature of the classifier's decision boundary. Our bounds confirm and quantify the empirical observations that classifiers satisfying curvature constraints are robust to random noise. Moreover, we quantify the robustness of classifiers in terms of the subspace dimension in the semi-random noise regime, and show that our bounds remarkably interpolate between the worst-case and random noise regimes. We perform experiments and show that the derived bounds provide very accurate estimates when applied to various state-of-the-art deep neural networks and datasets. This result suggests bounds on the curvature of the classifiers' decision boundaries that we support experimentally, and more generally offers important insights onto the geometry of high dimensional classification problems.
△ Less
Submitted 31 August, 2016;
originally announced August 2016.
-
DeepFool: a simple and accurate method to fool deep neural networks
Authors:
Seyed-Mohsen Moosavi-Dezfooli,
Alhussein Fawzi,
Pascal Frossard
Abstract:
State-of-the-art deep neural networks have achieved impressive results on many image classification tasks. However, these same architectures have been shown to be unstable to small, well sought, perturbations of the images. Despite the importance of this phenomenon, no effective methods have been proposed to accurately compute the robustness of state-of-the-art deep classifiers to such perturbatio…
▽ More
State-of-the-art deep neural networks have achieved impressive results on many image classification tasks. However, these same architectures have been shown to be unstable to small, well sought, perturbations of the images. Despite the importance of this phenomenon, no effective methods have been proposed to accurately compute the robustness of state-of-the-art deep classifiers to such perturbations on large-scale datasets. In this paper, we fill this gap and propose the DeepFool algorithm to efficiently compute perturbations that fool deep networks, and thus reliably quantify the robustness of these classifiers. Extensive experimental results show that our approach outperforms recent methods in the task of computing adversarial perturbations and making classifiers more robust.
△ Less
Submitted 4 July, 2016; v1 submitted 14 November, 2015;
originally announced November 2015.
-
Manitest: Are classifiers really invariant?
Authors:
Alhussein Fawzi,
Pascal Frossard
Abstract:
Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as rotations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rig…
▽ More
Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as rotations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rigorous and systematic approach for quantifying the invariance to geometric transformations of any classifier. Our key idea is to cast the problem of assessing a classifier's invariance as the computation of geodesics along the manifold of transformed images. We propose the Manitest method, built on the efficient Fast Marching algorithm to compute the invariance of classifiers. Our new method quantifies in particular the importance of data augmentation for learning invariance from data, and the increased invariance of convolutional neural networks with depth. We foresee that the proposed generic tool for measuring invariance to a large class of geometric transformations and arbitrary classifiers will have many applications for evaluating and comparing classifiers based on their invariance, and help improving the invariance of existing classifiers.
△ Less
Submitted 23 July, 2015;
originally announced July 2015.
-
Multi-task additive models with shared transfer functions based on dictionary learning
Authors:
Alhussein Fawzi,
Mathieu Sinn,
Pascal Frossard
Abstract:
Additive models form a widely popular class of regression models which represent the relation between covariates and response variables as the sum of low-dimensional transfer functions. Besides flexibility and accuracy, a key benefit of these models is their interpretability: the transfer functions provide visual means for inspecting the models and identifying domain-specific relations between inp…
▽ More
Additive models form a widely popular class of regression models which represent the relation between covariates and response variables as the sum of low-dimensional transfer functions. Besides flexibility and accuracy, a key benefit of these models is their interpretability: the transfer functions provide visual means for inspecting the models and identifying domain-specific relations between inputs and outputs. However, in large-scale problems involving the prediction of many related tasks, learning independently additive models results in a loss of model interpretability, and can cause overfitting when training data is scarce. We introduce a novel multi-task learning approach which provides a corpus of accurate and interpretable additive models for a large number of related forecasting tasks. Our key idea is to share transfer functions across models in order to reduce the model complexity and ease the exploration of the corpus. We establish a connection with sparse dictionary learning and propose a new efficient fitting algorithm which alternates between sparse coding and transfer function updates. The former step is solved via an extension of Orthogonal Matching Pursuit, whose properties are analyzed using a novel recovery condition which extends existing results in the literature. The latter step is addressed using a traditional dictionary update rule. Experiments on real-world data demonstrate that our approach compares favorably to baseline methods while yielding an interpretable corpus of models, revealing structure among the individual tasks and being more robust when training data is scarce. Our framework therefore extends the well-known benefits of additive models to common regression settings possibly involving thousands of tasks.
△ Less
Submitted 19 May, 2015;
originally announced May 2015.
-
Analysis of classifiers' robustness to adversarial perturbations
Authors:
Alhussein Fawzi,
Omar Fawzi,
Pascal Frossard
Abstract:
The goal of this paper is to analyze an intriguing phenomenon recently discovered in deep networks, namely their instability to adversarial perturbations (Szegedy et. al., 2014). We provide a theoretical framework for analyzing the robustness of classifiers to adversarial perturbations, and show fundamental upper bounds on the robustness of classifiers. Specifically, we establish a general upper b…
▽ More
The goal of this paper is to analyze an intriguing phenomenon recently discovered in deep networks, namely their instability to adversarial perturbations (Szegedy et. al., 2014). We provide a theoretical framework for analyzing the robustness of classifiers to adversarial perturbations, and show fundamental upper bounds on the robustness of classifiers. Specifically, we establish a general upper bound on the robustness of classifiers to adversarial perturbations, and then illustrate the obtained upper bound on the families of linear and quadratic classifiers. In both cases, our upper bound depends on a distinguishability measure that captures the notion of difficulty of the classification task. Our results for both classes imply that in tasks involving small distinguishability, no classifier in the considered set will be robust to adversarial perturbations, even if a good accuracy is achieved. Our theoretical framework moreover suggests that the phenomenon of adversarial instability is due to the low flexibility of classifiers, compared to the difficulty of the classification task (captured by the distinguishability). Moreover, we show the existence of a clear distinction between the robustness of a classifier to random noise and its robustness to adversarial perturbations. Specifically, the former is shown to be larger than the latter by a factor that is proportional to \sqrt{d} (with d being the signal dimension) for linear classifiers. This result gives a theoretical explanation for the discrepancy between the two robustness properties in high dimensional problems, which was empirically observed in the context of neural networks. To the best of our knowledge, our results provide the first theoretical work that addresses the phenomenon of adversarial instability recently observed for deep networks. Our analysis is complemented by experimental results on controlled and real-world data.
△ Less
Submitted 28 March, 2016; v1 submitted 9 February, 2015;
originally announced February 2015.
-
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
Authors:
L. Jacques,
C. De Vleeschouwer,
Y. Boursier,
P. Sudhakar,
C. De Mol,
A. Pizurica,
S. Anthoine,
P. Vandergheynst,
P. Frossard,
C. Bilen,
S. Kitic,
N. Bertin,
R. Gribonval,
N. Boumal,
B. Mishra,
P. -A. Absil,
R. Sepulchre,
S. Bundervoet,
C. Schretter,
A. Dooms,
P. Schelkens,
O. Chabiron,
F. Malgouyres,
J. -Y. Tourneret,
N. Dobigeon
, et al. (42 additional authors not shown)
Abstract:
The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in…
▽ More
The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.
△ Less
Submitted 9 October, 2014; v1 submitted 2 October, 2014;
originally announced October 2014.
-
Dictionary learning for fast classification based on soft-thresholding
Authors:
Alhussein Fawzi,
Mike Davies,
Pascal Frossard
Abstract:
Classifiers based on sparse representations have recently been shown to provide excellent results in many visual recognition and classification tasks. However, the high cost of computing sparse representations at test time is a major obstacle that limits the applicability of these methods in large-scale problems, or in scenarios where computational power is restricted. We consider in this paper a…
▽ More
Classifiers based on sparse representations have recently been shown to provide excellent results in many visual recognition and classification tasks. However, the high cost of computing sparse representations at test time is a major obstacle that limits the applicability of these methods in large-scale problems, or in scenarios where computational power is restricted. We consider in this paper a simple yet efficient alternative to sparse coding for feature extraction. We study a classification scheme that applies the soft-thresholding nonlinear map** in a dictionary, followed by a linear classifier. A novel supervised dictionary learning algorithm tailored for this low complexity classification architecture is proposed. The dictionary learning problem, which jointly learns the dictionary and linear classifier, is cast as a difference of convex (DC) program and solved efficiently with an iterative DC solver. We conduct experiments on several datasets, and show that our learning algorithm that leverages the structure of the classification problem outperforms generic learning procedures. Our simple classifier based on soft-thresholding also competes with the recent sparse coding classifiers, when the dictionary is learned appropriately. The adopted classification scheme further requires less computational time at the testing stage, compared to other classifiers. The proposed scheme shows the potential of the adequately trained soft-thresholding map** for classification and paves the way towards the development of very efficient classification methods for vision problems.
△ Less
Submitted 2 October, 2014; v1 submitted 9 February, 2014;
originally announced February 2014.
-
Image registration with sparse approximations in parametric dictionaries
Authors:
Alhussein Fawzi,
Pascal Frossard
Abstract:
We examine in this paper the problem of image registration from the new perspective where images are given by sparse approximations in parametric dictionaries of geometric functions. We propose a registration algorithm that looks for an estimate of the global transformation between sparse images by examining the set of relative geometrical transformations between the respective features. We propos…
▽ More
We examine in this paper the problem of image registration from the new perspective where images are given by sparse approximations in parametric dictionaries of geometric functions. We propose a registration algorithm that looks for an estimate of the global transformation between sparse images by examining the set of relative geometrical transformations between the respective features. We propose a theoretical analysis of our registration algorithm and we derive performance guarantees based on two novel important properties of redundant dictionaries, namely the robust linear independence and the transformation inconsistency. We propose several illustrations and insights about the importance of these dictionary properties and show that common properties such as coherence or restricted isometry property fail to provide sufficient information in registration problems. We finally show with illustrative experiments on simple visual objects and handwritten digits images that our algorithm outperforms baseline competitor methods in terms of transformation-invariant distance computation and classification.
△ Less
Submitted 4 July, 2013; v1 submitted 28 January, 2013;
originally announced January 2013.
-
Thresholding-based reconstruction of compressed correlated signals
Authors:
Alhussein Fawzi,
Tamara Tosic,
Pascal Frossard
Abstract:
We consider the problem of recovering a set of correlated signals (e.g., images from different viewpoints) from a few linear measurements per signal. We assume that each sensor in a network acquires a compressed signal in the form of linear measurements and sends it to a joint decoder for reconstruction. We propose a novel joint reconstruction algorithm that exploits correlation among underlying s…
▽ More
We consider the problem of recovering a set of correlated signals (e.g., images from different viewpoints) from a few linear measurements per signal. We assume that each sensor in a network acquires a compressed signal in the form of linear measurements and sends it to a joint decoder for reconstruction. We propose a novel joint reconstruction algorithm that exploits correlation among underlying signals. Our correlation model considers geometrical transformations between the supports of the different signals. The proposed joint decoder estimates the correlation and reconstructs the signals using a simple thresholding algorithm. We give both theoretical and experimental evidence to show that our method largely outperforms independent decoding in terms of support recovery and reconstruction quality.
△ Less
Submitted 2 April, 2012; v1 submitted 27 September, 2011;
originally announced September 2011.