Search | arXiv e-print repository

Addressing Sample Inefficiency in Multi-View Representation Learning

Authors: Kumar Krishna Agrawal, Arna Ghosh, Adam Oberman, Blake Richards

Abstract: Non-contrastive self-supervised learning (NC-SSL) methods like BarlowTwins and VICReg have shown great promise for label-free representation learning in computer vision. Despite the apparent simplicity of these techniques, researchers must rely on several empirical heuristics to achieve competitive performance, most notably using high-dimensional projector heads and two augmentations of the same i… ▽ More Non-contrastive self-supervised learning (NC-SSL) methods like BarlowTwins and VICReg have shown great promise for label-free representation learning in computer vision. Despite the apparent simplicity of these techniques, researchers must rely on several empirical heuristics to achieve competitive performance, most notably using high-dimensional projector heads and two augmentations of the same image. In this work, we provide theoretical insights on the implicit bias of the BarlowTwins and VICReg loss that can explain these heuristics and guide the development of more principled recommendations. Our first insight is that the orthogonality of the features is more critical than projector dimensionality for learning good representations. Based on this, we empirically demonstrate that low-dimensional projector heads are sufficient with appropriate regularization, contrary to the existing heuristic. Our second theoretical insight suggests that using multiple data augmentations better represents the desiderata of the SSL objective. Based on this, we demonstrate that leveraging more augmentations per sample improves representation quality and trainability. In particular, it improves optimization convergence, leading to better features emerging earlier in the training. Remarkably, we demonstrate that we can reduce the pretraining dataset size by up to 4x while maintaining accuracy and improving convergence simply by using more data augmentations. Combining these insights, we present practical pretraining recommendations that improve wall-clock time by 2x and improve performance on CIFAR-10/STL-10 datasets using a ResNet-50 backbone. Thus, this work provides a theoretical insight into NC-SSL and produces practical recommendations for enhancing its sample and compute efficiency. △ Less

Submitted 17 December, 2023; originally announced December 2023.

arXiv:2212.11803 [pdf, other]

EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models

Authors: Xinlin Li, Mariana Parazeres, Adam Oberman, Alireza Ghaffari, Masoud Asgharian, Vahid Partovi Nia

Abstract: With the advent of deep learning application on edge devices, researchers actively try to optimize their deployments on low-power and restricted memory devices. There are established compression method such as quantization, pruning, and architecture search that leverage commodity hardware. Apart from conventional compression algorithms, one may redesign the operations of deep learning models that… ▽ More With the advent of deep learning application on edge devices, researchers actively try to optimize their deployments on low-power and restricted memory devices. There are established compression method such as quantization, pruning, and architecture search that leverage commodity hardware. Apart from conventional compression algorithms, one may redesign the operations of deep learning models that lead to more efficient implementation. To this end, we propose EuclidNet, a compression method, designed to be implemented on hardware which replaces multiplication, $xw$, with Euclidean distance $(x-w)^2$. We show that EuclidNet is aligned with matrix multiplication and it can be used as a measure of similarity in case of convolutional layers. Furthermore, we show that under various transformations and noise scenarios, EuclidNet exhibits the same performance compared to the deep learning models designed with multiplication operations. △ Less

Submitted 22 December, 2022; originally announced December 2022.

ACM Class: I.2.6

arXiv:2210.12254 [pdf, other]

Score-based Denoising Diffusion with Non-Isotropic Gaussian Noise Models

Authors: Vikram Voleti, Christopher Pal, Adam Oberman

Abstract: Generative models based on denoising diffusion techniques have led to an unprecedented increase in the quality and diversity of imagery that is now possible to create with neural generative models. However, most contemporary state-of-the-art methods are derived from a standard isotropic Gaussian formulation. In this work we examine the situation where non-isotropic Gaussian distributions are used.… ▽ More Generative models based on denoising diffusion techniques have led to an unprecedented increase in the quality and diversity of imagery that is now possible to create with neural generative models. However, most contemporary state-of-the-art methods are derived from a standard isotropic Gaussian formulation. In this work we examine the situation where non-isotropic Gaussian distributions are used. We present the key mathematical derivations for creating denoising diffusion models using an underlying non-isotropic Gaussian noise model. We also provide initial experiments with the CIFAR-10 dataset to help verify empirically that this more general modeling approach can also yield high-quality samples. △ Less

Submitted 22 November, 2022; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022 Workshop ; 4 pages, 1 page of references, 18 pages of appendix, 2 figures

Journal ref: NeurIPS 2022 Workshop on Score-Based Methods

arXiv:2210.01210 [pdf, ps, other]

A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods

Authors: Tiago Salvador, Kilian Fatras, Ioannis Mitliagkas, Adam Oberman

Abstract: Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In this work, we consider the Partial Domain Adaptation (PDA) variant, where we have extra source classes not present in the target domain. Most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training… ▽ More Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In this work, we consider the Partial Domain Adaptation (PDA) variant, where we have extra source classes not present in the target domain. Most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. However, these strategies violate the main assumption in PDA: only unlabeled target domain samples are available. Moreover, there are also inconsistencies in the experimental settings - architecture, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods with the different model selection strategies under a consistent evaluation protocol. We evaluate 7 representative PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source. △ Less

Submitted 3 October, 2022; originally announced October 2022.

Comments: 17 pages, 13 tables

arXiv:2203.00543 [pdf, other]

On the Generalization of Representations in Reinforcement Learning

Authors: Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

Abstract: In reinforcement learning, state representations are used to tractably deal with large problem spaces. State representations serve both to approximate the value function with few parameters, but also to generalize to newly encountered states. Their features may be learned implicitly (as part of a neural network) or explicitly (for example, the successor representation of \citet{dayan1993improving}… ▽ More In reinforcement learning, state representations are used to tractably deal with large problem spaces. State representations serve both to approximate the value function with few parameters, but also to generalize to newly encountered states. Their features may be learned implicitly (as part of a neural network) or explicitly (for example, the successor representation of \citet{dayan1993improving}). While the approximation properties of representations are reasonably well-understood, a precise characterization of how and when these representations generalize is lacking. In this work, we address this gap and provide an informative bound on the generalization error arising from a specific state representation. This bound is based on the notion of effective dimension which measures the degree to which knowing the value at one state informs the value at other states. Our bound applies to any state representation and quantifies the natural tension between representations that generalize well and those that approximate well. We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: Accepted at AISTATS22

arXiv:2106.08462 [pdf, other]

Multi-Resolution Continuous Normalizing Flows

Authors: Vikram Voleti, Chris Finlay, Adam Oberman, Christopher Pal

Abstract: Recent work has shown that Neural Ordinary Differential Equations (ODEs) can serve as generative models of images using the perspective of Continuous Normalizing Flows (CNFs). Such models offer exact likelihood calculation, and invertible generation/density estimation. In this work we introduce a Multi-Resolution variant of such models (MRCNF), by characterizing the conditional distribution over t… ▽ More Recent work has shown that Neural Ordinary Differential Equations (ODEs) can serve as generative models of images using the perspective of Continuous Normalizing Flows (CNFs). Such models offer exact likelihood calculation, and invertible generation/density estimation. In this work we introduce a Multi-Resolution variant of such models (MRCNF), by characterizing the conditional distribution over the additional information required to generate a fine image that is consistent with the coarse image. We introduce a transformation between resolutions that allows for no change in the log likelihood. We show that this approach yields comparable likelihood values for various image datasets, with improved performance at higher resolutions, with fewer parameters, using only 1 GPU. Further, we examine the out-of-distribution properties of (Multi-Resolution) Continuous Normalizing Flows, and find that they are similar to those of other likelihood-based generative models. △ Less

Submitted 5 October, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: 10 pages, 5 figures, 3 tables, 18 equations

arXiv:2106.03762 [pdf, other]

Frustratingly Easy Uncertainty Estimation for Distribution Shift

Authors: Tiago Salvador, Vikram Voleti, Alexander Iannantuono, Adam Oberman

Abstract: Distribution shift is an important concern in deep image classification, produced either by corruption of the source images, or a complete change, with the solution involving domain adaptation. While the primary goal is to improve accuracy under distribution shift, an important secondary goal is uncertainty estimation: evaluating the probability that the prediction of a model is correct. While imp… ▽ More Distribution shift is an important concern in deep image classification, produced either by corruption of the source images, or a complete change, with the solution involving domain adaptation. While the primary goal is to improve accuracy under distribution shift, an important secondary goal is uncertainty estimation: evaluating the probability that the prediction of a model is correct. While improving accuracy is hard, uncertainty estimation turns out to be frustratingly easy. Prior works have appended uncertainty estimation into the model and training paradigm in various ways. Instead, we show that we can estimate uncertainty by simply exposing the original model to corrupted images, and performing simple statistical calibration on the image outputs. Our frustratingly easy methods demonstrate superior performance on a wide range of distribution shifts as well as on unsupervised domain adaptation tasks, measured through extensive experimentation. △ Less

Submitted 17 October, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: 17 pages, 4 Tables, 9 Figures

arXiv:2106.03761 [pdf, other]

FairCal: Fairness Calibration for Face Verification

Authors: Tiago Salvador, Stephanie Cairns, Vikram Voleti, Noah Marshall, Adam Oberman

Abstract: Despite being widely used, face recognition models suffer from bias: the probability of a false positive (incorrect face match) strongly depends on sensitive attributes such as the ethnicity of the face. As a result, these models can disproportionately and negatively impact minority groups, particularly when used by law enforcement. The majority of bias reduction methods have several drawbacks: th… ▽ More Despite being widely used, face recognition models suffer from bias: the probability of a false positive (incorrect face match) strongly depends on sensitive attributes such as the ethnicity of the face. As a result, these models can disproportionately and negatively impact minority groups, particularly when used by law enforcement. The majority of bias reduction methods have several drawbacks: they use an end-to-end retraining approach, may not be feasible due to privacy issues, and often reduce accuracy. An alternative approach is post-processing methods that build fairer decision classifiers using the features of pre-trained models, thus avoiding the cost of retraining. However, they still have drawbacks: they reduce accuracy (AGENDA, PASS, FTC), or require retuning for different false positive rates (FSN). In this work, we introduce the Fairness Calibration (FairCal) method, a post-training approach that simultaneously: (i) increases model accuracy (improving the state-of-the-art), (ii) produces fairly-calibrated probabilities, (iii) significantly reduces the gap in the false positive rates, (iv) does not require knowledge of the sensitive attribute, and (v) does not require retraining, training an additional model, or retuning. We apply it to the task of Face Verification, and obtain state-of-the-art results with all the above advantages. △ Less

Submitted 30 March, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: Accepted at ICLR 2022

arXiv:2010.02508 [pdf, other]

Adversarial Boot Camp: label free certified robustness in one epoch

Authors: Ryan Campbell, Chris Finlay, Adam M Oberman

Abstract: Machine learning models are vulnerable to adversarial attacks. One approach to addressing this vulnerability is certification, which focuses on models that are guaranteed to be robust for a given perturbation size. A drawback of recent certified models is that they are stochastic: they require multiple computationally expensive model evaluations with random noise added to a given input. In our wor… ▽ More Machine learning models are vulnerable to adversarial attacks. One approach to addressing this vulnerability is certification, which focuses on models that are guaranteed to be robust for a given perturbation size. A drawback of recent certified models is that they are stochastic: they require multiple computationally expensive model evaluations with random noise added to a given input. In our work, we present a deterministic certification approach which results in a certifiably robust model. This approach is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We achieve certified models on ImageNet-1k by retraining a model with this loss for one epoch without the use of label information. △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: 13 pages, 5 figures, 5 tables. Under review as a conference paper at ICLR 2021. arXiv admin note: substantial text overlap with arXiv:2006.06061

arXiv:2006.06061 [pdf, other]

Deterministic Gaussian Averaged Neural Networks

Authors: Ryan Campbell, Chris Finlay, Adam M Oberman

Abstract: We present a deterministic method to compute the Gaussian average of neural networks used in regression and classification. Our method is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We use this equivalence to certify models which perform well on clean data but are not robust to adversarial perturbations. In terms of cer… ▽ More We present a deterministic method to compute the Gaussian average of neural networks used in regression and classification. Our method is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We use this equivalence to certify models which perform well on clean data but are not robust to adversarial perturbations. In terms of certified accuracy and adversarial robustness, our method is comparable to known stochastic methods such as randomized smoothing, but requires only a single model evaluation during inference. △ Less

Submitted 10 June, 2020; originally announced June 2020.

arXiv:2006.06033 [pdf, other]

Learning normalizing flows from Entropy-Kantorovich potentials

Authors: Chris Finlay, Augusto Gerolin, Adam M Oberman, Aram-Alexandre Pooladian

Abstract: We approach the problem of learning continuous normalizing flows from a dual perspective motivated by entropy-regularized optimal transport, in which continuous normalizing flows are cast as gradients of scalar potential functions. This formulation allows us to train a dual objective comprised only of the scalar potential functions, and removes the burden of explicitly computing normalizing flows… ▽ More We approach the problem of learning continuous normalizing flows from a dual perspective motivated by entropy-regularized optimal transport, in which continuous normalizing flows are cast as gradients of scalar potential functions. This formulation allows us to train a dual objective comprised only of the scalar potential functions, and removes the burden of explicitly computing normalizing flows during training. After training, the normalizing flow is easily recovered from the potential functions. △ Less

Submitted 10 June, 2020; originally announced June 2020.

arXiv:2002.02798 [pdf, other]

How to train your neural ODE: the world of Jacobian and kinetic regularization

Authors: Chris Finlay, Jörn-Henrik Jacobsen, Levon Nurbekyan, Adam M Oberman

Abstract: Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. In practice this leads to dynamics equivalent to many hundreds or even thousands of layers. In this paper, we overcome this apparent difficulty by introducing a theoretically-grounded combination of both optimal transport and… ▽ More Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. In practice this leads to dynamics equivalent to many hundreds or even thousands of layers. In this paper, we overcome this apparent difficulty by introducing a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well. Simpler dynamics lead to faster convergence and to fewer discretizations of the solver, considerably decreasing wall-clock time without loss in performance. Our approach allows us to train neural ODE-based generative models to the same performance as the unregularized dynamics, with significant reductions in training time. This brings neural ODEs closer to practical relevance in large-scale applications. △ Less

Submitted 23 June, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

Comments: Accepted to ICML 2020

arXiv:1912.02317 [pdf, other]

doi 10.1007/s10915-020-01143-x

No-collision Transportation Maps

Authors: Levon Nurbekyan, Alexander Iannantuono, Adam M. Oberman

Abstract: Transportation maps between probability measures are critical objects in numerous areas of mathematics and applications such as PDE, fluid mechanics, geometry, machine learning, computer science, and economics. Given a pair of source and target measures, one searches for a map that has suitable properties and transports the source measure to the target one. Here, we study maps that possess the \te… ▽ More Transportation maps between probability measures are critical objects in numerous areas of mathematics and applications such as PDE, fluid mechanics, geometry, machine learning, computer science, and economics. Given a pair of source and target measures, one searches for a map that has suitable properties and transports the source measure to the target one. Here, we study maps that possess the \textit{no-collision} property; that is, particles simultaneously traveling from sources to targets in a unit time with uniform velocities do not collide. These maps are particularly relevant for applications in swarm control problems. We characterize these no-collision maps in terms of \textit{half-space preserving} property and establish a direct connection between these maps and \textit{binary-space-partitioning (BSP) tree} structures. Based on this characterization, we provide explicit BSP algorithms, of cost $O(n \log n)$, to construct no-collision maps. Moreover, interpreting these maps as approximations of optimal transportation maps, we find that they succeed in computing nearly optimal maps for $q$-Wasserstein metric ($q=1,2$). In some cases, our maps yield costs that are just a few percent off from being optimal. △ Less

Submitted 5 April, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

Comments: 24 pages, 4 figures, 6 tables

MSC Class: 49M27;

Journal ref: J Sci Comput 82, 45 (2020)

arXiv:1910.02840 [pdf, ps, other]

Farkas layers: don't shift the data, fix the geometry

Authors: Aram-Alexandre Pooladian, Chris Finlay, Adam M Oberman

Abstract: Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on r… ▽ More Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on residual networks with ReLU activation, we empirically demonstrate a significant improvement in training capacity in the absence of batch normalization or methods of initialization across a broad range of network sizes on benchmark datasets. △ Less

Submitted 4 October, 2019; originally announced October 2019.

arXiv:1910.01612 [pdf, other]

Partial differential equation regularization for supervised machine learning

Authors: Adam M Oberman

Abstract: This article is an overview of supervised machine learning problems for regression and classification. Topics include: kernel methods, training by stochastic gradient descent, deep learning architecture, losses for classification, statistical learning theory, and dimension independent generalization bounds. Implicit regularization in deep learning examples are presented, including data augmentatio… ▽ More This article is an overview of supervised machine learning problems for regression and classification. Topics include: kernel methods, training by stochastic gradient descent, deep learning architecture, losses for classification, statistical learning theory, and dimension independent generalization bounds. Implicit regularization in deep learning examples are presented, including data augmentation, adversarial training, and additive noise. These methods are reframed as explicit gradient regularization. △ Less

Submitted 3 October, 2019; originally announced October 2019.

Comments: 16 pages, 5 figures

MSC Class: Primary 65N99; Secondary 35A15; 49M99; 65C50

arXiv:1908.07861 [pdf, other]

Nesterov's method with decreasing learning rate leads to accelerated stochastic gradient descent

Authors: Maxime Laborde, Adam M. Oberman

Abstract: We present a coupled system of ODEs which, when discretized with a constant time step/learning rate, recovers Nesterov's accelerated gradient descent algorithm. The same ODEs, when discretized with a decreasing learning rate, leads to novel stochastic gradient descent (SGD) algorithms, one in the convex and a second in the strongly convex case. In the strongly convex case, we obtain an algorithm s… ▽ More We present a coupled system of ODEs which, when discretized with a constant time step/learning rate, recovers Nesterov's accelerated gradient descent algorithm. The same ODEs, when discretized with a decreasing learning rate, leads to novel stochastic gradient descent (SGD) algorithms, one in the convex and a second in the strongly convex case. In the strongly convex case, we obtain an algorithm superficially similar to momentum SGD, but with additional terms. In the convex case, we obtain an algorithm with a novel order $k^{3/4}$ learning rate. We prove, extending the Lyapunov function approach from the full gradient case to the stochastic case, that the algorithms converge at the optimal rate for the last iterate of SGD, with rate constants which are better than previously available. △ Less

Submitted 1 September, 2020; v1 submitted 21 August, 2019; originally announced August 2019.

Comments: 23 pages, 1 table, (minor revision), accepted to AISTATS 2020

arXiv:1908.01667 [pdf, ps, other]

A principled approach for generating adversarial images under non-smooth dissimilarity metrics

Authors: Aram-Alexandre Pooladian, Chris Finlay, Tim Hoheisel, Adam Oberman

Abstract: Deep neural networks perform well on real world data but are prone to adversarial perturbations: small changes in the input easily lead to misclassification. In this work, we propose an attack methodology not only for cases where the perturbations are measured by $\ell_p$ norms, but in fact any adversarial dissimilarity metric with a closed proximal form. This includes, but is not limited to,… ▽ More Deep neural networks perform well on real world data but are prone to adversarial perturbations: small changes in the input easily lead to misclassification. In this work, we propose an attack methodology not only for cases where the perturbations are measured by $\ell_p$ norms, but in fact any adversarial dissimilarity metric with a closed proximal form. This includes, but is not limited to, $\ell_1, \ell_2$, and $\ell_\infty$ perturbations; the $\ell_0$ counting "norm" (i.e. true sparseness); and the total variation seminorm, which is a (non-$\ell_p$) convolutional dissimilarity measuring local pixel changes. Our approach is a natural extension of a recent adversarial attack method, and eliminates the differentiability requirement of the metric. We demonstrate our algorithm, ProxLogBarrier, on the MNIST, CIFAR10, and ImageNet-1k datasets. We consider undefended and defended models, and show that our algorithm easily transfers to various datasets. We observe that ProxLogBarrier outperforms a host of modern adversarial attacks specialized for the $\ell_0$ case. Moreover, by altering images in the total variation seminorm, we shed light on a new class of perturbations that exploit neighboring pixel information. △ Less

Submitted 8 October, 2019; v1 submitted 5 August, 2019; originally announced August 2019.

arXiv:1908.00578 [pdf, other]

A Partial Differential Equation Obstacle Problem for the Level Set Approach to Visibility

Authors: Adam Oberman, Tiago Salvador

Abstract: In this article we consider the problem of finding the visibility set from a given point when the obstacles are represented as the level set of a given function. Although the visibility set can be computed efficiently by ray tracing, there are advantages to using a level set representation for the obstacles, and to characterizing the solution using a Partial Differential Equation (PDE). A nonlocal… ▽ More In this article we consider the problem of finding the visibility set from a given point when the obstacles are represented as the level set of a given function. Although the visibility set can be computed efficiently by ray tracing, there are advantages to using a level set representation for the obstacles, and to characterizing the solution using a Partial Differential Equation (PDE). A nonlocal PDE formulation was proposed in Tsai et. al. (Journal of Computational Physics 199(1):260-290, 2004): in this article we propose a simpler PDE formulation, involving a nonlinear obstacle problem. We present a simple numerical scheme and show its convergence using the framework of Barles and Souganidis. Numerical examples in both two and three dimensions are presented. △ Less

Submitted 1 August, 2019; originally announced August 2019.

Comments: 18 pages, 8 figures, 1 table

MSC Class: 35J15; 35J60; 65N06; 65N12

arXiv:1905.11468 [pdf, other]

Scaleable input gradient regularization for adversarial robustness

Authors: Chris Finlay, Adam M Oberman

Abstract: In this work we revisit gradient regularization for adversarial robustness with some new ingredients. First, we derive new per-image theoretical robustness bounds based on local gradient information. These bounds strongly motivate input gradient regularization. Second, we implement a scaleable version of input gradient regularization which avoids double backpropagation: adversarially robust ImageN… ▽ More In this work we revisit gradient regularization for adversarial robustness with some new ingredients. First, we derive new per-image theoretical robustness bounds based on local gradient information. These bounds strongly motivate input gradient regularization. Second, we implement a scaleable version of input gradient regularization which avoids double backpropagation: adversarially robust ImageNet models are trained in 33 hours on four consumer grade GPUs. Finally, we show experimentally and through theoretical certification that input gradient regularization is competitive with adversarial training. Moreover we demonstrate that gradient regularization does not lead to gradient obfuscation or gradient masking. △ Less

Submitted 4 October, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

arXiv:1903.10396 [pdf, other]

The LogBarrier adversarial attack: making effective use of decision boundary information

Authors: Chris Finlay, Aram-Alexandre Pooladian, Adam M. Oberman

Abstract: Adversarial attacks for image classification are small perturbations to images that are designed to cause misclassification by a model. Adversarial attacks formally correspond to an optimization problem: find a minimum norm image perturbation, constrained to cause misclassification. A number of effective attacks have been developed. However, to date, no gradient-based attacks have used best practi… ▽ More Adversarial attacks for image classification are small perturbations to images that are designed to cause misclassification by a model. Adversarial attacks formally correspond to an optimization problem: find a minimum norm image perturbation, constrained to cause misclassification. A number of effective attacks have been developed. However, to date, no gradient-based attacks have used best practices from the optimization literature to solve this constrained minimization problem. We design a new untargeted attack, based on these best practices, using the established logarithmic barrier method. On average, our attack distance is similar or better than all state-of-the-art attacks on benchmark datasets (MNIST, CIFAR10, ImageNet-1K). In addition, our method performs significantly better on the most challenging images, those which normally require larger perturbations for misclassification. We employ the LogBarrier attack on several adversarially defended models, and show that it adversarially perturbs all images more efficiently than other attacks: the distance needed to perturb all images is significantly smaller with the LogBarrier attack than with other state-of-the-art attacks. △ Less

Submitted 25 March, 2019; originally announced March 2019.

Comments: 12 pages, 4 figures, 6 tables

arXiv:1903.09215 [pdf, other]

Calibrated Top-1 Uncertainty estimates for classification by score based models

Authors: Adam M. Oberman, Chris Finlay, Alexander Iannantuono, Tiago Salvador

Abstract: While the accuracy of modern deep learning models has significantly improved in recent years, the ability of these models to generate uncertainty estimates has not progressed to the same degree. Uncertainty methods are designed to provide an estimate of class probabilities when predicting class assignment. While there are a number of proposed methods for estimating uncertainty, they all suffer f… ▽ More While the accuracy of modern deep learning models has significantly improved in recent years, the ability of these models to generate uncertainty estimates has not progressed to the same degree. Uncertainty methods are designed to provide an estimate of class probabilities when predicting class assignment. While there are a number of proposed methods for estimating uncertainty, they all suffer from a lack of calibration: predicted probabilities can be off from empirical ones by a few percent or more. By restricting the scope of our predictions to only the probability of Top-1 error, we can decrease the calibration error of existing methods to less than one percent. As a result, the scores of the methods also improve significantly over benchmarks. △ Less

Submitted 16 June, 2020; v1 submitted 21 March, 2019; originally announced March 2019.

Comments: 12 pages, 5 figures, 6 tables (major revision, new benchmark allows us to show model calibration is better)

arXiv:1903.08688 [pdf, other]

Stochastic Gradient Descent with Polyak's Learning Rate

Authors: Adam M. Oberman, Mariana Prazeres

Abstract: Stochastic gradient descent (SGD) for strongly convex functions converges at the rate $\bO(1/k)$. However, achieving good results in practice requires tuning the parameters (for example the learning rate) of the algorithm. In this paper we propose a generalization of the Polyak step size, used for subgradient methods, to Stochastic gradient descent. We prove a non-asymptotic convergence at the rat… ▽ More Stochastic gradient descent (SGD) for strongly convex functions converges at the rate $\bO(1/k)$. However, achieving good results in practice requires tuning the parameters (for example the learning rate) of the algorithm. In this paper we propose a generalization of the Polyak step size, used for subgradient methods, to Stochastic gradient descent. We prove a non-asymptotic convergence at the rate $\bO(1/k)$ with a rate constant which can be better than the corresponding rate constant for optimally scheduled SGD. We demonstrate that the method is effective in practice, and on convex optimization problems and on training deep neural networks, and compare to the theoretical rate. △ Less

Submitted 11 July, 2019; v1 submitted 20 March, 2019; originally announced March 2019.

arXiv:1810.00953

Improved robustness to adversarial examples using Lipschitz regularization of the loss

Authors: Chris Finlay, Adam Oberman, Bilal Abbasi

Abstract: We augment adversarial training (AT) with worst case adversarial training (WCAT) which improves adversarial robustness by 11% over the current state-of-the-art result in the $\ell_2$ norm on CIFAR-10. We obtain verifiable average case and worst case robustness guarantees, based on the expected and maximum values of the norm of the gradient of the loss. We interpret adversarial training as Total Va… ▽ More We augment adversarial training (AT) with worst case adversarial training (WCAT) which improves adversarial robustness by 11% over the current state-of-the-art result in the $\ell_2$ norm on CIFAR-10. We obtain verifiable average case and worst case robustness guarantees, based on the expected and maximum values of the norm of the gradient of the loss. We interpret adversarial training as Total Variation Regularization, which is a fundamental tool in mathematical image processing, and WCAT as Lipschitz regularization. △ Less

Submitted 13 September, 2019; v1 submitted 1 October, 2018; originally announced October 2018.

Comments: Merged with arXiv:1808.09540

arXiv:1808.09540 [pdf, other]

Lipschitz regularized Deep Neural Networks generalize and are adversarially robust

Authors: Chris Finlay, Jeff Calder, Bilal Abbasi, Adam Oberman

Abstract: In this work we study input gradient regularization of deep neural networks, and demonstrate that such regularization leads to generalization proofs and improved adversarial robustness. The proof of generalization does not overcome the curse of dimensionality, but it is independent of the number of layers in the networks. The adversarial robustness regularization combines adversarial training, whi… ▽ More In this work we study input gradient regularization of deep neural networks, and demonstrate that such regularization leads to generalization proofs and improved adversarial robustness. The proof of generalization does not overcome the curse of dimensionality, but it is independent of the number of layers in the networks. The adversarial robustness regularization combines adversarial training, which we show to be equivalent to Total Variation regularization, with Lipschitz regularization. We demonstrate empirically that the regularized models are more robust, and that gradient norms of images can be used for attack detection. △ Less

Submitted 11 September, 2019; v1 submitted 28 August, 2018; originally announced August 2018.

Comments: 18 pages, 4 figures (merged with arXiv:1810.00953)

arXiv:1807.05150 [pdf, other]

Improved accuracy of monotone finite difference schemes on point clouds and regular grids

Authors: Chris Finlay, Adam Oberman

Abstract: Finite difference schemes are the method of choice for solving nonlinear, degenerate elliptic PDEs, because the Barles-Sougandis convergence framework [Barles and Sougandidis, Asymptotic Analysis, 4(3):271-283, 1991] provides sufficient conditions for convergence to the unique viscosity solution [Crandall, Ishii and Lions, Bull. Amer. Math Soc., 27(1):1-67, 1992]. For anisotropic operators, such a… ▽ More Finite difference schemes are the method of choice for solving nonlinear, degenerate elliptic PDEs, because the Barles-Sougandis convergence framework [Barles and Sougandidis, Asymptotic Analysis, 4(3):271-283, 1991] provides sufficient conditions for convergence to the unique viscosity solution [Crandall, Ishii and Lions, Bull. Amer. Math Soc., 27(1):1-67, 1992]. For anisotropic operators, such as the Monge-Ampere equation, wide stencil schemes are needed [Oberman, SIAM J. Numer. Anal., 44(2):879-895]. The accuracy of these schemes depends on both the distances to neighbors, $R$, and the angular resolution, $dθ$. On uniform grids, the accuracy is $\mathcal O(R^2 + dθ)$. On point clouds, the most accurate schemes are of $\mathcal O(R + dθ)$, by Froese [Numerische Mathematik, 138(1):75-99, 2018]. In this work, we construct geometrically motivated schemes of higher accuracy in both cases: order $\mathcal O(R + dθ^2)$ on point clouds, and $\mathcal O(R^2 + dθ^2)$ on uniform grids. △ Less

Submitted 13 July, 2018; originally announced July 2018.

MSC Class: 65N06; 65N12; 65N22; 35J15; 35J25

arXiv:1710.10311 [pdf, ps, other]

doi 10.1007/s10915-018-0730-x

Approximate homogenization of fully nonlinear elliptic PDEs: estimates and numerical results for Pucci type equations

Authors: Chris Finlay, Adam M. Oberman

Abstract: We are interested in the shape of the homogenized operator $\overline F(Q)$ for PDEs which have the structure of a nonlinear Pucci operator. A typical operator is $H^{a_1,a_2}(Q,x) = a_1(x) λ_{\min}(Q) + a_2(x)λ_{\max}(Q)$. Linearization of the operator leads to a non-divergence form homogenization problem, which can be solved by averaging against the invariant measure. We estimate the error obtai… ▽ More We are interested in the shape of the homogenized operator $\overline F(Q)$ for PDEs which have the structure of a nonlinear Pucci operator. A typical operator is $H^{a_1,a_2}(Q,x) = a_1(x) λ_{\min}(Q) + a_2(x)λ_{\max}(Q)$. Linearization of the operator leads to a non-divergence form homogenization problem, which can be solved by averaging against the invariant measure. We estimate the error obtained by linearization based on semi-concavity estimates on the nonlinear operator. These estimates show that away from high curvature regions, the linearization can be accurate. Numerical results show that for many values of $Q$, the linearization is highly accurate, and that even near corners, the error can be small (a few percent) even for relatively wide ranges of the coefficients. △ Less

Submitted 14 May, 2018; v1 submitted 27 October, 2017; originally announced October 2017.

Comments: Journal of Scientific Computing (2018)

MSC Class: 35B27

arXiv:1710.10309 [pdf, ps, other]

Approximate homogenization of convex nonlinear elliptic PDEs

Authors: Chris Finlay, Adam M. Oberman

Abstract: We approximate the homogenization of fully nonlinear, convex, uniformly elliptic Partial Differential Equations in the periodic setting, using a variational formula for the optimal invariant measure, which may be derived via Legendre-Fenchel duality. The variational formula expresses $\bar H$ as an average of the operator against the optimal invariant measure, generalizing the linear case. Several… ▽ More We approximate the homogenization of fully nonlinear, convex, uniformly elliptic Partial Differential Equations in the periodic setting, using a variational formula for the optimal invariant measure, which may be derived via Legendre-Fenchel duality. The variational formula expresses $\bar H$ as an average of the operator against the optimal invariant measure, generalizing the linear case. Several nontrivial analytic formulas for $\bar H$ are obtained. These formulas are compared to numerical simulations, using both PDE and variational methods. We also perform a numerical study of convergence rates for homogenization in the periodic and random setting and compare these to theoretical results. △ Less

Submitted 27 October, 2017; originally announced October 2017.

MSC Class: 35B27

arXiv:1710.07746 [pdf, other]

Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for $k$-means Clustering

Authors: Penghang Yin, Minh Pham, Adam Oberman, Stanley Osher

Abstract: In this paper, we propose an implicit gradient descent algorithm for the classic $k$-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed… ▽ More In this paper, we propose an implicit gradient descent algorithm for the classic $k$-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed stochastic backward Euler and the recent entropy stochastic gradient descent (Entropy-SGD) for improving the training of deep neural networks. Numerical experiments on various synthetic and real datasets show that the proposed algorithm provides better clustering results compared to $k$-means algorithms in the sense that it decreased the objective function (the cluster) and is much more robust to initialization. △ Less

Submitted 21 May, 2018; v1 submitted 20 October, 2017; originally announced October 2017.

arXiv:1707.00424 [pdf, other]

Parle: parallelizing stochastic gradient descent

Authors: Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman

Abstract: We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been sh… ▽ More We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been shown to lead to improved generalization error for deep networks. Parle requires very infrequent communication with the parameter server and instead performs more computation on each client, which makes it well-suited to both single-machine, multi-GPU settings and distributed implementations. △ Less

Submitted 10 September, 2017; v1 submitted 3 July, 2017; originally announced July 2017.

arXiv:1704.04932 [pdf, other]

Deep Relaxation: partial differential equations for optimizing deep neural networks

Authors: Pratik Chaudhari, Adam Oberman, Stanley Osher, Stefano Soatto, Guillaume Carlier

Abstract: In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs). Relaxation techniques arising in statistical physics which have already been used successfully in this context are reinterpreted as solutions of a viscous Hamilton-Jacobi PDE. Using a stochastic control interpretation allows we prove… ▽ More In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs). Relaxation techniques arising in statistical physics which have already been used successfully in this context are reinterpreted as solutions of a viscous Hamilton-Jacobi PDE. Using a stochastic control interpretation allows we prove that the modified algorithm performs better in expectation that stochastic gradient descent. Well-known PDE regularity results allow us to analyze the geometry of the relaxed energy landscape, confirming empirical evidence. The PDE is derived from a stochastic homogenization problem, which arises in the implementation of the algorithm. The algorithms scale well in practice and can effectively tackle the high dimensionality of modern neural networks. △ Less

Submitted 1 June, 2017; v1 submitted 17 April, 2017; originally announced April 2017.

arXiv:1703.01350 [pdf, other]

Approximate Convex Hulls: sketching the convex hull using curvature

Authors: Robert Graham, Adam M. Oberman

Abstract: Convex hulls are fundamental objects in computational geometry. In moderate dimensions or for large numbers of vertices, computing the convex hull can be impractical due to the computational complexity of convex hull algorithms. In this article we approximate the convex hull in using a scalable algorithm which finds high curvature vertices with high probability. The algorithm is particularly effec… ▽ More Convex hulls are fundamental objects in computational geometry. In moderate dimensions or for large numbers of vertices, computing the convex hull can be impractical due to the computational complexity of convex hull algorithms. In this article we approximate the convex hull in using a scalable algorithm which finds high curvature vertices with high probability. The algorithm is particularly effective for approximating convex hulls which have a relatively small number of extreme points. △ Less

Submitted 14 June, 2017; v1 submitted 27 February, 2017; originally announced March 2017.

Comments: 16 pages, 8 figures

MSC Class: Primary: 52-04; Secondary: 52A41; 52A20; 65Y20; 53C45

arXiv:1612.06813 [pdf, other]

doi 10.1093/imanum/drx068

A partial differential equation for the strictly quasiconvex envelope

Authors: Bilal Abbasi, Adam M. Oberman

Abstract: In a series of papers Barron, Goebel, and Jensen studied Partial Differential Equations (PDE)s for quasiconvex (QC) functions \cite{barron2012functions, barron2012quasiconvex,barron2013quasiconvex,barron2013uniqueness}. To overcome the lack of uniqueness for the QC PDE, they introduced a regularization: a PDE for $\e$-robust QC functions, which is well-posed. Building on this work, we introduce a… ▽ More In a series of papers Barron, Goebel, and Jensen studied Partial Differential Equations (PDE)s for quasiconvex (QC) functions \cite{barron2012functions, barron2012quasiconvex,barron2013quasiconvex,barron2013uniqueness}. To overcome the lack of uniqueness for the QC PDE, they introduced a regularization: a PDE for $\e$-robust QC functions, which is well-posed. Building on this work, we introduce a stronger regularization which is amenable to numerical approximation. We build convergent finite difference approximations, comparing the QC envelope and the two regularization. Solutions of this PDE are strictly convex, and smoother than the robust-QC functions. △ Less

Submitted 20 December, 2016; originally announced December 2016.

Comments: 20 pages, 6 figures, 1 table

Journal ref: IMA Journal of Numerical Analysis Volume 39 Issue 1, 25 January 2019 Pages 141-166

arXiv:1612.05584 [pdf, ps, other]

Computing the quasiconvex envelope using a nonlocal line solver

Authors: Bilal Abbasi, Adam M. Oberman

Abstract: Recently in a series of articles, Barron, Goebel, and Jensen \cite{barron2012functions} \cite{barron2012quasiconvex} \cite{barron2013quasiconvex} \cite{barron2013uniqueness} have studied second order degenerate elliptic PDE and first order nonlocal PDEs for the quasiconvex envelope. Quasiconvex functions are functions whose level sets are convex. The PDE is difficult to solve. In this article we p… ▽ More Recently in a series of articles, Barron, Goebel, and Jensen \cite{barron2012functions} \cite{barron2012quasiconvex} \cite{barron2013quasiconvex} \cite{barron2013uniqueness} have studied second order degenerate elliptic PDE and first order nonlocal PDEs for the quasiconvex envelope. Quasiconvex functions are functions whose level sets are convex. The PDE is difficult to solve. In this article we present an algorithm for computing the quasiconvex envelope (QCE) of a given function. The QCE operator is a level set operator, so this algorithm gives a method to compute convex hull of sets represented by a level set functions. We present a nonlocal line solver for the quasiconvex envelope (QCE), based on solving the one dimensional problem on lines. We find an explicit formula for the QCE of a function defined on a line. △ Less

Submitted 16 December, 2016; originally announced December 2016.

Comments: 15 pages, 7 figures

arXiv:1611.00164 [pdf, other]

Finite difference methods for fractional Laplacians

Authors: Yanghong Huang, Adam Oberman

Abstract: The fractional Laplacian $(-Δ)^{α/2}$ is the prototypical non-local elliptic operator. While analytical theory has been advanced and understood for some time, there remain many open problems in the numerical analysis of the operator. In this article, we study several different finite difference discretisations of the fractional Laplacian on uniform grids in one dimension that takes the same form.… ▽ More The fractional Laplacian $(-Δ)^{α/2}$ is the prototypical non-local elliptic operator. While analytical theory has been advanced and understood for some time, there remain many open problems in the numerical analysis of the operator. In this article, we study several different finite difference discretisations of the fractional Laplacian on uniform grids in one dimension that takes the same form. Many properties can be compared and summarised in this relatively simple setting, to tackle more important questions like the nonlocality, singularity and flat tails common in practical implementations. The accuracy and the asymptotic behaviours of the methods are also studied, together with treatment of the far field boundary conditions, providing a unified perspective on the further development of the scheme in higher dimensions. △ Less

Submitted 1 November, 2016; originally announced November 2016.

arXiv:1610.08831 [pdf, other]

Numerical methods for motion of level sets by affine curvature

Authors: Adam M. Oberman, Tiago Salvador

Abstract: We study numerical methods for the nonlinear partial differential equation that governs the motion of level sets by affine curvature. We show that standard finite difference schemes are nonlinearly unstable. We build convergent finite difference schemes, using the theory of viscosity solutions. We demonstrate that our approximate solutions capture the affine invariance and morphological properties… ▽ More We study numerical methods for the nonlinear partial differential equation that governs the motion of level sets by affine curvature. We show that standard finite difference schemes are nonlinearly unstable. We build convergent finite difference schemes, using the theory of viscosity solutions. We demonstrate that our approximate solutions capture the affine invariance and morphological properties of the evolution. Numerical experiments demonstrate the accuracy and stability of the discretization. △ Less

Submitted 28 October, 2016; v1 submitted 27 October, 2016; originally announced October 2016.

Comments: 36 pages, 6 figures, 3 tables

arXiv:1608.04348 [pdf, ps, other]

doi 10.1137/17M1121184

Anomaly detection and classification for streaming data using PDEs

Authors: Bilal Abbasi, Jeff Calder, Adam M. Oberman

Abstract: Nondominated sorting, also called Pareto Depth Analysis (PDA), is widely used in multi-objective optimization and has recently found important applications in multi-criteria anomaly detection. Recently, a partial differential equation (PDE) continuum limit was discovered for nondominated sorting leading to a very fast approximate sorting algorithm called PDE-based ranking. We propose in this paper… ▽ More Nondominated sorting, also called Pareto Depth Analysis (PDA), is widely used in multi-objective optimization and has recently found important applications in multi-criteria anomaly detection. Recently, a partial differential equation (PDE) continuum limit was discovered for nondominated sorting leading to a very fast approximate sorting algorithm called PDE-based ranking. We propose in this paper a fast real-time streaming version of the PDA algorithm for anomaly detection that exploits the computational advantages of PDE continuum limits. Furthermore, we derive new PDE continuum limits for sorting points within their nondominated layers and show how the new PDEs can be used to classify anomalies based on which criterion was more significantly violated. We also prove statistical convergence rates for PDE-based ranking, and present the results of numerical experiments with both synthetic and real data. △ Less

Submitted 15 March, 2017; v1 submitted 15 August, 2016; originally announced August 2016.

MSC Class: 35D40; 49L25; 65N06; 06A07; 35F21; 68Q87 ACM Class: I.5; G.3; H.2.8

Journal ref: SIAM Journal on Applied Math, 78(2), 921--941, 2018

arXiv:1605.03155 [pdf, other]

doi 10.1007/s00205-017-1092-5

A partial differential equation for the rank one convex envelope

Authors: Adam M. Oberman, Yuanlong Ruan

Abstract: In this article we introduce a Partial Differential Equation (PDE) for the rank one convex envelope. Rank one convex envelopes arise in non-convex vector valued variational problems \cite{BallElasticity, kohn1986optimal1, BallJames87, chipot1988equilibrium}. More generally, we study a PDE for directional convex envelopes, which includes the usual convex envelope \cite{ObermanConvexEnvelope} and th… ▽ More In this article we introduce a Partial Differential Equation (PDE) for the rank one convex envelope. Rank one convex envelopes arise in non-convex vector valued variational problems \cite{BallElasticity, kohn1986optimal1, BallJames87, chipot1988equilibrium}. More generally, we study a PDE for directional convex envelopes, which includes the usual convex envelope \cite{ObermanConvexEnvelope} and the rank one convex envelope as special cases. Existence and uniqueness of viscosity solutions to the PDE is established. Wide stencil elliptic finite difference schemes are built. Convergence of finite difference solutions to the viscosity solution of the PDE is proven. Numerical examples of rank one and other directional convex envelopes are presented. Additionally, laminates are computed from the rank one convex envelope. △ Less

Submitted 12 January, 2017; v1 submitted 10 May, 2016; originally announced May 2016.

Comments: 26 pages, 5 figures, 4 tables

arXiv:1509.03668 [pdf, other]

An efficient linear programming method for Optimal Transportation

Authors: Adam M. Oberman, Yuanlong Ruan

Abstract: An efficient method for computing solutions to the Optimal Transportation (OT) problem with a wide class of cost functions is presented. The standard linear programming (LP) discretization of the continuous problem becomes intractible for moderate grid sizes. A grid refinement method results in a linear cost algorithm. Weak convergence of solutions is stablished. Barycentric projection of transfer… ▽ More An efficient method for computing solutions to the Optimal Transportation (OT) problem with a wide class of cost functions is presented. The standard linear programming (LP) discretization of the continuous problem becomes intractible for moderate grid sizes. A grid refinement method results in a linear cost algorithm. Weak convergence of solutions is stablished. Barycentric projection of transference plans is used to improve the accuracy of solutions. The method is applied to more general problems, including partial optimal transportation, and barycenter problems. Computational examples validate the accuracy and efficiency of the method. Optimal maps between nonconvex domains, partial OT free boundaries, and high accuracy barycenters are presented. △ Less

Submitted 11 September, 2015; originally announced September 2015.

Comments: 25 pages, 11 figures, 2 tables

arXiv:1502.04969 [pdf, other]

Numerical Methods for the 2-Hessian Elliptic Partial Differential Equation

Authors: Brittany D. Froese, Adam M. Oberman, Tiago Salvador

Abstract: The elliptic 2-Hessian equation is a fully nonlinear partial differential equation (PDE) that is related to intrinsic curvature for three dimensional manifolds. We introduce two numerical methods for this PDE: the first is provably convergent to the viscosity solution, and the second is more accurate, and convergent in practice but lacks a proof. The PDE is elliptic on a restricted set of function… ▽ More The elliptic 2-Hessian equation is a fully nonlinear partial differential equation (PDE) that is related to intrinsic curvature for three dimensional manifolds. We introduce two numerical methods for this PDE: the first is provably convergent to the viscosity solution, and the second is more accurate, and convergent in practice but lacks a proof. The PDE is elliptic on a restricted set of functions: a convexity type constraint is needed for the ellipticity of the PDE operator. Solutions with both discretizations are obtained using Newton's method. Computational results are presented on a number of exact solutions which range in regularity from smooth to nondifferentiable and in shape from convex to non convex. △ Less

Submitted 10 February, 2016; v1 submitted 17 February, 2015; originally announced February 2015.

Comments: 26 pages, 6 figures, 8 tables

arXiv:1412.3057 [pdf, other]

Adaptive finite difference methods for nonlinear elliptic and parabolic partial differential equations with free boundaries

Authors: Adam M. Oberman, Ian Zwiers

Abstract: Monotone finite difference methods provide stable convergent discretizations of a class of degenerate elliptic and parabolic Partial Differential Equations (PDEs). These methods are best suited to regular rectangular grids, which leads to low accuracy near curved boundaries or singularities of solutions. In this article we combine monotone finite difference methods with an adaptive grid refinement… ▽ More Monotone finite difference methods provide stable convergent discretizations of a class of degenerate elliptic and parabolic Partial Differential Equations (PDEs). These methods are best suited to regular rectangular grids, which leads to low accuracy near curved boundaries or singularities of solutions. In this article we combine monotone finite difference methods with an adaptive grid refinement technique to produce a PDE discretization and solver which is applied to a broad class of equations, in curved or unbounded domains which include free boundaries. The grid refinement is flexible and adaptive. The discretization is combined with a fast solution method, which incorporates asynchronous time step** adapted to the spatial scale. The framework is validated on linear problems in curved and unbounded domains. Key applications include the obstacle problem and the one-phase Stefan free boundary problem. △ Less

Submitted 18 November, 2015; v1 submitted 9 December, 2014; originally announced December 2014.

Comments: 19 pages, 12 figures, 2 tables

arXiv:1411.7018 [pdf, ps, other]

doi 10.1080/00207160.2016.1247443

A multigrid scheme for 3D Monge-Ampère equations

Authors: Jun Liu, Brittany D. Froese, Adam M. Oberman, Mingqing Xiao

Abstract: The elliptic Monge-Ampère equation is a fully nonlinear partial differential equation which has been the focus of increasing attention from the scientific computing community. Fast three dimensional solvers are needed, for example in medical image registration but are not yet available. We build fast solvers for smooth solutions in three dimensions using a nonlinear full-approximation storage mult… ▽ More The elliptic Monge-Ampère equation is a fully nonlinear partial differential equation which has been the focus of increasing attention from the scientific computing community. Fast three dimensional solvers are needed, for example in medical image registration but are not yet available. We build fast solvers for smooth solutions in three dimensions using a nonlinear full-approximation storage multigrid method. Starting from a second-order accurate centered finite difference approximation, we present a nonlinear Gauss-Seidel iterative method which has a mechanism for selecting the convex solution of the equation. The iterative method is used as an effective smoother, combined with the full-approximation storage multigrid method. Numerical experiments are provided to validate the accuracy of the finite difference scheme and illustrate the computational efficiency of the proposed multigrid solver. △ Less

Submitted 29 December, 2016; v1 submitted 25 November, 2014; originally announced November 2014.

Comments: 18 pages, 1 figure, 7 tables, 41 references. Accepted by International Journal of Computer Mathematics (published online: 21 Nov 2016)

MSC Class: 65N06; 65N22; 65N55; 65N12

arXiv:1411.3602 [pdf, other]

Numerical methods for matching for teams and Wasserstein barycenters

Authors: Guillaume Carlier, Adam Oberman, Edouard Oudet

Abstract: Equilibrium multi-population matching (matching for teams) is a problem from mathematical economics which is related to multi-marginal optimal transport. A special but important case is the Wasserstein barycenter problem, which has applications in image processing and statistics. Two algorithms are presented: a linear programming algorithm and an efficient nonsmooth optimization algorithm, which a… ▽ More Equilibrium multi-population matching (matching for teams) is a problem from mathematical economics which is related to multi-marginal optimal transport. A special but important case is the Wasserstein barycenter problem, which has applications in image processing and statistics. Two algorithms are presented: a linear programming algorithm and an efficient nonsmooth optimization algorithm, which applies in the case of the Wasserstein barycenters. The measures are approximated by discrete measures: convergence of the approximation is proved. Numerical results are presented which illustrate the efficiency of the algorithms. △ Less

Submitted 13 November, 2014; originally announced November 2014.

Comments: 29 pages, 13 figures

arXiv:1411.3205 [pdf, ps, other]

doi 10.1016/j.jcp.2014.12.039

Filtered schemes for Hamilton-Jacobi equations: a simple construction of convergent accurate difference schemes

Authors: Adam M. Oberman, Tiago Salvador

Abstract: We build a simple and general class of finite difference schemes for first order Hamilton-Jacobi (HJ) Partial Differential Equations. These filtered schemes are convergent to the unique viscosity solution of the equation. The schemes are accurate: we implement second, third and fourth order accurate schemes in one dimension and second order accurate schemes in two dimensions, indicating how to bui… ▽ More We build a simple and general class of finite difference schemes for first order Hamilton-Jacobi (HJ) Partial Differential Equations. These filtered schemes are convergent to the unique viscosity solution of the equation. The schemes are accurate: we implement second, third and fourth order accurate schemes in one dimension and second order accurate schemes in two dimensions, indicating how to build higher order ones. They are also explicit, which means they can be solved using the fast swee** method or the fast marching method.The accuracy of the method is validated with computational results for the eikonal equation in one and two dimensions, using filtered schemes made from standard centered differences, higher order upwinding and ENO interpolation. △ Less

Submitted 12 November, 2014; originally announced November 2014.

Comments: 31 pages, 9 figures, 9 tables

MSC Class: 35J15; 35J25; 35J60; 35J96 65N06; 65N12; 65N22

arXiv:1311.7691 [pdf, ps, other]

Numerical Methods for the Fractional Laplacian: a Finite Difference-quadrature Approach

Authors: Yanghong Huang, Adam Oberman

Abstract: The fractional Laplacian $(-Δ)^{α/2}$ is a non-local operator which depends on the parameter $α$ and recovers the usual Laplacian as $α\to 2$. A numerical method for the fractional Laplacian is proposed, based on the singular integral representation for the operator. The method combines finite difference with numerical quadrature, to obtain a discrete convolution operator with positive weights. Th… ▽ More The fractional Laplacian $(-Δ)^{α/2}$ is a non-local operator which depends on the parameter $α$ and recovers the usual Laplacian as $α\to 2$. A numerical method for the fractional Laplacian is proposed, based on the singular integral representation for the operator. The method combines finite difference with numerical quadrature, to obtain a discrete convolution operator with positive weights. The accuracy of the method is shown to be $O(h^{3-α})$. Convergence of the method is proven. The treatment of far field boundary conditions using an asymptotic approximation to the integral is used to obtain an accurate method. Numerical experiments on known exact solutions validate the predicted convergence rates. Computational examples include exponentially and algebraically decaying solution with varying regularity. The generalization to nonlinear equations involving the operator is discussed: the obstacle problem for the fractional Laplacian is computed. △ Less

Submitted 13 November, 2014; v1 submitted 29 November, 2013; originally announced November 2013.

Comments: 29 pages, 9 figures

arXiv:1212.0834 [pdf, other]

Nonlinear elliptic Partial Differential Equations and p-harmonic functions on graphs

Authors: Juan J. Manfredi, Adam M. Oberman, Alex P. Svirodov

Abstract: In this article we study the well-posedness (uniqueness and existence of solutions) of nonlinear elliptic Partial Differential Equations (PDEs) on a finite graph. These results are obtained using the discrete comparison principle and connectivity properties of the graph. This work is in the spirit of the theory of viscosity solutions for PDEs. The equations include the graph Laplacian, the $p$-L… ▽ More In this article we study the well-posedness (uniqueness and existence of solutions) of nonlinear elliptic Partial Differential Equations (PDEs) on a finite graph. These results are obtained using the discrete comparison principle and connectivity properties of the graph. This work is in the spirit of the theory of viscosity solutions for PDEs. The equations include the graph Laplacian, the $p$-Laplacian, the Infinity Laplacian, the Mean Curvature equation, and the Eikonal operator on the graph. △ Less

Submitted 28 February, 2013; v1 submitted 4 December, 2012; originally announced December 2012.

Comments: 19 pages, 2 figures

MSC Class: 35J20; 35J60; 35J70

arXiv:1208.4873 [pdf, other]

A viscosity solution approach to the Monge-Ampere formulation of the Optimal Transportation Problem

Authors: Jean-David Benamou, Brittany D. Froese, Adam M. Oberman

Abstract: In this work we present a numerical method for the Optimal Mass Transportation problem. Optimal Mass Transportation (OT) is an active research field in mathematics.It has recently led to significant theoretical results as well as applications in diverse areas. Numerical solution techniques for the OT problem remain underdeveloped. The solution is obtained by solving the second boundary value probl… ▽ More In this work we present a numerical method for the Optimal Mass Transportation problem. Optimal Mass Transportation (OT) is an active research field in mathematics.It has recently led to significant theoretical results as well as applications in diverse areas. Numerical solution techniques for the OT problem remain underdeveloped. The solution is obtained by solving the second boundary value problem for the MA equation, a fully nonlinear elliptic partial differential equation (PDE). Instead of standard boundary conditions the problem has global state constraints. These are reformulated as a tractable local PDE. We give a proof of convergence of the numerical method, using the theory of viscosity solutions. Details of the implementation and a fast solution method are provided in the companion paper arXiv:1208.4870. △ Less

Submitted 2 August, 2013; v1 submitted 23 August, 2012; originally announced August 2012.

Comments: 22 pages, 7 figures

MSC Class: 35J96; 65l12; 49M25

arXiv:1208.4870 [pdf, other]

Numerical solution of the Optimal Transportation problem using the Monge-Ampere equation

Authors: Jean-David Benamou, Brittany D. Froese, Adam M. Oberman

Abstract: A numerical method for the solution of the elliptic Monge-Ampere Partial Differential Equation, with boundary conditions corresponding to the Optimal Transportation (OT) problem is presented. A local representation of the OT boundary conditions is combined with a finite difference scheme for the Monge-Ampere equation. Newton's method is implemented leading to a fast solver, comparable to solving t… ▽ More A numerical method for the solution of the elliptic Monge-Ampere Partial Differential Equation, with boundary conditions corresponding to the Optimal Transportation (OT) problem is presented. A local representation of the OT boundary conditions is combined with a finite difference scheme for the Monge-Ampere equation. Newton's method is implemented leading to a fast solver, comparable to solving the Laplace equation on the same grid several times. Theoretical justification for the method is given by a convergence proof in the companion paper (Benamou et al., 2012). In this paper, the algorithm is modified to a simpler compact stencil implementation and details of the implementation are given. Solutions are computed with densities supported on non-convex and disconnected domains. Computational examples demonstrate robust performance on singular solutions and fast computational times. △ Less

Submitted 23 August, 2012; originally announced August 2012.

Comments: 27 pages, 7 figures

arXiv:1204.5798 [pdf, other]

Convergent filtered schemes for the Monge-Ampère partial differential equation

Authors: Brittany D. Froese, Adam M. Oberman

Abstract: The theory of viscosity solutions has been effective for representing and approximating weak solutions to fully nonlinear Partial Differential Equations (PDEs) such as the elliptic Monge-Ampère equation. The approximation theory of Barles-Souganidis [Barles and Souganidis, Asymptotic Anal., 4 (1999) 271-283] requires that numerical schemes be monotone (or elliptic in the sense of [Oberman, SIAM J.… ▽ More The theory of viscosity solutions has been effective for representing and approximating weak solutions to fully nonlinear Partial Differential Equations (PDEs) such as the elliptic Monge-Ampère equation. The approximation theory of Barles-Souganidis [Barles and Souganidis, Asymptotic Anal., 4 (1999) 271-283] requires that numerical schemes be monotone (or elliptic in the sense of [Oberman, SIAM J. Numer. Anal, 44 (2006) 879-895]. But such schemes have limited accuracy. In this article, we establish a convergence result for nearly monotone schemes. This allows us to construct finite difference discretizations of arbitrarily high-order. We demonstrate that the higher accuracy is achieved when solutions are sufficiently smooth. In addition, the filtered scheme provides a natural detection principle for singularities. We employ this framework to construct a formally second-order scheme for the Monge-Ampère equation and present computational results on smooth and singular solutions. △ Less

Submitted 3 December, 2012; v1 submitted 25 April, 2012; originally announced April 2012.

Comments: 24 pages, to appear in SINUM

MSC Class: 35J15; 35J25; 35J60; 35J96 65N06; 65N12; 65N22

arXiv:1107.5290 [pdf, other]

A numerical method for variational problems with convexity constraints

Authors: Adam M. Oberman

Abstract: We consider the problem of approximating the solution of variational problems subject to the constraint that the admissible functions must be convex. This problem is at the interface between convex analysis, convex optimization, variational problems, and partial differential equation techniques. The approach is to approximate the (non-polyhedral) cone of convex functions by a polyhedral cone whi… ▽ More We consider the problem of approximating the solution of variational problems subject to the constraint that the admissible functions must be convex. This problem is at the interface between convex analysis, convex optimization, variational problems, and partial differential equation techniques. The approach is to approximate the (non-polyhedral) cone of convex functions by a polyhedral cone which can be represented by linear inequalities. This approach leads to an optimization problem with linear constraints which can be computed efficiently, hundreds of times faster than existing methods. △ Less

Submitted 23 August, 2012; v1 submitted 26 July, 2011; originally announced July 2011.

Comments: 21 pages, 6 figures, 6 tables

MSC Class: 65K15; 90C25; 26B25; 65N06; 52A41; 91B68

arXiv:1107.5278 [pdf, other]

Finite difference methods for the Infinity Laplace and p-Laplace equations

Authors: Adam M. Oberman

Abstract: We build convergent discretizations and semi-implicit solvers for the Infinity Laplacian and the game theoretical $p$-Laplacian. The discretizations simplify and generalize earlier ones. We prove convergence of the solution of the Wide Stencil finite difference schemes to the unique viscosity solution of the underlying equation. We build a semi-implicit solver, which solves the Laplace equation as… ▽ More We build convergent discretizations and semi-implicit solvers for the Infinity Laplacian and the game theoretical $p$-Laplacian. The discretizations simplify and generalize earlier ones. We prove convergence of the solution of the Wide Stencil finite difference schemes to the unique viscosity solution of the underlying equation. We build a semi-implicit solver, which solves the Laplace equation as each step. It is fast in the sense that the number of iterations is independent of the problem size. This is an improvement over previous explicit solvers, which are slow due to the CFL-condition. △ Less

Submitted 5 December, 2012; v1 submitted 26 July, 2011; originally announced July 2011.

Comments: 22 pages, 10 figures

Showing 1–50 of 55 results for author: Oberman, A