Search | arXiv e-print repository

Lipschitz constant estimation for general neural network architectures using control tools

Authors: Patricia Pauli, Dennis Gramlich, Frank Allgöwer

Abstract: This paper is devoted to the estimation of the Lipschitz constant of neural networks using semidefinite programming. For this purpose, we interpret neural networks as time-varying dynamical systems, where the $k$-th layer corresponds to the dynamics at time $k$. A key novelty with respect to prior work is that we use this interpretation to exploit the series interconnection structure of neural net… ▽ More This paper is devoted to the estimation of the Lipschitz constant of neural networks using semidefinite programming. For this purpose, we interpret neural networks as time-varying dynamical systems, where the $k$-th layer corresponds to the dynamics at time $k$. A key novelty with respect to prior work is that we use this interpretation to exploit the series interconnection structure of neural networks with a dynamic programming recursion. Nonlinearities, such as activation functions and nonlinear pooling layers, are handled with integral quadratic constraints. If the neural network contains signal processing layers (convolutional or state space model layers), we realize them as 1-D/2-D/N-D systems and exploit this structure as well. We distinguish ourselves from related work on Lipschitz constant estimation by more extensive structure exploitation (scalability) and a generalization to a large class of common neural network architectures. To show the versatility and computational advantages of our method, we apply it to different neural network architectures trained on MNIST and CIFAR-10. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2403.11938 [pdf, other]

State space representations of the Roesser type for convolutional layers

Authors: Patricia Pauli, Dennis Gramlich, Fran Allgöwer

Abstract: From the perspective of control theory, convolutional layers (of neural networks) are 2-D (or N-D) linear time-invariant dynamical systems. The usual representation of convolutional layers by the convolution kernel corresponds to the representation of a dynamical system by its impulse response. However, many analysis tools from control theory, e.g., involving linear matrix inequalities, require a… ▽ More From the perspective of control theory, convolutional layers (of neural networks) are 2-D (or N-D) linear time-invariant dynamical systems. The usual representation of convolutional layers by the convolution kernel corresponds to the representation of a dynamical system by its impulse response. However, many analysis tools from control theory, e.g., involving linear matrix inequalities, require a state space representation. For this reason, we explicitly provide a state space representation of the Roesser type for 2-D convolutional layers with $c_\mathrm{in}r_1 + c_\mathrm{out}r_2$ states, where $c_\mathrm{in}$/$c_\mathrm{out}$ is the number of input/output channels of the layer and $r_1$/$r_2$ characterizes the width/length of the convolution kernel. This representation is shown to be minimal for $c_\mathrm{in} = c_\mathrm{out}$. We further construct state space representations for dilated, strided, and N-D convolutions. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2401.14033 [pdf, ps, other]

Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

Authors: Patricia Pauli, Aaron Havens, Alexandre Araujo, Siddharth Garg, Farshad Khorrami, Frank Allgöwer, Bin Hu

Abstract: Recently, semidefinite programming (SDP) techniques have shown great promise in providing accurate Lipschitz bounds for neural networks. Specifically, the LipSDP approach (Fazlyab et al., 2019) has received much attention and provides the least conservative Lipschitz upper bounds that can be computed with polynomial time guarantees. However, one main restriction of LipSDP is that its formulation r… ▽ More Recently, semidefinite programming (SDP) techniques have shown great promise in providing accurate Lipschitz bounds for neural networks. Specifically, the LipSDP approach (Fazlyab et al., 2019) has received much attention and provides the least conservative Lipschitz upper bounds that can be computed with polynomial time guarantees. However, one main restriction of LipSDP is that its formulation requires the activation functions to be slope-restricted on $[0,1]$, preventing its further use for more general activation functions such as GroupSort, MaxMin, and Householder. One can rewrite MaxMin activations for example as residual ReLU networks. However, a direct application of LipSDP to the resultant residual ReLU networks is conservative and even fails in recovering the well-known fact that the MaxMin activation is 1-Lipschitz. Our paper bridges this gap and extends LipSDP beyond slope-restricted activation functions. To this end, we provide novel quadratic constraints for GroupSort, MaxMin, and Householder activations via leveraging their underlying properties such as sum preservation. Our proposed analysis is general and provides a unified approach for estimating $\ell_2$ and $\ell_\infty$ Lipschitz bounds for a rich class of neural network architectures, including non-residual and residual neural networks and implicit models, with GroupSort, MaxMin, and Householder activations. Finally, we illustrate the utility of our approach with a variety of experiments and show that our proposed SDPs generate less conservative Lipschitz bounds in comparison to existing approaches. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: accepted as a conference paper at ICLR 2024

arXiv:2303.11835 [pdf, ps, other]

Lipschitz-bounded 1D convolutional neural networks using the Cayley transform and the controllability Gramian

Authors: Patricia Pauli, Ruigang Wang, Ian R. Manchester, Frank Allgöwer

Abstract: We establish a layer-wise parameterization for 1D convolutional neural networks (CNNs) with built-in end-to-end robustness guarantees. In doing so, we use the Lipschitz constant of the input-output map** characterized by a CNN as a robustness measure. We base our parameterization on the Cayley transform that parameterizes orthogonal matrices and the controllability Gramian of the state space rep… ▽ More We establish a layer-wise parameterization for 1D convolutional neural networks (CNNs) with built-in end-to-end robustness guarantees. In doing so, we use the Lipschitz constant of the input-output map** characterized by a CNN as a robustness measure. We base our parameterization on the Cayley transform that parameterizes orthogonal matrices and the controllability Gramian of the state space representation of the convolutional layers. The proposed parameterization by design fulfills linear matrix inequalities that are sufficient for Lipschitz continuity of the CNN, which further enables unconstrained training of Lipschitz-bounded 1D CNNs. Finally, we train Lipschitz-bounded 1D CNNs for the classification of heart arrythmia data and show their improved robustness. △ Less

Submitted 25 January, 2024; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Published as a conference paper at CDC 2023

arXiv:2303.03042 [pdf, other]

Convolutional Neural Networks as 2-D systems

Authors: Dennis Gramlich, Patricia Pauli, Carsten W. Scherer, Frank Allgöwer, Christian Ebenbauer

Abstract: This paper introduces a novel representation of convolutional Neural Networks (CNNs) in terms of 2-D dynamical systems. To this end, the usual description of convolutional layers with convolution kernels, i.e., the impulse responses of linear filters, is realized in state space as a linear time-invariant 2-D system. The overall convolutional Neural Network composed of convolutional layers and nonl… ▽ More This paper introduces a novel representation of convolutional Neural Networks (CNNs) in terms of 2-D dynamical systems. To this end, the usual description of convolutional layers with convolution kernels, i.e., the impulse responses of linear filters, is realized in state space as a linear time-invariant 2-D system. The overall convolutional Neural Network composed of convolutional layers and nonlinear activation functions is then viewed as a 2-D version of a Lur'e system, i.e., a linear dynamical system interconnected with static nonlinear components. One benefit of this 2-D Lur'e system perspective on CNNs is that we can use robust control theory much more efficiently for Lipschitz constant estimation than previously possible. △ Less

Submitted 11 April, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

arXiv:2211.15253 [pdf, other]

Lipschitz constant estimation for 1D convolutional neural networks

Authors: Patricia Pauli, Dennis Gramlich, Frank Allgöwer

Abstract: In this work, we propose a dissipativity-based method for Lipschitz constant estimation of 1D convolutional neural networks (CNNs). In particular, we analyze the dissipativity properties of convolutional, pooling, and fully connected layers making use of incremental quadratic constraints for nonlinear activation functions and pooling operations. The Lipschitz constant of the concatenation of these… ▽ More In this work, we propose a dissipativity-based method for Lipschitz constant estimation of 1D convolutional neural networks (CNNs). In particular, we analyze the dissipativity properties of convolutional, pooling, and fully connected layers making use of incremental quadratic constraints for nonlinear activation functions and pooling operations. The Lipschitz constant of the concatenation of these map**s is then estimated by solving a semidefinite program which we derive from dissipativity theory. To make our method as efficient as possible, we exploit the structure of convolutional layers by realizing these finite impulse response filters as causal dynamical systems in state space and carrying out the dissipativity analysis for the state space realizations. The examples we provide show that our Lipschitz bounds are advantageous in terms of accuracy and scalability. △ Less

Submitted 20 June, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

arXiv:2201.00632 [pdf, other]

Neural network training under semidefinite constraints

Authors: Patricia Pauli, Niklas Funcke, Dennis Gramlich, Mohamed Amine Msalmi, Frank Allgöwer

Abstract: This paper is concerned with the training of neural networks (NNs) under semidefinite constraints, which allows for NN training with robustness and stability guarantees. In particular, we focus on Lipschitz bounds for NNs. Exploiting the banded structure of the underlying matrix constraint, we set up an efficient and scalable training scheme for NN training problems of this kind based on interior… ▽ More This paper is concerned with the training of neural networks (NNs) under semidefinite constraints, which allows for NN training with robustness and stability guarantees. In particular, we focus on Lipschitz bounds for NNs. Exploiting the banded structure of the underlying matrix constraint, we set up an efficient and scalable training scheme for NN training problems of this kind based on interior point methods. Our implementation allows to enforce Lipschitz constraints in the training of large-scale deep NNs such as Wasserstein generative adversarial networks (WGANs) via semidefinite constraints. In numerical examples, we show the superiority of our method and its applicability to WGAN training. △ Less

Submitted 19 September, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: to be published in 61st IEEE Conference on Decision and Control

arXiv:2103.17106 [pdf, ps, other]

Linear systems with neural network nonlinearities: Improved stability analysis via acausal Zames-Falb multipliers

Authors: Patricia Pauli, Dennis Gramlich, Julian Berberich, Frank Allgöwer

Abstract: In this paper, we analyze the stability of feedback interconnections of a linear time-invariant system with a neural network nonlinearity in discrete time. Our analysis is based on abstracting neural networks using integral quadratic constraints (IQCs), exploiting the sector-bounded and slope-restricted structure of the underlying activation functions. In contrast to existing approaches, we levera… ▽ More In this paper, we analyze the stability of feedback interconnections of a linear time-invariant system with a neural network nonlinearity in discrete time. Our analysis is based on abstracting neural networks using integral quadratic constraints (IQCs), exploiting the sector-bounded and slope-restricted structure of the underlying activation functions. In contrast to existing approaches, we leverage the full potential of dynamic IQCs to describe the nonlinear activation functions in a less conservative fashion. To be precise, we consider multipliers based on the full-block Yakubovich / circle criterion in combination with acausal Zames-Falb multipliers, leading to linear matrix inequality based stability certificates. Our approach provides a flexible and versatile framework for stability analysis of feedback interconnections with neural network nonlinearities, allowing to trade off computational efficiency and conservatism. Finally, we provide numerical examples that demonstrate the applicability of the proposed framework and the achievable improvements over previous approaches. △ Less

Submitted 30 September, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

arXiv:2011.14006 [pdf, ps, other]

Offset-free setpoint tracking using neural network controllers

Authors: Patricia Pauli, Johannes Köhler, Julian Berberich, Anne Koch, Frank Allgöwer

Abstract: In this paper, we present a method to analyze local and global stability in offset-free setpoint tracking using neural network controllers and we provide ellipsoidal inner approximations of the corresponding region of attraction. We consider a feedback interconnection of a linear plant in connection with a neural network controller and an integrator, which allows for offset-free tracking of a desi… ▽ More In this paper, we present a method to analyze local and global stability in offset-free setpoint tracking using neural network controllers and we provide ellipsoidal inner approximations of the corresponding region of attraction. We consider a feedback interconnection of a linear plant in connection with a neural network controller and an integrator, which allows for offset-free tracking of a desired piecewise constant reference that enters the controller as an external input. Exploiting the fact that activation functions used in neural networks are slope-restricted, we derive linear matrix inequalities to verify stability using Lyapunov theory. After stating a global stability result, we present less conservative local stability conditions (i) for a given reference and (ii) for any reference from a certain set. The latter result even enables guaranteed tracking under setpoint changes using a reference governor which can lead to a significant increase of the region of attraction. Finally, we demonstrate the applicability of our analysis by verifying stability and offset-free tracking of a neural network controller that was trained to stabilize a linearized inverted pendulum. △ Less

Submitted 29 April, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

arXiv:2005.02929 [pdf, ps, other]

doi 10.1109/LCSYS.2021.3050444

Training robust neural networks using Lipschitz bounds

Authors: Patricia Pauli, Anne Koch, Julian Berberich, Paul Kohler, Frank Allgöwer

Abstract: Due to their susceptibility to adversarial perturbations, neural networks (NNs) are hardly used in safety-critical applications. One measure of robustness to such perturbations in the input is the Lipschitz constant of the input-output map defined by an NN. In this work, we propose a framework to train multi-layer NNs while at the same time encouraging robustness by kee** their Lipschitz constan… ▽ More Due to their susceptibility to adversarial perturbations, neural networks (NNs) are hardly used in safety-critical applications. One measure of robustness to such perturbations in the input is the Lipschitz constant of the input-output map defined by an NN. In this work, we propose a framework to train multi-layer NNs while at the same time encouraging robustness by kee** their Lipschitz constant small, thus addressing the robustness issue. More specifically, we design an optimization scheme based on the Alternating Direction Method of Multipliers that minimizes not only the training loss of an NN but also its Lipschitz constant resulting in a semidefinite programming based training procedure that promotes robustness. We design two versions of this training procedure. The first one includes a regularizer that penalizes an accurate upper bound on the Lipschitz constant. The second one allows to enforce a desired Lipschitz bound on the NN at all times during training. Finally, we provide two examples to show that the proposed framework successfully increases the robustness of NNs. △ Less

Submitted 15 September, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

arXiv:1707.06002 [pdf, other]

doi 10.18653/v1/D17-2002

Argotario: Computational Argumentation Meets Serious Games

Authors: Ivan Habernal, Raffael Hannemann, Christian Pollak, Christopher Klamm, Patrick Pauli, Iryna Gurevych

Abstract: An important skill in critical thinking and argumentation is the ability to spot and recognize fallacies. Fallacious arguments, omnipresent in argumentative discourse, can be deceptive, manipulative, or simply leading to `wrong moves' in a discussion. Despite their importance, argumentation scholars and NLP researchers with focus on argumentation quality have not yet investigated fallacies empiric… ▽ More An important skill in critical thinking and argumentation is the ability to spot and recognize fallacies. Fallacious arguments, omnipresent in argumentative discourse, can be deceptive, manipulative, or simply leading to `wrong moves' in a discussion. Despite their importance, argumentation scholars and NLP researchers with focus on argumentation quality have not yet investigated fallacies empirically. The nonexistence of resources dealing with fallacious argumentation calls for scalable approaches to data acquisition and annotation, for which the serious games methodology offers an appealing, yet unexplored, alternative. We present Argotario, a serious game that deals with fallacies in everyday argumentation. Argotario is a multilingual, open-source, platform-independent application with strong educational aspects, accessible at www.argotario.net. △ Less

Submitted 19 July, 2017; originally announced July 2017.

Comments: EMNLP 2017 demo paper. Source codes: https://github.com/UKPLab/argotario

Showing 1–11 of 11 results for author: Pauli, P