-
How do ionic superdiscs self-assemble in nanopores?
Authors:
Zhuoqing Li,
Aileen R. Raab,
Mohamed A. Kolmangadi,
Mark Busch,
Marco Grunwald,
Felix Demel,
Florian Bertram,
Andriy V. Kityk,
Andreas Schoenhals,
Sabine Laschat,
Patrick Huber
Abstract:
Discotic ionic liquid crystals (DILCs) consist of self-assembled superdiscs of cations and anions that spontaneously stack in linear columns with high one-dimensional ionic and electronic charge mobility, making them prominent model systems for functional soft matter. Unfortunately, a homogeneous alignment of DILCs on the macroscale is often not achievable, which significantly limits their applica…
▽ More
Discotic ionic liquid crystals (DILCs) consist of self-assembled superdiscs of cations and anions that spontaneously stack in linear columns with high one-dimensional ionic and electronic charge mobility, making them prominent model systems for functional soft matter. Unfortunately, a homogeneous alignment of DILCs on the macroscale is often not achievable, which significantly limits their applicability. Infiltration into nanoporous solid scaffolds can in principle overcome this drawback. However, due to the extreme experimental challenges to scrutinise liquid crystalline order in extreme spatial confinement, little is known about the structures of DILCs in nanopores. Here, we present temperature-dependent high-resolution optical birefringence measurement and 3D reciprocal space map** based on synchrotron-based X-ray scattering to investigate the thermotropic phase behaviour of dopamine-based ionic liquid crystals confined in cylindrical channels of 180~nm diameter in macroscopic anodic aluminum oxide (AAO) membranes. As a function of the membranes' hydrophilicity and thus the molecular anchoring to the pore walls (edge-on or face-on) and the variation of the hydrophilic-hydrophobic balance between the aromatic cores and the alkyl side chain motifs of the superdiscs by tailored chemical synthesis, we find a particularly rich phase behaviour, which is not present in the bulk state. It is governed by a complex interplay of liquid crystalline elastic energies (bending and splay deformations), polar interactions and pure geometric confinement, and includes textural transitions between radial and axial alignment of the columns with respect to the long nanochannel axis.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Long-Term Fairness with Unknown Dynamics
Authors:
Tongxin Yin,
Reilly Raab,
Mingyan Liu,
Yang Liu
Abstract:
While machine learning can myopically reinforce social inequalities, it may also be used to dynamically seek equitable outcomes. In this paper, we formalize long-term fairness in the context of online reinforcement learning. This formulation can accommodate dynamical control objectives, such as driving equity inherent in the state of a population, that cannot be incorporated into static formulatio…
▽ More
While machine learning can myopically reinforce social inequalities, it may also be used to dynamically seek equitable outcomes. In this paper, we formalize long-term fairness in the context of online reinforcement learning. This formulation can accommodate dynamical control objectives, such as driving equity inherent in the state of a population, that cannot be incorporated into static formulations of fairness. We demonstrate that this framing allows an algorithm to adapt to unknown dynamics by sacrificing short-term incentives to drive a classifier-population system towards more desirable equilibria. For the proposed setting, we develop an algorithm that adapts recent work in online learning. We prove that this algorithm achieves simultaneous probabilistic bounds on cumulative loss and cumulative violations of fairness (as statistical regularities between demographic groups). We compare our proposed algorithm to the repeated retraining of myopic classifiers, as a baseline, and to a deep reinforcement learning algorithm that lacks safety guarantees. Our experiments model human populations according to evolutionary game theory and integrate real-world datasets.
△ Less
Submitted 7 June, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Conjugate Natural Selection
Authors:
Reilly Raab,
Luca de Alfaro,
Yang Liu
Abstract:
We prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates the continuous time replicator equation (an essential model of evolutionary dynamics), and term this correspondence "conjugate natural selection". This correspondence promises alternative approaches for evolutionary computation over continuous or high-dimensional hypothesis spaces. As a special case, FR-NGD also prov…
▽ More
We prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates the continuous time replicator equation (an essential model of evolutionary dynamics), and term this correspondence "conjugate natural selection". This correspondence promises alternative approaches for evolutionary computation over continuous or high-dimensional hypothesis spaces. As a special case, FR-NGD also provides the optimal approximation of continuous Bayesian inference when hypotheses compete on the basis of predicting actual observations. In this case, the method avoids the need to compute prior probabilities. We demonstrate our findings on a non-convex optimization problem and a system identification task for a stochastic process with time-varying parameters.
△ Less
Submitted 12 June, 2023; v1 submitted 29 August, 2022;
originally announced August 2022.
-
Fairness Transferability Subject to Bounded Distribution Shift
Authors:
Yatong Chen,
Reilly Raab,
Jialu Wang,
Yang Liu
Abstract:
Given an algorithmic predictor that is "fair" on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound? In this paper, we study the transferability of statistical group fairness for machine learning predictors (i.e., classifiers or regressors) subject to bounded distribution shifts. Such shifts may be introduced by initial…
▽ More
Given an algorithmic predictor that is "fair" on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound? In this paper, we study the transferability of statistical group fairness for machine learning predictors (i.e., classifiers or regressors) subject to bounded distribution shifts. Such shifts may be introduced by initial training data uncertainties, user adaptation to a deployed predictor, dynamic environments, or the use of pre-trained models in new settings. Herein, we develop a bound that characterizes such transferability, flagging potentially inappropriate deployments of machine learning for socially consequential tasks. We first develop a framework for bounding violations of statistical fairness subject to distribution shift, formulating a generic upper bound for transferred fairness violations as our primary result. We then develop bounds for specific worked examples, focusing on two commonly used fairness definitions (i.e., demographic parity and equalized odds) and two classes of distribution shift (i.e., covariate shift and label shift). Finally, we compare our theoretical bounds to deterministic models of distribution shift and against real-world data, finding that we are able to estimate fairness violation bounds in practice, even when simplifying assumptions are only approximately satisfied.
△ Less
Submitted 15 December, 2022; v1 submitted 31 May, 2022;
originally announced June 2022.
-
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
Authors:
Leo Schwinn,
Leon Bungert,
An Nguyen,
René Raab,
Falk Pulsmeyer,
Doina Precup,
Björn Eskofier,
Dario Zanca
Abstract:
The reliability of neural networks is essential for their use in safety-critical applications. Existing approaches generally aim at improving the robustness of neural networks to either real-world distribution shifts (e.g., common corruptions and perturbations, spatial transformations, and natural adversarial examples) or worst-case distribution shifts (e.g., optimized adversarial examples). In th…
▽ More
The reliability of neural networks is essential for their use in safety-critical applications. Existing approaches generally aim at improving the robustness of neural networks to either real-world distribution shifts (e.g., common corruptions and perturbations, spatial transformations, and natural adversarial examples) or worst-case distribution shifts (e.g., optimized adversarial examples). In this work, we propose the Decision Region Quantification (DRQ) algorithm to improve the robustness of any differentiable pre-trained model against both real-world and worst-case distribution shifts in the data. DRQ analyzes the robustness of local decision regions in the vicinity of a given data point to make more reliable predictions. We theoretically motivate the DRQ algorithm by showing that it effectively smooths spurious local extrema in the decision surface. Furthermore, we propose an implementation using targeted and untargeted adversarial attacks. An extensive empirical evaluation shows that DRQ increases the robustness of adversarially and non-adversarially trained models against real-world and worst-case distribution shifts on several computer vision benchmark datasets.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
Unintended Selection: Persistent Qualification Rate Disparities and Interventions
Authors:
Reilly Raab,
Yang Liu
Abstract:
Realistically -- and equitably -- modeling the dynamics of group-level disparities in machine learning remains an open problem. In particular, we desire models that do not suppose inherent differences between artificial groups of people -- but rather endogenize disparities by appeal to unequal initial conditions of insular subpopulations. In this paper, agents each have a real-valued feature $X$ (…
▽ More
Realistically -- and equitably -- modeling the dynamics of group-level disparities in machine learning remains an open problem. In particular, we desire models that do not suppose inherent differences between artificial groups of people -- but rather endogenize disparities by appeal to unequal initial conditions of insular subpopulations. In this paper, agents each have a real-valued feature $X$ (e.g., credit score) informed by a "true" binary label $Y$ representing qualification (e.g., for a loan). Each agent alternately (1) receives a binary classification label $\hat{Y}$ (e.g., loan approval) from a Bayes-optimal machine learning classifier observing $X$ and (2) may update their qualification $Y$ by imitating successful strategies (e.g., seek a raise) within an isolated group $G$ of agents to which they belong. We consider the disparity of qualification rates $\Pr(Y=1)$ between different groups and how this disparity changes subject to a sequence of Bayes-optimal classifiers repeatedly retrained on the global population. We model the evolving qualification rates of each subpopulation (group) using the replicator equation, which derives from a class of imitation processes. We show that differences in qualification rates between subpopulations can persist indefinitely for a set of non-trivial equilibrium states due to uniformed classifier deployments, even when groups are identical in all aspects except initial qualification densities. We next simulate the effects of commonly proposed fairness interventions on this dynamical system along with a new feedback control mechanism capable of permanently eliminating group-level qualification rate disparities. We conclude by discussing the limitations of our model and findings and by outlining potential future work.
△ Less
Submitted 29 December, 2021; v1 submitted 1 November, 2021;
originally announced November 2021.
-
Exploring Misclassifications of Robust Neural Networks to Enhance Adversarial Attacks
Authors:
Leo Schwinn,
René Raab,
An Nguyen,
Dario Zanca,
Bjoern Eskofier
Abstract:
Progress in making neural networks more robust against adversarial attacks is mostly marginal, despite the great efforts of the research community. Moreover, the robustness evaluation is often imprecise, making it difficult to identify promising approaches. We analyze the classification decisions of 19 different state-of-the-art neural networks trained to be robust against adversarial attacks. Our…
▽ More
Progress in making neural networks more robust against adversarial attacks is mostly marginal, despite the great efforts of the research community. Moreover, the robustness evaluation is often imprecise, making it difficult to identify promising approaches. We analyze the classification decisions of 19 different state-of-the-art neural networks trained to be robust against adversarial attacks. Our findings suggest that current untargeted adversarial attacks induce misclassification towards only a limited amount of different classes. Additionally, we observe that both over- and under-confidence in model predictions result in an inaccurate assessment of model robustness. Based on these observations, we propose a novel loss function for adversarial attacks that consistently improves attack success rate compared to prior loss functions for 19 out of 19 analyzed models.
△ Less
Submitted 25 May, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
CLIP: Cheap Lipschitz Training of Neural Networks
Authors:
Leon Bungert,
René Raab,
Tim Roith,
Leo Schwinn,
Daniel Tenbrinck
Abstract:
Despite the large success of deep neural networks (DNN) in recent years, most neural networks still lack mathematical guarantees in terms of stability. For instance, DNNs are vulnerable to small or even imperceptible input perturbations, so called adversarial examples, that can cause false predictions. This instability can have severe consequences in applications which influence the health and saf…
▽ More
Despite the large success of deep neural networks (DNN) in recent years, most neural networks still lack mathematical guarantees in terms of stability. For instance, DNNs are vulnerable to small or even imperceptible input perturbations, so called adversarial examples, that can cause false predictions. This instability can have severe consequences in applications which influence the health and safety of humans, e.g., biomedical imaging or autonomous driving. While bounding the Lipschitz constant of a neural network improves stability, most methods rely on restricting the Lipschitz constants of each layer which gives a poor bound for the actual Lipschitz constant.
In this paper we investigate a variational regularization method named CLIP for controlling the Lipschitz constant of a neural network, which can easily be integrated into the training procedure. We mathematically analyze the proposed model, in particular discussing the impact of the chosen regularization parameter on the output of the network. Finally, we numerically evaluate our method on both a nonlinear regression problem and the MNIST and Fashion-MNIST classification databases, and compare our results with a weight regularization approach.
△ Less
Submitted 31 October, 2022; v1 submitted 23 March, 2021;
originally announced March 2021.
-
Identifying Untrustworthy Predictions in Neural Networks by Geometric Gradient Analysis
Authors:
Leo Schwinn,
An Nguyen,
René Raab,
Leon Bungert,
Daniel Tenbrinck,
Dario Zanca,
Martin Burger,
Bjoern Eskofier
Abstract:
The susceptibility of deep neural networks to untrustworthy predictions, including out-of-distribution (OOD) data and adversarial examples, still prevent their widespread use in safety-critical applications. Most existing methods either require a re-training of a given model to achieve robust identification of adversarial attacks or are limited to out-of-distribution sample detection only. In this…
▽ More
The susceptibility of deep neural networks to untrustworthy predictions, including out-of-distribution (OOD) data and adversarial examples, still prevent their widespread use in safety-critical applications. Most existing methods either require a re-training of a given model to achieve robust identification of adversarial attacks or are limited to out-of-distribution sample detection only. In this work, we propose a geometric gradient analysis (GGA) to improve the identification of untrustworthy predictions without retraining of a given model. GGA analyzes the geometry of the loss landscape of neural networks based on the saliency maps of their respective input. To motivate the proposed approach, we provide theoretical connections between gradients' geometrical properties and local minima of the loss function. Furthermore, we demonstrate that the proposed method outperforms prior approaches in detecting OOD data and adversarial attacks, including state-of-the-art and adaptive attacks.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
Dynamically Sampled Nonlocal Gradients for Stronger Adversarial Attacks
Authors:
Leo Schwinn,
An Nguyen,
René Raab,
Dario Zanca,
Bjoern Eskofier,
Daniel Tenbrinck,
Martin Burger
Abstract:
The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research. Although several sophisticated defense mechanisms have been introduced, most were later shown to be ineffective. However, a reliable evaluation of model robustness is mandatory for deployment in safety-critical scenarios. To overcome this problem we propose…
▽ More
The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research. Although several sophisticated defense mechanisms have been introduced, most were later shown to be ineffective. However, a reliable evaluation of model robustness is mandatory for deployment in safety-critical scenarios. To overcome this problem we propose a simple yet effective modification to the gradient calculation of state-of-the-art first-order adversarial attacks. Normally, the gradient update of an attack is directly calculated for the given data point. This approach is sensitive to noise and small local optima of the loss function. Inspired by gradient sampling techniques from non-convex optimization, we propose Dynamically Sampled Nonlocal Gradient Descent (DSNGD). DSNGD calculates the gradient direction of the adversarial attack as the weighted average over past gradients of the optimization history. Moreover, distribution hyperparameters that define the sampling operation are automatically learned during the optimization scheme. We empirically show that by incorporating this nonlocal gradient information, we are able to give a more accurate estimation of the global descent direction on noisy and non-convex loss surfaces. In addition, we show that DSNGD-based attacks are on average 35% faster while achieving 0.9% to 27.1% higher success rates compared to their gradient descent-based counterparts.
△ Less
Submitted 27 September, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
Towards Rapid and Robust Adversarial Training with One-Step Attacks
Authors:
Leo Schwinn,
René Raab,
Björn Eskofier
Abstract:
Adversarial training is the most successful empirical method for increasing the robustness of neural networks against adversarial attacks. However, the most effective approaches, like training with Projected Gradient Descent (PGD) are accompanied by high computational complexity. In this paper, we present two ideas that, in combination, enable adversarial training with the computationally less exp…
▽ More
Adversarial training is the most successful empirical method for increasing the robustness of neural networks against adversarial attacks. However, the most effective approaches, like training with Projected Gradient Descent (PGD) are accompanied by high computational complexity. In this paper, we present two ideas that, in combination, enable adversarial training with the computationally less expensive Fast Gradient Sign Method (FGSM). First, we add uniform noise to the initial data point of the FGSM attack, which creates a wider variety of adversaries, thus prohibiting overfitting to one particular perturbation bound. Further, we add a learnable regularization step prior to the neural network, which we call Pixelwise Noise Injection Layer (PNIL). Inputs propagated trough the PNIL are resampled from a learned Gaussian distribution. The regularization induced by the PNIL prevents the model form learning to obfuscate its gradients, a factor that hindered prior approaches from successfully applying one-step methods for adversarial training. We show that noise injection in conjunction with FGSM-based adversarial training achieves comparable results to adversarial training with PGD while being considerably faster. Moreover, we outperform PGD-based adversarial training by combining noise injection and PNIL.
△ Less
Submitted 17 March, 2020; v1 submitted 24 February, 2020;
originally announced February 2020.