-
Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows
Authors:
Yifan Chen,
Daniel Zhengyu Huang,
Jiaoyang Huang,
Sebastian Reich,
Andrew M. Stuart
Abstract:
In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward…
▽ More
In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward models; (ii) the potential existence of multiple modes; and (iii) the fact that gradient of, or adjoint solver for, the forward model might not be feasible.
While existing Bayesian inference methods meet some of these challenges individually, we propose a framework that tackles all three systematically. Our approach builds upon the Fisher-Rao gradient flow in probability space, yielding a dynamical system for probability densities that converges towards the target distribution at a uniform exponential rate. This rapid convergence is advantageous for the computational burden outlined in (i). We apply Gaussian mixture approximations with operator splitting techniques to simulate the flow numerically; the resulting approximation can capture multiple modes thus addressing (ii). Furthermore, we employ the Kalman methodology to facilitate a derivative-free update of these Gaussian components and their respective weights, addressing the issue in (iii).
The proposed methodology results in an efficient derivative-free sampler flexible enough to handle multi-modal distributions: Gaussian Mixture Kalman Inversion (GMKI). The effectiveness of GMKI is demonstrated both theoretically and numerically in several experiments with multimodal target distributions, including proof-of-concept and two-dimensional examples, as well as a large-scale application: recovering the Navier-Stokes initial condition from solution data at positive times.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Robust parameter estimation for partially observed second-order diffusion processes
Authors:
Jan Albrecht,
Sebastian Reich
Abstract:
Estimating parameters of a diffusion process given continuous-time observations of the process via maximum likelihood approaches or, online, via stochastic gradient descent or Kalman filter formulations constitutes a well-established research area. It has also been established previously that these techniques are, in general, not robust to perturbations in the data in the form of temporal correlat…
▽ More
Estimating parameters of a diffusion process given continuous-time observations of the process via maximum likelihood approaches or, online, via stochastic gradient descent or Kalman filter formulations constitutes a well-established research area. It has also been established previously that these techniques are, in general, not robust to perturbations in the data in the form of temporal correlations. While the subject is relatively well understood and appropriate modifications have been suggested in the context of multi-scale diffusion processes and their reduced model equations, we consider here an alternative setting where a second-order diffusion process in positions and velocities is only observed via its positions. In this note, we propose a simple modification to standard stochastic gradient descent and Kalman filter formulations, which eliminates the arising systematic estimation biases. The modification can be extended to standard maximum likelihood approaches and avoids computation of previously proposed correction terms.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Raman scattering by carbon nanotubes coupled to quantum dots via dipolar excitonic interaction
Authors:
Anna Wroblewska,
Niclas S. Mueller,
Mariusz Zdrojek,
Stephanie Reich,
Georgy Gordeev
Abstract:
The dipole-dipole interactions between excitons are of paramount importance in the nanoscale structures. When two excitons are placed together they can exchange the energy can manifest in the resonant Raman cross sections. We provide theoretical framework for such effects by combining the coupled oscillator model and perturbation theory. We apply this theory to a hybrid film comprising semiconduct…
▽ More
The dipole-dipole interactions between excitons are of paramount importance in the nanoscale structures. When two excitons are placed together they can exchange the energy can manifest in the resonant Raman cross sections. We provide theoretical framework for such effects by combining the coupled oscillator model and perturbation theory. We apply this theory to a hybrid film comprising semiconducting quantum dots and metallic carbon nanotubes. The quantum dots exciton has a fixed energy, while the nanotube resonances span across a larger range from 1.7 to \SI{1.93}{eV}. We acquire the resonant Raman profiles of the pristine nanotubes and hybrids and find a relative shift between them. The shift direction depends on the relative energies between the CNT and QD exciton energies, as predicted by our theory.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Resonant Raman signatures of exciton polarons in a transition metal oxide: BiVO$_4$
Authors:
Georgy Gordeev,
Christina Hill,
Angelina Gudima,
Stephanie Reich,
Mael Guennou
Abstract:
In this work we investigate the delocalized excitons and excitons trapped by a polaron formation in \BVO{} by means of resonant Raman spectroscopy. We record Raman spectra with 16 laser lines between 1.9 and \SI{2.6}{\eV} and analyze intensity variations of the Raman peaks for different vibrational modes. The resonant Raman cross sections of the \Ag{} modes contain two types of resonances. The fir…
▽ More
In this work we investigate the delocalized excitons and excitons trapped by a polaron formation in \BVO{} by means of resonant Raman spectroscopy. We record Raman spectra with 16 laser lines between 1.9 and \SI{2.6}{\eV} and analyze intensity variations of the Raman peaks for different vibrational modes. The resonant Raman cross sections of the \Ag{} modes contain two types of resonances. The first high-energy resonance near \SI{2.45}{\eV} belongs to a transition between delocalized states; it is close to absorption edge measured at \SI{2.3}{\eV} and exhibits a characteristic \SI{50}{\meV} anisotropy between polarization parallel and perpendicular to the $c$ axis. The high energy Raman resonance occurs inside the gap at \SI{1.94}{\eV} for all crystallographic directions.
The in-gap resonance can involve a localized transition. We attribute it to an exciton-polaron, formed by a small localized electron polaron of Holstein type and delocalized holes. It manifests in the vibrations of vanadium and oxygen atoms where polaron localization occurs and the resonance energy matches theoretical predictions. The vibrational modes couple to the polaron with different efficiency determined from resonant Raman profiles.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Dielectric Screening Inside Carbon Nanotubes
Authors:
Georgy Gordeev,
Sören Wasserroth,
Han Li,
Ado Jorio,
Benjamin S. Flavel,
Stephanie Reich
Abstract:
Dielectric screening plays a vital role for the physical properties in the nanoscale and also alters our ability to detect and characterize nanomaterials by optical techniques. We study the dielectric screening inside of carbon nanotubes and how it changes electromagnetic fields and many-body effects for encapsulated nanostructures. First, we show that the local electric field inside a nanotube is…
▽ More
Dielectric screening plays a vital role for the physical properties in the nanoscale and also alters our ability to detect and characterize nanomaterials by optical techniques. We study the dielectric screening inside of carbon nanotubes and how it changes electromagnetic fields and many-body effects for encapsulated nanostructures. First, we show that the local electric field inside a nanotube is altered by one-dimensional screening with dramatic effects on the effective Raman scattering efficiency of the encapsulated species for metallic walls. The scattering intensity of the inner tube is two orders of magnitude weaker than for the tube in air, which is nicely reproduced by local field calculations. Secondly, we find that the optical transition energies of the inner nanotubes shift to lower energies compared to a single-walled carbon nanotubes of the same chirality. The shift is higher if the outer tube is metallic than when it is semiconducting. The magnitude of the shift suggests that the excitons of small diameter inner metallic tubes are thermally dissociated at room temperate if the outer tube is also metallic and in essence we observe band-to-band transitions.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Stable generative modeling using diffusion maps
Authors:
Georg Gottwald,
Fengyi Li,
Youssef Marzouk,
Sebastian Reich
Abstract:
We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. Such settings have recently drawn considerable interest in the context of generative modelling. In this paper, we propose a generative model combining diffusion maps and Langevin dynamics. Diffusion maps are used to approximate the drift term from the avail…
▽ More
We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. Such settings have recently drawn considerable interest in the context of generative modelling. In this paper, we propose a generative model combining diffusion maps and Langevin dynamics. Diffusion maps are used to approximate the drift term from the available training samples, which is then implemented in a discrete-time Langevin sampler to generate new samples. By setting the kernel bandwidth to match the time step size used in the unadjusted Langevin algorithm, our method effectively circumvents any stability issues typically associated with time-step** stiff stochastic differential equations. More precisely, we introduce a novel split-step scheme, ensuring that the generated samples remain within the convex hull of the training samples. Our framework can be naturally extended to generate conditional samples. We demonstrate the performance of our proposed scheme through experiments on synthetic datasets with increasing dimensions and on a stochastic subgrid-scale parametrization conditional sampling problem.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Filtered data based estimators for stochastic processes driven by colored noise
Authors:
Grigorios A. Pavliotis,
Sebastian Reich,
Andrea Zanoni
Abstract:
We consider the problem of estimating unknown parameters in stochastic differential equations driven by colored noise, which we model as a sequence of Gaussian stationary processes with decreasing correlation time. We aim to infer parameters in the limit equation, driven by white noise, given observations of the colored noise dynamics. We consider both the maximum likelihood and the stochastic gra…
▽ More
We consider the problem of estimating unknown parameters in stochastic differential equations driven by colored noise, which we model as a sequence of Gaussian stationary processes with decreasing correlation time. We aim to infer parameters in the limit equation, driven by white noise, given observations of the colored noise dynamics. We consider both the maximum likelihood and the stochastic gradient descent in continuous time estimators, and we propose to modify them by including filtered data. We provide a convergence analysis for our estimators showing their asymptotic unbiasedness in a general setting and asymptotic normality under a simplified scenario.
△ Less
Submitted 22 January, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
Levitin-Polyak well-posedness of split multivalued variational inequalities
Authors:
Soumitra Dey,
Simeon Reich
Abstract:
We introduce and study the split multivalued variational inequality problem (SMVIP) and the parametric SMVIP. We examine, in particular, Levitin-Polyak well-posedness of SMVIPs and parametric SMVIPs in Hilbert spaces. We provide several examples to illustrate our theoretical results. We also discuss several important special cases.
We introduce and study the split multivalued variational inequality problem (SMVIP) and the parametric SMVIP. We examine, in particular, Levitin-Polyak well-posedness of SMVIPs and parametric SMVIPs in Hilbert spaces. We provide several examples to illustrate our theoretical results. We also discuss several important special cases.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Particle-based algorithm for stochastic optimal control
Authors:
Sebastian Reich
Abstract:
The solution to a stochastic optimal control problem can be determined by computing the value function from a discretization of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and…
▽ More
The solution to a stochastic optimal control problem can be determined by computing the value function from a discretization of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in diffusion-based generative models. Forward and reverse time formulations express the value function as the ratio of two probability density functions; one stemming from a forward McKean-Vlasov SDE and another one from a reverse McKean-Vlasov SDE. In this paper, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type and diffusion map approximation techniques in order to obtain efficient and robust particle-based algorithms.
△ Less
Submitted 27 February, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Longitudinal Polaritons in Crystals
Authors:
Eduardo B. Barros,
Stephanie Reich
Abstract:
The collective excitations of solids are classified as longitudinal and transverse depending on their relative polarization and propagation direction. This seemingly formal classification results in surprisingly distinct types of excitations if calculated within the Coulomb gauge. Transverse modes couple to free-space photons and hybridize into polaritons for strong light-matter coupling. Longitud…
▽ More
The collective excitations of solids are classified as longitudinal and transverse depending on their relative polarization and propagation direction. This seemingly formal classification results in surprisingly distinct types of excitations if calculated within the Coulomb gauge. Transverse modes couple to free-space photons and hybridize into polaritons for strong light-matter coupling. Longitudinal modes, in contrast, are seen as pure matter excitations that produce a dynamic polarization inside the material without photon coupling. Here we show that both longitudinal and transverse modes become polaritons in the explicitly covariant Lorenz gauge. Longitudinal excitations couple to longitudinal and scalar photons, which have been considered elusive so far. We show that the dipolar excitations become three-fold degenerate in the long-wavelength limit when including all photonic degrees of freedom, as expected from symmetry. Our findings demonstrate how choosing a gauge determines our thinking about materials excitations and how gauge fixing reveals new pathways for tailoring polaritons in crystals, metamaterials, and surfaces. Longitudinal polaritons will interact with longitudinal near fields located at surfaces, which provides additional excitation channels to engineer scanning near-field microscopy and surface-enhanced spectroscopy.
△ Less
Submitted 15 April, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
New iterative algorithms for solving split variational inclusions
Authors:
Soumitra Dey,
Chinedu Izuchukwu,
Adeolu Taiwo,
Simeon Reich
Abstract:
In this paper we study a class of split variational inclusion (SVI) and regularized split variational inclusion (RSVI) problems in real Hilbert spaces. We discuss various analytical properties of the net generated by the RSVI and establish the existence and uniqueness of the solution to the RSVI. Using analytical properties of this net and under certain assumptions on the parameters and map**s a…
▽ More
In this paper we study a class of split variational inclusion (SVI) and regularized split variational inclusion (RSVI) problems in real Hilbert spaces. We discuss various analytical properties of the net generated by the RSVI and establish the existence and uniqueness of the solution to the RSVI. Using analytical properties of this net and under certain assumptions on the parameters and map**s associated with the SVI, we establish the strong convergence of the sequence generated by our proposed iterative algorithm. We also deduce another iterative algorithm by taking the regularization parameters to be zero in our proposed algorithm. We establish the weak convergence of the sequence generated by our new algorithm under certain assumptions. Moreover, we discuss two special cases of the SVI, namely the split convex minimization and the split variational inequality problems, and give several numerical examples.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Tweedie Moment Projected Diffusions For Inverse Problems
Authors:
Benjamin Boys,
Mark Girolami,
Jakiw Pidstrigach,
Sebastian Reich,
Alan Mosca,
O. Deniz Akyildiz
Abstract:
Diffusion generative models unlock new possibilities for inverse problems as they allow for the incorporation of strong empirical priors into the process of scientific inference. Recently, diffusion models received significant attention for solving inverse problems by posterior sampling, but many challenges remain open due to the intractability of this sampling process. Prior work resorted to Gaus…
▽ More
Diffusion generative models unlock new possibilities for inverse problems as they allow for the incorporation of strong empirical priors into the process of scientific inference. Recently, diffusion models received significant attention for solving inverse problems by posterior sampling, but many challenges remain open due to the intractability of this sampling process. Prior work resorted to Gaussian approximations to conditional densities of the reverse process, leveraging Tweedie's formula to parameterise its mean, complemented with various heuristics. In this work, we leverage higher order information using Tweedie's formula and obtain a finer approximation with a principled covariance estimate. This novel approximation removes any time-dependent step-size hyperparameters required by earlier methods, and enables higher quality approximations of the posterior density which results in better samples. Specifically, we tackle noisy linear inverse problems and obtain a novel approximation to the gradient of the likelihood. We then plug this gradient estimate into various diffusion models and show that this method is optimal for a Gaussian data distribution. We illustrate the empirical effectiveness of our approach for general linear inverse problems on toy synthetic examples as well as image restoration using pretrained diffusion models as the prior. We show that our method improves the sample quality by providing statistically principled approximations to diffusion posterior sampling problem.
△ Less
Submitted 22 November, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Sampling via Gradient Flows in the Space of Probability Measures
Authors:
Yifan Chen,
Daniel Zhengyu Huang,
Jiaoyang Huang,
Sebastian Reich,
Andrew M Stuart
Abstract:
Sampling a target probability distribution with an unknown normalization constant is a fundamental challenge in computational science and engineering. Recent work shows that algorithms derived by considering gradient flows in the space of probability measures open up new avenues for algorithm development. This paper makes three contributions to this sampling approach by scrutinizing the design com…
▽ More
Sampling a target probability distribution with an unknown normalization constant is a fundamental challenge in computational science and engineering. Recent work shows that algorithms derived by considering gradient flows in the space of probability measures open up new avenues for algorithm development. This paper makes three contributions to this sampling approach by scrutinizing the design components of such gradient flows. Any instantiation of a gradient flow for sampling needs an energy functional and a metric to determine the flow, as well as numerical approximations of the flow to derive algorithms. Our first contribution is to show that the Kullback-Leibler divergence, as an energy functional, has the unique property (among all f-divergences) that gradient flows resulting from it do not depend on the normalization constant of the target distribution. Our second contribution is to study the choice of metric from the perspective of invariance. The Fisher-Rao metric is known as the unique choice (up to scaling) that is diffeomorphism invariant. As a computationally tractable alternative, we introduce a relaxed, affine invariance property for the metrics and gradient flows. In particular, we construct various affine invariant Wasserstein and Stein gradient flows. Affine invariant gradient flows are shown to behave more favorably than their non-affine-invariant counterparts when sampling highly anisotropic distributions, in theory and by using particle methods. Our third contribution is to study, and develop efficient algorithms based on Gaussian approximations of the gradient flows; this leads to an alternative to particle methods. We establish connections between various Gaussian approximate gradient flows, discuss their relation to gradient methods arising from parametric variational inference, and study their convergence properties both theoretically and numerically.
△ Less
Submitted 9 March, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Strong Coupling of Two-Dimensional Excitons and Plasmonic Photonic Crystals: Microscopic Theory Reveals Triplet Spectra
Authors:
Lara Greten,
Robert Salzwedel,
Tobias Göde,
David Greten,
Stephanie Reich,
Stephen Hughes,
Malte Selig,
Andreas Knorr
Abstract:
Monolayers of transition metal dichalcogenides (TMDC) are direct-gap semiconductors with strong light-matter interactions featuring tightly bound excitons, while plasmonic crystals (PCs), consisting of metal nanoparticles that act as meta-atoms, exhibit collective plasmon modes and allow one to tailor electric fields on the nanoscale. Recent experiments show that TMDC-PC hybrids can reach the stro…
▽ More
Monolayers of transition metal dichalcogenides (TMDC) are direct-gap semiconductors with strong light-matter interactions featuring tightly bound excitons, while plasmonic crystals (PCs), consisting of metal nanoparticles that act as meta-atoms, exhibit collective plasmon modes and allow one to tailor electric fields on the nanoscale. Recent experiments show that TMDC-PC hybrids can reach the strong-coupling limit between excitons and plasmons forming new quasiparticles, so-called plexcitons. To describe this coupling theoretically, we develop a self-consistent Maxwell-Bloch theory for TMDC-PC hybrid structures, which allows us to compute the scattered light in the near- and far-field explicitly and provide guidance for experimental studies. Our calculations reveal a spectral splitting signature of strong coupling of more than $100\,$meV in gold-MoSe$_2$ structures with $30\,$nm nanoparticles, manifesting in a hybridization of exciton and plasmon into two effective plexcitonic bands. In addition to the hybridized states, we find a remaining excitonic mode with significantly smaller coupling to the plasmonic near-field, emitting directly into the far-field. Thus, hybrid spectra in the strong coupling regime can contain three emission peaks.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Affine Invariant Ensemble Transform Methods to Improve Predictive Uncertainty in Neural Networks
Authors:
Diksha Bhandari,
Jakiw Pidstrigach,
Sebastian Reich
Abstract:
We consider the problem of performing Bayesian inference for logistic regression using appropriate extensions of the ensemble Kalman filter. Two interacting particle systems are proposed that sample from an approximate posterior and prove quantitative convergence rates of these interacting particle systems to their mean-field limit as the number of particles tends to infinity. Furthermore, we appl…
▽ More
We consider the problem of performing Bayesian inference for logistic regression using appropriate extensions of the ensemble Kalman filter. Two interacting particle systems are proposed that sample from an approximate posterior and prove quantitative convergence rates of these interacting particle systems to their mean-field limit as the number of particles tends to infinity. Furthermore, we apply these techniques and examine their effectiveness as methods of Bayesian approximation for quantifying predictive uncertainty in neural networks.
△ Less
Submitted 1 July, 2024; v1 submitted 9 September, 2023;
originally announced September 2023.
-
Dropout Ensemble Kalman inversion for high dimensional inverse problems
Authors:
Shuigen Liu,
Sebastian Reich,
Xin T. Tong
Abstract:
Ensemble Kalman inversion (EKI) is an ensemble-based method to solve inverse problems. Its gradient-free formulation makes it an attractive tool for problems with involved formulation. However, EKI suffers from the ''subspace property'', i.e., the EKI solutions are confined in the subspace spanned by the initial ensemble. It implies that the ensemble size should be larger than the problem dimensio…
▽ More
Ensemble Kalman inversion (EKI) is an ensemble-based method to solve inverse problems. Its gradient-free formulation makes it an attractive tool for problems with involved formulation. However, EKI suffers from the ''subspace property'', i.e., the EKI solutions are confined in the subspace spanned by the initial ensemble. It implies that the ensemble size should be larger than the problem dimension to ensure EKI's convergence to the correct solution. Such scaling of ensemble size is impractical and prevents the use of EKI in high dimensional problems. To address this issue, we propose a novel approach using dropout regularization to mitigate the subspace problem. We prove that dropout-EKI converges in the small ensemble settings, and the computational cost of the algorithm scales linearly with dimension. We also show that dropout-EKI reaches the optimal query complexity, up to a constant factor. Numerical examples demonstrate the effectiveness of our approach.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Comparing the Methods of Alternating and Simultaneous Projections for Two Subspaces
Authors:
Simeon Reich,
Rafał Zalas
Abstract:
We study the well-known methods of alternating and simultaneous projections when applied to two nonorthogonal linear subspaces of a real Euclidean space. Assuming that both of the methods have a common starting point chosen from either one of the subspaces, we show that the method of alternating projections converges significantly faster than the method of simultaneous projections. On the other ha…
▽ More
We study the well-known methods of alternating and simultaneous projections when applied to two nonorthogonal linear subspaces of a real Euclidean space. Assuming that both of the methods have a common starting point chosen from either one of the subspaces, we show that the method of alternating projections converges significantly faster than the method of simultaneous projections. On the other hand, we provide examples of subspaces and starting points, where the method of simultaneous projections outperforms the method of alternating projections.
△ Less
Submitted 14 November, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Collective States in Molecular Monolayers on 2D Materials
Authors:
Sabrina Juergensen,
Moritz Kessens,
Charlotte Berrezueta-Palacios,
Nikolai Severin,
Sumaya Ifland,
Jürgen P. Rabe,
Niclas S. Mueller,
Stephanie Reich
Abstract:
Collective excited states form in organic two-dimensional layers through the Coulomb coupling of the molecular transition dipole moments. They manifest as characteristic strong and narrow peaks in the excitation and emission spectra that are shifted to lower energies compared to the monomer transition. We study experimentally and theoretically how robust the collective states are against homogeneo…
▽ More
Collective excited states form in organic two-dimensional layers through the Coulomb coupling of the molecular transition dipole moments. They manifest as characteristic strong and narrow peaks in the excitation and emission spectra that are shifted to lower energies compared to the monomer transition. We study experimentally and theoretically how robust the collective states are against homogeneous and inhomogeneous broadening as well as spatial disorder that occur in real molecular monolayers. Using a microscopic model for a two-dimensional dipole lattice in real space we calculate the properties of collective states and their extinction spectra. We find that the collective states persist even for 1-10% random variation in the molecular position and in the transition frequency, with similar peak position and integrated intensity as for the perfectly ordered system. We measure the optical response of a monolayer of the perylene-derivative MePTCDI on two-dimensional materials. On the wide band-gap insulator hexagonal boron nitride it shows strong emission from the collective state with a line width that is dominated by the inhomogeneous broadening of the molecular state. When using the semimetal graphene as a substrate, however, the luminescence is completely quenched. By combining optical absorption, luminescence, and multi-wavelength Raman scattering we verify that the MePTCDI molecules form very similar collective monolayer states on hexagonal boron nitride and graphene substrates, but on graphene the line width is dominated by non-radiative excitation transfer from the molecules to the substrate. Our study highlights the transition from the localized molecular state of the monomer to a delocalized collective state in the two-dimensional molecular lattice that is entirely based on Coulomb coupling between optically active excitations of the electrons or molecular vibrations.
△ Less
Submitted 14 August, 2023; v1 submitted 18 June, 2023;
originally announced June 2023.
-
The Information Retrieval Experiment Platform
Authors:
Maik Fröbe,
Jan Heinrich Reimer,
Sean MacAvaney,
Niklas Deckers,
Simon Reich,
Janek Bevendorff,
Benno Stein,
Matthias Hagen,
Martin Potthast
Abstract:
We integrate ir_datasets, ir_measures, and PyTerrier with TIRA in the Information Retrieval Experiment Platform (TIREx) to promote more standardized, reproducible, scalable, and even blinded retrieval experiments. Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures. However…
▽ More
We integrate ir_datasets, ir_measures, and PyTerrier with TIRA in the Information Retrieval Experiment Platform (TIREx) to promote more standardized, reproducible, scalable, and even blinded retrieval experiments. Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures. However, none of this is a must for reproducibility and scalability, as TIRA can run any dockerized software locally or remotely in a cloud-native execution environment. Version control and caching ensure efficient (re)execution. TIRA allows for blind evaluation when an experiment runs on a remote server or cloud not under the control of the experimenter. The test data and ground truth are then hidden from public access, and the retrieval software has to process them in a sandbox that prevents data leaks.
We currently host an instance of TIREx with 15 corpora (1.9 billion documents) on which 32 shared retrieval tasks are based. Using Docker images of 50 standard retrieval approaches, we automatically evaluated all approaches on all tasks (50 $\cdot$ 32 = 1,600~runs) in less than a week on a midsize cluster (1,620 CPU cores and 24 GPUs). This instance of TIREx is open for submissions and will be integrated with the IR Anthology, as well as released open source.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
On forward-backward SDE approaches to continuous-time minimum variance estimation
Authors:
** Won Kim,
Sebastian Reich
Abstract:
The work of Kalman and Bucy has established a duality between filtering and optimal estimation in the context of time-continuous linear systems. This duality has recently been extended to time-continuous nonlinear systems in terms of an optimization problem constrained by a backward stochastic partial differential equation. Here we revisit this problem from the perspective of appropriate forward-b…
▽ More
The work of Kalman and Bucy has established a duality between filtering and optimal estimation in the context of time-continuous linear systems. This duality has recently been extended to time-continuous nonlinear systems in terms of an optimization problem constrained by a backward stochastic partial differential equation. Here we revisit this problem from the perspective of appropriate forward-backward stochastic differential equations. This approach sheds new light on the estimation problem and provides a unifying perspective. It is also demonstrated that certain formulations of the estimation problem lead to deterministic formulations similar to the linear Gaussian case as originally investigated by Kalman and Bucy. Finally, optimal control of partially observed diffusion processes is discussed as an application of the proposed estimators.
△ Less
Submitted 14 August, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
EnKSGD: A Class Of Preconditioned Black Box Optimization And Inversion Algorithms
Authors:
Brian Irwin,
Sebastian Reich
Abstract:
In this paper, we introduce the Ensemble Kalman-Stein Gradient Descent (EnKSGD) class of algorithms. The EnKSGD class of algorithms builds on the ensemble Kalman filter (EnKF) line of work, applying techniques from sequential data assimilation to unconstrained optimization and parameter estimation problems. The essential idea is to exploit the EnKF as a black box (i.e. derivative-free, zeroth orde…
▽ More
In this paper, we introduce the Ensemble Kalman-Stein Gradient Descent (EnKSGD) class of algorithms. The EnKSGD class of algorithms builds on the ensemble Kalman filter (EnKF) line of work, applying techniques from sequential data assimilation to unconstrained optimization and parameter estimation problems. The essential idea is to exploit the EnKF as a black box (i.e. derivative-free, zeroth order) optimization tool if iterated to convergence. In this paper, we return to the foundations of the EnKF as a sequential data assimilation technique, including its continuous-time and mean-field limits, with the goal of develo** faster optimization algorithms suited to noisy black box optimization and inverse problems. The resulting EnKSGD class of algorithms can be designed to both maintain the desirable property of affine-invariance, and employ the well-known backtracking line search. Furthermore, EnKSGD algorithms are designed to not necessitate the subspace restriction property and variance collapse property of previous iterated EnKF approaches to optimization, as both these properties can be undesirable in an optimization context. EnKSGD also generalizes beyond the $L^{2}$ loss, and is thus applicable to a wider class of problems than the standard EnKF. Numerical experiments with both linear and nonlinear least squares problems, as well as maximum likelihood estimation, demonstrate the faster convergence of EnKSGD relative to alternative EnKF approaches to optimization.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Bayesian Dynamical Modeling of Fixational Eye Movements
Authors:
Lisa Schwetlick,
Sebastian Reich,
Ralf Engbert
Abstract:
Humans constantly move their eyes, even during visual fixations, where miniature (or fixational) eye movements are produced involuntarily. Fixational eye movements are composed of slow components (physiological drift and tremor) and fast microsaccades. The complex dynamics of physiological drift can be modeled qualitatively as a statistically self-avoiding random walk (SAW model, see Engbert et al…
▽ More
Humans constantly move their eyes, even during visual fixations, where miniature (or fixational) eye movements are produced involuntarily. Fixational eye movements are composed of slow components (physiological drift and tremor) and fast microsaccades. The complex dynamics of physiological drift can be modeled qualitatively as a statistically self-avoiding random walk (SAW model, see Engbert et al., 2011). In this study, we implement a data assimilation approach for the SAW model to explain quantitative differences in experimental data obtained from high-resolution, video-based eye tracking. We present a likelihood function for the SAW model which allows us apply Bayesian parameter estimation at the level of individual human participants. Based on the model fits we find a relationship between the activation predicted by the SAW model and the occurrence of microsaccades. The latent model activation relative to microsaccade onsets and offsets using experimental data reveals evidence for a triggering mechanism for microsaccades. These findings suggest that the SAW model is capable of capturing individual differences and can serve as a tool for exploring the relationship between physiological drift and microsaccades as the two most important components of fixational eye movements. Our results contribute to the understanding of individual variability in microsaccade behaviors and the role of fixational eye movements in visual information processing.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Gradient Flows for Sampling: Mean-Field Models, Gaussian Approximations and Affine Invariance
Authors:
Yifan Chen,
Daniel Zhengyu Huang,
Jiaoyang Huang,
Sebastian Reich,
Andrew M. Stuart
Abstract:
Sampling a probability distribution with an unknown normalization constant is a fundamental problem in computational science and engineering. This task may be cast as an optimization problem over all probability measures, and an initial distribution can be evolved to the desired minimizer dynamically via gradient flows. Mean-field models, whose law is governed by the gradient flow in the space of…
▽ More
Sampling a probability distribution with an unknown normalization constant is a fundamental problem in computational science and engineering. This task may be cast as an optimization problem over all probability measures, and an initial distribution can be evolved to the desired minimizer dynamically via gradient flows. Mean-field models, whose law is governed by the gradient flow in the space of probability measures, may also be identified; particle approximations of these mean-field models form the basis of algorithms. The gradient flow approach is also the basis of algorithms for variational inference, in which the optimization is performed over a parameterized family of probability distributions such as Gaussians, and the underlying gradient flow is restricted to the parameterized family.
By choosing different energy functionals and metrics for the gradient flow, different algorithms with different convergence properties arise. In this paper, we concentrate on the Kullback-Leibler divergence after showing that, up to scaling, it has the unique property that the gradient flows resulting from this choice of energy do not depend on the normalization constant. For the metrics, we focus on variants of the Fisher-Rao, Wasserstein, and Stein metrics; we introduce the affine invariance property for gradient flows, and their corresponding mean-field models, determine whether a given metric leads to affine invariance, and modify it to make it affine invariant if it does not. We study the resulting gradient flows in both probability density space and Gaussian space. The flow in the Gaussian space may be understood as a Gaussian approximation of the flow. We demonstrate that the Gaussian approximation based on the metric and through moment closure coincide, establish connections between them, and study their long-time convergence properties showing the advantages of affine invariance.
△ Less
Submitted 2 November, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
Infinite-Dimensional Diffusion Models
Authors:
Jakiw Pidstrigach,
Youssef Marzouk,
Sebastian Reich,
Sven Wang
Abstract:
Diffusion models have had a profound impact on many application areas, including those where data are intrinsically infinite-dimensional, such as images or time series. The standard approach is first to discretize and then to apply diffusion models to the discretized data. While such approaches are practically appealing, the performance of the resulting algorithms typically deteriorates as discret…
▽ More
Diffusion models have had a profound impact on many application areas, including those where data are intrinsically infinite-dimensional, such as images or time series. The standard approach is first to discretize and then to apply diffusion models to the discretized data. While such approaches are practically appealing, the performance of the resulting algorithms typically deteriorates as discretization parameters are refined. In this paper, we instead directly formulate diffusion-based generative models in infinite dimensions and apply them to the generative modeling of functions. We prove that our formulations are well posed in the infinite-dimensional setting and provide dimension-independent distance bounds from the sample to the target measure. Using our theory, we also develop guidelines for the design of infinite-dimensional diffusion models. For image distributions, these guidelines are in line with the canonical choices currently made for diffusion models. For other distributions, however, we can improve upon these canonical choices, which we show both theoretically and empirically, by applying the algorithms to data distributions on manifolds and inspired by Bayesian inverse problems or simulation-based inference.
△ Less
Submitted 3 October, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Nanomechanical absorption spectroscopy of 2D materials with femtowatt sensitivity
Authors:
Jan N. Kirchhof,
Yuefeng Yu,
Denis Yagodkin,
Nele Stetzuhn,
Daniel B. de Araújo,
Kostas Kanellopulos,
Samuel Manas-Valero,
Eugenio Coronado,
Herre van der Zant,
Stephanie Reich,
Silvan Schmid,
Kirill I. Bolotin
Abstract:
Nanomechanical spectroscopy (NMS) is a recently developed approach to determine optical absorption spectra of nanoscale materials via mechanical measurements. It is based on measuring changes in the resonance frequency of a membrane resonator vs. the photon energy of incoming light. This method is a direct measurement of absorption, which has practical advantages compared to common optical spectro…
▽ More
Nanomechanical spectroscopy (NMS) is a recently developed approach to determine optical absorption spectra of nanoscale materials via mechanical measurements. It is based on measuring changes in the resonance frequency of a membrane resonator vs. the photon energy of incoming light. This method is a direct measurement of absorption, which has practical advantages compared to common optical spectroscopy approaches. In the case of two-dimensional (2D) materials, NMS overcomes limitations inherent to conventional optical methods, such as the complications associated with measurements at high magnetic fields and low temperatures. In this work, we develop a protocol for NMS of 2D materials that yields two orders of magnitude improved sensitivity compared to previous approaches, while being simpler to use. To this end, we use electrical sample actuation, which simplifies the experiment and provides a reliable calibration for greater accuracy. Additionally, the use of low-stress silicon nitride membranes as our substrate reduces the noise-equivalent power to $NEP = 890 fW/\sqrt{Hz}$, comparable to commercial semiconductor photodetectors. We use our approach to spectroscopically characterize a two-dimensional transition metal dichalcogenide (WS$_2$), a layered magnetic semiconductor (CrPS$_4$), and a plasmonic supercrystal consisting of gold nanoparticles.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
What do Vision Transformers Learn? A Visual Exploration
Authors:
Amin Ghiasi,
Hamid Kazemi,
Eitan Borgnia,
Steven Reich,
Manli Shu,
Micah Goldblum,
Andrew Gordon Wilson,
Tom Goldstein
Abstract:
Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assiste…
▽ More
Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assisted by these solutions, we observe that neurons in ViTs trained with language model supervision (e.g., CLIP) are activated by semantic concepts rather than visual features. We also explore the underlying differences between ViTs and CNNs, and we find that transformers detect image background features, just like their convolutional counterparts, but their predictions depend far less on high-frequency information. On the other hand, both architecture types behave similarly in the way features progress from abstract patterns in early layers to concrete objects in late layers. In addition, we show that ViTs maintain spatial information in all layers except the final layer. In contrast to previous works, we show that the last layer most likely discards the spatial information and behaves as a learned global pooling operation. Finally, we conduct large-scale visualizations on a wide range of ViT variants, including DeiT, CoaT, ConViT, PiT, Swin, and Twin, to validate the effectiveness of our method.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
Observation of multi-directional energy transfer in a hybrid plasmonic-excitonic nanostructure
Authors:
Tommaso Pincelli,
Thomas Vasileiadis,
Shuo Dong,
Samuel Beaulieu,
Maciej Dendzik,
Daniela Zahn,
Sang-Eun Lee,
Hélène Seiler,
Yinpeng Qi,
R. Patrick Xian,
Julian Maklar,
Emerson Coy,
Niclas S. Müller,
Yu Okamura,
Stephanie Reich,
Martin Wolf,
Laurenz Rettig,
Ralph Ernstorfer
Abstract:
Hybrid plasmonic devices involve a nanostructured metal supporting localized surface plasmons to amplify light-matter interaction, and a non-plasmonic material to functionalize charge excitations. Application-relevant epitaxial heterostructures, however, give rise to ballistic ultrafast dynamics that challenge the conventional semiclassical understanding of unidirectional nanometal-to-substrate en…
▽ More
Hybrid plasmonic devices involve a nanostructured metal supporting localized surface plasmons to amplify light-matter interaction, and a non-plasmonic material to functionalize charge excitations. Application-relevant epitaxial heterostructures, however, give rise to ballistic ultrafast dynamics that challenge the conventional semiclassical understanding of unidirectional nanometal-to-substrate energy transfer. We study epitaxial Au nanoislands on WSe$_2$ with time- and angle-resolved photoemission spectroscopy and femtosecond electron diffraction: this combination of techniques resolves material, energy and momentum of charge-carriers and phonons excited in the heterostructure. We observe a strong non-linear plasmon-exciton interaction that transfers the energy of sub-bandgap photons very efficiently to the semiconductor, leaving the metal cold until non-radiative exciton recombination heats the nanoparticles on hundreds of femtoseconds timescales. Our results resolve a multi-directional energy exchange on timescales shorter than the electronic thermalization of the nanometal. Electron-phonon coupling and diffusive charge-transfer determine the subsequent energy flow. This complex dynamics opens perspectives for optoelectronic and photocatalytic applications, while providing a constraining experimental testbed for state-of-the-art modelling.
△ Less
Submitted 29 November, 2022; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Segmentation of Multiple Sclerosis Lesions across Hospitals: Learn Continually or Train from Scratch?
Authors:
Enamundram Naga Karthik,
Anne Kerbrat,
Pierre Labauge,
Tobias Granberg,
Jason Talbott,
Daniel S. Reich,
Massimo Filippi,
Rohit Bakshi,
Virginie Callot,
Sarath Chandar,
Julien Cohen-Adad
Abstract:
Segmentation of Multiple Sclerosis (MS) lesions is a challenging problem. Several deep-learning-based methods have been proposed in recent years. However, most methods tend to be static, that is, a single model trained on a large, specialized dataset, which does not generalize well. Instead, the model should learn across datasets arriving sequentially from different hospitals by building upon the…
▽ More
Segmentation of Multiple Sclerosis (MS) lesions is a challenging problem. Several deep-learning-based methods have been proposed in recent years. However, most methods tend to be static, that is, a single model trained on a large, specialized dataset, which does not generalize well. Instead, the model should learn across datasets arriving sequentially from different hospitals by building upon the characteristics of lesions in a continual manner. In this regard, we explore experience replay, a well-known continual learning method, in the context of MS lesion segmentation across multi-contrast data from 8 different hospitals. Our experiments show that replay is able to achieve positive backward transfer and reduce catastrophic forgetting compared to sequential fine-tuning. Furthermore, replay outperforms the multi-domain training, thereby emerging as a promising solution for the segmentation of MS lesions. The code is available at this link: https://github.com/naga-karthik/continual-learning-ms
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Ensemble Kalman Methods: A Mean Field Perspective
Authors:
Edoardo Calvello,
Sebastian Reich,
Andrew M. Stuart
Abstract:
This paper provides a unifying mean field based framework for the derivation and analysis of ensemble Kalman methods. Both state estimation and parameter estimation problems are considered, and formulations in both discrete and continuous time are employed. For state estimation problems both the control and filtering approaches are studied; analogously, for parameter estimation (inverse) problems…
▽ More
This paper provides a unifying mean field based framework for the derivation and analysis of ensemble Kalman methods. Both state estimation and parameter estimation problems are considered, and formulations in both discrete and continuous time are employed. For state estimation problems both the control and filtering approaches are studied; analogously, for parameter estimation (inverse) problems the optimization and Bayesian perspectives are both studied. The approach taken unifies a wide-ranging literature in the field, provides a framework for analysis of ensemble Kalman methods, and suggests open problems.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Data assimilation: A dynamic homotopy-based coupling approach
Authors:
Sebastian Reich
Abstract:
Homotopy approaches to Bayesian inference have found widespread use especially if the Kullback-Leibler divergence between the prior and the posterior distribution is large. Here we extend one of these homotopy approach to include an underlying stochastic diffusion process. The underlying mathematical problem is closely related to the Schrödinger bridge problem for given marginal distributions. We…
▽ More
Homotopy approaches to Bayesian inference have found widespread use especially if the Kullback-Leibler divergence between the prior and the posterior distribution is large. Here we extend one of these homotopy approach to include an underlying stochastic diffusion process. The underlying mathematical problem is closely related to the Schrödinger bridge problem for given marginal distributions. We demonstrate that the proposed homotopy approach provides a computationally tractable approximation to the underlying bridge problem. In particular, our implementation builds upon the widely used ensemble Kalman filter methodology and extends it to Schrödinger bridge problems within the context of sequential data assimilation.
△ Less
Submitted 3 November, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Levitin-Polyak Well-posedness for Split Equilibrium Problems
Authors:
Soumitra Dey,
Aviv Gibali,
Simeon Reich
Abstract:
The notion of well-posedness has drawn the attention of many researchers in the field of nonlinear analysis, as it allows to explore problems in which exact solutions are not known and/or computationally hard to compute. Roughly speaking, for a given problem, well-posedness guarantees the convergence of approximations to exact solutions via an iterative method. Thus, in this paper we extend the co…
▽ More
The notion of well-posedness has drawn the attention of many researchers in the field of nonlinear analysis, as it allows to explore problems in which exact solutions are not known and/or computationally hard to compute. Roughly speaking, for a given problem, well-posedness guarantees the convergence of approximations to exact solutions via an iterative method. Thus, in this paper we extend the concept of Levitin-Polyak well-posedness to split equilibrium problems in real Banach spaces. In particular, we establish a metric characterization of Levitin-Polyak well-posedness by perturbations and also show an equivalence between Levitin-Polyak well-posedness by perturbations for split equilibrium problems and the existence and uniqueness of their solutions.
△ Less
Submitted 3 May, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Strong Convergence of Forward-Reflected-Backward Splitting Methods for Solving Monotone Inclusions with Applications to Image Restoration and Optimal Control
Authors:
Chinedu Izuchukwu,
Simeon Reich,
Yekini Shehu,
Adeolu Taiwo
Abstract:
In this paper, we propose and study several strongly convergent versions of the forward-reflected-backward splitting method of Malitsky and Tam for finding a zero of the sum of two monotone operators in a real Hilbert space. Our proposed methods only require one forward evaluation of the single-valued operator and one backward evaluation of the set-valued operator at each iteration; a feature that…
▽ More
In this paper, we propose and study several strongly convergent versions of the forward-reflected-backward splitting method of Malitsky and Tam for finding a zero of the sum of two monotone operators in a real Hilbert space. Our proposed methods only require one forward evaluation of the single-valued operator and one backward evaluation of the set-valued operator at each iteration; a feature that is absent in many other available strongly convergent splitting methods in the literature. We also develop inertial versions of our methods and strong convergence results are obtained for these methods when the set-valued operator is maximal monotone and the single-valued operator is Lipschitz continuous and monotone. Finally, we discuss some examples from image restorations and optimal control regarding the implementations of our methods in comparison with known related methods in the literature.
△ Less
Submitted 14 August, 2022;
originally announced August 2022.
-
Generalized projections on general Banach spaces
Authors:
Akhtar A. Khan,
**lu Li,
Simeon Reich
Abstract:
In general Banach spaces, the metric projection map lacks the powerful properties it enjoys in Hilbert spaces. There are a few generalized projections that have been proposed in order to resolve many of the deficiencies of the metric projection. However, such notions are predominantly studied in Banach spaces with rich topological structures, such as uniformly convex Banach spaces. In this paper,…
▽ More
In general Banach spaces, the metric projection map lacks the powerful properties it enjoys in Hilbert spaces. There are a few generalized projections that have been proposed in order to resolve many of the deficiencies of the metric projection. However, such notions are predominantly studied in Banach spaces with rich topological structures, such as uniformly convex Banach spaces. In this paper, we investigate two notions of generalized projection in general Banach spaces. Various examples are provided to demonstrate the proposed notions and the loss of structure in the generalized projections after migrating from specially structured Banach spaces to general Banach spaces. Connections between the generalized projection and the metric projection are thoroughly explored.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Convergence of Two Simple Methods for Solving Monotone Inclusion Problems in Reflexive Banach Spaces
Authors:
Chinedu Izuchukwu,
Simeon Reich,
Yekini Shehu
Abstract:
We propose two very simple methods, the first one with constant step sizes and the second one with self-adaptive step sizes, for finding a zero of the sum of two monotone operators in real reflexive Banach spaces. Our methods require only one evaluation of the single-valued operator at each iteration. Weak convergence results are obtained when the set-valued operator is maximal monotone and the si…
▽ More
We propose two very simple methods, the first one with constant step sizes and the second one with self-adaptive step sizes, for finding a zero of the sum of two monotone operators in real reflexive Banach spaces. Our methods require only one evaluation of the single-valued operator at each iteration. Weak convergence results are obtained when the set-valued operator is maximal monotone and the single-valued operator is Lipschitz continuous, and strong convergence results are obtained when either one of these two operators is required, in addition, to be strongly monotone. We also obtain the rate of convergence of our proposed methods in real reflexive Banach spaces. Finally, we apply our results to solving generalized Nash equilibrium problems for gas markets.
△ Less
Submitted 10 July, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Polynomial Estimates for the Method of Cyclic Projections in Hilbert Spaces
Authors:
Simeon Reich,
Rafał Zalas
Abstract:
We study the method of cyclic projections when applied to closed and linear subspaces $M_i$, $i=1,\ldots,m$, of a real Hilbert space $\mathcal H$. We show that the average distance to individual sets enjoys a polynomial behaviour $o(k^{-1/2})$ along the trajectory of the generated iterates. Surprisingly, when the starting points are chosen from the subspace $\sum_{i=1}^{m}M_i^\perp$, our result yi…
▽ More
We study the method of cyclic projections when applied to closed and linear subspaces $M_i$, $i=1,\ldots,m$, of a real Hilbert space $\mathcal H$. We show that the average distance to individual sets enjoys a polynomial behaviour $o(k^{-1/2})$ along the trajectory of the generated iterates. Surprisingly, when the starting points are chosen from the subspace $\sum_{i=1}^{m}M_i^\perp$, our result yields a polynomial rate of convergence $\mathcal O(k^{-1/2})$ for the method of cyclic projections itself. Moreover, if $\sum_{i=1}^{m} M_i^\perp$ is not closed, then both of the aforementioned rates are best possible in the sense that the corresponding polynomial $k^{1/2}$ cannot be replaced by $k^{1/2+\varepsilon}$ for any $\varepsilon >0$.
△ Less
Submitted 17 April, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Generic properties of nonexpansive map**s on unbounded domains
Authors:
Christian Bargetz,
Simeon Reich,
Daylen Thimm
Abstract:
We investigate typical properties of nonexpansive map**s on unbounded complete hyperbolic metric spaces. For two families of metrics of uniform convergence on bounded sets, we show that the typical nonexpansive map** is a Rakotch contraction on every bounded subset and that there is a bounded set which is mapped into itself by this map**. In particular, we obtain that the typical nonexpansiv…
▽ More
We investigate typical properties of nonexpansive map**s on unbounded complete hyperbolic metric spaces. For two families of metrics of uniform convergence on bounded sets, we show that the typical nonexpansive map** is a Rakotch contraction on every bounded subset and that there is a bounded set which is mapped into itself by this map**. In particular, we obtain that the typical nonexpansive map** in this setting has a unique fixed point which can be reached by iterating the map**. Nevertheless, it turns out that the typical map** is not a Rakotch contraction on the whole space and that it has the maximal possible Lipschitz constant of one on a residual subset of its domain. By typical we mean that the complement of the set of map**s with this property is $σ$-$φ$-porous, that is, small in a metric sense. For a metric of pointwise convergence, we show that the set of Rakotch contractions is meagre.
△ Less
Submitted 8 February, 2023; v1 submitted 21 April, 2022;
originally announced April 2022.
-
A Neural Network for Solving Inverse Quasi-Variational Inequalities
Authors:
Soumitra Dey,
Simeon Reich
Abstract:
We study the existence and uniqueness of solutions to the inverse quasi-variational inequality problem. Motivated by the neural network approach to solving optimization problems such as variational inequality, monotone inclusion, and inverse variational problems, we consider a neural network associated with the inverse quasi-variational inequality problem, and establish the existence and uniquenes…
▽ More
We study the existence and uniqueness of solutions to the inverse quasi-variational inequality problem. Motivated by the neural network approach to solving optimization problems such as variational inequality, monotone inclusion, and inverse variational problems, we consider a neural network associated with the inverse quasi-variational inequality problem, and establish the existence and uniqueness of a solution to the proposed network. We prove that every trajectory of the proposed neural network converges to the unique solution of the inverse quasi-variational inequality problem and that the network is globally asymptotically stable at its equilibrium point. We also prove that if the function which governs the inverse quasi-variational inequality problem is strongly monotone and Lipschitz continuous, then the network is globally exponentially stable at its equilibrium point. We discretize the network and show that the sequence generated by the discretization of the network converges strongly to a solution of the inverse quasi-variational inequality problem under certain assumptions on the parameters involved. Finally, we provide numerical examples to support and illustrate our theoretical results.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems
Authors:
Daniel Zhengyu Huang,
Jiaoyang Huang,
Sebastian Reich,
Andrew M. Stuart
Abstract:
We consider Bayesian inference for large scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require $O(10^4)$ model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefor…
▽ More
We consider Bayesian inference for large scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require $O(10^4)$ model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system into which the inverse problem is embedded as an observation operator. Theoretical properties of the mean-field model are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model from time-averaged statistics.
△ Less
Submitted 11 August, 2022; v1 submitted 9 April, 2022;
originally announced April 2022.
-
Unrestricted Douglas-Rachford algorithms for solving convex feasibility problems in Hilbert space
Authors:
Kay Barshad,
Aviv Gibali,
Simeon Reich
Abstract:
In this work we focus on the convex feasibility problem (CFP) in Hilbert space. A specific method in this area that has gained a lot of interest in recent years is the Douglas-Rachford (DR) algorithm. This algorithm was originally introduced in 1956 for solving stationary and non-stationary heat equations. Then in 1979, Lions and Mercier adjusted and extended the algorithm with the aim of solving…
▽ More
In this work we focus on the convex feasibility problem (CFP) in Hilbert space. A specific method in this area that has gained a lot of interest in recent years is the Douglas-Rachford (DR) algorithm. This algorithm was originally introduced in 1956 for solving stationary and non-stationary heat equations. Then in 1979, Lions and Mercier adjusted and extended the algorithm with the aim of solving CFPs and even more general problems, such as finding zeros of the sum of two maximally monotone operators. Many developments which implement various concepts concerning this algorithm have occurred during the last decade. We introduce an unrestricted DR algorithm, which provides a general framework for such concepts. Using unrestricted products of a finite number of strongly nonexpansive operators, we apply this framework to provide new iterative methods, where, \textit{inter alia}, such operators may be interlaced between the operators used in the scheme of our \ unrestricted \color DR algorithm.
△ Less
Submitted 5 November, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.
-
Light Control over Chirality Selective Functionalization of Substrate Supported Carbon Nanotubes
Authors:
Georgy Gordeev,
Thomas Rosenkranz,
Frank Hennrich,
Stephanie Reich,
Ralph Krupke
Abstract:
Diazonium reactions with carbon nanotubes form optical $sp^3$ defects that can be used in optical and electrical circuits. We investigate a direct on-device reaction supported by confined laser irradiation and present a technique where an arbitrary carbon nanotube can be preferentially functionalized within a device by matching the light frequency with its transition energy. An exemplary reaction…
▽ More
Diazonium reactions with carbon nanotubes form optical $sp^3$ defects that can be used in optical and electrical circuits. We investigate a direct on-device reaction supported by confined laser irradiation and present a technique where an arbitrary carbon nanotube can be preferentially functionalized within a device by matching the light frequency with its transition energy. An exemplary reaction was carried out between (9,7) nanotube and 4-bromobenzenediazonium tetrafluoroborate. The substrate supported nanotubes of multiple semiconducting chiralities were locally exposed to laser light while monitoring the reaction kinetics in-situ via Raman spectroscopy. The chiral selectivity of the reaction was confirmed by resonant Raman spectroscopy, reporting a 10 meV $E_{22}$ transition energy red-shift only of the targeted species. We further demonstrated this method on a single tube (9,7) electroluminescent device and show a 25 meV red-shifted emission of the ground state $E_{11}$ compared to the emission from the pristine tubes.
△ Less
Submitted 18 May, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
A Deep Dive into Dataset Imbalance and Bias in Face Identification
Authors:
Valeriia Cherepanova,
Steven Reich,
Samuel Dooley,
Hossein Souri,
Micah Goldblum,
Tom Goldstein
Abstract:
As the deployment of automated face recognition (FR) systems proliferates, bias in these systems is not just an academic question, but a matter of public concern. Media portrayals often center imbalance as the main source of bias, i.e., that FR models perform worse on images of non-white people or women because these demographic groups are underrepresented in training data. Recent academic researc…
▽ More
As the deployment of automated face recognition (FR) systems proliferates, bias in these systems is not just an academic question, but a matter of public concern. Media portrayals often center imbalance as the main source of bias, i.e., that FR models perform worse on images of non-white people or women because these demographic groups are underrepresented in training data. Recent academic research paints a more nuanced picture of this relationship. However, previous studies of data imbalance in FR have focused exclusively on the face verification setting, while the face identification setting has been largely ignored, despite being deployed in sensitive applications such as law enforcement. This is an unfortunate omission, as 'imbalance' is a more complex matter in identification; imbalance may arise in not only the training data, but also the testing data, and furthermore may affect the proportion of identities belonging to each demographic group or the number of images belonging to each identity. In this work, we address this gap in the research by thoroughly exploring the effects of each kind of imbalance possible in face identification, and discuss other factors which may impact bias in this setting.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Nanomechanical spectroscopy of 2D materials
Authors:
Jan N. Kirchhof,
Yuefeng Yu,
Gabriel Antheaume,
Georgy Gordeev,
Denis Yagodkin,
Peter Elliott,
Daniel B. de Araújo,
Sangeeta Sharma,
Stephanie Reich,
Kirill I. Bolotin
Abstract:
We introduce a nanomechanical platform for fast and sensitive measurements of the spectrally-resolved optical dielectric function of 2D materials. At the heart of our approach is a suspended 2D material integrated into a nanomechanical resonator illuminated by a wavelength-tunable laser source. From the heating-related frequency shift of the resonator as well as its optical reflection measured as…
▽ More
We introduce a nanomechanical platform for fast and sensitive measurements of the spectrally-resolved optical dielectric function of 2D materials. At the heart of our approach is a suspended 2D material integrated into a nanomechanical resonator illuminated by a wavelength-tunable laser source. From the heating-related frequency shift of the resonator as well as its optical reflection measured as a function of photon energy, we obtain the real and imaginary parts of the dielectric function. Our measurements are unaffected by substrate-related screening and do not require any assumptions on the underling optical constants. This fast ($τ_{rise}$ $\sim$ 135 ns), sensitive (noise-equivalent power = 90 $\frac{pW}{\sqrt{Hz}}$ ), and broadband (1.2 $-$ 3.1 eV, extendable to UV-THz) method provides an attractive alternative to spectroscopic or ellipsometric characterisation techniques.
△ Less
Submitted 26 September, 2022; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Slow relaxation and aging in the model of randomly connected cycles network
Authors:
S. Reich,
S. Maoz,
Y. Kaplan,
H. Rappeport,
N. Q. Balaban,
O. Agam
Abstract:
We propose a statistical model of a large random network with high connectivity in order to describe the behavior of {\it E.\,coli} cells after exposure to acute stress. The building blocks of this network are feedback cycles typical of the genetic and metabolic networks of a cell. Each node on the cycles is a spin degree of freedom representing a component in the cell's network that can be in one…
▽ More
We propose a statistical model of a large random network with high connectivity in order to describe the behavior of {\it E.\,coli} cells after exposure to acute stress. The building blocks of this network are feedback cycles typical of the genetic and metabolic networks of a cell. Each node on the cycles is a spin degree of freedom representing a component in the cell's network that can be in one of two states - active or inactive. The cycles are interconnected by regulation or by the exchange of metabolites. Stress is realized by an external magnetic field that drives the nodes into an inactive state, and the time the magnetization passes zero value for the first time represents the first division event of the cell after the stress period. The numerical and analytical solutions for this first passage problem reproduce the aging dynamics observed in the experimental data.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Authors:
Amin Ghiasi,
Hamid Kazemi,
Steven Reich,
Chen Zhu,
Micah Goldblum,
Tom Goldstein
Abstract:
Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images. In this work, we introduce Plug-In Inversion, which relies on a simple set of augmentations and does not require excessive hyper-parameter tuning. Under our proposed augmen…
▽ More
Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images. In this work, we introduce Plug-In Inversion, which relies on a simple set of augmentations and does not require excessive hyper-parameter tuning. Under our proposed augmentation-based scheme, the same set of augmentation hyper-parameters can be used for inverting a wide range of image classification models, regardless of input dimensions or the architecture. We illustrate the practicality of our approach by inverting Vision Transformers (ViTs) and Multi-Layer Perceptrons (MLPs) trained on the ImageNet dataset, tasks which to the best of our knowledge have not been successfully accomplished by any previous works.
△ Less
Submitted 30 January, 2022;
originally announced January 2022.
-
Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models
Authors:
Liam Fowl,
Jonas Gei**,
Steven Reich,
Yuxin Wen,
Wojtek Czaja,
Micah Goldblum,
Tom Goldstein
Abstract:
A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy. However, previous work has shown that the gradient updates used in FL can leak user information. While the most industrial uses of FL are for text applications (e.g. keystroke prediction), nearly all attacks on FL privacy have focused on simple image classifiers. We propose a novel attack th…
▽ More
A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy. However, previous work has shown that the gradient updates used in FL can leak user information. While the most industrial uses of FL are for text applications (e.g. keystroke prediction), nearly all attacks on FL privacy have focused on simple image classifiers. We propose a novel attack that reveals private user text by deploying malicious parameter vectors, and which succeeds even with mini-batches, multiple users, and long sequences. Unlike previous attacks on FL, the attack exploits characteristics of both the Transformer architecture and the token embedding, separately extracting tokens and positional embeddings to retrieve high-fidelity text. This work suggests that FL on text, which has historically been resistant to privacy attacks, is far more vulnerable than previously thought.
△ Less
Submitted 31 May, 2023; v1 submitted 29 January, 2022;
originally announced January 2022.
-
Cortical lesions, central vein sign, and paramagnetic rim lesions in multiple sclerosis: emerging machine learning techniques and future avenues
Authors:
Francesco La Rosa,
Maxence Wynen,
Omar Al-Louzi,
Erin S Beck,
Till Huelnhagen,
Pietro Maggi,
Jean-Philippe Thiran,
Tobias Kober,
Russell T Shinohara,
Pascal Sati,
Daniel S Reich,
Cristina Granziera,
Martina Absinta,
Meritxell Bach Cuadra
Abstract:
The current multiple sclerosis (MS) diagnostic criteria lack specificity, and this may lead to misdiagnosis, which remains an issue in present-day clinical practice. In addition, conventional biomarkers only moderately correlate with MS disease progression. Recently, advanced MS lesional imaging biomarkers such as cortical lesions (CL), the central vein sign (CVS), and paramagnetic rim lesions (PR…
▽ More
The current multiple sclerosis (MS) diagnostic criteria lack specificity, and this may lead to misdiagnosis, which remains an issue in present-day clinical practice. In addition, conventional biomarkers only moderately correlate with MS disease progression. Recently, advanced MS lesional imaging biomarkers such as cortical lesions (CL), the central vein sign (CVS), and paramagnetic rim lesions (PRL), visible in specialized magnetic resonance imaging (MRI) sequences, have shown higher specificity in differential diagnosis. Moreover, studies have shown that CL and PRL are potential prognostic biomarkers, the former correlating with cognitive impairments and the latter with early disability progression. As machine learning-based methods have achieved extraordinary performance in the assessment of conventional imaging biomarkers, such as white matter lesion segmentation, several automated or semi-automated methods have been proposed for CL, CVS, and PRL as well. In the present review, we first introduce these advanced MS imaging biomarkers and their imaging methods. Subsequently, we describe the corresponding machine learning-based methods that were used to tackle these clinical questions, putting them into context with respect to the challenges they are still facing, including non-standardized MRI protocols, limited datasets, and moderate inter-rater variability. We conclude by presenting the current limitations that prevent their broader deployment and suggesting future research directions.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Robust parameter estimation using the ensemble Kalman filter
Authors:
Sebastian Reich
Abstract:
Standard maximum likelihood or Bayesian approaches to parameter estimation for stochastic differential equations are not robust to perturbations in the continuous-in-time data. In this paper, we give a rather elementary explanation of this observation in the context of continuous-time parameter estimation using an ensemble Kalman filter. We employ the frequentist perspective to shed new light on t…
▽ More
Standard maximum likelihood or Bayesian approaches to parameter estimation for stochastic differential equations are not robust to perturbations in the continuous-in-time data. In this paper, we give a rather elementary explanation of this observation in the context of continuous-time parameter estimation using an ensemble Kalman filter. We employ the frequentist perspective to shed new light on three robust estimation techniques; namely subsampling the data, rough path corrections, and data filtering. We illustrate our findings through a simple numerical experiment.
△ Less
Submitted 19 December, 2023; v1 submitted 3 January, 2022;
originally announced January 2022.
-
Bulk morphology of porous materials at sub-micrometer scale studied by multi-modal X-ray imaging with Hartmann masks
Authors:
M. Zakharova,
A. Mikhaylov,
S. Reich,
A. Plech,
D. Kunka
Abstract:
We present the quantitative investigation of the submicron structure in the bulk of porous graphite by using the scattering signal in the multi-modal X-ray imaging with Hartmann masks. By scanning the correlation length and measuring the mask visibility reduction, we obtain average pore size, relative pore fraction, fractal dimension, and Hurst exponent of the structure. Profiting from the dimensi…
▽ More
We present the quantitative investigation of the submicron structure in the bulk of porous graphite by using the scattering signal in the multi-modal X-ray imaging with Hartmann masks. By scanning the correlation length and measuring the mask visibility reduction, we obtain average pore size, relative pore fraction, fractal dimension, and Hurst exponent of the structure. Profiting from the dimensionality of the mask, we apply the method to study pore size anisotropy. The measurements were performed in a simple and flexible imaging setup with relaxed requirements on beam coherence.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Synthesis of Multifunctional Charge Transfer Agents: Towards Single Walled Carbon Nanotubes with Defined Covalent Functionality and Preserved π System
Authors:
Alphonse Fiebor,
Antonio Setaro,
Andreas J. Achazi,
Georgy Gordeev,
Manuela Weber,
Daniel Franz,
Beate Paulus,
Mohsen Adeli,
Stephanie Reich
Abstract:
The attachment of well-defined charge transfer agents to the surface of nanomaterials is an efficient strategy to control their charge density and also to tune their optical, electrical, and physicochemical properties. Particularly interesting are charge transfer agents that either donate or withdraw electrons depending on the arrangements of their building units and that promise a non-destructive…
▽ More
The attachment of well-defined charge transfer agents to the surface of nanomaterials is an efficient strategy to control their charge density and also to tune their optical, electrical, and physicochemical properties. Particularly interesting are charge transfer agents that either donate or withdraw electrons depending on the arrangements of their building units and that promise a non-destructive attachment to delicate nanomaterials like sp$^2$ compounds. In this work, we rationally synthesize molecular moieties with versatile functionalities. A reactive anchor group allows to attach them to carbon nanotubes as defined charge transfer agents while preserving the tube $π$-conjugation. The charge transfer agents were synthesized through the stepwise nucleophilic substitution of either one (monosubstituted series) or two chlorine (disubstituted series) atoms of cyanuric chloride by aniline derivatives containing one, two or three methoxy groups in the para and meta positions. Variation in the number and position of methoxy as an electron transferring group help us to manipulate the electronic and optical properties of the molecular probes and their charge transfer to the SWNTs systematically. The correlation between the optical properties of these molecular probes and their functionality was investigated by experiments and quantum chemical calculations. While the optoelectronic properties of the conjugated charge transfer agents were dominated by the aniline segments, the triazine warrants the ability to nondestructively attach to the surface of SWNTs. This study is one step ahead towards the production of SWNTs with desired optical and electrical properties by covalent $π$-preserving functionalization.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
Combining machine learning and data assimilation to forecast dynamical systems from noisy partial observations
Authors:
Georg A. Gottwald,
Sebastian Reich
Abstract:
We present a supervised learning method to learn the propagator map of a dynamical system from partial and noisy observations. In our computationally cheap and easy-to-implement framework a neural network consisting of random feature maps is trained sequentially by incoming observations within a data assimilation procedure. By employing Takens' embedding theorem, the network is trained on delay co…
▽ More
We present a supervised learning method to learn the propagator map of a dynamical system from partial and noisy observations. In our computationally cheap and easy-to-implement framework a neural network consisting of random feature maps is trained sequentially by incoming observations within a data assimilation procedure. By employing Takens' embedding theorem, the network is trained on delay coordinates. We show that the combination of random feature maps and data assimilation, called RAFDA, outperforms standard random feature maps for which the dynamics is learned using batch data.
△ Less
Submitted 2 September, 2021; v1 submitted 7 August, 2021;
originally announced August 2021.