-
Towards Compositionality in Concept Learning
Authors:
Adam Stein,
Aaditya Naik,
Yinjun Wu,
Mayur Naik,
Eric Wong
Abstract:
Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositiona…
▽ More
Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositional. To automatically discover compositional concept representations, we identify two salient properties of such representations, and propose Compositional Concept Extraction (CCE) for finding concepts which obey these properties. We evaluate CCE on five different datasets over image and text data. Our evaluation shows that CCE finds more compositional concept representations than baselines and yields better accuracy on four downstream classification tasks. Code and data are available at https://github.com/adaminsky/compositional_concepts .
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Transformers Can Do Arithmetic with the Right Embeddings
Authors:
Sean McLeish,
Arpit Bansal,
Alex Stein,
Neel Jain,
John Kirchenbauer,
Brian R. Bartoldson,
Bhavya Kailkhura,
Abhinav Bhatele,
Jonas Gei**,
Avi Schwarzschild,
Tom Goldstein
Abstract:
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena…
▽ More
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix enables architectural modifications such as input injection and recurrent layers to improve performance even further.
With positions resolved, we can study the logical extrapolation ability of transformers. Can they solve arithmetic problems that are larger and more complex than those in their training data? We find that training on only 20 digit numbers with a single GPU for one day, we can reach state-of-the-art performance, achieving up to 99% accuracy on 100 digit addition problems. Finally, we show that these gains in numeracy also unlock improvements on other multi-step reasoning tasks including sorting and multiplication.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System
Authors:
Shuwen Kan,
Zefan Du,
Miguel Palma,
Samuel A Stein,
Chenxu Liu,
Wenqi Wei,
Juntao Chen,
Ang Li,
Ying Mao
Abstract:
Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts ha…
▽ More
Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts have focused on multi-node quantum systems that connect multiple smaller quantum devices to execute larger circuits. Future demonstrations hope to use quantum channels to couple systems, however current demonstrations can leverage classical communication with circuit cutting techniques. This involves cutting large circuits into smaller subcircuits and reconstructing them post-execution. However, existing cutting methods are hindered by lengthy search times as the number of qubits and gates increases. Additionally, they often fail to effectively utilize the resources of various worker configurations in a multi-node system. To address these challenges, we introduce FitCut, a novel approach that transforms quantum circuits into weighted graphs and utilizes a community-based, bottom-up approach to cut circuits according to resource constraints, e.g., qubit counts, on each worker. FitCut also includes a scheduling algorithm that optimizes resource utilization across workers. Implemented with Qiskit and evaluated extensively, FitCut significantly outperforms the Qiskit Circuit Knitting Toolbox, reducing time costs by factors ranging from 3 to 2000 and improving resource utilization rates by up to 3.88 times on the worker side, achieving a system-wide improvement of 2.86 times.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Benchmarking Optimizers for Qumode State Preparation with Variational Quantum Algorithms
Authors:
Shuwen Kan,
Miguel Palma,
Zefan Du,
Samuel A Stein,
Chenxu Liu,
Juntao Chen,
Ang Li,
Ying Mao
Abstract:
Quantum state preparation involves preparing a target state from an initial system, a process integral to applications such as quantum machine learning and solving systems of linear equations. Recently, there has been a growing interest in qumodes due to advancements in the field and their potential applications. However there is a notable gap in the literature specifically addressing this area. T…
▽ More
Quantum state preparation involves preparing a target state from an initial system, a process integral to applications such as quantum machine learning and solving systems of linear equations. Recently, there has been a growing interest in qumodes due to advancements in the field and their potential applications. However there is a notable gap in the literature specifically addressing this area. This paper aims to bridge this gap by providing performance benchmarks of various optimizers used in state preparation with Variational Quantum Algorithms. We conducted extensive testing across multiple scenarios, including different target states, both ideal and sampling simulations, and varying numbers of basis gate layers. Our evaluations offer insights into the complexity of learning each type of target state and demonstrate that some optimizers perform better than others in this context. Notably, the Powell optimizer was found to be exceptionally robust against sampling errors, making it a preferred choice in scenarios prone to such inaccuracies. Additionally, the Simultaneous Perturbation Stochastic Approximation optimizer was distinguished for its efficiency and ability to handle increased parameter dimensionality effectively.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Single-Shot Readout and Weak Measurement of a Tin-Vacancy Qubit in Diamond
Authors:
Eric I. Rosenthal,
Souvik Biswas,
Giovanni Scuri,
Hope Lee,
Abigail J. Stein,
Hannah C. Kleidermacher,
Jakob Grzesik,
Alison E. Rugar,
Shahriar Aghaeimeibodi,
Daniel Riedel,
Michael Titze,
Edward S. Bielejec,
Joonhee Choi,
Christopher P. Anderson,
Jelena Vuckovic
Abstract:
The negatively charged tin-vacancy center in diamond (SnV$^-$) is an emerging platform for building the next generation of long-distance quantum networks. This is due to the SnV$^-$'s favorable optical and spin properties including bright emission, insensitivity to electronic noise, and long spin coherence times at temperatures above 1 Kelvin. Here, we demonstrate measurement of a single SnV$^-$ e…
▽ More
The negatively charged tin-vacancy center in diamond (SnV$^-$) is an emerging platform for building the next generation of long-distance quantum networks. This is due to the SnV$^-$'s favorable optical and spin properties including bright emission, insensitivity to electronic noise, and long spin coherence times at temperatures above 1 Kelvin. Here, we demonstrate measurement of a single SnV$^-$ electronic spin with a single-shot readout fidelity of $87.4\%$, which can be further improved to $98.5\%$ by conditioning on multiple readouts. We show this performance is compatible with rapid microwave spin control, demonstrating that the trade-off between optical readout and spin control inherent to group-IV centers in diamond can be overcome for the SnV$^-$. Finally, we use weak quantum measurement to study measurement induced dephasing; this illuminates the fundamental interplay between measurement and decoherence in quantum mechanics, and makes use of the qubit's spin coherence as a metrological tool. Taken together, these results overcome an important hurdle in the development of the SnV$^-$ based quantum technologies, and in the process, develop techniques and understanding broadly applicable to the study of solid-state quantum emitters.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
AQM: A Refresh of the Abstract Qubit Model for Quantum Computing Co-design
Authors:
Chenxu Liu,
Samuel A. Stein,
Muqing Zheng,
James Ang,
Ang Li
Abstract:
Qubits are the fundamental building blocks of quantum information science and applications, whose concept is widely utilized in both quantum physics and quantum computation. While the significance of qubits and their implementation in physical devices have been extensively examined, now is the right time to revisit this understanding. In this paper, we introduce an abstract qubit model (AQM), offe…
▽ More
Qubits are the fundamental building blocks of quantum information science and applications, whose concept is widely utilized in both quantum physics and quantum computation. While the significance of qubits and their implementation in physical devices have been extensively examined, now is the right time to revisit this understanding. In this paper, we introduce an abstract qubit model (AQM), offering a mathematical framework for higher-level algorithms and applications, and setting forth criteria for lower-level physical devices to enable quantum computation. We first provide a comprehensive definition of "qubits", regarded as the foundational principle for quantum computing algorithms (bottom-up support), and examine their requisites for devices (top-down demand). We then investigate the feasibility of relaxing specific requirements, thereby broadening device support while considering techniques that tradeoff extra costs to counterbalance this relaxation. Lastly, we delve into the quantum applications that only require partial support of "qubits", and discuss the physical systems with limited support of the AQM but remain valuable in quantum applications. AQM may serve as an intermediate interface between quantum algorithms and devices, facilitating quantum algorithm-device co-design.
△ Less
Submitted 18 April, 2024; v1 submitted 17 March, 2024;
originally announced March 2024.
-
Public Goods Games in Disease Evolution and Spread
Authors:
Christo Morison,
Małgorzata Fic,
Thomas Marcou,
Javad Mohamadichamgavi,
Javier Redondo Antón,
Golsa Sayyar,
Alexander Stein,
Frank Bastian,
Hana Krakovská,
Nandakishor Krishnan,
Diogo L. Pires,
Mohammadreza Satouri,
Frederik J. Thomsen,
Kausutua Tjikundi,
Wajid Ali
Abstract:
Cooperation arises in nature at every scale, from within cells to entire ecosystems. In the framework of evolutionary game theory, public goods games (PGGs) are used to analyse scenarios where individuals can cooperate or defect, and can predict when and how these behaviours emerge. However, too few examples motivate the transferal of knowledge from one application of PGGs to another. Here, we foc…
▽ More
Cooperation arises in nature at every scale, from within cells to entire ecosystems. In the framework of evolutionary game theory, public goods games (PGGs) are used to analyse scenarios where individuals can cooperate or defect, and can predict when and how these behaviours emerge. However, too few examples motivate the transferal of knowledge from one application of PGGs to another. Here, we focus on PGGs arising in disease modelling of cancer evolution and the spread of infectious diseases. We use these two systems as case studies for the development of the theory and applications of PGGs, which we succinctly review and compare. We also posit that applications of evolutionary game theory to decision-making in cancer, such as interactions between a clinician and a tumour, can learn from the PGGs studied in epidemiology, where cooperative behaviours such as quarantine and vaccination compliance have been more thoroughly investigated. Furthermore, instances of cellular-level cooperation observed in cancers point to a corresponding area of potential interest for modellers of other diseases, be they viral, bacterial or otherwise. We aim to demonstrate the breadth of applicability of PGGs in disease modelling while providing a starting point for those interested in quantifying cooperation arising in healthcare.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Coercing LLMs to do and reveal (almost) anything
Authors:
Jonas Gei**,
Alex Stein,
Manli Shu,
Khalid Saifullah,
Yuxin Wen,
Tom Goldstein
Abstract:
It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and syst…
▽ More
It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and systematize attacks that coerce varied unintended behaviors, such as misdirection, model control, denial-of-service, or data extraction.
We analyze these attacks in controlled experiments, and find that many of them stem from the practice of pre-training LLMs with coding capabilities, as well as the continued existence of strange "glitch" tokens in common LLM vocabularies that should be removed for security reasons.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Towards Global Glacier Map** with Deep Learning and Open Earth Observation Data
Authors:
Konstantin A. Maslov,
Claudio Persello,
Thomas Schellenberger,
Alfred Stein
Abstract:
Accurate global glacier map** is critical for understanding climate change impacts. Despite its importance, automated glacier map** at a global scale remains largely unexplored. Here we address this gap and propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier map** using open satellite i…
▽ More
Accurate global glacier map** is critical for understanding climate change impacts. Despite its importance, automated glacier map** at a global scale remains largely unexplored. Here we address this gap and propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier map** using open satellite imagery. Assessing the spatial, temporal and cross-sensor generalisation shows that our best strategy achieves intersection over union >0.85 on previously unobserved images in most cases, which drops to >0.75 for debris-rich areas such as High-Mountain Asia and increases to >0.90 for regions dominated by clean ice. A comparative validation against human expert uncertainties in terms of area and distance deviations underscores GlaViTU performance, approaching or matching expert-level delineation. Adding synthetic aperture radar data, namely, backscatter and interferometric coherence, increases the accuracy in all regions where available. The calibrated confidence for glacier extents is reported making the predictions more reliable and interpretable. We also release a benchmark dataset that covers 9% of glaciers worldwide. Our results support efforts towards automated multitemporal and global glacier map**.
△ Less
Submitted 29 May, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Application-Oriented Co-Design of Motors and Motions for a 6DOF Robot Manipulator
Authors:
Adrian Stein,
Yebin Wang,
Yusuke Sakamoto,
Bingnan Wang,
Huazhen Fang
Abstract:
This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for…
▽ More
This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for a specific application. Contributions are made towards solving the proposed co-design problem in a computationally-efficient manner. First, a two-step process is proposed, where multiple motor designs are identified by optimizing motions and motors for multiple tasks one by one, and then are reconciled to determine the final motor design. Second, magnetic equivalent circuit modeling is exploited to establish the analytic map** from motor design parameters to dynamic models and objective functions to facilitate the subsequent differentiable simulation. Third, a direct-collocation-based differentiable simulator of motor and robotic arm dynamics is developed to balance the computational complexity and numerical stability. Simulation verifies that higher performance for a specific application can be achieved with the multi-task method, compared to several benchmark co-design methods.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Quantum Memory: A Missing Piece in Quantum Computing Units
Authors:
Chenxu Liu,
Meng Wang,
Samuel A. Stein,
Yufei Ding,
Ang Li
Abstract:
Memory is an indispensable component in classical computing systems. While the development of quantum computing is still in its early stages, current quantum processing units mainly function as quantum registers. Consequently, the actual role of quantum memory in future advanced quantum computing architectures remains unclear. With the rapid scaling of qubits, it is opportune to explore the potent…
▽ More
Memory is an indispensable component in classical computing systems. While the development of quantum computing is still in its early stages, current quantum processing units mainly function as quantum registers. Consequently, the actual role of quantum memory in future advanced quantum computing architectures remains unclear. With the rapid scaling of qubits, it is opportune to explore the potential and feasibility of quantum memory across different substrate device technologies and application scenarios. In this paper, we provide a full design stack view of quantum memory. We start from the elementary component of a quantum memory device, quantum memory cells. We provide an abstraction to a quantum memory cell and define metrics to measure the performance of physical platforms. Combined with addressing functionality, we then review two types of quantum memory devices: random access quantum memory (RAQM) and quantum random access memory (QRAM). Building on top of these devices, quantum memory units in the computing architecture, including building a quantum memory unit, quantum cache, quantum buffer, and using QRAM for the quantum input-output module, are discussed. We further propose the programming model for the quantum memory units and discuss their possible applications. By presenting this work, we aim to attract more researchers from both the Quantum Information Science (QIS) and classical memory communities to enter this emerging and exciting area.
△ Less
Submitted 2 November, 2023; v1 submitted 25 September, 2023;
originally announced September 2023.
-
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
Authors:
Aaditya Naik,
Adam Stein,
Yinjun Wu,
Mayur Naik,
Eric Wong
Abstract:
Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check…
▽ More
Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check integrity constraints over machine learning models and datasets. It seamlessly integrates relational algebra with functional programming to allow for highly expressive queries using only eight intuitive operators. We evaluate TorchQL on diverse use-cases including finding critical temporal inconsistencies in objects detected across video frames in autonomous driving, finding data imputation errors in time-series medical records, finding data labeling errors in real-world images, and evaluating biases and constraining outputs of language models. Our experiments show that TorchQL enables up to 13x faster query executions than baselines like Pandas and MongoDB, and up to 40% shorter queries than native Python. We also conduct a user study and find that TorchQL is natural enough for developers familiar with Python to specify complex integrity constraints.
△ Less
Submitted 14 February, 2024; v1 submitted 13 August, 2023;
originally announced August 2023.
-
An Antithetic Multilevel Monte Carlo-Milstein Scheme for Stochastic Partial Differential Equations
Authors:
Abdul-Lateef Haji-Al,
Andreas Stein
Abstract:
We present a novel multilevel Monte Carlo approach for estimating quantities of interest for stochastic partial differential equations (SPDEs). Drawing inspiration from [Giles and Szpruch: Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without Lévy area simulation, Annals of Appl. Prob., 2014], we extend the antithetic Milstein scheme for finite-dimensional stochastic diff…
▽ More
We present a novel multilevel Monte Carlo approach for estimating quantities of interest for stochastic partial differential equations (SPDEs). Drawing inspiration from [Giles and Szpruch: Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without Lévy area simulation, Annals of Appl. Prob., 2014], we extend the antithetic Milstein scheme for finite-dimensional stochastic differential equations to Hilbert space-valued SPDEs. Our method has the advantages of both Euler and Milstein discretizations, as it is easy to implement and does not involve intractable Lévy area terms. Moreover, the antithetic correction in our method leads to the same variance decay in a MLMC algorithm as the standard Milstein method, resulting in significantly lower computational complexity than a corresponding MLMC Euler scheme. Our approach is applicable to a broader range of non-linear diffusion coefficients and does not require any commutative properties. The key component of our MLMC algorithm is a truncated Milstein-type time step** scheme for SPDEs, which accelerates the rate of variance decay in the MLMC method when combined with an antithetic coupling on the fine scales. We combine the truncated Milstein scheme with appropriate spatial discretizations and noise approximations on all scales to obtain a fully discrete scheme and show that the antithetic coupling does not introduce an additional bias.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Deep Operator Network Approximation Rates for Lipschitz Operators
Authors:
Christoph Schwab,
Andreas Stein,
Jakob Zech
Abstract:
We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or Hölder) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of…
▽ More
We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or Hölder) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of $\mathcal X$, $\mathcal Y$, and an approximator network of an infinite-dimensional, parametric coordinate map that is Lipschitz continuous on the sequence space $\ell^2(\mathbb N)$. Unlike previous works ([Herrmann, Schwab and Zech: Neural and Spectral operator surrogates: construction and expression rate bounds, SAM Report, 2022], [Marcati and Schwab: Exponential Convergence of Deep Operator Networks for Elliptic Partial Differential Equations, SAM Report, 2022]), which required for example $\mathcal G$ to be holomorphic, the present expression rate results require mere Lipschitz (or Hölder) continuity of $\mathcal G$. Key in the proof of the present expression rate bounds is the use of either super-expressive activations (e.g. [Yarotski: Elementary superexpressive activations, Int. Conf. on ML, 2021], [Shen, Yang and Zhang: Neural network approximation: Three hidden layers are enough, Neural Networks, 2021], and the references there) which are inspired by the Kolmogorov superposition theorem, or of nonstandard NN architectures with standard (ReLU) activations as recently proposed in [Zhang, Shen and Yang: Neural Network Architecture Beyond Width and Depth, Adv. in Neural Inf. Proc. Sys., 2022]. We illustrate the abstract results by approximation rate bounds for emulation of a) solution operators for parametric elliptic variational inequalities, and b) Lipschitz maps of Hilbert-Schmidt operators.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
A modern framework for jet tagger development
Authors:
Annika Stein
Abstract:
This paper presents a new tool to perform various steps in jet tagger development in an efficient and comprehensive way. A common data structure is used for training, as well as for performance evaluation in data. The introduction of this new framework reduces the amount of data to be stored while accomplishing the same tasks, and shortens waiting times between algorithm development and data-to-si…
▽ More
This paper presents a new tool to perform various steps in jet tagger development in an efficient and comprehensive way. A common data structure is used for training, as well as for performance evaluation in data. The introduction of this new framework reduces the amount of data to be stored while accomplishing the same tasks, and shortens waiting times between algorithm development and data-to-simulation results becoming available from months to days, taking typical CMS experiment pipelines as a reference. Proper utilization of high-throughput systems enables first data-to-simulation studies with a recent neural network architecture, Particle Transformer, adapted to jet flavour tagging. Unlike official implementations of the collaboration, the new framework allows investigating different variants, like different training paradigms, and their impact on data/simulation agreement, without producing any new large files on disk, and within the same run of the analysis framework. Besides being more time- and storage-efficient and thus enabling the first results of that kind to be available just few hours after finishing neural network training, the framework is currently the only realization capable of studying how adversarial techniques affect data/simulation agreement for tagger algorithm outputs as well as inputs.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
Microwave Spin Control of a Tin-Vacancy Qubit in Diamond
Authors:
Eric I. Rosenthal,
Christopher P. Anderson,
Hannah C. Kleidermacher,
Abigail J. Stein,
Hope Lee,
Jakob Grzesik,
Giovanni Scuri,
Alison E. Rugar,
Daniel Riedel,
Shahriar Aghaeimeibodi,
Geun Ho Ahn,
Kasper Van Gasse,
Jelena Vuckovic
Abstract:
The negatively charged tin-vacancy (SnV-) center in diamond is a promising solid-state qubit for applications in quantum networking due to its high quantum efficiency, strong zero phonon emission, and reduced sensitivity to electrical noise. The SnV- has a large spin-orbit coupling, which allows for long spin lifetimes at elevated temperatures, but unfortunately suppresses the magnetic dipole tran…
▽ More
The negatively charged tin-vacancy (SnV-) center in diamond is a promising solid-state qubit for applications in quantum networking due to its high quantum efficiency, strong zero phonon emission, and reduced sensitivity to electrical noise. The SnV- has a large spin-orbit coupling, which allows for long spin lifetimes at elevated temperatures, but unfortunately suppresses the magnetic dipole transitions desired for quantum control. Here, by use of a naturally strained center, we overcome this limitation and achieve high-fidelity microwave spin control. We demonstrate a pi-pulse fidelity of up to 99.51+/0.03%$ and a Hahn-echo coherence time of T2echo = 170.0+/-2.8 microseconds, both the highest yet reported for SnV- platform. This performance comes without compromise to optical stability, and is demonstrated at 1.7 Kelvin where ample cooling power is available to mitigate drive induced heating. These results pave the way for SnV- spins to be used as a building block for future quantum technologies.
△ Less
Submitted 30 August, 2023; v1 submitted 22 June, 2023;
originally announced June 2023.
-
TopEx: Topic-based Explanations for Model Comparison
Authors:
Shreya Havaldar,
Adam Stein,
Eric Wong,
Lyle Ungar
Abstract:
Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between…
▽ More
Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks.
△ Less
Submitted 1 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation
Authors:
Xianwei Lv,
Claudio Persello,
Wangbin Li,
Xiao Huang,
Dong** Ming,
Alfred Stein
Abstract:
Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posin…
▽ More
Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posing limitations on the segmentation of large-scale remote sensing images and yielding algorithms with limited interpretability. To address the above challenges, we propose a deep-learning-based region merging method dubbed DeepMerge to handle the segmentation of complete objects in large VHR images by integrating deep learning and region adjacency graph (RAG). This is the first method to use deep learning to learn the similarity and merge similar adjacent super-pixels in RAG. We propose a modified binary tree sampling method to generate shift-scale data, serving as inputs for transformer-based deep learning networks, a shift-scale attention with 3-Dimension relative position embedding to learn features across scales, and an embedding to fuse learned features with hand-crafted features. DeepMerge can achieve high segmentation accuracy in a supervised manner from large-scale remotely sensed images and provides an interpretable optimal scale parameter, which is validated using a remote sensing image of 0.55 m resolution covering an area of 5,660 km^2. The experimental results show that DeepMerge achieves the highest F value (0.9550) and the lowest total error TE (0.0895), correctly segmenting objects of different sizes and outperforming all competing segmentation methods.
△ Less
Submitted 5 January, 2024; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Rectifying Group Irregularities in Explanations for Distribution Shift
Authors:
Adam Stein,
Yinjun Wu,
Eric Wong,
Mayur Naik
Abstract:
It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these…
▽ More
It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these two distributions. However, these methods can introduce group irregularities, leading to explanations that are less feasible and robust. To address these issues, we propose Group-aware Shift Explanations (GSE), a method that produces interpretable explanations by leveraging worst-group optimization to rectify group irregularities. We demonstrate how GSE not only maintains group structures, such as demographic and hierarchical subpopulations, but also enhances feasibility and robustness in the resulting explanations in a wide range of tabular, language, and image settings.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Improving robustness of jet tagging algorithms with adversarial training: exploring the loss surface
Authors:
Annika Stein
Abstract:
In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to inv…
▽ More
In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to investigate how well these perform under slight distortions of input features. Especially for tasks that involve many (low-level) inputs, the application of deep neural networks brings new challenges. In the context of jet flavor tagging, adversarial attacks are used to probe a typical classifier's vulnerability and can be understood as a model for systematic uncertainties. A corresponding defense strategy, adversarial training, improves robustness, while maintaining high performance. Investigating the loss surface corresponding to the inputs and models in question reveals geometric interpretations of robustness, taking correlations into account.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
From Playground Swings to Sway Control of Cranes: An Active Pendulum Experiment
Authors:
Adrian Stein,
Tarik Parcic,
Tarunraj Singh
Abstract:
Dynamics is a core discipline in Mechanical and Aerospace Engineering programs and with the ubiquitous nature of control in modern day applications, the field of mechatronics has gained popularity. Mechatronics refers to the field of engineering which integrates the engineering disciplines of mechanical, control, electronics and computing. To create a testbed to illustrate a tabletop mechatronics…
▽ More
Dynamics is a core discipline in Mechanical and Aerospace Engineering programs and with the ubiquitous nature of control in modern day applications, the field of mechatronics has gained popularity. Mechatronics refers to the field of engineering which integrates the engineering disciplines of mechanical, control, electronics and computing. To create a testbed to illustrate a tabletop mechatronics system, the paper details the design, and fabrication of an active pendulum whose length can be changed in real-time using solenoids. This permits illustrating two concepts: (1) dam** of pendulum oscillations which emulates the sway of a crane and (2) amplification of the oscillations which emulates the pum** of a playground swing. The paper describes the steps prior to experimental validation which include: modeling, system identification, signal processing, and controller implementation. Numerical simulations are used to prototype the controller and eventually to compare the simulation results to the experimental ones. The results of all the experiments illustrate a close match between the simulated and experimental results. To permit reproduction of the experiment, the design details and code to implement the controllers are posted in a public repository.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Neural Auctions Compromise Bidder Information
Authors:
Alex Stein,
Avi Schwarzschild,
Michael Curry,
Tom Goldstein,
John Dickerson
Abstract:
Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individuall…
▽ More
Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individually rational. We show that despite such auctions maximizing revenue, they do so at the cost of revealing private bidder information. While randomness is often used to build in privacy, in this context it comes with complications if done without care. Specifically, it can violate rationality and feasibility constraints, fundamentally change the incentive structure of the mechanism, and/or harm top-level metrics such as revenue and social welfare. We propose a method that employs stochasticity to improve privacy while meeting the requirements for auction mechanisms with only a modest sacrifice in revenue. We analyze the cost to the auction house that comes with introducing varying degrees of privacy in common auction settings. Our results show that despite current neural auctions' ability to approximate optimal mechanisms, the resulting vulnerability that comes with relying on neural networks must be accounted for.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Learning to Select Pivotal Samples for Meta Re-weighting
Authors:
Yinjun Wu,
Adam Stein,
Jacob Gardner,
Mayur Naik
Abstract:
Sample re-weighting strategies provide a promising mechanism to deal with imperfect training data in machine learning, such as noisily labeled or class-imbalanced data. One such strategy involves formulating a bi-level optimization problem called the meta re-weighting problem, whose goal is to optimize performance on a small set of perfect pivotal samples, called meta samples. Many approaches have…
▽ More
Sample re-weighting strategies provide a promising mechanism to deal with imperfect training data in machine learning, such as noisily labeled or class-imbalanced data. One such strategy involves formulating a bi-level optimization problem called the meta re-weighting problem, whose goal is to optimize performance on a small set of perfect pivotal samples, called meta samples. Many approaches have been proposed to efficiently solve this problem. However, all of them assume that a perfect meta sample set is already provided while we observe that the selections of meta sample set is performance critical. In this paper, we study how to learn to identify such a meta sample set from a large, imperfect training set, that is subsequently cleaned and used to optimize performance in the meta re-weighting setting. We propose a learning framework which reduces the meta samples selection problem to a weighted K-means clustering problem through rigorously theoretical analysis. We propose two clustering methods within our learning framework, Representation-based clustering method (RBC) and Gradient-based clustering method (GBC), for balancing performance and computational efficiency. Empirical studies demonstrate the performance advantage of our methods over various baseline methods.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Multilevel Markov Chain Monte Carlo for Bayesian Elliptic Inverse Problems with Besov Random Tree Priors
Authors:
Andreas Stein,
Viet Ha Hoang
Abstract:
We propose a multilevel Monte Carlo-FEM algorithm to solve elliptic Bayesian inverse problems with "Besov random tree prior". These priors are given by a wavelet series with stochastic coefficients, and certain terms in the expansion vanishing at random, according to the law of so-called Galton-Watson trees. This allows to incorporate random fractal structures and large deviations in the log-diffu…
▽ More
We propose a multilevel Monte Carlo-FEM algorithm to solve elliptic Bayesian inverse problems with "Besov random tree prior". These priors are given by a wavelet series with stochastic coefficients, and certain terms in the expansion vanishing at random, according to the law of so-called Galton-Watson trees. This allows to incorporate random fractal structures and large deviations in the log-diffusion, which occur naturally in many applications from geophysics or medical imaging. This framework entails two main difficulties: First, the associated diffusion coefficient does not satisfy a uniform ellipticity condition, which leads to non-integrable terms and thus divergence of standard multilevel estimators. Secondly, the associated space of parameters is Polish, but not a normed linear space. We address the first point by introducing cut-off functions in the estimator to compensate for the non-integrable terms, while the second issue is resolved by employing an independence Metropolis-Hastings sampler. The resulting algorithm converges in the mean-square sense with essentially optimal asymptotic complexity, and dimension-independent acceptance probabilities.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Multilevel Monte Carlo FEM for Elliptic PDEs with Besov Random Tree Priors
Authors:
Christoph Schwab,
Andreas Stein
Abstract:
We develop a multilevel Monte Carlo (MLMC)-FEM algorithm for linear, elliptic diffusion problems in polytopal domain $\mathcal D\subset \mathbb R^d$, with Besov-tree random coefficients. This is to say that the logarithms of the diffusion coefficients are sampled from so-called Besov-tree priors, which have recently been proposed to model data for fractal phenomena in science and engineering. Nume…
▽ More
We develop a multilevel Monte Carlo (MLMC)-FEM algorithm for linear, elliptic diffusion problems in polytopal domain $\mathcal D\subset \mathbb R^d$, with Besov-tree random coefficients. This is to say that the logarithms of the diffusion coefficients are sampled from so-called Besov-tree priors, which have recently been proposed to model data for fractal phenomena in science and engineering. Numerical analysis of the fully discrete FEM for the elliptic PDE includes quadrature approximation and must account for a) nonuniform pathwise upper and lower coefficient bounds, and for b) low path-regularity of the Besov-tree coefficients. Admissible non-parametric random coefficients correspond to random functions exhibiting singularities on random fractals with tunable fractal dimension, but involve no a-priori specification of the fractal geometry of singular supports of sample paths. Optimal complexity and convergence rate estimates for quantities of interest and for their second moments are proved. A convergence analysis for MLMC-FEM is performed which yields choices of the algorithmic steering parameters for efficient implementation. A complexity (``error vs work'') analysis of the MLMC-FEM approximations is provided.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Faithful Chain-of-Thought Reasoning
Authors:
Qing Lyu,
Shreya Havaldar,
Adam Stein,
Li Zhang,
Delip Rao,
Eric Wong,
Marianna Apidianaki,
Chris Callison-Burch
Abstract:
While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving…
▽ More
While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy.
△ Less
Submitted 20 September, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Minimum Time Control of a Gantry Crane System with Rate Constraints
Authors:
Adrian Stein,
Tarunraj Singh
Abstract:
This paper focuses on the development of minimum time control profiles for point-to-point motion of a gantry crane system in the presence of uncertainties in modal parameters. Assuming that the velocity of the trolley of the crane can be commanded and is subject to limits, an optimal control problem is posed to determine the bang-off-bang control profile to transition the system from a point of re…
▽ More
This paper focuses on the development of minimum time control profiles for point-to-point motion of a gantry crane system in the presence of uncertainties in modal parameters. Assuming that the velocity of the trolley of the crane can be commanded and is subject to limits, an optimal control problem is posed to determine the bang-off-bang control profile to transition the system from a point of rest to the terminal states with no residual vibrations. Both undamped and underdamped systems are considered and the variation of the structure of the optimal control profiles as a function of the final displacement is studied. As the magnitude of the rigid body displacement is increased, the collapse and birthing of switches in the optimal control profile are observed and explained. Robustness to uncertainties in modal parameters is accounted for by forcing the state sensitivities at the terminal time to zero. The observation that the time-optimal control profile merges with the robust time-optimal control is noted for specific terminal displacements and the migration of zeros of the time-delay filter parameterizing the optimal control profile are used to explain this counter intuitive result. A two degree of freedom gantry crane system is used to experimentally validate the observations of the numerical studies and the tradeoff of increase in maneuver time to the reduction of residual vibrations is experimentally illustrated.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Shapley Effect Estimation using Polynomial Chaos
Authors:
Adrian Stein,
Tarunraj Singh
Abstract:
This paper presents an approach for estimating Shapley effects for use as global sensitivity metrics to quantify the relative importance of uncertain model parameters. Polynomial Chaos expansion, a well established approach for develo** surrogate models is proposed to be used to estimate Shapley effects. Polynomial Chaos permits the transformation of a stochastic process to a deterministic model…
▽ More
This paper presents an approach for estimating Shapley effects for use as global sensitivity metrics to quantify the relative importance of uncertain model parameters. Polynomial Chaos expansion, a well established approach for develo** surrogate models is proposed to be used to estimate Shapley effects. Polynomial Chaos permits the transformation of a stochastic process to a deterministic model which can then be used to efficiently evaluate statistical moments of the quantity of interest. These moments include conditional variances which are algebraically mapped to Shapley effects. The polynomial chaos based estimates of Shapley effects are validated using Monte Carlo simulations and tested on the benchmark Ishigami function and on the dynamic SEIR epidemic model and the Bergman Type 1 diabetes model. The results illustrate the correct ranking of uncertain variables for the Ishigami function in contrast to the Sobol indices and illustrates the time-varying rank ordering of the model parameters for the dynamic models.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
A-posteriori QMC-FEM error estimation for Bayesian inversion and optimal control with entropic risk measure
Authors:
Marcello Longo,
Christoph Schwab,
Andreas Stein
Abstract:
We propose a novel a-posteriori error estimation technique where the target quantities of interest are ratios of high-dimensional integrals, as occur e.g. in PDE constrained Bayesian inversion and PDE constrained optimal control subject to an entropic risk measure. We consider in particular parametric, elliptic PDEs with affine-parametric diffusion coefficient, on high-dimensional parameter spaces…
▽ More
We propose a novel a-posteriori error estimation technique where the target quantities of interest are ratios of high-dimensional integrals, as occur e.g. in PDE constrained Bayesian inversion and PDE constrained optimal control subject to an entropic risk measure. We consider in particular parametric, elliptic PDEs with affine-parametric diffusion coefficient, on high-dimensional parameter spaces. We combine our recent a-posteriori Quasi-Monte Carlo (QMC) error analysis, with Finite Element a-posteriori error estimation. The proposed approach yields a computable a-posteriori estimator which is reliable, up to higher order terms. The estimator's reliability is uniform with respect to the PDE discretization, and robust with respect to the parametric dimension of the uncertain PDE input.
△ Less
Submitted 6 July, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
Architectures for Multinode Superconducting Quantum Computers
Authors:
James Ang,
Gabriella Carini,
Yanzhu Chen,
Isaac Chuang,
Michael Austin DeMarco,
Sophia E. Economou,
Alec Eickbusch,
Andrei Faraon,
Kai-Mei Fu,
Steven M. Girvin,
Michael Hatridge,
Andrew Houck,
Paul Hilaire,
Kevin Krsulich,
Ang Li,
Chenxu Liu,
Yuan Liu,
Margaret Martonosi,
David C. McKay,
James Misewich,
Mark Ritter,
Robert J. Schoelkopf,
Samuel A. Stein,
Sara Sussman,
Hong X. Tang
, et al. (8 additional authors not shown)
Abstract:
Many proposals to scale quantum technology rely on modular or distributed designs where individual quantum processors, called nodes, are linked together to form one large multinode quantum computer (MNQC). One scalable method to construct an MNQC is using superconducting quantum systems with optical interconnects. However, a limiting factor of these machines will be internode gates, which may be t…
▽ More
Many proposals to scale quantum technology rely on modular or distributed designs where individual quantum processors, called nodes, are linked together to form one large multinode quantum computer (MNQC). One scalable method to construct an MNQC is using superconducting quantum systems with optical interconnects. However, a limiting factor of these machines will be internode gates, which may be two to three orders of magnitude noisier and slower than local operations. Surmounting the limitations of internode gates will require a range of techniques, including improvements in entanglement generation, the use of entanglement distillation, and optimized software and compilers, and it remains unclear how improvements to these components interact to affect overall system performance, what performance from each is required, or even how to quantify the performance of each. In this paper, we employ a `co-design' inspired approach to quantify overall MNQC performance in terms of hardware models of internode links, entanglement distillation, and local architecture. In the case of superconducting MNQCs with microwave-to-optical links, we uncover a tradeoff between entanglement generation and distillation that threatens to degrade performance. We show how to navigate this tradeoff, lay out how compilers should optimize between local and internode gates, and discuss when noisy quantum links have an advantage over purely classical links. Using these results, we introduce a roadmap for the realization of early MNQCs which illustrates potential improvements to the hardware and software of MNQCs and outlines criteria for evaluating the landscape, from progress in entanglement generation and quantum memory to dedicated algorithms such as distributed quantum phase estimation. While we focus on superconducting devices with optical interconnects, our approach is general across MNQC implementations.
△ Less
Submitted 12 December, 2022;
originally announced December 2022.
-
QuCNN : A Quantum Convolutional Neural Network with Entanglement Based Backpropagation
Authors:
Samuel A. Stein,
Ying Mao,
James Ang,
Ang Li
Abstract:
Quantum Machine Learning continues to be a highly active area of interest within Quantum Computing. Many of these approaches have adapted classical approaches to the quantum settings, such as QuantumFlow, etc. We push forward this trend and demonstrate an adaption of the Classical Convolutional Neural Networks to quantum systems - namely QuCNN. QuCNN is a parameterised multi-quantum-state based ne…
▽ More
Quantum Machine Learning continues to be a highly active area of interest within Quantum Computing. Many of these approaches have adapted classical approaches to the quantum settings, such as QuantumFlow, etc. We push forward this trend and demonstrate an adaption of the Classical Convolutional Neural Networks to quantum systems - namely QuCNN. QuCNN is a parameterised multi-quantum-state based neural network layer computing similarities between each quantum filter state and each quantum data state. With QuCNN, back propagation can be achieved through a single-ancilla qubit quantum routine. QuCNN is validated by applying a convolutional layer with a data state and a filter state over a small subset of MNIST images, comparing the back propagated gradients, and training a filter state against an ideal target state.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Early life height and weight production functions with endogenous energy and protein inputs
Authors:
Esteban Puentes,
Fan Wang,
Jere R. Behrman,
Flávio Cunha,
John Hoddinott,
John A. Maluccio,
Linda S. Adair,
Judith B. Borja,
Reynaldo Martorell,
Aryeh D. Stein
Abstract:
We examine effects of protein and energy intakes on height and weight growth for children between 6 and 24 months old in Guatemala and the Philippines. Using instrumental variables to control for endogeneity and estimating multiple specifications, we find that protein intake plays an important and positive role in height and weight growth in the 6-24 month period. Energy from other macronutrients,…
▽ More
We examine effects of protein and energy intakes on height and weight growth for children between 6 and 24 months old in Guatemala and the Philippines. Using instrumental variables to control for endogeneity and estimating multiple specifications, we find that protein intake plays an important and positive role in height and weight growth in the 6-24 month period. Energy from other macronutrients, however, does not have a robust relation with these two anthropometric measures. Our estimates indicate that in contexts with substantial child undernutrition, increases in protein-rich food intake in the first 24 months can have important growth effects, which previous studies indicate are related significantly to a range of outcomes over the life cycle.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Improving Robustness of Jet Tagging Algorithms with Adversarial Training
Authors:
Annika Stein,
Xavier Coubez,
Spandan Mondal,
Andrzej Novak,
Alexander Schmidt
Abstract:
Deep learning is a standard tool in the field of high-energy physics, facilitating considerable sensitivity enhancements for numerous analysis strategies. In particular, in identification of physics objects, such as jet flavor tagging, complex neural network architectures play a major role. However, these methods are reliant on accurate simulations. Mismodeling can lead to non-negligible differenc…
▽ More
Deep learning is a standard tool in the field of high-energy physics, facilitating considerable sensitivity enhancements for numerous analysis strategies. In particular, in identification of physics objects, such as jet flavor tagging, complex neural network architectures play a major role. However, these methods are reliant on accurate simulations. Mismodeling can lead to non-negligible differences in performance in data that need to be measured and calibrated against. We investigate the classifier response to input data with injected mismodelings and probe the vulnerability of flavor tagging algorithms via application of adversarial attacks. Subsequently, we present an adversarial training strategy that mitigates the impact of such simulated attacks and improves the classifier robustness. We examine the relationship between performance and vulnerability and show that this method constitutes a promising approach to reduce the vulnerability to poor modeling.
△ Less
Submitted 16 September, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
A Case Study of Vehicle Route Optimization
Authors:
Veronika Lesch,
Maximilian König,
Samuel Kounev,
Anthony Stein,
Christian Krupitzer
Abstract:
In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-wo…
▽ More
In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-world applications, the so called rich VRP (rVRP) and are limited to single aspects. In this work, we incorporate the main relevant real-world constraints and requirements. We propose a two-stage strategy and a Timeline algorithm for time windows and pause times, and apply a Genetic Algorithm (GA) and Ant Colony Optimization (ACO) individually to the problem to find optimal solutions. Our evaluation of eight different problem instances against four state-of-the-art algorithms shows that our approach handles all given constraints in a reasonable time.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Electronic Properties of Tetraazaperopyrene Derivatives on Au(111): Energy Level Alignment and Interfacial Band Formation
Authors:
Arnulf Stein,
Daniela Rolf,
Christian Lotze,
Sascha Feldmann,
David Gerbert,
Benjamin Günther,
Andreas Jeindl,
Johannes J. Cartus,
Oliver T. Hofmann,
Lutz H. Gade,
Katharina J. Franke,
Petra Tegeder
Abstract:
N-Heteropolycyclic aromatic compounds are promising organic electron-transporting semiconductors for applications in field effect transistors. Here, we investigated the electronic properties of 1,3,8,10-tetraazaperopyrene derivatives adsorbed on Au(111) using a complementary experimental approach, namely scanning tunneling spectroscopy and two-photon photoemission combined with state-of-the-art de…
▽ More
N-Heteropolycyclic aromatic compounds are promising organic electron-transporting semiconductors for applications in field effect transistors. Here, we investigated the electronic properties of 1,3,8,10-tetraazaperopyrene derivatives adsorbed on Au(111) using a complementary experimental approach, namely scanning tunneling spectroscopy and two-photon photoemission combined with state-of-the-art density functional calculations. We find signatures of weak physisorption of the molecular layers, such as the absence of charge transfer, a nearly unperturbed surface state and an intact herringbone reconstruction underneath the molecular layer. Interestingly, molecular states in the energy region of the \emph{sp}- and \emph{d}-bands of the Au(111) substrate exhibit hole-like dispersive character. We ascribe this band character to hybridization with the delocalized states of the substrate. We suggest that such bands, which effectively leave the molecular frontier orbitals largely unperturbed, to be a promising lead for the design of organic-metal interfaces with a low charge injection barrier.
△ Less
Submitted 22 August, 2021;
originally announced August 2021.
-
Widening, Transition and Coalescence of Local Resonance Band Gaps in Multi-resonator Acoustic Metamaterials: From Unit Cells to Finite Chains
Authors:
A. Stein,
M. Nouh,
T. Singh
Abstract:
Local resonance band gaps in acoustic metamaterials are widely known for their strong attenuation yet narrow frequency span. The latter limits the practical ability to implement subwavelength band gaps for broadband attenuation and has motivated novel metamaterial designs in recent years. In this paper, we investigate the behavior of acoustic metamaterials where unit cells house multiple resonatin…
▽ More
Local resonance band gaps in acoustic metamaterials are widely known for their strong attenuation yet narrow frequency span. The latter limits the practical ability to implement subwavelength band gaps for broadband attenuation and has motivated novel metamaterial designs in recent years. In this paper, we investigate the behavior of acoustic metamaterials where unit cells house multiple resonating elements stacked in different configurations, aimed at instigating a wide array of wave propagation profiles that are otherwise unattainable. The dispersion mechanics of the multi-resonator metamaterials are developed using purely analytical expressions which depict and explain the underlying dynamics of such systems both at the unit cell level as well as the frequency response of their finite realizations. The framework reveals the mechanism behind the transition of the lower and upper band gap bounds in metamaterials with parallel resonators resulting in a significant band gap widening. The analysis also illustrates the ability of metamaterials with dual-periodic super cells to exhibit a range of dispersion transitions culminating in collapsing solutions of acoustic and optic bands, enabling a coalescence of local resonance band gaps, vanishing resonances, and a number of intriguing scenarios in between.
△ Less
Submitted 11 January, 2022; v1 submitted 18 July, 2021;
originally announced July 2021.
-
Multiple energy-scales in vertex-frustrated mesospin systems
Authors:
Henry Stopfel,
Unnar B. Arnalds,
Aaron Stein,
Thomas P. A. Hase,
Björgvin Hjörvarsson,
Vassilios Kapaklis
Abstract:
The interplay between topology and energy-hierarchy plays a vital role in the collective magnetic order in artificial ferroic systems. Here we investigate, experimentally, the effect of having one or two activation energies of interacting Ising-like magnetic islands -- mesospins -- in thermalized, vertex-frustrated lattices. The thermally arrested magnetic states of the elements were determined us…
▽ More
The interplay between topology and energy-hierarchy plays a vital role in the collective magnetic order in artificial ferroic systems. Here we investigate, experimentally, the effect of having one or two activation energies of interacting Ising-like magnetic islands -- mesospins -- in thermalized, vertex-frustrated lattices. The thermally arrested magnetic states of the elements were determined using synchrotron-based magnetic microscopy after cooling the samples from temperatures above the Curie temperature of the material. Statistical analysis of the correlations between mesospins across several length-scales, reveals changes in the magnetic order, reflecting the amount of ground state plaquettes realized for a vertex-frustrated lattice. We show that the latter depends on the presence, or not, of different activation energies.
△ Less
Submitted 15 October, 2021; v1 submitted 19 June, 2021;
originally announced June 2021.
-
QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity
Authors:
Samuel A. Stein,
Betis Baheri,
Daniel Chen,
Ying Mao,
Qiang Guan,
Ang Li,
Shuai Xu,
Caiwen Ding
Abstract:
In the past decade, remarkable progress has been achieved in deep learning related systems and applications. In the post Moore's Law era, however, the limit of semiconductor fabrication technology along with the increasing data size have slowed down the development of learning algorithms. In parallel, the fast development of quantum computing has pushed it to the new ear. Google illustrates quantu…
▽ More
In the past decade, remarkable progress has been achieved in deep learning related systems and applications. In the post Moore's Law era, however, the limit of semiconductor fabrication technology along with the increasing data size have slowed down the development of learning algorithms. In parallel, the fast development of quantum computing has pushed it to the new ear. Google illustrates quantum supremacy by completing a specific task (random sampling problem), in 200 seconds, which is impracticable for the largest classical computers. Due to the limitless potential, quantum based learning is an area of interest, in hopes that certain systems might offer a quantum speedup. In this work, we propose a novel architecture QuClassi, a quantum neural network for both binary and multi-class classification. Powered by a quantum differentiation function along with a hybrid quantum-classic design, QuClassi encodes the data with a reduced number of qubits and generates the quantum circuit, pushing it to the quantum platform for the best states, iteratively. We conduct intensive experiments on both the simulator and IBM-Q quantum platform. The evaluation results demonstrate that QuClassi is able to outperform the state-of-the-art quantum-based solutions, Tensorflow-Quantum and QuantumFlow by up to 53.75% and 203.00% for binary and multi-class classifications. When comparing to traditional deep neural networks, QuClassi achieves a comparable performance with 97.37% fewer parameters.
△ Less
Submitted 31 March, 2022; v1 submitted 21 March, 2021;
originally announced March 2021.
-
3D Fully Convolutional Neural Networks with Intersection Over Union Loss for Crop Map** from Multi-Temporal Satellite Images
Authors:
Sina Mohammadi,
Mariana Belgiu,
Alfred Stein
Abstract:
Information on cultivated crops is relevant for a large number of food security studies. Different scientific efforts are dedicated to generating this information from remote sensing images by means of machine learning methods. Unfortunately, these methods do not take account of the spatial-temporal relationships inherent in remote sensing images. In our paper, we explore the capability of a 3D Fu…
▽ More
Information on cultivated crops is relevant for a large number of food security studies. Different scientific efforts are dedicated to generating this information from remote sensing images by means of machine learning methods. Unfortunately, these methods do not take account of the spatial-temporal relationships inherent in remote sensing images. In our paper, we explore the capability of a 3D Fully Convolutional Neural Network (FCN) to map crop types from multi-temporal images. In addition, we propose the Intersection Over Union (IOU) loss function for increasing the overlap between the predicted classes and ground reference data. The proposed method was applied to identify soybean and corn from a study area situated in the US corn belt using multi-temporal Landsat images. The study shows that our method outperforms related methods, obtaining a Kappa coefficient of 91.8%. We conclude that using the IOU loss function provides a superior choice to learn individual crop types.
△ Less
Submitted 19 October, 2021; v1 submitted 14 February, 2021;
originally announced February 2021.
-
Quantum-Inspired Classical Algorithm for Slow Feature Analysis
Authors:
Daniel Chen,
Yekun Xu,
Betis Baheri,
Samuel A. Stein,
Chuan Bi,
Ying Mao,
Qiang Quan,
Shuai Xu
Abstract:
Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with…
▽ More
Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with a run time O(polylog(n)poly(d)). To achieve this, we assumed necessary preprocessing of the input data as well as the existence of a data structure supporting a particular sampling scheme. The analysis of algorithm borrowed results from matrix perturbation theory, which was crucial for the algorithm's correctness. This work demonstrates the possible application and extent for which quantum-inspired computation can be used.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
A Hybrid System for Learning Classical Data in Quantum States
Authors:
Samuel A. Stein,
Ryan L'Abbate,
Wenrui Mu,
Yue Liu,
Betis Baheri,
Ying Mao,
Qiang Guan,
Ang Li,
Bo Fang
Abstract:
Deep neural network powered artificial intelligence has rapidly changed our daily life with various applications. However, as one of the essential steps of deep neural networks, training a heavily weighted network requires a tremendous amount of computing resources. Especially in the post-Moore's Law era, the limit of semiconductor fabrication technology has restricted the development of learning…
▽ More
Deep neural network powered artificial intelligence has rapidly changed our daily life with various applications. However, as one of the essential steps of deep neural networks, training a heavily weighted network requires a tremendous amount of computing resources. Especially in the post-Moore's Law era, the limit of semiconductor fabrication technology has restricted the development of learning algorithms to cope with the increasing high-intensity training data. Meanwhile, quantum computing has demonstrated its significant potential in terms of speeding up the traditionally compute-intensive workloads. For example, Google illustrated quantum supremacy by completing a sampling calculation task in 200 seconds, which is otherwise impracticable on the world's largest supercomputers. To this end, quantum-based learning has become an area of interest, with the potential of a quantum speedup. In this paper, we propose GenQu, a hybrid and general-purpose quantum framework for learning classical data through quantum states. We evaluate GenQu with real datasets and conduct experiments on both simulations and real quantum computer IBM-Q. Our evaluation demonstrates that, compared with classical solutions, the proposed models running on GenQu framework achieve similar accuracy with a much smaller number of qubits, while significantly reducing the parameter size by up to 95.86% and converging speedup by 33.33% faster.
△ Less
Submitted 20 August, 2021; v1 submitted 30 November, 2020;
originally announced December 2020.
-
QuGAN: A Quantum State Fidelity based Generative Adversarial Network
Authors:
Samuel A. Stein,
Betis Baheri,
Daniel Chen,
Ying Mao,
Qiang Guan,
Ang Li,
Bo Fang,
Shuai Xu
Abstract:
Tremendous progress has been witnessed in artificial intelligence where neural network backed deep learning systems have been used, with applications in almost every domain. As a representative deep learning framework, Generative Adversarial Network (GAN) has been widely used for generating artificial images, text-to-image or image augmentation across areas of science, arts and video games. Howeve…
▽ More
Tremendous progress has been witnessed in artificial intelligence where neural network backed deep learning systems have been used, with applications in almost every domain. As a representative deep learning framework, Generative Adversarial Network (GAN) has been widely used for generating artificial images, text-to-image or image augmentation across areas of science, arts and video games. However, GANs are computationally expensive, sometimes computationally prohibitive. Furthermore, training GANs may suffer from convergence failure and modal collapse. Aiming at the acceleration of use cases for practical quantum computers, we propose QuGAN, a quantum GAN architecture that provides stable convergence, quantum-state based gradients and significantly reduced parameter sets. The QuGAN architecture runs both the discriminator and the generator purely on quantum state fidelity and utilizes the swap test on qubits to calculate the values of quantum-based loss functions. Built on quantum layers, QuGAN achieves similar performance with a 94.98% reduction on the parameter set when compared to classical GANs. With the same number of parameters, additionally, QuGAN outperforms state-of-the-art quantum based GANs in the literature providing a 48.33% improvement in system performance compared to others attaining less than 0.5% in terms of similarity between generated distributions and original data sets. QuGAN code is released at https://github.com/yingmao/Quantum-Generative-Adversarial-Network
△ Less
Submitted 22 September, 2022; v1 submitted 18 October, 2020;
originally announced October 2020.
-
XCS Classifier System with Experience Replay
Authors:
Anthony Stein,
Roland Maier,
Lukas Rosenbauer,
Jörg Hähner
Abstract:
XCS constitutes the most deeply investigated classifier system today. It bears strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter d…
▽ More
XCS constitutes the most deeply investigated classifier system today. It bears strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter domain, recent advances have been mainly driven by algorithms which model their policies based on deep neural networks -- among which the Deep-Q-Network (DQN) is a prominent representative. Experience Replay (ER) constitutes one of the crucial factors for the DQN's successes, since it facilitates stabilized training of the neural network-based Q-function approximators. Surprisingly, XCS barely takes advantage of similar mechanisms that leverage stored raw experiences encountered so far. To bridge this gap, this paper investigates the benefits of extending XCS with ER. On the one hand, we demonstrate that for single-step tasks ER bears massive potential for improvements in terms of sample efficiency. On the shady side, however, we reveal that the use of ER might further aggravate well-studied issues not yet solved for XCS when applied to sequential decision problems demanding for long-action-chains.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
Bootstrap** a DQN Replay Memory with Synthetic Experiences
Authors:
Wenzel Baron Pilar von Pilchau,
Anthony Stein,
Jörg Hähner
Abstract:
An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples b…
▽ More
An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Collective magnetic dynamics in artificial spin ice probed by AC susceptibility
Authors:
Merlin Pohlit,
Giuseppe Muscas,
Ioan-Augustin Chioar,
Henry Stopfel,
Agne Ciuciulkaite,
Erik Östman,
Spyridon D. Pappas,
Aaron Stein,
Björgvin Hjörvarsson,
Petra E. Jönsson,
Vassilios Kapaklis
Abstract:
We report on the study of the thermal dynamics of square artificial spin ice, probed by means of temperature and frequency dependent AC susceptibility. Pronounced influence of the inter-island coupling strength was found on the frequency response of the samples. Through the subsequent analysis of the frequency- and coupling-dependent freezing temperatures, we discuss the phenomenological parameter…
▽ More
We report on the study of the thermal dynamics of square artificial spin ice, probed by means of temperature and frequency dependent AC susceptibility. Pronounced influence of the inter-island coupling strength was found on the frequency response of the samples. Through the subsequent analysis of the frequency- and coupling-dependent freezing temperatures, we discuss the phenomenological parameters obtained in the framework of Vogel-Fulcher-Tammann law in terms of the samples microscopic features. The high sensitivity and robust signal to noise ratio of AC susceptibility validates the latter as a promising and simple experimental technique for resolving the dynamics and temperature driven dynamics crossovers for the case of artificial spin ice.
△ Less
Submitted 11 March, 2020; v1 submitted 14 January, 2020;
originally announced January 2020.
-
Stochastic Transport with Lévy Noise -- Fully Discrete Numerical Approximation
Authors:
Andrea Barth,
Andreas Stein
Abstract:
Semilinear hyperbolic stochastic partial differential equations (SPDEs) find widespread applications in the natural and engineering sciences. However, the traditional Gaussian setting may prove too restrictive, as phenomena in mathematical finance, porous media, and pollution models often exhibit noise of a different nature. To capture temporal discontinuities and accommodate heavy-tailed distribu…
▽ More
Semilinear hyperbolic stochastic partial differential equations (SPDEs) find widespread applications in the natural and engineering sciences. However, the traditional Gaussian setting may prove too restrictive, as phenomena in mathematical finance, porous media, and pollution models often exhibit noise of a different nature. To capture temporal discontinuities and accommodate heavy-tailed distributions, Hilbert space-valued Lévy processes or Lévy fields are employed as driving noise terms. The numerical discretization of such SPDEs presents several challenges. The low regularity of the solution in space and time leads to slow convergence rates and instability in space/time discretization schemes. Furthermore, the Lévy process can take values in an infinite-dimensional Hilbert space, necessitating projections onto finite-dimensional subspaces at each discrete time point. Additionally, unbiased sampling from the resulting Lévy field may not be feasible. In this study, we introduce a novel fully discrete approximation scheme that tackles these difficulties. Our main contribution is a discontinuous Galerkin scheme for spatial approximation, derived naturally from the weak formulation of the SPDE. We establish optimal convergence properties for this approach and combine it with a suitable time step** scheme to prevent numerical oscillations. Furthermore, we approximate the driving noise process using truncated Karhunen-Loève expansions. This approximation yields a sum of scaled and uncorrelated one-dimensional Lévy processes, which can be simulated with controlled bias using Fourier inversion techniques.
△ Less
Submitted 3 July, 2023; v1 submitted 31 October, 2019;
originally announced October 2019.
-
Dielectric Metasurfaces for Complete and Independent Control of Optical Amplitude and Phase
Authors:
Adam C. Overvig,
Sajan Shrestha,
Stephanie C. Malek,
Ming Lu,
Aaron Stein,
Changxi Zheng,
Nanfang Yu
Abstract:
Metasurfaces are optically thin metamaterials that promise complete control of the wavefront of light but are primarily used to control only the phase of light. Here, we present an approach, simple in concept and in practice, that uses meta-atoms with a varying degree of form birefringence and rotation angles to create high-efficiency dielectric metasurfaces that control both the optical amplitude…
▽ More
Metasurfaces are optically thin metamaterials that promise complete control of the wavefront of light but are primarily used to control only the phase of light. Here, we present an approach, simple in concept and in practice, that uses meta-atoms with a varying degree of form birefringence and rotation angles to create high-efficiency dielectric metasurfaces that control both the optical amplitude and phase at one or two frequencies. This opens up applications in computer-generated holography, allowing faithful reproduction of both the phase and amplitude of a target holographic scene without the iterative algorithms required in phase-only holography. We demonstrate all-dielectric metasurface holograms with independent and complete control of the amplitude and phase at up to two optical frequencies simultaneously to generate two- and three-dimensional holographic objects. We show that phase-amplitude metasurfaces enable a few features not attainable in phase-only holography; these include creating artifact-free two-dimensional holographic images, encoding phase and amplitude profiles separately at the object plane, encoding intensity profiles at the metasurface and object planes separately, and controlling the surface textures of three-dimensional holographic objects.
△ Less
Submitted 5 September, 2019; v1 submitted 1 March, 2019;
originally announced March 2019.
-
Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation
Authors:
Devansh Bisla,
Anna Choromanska,
Jennifer A. Stein,
David Polsky,
Russell Berman
Abstract:
Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with o…
▽ More
Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with occlusions. We build deep-learning-based tools for data purification and augmentation to counter-act these limitations. The developed tools can be utilized in a deep learning system for lesion classification and we show how to build such a system. The system heavily relies on the processing unit for removing image occlusions and the data generation unit, based on generative adversarial networks, for populating scarce lesion classes, or equivalently creating virtual patients with pre-defined types of lesions. We empirically verify our approach and show that incorporating these two units into melanoma detection system results in the superior performance over common baselines.
△ Less
Submitted 14 May, 2019; v1 submitted 16 February, 2019;
originally announced February 2019.
-
Numerical analysis for time-dependent advection-diffusion problems with random discontinuous coefficients
Authors:
Andrea Barth,
Andreas Stein
Abstract:
Subsurface flows are commonly modeled by advection-diffusion equations. Insufficient measurements or uncertain material procurement may be accounted for by random coefficients. To represent, for example, transitions in heterogeneous media, the parameters of the equation are spatially discontinuous. Specifically, a scenario with coupled advection- and diffusion coefficients that are modeled as sums…
▽ More
Subsurface flows are commonly modeled by advection-diffusion equations. Insufficient measurements or uncertain material procurement may be accounted for by random coefficients. To represent, for example, transitions in heterogeneous media, the parameters of the equation are spatially discontinuous. Specifically, a scenario with coupled advection- and diffusion coefficients that are modeled as sums of continuous random fields and discontinuous jump components are considered. For the numerical approximation of the solution, an adaptive, pathwise discretization scheme based on a Finite Element approach is introduced. To stabilize the numerical approximation and accelerate convergence, the discrete space-time grid is chosen with respect to the varying discontinuities in each sample of the coefficients, leading to a stochastic formulation of the Galerkin projection and the Finite Element basis.
△ Less
Submitted 22 January, 2021; v1 submitted 6 February, 2019;
originally announced February 2019.
-
A Multilevel Monte Carlo Algorithm for Parabolic Advection-Diffusion Problems with Discontinuous Coefficients
Authors:
Andrea Barth,
Andreas Stein
Abstract:
The Richards' equation is a model for flow of water in unsaturated soils. The coefficients of this (nonlinear) partial differential equation describe the permeability of the medium. Insufficient or uncertain measurements are commonly modeled by random coefficients. For flows in heterogeneous\textbackslash fractured\textbackslash porous media, the coefficients are modeled as discontinuous random fi…
▽ More
The Richards' equation is a model for flow of water in unsaturated soils. The coefficients of this (nonlinear) partial differential equation describe the permeability of the medium. Insufficient or uncertain measurements are commonly modeled by random coefficients. For flows in heterogeneous\textbackslash fractured\textbackslash porous media, the coefficients are modeled as discontinuous random fields, where the interfaces along the stochastic discontinuities represent transitions in the media. More precisely, the random coefficient is given by the sum of a (continuous) Gaussian random field and a (discontinuous) jump part. In this work moments of the solution to the random partial differential equation are calculated using a path-wise numerical approximation combined with multilevel Monte Carlo sampling. The discontinuities dictate the spatial discretization, which leads to a stochastic grid. Hence, the refinement parameter and problem-dependent constants in the error analysis are random variables and we derive (optimal) a-priori convergence rates in a mean-square sense.
△ Less
Submitted 9 March, 2020; v1 submitted 6 February, 2019;
originally announced February 2019.