Search | arXiv e-print repository

Towards Compositionality in Concept Learning

Authors: Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong

Abstract: Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositiona… ▽ More Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositional. To automatically discover compositional concept representations, we identify two salient properties of such representations, and propose Compositional Concept Extraction (CCE) for finding concepts which obey these properties. We evaluate CCE on five different datasets over image and text data. Our evaluation shows that CCE finds more compositional concept representations than baselines and yields better accuracy on four downstream classification tasks. Code and data are available at https://github.com/adaminsky/compositional_concepts . △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024. 26 pages, 10 figures

arXiv:2405.17399 [pdf, other]

Transformers Can Do Arithmetic with the Right Embeddings

Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Gei**, Avi Schwarzschild, Tom Goldstein

Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix enables architectural modifications such as input injection and recurrent layers to improve performance even further. With positions resolved, we can study the logical extrapolation ability of transformers. Can they solve arithmetic problems that are larger and more complex than those in their training data? We find that training on only 20 digit numbers with a single GPU for one day, we can reach state-of-the-art performance, achieving up to 99% accuracy on 100 digit addition problems. Finally, we show that these gains in numeracy also unlock improvements on other multi-step reasoning tasks including sorting and multiplication. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.04514 [pdf, other]

Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System

Authors: Shuwen Kan, Zefan Du, Miguel Palma, Samuel A Stein, Chenxu Liu, Wenqi Wei, Juntao Chen, Ang Li, Ying Mao

Abstract: Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts ha… ▽ More Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts have focused on multi-node quantum systems that connect multiple smaller quantum devices to execute larger circuits. Future demonstrations hope to use quantum channels to couple systems, however current demonstrations can leverage classical communication with circuit cutting techniques. This involves cutting large circuits into smaller subcircuits and reconstructing them post-execution. However, existing cutting methods are hindered by lengthy search times as the number of qubits and gates increases. Additionally, they often fail to effectively utilize the resources of various worker configurations in a multi-node system. To address these challenges, we introduce FitCut, a novel approach that transforms quantum circuits into weighted graphs and utilizes a community-based, bottom-up approach to cut circuits according to resource constraints, e.g., qubit counts, on each worker. FitCut also includes a scheduling algorithm that optimizes resource utilization across workers. Implemented with Qiskit and evaluated extensively, FitCut significantly outperforms the Qiskit Circuit Knitting Toolbox, reducing time costs by factors ranging from 3 to 2000 and improving resource utilization rates by up to 3.88 times on the worker side, achieving a system-wide improvement of 2.86 times. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04499 [pdf, other]

Benchmarking Optimizers for Qumode State Preparation with Variational Quantum Algorithms

Authors: Shuwen Kan, Miguel Palma, Zefan Du, Samuel A Stein, Chenxu Liu, Juntao Chen, Ang Li, Ying Mao

Abstract: Quantum state preparation involves preparing a target state from an initial system, a process integral to applications such as quantum machine learning and solving systems of linear equations. Recently, there has been a growing interest in qumodes due to advancements in the field and their potential applications. However there is a notable gap in the literature specifically addressing this area. T… ▽ More Quantum state preparation involves preparing a target state from an initial system, a process integral to applications such as quantum machine learning and solving systems of linear equations. Recently, there has been a growing interest in qumodes due to advancements in the field and their potential applications. However there is a notable gap in the literature specifically addressing this area. This paper aims to bridge this gap by providing performance benchmarks of various optimizers used in state preparation with Variational Quantum Algorithms. We conducted extensive testing across multiple scenarios, including different target states, both ideal and sampling simulations, and varying numbers of basis gate layers. Our evaluations offer insights into the complexity of learning each type of target state and demonstrate that some optimizers perform better than others in this context. Notably, the Powell optimizer was found to be exceptionally robust against sampling errors, making it a preferred choice in scenarios prone to such inaccuracies. Additionally, the Simultaneous Perturbation Stochastic Approximation optimizer was distinguished for its efficiency and ability to handle increased parameter dimensionality effectively. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2403.13110 [pdf, other]

Single-Shot Readout and Weak Measurement of a Tin-Vacancy Qubit in Diamond

Authors: Eric I. Rosenthal, Souvik Biswas, Giovanni Scuri, Hope Lee, Abigail J. Stein, Hannah C. Kleidermacher, Jakob Grzesik, Alison E. Rugar, Shahriar Aghaeimeibodi, Daniel Riedel, Michael Titze, Edward S. Bielejec, Joonhee Choi, Christopher P. Anderson, Jelena Vuckovic

Abstract: The negatively charged tin-vacancy center in diamond (SnV$^-$) is an emerging platform for building the next generation of long-distance quantum networks. This is due to the SnV$^-$'s favorable optical and spin properties including bright emission, insensitivity to electronic noise, and long spin coherence times at temperatures above 1 Kelvin. Here, we demonstrate measurement of a single SnV$^-$ e… ▽ More The negatively charged tin-vacancy center in diamond (SnV$^-$) is an emerging platform for building the next generation of long-distance quantum networks. This is due to the SnV$^-$'s favorable optical and spin properties including bright emission, insensitivity to electronic noise, and long spin coherence times at temperatures above 1 Kelvin. Here, we demonstrate measurement of a single SnV$^-$ electronic spin with a single-shot readout fidelity of $87.4\%$, which can be further improved to $98.5\%$ by conditioning on multiple readouts. We show this performance is compatible with rapid microwave spin control, demonstrating that the trade-off between optical readout and spin control inherent to group-IV centers in diamond can be overcome for the SnV$^-$. Finally, we use weak quantum measurement to study measurement induced dephasing; this illuminates the fundamental interplay between measurement and decoherence in quantum mechanics, and makes use of the qubit's spin coherence as a metrological tool. Taken together, these results overcome an important hurdle in the development of the SnV$^-$ based quantum technologies, and in the process, develop techniques and understanding broadly applicable to the study of solid-state quantum emitters. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.11329 [pdf, other]

AQM: A Refresh of the Abstract Qubit Model for Quantum Computing Co-design

Authors: Chenxu Liu, Samuel A. Stein, Muqing Zheng, James Ang, Ang Li

Abstract: Qubits are the fundamental building blocks of quantum information science and applications, whose concept is widely utilized in both quantum physics and quantum computation. While the significance of qubits and their implementation in physical devices have been extensively examined, now is the right time to revisit this understanding. In this paper, we introduce an abstract qubit model (AQM), offe… ▽ More Qubits are the fundamental building blocks of quantum information science and applications, whose concept is widely utilized in both quantum physics and quantum computation. While the significance of qubits and their implementation in physical devices have been extensively examined, now is the right time to revisit this understanding. In this paper, we introduce an abstract qubit model (AQM), offering a mathematical framework for higher-level algorithms and applications, and setting forth criteria for lower-level physical devices to enable quantum computation. We first provide a comprehensive definition of "qubits", regarded as the foundational principle for quantum computing algorithms (bottom-up support), and examine their requisites for devices (top-down demand). We then investigate the feasibility of relaxing specific requirements, thereby broadening device support while considering techniques that tradeoff extra costs to counterbalance this relaxation. Lastly, we delve into the quantum applications that only require partial support of "qubits", and discuss the physical systems with limited support of the AQM but remain valuable in quantum applications. AQM may serve as an intermediate interface between quantum algorithms and devices, facilitating quantum algorithm-device co-design. △ Less

Submitted 18 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

Comments: 36 pages, 3 figures, 2 tables

arXiv:2402.17842 [pdf, other]

doi 10.5281/zenodo.10719143

Public Goods Games in Disease Evolution and Spread

Authors: Christo Morison, Małgorzata Fic, Thomas Marcou, Javad Mohamadichamgavi, Javier Redondo Antón, Golsa Sayyar, Alexander Stein, Frank Bastian, Hana Krakovská, Nandakishor Krishnan, Diogo L. Pires, Mohammadreza Satouri, Frederik J. Thomsen, Kausutua Tjikundi, Wajid Ali

Abstract: Cooperation arises in nature at every scale, from within cells to entire ecosystems. In the framework of evolutionary game theory, public goods games (PGGs) are used to analyse scenarios where individuals can cooperate or defect, and can predict when and how these behaviours emerge. However, too few examples motivate the transferal of knowledge from one application of PGGs to another. Here, we foc… ▽ More Cooperation arises in nature at every scale, from within cells to entire ecosystems. In the framework of evolutionary game theory, public goods games (PGGs) are used to analyse scenarios where individuals can cooperate or defect, and can predict when and how these behaviours emerge. However, too few examples motivate the transferal of knowledge from one application of PGGs to another. Here, we focus on PGGs arising in disease modelling of cancer evolution and the spread of infectious diseases. We use these two systems as case studies for the development of the theory and applications of PGGs, which we succinctly review and compare. We also posit that applications of evolutionary game theory to decision-making in cancer, such as interactions between a clinician and a tumour, can learn from the PGGs studied in epidemiology, where cooperative behaviours such as quarantine and vaccination compliance have been more thoroughly investigated. Furthermore, instances of cellular-level cooperation observed in cancers point to a corresponding area of potential interest for modellers of other diseases, be they viral, bacterial or otherwise. We aim to demonstrate the breadth of applicability of PGGs in disease modelling while providing a starting point for those interested in quantifying cooperation arising in healthcare. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 figures, 3 tables

arXiv:2402.14020 [pdf, other]

Coercing LLMs to do and reveal (almost) anything

Authors: Jonas Gei**, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein

Abstract: It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and syst… ▽ More It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and systematize attacks that coerce varied unintended behaviors, such as misdirection, model control, denial-of-service, or data extraction. We analyze these attacks in controlled experiments, and find that many of them stem from the practice of pre-training LLMs with coding capabilities, as well as the continued existence of strange "glitch" tokens in common LLM vocabularies that should be removed for security reasons. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 32 pages. Implementation available at https://github.com/JonasGei**/carving

arXiv:2401.15113 [pdf, other]

Towards Global Glacier Map** with Deep Learning and Open Earth Observation Data

Authors: Konstantin A. Maslov, Claudio Persello, Thomas Schellenberger, Alfred Stein

Abstract: Accurate global glacier map** is critical for understanding climate change impacts. Despite its importance, automated glacier map** at a global scale remains largely unexplored. Here we address this gap and propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier map** using open satellite i… ▽ More Accurate global glacier map** is critical for understanding climate change impacts. Despite its importance, automated glacier map** at a global scale remains largely unexplored. Here we address this gap and propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier map** using open satellite imagery. Assessing the spatial, temporal and cross-sensor generalisation shows that our best strategy achieves intersection over union >0.85 on previously unobserved images in most cases, which drops to >0.75 for debris-rich areas such as High-Mountain Asia and increases to >0.90 for regions dominated by clean ice. A comparative validation against human expert uncertainties in terms of area and distance deviations underscores GlaViTU performance, approaching or matching expert-level delineation. Adding synthetic aperture radar data, namely, backscatter and interferometric coherence, increases the accuracy in all regions where available. The calibrated confidence for glacier extents is reported making the predictions more reliable and interpretable. We also release a benchmark dataset that covers 9% of glaciers worldwide. Our results support efforts towards automated multitemporal and global glacier map**. △ Less

Submitted 29 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: after major revision, discussion extended, added comparison with human experts, added comparison with band ratio

arXiv:2310.03132 [pdf, ps, other]

Application-Oriented Co-Design of Motors and Motions for a 6DOF Robot Manipulator

Authors: Adrian Stein, Yebin Wang, Yusuke Sakamoto, Bingnan Wang, Huazhen Fang

Abstract: This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for… ▽ More This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for a specific application. Contributions are made towards solving the proposed co-design problem in a computationally-efficient manner. First, a two-step process is proposed, where multiple motor designs are identified by optimizing motions and motors for multiple tasks one by one, and then are reconciled to determine the final motor design. Second, magnetic equivalent circuit modeling is exploited to establish the analytic map** from motor design parameters to dynamic models and objective functions to facilitate the subsequent differentiable simulation. Third, a direct-collocation-based differentiable simulator of motor and robotic arm dynamics is developed to balance the computational complexity and numerical stability. Simulation verifies that higher performance for a specific application can be achieved with the multi-task method, compared to several benchmark co-design methods. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.14432 [pdf, other]

Quantum Memory: A Missing Piece in Quantum Computing Units

Authors: Chenxu Liu, Meng Wang, Samuel A. Stein, Yufei Ding, Ang Li

Abstract: Memory is an indispensable component in classical computing systems. While the development of quantum computing is still in its early stages, current quantum processing units mainly function as quantum registers. Consequently, the actual role of quantum memory in future advanced quantum computing architectures remains unclear. With the rapid scaling of qubits, it is opportune to explore the potent… ▽ More Memory is an indispensable component in classical computing systems. While the development of quantum computing is still in its early stages, current quantum processing units mainly function as quantum registers. Consequently, the actual role of quantum memory in future advanced quantum computing architectures remains unclear. With the rapid scaling of qubits, it is opportune to explore the potential and feasibility of quantum memory across different substrate device technologies and application scenarios. In this paper, we provide a full design stack view of quantum memory. We start from the elementary component of a quantum memory device, quantum memory cells. We provide an abstraction to a quantum memory cell and define metrics to measure the performance of physical platforms. Combined with addressing functionality, we then review two types of quantum memory devices: random access quantum memory (RAQM) and quantum random access memory (QRAM). Building on top of these devices, quantum memory units in the computing architecture, including building a quantum memory unit, quantum cache, quantum buffer, and using QRAM for the quantum input-output module, are discussed. We further propose the programming model for the quantum memory units and discuss their possible applications. By presenting this work, we aim to attract more researchers from both the Quantum Information Science (QIS) and classical memory communities to enter this emerging and exciting area. △ Less

Submitted 2 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 41 pages, 11 figures, 7 tables

arXiv:2308.06686 [pdf, other]

TorchQL: A Programming Framework for Integrity Constraints in Machine Learning

Authors: Aaditya Naik, Adam Stein, Yinjun Wu, Mayur Naik, Eric Wong

Abstract: Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check… ▽ More Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check integrity constraints over machine learning models and datasets. It seamlessly integrates relational algebra with functional programming to allow for highly expressive queries using only eight intuitive operators. We evaluate TorchQL on diverse use-cases including finding critical temporal inconsistencies in objects detected across video frames in autonomous driving, finding data imputation errors in time-series medical records, finding data labeling errors in real-world images, and evaluating biases and constraining outputs of language models. Our experiments show that TorchQL enables up to 13x faster query executions than baselines like Pandas and MongoDB, and up to 40% shorter queries than native Python. We also conduct a user study and find that TorchQL is natural enough for developers familiar with Python to specify complex integrity constraints. △ Less

Submitted 14 February, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

arXiv:2307.14169 [pdf, ps, other]

An Antithetic Multilevel Monte Carlo-Milstein Scheme for Stochastic Partial Differential Equations

Authors: Abdul-Lateef Haji-Al, Andreas Stein

Abstract: We present a novel multilevel Monte Carlo approach for estimating quantities of interest for stochastic partial differential equations (SPDEs). Drawing inspiration from [Giles and Szpruch: Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without Lévy area simulation, Annals of Appl. Prob., 2014], we extend the antithetic Milstein scheme for finite-dimensional stochastic diff… ▽ More We present a novel multilevel Monte Carlo approach for estimating quantities of interest for stochastic partial differential equations (SPDEs). Drawing inspiration from [Giles and Szpruch: Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without Lévy area simulation, Annals of Appl. Prob., 2014], we extend the antithetic Milstein scheme for finite-dimensional stochastic differential equations to Hilbert space-valued SPDEs. Our method has the advantages of both Euler and Milstein discretizations, as it is easy to implement and does not involve intractable Lévy area terms. Moreover, the antithetic correction in our method leads to the same variance decay in a MLMC algorithm as the standard Milstein method, resulting in significantly lower computational complexity than a corresponding MLMC Euler scheme. Our approach is applicable to a broader range of non-linear diffusion coefficients and does not require any commutative properties. The key component of our MLMC algorithm is a truncated Milstein-type time step** scheme for SPDEs, which accelerates the rate of variance decay in the MLMC method when combined with an antithetic coupling on the fine scales. We combine the truncated Milstein scheme with appropriate spatial discretizations and noise approximations on all scales to obtain a fully discrete scheme and show that the antithetic coupling does not introduce an additional bias. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: 35 pages

MSC Class: 65C05; 65C30; 65M12

arXiv:2307.09835 [pdf, ps, other]

Deep Operator Network Approximation Rates for Lipschitz Operators

Authors: Christoph Schwab, Andreas Stein, Jakob Zech

Abstract: We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or Hölder) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of… ▽ More We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or Hölder) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of $\mathcal X$, $\mathcal Y$, and an approximator network of an infinite-dimensional, parametric coordinate map that is Lipschitz continuous on the sequence space $\ell^2(\mathbb N)$. Unlike previous works ([Herrmann, Schwab and Zech: Neural and Spectral operator surrogates: construction and expression rate bounds, SAM Report, 2022], [Marcati and Schwab: Exponential Convergence of Deep Operator Networks for Elliptic Partial Differential Equations, SAM Report, 2022]), which required for example $\mathcal G$ to be holomorphic, the present expression rate results require mere Lipschitz (or Hölder) continuity of $\mathcal G$. Key in the proof of the present expression rate bounds is the use of either super-expressive activations (e.g. [Yarotski: Elementary superexpressive activations, Int. Conf. on ML, 2021], [Shen, Yang and Zhang: Neural network approximation: Three hidden layers are enough, Neural Networks, 2021], and the references there) which are inspired by the Kolmogorov superposition theorem, or of nonstandard NN architectures with standard (ReLU) activations as recently proposed in [Zhang, Shen and Yang: Neural Network Architecture Beyond Width and Depth, Adv. in Neural Inf. Proc. Sys., 2022]. We illustrate the abstract results by approximation rate bounds for emulation of a) solution operators for parametric elliptic variational inequalities, and b) Lipschitz maps of Hilbert-Schmidt operators. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 31 pages

MSC Class: 41A65; 68T15; 68Q32

arXiv:2306.17846

A modern framework for jet tagger development

Authors: Annika Stein

Abstract: This paper presents a new tool to perform various steps in jet tagger development in an efficient and comprehensive way. A common data structure is used for training, as well as for performance evaluation in data. The introduction of this new framework reduces the amount of data to be stored while accomplishing the same tasks, and shortens waiting times between algorithm development and data-to-si… ▽ More This paper presents a new tool to perform various steps in jet tagger development in an efficient and comprehensive way. A common data structure is used for training, as well as for performance evaluation in data. The introduction of this new framework reduces the amount of data to be stored while accomplishing the same tasks, and shortens waiting times between algorithm development and data-to-simulation results becoming available from months to days, taking typical CMS experiment pipelines as a reference. Proper utilization of high-throughput systems enables first data-to-simulation studies with a recent neural network architecture, Particle Transformer, adapted to jet flavour tagging. Unlike official implementations of the collaboration, the new framework allows investigating different variants, like different training paradigms, and their impact on data/simulation agreement, without producing any new large files on disk, and within the same run of the analysis framework. Besides being more time- and storage-efficient and thus enabling the first results of that kind to be available just few hours after finishing neural network training, the framework is currently the only realization capable of studying how adversarial techniques affect data/simulation agreement for tagger algorithm outputs as well as inputs. △ Less

Submitted 30 June, 2023; originally announced June 2023.

Comments: This article has been removed by arXiv administrators because the submitter did not have the authority to grant the license assigned at the time of submission

arXiv:2306.13199 [pdf, other]

doi 10.1103/PhysRevX.13.031022

Microwave Spin Control of a Tin-Vacancy Qubit in Diamond

Authors: Eric I. Rosenthal, Christopher P. Anderson, Hannah C. Kleidermacher, Abigail J. Stein, Hope Lee, Jakob Grzesik, Giovanni Scuri, Alison E. Rugar, Daniel Riedel, Shahriar Aghaeimeibodi, Geun Ho Ahn, Kasper Van Gasse, Jelena Vuckovic

Abstract: The negatively charged tin-vacancy (SnV-) center in diamond is a promising solid-state qubit for applications in quantum networking due to its high quantum efficiency, strong zero phonon emission, and reduced sensitivity to electrical noise. The SnV- has a large spin-orbit coupling, which allows for long spin lifetimes at elevated temperatures, but unfortunately suppresses the magnetic dipole tran… ▽ More The negatively charged tin-vacancy (SnV-) center in diamond is a promising solid-state qubit for applications in quantum networking due to its high quantum efficiency, strong zero phonon emission, and reduced sensitivity to electrical noise. The SnV- has a large spin-orbit coupling, which allows for long spin lifetimes at elevated temperatures, but unfortunately suppresses the magnetic dipole transitions desired for quantum control. Here, by use of a naturally strained center, we overcome this limitation and achieve high-fidelity microwave spin control. We demonstrate a pi-pulse fidelity of up to 99.51+/0.03%$ and a Hahn-echo coherence time of T2echo = 170.0+/-2.8 microseconds, both the highest yet reported for SnV- platform. This performance comes without compromise to optical stability, and is demonstrated at 1.7 Kelvin where ample cooling power is available to mitigate drive induced heating. These results pave the way for SnV- spins to be used as a building block for future quantum technologies. △ Less

Submitted 30 August, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: Final published version

Journal ref: Phys. Rev. X 13, 031022 (2023)

arXiv:2306.00976 [pdf, other]

TopEx: Topic-based Explanations for Model Comparison

Authors: Shreya Havaldar, Adam Stein, Eric Wong, Lyle Ungar

Abstract: Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between… ▽ More Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks. △ Less

Submitted 1 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: Accepted to ICLR 2023, Tiny Papers Track

arXiv:2305.19787 [pdf, other]

DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation

Authors: Xianwei Lv, Claudio Persello, Wangbin Li, Xiao Huang, Dong** Ming, Alfred Stein

Abstract: Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posin… ▽ More Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posing limitations on the segmentation of large-scale remote sensing images and yielding algorithms with limited interpretability. To address the above challenges, we propose a deep-learning-based region merging method dubbed DeepMerge to handle the segmentation of complete objects in large VHR images by integrating deep learning and region adjacency graph (RAG). This is the first method to use deep learning to learn the similarity and merge similar adjacent super-pixels in RAG. We propose a modified binary tree sampling method to generate shift-scale data, serving as inputs for transformer-based deep learning networks, a shift-scale attention with 3-Dimension relative position embedding to learn features across scales, and an embedding to fuse learned features with hand-crafted features. DeepMerge can achieve high segmentation accuracy in a supervised manner from large-scale remotely sensed images and provides an interpretable optimal scale parameter, which is validated using a remote sensing image of 0.55 m resolution covering an area of 5,660 km^2. The experimental results show that DeepMerge achieves the highest F value (0.9550) and the lowest total error TE (0.0895), correctly segmenting objects of different sizes and outperforming all competing segmentation methods. △ Less

Submitted 5 January, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

arXiv:2305.16308 [pdf, other]

Rectifying Group Irregularities in Explanations for Distribution Shift

Authors: Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik

Abstract: It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these… ▽ More It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these two distributions. However, these methods can introduce group irregularities, leading to explanations that are less feasible and robust. To address these issues, we propose Group-aware Shift Explanations (GSE), a method that produces interpretable explanations by leveraging worst-group optimization to rectify group irregularities. We demonstrate how GSE not only maintains group structures, such as demographic and hierarchical subpopulations, but also enhances feasibility and robustness in the resulting explanations in a wide range of tabular, language, and image settings. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 19 pages, 5 figures

arXiv:2303.14511 [pdf, other]

Improving robustness of jet tagging algorithms with adversarial training: exploring the loss surface

Authors: Annika Stein

Abstract: In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to inv… ▽ More In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to investigate how well these perform under slight distortions of input features. Especially for tasks that involve many (low-level) inputs, the application of deep neural networks brings new challenges. In the context of jet flavor tagging, adversarial attacks are used to probe a typical classifier's vulnerability and can be understood as a model for systematic uncertainties. A corresponding defense strategy, adversarial training, improves robustness, while maintaining high performance. Investigating the loss surface corresponding to the inputs and models in question reveals geometric interpretations of robustness, taking correlations into account. △ Less

Submitted 25 March, 2023; originally announced March 2023.

Comments: 5 pages, 2 figures; submitted to ACAT 2022 proceedings

arXiv:2303.02044 [pdf, ps, other]

doi 10.1177/03064190231159330

From Playground Swings to Sway Control of Cranes: An Active Pendulum Experiment

Authors: Adrian Stein, Tarik Parcic, Tarunraj Singh

Abstract: Dynamics is a core discipline in Mechanical and Aerospace Engineering programs and with the ubiquitous nature of control in modern day applications, the field of mechatronics has gained popularity. Mechatronics refers to the field of engineering which integrates the engineering disciplines of mechanical, control, electronics and computing. To create a testbed to illustrate a tabletop mechatronics… ▽ More Dynamics is a core discipline in Mechanical and Aerospace Engineering programs and with the ubiquitous nature of control in modern day applications, the field of mechatronics has gained popularity. Mechatronics refers to the field of engineering which integrates the engineering disciplines of mechanical, control, electronics and computing. To create a testbed to illustrate a tabletop mechatronics system, the paper details the design, and fabrication of an active pendulum whose length can be changed in real-time using solenoids. This permits illustrating two concepts: (1) dam** of pendulum oscillations which emulates the sway of a crane and (2) amplification of the oscillations which emulates the pum** of a playground swing. The paper describes the steps prior to experimental validation which include: modeling, system identification, signal processing, and controller implementation. Numerical simulations are used to prototype the controller and eventually to compare the simulation results to the experimental ones. The results of all the experiments illustrate a close match between the simulated and experimental results. To permit reproduction of the experiment, the design details and code to implement the controllers are posted in a public repository. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Journal ref: International Journal of Mechanical Engineering Education February 23, 2023

arXiv:2303.00116 [pdf, other]

Neural Auctions Compromise Bidder Information

Authors: Alex Stein, Avi Schwarzschild, Michael Curry, Tom Goldstein, John Dickerson

Abstract: Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individuall… ▽ More Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individually rational. We show that despite such auctions maximizing revenue, they do so at the cost of revealing private bidder information. While randomness is often used to build in privacy, in this context it comes with complications if done without care. Specifically, it can violate rationality and feasibility constraints, fundamentally change the incentive structure of the mechanism, and/or harm top-level metrics such as revenue and social welfare. We propose a method that employs stochasticity to improve privacy while meeting the requirements for auction mechanisms with only a modest sacrifice in revenue. We analyze the cost to the auction house that comes with introducing varying degrees of privacy in common auction settings. Our results show that despite current neural auctions' ability to approximate optimal mechanisms, the resulting vulnerability that comes with relying on neural networks must be accounted for. △ Less

Submitted 28 February, 2023; originally announced March 2023.

arXiv:2302.04418 [pdf, other]

Learning to Select Pivotal Samples for Meta Re-weighting

Authors: Yinjun Wu, Adam Stein, Jacob Gardner, Mayur Naik

Abstract: Sample re-weighting strategies provide a promising mechanism to deal with imperfect training data in machine learning, such as noisily labeled or class-imbalanced data. One such strategy involves formulating a bi-level optimization problem called the meta re-weighting problem, whose goal is to optimize performance on a small set of perfect pivotal samples, called meta samples. Many approaches have… ▽ More Sample re-weighting strategies provide a promising mechanism to deal with imperfect training data in machine learning, such as noisily labeled or class-imbalanced data. One such strategy involves formulating a bi-level optimization problem called the meta re-weighting problem, whose goal is to optimize performance on a small set of perfect pivotal samples, called meta samples. Many approaches have been proposed to efficiently solve this problem. However, all of them assume that a perfect meta sample set is already provided while we observe that the selections of meta sample set is performance critical. In this paper, we study how to learn to identify such a meta sample set from a large, imperfect training set, that is subsequently cleaned and used to optimize performance in the meta re-weighting setting. We propose a learning framework which reduces the meta samples selection problem to a weighted K-means clustering problem through rigorously theoretical analysis. We propose two clustering methods within our learning framework, Representation-based clustering method (RBC) and Gradient-based clustering method (GBC), for balancing performance and computational efficiency. Empirical studies demonstrate the performance advantage of our methods over various baseline methods. △ Less

Submitted 8 February, 2023; originally announced February 2023.

Comments: Published in AAAI 2023 (oral)

arXiv:2302.00678 [pdf, other]

Multilevel Markov Chain Monte Carlo for Bayesian Elliptic Inverse Problems with Besov Random Tree Priors

Authors: Andreas Stein, Viet Ha Hoang

Abstract: We propose a multilevel Monte Carlo-FEM algorithm to solve elliptic Bayesian inverse problems with "Besov random tree prior". These priors are given by a wavelet series with stochastic coefficients, and certain terms in the expansion vanishing at random, according to the law of so-called Galton-Watson trees. This allows to incorporate random fractal structures and large deviations in the log-diffu… ▽ More We propose a multilevel Monte Carlo-FEM algorithm to solve elliptic Bayesian inverse problems with "Besov random tree prior". These priors are given by a wavelet series with stochastic coefficients, and certain terms in the expansion vanishing at random, according to the law of so-called Galton-Watson trees. This allows to incorporate random fractal structures and large deviations in the log-diffusion, which occur naturally in many applications from geophysics or medical imaging. This framework entails two main difficulties: First, the associated diffusion coefficient does not satisfy a uniform ellipticity condition, which leads to non-integrable terms and thus divergence of standard multilevel estimators. Secondly, the associated space of parameters is Polish, but not a normed linear space. We address the first point by introducing cut-off functions in the estimator to compensate for the non-integrable terms, while the second issue is resolved by employing an independence Metropolis-Hastings sampler. The resulting algorithm converges in the mean-square sense with essentially optimal asymptotic complexity, and dimension-independent acceptance probabilities. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Comments: 31 pages. arXiv admin note: text overlap with arXiv:2302.00522

MSC Class: 35R30; 65C05; 65C40; 65N12; 65N15; 65N30; 60G60

arXiv:2302.00522 [pdf, other]

Multilevel Monte Carlo FEM for Elliptic PDEs with Besov Random Tree Priors

Authors: Christoph Schwab, Andreas Stein

Abstract: We develop a multilevel Monte Carlo (MLMC)-FEM algorithm for linear, elliptic diffusion problems in polytopal domain $\mathcal D\subset \mathbb R^d$, with Besov-tree random coefficients. This is to say that the logarithms of the diffusion coefficients are sampled from so-called Besov-tree priors, which have recently been proposed to model data for fractal phenomena in science and engineering. Nume… ▽ More We develop a multilevel Monte Carlo (MLMC)-FEM algorithm for linear, elliptic diffusion problems in polytopal domain $\mathcal D\subset \mathbb R^d$, with Besov-tree random coefficients. This is to say that the logarithms of the diffusion coefficients are sampled from so-called Besov-tree priors, which have recently been proposed to model data for fractal phenomena in science and engineering. Numerical analysis of the fully discrete FEM for the elliptic PDE includes quadrature approximation and must account for a) nonuniform pathwise upper and lower coefficient bounds, and for b) low path-regularity of the Besov-tree coefficients. Admissible non-parametric random coefficients correspond to random functions exhibiting singularities on random fractals with tunable fractal dimension, but involve no a-priori specification of the fractal geometry of singular supports of sample paths. Optimal complexity and convergence rate estimates for quantities of interest and for their second moments are proved. A convergence analysis for MLMC-FEM is performed which yields choices of the algorithmic steering parameters for efficient implementation. A complexity (``error vs work'') analysis of the MLMC-FEM approximations is provided. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Comments: 41 pages

MSC Class: 65C05; 65N12; 65N15; 65N30; 60G60

arXiv:2301.13379 [pdf, other]

Faithful Chain-of-Thought Reasoning

Authors: Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

Abstract: While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving… ▽ More While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy. △ Less

Submitted 20 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: IJCNLP-AACL 2023 camera-ready version

arXiv:2301.08716 [pdf, ps, other]

Minimum Time Control of a Gantry Crane System with Rate Constraints

Authors: Adrian Stein, Tarunraj Singh

Abstract: This paper focuses on the development of minimum time control profiles for point-to-point motion of a gantry crane system in the presence of uncertainties in modal parameters. Assuming that the velocity of the trolley of the crane can be commanded and is subject to limits, an optimal control problem is posed to determine the bang-off-bang control profile to transition the system from a point of re… ▽ More This paper focuses on the development of minimum time control profiles for point-to-point motion of a gantry crane system in the presence of uncertainties in modal parameters. Assuming that the velocity of the trolley of the crane can be commanded and is subject to limits, an optimal control problem is posed to determine the bang-off-bang control profile to transition the system from a point of rest to the terminal states with no residual vibrations. Both undamped and underdamped systems are considered and the variation of the structure of the optimal control profiles as a function of the final displacement is studied. As the magnitude of the rigid body displacement is increased, the collapse and birthing of switches in the optimal control profile are observed and explained. Robustness to uncertainties in modal parameters is accounted for by forcing the state sensitivities at the terminal time to zero. The observation that the time-optimal control profile merges with the robust time-optimal control is noted for specific terminal displacements and the migration of zeros of the time-delay filter parameterizing the optimal control profile are used to explain this counter intuitive result. A two degree of freedom gantry crane system is used to experimentally validate the observations of the numerical studies and the tradeoff of increase in maneuver time to the reduction of residual vibrations is experimentally illustrated. △ Less

Submitted 20 January, 2023; originally announced January 2023.

arXiv:2301.04315 [pdf, ps, other]

Shapley Effect Estimation using Polynomial Chaos

Authors: Adrian Stein, Tarunraj Singh

Abstract: This paper presents an approach for estimating Shapley effects for use as global sensitivity metrics to quantify the relative importance of uncertain model parameters. Polynomial Chaos expansion, a well established approach for develo** surrogate models is proposed to be used to estimate Shapley effects. Polynomial Chaos permits the transformation of a stochastic process to a deterministic model… ▽ More This paper presents an approach for estimating Shapley effects for use as global sensitivity metrics to quantify the relative importance of uncertain model parameters. Polynomial Chaos expansion, a well established approach for develo** surrogate models is proposed to be used to estimate Shapley effects. Polynomial Chaos permits the transformation of a stochastic process to a deterministic model which can then be used to efficiently evaluate statistical moments of the quantity of interest. These moments include conditional variances which are algebraically mapped to Shapley effects. The polynomial chaos based estimates of Shapley effects are validated using Monte Carlo simulations and tested on the benchmark Ishigami function and on the dynamic SEIR epidemic model and the Bergman Type 1 diabetes model. The results illustrate the correct ranking of uncertain variables for the Ishigami function in contrast to the Sobol indices and illustrates the time-varying rank ordering of the model parameters for the dynamic models. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2301.03327 [pdf, other]

A-posteriori QMC-FEM error estimation for Bayesian inversion and optimal control with entropic risk measure

Authors: Marcello Longo, Christoph Schwab, Andreas Stein

Abstract: We propose a novel a-posteriori error estimation technique where the target quantities of interest are ratios of high-dimensional integrals, as occur e.g. in PDE constrained Bayesian inversion and PDE constrained optimal control subject to an entropic risk measure. We consider in particular parametric, elliptic PDEs with affine-parametric diffusion coefficient, on high-dimensional parameter spaces… ▽ More We propose a novel a-posteriori error estimation technique where the target quantities of interest are ratios of high-dimensional integrals, as occur e.g. in PDE constrained Bayesian inversion and PDE constrained optimal control subject to an entropic risk measure. We consider in particular parametric, elliptic PDEs with affine-parametric diffusion coefficient, on high-dimensional parameter spaces. We combine our recent a-posteriori Quasi-Monte Carlo (QMC) error analysis, with Finite Element a-posteriori error estimation. The proposed approach yields a computable a-posteriori estimator which is reliable, up to higher order terms. The estimator's reliability is uniform with respect to the PDE discretization, and robust with respect to the parametric dimension of the uncertain PDE input. △ Less

Submitted 6 July, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

MSC Class: 65C05; 65C10; 65N15; 65N30; 65N50

arXiv:2212.06167 [pdf, other]

Architectures for Multinode Superconducting Quantum Computers

Authors: James Ang, Gabriella Carini, Yanzhu Chen, Isaac Chuang, Michael Austin DeMarco, Sophia E. Economou, Alec Eickbusch, Andrei Faraon, Kai-Mei Fu, Steven M. Girvin, Michael Hatridge, Andrew Houck, Paul Hilaire, Kevin Krsulich, Ang Li, Chenxu Liu, Yuan Liu, Margaret Martonosi, David C. McKay, James Misewich, Mark Ritter, Robert J. Schoelkopf, Samuel A. Stein, Sara Sussman, Hong X. Tang , et al. (8 additional authors not shown)

Abstract: Many proposals to scale quantum technology rely on modular or distributed designs where individual quantum processors, called nodes, are linked together to form one large multinode quantum computer (MNQC). One scalable method to construct an MNQC is using superconducting quantum systems with optical interconnects. However, a limiting factor of these machines will be internode gates, which may be t… ▽ More Many proposals to scale quantum technology rely on modular or distributed designs where individual quantum processors, called nodes, are linked together to form one large multinode quantum computer (MNQC). One scalable method to construct an MNQC is using superconducting quantum systems with optical interconnects. However, a limiting factor of these machines will be internode gates, which may be two to three orders of magnitude noisier and slower than local operations. Surmounting the limitations of internode gates will require a range of techniques, including improvements in entanglement generation, the use of entanglement distillation, and optimized software and compilers, and it remains unclear how improvements to these components interact to affect overall system performance, what performance from each is required, or even how to quantify the performance of each. In this paper, we employ a `co-design' inspired approach to quantify overall MNQC performance in terms of hardware models of internode links, entanglement distillation, and local architecture. In the case of superconducting MNQCs with microwave-to-optical links, we uncover a tradeoff between entanglement generation and distillation that threatens to degrade performance. We show how to navigate this tradeoff, lay out how compilers should optimize between local and internode gates, and discuss when noisy quantum links have an advantage over purely classical links. Using these results, we introduce a roadmap for the realization of early MNQCs which illustrates potential improvements to the hardware and software of MNQCs and outlines criteria for evaluating the landscape, from progress in entanglement generation and quantum memory to dedicated algorithms such as distributed quantum phase estimation. While we focus on superconducting devices with optical interconnects, our approach is general across MNQC implementations. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: 23 pages, white paper

arXiv:2210.05443 [pdf, other]

QuCNN : A Quantum Convolutional Neural Network with Entanglement Based Backpropagation

Authors: Samuel A. Stein, Ying Mao, James Ang, Ang Li

Abstract: Quantum Machine Learning continues to be a highly active area of interest within Quantum Computing. Many of these approaches have adapted classical approaches to the quantum settings, such as QuantumFlow, etc. We push forward this trend and demonstrate an adaption of the Classical Convolutional Neural Networks to quantum systems - namely QuCNN. QuCNN is a parameterised multi-quantum-state based ne… ▽ More Quantum Machine Learning continues to be a highly active area of interest within Quantum Computing. Many of these approaches have adapted classical approaches to the quantum settings, such as QuantumFlow, etc. We push forward this trend and demonstrate an adaption of the Classical Convolutional Neural Networks to quantum systems - namely QuCNN. QuCNN is a parameterised multi-quantum-state based neural network layer computing similarities between each quantum filter state and each quantum data state. With QuCNN, back propagation can be achieved through a single-ancilla qubit quantum routine. QuCNN is validated by applying a convolutional layer with a data state and a filter state over a small subset of MNIST images, comparing the back propagated gradients, and training a filter state against an ideal target state. △ Less

Submitted 11 October, 2022; originally announced October 2022.

arXiv:2204.02542 [pdf]

doi 10.1016/j.ehb.2016.03.002

Early life height and weight production functions with endogenous energy and protein inputs

Authors: Esteban Puentes, Fan Wang, Jere R. Behrman, Flávio Cunha, John Hoddinott, John A. Maluccio, Linda S. Adair, Judith B. Borja, Reynaldo Martorell, Aryeh D. Stein

Abstract: We examine effects of protein and energy intakes on height and weight growth for children between 6 and 24 months old in Guatemala and the Philippines. Using instrumental variables to control for endogeneity and estimating multiple specifications, we find that protein intake plays an important and positive role in height and weight growth in the 6-24 month period. Energy from other macronutrients,… ▽ More We examine effects of protein and energy intakes on height and weight growth for children between 6 and 24 months old in Guatemala and the Philippines. Using instrumental variables to control for endogeneity and estimating multiple specifications, we find that protein intake plays an important and positive role in height and weight growth in the 6-24 month period. Energy from other macronutrients, however, does not have a robust relation with these two anthropometric measures. Our estimates indicate that in contexts with substantial child undernutrition, increases in protein-rich food intake in the first 24 months can have important growth effects, which previous studies indicate are related significantly to a range of outcomes over the life cycle. △ Less

Submitted 5 April, 2022; originally announced April 2022.

MSC Class: 62P10; 62P20; 62P10; 92C60 ACM Class: J.3; J.4

Journal ref: Economics & Human Biology 22 (September 1, 2016): 65-81

arXiv:2203.13890 [pdf, other]

doi 10.1007/s41781-022-00087-1

Improving Robustness of Jet Tagging Algorithms with Adversarial Training

Authors: Annika Stein, Xavier Coubez, Spandan Mondal, Andrzej Novak, Alexander Schmidt

Abstract: Deep learning is a standard tool in the field of high-energy physics, facilitating considerable sensitivity enhancements for numerous analysis strategies. In particular, in identification of physics objects, such as jet flavor tagging, complex neural network architectures play a major role. However, these methods are reliant on accurate simulations. Mismodeling can lead to non-negligible differenc… ▽ More Deep learning is a standard tool in the field of high-energy physics, facilitating considerable sensitivity enhancements for numerous analysis strategies. In particular, in identification of physics objects, such as jet flavor tagging, complex neural network architectures play a major role. However, these methods are reliant on accurate simulations. Mismodeling can lead to non-negligible differences in performance in data that need to be measured and calibrated against. We investigate the classifier response to input data with injected mismodelings and probe the vulnerability of flavor tagging algorithms via application of adversarial attacks. Subsequently, we present an adversarial training strategy that mitigates the impact of such simulated attacks and improves the classifier robustness. We examine the relationship between performance and vulnerability and show that this method constitutes a promising approach to reduce the vulnerability to poor modeling. △ Less

Submitted 16 September, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

Comments: 17 pages, 16 figures, 2 tables. Replaced with the published version. Added the journal reference and the DOI. Code accessible under https://github.com/AnnikaStein/Adversarial-Training-for-Jet-Tagging

Journal ref: Comput Softw Big Sci 6 (2022) 15

arXiv:2111.09087 [pdf, other]

doi 10.1007/s10489-021-03035-5

A Case Study of Vehicle Route Optimization

Authors: Veronika Lesch, Maximilian König, Samuel Kounev, Anthony Stein, Christian Krupitzer

Abstract: In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-wo… ▽ More In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-world applications, the so called rich VRP (rVRP) and are limited to single aspects. In this work, we incorporate the main relevant real-world constraints and requirements. We propose a two-stage strategy and a Timeline algorithm for time windows and pause times, and apply a Genetic Algorithm (GA) and Ant Colony Optimization (ACO) individually to the problem to find optimal solutions. Our evaluation of eight different problem instances against four state-of-the-art algorithms shows that our approach handles all given constraints in a reasonable time. △ Less

Submitted 17 November, 2021; originally announced November 2021.

arXiv:2108.09681 [pdf, other]

doi 10.1021/acs.jpcc.1c04217

Electronic Properties of Tetraazaperopyrene Derivatives on Au(111): Energy Level Alignment and Interfacial Band Formation

Authors: Arnulf Stein, Daniela Rolf, Christian Lotze, Sascha Feldmann, David Gerbert, Benjamin Günther, Andreas Jeindl, Johannes J. Cartus, Oliver T. Hofmann, Lutz H. Gade, Katharina J. Franke, Petra Tegeder

Abstract: N-Heteropolycyclic aromatic compounds are promising organic electron-transporting semiconductors for applications in field effect transistors. Here, we investigated the electronic properties of 1,3,8,10-tetraazaperopyrene derivatives adsorbed on Au(111) using a complementary experimental approach, namely scanning tunneling spectroscopy and two-photon photoemission combined with state-of-the-art de… ▽ More N-Heteropolycyclic aromatic compounds are promising organic electron-transporting semiconductors for applications in field effect transistors. Here, we investigated the electronic properties of 1,3,8,10-tetraazaperopyrene derivatives adsorbed on Au(111) using a complementary experimental approach, namely scanning tunneling spectroscopy and two-photon photoemission combined with state-of-the-art density functional calculations. We find signatures of weak physisorption of the molecular layers, such as the absence of charge transfer, a nearly unperturbed surface state and an intact herringbone reconstruction underneath the molecular layer. Interestingly, molecular states in the energy region of the \emph{sp}- and \emph{d}-bands of the Au(111) substrate exhibit hole-like dispersive character. We ascribe this band character to hybridization with the delocalized states of the substrate. We suggest that such bands, which effectively leave the molecular frontier orbitals largely unperturbed, to be a promising lead for the design of organic-metal interfaces with a low charge injection barrier. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Journal ref: Journal of Physical Chemistry C 125, 19969 (2021)

arXiv:2107.08431 [pdf, ps, other]

doi 10.1016/j.jsv.2021.116716

Widening, Transition and Coalescence of Local Resonance Band Gaps in Multi-resonator Acoustic Metamaterials: From Unit Cells to Finite Chains

Authors: A. Stein, M. Nouh, T. Singh

Abstract: Local resonance band gaps in acoustic metamaterials are widely known for their strong attenuation yet narrow frequency span. The latter limits the practical ability to implement subwavelength band gaps for broadband attenuation and has motivated novel metamaterial designs in recent years. In this paper, we investigate the behavior of acoustic metamaterials where unit cells house multiple resonatin… ▽ More Local resonance band gaps in acoustic metamaterials are widely known for their strong attenuation yet narrow frequency span. The latter limits the practical ability to implement subwavelength band gaps for broadband attenuation and has motivated novel metamaterial designs in recent years. In this paper, we investigate the behavior of acoustic metamaterials where unit cells house multiple resonating elements stacked in different configurations, aimed at instigating a wide array of wave propagation profiles that are otherwise unattainable. The dispersion mechanics of the multi-resonator metamaterials are developed using purely analytical expressions which depict and explain the underlying dynamics of such systems both at the unit cell level as well as the frequency response of their finite realizations. The framework reveals the mechanism behind the transition of the lower and upper band gap bounds in metamaterials with parallel resonators resulting in a significant band gap widening. The analysis also illustrates the ability of metamaterials with dual-periodic super cells to exhibit a range of dispersion transitions culminating in collapsing solutions of acoustic and optic bands, enabling a coalescence of local resonance band gaps, vanishing resonances, and a number of intriguing scenarios in between. △ Less

Submitted 11 January, 2022; v1 submitted 18 July, 2021; originally announced July 2021.

arXiv:2106.10510 [pdf, other]

doi 10.1103/PhysRevMaterials.5.114410

Multiple energy-scales in vertex-frustrated mesospin systems

Authors: Henry Stopfel, Unnar B. Arnalds, Aaron Stein, Thomas P. A. Hase, Björgvin Hjörvarsson, Vassilios Kapaklis

Abstract: The interplay between topology and energy-hierarchy plays a vital role in the collective magnetic order in artificial ferroic systems. Here we investigate, experimentally, the effect of having one or two activation energies of interacting Ising-like magnetic islands -- mesospins -- in thermalized, vertex-frustrated lattices. The thermally arrested magnetic states of the elements were determined us… ▽ More The interplay between topology and energy-hierarchy plays a vital role in the collective magnetic order in artificial ferroic systems. Here we investigate, experimentally, the effect of having one or two activation energies of interacting Ising-like magnetic islands -- mesospins -- in thermalized, vertex-frustrated lattices. The thermally arrested magnetic states of the elements were determined using synchrotron-based magnetic microscopy after cooling the samples from temperatures above the Curie temperature of the material. Statistical analysis of the correlations between mesospins across several length-scales, reveals changes in the magnetic order, reflecting the amount of ground state plaquettes realized for a vertex-frustrated lattice. We show that the latter depends on the presence, or not, of different activation energies. △ Less

Submitted 15 October, 2021; v1 submitted 19 June, 2021; originally announced June 2021.

Comments: 10 pages, 8 figures

Journal ref: Phys. Rev. Materials 5, 114410 (2021)

arXiv:2103.11307 [pdf, other]

QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity

Authors: Samuel A. Stein, Betis Baheri, Daniel Chen, Ying Mao, Qiang Guan, Ang Li, Shuai Xu, Caiwen Ding

Abstract: In the past decade, remarkable progress has been achieved in deep learning related systems and applications. In the post Moore's Law era, however, the limit of semiconductor fabrication technology along with the increasing data size have slowed down the development of learning algorithms. In parallel, the fast development of quantum computing has pushed it to the new ear. Google illustrates quantu… ▽ More In the past decade, remarkable progress has been achieved in deep learning related systems and applications. In the post Moore's Law era, however, the limit of semiconductor fabrication technology along with the increasing data size have slowed down the development of learning algorithms. In parallel, the fast development of quantum computing has pushed it to the new ear. Google illustrates quantum supremacy by completing a specific task (random sampling problem), in 200 seconds, which is impracticable for the largest classical computers. Due to the limitless potential, quantum based learning is an area of interest, in hopes that certain systems might offer a quantum speedup. In this work, we propose a novel architecture QuClassi, a quantum neural network for both binary and multi-class classification. Powered by a quantum differentiation function along with a hybrid quantum-classic design, QuClassi encodes the data with a reduced number of qubits and generates the quantum circuit, pushing it to the quantum platform for the best states, iteratively. We conduct intensive experiments on both the simulator and IBM-Q quantum platform. The evaluation results demonstrate that QuClassi is able to outperform the state-of-the-art quantum-based solutions, Tensorflow-Quantum and QuantumFlow by up to 53.75% and 203.00% for binary and multi-class classifications. When comparing to traditional deep neural networks, QuClassi achieves a comparable performance with 97.37% fewer parameters. △ Less

Submitted 31 March, 2022; v1 submitted 21 March, 2021; originally announced March 2021.

arXiv:2102.07280 [pdf, other]

doi 10.1109/IGARSS47720.2021.9554573

3D Fully Convolutional Neural Networks with Intersection Over Union Loss for Crop Map** from Multi-Temporal Satellite Images

Authors: Sina Mohammadi, Mariana Belgiu, Alfred Stein

Abstract: Information on cultivated crops is relevant for a large number of food security studies. Different scientific efforts are dedicated to generating this information from remote sensing images by means of machine learning methods. Unfortunately, these methods do not take account of the spatial-temporal relationships inherent in remote sensing images. In our paper, we explore the capability of a 3D Fu… ▽ More Information on cultivated crops is relevant for a large number of food security studies. Different scientific efforts are dedicated to generating this information from remote sensing images by means of machine learning methods. Unfortunately, these methods do not take account of the spatial-temporal relationships inherent in remote sensing images. In our paper, we explore the capability of a 3D Fully Convolutional Neural Network (FCN) to map crop types from multi-temporal images. In addition, we propose the Intersection Over Union (IOU) loss function for increasing the overlap between the predicted classes and ground reference data. The proposed method was applied to identify soybean and corn from a study area situated in the US corn belt using multi-temporal Landsat images. The study shows that our method outperforms related methods, obtaining a Kappa coefficient of 91.8%. We conclude that using the IOU loss function provides a superior choice to learn individual crop types. △ Less

Submitted 19 October, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

Comments: Accepted by IGARSS 2021

Journal ref: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2021, pp. 5834-5837

arXiv:2012.00824 [pdf, ps, other]

Quantum-Inspired Classical Algorithm for Slow Feature Analysis

Authors: Daniel Chen, Yekun Xu, Betis Baheri, Samuel A. Stein, Chuan Bi, Ying Mao, Qiang Quan, Shuai Xu

Abstract: Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with… ▽ More Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with a run time O(polylog(n)poly(d)). To achieve this, we assumed necessary preprocessing of the input data as well as the existence of a data structure supporting a particular sampling scheme. The analysis of algorithm borrowed results from matrix perturbation theory, which was crucial for the algorithm's correctness. This work demonstrates the possible application and extent for which quantum-inspired computation can be used. △ Less

Submitted 1 December, 2020; originally announced December 2020.

arXiv:2012.00256 [pdf, other]

A Hybrid System for Learning Classical Data in Quantum States

Authors: Samuel A. Stein, Ryan L'Abbate, Wenrui Mu, Yue Liu, Betis Baheri, Ying Mao, Qiang Guan, Ang Li, Bo Fang

Abstract: Deep neural network powered artificial intelligence has rapidly changed our daily life with various applications. However, as one of the essential steps of deep neural networks, training a heavily weighted network requires a tremendous amount of computing resources. Especially in the post-Moore's Law era, the limit of semiconductor fabrication technology has restricted the development of learning… ▽ More Deep neural network powered artificial intelligence has rapidly changed our daily life with various applications. However, as one of the essential steps of deep neural networks, training a heavily weighted network requires a tremendous amount of computing resources. Especially in the post-Moore's Law era, the limit of semiconductor fabrication technology has restricted the development of learning algorithms to cope with the increasing high-intensity training data. Meanwhile, quantum computing has demonstrated its significant potential in terms of speeding up the traditionally compute-intensive workloads. For example, Google illustrated quantum supremacy by completing a sampling calculation task in 200 seconds, which is otherwise impracticable on the world's largest supercomputers. To this end, quantum-based learning has become an area of interest, with the potential of a quantum speedup. In this paper, we propose GenQu, a hybrid and general-purpose quantum framework for learning classical data through quantum states. We evaluate GenQu with real datasets and conduct experiments on both simulations and real quantum computer IBM-Q. Our evaluation demonstrates that, compared with classical solutions, the proposed models running on GenQu framework achieve similar accuracy with a much smaller number of qubits, while significantly reducing the parameter size by up to 95.86% and converging speedup by 33.33% faster. △ Less

Submitted 20 August, 2021; v1 submitted 30 November, 2020; originally announced December 2020.

arXiv:2010.09036 [pdf, other]

doi 10.1109/QCE52317.2021.00023

QuGAN: A Quantum State Fidelity based Generative Adversarial Network

Authors: Samuel A. Stein, Betis Baheri, Daniel Chen, Ying Mao, Qiang Guan, Ang Li, Bo Fang, Shuai Xu

Abstract: Tremendous progress has been witnessed in artificial intelligence where neural network backed deep learning systems have been used, with applications in almost every domain. As a representative deep learning framework, Generative Adversarial Network (GAN) has been widely used for generating artificial images, text-to-image or image augmentation across areas of science, arts and video games. Howeve… ▽ More Tremendous progress has been witnessed in artificial intelligence where neural network backed deep learning systems have been used, with applications in almost every domain. As a representative deep learning framework, Generative Adversarial Network (GAN) has been widely used for generating artificial images, text-to-image or image augmentation across areas of science, arts and video games. However, GANs are computationally expensive, sometimes computationally prohibitive. Furthermore, training GANs may suffer from convergence failure and modal collapse. Aiming at the acceleration of use cases for practical quantum computers, we propose QuGAN, a quantum GAN architecture that provides stable convergence, quantum-state based gradients and significantly reduced parameter sets. The QuGAN architecture runs both the discriminator and the generator purely on quantum state fidelity and utilizes the swap test on qubits to calculate the values of quantum-based loss functions. Built on quantum layers, QuGAN achieves similar performance with a 94.98% reduction on the parameter set when compared to classical GANs. With the same number of parameters, additionally, QuGAN outperforms state-of-the-art quantum based GANs in the literature providing a 48.33% improvement in system performance compared to others attaining less than 0.5% in terms of similarity between generated distributions and original data sets. QuGAN code is released at https://github.com/yingmao/Quantum-Generative-Adversarial-Network △ Less

Submitted 22 September, 2022; v1 submitted 18 October, 2020; originally announced October 2020.

Comments: 2021 IEEE International Conference on Quantum Computing and Engineering (QCE)

arXiv:2002.05628 [pdf, other]

XCS Classifier System with Experience Replay

Authors: Anthony Stein, Roland Maier, Lukas Rosenbauer, Jörg Hähner

Abstract: XCS constitutes the most deeply investigated classifier system today. It bears strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter d… ▽ More XCS constitutes the most deeply investigated classifier system today. It bears strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter domain, recent advances have been mainly driven by algorithms which model their policies based on deep neural networks -- among which the Deep-Q-Network (DQN) is a prominent representative. Experience Replay (ER) constitutes one of the crucial factors for the DQN's successes, since it facilitates stabilized training of the neural network-based Q-function approximators. Surprisingly, XCS barely takes advantage of similar mechanisms that leverage stored raw experiences encountered so far. To bridge this gap, this paper investigates the benefits of extending XCS with ER. On the one hand, we demonstrate that for single-step tasks ER bears massive potential for improvements in terms of sample efficiency. On the shady side, however, we reveal that the use of ER might further aggravate well-studied issues not yet solved for XCS when applied to sequential decision problems demanding for long-action-chains. △ Less

Submitted 13 February, 2020; originally announced February 2020.

arXiv:2002.01370 [pdf, other]

Bootstrap** a DQN Replay Memory with Synthetic Experiences

Authors: Wenzel Baron Pilar von Pilchau, Anthony Stein, Jörg Hähner

Abstract: An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples b… ▽ More An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version. △ Less

Submitted 4 February, 2020; originally announced February 2020.

arXiv:2001.04746 [pdf, other]

doi 10.1103/PhysRevB.101.134404

Collective magnetic dynamics in artificial spin ice probed by AC susceptibility

Authors: Merlin Pohlit, Giuseppe Muscas, Ioan-Augustin Chioar, Henry Stopfel, Agne Ciuciulkaite, Erik Östman, Spyridon D. Pappas, Aaron Stein, Björgvin Hjörvarsson, Petra E. Jönsson, Vassilios Kapaklis

Abstract: We report on the study of the thermal dynamics of square artificial spin ice, probed by means of temperature and frequency dependent AC susceptibility. Pronounced influence of the inter-island coupling strength was found on the frequency response of the samples. Through the subsequent analysis of the frequency- and coupling-dependent freezing temperatures, we discuss the phenomenological parameter… ▽ More We report on the study of the thermal dynamics of square artificial spin ice, probed by means of temperature and frequency dependent AC susceptibility. Pronounced influence of the inter-island coupling strength was found on the frequency response of the samples. Through the subsequent analysis of the frequency- and coupling-dependent freezing temperatures, we discuss the phenomenological parameters obtained in the framework of Vogel-Fulcher-Tammann law in terms of the samples microscopic features. The high sensitivity and robust signal to noise ratio of AC susceptibility validates the latter as a promising and simple experimental technique for resolving the dynamics and temperature driven dynamics crossovers for the case of artificial spin ice. △ Less

Submitted 11 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

Comments: 7 pages, 4 figures

Journal ref: Phys. Rev. B 101, 134404 (2020)

arXiv:1910.14657 [pdf, other]

Stochastic Transport with Lévy Noise -- Fully Discrete Numerical Approximation

Authors: Andrea Barth, Andreas Stein

Abstract: Semilinear hyperbolic stochastic partial differential equations (SPDEs) find widespread applications in the natural and engineering sciences. However, the traditional Gaussian setting may prove too restrictive, as phenomena in mathematical finance, porous media, and pollution models often exhibit noise of a different nature. To capture temporal discontinuities and accommodate heavy-tailed distribu… ▽ More Semilinear hyperbolic stochastic partial differential equations (SPDEs) find widespread applications in the natural and engineering sciences. However, the traditional Gaussian setting may prove too restrictive, as phenomena in mathematical finance, porous media, and pollution models often exhibit noise of a different nature. To capture temporal discontinuities and accommodate heavy-tailed distributions, Hilbert space-valued Lévy processes or Lévy fields are employed as driving noise terms. The numerical discretization of such SPDEs presents several challenges. The low regularity of the solution in space and time leads to slow convergence rates and instability in space/time discretization schemes. Furthermore, the Lévy process can take values in an infinite-dimensional Hilbert space, necessitating projections onto finite-dimensional subspaces at each discrete time point. Additionally, unbiased sampling from the resulting Lévy field may not be feasible. In this study, we introduce a novel fully discrete approximation scheme that tackles these difficulties. Our main contribution is a discontinuous Galerkin scheme for spatial approximation, derived naturally from the weak formulation of the SPDE. We establish optimal convergence properties for this approach and combine it with a suitable time step** scheme to prevent numerical oscillations. Furthermore, we approximate the driving noise process using truncated Karhunen-Loève expansions. This approximation yields a sum of scaled and uncorrelated one-dimensional Lévy processes, which can be simulated with controlled bias using Fourier inversion techniques. △ Less

Submitted 3 July, 2023; v1 submitted 31 October, 2019; originally announced October 2019.

MSC Class: 60H15; 60H35; 35R60; 60G51; 60J76; 65M15; 65M60

arXiv:1903.00578 [pdf]

doi 10.1038/s41377-019-0201-7

Dielectric Metasurfaces for Complete and Independent Control of Optical Amplitude and Phase

Authors: Adam C. Overvig, Sajan Shrestha, Stephanie C. Malek, Ming Lu, Aaron Stein, Changxi Zheng, Nanfang Yu

Abstract: Metasurfaces are optically thin metamaterials that promise complete control of the wavefront of light but are primarily used to control only the phase of light. Here, we present an approach, simple in concept and in practice, that uses meta-atoms with a varying degree of form birefringence and rotation angles to create high-efficiency dielectric metasurfaces that control both the optical amplitude… ▽ More Metasurfaces are optically thin metamaterials that promise complete control of the wavefront of light but are primarily used to control only the phase of light. Here, we present an approach, simple in concept and in practice, that uses meta-atoms with a varying degree of form birefringence and rotation angles to create high-efficiency dielectric metasurfaces that control both the optical amplitude and phase at one or two frequencies. This opens up applications in computer-generated holography, allowing faithful reproduction of both the phase and amplitude of a target holographic scene without the iterative algorithms required in phase-only holography. We demonstrate all-dielectric metasurface holograms with independent and complete control of the amplitude and phase at up to two optical frequencies simultaneously to generate two- and three-dimensional holographic objects. We show that phase-amplitude metasurfaces enable a few features not attainable in phase-only holography; these include creating artifact-free two-dimensional holographic images, encoding phase and amplitude profiles separately at the object plane, encoding intensity profiles at the metasurface and object planes separately, and controlling the surface textures of three-dimensional holographic objects. △ Less

Submitted 5 September, 2019; v1 submitted 1 March, 2019; originally announced March 2019.

Comments: 47 pages, 20 figures; additional results, references added

arXiv:1902.06061 [pdf, other]

Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation

Authors: Devansh Bisla, Anna Choromanska, Jennifer A. Stein, David Polsky, Russell Berman

Abstract: Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with o… ▽ More Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with occlusions. We build deep-learning-based tools for data purification and augmentation to counter-act these limitations. The developed tools can be utilized in a deep learning system for lesion classification and we show how to build such a system. The system heavily relies on the processing unit for removing image occlusions and the data generation unit, based on generative adversarial networks, for populating scarce lesion classes, or equivalently creating virtual patients with pre-defined types of lesions. We empirically verify our approach and show that incorporating these two units into melanoma detection system results in the superior performance over common baselines. △ Less

Submitted 14 May, 2019; v1 submitted 16 February, 2019; originally announced February 2019.

Comments: Accepted to CVPR ISIC Workshop - 2019

arXiv:1902.02130 [pdf, other]

Numerical analysis for time-dependent advection-diffusion problems with random discontinuous coefficients

Authors: Andrea Barth, Andreas Stein

Abstract: Subsurface flows are commonly modeled by advection-diffusion equations. Insufficient measurements or uncertain material procurement may be accounted for by random coefficients. To represent, for example, transitions in heterogeneous media, the parameters of the equation are spatially discontinuous. Specifically, a scenario with coupled advection- and diffusion coefficients that are modeled as sums… ▽ More Subsurface flows are commonly modeled by advection-diffusion equations. Insufficient measurements or uncertain material procurement may be accounted for by random coefficients. To represent, for example, transitions in heterogeneous media, the parameters of the equation are spatially discontinuous. Specifically, a scenario with coupled advection- and diffusion coefficients that are modeled as sums of continuous random fields and discontinuous jump components are considered. For the numerical approximation of the solution, an adaptive, pathwise discretization scheme based on a Finite Element approach is introduced. To stabilize the numerical approximation and accelerate convergence, the discrete space-time grid is chosen with respect to the varying discontinuities in each sample of the coefficients, leading to a stochastic formulation of the Galerkin projection and the Finite Element basis. △ Less

Submitted 22 January, 2021; v1 submitted 6 February, 2019; originally announced February 2019.

arXiv:1902.02129 [pdf, ps, other]

A Multilevel Monte Carlo Algorithm for Parabolic Advection-Diffusion Problems with Discontinuous Coefficients

Authors: Andrea Barth, Andreas Stein

Abstract: The Richards' equation is a model for flow of water in unsaturated soils. The coefficients of this (nonlinear) partial differential equation describe the permeability of the medium. Insufficient or uncertain measurements are commonly modeled by random coefficients. For flows in heterogeneous\textbackslash fractured\textbackslash porous media, the coefficients are modeled as discontinuous random fi… ▽ More The Richards' equation is a model for flow of water in unsaturated soils. The coefficients of this (nonlinear) partial differential equation describe the permeability of the medium. Insufficient or uncertain measurements are commonly modeled by random coefficients. For flows in heterogeneous\textbackslash fractured\textbackslash porous media, the coefficients are modeled as discontinuous random fields, where the interfaces along the stochastic discontinuities represent transitions in the media. More precisely, the random coefficient is given by the sum of a (continuous) Gaussian random field and a (discontinuous) jump part. In this work moments of the solution to the random partial differential equation are calculated using a path-wise numerical approximation combined with multilevel Monte Carlo sampling. The discontinuities dictate the spatial discretization, which leads to a stochastic grid. Hence, the refinement parameter and problem-dependent constants in the error analysis are random variables and we derive (optimal) a-priori convergence rates in a mean-square sense. △ Less

Submitted 9 March, 2020; v1 submitted 6 February, 2019; originally announced February 2019.

MSC Class: 60H25; 60H30; 60H35; 35R60; 65C05; 58J65; 35K10

Showing 1–50 of 126 results for author: Stein, A