-
PIC2O-Sim: A Physics-Inspired Causality-Aware Dynamic Convolutional Neural Operator for Ultra-Fast Photonic Device FDTD Simulation
Authors:
**chuan Ma,
Haoyu Yang,
Zhengqi Gao,
Duane S. Boning,
Jiaqi Gu
Abstract:
The finite-difference time-domain (FDTD) method, which is important in photonic hardware design flow, is widely adopted to solve time-domain Maxwell equations. However, FDTD is known for its prohibitive runtime cost, taking minutes to hours to simulate a single device. Recently, AI has been applied to realize orders-of-magnitude speedup in partial differential equation (PDE) solving. However, AI-b…
▽ More
The finite-difference time-domain (FDTD) method, which is important in photonic hardware design flow, is widely adopted to solve time-domain Maxwell equations. However, FDTD is known for its prohibitive runtime cost, taking minutes to hours to simulate a single device. Recently, AI has been applied to realize orders-of-magnitude speedup in partial differential equation (PDE) solving. However, AI-based FDTD solvers for photonic devices have not been clearly formulated. Directly applying off-the-shelf models to predict the optical field dynamics shows unsatisfying fidelity and efficiency since the model primitives are agnostic to the unique physical properties of Maxwell equations and lack algorithmic customization. In this work, we thoroughly investigate the synergy between neural operator designs and the physical property of Maxwell equations and introduce a physics-inspired AI-based FDTD prediction framework PIC2O-Sim which features a causality-aware dynamic convolutional neural operator as its backbone model that honors the space-time causality constraints via careful receptive field configuration and explicitly captures the permittivity-dependent light propagation behavior via an efficient dynamic convolution operator. Meanwhile, we explore the trade-offs among prediction scalability, fidelity, and efficiency via a multi-stage partitioned time-bundling technique in autoregressive prediction. Multiple key techniques have been introduced to mitigate iterative error accumulation while maintaining efficiency advantages during autoregressive field prediction. Extensive evaluations on three challenging photonic device simulation tasks have shown the superiority of our PIC2O-Sim method, showing 51.2% lower roll-out prediction error, 23.5 times fewer parameters than state-of-the-art neural operators, providing 300-600x higher simulation speed than an open-source FDTD numerical solver.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Rare Event Probability Learning by Normalizing Flows
Authors:
Zhenggqi Gao,
Dinghuai Zhang,
Luca Daniel,
Duane S. Boning
Abstract:
A rare event is defined by a low probability of occurrence. Accurate estimation of such small probabilities is of utmost importance across diverse domains. Conventional Monte Carlo methods are inefficient, demanding an exorbitant number of samples to achieve reliable estimates. Inspired by the exact sampling capabilities of normalizing flows, we revisit this challenge and propose normalizing flow…
▽ More
A rare event is defined by a low probability of occurrence. Accurate estimation of such small probabilities is of utmost importance across diverse domains. Conventional Monte Carlo methods are inefficient, demanding an exorbitant number of samples to achieve reliable estimates. Inspired by the exact sampling capabilities of normalizing flows, we revisit this challenge and propose normalizing flow assisted importance sampling, termed NOFIS. NOFIS first learns a sequence of proposal distributions associated with predefined nested subset events by minimizing KL divergence losses. Next, it estimates the rare event probability by utilizing importance sampling in conjunction with the last proposal. The efficacy of our NOFIS method is substantiated through comprehensive qualitative visualizations, affirming the optimality of the learned proposal distribution, as well as a series of quantitative experiments encompassing $10$ distinct test cases, which highlight NOFIS's superiority over baseline approaches.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
KirchhoffNet: A Scalable Ultra Fast Analog Neural Network
Authors:
Zhengqi Gao,
Fan-Keng Sun,
Ron Rohrer,
Duane S. Boning
Abstract:
In this paper, we leverage a foundational principle of analog electronic circuitry, Kirchhoff's current and voltage laws, to introduce a distinctive class of neural network models termed KirchhoffNet. Essentially, KirchhoffNet is an analog circuit that can function as a neural network, utilizing its initial node voltages as the neural network input and the node voltages at a specific time point as…
▽ More
In this paper, we leverage a foundational principle of analog electronic circuitry, Kirchhoff's current and voltage laws, to introduce a distinctive class of neural network models termed KirchhoffNet. Essentially, KirchhoffNet is an analog circuit that can function as a neural network, utilizing its initial node voltages as the neural network input and the node voltages at a specific time point as the output. The evolution of node voltages within the specified time is dictated by learnable parameters on the edges connecting nodes. We demonstrate that KirchhoffNet is governed by a set of ordinary differential equations (ODEs), and notably, even in the absence of traditional layers (such as convolution layers), it attains state-of-the-art performances across diverse and complex machine learning tasks. Most importantly, KirchhoffNet can be potentially implemented as a low-power analog integrated circuit, leading to an appealing property -- irrespective of the number of parameters within a KirchhoffNet, its on-chip forward calculation can always be completed within a short time. This characteristic makes KirchhoffNet a promising and fundamental paradigm for implementing large-scale neural networks, opening a new avenue in analog neural networks for AI.
△ Less
Submitted 6 May, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Nominality Score Conditioned Time Series Anomaly Detection by Point/Sequential Reconstruction
Authors:
Chih-Yu Lai,
Fan-Keng Sun,
Zhengqi Gao,
Jeffrey H. Lang,
Duane S. Boning
Abstract:
Time series anomaly detection is challenging due to the complexity and variety of patterns that can occur. One major difficulty arises from modeling time-dependent relationships to find contextual anomalies while maintaining detection accuracy for point anomalies. In this paper, we propose a framework for unsupervised time series anomaly detection that utilizes point-based and sequence-based recon…
▽ More
Time series anomaly detection is challenging due to the complexity and variety of patterns that can occur. One major difficulty arises from modeling time-dependent relationships to find contextual anomalies while maintaining detection accuracy for point anomalies. In this paper, we propose a framework for unsupervised time series anomaly detection that utilizes point-based and sequence-based reconstruction models. The point-based model attempts to quantify point anomalies, and the sequence-based model attempts to quantify both point and contextual anomalies. Under the formulation that the observed time point is a two-stage deviated value from a nominal time point, we introduce a nominality score calculated from the ratio of a combined value of the reconstruction errors. We derive an induced anomaly score by further integrating the nominality score and anomaly score, then theoretically prove the superiority of the induced anomaly score over the original anomaly score under certain conditions. Extensive studies conducted on several public datasets show that the proposed framework outperforms most state-of-the-art baselines for time series anomaly detection.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Provable Routing Analysis of Programmable Photonics
Authors:
Zhengqi Gao,
Xiangfeng Chen,
Zhengxing Zhang,
Chih-Yu Lai,
Uttara Chakraborty,
Wim Bogaerts,
Duane S. Boning
Abstract:
Programmable photonic integrated circuits (PPICs) are an emerging technology recently proposed as an alternative to custom-designed application-specific integrated photonics. Light routing is one of the most important functions that need to be realized on a PPIC. Previous literature has investigated the light routing problem from an algorithmic or experimental perspective, e.g., adopting graph the…
▽ More
Programmable photonic integrated circuits (PPICs) are an emerging technology recently proposed as an alternative to custom-designed application-specific integrated photonics. Light routing is one of the most important functions that need to be realized on a PPIC. Previous literature has investigated the light routing problem from an algorithmic or experimental perspective, e.g., adopting graph theory to route an optical signal. In this paper, we also focus on the light routing problem, but from a complementary and theoretical perspective, to answer questions about what is possible to be routed. Specifically, we demonstrate that not all path lengths (defined as the number of tunable basic units that an optical signal traverses) can be realized on a square-mesh PPIC, and a rigorous realizability condition is proposed and proved. We further consider multi-path routing, where we provide an analytical expression on path length sum, upper bounds on path length mean/variance, and the maximum number of realizable paths. All of our conclusions are proven mathematically. Illustrative potential optical applications using our observations are also presented.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
A Review of Bayesian Methods in Electronic Design Automation
Authors:
Zhengqi Gao,
Duane S. Boning
Abstract:
The utilization of Bayesian methods has been widely acknowledged as a viable solution for tackling various challenges in electronic integrated circuit (IC) design under stochastic process variation, including circuit performance modeling, yield/failure rate estimation, and circuit optimization. As the post-Moore era brings about new technologies (such as silicon photonics and quantum circuits), ma…
▽ More
The utilization of Bayesian methods has been widely acknowledged as a viable solution for tackling various challenges in electronic integrated circuit (IC) design under stochastic process variation, including circuit performance modeling, yield/failure rate estimation, and circuit optimization. As the post-Moore era brings about new technologies (such as silicon photonics and quantum circuits), many of the associated issues there are similar to those encountered in electronic IC design and can be addressed using Bayesian methods. Motivated by this observation, we present a comprehensive review of Bayesian methods in electronic design automation (EDA). By doing so, we hope to equip researchers and designers with the ability to apply Bayesian methods in solving stochastic problems in electronic circuits and beyond.
△ Less
Submitted 13 March, 2023;
originally announced April 2023.
-
NeurOLight: A Physics-Agnostic Neural Operator Enabling Parametric Photonic Device Simulation
Authors:
Jiaqi Gu,
Zhengqi Gao,
Chenghao Feng,
Hanqing Zhu,
Ray T. Chen,
Duane S. Boning,
David Z. Pan
Abstract:
Optical computing is an emerging technology for next-generation efficient artificial intelligence (AI) due to its ultra-high speed and efficiency. Electromagnetic field simulation is critical to the design, optimization, and validation of photonic devices and circuits. However, costly numerical simulation significantly hinders the scalability and turn-around time in the photonic circuit design loo…
▽ More
Optical computing is an emerging technology for next-generation efficient artificial intelligence (AI) due to its ultra-high speed and efficiency. Electromagnetic field simulation is critical to the design, optimization, and validation of photonic devices and circuits. However, costly numerical simulation significantly hinders the scalability and turn-around time in the photonic circuit design loop. Recently, physics-informed neural networks have been proposed to predict the optical field solution of a single instance of a partial differential equation (PDE) with predefined parameters. Their complicated PDE formulation and lack of efficient parametrization mechanisms limit their flexibility and generalization in practical simulation scenarios. In this work, for the first time, a physics-agnostic neural operator-based framework, dubbed NeurOLight, is proposed to learn a family of frequency-domain Maxwell PDEs for ultra-fast parametric photonic device simulation. We balance the efficiency and generalization of NeurOLight via several novel techniques. Specifically, we discretize different devices into a unified domain, represent parametric PDEs with a compact wave prior, and encode the incident light via masked source modeling. We design our model with parameter-efficient cross-shaped NeurOLight blocks and adopt superposition-based augmentation for data-efficient learning. With these synergistic approaches, NeurOLight generalizes to a large space of unseen simulation settings, demonstrates 2-orders-of-magnitude faster simulation speed than numerical solvers, and outperforms prior neural network models by ~54% lower prediction error with ~44% fewer parameters. Our code is available at https://github.com/JeremieMelo/NeurOLight.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Automatic Synthesis of Light Processing Functions for Programmable Photonics: Theory and Realization
Authors:
Zhengqi Gao,
Xiangfeng Chen,
Zhengxing Zhang,
Uttara Chakraborty,
Wim Bogaerts,
Duane S. Boning
Abstract:
Linear light processing functions (e.g., routing, splitting, filtering) are key functions requiring configuration to implement on a programmable photonic integrated circuit (PPIC). In recirculating waveguide meshes (which include loop-backs), this is usually done manually. Some previous results describe explorations to perform this task automatically, but their efficiency or applicability is still…
▽ More
Linear light processing functions (e.g., routing, splitting, filtering) are key functions requiring configuration to implement on a programmable photonic integrated circuit (PPIC). In recirculating waveguide meshes (which include loop-backs), this is usually done manually. Some previous results describe explorations to perform this task automatically, but their efficiency or applicability is still limited. In this paper, we propose an efficient method that can automatically realize configurations for many light processing functions on a square-mesh PPIC. At its heart is an automatic differentiation subroutine built upon analytical expressions of scattering matrices, that enables gradient descent optimization for functional circuit synthesis. Similar to the state-of-the-art synthesis techniques, our method can realize configurations for a wide range of light processing functions, and multiple functions on the same PPIC simultaneously. However, we do not need to separate the functions spatially into different subdomains of the mesh, and the resulting optimum can have multiple functions using the same part of the mesh. Furthermore, compared to non-gradient or numerical differentiation based methods, our proposed approach achieves 3x time reduction in computational cost.
△ Less
Submitted 10 February, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion
Authors:
Zhengqi Gao,
Fan-Keng Sun,
Mingran Yang,
Sucheng Ren,
Zikai Xiong,
Marc Engeler,
Antonio Burazer,
Linda Wildling,
Luca Daniel,
Duane S. Boning
Abstract:
Data lies at the core of modern deep learning. The impressive performance of supervised learning is built upon a base of massive accurately labeled data. However, in some real-world applications, accurate labeling might not be viable; instead, multiple noisy labels (instead of one accurate label) are provided by several annotators for each data sample. Learning a classifier on such a noisy trainin…
▽ More
Data lies at the core of modern deep learning. The impressive performance of supervised learning is built upon a base of massive accurately labeled data. However, in some real-world applications, accurate labeling might not be viable; instead, multiple noisy labels (instead of one accurate label) are provided by several annotators for each data sample. Learning a classifier on such a noisy training dataset is a challenging task. Previous approaches usually assume that all data samples share the same set of parameters related to annotator errors, while we demonstrate that label error learning should be both annotator and data sample dependent. Motivated by this observation, we propose a novel learning algorithm. The proposed method displays superiority compared with several state-of-the-art baseline methods on MNIST, CIFAR-100, and ImageNet-100. Our code is available at: https://github.com/zhengqigao/Learning-from-Multiple-Annotator-Noisy-Labels.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
FreDo: Frequency Domain-based Long-Term Time Series Forecasting
Authors:
Fan-Keng Sun,
Duane S. Boning
Abstract:
The ability to forecast far into the future is highly beneficial to many applications, including but not limited to climatology, energy consumption, and logistics. However, due to noise or measurement error, it is questionable how far into the future one can reasonably predict. In this paper, we first mathematically show that due to error accumulation, sophisticated models might not outperform bas…
▽ More
The ability to forecast far into the future is highly beneficial to many applications, including but not limited to climatology, energy consumption, and logistics. However, due to noise or measurement error, it is questionable how far into the future one can reasonably predict. In this paper, we first mathematically show that due to error accumulation, sophisticated models might not outperform baseline models for long-term forecasting. To demonstrate, we show that a non-parametric baseline model based on periodicity can actually achieve comparable performance to a state-of-the-art Transformer-based model on various datasets. We further propose FreDo, a frequency domain-based neural network model that is built on top of the baseline model to enhance its performance and which greatly outperforms the state-of-the-art model. Finally, we validate that the frequency domain is indeed better by comparing univariate models trained in the frequency v.s. time domain.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Variational Quantum Pulse Learning
Authors:
Zhiding Liang,
Hanrui Wang,
**glei Cheng,
Yongshan Ding,
Hang Ren,
Zhengqi Gao,
Zhirui Hu,
Duane S. Boning,
Xuehai Qian,
Song Han,
Weiwen Jiang,
Yiyu Shi
Abstract:
Quantum computing is among the most promising emerging techniques to solve problems that are computationally intractable on classical hardware. A large body of existing works focus on using variational quantum algorithms on the gate level for machine learning tasks, such as the variational quantum circuit (VQC). However, VQC has limited flexibility and expressibility due to limited number of param…
▽ More
Quantum computing is among the most promising emerging techniques to solve problems that are computationally intractable on classical hardware. A large body of existing works focus on using variational quantum algorithms on the gate level for machine learning tasks, such as the variational quantum circuit (VQC). However, VQC has limited flexibility and expressibility due to limited number of parameters, e.g. only one parameter can be trained in one rotation gate. On the other hand, we observe that quantum pulses are lower than quantum gates in the stack of quantum computing and offers more control parameters. Inspired by the promising performance of VQC, in this paper we propose variational quantum pulses (VQP), a novel paradigm to directly train quantum pulses for learning tasks. The proposed method manipulates variational quantum pulses by pulling and pushing the amplitudes of pulses in an optimization framework. Similar to variational quantum algorithms, our framework to train pulses maintains the robustness to noise on Noisy Intermediate-Scale Quantum (NISQ) computers. In an example task of binary classification, VQP learning achieves up to 11% and 9% higher accuracy compared with VQC learning on the qiskit noise simulators (with noise model from real machine) and ibmq-jarkata, respectively, demonstrating its effectiveness and feasibility. Stability for VQP to obtain reliable results has also been verified in the presence of noise.
△ Less
Submitted 5 August, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Adjusting for Autocorrelated Errors in Neural Networks for Time Series
Authors:
Fan-Keng Sun,
Christopher I. Lang,
Duane S. Boning
Abstract:
An increasing body of research focuses on using neural networks to model time series. A common assumption in training neural networks via maximum likelihood estimation on time series is that the errors across time steps are uncorrelated. However, errors are actually autocorrelated in many cases due to the temporality of the data, which makes such maximum likelihood estimations inaccurate. In this…
▽ More
An increasing body of research focuses on using neural networks to model time series. A common assumption in training neural networks via maximum likelihood estimation on time series is that the errors across time steps are uncorrelated. However, errors are actually autocorrelated in many cases due to the temporality of the data, which makes such maximum likelihood estimations inaccurate. In this paper, in order to adjust for autocorrelated errors, we propose to learn the autocorrelation coefficient jointly with the model parameters. In our experiments, we verify the effectiveness of our approach on time series forecasting. Results across a wide range of real-world datasets with various state-of-the-art models show that our method enhances performance in almost all cases. Based on these results, we suggest empirical critical values to determine the severity of autocorrelated errors. We also analyze several aspects of our method to demonstrate its advantages. Finally, other time series tasks are also considered to validate that our method is not restricted to only forecasting.
△ Less
Submitted 8 October, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Variational inference formulation for a model-free simulation of a dynamical system with unknown parameters by a recurrent neural network
Authors:
Kyongmin Yeo,
Dylan E. C. Grullon,
Fan-Keng Sun,
Duane S. Boning,
Jayant R. Kalagnanam
Abstract:
We propose a recurrent neural network for a "model-free" simulation of a dynamical system with unknown parameters without prior knowledge. The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset. We assume that the time series data set consists of an ensemble of trajectories for a range of the parameter…
▽ More
We propose a recurrent neural network for a "model-free" simulation of a dynamical system with unknown parameters without prior knowledge. The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset. We assume that the time series data set consists of an ensemble of trajectories for a range of the parameters. The learning task is formulated as a statistical inference problem by considering the unknown parameters as random variables. A latent variable is introduced to model the effects of the unknown parameters, and a variational inference method is employed to simultaneously train probabilistic models for the time marching operator and an approximate posterior distribution for the latent variable. Unlike the classical variational inference, where a factorized distribution is used to approximate the posterior, we employ a feedforward neural network supplemented by an encoder recurrent neural network to develop a more flexible probabilistic model. The approximate posterior distribution makes an inference on a trajectory to identify the effects of the unknown parameters. The time marching operator is approximated by a recurrent neural network, which takes a latent state sampled from the approximate posterior distribution as one of the input variables, to compute the time evolution of the probability distribution conditioned on the latent variable. In the numerical experiments, it is shown that the proposed variational inference model makes a more accurate simulation compared to the standard recurrent neural networks. It is found that the proposed deep learning model is capable of correctly identifying the dimensions of the random parameters and learning a representation of complex time series data.
△ Less
Submitted 26 February, 2021; v1 submitted 2 March, 2020;
originally announced March 2020.