-
Early warning indicators via latent stochastic dynamical systems
Authors:
Lingyu Feng,
Ting Gao,
Wang Xiao,
**qiao Duan
Abstract:
Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data is essential in many real-world applications, such as brain diseases, natural disasters, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifol…
▽ More
Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data is essential in many real-world applications, such as brain diseases, natural disasters, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifold. Then three effective warning signals (Onsager-Machlup Indicator, Sample Entropy Indicator, and Transition Probability Indicator) are derived through the latent coordinates and the latent stochastic dynamical systems. To validate our framework, we apply this methodology to authentic electroencephalogram (EEG) data. We find that our early warning indicators are capable of detecting the tip** point during state transition. This framework not only bridges the latent dynamics with real-world data but also shows the potential ability for automatic labeling on complex high-dimensional time series.
△ Less
Submitted 5 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Target PCA: Transfer Learning Large Dimensional Panel Data
Authors:
Junting Duan,
Markus Pelger,
Ruoxuan Xiong
Abstract:
This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator…
▽ More
This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Reservoir Computing with Error Correction: Long-term Behaviors of Stochastic Dynamical Systems
Authors:
Cheng Fang,
Yubin Lu,
Ting Gao,
**qiao Duan
Abstract:
The prediction of stochastic dynamical systems and the capture of dynamical behaviors are profound problems. In this article, we propose a data-driven framework combining Reservoir Computing and Normalizing Flow to study this issue, which mimics error modeling to improve traditional Reservoir Computing performance and integrates the virtues of both approaches. With few assumptions about the underl…
▽ More
The prediction of stochastic dynamical systems and the capture of dynamical behaviors are profound problems. In this article, we propose a data-driven framework combining Reservoir Computing and Normalizing Flow to study this issue, which mimics error modeling to improve traditional Reservoir Computing performance and integrates the virtues of both approaches. With few assumptions about the underlying stochastic dynamical systems, this model-free method successfully predicts the long-term evolution of stochastic dynamical systems and replicates dynamical behaviors. We verify the effectiveness of the proposed framework in several experiments, including the stochastic Van der Pal oscillator, El Niño-Southern Oscillation simplified model, and stochastic Lorenz system. These experiments consist of Markov/non-Markov and stationary/non-stationary stochastic processes which are defined by linear/nonlinear stochastic differential equations or stochastic delay differential equations. Additionally, we explore the noise-induced tip** phenomenon, relaxation oscillation, stochastic mixed-mode oscillation, and replication of the strange attractor.
△ Less
Submitted 30 July, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Sparse Positive-Definite Estimation for Covariance Matrices with Repeated Measurements
Authors:
Sunpeng Duan,
Guo Yu,
Juntao Duan,
Yuedong Wang
Abstract:
Repeated measurements are common in many fields, where random variables are observed repeatedly across different subjects. Such data have an underlying hierarchical structure, and it is of interest to learn covariance/correlation at different levels. Most existing methods for sparse covariance/correlation matrix estimation assume independent samples. Ignoring the underlying hierarchical structure…
▽ More
Repeated measurements are common in many fields, where random variables are observed repeatedly across different subjects. Such data have an underlying hierarchical structure, and it is of interest to learn covariance/correlation at different levels. Most existing methods for sparse covariance/correlation matrix estimation assume independent samples. Ignoring the underlying hierarchical structure and correlation within the subject leads to erroneous scientific conclusions. In this paper, we study the problem of sparse and positive-definite estimation of between-subject and within-subject covariance/correlation matrices for repeated measurements. Our estimators are solutions to convex optimization problems that can be solved efficiently. We establish estimation error rates for the proposed estimators and demonstrate their favorable performance through theoretical analysis and comprehensive simulation studies. We further apply our methods to construct between-subject and within-subject covariance graphs of clinical variables from hemodialysis patients.
△ Less
Submitted 10 June, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Two-stage Hypothesis Tests for Variable Interactions with FDR Control
Authors:
**gyi Duan,
Yang Ning,
Xi Chen,
Yong Chen
Abstract:
In many scenarios such as genome-wide association studies where dependences between variables commonly exist, it is often of interest to infer the interaction effects in the model. However, testing pairwise interactions among millions of variables in complex and high-dimensional data suffers from low statistical power and huge computational cost. To address these challenges, we propose a two-stage…
▽ More
In many scenarios such as genome-wide association studies where dependences between variables commonly exist, it is often of interest to infer the interaction effects in the model. However, testing pairwise interactions among millions of variables in complex and high-dimensional data suffers from low statistical power and huge computational cost. To address these challenges, we propose a two-stage testing procedure with false discovery rate (FDR) control, which is known as a less conservative multiple-testing correction. Theoretically, the difficulty in the FDR control dues to the data dependence among test statistics in two stages, and the fact that the number of hypothesis tests conducted in the second stage depends on the screening result in the first stage. By using the Cramér type moderate deviation technique, we show that our procedure controls FDR at the desired level asymptotically in the generalized linear model (GLM), where the model is allowed to be misspecified. In addition, the asymptotic power of the FDR control procedure is rigorously established. We demonstrate via comprehensive simulation studies that our two-stage procedure is computationally more efficient than the classical BH procedure, with a comparable or improved statistical power. Finally, we apply the proposed method to a bladder cancer data from dbGaP where the scientific goal is to identify genetic susceptibility loci for bladder cancer.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Learning effective dynamics from data-driven stochastic systems
Authors:
Lingyu Feng,
Ting Gao,
Min Dai,
**qiao Duan
Abstract:
Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic…
▽ More
Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.
△ Less
Submitted 29 December, 2023; v1 submitted 9 May, 2022;
originally announced May 2022.
-
LoCoV: low dimension covariance voting algorithm for portfolio optimization
Authors:
JunTao Duan,
Ionel Popescu
Abstract:
Minimum-variance portfolio optimizations rely on accurate covariance estimator to obtain optimal portfolios. However, it usually suffers from large error from sample covariance matrix when the sample size $n$ is not significantly larger than the number of assets $p$. We analyze the random matrix aspects of portfolio optimization and identify the order of errors in sample optimal portfolio weight a…
▽ More
Minimum-variance portfolio optimizations rely on accurate covariance estimator to obtain optimal portfolios. However, it usually suffers from large error from sample covariance matrix when the sample size $n$ is not significantly larger than the number of assets $p$. We analyze the random matrix aspects of portfolio optimization and identify the order of errors in sample optimal portfolio weight and show portfolio risk are underestimated when using samples. We also provide LoCoV (low dimension covariance voting) algorithm to reduce error inherited from random samples. From various experiments, LoCoV is shown to outperform the classical method by a large margin.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Enhanced Nearest Neighbor Classification for Crowdsourcing
Authors:
Jiexin Duan,
Xingye Qiao,
Guang Cheng
Abstract:
In machine learning, crowdsourcing is an economical way to label a large amount of data. However, the noise in the produced labels may deteriorate the accuracy of any classification method applied to the labelled data. We propose an enhanced nearest neighbor classifier (ENN) to overcome this issue. Two algorithms are developed to estimate the worker quality (which is often unknown in practice): on…
▽ More
In machine learning, crowdsourcing is an economical way to label a large amount of data. However, the noise in the produced labels may deteriorate the accuracy of any classification method applied to the labelled data. We propose an enhanced nearest neighbor classifier (ENN) to overcome this issue. Two algorithms are developed to estimate the worker quality (which is often unknown in practice): one is to construct the estimate based on the denoised worker labels by applying the $k$NN classifier to the expert data; the other is an iterative algorithm that works even without access to the expert data. Other than strong numerical evidence, our proposed methods are proven to achieve the same regret as its oracle version based on high-quality expert data. As a technical by-product, a lower bound on the sample size assigned to each worker to reach the optimal convergence rate of regret is derived.
△ Less
Submitted 26 February, 2022;
originally announced March 2022.
-
An end-to-end deep learning approach for extracting stochastic dynamical systems with $α$-stable Lévy noise
Authors:
Cheng Fang,
Yubin Lu,
Ting Gao,
**qiao Duan
Abstract:
Recently, extracting data-driven governing laws of dynamical systems through deep learning frameworks has gained a lot of attention in various fields. Moreover, a growing amount of research work tends to transfer deterministic dynamical systems to stochastic dynamical systems, especially those driven by non-Gaussian multiplicative noise. However, lots of log-likelihood based algorithms that work w…
▽ More
Recently, extracting data-driven governing laws of dynamical systems through deep learning frameworks has gained a lot of attention in various fields. Moreover, a growing amount of research work tends to transfer deterministic dynamical systems to stochastic dynamical systems, especially those driven by non-Gaussian multiplicative noise. However, lots of log-likelihood based algorithms that work well for Gaussian cases cannot be directly extended to non-Gaussian scenarios which could have high error and low convergence issues. In this work, we overcome some of these challenges and identify stochastic dynamical systems driven by $α$-stable Lévy noise from only random pairwise data. Our innovations include: (1) designing a deep learning approach to learn both drift and diffusion coefficients for Lévy induced noise with $α$ across all values, (2) learning complex multiplicative noise without restrictions on small noise intensity, (3) proposing an end-to-end complete framework for stochastic systems identification under a general input data assumption, that is, $α$-stable random variable. Finally, numerical experiments and comparisons with the non-local Kramers-Moyal formulas with moment generating function confirm the effectiveness of our method.
△ Less
Submitted 2 July, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Recover the spectrum of covariance matrix: a non-asymptotic iterative method
Authors:
Juntao Duan,
Ionel Popescu,
Heinrich Matzinger
Abstract:
It is well known the sample covariance has a consistent bias in the spectrum, for example spectrum of Wishart matrix follows the Marchenko-Pastur law. We in this work introduce an iterative algorithm 'Concent' that actively eliminate this bias and recover the true spectrum for small and moderate dimensions.
It is well known the sample covariance has a consistent bias in the spectrum, for example spectrum of Wishart matrix follows the Marchenko-Pastur law. We in this work introduce an iterative algorithm 'Concent' that actively eliminate this bias and recover the true spectrum for small and moderate dimensions.
△ Less
Submitted 1 January, 2022;
originally announced January 2022.
-
Invariance principle of random projection for the norm
Authors:
Juntao Duan,
Ionel Popescu,
Heinrich Matzinger
Abstract:
Johnson-Lindenstrauss guarantees certain topological structure is preserved under random projections when project high dimensional deterministic vectors to low dimensional vectors. In this work, we try to understand how random matrix affect norms of random vectors. In particular we prove the distribution of the norm of random vector $X \in \mathbb{R}^n$, whose entries are i.i.d. random variables,…
▽ More
Johnson-Lindenstrauss guarantees certain topological structure is preserved under random projections when project high dimensional deterministic vectors to low dimensional vectors. In this work, we try to understand how random matrix affect norms of random vectors. In particular we prove the distribution of the norm of random vector $X \in \mathbb{R}^n$, whose entries are i.i.d. random variables, is preserved by random projection $S:\mathbb{R}^n \to \mathbb{R}^m$. More precisely, \[ \frac{X^TS^TSX - mn}{\sqrt{σ^2 m^2n+2mn^2}} \xrightarrow[\quad m/n\to 0 \quad ]{ m,n\to \infty } \mathcal{N}(0,1) \] We also prove a concentration of the random norm transformed by either random projection or random embedding. Overall, our results showed random matrix has low distortion for the norm of random vectors with i.i.d. entries.
△ Less
Submitted 25 July, 2022; v1 submitted 1 December, 2021;
originally announced December 2021.
-
Neural network stochastic differential equation models with applications to financial data forecasting
Authors:
Luxuan Yang,
Ting Gao,
Yubin Lu,
**qiao Duan,
Tao Liu
Abstract:
In this article, we employ a collection of stochastic differential equations with drift and diffusion coefficients approximated by neural networks to predict the trend of chaotic time series which has big jump properties. Our contributions are, first, we propose a model called Lévy induced stochastic differential equation network, which explores compounded stochastic differential equations with…
▽ More
In this article, we employ a collection of stochastic differential equations with drift and diffusion coefficients approximated by neural networks to predict the trend of chaotic time series which has big jump properties. Our contributions are, first, we propose a model called Lévy induced stochastic differential equation network, which explores compounded stochastic differential equations with $α$-stable Lévy motion to model complex time series data and solve the problem through neural network approximation. Second, we theoretically prove that the numerical solution through our algorithm converges in probability to the solution of corresponding stochastic differential equation, without curse of dimensionality. Finally, we illustrate our method by applying it to real financial time series data and find the accuracy increases through the use of non-Gaussian Lévy processes. We also present detailed comparisons in terms of data patterns, various models, different shapes of Lévy motion and the prediction lengths.
△ Less
Submitted 3 November, 2022; v1 submitted 25 November, 2021;
originally announced November 2021.
-
Extracting stochastic dynamical systems with $α$-stable Lévy noise from data
Authors:
Yang Li,
Yubin Lu,
Shengyuan Xu,
**qiao Duan
Abstract:
With the rapid increase of valuable observational, experimental and simulated data for complex systems, much efforts have been devoted to identifying governing laws underlying the evolution of these systems. Despite the wide applications of non-Gaussian fluctuations in numerous physical phenomena, the data-driven approaches to extract stochastic dynamical systems with (non-Gaussian) Lévy noise are…
▽ More
With the rapid increase of valuable observational, experimental and simulated data for complex systems, much efforts have been devoted to identifying governing laws underlying the evolution of these systems. Despite the wide applications of non-Gaussian fluctuations in numerous physical phenomena, the data-driven approaches to extract stochastic dynamical systems with (non-Gaussian) Lévy noise are relatively few so far. In this work, we propose a data-driven method to extract stochastic dynamical systems with $α$-stable Lévy noise from short burst data based on the properties of $α$-stable distributions. More specifically, we first estimate the Lévy jump measure and noise intensity via computing mean and variance of the amplitude of the increment of the sample paths. Then we approximate the drift coefficient by combining nonlocal Kramers-Moyal formulas with normalizing flows. Numerical experiments on one- and two-dimensional prototypical examples illustrate the accuracy and effectiveness of our method. This approach will become an effective scientific tool in discovering stochastic governing laws of complex phenomena and understanding dynamical behaviors under non-Gaussian fluctuations.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
Extracting Stochastic Governing Laws by Nonlocal Kramers-Moyal Formulas
Authors:
Yubin Lu,
Yang Li,
**qiao Duan
Abstract:
With the rapid development of computational techniques and scientific tools, great progress of data-driven analysis has been made to extract governing laws of dynamical systems from data. Despite the wide occurrences of non-Gaussian fluctuations, the effective data-driven methods to identify stochastic differential equations with non-Gaussian Lévy noise are relatively few so far. In this work, we…
▽ More
With the rapid development of computational techniques and scientific tools, great progress of data-driven analysis has been made to extract governing laws of dynamical systems from data. Despite the wide occurrences of non-Gaussian fluctuations, the effective data-driven methods to identify stochastic differential equations with non-Gaussian Lévy noise are relatively few so far. In this work, we propose a data-driven approach to extract stochastic governing laws with both (Gaussian) Brownian motion and (non-Gaussian) Lévy motion, from short bursts of simulation data. Specifically, we use the normalizing flows technology to estimate the transition probability density function (solution of nonlocal Fokker-Planck equation) from data, and then substitute it into the recently proposed nonlocal Kramers-Moyal formulas to approximate Lévy jump measure, drift coefficient and diffusion coefficient. We demonstrate that this approach can learn the stochastic differential equation with Lévy motion. We present examples with one- and two-dimensional, decoupled and coupled systems to illustrate our method. This approach will become an effective tool for discovering stochastic governing laws and understanding complex dynamical behaviors.
△ Less
Submitted 31 August, 2021; v1 submitted 28 August, 2021;
originally announced August 2021.
-
Test of Significance for High-dimensional Thresholds with Application to Individualized Minimal Clinically Important Difference
Authors:
Huijie Feng,
**gyi Duan,
Yang Ning,
Jiwei Zhao
Abstract:
This work is motivated by learning the individualized minimal clinically important difference, a vital concept to assess clinical importance in various biomedical studies. We formulate the scientific question into a high-dimensional statistical problem where the parameter of interest lies in an individualized linear threshold. The goal is to develop a hypothesis testing procedure for the significa…
▽ More
This work is motivated by learning the individualized minimal clinically important difference, a vital concept to assess clinical importance in various biomedical studies. We formulate the scientific question into a high-dimensional statistical problem where the parameter of interest lies in an individualized linear threshold. The goal is to develop a hypothesis testing procedure for the significance of a single element in this parameter as well as of a linear combination of this parameter. The difficulty dues to the high-dimensional nuisance in develo** such a testing procedure, and also stems from the fact that this high-dimensional threshold model is nonregular and the limiting distribution of the corresponding estimator is nonstandard. To deal with these challenges, we construct a test statistic via a new bias-corrected smoothed decorrelated score approach, and establish its asymptotic distributions under both null and local alternative hypotheses. We propose a double-smoothing approach to select the optimal bandwidth in our test statistic and provide theoretical guarantees for the selected bandwidth. We conduct simulation studies to demonstrate how our proposed procedure can be applied in empirical studies. We apply the proposed method to a clinical trial where the scientific goal is to assess the clinical importance of a surgery procedure.
△ Less
Submitted 26 March, 2023; v1 submitted 9 August, 2021;
originally announced August 2021.
-
Learning the temporal evolution of multivariate densities via normalizing flows
Authors:
Yubin Lu,
Romit Maulik,
Ting Gao,
Felix Dietrich,
Ioannis G. Kevrekidis,
**qiao Duan
Abstract:
In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** t…
▽ More
In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** that takes a reference distribution (say, a Gaussian) to each and every instance of our evolving distribution. If the reference distribution is the initial condition of a Fokker-Planck equation, what we learn is the time-T map of the corresponding solution. Specifically, the learned map is a multivariate normalizing flow that deforms the support of the reference density to the support of each and every density snapshot in time. We demonstrate that this approach can approximate probability density function evolutions in time from observed sampled data for systems driven by both Brownian and Lévy noise. We present examples with two- and three-dimensional, uni- and multimodal distributions to validate the method.
△ Less
Submitted 3 May, 2022; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Extracting Governing Laws from Sample Path Data of Non-Gaussian Stochastic Dynamical Systems
Authors:
Yang Li,
**qiao Duan
Abstract:
Advances in data science are leading to new progresses in the analysis and understanding of complex dynamics for systems with experimental and observational data. With numerous physical phenomena exhibiting bursting, flights, hop**, and intermittent features, stochastic differential equations with non-Gaussian Lévy noise are suitable to model these systems. Thus it is desirable and essential to…
▽ More
Advances in data science are leading to new progresses in the analysis and understanding of complex dynamics for systems with experimental and observational data. With numerous physical phenomena exhibiting bursting, flights, hop**, and intermittent features, stochastic differential equations with non-Gaussian Lévy noise are suitable to model these systems. Thus it is desirable and essential to infer such equations from available data to reasonably predict dynamical behaviors. In this work, we consider a data-driven method to extract stochastic dynamical systems with non-Gaussian asymmetric (rather than the symmetric) Lévy process, as well as Gaussian Brownian motion. We establish a theoretical framework and design a numerical algorithm to compute the asymmetric Lévy jump measure, drift and diffusion (i.e., nonlocal Kramers-Moyal formulas), hence obtaining the stochastic governing law, from noisy data. Numerical experiments on several prototypical examples confirm the efficacy and accuracy of this method. This method will become an effective tool in discovering the governing laws from available data sets and in understanding the mechanisms underlying complex random phenomena.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
A Machine Learning Framework for Computing the Most Probable Paths of Stochastic Dynamical Systems
Authors:
Yang Li,
**qiao Duan,
Xianbin Liu
Abstract:
The emergence of transition phenomena between metastable states induced by noise plays a fundamental role in a broad range of nonlinear systems. The computation of the most probable paths is a key issue to understand the mechanism of transition behaviors. Shooting method is a common technique for this purpose to solve the Euler-Lagrange equation for the associated action functional, while losing i…
▽ More
The emergence of transition phenomena between metastable states induced by noise plays a fundamental role in a broad range of nonlinear systems. The computation of the most probable paths is a key issue to understand the mechanism of transition behaviors. Shooting method is a common technique for this purpose to solve the Euler-Lagrange equation for the associated action functional, while losing its efficacy in high-dimensional systems. In the present work, we develop a machine learning framework to compute the most probable paths in the sense of Onsager-Machlup action functional theory. Specifically, we reformulate the boundary value problem of Hamiltonian system and design a neural network to remedy the shortcomings of shooting method. The successful applications of our algorithms to several prototypical examples demonstrate its efficacy and accuracy for stochastic systems with both (Gaussian) Brownian noise and (non-Gaussian) Lévy noise. This novel approach is effective in exploring the internal mechanisms of rare events triggered by random fluctuations in various scientific fields.
△ Less
Submitted 24 December, 2020; v1 submitted 1 October, 2020;
originally announced October 2020.
-
Multivariate Time-series Anomaly Detection via Graph Attention Network
Authors:
Hang Zhao,
Yu**g Wang,
Juanyong Duan,
Congrui Huang,
Defu Cao,
Yunhai Tong,
Bixiong Xu,
**g Bai,
Jie Tong,
Qi Zhang
Abstract:
Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications. Recent approaches have achieved significant progress in this topic, but there is remaining limitations. One major limitation is that they do not capture the relationships between different time-series explicitly, resulting in inevitable false alarms. In this paper, we prop…
▽ More
Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications. Recent approaches have achieved significant progress in this topic, but there is remaining limitations. One major limitation is that they do not capture the relationships between different time-series explicitly, resulting in inevitable false alarms. In this paper, we propose a novel self-supervised framework for multivariate time-series anomaly detection to address this issue. Our framework considers each univariate time-series as an individual feature and includes two graph attention layers in parallel to learn the complex dependencies of multivariate time-series in both temporal and feature dimensions. In addition, our approach jointly optimizes a forecasting-based model and are construction-based model, obtaining better time-series representations through a combination of single-timestamp prediction and reconstruction of the entire time-series. We demonstrate the efficacy of our model through extensive experiments. The proposed method outperforms other state-of-the-art models on three real-world datasets. Further analysis shows that our method has good interpretability and is useful for anomaly diagnosis.
△ Less
Submitted 4 September, 2020;
originally announced September 2020.
-
Solving Inverse Stochastic Problems from Discrete Particle Observations Using the Fokker-Planck Equation and Physics-informed Neural Networks
Authors:
Xiaoli Chen,
Liu Yang,
**qiao Duan,
George Em Karniadakis
Abstract:
The Fokker-Planck (FP) equation governing the evolution of the probability density function (PDF) is applicable to many disciplines but it requires specification of the coefficients for each case, which can be functions of space-time and not just constants, hence requiring the development of a data-driven modeling approach. When the data available is directly on the PDF, then there exist methods f…
▽ More
The Fokker-Planck (FP) equation governing the evolution of the probability density function (PDF) is applicable to many disciplines but it requires specification of the coefficients for each case, which can be functions of space-time and not just constants, hence requiring the development of a data-driven modeling approach. When the data available is directly on the PDF, then there exist methods for inverse problems that can be employed to infer the coefficients and thus determine the FP equation and subsequently obtain its solution. Herein, we address a more realistic scenario, where only sparse data are given on the particles' positions at a few time instants, which are not sufficient to accurately construct directly the PDF even at those times from existing methods, e.g., kernel estimation algorithms. To this end, we develop a general framework based on physics-informed neural networks (PINNs) that introduces a new loss function using the Kullback-Leibler divergence to connect the stochastic samples with the FP equation, to simultaneously learn the equation and infer the multi-dimensional PDF at all times. In particular, we consider two types of inverse problems, type I where the FP equation is known but the initial PDF is unknown, and type II in which, in addition to unknown initial PDF, the drift and diffusion terms are also unknown. In both cases, we investigate problems with either Brownian or Levy noise or a combination of both. We demonstrate the new PINN framework in detail in the one-dimensional case (1D) but we also provide results for up to 5D demonstrating that we can infer both the FP equation and} dynamics simultaneously at all times with high accuracy using only very few discrete observations of the particles.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
A Data-Driven Approach for Discovering Stochastic Dynamical Systems with Non-Gaussian Levy Noise
Authors:
Yang Li,
**qiao Duan
Abstract:
With the rapid increase of valuable observational, experimental and simulating data for complex systems, great efforts are being devoted to discovering governing laws underlying the evolution of these systems. However, the existing techniques are limited to extract governing laws from data as either deterministic differential equations or stochastic differential equations with Gaussian noise. In t…
▽ More
With the rapid increase of valuable observational, experimental and simulating data for complex systems, great efforts are being devoted to discovering governing laws underlying the evolution of these systems. However, the existing techniques are limited to extract governing laws from data as either deterministic differential equations or stochastic differential equations with Gaussian noise. In the present work, we develop a new data-driven approach to extract stochastic dynamical systems with non-Gaussian symmetric Lévy noise, as well as Gaussian noise. First, we establish a feasible theoretical framework, by expressing the drift coefficient, diffusion coefficient and jump measure (i.e., anomalous diffusion) for the underlying stochastic dynamical system in terms of sample paths data. We then design a numerical algorithm to compute the drift, diffusion coefficient and jump measure, and thus extract a governing stochastic differential equation with Gaussian and non-Gaussian noise. Finally, we demonstrate the efficacy and accuracy of our approach by applying to several prototypical one-, two- and three-dimensional systems. This new approach will become a tool in discovering governing dynamical laws from noisy data sets, from observing or simulating complex phenomena, such as rare events triggered by random fluctuations with heavy as well as light tail statistical features.
△ Less
Submitted 10 December, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Difference Attention Based Error Correction LSTM Model for Time Series Prediction
Authors:
Yuxuan Liu,
Jiangyong Duan,
Juan Meng
Abstract:
In this paper, we propose a novel model for time series prediction in which difference-attention LSTM model and error-correction LSTM model are respectively employed and combined in a cascade way. While difference-attention LSTM model introduces a difference feature to perform attention in traditional LSTM to focus on the obvious changes in time series. Error-correction LSTM model refines the pred…
▽ More
In this paper, we propose a novel model for time series prediction in which difference-attention LSTM model and error-correction LSTM model are respectively employed and combined in a cascade way. While difference-attention LSTM model introduces a difference feature to perform attention in traditional LSTM to focus on the obvious changes in time series. Error-correction LSTM model refines the prediction error of difference-attention LSTM model to further improve the prediction accuracy. Finally, we design a training strategy to jointly train the both models simultaneously. With additional difference features and new principle learning framework, our model can improve the prediction accuracy in time series. Experiments on various time series are conducted to demonstrate the effectiveness of our method.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Safe Reinforcement Learning for Autonomous Vehicles through Parallel Constrained Policy Optimization
Authors:
Lu Wen,
**gliang Duan,
Shengbo Eben Li,
Shaobing Xu,
Huei Peng
Abstract:
Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two predominant problems: behaviours are unexplainable, and they cannot guarantee safety under new scenarios. This paper presents a safe RL algorithm, called Parallel C…
▽ More
Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two predominant problems: behaviours are unexplainable, and they cannot guarantee safety under new scenarios. This paper presents a safe RL algorithm, called Parallel Constrained Policy Optimization (PCPO), for two autonomous driving tasks. PCPO extends today's common actor-critic architecture to a three-component learning framework, in which three neural networks are used to approximate the policy function, value function and a newly added risk function, respectively. Meanwhile, a trust region constraint is added to allow large update steps without breaking the monotonic improvement condition. To ensure the feasibility of safety constrained problems, synchronized parallel learners are employed to explore different state spaces, which accelerates learning and policy-update. The simulations of two scenarios for autonomous vehicles confirm we can ensure safety while achieving fast learning.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Identifying stochastic governing equations from data of the most probable transition trajectories
Authors:
Jian Ren,
**qiao Duan
Abstract:
Extracting governing stochastic differential equation models from elusive data is crucial to understand and forecast dynamics for complex systems. We devise a method to extract the drift term and estimate the diffusion coefficient of a governing stochastic dynamical system, from its time-series data of the most probable transition trajectory. By the Onsager-Machlup theory, the most probable transi…
▽ More
Extracting governing stochastic differential equation models from elusive data is crucial to understand and forecast dynamics for complex systems. We devise a method to extract the drift term and estimate the diffusion coefficient of a governing stochastic dynamical system, from its time-series data of the most probable transition trajectory. By the Onsager-Machlup theory, the most probable transition trajectory satisfies the corresponding Euler-Lagrange equation, which is a second order deterministic ordinary differential equation involving the drift term and diffusion coefficient. We first estimate the coefficients of the Euler-Lagrange equation based on the data of the most probable trajectory, and then we calculate the drift and diffusion coefficients of the governing stochastic dynamical system. These two steps involve sparse regression and optimization. Finally, we illustrate our method with an example and some discussions.
△ Less
Submitted 19 August, 2020; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic
Authors:
Yangang Ren,
**gliang Duan,
Shengbo Eben Li,
Yang Guan,
Qi Sun
Abstract:
Reinforcement learning (RL) has achieved remarkable performance in numerous sequential decision making and control tasks. However, a common problem is that learned nearly optimal policy always overfits to the training environment and may not be extended to situations never encountered during training. For practical applications, the randomness of environment usually leads to some devastating event…
▽ More
Reinforcement learning (RL) has achieved remarkable performance in numerous sequential decision making and control tasks. However, a common problem is that learned nearly optimal policy always overfits to the training environment and may not be extended to situations never encountered during training. For practical applications, the randomness of environment usually leads to some devastating events, which should be the focus of safety-critical systems such as autonomous driving. In this paper, we introduce the minimax formulation and distributional framework to improve the generalization ability of RL algorithms and develop the Minimax Distributional Soft Actor-Critic (Minimax DSAC) algorithm. Minimax formulation aims to seek optimal policy considering the most severe variations from environment, in which the protagonist policy maximizes action-value function while the adversary policy tries to minimize it. Distributional framework aims to learn a state-action return distribution, from which we can model the risk of different returns explicitly, thereby formulating a risk-averse protagonist policy and a risk-seeking adversarial policy. We implement our method on the decision-making tasks of autonomous vehicles at intersections and test the trained policy in distinct environments. Results demonstrate that our method can greatly improve the generalization ability of the protagonist agent to different environmental variations.
△ Less
Submitted 30 September, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Direct and indirect reinforcement learning
Authors:
Yang Guan,
Shengbo Eben Li,
**gliang Duan,
Jie Li,
Yangang Ren,
Qi Sun,
Bo Cheng
Abstract:
Reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. In this paper, we classify RL into direct and indirect RL according to how they seek the optimal policy of the Markov decision process problem. The former solves the optimal policy by directly maximizing an objective function using gradient descent methods,…
▽ More
Reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. In this paper, we classify RL into direct and indirect RL according to how they seek the optimal policy of the Markov decision process problem. The former solves the optimal policy by directly maximizing an objective function using gradient descent methods, in which the objective function is usually the expectation of accumulative future rewards. The latter indirectly finds the optimal policy by solving the Bellman equation, which is the sufficient and necessary condition from Bellman's principle of optimality. We study policy gradient forms of direct and indirect RL and show that both of them can derive the actor-critic architecture and can be unified into a policy gradient with the approximate value function and the stationary state distribution, revealing the equivalence of direct and indirect RL. We employ a Gridworld task to verify the influence of different forms of policy gradient, suggesting their differences and relationships experimentally. Finally, we classify current mainstream RL algorithms using the direct and indirect taxonomy, together with other ones including value-based and policy-based, model-based and model-free.
△ Less
Submitted 11 May, 2021; v1 submitted 22 December, 2019;
originally announced December 2019.
-
$Σ$-net: Ensembled Iterative Deep Neural Networks for Accelerated Parallel MR Image Reconstruction
Authors:
Jo Schlemper,
Chen Qin,
**ming Duan,
Ronald M. Summers,
Kerstin Hammernik
Abstract:
We explore an ensembled $Σ$-net for fast parallel MR imaging, including parallel coil networks, which perform implicit coil weighting, and sensitivity networks, involving explicit sensitivity maps. The networks in $Σ$-net are trained in a supervised way, including content and GAN losses, and with various ways of data consistency, i.e., proximal map**s, gradient descent and variable splitting. A…
▽ More
We explore an ensembled $Σ$-net for fast parallel MR imaging, including parallel coil networks, which perform implicit coil weighting, and sensitivity networks, involving explicit sensitivity maps. The networks in $Σ$-net are trained in a supervised way, including content and GAN losses, and with various ways of data consistency, i.e., proximal map**s, gradient descent and variable splitting. A semi-supervised finetuning scheme allows us to adapt to the k-space data at test time, which, however, decreases the quantitative metrics, although generating the visually most textured and sharp images. For this challenge, we focused on robust and high SSIM scores, which we achieved by ensembling all models to a $Σ$-net.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.
-
Data consistency networks for (calibration-less) accelerated parallel MR image reconstruction
Authors:
Jo Schlemper,
**ming Duan,
Cheng Ouyang,
Chen Qin,
Jose Caballero,
Joseph V. Hajnal,
Daniel Rueckert
Abstract:
We present simple reconstruction networks for multi-coil data by extending deep cascade of CNN's and exploiting the data consistency layer. In particular, we propose two variants, where one is inspired by POCSENSE and the other is calibration-less. We show that the proposed approaches are competitive relative to the state of the art both quantitatively and qualitatively.
We present simple reconstruction networks for multi-coil data by extending deep cascade of CNN's and exploiting the data consistency layer. In particular, we propose two variants, where one is inspired by POCSENSE and the other is calibration-less. We show that the proposed approaches are competitive relative to the state of the art both quantitatively and qualitatively.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
dAUTOMAP: decomposing AUTOMAP to achieve scalability and enhance performance
Authors:
Jo Schlemper,
Ilkay Oksuz,
James R. Clough,
**ming Duan,
Andrew P. King,
Julia A. Schnabel,
Joseph V. Hajnal,
Daniel Rueckert
Abstract:
AUTOMAP is a promising generalized reconstruction approach, however, it is not scalable and hence the practicality is limited. We present dAUTOMAP, a novel way for decomposing the domain transformation of AUTOMAP, making the model scale linearly. We show dAUTOMAP outperforms AUTOMAP with significantly fewer parameters.
AUTOMAP is a promising generalized reconstruction approach, however, it is not scalable and hence the practicality is limited. We present dAUTOMAP, a novel way for decomposing the domain transformation of AUTOMAP, making the model scale linearly. We show dAUTOMAP outperforms AUTOMAP with significantly fewer parameters.
△ Less
Submitted 25 September, 2019; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Rates of Convergence for Large-scale Nearest Neighbor Classification
Authors:
Xingye Qiao,
Jiexin Duan,
Guang Cheng
Abstract:
Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership limitations, we consider the divide and conquer scheme: the entire data set is divided into small subsamples, on which nearest neighbor predictions are made, and then a…
▽ More
Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership limitations, we consider the divide and conquer scheme: the entire data set is divided into small subsamples, on which nearest neighbor predictions are made, and then a final decision is reached by aggregating the predictions on subsamples by majority voting. We name this method the big Nearest Neighbor (bigNN) classifier, and provide its rates of convergence under minimal assumptions, in terms of both the excess risk and the classification instability, which are proven to be the same rates as the oracle nearest neighbor classifier and cannot be improved. To significantly reduce the prediction time that is required for achieving the optimal rate, we also consider the pre-training acceleration technique applied to the bigNN method, with proven convergence rate. We find that in the distributed setting, the optimal choice of the neighbor $k$ should scale with both the total sample size and the number of partitions, and there is a theoretical upper limit for the latter. Numerical studies have verified the theoretical findings.
△ Less
Submitted 30 October, 2019; v1 submitted 3 September, 2019;
originally announced September 2019.
-
A cost-reducing partial labeling estimator in text classification problem
Authors:
Jiangning Chen,
Zhibo Dai,
Juntao Duan,
Qianli Hu,
Ruilin Li,
Heinrich Matzinger,
Ionel Popescu,
Haoyan Zhai
Abstract:
We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove tha…
▽ More
We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove that under some conditions, our estimators converge faster. Also we discuss the advantages of applying one of our estimator to a fully supervised learning problem. The proposed method has potential applicability in many areas, such as crowdsourcing, natural language processing and medical image analysis.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Naive Bayes with Correlation Factor for Text Classification Problem
Authors:
Jiangning Chen,
Zhibo Dai,
Juntao Duan,
Heinrich Matzinger,
Ionel Popescu
Abstract:
Naive Bayes estimator is widely used in text classification problems. However, it doesn't perform well with small-size training dataset. We propose a new method based on Naive Bayes estimator to solve this problem. A correlation factor is introduced to incorporate the correlation among different classes. Experimental results show that our estimator achieves a better accuracy compared with traditio…
▽ More
Naive Bayes estimator is widely used in text classification problems. However, it doesn't perform well with small-size training dataset. We propose a new method based on Naive Bayes estimator to solve this problem. A correlation factor is introduced to incorporate the correlation among different classes. Experimental results show that our estimator achieves a better accuracy compared with traditional Naive Bayes in real world data.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
Estimation of group means in generalized linear mixed models
Authors:
Jiexin Duan,
Michael Levine,
Junxiang Luo,
Yongming Qu
Abstract:
In this manuscript, we investigate the concept of the mean response for a treatment group mean as well as its estimation and prediction for generalized linear models with a subject-wise random effect. Generalized linear models are commonly used to analyze categorical data. The model-based mean for a treatment group usually estimates the response at the mean covariate. However, the mean response fo…
▽ More
In this manuscript, we investigate the concept of the mean response for a treatment group mean as well as its estimation and prediction for generalized linear models with a subject-wise random effect. Generalized linear models are commonly used to analyze categorical data. The model-based mean for a treatment group usually estimates the response at the mean covariate. However, the mean response for the treatment group for studied population is at least equally important in the context of clinical trials. New methods were proposed to estimate such a mean response in generalized linear models; however, this has only been done when there are no random effects in the model. We suggest that, in a generalized linear mixed model (GLMM), there are at least two possible definitions of a treatment group mean response that can serve as estimation/prediction targets. The estimation of these treatment group means is important for healthcare professionals to be able to understand the absolute benefit versus risk. For both of these treatment group means, we propose a new set of methods that suggests how to estimate/predict both of them in a GLMM models with a univariate subject-wise random effect. Our methods also suggest an easy way of constructing corresponding confidence and prediction intervals for both possible treatment group means. Simulations show that proposed confidence and prediction intervals provide correct empirical coverage probability under most circumstances. Proposed methods have also been applied to analyze hypoglycemia data from diabetes clinical trials.
△ Less
Submitted 2 November, 2019; v1 submitted 12 April, 2019;
originally announced April 2019.
-
Probabilistic Load Forecasting via Point Forecast Feature Integration
Authors:
Qicheng Chang,
Yishen Wang,
Xiao Lu,
Di Shi,
Haifeng Li,
Jiajun Duan,
Zhiwei Wang
Abstract:
Short-term load forecasting is a critical element of power systems energy management systems. In recent years, probabilistic load forecasting (PLF) has gained increased attention for its ability to provide uncertainty information that helps to improve the reliability and economics of system operation performances. This paper proposes a two-stage probabilistic load forecasting framework by integrat…
▽ More
Short-term load forecasting is a critical element of power systems energy management systems. In recent years, probabilistic load forecasting (PLF) has gained increased attention for its ability to provide uncertainty information that helps to improve the reliability and economics of system operation performances. This paper proposes a two-stage probabilistic load forecasting framework by integrating point forecast as a key probabilistic forecasting feature into PLF. In the first stage, all related features are utilized to train a point forecast model and also obtain the feature importance. In the second stage the forecasting model is trained, taking into consideration point forecast features, as well as selected feature subsets. During the testing period of the forecast model, the final probabilistic load forecast results are leveraged to obtain both point forecasting and probabilistic forecasting. Numerical results obtained from ISO New England demand data demonstrate the effectiveness of the proposed approach in the hour-ahead load forecasting, which uses the gradient boosting regression for the point forecasting and quantile regression neural networks for the probabilistic forecasting.
△ Less
Submitted 26 March, 2019;
originally announced March 2019.
-
Submodular Load Clustering with Robust Principal Component Analysis
Authors:
Yishen Wang,
Xiao Lu,
Yiran Xu,
Di Shi,
Zhehan Yi,
Jiajun Duan,
Zhiwei Wang
Abstract:
Traditional load analysis is facing challenges with the new electricity usage patterns due to demand response as well as increasing deployment of distributed generations, including photovoltaics (PV), electric vehicles (EV), and energy storage systems (ESS). At the transmission system, despite of irregular load behaviors at different areas, highly aggregated load shapes still share similar charact…
▽ More
Traditional load analysis is facing challenges with the new electricity usage patterns due to demand response as well as increasing deployment of distributed generations, including photovoltaics (PV), electric vehicles (EV), and energy storage systems (ESS). At the transmission system, despite of irregular load behaviors at different areas, highly aggregated load shapes still share similar characteristics. Load clustering is to discover such intrinsic patterns and provide useful information to other load applications, such as load forecasting and load modeling. This paper proposes an efficient submodular load clustering method for transmission-level load areas. Robust principal component analysis (R-PCA) firstly decomposes the annual load profiles into low-rank components and sparse components to extract key features. A novel submodular cluster center selection technique is then applied to determine the optimal cluster centers through constructed similarity graph. Following the selection results, load areas are efficiently assigned to different clusters for further load analysis and applications. Numerical results obtained from PJM load demonstrate the effectiveness of the proposed approach.
△ Less
Submitted 19 February, 2019;
originally announced February 2019.
-
Deep learning cardiac motion analysis for human survival prediction
Authors:
Ghalib A. Bello,
Timothy J. W. Dawes,
**ming Duan,
Carlo Biffi,
Antonio de Marvao,
Luke S. G. E. Howard,
J. Simon R. Gibbs,
Martin R. Wilkins,
Stuart A. Cook,
Daniel Rueckert,
Declan P. O'Regan
Abstract:
Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using…
▽ More
Motion analysis is used in computer vision to understand the behaviour of moving objects in sequences of images. Optimising the interpretation of dynamic biological systems requires accurate and precise motion tracking as well as efficient representations of high-dimensional motion trajectories so that these can be used for prediction tasks. Here we use image sequences of the heart, acquired using cardiac magnetic resonance imaging, to create time-resolved three-dimensional segmentations using a fully convolutional network trained on anatomical shape priors. This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. To handle right-censored survival outcomes, our network used a Cox partial likelihood loss function. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95$\%$ CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95$\%$ CI: 0.53 - 0.65). This work demonstrates how a complex computer vision task using high-dimensional medical image data can efficiently predict human survival.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
Efficient Computational Algorithm for Optimal Continuous Experimental Designs
Authors:
Jiangtao Duan,
Wei Gao,
Hon Keung Tony Ng
Abstract:
A simple yet efficient computational algorithm for computing the continuous optimal experimental design for linear models is proposed. An alternative proof the monotonic convergence for $D$-optimal criterion on continuous design spaces are provided. We further show that the proposed algorithm converges to the $D$-optimal design. We also provide an algorithm for the $A$-optimality and conjecture th…
▽ More
A simple yet efficient computational algorithm for computing the continuous optimal experimental design for linear models is proposed. An alternative proof the monotonic convergence for $D$-optimal criterion on continuous design spaces are provided. We further show that the proposed algorithm converges to the $D$-optimal design. We also provide an algorithm for the $A$-optimality and conjecture that the algorithm convergence monotonically on continuous design spaces. Different numerical examples are used to demonstrated the usefulness and performance of the proposed algorithms.
△ Less
Submitted 8 April, 2018;
originally announced April 2018.
-
Diffusion Maximum Correntropy Criterion Algorithms for Robust Distributed Estimation
Authors:
Wentao Ma,
Badong Chen,
Jiandong Duan,
Haiquan Zhao
Abstract:
Robust diffusion adaptive estimation algorithms based on the maximum correntropy criterion (MCC), including adaptation to combination MCC and combination to adaptation MCC, are developed to deal with the distributed estimation over network in impulsive (long-tailed) noise environments. The cost functions used in distributed estimation are in general based on the mean square error (MSE) criterion,…
▽ More
Robust diffusion adaptive estimation algorithms based on the maximum correntropy criterion (MCC), including adaptation to combination MCC and combination to adaptation MCC, are developed to deal with the distributed estimation over network in impulsive (long-tailed) noise environments. The cost functions used in distributed estimation are in general based on the mean square error (MSE) criterion, which is desirable when the measurement noise is Gaussian. In non-Gaussian situations, such as the impulsive-noise case, MCC based methods may achieve much better performance than the MSE methods as they take into account higher order statistics of error distribution. The proposed methods can also outperform the robust diffusion least mean p-power(DLMP) and diffusion minimum error entropy (DMEE) algorithms. The mean and mean square convergence analysis of the new algorithms are also carried out.
△ Less
Submitted 3 February, 2016; v1 submitted 8 August, 2015;
originally announced August 2015.
-
State estimation under non-Gaussian Levy noise: A modified Kalman filtering method
Authors:
Xu Sun,
**qiao Duan,
Xiaofan Li,
Xiangjun Wang
Abstract:
The Kalman filter is extensively used for state estimation for linear systems under Gaussian noise. When non-Gaussian Lévy noise is present, the conventional Kalman filter may fail to be effective due to the fact that the non-Gaussian Lévy noise may have infinite variance. A modified Kalman filter for linear systems with non-Gaussian Lévy noise is devised. It works effectively with reasonable comp…
▽ More
The Kalman filter is extensively used for state estimation for linear systems under Gaussian noise. When non-Gaussian Lévy noise is present, the conventional Kalman filter may fail to be effective due to the fact that the non-Gaussian Lévy noise may have infinite variance. A modified Kalman filter for linear systems with non-Gaussian Lévy noise is devised. It works effectively with reasonable computational cost. Simulation results are presented to illustrate this non-Gaussian filtering method.
△ Less
Submitted 10 March, 2013;
originally announced March 2013.
-
A sufficient condition on monotonic increase of the number of nonzero entry in the optimizer of L1 norm penalized least-square problem
Authors:
J. Duan,
Charles Soussen,
David Brie,
Jerome Idier,
Y. -P. Wang
Abstract:
The $\ell$-1 norm based optimization is widely used in signal processing, especially in recent compressed sensing theory. This paper studies the solution path of the $\ell$-1 norm penalized least-square problem, whose constrained form is known as Least Absolute Shrinkage and Selection Operator (LASSO). A solution path is the set of all the optimizers with respect to the evolution of the hyperparam…
▽ More
The $\ell$-1 norm based optimization is widely used in signal processing, especially in recent compressed sensing theory. This paper studies the solution path of the $\ell$-1 norm penalized least-square problem, whose constrained form is known as Least Absolute Shrinkage and Selection Operator (LASSO). A solution path is the set of all the optimizers with respect to the evolution of the hyperparameter (Lagrange multiplier). The study of the solution path is of great significance in viewing and understanding the profile of the tradeoff between the approximation and regularization terms. If the solution path of a given problem is known, it can help us to find the optimal hyperparameter under a given criterion such as the Akaike Information Criterion. In this paper we present a sufficient condition on $\ell$-1 norm penalized least-square problem. Under this sufficient condition, the number of nonzero entries in the optimizer or solution vector increases monotonically when the hyperparameter decreases. We also generalize the result to the often used total variation case, where the $\ell$-1 norm is taken over the first order derivative of the solution vector. We prove that the proposed condition has intrinsic connections with the condition given by Donoho, et al \cite{Donoho08} and the positive cone condition by Efron {\it el al} \cite{Efron04}. However, the proposed condition does not need to assume the sparsity level of the signal as required by Donoho et al's condition, and is easier to verify than Efron, et al's positive cone condition when being used for practical applications.
△ Less
Submitted 19 April, 2011;
originally announced April 2011.