Search | arXiv e-print repository

Research on Autonomous Robots Navigation based on Reinforcement Learning

Authors: Zixiang Wang, Hao Yan, Yining Wang, Zhengjia Xu, Zhuoyue Wang, Zhizhong Wu

Abstract: Reinforcement learning continuously optimizes decision-making based on real-time feedback reward signals through continuous interaction with the environment, demonstrating strong adaptive and self-learning capabilities. In recent years, it has become one of the key methods to achieve autonomous navigation of robots. In this work, an autonomous robot navigation method based on reinforcement learnin… ▽ More Reinforcement learning continuously optimizes decision-making based on real-time feedback reward signals through continuous interaction with the environment, demonstrating strong adaptive and self-learning capabilities. In recent years, it has become one of the key methods to achieve autonomous navigation of robots. In this work, an autonomous robot navigation method based on reinforcement learning is introduced. We use the Deep Q Network (DQN) and Proximal Policy Optimization (PPO) models to optimize the path planning and decision-making process through the continuous interaction between the robot and the environment, and the reward signals with real-time feedback. By combining the Q-value function with the deep neural network, deep Q network can handle high-dimensional state space, so as to realize path planning in complex environments. Proximal policy optimization is a strategy gradient-based method, which enables robots to explore and utilize environmental information more efficiently by optimizing policy functions. These methods not only improve the robot's navigation ability in the unknown environment, but also enhance its adaptive and self-learning capabilities. Through multiple training and simulation experiments, we have verified the effectiveness and robustness of these models in various complex scenarios. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.01015 [pdf, other]

Bayesian Entropy Neural Networks for Physics-Aware Prediction

Authors: Rahul Rathnakumar, Jiayu Huang, Hao Yan, Yongming Liu

Abstract: This paper addresses the need for deep learning models to integrate well-defined constraints into their outputs, driven by their application in surrogate models, learning with limited data and partial information, and scenarios requiring flexible model behavior to incorporate non-data sample information. We introduce Bayesian Entropy Neural Networks (BENN), a framework grounded in Maximum Entropy… ▽ More This paper addresses the need for deep learning models to integrate well-defined constraints into their outputs, driven by their application in surrogate models, learning with limited data and partial information, and scenarios requiring flexible model behavior to incorporate non-data sample information. We introduce Bayesian Entropy Neural Networks (BENN), a framework grounded in Maximum Entropy (MaxEnt) principles, designed to impose constraints on Bayesian Neural Network (BNN) predictions. BENN is capable of constraining not only the predicted values but also their derivatives and variances, ensuring a more robust and reliable model output. To achieve simultaneous uncertainty quantification and constraint satisfaction, we employ the method of multipliers approach. This allows for the concurrent estimation of neural network parameters and the Lagrangian multipliers associated with the constraints. Our experiments, spanning diverse applications such as beam deflection modeling and microstructure generation, demonstrate the effectiveness of BENN. The results highlight significant improvements over traditional BNNs and showcase competitive performance relative to contemporary constrained deep learning methods. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 15 pages

ACM Class: I.5.1

arXiv:2404.04403 [pdf, other]

Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling

Authors: Jiuyun Hu, Ziyue Li, Chen Zhang, Fugee Tsung, Hao Yan

Abstract: Tensor clustering has become an important topic, specifically in spatio-temporal modeling, due to its ability to cluster spatial modes (e.g., stations or road segments) and temporal modes (e.g., time of the day or day of the week). Our motivating example is from subway passenger flow modeling, where similarities between stations are commonly found. However, the challenges lie in the innate high-di… ▽ More Tensor clustering has become an important topic, specifically in spatio-temporal modeling, due to its ability to cluster spatial modes (e.g., stations or road segments) and temporal modes (e.g., time of the day or day of the week). Our motivating example is from subway passenger flow modeling, where similarities between stations are commonly found. However, the challenges lie in the innate high-dimensionality of tensors and also the potential existence of anomalies. This is because the three tasks, i.e., dimension reduction, clustering, and anomaly decomposition, are inter-correlated to each other, and treating them in a separate manner will render a suboptimal performance. Thus, in this work, we design a tensor-based subspace clustering and anomaly decomposition technique for simultaneously outlier-robust dimension reduction and clustering for high-dimensional tensors. To achieve this, a novel low-rank robust subspace clustering decomposition model is proposed by combining Tucker decomposition, sparse anomaly decomposition, and subspace clustering. An effective algorithm based on Block Coordinate Descent is proposed to update the parameters. Prudent experiments prove the effectiveness of the proposed framework via the simulation study, with a gain of +25% clustering accuracy than benchmark methods in a hard case. The interrelations of the three tasks are also analyzed via ablation studies, validating the interrelation assumption. Moreover, a case study in the station clustering based on real passenger flow data is conducted, with quite valuable insights discovered. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Conditionally Accepted in INFORMS Journal of Data Science

arXiv:2310.20224 [pdf, other]

Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering

Authors: Ziyue Li, Hao Yan, Chen Zhang, Lijun Sun, Wolfgang Ketter, Fugee Tsung

Abstract: Passenger clustering based on trajectory records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, including multiple trips within each passenger and multi-dimensional information about each trip. Furthermore, existing approaches rely on an accurate specification of the clus… ▽ More Passenger clustering based on trajectory records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, including multiple trips within each passenger and multi-dimensional information about each trip. Furthermore, existing approaches rely on an accurate specification of the clustering number to start. Finally, existing methods do not consider spatial semantic graphs such as geographical proximity and functional similarity between the locations. In this paper, we propose a novel tensor Dirichlet Process Multinomial Mixture model with graphs, which can preserve the hierarchical structure of the multi-dimensional trip information and cluster them in a unified one-step manner with the ability to determine the number of clusters automatically. The spatial graphs are utilized in community detection to link the semantic neighbors. We further propose a tensor version of Collapsed Gibbs Sampling method with a minimum cluster size requirement. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of cluster amount evolution and better cluster quality measured by within-cluster compactness and cross-cluster separateness. The code is available at https://github.com/bonaldli/TensorDPMM-G. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: Accepted in ACM SIGSPATIAL 2023. arXiv admin note: substantial text overlap with arXiv:2306.13794

arXiv:2310.08721 [pdf, other]

Drug Supply Chain Optimization for Adaptive Clinical Trials

Authors: **cheng Pang, Hong Yan, Zoe Hua

Abstract: With increasing interest in adaptive clinical trial designs, challenges are present to drug supply chain management which may offset the benefit of adaptive designs. Thus, it is necessary to develop an optimization tool to facilitate the decision making and analysis of drug supply chain planning. The challenges include the uncertainty of maximum drug supply needed, the shifting of supply requireme… ▽ More With increasing interest in adaptive clinical trial designs, challenges are present to drug supply chain management which may offset the benefit of adaptive designs. Thus, it is necessary to develop an optimization tool to facilitate the decision making and analysis of drug supply chain planning. The challenges include the uncertainty of maximum drug supply needed, the shifting of supply requirement, and rapid availability of new supply at decision points. In this paper, statistical simulations are designed to optimize the pre-study medication supply strategy and monitor ongoing drug supply using real-time data collected with the progress of study. Particle swarm algorithm is applied when performing optimization, where feature extraction is implemented to reduce dimensionality and save computational cost. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2309.03439 [pdf, other]

Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor Data

Authors: Jiuyun Hu, Naichen Shi, Raed Al Kontar, Hao Yan

Abstract: We propose personalized Tucker decomposition (perTucker) to address the limitations of traditional tensor decomposition methods in capturing heterogeneity across different datasets. perTucker decomposes tensor data into shared global components and personalized local components. We introduce a mode orthogonality assumption and develop a proximal gradient regularized block coordinate descent algori… ▽ More We propose personalized Tucker decomposition (perTucker) to address the limitations of traditional tensor decomposition methods in capturing heterogeneity across different datasets. perTucker decomposes tensor data into shared global components and personalized local components. We introduce a mode orthogonality assumption and develop a proximal gradient regularized block coordinate descent algorithm that is guaranteed to converge to a stationary point. By learning unique and common representations across datasets, we demonstrate perTucker's effectiveness in anomaly detection, client classification, and clustering through a simulation study and two case studies on solar flare detection and tonnage signal classification. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2306.13794 [pdf, other]

Tensor Dirichlet Process Multinomial Mixture Model for Passenger Trajectory Clustering

Authors: Ziyue Li, Hao Yan, Chen Zhang, Andi Wang, Wolfgang Ketter, Lijun Sun, Fugee Tsung

Abstract: Passenger clustering based on travel records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, namely: each passenger has multiple trips, and each trip contains multi-dimensional multi-mode information. Furthermore, existing approaches rely on an accurate specification of th… ▽ More Passenger clustering based on travel records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, namely: each passenger has multiple trips, and each trip contains multi-dimensional multi-mode information. Furthermore, existing approaches rely on an accurate specification of the clustering number to start, which is difficult when millions of commuters are using the transport systems on a daily basis. In this paper, we propose a novel Tensor Dirichlet Process Multinomial Mixture model (Tensor-DPMM), which is designed to preserve the multi-mode and hierarchical structure of the multi-dimensional trip information via tensor, and cluster them in a unified one-step manner. The model also has the ability to determine the number of clusters automatically by using the Dirichlet Process to decide the probabilities for a passenger to be either assigned in an existing cluster or to create a new cluster: This allows our model to grow the clusters as needed in a dynamic manner. Finally, existing methods do not consider spatial semantic graphs such as geographical proximity and functional similarity between the locations, which may cause inaccurate clustering. To this end, we further propose a variant of our model, namely the Tensor-DPMM with Graph. For the algorithm, we propose a tensor Collapsed Gibbs Sampling method, with an innovative step of "disband and relocating", which disbands clusters with too small amount of members and relocates them to the remaining clustering. This avoids uncontrollable growing amounts of clusters. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of learning the number of clusters, and the learned clusters are better in within-cluster compactness and cross-cluster separateness. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: Under Review of Transportation Research Part C: Emerging Technologies

arXiv:2305.14767 [pdf]

Interpretation and visualization of distance covariance through additive decomposition of correlations formula

Authors: Andi Wang, Hao Yan, Juan Du

Abstract: Distance covariance is a widely used statistical methodology for testing the dependency between two groups of variables. Despite the appealing properties of consistency and superior testing power, the testing results of distance covariance are often hard to be interpreted. This paper presents an elementary interpretation of the mechanism of distance covariance through an additive decomposition of… ▽ More Distance covariance is a widely used statistical methodology for testing the dependency between two groups of variables. Despite the appealing properties of consistency and superior testing power, the testing results of distance covariance are often hard to be interpreted. This paper presents an elementary interpretation of the mechanism of distance covariance through an additive decomposition of correlations formula. Based on this formula, a visualization method is developed to provide practitioners with a more intuitive explanation of the distance covariance score. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2208.08855 [pdf, other]

Adaptive Partially-Observed Sequential Change Detection and Isolation

Authors: Xinyu Zhao, Jiuyun Hu, Yajun Mei, Hao Yan

Abstract: High-dimensional data has become popular due to the easy accessibility of sensors in modern industrial applications. However, one specific challenge is that it is often not easy to obtain complete measurements due to limited sensing powers and resource constraints. Furthermore, distinct failure patterns may exist in the systems, and it is necessary to identify the true failure pattern. This work f… ▽ More High-dimensional data has become popular due to the easy accessibility of sensors in modern industrial applications. However, one specific challenge is that it is often not easy to obtain complete measurements due to limited sensing powers and resource constraints. Furthermore, distinct failure patterns may exist in the systems, and it is necessary to identify the true failure pattern. This work focuses on the online adaptive monitoring of high-dimensional data in resource-constrained environments with multiple potential failure modes. To achieve this, we propose to apply the Shiryaev-Roberts procedure on the failure mode level and utilize the multi-arm bandit to balance the exploration and exploitation. We further discuss the theoretical property of the proposed algorithm to show that the proposed method can correctly isolate the failure mode. Finally, extensive simulations and two case studies demonstrate that the change point detection performance and the failure mode isolation accuracy can be greatly improved. △ Less

Submitted 25 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted in Technometrics

arXiv:2208.05045 [pdf, other]

Adaptive Resources Allocation CUSUM for Binomial Count Data Monitoring with Application to COVID-19 Hotspot Detection

Authors: Jiuyun Hu, Yajun Mei, Sarah Holte, Hao Yan

Abstract: In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted up… ▽ More In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted update is used to update the posterior distribution of the infection rate. Then, the upper confidence bound (UCB) is used for resource allocation and planning. Finally, CUSUM monitoring statistics to detect the change point as well as the change location. For performance evaluation, we compare the performance of the proposed method with several benchmark methods in the literature and showed the proposed algorithm is able to achieve a lower detection delay and higher detection precision. Finally, this method is applied to hotspot detection in a real case study of county-level daily positive COVID-19 cases in Washington State WA) and demonstrates the effectiveness with very limited distributed samples. △ Less

Submitted 17 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted in Journal of Applied Statistics

arXiv:2112.04492 [pdf, other]

Daily peak electrical load forecasting with a multi-resolution approach

Authors: Yvenn Amara-Ouali, Matteo Fasiolo, Yannig Goude, Hui Yan

Abstract: In the context of smart grids and load balancing, daily peak load forecasting has become a critical activity for stakeholders of the energy industry. An understanding of peak magnitude and timing is paramount for the implementation of smart grid strategies such as peak shaving. The modelling approach proposed in this paper leverages high-resolution and low-resolution information to forecast daily… ▽ More In the context of smart grids and load balancing, daily peak load forecasting has become a critical activity for stakeholders of the energy industry. An understanding of peak magnitude and timing is paramount for the implementation of smart grid strategies such as peak shaving. The modelling approach proposed in this paper leverages high-resolution and low-resolution information to forecast daily peak demand size and timing. The resulting multi-resolution modelling framework can be adapted to different model classes. The key contributions of this paper are a) a general and formal introduction to the multi-resolution modelling approach, b) a discussion on modelling approaches at different resolutions implemented via Generalised Additive Models and Neural Networks and c) experimental results on real data from the UK electricity market. The results confirm that the predictive performance of the proposed modelling approach is competitive with that of low- and high-resolution alternatives. △ Less

Submitted 8 December, 2021; originally announced December 2021.

arXiv:2105.08180 [pdf, other]

Deep Multistage Multi-Task Learning for Quality Prediction of Multistage Manufacturing Systems

Authors: Hao Yan, Nurretin Dorukhan Sergin, William A. Brenneman, Stephen Joseph Lange, Shan Ba

Abstract: In multistage manufacturing systems, modeling multiple quality indices based on the process sensing variables is important. However, the classic modeling technique predicts each quality variable one at a time, which fails to consider the correlation within or between stages. We propose a deep multistage multi-task learning framework to jointly predict all output sensing variables in a unified end-… ▽ More In multistage manufacturing systems, modeling multiple quality indices based on the process sensing variables is important. However, the classic modeling technique predicts each quality variable one at a time, which fails to consider the correlation within or between stages. We propose a deep multistage multi-task learning framework to jointly predict all output sensing variables in a unified end-to-end learning framework according to the sequential system architecture in the MMS. Our numerical studies and real case study have shown that the new model has a superior performance compared to many benchmark methods as well as great interpretability through developed variable selection techniques. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: Accepted by Journal of Quality Technology

arXiv:2101.06839 [pdf, other]

Adaptive Change Point Monitoring for High-Dimensional Data

Authors: Teng Wu, Runmin Wang, Hao Yan, Xiaofeng Shao

Abstract: In this paper, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests developed by Wang et al.(2019) and Zhang et al.(2020), we advance the U-statistic based approach to the sequential monitoring problem by develo** a new adaptive monitoring procedure that can detect both dense and… ▽ More In this paper, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests developed by Wang et al.(2019) and Zhang et al.(2020), we advance the U-statistic based approach to the sequential monitoring problem by develo** a new adaptive monitoring procedure that can detect both dense and sparse changes in real-time. Unlike Wang et al.(2019) and Zhang et al.(2020), where self-normalization was used in their tests, we instead introduce a class of estimators for $q$-norm of the covariance matrix and prove their ratio consistency. To facilitate fast computation, we further develop recursive algorithms to improve the computational efficiency of the monitoring procedure. The advantage of the proposed methodology is demonstrated via simulation studies and real data illustrations. △ Less

Submitted 17 January, 2021; originally announced January 2021.

arXiv:2009.10645 [pdf, other]

Partially Observable Online Change Detection via Smooth-Sparse Decomposition

Authors: Jie Guo, Hao Yan, Chen Zhang, Steven Hoi

Abstract: We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able… ▽ More We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able to adaptively and actively select the most important variables to observe to maximize the detection power. To address these two points, in this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition, whose parameters can be learned via spike-slab variational Bayesian inference. Then the posterior Bayes factor, which incorporates the learned parameters and sparse change information, is formulated as a detection statistic. Finally, by formulating the statistic as the reward of a combinatorial multi-armed bandit problem, an adaptive sampling strategy based on Thompson sampling is proposed. The efficacy and applicability of our method in practice are demonstrated with numerical studies and a real case study. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 48 pages

arXiv:2008.10030 [pdf, other]

doi 10.1109/TPAMI.2020.3014218

Unsupervised Domain Adaptation via Discriminative Manifold Propagation

Authors: You-Wei Luo, Chuan-Xian Ren, Dao-Qing Dai, Hong Yan

Abstract: Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application… ▽ More Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application of them may destroy the intrinsic data structure. Second, batch-wise training of deep learning limits the characterization of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability simultaneously. For the first issue, this framework establishes a probabilistic discriminant criterion on the target domain via soft labels. Based on pre-built prototypes, this criterion is extended to a global approximation scheme for the second issue. Manifold metric alignment is adopted to be compatible with the embedding space. The theoretical error bounds of different alignment metrics are derived for constructive guidance. The proposed method can be used to tackle a series of variants of domain adaptation problems, including both vanilla and partial settings. Extensive experiments have been conducted to investigate the method and a comparative study shows the superiority of the discriminative manifold learning framework. △ Less

Submitted 23 August, 2020; originally announced August 2020.

Comments: To be published in IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2007.06136 [pdf, other]

Bayesian Bi-clustering Methods with Applications in Computational Biology

Authors: Han Yan, Jiexing Wu, Yang Li, Jun S. Liu

Abstract: Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to high dimensions, and propose three Bayesian bi-clustering models on categorical data, which increase in complexities in their modeling of the distributions of fe… ▽ More Bi-clustering is a useful approach in analyzing biological data when observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to high dimensions, and propose three Bayesian bi-clustering models on categorical data, which increase in complexities in their modeling of the distributions of features across bi-clusters. Our proposed methods apply to a wide range of scenarios: from situations where data are cluster-distinguishable only among a small subset of features but masked by a large amount of noise, to situations where different groups of data are identified by different sets of features or data exhibit hierarchical structures. Through simulation studies, we show that our methods outperform existing (bi-)clustering methods in both identifying clusters and recovering feature distributional patterns across bi-clusters. We apply our methods to two genetic datasets, though the area of application of our methods is even broader. △ Less

Submitted 9 February, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

arXiv:2004.11710 [pdf, other]

Rapid Detection of Hot-spots via Tensor Decomposition with applications to Crime Rate Data

Authors: Yujie Zhao, Hao Yan, Sarah Holte, Yajun Mei

Abstract: We propose an efficient statistical method (denoted as SSR-Tensor) to robustly and quickly detect hot-spots that are sparse and temporal-consistent in a spatial-temporal dataset through the tensor decomposition. Our main idea is first to build an SSR model to decompose the tensor data into a Smooth global trend mean, Sparse local hot-spots, and Residuals. Next, tensor decomposition is utilized as… ▽ More We propose an efficient statistical method (denoted as SSR-Tensor) to robustly and quickly detect hot-spots that are sparse and temporal-consistent in a spatial-temporal dataset through the tensor decomposition. Our main idea is first to build an SSR model to decompose the tensor data into a Smooth global trend mean, Sparse local hot-spots, and Residuals. Next, tensor decomposition is utilized as follows: bases are introduced to describe within-dimension correlation, and tensor products are used for between-dimension interaction. Then, a combination of LASSO and fused LASSO is used to estimate the model parameters, where an efficient recursive estimation procedure is developed based on the large-scale convex optimization, where we first transform the general LASSO optimization into regular LASSO optimization and apply FISTA to solve it with the fastest convergence rate. Finally, a CUSUM procedure is applied to detect when and where the hot-spot event occurs. We compare the performance of the proposed method in a numerical simulation study and a real-world case study, which contains a dataset including a collection of three types of crime rates for U.S. mainland states during the year 1965-2014. In both cases, the proposed SSR-Tensor is able to achieve the fast detection and accurate localization of the hot-spots. △ Less

Submitted 14 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

Comments: arXiv admin note: text overlap with arXiv:2001.11685

arXiv:2004.11022 [pdf, other]

Long-Short Term Spatiotemporal Tensor Prediction for Passenger Flow Profile

Authors: Ziyue Li, Hao Yan, Chen Zhang, Fugee Tsung

Abstract: Spatiotemporal data is very common in many applications, such as manufacturing systems and transportation systems. It is typically difficult to be accurately predicted given intrinsic complex spatial and temporal correlations. Most of the existing methods based on various statistical models and regularization terms, fail to preserve innate features in data alongside their complex correlations. In… ▽ More Spatiotemporal data is very common in many applications, such as manufacturing systems and transportation systems. It is typically difficult to be accurately predicted given intrinsic complex spatial and temporal correlations. Most of the existing methods based on various statistical models and regularization terms, fail to preserve innate features in data alongside their complex correlations. In this paper, we focus on a tensor-based prediction and propose several practical techniques to improve prediction. For long-term prediction specifically, we propose the "Tensor Decomposition + 2-Dimensional Auto-Regressive Moving Average (2D-ARMA)" model, and an effective way to update prediction real-time; For short-term prediction, we propose to conduct tensor completion based on tensor clustering to avoid oversimplifying and ensure accuracy. A case study based on the metro passenger flow data is conducted to demonstrate the improved performance. △ Less

Submitted 23 April, 2020; originally announced April 2020.

arXiv:2004.10977 [pdf]

Real-time Detection of Clustered Events in Video-imaging data with Applications to Additive Manufacturing

Authors: Hao Yan, Marco Grasso, Kamran Paynabar, Bianca Maria Colosimo

Abstract: The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the unde… ▽ More The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to the events that are localized both in time and space. In this paper, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse spatially clustered and temporally consistent. Therefore, the goal is to not only detect the anomaly as quickly as possible ("when") but also locate it ("where"). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the hotspot happens. The proposed approach was applied to the analysis of video-imaging data to detect and locate local over-heating phenomena ("hotspots") during the layer-wise process in a metal additive manufacturing process. △ Less

Submitted 23 April, 2020; originally announced April 2020.

arXiv:2001.11685 [pdf, ps, other]

Rapid Detection of Hot-spot by Tensor Decomposition with Application to Weekly Gonorrhea Data

Authors: Yujie Zhao, Hao Yan, Sarah E. Holte, Roxanne P. Kerani, Yajun Mei

Abstract: In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method… ▽ More In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from $2006$ to $2018$ for $50$ states in the United States. △ Less

Submitted 7 April, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

Journal ref: The XIIIth International Workshop on Intelligent Statistical Quality Control, pp 289- 310, Hong Kong, 2019

arXiv:1912.05693 [pdf, other]

Tensor Completion for Weakly-dependent Data on Graph for Metro Passenger Flow Prediction

Authors: Ziyue Li, Nurettin Dorukhan Sergin, Hao Yan, Chen Zhang, Fugee Tsung

Abstract: Low-rank tensor decomposition and completion have attracted significant interest from academia given the ubiquity of tensor data. However, the low-rank structure is a global property, which will not be fulfilled when the data presents complex and weak dependencies given specific graph structures. One particular application that motivates this study is the spatiotemporal data analysis. As shown in… ▽ More Low-rank tensor decomposition and completion have attracted significant interest from academia given the ubiquity of tensor data. However, the low-rank structure is a global property, which will not be fulfilled when the data presents complex and weak dependencies given specific graph structures. One particular application that motivates this study is the spatiotemporal data analysis. As shown in the preliminary study, weakly dependencies can worsen the low-rank tensor completion performance. In this paper, we propose a novel low-rank CANDECOMP / PARAFAC (CP) tensor decomposition and completion framework by introducing the $L_{1}$-norm penalty and Graph Laplacian penalty to model the weakly dependency on graph. We further propose an efficient optimization algorithm based on the Block Coordinate Descent for efficient estimation. A case study based on the metro passenger flow data in Hong Kong is conducted to demonstrate improved performance over the regular tensor completion methods. △ Less

Submitted 11 December, 2019; originally announced December 2019.

Comments: Accepted at AAAI 2020

arXiv:1911.00482 [pdf, other]

doi 10.1080/00224065.2021.1903821

Toward a Better Monitoring Statistic for Profile Monitoring via Variational Autoencoders

Authors: Nurettin Sergin, Hao Yan

Abstract: Wide accessibility of imaging and profile sensors in modern industrial systems created an abundance of high-dimensional sensing variables. This led to a a growing interest in the research of high-dimensional process monitoring. However, most of the approaches in the literature assume the in-control population to lie on a linear manifold with a given basis (i.e., spline, wavelet, kernel, etc) or an… ▽ More Wide accessibility of imaging and profile sensors in modern industrial systems created an abundance of high-dimensional sensing variables. This led to a a growing interest in the research of high-dimensional process monitoring. However, most of the approaches in the literature assume the in-control population to lie on a linear manifold with a given basis (i.e., spline, wavelet, kernel, etc) or an unknown basis (i.e., principal component analysis and its variants), which cannot be used to efficiently model profiles with a nonlinear manifold which is common in many real-life cases. We propose deep probabilistic autoencoders as a viable unsupervised learning approach to model such manifolds. To do so, we formulate nonlinear and probabilistic extensions of the monitoring statistics from classical approaches as the expected reconstruction error (ERE) and the KL-divergence (KLD) based monitoring statistics. Through extensive simulation study, we provide insights on why latent-space based statistics are unreliable and why residual-space based ones typically perform much better for deep learning based approaches. Finally, we demonstrate the superiority of deep probabilistic models via both simulation study and a real-life case study involving images of defects from a hot steel rolling process. △ Less

Submitted 10 August, 2022; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Journal of Quality Technology 53 (2021) 454-473

arXiv:1910.09979 [pdf, other]

Orthogonal Nonnegative Tucker Decomposition

Authors: Junjun Pan, Michael K. Ng, Ye Liu, Xiongjun Zhang, Hong Yan

Abstract: In this paper, we study the nonnegative tensor data and propose an orthogonal nonnegative Tucker decomposition (ONTD). We discuss some properties of ONTD and develop a convex relaxation algorithm of the augmented Lagrangian function to solve the optimization problem. The convergence of the algorithm is given. We employ ONTD on the image data sets from the real world applications including face rec… ▽ More In this paper, we study the nonnegative tensor data and propose an orthogonal nonnegative Tucker decomposition (ONTD). We discuss some properties of ONTD and develop a convex relaxation algorithm of the augmented Lagrangian function to solve the optimization problem. The convergence of the algorithm is given. We employ ONTD on the image data sets from the real world applications including face recognition, image representation, hyperspectral unmixing. Numerical results are shown to illustrate the effectiveness of the proposed algorithm. △ Less

Submitted 27 October, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

arXiv:1910.05513 [pdf, other]

On Robustness of Neural Ordinary Differential Equations

Authors: Hanshu Yan, Jiawei Du, Vincent Y. F. Tan, Jiashi Feng

Abstract: Neural ordinary differential equations (ODEs) have been attracting increasing attention in various research domains recently. There have been some works studying optimization issues and approximation capabilities of neural ODEs, but their robustness is still yet unclear. In this work, we fill this important gap by exploring robustness properties of neural ODEs both empirically and theoretically. W… ▽ More Neural ordinary differential equations (ODEs) have been attracting increasing attention in various research domains recently. There have been some works studying optimization issues and approximation capabilities of neural ODEs, but their robustness is still yet unclear. In this work, we fill this important gap by exploring robustness properties of neural ODEs both empirically and theoretically. We first present an empirical study on the robustness of the neural ODE-based networks (ODENets) by exposing them to inputs with various types of perturbations and subsequently investigating the changes of the corresponding outputs. In contrast to conventional convolutional neural networks (CNNs), we find that the ODENets are more robust against both random Gaussian perturbations and adversarial attack examples. We then provide an insightful understanding of this phenomenon by exploiting a certain desirable property of the flow of a continuous-time ODE, namely that integral curves are non-intersecting. Our work suggests that, due to their intrinsic robustness, it is promising to use neural ODEs as a basic block for building robust deep network models. To further enhance the robustness of vanilla neural ODEs, we propose the time-invariant steady neural ODE (TisODE), which regularizes the flow on perturbed data via the time-invariant property and the imposition of a steady-state constraint. We show that the TisODE method outperforms vanilla neural ODEs and also can work in conjunction with other state-of-the-art architectural methods to build more robust deep networks. △ Less

Submitted 3 March, 2022; v1 submitted 12 October, 2019; originally announced October 2019.

arXiv:1910.02119 [pdf, other]

AKM$^2$D : An Adaptive Framework for Online Sensing and Anomaly Quantification

Authors: Hao Yan, Kamran Paynabar, Jianjun Shi

Abstract: In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sam… ▽ More In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sampling space. However, in many anomaly quantification applications, the main goal is to estimate sparse anomalous regions in the pixel-level accurately. In this paper, we develop a novel framework named Adaptive Kernelized Maximum-Minimum Distance AKM$^2$D to speed up the inspection and anomaly detection process through an intelligent sequential sampling scheme integrated with fast estimation and detection. The proposed method balances the sampling efforts between the space-filling sampling (exploration) and focused sampling near the anomalous region (exploitation). The proposed methodology is validated by conducting simulations and a case study of anomaly detection in composite sheets using a guided wave test. △ Less

Submitted 4 October, 2019; originally announced October 2019.

Comments: Under review in IISE Transaction

arXiv:1906.03586 [pdf, other]

Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling

Authors: Hao Peng, Jianxin Li, Hao Yan, Qiran Gong, Senzhang Wang, Lin Liu, Lihong Wang, Xiang Ren

Abstract: Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedd… ▽ More Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved. △ Less

Submitted 9 June, 2019; originally announced June 2019.

Comments: Accepted by China Science Information Science. arXiv admin note: text overlap with arXiv:1811.05932 by other authors

arXiv:1807.10278 [pdf, other]

doi 10.1080/00401706.2018.1529628

Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization

Authors: Hao Yan, Kamran Paynabar, Massimo Pacella

Abstract: Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra tec… ▽ More Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra techniques and propose a set of tensor regression approaches to model the variational patterns of point clouds and to link them to process variables. The performance of the proposed methods is evaluated through simulations and a real case study of turning process optimization. △ Less

Submitted 1 December, 2018; v1 submitted 25 July, 2018; originally announced July 2018.

Comments: Technometrics, accepted

Journal ref: Technometrics 61.3 (2019): 385-395

arXiv:1804.03797 [pdf, other]

Dynamic Multivariate Functional Data Modeling via Sparse Subspace Learning

Authors: Chen Zhang, Hao Yan, Seungho Lee, Jianjun Shi

Abstract: Multivariate functional data from a complex system are naturally high-dimensional and have complex cross-correlation structure. The complexity of data structure can be observed as that (1) some functions are strongly correlated with similar features, while some others may have almost no cross-correlations with quite diverse features; and (2) the cross-correlation structure may also change over tim… ▽ More Multivariate functional data from a complex system are naturally high-dimensional and have complex cross-correlation structure. The complexity of data structure can be observed as that (1) some functions are strongly correlated with similar features, while some others may have almost no cross-correlations with quite diverse features; and (2) the cross-correlation structure may also change over time due to the system evolution. With this regard, this paper presents a dynamic subspace learning method for multivariate functional data modeling. In particular, we consider different functions come from different subspaces, and only functions of the same subspace have cross-correlations with each other. The subspaces can be automatically formulated and learned by reformatting the problem as a sparse regression. By allowing but regularizing the regression change over time, we can describe the cross-correlation dynamics. The model can be efficiently estimated by the fast iterative shrinkage-thresholding algorithm (FISTA), and the features of every subspace can be extracted using the smooth multi-channel functional PCA. Numerical studies together with case studies demonstrate the efficiency and applicability of the proposed methodology. △ Less

Submitted 10 April, 2018; originally announced April 2018.

arXiv:1804.03346 [pdf, other]

Learning Latent Events from Network Message Logs

Authors: Siddhartha Satpathi, Supratim Deb, R Srikant, He Yan

Abstract: We consider the problem of separating error messages generated in large distributed data center networks into error events. In such networks, each error event leads to a stream of messages generated by hardware and software components affected by the event. These messages are stored in a giant message log. We consider the unsupervised learning problem of identifying the signatures of events that g… ▽ More We consider the problem of separating error messages generated in large distributed data center networks into error events. In such networks, each error event leads to a stream of messages generated by hardware and software components affected by the event. These messages are stored in a giant message log. We consider the unsupervised learning problem of identifying the signatures of events that generated these messages; here, the signature of an error event refers to the mixture of messages generated by the event. One of the main contributions of the paper is a novel map** of our problem which transforms it into a problem of topic discovery in documents. Events in our problem correspond to topics and messages in our problem correspond to words in the topic discovery problem. However, there is no direct analog of documents. Therefore, we use a non-parametric change-point detection algorithm, which has linear computational complexity in the number of messages, to divide the message log into smaller subsets called episodes, which serve as the equivalents of documents. After this map** has been done, we use a well-known algorithm for topic discovery, called LDA, to solve our problem. We theoretically analyze the change-point detection algorithm, and show that it is consistent and has low sample complexity. We also demonstrate the scalability of our algorithm on a real data set consisting of $97$ million messages collected over a period of $15$ days, from a distributed data center network which supports the operations of a large wireless service provider. △ Less

Submitted 17 July, 2019; v1 submitted 10 April, 2018; originally announced April 2018.

Comments: To Appear in IEEE Transactions on Networking, Appeared in Workshop on MiLeTS, SIGKDD 2018

arXiv:1803.00138 [pdf, ps, other]

doi 10.1080/00401706.2019.1708463

A novel approach for fusion of heterogeneous sources of data

Authors: Mostafa Reisi Gahrooei, Hao Yan, Kamran Paynabar, Jianjun Shi

Abstract: With advancements in sensor technology, a heterogeneous set of data, containing samples of scalar, waveform signal, image, or even structured point cloud are becoming increasingly popular. Develo** a statistical model, representing the behavior of the underlying system based upon such a heterogeneous set of data can be used in monitoring, control, and optimization of the system. Unfortunately, a… ▽ More With advancements in sensor technology, a heterogeneous set of data, containing samples of scalar, waveform signal, image, or even structured point cloud are becoming increasingly popular. Develo** a statistical model, representing the behavior of the underlying system based upon such a heterogeneous set of data can be used in monitoring, control, and optimization of the system. Unfortunately, available methods only focus on the scalar and curve data and do not provide a general framework that can integrate different sources of data to construct a model. This paper poses the problem of estimating a process output, measured by a scalar, curve, an image, or a point cloud by a set of heterogeneous process variables such as scalar process setting, sensor readings, and images. We introduce a general approach in which each set of input data (predictor) as well as the output measurements are represented by tensors. We formulate a linear regression model between the input and output tensors and estimate the parameters by minimizing a least square loss function. In order to avoid overfitting and to reduce the number of parameters to be estimated, we decompose the model parameters using several bases, spanning the input and output spaces. Next, we learn both the bases and their spanning coefficients when minimizing the loss function using an alternating least square (ALS) algorithm. We show that such a minimization has a closed-form solution in each iteration and can be computed very efficiently. Through several simulation and case studies, we evaluate the performance of the proposed method. The results reveal the advantage of the proposed method over some benchmarks in the literature in terms of the mean square prediction error. △ Less

Submitted 19 April, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

Journal ref: Technometrics, 2020

arXiv:1612.07439 [pdf, other]

Estimating fiber orientation distribution from diffusion MRI with spherical needlets

Authors: Hao Yan, Owen Carmichael, Debashis Paul, Jie Peng

Abstract: We present a novel method for estimation of the fiber orientation distribution (FOD) function based on diffusion-weighted Magnetic Resonance Imaging (D-MRI) data. We formulate the problem of FOD estimation as a regression problem through spherical deconvolution and a sparse representation of the FOD by a spherical needlets basis that form a multi-resolution tight frame for spherical functions. Thi… ▽ More We present a novel method for estimation of the fiber orientation distribution (FOD) function based on diffusion-weighted Magnetic Resonance Imaging (D-MRI) data. We formulate the problem of FOD estimation as a regression problem through spherical deconvolution and a sparse representation of the FOD by a spherical needlets basis that form a multi-resolution tight frame for spherical functions. This sparse representation allows us to estimate FOD by an $l_1$-penalized regression under a non-negativity constraint. The resulting convex optimization problem is solved by an alternating direction method of multipliers (ADMM) algorithm. The proposed method leads to a reconstruction of the FODs that is accurate, has low variability and preserves sharp features. Through extensive experiments, we demonstrate the effectiveness and favorable performance of the proposed method compared with two existing methods. Particularly, we show the ability of the proposed method in successfully resolving fiber crossing at small angles and in automatically identifying isotropic diffusion. We also apply the proposed method to real 3T D-MRI data sets of healthy elderly individuals. The results show realistic descriptions of crossing fibers that are more accurate and less noisy than competing methods even with a relatively small number of gradient directions. △ Less

Submitted 21 December, 2016; originally announced December 2016.

arXiv:1110.3257 [pdf, other]

Modelling the impact of human activity on nitrogen dioxide concentrations in Europe

Authors: Gavin Shaddick, Haojie Yan, Danielle Vienneau

Abstract: Ambient concentrations of many pollutants are associated with emissions due to human activity, such as road transport and other combustion sources. In this paper we consider air pollution as a multi--level phenomenon within a Bayesian hierarchical model. We examine different scales of variation in pollution concentrations ranging from large scale transboundary effects to more localised effects whi… ▽ More Ambient concentrations of many pollutants are associated with emissions due to human activity, such as road transport and other combustion sources. In this paper we consider air pollution as a multi--level phenomenon within a Bayesian hierarchical model. We examine different scales of variation in pollution concentrations ranging from large scale transboundary effects to more localised effects which are directly related to human activity. Specifically, in the first stage of the model, we isolate underlying patterns in pollution concentrations due to global factors such as underlying climate and topography, which are modelled together with spatial structure. At this stage measurements from monitoring sites located within rural areas are used which, as far as possible, are chosen to reflect background concentrations. Having isolated these global effects, in the second stage we assess the effects of human activity on pollution in urban areas. The proposed model was applied to concentrations of nitrogen dioxide measured throughout the EU for which significant increases are found to be associated with human activity in urban areas. The approach proposed here provides valuable information that could be used in performing health impact assessments and to inform policy. △ Less

Submitted 14 October, 2011; originally announced October 2011.

Comments: 22 pages, 5 figures

Showing 1–32 of 32 results for author: Yan, H