-
Dynamic Exploration-Exploitation Trade-Off in Active Learning Regression with Bayesian Hierarchical Modeling
Authors:
Upala Junaida Islam,
Kamran Paynabar,
George Runger,
Ashif Sikandar Iquebal
Abstract:
Active learning provides a framework to adaptively query the most informative experiments towards learning an unknown black-box function. Various approaches of active learning have been proposed in the literature, however, they either focus on exploration or exploitation in the design space. Methods that do consider exploration-exploitation simultaneously employ fixed or ad-hoc measures to control…
▽ More
Active learning provides a framework to adaptively query the most informative experiments towards learning an unknown black-box function. Various approaches of active learning have been proposed in the literature, however, they either focus on exploration or exploitation in the design space. Methods that do consider exploration-exploitation simultaneously employ fixed or ad-hoc measures to control the trade-off that may not be optimal. In this paper, we develop a Bayesian hierarchical approach, referred as BHEEM, to dynamically balance the exploration-exploitation trade-off as more data points are queried. To sample from the posterior distribution of the trade-off parameter, We subsequently formulate an approximate Bayesian computation approach based on the linear dependence of queried data in the feature space. Simulated and real-world examples show the proposed approach achieves at least 21% and 11% average improvement when compared to pure exploration and exploitation strategies respectively. More importantly, we note that by optimally balancing the trade-off between exploration and exploitation, BHEEM performs better or at least as well as either pure exploration or pure exploitation.
△ Less
Submitted 30 September, 2023; v1 submitted 15 April, 2023;
originally announced April 2023.
-
A Novel Two-level Causal Inference Framework for On-road Vehicle Quality Issues Diagnosis
Authors:
Qian Wang,
Huanyi Shui,
Thi Tu Trinh Tran,
Milad Zafar Nezhad,
Devesh Upadhyay,
Kamran Paynabar,
Anqi He
Abstract:
In the automotive industry, the full cycle of managing in-use vehicle quality issues can take weeks to investigate. The process involves isolating root causes, defining and implementing appropriate treatments, and refining treatments if needed. The main pain-point is the lack of a systematic method to identify causal relationships, evaluate treatment effectiveness, and direct the next actionable t…
▽ More
In the automotive industry, the full cycle of managing in-use vehicle quality issues can take weeks to investigate. The process involves isolating root causes, defining and implementing appropriate treatments, and refining treatments if needed. The main pain-point is the lack of a systematic method to identify causal relationships, evaluate treatment effectiveness, and direct the next actionable treatment if the current treatment was deemed ineffective. This paper will show how we leverage causal Machine Learning (ML) to speed up such processes. A real-word data set collected from on-road vehicles will be used to demonstrate the proposed framework. Open challenges for vehicle quality applications will also be discussed.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Maximum Covariance Unfolding Regression: A Novel Covariate-based Manifold Learning Approach for Point Cloud Data
Authors:
Qian Wang,
Kamran Paynabar
Abstract:
Point cloud data are widely used in manufacturing applications for process inspection, modeling, monitoring and optimization. The state-of-art tensor regression techniques have effectively been used for analysis of structured point cloud data, where the measurements on a uniform grid can be formed into a tensor. However, these techniques are not capable of handling unstructured point cloud data th…
▽ More
Point cloud data are widely used in manufacturing applications for process inspection, modeling, monitoring and optimization. The state-of-art tensor regression techniques have effectively been used for analysis of structured point cloud data, where the measurements on a uniform grid can be formed into a tensor. However, these techniques are not capable of handling unstructured point cloud data that are often in the form of manifolds. In this paper, we propose a nonlinear dimension reduction approach named Maximum Covariance Unfolding Regression that is able to learn the low-dimensional (LD) manifold of point clouds with the highest correlation with explanatory covariates. This LD manifold is then used for regression modeling and process optimization based on process variables. The performance of the proposed method is subsequently evaluated and compared with benchmark methods through simulations and a case study of steel bracket manufacturing.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Multi-Objective Allocation of COVID-19 Testing Centers: Improving Coverage and Equity in Access
Authors:
Zhen Zhong,
Ribhu Sengupta,
Kamran Paynabar,
Lance A. Waller
Abstract:
At the time of this article, COVID-19 has been transmitted to more than 42 million people and resulted in more than 673,000 deaths across the United States. Throughout this pandemic, public health authorities have monitored the results of diagnostic testing to identify hotspots of transmission. Such information can help reduce or block transmission paths of COVID-19 and help infected patients rece…
▽ More
At the time of this article, COVID-19 has been transmitted to more than 42 million people and resulted in more than 673,000 deaths across the United States. Throughout this pandemic, public health authorities have monitored the results of diagnostic testing to identify hotspots of transmission. Such information can help reduce or block transmission paths of COVID-19 and help infected patients receive early treatment. However, most current schemes of test site allocation have been based on experience or convenience, often resulting in low efficiency and non-optimal allocation. In addition, the historical sociodemographic patterns of populations within cities can result in measurable inequities in access to testing between various racial and income groups. To address these pressing issues, we propose a novel test site allocation scheme to (a) maximize population coverage, (b) minimize prediction uncertainties associated with projections of outbreak trajectories, and (c) reduce inequities in access. We illustrate our approach with case studies comparing our allocation scheme with recorded allocation of testing sites in Georgia, revealing increases in both population coverage and improvements in equity of access over current practice.
△ Less
Submitted 20 September, 2021;
originally announced October 2021.
-
An Online Approach to Cyberattack Detection and Localization in Smart Grid
Authors:
Dan Li,
Nagi Gebraeel,
Kamran Paynabar,
A. P. Sakis Meliopoulos
Abstract:
Complex interconnections between information technology and digital control systems have significantly increased cybersecurity vulnerabilities in smart grids. Cyberattacks involving data integrity can be very disruptive because of their potential to compromise physical control by manipulating measurement data. This is especially true in large and complex electric networks that often rely on tradit…
▽ More
Complex interconnections between information technology and digital control systems have significantly increased cybersecurity vulnerabilities in smart grids. Cyberattacks involving data integrity can be very disruptive because of their potential to compromise physical control by manipulating measurement data. This is especially true in large and complex electric networks that often rely on traditional intrusion detection systems focused on monitoring network traffic. In this paper, we develop an online detection algorithm to detect and localize covert attacks on smart grids. Using a network system model, we develop a theoretical framework by characterizing a covert attack on a generator bus in the network as sparse features in the state-estimation residuals. We leverage such sparsity via a regularized linear regression method to detect and localize covert attacks based on the regression coefficients. We conduct a comprehensive numerical study on both linear and nonlinear system models to validate our proposed method. The results show that our method outperforms conventional methods in both detection delay and localization accuracy.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
Deep Learning based Covert Attack Identification for Industrial Control Systems
Authors:
Dan Li,
Paritosh Ramanan,
Nagi Gebraeel,
Kamran Paynabar
Abstract:
Cybersecurity of Industrial Control Systems (ICS) is drawing significant concerns as data communication increasingly leverages wireless networks. A lot of data-driven methods were developed for detecting cyberattacks, but few are focused on distinguishing them from equipment faults. In this paper, we develop a data-driven framework that can be used to detect, diagnose, and localize a type of cyber…
▽ More
Cybersecurity of Industrial Control Systems (ICS) is drawing significant concerns as data communication increasingly leverages wireless networks. A lot of data-driven methods were developed for detecting cyberattacks, but few are focused on distinguishing them from equipment faults. In this paper, we develop a data-driven framework that can be used to detect, diagnose, and localize a type of cyberattack called covert attacks on smart grids. The framework has a hybrid design that combines an autoencoder, a recurrent neural network (RNN) with a Long-Short-Term-Memory (LSTM) layer, and a Deep Neural Network (DNN). This data-driven framework considers the temporal behavior of a generic physical system that extracts features from the time series of the sensor measurements that can be used for detecting covert attacks, distinguishing them from equipment faults, as well as localize the attack/fault. We evaluate the performance of the proposed method through a realistic simulation study on the IEEE 14-bus model as a typical example of ICS. We compare the performance of the proposed method with the traditional model-based method to show its applicability and efficacy.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
Real-time Detection of Clustered Events in Video-imaging data with Applications to Additive Manufacturing
Authors:
Hao Yan,
Marco Grasso,
Kamran Paynabar,
Bianca Maria Colosimo
Abstract:
The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the unde…
▽ More
The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to the events that are localized both in time and space. In this paper, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse spatially clustered and temporally consistent. Therefore, the goal is to not only detect the anomaly as quickly as possible ("when") but also locate it ("where"). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the hotspot happens. The proposed approach was applied to the analysis of video-imaging data to detect and locate local over-heating phenomena ("hotspots") during the layer-wise process in a metal additive manufacturing process.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
AKM$^2$D : An Adaptive Framework for Online Sensing and Anomaly Quantification
Authors:
Hao Yan,
Kamran Paynabar,
Jianjun Shi
Abstract:
In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sam…
▽ More
In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sampling space. However, in many anomaly quantification applications, the main goal is to estimate sparse anomalous regions in the pixel-level accurately. In this paper, we develop a novel framework named Adaptive Kernelized Maximum-Minimum Distance AKM$^2$D to speed up the inspection and anomaly detection process through an intelligent sequential sampling scheme integrated with fast estimation and detection. The proposed method balances the sampling efforts between the space-filling sampling (exploration) and focused sampling near the anomalous region (exploitation). The proposed methodology is validated by conducting simulations and a case study of anomaly detection in composite sheets using a guided wave test.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Large Multistream Data Analytics for Monitoring and Diagnostics in Manufacturing Systems
Authors:
Samaneh Ebrahimi,
Chitta Ranjan,
Kamran Paynabar
Abstract:
The high-dimensionality and volume of large scale multistream data has inhibited significant research progress in develo** an integrated monitoring and diagnostics (M&D) approach. This data, also categorized as big data, is becoming common in manufacturing plants. In this paper, we propose an integrated M\&D approach for large scale streaming data. We developed a novel monitoring method named Ad…
▽ More
The high-dimensionality and volume of large scale multistream data has inhibited significant research progress in develo** an integrated monitoring and diagnostics (M&D) approach. This data, also categorized as big data, is becoming common in manufacturing plants. In this paper, we propose an integrated M\&D approach for large scale streaming data. We developed a novel monitoring method named Adaptive Principal Component monitoring (APC) which adaptively chooses PCs that are most likely to vary due to the change for early detection. Importantly, we integrate a novel diagnostic approach, Principal Component Signal Recovery (PCSR), to enable a streamlined SPC. This diagnostics approach draws inspiration from Compressed Sensing and uses Adaptive Lasso for identifying the sparse change in the process. We theoretically motivate our approaches and do a performance evaluation of our integrated M&D method through simulations and case studies.
△ Less
Submitted 26 December, 2018;
originally announced December 2018.
-
Dataset: Rare Event Classification in Multivariate Time Series
Authors:
Chitta Ranjan,
Mahendranath Reddy,
Markku Mustonen,
Kamran Paynabar,
Karim Pourak
Abstract:
A real-world dataset is provided from a pulp-and-paper manufacturing industry. The dataset comes from a multivariate time series process. The data contains a rare event of paper break that commonly occurs in the industry. The data contains sensor readings at regular time-intervals (x's) and the event label (y). The primary purpose of the data is thought to be building a classification model for ea…
▽ More
A real-world dataset is provided from a pulp-and-paper manufacturing industry. The dataset comes from a multivariate time series process. The data contains a rare event of paper break that commonly occurs in the industry. The data contains sensor readings at regular time-intervals (x's) and the event label (y). The primary purpose of the data is thought to be building a classification model for early prediction of the rare event. However, it can also be used for multivariate time series data exploration and building other supervised and unsupervised models.
△ Less
Submitted 31 May, 2019; v1 submitted 27 September, 2018;
originally announced September 2018.
-
Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization
Authors:
Hao Yan,
Kamran Paynabar,
Massimo Pacella
Abstract:
Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra tec…
▽ More
Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra techniques and propose a set of tensor regression approaches to model the variational patterns of point clouds and to link them to process variables. The performance of the proposed methods is evaluated through simulations and a real case study of turning process optimization.
△ Less
Submitted 1 December, 2018; v1 submitted 25 July, 2018;
originally announced July 2018.
-
Sequence Graph Transform (SGT): A Feature Embedding Function for Sequence Data Mining
Authors:
Chitta Ranjan,
Samaneh Ebrahimi,
Kamran Paynabar
Abstract:
Sequence feature embedding is a challenging task due to the unstructuredness of sequence, i.e., arbitrary strings of arbitrary length. Existing methods are efficient in extracting short-term dependencies but typically suffer from computation issues for the long-term. Sequence Graph Transform (SGT), a feature embedding function, that can extract a varying amount of short- to long-term dependencies…
▽ More
Sequence feature embedding is a challenging task due to the unstructuredness of sequence, i.e., arbitrary strings of arbitrary length. Existing methods are efficient in extracting short-term dependencies but typically suffer from computation issues for the long-term. Sequence Graph Transform (SGT), a feature embedding function, that can extract a varying amount of short- to long-term dependencies without increasing the computation is proposed. SGT's properties are analytically proved for interpretation under normal and uniform distribution assumptions. SGT features yield significantly superior results in sequence clustering and classification with higher accuracy and lower computation as compared to the existing methods, including the state-of-the-art sequence/string Kernels and LSTM.
△ Less
Submitted 4 October, 2021; v1 submitted 11 August, 2016;
originally announced August 2016.
-
An overview and perspective on social network monitoring
Authors:
William H. Woodall,
Meng J. Zhao,
Kamran Paynabar,
Ross Sparks,
James D. Wilson
Abstract:
In this expository paper we give an overview of some statistical methods for the monitoring of social networks. We discuss the advantages and limitations of various methods as well as some relevant issues. One of our primary contributions is to give the relationships between network monitoring methods and monitoring methods in engineering statistics and public health surveillance. We encourage res…
▽ More
In this expository paper we give an overview of some statistical methods for the monitoring of social networks. We discuss the advantages and limitations of various methods as well as some relevant issues. One of our primary contributions is to give the relationships between network monitoring methods and monitoring methods in engineering statistics and public health surveillance. We encourage researchers in the industrial process monitoring area to work on develo** and comparing the performance of social network monitoring methods. We also discuss some of the issues in social network monitoring and give a number of research ideas.
△ Less
Submitted 31 March, 2016;
originally announced March 2016.