-
Boosted Conformal Prediction Intervals
Authors:
Ran Xie,
Rina Foygel Barber,
Emmanuel J. Candès
Abstract:
This paper introduces a boosted conformal procedure designed to tailor conformalized prediction intervals toward specific desired properties, such as enhanced conditional coverage or reduced interval length. We employ machine learning techniques, notably gradient boosting, to systematically improve upon a predefined conformity score function. This process is guided by carefully constructed loss fu…
▽ More
This paper introduces a boosted conformal procedure designed to tailor conformalized prediction intervals toward specific desired properties, such as enhanced conditional coverage or reduced interval length. We employ machine learning techniques, notably gradient boosting, to systematically improve upon a predefined conformity score function. This process is guided by carefully constructed loss functions that measure the deviation of prediction intervals from the targeted properties. The procedure operates post-training, relying solely on model predictions and without modifying the trained model (e.g., the deep network). Systematic experiments demonstrate that starting from conventional conformal methods, our boosted procedure achieves substantial improvements in reducing interval length and decreasing deviation from target conditional coverage.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
Authors:
Renchunzi Xie,
Ambroise Odonnat,
Vasilii Feofanov,
Weijian Deng,
Jianfeng Zhang,
Bo An
Abstract:
Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to predict…
▽ More
Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias, especially under the natural shift. In this work, we first study the relationship between logits and generalization performance from the view of low-density separation assumption. Our findings motivate our proposed method MaNo which (1) applies a data-dependent normalization on the logits to reduce prediction bias, and (2) takes the $L_p$ norm of the matrix of normalized logits as the estimation score. Our theoretical analysis highlights the connection between the provided score and the model's uncertainty. We conduct an extensive empirical study on common unsupervised accuracy estimation benchmarks and demonstrate that MaNo achieves state-of-the-art performance across various architectures in the presence of synthetic, natural, or subpopulation shifts.
△ Less
Submitted 24 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks
Authors:
Yi Li,
Renyou Xie,
Chaojie Li,
Yi Wang,
Zhaoyang Dong
Abstract:
Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among differe…
▽ More
Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among different stations. To address these challenges, a federated graph learning approach involving multiple charging stations is proposed to collaboratively train a more generalized deep learning model for demand forecasting while capturing spatial correlations among various stations and enhancing robustness against potential attacks. Firstly, for better model performance, a Graph Neural Network (GNN) model is leveraged to characterize the geographic correlation among different charging stations in a federated manner. Secondly, to ensure robustness and deal with the data heterogeneity in a federated setting, a message passing that utilizes a global attention mechanism to aggregate personalized models for each client is proposed. Thirdly, by concerning cyberattacks, a special credit-based function is designed to mitigate potential threats from malicious clients or unwanted attacks. Extensive experiments on a public EV charging dataset are conducted using various deep learning techniques and federated learning methods to demonstrate the prediction accuracy and robustness of the proposed approach.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
Knockoff-Guided Feature Selection via A Single Pre-trained Reinforced Agent
Authors:
Xinyuan Wang,
Dongjie Wang,
Wangyang Ying,
Rui Xie,
Haifeng Chen,
Yanjie Fu
Abstract:
Feature selection prepares the AI-readiness of data by eliminating redundant features. Prior research falls into two primary categories: i) Supervised Feature Selection, which identifies the optimal feature subset based on their relevance to the target variable; ii) Unsupervised Feature Selection, which reduces the feature space dimensionality by capturing the essential information within the feat…
▽ More
Feature selection prepares the AI-readiness of data by eliminating redundant features. Prior research falls into two primary categories: i) Supervised Feature Selection, which identifies the optimal feature subset based on their relevance to the target variable; ii) Unsupervised Feature Selection, which reduces the feature space dimensionality by capturing the essential information within the feature set instead of using target variable. However, SFS approaches suffer from time-consuming processes and limited generalizability due to the dependence on the target variable and downstream ML tasks. UFS methods are constrained by the deducted feature space is latent and untraceable. To address these challenges, we introduce an innovative framework for feature selection, which is guided by knockoff features and optimized through reinforcement learning, to identify the optimal and effective feature subset. In detail, our method involves generating "knockoff" features that replicate the distribution and characteristics of the original features but are independent of the target variable. Each feature is then assigned a pseudo label based on its correlation with all the knockoff features, serving as a novel metric for feature evaluation. Our approach utilizes these pseudo labels to guide the feature selection process in 3 novel ways, optimized by a single reinforced agent: 1). A deep Q-network, pre-trained with the original features and their corresponding pseudo labels, is employed to improve the efficacy of the exploration process in feature selection. 2). We introduce unsupervised rewards to evaluate the feature subset quality based on the pseudo labels and the feature space reconstruction loss to reduce dependencies on the target variable. 3). A new ε-greedy strategy is used, incorporating insights from the pseudo labels to make the feature selection process more effective.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Smoothing spline analysis of variance models: A new tool for the analysis of accelerometer data
Authors:
Rui Xie,
Lulu Chen,
Joon-Hyuk Park,
Jeffrey Stout,
Ladda Thiamwong
Abstract:
Accelerometer data is commonplace in physical activity research, exercise science, and public health studies, where the goal is to understand and compare physical activity differences between groups and/or subject populations, and to identify patterns and trends in physical activity behavior to inform interventions for improving public health. We propose using mixed-effects smoothing spline analys…
▽ More
Accelerometer data is commonplace in physical activity research, exercise science, and public health studies, where the goal is to understand and compare physical activity differences between groups and/or subject populations, and to identify patterns and trends in physical activity behavior to inform interventions for improving public health. We propose using mixed-effects smoothing spline analysis of variance (SSANOVA) as a new tool for analyzing accelerometer data. By representing data as functions or curves, smoothing spline allows for accurate modeling of the underlying physical activity patterns throughout the day, especially when the accelerometer data is continuous and sampled at high frequency. The SSANOVA framework makes it possible to decompose the estimated function into the portion that is common across groups (i.e., the average activity) and the portion that differs across groups. By decomposing the function of physical activity measurements in such a manner, we can estimate group differences and identify the regions of difference. In this study, we demonstrate the advantages of utilizing SSANOVA models to analyze accelerometer-based physical activity data collected from community-dwelling older adults across various fall risk categories. Using Bayesian confidence intervals, the SSANOVA results can be used to reliably quantify physical activity differences between fall risk groups and identify the time regions that differ throughout the day.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data
Authors:
Rui Xie,
Shuyang Bai,
** Ma
Abstract:
The Internet of Things (IoT) system generates massive high-speed temporally correlated streaming data and is often connected with online inference tasks under computational or energy constraints. Online analysis of these streaming time series data often faces a trade-off between statistical efficiency and computational cost. One important approach to balance this trade-off is sampling, where only…
▽ More
The Internet of Things (IoT) system generates massive high-speed temporally correlated streaming data and is often connected with online inference tasks under computational or energy constraints. Online analysis of these streaming time series data often faces a trade-off between statistical efficiency and computational cost. One important approach to balance this trade-off is sampling, where only a small portion of the sample is selected for the model fitting and update. Motivated by the demands of dynamic relationship analysis of IoT system, we study the data-dependent sample selection and online inference problem for a multi-dimensional streaming time series, aiming to provide low-cost real-time analysis of high-speed power grid electricity consumption data. Inspired by D-optimality criterion in design of experiments, we propose a class of online data reduction methods that achieve an optimal sampling criterion and improve the computational efficiency of the online analysis. We show that the optimal solution amounts to a strategy that is a mixture of Bernoulli sampling and leverage score sampling. The leverage score sampling involves auxiliary estimations that have a computational advantage over recursive least squares updates. Theoretical properties of the auxiliary estimations involved are also discussed. When applied to European power grid consumption data, the proposed leverage score based sampling methods outperform the benchmark sampling method in online estimation and prediction. The general applicability of the sampling-assisted online estimation method is assessed via simulation studies.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
A random energy approach to deep learning
Authors:
Rongrong Xie,
Matteo Marsili
Abstract:
We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are char…
▽ More
We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are characterised by a broad distribution of energy levels. The analysis of Deep Belief Networks and Restricted Boltzmann Machines on different datasets confirms these conclusions.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Model-based Sparse Coding beyond Gaussian Independent Model
Authors:
Xin Xing,
Rui Xie,
Wenxuan Zhong
Abstract:
Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC) method is proposed to provide an effective and flexible framework for learning features from different data types: continuous, discrete, or categorical, and modeling…
▽ More
Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC) method is proposed to provide an effective and flexible framework for learning features from different data types: continuous, discrete, or categorical, and modeling different types of correlations: spatial or temporal. The specification of the sparsity level and how to adapt the estimation method to large-scale studies are also addressed. A fast EM algorithm is proposed for estimation, and its superior performance is demonstrated in simulation and multiple real applications such as image denoising, brain connectivity study, and spatial transcriptomic imaging.
△ Less
Submitted 22 August, 2021;
originally announced August 2021.
-
LowCon: A design-based subsampling approach in a misspecified linear modeL
Authors:
Cheng Meng,
Rui Xie,
Abhyuday Mandal,
Xinlian Zhang,
Wenxuan Zhong,
** Ma
Abstract:
We consider a measurement constrained supervised learning problem, that is, (1) full sample of the predictors are given; (2) the response observations are unavailable and expensive to measure. Thus, it is ideal to select a subsample of predictor observations, measure the corresponding responses, and then fit the supervised learning model on the subsample of the predictors and responses. However, m…
▽ More
We consider a measurement constrained supervised learning problem, that is, (1) full sample of the predictors are given; (2) the response observations are unavailable and expensive to measure. Thus, it is ideal to select a subsample of predictor observations, measure the corresponding responses, and then fit the supervised learning model on the subsample of the predictors and responses. However, model fitting is a trial and error process, and a postulated model for the data could be misspecified. Our empirical studies demonstrate that most of the existing subsampling methods have unsatisfactory performances when the models are misspecified. In this paper, we develop a novel subsampling method, called "LowCon", which outperforms the competing methods when the working linear model is misspecified. Our method uses orthogonal Latin hypercube designs to achieve a robust estimation. We show that the proposed design-based estimator approximately minimizes the so-called "worst-case" bias with respect to many possible misspecification terms. Both the simulated and real-data analyses demonstrate the proposed estimator is more robust than several subsample least squares estimators obtained by state-of-the-art subsampling methods.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
ATL: Autonomous Knowledge Transfer from Many Streaming Processes
Authors:
Mahardhika Pratama,
Marcus de Carvalho,
Renchunzi Xie,
Edwin Lughofer,
Jie Lu
Abstract:
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for…
▽ More
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for the online unsupervised transfer learning problem across many streaming processes. ATL offers an online domain adaptation strategy via the generative and discriminative phases coupled with the KL divergence based optimization strategy to produce a domain invariant network while putting forward an elastic network structure. It automatically evolves its network structure from scratch with/without the presence of ground truth to overcome independent concept drifts in the source and target domain. The rigorous numerical evaluation has been conducted along with a comparison against recently published works. ATL demonstrates improved performance while showing significantly faster training speed than its counterparts.
△ Less
Submitted 19 October, 2019; v1 submitted 8 October, 2019;
originally announced October 2019.
-
They may look and look, yet not see: BMDs cannot be tested adequately
Authors:
Philip B. Stark,
Ran Xie
Abstract:
Bugs, misconfiguration, and malware can cause ballot-marking devices (BMDs) to print incorrect votes. Several approaches to testing BMDs have been proposed. In logic and accuracy testing (LAT) and parallel or live testing, auditors input known test votes into the BMD and check the printout. Passive testing monitors the rate of "spoiled" BMD printout, on the theory that if BMDs malfunction, the rat…
▽ More
Bugs, misconfiguration, and malware can cause ballot-marking devices (BMDs) to print incorrect votes. Several approaches to testing BMDs have been proposed. In logic and accuracy testing (LAT) and parallel or live testing, auditors input known test votes into the BMD and check the printout. Passive testing monitors the rate of "spoiled" BMD printout, on the theory that if BMDs malfunction, the rate will increase noticeably. We show that these approaches cannot reliably detect outcome-altering problems, because: (i) The number of possible interactions with BMDs is enormous, so testing interactions uniformly at random is hopeless. (ii) To probe the space of interactions intelligently requires an accurate model of voter behavior, but because the space of interactions is so large, building an accurate model requires observing a huge number of voters in every jurisdiction in every election--more voters than there are in most jurisdictions. (iii) Even with a perfect model of voter behavior, the number of tests needed exceeds the number of voters in most jurisdictions. (iv) An attacker can target interactions that are expensive to test, e.g., because they involve voting slowly; or interactions for which tampering is less likely to be noticed, e.g., because the voter uses the audio interface. (v) Whether BMDs misbehave or not, the distribution of spoiled ballots is unknown and varies by election and possibly by ballot style: historical data do not help much. Hence, there is no way to calibrate a threshold for passive testing, e.g., to guarantee at least a 95% chance of noticing that 5% of the votes were altered, with at most a 5% false alarm rate. (vi) Even if the distribution of spoiled ballots were known to be Poisson, the vast majority of jurisdictions do not have enough voters for passive testing to have a large chance of detecting problems but only a small chance of false alarms.
△ Less
Submitted 25 July, 2022; v1 submitted 21 August, 2019;
originally announced August 2019.