-
Bayesian Spillover Graphs for Dynamic Networks
Authors:
Grace Deng,
David S. Matteson
Abstract:
We present Bayesian Spillover Graphs (BSG), a novel method for learning temporal relationships, identifying critical nodes, and quantifying uncertainty for multi-horizon spillover effects in a dynamic system. BSG leverages both an interpretable framework via forecast error variance decompositions (FEVD) and comprehensive uncertainty quantification via Bayesian time series models to contextualize t…
▽ More
We present Bayesian Spillover Graphs (BSG), a novel method for learning temporal relationships, identifying critical nodes, and quantifying uncertainty for multi-horizon spillover effects in a dynamic system. BSG leverages both an interpretable framework via forecast error variance decompositions (FEVD) and comprehensive uncertainty quantification via Bayesian time series models to contextualize temporal relationships in terms of systemic risk and prediction variability. Forecast horizon hyperparameter $h$ allows for learning both short-term and equilibrium state network behaviors. Experiments for identifying source and sink nodes under various graph and error specifications show significant performance gains against state-of-the-art Bayesian Networks and deep-learning baselines. Applications to real-world systems also showcase BSG as an exploratory analysis tool for uncovering indirect spillovers and quantifying systemic risk.
△ Less
Submitted 16 June, 2022; v1 submitted 3 March, 2022;
originally announced March 2022.
-
IB-GAN: A Unified Approach for Multivariate Time Series Classification under Class Imbalance
Authors:
Grace Deng,
Cuize Han,
Tommaso Dreossi,
Clarence Lee,
David S. Matteson
Abstract:
Classification of large multivariate time series with strong class imbalance is an important task in real-world applications. Standard methods of class weights, oversampling, or parametric data augmentation do not always yield significant improvements for predicting minority classes of interest. Non-parametric data augmentation with Generative Adversarial Networks (GANs) offers a promising solutio…
▽ More
Classification of large multivariate time series with strong class imbalance is an important task in real-world applications. Standard methods of class weights, oversampling, or parametric data augmentation do not always yield significant improvements for predicting minority classes of interest. Non-parametric data augmentation with Generative Adversarial Networks (GANs) offers a promising solution. We propose Imputation Balanced GAN (IB-GAN), a novel method that joins data augmentation and classification in a one-step process via an imputation-balancing approach. IB-GAN uses imputation and resampling techniques to generate higher quality samples from randomly masked vectors than from white noise, and augments classification through a class-balanced set of real and synthetic samples. Imputation hyperparameter $p_{miss}$ allows for regularization of classifier variability by tuning innovations introduced via generator imputation. IB-GAN is simple to train and model-agnostic, pairing any deep learning classifier with a generator-discriminator duo and resulting in higher accuracy for under-observed classes. Empirical experiments on open-source UCR data and proprietary 90K product dataset show significant performance gains against state-of-the-art parametric and GAN baselines.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Active-set algorithms based statistical inference for shape-restricted generalized additive Cox regression models
Authors:
Geng Deng,
Guangning Xu,
Qiang Fu,
Xindong Wang,
**g Qin
Abstract:
Recently the shape-restricted inference has gained popularity in statistical and econometric literature in order to relax the linear or quadratic covariate effect in regression analyses. The typical shape-restricted covariate effect includes monotonic increasing, decreasing, convexity or concavity. In this paper, we introduce the shape-restricted inference to the celebrated Cox regression model (S…
▽ More
Recently the shape-restricted inference has gained popularity in statistical and econometric literature in order to relax the linear or quadratic covariate effect in regression analyses. The typical shape-restricted covariate effect includes monotonic increasing, decreasing, convexity or concavity. In this paper, we introduce the shape-restricted inference to the celebrated Cox regression model (SR-Cox), in which the covariate response is modeled as shape-restricted additive functions. The SR-Cox regression approximates the shape-restricted functions using a spline basis expansion with data driven choice of knots. The underlying minimization of negative log-likelihood function is formulated as a convex optimization problem, which is solved with an active-set optimization algorithm. The highlight of this algorithm is that it eliminates the superfluous knots automatically. When covariate effects include combinations of convex or concave terms with unknown forms and linear terms, the most interesting finding is that SR-Cox produces accurate linear covariate effect estimates which are comparable to the maximum partial likelihood estimates if indeed the forms are known. We conclude that concave or convex SR-Cox models could significantly improve nonlinear covariate response recovery and model goodness of fit.
△ Less
Submitted 2 July, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Critical Risk Indicators (CRIs) for the electric power grid: A survey and discussion of interconnected effects
Authors:
Judy P. Che-Castaldo,
RĂ©mi Cousin,
Stefani Daryanto,
Grace Deng,
Mei-Ling E. Feng,
Rajesh K. Gupta,
Dezhi Hong,
Ryan M. McGranaghan,
Olukunle O. Owolabi,
Tianyi Qu,
Wei Ren,
Toryn L. J. Schafer,
Ashutosh Sharma,
Chaopeng Shen,
Mila Getmansky Sherman,
Deborah A. Sunter,
Lan Wang,
David S. Matteson
Abstract:
The electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to solar weather, climate, hydrology, and ecology. The e…
▽ More
The electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to solar weather, climate, hydrology, and ecology. The emerging interconnected and complex network dependencies make such interactions increasingly dynamic causing potentially large swings, thus presenting new challenges to manage the coupled human-natural system. This paper provides a survey of models and methods that seek to explore the significant interconnected impact of the electric power grid and interdependent domains. We also provide relevant critical risk indicators (CRIs) across diverse domains that may influence electric power grid risks, including climate, ecology, hydrology, finance, space weather, and agriculture. We discuss the convergence of indicators from individual domains to explore possible systemic risk, i.e., holistic risk arising from cross-domains interconnections. Our study provides an important first step towards data-driven analysis and predictive modeling of risks in the coupled interconnected systems. Further, we propose a compositional approach to risk assessment that incorporates diverse domain expertise and information, data science, and computer science to identify domain-specific CRIs and their union in systemic risk indicators.
△ Less
Submitted 9 June, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Extended Missing Data Imputation via GANs for Ranking Applications
Authors:
Grace Deng,
Cuize Han,
David S. Matteson
Abstract:
We propose Conditional Imputation GAN, an extended missing data imputation method based on Generative Adversarial Networks (GANs). The motivating use case is learning-to-rank, the cornerstone of modern search, recommendation system, and information retrieval applications. Empirical ranking datasets do not always follow standard Gaussian distributions or Missing Completely At Random (MCAR) mechanis…
▽ More
We propose Conditional Imputation GAN, an extended missing data imputation method based on Generative Adversarial Networks (GANs). The motivating use case is learning-to-rank, the cornerstone of modern search, recommendation system, and information retrieval applications. Empirical ranking datasets do not always follow standard Gaussian distributions or Missing Completely At Random (MCAR) mechanism, which are standard assumptions of classic missing data imputation methods. Our methodology provides a simple solution that offers compatible imputation guarantees while relaxing assumptions for missing mechanisms and sidesteps approximating intractable distributions to improve imputation quality. We prove that the optimal GAN imputation is achieved for Extended Missing At Random (EMAR) and Extended Always Missing At Random (EAMAR) mechanisms, beyond the naive MCAR. Our method demonstrates the highest imputation quality on the open-source Microsoft Research Ranking (MSR) Dataset and a synthetic ranking dataset compared to state-of-the-art benchmarks and across various feature distributions. Using a proprietary Amazon Search ranking dataset, we also demonstrate comparable ranking quality metrics for ranking models trained on GAN-imputed data compared to ground-truth data.
△ Less
Submitted 10 November, 2021; v1 submitted 3 November, 2020;
originally announced November 2020.