Search | arXiv e-print repository

Clusterpath Gaussian Graphical Modeling

Authors: D. J. W. Touw, A. Alfons, P. J. F. Groenen, I. Wilms

Abstract: Graphical models serve as effective tools for visualizing conditional dependencies between variables. However, as the number of variables grows, interpretation becomes increasingly difficult, and estimation uncertainty increases due to the large number of parameters relative to the number of observations. To address these challenges, we introduce the Clusterpath estimator of the Gaussian Graphical… ▽ More Graphical models serve as effective tools for visualizing conditional dependencies between variables. However, as the number of variables grows, interpretation becomes increasingly difficult, and estimation uncertainty increases due to the large number of parameters relative to the number of observations. To address these challenges, we introduce the Clusterpath estimator of the Gaussian Graphical Model (CGGM) that encourages variable clustering in the graphical model in a data-driven way. Through the use of a clusterpath penalty, we group variables together, which in turn results in a block-structured precision matrix whose block structure remains preserved in the covariance matrix. We present a computationally efficient implementation of the CGGM estimator by using a cyclic block coordinate descent algorithm. In simulations, we show that CGGM not only matches, but oftentimes outperforms other state-of-the-art methods for variable clustering in graphical models. We also demonstrate CGGM's practical advantages and versatility on a diverse collection of empirical applications. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 43 pages, 11 figures

arXiv:2406.19702 [pdf, ps, other]

Vector AutoRegressive Moving Average Models: A Review

Authors: Marie-Christine Düker, David S. Matteson, Ruey S. Tsay, Ines Wilms

Abstract: Vector AutoRegressive Moving Average (VARMA) models form a powerful and general model class for analyzing dynamics among multiple time series. While VARMA models encompass the Vector AutoRegressive (VAR) models, their popularity in empirical applications is dominated by the latter. Can this phenomenon be explained fully by the simplicity of VAR models? Perhaps many users of VAR models have not ful… ▽ More Vector AutoRegressive Moving Average (VARMA) models form a powerful and general model class for analyzing dynamics among multiple time series. While VARMA models encompass the Vector AutoRegressive (VAR) models, their popularity in empirical applications is dominated by the latter. Can this phenomenon be explained fully by the simplicity of VAR models? Perhaps many users of VAR models have not fully appreciated what VARMA models can provide. The goal of this review is to provide a comprehensive resource for researchers and practitioners seeking insights into the advantages and capabilities of VARMA models. We start by reviewing the identification challenges inherent to VARMA models thereby encompassing classical and modern identification schemes and we continue along the same lines regarding estimation, specification and diagnosis of VARMA models. We then highlight the practical utility of VARMA models in terms of Granger Causality analysis, forecasting and structural analysis as well as recent advances and extensions of VARMA models to further facilitate their adoption in practice. Finally, we discuss some interesting future research directions where VARMA models can fulfill their potentials in applications as compared to their subclass of VAR models. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2405.18987 [pdf, other]

Transmission Channel Analysis in Dynamic Models

Authors: Enrico Wegner, Lenard Lieb, Stephan Smeekes, Ines Wilms

Abstract: We propose a framework for the analysis of transmission channels in a large class of dynamic models. To this end, we formulate our approach both using graph theory and potential outcomes, which we show to be equivalent. Our method, labelled Transmission Channel Analysis (TCA), allows for the decomposition of total effects captured by impulse response functions into the effects flowing along transm… ▽ More We propose a framework for the analysis of transmission channels in a large class of dynamic models. To this end, we formulate our approach both using graph theory and potential outcomes, which we show to be equivalent. Our method, labelled Transmission Channel Analysis (TCA), allows for the decomposition of total effects captured by impulse response functions into the effects flowing along transmission channels, thereby providing a quantitative assessment of the strength of various transmission channels. We establish that this requires no additional identification assumptions beyond the identification of the structural shock whose effects the researcher wants to decompose. Additionally, we prove that impulse response functions are sufficient statistics for the computation of transmission effects. We also demonstrate the empirical relevance of TCA for policy evaluation by decomposing the effects of various monetary policy shock measures into instantaneous implementation effects and effects that likely relate to forward guidance. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2402.09033 [pdf, other]

Cross-Temporal Forecast Reconciliation at Digital Platforms with Machine Learning

Authors: Jeroen Rombouts, Marie Ternes, Ines Wilms

Abstract: Platform businesses operate on a digital core and their decision making requires high-dimensional accurate forecast streams at different levels of cross-sectional (e.g., geographical regions) and temporal aggregation (e.g., minutes to days). It also necessitates coherent forecasts across all levels of the hierarchy to ensure aligned decision making across different planning units such as pricing,… ▽ More Platform businesses operate on a digital core and their decision making requires high-dimensional accurate forecast streams at different levels of cross-sectional (e.g., geographical regions) and temporal aggregation (e.g., minutes to days). It also necessitates coherent forecasts across all levels of the hierarchy to ensure aligned decision making across different planning units such as pricing, product, controlling and strategy. Given that platform data streams feature complex characteristics and interdependencies, we introduce a non-linear hierarchical forecast reconciliation method that produces cross-temporal reconciled forecasts in a direct and automated way through the use of popular machine learning methods. The method is sufficiently fast to allow forecast-based high-frequency decision making that platforms require. We empirically test our framework on unique, large-scale streaming datasets from a leading on-demand delivery platform in Europe and a bicycle sharing system in New York City. △ Less

Submitted 31 May, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2401.09144 [pdf, other]

Monitoring Machine Learning Forecasts for Platform Data Streams

Authors: Jeroen Rombouts, Ines Wilms

Abstract: Data stream forecasts are essential inputs for decision making at digital platforms. Machine learning algorithms are appealing candidates to produce such forecasts. Yet, digital platforms require a large-scale forecast framework that can flexibly respond to sudden performance drops. Re-training ML algorithms at the same speed as new data batches enter is usually computationally too costly. On the… ▽ More Data stream forecasts are essential inputs for decision making at digital platforms. Machine learning algorithms are appealing candidates to produce such forecasts. Yet, digital platforms require a large-scale forecast framework that can flexibly respond to sudden performance drops. Re-training ML algorithms at the same speed as new data batches enter is usually computationally too costly. On the other hand, infrequent re-training requires specifying the re-training frequency and typically comes with a severe cost of forecast deterioration. To ensure accurate and stable forecasts, we propose a simple data-driven monitoring procedure to answer the question when the ML algorithm should be re-trained. Instead of investigating instability of the data streams, we test if the incoming streaming forecast loss batch differs from a well-defined reference batch. Using a novel dataset constituting 15-min frequency data streams from an on-demand logistics platform operating in London, we apply the monitoring procedure to popular ML algorithms including random forest, XGBoost and lasso. We show that monitor-based re-training produces accurate forecasts compared to viable benchmarks while preserving computational feasibility. Moreover, the choice of monitoring procedure is more important than the choice of ML algorithm, thereby permitting practitioners to combine the proposed monitoring procedure with one's favorite forecasting algorithm. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2312.00090 [pdf, other]

Tree-based Forecasting of Day-ahead Solar Power Generation from Granular Meteorological Features

Authors: Nick Berlanger, Noah van Ophoven, Tim Verdonck, Ines Wilms

Abstract: Accurate forecasts for day-ahead photovoltaic (PV) power generation are crucial to support a high PV penetration rate in the local electricity grid and to assure stability in the grid. We use state-of-the-art tree-based machine learning methods to produce such forecasts and, unlike previous studies, we hereby account for (i) the effects various meteorological as well as astronomical features have… ▽ More Accurate forecasts for day-ahead photovoltaic (PV) power generation are crucial to support a high PV penetration rate in the local electricity grid and to assure stability in the grid. We use state-of-the-art tree-based machine learning methods to produce such forecasts and, unlike previous studies, we hereby account for (i) the effects various meteorological as well as astronomical features have on PV power production, and this (ii) at coarse as well as granular spatial locations. To this end, we use data from Belgium and forecast day-ahead PV power production at an hourly resolution. The insights from our study can assist utilities, decision-makers, and other stakeholders in optimizing grid operations, economic dispatch, and in facilitating the integration of distributed PV power into the electricity grid. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2311.13911 [pdf, ps, other]

Identifying Important Pairwise Logratios in Compositional Data with Sparse Principal Component Analysis

Authors: Viktorie Nesrstová, Ines Wilms, Karel Hron, Peter Filzmoser

Abstract: Compositional data are characterized by the fact that their elemental information is contained in simple pairwise logratios of the parts that constitute the composition. While pairwise logratios are typically easy to interpret, the number of possible pairs to consider quickly becomes (too) large even for medium-sized compositions, which might hinder interpretability in further multivariate analyse… ▽ More Compositional data are characterized by the fact that their elemental information is contained in simple pairwise logratios of the parts that constitute the composition. While pairwise logratios are typically easy to interpret, the number of possible pairs to consider quickly becomes (too) large even for medium-sized compositions, which might hinder interpretability in further multivariate analyses. Sparse methods can therefore be useful to identify few, important pairwise logratios (respectively parts contained in them) from the total candidate set. To this end, we propose a procedure based on the construction of all possible pairwise logratios and employ sparse principal component analysis to identify important pairwise logratios. The performance of the procedure is demonstrated both with simulated and real-world data. In our empirical analyses, we propose three visual tools showing (i) the balance between sparsity and explained variability, (ii) stability of the pairwise logratios, and (iii) importance of the original compositional parts to aid practitioners with their model interpretation. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: 23 pages, 13 figures, 1 appendix, submitted to: Mathematical Geosciences

arXiv:2303.01887 [pdf, other]

Fast Forecasting of Unstable Data Streams for On-Demand Service Platforms

Authors: Yu Jeffrey Hu, Jeroen Rombouts, Ines Wilms

Abstract: On-demand service platforms face a challenging problem of forecasting a large collection of high-frequency regional demand data streams that exhibit instabilities. This paper develops a novel forecast framework that is fast and scalable, and automatically assesses changing environments without human intervention. We empirically test our framework on a large-scale demand data set from a leading on-… ▽ More On-demand service platforms face a challenging problem of forecasting a large collection of high-frequency regional demand data streams that exhibit instabilities. This paper develops a novel forecast framework that is fast and scalable, and automatically assesses changing environments without human intervention. We empirically test our framework on a large-scale demand data set from a leading on-demand delivery platform in Europe, and find strong performance gains from using our framework against several industry benchmarks, across all geographical regions, loss functions, and both pre- and post-Covid periods. We translate forecast gains to economic impacts for this on-demand service platform by computing financial gains and reductions in computing costs. △ Less

Submitted 31 May, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

arXiv:2302.01233 [pdf, ps, other]

Sparse High-Dimensional Vector Autoregressive Bootstrap

Authors: Robert Adamek, Stephan Smeekes, Ines Wilms

Abstract: We introduce a high-dimensional multiplier bootstrap for time series data based capturing dependence through a sparsely estimated vector autoregressive model. We prove its consistency for inference on high-dimensional means under two different moment assumptions on the errors, namely sub-gaussian moments and a finite number of absolute moments. In establishing these results, we derive a Gaussian a… ▽ More We introduce a high-dimensional multiplier bootstrap for time series data based capturing dependence through a sparsely estimated vector autoregressive model. We prove its consistency for inference on high-dimensional means under two different moment assumptions on the errors, namely sub-gaussian moments and a finite number of absolute moments. In establishing these results, we derive a Gaussian approximation for the maximum mean of a linear process, which may be of independent interest. △ Less

Submitted 2 February, 2023; originally announced February 2023.

arXiv:2301.10592 [pdf, other]

Hierarchical Regularizers for Reverse Unrestricted Mixed Data Sampling Regressions

Authors: Alain Hecq, Marie Ternes, Ines Wilms

Abstract: Reverse Unrestricted MIxed DAta Sampling (RU-MIDAS) regressions are used to model high-frequency responses by means of low-frequency variables. However, due to the periodic structure of RU-MIDAS regressions, the dimensionality grows quickly if the frequency mismatch between the high- and low-frequency variables is large. Additionally the number of high-frequency observations available for estimati… ▽ More Reverse Unrestricted MIxed DAta Sampling (RU-MIDAS) regressions are used to model high-frequency responses by means of low-frequency variables. However, due to the periodic structure of RU-MIDAS regressions, the dimensionality grows quickly if the frequency mismatch between the high- and low-frequency variables is large. Additionally the number of high-frequency observations available for estimation decreases. We propose to counteract this reduction in sample size by pooling the high-frequency coefficients and further reduce the dimensionality through a sparsity-inducing convex regularizer that accounts for the temporal ordering among the different lags. To this end, the regularizer prioritizes the inclusion of lagged coefficients according to the recency of the information they contain. We demonstrate the proposed method on an empirical application for daily realized volatility forecasting where we explore whether modeling high-frequency volatility data in terms of low-frequency macroeconomic data pays off. △ Less

Submitted 25 January, 2023; originally announced January 2023.

arXiv:2211.01686 [pdf, other]

Principal Balances of Compositional Data for Regression and Classification using Partial Least Squares

Authors: V. Nesrstová, I. Wilms, J. Palarea-Albaladejo, P. Filzmoser, J. A. Martín-Fernández, D. Friedecký, K. Hron

Abstract: High-dimensional compositional data are commonplace in the modern omics sciences amongst others. Analysis of compositional data requires a proper choice of orthonormal coordinate representation as their relative nature is not compatible with the direct use of standard statistical methods. Principal balances, a specific class of log-ratio coordinates, are well suited to this context since they are… ▽ More High-dimensional compositional data are commonplace in the modern omics sciences amongst others. Analysis of compositional data requires a proper choice of orthonormal coordinate representation as their relative nature is not compatible with the direct use of standard statistical methods. Principal balances, a specific class of log-ratio coordinates, are well suited to this context since they are constructed in such a way that the first few coordinates capture most of the variability in the original data. Focusing on regression and classification problems in high dimensions, we propose a novel Partial Least Squares (PLS) based procedure to construct principal balances that maximize explained variability of the response variable and notably facilitates interpretability when compared to the ordinary PLS formulation. The proposed PLS principal balance approach can be understood as a generalized version of common logcontrast models, since multiple orthonormal (instead of one) logcontrasts are estimated simultaneously. We demonstrate the performance of the method using both simulated and real data sets. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: 26 pages, 10 figures, 1 appendix submitted to: Journal of Chemometrics

arXiv:2209.07374 [pdf, other]

The Influence Function of Graphical Lasso Estimators

Authors: Gaëtan Louvet, Jakob Raymaekers, Germain Van Bever, Ines Wilms

Abstract: The precision matrix that encodes conditional linear dependency relations among a set of variables forms an important object of interest in multivariate analysis. Sparse estimation procedures for precision matrices such as the graphical lasso (Glasso) gained popularity as they facilitate interpretability, thereby separating pairs of variables that are conditionally dependent from those that are in… ▽ More The precision matrix that encodes conditional linear dependency relations among a set of variables forms an important object of interest in multivariate analysis. Sparse estimation procedures for precision matrices such as the graphical lasso (Glasso) gained popularity as they facilitate interpretability, thereby separating pairs of variables that are conditionally dependent from those that are independent (given all other variables). Glasso lacks, however, robustness to outliers. To overcome this problem, one typically applies a robust plug-in procedure where the Glasso is computed from a robust covariance estimate instead of the sample covariance, thereby providing protection against outliers. In this paper, we study such estimators theoretically, by deriving and comparing their influence function, sensitivity curves and asymptotic variances. △ Less

Submitted 8 March, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

arXiv:2209.03218 [pdf, other]

doi 10.1093/ectj/utae012

Local Projection Inference in High Dimensions

Authors: Robert Adamek, Stephan Smeekes, Ines Wilms

Abstract: In this paper, we estimate impulse responses by local projections in high-dimensional settings. We use the desparsified (de-biased) lasso to estimate the high-dimensional local projections, while leaving the impulse response parameter of interest unpenalized. We establish the uniform asymptotic normality of the proposed estimator under general conditions. Finally, we demonstrate small sample perfo… ▽ More In this paper, we estimate impulse responses by local projections in high-dimensional settings. We use the desparsified (de-biased) lasso to estimate the high-dimensional local projections, while leaving the impulse response parameter of interest unpenalized. We establish the uniform asymptotic normality of the proposed estimator under general conditions. Finally, we demonstrate small sample performance through a simulation study and consider two canonical applications in macroeconomic research on monetary policy and government spending. △ Less

Submitted 16 April, 2024; v1 submitted 7 September, 2022; originally announced September 2022.

arXiv:2207.02362 [pdf, other]

Regularized Predictive Models for Beef Eating Quality of Individual Meals

Authors: Garth Tarr, Ines Wilms

Abstract: Faced with changing markets and evolving consumer demands, beef industries are investing in grading systems to maximise value extraction throughout their entire supply chain. The Meat Standards Australia (MSA) system is a customer-oriented total quality management system that stands out internationally by predicting quality grades of specific muscles processed by a designated cooking method. The m… ▽ More Faced with changing markets and evolving consumer demands, beef industries are investing in grading systems to maximise value extraction throughout their entire supply chain. The Meat Standards Australia (MSA) system is a customer-oriented total quality management system that stands out internationally by predicting quality grades of specific muscles processed by a designated cooking method. The model currently underpinning the MSA system requires laborious effort to estimate and its prediction performance may be less accurate in the presence of unbalanced data sets where many "muscle x cook" combinations have few observations and/or few predictors of palatability are available. This paper proposes a novel predictive method for beef eating quality that bridges a spectrum of muscle x cook-specific models. At one extreme, each muscle x cook combination is modelled independently; at the other extreme a pooled predictive model is obtained across all muscle x cook combinations. Via a data-driven regularization method, we cover all muscle x cook-specific models along this spectrum. We demonstrate that the proposed predictive method attains considerable accuracy improvements relative to independent or pooled approaches on unique MSA data sets. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: 26 pages, 5 figures

arXiv:2107.10572 [pdf, other]

Graphical Influence Diagnostics for Changepoint Models

Authors: Ines Wilms, Rebecca Killick, David S. Matteson

Abstract: Changepoint models enjoy a wide appeal in a variety of disciplines to model the heterogeneity of ordered data. Graphical influence diagnostics to characterize the influence of single observations on changepoint models are, however, lacking. We address this gap by develo** a framework for investigating instabilities in changepoint segmentations and assessing the influence of single observations o… ▽ More Changepoint models enjoy a wide appeal in a variety of disciplines to model the heterogeneity of ordered data. Graphical influence diagnostics to characterize the influence of single observations on changepoint models are, however, lacking. We address this gap by develo** a framework for investigating instabilities in changepoint segmentations and assessing the influence of single observations on various outputs of a changepoint analysis. We construct graphical diagnostic plots that allow practitioners to assess whether instabilities occur; how and where they occur; and to detect influential individual observations triggering instability. We analyze well-log data to illustrate how such influence diagnostic plots can be used in practice to reveal features of the data that may otherwise remain hidden. △ Less

Submitted 22 July, 2021; originally announced July 2021.

arXiv:2102.11780 [pdf, other]

Hierarchical Regularizers for Mixed-Frequency Vector Autoregressions

Authors: Alain Hecq, Marie Ternes, Ines Wilms

Abstract: Mixed-frequency Vector AutoRegressions (MF-VAR) model the dynamics between variables recorded at different frequencies. However, as the number of series and high-frequency observations per low-frequency period grow, MF-VARs suffer from the "curse of dimensionality". We curb this curse through a regularizer that permits hierarchical sparsity patterns by prioritizing the inclusion of coefficients ac… ▽ More Mixed-frequency Vector AutoRegressions (MF-VAR) model the dynamics between variables recorded at different frequencies. However, as the number of series and high-frequency observations per low-frequency period grow, MF-VARs suffer from the "curse of dimensionality". We curb this curse through a regularizer that permits hierarchical sparsity patterns by prioritizing the inclusion of coefficients according to the recency of the information they contain. Additionally, we investigate the presence of nowcasting relations by sparsely estimating the MF-VAR error covariance matrix. We study predictive Granger causality relations in a MF-VAR for the U.S. economy and construct a coincident indicator of GDP growth. Supplementary Materials for this article are available online. △ Less

Submitted 18 March, 2022; v1 submitted 23 February, 2021; originally announced February 2021.

Comments: Forthcoming in Journal of Computational and Graphical Statistics

arXiv:2101.12503 [pdf, other]

Tree-based Node Aggregation in Sparse Graphical Models

Authors: Ines Wilms, Jacob Bien

Abstract: High-dimensional graphical models are often estimated using regularization that is aimed at reducing the number of edges in a network. In this work, we show how even simpler networks can be produced by aggregating the nodes of the graphical model. We develop a new convex regularized method, called the tree-aggregated graphical lasso or tag-lasso, that estimates graphical models that are both edge-… ▽ More High-dimensional graphical models are often estimated using regularization that is aimed at reducing the number of edges in a network. In this work, we show how even simpler networks can be produced by aggregating the nodes of the graphical model. We develop a new convex regularized method, called the tree-aggregated graphical lasso or tag-lasso, that estimates graphical models that are both edge-sparse and node-aggregated. The aggregation is performed in a data-driven fashion by leveraging side information in the form of a tree that encodes node similarity and facilitates the interpretation of the resulting aggregated nodes. We provide an efficient implementation of the tag-lasso by using the locally adaptive alternating direction method of multipliers and illustrate our proposal's practical advantages in simulation and in applications in finance and biology. △ Less

Submitted 29 January, 2021; originally announced January 2021.

arXiv:2007.12249 [pdf, other]

bootUR: An R Package for Bootstrap Unit Root Tests

Authors: Stephan Smeekes, Ines Wilms

Abstract: Unit root tests form an essential part of any time series analysis. We provide practitioners with a single, unified framework for comprehensive and reliable unit root testing in the R package bootUR.The package's backbone is the popular augmented Dickey-Fuller test paired with a union of rejections principle, which can be performed directly on single time series or multiple (including panel) time… ▽ More Unit root tests form an essential part of any time series analysis. We provide practitioners with a single, unified framework for comprehensive and reliable unit root testing in the R package bootUR.The package's backbone is the popular augmented Dickey-Fuller test paired with a union of rejections principle, which can be performed directly on single time series or multiple (including panel) time series. Accurate inference is ensured through the use of bootstrap methods. The package addresses the needs of both novice users, by providing user-friendly and easy-to-implement functions with sensible default options, as well as expert users, by giving full user-control to adjust the tests to one's desired settings. Our parallelized C++ implementation ensures that all unit root tests are scalable to datasets containing many time series. △ Less

Submitted 13 July, 2022; v1 submitted 23 July, 2020; originally announced July 2020.

arXiv:1711.03623 [pdf, other]

Interpretable Vector AutoRegressions with Exogenous Time Series

Authors: Ines Wilms, Sumanta Basu, Jacob Bien, David S. Matteson

Abstract: The Vector AutoRegressive (VAR) model is fundamental to the study of multivariate time series. Although VAR models are intensively investigated by many researchers, practitioners often show more interest in analyzing VARX models that incorporate the impact of unmodeled exogenous variables (X) into the VAR. However, since the parameter space grows quadratically with the number of time series, estim… ▽ More The Vector AutoRegressive (VAR) model is fundamental to the study of multivariate time series. Although VAR models are intensively investigated by many researchers, practitioners often show more interest in analyzing VARX models that incorporate the impact of unmodeled exogenous variables (X) into the VAR. However, since the parameter space grows quadratically with the number of time series, estimation quickly becomes challenging. While several proposals have been made to sparsely estimate large VAR models, the estimation of large VARX models is under-explored. Moreover, typically these sparse proposals involve a lasso-type penalty and do not incorporate lag selection into the estimation procedure. As a consequence, the resulting models may be difficult to interpret. In this paper, we propose a lag-based hierarchically sparse estimator, called "HVARX", for large VARX models. We illustrate the usefulness of HVARX on a cross-category management marketing application. Our results show how it provides a highly interpretable model, and improves out-of-sample forecast accuracy compared to a lasso-type approach. △ Less

Submitted 9 November, 2017; originally announced November 2017.

Comments: Presented at NIPS 2017 Symposium on Interpretable Machine Learning

arXiv:1707.09208 [pdf, other]

Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages

Authors: Ines Wilms, Sumanta Basu, Jacob Bien, David S. Matteson

Abstract: The Vector AutoRegressive Moving Average (VARMA) model is fundamental to the theory of multivariate time series; however, identifiability issues have led practitioners to abandon it in favor of the simpler but more restrictive Vector AutoRegressive (VAR) model. We narrow this gap with a new optimization-based approach to VARMA identification built upon the principle of parsimony. Among all equival… ▽ More The Vector AutoRegressive Moving Average (VARMA) model is fundamental to the theory of multivariate time series; however, identifiability issues have led practitioners to abandon it in favor of the simpler but more restrictive Vector AutoRegressive (VAR) model. We narrow this gap with a new optimization-based approach to VARMA identification built upon the principle of parsimony. Among all equivalent data-generating models, we use convex optimization to seek the parameterization that is "simplest" in a certain sense. A user-specified strongly convex penalty is used to measure model simplicity, and that same penalty is then used to define an estimator that can be efficiently computed. We establish consistency of our estimators in a double-asymptotic regime. Our non-asymptotic error bound analysis accommodates both model specification and parameter estimation steps, a feature that is crucial for studying large-scale VARMA algorithms. Our analysis also provides new results on penalized estimation of infinite-order VAR, and elastic net regression under a singular covariance structure of regressors, which may be of independent interest. We illustrate the advantage of our method over VAR alternatives on three real data examples. △ Less

Submitted 8 June, 2021; v1 submitted 28 July, 2017; originally announced July 2017.

arXiv:1612.07971 [pdf, ps, other]

Cellwise robust regularized discriminant analysis

Authors: Stéphanie Aerts, Ines Wilms

Abstract: Quadratic and Linear Discriminant Analysis (QDA/LDA) are the most often applied classification rules under normality. In QDA, a separate covariance matrix is estimated for each group. If there are more variables than observations in the groups, the usual estimates are singular and cannot be used anymore. Assuming homoscedasticity, as in LDA, reduces the number of parameters to estimate. This rathe… ▽ More Quadratic and Linear Discriminant Analysis (QDA/LDA) are the most often applied classification rules under normality. In QDA, a separate covariance matrix is estimated for each group. If there are more variables than observations in the groups, the usual estimates are singular and cannot be used anymore. Assuming homoscedasticity, as in LDA, reduces the number of parameters to estimate. This rather strong assumption is however rarely verified in practice. Regularized discriminant techniques that are computable in high-dimension and cover the path between the two extremes QDA and LDA have been proposed in the literature. However, these procedures rely on sample covariance matrices. As such, they become inappropriate in presence of cellwise outliers, a type of outliers that is very likely to occur in high-dimensional datasets. In this paper, we propose cellwise robust counterparts of these regularized discriminant techniques by inserting cellwise robust covariance matrices. Our methodology results in a family of discriminant methods that (i) are robust against outlying cells, (ii) cover the gap between LDA and QDA and (iii) are computable in high-dimension. The good performance of the new methods is illustrated through simulated and real data examples. As a by-product, visual tools are provided for the detection of outliers. △ Less

Submitted 23 December, 2016; originally announced December 2016.

arXiv:1610.02653 [pdf, other]

Lasso-based forecast combinations for forecasting realized variances

Authors: Ines Wilms, Jeroen Rombouts, Christophe Croux

Abstract: Volatility forecasts are key inputs in financial analysis. While lasso based forecasts have shown to perform well in many applications, their use to obtain volatility forecasts has not yet received much attention in the literature. Lasso estimators produce parsimonious forecast models. Our forecast combination approach hedges against the risk of selecting a wrong degree of model parsimony. Apart f… ▽ More Volatility forecasts are key inputs in financial analysis. While lasso based forecasts have shown to perform well in many applications, their use to obtain volatility forecasts has not yet received much attention in the literature. Lasso estimators produce parsimonious forecast models. Our forecast combination approach hedges against the risk of selecting a wrong degree of model parsimony. Apart from the standard lasso, we consider several lasso extensions that account for the dynamic nature of the forecast model. We apply forecast combined lasso estimators in a comprehensive forecasting exercise using realized variance time series of ten major international stock market indices. We find the lasso extended "ordered lasso" to give the most accurate realized variance forecasts. Multivariate forecast models, accounting for volatility spillovers between different stock markets, outperform univariate forecast models for longer forecast horizons. △ Less

Submitted 9 October, 2016; originally announced October 2016.

arXiv:1605.03325 [pdf, other]

Multi-class Vector AutoRegressive Models for Multi-store Sales Data

Authors: Ines Wilms, Luca Barbaglia, Christophe Croux

Abstract: Retailers use the Vector AutoRegressive (VAR) model as a standard tool to estimate the effects of prices, promotions and sales in one product category on the sales of another product category. Besides, these price, promotion and sales data are available for not just one store, but a whole chain of stores. We propose to study cross-category effects using a multi-class VAR model: we jointly estimate… ▽ More Retailers use the Vector AutoRegressive (VAR) model as a standard tool to estimate the effects of prices, promotions and sales in one product category on the sales of another product category. Besides, these price, promotion and sales data are available for not just one store, but a whole chain of stores. We propose to study cross-category effects using a multi-class VAR model: we jointly estimate cross-category effects for several distinct but related VAR models, one for each store. Our methodology encourages effects to be similar across stores, while still allowing for small differences between stores to account for store heterogeneity. Moreover, our estimator is sparse: unimportant effects are estimated as exactly zero, which facilitates the interpretation of the results. A simulation study shows that the proposed multi-class estimator improves estimation accuracy by borrowing strength across classes. Finally, we provide three visual tools showing (i) the clustering of stores on identical cross-category effects, (ii) the networks of product categories and (iii) the similarity matrices of shared cross-category effects across stores. △ Less

Submitted 11 May, 2016; originally announced May 2016.

arXiv:1512.05153 [pdf, other]

An algorithm for the multivariate group lasso with covariance estimation

Authors: Ines Wilms, Christophe Croux

Abstract: We study a group lasso estimator for the multivariate linear regression model that accounts for correlated error terms. A block coordinate descent algorithm is used to compute this estimator. We perform a simulation study with categorical data and multivariate time series data, typical settings with a natural grou** among the predictor variables. Our simulation studies show the good performance… ▽ More We study a group lasso estimator for the multivariate linear regression model that accounts for correlated error terms. A block coordinate descent algorithm is used to compute this estimator. We perform a simulation study with categorical data and multivariate time series data, typical settings with a natural grou** among the predictor variables. Our simulation studies show the good performance of the proposed group lasso estimator compared to alternative estimators. We illustrate the method on a time series data set of gene expressions. △ Less

Submitted 16 December, 2015; originally announced December 2015.

arXiv:1508.02846 [pdf, other]

The predictive power of the business and bank sentiment of firms: A high-dimensional Granger Causality approach

Authors: Ines Wilms, Sarah Gelper, Christophe Croux

Abstract: We study the predictive power of industry-specific economic sentiment indicators for future macro-economic developments. In addition to the sentiment of firms towards their own business situation, we study their sentiment with respect to the banking sector - their main credit providers. The use of industry-specific sentiment indicators results in a high-dimensional forecasting problem. To identify… ▽ More We study the predictive power of industry-specific economic sentiment indicators for future macro-economic developments. In addition to the sentiment of firms towards their own business situation, we study their sentiment with respect to the banking sector - their main credit providers. The use of industry-specific sentiment indicators results in a high-dimensional forecasting problem. To identify the most predictive industries, we present a bootstrap Granger Causality test based on the Adaptive Lasso. This test is more powerful than the standard Wald test in such high-dimensional settings. Forecast accuracy is improved by using only the most predictive industries rather than all industries. △ Less

Submitted 12 August, 2015; originally announced August 2015.

arXiv:1506.01589 [pdf, other]

Identifying Demand Effects in a Large Network of Product Categories

Authors: Sarah Gelper, Ines Wilms, Christophe Croux

Abstract: Planning marketing mix strategies requires retailers to understand within- as well as cross-category demand effects. Most retailers carry products in a large variety of categories, leading to a high number of such demand effects to be estimated. At the same time, we do not expect cross-category effects between all categories. This paper outlines a methodology to estimate a parsimonious product cat… ▽ More Planning marketing mix strategies requires retailers to understand within- as well as cross-category demand effects. Most retailers carry products in a large variety of categories, leading to a high number of such demand effects to be estimated. At the same time, we do not expect cross-category effects between all categories. This paper outlines a methodology to estimate a parsimonious product category network without prior constraints on its structure. To do so, sparse estimation of the Vector AutoRegressive Market Response Model is presented. We find that cross-category effects go beyond substitutes and complements, and that categories have asymmetric roles in the product category network. Destination categories are most influential for other product categories, while convenience and occasional categories are most responsive. Routine categories are moderately influential and moderately responsive. △ Less

Submitted 4 June, 2015; originally announced June 2015.

arXiv:1501.01250 [pdf, other]

Sparse cointegration

Authors: Ines Wilms, Christophe Croux

Abstract: Cointegration analysis is used to estimate the long-run equilibrium relations between several time series. The coefficients of these long-run equilibrium relations are the cointegrating vectors. In this paper, we provide a sparse estimator of the cointegrating vectors. The estimation technique is sparse in the sense that some elements of the cointegrating vectors will be estimated as zero. For thi… ▽ More Cointegration analysis is used to estimate the long-run equilibrium relations between several time series. The coefficients of these long-run equilibrium relations are the cointegrating vectors. In this paper, we provide a sparse estimator of the cointegrating vectors. The estimation technique is sparse in the sense that some elements of the cointegrating vectors will be estimated as zero. For this purpose, we combine a penalized estimation procedure for vector autoregressive models with sparse reduced rank regression. The sparse cointegration procedure achieves a higher estimation accuracy than the traditional Johansen cointegration approach in settings where the true cointegrating vectors have a sparse structure, and/or when the sample size is low compared to the number of time series. We also discuss a criterion to determine the cointegration rank and we illustrate its good performance in several simulation settings. In a first empirical application we investigate whether the expectations hypothesis of the term structure of interest rates, implying sparse cointegrating vectors, holds in practice. In a second empirical application we show that forecast performance in high-dimensional systems can be improved by sparsely estimating the cointegration relations. △ Less

Submitted 6 January, 2015; originally announced January 2015.

arXiv:1501.01233 [pdf, other]

Robust Sparse Canonical Correlation Analysis

Authors: Ines Wilms, Christophe Croux

Abstract: Canonical correlation analysis (CCA) is a multivariate statistical method which describes the associations between two sets of variables. The objective is to find linear combinations of the variables in each data set having maximal correlation. This paper discusses a method for Robust Sparse CCA. Sparse estimation produces canonical vectors with some of their elements estimated as exactly zero. As… ▽ More Canonical correlation analysis (CCA) is a multivariate statistical method which describes the associations between two sets of variables. The objective is to find linear combinations of the variables in each data set having maximal correlation. This paper discusses a method for Robust Sparse CCA. Sparse estimation produces canonical vectors with some of their elements estimated as exactly zero. As such, their interpretability is improved. We also robustify the method such that it can cope with outliers in the data. To estimate the canonical vectors, we convert the CCA problem into an alternating regression framework, and use the sparse Least Trimmed Squares estimator. We illustrate the good performance of the Robust Sparse CCA method in several simulation studies and two real data examples. △ Less

Submitted 6 January, 2015; originally announced January 2015.

arXiv:1501.01231 [pdf, other]

Sparse canonical correlation analysis from a predictive point of view

Authors: Ines Wilms, Christophe Croux

Abstract: Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for s… ▽ More Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each data set, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a regression framework. By combining an alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and illustrate its usefulness on a genomic data set. △ Less

Submitted 6 January, 2015; originally announced January 2015.

arXiv:1412.5250 [pdf, other]

High Dimensional Forecasting via Interpretable Vector Autoregression

Authors: William B. Nicholson, Ines Wilms, Jacob Bien, David S. Matteson

Abstract: Vector autoregression (VAR) is a fundamental tool for modeling multivariate time series. However, as the number of component series is increased, the VAR model becomes overparameterized. Several authors have addressed this issue by incorporating regularized approaches, such as the lasso in VAR estimation. Traditional approaches address overparameterization by selecting a low lag order, based on th… ▽ More Vector autoregression (VAR) is a fundamental tool for modeling multivariate time series. However, as the number of component series is increased, the VAR model becomes overparameterized. Several authors have addressed this issue by incorporating regularized approaches, such as the lasso in VAR estimation. Traditional approaches address overparameterization by selecting a low lag order, based on the assumption of short range dependence, assuming that a universal lag order applies to all components. Such an approach constrains the relationship between the components and impedes forecast performance. The lasso-based approaches work much better in high-dimensional situations but do not incorporate the notion of lag order selection. We propose a new class of hierarchical lag structures (HLag) that embed the notion of lag selection into a convex regularizer. The key modeling tool is a group lasso with nested groups which guarantees that the sparsity pattern of lag coefficients honors the VAR's ordered structure. The HLag framework offers three structures, which allow for varying levels of flexibility. A simulation study demonstrates improved performance in forecasting and lag order selection over previous approaches, and a macroeconomic application further highlights forecasting improvements as well as HLag's convenient, interpretable output. △ Less

Submitted 7 September, 2020; v1 submitted 16 December, 2014; originally announced December 2014.

Showing 1–30 of 30 results for author: Wilms, I