-
Sparse Bayesian factor analysis when the number of factors is unknown
Authors:
Sylvia Frühwirth-Schnatter,
Darjus Hosszejni,
Hedibert Freitas Lopes
Abstract:
There has been increased research interest in the subfield of sparse Bayesian factor analysis with shrinkage priors, which achieve additional sparsity beyond the natural parsimonity of factor models. In this spirit, we estimate the number of common factors in the highly implemented sparse latent factor model with spike-and-slab priors on the factor loadings matrix. Our framework leads to a natural…
▽ More
There has been increased research interest in the subfield of sparse Bayesian factor analysis with shrinkage priors, which achieve additional sparsity beyond the natural parsimonity of factor models. In this spirit, we estimate the number of common factors in the highly implemented sparse latent factor model with spike-and-slab priors on the factor loadings matrix. Our framework leads to a natural, efficient and simultaneous coupling of model estimation and selection on one hand and model identification and rank estimation (number of factors) on the other hand. More precisely, by embedding the unordered generalized lower triangular loadings representation into overfitting sparse factor modelling, we obtain posterior summaries regarding factor loadings, common factors as well as the factor dimension via postprocessing draws from our efficient and customized Markov chain Monte Carlo scheme.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
When it counts -- Econometric identification of the basic factor model based on GLT structures
Authors:
Sylvia Frühwirth-Schnatter,
Darjus Hosszejni,
Hedibert Freitas Lopes
Abstract:
Despite the popularity of factor models with sparse loading matrices, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular (PLT) constraint. To fill this gap, we review the advantages of variance identification in sparse factor analysis and introduce the generalized lower triangular (…
▽ More
Despite the popularity of factor models with sparse loading matrices, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular (PLT) constraint. To fill this gap, we review the advantages of variance identification in sparse factor analysis and introduce the generalized lower triangular (GLT) structures. We show that the GLT assumption is an improvement over PLT without compromise: GLT is also unique but, unlike PLT, a non-restrictive assumption. Furthermore, we provide a simple counting rule for variance identification under GLT structures, and we demonstrate that within this model class the unknown number of common factors can be recovered in an exploratory factor analysis. Our methodology is illustrated for simulated data in the context of post-processing posterior draws in Bayesian sparse factor analysis.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
Uncertainty quantification through Monte Carlo method in a cloud computing setting
Authors:
A. Cunha Jr,
R. Nasser,
R. Sampaio,
H. Lopes,
K. Breitman
Abstract:
The Monte Carlo (MC) method is the most common technique used for uncertainty quantification, due to its simplicity and good statistical results. However, its computational cost is extremely high, and, in many cases, prohibitive. Fortunately, the MC algorithm is easily parallelizable, which allows its use in simulations where the computation of a single realization is very costly. This work presen…
▽ More
The Monte Carlo (MC) method is the most common technique used for uncertainty quantification, due to its simplicity and good statistical results. However, its computational cost is extremely high, and, in many cases, prohibitive. Fortunately, the MC algorithm is easily parallelizable, which allows its use in simulations where the computation of a single realization is very costly. This work presents a methodology for the parallelization of the MC method, in the context of cloud computing. This strategy is based on the MapReduce paradigm, and allows an efficient distribution of tasks in the cloud. This methodology is illustrated on a problem of structural dynamics that is subject to uncertainties. The results show that the technique is capable of producing good results concerning statistical moments of low order. It is shown that even a simple problem may require many realizations for convergence of histograms, which makes the cloud computing strategy very attractive (due to its high scalability capacity and low-cost). Additionally, the results regarding the time of processing and storage space usage allow one to qualify this new methodology as a solution for simulations that require a number of MC realizations beyond the standard.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
The Illusion of the Illusion of Sparsity: An exercise in prior sensitivity
Authors:
Bruno Fava,
Hedibert F. Lopes
Abstract:
The emergence of Big Data raises the question of how to model economic relations when there is a large number of possible explanatory variables. We revisit the issue by comparing the possibility of using dense or sparse models in a Bayesian approach, allowing for variable selection and shrinkage. More specifically, we discuss the results reached by Giannone, Lenza, and Primiceri (2020) through a "…
▽ More
The emergence of Big Data raises the question of how to model economic relations when there is a large number of possible explanatory variables. We revisit the issue by comparing the possibility of using dense or sparse models in a Bayesian approach, allowing for variable selection and shrinkage. More specifically, we discuss the results reached by Giannone, Lenza, and Primiceri (2020) through a "Spike-and-Slab" prior, which suggest an "illusion of sparsity" in economic data, as no clear patterns of sparsity could be detected. We make a further revision of the posterior distributions of the model, and propose three experiments to evaluate the robustness of the adopted prior distribution. We find that the pattern of sparsity is sensitive to the prior distribution of the regression coefficients, and present evidence that the model indirectly induces variable selection and shrinkage, which suggests that the "illusion of sparsity" could be, itself, an illusion. Code is available on github.com/bfava/IllusionOfIllusion.
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
Dynamic sparsity on dynamic regression models
Authors:
Paloma W. Uribe,
Hedibert F. Lopes
Abstract:
In the present work, we consider variable selection and shrinkage for the Gaussian dynamic linear regression within a Bayesian framework. In particular, we propose a novel method that allows for time-varying sparsity, based on an extension of spike-and-slab priors for dynamic models. This is done by assigning appropriate Markov switching priors for the time-varying coefficients' variances, extendi…
▽ More
In the present work, we consider variable selection and shrinkage for the Gaussian dynamic linear regression within a Bayesian framework. In particular, we propose a novel method that allows for time-varying sparsity, based on an extension of spike-and-slab priors for dynamic models. This is done by assigning appropriate Markov switching priors for the time-varying coefficients' variances, extending the previous work of Ishwaran and Rao (2005). Furthermore, we investigate different priors, including the common Inverted gamma prior for the process variances, and other mixture prior distributions such as Gamma priors for both the spike and the slab, which leads to a mixture of Normal-Gammas priors (Griffin ad Brown, 2010) for the coefficients. In this sense, our prior can be view as a dynamic variable selection prior which induces either smoothness (through the slab) or shrinkage towards zero (through the spike) at each time point. The MCMC method used for posterior computation uses Markov latent variables that can assume binary regimes at each time point to generate the coefficients' variances. In that way, our model is a dynamic mixture model, thus, we could use the algorithm of Gerlach et al (2000) to generate the latent processes without conditioning on the states. Finally, our approach is exemplified through simulated examples and a real data application.
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
Decoupling Shrinkage and Selection in Gaussian Linear Factor Analysis
Authors:
Henrique Bolfarine,
Carlos M. Carvalho,
Hedibert F. Lopes,
Jared S. Murray
Abstract:
Factor Analysis is a popular method for modeling dependence in multivariate data. However, determining the number of factors and obtaining a sparse orientation of the loadings are still major challenges. In this paper, we propose a decision-theoretic approach that brings to light the relation between a sparse representation of the loadings and factor dimension. This relation is done through a summ…
▽ More
Factor Analysis is a popular method for modeling dependence in multivariate data. However, determining the number of factors and obtaining a sparse orientation of the loadings are still major challenges. In this paper, we propose a decision-theoretic approach that brings to light the relation between a sparse representation of the loadings and factor dimension. This relation is done through a summary from information contained in the multivariate posterior. To construct such summary, we introduce a three-step approach. In the first step, the model is fitted with a conservative factor dimension. In the second step, a series of sparse point-estimates, with a decreasing number of factors, is obtained by minimizing an expected predictive loss function. In step three, the degradation in utility in relation to the sparse loadings and factor dimensions is displayed in the posterior summary. The findings are illustrated with applications in classical data from the Factor Analysis literature. We used different prior choices and factor dimensions to demonstrate the flexibility of the proposed method.
△ Less
Submitted 24 July, 2021; v1 submitted 21 June, 2020;
originally announced June 2020.
-
Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network
Authors:
Raul de Araújo Lima,
Rômulo César Costa de Sousa,
Simone Diniz Junqueira Barbosa,
Hélio Cortês Vieira Lopes
Abstract:
Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib…
▽ More
Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
Learning a latent pattern of heterogeneity in the innovation rates of a time series of counts
Authors:
Helton Graziadei,
Hedibert F. Lopes,
Paulo C. Marques F
Abstract:
We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting cap…
▽ More
We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting capabilities of the model are put to test in the analysis of crime data in Pittsburgh, with favorable results.
△ Less
Submitted 6 July, 2019;
originally announced July 2019.
-
Tree-Based Bayesian Treatment Effect Analysis
Authors:
Pedro Henrique Filipini dos Santos,
Hedibert Freitas Lopes
Abstract:
The inclusion of the propensity score as a covariate in Bayesian regression trees for causal inference can reduce the bias in treatment effect estimations, which occurs due to the regularization-induced confounding phenomenon. This study advocate for the use of the propensity score by evaluating it under a full-Bayesian variable selection setting, and the use of Individual Conditional Expectation…
▽ More
The inclusion of the propensity score as a covariate in Bayesian regression trees for causal inference can reduce the bias in treatment effect estimations, which occurs due to the regularization-induced confounding phenomenon. This study advocate for the use of the propensity score by evaluating it under a full-Bayesian variable selection setting, and the use of Individual Conditional Expectation Plots, which is a graphical tool that can improve treatment effect analysis on tree-based Bayesian models and others "black box" models. The first one, even if poorly estimated, can lead to bias reduction on the estimated treatment effects, while the latter can be used to found groups of individuals which have different responses to the applied treatment, and analyze the impact of each variable in the estimated treatment effect.
△ Less
Submitted 28 August, 2018;
originally announced August 2018.
-
Efficient sampling for Gaussian linear regression with arbitrary priors
Authors:
P. Richard Hahn,
**gyu He,
Hedibert Lopes
Abstract:
This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, ma…
▽ More
This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, making it ideal for investigating the properties of new shrinkage priors without having to develop custom sampling algorithms. The new sampler takes advantage of the special structure of the linear regression likelihood, allowing it to produce better effective sample size per second than common alternative approaches.
△ Less
Submitted 14 June, 2018;
originally announced June 2018.
-
Sparse Bayesian Factor Analysis when the Number of Factors is Unknown
Authors:
Sylvia Fruehwirth-Schnatter,
Hedibert Freitas Lopes
Abstract:
Despite the popularity of sparse factor models, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular constraint. To fill this gap, we provide a counting rule on the number of nonzero factor loadings that is sufficient for achieving uniqueness of the variance decomposition in the facto…
▽ More
Despite the popularity of sparse factor models, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular constraint. To fill this gap, we provide a counting rule on the number of nonzero factor loadings that is sufficient for achieving uniqueness of the variance decomposition in the factor representation. Furthermore, we introduce the generalised lower triangular representation to resolve rotational invariance and show that within this model class the unknown number of common factors can be recovered in an overfitting sparse factor model. By combining point-mass mixture priors with a highly efficient and customised MCMC scheme, we obtain posterior summaries regarding the number of common factors as well as the factor loadings via postprocessing. Our methodology is illustrated for monthly exchange rates of 22 currencies with respect to the euro over a period of eight years and for monthly log returns of 73 firms from the NYSE100 over a period of 20 years.
△ Less
Submitted 11 April, 2018;
originally announced April 2018.
-
Calibration of Machine Learning Classifiers for Probability of Default Modelling
Authors:
Pedro G. Fonseca,
Hugo D. Lopes
Abstract:
Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities output by the model map to the observed probabilities). In this study we cover the current best practices regarding calibration for binary classification, and explor…
▽ More
Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities output by the model map to the observed probabilities). In this study we cover the current best practices regarding calibration for binary classification, and explore how different approaches yield different results on real world credit scoring data. The limitations of evaluating credit scoring models using only rank ability metrics are explored. A benchmark is run on 18 real world datasets, and results compared. The calibration techniques used are Platt Scaling and Isotonic Regression. Also, different machine learning models are used: Logistic Regression, Random Forest Classifiers, and Gradient Boosting Classifiers. Results show that when the dataset is treated as a time series, the use of re-calibration with Isotonic Regression is able to improve the long term calibration better than the alternative methods. Using re-calibration, the non-parametric models are able to outperform the Logistic Regression on Brier Score Loss.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
Efficient Bayesian Inference for Multivariate Factor Stochastic Volatility Models
Authors:
Gregor Kastner,
Sylvia Frühwirth-Schnatter,
Hedibert Freitas Lopes
Abstract:
We discuss efficient Bayesian estimation of dynamic covariance matrices in multivariate time series through a factor stochastic volatility model. In particular, we propose two interweaving strategies (Yu and Meng, Journal of Computational and Graphical Statistics, 20(3), 531-570, 2011) to substantially accelerate convergence and mixing of standard MCMC approaches. Similar to marginal data augmenta…
▽ More
We discuss efficient Bayesian estimation of dynamic covariance matrices in multivariate time series through a factor stochastic volatility model. In particular, we propose two interweaving strategies (Yu and Meng, Journal of Computational and Graphical Statistics, 20(3), 531-570, 2011) to substantially accelerate convergence and mixing of standard MCMC approaches. Similar to marginal data augmentation techniques, the proposed acceleration procedures exploit non-identifiability issues which frequently arise in factor models. Our new interweaving strategies are easy to implement and come at almost no extra computational cost; nevertheless, they can boost estimation efficiency by several orders of magnitude as is shown in extensive simulation studies. To conclude, the application of our algorithm to a 26-dimensional exchange rate data set illustrates the superior performance of the new approach for real-world data.
△ Less
Submitted 19 July, 2017; v1 submitted 25 February, 2016;
originally announced February 2016.
-
Scalable semiparametric inference for the means of heavy-tailed distributions
Authors:
Matt Taddy,
Hedibert Freitas Lopes,
Matt Gardner
Abstract:
Heavy tailed distributions present a tough setting for inference. They are also common in industrial applications, particularly with Internet transaction datasets, and machine learners often analyze such data without considering the biases and risks associated with the misuse of standard tools. This paper outlines a procedure for inference about the mean of a (possibly conditional) heavy tailed di…
▽ More
Heavy tailed distributions present a tough setting for inference. They are also common in industrial applications, particularly with Internet transaction datasets, and machine learners often analyze such data without considering the biases and risks associated with the misuse of standard tools. This paper outlines a procedure for inference about the mean of a (possibly conditional) heavy tailed distribution that combines nonparametric analysis for the bulk of the support with Bayesian parametric modeling -- motivated from extreme value theory -- for the heavy tail. The procedure is fast and massively scalable. The resulting point estimators attain lowest-possible error rates and, unique among alternatives, we are able to provide accurate uncertainty quantification for these estimators. The work should find application in settings wherever correct inference is important and reward tails are heavy; we illustrate the framework in causal inference for A/B experiments involving hundreds of millions of users of eBay.com.
△ Less
Submitted 13 October, 2016; v1 submitted 25 February, 2016;
originally announced February 2016.
-
Shrinkage priors for linear instrumental variable models with many instruments
Authors:
P. Richard Hahn,
Hedibert Lopes
Abstract:
This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated…
▽ More
This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated into the analysis in a robust way.
Second, the new prior is implemented via an importance sampling scheme, which utilizes posterior Monte Carlo samples from a first-stage Bayesian regression analysis. This modular computation makes sensitivity analyses straightforward.
Two simulation studies are provided to demonstrate the advantages of the new method. As an empirical illustration, the new method is used to estimate a key parameter in macro-economic models: the elasticity of inter-temporal substitution. The empirical analysis produces substantive conclusions in line with previous studies, but certain inconsistencies of earlier analyses are resolved.
△ Less
Submitted 3 August, 2014;
originally announced August 2014.
-
Measuring the vulnerability of the Uruguayan population to vector-borne diseases via spatially hierarchical factor models
Authors:
Hedibert F. Lopes,
Alexandra M. Schmidt,
Esther Salazar,
Mariana Gómez,
Marcel Achkar
Abstract:
We propose a model-based vulnerability index of the population from Uruguay to vector-borne diseases. We have available measurements of a set of variables in the census tract level of the 19 Departmental capitals of Uruguay. In particular, we propose an index that combines different sources of information via a set of micro-environmental indicators and geographical location in the country. Our ind…
▽ More
We propose a model-based vulnerability index of the population from Uruguay to vector-borne diseases. We have available measurements of a set of variables in the census tract level of the 19 Departmental capitals of Uruguay. In particular, we propose an index that combines different sources of information via a set of micro-environmental indicators and geographical location in the country. Our index is based on a new class of spatially hierarchical factor models that explicitly account for the different levels of hierarchy in the country, such as census tracts within the city level, and cities in the country level. We compare our approach with that obtained when data are aggregated in the city level. We show that our proposal outperforms current and standard approaches, which fail to properly account for discrepancies in the region sizes, for example, number of census tracts. We also show that data aggregation can seriously affect the estimation of the cities vulnerability rankings under benchmark models.
△ Less
Submitted 19 March, 2012;
originally announced March 2012.
-
Particle Learning and Smoothing
Authors:
Carlos M. Carvalho,
Michael S. Johannes,
Hedibert F. Lopes,
Nicholas G. Polson
Abstract:
Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particles. State smoothing in the presence of parameter un…
▽ More
Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particles. State smoothing in the presence of parameter uncertainty is also solved as a by-product of PL. In a number of examples, we show that PL outperforms existing particle filtering alternatives and proves to be a competitor to MCMC.
△ Less
Submitted 4 November, 2010;
originally announced November 2010.