Search | arXiv e-print repository

Sparse Bayesian factor analysis when the number of factors is unknown

Authors: Sylvia Frühwirth-Schnatter, Darjus Hosszejni, Hedibert Freitas Lopes

Abstract: There has been increased research interest in the subfield of sparse Bayesian factor analysis with shrinkage priors, which achieve additional sparsity beyond the natural parsimonity of factor models. In this spirit, we estimate the number of common factors in the highly implemented sparse latent factor model with spike-and-slab priors on the factor loadings matrix. Our framework leads to a natural… ▽ More There has been increased research interest in the subfield of sparse Bayesian factor analysis with shrinkage priors, which achieve additional sparsity beyond the natural parsimonity of factor models. In this spirit, we estimate the number of common factors in the highly implemented sparse latent factor model with spike-and-slab priors on the factor loadings matrix. Our framework leads to a natural, efficient and simultaneous coupling of model estimation and selection on one hand and model identification and rank estimation (number of factors) on the other hand. More precisely, by embedding the unordered generalized lower triangular loadings representation into overfitting sparse factor modelling, we obtain posterior summaries regarding factor loadings, common factors as well as the factor dimension via postprocessing draws from our efficient and customized Markov chain Monte Carlo scheme. △ Less

Submitted 16 January, 2023; originally announced January 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:1804.04231

MSC Class: 62H25 (Primary) 62F15; 65C05 (Secondary)

arXiv:2301.06354 [pdf, ps, other]

doi 10.3390/econometrics11040026

When it counts -- Econometric identification of the basic factor model based on GLT structures

Authors: Sylvia Frühwirth-Schnatter, Darjus Hosszejni, Hedibert Freitas Lopes

Abstract: Despite the popularity of factor models with sparse loading matrices, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular (PLT) constraint. To fill this gap, we review the advantages of variance identification in sparse factor analysis and introduce the generalized lower triangular (… ▽ More Despite the popularity of factor models with sparse loading matrices, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular (PLT) constraint. To fill this gap, we review the advantages of variance identification in sparse factor analysis and introduce the generalized lower triangular (GLT) structures. We show that the GLT assumption is an improvement over PLT without compromise: GLT is also unique but, unlike PLT, a non-restrictive assumption. Furthermore, we provide a simple counting rule for variance identification under GLT structures, and we demonstrate that within this model class the unknown number of common factors can be recovered in an exploratory factor analysis. Our methodology is illustrated for simulated data in the context of post-processing posterior draws in Bayesian sparse factor analysis. △ Less

Submitted 16 January, 2023; originally announced January 2023.

MSC Class: 62H25 (Primary) 15A24; 62F15 (Secondary)

arXiv:2105.09512 [pdf, other]

doi 10.1016/j.cpc.2014.01.006

Uncertainty quantification through Monte Carlo method in a cloud computing setting

Authors: A. Cunha Jr, R. Nasser, R. Sampaio, H. Lopes, K. Breitman

Abstract: The Monte Carlo (MC) method is the most common technique used for uncertainty quantification, due to its simplicity and good statistical results. However, its computational cost is extremely high, and, in many cases, prohibitive. Fortunately, the MC algorithm is easily parallelizable, which allows its use in simulations where the computation of a single realization is very costly. This work presen… ▽ More The Monte Carlo (MC) method is the most common technique used for uncertainty quantification, due to its simplicity and good statistical results. However, its computational cost is extremely high, and, in many cases, prohibitive. Fortunately, the MC algorithm is easily parallelizable, which allows its use in simulations where the computation of a single realization is very costly. This work presents a methodology for the parallelization of the MC method, in the context of cloud computing. This strategy is based on the MapReduce paradigm, and allows an efficient distribution of tasks in the cloud. This methodology is illustrated on a problem of structural dynamics that is subject to uncertainties. The results show that the technique is capable of producing good results concerning statistical moments of low order. It is shown that even a simple problem may require many realizations for convergence of histograms, which makes the cloud computing strategy very attractive (due to its high scalability capacity and low-cost). Additionally, the results regarding the time of processing and storage space usage allow one to qualify this new methodology as a solution for simulations that require a number of MC realizations beyond the standard. △ Less

Submitted 20 May, 2021; originally announced May 2021.

MSC Class: 62D05 ACM Class: G.3

Journal ref: Computer Physics Communications, vol. 185, pp. 1355-1363, 2014

arXiv:2009.14296 [pdf, other]

The Illusion of the Illusion of Sparsity: An exercise in prior sensitivity

Authors: Bruno Fava, Hedibert F. Lopes

Abstract: The emergence of Big Data raises the question of how to model economic relations when there is a large number of possible explanatory variables. We revisit the issue by comparing the possibility of using dense or sparse models in a Bayesian approach, allowing for variable selection and shrinkage. More specifically, we discuss the results reached by Giannone, Lenza, and Primiceri (2020) through a "… ▽ More The emergence of Big Data raises the question of how to model economic relations when there is a large number of possible explanatory variables. We revisit the issue by comparing the possibility of using dense or sparse models in a Bayesian approach, allowing for variable selection and shrinkage. More specifically, we discuss the results reached by Giannone, Lenza, and Primiceri (2020) through a "Spike-and-Slab" prior, which suggest an "illusion of sparsity" in economic data, as no clear patterns of sparsity could be detected. We make a further revision of the posterior distributions of the model, and propose three experiments to evaluate the robustness of the adopted prior distribution. We find that the pattern of sparsity is sensitive to the prior distribution of the regression coefficients, and present evidence that the model indirectly induces variable selection and shrinkage, which suggests that the "illusion of sparsity" could be, itself, an illusion. Code is available on github.com/bfava/IllusionOfIllusion. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: 33 pages, 11 figures

arXiv:2009.14131 [pdf, other]

Dynamic sparsity on dynamic regression models

Authors: Paloma W. Uribe, Hedibert F. Lopes

Abstract: In the present work, we consider variable selection and shrinkage for the Gaussian dynamic linear regression within a Bayesian framework. In particular, we propose a novel method that allows for time-varying sparsity, based on an extension of spike-and-slab priors for dynamic models. This is done by assigning appropriate Markov switching priors for the time-varying coefficients' variances, extendi… ▽ More In the present work, we consider variable selection and shrinkage for the Gaussian dynamic linear regression within a Bayesian framework. In particular, we propose a novel method that allows for time-varying sparsity, based on an extension of spike-and-slab priors for dynamic models. This is done by assigning appropriate Markov switching priors for the time-varying coefficients' variances, extending the previous work of Ishwaran and Rao (2005). Furthermore, we investigate different priors, including the common Inverted gamma prior for the process variances, and other mixture prior distributions such as Gamma priors for both the spike and the slab, which leads to a mixture of Normal-Gammas priors (Griffin ad Brown, 2010) for the coefficients. In this sense, our prior can be view as a dynamic variable selection prior which induces either smoothness (through the slab) or shrinkage towards zero (through the spike) at each time point. The MCMC method used for posterior computation uses Markov latent variables that can assume binary regimes at each time point to generate the coefficients' variances. In that way, our model is a dynamic mixture model, thus, we could use the algorithm of Gerlach et al (2000) to generate the latent processes without conditioning on the states. Finally, our approach is exemplified through simulated examples and a real data application. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: 31 pages, 5 figures

arXiv:2006.11908 [pdf, ps, other]

Decoupling Shrinkage and Selection in Gaussian Linear Factor Analysis

Authors: Henrique Bolfarine, Carlos M. Carvalho, Hedibert F. Lopes, Jared S. Murray

Abstract: Factor Analysis is a popular method for modeling dependence in multivariate data. However, determining the number of factors and obtaining a sparse orientation of the loadings are still major challenges. In this paper, we propose a decision-theoretic approach that brings to light the relation between a sparse representation of the loadings and factor dimension. This relation is done through a summ… ▽ More Factor Analysis is a popular method for modeling dependence in multivariate data. However, determining the number of factors and obtaining a sparse orientation of the loadings are still major challenges. In this paper, we propose a decision-theoretic approach that brings to light the relation between a sparse representation of the loadings and factor dimension. This relation is done through a summary from information contained in the multivariate posterior. To construct such summary, we introduce a three-step approach. In the first step, the model is fitted with a conservative factor dimension. In the second step, a series of sparse point-estimates, with a decreasing number of factors, is obtained by minimizing an expected predictive loss function. In step three, the degradation in utility in relation to the sparse loadings and factor dimensions is displayed in the posterior summary. The findings are illustrated with applications in classical data from the Factor Analysis literature. We used different prior choices and factor dimensions to demonstrate the flexibility of the proposed method. △ Less

Submitted 24 July, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

Comments: 22 pages, 7 figures

arXiv:2003.05377 [pdf, other]

Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network

Authors: Raul de Araújo Lima, Rômulo César Costa de Sousa, Simone Diniz Junqueira Barbosa, Hélio Cortês Vieira Lopes

Abstract: Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib… ▽ More Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context. △ Less

Submitted 6 March, 2020; originally announced March 2020.

Comments: 7 pages, 4 figures, 3 tables

MSC Class: 68T50(Primary); 68T05 (Secondary) ACM Class: I.2.7; I.2.6

arXiv:1907.03155 [pdf, other]

Learning a latent pattern of heterogeneity in the innovation rates of a time series of counts

Authors: Helton Graziadei, Hedibert F. Lopes, Paulo C. Marques F

Abstract: We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting cap… ▽ More We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting capabilities of the model are put to test in the analysis of crime data in Pittsburgh, with favorable results. △ Less

Submitted 6 July, 2019; originally announced July 2019.

arXiv:1808.09507 [pdf, other]

Tree-Based Bayesian Treatment Effect Analysis

Authors: Pedro Henrique Filipini dos Santos, Hedibert Freitas Lopes

Abstract: The inclusion of the propensity score as a covariate in Bayesian regression trees for causal inference can reduce the bias in treatment effect estimations, which occurs due to the regularization-induced confounding phenomenon. This study advocate for the use of the propensity score by evaluating it under a full-Bayesian variable selection setting, and the use of Individual Conditional Expectation… ▽ More The inclusion of the propensity score as a covariate in Bayesian regression trees for causal inference can reduce the bias in treatment effect estimations, which occurs due to the regularization-induced confounding phenomenon. This study advocate for the use of the propensity score by evaluating it under a full-Bayesian variable selection setting, and the use of Individual Conditional Expectation Plots, which is a graphical tool that can improve treatment effect analysis on tree-based Bayesian models and others "black box" models. The first one, even if poorly estimated, can lead to bias reduction on the estimated treatment effects, while the latter can be used to found groups of individuals which have different responses to the applied treatment, and analyze the impact of each variable in the estimated treatment effect. △ Less

Submitted 28 August, 2018; originally announced August 2018.

arXiv:1806.05738 [pdf, other]

Efficient sampling for Gaussian linear regression with arbitrary priors

Authors: P. Richard Hahn, **gyu He, Hedibert Lopes

Abstract: This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, ma… ▽ More This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, making it ideal for investigating the properties of new shrinkage priors without having to develop custom sampling algorithms. The new sampler takes advantage of the special structure of the linear regression likelihood, allowing it to produce better effective sample size per second than common alternative approaches. △ Less

Submitted 14 June, 2018; originally announced June 2018.

arXiv:1804.04231 [pdf, other]

Sparse Bayesian Factor Analysis when the Number of Factors is Unknown

Authors: Sylvia Fruehwirth-Schnatter, Hedibert Freitas Lopes

Abstract: Despite the popularity of sparse factor models, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular constraint. To fill this gap, we provide a counting rule on the number of nonzero factor loadings that is sufficient for achieving uniqueness of the variance decomposition in the facto… ▽ More Despite the popularity of sparse factor models, little attention has been given to formally address identifiability of these models beyond standard rotation-based identification such as the positive lower triangular constraint. To fill this gap, we provide a counting rule on the number of nonzero factor loadings that is sufficient for achieving uniqueness of the variance decomposition in the factor representation. Furthermore, we introduce the generalised lower triangular representation to resolve rotational invariance and show that within this model class the unknown number of common factors can be recovered in an overfitting sparse factor model. By combining point-mass mixture priors with a highly efficient and customised MCMC scheme, we obtain posterior summaries regarding the number of common factors as well as the factor loadings via postprocessing. Our methodology is illustrated for monthly exchange rates of 22 currencies with respect to the euro over a period of eight years and for monthly log returns of 73 firms from the NYSE100 over a period of 20 years. △ Less

Submitted 11 April, 2018; originally announced April 2018.

Comments: 62 pages, 7 figures, 7 tables,

arXiv:1710.08901 [pdf]

Calibration of Machine Learning Classifiers for Probability of Default Modelling

Authors: Pedro G. Fonseca, Hugo D. Lopes

Abstract: Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities output by the model map to the observed probabilities). In this study we cover the current best practices regarding calibration for binary classification, and explor… ▽ More Binary classification is highly used in credit scoring in the estimation of probability of default. The validation of such predictive models is based both on rank ability, and also on calibration (i.e. how accurately the probabilities output by the model map to the observed probabilities). In this study we cover the current best practices regarding calibration for binary classification, and explore how different approaches yield different results on real world credit scoring data. The limitations of evaluating credit scoring models using only rank ability metrics are explored. A benchmark is run on 18 real world datasets, and results compared. The calibration techniques used are Platt Scaling and Isotonic Regression. Also, different machine learning models are used: Logistic Regression, Random Forest Classifiers, and Gradient Boosting Classifiers. Results show that when the dataset is treated as a time series, the use of re-calibration with Isotonic Regression is able to improve the long term calibration better than the alternative methods. Using re-calibration, the non-parametric models are able to outperform the Logistic Regression on Brier Score Loss. △ Less

Submitted 24 October, 2017; originally announced October 2017.

Comments: Keywords: Binary classification, Probability of Default, Calibration, Credit Risk, Isotonic Regression, Platt Scaling

arXiv:1602.08154 [pdf, other]

doi 10.1080/10618600.2017.1322091

Efficient Bayesian Inference for Multivariate Factor Stochastic Volatility Models

Authors: Gregor Kastner, Sylvia Frühwirth-Schnatter, Hedibert Freitas Lopes

Abstract: We discuss efficient Bayesian estimation of dynamic covariance matrices in multivariate time series through a factor stochastic volatility model. In particular, we propose two interweaving strategies (Yu and Meng, Journal of Computational and Graphical Statistics, 20(3), 531-570, 2011) to substantially accelerate convergence and mixing of standard MCMC approaches. Similar to marginal data augmenta… ▽ More We discuss efficient Bayesian estimation of dynamic covariance matrices in multivariate time series through a factor stochastic volatility model. In particular, we propose two interweaving strategies (Yu and Meng, Journal of Computational and Graphical Statistics, 20(3), 531-570, 2011) to substantially accelerate convergence and mixing of standard MCMC approaches. Similar to marginal data augmentation techniques, the proposed acceleration procedures exploit non-identifiability issues which frequently arise in factor models. Our new interweaving strategies are easy to implement and come at almost no extra computational cost; nevertheless, they can boost estimation efficiency by several orders of magnitude as is shown in extensive simulation studies. To conclude, the application of our algorithm to a 26-dimensional exchange rate data set illustrates the superior performance of the new approach for real-world data. △ Less

Submitted 19 July, 2017; v1 submitted 25 February, 2016; originally announced February 2016.

Journal ref: Journal of Computational and Graphical Statistics 26(4), 905-917 (2017)

arXiv:1602.08066 [pdf, other]

Scalable semiparametric inference for the means of heavy-tailed distributions

Authors: Matt Taddy, Hedibert Freitas Lopes, Matt Gardner

Abstract: Heavy tailed distributions present a tough setting for inference. They are also common in industrial applications, particularly with Internet transaction datasets, and machine learners often analyze such data without considering the biases and risks associated with the misuse of standard tools. This paper outlines a procedure for inference about the mean of a (possibly conditional) heavy tailed di… ▽ More Heavy tailed distributions present a tough setting for inference. They are also common in industrial applications, particularly with Internet transaction datasets, and machine learners often analyze such data without considering the biases and risks associated with the misuse of standard tools. This paper outlines a procedure for inference about the mean of a (possibly conditional) heavy tailed distribution that combines nonparametric analysis for the bulk of the support with Bayesian parametric modeling -- motivated from extreme value theory -- for the heavy tail. The procedure is fast and massively scalable. The resulting point estimators attain lowest-possible error rates and, unique among alternatives, we are able to provide accurate uncertainty quantification for these estimators. The work should find application in settings wherever correct inference is important and reward tails are heavy; we illustrate the framework in causal inference for A/B experiments involving hundreds of millions of users of eBay.com. △ Less

Submitted 13 October, 2016; v1 submitted 25 February, 2016; originally announced February 2016.

arXiv:1408.0462 [pdf, other]

Shrinkage priors for linear instrumental variable models with many instruments

Authors: P. Richard Hahn, Hedibert Lopes

Abstract: This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated… ▽ More This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated into the analysis in a robust way. Second, the new prior is implemented via an importance sampling scheme, which utilizes posterior Monte Carlo samples from a first-stage Bayesian regression analysis. This modular computation makes sensitivity analyses straightforward. Two simulation studies are provided to demonstrate the advantages of the new method. As an empirical illustration, the new method is used to estimate a key parameter in macro-economic models: the elasticity of inter-temporal substitution. The empirical analysis produces substantive conclusions in line with previous studies, but certain inconsistencies of earlier analyses are resolved. △ Less

Submitted 3 August, 2014; originally announced August 2014.

Comments: 27 pages, 6 figures, 3 tables

arXiv:1203.4119 [pdf, ps, other]

doi 10.1214/11-AOAS497

Measuring the vulnerability of the Uruguayan population to vector-borne diseases via spatially hierarchical factor models

Authors: Hedibert F. Lopes, Alexandra M. Schmidt, Esther Salazar, Mariana Gómez, Marcel Achkar

Abstract: We propose a model-based vulnerability index of the population from Uruguay to vector-borne diseases. We have available measurements of a set of variables in the census tract level of the 19 Departmental capitals of Uruguay. In particular, we propose an index that combines different sources of information via a set of micro-environmental indicators and geographical location in the country. Our ind… ▽ More We propose a model-based vulnerability index of the population from Uruguay to vector-borne diseases. We have available measurements of a set of variables in the census tract level of the 19 Departmental capitals of Uruguay. In particular, we propose an index that combines different sources of information via a set of micro-environmental indicators and geographical location in the country. Our index is based on a new class of spatially hierarchical factor models that explicitly account for the different levels of hierarchy in the country, such as census tracts within the city level, and cities in the country level. We compare our approach with that obtained when data are aggregated in the city level. We show that our proposal outperforms current and standard approaches, which fail to properly account for discrepancies in the region sizes, for example, number of census tracts. We also show that data aggregation can seriously affect the estimation of the cities vulnerability rankings under benchmark models. △ Less

Submitted 19 March, 2012; originally announced March 2012.

Comments: Published in at http://dx.doi.org/10.1214/11-AOAS497 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS497

Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 1, 284-303

arXiv:1011.1098 [pdf, ps, other]

doi 10.1214/10-STS325

Particle Learning and Smoothing

Authors: Carlos M. Carvalho, Michael S. Johannes, Hedibert F. Lopes, Nicholas G. Polson

Abstract: Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particles. State smoothing in the presence of parameter un… ▽ More Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particles. State smoothing in the presence of parameter uncertainty is also solved as a by-product of PL. In a number of examples, we show that PL outperforms existing particle filtering alternatives and proves to be a competitor to MCMC. △ Less

Submitted 4 November, 2010; originally announced November 2010.

Comments: Published in at http://dx.doi.org/10.1214/10-STS325 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-STS-STS325

Journal ref: Statistical Science 2010, Vol. 25, No. 1, 88-106

Showing 1–17 of 17 results for author: Lopes, H