-
Incorporating circuit theory into a dynamic model for crowd-sourced observations of migratory birds
Authors:
Michael F. Christensen,
Peter D. Hoff
Abstract:
While the overarching pattern of biannual avian migration is well understood, there are significant questions pertaining to this phenomenon that invite further study. Necessary to any analysis of these questions is an understanding of how a given species' spatial distribution evolves in time. While studies of animal movement are often conducted using telemetry data, the collection of such data can…
▽ More
While the overarching pattern of biannual avian migration is well understood, there are significant questions pertaining to this phenomenon that invite further study. Necessary to any analysis of these questions is an understanding of how a given species' spatial distribution evolves in time. While studies of animal movement are often conducted using telemetry data, the collection of such data can be time- and resource-intensive, frequently resulting in small sample sizes. Ecological surveys of animal populations are also indicative of species distribution trends, but may be constrained to a limited spatial domain. Within this article we utilize crowd-sourced observations from the eBird database to model the abundance of migratory bird species in space and time. While crowd-sourced observations are individually less reliable than those produced by experts, the sheer size and spatial coverage of the eBird database make it attractive for use in this setting. We introduce a hidden Markov model for observed bird counts utilizing a novel transition structure developed using principles from circuit theory. After illustrating model properties we fit it to observations of Baltimore orioles and yellow-rumped warblers within the eastern United States and discuss insight it provides into the migratory patterns for these species.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Frequentist Prediction Sets for Species Abundance using Indirect Information
Authors:
Elizabeth Bersson,
Peter D. Hoff
Abstract:
Citizen science databases that consist of volunteer-led sampling efforts of species communities are relied on as essential sources of data in ecology. Summarizing such data across counties with frequentist-valid prediction sets for each county provides an interpretable comparison across counties of varying size or composition. As citizen science data often feature unequal sampling efforts across a…
▽ More
Citizen science databases that consist of volunteer-led sampling efforts of species communities are relied on as essential sources of data in ecology. Summarizing such data across counties with frequentist-valid prediction sets for each county provides an interpretable comparison across counties of varying size or composition. As citizen science data often feature unequal sampling efforts across a spatial domain, prediction sets constructed with indirect methods that share information across counties may be used to improve precision. In this article, we present a nonparametric framework to obtain precise prediction sets for a multinomial random sample based on indirect information that maintain frequentist coverage guarantees for each county. We detail a simple algorithm to obtain prediction sets for each county using indirect information where the computation time does not depend on the sample size and scales nicely with the number of species considered. The indirect information may be estimated by a proposed empirical Bayes procedure based on information from auxiliary data. Our approach makes inference for under-sampled counties more precise, while maintaining area-specific frequentist validity for each county. Our method is used to provide a useful description of avian species abundance in North Carolina, USA based on citizen science data from the eBird database.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Bayesian Covariance Estimation for Multi-group Matrix-variate Data
Authors:
Elizabeth Bersson,
Peter D. Hoff
Abstract:
Multi-group covariance estimation for matrix-variate data with small within group sample sizes is a key part of many data analysis tasks in modern applications. To obtain accurate group-specific covariance estimates, shrinkage estimation methods which shrink an unstructured, group-specific covariance either across groups towards a pooled covariance or within each group towards a Kronecker structur…
▽ More
Multi-group covariance estimation for matrix-variate data with small within group sample sizes is a key part of many data analysis tasks in modern applications. To obtain accurate group-specific covariance estimates, shrinkage estimation methods which shrink an unstructured, group-specific covariance either across groups towards a pooled covariance or within each group towards a Kronecker structure have been developed. However, in many applications, it is unclear which approach will result in more accurate covariance estimates. In this article, we present a hierarchical prior distribution which flexibly allows for both types of shrinkage. The prior linearly combines shrinkage across groups towards a shared pooled covariance and shrinkage within groups towards a group-specific Kronecker covariance. We illustrate the utility of the proposed prior in speech recognition and an analysis of chemical exposure data.
△ Less
Submitted 7 March, 2024; v1 submitted 17 February, 2023;
originally announced February 2023.
-
A flexible and interpretable spatial covariance model for data on graphs
Authors:
Michael F. Christensen,
Peter D. Hoff
Abstract:
Spatial models for areal data are often constructed such that all pairs of adjacent regions are assumed to have near-identical spatial autocorrelation. In practice, data can exhibit dependence structures more complicated than can be represented under this assumption. In this article we develop a new model for spatially correlated data observed on graphs, which can flexibly represented many types o…
▽ More
Spatial models for areal data are often constructed such that all pairs of adjacent regions are assumed to have near-identical spatial autocorrelation. In practice, data can exhibit dependence structures more complicated than can be represented under this assumption. In this article we develop a new model for spatially correlated data observed on graphs, which can flexibly represented many types of spatial dependence patterns while retaining aspects of the original graph geometry. Our method implies an embedding of the graph into Euclidean space wherein covariance can be modeled using traditional covariance functions, such as those from the Matérn family. We parameterize our model using a class of graph metrics compatible with such covariance functions, and which characterize distance in terms of network flow, a property useful for understanding proximity in many ecological settings. By estimating the parameters underlying these metrics, we recover the "intrinsic distances" between graph nodes, which assist in the interpretation of the estimated covariance and allow us to better understand the relationship between the observed process and spatial domain. We compare our model to existing methods for spatially dependent graph data, primarily conditional autoregressive models and their variants, and illustrate advantages of our method over traditional approaches. We fit our model to bird abundance data for several species in North Carolina, and show how it provides insight into the interactions between species-specific spatial distributions and geography.
△ Less
Submitted 2 July, 2024; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Optimal Conformal Prediction for Small Areas
Authors:
Elizabeth Bersson,
Peter D. Hoff
Abstract:
Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction appro…
▽ More
Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction approach in which the conformity measure is the posterior predictive density of a working model that incorporates indirect information. The resulting prediction region has guaranteed frequentist coverage regardless of the working model, and, if the working model assumptions are accurate, the region has minimum expected volume compared to other regions with the same coverage rate. When constructed under a normal working model, we prove such a prediction region is an interval and construct an efficient algorithm to obtain the exact interval. We illustrate the performance of our method through simulation studies and an application to EPA radon survey data.
△ Less
Submitted 17 April, 2022;
originally announced April 2022.
-
The multirank likelihood for semiparametric canonical correlation analysis
Authors:
Jordan G. Bryan,
Jonathan Niles-Weed,
Peter D. Hoff
Abstract:
Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption tha…
▽ More
Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors.
△ Less
Submitted 22 April, 2024; v1 submitted 14 December, 2021;
originally announced December 2021.
-
A Latent Variable Model for Relational Events with Multiple Receivers
Authors:
Joris Mulder,
Peter D. Hoff
Abstract:
Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network,…
▽ More
Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network, which is needed for better understanding social interaction dynamics. In this paper a multiplicative latent factor model is proposed to analyze such relational data. For a given message, all potential receiver actors are placed on a suitability scale, and the actors are included in the receiver set whose suitability score exceeds a threshold value. Unobserved heterogeneity in the social interaction behavior is captured using a multiplicative latent factor structure with latent variables for actors (which differ for actors as senders and receivers) and latent variables for individual messages. A Bayesian computational algorithm, which relies on Gibbs sampling, is proposed for model fitting. Model assessment is done using posterior predictive checks. Based on our analyses of the Enron email data, a mc-amen model with a 2 dimensional latent variable can accurately capture the empirical distribution of the cardinality of the receiver set and the composition of the receiver sets for commonly observed messages. Moreover the results show that actors have a comparable (but not identical) role as a sender and as a receiver in the network.
△ Less
Submitted 21 February, 2024; v1 submitted 13 January, 2021;
originally announced January 2021.
-
Hierarchical Multidimensional Scaling for the Comparison of Musical Performance Styles
Authors:
Anna K. Yanchenko,
Peter D. Hoff
Abstract:
Quantification of stylistic differences between musical artists is of academic interest to the music community, and is also useful for other applications such as music information retrieval and recommendation systems. Information about stylistic differences can be obtained by comparing the performances of different artists across common musical pieces. In this article, we develop a statistical met…
▽ More
Quantification of stylistic differences between musical artists is of academic interest to the music community, and is also useful for other applications such as music information retrieval and recommendation systems. Information about stylistic differences can be obtained by comparing the performances of different artists across common musical pieces. In this article, we develop a statistical methodology for identifying and quantifying systematic stylistic differences among artists that are consistent across audio recordings of a common set of pieces, in terms of several musical features. Our focus is on a comparison of ten different orchestras, based on data from audio recordings of the nine Beethoven symphonies. As generative or fully parametric models of raw audio data can be highly complex, and more complex than necessary for our goal of identifying differences between orchestras, we propose to reduce the data from a set of audio recordings down to pairwise distances between orchestras, based on different musical characteristics of the recordings, such as tempo, dynamics, and timbre. For each of these characteristics, we obtain multiple pairwise distance matrices, one for each movement of each symphony. We develop a hierarchical multidimensional scaling (HMDS) model to identify and quantify systematic differences between orchestras in terms of these three musical characteristics, and interpret the results in the context of known qualitative information about the orchestras. This methodology is able to recover several expected systematic similarities between orchestras, as well as to identify some more novel results. For example, we find that modern recordings exhibit a high degree of similarity to each other, as compared to older recordings.
△ Less
Submitted 21 December, 2020; v1 submitted 28 April, 2020;
originally announced April 2020.
-
Smaller $p$-values in genomics studies using distilled historical information
Authors:
Jordan G. Bryan,
Peter D. Hoff
Abstract:
Medical research institutions have generated massive amounts of biological data by genetically profiling hundreds of cancer cell lines. In parallel, academic biology labs have conducted genetic screens on small numbers of cancer cell lines under custom experimental conditions. In order to share information between these two approaches to scientific discovery, this article proposes a "frequentist a…
▽ More
Medical research institutions have generated massive amounts of biological data by genetically profiling hundreds of cancer cell lines. In parallel, academic biology labs have conducted genetic screens on small numbers of cancer cell lines under custom experimental conditions. In order to share information between these two approaches to scientific discovery, this article proposes a "frequentist assisted by Bayes" (FAB) procedure for hypothesis testing that allows historical information from massive genomics datasets to increase the power of hypothesis tests in specialized studies. The exchange of information takes place through a novel probability model for multimodal genomics data, which distills historical information pertaining to cancer cell lines and genes across a wide variety of experimental contexts. If the relevance of the historical information for a given study is high, then the resulting FAB tests can be more powerful than the corresponding classical tests. If the relevance is low, then the FAB tests yield as many discoveries as the classical tests. Simulations and practical investigations demonstrate that the FAB testing procedure can increase the number of effects discovered in genomics studies while still maintaining strict control of type I error and false discovery rates.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Smaller $p$-values via indirect information
Authors:
Peter D. Hoff
Abstract:
This article develops $p$-values for evaluating means of normal populations that make use of indirect or prior information. A $p$-value of this type is based on a biased test statistic that is optimal on average with respect to a probability distribution that encodes indirect information about the mean parameter, resulting in a smaller $p$-value if the indirect information is accurate. In a variet…
▽ More
This article develops $p$-values for evaluating means of normal populations that make use of indirect or prior information. A $p$-value of this type is based on a biased test statistic that is optimal on average with respect to a probability distribution that encodes indirect information about the mean parameter, resulting in a smaller $p$-value if the indirect information is accurate. In a variety of multiparameter settings, we show how to adaptively estimate the indirect information for each mean parameter while still maintaining uniformity of the $p$-values under their null hypotheses. This is done using a linking model through which indirect information about the mean of one population may be obtained from the data of other populations. Importantly, the linking model does not need to be correct to maintain the uniformity of the $p$-values under their null hypotheses. This methodology is illustrated in several data analysis scenarios, including small area inference, spatially arranged populations, interactions in linear regression, and generalized linear models.
△ Less
Submitted 10 December, 2019; v1 submitted 29 July, 2019;
originally announced July 2019.
-
Monte Carlo simulation on the Stiefel manifold via polar expansion
Authors:
Michael Jauch,
Peter D. Hoff,
David B. Dunson
Abstract:
Motivated by applications to Bayesian inference for statistical models with orthogonal matrix parameters, we present $\textit{polar expansion},$ a general approach to Monte Carlo simulation from probability distributions on the Stiefel manifold. To bypass many of the well-established challenges of simulating from the distribution of a random orthogonal matrix $\boldsymbol{Q},$ we construct a distr…
▽ More
Motivated by applications to Bayesian inference for statistical models with orthogonal matrix parameters, we present $\textit{polar expansion},$ a general approach to Monte Carlo simulation from probability distributions on the Stiefel manifold. To bypass many of the well-established challenges of simulating from the distribution of a random orthogonal matrix $\boldsymbol{Q},$ we construct a distribution for an unconstrained random matrix $\boldsymbol{X}$ such that $\boldsymbol{Q}_X,$ the orthogonal component of the polar decomposition of $\boldsymbol{X},$ is equal in distribution to $\boldsymbol{Q}.$ The distribution of $\boldsymbol{X}$ is amenable to Markov chain Monte Carlo (MCMC) simulation using standard methods, and an approximation to the distribution of $\boldsymbol{Q}$ can be recovered from a Markov chain on the unconstrained space. When combined with modern MCMC software, polar expansion allows for routine and flexible posterior inference in models with orthogonal matrix parameters. We find that polar expansion with adaptive Hamiltonian Monte Carlo is an order of magnitude more efficient than competing MCMC approaches in a benchmark protein interaction network application. We also propose a new approach to Bayesian functional principal components analysis which we illustrate in a meteorological time series application.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
Structured Shrinkage Priors
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured an…
▽ More
In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured and sparse estimates. In this paper we develop structured shrinkage priors that generalize multivariate normal, Laplace, exponential power and normal-gamma priors. These priors allow the regression coefficients to be correlated a priori without sacrificing elementwise sparsity or shrinkage. The primary challenges in working with these structured shrinkage priors are computational, as the corresponding penalties are intractable integrals and the full conditional distributions that are needed to approximate the posterior mode or simulate from the posterior distribution may be non-standard. We overcome these issues using a flexible elliptical slice sampling procedure, and demonstrate that these priors can be used to introduce structure while preserving sparsity.
△ Less
Submitted 26 April, 2023; v1 submitted 13 February, 2019;
originally announced February 2019.
-
Random orthogonal matrices and the Cayley transform
Authors:
Michael Jauch,
Peter D. Hoff,
David B. Dunson
Abstract:
Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in…
▽ More
Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we describe and illustrate through examples a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish an asymptotic independent normal approximation for the distribution of the Euclidean parameters which corresponds to the uniform distribution on the Stiefel manifold. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof.
△ Less
Submitted 5 October, 2018;
originally announced October 2018.
-
Additive and multiplicative effects network models
Authors:
Peter D. Hoff
Abstract:
Network datasets typically exhibit certain types of statistical dependencies, such as within-dyad correlation, row and column heterogeneity, and third-order dependence patterns such as transitivity and clustering. The first two of these can be well-represented statistically with a social relations model, a type of additive random effects model originally developed for continuous dyadic data. Third…
▽ More
Network datasets typically exhibit certain types of statistical dependencies, such as within-dyad correlation, row and column heterogeneity, and third-order dependence patterns such as transitivity and clustering. The first two of these can be well-represented statistically with a social relations model, a type of additive random effects model originally developed for continuous dyadic data. Third-order patterns can be represented with multiplicative random effects models, which are related to matrix decompositions commonly used for matrix-variate data analysis. Additionally, these multiplicative random effects models generalize other popular latent variable network models, such as the stochastic blockmodel and the latent space model. In this article we review a general regression framework for the analysis of network data that combines these two types of random effects and accommodates a variety of network data types, including continuous, binary and ordinal network relations.
△ Less
Submitted 20 July, 2018;
originally announced July 2018.
-
Adaptive Sign Error Control
Authors:
Chaoyu Yu,
Peter D. Hoff
Abstract:
In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error rate, family-wise type I error rate or the false discovery rate. However, controlling these error rates does not explicitly control the sign error rate. In thi…
▽ More
In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error rate, family-wise type I error rate or the false discovery rate. However, controlling these error rates does not explicitly control the sign error rate. In this paper, we propose two procedures for adaptively selecting an experimentwise significance threshold in order to control the sign error rate. The first controls the sign error rate conservatively, without any distributional assumptions on the parameters of interest. The second is an empirical Bayes procedure, and achieves optimal performance asymptotically when a model for the distribution of the parameters is correctly specified. We also discuss an adaptive procedure to minimize the sign error rate when the experimentwise type I error rate is held fixed.
△ Less
Submitted 30 December, 2017;
originally announced January 2018.
-
Testing Sparsity-Inducing Penalties
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution…
▽ More
Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution of the regression coefficients resembles a Laplace distribution. We give a testing procedure of whether or not a Laplace prior is appropriate and accordingly, whether or not using a lasso penalized estimate is appropriate. This testing procedure is designed to have power against exponential power priors which correspond to $\ell_q$ penalties. Via simulations, we show that this testing procedure achieves the desired level and has enough power to detect violations of the Laplace assumption when the numbers of observations and unknown regression coefficients are large. We then introduce an adaptive procedure that chooses a more appropriate prior and corresponding penalty from the class of exponential power priors when the null hypothesis is rejected. We show that this can improve estimation of the regression coefficients both when they are drawn from an exponential power distribution and when they are drawn from a spike-and-slab distribution.
△ Less
Submitted 8 September, 2018; v1 submitted 17 December, 2017;
originally announced December 2017.
-
Multiplicative Coevolution Regression Models for Longitudinal Networks and Nodal Attributes
Authors:
Yanjun He,
Peter D. Hoff
Abstract:
We introduce a simple and extendable coevolution model for the analysis of longitudinal network and nodal attribute data. The model features parameters that describe three phenomena: homophily, contagion and autocorrelation of the network and nodal attribute process. Homophily here describes how changes to the network may be associated with between-node similarities in terms of their nodal attribu…
▽ More
We introduce a simple and extendable coevolution model for the analysis of longitudinal network and nodal attribute data. The model features parameters that describe three phenomena: homophily, contagion and autocorrelation of the network and nodal attribute process. Homophily here describes how changes to the network may be associated with between-node similarities in terms of their nodal attributes. Contagion refers to how node-level attributes may change depending on the network. The model we present is based upon a pair of intertwined autoregressive processes. We obtain least-squares parameter estimates for continuous-valued fully-observed network and attribute data. We also provide methods for Bayesian inference in several other cases, including ordinal network and attribute data, and models involving latent nodal attributes. These model extensions are applied to an analysis of international relations data and to data from a study of teen delinquency and friendship networks.
△ Less
Submitted 7 December, 2017;
originally announced December 2017.
-
Influence Networks in International Relations
Authors:
Shahryar Minhas,
Peter D. Hoff,
Michael D. Ward
Abstract:
Measuring influence and determining what drives it are persistent questions in political science and in network analysis more generally. Herein we focus on the domain of international relations. Our major substantive question is: How can we determine what characteristics make an actor influential? To address the topic of influence, we build on a multilinear tensor regression framework (MLTR) that…
▽ More
Measuring influence and determining what drives it are persistent questions in political science and in network analysis more generally. Herein we focus on the domain of international relations. Our major substantive question is: How can we determine what characteristics make an actor influential? To address the topic of influence, we build on a multilinear tensor regression framework (MLTR) that captures influence relationships using a tensor generalization of a vector autoregression model. Influence relationships in that approach are captured in a pair of n x n matrices and provide measurements of how the network actions of one actor may influence the future actions of another. A limitation of the MLTR and earlier latent space approaches is that there are no direct mechanisms through which to explain why a certain actor is more or less influential than others. Our new framework, social influence regression, provides a way to statistically model the influence of one actor on another as a function of characteristics of the actors. Thus we can move beyond just estimating that an actor influences another to understanding why. To highlight the utility of this approach, we apply it to studying monthly-level conflictual events between countries as measured through the Integrated Crisis Early Warning System (ICEWS) event data project.
△ Less
Submitted 27 June, 2017;
originally announced June 2017.
-
Exact adaptive confidence intervals for linear regression coefficients
Authors:
Peter D. Hoff,
Chaoyu Yu
Abstract:
We propose an adaptive confidence interval procedure (CIP) for the coefficients in the normal linear regression model. This procedure has a frequentist coverage rate that is constant as a function of the model parameters, yet provides smaller intervals than the usual interval procedure, on average across regression coefficients. The proposed procedure is obtained by defining a class of CIPs that a…
▽ More
We propose an adaptive confidence interval procedure (CIP) for the coefficients in the normal linear regression model. This procedure has a frequentist coverage rate that is constant as a function of the model parameters, yet provides smaller intervals than the usual interval procedure, on average across regression coefficients. The proposed procedure is obtained by defining a class of CIPs that all have exact $1-α$ frequentist coverage, and then selecting from this class the procedure that minimizes a prior expected interval width. Such a procedure may be described as "frequentist, assisted by Bayes" or FAB. We describe an adaptive approach for estimating the prior distribution from the data so that exact non-asymptotic $1-α$ coverage is maintained. Additionally, in a "$p$ growing with $n$" asymptotic scenario, this adaptive FAB procedure is asymptotically Bayes-optimal among $1-α$ frequentist CIPs.
△ Less
Submitted 6 July, 2017; v1 submitted 23 May, 2017;
originally announced May 2017.
-
Lasso ANOVA Decompositions for Matrix and Tensor Data
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may…
▽ More
Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may be sparse. Accordingly, the mean matrix can be estimated by solving a penalized regression problem, applying a lasso penalty to the elementwise effects. Although solving this penalized regression problem is straightforward, specifying appropriate values of the penalty parameters is not. Leveraging the posterior mode interpretation of the penalized regression problem, moment-based empirical Bayes estimators of the penalty parameters can be defined. Estimation of the mean matrix using these these moment-based empirical Bayes estimators can be called LANOVA penalization, and the corresponding estimate of the mean matrix can be called the LANOVA estimate. The empirical Bayes estimators are shown to be consistent. Additionally, LANOVA penalization is extended to accommodate sparsity of row and column effects and to estimate an unknown mean tensor. The behavior of the LANOVA estimate is examined under misspecification of the distribution of the elementwise effects, and LANOVA penalization is applied to several datasets, including a matrix of microarray data, a three-way tensor of fMRI data and a three-way tensor of wheat infection data.
△ Less
Submitted 8 February, 2019; v1 submitted 24 March, 2017;
originally announced March 2017.
-
Adaptive multigroup confidence intervals with constant coverage
Authors:
Chaoyu Yu,
Peter D. Hoff
Abstract:
Confidence intervals for the means of multiple normal populations are often based on a hierarchical normal model. While commonly used interval procedures based on such a model have the nominal coverage rate on average across a population of groups, their actual coverage rate for a given group will be above or below the nominal rate, depending on the value of the group mean. Alternatively, a covera…
▽ More
Confidence intervals for the means of multiple normal populations are often based on a hierarchical normal model. While commonly used interval procedures based on such a model have the nominal coverage rate on average across a population of groups, their actual coverage rate for a given group will be above or below the nominal rate, depending on the value of the group mean. Alternatively, a coverage rate that is constant as a function of a group's mean can be simply achieved by using a standard $t$-interval, based on data only from that group. The standard $t$-interval, however, fails to share information across the groups and is therefore not adaptive to easily obtained information about the distribution of group-specific means.
In this article we construct confidence intervals that have a constant frequentist coverage rate and that make use of information about across-group heterogeneity, resulting in constant-coverage intervals that are narrower than standard $t$-intervals on average across groups. Such intervals are constructed by inverting biased tests for the mean of a normal population. Given a prior distribution on the mean, Bayes-optimal biased tests can be inverted to form Bayes-optimal confidence intervals with frequentist coverage that is constant as a function of the mean. In the context of multiple groups, the prior distribution is replaced by a model of across-group heterogeneity. The parameters for this model can be estimated using data from all of the groups, and used to obtain confidence intervals with constant group-specific coverage that adapt to information about the distribution of group means.
△ Less
Submitted 25 December, 2016;
originally announced December 2016.
-
Inferential Approaches for Network Analyses: AMEN for Latent Factor Models
Authors:
Shahryar Minhas,
Peter D. Hoff,
Michael D. Ward
Abstract:
We introduce a Bayesian approach to conduct inferential analyses on dyadic data while accounting for interdependencies between observations through a set of additive and multiplicative effects (AME). The AME model is built on a generalized linear modeling framework and is thus flexible enough to be applied to a variety of contexts. We contrast the AME model to two prominent approaches in the liter…
▽ More
We introduce a Bayesian approach to conduct inferential analyses on dyadic data while accounting for interdependencies between observations through a set of additive and multiplicative effects (AME). The AME model is built on a generalized linear modeling framework and is thus flexible enough to be applied to a variety of contexts. We contrast the AME model to two prominent approaches in the literature: the latent space model (LSM) and the exponential random graph model (ERGM). Relative to these approaches, we show that the AME approach is a) to be easy to implement; b) interpretable in a general linear model framework; c) computationally straightforward; d) not prone to degeneracy; e) captures 1st, 2nd, and 3rd order network dependencies; and f) notably outperforms ERGMs and LSMs on a variety of metrics and in an out-of-sample context. In summary, AME offers a straightforward way to undertake nuanced, principled inferential network analysis for a wide range of social science questions.
△ Less
Submitted 27 July, 2018; v1 submitted 1 November, 2016;
originally announced November 2016.
-
Lasso, fractional norm and structured sparse estimation using a Hadamard product parametrization
Authors:
Peter D. Hoff
Abstract:
Using a multiplicative reparametrization, I show that a subclass of $L_q$ penalties with $q\leq 1$ can be expressed as sums of $L_2$ penalties. It follows that the lasso and other norm-penalized regression estimates may be obtained using a very simple and intuitive alternating ridge regression algorithm. As compared to a similarly intuitive EM algorithm for $L_q$ optimization, the proposed algorit…
▽ More
Using a multiplicative reparametrization, I show that a subclass of $L_q$ penalties with $q\leq 1$ can be expressed as sums of $L_2$ penalties. It follows that the lasso and other norm-penalized regression estimates may be obtained using a very simple and intuitive alternating ridge regression algorithm. As compared to a similarly intuitive EM algorithm for $L_q$ optimization, the proposed algorithm avoids some numerical instability issues and is also competitive in terms of speed. Furthermore, the proposed algorithm can be extended to accommodate sparse high-dimensional scenarios, generalized linear models, and can be used to create structured sparsity via penalties derived from covariance models for the parameters. Such model-based penalties may be useful for sparse estimation of spatially or temporally structured parameters.
△ Less
Submitted 18 May, 2017; v1 submitted 31 October, 2016;
originally announced November 2016.
-
Limitations on detecting row covariance in the presence of column covariance
Authors:
Peter D. Hoff
Abstract:
Many inference techniques for multivariate data analysis assume that the rows of the data matrix are realizations of independent and identically distributed random vectors. Such an assumption will be met, for example, if the rows of the data matrix are multivariate measurements on a set of independently sampled units. In the absence of an independent random sample, a relevant question is whether o…
▽ More
Many inference techniques for multivariate data analysis assume that the rows of the data matrix are realizations of independent and identically distributed random vectors. Such an assumption will be met, for example, if the rows of the data matrix are multivariate measurements on a set of independently sampled units. In the absence of an independent random sample, a relevant question is whether or not a statistical model that assumes such row exchangeability is plausible. One method for assessing this plausibility is a statistical test of row covariation. Maintenance of a constant type I error rate regardless of the column covariance or matrix mean can be accomplished with a test that is invariant under an appropriate group of transformations. In the context of a class of elliptically contoured matrix regression models (such as matrix normal models), I show that there are no non-trivial invariant tests if the number of rows is not sufficiently larger than the number of columns. Furthermore, I show that even if the number of rows is large, there are no non-trivial invariant tests that have power to detect arbitrary row covariance in the presence of arbitrary column covariance. However, we can construct biased tests that have power to detect certain types of row covariance that may be encountered in practice.
△ Less
Submitted 30 December, 2015;
originally announced December 2015.
-
A Pivot-Based Improvement to Sandwich-Based Confidence Intervals
Authors:
James W. Harmon,
Peter D. Hoff
Abstract:
The current standard for confidence interval construction in the context of a possibly misspecified model is to use an interval based on the sandwich estimate of variance. These intervals provide asymptotically correct coverage, but small-sample coverage is known to be poor. By eliminating a plug-in assumption, we derive a pivot-based method for confidence interval construction under possibly miss…
▽ More
The current standard for confidence interval construction in the context of a possibly misspecified model is to use an interval based on the sandwich estimate of variance. These intervals provide asymptotically correct coverage, but small-sample coverage is known to be poor. By eliminating a plug-in assumption, we derive a pivot-based method for confidence interval construction under possibly misspecified models. When compared against confidence intervals generated by the sandwich estimate of variance, this method provides more accurate coverage of the pseudo-true parameter at small sample sizes. This is shown in the results of several simulation studies. Asymptotic results show that our pivot-based intervals have large sample efficiency equal to that of intervals based on the sandwich estimate of variance.
△ Less
Submitted 29 December, 2015;
originally announced December 2015.
-
Dyadic data analysis with amen
Authors:
Peter D. Hoff
Abstract:
Dyadic data on pairs of objects, such as relational or social network data, often exhibit strong statistical dependencies. Certain types of second-order dependencies, such as degree heterogeneity and reciprocity, can be well-represented with additive random effects models. Higher-order dependencies, such as transitivity and stochastic equivalence, can often be represented with multiplicative effec…
▽ More
Dyadic data on pairs of objects, such as relational or social network data, often exhibit strong statistical dependencies. Certain types of second-order dependencies, such as degree heterogeneity and reciprocity, can be well-represented with additive random effects models. Higher-order dependencies, such as transitivity and stochastic equivalence, can often be represented with multiplicative effects. The "amen" package for the R statistical computing environment provides estimation and inference for a class of additive and multiplicative random effects models for ordinal, continuous, binary and other types of dyadic data. The package also provides methods for missing, censored and fixed-rank nomination data, as well as longitudinal dyadic data. This tutorial illustrates the "amen" package via example statistical analyses of several of these different data types.
△ Less
Submitted 26 June, 2015;
originally announced June 2015.
-
Relax, Tensors Are Here: Dependencies in International Processes
Authors:
Shahryar Minhas,
Peter D. Hoff,
Michael D. Ward
Abstract:
Previous models of international conflict have suffered two shortfalls. They tended not to embody dynamic changes, focusing rather on static slices of behavior over time. These models have also been empirically evaluated in ways that assumed the independence of each country, when in reality they are searching for the interdependence among all countries. We illustrate a solution to these two hurdle…
▽ More
Previous models of international conflict have suffered two shortfalls. They tended not to embody dynamic changes, focusing rather on static slices of behavior over time. These models have also been empirically evaluated in ways that assumed the independence of each country, when in reality they are searching for the interdependence among all countries. We illustrate a solution to these two hurdles and evaluate this new, dynamic, network based approach to the dependencies among the ebb and flow of daily international interactions using a newly developed, and openly available, database of events among nations.
△ Less
Submitted 30 April, 2015;
originally announced April 2015.
-
Multilinear tensor regression for longitudinal relational data
Authors:
Peter D. Hoff
Abstract:
A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data,…
▽ More
A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points. This is done via a separable, or Kronecker-structured, regression parameter along with a separable covariance model. In the context of an analysis of longitudinal multivariate relational data, it is shown how the multilinear tensor regression model can represent patterns that often appear in relational and network data, such as reciprocity and transitivity.
△ Less
Submitted 5 November, 2015; v1 submitted 28 November, 2014;
originally announced December 2014.
-
A higher-order LQ decomposition for separable covariance models
Authors:
David C. Gerard,
Peter D. Hoff
Abstract:
We develop a higher order generalization of the LQ decomposition and show that this decomposition plays an important role in likelihood-based estimation and testing for separable, or Kronecker structured, covariance models, such as the multilinear normal model. This role is analogous to that of the LQ decomposition in likelihood inference for the multivariate normal model. Additionally, this highe…
▽ More
We develop a higher order generalization of the LQ decomposition and show that this decomposition plays an important role in likelihood-based estimation and testing for separable, or Kronecker structured, covariance models, such as the multilinear normal model. This role is analogous to that of the LQ decomposition in likelihood inference for the multivariate normal model. Additionally, this higher order LQ decomposition can be used to construct an alternative version of the popular higher order singular value decomposition for tensor-valued data. We also develop a novel generalization of the polar decomposition to tensor-valued data.
△ Less
Submitted 4 October, 2014;
originally announced October 2014.
-
Equivariant and scale-free Tucker decomposition models
Authors:
Peter David Hoff
Abstract:
Analyses of array-valued datasets often involve reduced-rank array approximations, typically obtained via least-squares or truncations of array decompositions. However, least-squares approximations tend to be noisy in high-dimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain low-rank model-based repre…
▽ More
Analyses of array-valued datasets often involve reduced-rank array approximations, typically obtained via least-squares or truncations of array decompositions. However, least-squares approximations tend to be noisy in high-dimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain low-rank model-based representations of continuous, discrete and ordinal data arrays. The model is based on a parameterization of the mean array as a multilinear product of a reduced-rank core array and a set of index-specific orthogonal eigenvector matrices. It is shown how orthogonally equivariant parameter estimates can be obtained from Bayesian procedures under invariant prior distributions. Additionally, priors on the core array are developed that act as regularizers, leading to improved inference over the standard least-squares estimator, and providing robustness to misspecification of the array rank. This model-based approach is extended to accommodate discrete or ordinal data arrays using a semiparametric transformation model. The resulting low-rank representation is scale-free, in the sense that it is invariant to monotonic transformations of the data array. In an example analysis of a multivariate discrete network dataset, this scale-free approach provides a more complete description of data patterns.
△ Less
Submitted 22 December, 2013;
originally announced December 2013.
-
Joint Mean and Covariance Modeling of Multiple Health Outcome Measures
Authors:
Xiaoyue Niu,
Peter D. Hoff
Abstract:
Health exams determine a patient's health status by comparing the patient's measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009 - 2010 National Health and Nutrition Examination Survey (NHANES) on four m…
▽ More
Health exams determine a patient's health status by comparing the patient's measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009 - 2010 National Health and Nutrition Examination Survey (NHANES) on four major health problems in the U.S. and apply a joint mean and covariance model to study how the reference ranges and associations of those health outcomes could vary among subpopulations. We discuss guidelines for model selection and evaluation, using standard criteria such as AIC in conjunction with posterior predictive checks. The results from the proposed model can help identify subpopulations in which more data need to be collected to refine the reference range and to study the specific associations among those health problems.
△ Less
Submitted 31 May, 2018; v1 submitted 11 November, 2013;
originally announced November 2013.
-
Testing for nodal dependence in relational data matrices
Authors:
Alexander Volfovsky,
Peter D. Hoff
Abstract:
Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We provide a test for such dependence…
▽ More
Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We provide a test for such dependence using the framework of the matrix normal model, a type of multivariate normal distribution parameterized in terms of row- and column-specific covariance matrices. We develop a likelihood ratio test (LRT) for row and column dependence based on the observation of a single relational data matrix. We obtain a reference distribution for the LRT statistic, thereby providing an exact test for the presence of row or column correlations in a square relational data matrix. Additionally, we provide extensions of the test to accommodate common features of such data, such as undefined diagonal entries, a non-zero mean, multiple observations, and deviations from normality.
△ Less
Submitted 24 June, 2013;
originally announced June 2013.
-
Testing and Modeling Dependencies Between a Network and Nodal Attributes
Authors:
Bailey K. Fosdick,
Peter D. Hoff
Abstract:
Network analysis is often focused on characterizing the dependencies between network relations and node-level attributes. Potential relationships are typically explored by modeling the network as a function of the nodal attributes or by modeling the attributes as a function of the network. These methods require specification of the exact nature of the association between the network and attributes…
▽ More
Network analysis is often focused on characterizing the dependencies between network relations and node-level attributes. Potential relationships are typically explored by modeling the network as a function of the nodal attributes or by modeling the attributes as a function of the network. These methods require specification of the exact nature of the association between the network and attributes, reduce the network data to a small number of summary statistics, and are unable provide predictions simultaneously for missing attribute and network information. Existing methods that model the attributes and network jointly also assume the data are fully observed. In this article we introduce a unified approach to analysis that addresses these shortcomings. We use a latent variable model to obtain a low dimensional representation of the network in terms of node-specific network factors and use a test of dependence between the network factors and attributes as a surrogate for a test of dependence between the network and attributes. We propose a formal testing procedure to determine if dependencies exists between the network factors and attributes. We also introduce a joint model for the network and attributes, for use if the test rejects, that can capture a variety of dependence patterns and be used to make inference and predictions for missing observations.
△ Less
Submitted 19 June, 2013;
originally announced June 2013.
-
Comment on "Bayesian Nonparametric Inference - Why and How" by Mueller and Mitra
Authors:
Peter D. Hoff
Abstract:
Due to their great flexibility, nonparametric Bayes methods have proven to be a valuable tool for discovering complicated patterns in data. The term "nonparametric Bayes" suggests that these methods inherit model-free operating characteristics of classical nonparametric methods, as well as coherent uncertainty assessments provided by Bayesian procedures. However, as the authors say in the conclusi…
▽ More
Due to their great flexibility, nonparametric Bayes methods have proven to be a valuable tool for discovering complicated patterns in data. The term "nonparametric Bayes" suggests that these methods inherit model-free operating characteristics of classical nonparametric methods, as well as coherent uncertainty assessments provided by Bayesian procedures. However, as the authors say in the conclusion to their article, nonparametric Bayesian methods may be more aptly described as "massively parametric." Furthermore, I argue that many of the default nonparametric Bayes procedures are only Bayesian in the weakest sense of the term, and cannot be assumed to provide honest assessments of uncertainty merely because they carry the Bayesian label. However useful such procedures may be, we should be cautious about advertising default nonparametric Bayes procedures as either being "assumption free" or providing descriptions of our uncertainty. If we want our nonparametric Bayes procedures to have a Bayesian interpretation, we should modify default NP Bayes methods to accommodate real prior information, or at the very least, carefully evaluate the effects of hyperparameters on posterior quantities of interest.
△ Less
Submitted 12 April, 2013;
originally announced April 2013.
-
Bayesian analysis of matrix data with rstiefel
Authors:
Peter D. Hoff
Abstract:
We illustrate the use of the R-package "rstiefel" for matrix-variate data analysis in the context of two examples. The first example considers estimation of a reduced-rank mean matrix in the presence of normally distributed noise. The second example considers the modeling of a social network of friendships among teenagers. Bayesian estimation for these models requires the ability to simulate from…
▽ More
We illustrate the use of the R-package "rstiefel" for matrix-variate data analysis in the context of two examples. The first example considers estimation of a reduced-rank mean matrix in the presence of normally distributed noise. The second example considers the modeling of a social network of friendships among teenagers. Bayesian estimation for these models requires the ability to simulate from the matrix-variate von Mises-Fisher distributions and the matrix-variate Bingham distributions on the Stiefel manifold.
△ Less
Submitted 12 April, 2013;
originally announced April 2013.
-
Separable factor analysis with applications to mortality data
Authors:
Bailey K. Fosdick,
Peter D. Hoff
Abstract:
Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to accou…
▽ More
Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array and, for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to array-valued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We apply this methodology to the analysis of data from the Human Mortality Database, and show in a cross-validation experiment how it outperforms simpler methods. Additionally, we use this model to impute mortality rates for countries that have no mortality data for several years. Unlike other approaches, our methodology is able to estimate similarities between the mortality rates of countries, time periods and sexes, and use this information to assist with the imputations.
△ Less
Submitted 14 April, 2014; v1 submitted 16 November, 2012;
originally announced November 2012.
-
Hierarchical array priors for ANOVA decompositions of cross-classified data
Authors:
Alexander Volfovsky,
Peter D. Hoff
Abstract:
ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categor…
▽ More
ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays that can adapt to the presence of such interactions. These prior distributions are based on a type of array-variate normal distribution, for which a covariance matrix for each factor is estimated. This prior is able to adapt to potential similarities among the levels of a factor, and incorporate any such information into the estimation of the effects in which the factor appears. In the presence of such similarities, this prior is able to borrow information from well-estimated main effects and lower-order interactions to assist in the estimation of higher-order terms for which data information is limited.
△ Less
Submitted 14 April, 2014; v1 submitted 8 August, 2012;
originally announced August 2012.
-
Marginally Specified Priors for Nonparametric Bayesian Estimation
Authors:
David C. Kessler,
Peter D. Hoff,
David B. Dunson
Abstract:
Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article propos…
▽ More
Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard nonparametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modeling of high-dimensional sparse contingency tables.
△ Less
Submitted 29 April, 2012;
originally announced April 2012.
-
Small-Sample Behavior of Novel Phase I Cancer Trial Designs
Authors:
Assaf P. Oron,
Peter D. Hoff
Abstract:
Novel dose-finding designs, using estimation to assign the best estimated maximum- tolerated-dose (MTD) at each point in the experiment, most commonly via Bayesian techniques, have recently entered large-scale implementation in Phase I cancer clinical trials. We examine the small-sample behavior of these "Bayesian Phase I" (BP1) designs, and also of non-Bayesian designs sharing the same main "long…
▽ More
Novel dose-finding designs, using estimation to assign the best estimated maximum- tolerated-dose (MTD) at each point in the experiment, most commonly via Bayesian techniques, have recently entered large-scale implementation in Phase I cancer clinical trials. We examine the small-sample behavior of these "Bayesian Phase I" (BP1) designs, and also of non-Bayesian designs sharing the same main "long-memory" traits (hereafter: LMP1s).
For all LMP1s examined, the number of cohorts treated at the true MTD (denoted here as n*) was highly variable between numerical runs drawn from the same toxicity-threshold distribution, especially when compared with "up-and-down" (U&D) short-memory designs. Further investigation using the same set of thresholds in permuted order, produced a nearly-identical magnitude of variability in n*. Therefore, this LMP1 behavior is driven by a strong sensitivity to the order in which toxicity thresholds appear in the experiment. We suggest that the sensitivity is related to LMP1's tendency to "settle" early on a specific dose level - a tendency caused by the repeated likelihood-based "winner-takes-all" dose assignment rule, which grants the early cohorts a disproportionately large influence upon experimental trajectories.
Presently, U&D designs offer a simpler and more stable alternative, with roughly equivalent MTD estimation performance. A promising direction for combining the two approaches is briefly discussed (note: the '3+3' protocol is not a U&D design).
△ Less
Submitted 20 September, 2012; v1 submitted 22 February, 2012;
originally announced February 2012.
-
Information bounds for Gaussian copulas
Authors:
Peter D. Hoff,
Xiaoyue Niu,
Jon A. Wellner
Abstract:
Often of primary interest in the analysis of multivariate data are the copula parameters describing the dependence among the variables, rather than the univariate marginal distributions. Since the ranks of a multivariate dataset are invariant to changes in the univariate marginal distributions, rank-based estimators are natural candidates for semiparametric copula estimation. Asymptotic informatio…
▽ More
Often of primary interest in the analysis of multivariate data are the copula parameters describing the dependence among the variables, rather than the univariate marginal distributions. Since the ranks of a multivariate dataset are invariant to changes in the univariate marginal distributions, rank-based estimators are natural candidates for semiparametric copula estimation. Asymptotic information bounds for such estimators can be obtained from an asymptotic analysis of the rank likelihood, that is, the probability of the multivariate ranks. In this article, we obtain limiting normal distributions of the rank likelihood for Gaussian copula models. Our results cover models with structured correlation matrices, such as exchangeable or circular correlation models, as well as unstructured correlation matrices. For all Gaussian copula models, the limiting distribution of the rank likelihood ratio is shown to be equal to that of a parametric likelihood ratio for an appropriately chosen multivariate normal model. This implies that the semiparametric information bounds for rank-based estimators are the same as the information bounds for estimators based on the full data, and that the multivariate normal distributions are least favorable.
△ Less
Submitted 12 March, 2014; v1 submitted 16 October, 2011;
originally announced October 2011.
-
A covariance regression model
Authors:
Peter D. Hoff,
Xiaoyue Niu
Abstract:
Classical regression analysis relates the expectation of a response variable to a linear combination of explanatory variables. In this article, we propose a covariance regression model that parameterizes the covariance matrix of a multivariate response vector as a parsimonious quadratic function of explanatory variables. The approach is analogous to the mean regression model, and is similar to a f…
▽ More
Classical regression analysis relates the expectation of a response variable to a linear combination of explanatory variables. In this article, we propose a covariance regression model that parameterizes the covariance matrix of a multivariate response vector as a parsimonious quadratic function of explanatory variables. The approach is analogous to the mean regression model, and is similar to a factor analysis model in which the factor loadings depend on the explanatory variables. Using a random-effects representation, parameter estimation for the model is straightforward using either an EM-algorithm or an MCMC approximation via Gibbs sampling. The proposed methodology provides a simple but flexible representation of heteroscedasticity across the levels of an explanatory variable, improves estimation of the mean function and gives better calibrated prediction regions when compared to a homoscedastic model.
△ Less
Submitted 28 February, 2011;
originally announced February 2011.
-
A mixed effects model for longitudinal relational and network data, with applications to international trade and conflict
Authors:
Anton H. Westveld,
Peter D. Hoff
Abstract:
The focus of this paper is an approach to the modeling of longitudinal social network or relational data. Such data arise from measurements on pairs of objects or actors made at regular temporal intervals, resulting in a social network for each point in time. In this article we represent the network and temporal dependencies with a random effects model, resulting in a stochastic process defined by…
▽ More
The focus of this paper is an approach to the modeling of longitudinal social network or relational data. Such data arise from measurements on pairs of objects or actors made at regular temporal intervals, resulting in a social network for each point in time. In this article we represent the network and temporal dependencies with a random effects model, resulting in a stochastic process defined by a set of stationary covariance matrices. Our approach builds upon the social relations models of Warner, Kenny and Stoto [Journal of Personality and Social Psychology 37 (1979) 1742--1757] and Gill and Swartz [Canad. J. Statist. 29 (2001) 321--331] and allows for an intra- and inter-temporal representation of network structures. We apply the methodology to two longitudinal data sets: international trade (continuous response) and militarized interstate disputes (binary response).
△ Less
Submitted 17 August, 2011; v1 submitted 7 September, 2010;
originally announced September 2010.
-
Separable covariance arrays via the Tucker product, with applications to multivariate relational data
Authors:
Peter D. Hoff
Abstract:
Modern datasets are often in the form of matrices or arrays,potentially having correlations along each set of data indices. For example, data involving repeated measurements of several variables over time may exhibit temporal correlation as well as correlation among the variables. A possible model for matrix-valued data is the class of matrix normal distributions, which is parametrized by two cova…
▽ More
Modern datasets are often in the form of matrices or arrays,potentially having correlations along each set of data indices. For example, data involving repeated measurements of several variables over time may exhibit temporal correlation as well as correlation among the variables. A possible model for matrix-valued data is the class of matrix normal distributions, which is parametrized by two covariance matrices, one for each index set of the data. In this article we describe an extension of the matrix normal model to accommodate multidimensional data arrays, or tensors. We generate a class of array normal distributions by applying a group of multilinear transformations to an array of independent standard normal random variables. The covariance structures of the resulting class take the form of outer products of dimension-specific covariance matrices. We derive some properties of these covariance structures and the corresponding array normal distributions, discuss maximum likelihood and Bayesian estimation of covariance parameters and illustrate the model in an analysis of multivariate longitudinal network data.
△ Less
Submitted 12 August, 2010;
originally announced August 2010.
-
A Statistical View of Learning in the Centipede Game
Authors:
Anton H. Westveld,
Peter D. Hoff
Abstract:
In this article we evaluate the statistical evidence that a population of students learn about the sub-game perfect Nash equilibrium of the centipede game via repeated play of the game. This is done by formulating a model in which a player's error in assessing the utility of decisions changes as they gain experience with the game. We first estimate parameters in a statistical model where the proba…
▽ More
In this article we evaluate the statistical evidence that a population of students learn about the sub-game perfect Nash equilibrium of the centipede game via repeated play of the game. This is done by formulating a model in which a player's error in assessing the utility of decisions changes as they gain experience with the game. We first estimate parameters in a statistical model where the probabilities of choices of the players are given by a Quantal Response Equilibrium (QRE) (McKelvey and Palfrey, 1995, 1996, 1998), but are allowed to change with repeated play. This model gives a better fit to the data than similar models previously considered. However, substantial correlation of outcomes of games having a common player suggests that a statistical model that captures within-subject correlation is more appropriate. Thus we then estimate parameters in a model which allows for within-player correlation of decisions and rates of learning. Through out the paper we also consider and compare the use of randomization tests and posterior predictive tests in the context of exploratory and confirmatory data analyses.
△ Less
Submitted 7 September, 2010; v1 submitted 10 March, 2010;
originally announced March 2010.
-
Convergence of Nonparametric Long-Memory Phase I Designs
Authors:
Assaf P. Oron,
David Azriel,
Peter D. Hoff
Abstract:
We examine nonparametric dose-finding designs that use toxicity estimates based on all available data at each dose allocation decision. We prove that one such design family, called here "interval design", converges almost surely to the maximum tolerated dose (MTD), if the MTD is the only dose level whose toxicity rate falls within the pre-specified interval around the desired target rate. Another…
▽ More
We examine nonparametric dose-finding designs that use toxicity estimates based on all available data at each dose allocation decision. We prove that one such design family, called here "interval design", converges almost surely to the maximum tolerated dose (MTD), if the MTD is the only dose level whose toxicity rate falls within the pre-specified interval around the desired target rate. Another nonparametric family, called "point design", has a positive probability of not converging. In a numerical sensitivity study, a diverse sample of dose-toxicity scenarios was randomly generated. On this sample, the "interval design" convergence conditions are met far more often than the conditions for one-parameter design convergence (the Shen-O'Quigley conditions), suggesting that the interval-design conditions are less restrictive. Implications of these theoretical and numerical results for small-sample behavior of the designs, and for future research, are discussed.
△ Less
Submitted 14 June, 2010; v1 submitted 18 August, 2009;
originally announced August 2009.
-
Modeling homophily and stochastic equivalence in symmetric relational data
Authors:
Peter D. Hoff
Abstract:
This article discusses a latent variable model for inference and prediction of symmetric relational data.
The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This ``eigenmodel'' generalizes other popular latent variable models, such as latent class and dista…
▽ More
This article discusses a latent variable model for inference and prediction of symmetric relational data.
The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This ``eigenmodel'' generalizes other popular latent variable models, such as latent class and distance models: It is shown mathematically that any latent class or distance model has a representation as an eigenmodel, but not vice-versa. The practical implications of this are examined in the context of three real datasets, for which the eigenmodel has as good or better out-of-sample predictive performance than the other two models.
△ Less
Submitted 7 November, 2007;
originally announced November 2007.
-
Extending the rank likelihood for semiparametric copula estimation
Authors:
Peter D. Hoff
Abstract:
Quantitative studies in many fields involve the analysis of multivariate data of diverse types, including measurements that we may consider binary, ordinal and continuous. One approach to the analysis of such mixed data is to use a copula model, in which the associations among the variables are parameterized separately from their univariate marginal distributions. The purpose of this article is…
▽ More
Quantitative studies in many fields involve the analysis of multivariate data of diverse types, including measurements that we may consider binary, ordinal and continuous. One approach to the analysis of such mixed data is to use a copula model, in which the associations among the variables are parameterized separately from their univariate marginal distributions. The purpose of this article is to provide a method of semiparametric inference for copula models via the construction of what we call a marginal set likelihood function for the association parameters. The proposed method of inference can be viewed as a generalization of marginal likelihood estimation, in which inference for a parameter of interest is based on a summary statistic whose sampling distribution is not a function of any nuisance parameters. In the context of copula estimation, the marginal set likelihood is a function of the association parameters only and its applicability does not depend on any assumptions about the marginal distributions of the data, thus making it appropriate for the analysis of mixed continuous and discrete data with arbitrary margins. Estimation and inference for parameters of the Gaussian copula are available via a straightforward Markov chain Monte Carlo algorithm based on Gibbs sampling.
△ Less
Submitted 10 March, 2007; v1 submitted 12 October, 2006;
originally announced October 2006.
-
Model averaging and dimension selection for the singular value decomposition
Authors:
Peter D. Hoff
Abstract:
Many multivariate data analysis techniques for an $m\times n$ matrix $\m Y$ are related to the model $\m Y = \m M +\m E$, where $\m Y$ is an $m\times n$ matrix of full rank and $\m M$ is an unobserved mean matrix of rank $K< (m\wedge n)$. Typically the rank of $\m M$ is estimated in a heuristic way and then the least-squares estimate of $\m M$ is obtained via the singular value decomposition of…
▽ More
Many multivariate data analysis techniques for an $m\times n$ matrix $\m Y$ are related to the model $\m Y = \m M +\m E$, where $\m Y$ is an $m\times n$ matrix of full rank and $\m M$ is an unobserved mean matrix of rank $K< (m\wedge n)$. Typically the rank of $\m M$ is estimated in a heuristic way and then the least-squares estimate of $\m M$ is obtained via the singular value decomposition of $\m Y$, yielding an estimate that can have a very high variance. In this paper we suggest a model-based alternative to the above approach by providing prior distributions and posterior estimation for the rank of $\m M$ and the components of its singular value decomposition. In addition to providing more accurate inference, such an approach has the advantage of being extendable to more general data-analysis situations, such as inference in the presence of missing data and estimation in a generalized linear modeling framework.
△ Less
Submitted 1 September, 2006;
originally announced September 2006.