Search | arXiv e-print repository

Incorporating circuit theory into a dynamic model for crowd-sourced observations of migratory birds

Authors: Michael F. Christensen, Peter D. Hoff

Abstract: While the overarching pattern of biannual avian migration is well understood, there are significant questions pertaining to this phenomenon that invite further study. Necessary to any analysis of these questions is an understanding of how a given species' spatial distribution evolves in time. While studies of animal movement are often conducted using telemetry data, the collection of such data can… ▽ More While the overarching pattern of biannual avian migration is well understood, there are significant questions pertaining to this phenomenon that invite further study. Necessary to any analysis of these questions is an understanding of how a given species' spatial distribution evolves in time. While studies of animal movement are often conducted using telemetry data, the collection of such data can be time- and resource-intensive, frequently resulting in small sample sizes. Ecological surveys of animal populations are also indicative of species distribution trends, but may be constrained to a limited spatial domain. Within this article we utilize crowd-sourced observations from the eBird database to model the abundance of migratory bird species in space and time. While crowd-sourced observations are individually less reliable than those produced by experts, the sheer size and spatial coverage of the eBird database make it attractive for use in this setting. We introduce a hidden Markov model for observed bird counts utilizing a novel transition structure developed using principles from circuit theory. After illustrating model properties we fit it to observations of Baltimore orioles and yellow-rumped warblers within the eastern United States and discuss insight it provides into the migratory patterns for these species. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 33 pages, 12 figures

arXiv:2311.15860 [pdf, other]

Frequentist Prediction Sets for Species Abundance using Indirect Information

Authors: Elizabeth Bersson, Peter D. Hoff

Abstract: Citizen science databases that consist of volunteer-led sampling efforts of species communities are relied on as essential sources of data in ecology. Summarizing such data across counties with frequentist-valid prediction sets for each county provides an interpretable comparison across counties of varying size or composition. As citizen science data often feature unequal sampling efforts across a… ▽ More Citizen science databases that consist of volunteer-led sampling efforts of species communities are relied on as essential sources of data in ecology. Summarizing such data across counties with frequentist-valid prediction sets for each county provides an interpretable comparison across counties of varying size or composition. As citizen science data often feature unequal sampling efforts across a spatial domain, prediction sets constructed with indirect methods that share information across counties may be used to improve precision. In this article, we present a nonparametric framework to obtain precise prediction sets for a multinomial random sample based on indirect information that maintain frequentist coverage guarantees for each county. We detail a simple algorithm to obtain prediction sets for each county using indirect information where the computation time does not depend on the sample size and scales nicely with the number of species considered. The indirect information may be estimated by a proposed empirical Bayes procedure based on information from auxiliary data. Our approach makes inference for under-sampled counties more precise, while maintaining area-specific frequentist validity for each county. Our method is used to provide a useful description of avian species abundance in North Carolina, USA based on citizen science data from the eBird database. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 16 pages, 4 figures, 2 tables

arXiv:2310.12460 [pdf, other]

Linear Source Apportionment using Generalized Least Squares

Authors: Jordan Bryan, Peter Hoff

Abstract: Motivated by applications to water quality monitoring using fluorescence spectroscopy, we develop the source apportionment model for high dimensional profiles of dissolved organic matter (DOM). We describe simple methods to estimate the parameters of a linear source apportionment model, and show how the estimates are related to those of ordinary and generalized least squares. Using this least squa… ▽ More Motivated by applications to water quality monitoring using fluorescence spectroscopy, we develop the source apportionment model for high dimensional profiles of dissolved organic matter (DOM). We describe simple methods to estimate the parameters of a linear source apportionment model, and show how the estimates are related to those of ordinary and generalized least squares. Using this least squares framework, we analyze the variability of the estimates, and we propose predictors for missing elements of a DOM profile. We demonstrate the practical utility of our results on fluorescence spectroscopy data collected from the Neuse River in North Carolina. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 31 pages, 5 figures

arXiv:2308.02260 [pdf, other]

Information Geometry and Asymptotics for Kronecker Covariances

Authors: Andrew McCormack, Peter Hoff

Abstract: We explore the information geometry and asymptotic behaviour of estimators for Kronecker-structured covariances, in both growing-$n$ and growing-$p$ scenarios, with a focus towards examining the quadratic form or partial trace estimator proposed by Linton and Tang. It is shown that the partial trace estimator is asymptotically inefficient An explanation for this inefficiency is that the partial tr… ▽ More We explore the information geometry and asymptotic behaviour of estimators for Kronecker-structured covariances, in both growing-$n$ and growing-$p$ scenarios, with a focus towards examining the quadratic form or partial trace estimator proposed by Linton and Tang. It is shown that the partial trace estimator is asymptotically inefficient An explanation for this inefficiency is that the partial trace estimator does not scale sub-blocks of the sample covariance matrix optimally. To correct for this, an asymptotically efficient, rescaled partial trace estimator is proposed. Motivated by this rescaling, we introduce an orthogonal parameterization for the set of Kronecker covariances. High-dimensional consistency results using the partial trace estimator are obtained that demonstrate a blessing of dimensionality. In settings where an array has at least order three, it is shown that as the array dimensions jointly increase, it is possible to consistently estimate the Kronecker covariance matrix, even when the sample size is one. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: 42 total pages, 21 pages of main text, 4 tables, 8 figures

MSC Class: 62

arXiv:2302.09211 [pdf, other]

Bayesian Covariance Estimation for Multi-group Matrix-variate Data

Authors: Elizabeth Bersson, Peter D. Hoff

Abstract: Multi-group covariance estimation for matrix-variate data with small within group sample sizes is a key part of many data analysis tasks in modern applications. To obtain accurate group-specific covariance estimates, shrinkage estimation methods which shrink an unstructured, group-specific covariance either across groups towards a pooled covariance or within each group towards a Kronecker structur… ▽ More Multi-group covariance estimation for matrix-variate data with small within group sample sizes is a key part of many data analysis tasks in modern applications. To obtain accurate group-specific covariance estimates, shrinkage estimation methods which shrink an unstructured, group-specific covariance either across groups towards a pooled covariance or within each group towards a Kronecker structure have been developed. However, in many applications, it is unclear which approach will result in more accurate covariance estimates. In this article, we present a hierarchical prior distribution which flexibly allows for both types of shrinkage. The prior linearly combines shrinkage across groups towards a shared pooled covariance and shrinkage within groups towards a group-specific Kronecker covariance. We illustrate the utility of the proposed prior in speech recognition and an analysis of chemical exposure data. △ Less

Submitted 7 March, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: 28 pages, 7 figures, 5 tables

arXiv:2207.12484 [pdf, other]

Core Shrinkage Covariance Estimation for Matrix-variate Data

Authors: Peter Hoff, Andrew McCormack, Anru R. Zhang

Abstract: A separable covariance model for a random matrix provides a parsimonious description of the covariances among the rows and among the columns of the matrix, and permits likelihood-based inference with a very small sample size. However, in many applications the assumption of exact separability is unlikely to be met, and data analysis with a separable model may overlook or misrepresent important depe… ▽ More A separable covariance model for a random matrix provides a parsimonious description of the covariances among the rows and among the columns of the matrix, and permits likelihood-based inference with a very small sample size. However, in many applications the assumption of exact separability is unlikely to be met, and data analysis with a separable model may overlook or misrepresent important dependence patterns in the data. In this article, we propose a compromise between separable and unstructured covariance estimation. We show how the set of covariance matrices may be uniquely parametrized in terms of the set of separable covariance matrices and a complementary set of "core" covariance matrices, where the core of a separable covariance matrix is the identity matrix. This parametrization defines a Kronecker-core decomposition of a covariance matrix. By shrinking the core of the sample covariance matrix with an empirical Bayes procedure, we obtain an estimator that can adapt to the degree of separability of the population covariance matrix. △ Less

Submitted 25 July, 2022; originally announced July 2022.

MSC Class: 62H20; 15A23

arXiv:2207.10513 [pdf, other]

A flexible and interpretable spatial covariance model for data on graphs

Authors: Michael F. Christensen, Peter D. Hoff

Abstract: Spatial models for areal data are often constructed such that all pairs of adjacent regions are assumed to have near-identical spatial autocorrelation. In practice, data can exhibit dependence structures more complicated than can be represented under this assumption. In this article we develop a new model for spatially correlated data observed on graphs, which can flexibly represented many types o… ▽ More Spatial models for areal data are often constructed such that all pairs of adjacent regions are assumed to have near-identical spatial autocorrelation. In practice, data can exhibit dependence structures more complicated than can be represented under this assumption. In this article we develop a new model for spatially correlated data observed on graphs, which can flexibly represented many types of spatial dependence patterns while retaining aspects of the original graph geometry. Our method implies an embedding of the graph into Euclidean space wherein covariance can be modeled using traditional covariance functions, such as those from the Matérn family. We parameterize our model using a class of graph metrics compatible with such covariance functions, and which characterize distance in terms of network flow, a property useful for understanding proximity in many ecological settings. By estimating the parameters underlying these metrics, we recover the "intrinsic distances" between graph nodes, which assist in the interpretation of the estimated covariance and allow us to better understand the relationship between the observed process and spatial domain. We compare our model to existing methods for spatially dependent graph data, primarily conditional autoregressive models and their variants, and illustrate advantages of our method over traditional approaches. We fit our model to bird abundance data for several species in North Carolina, and show how it provides insight into the interactions between species-specific spatial distributions and geography. △ Less

Submitted 2 July, 2024; v1 submitted 21 July, 2022; originally announced July 2022.

Comments: 36 pages, 7 figures

arXiv:2204.08122 [pdf, other]

Optimal Conformal Prediction for Small Areas

Authors: Elizabeth Bersson, Peter D. Hoff

Abstract: Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction appro… ▽ More Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction approach in which the conformity measure is the posterior predictive density of a working model that incorporates indirect information. The resulting prediction region has guaranteed frequentist coverage regardless of the working model, and, if the working model assumptions are accurate, the region has minimum expected volume compared to other regions with the same coverage rate. When constructed under a normal working model, we prove such a prediction region is an interval and construct an efficient algorithm to obtain the exact interval. We illustrate the performance of our method through simulation studies and an application to EPA radon survey data. △ Less

Submitted 17 April, 2022; originally announced April 2022.

Comments: 24 pages, 9 figures

arXiv:2203.12732 [pdf, other]

Tests of Linear Hypotheses using Indirect Information

Authors: Andrew McCormack, Peter Hoff

Abstract: In multigroup data settings with small within-group sample sizes, standard $F$-tests of group-specific linear hypotheses can have low power, particularly if the within-group sample sizes are not large relative to the number of explanatory variables. To remedy this situation, in this article we derive alternative test statistics based on information-sharing across groups. Each group-specific test h… ▽ More In multigroup data settings with small within-group sample sizes, standard $F$-tests of group-specific linear hypotheses can have low power, particularly if the within-group sample sizes are not large relative to the number of explanatory variables. To remedy this situation, in this article we derive alternative test statistics based on information-sharing across groups. Each group-specific test has potentially much larger power than the standard $F$-test, while still exactly maintaining a target type I error rate if the hypothesis for the group is true. The proposed test for a given group uses a statistic that has optimal marginal power under a prior distribution derived from the data of the other groups. This statistic approaches the usual $F$-statistic as the prior distribution becomes more diffuse, but approaches a limiting "cone" test statistic as the prior distribution becomes extremely concentrated. We compare the power and $p$-values of the cone test to that of the $F$-test in some high-dimensional asymptotic scenarios. An analysis of educational outcome data is provided, demonstrating empirically that the proposed test is more powerful than the $F$-test. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: 37 pages, 6 figures, 3 tables

arXiv:2203.02569 [pdf, other]

Coverage Properties of Empirical Bayes Intervals

Authors: Peter Hoff

Abstract: This note is an invited discussion of the article "Confidence Intervals for Nonparametric Empirical Bayes Analysis" by Ignatiadis and Wager. In this discussion, I review some goals of empirical Bayes data analysis and the contribution of Ignatiadis and Wager. Differences between across-group inference and group-specific inference are discussed. Standard empirical Bayes interval procedures focus on… ▽ More This note is an invited discussion of the article "Confidence Intervals for Nonparametric Empirical Bayes Analysis" by Ignatiadis and Wager. In this discussion, I review some goals of empirical Bayes data analysis and the contribution of Ignatiadis and Wager. Differences between across-group inference and group-specific inference are discussed. Standard empirical Bayes interval procedures focus on controlling the across-group average coverage rate. However, if group-specific inferences are of primary interest, confidence intervals with group-specific coverage control may be preferable. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: Invited comments for the JASA discussion article "Confidence Intervals for Nonparametric Empirical Bayes Analysis" by Ignatiadis and Wager

MSC Class: 62C12

arXiv:2112.07465 [pdf, other]

The multirank likelihood for semiparametric canonical correlation analysis

Authors: Jordan G. Bryan, Jonathan Niles-Weed, Peter D. Hoff

Abstract: Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption tha… ▽ More Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors. △ Less

Submitted 22 April, 2024; v1 submitted 14 December, 2021; originally announced December 2021.

arXiv:2105.14045 [pdf, other]

Bayes-optimal prediction with frequentist coverage control

Authors: Peter Hoff

Abstract: This article illustrates how indirect or prior information can be optimally used to construct a prediction region that maintains a target frequentist coverage rate. If the indirect information is accurate, the volume of the prediction region is lower on average than that of other regions with the same coverage rate. Even if the indirect information is inaccurate, the resulting region still maintai… ▽ More This article illustrates how indirect or prior information can be optimally used to construct a prediction region that maintains a target frequentist coverage rate. If the indirect information is accurate, the volume of the prediction region is lower on average than that of other regions with the same coverage rate. Even if the indirect information is inaccurate, the resulting region still maintains the target coverage rate. Such a prediction region can be constructed for models that have a complete sufficient statistic, which includes many widely-used parametric and nonparametric models. Particular examples include a Bayes-optimal conformal prediction procedure that maintains a constant coverage rate across distributions in a nonparametric model, as well as a prediction procedure for the normal linear regression model that can utilize a regularizing prior distribution, yet maintain a frequentist coverage rate that is constant as a function of the model parameters and explanatory variables. No results in this article rely on asymptotic approximations. △ Less

Submitted 28 May, 2021; originally announced May 2021.

MSC Class: 62C10; 28B20

arXiv:2104.03397 [pdf, other]

Equivariant Estimation of Fréchet Means

Authors: Andrew McCormack, Peter Hoff

Abstract: The Fréchet mean generalizes the concept of a mean to a metric space setting. In this work we consider equivariant estimation of Fréchet means for parametric models on metric spaces that are Riemannian manifolds. The geometry and symmetry of such a space is encoded by its isometry group. Estimators that are equivariant under the isometry group take into account the symmetry of the metric space. Fo… ▽ More The Fréchet mean generalizes the concept of a mean to a metric space setting. In this work we consider equivariant estimation of Fréchet means for parametric models on metric spaces that are Riemannian manifolds. The geometry and symmetry of such a space is encoded by its isometry group. Estimators that are equivariant under the isometry group take into account the symmetry of the metric space. For some models there exists an optimal equivariant estimator, which necessarily will perform as well or better than other common equivariant estimators, such as the maximum likelihood estimator or the sample Fréchet mean. We derive the general form of this minimum risk equivariant estimator and in a few cases provide explicit expressions for it. In other models the isometry group is not large enough relative to the parametric family of distributions for there to exist a minimum risk equivariant estimator. In such cases, we introduce an adaptive equivariant estimator that uses the data to select a submodel for which there is an MRE. Simulations results show that the adaptive equivariant estimator performs favorably relative to alternative estimators. △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: 31 pages, 1 figure

arXiv:2101.05135 [pdf, other]

A Latent Variable Model for Relational Events with Multiple Receivers

Authors: Joris Mulder, Peter D. Hoff

Abstract: Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network,… ▽ More Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network, which is needed for better understanding social interaction dynamics. In this paper a multiplicative latent factor model is proposed to analyze such relational data. For a given message, all potential receiver actors are placed on a suitability scale, and the actors are included in the receiver set whose suitability score exceeds a threshold value. Unobserved heterogeneity in the social interaction behavior is captured using a multiplicative latent factor structure with latent variables for actors (which differ for actors as senders and receivers) and latent variables for individual messages. A Bayesian computational algorithm, which relies on Gibbs sampling, is proposed for model fitting. Model assessment is done using posterior predictive checks. Based on our analyses of the Enron email data, a mc-amen model with a 2 dimensional latent variable can accurately capture the empirical distribution of the cardinality of the receiver set and the composition of the receiver sets for commonly observed messages. Moreover the results show that actors have a comparable (but not identical) role as a sender and as a receiver in the network. △ Less

Submitted 21 February, 2024; v1 submitted 13 January, 2021; originally announced January 2021.

Comments: 56 pages, 41 figures

arXiv:2009.09101 [pdf, other]

The Stein Effect for Frechet Means

Authors: Andrew McCormack, Peter Hoff

Abstract: The Frechet mean is a useful description of location for a probability distribution on a metric space that is not necessarily a vector space. This article considers simultaneous estimation of multiple Frechet means from a decision-theoretic perspective, and in particular, the extent to which the unbiased estimator of a Frechet mean can be dominated by a generalization of the James-Stein shrinkage… ▽ More The Frechet mean is a useful description of location for a probability distribution on a metric space that is not necessarily a vector space. This article considers simultaneous estimation of multiple Frechet means from a decision-theoretic perspective, and in particular, the extent to which the unbiased estimator of a Frechet mean can be dominated by a generalization of the James-Stein shrinkage estimator. It is shown that if the metric space satisfies a non-positive curvature condition, then this generalized James-Stein estimator asymptotically dominates the unbiased estimator as the dimension of the space grows. These results hold for a large class of distributions on a variety of spaces - including Hilbert spaces - and therefore partially extend known results on the applicability of the James-Stein estimator to non-normal distributions on Euclidean spaces. Simulation studies on metric trees and symmetric-positive-definite matrices are presented, numerically demonstrating the efficacy of this generalized James-Stein estimator. △ Less

Submitted 18 September, 2020; originally announced September 2020.

arXiv:2006.15095 [pdf, other]

doi 10.1103/PhysRevC.102.014329

Fast-timing study of $^{81}$Ga from the $β$ decay of $^{81}$Zn

Authors: V. Paziy, L. M. Fraile, H. Mach, B. Olaizola, G. S. Simpson, A. Aprahamian, C. Bernards, J. A. Briz, B. Bucher, C. J. Chiara, Z. Dlouhý, I. Gheorghe, D. Ghiţǎ, P. Hoff, J. Jolie, U. Köster, W. Kurcewicz, R. Licǎ, N. Mǎrginean, R. Mǎrginean, J. -M. Régis, M. Rudigier, T. Sava, M. Stǎnoiu, L. Stroe , et al. (1 additional authors not shown)

Abstract: The $β^{-}$ decay of $^{81}$Zn to the neutron magic $N=50$ nucleus $^{81}$Ga, with only three valence protons with respect to $^{78}$Ni, was investigated. The study was performed at the ISOLDE facility at CERN by means of $γ$ spectroscopy. The $^{81}$Zn half-life was determined to be $T_{1/2}=290(4)$ ms while the $β$-delayed neutron emission probability was measured as $P_n=23(4)\%$. The analysis… ▽ More The $β^{-}$ decay of $^{81}$Zn to the neutron magic $N=50$ nucleus $^{81}$Ga, with only three valence protons with respect to $^{78}$Ni, was investigated. The study was performed at the ISOLDE facility at CERN by means of $γ$ spectroscopy. The $^{81}$Zn half-life was determined to be $T_{1/2}=290(4)$ ms while the $β$-delayed neutron emission probability was measured as $P_n=23(4)\%$. The analysis of the $β$-gated $γ$-ray singles and $γ$-$γ$ coincidences from the decay of $^{81}$Zn provides 47 new levels and 70 new transitions in $^{81}$Ga. The $β^-$$n$ decay of $^{81}$Zn was observed and a new decay scheme into the odd-odd $^{80}$Ga nucleus was established. The half-lives of the first and second excited states of $^{81}$Ga were measured via the fast-timing method using LaBr$_3$(Ce) detectors. The level scheme and transition rates are compared to large-scale shell-model calculations. The low-lying structure of $^{81}$Ga is interpreted in terms of the coupling of the three valence protons outside the doubly-magic $^{78}$Ni core. △ Less

Submitted 26 June, 2020; originally announced June 2020.

Comments: Submitted to Phys. Rev. C

arXiv:2004.13870 [pdf, other]

doi 10.1214/20-AOAS1391

Hierarchical Multidimensional Scaling for the Comparison of Musical Performance Styles

Authors: Anna K. Yanchenko, Peter D. Hoff

Abstract: Quantification of stylistic differences between musical artists is of academic interest to the music community, and is also useful for other applications such as music information retrieval and recommendation systems. Information about stylistic differences can be obtained by comparing the performances of different artists across common musical pieces. In this article, we develop a statistical met… ▽ More Quantification of stylistic differences between musical artists is of academic interest to the music community, and is also useful for other applications such as music information retrieval and recommendation systems. Information about stylistic differences can be obtained by comparing the performances of different artists across common musical pieces. In this article, we develop a statistical methodology for identifying and quantifying systematic stylistic differences among artists that are consistent across audio recordings of a common set of pieces, in terms of several musical features. Our focus is on a comparison of ten different orchestras, based on data from audio recordings of the nine Beethoven symphonies. As generative or fully parametric models of raw audio data can be highly complex, and more complex than necessary for our goal of identifying differences between orchestras, we propose to reduce the data from a set of audio recordings down to pairwise distances between orchestras, based on different musical characteristics of the recordings, such as tempo, dynamics, and timbre. For each of these characteristics, we obtain multiple pairwise distance matrices, one for each movement of each symphony. We develop a hierarchical multidimensional scaling (HMDS) model to identify and quantify systematic differences between orchestras in terms of these three musical characteristics, and interpret the results in the context of known qualitative information about the orchestras. This methodology is able to recover several expected systematic similarities between orchestras, as well as to identify some more novel results. For example, we find that modern recordings exhibit a high degree of similarity to each other, as compared to older recordings. △ Less

Submitted 21 December, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

Comments: Published in the Annals of Applied Statistics (https://projecteuclid.org/euclid.aoas/1608346888)

Journal ref: Annals of Applied Statistics, Volume 14, Number 4 (2020), 1581-1603

arXiv:2004.07887 [pdf, other]

Smaller $p$-values in genomics studies using distilled historical information

Authors: Jordan G. Bryan, Peter D. Hoff

Abstract: Medical research institutions have generated massive amounts of biological data by genetically profiling hundreds of cancer cell lines. In parallel, academic biology labs have conducted genetic screens on small numbers of cancer cell lines under custom experimental conditions. In order to share information between these two approaches to scientific discovery, this article proposes a "frequentist a… ▽ More Medical research institutions have generated massive amounts of biological data by genetically profiling hundreds of cancer cell lines. In parallel, academic biology labs have conducted genetic screens on small numbers of cancer cell lines under custom experimental conditions. In order to share information between these two approaches to scientific discovery, this article proposes a "frequentist assisted by Bayes" (FAB) procedure for hypothesis testing that allows historical information from massive genomics datasets to increase the power of hypothesis tests in specialized studies. The exchange of information takes place through a novel probability model for multimodal genomics data, which distills historical information pertaining to cancer cell lines and genes across a wide variety of experimental contexts. If the relevance of the historical information for a given study is high, then the resulting FAB tests can be more powerful than the corresponding classical tests. If the relevance is low, then the FAB tests yield as many discoveries as the classical tests. Simulations and practical investigations demonstrate that the FAB testing procedure can increase the number of effects discovered in genomics studies while still maintaining strict control of type I error and false discovery rates. △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:2003.06024 [pdf, other]

Existence and Uniqueness of the Kronecker Covariance MLE

Authors: Mathias Drton, Satoshi Kuriki, Peter Hoff

Abstract: In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in which the covariances among the elements of a random matrix are parameterized in terms of the Kronecker product of two covariance matrices, one representing row covariances and one representing column co… ▽ More In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in which the covariances among the elements of a random matrix are parameterized in terms of the Kronecker product of two covariance matrices, one representing row covariances and one representing column covariance. An appealing feature of such a matrix normal model is that the Kronecker covariance structure allows for standard likelihood inference even when only a very small number of data matrices is available. For instance, in some cases a likelihood ratio test of dependence may be performed with a sample size of one. However, more generally the sample size required to ensure boundedness of the matrix normal likelihood or the existence of a unique maximizer depends in a complicated way on the matrix dimensions. This motivates the study of how large a sample size is needed to ensure that maximum likelihood estimators exist, and exist uniquely with probability one. Our main result gives precise sample size thresholds in the paradigm where the number of rows and the number of columns of the data matrices differ by at most a factor of two. Our proof uses invariance properties that allow us to consider data matrices in canonical form, as obtained from the Kronecker canonical form for matrix pencils. △ Less

Submitted 14 January, 2021; v1 submitted 12 March, 2020; originally announced March 2020.

arXiv:1907.12589 [pdf, other]

Smaller $p$-values via indirect information

Authors: Peter D. Hoff

Abstract: This article develops $p$-values for evaluating means of normal populations that make use of indirect or prior information. A $p$-value of this type is based on a biased test statistic that is optimal on average with respect to a probability distribution that encodes indirect information about the mean parameter, resulting in a smaller $p$-value if the indirect information is accurate. In a variet… ▽ More This article develops $p$-values for evaluating means of normal populations that make use of indirect or prior information. A $p$-value of this type is based on a biased test statistic that is optimal on average with respect to a probability distribution that encodes indirect information about the mean parameter, resulting in a smaller $p$-value if the indirect information is accurate. In a variety of multiparameter settings, we show how to adaptively estimate the indirect information for each mean parameter while still maintaining uniformity of the $p$-values under their null hypotheses. This is done using a linking model through which indirect information about the mean of one population may be obtained from the data of other populations. Importantly, the linking model does not need to be correct to maintain the uniformity of the $p$-values under their null hypotheses. This methodology is illustrated in several data analysis scenarios, including small area inference, spatially arranged populations, interactions in linear regression, and generalized linear models. △ Less

Submitted 10 December, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

arXiv:1906.07684 [pdf, other]

Monte Carlo simulation on the Stiefel manifold via polar expansion

Authors: Michael Jauch, Peter D. Hoff, David B. Dunson

Abstract: Motivated by applications to Bayesian inference for statistical models with orthogonal matrix parameters, we present $\textit{polar expansion},$ a general approach to Monte Carlo simulation from probability distributions on the Stiefel manifold. To bypass many of the well-established challenges of simulating from the distribution of a random orthogonal matrix $\boldsymbol{Q},$ we construct a distr… ▽ More Motivated by applications to Bayesian inference for statistical models with orthogonal matrix parameters, we present $\textit{polar expansion},$ a general approach to Monte Carlo simulation from probability distributions on the Stiefel manifold. To bypass many of the well-established challenges of simulating from the distribution of a random orthogonal matrix $\boldsymbol{Q},$ we construct a distribution for an unconstrained random matrix $\boldsymbol{X}$ such that $\boldsymbol{Q}_X,$ the orthogonal component of the polar decomposition of $\boldsymbol{X},$ is equal in distribution to $\boldsymbol{Q}.$ The distribution of $\boldsymbol{X}$ is amenable to Markov chain Monte Carlo (MCMC) simulation using standard methods, and an approximation to the distribution of $\boldsymbol{Q}$ can be recovered from a Markov chain on the unconstrained space. When combined with modern MCMC software, polar expansion allows for routine and flexible posterior inference in models with orthogonal matrix parameters. We find that polar expansion with adaptive Hamiltonian Monte Carlo is an order of magnitude more efficient than competing MCMC approaches in a benchmark protein interaction network application. We also propose a new approach to Bayesian functional principal components analysis which we illustrate in a meteorological time series application. △ Less

Submitted 18 June, 2019; originally announced June 2019.

Comments: 24 pages, 4 figures, 1 table

arXiv:1902.05106 [pdf, other]

Structured Shrinkage Priors

Authors: Maryclare Griffin, Peter D. Hoff

Abstract: In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured an… ▽ More In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured and sparse estimates. In this paper we develop structured shrinkage priors that generalize multivariate normal, Laplace, exponential power and normal-gamma priors. These priors allow the regression coefficients to be correlated a priori without sacrificing elementwise sparsity or shrinkage. The primary challenges in working with these structured shrinkage priors are computational, as the corresponding penalties are intractable integrals and the full conditional distributions that are needed to approximate the posterior mode or simulate from the posterior distribution may be non-standard. We overcome these issues using a flexible elliptical slice sampling procedure, and demonstrate that these priors can be used to introduce structure while preserving sparsity. △ Less

Submitted 26 April, 2023; v1 submitted 13 February, 2019; originally announced February 2019.

arXiv:1810.02881 [pdf, other]

Random orthogonal matrices and the Cayley transform

Authors: Michael Jauch, Peter D. Hoff, David B. Dunson

Abstract: Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in… ▽ More Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we describe and illustrate through examples a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish an asymptotic independent normal approximation for the distribution of the Euclidean parameters which corresponds to the uniform distribution on the Stiefel manifold. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof. △ Less

Submitted 5 October, 2018; originally announced October 2018.

Comments: 34 pages, 2 figures

arXiv:1809.09159 [pdf, other]

Exact adaptive confidence intervals for small areas

Authors: Kyle Burris, Peter Hoff

Abstract: In the analysis of survey data it is of interest to estimate and quantify uncertainty about means or totals for each of several non-overlap** subpopulations, or areas. When the sample size for a given area is small, standard confidence intervals based on data only from that area can be unacceptably wide. In order to reduce interval width, practitioners often utilize multilevel models in order to… ▽ More In the analysis of survey data it is of interest to estimate and quantify uncertainty about means or totals for each of several non-overlap** subpopulations, or areas. When the sample size for a given area is small, standard confidence intervals based on data only from that area can be unacceptably wide. In order to reduce interval width, practitioners often utilize multilevel models in order to borrow information across areas, resulting in intervals centered around shrinkage estimators. However, such intervals only have the nominal coverage rate on average across areas under the assumed model for across-area heterogeneity. The coverage rate for a given area depends on the actual value of the area mean, and can be nearly zero for areas with means that are far from the across-group average. As such, the use of uncertainty intervals centered around shrinkage estimators are inappropriate when area-specific coverage rates are desired. In this article, we propose an alternative confidence interval procedure for area means and totals under normally distributed sampling errors. This procedure not only has constant $1-α$ frequentist coverage for all values of the target quantity, but also uses auxiliary information to borrow information across areas. Because of this, the corresponding intervals have shorter expected lengths than standard confidence intervals centered on the unbiased direct estimator. Importantly, the coverage of the procedure does not depend on the assumed model for across-area heterogeneity. Rather, improvements to the model for across-area heterogeneity result in reduced expected interval width. △ Less

Submitted 24 September, 2018; originally announced September 2018.

arXiv:1807.08038 [pdf, other]

Additive and multiplicative effects network models

Authors: Peter D. Hoff

Abstract: Network datasets typically exhibit certain types of statistical dependencies, such as within-dyad correlation, row and column heterogeneity, and third-order dependence patterns such as transitivity and clustering. The first two of these can be well-represented statistically with a social relations model, a type of additive random effects model originally developed for continuous dyadic data. Third… ▽ More Network datasets typically exhibit certain types of statistical dependencies, such as within-dyad correlation, row and column heterogeneity, and third-order dependence patterns such as transitivity and clustering. The first two of these can be well-represented statistically with a social relations model, a type of additive random effects model originally developed for continuous dyadic data. Third-order patterns can be represented with multiplicative random effects models, which are related to matrix decompositions commonly used for matrix-variate data analysis. Additionally, these multiplicative random effects models generalize other popular latent variable network models, such as the stochastic blockmodel and the latent space model. In this article we review a general regression framework for the analysis of network data that combines these two types of random effects and accommodates a variety of network data types, including continuous, binary and ordinal network relations. △ Less

Submitted 20 July, 2018; originally announced July 2018.

MSC Class: 62H25; 62F15

arXiv:1801.00152 [pdf, other]

Adaptive Sign Error Control

Authors: Chaoyu Yu, Peter D. Hoff

Abstract: In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error rate, family-wise type I error rate or the false discovery rate. However, controlling these error rates does not explicitly control the sign error rate. In thi… ▽ More In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error rate, family-wise type I error rate or the false discovery rate. However, controlling these error rates does not explicitly control the sign error rate. In this paper, we propose two procedures for adaptively selecting an experimentwise significance threshold in order to control the sign error rate. The first controls the sign error rate conservatively, without any distributional assumptions on the parameters of interest. The second is an empirical Bayes procedure, and achieves optimal performance asymptotically when a model for the distribution of the parameters is correctly specified. We also discuss an adaptive procedure to minimize the sign error rate when the experimentwise type I error rate is held fixed. △ Less

Submitted 30 December, 2017; originally announced January 2018.

Comments: 19 pages, 4 figures

arXiv:1712.06230 [pdf, other]

Testing Sparsity-Inducing Penalties

Authors: Maryclare Griffin, Peter D. Hoff

Abstract: Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution… ▽ More Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution of the regression coefficients resembles a Laplace distribution. We give a testing procedure of whether or not a Laplace prior is appropriate and accordingly, whether or not using a lasso penalized estimate is appropriate. This testing procedure is designed to have power against exponential power priors which correspond to $\ell_q$ penalties. Via simulations, we show that this testing procedure achieves the desired level and has enough power to detect violations of the Laplace assumption when the numbers of observations and unknown regression coefficients are large. We then introduce an adaptive procedure that chooses a more appropriate prior and corresponding penalty from the class of exponential power priors when the null hypothesis is rejected. We show that this can improve estimation of the regression coefficients both when they are drawn from an exponential power distribution and when they are drawn from a spike-and-slab distribution. △ Less

Submitted 8 September, 2018; v1 submitted 17 December, 2017; originally announced December 2017.

arXiv:1712.02497 [pdf, other]

Multiplicative Coevolution Regression Models for Longitudinal Networks and Nodal Attributes

Authors: Yanjun He, Peter D. Hoff

Abstract: We introduce a simple and extendable coevolution model for the analysis of longitudinal network and nodal attribute data. The model features parameters that describe three phenomena: homophily, contagion and autocorrelation of the network and nodal attribute process. Homophily here describes how changes to the network may be associated with between-node similarities in terms of their nodal attribu… ▽ More We introduce a simple and extendable coevolution model for the analysis of longitudinal network and nodal attribute data. The model features parameters that describe three phenomena: homophily, contagion and autocorrelation of the network and nodal attribute process. Homophily here describes how changes to the network may be associated with between-node similarities in terms of their nodal attributes. Contagion refers to how node-level attributes may change depending on the network. The model we present is based upon a pair of intertwined autoregressive processes. We obtain least-squares parameter estimates for continuous-valued fully-observed network and attribute data. We also provide methods for Bayesian inference in several other cases, including ordinal network and attribute data, and models involving latent nodal attributes. These model extensions are applied to an analysis of international relations data and to data from a study of teen delinquency and friendship networks. △ Less

Submitted 7 December, 2017; originally announced December 2017.

Comments: 20 pages

arXiv:1706.09072 [pdf, other]

Influence Networks in International Relations

Authors: Shahryar Minhas, Peter D. Hoff, Michael D. Ward

Abstract: Measuring influence and determining what drives it are persistent questions in political science and in network analysis more generally. Herein we focus on the domain of international relations. Our major substantive question is: How can we determine what characteristics make an actor influential? To address the topic of influence, we build on a multilinear tensor regression framework (MLTR) that… ▽ More Measuring influence and determining what drives it are persistent questions in political science and in network analysis more generally. Herein we focus on the domain of international relations. Our major substantive question is: How can we determine what characteristics make an actor influential? To address the topic of influence, we build on a multilinear tensor regression framework (MLTR) that captures influence relationships using a tensor generalization of a vector autoregression model. Influence relationships in that approach are captured in a pair of n x n matrices and provide measurements of how the network actions of one actor may influence the future actions of another. A limitation of the MLTR and earlier latent space approaches is that there are no direct mechanisms through which to explain why a certain actor is more or less influential than others. Our new framework, social influence regression, provides a way to statistically model the influence of one actor on another as a function of characteristics of the actors. Thus we can move beyond just estimating that an actor influences another to understanding why. To highlight the utility of this approach, we apply it to studying monthly-level conflictual events between countries as measured through the Integrated Crisis Early Warning System (ICEWS) event data project. △ Less

Submitted 27 June, 2017; originally announced June 2017.

Comments: 23 pages, 8 figures

arXiv:1705.08331 [pdf, other]

Exact adaptive confidence intervals for linear regression coefficients

Authors: Peter D. Hoff, Chaoyu Yu

Abstract: We propose an adaptive confidence interval procedure (CIP) for the coefficients in the normal linear regression model. This procedure has a frequentist coverage rate that is constant as a function of the model parameters, yet provides smaller intervals than the usual interval procedure, on average across regression coefficients. The proposed procedure is obtained by defining a class of CIPs that a… ▽ More We propose an adaptive confidence interval procedure (CIP) for the coefficients in the normal linear regression model. This procedure has a frequentist coverage rate that is constant as a function of the model parameters, yet provides smaller intervals than the usual interval procedure, on average across regression coefficients. The proposed procedure is obtained by defining a class of CIPs that all have exact $1-α$ frequentist coverage, and then selecting from this class the procedure that minimizes a prior expected interval width. Such a procedure may be described as "frequentist, assisted by Bayes" or FAB. We describe an adaptive approach for estimating the prior distribution from the data so that exact non-asymptotic $1-α$ coverage is maintained. Additionally, in a "$p$ growing with $n$" asymptotic scenario, this adaptive FAB procedure is asymptotically Bayes-optimal among $1-α$ frequentist CIPs. △ Less

Submitted 6 July, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

MSC Class: 62J05

arXiv:1703.08620 [pdf, other]

Lasso ANOVA Decompositions for Matrix and Tensor Data

Authors: Maryclare Griffin, Peter D. Hoff

Abstract: Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may… ▽ More Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may be sparse. Accordingly, the mean matrix can be estimated by solving a penalized regression problem, applying a lasso penalty to the elementwise effects. Although solving this penalized regression problem is straightforward, specifying appropriate values of the penalty parameters is not. Leveraging the posterior mode interpretation of the penalized regression problem, moment-based empirical Bayes estimators of the penalty parameters can be defined. Estimation of the mean matrix using these these moment-based empirical Bayes estimators can be called LANOVA penalization, and the corresponding estimate of the mean matrix can be called the LANOVA estimate. The empirical Bayes estimators are shown to be consistent. Additionally, LANOVA penalization is extended to accommodate sparsity of row and column effects and to estimate an unknown mean tensor. The behavior of the LANOVA estimate is examined under misspecification of the distribution of the elementwise effects, and LANOVA penalization is applied to several datasets, including a matrix of microarray data, a three-way tensor of fMRI data and a three-way tensor of wheat infection data. △ Less

Submitted 8 February, 2019; v1 submitted 24 March, 2017; originally announced March 2017.

arXiv:1612.08287 [pdf, other]

Adaptive multigroup confidence intervals with constant coverage

Authors: Chaoyu Yu, Peter D. Hoff

Abstract: Confidence intervals for the means of multiple normal populations are often based on a hierarchical normal model. While commonly used interval procedures based on such a model have the nominal coverage rate on average across a population of groups, their actual coverage rate for a given group will be above or below the nominal rate, depending on the value of the group mean. Alternatively, a covera… ▽ More Confidence intervals for the means of multiple normal populations are often based on a hierarchical normal model. While commonly used interval procedures based on such a model have the nominal coverage rate on average across a population of groups, their actual coverage rate for a given group will be above or below the nominal rate, depending on the value of the group mean. Alternatively, a coverage rate that is constant as a function of a group's mean can be simply achieved by using a standard $t$-interval, based on data only from that group. The standard $t$-interval, however, fails to share information across the groups and is therefore not adaptive to easily obtained information about the distribution of group-specific means. In this article we construct confidence intervals that have a constant frequentist coverage rate and that make use of information about across-group heterogeneity, resulting in constant-coverage intervals that are narrower than standard $t$-intervals on average across groups. Such intervals are constructed by inverting biased tests for the mean of a normal population. Given a prior distribution on the mean, Bayes-optimal biased tests can be inverted to form Bayes-optimal confidence intervals with frequentist coverage that is constant as a function of the mean. In the context of multiple groups, the prior distribution is replaced by a model of across-group heterogeneity. The parameters for this model can be estimated using data from all of the groups, and used to obtain confidence intervals with constant group-specific coverage that adapt to information about the distribution of group means. △ Less

Submitted 25 December, 2016; originally announced December 2016.

MSC Class: 62C12

arXiv:1611.00460 [pdf, other]

Inferential Approaches for Network Analyses: AMEN for Latent Factor Models

Authors: Shahryar Minhas, Peter D. Hoff, Michael D. Ward

Abstract: We introduce a Bayesian approach to conduct inferential analyses on dyadic data while accounting for interdependencies between observations through a set of additive and multiplicative effects (AME). The AME model is built on a generalized linear modeling framework and is thus flexible enough to be applied to a variety of contexts. We contrast the AME model to two prominent approaches in the liter… ▽ More We introduce a Bayesian approach to conduct inferential analyses on dyadic data while accounting for interdependencies between observations through a set of additive and multiplicative effects (AME). The AME model is built on a generalized linear modeling framework and is thus flexible enough to be applied to a variety of contexts. We contrast the AME model to two prominent approaches in the literature: the latent space model (LSM) and the exponential random graph model (ERGM). Relative to these approaches, we show that the AME approach is a) to be easy to implement; b) interpretable in a general linear model framework; c) computationally straightforward; d) not prone to degeneracy; e) captures 1st, 2nd, and 3rd order network dependencies; and f) notably outperforms ERGMs and LSMs on a variety of metrics and in an out-of-sample context. In summary, AME offers a straightforward way to undertake nuanced, principled inferential network analysis for a wide range of social science questions. △ Less

Submitted 27 July, 2018; v1 submitted 1 November, 2016; originally announced November 2016.

arXiv:1611.00040 [pdf, other]

Lasso, fractional norm and structured sparse estimation using a Hadamard product parametrization

Authors: Peter D. Hoff

Abstract: Using a multiplicative reparametrization, I show that a subclass of $L_q$ penalties with $q\leq 1$ can be expressed as sums of $L_2$ penalties. It follows that the lasso and other norm-penalized regression estimates may be obtained using a very simple and intuitive alternating ridge regression algorithm. As compared to a similarly intuitive EM algorithm for $L_q$ optimization, the proposed algorit… ▽ More Using a multiplicative reparametrization, I show that a subclass of $L_q$ penalties with $q\leq 1$ can be expressed as sums of $L_2$ penalties. It follows that the lasso and other norm-penalized regression estimates may be obtained using a very simple and intuitive alternating ridge regression algorithm. As compared to a similarly intuitive EM algorithm for $L_q$ optimization, the proposed algorithm avoids some numerical instability issues and is also competitive in terms of speed. Furthermore, the proposed algorithm can be extended to accommodate sparse high-dimensional scenarios, generalized linear models, and can be used to create structured sparsity via penalties derived from covariance models for the parameters. Such model-based penalties may be useful for sparse estimation of spatially or temporally structured parameters. △ Less

Submitted 18 May, 2017; v1 submitted 31 October, 2016; originally announced November 2016.

Comments: This revision includes a comparison to cyclic coordinate descent and a new algorithm for sparse high-dimensional settings

MSC Class: 62-04

arXiv:1607.03045 [pdf, other]

Shared Subspace Models for Multi-Group Covariance Estimation

Authors: Alexander Franks, Peter Hoff

Abstract: We develop a model-based method for evaluating heterogeneity among several p x p covariance matrices in the large p, small n setting. This is done by assuming a spiked covariance model for each group and sharing information about the space spanned by the group-level eigenvectors. We use an empirical Bayes method to identify a low-dimensional subspace which explains variation across all groups and… ▽ More We develop a model-based method for evaluating heterogeneity among several p x p covariance matrices in the large p, small n setting. This is done by assuming a spiked covariance model for each group and sharing information about the space spanned by the group-level eigenvectors. We use an empirical Bayes method to identify a low-dimensional subspace which explains variation across all groups and use an MCMC algorithm to estimate the posterior uncertainty of eigenvectors and eigenvalues on this subspace. The implementation and utility of our model is illustrated with analyses of high-dimensional multivariate gene expression. △ Less

Submitted 21 October, 2019; v1 submitted 11 July, 2016; originally announced July 2016.

arXiv:1603.01525 [pdf, ps, other]

doi 10.1103/PhysRevC.93.054303

The structure of low-lying states in ${}^{140}$Sm studied by Coulomb excitation

Authors: M. Klintefjord, K. Hadyńska-Klȩk, A. Görgen, C. Bauer, F. L. Bello Garrote, S. Bönig, B. Bounthong, A. Damyanova, J. -P. Delaroche, V. Fedosseev, D. A. Fink, F. Giacoppo, M. Girod, P. Hoff, N. Imai, W. Korten, A. C. Larsen, J. Libert, R. Lutter, B. A. Marsh, P. L. Molkanov, H. Naïdja, P. Napiorkowski, F. Nowacki, J. Pakarinen , et al. (19 additional authors not shown)

Abstract: The electromagnetic structure of $^{140}$Sm was studied in a low-energy Coulomb excitation experiment with a radioactive ion beam from the REX-ISOLDE facility at CERN. The $2^+$ and $4^+$ states of the ground-state band and a second $2^+$ state were populated by multi-step excitation. The analysis of the differential Coulomb excitation cross sections yielded reduced transition probabilities betwee… ▽ More The electromagnetic structure of $^{140}$Sm was studied in a low-energy Coulomb excitation experiment with a radioactive ion beam from the REX-ISOLDE facility at CERN. The $2^+$ and $4^+$ states of the ground-state band and a second $2^+$ state were populated by multi-step excitation. The analysis of the differential Coulomb excitation cross sections yielded reduced transition probabilities between all observed states and the spectroscopic quadrupole moment for the $2_1^+$ state. The experimental results are compared to large-scale shell model calculations and beyond-mean-field calculations based on the Gogny D1S interaction with a five-dimensional collective Hamiltonian formalism. Simpler geometric and algebraic models are also employed to interpret the experimental data. The results indicate that $^{140}$Sm shows considerable $γ$ softness, but in contrast to earlier speculation no signs of shape coexistence at low excitation energy. This work sheds more light on the onset of deformation and collectivity in this mass region. △ Less

Submitted 4 March, 2016; originally announced March 2016.

Comments: 15 pages, 12 figures

Journal ref: Phys. Rev. C 93, 054303 (2016)

arXiv:1512.09020 [pdf, other]

Limitations on detecting row covariance in the presence of column covariance

Authors: Peter D. Hoff

Abstract: Many inference techniques for multivariate data analysis assume that the rows of the data matrix are realizations of independent and identically distributed random vectors. Such an assumption will be met, for example, if the rows of the data matrix are multivariate measurements on a set of independently sampled units. In the absence of an independent random sample, a relevant question is whether o… ▽ More Many inference techniques for multivariate data analysis assume that the rows of the data matrix are realizations of independent and identically distributed random vectors. Such an assumption will be met, for example, if the rows of the data matrix are multivariate measurements on a set of independently sampled units. In the absence of an independent random sample, a relevant question is whether or not a statistical model that assumes such row exchangeability is plausible. One method for assessing this plausibility is a statistical test of row covariation. Maintenance of a constant type I error rate regardless of the column covariance or matrix mean can be accomplished with a test that is invariant under an appropriate group of transformations. In the context of a class of elliptically contoured matrix regression models (such as matrix normal models), I show that there are no non-trivial invariant tests if the number of rows is not sufficiently larger than the number of columns. Furthermore, I show that even if the number of rows is large, there are no non-trivial invariant tests that have power to detect arbitrary row covariance in the presence of arbitrary column covariance. However, we can construct biased tests that have power to detect certain types of row covariance that may be encountered in practice. △ Less

Submitted 30 December, 2015; originally announced December 2015.

MSC Class: 62H15

arXiv:1512.08815 [pdf, other]

A Pivot-Based Improvement to Sandwich-Based Confidence Intervals

Authors: James W. Harmon, Peter D. Hoff

Abstract: The current standard for confidence interval construction in the context of a possibly misspecified model is to use an interval based on the sandwich estimate of variance. These intervals provide asymptotically correct coverage, but small-sample coverage is known to be poor. By eliminating a plug-in assumption, we derive a pivot-based method for confidence interval construction under possibly miss… ▽ More The current standard for confidence interval construction in the context of a possibly misspecified model is to use an interval based on the sandwich estimate of variance. These intervals provide asymptotically correct coverage, but small-sample coverage is known to be poor. By eliminating a plug-in assumption, we derive a pivot-based method for confidence interval construction under possibly misspecified models. When compared against confidence intervals generated by the sandwich estimate of variance, this method provides more accurate coverage of the pseudo-true parameter at small sample sizes. This is shown in the results of several simulation studies. Asymptotic results show that our pivot-based intervals have large sample efficiency equal to that of intervals based on the sandwich estimate of variance. △ Less

Submitted 29 December, 2015; originally announced December 2015.

MSC Class: 62G15

arXiv:1506.08237 [pdf, other]

Dyadic data analysis with amen

Authors: Peter D. Hoff

Abstract: Dyadic data on pairs of objects, such as relational or social network data, often exhibit strong statistical dependencies. Certain types of second-order dependencies, such as degree heterogeneity and reciprocity, can be well-represented with additive random effects models. Higher-order dependencies, such as transitivity and stochastic equivalence, can often be represented with multiplicative effec… ▽ More Dyadic data on pairs of objects, such as relational or social network data, often exhibit strong statistical dependencies. Certain types of second-order dependencies, such as degree heterogeneity and reciprocity, can be well-represented with additive random effects models. Higher-order dependencies, such as transitivity and stochastic equivalence, can often be represented with multiplicative effects. The "amen" package for the R statistical computing environment provides estimation and inference for a class of additive and multiplicative random effects models for ordinal, continuous, binary and other types of dyadic data. The package also provides methods for missing, censored and fixed-rank nomination data, as well as longitudinal dyadic data. This tutorial illustrates the "amen" package via example statistical analyses of several of these different data types. △ Less

Submitted 26 June, 2015; originally announced June 2015.

Comments: This is a vignette for the R package "amen"

MSC Class: 62-07; 62F15

arXiv:1505.02114 [pdf, other]

doi 10.1214/17-EJS1330

Adaptive Higher-order Spectral Estimators

Authors: David Gerard, Peter Hoff

Abstract: Many applications involve estimation of a signal matrix from a noisy data matrix. In such cases, it has been observed that estimators that shrink or truncate the singular values of the data matrix perform well when the signal matrix has approximately low rank. In this article, we generalize this approach to the estimation of a tensor of parameters from noisy tensor data. We develop new classes of… ▽ More Many applications involve estimation of a signal matrix from a noisy data matrix. In such cases, it has been observed that estimators that shrink or truncate the singular values of the data matrix perform well when the signal matrix has approximately low rank. In this article, we generalize this approach to the estimation of a tensor of parameters from noisy tensor data. We develop new classes of estimators that shrink or threshold the mode-specific singular values from the higher-order singular value decomposition. These classes of estimators are indexed by tuning parameters, which we adaptively choose from the data by minimizing Stein's unbiased risk estimate. In particular, this procedure provides a way to estimate the multilinear rank of the underlying signal tensor. Using simulation studies under a variety of conditions, we show that our estimators perform well when the mean tensor has approximately low multilinear rank, and perform competitively when the signal tensor does not have approximately low multilinear rank. We illustrate the use of these methods in an application to multivariate relational data. △ Less

Submitted 22 February, 2017; v1 submitted 8 May, 2015; originally announced May 2015.

Comments: 29 pages, 3 figures

MSC Class: 62H12 (Primary) 15A69; 62C99; 91D30; 62H35 (Secondary)

Journal ref: Electronic Journal of Statistics 11 (2017) 3703--3737

arXiv:1504.08218 [pdf, other]

Relax, Tensors Are Here: Dependencies in International Processes

Authors: Shahryar Minhas, Peter D. Hoff, Michael D. Ward

Abstract: Previous models of international conflict have suffered two shortfalls. They tended not to embody dynamic changes, focusing rather on static slices of behavior over time. These models have also been empirically evaluated in ways that assumed the independence of each country, when in reality they are searching for the interdependence among all countries. We illustrate a solution to these two hurdle… ▽ More Previous models of international conflict have suffered two shortfalls. They tended not to embody dynamic changes, focusing rather on static slices of behavior over time. These models have also been empirically evaluated in ways that assumed the independence of each country, when in reality they are searching for the interdependence among all countries. We illustrate a solution to these two hurdles and evaluate this new, dynamic, network based approach to the dependencies among the ebb and flow of daily international interactions using a newly developed, and openly available, database of events among nations. △ Less

Submitted 30 April, 2015; originally announced April 2015.

arXiv:1412.0048 [pdf, ps, other]

doi 10.1214/15-AOAS839

Multilinear tensor regression for longitudinal relational data

Authors: Peter D. Hoff

Abstract: A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data,… ▽ More A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points. This is done via a separable, or Kronecker-structured, regression parameter along with a separable covariance model. In the context of an analysis of longitudinal multivariate relational data, it is shown how the multilinear tensor regression model can represent patterns that often appear in relational and network data, such as reciprocity and transitivity. △ Less

Submitted 5 November, 2015; v1 submitted 28 November, 2014; originally announced December 2014.

Comments: Published at http://dx.doi.org/10.1214/15-AOAS839 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS839

Journal ref: Annals of Applied Statistics 2015, Vol. 9, No. 3, 1169-1193

arXiv:1410.1094 [pdf, other]

doi 10.1016/j.laa.2016.04.033

A higher-order LQ decomposition for separable covariance models

Authors: David C. Gerard, Peter D. Hoff

Abstract: We develop a higher order generalization of the LQ decomposition and show that this decomposition plays an important role in likelihood-based estimation and testing for separable, or Kronecker structured, covariance models, such as the multilinear normal model. This role is analogous to that of the LQ decomposition in likelihood inference for the multivariate normal model. Additionally, this highe… ▽ More We develop a higher order generalization of the LQ decomposition and show that this decomposition plays an important role in likelihood-based estimation and testing for separable, or Kronecker structured, covariance models, such as the multilinear normal model. This role is analogous to that of the LQ decomposition in likelihood inference for the multivariate normal model. Additionally, this higher order LQ decomposition can be used to construct an alternative version of the popular higher order singular value decomposition for tensor-valued data. We also develop a novel generalization of the polar decomposition to tensor-valued data. △ Less

Submitted 4 October, 2014; originally announced October 2014.

Comments: 30 pages

MSC Class: 15A69; 62H12; 62H15; 65F99

Journal ref: Linear Algebra and its Applications 505 (2016) 57--84

arXiv:1408.0424 [pdf, other]

doi 10.1016/j.jmva.2015.01.020

Equivariant minimax dominators of the MLE in the array normal model

Authors: David Gerard, Peter Hoff

Abstract: Inference about dependencies in a multiway data array can be made using the array normal model, which corresponds to the class of multivariate normal distributions with separable covariance matrices. Maximum likelihood and Bayesian methods for inference in the array normal model have appeared in the literature, but there have not been any results concerning the optimality properties of such estima… ▽ More Inference about dependencies in a multiway data array can be made using the array normal model, which corresponds to the class of multivariate normal distributions with separable covariance matrices. Maximum likelihood and Bayesian methods for inference in the array normal model have appeared in the literature, but there have not been any results concerning the optimality properties of such estimators. In this article, we obtain results for the array normal model that are analogous to some classical results concerning covariance estimation for the multivariate normal model. We show that under a lower triangular product group, a uniformly minimum risk equivariant estimator (UMREE) can be obtained via a generalized Bayes procedure. Although this UMREE is minimax and dominates the MLE, it can be improved upon via an orthogonally equivariant modification. Numerical comparisons of the risks of these estimators show that the equivariant estimators can have substantially lower risks than the MLE. △ Less

Submitted 2 August, 2014; originally announced August 2014.

MSC Class: 62H12; 62C20; 62F10; 62F15

Journal ref: Journal of Multivariate Analysis 137 (2015) 32--49

arXiv:1312.6397 [pdf, other]

Equivariant and scale-free Tucker decomposition models

Authors: Peter David Hoff

Abstract: Analyses of array-valued datasets often involve reduced-rank array approximations, typically obtained via least-squares or truncations of array decompositions. However, least-squares approximations tend to be noisy in high-dimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain low-rank model-based repre… ▽ More Analyses of array-valued datasets often involve reduced-rank array approximations, typically obtained via least-squares or truncations of array decompositions. However, least-squares approximations tend to be noisy in high-dimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain low-rank model-based representations of continuous, discrete and ordinal data arrays. The model is based on a parameterization of the mean array as a multilinear product of a reduced-rank core array and a set of index-specific orthogonal eigenvector matrices. It is shown how orthogonally equivariant parameter estimates can be obtained from Bayesian procedures under invariant prior distributions. Additionally, priors on the core array are developed that act as regularizers, leading to improved inference over the standard least-squares estimator, and providing robustness to misspecification of the array rank. This model-based approach is extended to accommodate discrete or ordinal data arrays using a semiparametric transformation model. The resulting low-rank representation is scale-free, in the sense that it is invariant to monotonic transformations of the data array. In an example analysis of a multivariate discrete network dataset, this scale-free approach provides a more complete description of data patterns. △ Less

Submitted 22 December, 2013; originally announced December 2013.

MSC Class: 62H25; 62F15

arXiv:1311.2610 [pdf, other]

Joint Mean and Covariance Modeling of Multiple Health Outcome Measures

Authors: Xiaoyue Niu, Peter D. Hoff

Abstract: Health exams determine a patient's health status by comparing the patient's measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009 - 2010 National Health and Nutrition Examination Survey (NHANES) on four m… ▽ More Health exams determine a patient's health status by comparing the patient's measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009 - 2010 National Health and Nutrition Examination Survey (NHANES) on four major health problems in the U.S. and apply a joint mean and covariance model to study how the reference ranges and associations of those health outcomes could vary among subpopulations. We discuss guidelines for model selection and evaluation, using standard criteria such as AIC in conjunction with posterior predictive checks. The results from the proposed model can help identify subpopulations in which more data need to be collected to refine the reference range and to study the specific associations among those health problems. △ Less

Submitted 31 May, 2018; v1 submitted 11 November, 2013; originally announced November 2013.

Comments: 20 pages, 8 figures

arXiv:1306.5786 [pdf, other]

Testing for nodal dependence in relational data matrices

Authors: Alexander Volfovsky, Peter D. Hoff

Abstract: Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We provide a test for such dependence… ▽ More Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We provide a test for such dependence using the framework of the matrix normal model, a type of multivariate normal distribution parameterized in terms of row- and column-specific covariance matrices. We develop a likelihood ratio test (LRT) for row and column dependence based on the observation of a single relational data matrix. We obtain a reference distribution for the LRT statistic, thereby providing an exact test for the presence of row or column correlations in a square relational data matrix. Additionally, we provide extensions of the test to accommodate common features of such data, such as undefined diagonal entries, a non-zero mean, multiple observations, and deviations from normality. △ Less

Submitted 24 June, 2013; originally announced June 2013.

arXiv:1306.4708 [pdf, other]

Testing and Modeling Dependencies Between a Network and Nodal Attributes

Authors: Bailey K. Fosdick, Peter D. Hoff

Abstract: Network analysis is often focused on characterizing the dependencies between network relations and node-level attributes. Potential relationships are typically explored by modeling the network as a function of the nodal attributes or by modeling the attributes as a function of the network. These methods require specification of the exact nature of the association between the network and attributes… ▽ More Network analysis is often focused on characterizing the dependencies between network relations and node-level attributes. Potential relationships are typically explored by modeling the network as a function of the nodal attributes or by modeling the attributes as a function of the network. These methods require specification of the exact nature of the association between the network and attributes, reduce the network data to a small number of summary statistics, and are unable provide predictions simultaneously for missing attribute and network information. Existing methods that model the attributes and network jointly also assume the data are fully observed. In this article we introduce a unified approach to analysis that addresses these shortcomings. We use a latent variable model to obtain a low dimensional representation of the network in terms of node-specific network factors and use a test of dependence between the network factors and attributes as a surrogate for a test of dependence between the network and attributes. We propose a formal testing procedure to determine if dependencies exists between the network factors and attributes. We also introduce a joint model for the network and attributes, for use if the test rejects, that can capture a variety of dependence patterns and be used to make inference and predictions for missing observations. △ Less

Submitted 19 June, 2013; originally announced June 2013.

arXiv:1304.3676 [pdf, ps, other]

Comment on "Bayesian Nonparametric Inference - Why and How" by Mueller and Mitra

Authors: Peter D. Hoff

Abstract: Due to their great flexibility, nonparametric Bayes methods have proven to be a valuable tool for discovering complicated patterns in data. The term "nonparametric Bayes" suggests that these methods inherit model-free operating characteristics of classical nonparametric methods, as well as coherent uncertainty assessments provided by Bayesian procedures. However, as the authors say in the conclusi… ▽ More Due to their great flexibility, nonparametric Bayes methods have proven to be a valuable tool for discovering complicated patterns in data. The term "nonparametric Bayes" suggests that these methods inherit model-free operating characteristics of classical nonparametric methods, as well as coherent uncertainty assessments provided by Bayesian procedures. However, as the authors say in the conclusion to their article, nonparametric Bayesian methods may be more aptly described as "massively parametric." Furthermore, I argue that many of the default nonparametric Bayes procedures are only Bayesian in the weakest sense of the term, and cannot be assumed to provide honest assessments of uncertainty merely because they carry the Bayesian label. However useful such procedures may be, we should be cautious about advertising default nonparametric Bayes procedures as either being "assumption free" or providing descriptions of our uncertainty. If we want our nonparametric Bayes procedures to have a Bayesian interpretation, we should modify default NP Bayes methods to accommodate real prior information, or at the very least, carefully evaluate the effects of hyperparameters on posterior quantities of interest. △ Less

Submitted 12 April, 2013; originally announced April 2013.

Comments: Invited discussion of "Bayesian Nonparametric Inference - Why and How" by Mueller and Mitra, to appear in Bayesian Analysis, June 2013

MSC Class: 62G99; 62C10

arXiv:1304.3673 [pdf, other]

Bayesian analysis of matrix data with rstiefel

Authors: Peter D. Hoff

Abstract: We illustrate the use of the R-package "rstiefel" for matrix-variate data analysis in the context of two examples. The first example considers estimation of a reduced-rank mean matrix in the presence of normally distributed noise. The second example considers the modeling of a social network of friendships among teenagers. Bayesian estimation for these models requires the ability to simulate from… ▽ More We illustrate the use of the R-package "rstiefel" for matrix-variate data analysis in the context of two examples. The first example considers estimation of a reduced-rank mean matrix in the presence of normally distributed noise. The second example considers the modeling of a social network of friendships among teenagers. Bayesian estimation for these models requires the ability to simulate from the matrix-variate von Mises-Fisher distributions and the matrix-variate Bingham distributions on the Stiefel manifold. △ Less

Submitted 12 April, 2013; originally announced April 2013.

Comments: This is a vignette for the R-package "rstiefel"

MSC Class: 62H11; 62H25; 65C40

Showing 1–50 of 72 results for author: Hoff, P