-
Bayesian GARCH Modeling of Functional Sports Data
Authors:
Patric Dolmeta,
Raffaele Argiento,
Silvia Montagna
Abstract:
The use of statistical methods in sport analytics has gained a rapidly growing interest over the last decade, and nowadays is common practice. In particular, the interest in understanding and predicting an athlete's performance throughout his/her career is motivated by the need to evaluate the efficacy of training programs, anticipate fatigue to prevent injuries and detect unexpected of disproport…
▽ More
The use of statistical methods in sport analytics has gained a rapidly growing interest over the last decade, and nowadays is common practice. In particular, the interest in understanding and predicting an athlete's performance throughout his/her career is motivated by the need to evaluate the efficacy of training programs, anticipate fatigue to prevent injuries and detect unexpected of disproportionate increases in performance that might be indicative of do**. Moreover, fast evolving data gathering technologies require up to date modelling techniques that adapt to the distinctive features of sports data. In this work, we propose a hierarchical Bayesian model for describing and predicting the evolution of performance over time for shot put athletes. To account for seasonality and heterogeneity in recorded results, we rely both on a smooth functional contribution and on a linear mixed effect model with heteroskedastic errors to represent the athlete-specific trajectories. The resulting model provides an accurate description of the performance trajectories and helps specifying both the intra- and inter-seasonal variability of measurements. Further, the model allows for the prediction of athletes' performance in future seasons. We apply our model to an extensive real world data set on performance data of professional shot put athletes recorded at elite competitions.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis
Authors:
Silvia Montagna,
Vanessa Orani,
Raffaele Argiento
Abstract:
In professional tennis, it is often acknowledged that the server has an initial advantage. Indeed, the majority of points are won by the server, making the serve one of the most important elements in this sport. In this paper, we focus on the role of the serve advantage in winning a point as a function of the rally length. We propose a Bayesian isotonic logistic regression model for the probabilit…
▽ More
In professional tennis, it is often acknowledged that the server has an initial advantage. Indeed, the majority of points are won by the server, making the serve one of the most important elements in this sport. In this paper, we focus on the role of the serve advantage in winning a point as a function of the rally length. We propose a Bayesian isotonic logistic regression model for the probability of winning a point on serve. In particular, we decompose the logit of the probability of winning via a linear combination of B-splines basis functions, with athlete-specific basis function coefficients. Further, we ensure the serve advantage decreases with rally length by imposing constraints on the spline coefficients. We also consider the rally ability of each player, and study how the different types of court may impact on the player's rally ability. We apply our methodology to a Grand Slam singles matches dataset.
△ Less
Submitted 29 August, 2019;
originally announced September 2019.
-
High-dimensional Bayesian Fourier Analysis For Detecting Circadian Gene Expressions
Authors:
Silvia Montagna,
Irina Irincheeva,
Surya T. Tokdar
Abstract:
In genomic applications, there is often interest in identifying genes whose time-course expression trajectories exhibit periodic oscillations with a period of approximately 24 hours. Such genes are usually referred to as circadian, and their identification is a crucial step toward discovering physiological processes that are clock-controlled. It is natural to expect that the expression of gene i a…
▽ More
In genomic applications, there is often interest in identifying genes whose time-course expression trajectories exhibit periodic oscillations with a period of approximately 24 hours. Such genes are usually referred to as circadian, and their identification is a crucial step toward discovering physiological processes that are clock-controlled. It is natural to expect that the expression of gene i at time j might depend to some degree on the expression of the other genes measured at the same time. However, widely-used rhythmicity detection techniques do not accommodate for the potential dependence across genes. We develop a Bayesian approach for periodicity identification that explicitly takes into account the complex dependence structure across time-course trajectories in gene expressions. We employ a latent factor representation to accommodate dependence, while representing the true trajectories in the Fourier domain allows for inference on period, phase, and amplitude of the signal. Identification of circadian genes is allowed through a carefully chosen variable selection prior on the Fourier basis coefficients. The methodology is applied to a novel mouse liver circadian dataset. Although motivated by time-course gene expression array data, the proposed approach is applicable to the analysis of dependent functional data at broad.
△ Less
Submitted 27 February, 2024; v1 submitted 12 September, 2018;
originally announced September 2018.
-
The coordinate-based meta-analysis of neuroimaging data
Authors:
Pantelis Samartsidis,
Silvia Montagna,
Thomas E. Nichols,
Timothy D. Johnson
Abstract:
Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open probl…
▽ More
Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research.
△ Less
Submitted 29 November, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.
-
Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data
Authors:
Silvia Montagna,
Tor Wager,
Lisa Feldman-Barrett,
Timothy D. Johnson,
Thomas E. Nichols
Abstract:
Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-based Meta-analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predicti…
▽ More
Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-based Meta-analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and a neuroimaging meta-analysis dataset.
△ Less
Submitted 22 June, 2016;
originally announced June 2016.
-
Computer emulation with non-stationary Gaussian processes
Authors:
Silvia Montagna,
Surya T. Tokdar
Abstract:
Gaussian process (GP) models are widely used to emulate propagation uncertainty in computer experiments. GP emulation sits comfortably within an analytically tractable Bayesian framework. Apart from propagating uncertainty of the input variables, a GP emulator trained on finitely many runs of the experiment also offers error bars for response surface estimates at unseen input values. This helps se…
▽ More
Gaussian process (GP) models are widely used to emulate propagation uncertainty in computer experiments. GP emulation sits comfortably within an analytically tractable Bayesian framework. Apart from propagating uncertainty of the input variables, a GP emulator trained on finitely many runs of the experiment also offers error bars for response surface estimates at unseen input values. This helps select future input values where the experiment should be run to minimize the uncertainty in the response surface estimation. However, traditional GP emulators use stationary covariance functions, which perform poorly and lead to sub-optimal selection of future input points when the response surface has sharp local features, such as a jump discontinuity or an isolated tall peak. We propose an easily implemented non-stationary GP emulator, based on two stationary GPs, one nested into the other, and demonstrate its superior ability in handling local features and selecting future input points from the boundaries of such features.
△ Less
Submitted 29 January, 2015; v1 submitted 21 August, 2013;
originally announced August 2013.