Search | arXiv e-print repository

Empirical Determination of Baseball Eras: Multivariate Changepoint Analysis in Major League Baseball

Authors: Mena CR Whalen, Gregory J Matthews, Brain M Mills

Abstract: We use multivariate change point analysis methods, to identify not only mean shifts but also changes in variance across a wide array of statistical time series. Our primary objective is to empirically discern distinct eras in the evolution of baseball, shedding light on significant transformations in team performance and management strategies. We leverage a rich dataset comprising baseball statist… ▽ More We use multivariate change point analysis methods, to identify not only mean shifts but also changes in variance across a wide array of statistical time series. Our primary objective is to empirically discern distinct eras in the evolution of baseball, shedding light on significant transformations in team performance and management strategies. We leverage a rich dataset comprising baseball statistics from the late 1800s to 2020, spanning over a century of the sport's history. Results confirm previous historical research, pinpointing well-known baseball eras, such as the Dead Ball Era, Integration Era, Steroid Era, and Post-Steroid Era. Moreover, the study delves into the detection of substantial changes in team performance, effectively identifying periods of both dynasties and collapses within a team's history. The multivariate change point analysis proves to be a valuable tool for understanding the intricate dynamics of baseball's evolution. The method offers a data-driven approach to unveil structural shifts in the sport's historical landscape, providing fresh insights into the impact of rule changes, player strategies, and external factors on baseball's evolution. This not only enhances our comprehension of baseball, showing more robust identification of eras than past univariate time series work, but also showcases the broader applicability of multivariate change point analysis in the domain of sports research and beyond. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2305.10262 [pdf, other]

doi 10.1080/00031305.2023.2242442

Here Comes the STRAIN: Analyzing Defensive Pass Rush in American Football with Player Tracking Data

Authors: Quang Nguyen, Ronald Yurko, Gregory J. Matthews

Abstract: In American football, a pass rush is an attempt by the defensive team to disrupt the offense and prevent the quarterback (QB) from completing a pass. Existing metrics for assessing pass rush performance are either discrete-time quantities or based on subjective judgment. Using player tracking data, we propose STRAIN, a novel metric for evaluating pass rushers in the National Football League (NFL)… ▽ More In American football, a pass rush is an attempt by the defensive team to disrupt the offense and prevent the quarterback (QB) from completing a pass. Existing metrics for assessing pass rush performance are either discrete-time quantities or based on subjective judgment. Using player tracking data, we propose STRAIN, a novel metric for evaluating pass rushers in the National Football League (NFL) at the continuous-time within-play level. Inspired by the concept of strain rate in materials science, STRAIN is a simple and interpretable means for measuring defensive pressure in football. It is a directly-observed statistic as a function of two features: the distance between the pass rusher and QB, and the rate at which this distance is being reduced. Our metric possesses great predictability of pressure and stability over time. We also fit a multilevel model for STRAIN to understand the defensive pressure contribution of every pass rusher at the play-level. We apply our approach to NFL data and present results for the first eight weeks of the 2021 regular season. In particular, we provide comparisons of STRAIN for different defensive positions and play outcomes, and rankings of the NFL's best pass rushers according to our metric. △ Less

Submitted 30 July, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: 12 figures, 7 tables

arXiv:2301.04001 [pdf, other]

Big Ideas in Sports Analytics and Statistical Tools for their Investigation

Authors: Benjamin S. Baumer, Gregory J. Matthews, Quang Nguyen

Abstract: Sports analytics -- broadly defined as the pursuit of improvement in athletic performance through the analysis of data -- has expanded its footprint both in the professional sports industry and in academia over the past 30 years. In this paper, we connect four big ideas that are common across multiple sports: the expected value of a game state, win probability, measures of team strength, and the u… ▽ More Sports analytics -- broadly defined as the pursuit of improvement in athletic performance through the analysis of data -- has expanded its footprint both in the professional sports industry and in academia over the past 30 years. In this paper, we connect four big ideas that are common across multiple sports: the expected value of a game state, win probability, measures of team strength, and the use of sports betting market data. For each, we explore both the shared similarities and individual idiosyncrasies of analytical approaches in each sport. While our focus is on the concepts underlying each type of analysis, any implementation necessarily involves statistical methodologies, computational tools, and data sources. Where appropriate, we outline how data, models, tools, and knowledge of the sport combine to generate actionable insights. We also describe opportunities to share analytical work, but omit an in-depth discussion of individual player evaluation as beyond our scope. This paper should serve as a useful overview for anyone becoming interested in the study of sports analytics. △ Less

Submitted 10 January, 2023; originally announced January 2023.

MSC Class: 62P99 ACM Class: J.2

arXiv:2210.02383 [pdf, other]

Filling the Gaps: A Multiple Imputation Approach to Estimating Aging Curves in Baseball

Authors: Quang Nguyen, Gregory J. Matthews

Abstract: In sports, an aging curve depicts the relationship between average performance and age in athletes' careers. This paper investigates the aging curves for offensive players in Major League Baseball. We study this problem in a missing data context and account for different types of dropouts of baseball players during their careers. We employ a multiple imputation framework for multilevel data to imp… ▽ More In sports, an aging curve depicts the relationship between average performance and age in athletes' careers. This paper investigates the aging curves for offensive players in Major League Baseball. We study this problem in a missing data context and account for different types of dropouts of baseball players during their careers. We employ a multiple imputation framework for multilevel data to impute the player performance associated with the missing seasons, and estimate the aging curves based on the imputed datasets. We then evaluate the effects of different dropout mechanisms on the aging curves through simulation, before applying our method to analyze MLB player data from past seasons. Results suggest an overestimation of the aging curves constructed without considering the unobserved seasons, whereas estimates obtained from multiple imputation address this shortcoming. △ Less

Submitted 11 March, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

arXiv:2111.05310 [pdf, other]

doi 10.6339/22-JDS1042

An Examination of Olympic Sport Climbing Competition Format and Scoring System

Authors: Quang Nguyen, Hannah Butler, Gregory J. Matthews

Abstract: Sport climbing, which made its Olympic debut at the 2020 Summer Games, generally consists of three separate disciplines: speed climbing, bouldering, and lead climbing. However, the International Olympic Committee (IOC) only allowed one set of medals each for men and women in sport climbing. As a result, the governing body of sport climbing, rather than choosing only one of the three disciplines to… ▽ More Sport climbing, which made its Olympic debut at the 2020 Summer Games, generally consists of three separate disciplines: speed climbing, bouldering, and lead climbing. However, the International Olympic Committee (IOC) only allowed one set of medals each for men and women in sport climbing. As a result, the governing body of sport climbing, rather than choosing only one of the three disciplines to include in the Olympics, decided to create a competition combining all three disciplines. In order to determine a winner, a combined scoring system was created using the product of the ranks across the three disciplines to determine an overall score for each climber. In this work, the rank-product scoring system of sport climbing is evaluated through simulation to investigate its general features, specifically, the advancement probabilities and scores for climbers given certain placements. Additionally, analyses of historical climbing contest results are presented and real examples of violations of the independence of irrelevant alternatives are illustrated. Finally, this work finds evidence that the current competition format is putting speed climbers at a disadvantage. △ Less

Submitted 28 March, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: 17 pages, 7 figures

arXiv:2009.04576 [pdf, other]

Bang the Can Slowly: An Investigation into the 2017 Houston Astros

Authors: Ryan T. Elmore, Gregory J. Matthews

Abstract: This manuscript is a statistical investigation into the 2017 Major League Baseball scandal involving the Houston Astros, the World Series championship winner that same year. The Astros were alleged to have stolen their opponents' pitching signs in order to provide their batters with a potentially unfair advantage. This work finds compelling evidence that the Astros on-field performance was signifi… ▽ More This manuscript is a statistical investigation into the 2017 Major League Baseball scandal involving the Houston Astros, the World Series championship winner that same year. The Astros were alleged to have stolen their opponents' pitching signs in order to provide their batters with a potentially unfair advantage. This work finds compelling evidence that the Astros on-field performance was significantly affected by their sign-stealing ploy and quantifies the effects. The three main findings in the manuscript are: 1) the Astros' odds of swinging at a pitch were reduced by approximately 27% (OR: 0.725, 95% CI: (0.618, 0.850)) when the sign was stolen, 2) when an Astros player swung, the odds of making contact with the ball increased roughly 80% (OR: 1.805, 95% CI: (1.342, 2.675)) on non-fastball pitches, and 3) when the Astros made contact with a ball on a pitch in which the sign was known, the ball's exit velocity (launch speed) increased on average by 2.386 (95% CI: (0.334, 4.451)) miles per hour. △ Less

Submitted 9 September, 2020; originally announced September 2020.

arXiv:1804.05882 [pdf, other]

Confidence intervals for the area under the receiver operating characteristic curve in the presence of ignorable missing data

Authors: Hunyong Cho, Gregory J. Matthews, Ofer Harel

Abstract: Receiver operating characteristic (ROC) curves are widely used as a measure of accuracy of diagnostic tests and can be summarized using the area under the ROC curve (AUC). Often, it is useful to construct a confidence intervals for the AUC, however, since there are a number of different proposed methods to measure variance of the AUC, there are thus many different resulting methods for constructin… ▽ More Receiver operating characteristic (ROC) curves are widely used as a measure of accuracy of diagnostic tests and can be summarized using the area under the ROC curve (AUC). Often, it is useful to construct a confidence intervals for the AUC, however, since there are a number of different proposed methods to measure variance of the AUC, there are thus many different resulting methods for constructing these intervals. In this manuscript, we compare different methods of constructing Wald-type confidence interval in the presence of missing data where the missingness mechanism is ignorable. We find that constructing confidence intervals using multiple imputation (MI) based on logistic regression (LR) gives the most robust coverage probability and the choice of CI method is less important. However, when missingness rate is less severe (e.g. less than 70%), we recommend using Newcombe's Wald method for constructing confidence intervals along with multiple imputation using predictive mean matching (PMM). △ Less

Submitted 16 April, 2018; originally announced April 2018.

Comments: 32 pages

arXiv:1802.05778 [pdf, other]

doi 10.1080/02664763.2018.1441381

A comparison of machine learning techniques for taxonomic classification of teeth from the Family Bovidae

Authors: Gregory J Matthews, Juliet K. Brophy, Maxwell P. Luetkemeier, Hongie Gu, George K. Thiruvathukal

Abstract: This study explores the performance of modern, accurate machine learning algorithms on the classification of fossil teeth in the Family Bovidae. Isolated bovid teeth are typically the most common fossils found in southern Africa and they often constitute the basis for paleoenvironmental reconstructions. Taxonomic identification of fossil bovid teeth, however, is often imprecise and subjective. Usi… ▽ More This study explores the performance of modern, accurate machine learning algorithms on the classification of fossil teeth in the Family Bovidae. Isolated bovid teeth are typically the most common fossils found in southern Africa and they often constitute the basis for paleoenvironmental reconstructions. Taxonomic identification of fossil bovid teeth, however, is often imprecise and subjective. Using modern teeth with known taxons, machine learning algorithms can be trained to classify fossils. Previous work by Brophy et. al. 2014 uses elliptical Fourier analysis of the form (size and shape) of the outline of the occlusal surface of each tooth as features in a linear discriminant analysis framework. This manuscript expands on that previous work by exploring how different machine learning approaches classify the teeth and testing which technique is best for classification. Five different machine learning techniques including linear discriminant analysis, neural networks, nuclear penalized multinomial regression, random forests, and support vector machines were used to estimate these models. Support vector machines and random forests perform the best in terms of both log-loss and misclassification rate; both of these methods are improvements over linear discriminant analysis. With the identification and application of these superior methods, bovid teeth can be classified with higher accuracy. △ Less

Submitted 15 February, 2018; originally announced February 2018.

Journal ref: Gregory J. Matthews, Juliet K. Brophy, Maxwell Luetkemeier, Hongie Gu & George K. Thiruvathukal (2018) A comparison of machine learning techniques for taxonomic classification of teeth from the Family Bovidae, Journal of Applied Statistics

arXiv:1701.05976 [pdf, other]

How often does the best team win? A unified approach to understanding randomness in North American sport

Authors: Michael J. Lopez, Gregory J. Matthews, Benjamin S. Baumer

Abstract: Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting m… ▽ More Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting market data that can be uniformly applied across sporting organizations to better understand the role of randomness in game outcomes. These models can be used to extract estimates of team strength, the between-season, within-season, and game-to-game variability of team strengths, as well each team's home advantage. We implement our approach across a decade of play in each of the National Football League (NFL), National Hockey League (NHL), National Basketball Association (NBA), and Major League Baseball (MLB), finding that the NBA demonstrates both the largest dispersion in talent and the largest home advantage, while the NHL and MLB stand out for their relative randomness in game outcomes. We conclude by proposing new metrics for judging competitiveness across sports leagues, both within the regular season and using traditional postseason tournament formats. Although we focus on sports, we discuss a number of other situations in which our generalizable models might be usefully applied. △ Less

Submitted 22 November, 2017; v1 submitted 20 January, 2017; originally announced January 2017.

Comments: 40 pages, 20 figures, 5 tables, code available at https://github.com/bigfour/competitiveness

arXiv:1312.7158 [pdf, other]

openWAR: An Open Source System for Evaluating Overall Player Performance in Major League Baseball

Authors: Benjamin S. Baumer, Shane T. Jensen, Gregory J. Matthews

Abstract: Within baseball analytics, there is substantial interest in comprehensive statistics intended to capture overall player performance. One such measure is Wins Above Replacement (WAR), which aggregates the contributions of a player in each facet of the game: hitting, pitching, baserunning, and fielding. However, current versions of WAR depend upon proprietary data, ad hoc methodology, and opaque cal… ▽ More Within baseball analytics, there is substantial interest in comprehensive statistics intended to capture overall player performance. One such measure is Wins Above Replacement (WAR), which aggregates the contributions of a player in each facet of the game: hitting, pitching, baserunning, and fielding. However, current versions of WAR depend upon proprietary data, ad hoc methodology, and opaque calculations. We propose a competitive aggregate measure, openWAR, that is based upon public data and methodology with greater rigor and transparency. We discuss a principled standard for the nebulous concept of a "replacement" player. Finally, we use simulation-based techniques to provide interval estimates for our openWAR measure. △ Less

Submitted 24 March, 2015; v1 submitted 26 December, 2013; originally announced December 2013.

Comments: 27 pages including supplement

MSC Class: 62P99

Showing 1–10 of 10 results for author: Matthews, G J