Empirical Determination of Baseball Eras: Multivariate Changepoint Analysis in Major League Baseball

Mena CR Whalen, Gregory J Matthews
Mathematics and Statistics
Loyola University Chicago
Chicago, IL
{mwhalen3, gmatthews1}@luc.edu
&Brain M Mills
Kinesiology and Health Education
University of Texas at Austin
Austin, TX
[email protected]
Abstract

We use multivariate change point analysis methods, to identify not only mean shifts but also changes in variance across a wide array of statistical time series. Our primary objective is to empirically discern distinct eras in the evolution of baseball, shedding light on significant transformations in team performance and management strategies. We leverage a rich dataset comprising baseball statistics from the late 1800s to 2020, spanning over a century of the sport’s history. Results confirm previous historical research, pinpointing well-known baseball eras, such as the Dead Ball Era, Integration Era, Steroid Era, and Post-Steroid Era. Moreover, the study delves into the detection of substantial changes in team performance, effectively identifying periods of both dynasties and collapses within a team’s history. The multivariate change point analysis proves to be a valuable tool for understanding the intricate dynamics of baseball’s evolution. The method offers a data-driven approach to unveil structural shifts in the sport’s historical landscape, providing fresh insights into the impact of rule changes, player strategies, and external factors on baseball’s evolution. This not only enhances our comprehension of baseball, showing more robust identification of eras than past univariate time series work, but also showcases the broader applicability of multivariate change point analysis in the domain of sports research and beyond.

Keywords Baseball, Change Point Analysis, Panel Time Series

1 Introduction

The first professional baseball team in the United States, the Cincinnati Red Stockings, was formed in 1869 ([1]). Many leagues came and went in the late 1800s, but the National League (NL), formed in 1876, emerged as the predominant league of the time. Near the turn of the century, the American League (AL) began growing in popularity and eventually reached an agreement with the NL to be the two major leagues of baseball with the winner of each league playing in the World Series starting in 1903.

Throughout the long history of the game, baseball has gone through many changes and distinct eras, often classified qualitatively by historians. [2] classify six eras of modern baseball: "Baseball has endured much change over the course of its history, and because of constant change, the modern era of baseball has been segmented into six distinct sub-eras. This common list is presented at Baseball-Reference, similarly depicting the eras as the Dead Ball Era (1901-1919), the Live Ball Era (1920-1941), the Integration Era (1942-1960), the Expansion Era (1961-1976), the Free Agency Era (1977-1993) and the Long Ball/Steroid Era (1994-2005)." For example, the time period between approximately 1900-1919, the "Dead Ball Era", was marked by low scoring games, few home runs, and dominant pitching. Alternatively, the "Steroid Era", lasting from approximately 1994 through 2005, was characterized by a rapid increase in power hitting often attributed to players using performance enhancing drugs. More recently adding to this list, Woltring et al. also identified and named a seventh era after 2006: the "Post Steroid Era."

Although the determination of eras tends to be inexact, identification of regimes between large changes can be helpful to understand innovations in gameplay, rules, league structure, team management, and athleticism. For this reason, academic work has sought to empirically identify structural changes in skills and gameplay, competitive balance, and fan demand (attendance). For example, evaluating univariate time series of performance measures, [3] "let the data speak", looking for structural change points over the period from 1871-2020. They analyzed four statistics: slugging percentage (SLG), home run (HR) rate, batting average (BA), and runs batted in (RBI) rate. For each of these statistics, they computed the mean and standard deviation across all players who had at least 100 at bats in a given season, yielding a univariate time series for each of these statistical measures. They then used the Lagrange Multiplier (LM) unit root test proposed in [4] to find change points, identifying two changes in slugging percentage in 1921 and 1992. The first change point marked the end of the Dead Ball Era, while the latter corresponds with the start of the Steroid Era. Similarly, [5], tested for structural change points in a single player’s performance: Barry Bonds. This work investigated a univariate time series of monthly On-Base-Plus-Slugging (OPS) of home run champion Barry Bonds over the course of his career, finding two large changes, one in June of 1993 and another in September of 2000. The authors note that the latter occurred late in his career - quite unexpectedly - leading to speculation about steroid use by Bonds himself during the Steroid Era.

Additional work has evaluated the existence of structural changes in attendance and competitive balance. For example, [6] and [7] separately tested the stationarity of, and estimated structural changes in, competitive balance of the American and National Leagues from 1901 to 1999 using methods developed by [8, 9] and others ([10, 11, 12]). They measure competitive balance using the classical Noll-Scully ratio, adjusting for ties ([13, 14]). The analysis identified change points in competitive balance in 1912, 1926, and 1933 for the NL and in 1926 and 1957 in the AL, with improving trends in balance since this time. The authors largely attribute this to equalization of population centers (market sizes), television revenues, and increases in utilization of the international talent pool.

Baseball is not the only sport where this type of analysis has been applied. [15] estimated structural change points in soccer using data from British soccer leagues through 1996. Notably, this work identified a change point in the mean of margin of victory in 1925 related to the change in the definition of offsides (changed from 3 players to 2 players). Additional change points occured in the variability of number of goals in the early 1980s and 1992, corresponding both with a change in number of points for a win (from 2 points to 3 points), and a change in the backpass rule. [16] looked for change points in competitive balance for three other major North American sports leagues: National Basketball Association (NBA), National Hockey League (NHL), and National Football League (NFL). They identified a number of change points in each sport that often, though not always, corresponded to league expansion, league mergers, or other major economic events in a sport (e.g. increased number of foreign players in the NBA in the late 1990s/early 2000s).

Subsequent work has analyzed attendance shifts and trend changes in all four major North American leagues, seeking to understand relationships between league policies – such as free agency and expansion – on both balance and subsequent fan demand ([17, 18, 19]). Many of the largest change points in attendance tended to be coincident with major wars, the Great Depression, league expansion, and labor disputes that resulted in cancelled games or seasons. Further empirical analysis has addressed competitive balance in college football and college basketball, which identified structural changes associated with realignment and a split into various divisions in National Collegiate Athletics Association (NCAA) ([20, 21]).

All of this previous work focuses on change point analysis in a univariate time series context. However, recent methodological developments in change point analysis allow for the estimation of change points in multivariate panel data, which is the focus of the current manuscript. We use the Double CUSUM ([22]) and Sparsified Binary Segmentation algorithm ([23]) to identify change points in Major League Baseball (MLB) at multiple levels. We first seek to identify change points in a panel of league-level statistical measures and, separately, a team-level panel of individual statistical measures to empirically define different eras in baseball, providing a broader data-driven empirical context for historical analysis of the sport than past work. Subsequently, we search for change points within individual teams, using a panel of team-level statistics, to determine relevant eras of team performance and management. This latter analysis is used to empirically locate the initial emergence (and subsequent demise) of so-called "dynasties", periods of sustained excellent performance by a team.

2 Methods

Change point analysis is a statistical technique for identifying structural changes in time series data. A common method for identifying change points in the data is to use the cumulative sum (CUSUM) statistic [24]. The statistic as described by [22] can be seen in equation 1, let Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT denote a CUSUM statistic which takes xj,tsubscript𝑥𝑗𝑡x_{j,t}italic_x start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT over a generic interval t[s,e]𝑡𝑠𝑒t\in[s,e]italic_t ∈ [ italic_s , italic_e ] with 1s<eT1𝑠𝑒𝑇1\leq s<e\leq T1 ≤ italic_s < italic_e ≤ italic_T as an input and returns Cb({xj,tσj}t=se)=subscript𝐶𝑏superscriptsubscriptsubscript𝑥𝑗𝑡subscript𝜎𝑗𝑡𝑠𝑒absentC_{b}(\{\frac{x_{j,t}}{\sigma_{j}}\}_{t=s}^{e})=italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( { divide start_ARG italic_x start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG } start_POSTSUBSCRIPT italic_t = italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) =

𝐗s,b,ej=1σj[eb(es+1)(bs+1)t=sbxj,tbs+1(es+1)(eb)t=b+1exj,t]superscriptsubscript𝐗𝑠𝑏𝑒𝑗1subscript𝜎𝑗delimited-[]𝑒𝑏𝑒𝑠1𝑏𝑠1superscriptsubscript𝑡𝑠𝑏subscript𝑥𝑗𝑡𝑏𝑠1𝑒𝑠1𝑒𝑏superscriptsubscript𝑡𝑏1𝑒subscript𝑥𝑗𝑡\mathbf{X}_{s,b,e}^{j}=\frac{1}{\sigma_{j}}\Bigg{[}\sqrt{\frac{e-b}{(e-s+1)(b-% s+1)}}\sum_{t=s}^{b}x_{j,t}-\sqrt{\frac{b-s+1}{(e-s+1)(e-b)}}\sum_{t=b+1}^{e}x% _{j,t}\Bigg{]}bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG [ square-root start_ARG divide start_ARG italic_e - italic_b end_ARG start_ARG ( italic_e - italic_s + 1 ) ( italic_b - italic_s + 1 ) end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_t = italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT - square-root start_ARG divide start_ARG italic_b - italic_s + 1 end_ARG start_ARG ( italic_e - italic_s + 1 ) ( italic_e - italic_b ) end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_t = italic_b + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT ] (1)

for b=s,,e1𝑏𝑠𝑒1b=s,\dots,e-1italic_b = italic_s , … , italic_e - 1, with a suitably chosen scaling constant σjsubscript𝜎𝑗\sigma_{j}italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

The CUSUM statistic is used by breaking up the length of the time series into smaller intervals and then calculating the CUSUM statistic for each of the generic intervals. These intervals are calculated cumulatively over all b𝑏bitalic_b’s until a potential change point is found. A change point can be found when the CUSUM, which can be thought of as a weighted difference between the two segments, is found to be significantly large compared to a threshold. This means that there is some significant breaking point, at a potential time point b𝑏bitalic_b, leading to a large difference before and after b𝑏bitalic_b. Then binary segmentation is used to break up the time series into segments based on the identification of the first change point (the largest CUSUM at a given b𝑏bitalic_b) in the length of the time series and then sequentially breaks down into smaller segments until no more change points are detected. The CUSUM statistic is found to be large enough for a change point to be determined based on a thresholding parameter related to the error term and the length of the time series ([25]).

This method is used in univariate analysis for the mean change of a time series but can have limitations in some situations. Our interest is in segmenting the structure of time series in multivariate settings for both mean and variance changes to understand the combined influence of change points across multivariate time series. Using multivariate panel data and investigating when changes occur across time series would require separate analyses for each one the time series resulting in multiple potential locations of change points with different thresholding parameters with univariate methodology. Attempting to remove some potential change points based on some universal threshold parameter can be done but can prove difficult at high dimensions and does not appropriately use information about shared change point locations. The methodology from [22] proposed the Double CUSUM (DC) statistic which computes CUSUM statistics across the same intervals of all the time series and then creates a test statistic using ordered CUSUM values within that interval.

Using similar notation from equation 1, the Double CUSUM (DC) statistic is,

𝐃mϕ(|𝐗s,b,e(j)|j=1n)={m(2nm)2n}ϕ(1mj=1m|𝐗s,b,e(j)|12nmj=m+1n|𝐗s,b,e(j)|)superscriptsubscript𝐃𝑚italic-ϕsuperscriptsubscriptsuperscriptsubscript𝐗𝑠𝑏𝑒𝑗𝑗1𝑛superscript𝑚2𝑛𝑚2𝑛italic-ϕ1𝑚superscriptsubscript𝑗1𝑚superscriptsubscript𝐗𝑠𝑏𝑒𝑗12𝑛𝑚superscriptsubscript𝑗𝑚1𝑛superscriptsubscript𝐗𝑠𝑏𝑒𝑗\mathbf{D}_{m}^{\phi}({|\mathbf{X}_{s,b,e}^{(j)}|}_{j=1}^{n})=\{\frac{m(2n-m)}% {2n}\}^{\phi}(\frac{1}{m}\sum_{j=1}^{m}|\mathbf{X}_{s,b,e}^{(j)}|-\frac{1}{2n-% m}\sum_{j=m+1}^{n}|\mathbf{X}_{s,b,e}^{(j)}|)bold_D start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ϕ end_POSTSUPERSCRIPT ( | bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = { divide start_ARG italic_m ( 2 italic_n - italic_m ) end_ARG start_ARG 2 italic_n end_ARG } start_POSTSUPERSCRIPT italic_ϕ end_POSTSUPERSCRIPT ( divide start_ARG 1 end_ARG start_ARG italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT | bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | - divide start_ARG 1 end_ARG start_ARG 2 italic_n - italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_j = italic_m + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | ) (2)

for b[s,e)𝑏𝑠𝑒b\in[s,e)italic_b ∈ [ italic_s , italic_e ) and m1,,n𝑚1𝑛m\in{1,...,n}italic_m ∈ 1 , … , italic_n, where the DC operator 𝐃mϕsuperscriptsubscript𝐃𝑚italic-ϕ\mathbf{D}_{m}^{\phi}bold_D start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ϕ end_POSTSUPERSCRIPT takes the ordered CUSUM values |𝐗s,b,e(1)||𝐗s,b,e(2)||𝐗s,b,e(n)|superscriptsubscript𝐗𝑠𝑏𝑒1superscriptsubscript𝐗𝑠𝑏𝑒2superscriptsubscript𝐗𝑠𝑏𝑒𝑛|\mathbf{X}_{s,b,e}^{(1)}|\geq|\mathbf{X}_{s,b,e}^{(2)}|\geq\dots|\mathbf{X}_{% s,b,e}^{(n)}|| bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT | ≥ | bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | ≥ … | bold_X start_POSTSUBSCRIPT italic_s , italic_b , italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_n ) end_POSTSUPERSCRIPT | at each b𝑏bitalic_b, as its input for some ϕ[0,1]italic-ϕ01\phi\in[0,1]italic_ϕ ∈ [ 0 , 1 ].

The ordering of the CUSUM statistics prioritizes time series that have larger values since these would most likely have change points occurring within this interval and potentially in others. DC statistics across all intervals of the time series are then compared against a threshold to remove intervals that even at their largest CUSUM value would not result in a change point. This thresholding process helps remove locations in time where there possibly is not a meaningful change point across all of the time series.

The Sparsified Binary Segmentation (SBS) ([23]) is then used with the DC statistics to compare across different time intervals and within different time series to determine when change points occur across all of the time series. Determining the choice of a thresholding parameter based on the panel data can be challenging due to potential high autocorrelation within the data. The authors propose using a Generalized Dynamic Factor Model (GDFM) bootstrap** algorithm which accounts for potential correlations within and between high-dimensional time series to determine an appropriate threshold for the panel data ([22]). This methodology enables the detection of second-order change points, detecting change points in the variability as opposed to the mean, through utilizing methodology from [26] using Haar wavelet periodograms and cross-periodograms, as opposed to the DC statistics. For further details on the methods please refer to [22] and [23]. Due to the nature of the methodology, evaluation of time series of differing lengths cannot be performed. We apply this methodology to multiple forms of panel data in baseball to evaluate and confirm changes in eras of the game for league-level statistics and determine dynasties within teams with team-level statistics. R programming language ([27]) and the package "hdbinseg" ([28]) were used to wrangle, model, and visualize this analysis.

2.1 Data

We obtained seasonal level baseball statistics from the Lahman database from 1900 through 2020 ([29]). Over the course of the sport’s history, numerous statistics have been meticulously collected and maintained, which we leverage for our analysis, focusing on year-end statistics at the team- and league-level. Using league-level statistics, we include all teams existing for each year in question from 1900 through 2020. However, for the league-level team-panel analysis, it was necessary for each team to have existed from 1900 until the end of the data set in 2020, as the multivariate methodological approach requires equal length time series. This sample (a panel of team-level measures, estimated separately for each statistical measure) encompassed a total of 16 franchises, each presented in 1. At the team level, using a panel of statistical measures for each team, we include all existing teams across their entire existence, treating each as a single franchise irrespective of geographic moves or name changes.

The statistics of interest for these teams include runs (R), hits (H), home runs (HR), base-on-balls (walks; BB), strikeouts (K), at-bats (AB), stolen bases (SB), number of games played in a season (G), runs against (RA), hits against (HA), home runs against (HRA), base-on-balls against (BBA), strikeouts against (SO), and attendance (ATT). We use the G variable to transform all count statistics into per-game rate statistics. However, a few statistics, such as BB, SB, K, and ATT had missing data for certain years generally pre-1900. To address this, we imputed missing values with predicted values from a linear regression model using MICE [30] due to low rates of missingness and to ensure consistent data across all analyses.

2.2 Eras and Dynasties

2.2.1 Eras

Because we are interested in finding where in time different eras of baseball have emerged, we first focused the multivariate change point procedure to a league-level panel of time series on the 4 key seasonal league-average measures which have been analyzed in previous research: HR, K, BB, and SB. These represent differences in approach with respect to power, contact, reaching base, and speed, respectively. Each was standardized by the number of games in a season (G). The standardization was required due to the large changes in games played in MLB across this time period, ranging from 60 games in the 1800s to as many as 163 in recent years where a tiebreaker game was required at the end of the regular season. Here, the multivariate dimension in our analysis is the various statistical measures for each season (HR, K, BB, and SB).

We next examined some key statistics individually, on how these statistics and their changes evolved, using the time series of all 16 teams. Those statistics included HR, K, BB, SB, R, and ATT, which were each standardized to the number of games played in a season (G). Unlike the prior league-level analysis, here the multivariate dimension of analysis was the 16 different team-level time series, with the change point procedure applied to each individual statistic (16 team-level time series for each). Teams included in this panel are included in Table 1.

Table 1: Franchise label and their modern team name.
Franchise Label Current Franchise Name
ATL Atlanta Braves
BAL Baltimore Orioles
BOS Boston Red Sox
CHC Chicago Cubs
CHW Chicago White Sox
CIN Cincinnati Reds
CLE Cleveland Guardians
DET Detroit Tigers
LAD Los Angeles Dodgers
MIN Minnesota Twins
NYY New York Yankees
OAK Oakland Athletics
PHI Philadelphia Phillies
PIT Pittsburgh Pirates
SFG San Francisco Giants
STL St. Louis Cardinals

We determined a thresholding parameter from GDFM bootstrap**, which was then used to find where change points in both mean and variance existed in the panel of statistics and panel of teams ([23]). This was then separately applied to the panel of teams and each of the league-average panel of each of the 6 statistics noted earlier.

2.2.2 Dynasties

Continuing our investigation, we used change point analysis to empirically identify "dynasties" and "collapses" within modern baseball teams, irrespective of their length of a teams existence. For each team, we examined ten key statistics that represent a team’s overall performance scoring and preventing runs (R, H, HR, BB, K, RA, HA, HRA, BBA, and SO) as a multivariate panel at the team-level, analyzing both mean and variance changes to identify potential change points. More specifically, the change point procedure was applied separately for each team panel, with 10 statistical time series for each team. Using this method, a dynasty would be identified when a team experiences a noticeable positive shift across their statistical outputs, whereas a collapse would be identified through negative shifts at a change point.

By scrutinizing these indicators, we aimed to uncover shifts in a team’s performance and assess their significance within the context of dynasties. As noted earlier, these measures were standardized using the season average and standard deviation derived from all teams present in each respective season. By standardizing, we capture a team’s offensive and defensive performance relative to peers within a season. Notably, if a team exhibits a high and positive offensive performance, it can be balanced to zero by large and negative defensive statistics when standardized. Therefore, if a team excels in both aspects of the game, it would have numerous time points above zero, whereas under-performing teams would display many below-zero values.

3 Results

3.1 Eras: League-Aggregate Multivariate Time Series

We begin by describing change points identified using the league-level multivariate time series of statistical measures. Figure 1 presents the estimated change points at the league-level, with 4 change points in mean, and 1 change point in variance of the multivariate time series of HR, SB, K, and BB. The variance change point (1919) aligns closely with the first mean change point (1924), and appears to identify the end of the Dead Ball Era and beginning of the Live Ball Era. It is likely that variance change point began slightly earlier, as home run hitting in the Live Ball Era was largely kicked off by a few players or single player (Babe Ruth) in its early stages, leading to high variability in approaches. Subsequently, other players followed suit and swung for the fences after realizing the benefits to such an approach after recent changes to the way balls were used in games. As with past work, the end of the Dead Ball Era tends to be the most strongly empirically confirmed change point in MLB’s history.

The second mean change occurred in 1954, concurrent with a sharp and continued increase in strikeouts. Stolen bases also began to increase after they had been decreasing since the start of the Deadball Era. [31] describes the 1950s as "The Nadir of Stolen Bases." While not directly tied to traditionally separate eras, this tends to point toward a change in managerial philosophy, as well as possible umpire influences. Umpires during the 1950s were often accused of failing to enforce the set rule for pitchers, which made stealing bases particularly difficult. In 1950, there was also an increase in the size of the strike zone, and subsequent dramatic increases in strikeouts ensued throughout the decade. The combination of these differences seem to be the impetus for the detected change point around 1954, particularly as steals began increasing again during the 2nd half ot he decade.

The more recent changes, in 1993 and 2009, likely have straight forward interpretations. 1993 marked expansion to Florida and Colorado in the NL, which potentially diluted the talent pool across the league. Although these teams do not appear in this league-level analysis, they could potentially influence output by other teams due to the low quality (including, again, a dramatic increase in strikeouts and walks). Further, the subsequent season marked one of the most consequential work stoppages in MLB history, cancelling the 1994 World Series and some of the 1995 season. This also coincided with the early part of the Steroid Era.

Beginning around 2009, there was a well-documented expansion of the called strike zone by MLB umpires - which continued through 2014 - due to new monitoring and evaluation systems in place ([32]) that led to stark increases in strikeouts and reductions in walks and offense in general ([33]). As before, large changes to the relative rates of walks and strikeouts seem to again be a partial driver of this change point.

Refer to caption
Figure 1: Standardized season average of all statistics and their change points.

We exhibit the timing of estimated changes in Figure 1 alongside the individual time series of the key metrics (BB, HR, SB, and SO). Visually, many of the changes tend to be associated with batter strikeouts (SO), with large increases after the 1954, 1993, and 2009 mean change points, and to a lesser degree, home runs (HR). For the first change points in 1919 (variance) and 1924 (mean), there the opposite was true: there was a sudden recent decrease in strikeouts a few years prior. Additionally, the first 3 mean change points coincide with a recent inversion of the relative rates of SB and HR, indicating particularly stark changes to game play during these eras. In all cases, these inversions were associated with strong increases in HR relative to SB.

3.2 Eras: Metric-Specific Multivariate Time Series

Here, we turn our attention now to the estimation of change points for individual statistics, leveraging the panel of team-level time series of each measure to assist in understanding drivers of the league-aggregate change points. We present mean and variance change points for each statistic in Table 2. These are also presented visually in Figure 2.

3.2.1 Attendance

We begin with attendance to be able to compare directly to the abundant literature using univariate change point methods on attendance data in professional sports. Attendance and other statistical measures’ change points can be found in Table 2 and Figure 2. We find similar change points to past work coinciding with the end of World War II (both for the mean in 1945, and the variance in 1944). Our analysis also identifies as change point in 1976, near the start of the Free Agency era and the expansion of the league to Toronto and (back to) Seattle. [17] identify an attendance change point in the mid-1960s followed by a very strong upward trend - particularly in the American League - beginning shortly thereafter. While they did not estimate a structural change during the 1960s time for the National League, there is evidence that an increasing attendance trend seems to have begun near our estimated 1976 change point. Figure 2 also shows this steepening attendance trend around this time. Finally, a variance change point was identified in 2008, which was the final year that baseball was played in both Yankee and Shea Stadium with both teams, the Yankees and the Mets, moving to stadiums with smaller capacities.

Table 2: Change points present in statistics across the league, change point in mean unless noted as for variance.
Statistic Changes
Attendance 1944, 1945, 1976, 2008
Walks 1936
Home Runs 1920, 1927, 1944, 1946, 1967, 1993, 1995, 2007
Runs 1919, 1935, 1940
Stolen Bases 1919, 1919, 1966, 1967
Strikeouts 1956, 1993, 1995, 2009

3.2.2 Runs

Our method identified two mean change points and one variance change point using the runs per game measure. The first mean change point in 1919 is a common finding in the literature and across some of our other statistics, coinciding with World War I, the Spanish Flu pandemic, and the end of the Dead Ball Era. The second mean change in 1940 is approximately concurrent with the end of the Live Ball Era and just prior to the start of the Integration Era. Lastly, the variance change point, which occurred in 1935, is near the peak of Live Ball Era scoring, suggesting that as run scoring came back down, variability in scoring was also reduced. In 1931, MLB instituted an official rule to standardize the height of the pitcher’s mound at 15 inches. Mound height has been shown to have important impacts on run scoring, and this standardization may have been the reason for what appears to be reductions in the variability in run scoring across teams detected in 1935.

3.2.3 Home Runs and Walks

There are 4 estimated change points for HR for each of the mean and variance, which largely align with the traditionally noted eras. Mean change points approximately correspond with the start of the Live Ball Era (1920), the end of World War II and start Integration Era (1946), and the start of the Steroid Era (1993). The fourth mean change point was in 1967, just as the league lowered the pitcher’s mound by 5 inches and just before shrinking the strike zone. Variance change points are associated with the approximate start of the Integration Era (1944), the Steroid Era (1995), and the Post Steroid Era (2007). A fourth variance change point occurred in 1927 as the league shifted out of the Dead Ball Era and and into the Live Ball Era where a much larger emphasis was placed on power hitting. A mean change point in walks was identified in 1936. This change point occurs during the latter part of the Live Ball Era and is consistent with pitchers trying to avoid throwing pitches down the middle to home run hitters that were now more plentiful throughout the league.

Refer to caption
Figure 2: Change point of all statistics over the league for 120 years across 16 teams.

3.2.4 Stolen Bases

As offensive output trended toward hitting home runs in the Live Ball Era, the need for SB decreased ([34]). This is seen by the estimation of a change point in the mean (associated with a drop) in SB in 1919. In that same year, a change point in the variance was also observed as well. In the latter half of the 1960s, another mean and variance change point pair is observed. There is a variance change point in SB in 1966 and mean change point observed a year later in 1967. This was concurrent with a convergence in SB and HR rates that occurred beginning around this time and accelerating into the mid-1970s. [31] notes that as larger parks began to open, Deadball Era play styles re-emerged in the early 1970s.

3.2.5 Strikeouts

The procedure detected 3 mean change points and 1 variance change point for strikeouts. Here, we note that two of the mean change points (and in combination with the variance change point) are associated with the start of traditional eras. The first mean change point in 1956 indicates a dramatic change in strikeouts (see Figure 2), which was later reduced by lower the pitching mound in 1968. The 1960s marked the Expansion Era, perhaps adding fuel to the strikeout increases during this time by increasing the number of players in the league (and subsequently diluting the talent pool in the short term). The 1993 mean change point (1995 variance change point) again approximately aligns with the start of the Steroid Era. The 2009 change point, while close to the start of the Post Steroid Era, also aligns with other work on changes to the umpire-called strike zone. Specifically, [32],[33] shows that this expansion of the strike zone, which was concurrent with new umpire monitoring technology, accounted for as much as 40 percent of the changes to run scoring during this time through a decrease in the odds of contact on any pitch by 73 percent. This expansion began right after the 2009 season.

3.3 Team Series: Dynasties and Collapses

Table 3 exhibits the estimated change points for the team-level multivariate time series analyses used to identify team-specific eras across all statistics. These are also visualized in Figure 3. In this case, we identified change points for each team in the data using all team-specific statistical measures noted previously. Because there are a large number of change points across teams, we split our analysis by first consolidating common change point seasons across teams, and subsequently discuss particularly strong dynasties or collapses for relevant teams in the data. We leave inspection of remaining team-level change points to the reader.

In nearly every decade, there were changes points estimated that were common across teams, and consistent with some of the league-level changepoints discussed earlier. For example, 3 different teams (BOS, CIN, PHI) were estimated to have change points around 1918, near the end of the Dead Ball Era. Similarly there were 5 different teams (BOS, LAD, STL, CLE, DET) which experience change points shortly before or after the start of the Integration Era (these ranged from 1932 to 1944). There were also 4 teams (BAL, CLE, DET, MIN) with change points at the end of the Integration Era. The most common single year associated with a change point was 1977, with 3 teams (LAA, MIL, SDP) experiencing change estimated changes at the start of the Free Agency Era. The 1970-1974 time period. The 1973-1978 time period also saw 4 other teams with estimated change points (CWS, SF, NYM, CIN). Additionally, 6 different teams were estimated to have change points near the start of the Steroid Era and around the 1994-1995 work stoppage (CLE, MIL, BAL, HOU, KCR, SEA). Finally, there were 4 teams (TB, BOS, HOU, WAS) with change points associated with the start of the Post Steroid Era, with change points ranging from 2007 to 2011.

Table 3: Change points for franchises in mean unless noted as for variance.
name Changes
Arizona Diamondbacks None
Atlanta Braves 1887, 1990
Baltimore Orioles 1959, 1999
Boston Red Sox 1918, 1937, 2008
Chicago Cubs 1891
Chicago White Sox 1969, 1973
Cincinnati Reds 1918, 1952, 1978
Cleveland Guardians 1944, 1958, 1966, 1993
Colorado Rockies None
Detroit Tigers 1912, 1942, 1958, 1988
Houston Astros 1999 , 2010*
Kansas City Royals 1997
Los Angeles Angels 1977
Los Angeles Dodgers 1940, 1961
Miami Marlins None
Milwaukee Brewers 1977, 1996
Minnesota Twins 1959
New York Mets 1974
New York Yankees 1918, 1964
Oakland Athletics None
Philadelphia Phillies 1917, 1949
Pittsburgh Pirates 1902, 2001
San Diego Padres 1977, 1977
San Francisco Giants 1903, 1973
Seattle Mariners 1999
St. Louis Cardinals 1932
Tampa Bay Rays 2007
Texas Rangers 1985
Toronto Blue Jays None
Washington Nationals 2011

The change point detection was seemingly able to identify particularly strong shifts in performance of individual teams, and we focus on more recent change points here. In the 1990s, the Atlanta Braves (1990) and Cleveland Guardians (1993) quickly saw their fortunes change, culminating in multiple World Series berths for both teams and were seen as the two top teams dueling for the title of World Series Champion throughout the 1990s.111The similarity in their team mascots at the time likely heightened awareness of this cross-league rivalry. While the Braves of this era are considered a dynasty by most baseball fans - headlined by their Hall of Fame pitching trifecta that included Greg Maddux, Tom Glavine, and John Smoltz - Cleveland is often remembered less fondly, given their inability to win a World Series during this stretch. Nevertheless, from 1994 to 2001, Cleveland was the best hitting team in baseball, leading the league in both on-base percentage and slugging percentage during this stretch. During this same peak of the Steroid Era, the Braves had a team earned run average of 3.55, leading the next closest team - the Los Angeles Dodgers (3.94) - by nearly half a run per game. This was arguably the most sustainably dominant pitching staff in the history of baseball.

As both Atlanta and Cleveland saw the ends of their historic runs, the Seattle Mariners experienced a change point around the 1999 season. Seattle put together a team that had sustained success from 2000 through 2003, tying the all-time single season team wins record in 2001, which stands to this day. Most surprisingly, they managed this feat shortly after the departure of three of the greatest players in MLB history near the prime of their respective careers: Randy Johnson in 1998, Ken Griffey, Jr. in 2000, and Alex Rodriguez in 2001.

Refer to caption
Figure 3: Change point location in time for each franchise throughout their history.

More recently, the Tampa Bay Rays experienced a change point near the 2007 season, marking the last year of a streak of abysmal team records. From their inception in 1998 through the 2007 season, the Rays never won more than 70 games in a season, but suddenly vaulted to an American League East Division title - over the historically dynastic Yankees - and appeared in the World Series for the first time in franchise history. Since that season, the Rays have regularly been Division and Wild Card contenders, winning it 3 more times through 2021. Meanwhile, the Washington Nationals - which were moved from Montreal in 2005, shedding the Expos name - had a run of success starting in 2012 that culminated in a World Series title in 2019. This is likely identified by the 2011 change point in our data, and concurrent with the drafting of back-to-back first overall selections in the MLB draft (Stephen Strasburg and Bryce Harper). Prior to this, the Nationals had never had a winning season, and their predecessor in Montreal had only a single playoff appearance in their history. As a whole, the multivariate change point method seems to have identified these stark changes at the individual team level rather well.

4 Conclusions

Past work estimating structural changes to time series data in sports have exclusively focused upon univariate time series or separately estimating changes for different individual time series. We add to this literature by implementing advances to change point detection in the multivariate context, allowing for changes to take place not just in mean, but also in the variance across a panel of series. In our context, we were able to empirically identify traditional league eras directly from these multivariate data, including the Dead Ball Era, the Integration Era, the Steroid Era, and the Post-Steroid Era. This serves as strong support for the historical record as it relates to baseball eras. We also were able to classify when dynasties began, and identify declines for individual teams. These results are encouraging for the use of this method in sports data, and could be used to assist in historical description of leagues and teams across a number of dimensions that interest sports analysts and researchers.

Given the success of this approach here, we propose that other research questions related to sports leagues, teams, and management policies would be well served by the multivariate change point procedure. For example, there are opportunities to detect change points for individual athletes with multivariate time series. Past work has, again, focused on a single performance statistics at a time. However, it is now possible to identify structural changes to approach, power, speed, and game play for individual athletes across a variety of measures. Additionally, prior research has struggled to reconcile the diverse measurement of competitive balance in sports leagues, which is made up of various conceptualizations or components of balance and uncertainty ([35, 18, 36]). Related work has focused on integrating various characteristics of balance into a single measure ([37]). Rather than separately focus on measures within their own series - or aggregating into a single measure that loses information about certain dimensions - this method could allow the change point detection in a collection of balance measure time series.

5 Supplemental Materials

All code for reproducing the analyses in this paper is publicly available at github <provided after review>

Acknowledgement

We thank Michael Lopez for suggesting we do "something with change point analysis".

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

  • [1] Matt Rothenberg. Pro baseball began in Cininnati in 1869, n.d.
  • [2] M. Woltring, J. Rost, and C. Jubenville. Examining perceptions of baseball’s eras: A statistical comparison. The Sport Journal, 2018.
  • [3] Peter A. Groothius, Kurt W. Rotthoff, and Mark C. Strazicich. Structural breaks in the game: The case of major league baseball. Journal of Sports Economics, 18(6):622–637, 2017.
  • [4] J. Lee and M. Strazicich. Minimum lagrange multiplier unit root test with two structural breaks. The Review of Economics and Statistics, 85(4):1082–1089, 2003.
  • [5] Michael L. Nieswiadomy, Mark C. Strazicich, and Stephen Clayton. Was there a structural break in barry bonds’s bat? Journal of Quantitative Analysis in Sports, 8(3), 2012.
  • [6] Lee and Fort. Structural change in mlb competitive balance: The depression, team location, and integration. Economic Inquiry, 43(1):158–169, 2005.
  • [7] Lee Y. H. Fort R. Stationarity and major league baseball attendance analysis. Journal of Sports Economics, 7:408–415, 2006.
  • [8] J. Bai and P. Perron. Estimating and testing linear models with multiple structural changes. Econometrica, 66:47–78, 1998.
  • [9] J. Bai and P. Perron. Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18:1–22, 2003.
  • [10] D.W.K. Andrews. Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4):821–856, 1993.
  • [11] Jushan Bai. Estimating multiple breaks one at a time. Econometric Theory, 13(3):315–352, 1997.
  • [12] J. Bai. Likelihood ratio tests for multiple structural change points. Journal of Econometrics, 91:299–323, 1999.
  • [13] R.G. Noll. Professional basketball. Stanford University Studies in Industrial Economics, (144), 1988.
  • [14] G. W. Scully. The Business of Major League Baseball. University of Chicago Press, 1989.
  • [15] I. Palacios-Huerta. Structural changes during a century of the world’s most popular sport. Statistical Methods and Applications, 12:241–258, 2004.
  • [16] R. Fort and Y.H. Lee. Structural change, competitive balance, and the rest of the major leagues. Economic Inquiry, 45(3):519–532, 2007.
  • [17] Y.H. Lee and R. Fort. Attendance and the uncertainty-of-outcome hypothesis in baseball. Review of Industrial Organization, 33(4):281–295, 2008.
  • [18] Brian M. Mills and Rodney Fort. League-level attendance and outcome uncertainty in u.s. pro sports leagues. Economic Inquiry, 52(1):205–218, 2014.
  • [19] Brian M. Mills and Rodney Fort. Team-level time series analysis in mlb, the nba, and the nhl: Attendance and outcome uncertainty. Journal of Sports Economics, 19(7):911–933, 2018.
  • [20] S. Salaga and R. Fort. Structural change in competitive balance in big-time college football. Review of Industrial Organization, 50:27–41, 2017.
  • [21] Brian M. Mills and Steven Salaga. Historical time series perspectives on competitive balance in ncaa division i basketball. Journal of Sports Economics, 16(6):614–646, 2015.
  • [22] H. Cho. Change-point detection in panel data via double cusum statistic. Electronic Journal of Statistics, 10:2000–2038., 2016.
  • [23] H. Cho and P. Fryzlewicz. Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. JRSSB, 77:475–507, 2014.
  • [24] ES Page. A test for a change in a parameter occurring at an unknown point. Biometrika, 42(3/4):523–527, 1955.
  • [25] David V Hinkley. Inference about the change-point from cumulative sum tests. Biometrika, 58(3):509–523, 1971.
  • [26] Haeran Cho and Piotr Fryzlewicz. Multiscale and multilevel technique for consistent segmentation of nonstationary time series. Statistica Sinica, 22(1):207–229, 2012.
  • [27] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2023.
  • [28] Haeran Cho and Piotr Fryzlewicz. hdbinseg: Change-Point Analysis of High-Dimensional Time Series via Binary Segmentation, 2018. R package version 1.0.1.
  • [29] Michael Friendly, Chris Dalzell, Martin Monkman, and Dennis Murphy. Lahman: Sean ’Lahman’ Baseball Database, 2021. R package version 9.0-0.
  • [30] Stef van Buuren and Karin Groothuis-Oudshoorn. mice: Multivariate imputation by chained equations in r. Journal of Statistical Software, 45(3):1–67, 2011.
  • [31] John McMurray. Examining stolen base trends by decade from the deadball era through the 1970s. Baseball Research Journal, Fall, 2015.
  • [32] Brian M. Mills. Technological innovations in monitoring and evaluation: Evidence of performance impacts among major league baseball umpires. Labour Economics, 46:189–199, 2017.
  • [33] Brian M. Mills. Policy changes in major league baseball: Improved agent behavior and ancillary productivity outcomes. Economic Inquiry, 55:1104–1118, 2017.
  • [34] John McMurray. Stolen bases in the deadball era: A relentless approach. SABR Deadball Era Newsletter, 2015.
  • [35] Allen R. Sanderson. The many dimensions of competitive balance. Journal of Sports Economics, 3:204–228, 2002.
  • [36] B Gerrard and M. Kringstad. The multi-dimensionality of competitive balance: Evidence from european football. Sport, Business, and Management, 12:382–402, 2022.
  • [37] Brad R. Humphreys. Alternative measures of competitive balance. Journal of Sports Economics, 3:133–148, 2002.