-
Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football
Authors:
Nathan Sandholtz,
Lucas Wu,
Martin Puterman,
Timothy C. Y. Chan
Abstract:
For decades, National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper, we develop a framework to explain this discrepancy using a novel inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynam…
▽ More
For decades, National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper, we develop a framework to explain this discrepancy using a novel inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynamics of which we estimate from NFL play-by-play data from the 2014 through 2022 seasons. We assume that coaches' observed decisions are optimal but that the risk preferences governing their decisions are unknown. This yields a novel inverse decision problem for which the optimality criterion, or risk measure, of the MDP is the estimand. Using the quantile function to parameterize risk, we estimate which quantile-optimal policy yields the coaches' observed decisions as minimally suboptimal. In general, we find that coaches' fourth-down behavior is consistent with optimizing low quantiles of the next-state value distribution, which corresponds to conservative risk preferences. We also find that coaches exhibit higher risk tolerances when making decisions in the opponent's half of the field than in their own, and that league average fourth down risk tolerances have increased over the seasons in our data.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Miss It Like Messi: Extracting Value from Off-Target Shots in Soccer
Authors:
Ethan Baron,
Nathan Sandholtz,
Devin Pleuler,
Timothy C. Y. Chan
Abstract:
Measuring soccer shooting skill is a challenging analytics problem due to the scarcity and highly contextual nature of scoring events. The introduction of more advanced data surrounding soccer shots has given rise to model-based metrics which better cope with these challenges. Specifically, metrics such as expected goals added, goals above expectation, and post-shot expected goals all use advanced…
▽ More
Measuring soccer shooting skill is a challenging analytics problem due to the scarcity and highly contextual nature of scoring events. The introduction of more advanced data surrounding soccer shots has given rise to model-based metrics which better cope with these challenges. Specifically, metrics such as expected goals added, goals above expectation, and post-shot expected goals all use advanced data to offer an improvement over the classical conversion rate. However, all metrics developed to date assign a value of zero to off-target shots, which account for almost two-thirds of all shots, since these shots have no probability of scoring. We posit that there is non-negligible shooting skill signal contained in the trajectories of off-target shots and propose two shooting skill metrics that incorporate the signal contained in off-target shots. Specifically, we develop a player-specific generative model for shot trajectories based on a mixture of truncated bivariate Gaussian distributions. We use this generative model to compute metrics that allow us to attach non-zero value to off-target shots. We demonstrate that our proposed metrics are more stable than current state-of-the-art metrics and have increased predictive power.
△ Less
Submitted 24 December, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Equity, diversity, and inclusion in sports analytics
Authors:
Craig Fernandes,
Jason D. Vescovi,
Richard Norman,
Cheri L. Bradish,
Nathan Taback,
Timothy C. Y. Chan
Abstract:
This paper presents a landmark study of equity, diversity and inclusion (EDI) in the field of sports analytics. We developed a survey that examined personal and job-related demographics, as well as individual perceptions and experiences about EDI in the workplace. We sent the survey to individuals in the five major North American professional leagues, representatives from the Olympic and Paralympi…
▽ More
This paper presents a landmark study of equity, diversity and inclusion (EDI) in the field of sports analytics. We developed a survey that examined personal and job-related demographics, as well as individual perceptions and experiences about EDI in the workplace. We sent the survey to individuals in the five major North American professional leagues, representatives from the Olympic and Paralympic Committees in Canada and the U.S., the NCAA Division I programs, companies in sports tech/analytics, and university research groups. Our findings indicate the presence of a clear dominant group in sports analytics identifying as: young (72.0%), White (69.5%), heterosexual (89.7%) and male (82.0%). Within professional sports, males in management positions earned roughly 30,000 USD (27%) more on average compared to females. A smaller but equally alarming pay gap of 17,000 USD (14%) was found between White and non-White management personnel. Of concern, females were nearly five times as likely to experience discrimination and twice as likely to have considered leaving their job due to isolation or feeling unwelcome. While they had similar levels of agreement regarding fair processes for rewards and compensation, females "strongly agreed" less often than males regarding equitable support, equitable workload, having a voice, and being taken seriously. Over one third (36.3%) of females indicated that they "strongly agreed" that they must work harder than others to be valued equally, compared to 9.8% of males. We conclude the paper with concrete recommendations that could be considered to create a more equitable, diverse and inclusive environment for individuals working within the sports analytics sector.
△ Less
Submitted 14 June, 2022; v1 submitted 2 April, 2022;
originally announced April 2022.
-
A Markov process approach to untangling intention versus execution in tennis
Authors:
Timothy C. Y. Chan,
Douglas S. Fearing,
Craig Fernandes,
Stephanie Kovalchik
Abstract:
Value functions are used in sports applications to determine the optimal action players should employ. However, most literature implicitly assumes that the player can perform the prescribed action with known and fixed probability of success. The effect of varying this probability or, equivalently, "execution error" in implementing an action (e.g., hitting a tennis ball to a specific location on th…
▽ More
Value functions are used in sports applications to determine the optimal action players should employ. However, most literature implicitly assumes that the player can perform the prescribed action with known and fixed probability of success. The effect of varying this probability or, equivalently, "execution error" in implementing an action (e.g., hitting a tennis ball to a specific location on the court) on the design of optimal strategies, has received limited attention. In this paper, we develop a novel modeling framework based on Markov reward processes and Markov decision processes to investigate how execution error impacts a player's value function and strategy in tennis. We power our models with hundreds of millions of simulated tennis shots with 3D ball and 2D player tracking data. We find that optimal shot selection strategies in tennis become more conservative as execution error grows, and that having perfect execution with the empirical shot selection strategy is roughly equivalent to choosing one or two optimal shots with average execution error. We find that execution error on backhand shots is more costly than on forehand shots, and that optimal shot selection on a serve return is more valuable than on any other shot, over all values of execution error.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
An Inverse Optimization Approach to Measuring Clinical Pathway Concordance
Authors:
Timothy C. Y. Chan,
Maria Eberg,
Katharina Forster,
Claire Holloway,
Luciano Ieraci,
Yusuf Shalaby,
Nasrin Yousefi
Abstract:
Clinical pathways outline standardized processes in the delivery of care for a specific disease. Patient journeys through the healthcare system, though, can deviate substantially from these pathways. Given the positive benefits of clinical pathways, it is important to measure the concordance of patient pathways so that variations in health system performance or bottlenecks in the delivery of care…
▽ More
Clinical pathways outline standardized processes in the delivery of care for a specific disease. Patient journeys through the healthcare system, though, can deviate substantially from these pathways. Given the positive benefits of clinical pathways, it is important to measure the concordance of patient pathways so that variations in health system performance or bottlenecks in the delivery of care can be detected, monitored, and acted upon. This paper proposes the first data-driven inverse optimization approach to measuring pathway concordance in any problem context. Our specific application considers clinical pathway concordance for stage III colon cancer. We develop a novel concordance metric and demonstrate using real patient data from Ontario, Canada that it has a statistically significant association with survival. Our methodological approach considers a patient's journey as a walk in a directed graph, where the costs on the arcs are derived by solving an inverse shortest path problem. The inverse optimization model uses two sources of information to find the arc costs: reference pathways developed by a provincial cancer agency (primary) and data from real-world patient-related activity from patients with both positive and negative clinical outcomes (secondary). Thus, our inverse optimization framework extends existing models by including data points of both varying "primacy" and "alignment". Data primacy is addressed through a two-stage approach to imputing the cost vector, while data alignment is addressed by a hybrid objective function that aims to minimize and maximize suboptimality error for different subsets of input data.
△ Less
Submitted 15 January, 2021; v1 submitted 6 June, 2019;
originally announced June 2019.
-
Automated Treatment Planning in Radiation Therapy using Generative Adversarial Networks
Authors:
Rafid Mahmood,
Aaron Babier,
Andrea McNiven,
Adam Diamant,
Timothy C. Y. Chan
Abstract:
Knowledge-based planning (KBP) is an automated approach to radiation therapy treatment planning that involves predicting desirable treatment plans before they are then corrected to deliverable ones. We propose a generative adversarial network (GAN) approach for predicting desirable 3D dose distributions that eschews the previous paradigms of site-specific feature engineering and predicting low-dim…
▽ More
Knowledge-based planning (KBP) is an automated approach to radiation therapy treatment planning that involves predicting desirable treatment plans before they are then corrected to deliverable ones. We propose a generative adversarial network (GAN) approach for predicting desirable 3D dose distributions that eschews the previous paradigms of site-specific feature engineering and predicting low-dimensional representations of the plan. Experiments on a dataset of oropharyngeal cancer patients show that our approach significantly outperforms previous methods on several clinical satisfaction criteria and similarity metrics.
△ Less
Submitted 17 July, 2018;
originally announced July 2018.
-
Learning to Optimize Contextually Constrained Problems for Real-Time Decision-Generation
Authors:
Aaron Babier,
Timothy C. Y. Chan,
Adam Diamant,
Rafid Mahmood
Abstract:
The topic of learning to solve optimization problems has received interest from both the operations research and machine learning communities. In this work, we combine techniques from both fields to address the problem of learning to generate decisions to instances of continuous optimization problems where the feasible set varies with contextual features. We propose a novel framework for training…
▽ More
The topic of learning to solve optimization problems has received interest from both the operations research and machine learning communities. In this work, we combine techniques from both fields to address the problem of learning to generate decisions to instances of continuous optimization problems where the feasible set varies with contextual features. We propose a novel framework for training a generative model to estimate optimal decisions by combining interior point methods and adversarial learning, which we further embed within an data generation algorithm. Decisions generated by our model satisfy in-sample and out-of-sample optimality guarantees. Finally, we investigate case studies in portfolio optimization and personalized treatment design, demonstrating that our approach yields advantages over predict-then-optimize and supervised deep learning techniques, respectively.
△ Less
Submitted 21 April, 2022; v1 submitted 23 May, 2018;
originally announced May 2018.