Speed-accuracy tradeoff and its effect in the game of cricket: predictive modeling from statistical mechanics perspective

Mohd Suhail Rizvi [email protected] Department of Biomedical Engineering, Indian Institute of Technology Hyderabad, Kandi, Telangana, 502285, India.
(July 2, 2024)
Abstract

The speed-accuracy tradeoffs are prevalent in a wide range of physical systems. In this paper, we demonstrate speed-accuracy tradeoffs in the game of cricket, where ‘batters’ score runs on the balls bowled by the ‘bowlers’. It is shown that the run scoring rate by a batter and the probability of dismissal follow a power-law relation. Due to availability of extensive data, game of cricket is an excellent model for the study of the effect of speed-accuracy tradeoff on the overall performance of the system. It is shown that the exponent of the power-law governs the nature of the adaptability of the player in different conditions and can be used for their assessment. Further, it is demonstrated that the players with extreme values of the power-law exponent are better suited for different playing conditions as compared to the ones with moderate values. These findings can be utilized to identify the potential of the cricket players for different game formats and can further help team management in devising strategies for the best outcomes with a given set of players.
Keywords: Cricket, speed-accuracy tradeoff, player performance

preprint: APS/123-QED

I Introduction

In a wide variety of systems, be they natural or man-made, the operational speed and functional reliability do not go hand-in-hand and this phenomenon is known as “speed-accuracy tradeoff” (pmid20709093, ; pmid18217850, ; pmid2969031, ; nilsson2004thesis, ; pmid25554788, ; pmid10096999, ; pmid21958757, ; pmid24966810, ; wickelgren1976, ). The speed in different contexts can represent speed of the physical motion (pmid18217850, ; nilsson2004thesis, ; pmid10096999, ), characteristic time of decision making or memory retrieval process (pmid2969031, ; pmid21958757, ), rate of cell proliferation (pmid25554788, ), rate of a chemical reaction or a natural process (pmid20709093, ; pmid24966810, ; wickelgren1976, ), and correspondingly the functional reliability or accuracy stands for the deviation from the given target in case of physical motion, error in making right decision, chances of lethal genetic mutation or formation of undesired products. The presence of speed-accuracy tradeoffs in such diverse range of scenarios indicates towards underlying similarities among these system.

We show in this paper that similar speed-accuracy tradeoffs are also present in the game of cricket and can be utilized as the performance indicators of players. The game of cricket, arguably the second most popular sport in the world (economist_article, ), played between two teams, primarily involves three set of players- batters, bowlers and fielders. On a cricket pitch, a bowler of one team (‘bowling’ team) throws a ball towards a batter of the opponent team (‘batting’ team) who, in turn, hits the ball with a bat. The fielders of the bowling team try to collect or catch the ball after the batter hits it and bring it back to the pitch. Meanwhile, the goal of batters is to score points, or ‘runs’ as called in cricket, by running between the wickets present at each end of the pitch such that they reach the wicket before a fielder can collect the ball and knock the wickets down. The batters can also score 4444 or 6666 runs on a delivery by hitting the ball outside of a marked boundary in a grounded fashion or aerially, respectively. Thus normally a batter can score 00, 1111, 2222, 3333, 4444, 5555 or 6666 runs at each delivery. If, however, either the wicket is knocked down or the fielders take a straight catch before the hit ball touches the ground, the batter is declared “out” (also known as ‘losing wicket’) and can no longer bat in that game. When a batter gets out, the next batter comes to bat and it goes on until all the players of the batting team are out, also known as the end of an ‘inning’. This is followed by the second innings where the opponent team bats to score more runs. Further, in fixed ‘overs’ (six consecutive balls bowled) format of the game the inning of a team can also end if the prescribed overs are finished. For instance, in T20I format of the game, one inning can go on for 20202020 overs (120 balls), in one-day international (‘ODI’) cricket, it goes for 50505050 overs, and in ‘test’ format of the cricket, there are virtually unlimited overs available for batting.

The availability of extensive statistical data about the international as well as club level cricket matches has encouranged the use of various statistics, such as strike rate (average runs scored for each ball faced), average runs scored for each inning, number of innings with a score of 50505050 or 100100100100 for batters as the measure of their batting abilities. Similar statistics, such as economy rate (average runs conceded for each over bowled), strike rate (average runs conceded for each wicket), are also used for the performance assesment of bowlers. Although these quantities do represent the quality of the player in different formats of the game they do not capture the changes across the three formats of the game. Here, in this report, we demonstrate that the players’ performances across three game formats are linked to each other in the form of speed-accuracy tradeoffs in batting as well as bowling. For this perpose we define “speed” in the batting (bowling) as the run scoring (conceding) rate and “accuracy” as the balls faced (delivered) before losing (gaining) his/her (opponent’s) wicket. We also show that these tradeoffs are poswerful indicators of the players’ performances and predictors across the game formats. We also utilize predictive modeling to identify the players better suited for different playing conditions.

II Data acquisition and Methods

For the analysis, we obtained the data from ESPN Cricinfo (cricinfo, ) website for the international T20I, ODIs, and test matches. For each batter, the obtained data included total runs scored, total balls faced, total number of times they got out and their respective run scoring rate. In the test, ODI and T20I formats of the game, players who have scored a total of 5000500050005000, 3000300030003000 and 500500500500 runs or more, respectively, were considered for analysis. Similarly for each bowler, total runs conceded, total balls bowled and average run given per ball were obtained. The bowlers who have taken 100100100100, 100100100100 and 20202020 wickets in the tests, ODIs and T20Is, respectively, were taken for the study.

The statistical analyses were performed using package R𝑅Ritalic_R and the predictive modeling of the batting was implemented using MATLAB. For each statistical analysis, p𝑝pitalic_p-value and effect size were obtained and are reported with the data. For the predictive parameter, the exponent α𝛼\alphaitalic_α, 95%percent9595\%95 % confidence intervals were also calculated. For the collective analysis of all the players (Fig. 1 all the batters and bowlers as selected from the criteria mentioned previously were considered. For the analysis of individual players (Figs. 2C and 2D), only those players were selected for whom complete statistics were available for all three cricket formats which resulted in 24242424 batters and 16161616 bowlers for the analysis.

III Results and Discussion

III.1 Speed-accuracy tradeoffs in cricket

From the data obtained from the ESPN Cricinfo (cricinfo, ) website, we calculate two parameters for batters- the runs scoring rate, r𝑟ritalic_r, and the half-life, τ𝜏\tauitalic_τ, of the innings. The run scoring rate is defined as the average number of runs scored by a batter at each ball faced, whereas the average half life of a batter’s inning is given by

τ=log2pe=Blog2d𝜏2subscript𝑝𝑒𝐵2𝑑\tau=\frac{\log 2}{p_{e}}=\frac{B\log 2}{d}italic_τ = divide start_ARG roman_log 2 end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_B roman_log 2 end_ARG start_ARG italic_d end_ARG (1)

where pesubscript𝑝𝑒p_{e}italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is the probability of getting out for a batter at a ball, and B𝐵Bitalic_B and d𝑑ditalic_d are the total number of balls faced and the number of dismissals, respectively. We found that for batters the run scoring rate and the innings half-life are related through a power law relationship as shown in Fig. 1A and Table 1. That is

r=Kbatτα=Kbatlog2(Bd)α𝑟subscript𝐾𝑏𝑎𝑡superscript𝜏𝛼subscript𝐾𝑏𝑎𝑡2superscript𝐵𝑑𝛼r=K_{bat}\tau^{-\alpha}=K_{bat}\log 2\left(\frac{B}{d}\right)^{-\alpha}italic_r = italic_K start_POSTSUBSCRIPT italic_b italic_a italic_t end_POSTSUBSCRIPT italic_τ start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT = italic_K start_POSTSUBSCRIPT italic_b italic_a italic_t end_POSTSUBSCRIPT roman_log 2 ( divide start_ARG italic_B end_ARG start_ARG italic_d end_ARG ) start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT (2)
Table 1: Summary of the collective statistical analysis of all the players (Fig. 1).
Player type α𝛼\alphaitalic_α Effect size (Pearson’s R𝑅Ritalic_R) p𝑝pitalic_p-value
batters 0.6180.6180.6180.618 0.9020.902-0.902- 0.902 <1015absentsuperscript1015<10^{-15}< 10 start_POSTSUPERSCRIPT - 15 end_POSTSUPERSCRIPT
Bowlers 0.6950.6950.6950.695 0.8980.898-0.898- 0.898 <1015absentsuperscript1015<10^{-15}< 10 start_POSTSUPERSCRIPT - 15 end_POSTSUPERSCRIPT
Refer to caption
Figure 1: Power law relationships in the game of cricket. (A) The run scoring rate, r𝑟ritalic_r, measured as runs scored per ball faced, and the duration of batting before getting out, quantified in terms of half life, τ𝜏\tauitalic_τ, for batters are related in a power law relationship. (B) For the bowlers also, the run conceding rate, r𝑟ritalic_r, and balls delivered before taking a wicket follow power law relationship. In both plots, α𝛼\alphaitalic_α and R𝑅Ritalic_R represent the power law exponent (see relation 2 and the coefficient of correlation, respectively. (C) The variation in the total runs scored, S𝑆Sitalic_S, in unlimited number of balls as a function of runs scoring rate, r𝑟ritalic_r, for three different α𝛼\alphaitalic_α. The runs in (C) are shown at an arbitrary unit.

where Kbatsubscript𝐾𝑏𝑎𝑡K_{bat}italic_K start_POSTSUBSCRIPT italic_b italic_a italic_t end_POSTSUBSCRIPT is a phenomenological constant associated with the batting and α𝛼\alphaitalic_α, the power law exponent, dictates the nature of the batters’s adaptability under different circumstances. Similarly for the bowling too, the runs conceding rate and the balls delivered for a wicket follow a power law relation (Fig. 1B, Table 1). Owing to this relationship, the total runs, S𝑆Sitalic_S, scored by a batter before getting out are always bounded as

Sτrr(11/α),similar-to𝑆𝜏𝑟similar-tosuperscript𝑟11𝛼S\sim\tau r\sim r^{\left(1-1/\alpha\right)},italic_S ∼ italic_τ italic_r ∼ italic_r start_POSTSUPERSCRIPT ( 1 - 1 / italic_α ) end_POSTSUPERSCRIPT , (3)

is dependent on α𝛼\alphaitalic_α and the run scoring rate. The relation 3 demonstrates that the exponent α𝛼\alphaitalic_α is an indicator of the player performance with which the cricket batters can be categorized in three categories (Fig. 1C). For the batters with α>1𝛼1\alpha>1italic_α > 1, the total scored runs, S𝑆Sitalic_S, increase with an increase in the run scoring rate, r𝑟ritalic_r, whereas for α<1𝛼1\alpha<1italic_α < 1 and α=1𝛼1\alpha=1italic_α = 1 the total runs decrease and remain unchanged with an increase in the run scoring rate, respectively. This shows that for an average batter (α=0.62𝛼0.62\alpha=0.62italic_α = 0.62, Fig. 1A) an increase in the run scoring rate leads to a reduction in the total runs scored. Similarly, for an average bowler (α=0.7𝛼0.7\alpha=0.7italic_α = 0.7, Fig. 1B) an increase in the run conceding rate results in lesser runs conceded as it leads to faster dismissal of opposing batters. Therefore, the exponent α𝛼\alphaitalic_α is an indicator of the batting and bowling performances of the players in different conditions.

III.2 Analysis of individual players

In Fig. 1, an average power law relationship is obtained for all the players together, but similar relation can also be obtained for the individuals. The players who have played three formats of the game of cricket (tests, ODIs and T20Is) have three sets of run scoring rate, r𝑟ritalic_r, and half life τ𝜏\tauitalic_τ, each. The log-log plot of the run scoring rate and half-life for individual players demonstrates that, here too, the power-law relation holds for batting (Fig. 2A, Table 2) as well as bowling (Fig. 2B, Table 2).

Table 2: Summary of the statistical analysis of individual players (Fig. 2).
Player type average α𝛼\alphaitalic_α 95%percent9595\%95 % CI for α𝛼\alphaitalic_α Effect size (Pearson’s R𝑅Ritalic_R)
batters 0.6800.6800.6800.680 [0.617,0.743]0.6170.743[0.617,0.743][ 0.617 , 0.743 ] <0.880absent0.880<-0.880< - 0.880
Bowlers 0.6790.6790.6790.679 [0.623,0.734]0.6230.734[0.623,0.734][ 0.623 , 0.734 ] <0.930absent0.930<-0.930< - 0.930
Refer to caption
Figure 2: Power law exponents for individual players. Power law fits between (A) the run scoring rate, r𝑟ritalic_r, and innings half life, τ𝜏\tauitalic_τ, for three representative batters, and (B) run conceding rate and number of deliveries before a wicket for three bowlers, across three cricket formats. Distributions of power law exponents, α𝛼\alphaitalic_α, for individual (C) batters and (D) bowlers. In (C) and (D) the dashed line correspond to the respective average values of α𝛼\alphaitalic_α and the yellow region span the width equivalent to the two standard deviations of the distributions. n𝑛nitalic_n stands for the number of players considered in (C) and (D).

Therefore, from these relationships power-law exponents, α𝛼\alphaitalic_α, were obtained for each batter and bowler. The respective frequency distributions of the α𝛼\alphaitalic_α are shown in Figs. 2C and 2D. The frequency distributions show that 0.4α1.10.4𝛼1.10.4\leq\alpha\leq 1.10.4 ≤ italic_α ≤ 1.1 for batting as well as bowling (see Table 2 for 95%percent9595\%95 % CI). It can be seen that for a majority of players α<1𝛼1\alpha<1italic_α < 1 which indicates that for an average batter (α<1𝛼1\alpha<1italic_α < 1) an increase in the run scoring rate leads to a decrease in the total runs scored (Fig. 1C).

As the power-law relation can also be interpreted in terms of the proportional relative change between the two quantities, that is

rr=αττ,𝑟𝑟𝛼𝜏𝜏\frac{\partial r}{r}=-\alpha\frac{\partial\tau}{\tau},divide start_ARG ∂ italic_r end_ARG start_ARG italic_r end_ARG = - italic_α divide start_ARG ∂ italic_τ end_ARG start_ARG italic_τ end_ARG , (4)

the batters with high α𝛼\alphaitalic_α value are more suited for the batting at higher run scoring rates. This is because an increase in their run scoring rate results in lesser proportional decrease in their innings half-life, τ𝜏\tauitalic_τ. By the same argument, the batters with very low α𝛼\alphaitalic_α value are more apt at lower run scoring rates. Similarly, for bowlers, lower α𝛼\alphaitalic_α makes the player suitable for the shorter formats of the game where higher run scoring rates are more common.

Therefore, the exponent α𝛼\alphaitalic_α is a powerful indicator of the players’ adaptability in different situations where high or low run scoring rates are desired. This parameter can be utilized to identify the suitability of a given player for different formats of the game, such as the batters with high α𝛼\alphaitalic_α are more suitable for shorter games (T20Is) whereas the ones with smaller α𝛼\alphaitalic_α are more appropriate for longer formats of the game (tests).

III.3 Predictive model for run scoring

In order to understand the influence of the power-law relationship between r𝑟ritalic_r and τ𝜏\tauitalic_τ on the overall run scoring by a batter, we model the score evolution as a one-dimensional random walk with a drift. This model can also be applied to the bowlers with apppriate changes in the model parameters. Previously, using the available ESPN Cricinfo (cricinfo, ) data, the anomalous diffusion nature of the score evolution in the game of cricket has been shown (pmid23005806, ). Although, the score evolution for the whole team follows anomalous diffusion (pmid23005806, ), we model the run scoring by a single batter as a normal diffusion. If PS(S,B)subscript𝑃𝑆𝑆𝐵P_{S}\left(S,B\right)italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_S , italic_B ) is the probability of scoring S𝑆Sitalic_S runs after playing B𝐵Bitalic_B balls, then the expression for PS(S,B)subscript𝑃𝑆𝑆𝐵P_{S}\left(S,B\right)italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_S , italic_B ) can be written as the following recurrence relation

PS(S,B)=i=06[(1pe)piPS(Si,B1)]pePS(S,B1)subscript𝑃𝑆𝑆𝐵superscriptsubscript𝑖06delimited-[]1subscript𝑝𝑒subscript𝑝𝑖subscript𝑃𝑆𝑆𝑖𝐵1subscript𝑝𝑒subscript𝑃𝑆𝑆𝐵1P_{S}\left(S,B\right)=\sum\limits_{i=0}^{6}\left[\left(1-p_{e}\right)p_{i}P_{S% }\left(S-i,B-1\right)\right]-p_{e}P_{S}\left(S,B-1\right)italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_S , italic_B ) = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT [ ( 1 - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_S - italic_i , italic_B - 1 ) ] - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_S , italic_B - 1 ) (5)

where pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the run-scoring distribution for a player, that is the probability of scoring i𝑖iitalic_i runs in a single ball and pe=log2τsubscript𝑝𝑒2𝜏\displaystyle p_{e}=\frac{\log 2}{\tau}italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = divide start_ARG roman_log 2 end_ARG start_ARG italic_τ end_ARG is the probability for the batter of getting out at a ball. Here the two terms on the right hand side of the recurrence relation correspond to scoring runs and gertting out at a given ball, respectively. As a player can score at most 6666 runs at each ball the summation is performed for 0i60𝑖60\leq i\leq 60 ≤ italic_i ≤ 6. For the ease of analysis the equation 5 can be written in the continuum limit as

Ps(s,b)b=rPs(s,b)s+D2Ps(s,b)s2rePs(s,b)subscript𝑃𝑠𝑠𝑏𝑏𝑟subscript𝑃𝑠𝑠𝑏𝑠𝐷superscript2subscript𝑃𝑠𝑠𝑏superscript𝑠2subscript𝑟𝑒subscript𝑃𝑠𝑠𝑏\frac{\partial P_{s}\left(s,b\right)}{\partial b}=-r\frac{\partial P_{s}\left(% s,b\right)}{\partial s}+D\frac{\partial^{2}P_{s}\left(s,b\right)}{\partial s^{% 2}}-r_{e}P_{s}\left(s,b\right)divide start_ARG ∂ italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) end_ARG start_ARG ∂ italic_b end_ARG = - italic_r divide start_ARG ∂ italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) end_ARG start_ARG ∂ italic_s end_ARG + italic_D divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) end_ARG start_ARG ∂ italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) (6)

where Pssubscript𝑃𝑠P_{s}italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT, s𝑠sitalic_s and b𝑏bitalic_b are the continuum counter-parts of discrete variables PSsubscript𝑃𝑆P_{S}italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT, S𝑆Sitalic_S and B𝐵Bitalic_B, respectively and r=(1pe)i=06ipi𝑟1subscript𝑝𝑒superscriptsubscript𝑖06𝑖subscript𝑝𝑖r=\left(1-p_{e}\right)\sum\limits_{i=0}^{6}ip_{i}italic_r = ( 1 - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT italic_i italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, D=1pe2i=06i2pi𝐷1subscript𝑝𝑒2superscriptsubscript𝑖06superscript𝑖2subscript𝑝𝑖\displaystyle D=\frac{1-p_{e}}{2}\sum\limits_{i=0}^{6}i^{2}p_{i}italic_D = divide start_ARG 1 - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT italic_i start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and re=log(1pe)subscript𝑟𝑒1subscript𝑝𝑒r_{e}=-\log\left(1-p_{e}\right)italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = - roman_log ( 1 - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT )(feldman, ). The left hand side of the equation is the change in the probability Pssubscript𝑃𝑠P_{s}italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT at each ball and the three terms on the right hand side correspond to the average scoring rate for a batter, the variation in the run scored by batter at each ball and dismissal at a given ball, respectively. It has to be noted that this predictive model is about the average performance of a player where all other factors remain unchanged.

For this model of diffusion, the probability of reaching a target score of s𝑠sitalic_s runs in b𝑏bitalic_b balls or less can be obtained by the integral of the first-passage time distribution (ding_ranga_2004, ) as

ψ(s,b)=0bs4πDt3exp(ret(srt)24Dt)𝑑t𝜓𝑠𝑏superscriptsubscript0𝑏𝑠4𝜋𝐷superscript𝑡3subscript𝑟𝑒𝑡superscript𝑠𝑟𝑡24𝐷𝑡differential-d𝑡\psi\left(s,b\right)=\int\limits_{0}^{b}\frac{s}{\sqrt{4\pi Dt^{3}}}\exp\left(% -r_{e}t-\frac{\left(s-rt\right)^{2}}{4Dt}\right)dtitalic_ψ ( italic_s , italic_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT divide start_ARG italic_s end_ARG start_ARG square-root start_ARG 4 italic_π italic_D italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_t - divide start_ARG ( italic_s - italic_r italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_D italic_t end_ARG ) italic_d italic_t (7)

The difference between Ps(s,b)subscript𝑃𝑠𝑠𝑏P_{s}\left(s,b\right)italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) and ψ(s,b)𝜓𝑠𝑏\psi\left(s,b\right)italic_ψ ( italic_s , italic_b ) has to be noted where Ps(s,b)subscript𝑃𝑠𝑠𝑏P_{s}\left(s,b\right)italic_P start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_s , italic_b ) is the probability density of scoring s𝑠sitalic_s runs in b𝑏bitalic_b balls whereas ψ(s,b)𝜓𝑠𝑏\psi\left(s,b\right)italic_ψ ( italic_s , italic_b ) is the probability denisity of achieving s𝑠sitalic_s runs in b𝑏bitalic_b balls or less. With this expression, the dynamics of the score evolution for a batter can be studied under two scenarios. In the first scenario the number of balls available for a batter is fixed and it is desired to score as many runs as possible. In the second scenario, however, the target of the runs to be scored is fixed and a batter is expected to achieve that target in as few balls as possible. We study these two scenarios in the following.

III.3.1 Scenario 1: Runs scoring in first inning of a game

In the fixed overs format of the cricket (ODIs and T20Is), in the first inning of a match the number of balls to be played by a batter are fixed. Therefore, in this scenario, the runs scored by the batter for different run scoring rates r𝑟ritalic_r can be studied by incorporating the power-law relation between r𝑟ritalic_r and τ𝜏\tauitalic_τ. By substituting

re=log(1log2τ0(r0r)1/α)subscript𝑟𝑒12subscript𝜏0superscriptsubscript𝑟0𝑟1𝛼r_{e}=-\log\left(1-\frac{\log 2}{\tau_{0}}\left(\frac{r_{0}}{r}\right)^{-1/% \alpha}\right)italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = - roman_log ( 1 - divide start_ARG roman_log 2 end_ARG start_ARG italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( divide start_ARG italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_r end_ARG ) start_POSTSUPERSCRIPT - 1 / italic_α end_POSTSUPERSCRIPT ) (8)

in relation 7, the total runs scored by a batter in b𝑏bitalic_b balls can be obtained for different run scoring rates as

sm(b)=00bs4πDt3exp(ret(srt)24Dt)𝑑t𝑑s.subscript𝑠𝑚𝑏superscriptsubscript0superscriptsubscript0𝑏𝑠4𝜋𝐷superscript𝑡3subscript𝑟𝑒𝑡superscript𝑠𝑟𝑡24𝐷𝑡differential-d𝑡differential-d𝑠s_{m}\left(b\right)=\int\limits_{0}^{\infty}\int\limits_{0}^{b}\frac{s}{\sqrt{% 4\pi Dt^{3}}}\exp\left(-r_{e}t-\frac{\left(s-rt\right)^{2}}{4Dt}\right)dtds.italic_s start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT divide start_ARG italic_s end_ARG start_ARG square-root start_ARG 4 italic_π italic_D italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_t - divide start_ARG ( italic_s - italic_r italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_D italic_t end_ARG ) italic_d italic_t italic_d italic_s . (9)

The relation 8 is obtained by the substitution of power-law relation in re=log(1pe)subscript𝑟𝑒1subscript𝑝𝑒r_{e}=-\log\left(1-p_{e}\right)italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = - roman_log ( 1 - italic_p start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ), where r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT are the reference run scoring rate and inning half-life, respectively. For two extreme cases, b=20𝑏20b=20italic_b = 20 and b=100𝑏100b=100italic_b = 100, the total runs scored for batters with different α𝛼\alphaitalic_α are shown as a function of run scoring rate r𝑟ritalic_r in Figs. 3A and 3B, respectively. For low α𝛼\alphaitalic_α, the runs scored in finite number of balls vary non-monotonically with the run scoring rate (Figs. 3A-B). On the other hand, for high α𝛼\alphaitalic_α, the runs scored vary monotonically for smaller number of balls (Figs. 3A-B). Further, these non-monotonic relationships show that for a fixed number of balls the maximum possible runs which a batter can score can be estimated. Fig. 3A shows that for small number of balls the batter with higher α𝛼\alphaitalic_α would score more runs than the batter with lower α𝛼\alphaitalic_α. On the other hand, for large number of balls, the batter with smaller α𝛼\alphaitalic_α is predicted to score higher runs (Fig. 3B). The maximum runs scored by batters for different α𝛼\alphaitalic_α and number of balls are shown in Fig. 3C. This demonstrates that in the first inning of the shorter format of the game (smaller number of balls) the batters with larger α𝛼\alphaitalic_α would score more runs (downward arrow in Fig. 3C), and in the longer format (larger number of balls) batters with smaller α𝛼\alphaitalic_α would perform better (upward arrow in Fig. 3C). This also shows that batters with α>1𝛼1\alpha>1italic_α > 1, which are rare (Fig. 2C), always perform better irrespective of the game format.

Refer to caption
Figure 3: Drift-diffusion based predictive model of run scoring and target chasing in cricket. Total runs scored, s𝑠sitalic_s, against run scoring rate, r𝑟ritalic_r, in (A) b=20𝑏20b=20italic_b = 20 balls, and (B) b=100𝑏100b=100italic_b = 100 balls, as obtained from the continuum diffusion model of the score evolution for different α𝛼\alphaitalic_α. (C) Maximum possible runs scored by batters with different α𝛼\alphaitalic_α in finite number of balls. Rate of reaching a target score of (D) s=50𝑠50s=50italic_s = 50 runs, and (E) s=300𝑠300s=300italic_s = 300 runs as a function of r𝑟ritalic_r. (F) Highest rate of chasing a finite target for different α𝛼\alphaitalic_α. In (C) and (F), smaller number of balls and shorter target scores, respectively, stand for the shorter formats of the game of cricket (T20Is), whereas the large b𝑏bitalic_b and s𝑠sitalic_s correspond to longer format (tests). The arrows in (C) and (F) mark the most suitable α𝛼\alphaitalic_α in two formats of the game.

Scenario 2: Target chasing in second inning of a game

In the second inning of a fixed over format game of cricket, the batters are tasked to chase a target score. In chasing a target score, it is important to see if a batter with a specific α𝛼\alphaitalic_α can achieve the target at all. The probability for a batter to reach a target score s𝑠sitalic_s is given by

P(s)=limbψ(s,b)=0s4πDt3exp(ret(srt)24Dt)𝑑t,𝑃𝑠subscript𝑏𝜓𝑠𝑏superscriptsubscript0𝑠4𝜋𝐷superscript𝑡3subscript𝑟𝑒𝑡superscript𝑠𝑟𝑡24𝐷𝑡differential-d𝑡P\left(s\right)=\lim\limits_{b\rightarrow\infty}\psi\left(s,b\right)=\int% \limits_{0}^{\infty}\frac{s}{\sqrt{4\pi Dt^{3}}}\exp\left(-r_{e}t-\frac{\left(% s-rt\right)^{2}}{4Dt}\right)dt,italic_P ( italic_s ) = roman_lim start_POSTSUBSCRIPT italic_b → ∞ end_POSTSUBSCRIPT italic_ψ ( italic_s , italic_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG italic_s end_ARG start_ARG square-root start_ARG 4 italic_π italic_D italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_t - divide start_ARG ( italic_s - italic_r italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_D italic_t end_ARG ) italic_d italic_t , (10)

where resubscript𝑟𝑒r_{e}italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is given by relation 8. Further, in addition to the probability of reaching a target score (relation 10, the average number of balls required to achieve the target is also an important quantity. It is given by

bm(s)=sP(s)0t4πDt3exp(ret(srt)24Dt)𝑑t.subscript𝑏𝑚𝑠𝑠𝑃𝑠superscriptsubscript0𝑡4𝜋𝐷superscript𝑡3subscript𝑟𝑒𝑡superscript𝑠𝑟𝑡24𝐷𝑡differential-d𝑡b_{m}\left(s\right)=\frac{s}{P\left(s\right)}\int\limits_{0}^{\infty}\frac{t}{% \sqrt{4\pi Dt^{3}}}\exp\left(-r_{e}t-\frac{\left(s-rt\right)^{2}}{4Dt}\right)dt.italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_s ) = divide start_ARG italic_s end_ARG start_ARG italic_P ( italic_s ) end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG italic_t end_ARG start_ARG square-root start_ARG 4 italic_π italic_D italic_t start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( - italic_r start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_t - divide start_ARG ( italic_s - italic_r italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_D italic_t end_ARG ) italic_d italic_t . (11)

The quantities P(s)𝑃𝑠P\left(s\right)italic_P ( italic_s ) and bm(s)subscript𝑏𝑚𝑠b_{m}\left(s\right)italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_s ) can be combined as P/bm𝑃subscript𝑏𝑚P/b_{m}italic_P / italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT to obtain the effective rate of reaching a target score. Low values of P/bm𝑃subscript𝑏𝑚P/b_{m}italic_P / italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT imply that the probability of reaching the target score is low and the average number of required balls for the same is high. On the other hand, high P/bm𝑃subscript𝑏𝑚P/b_{m}italic_P / italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT indicate high probability and smaller number of required balls for achieving the target score. As shown in Figs. 3D and 3E, the quantity P/bm𝑃subscript𝑏𝑚P/b_{m}italic_P / italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT varies non-monotonically with the run scoring rate, r𝑟ritalic_r. Further, for smaller targets, the batter with higher α𝛼\alphaitalic_α would chase the runs more effectively (Fig. 3D), and for larger target the batter with smaller α𝛼\alphaitalic_α are more reliable (Fig. 3E). The highest value of P/bm𝑃subscript𝑏𝑚P/b_{m}italic_P / italic_b start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT for different values of α𝛼\alphaitalic_α and target scores are shown in Fig. 3F, which demonstrates the suitability of batter with larger α𝛼\alphaitalic_α for shorter run chase (downward arrow), and batter with smaller α𝛼\alphaitalic_α for longer run chases (upward arrow).

The analysis of the two scenarios (run scoring in the first inning and target chasing in the second inning) demonstrates that the batters with the larger α𝛼\alphaitalic_α are more suitable for the shorter format of the game (T20Is), whereas batters for smaller α𝛼\alphaitalic_α are more apt for longer format of the games of cricket (tests).

It has to be highlighted that the drift-diffusion based predictive model is based on an assumption that all the balls faced by the batters are same and the effect of different bowlers are ignored. Therefore, the predictions of the present model are for the average performance of a batter. In order to assess the batter’s performace against specific bowlers similar statistical data of the particular batter is required and once it is available the same analysis is straight forward. Therefore, the statistical analysis and the predictive model presented in this paper can be used by the management of the cricket teams for the analysis of the players’ performances and in identifying best strategies agaisnt any given opponent for the best outcome.

References

  • [1] C. C. Wu, O. S. Kwon, and E. Kowler. Fitts’s Law and speed/accuracy trade-offs during sequences of saccades: Implications for strategies of saccadic planning. Vision Res., 50(21):2142–2157, Oct 2010.
  • [2] M. Dean, S. W. Wu, and L. T. Maloney. Trading off speed and accuracy in rapid, goal-directed movements. J Vis, 7(5):1–12, 2007.
  • [3] S. Yantis and D. E. Meyer. Dynamics of activation in semantic and episodic memory. J Exp Psychol Gen, 117(2):130–147, Jun 1988.
  • [4] G. Nilsson. Traffic safety dimensions and the power model to describe the effect of speed on safety. PhD thesis, Lund Institute of Technology, 2004.
  • [5] C. Tomasetti and B. Vogelstein. Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science, 347(6217):78–81, Jan 2015.
  • [6] R. Plamondon and A. M. Alimi. Speed/accuracy trade-offs in target-directed movements. Behav Brain Sci, 20(2):279–303, Jun 1997.
  • [7] C. C. Liu and T. Watanabe. Accounting for speed-accuracy tradeoff in perceptual learning. Vision Res., 61:107–114, May 2012.
  • [8] R. P. Heitz. The speed-accuracy tradeoff: history, physiology, methodology, and behavior. Front Neurosci, 8:150, 2014.
  • [9] W. A. Wickelgren. Speed-accuracy tradeoff and information processing dynamics. Acta Psychologica, 41:67–85, 1977.
  • [10] The Economist. And the silver goes to…, 2011.
  • [11] ESPN Cricinfo. ESPN Cricinfo-Statsguru, 2016.
  • [12] H. V. Ribeiro, S. Mukherjee, and X. H. Zeng. Anomalous diffusion and long-range correlations in the score evolution of the game of cricket. Phys Rev E Stat Nonlin Soft Matter Phys, 86(2 Pt 1):022102, Aug 2012.
  • [13] R. M. Feldman and C. Valdez-Flores. Applied Probability and Stochastic Processes. Springer Heidelberg Dordrecht London New York, 2010.
  • [14] M. Ding and G. Rangarajan. First passage time problem: A fokker-planck approach. In L. Wille, editor, New Directions in Statistical Physics, pages 31–46. Springer Berlin Heidelberg, 2004.