A Remark on Baserunning risk: Waiting Can Cost You the Game
AA Remark on Baserunning risk: Waiting CanCost you the Game
Peter MacDonaldDepartment of Mathematics & Statistics, McMaster UniversityHamilton, ON, Canada [email protected]
Dan McQuillanDepartment of Mathematics, Norwich UniversityNorthfield, VT, USA [email protected]
Ian McQuillanDepartment of Computer Science, University of SaskatchewanSaskatoon, SK, Canada [email protected]
Abstract
We address the value of a baserunner at first base waiting to see if a ballin play falls in for a hit, before running. When a ball is hit in the air, thebaserunner will usually wait, to gather additional information as to whethera ball will fall for a hit before deciding to run aggressively. This additionalinformation guarantees that there will not be a double play and an “unneces-sary out”. However, waiting could potentially cost the runner the opportunityto reach third base, or even scoring on the play if the ball falls for a hit. Thisin turn affects the probability of scoring at least one run henceforth in the in-ning. We create a new statistic, the baserunning risk threshold (BRT), whichmeasures the minimum probability with which the baserunner should be surethat a ball in play will fall in for a hit, before running without waiting to see ifthe ball will be caught, with the goal of scoring at least one run in the inning.We measure a -out and a -out version of BRT, both in aggregate, and alsoin high leverage situations, where scoring one run is particularly important. a r X i v : . [ s t a t . A P ] M a y e show a drop in BRT for pitchers who pitch in more high leverage innings,and a very low BRT on average for “elite closers”. It follows that baserun-ners should be frequently running without waiting, and getting thrown out indouble plays regularly to maximize their chances of scoring at least one run. Consider the situation of a late inning of a close baseball game, with one out, anda single baserunner, who is on first base. Assume the batter hits the ball towardsthe outfield and it is not clear if the ball will fall in for a hit. The baserunnertypically runs partway towards second base and then waits to see if the outfielderwill catch the ball or not before continuing. We will refer to this strategy as the conventional baserunning strategy . This strategy is almost always used in order toprevent a double play. Indeed, if the ball is caught before it touches the ground, thebaserunner must return to first base before the defense can get the ball to first base,or they are also ruled out, ending the inning. In this situation, standing and waitingcan prevent a double play.In contrast, when there are two outs, the baserunner has the freedom to run ag-gressively without waiting. Indeed, if the ball is caught, the inning is over anyway,so there is no possibility of a double play.In this paper, we make the case that the 2 out, aggressive baserunning strategyshould be employed with less than two outs in specific situations. In fact, thisaggressive baserunning strategy is almost never used with fewer than two outs. Inthe rare cases when it is used, traditional baseball statistics cannot detect it, unlessit results in a double play. Therefore, our arguments are necessarily indirect, ratherthan purely statistical.Assume a runner is on first base in the 9th inning with one out, and a ball is hit toshallow right field. The outfielder is running very fast, attempting to catch the ball.Let us say that the probability of a hit is 0.5. Such a probability assessment could bemade by the coaches, or the baserunner, in the time it takes the baserunner to reachthe point where they would conventionally stop (we will discuss the problem ofestimating this probability in Section 4). In Section 2, we show that in this scenario,they should not stop. Indeed the only way stopping could be beneficial, is if the ballis actually caught. Therefore, the maximum reward of the conventional baserunningstrategy is exactly one runner on first base, with two outs. We will show that thisreward is relatively insignificant. As a clear example, consider that between theyears of 2003 and 2008, late inning MLB specialist Mariano Rivera pitched in 58different high leverage innings where there was exactly one baserunner, who was We define high leverage situations to be either the eighth or ninth inning where the difference n first base with two outs. None of those 58 runners eventually scored in thoseinnings. With hindsight, it is clear that no one should have coveted such a reward.Baserunners would have been better off pretending that there were two outs insteadof one out, to increase their chances of scoring. Indeed, Rivera, is only slightlybetter than the average pitcher with a runner on third base and one out.In the literature, there have been a number of similar questions addressed. InTango et al. (2007), Winston (2009), the expected number of runs is shown, foreach configuration of bases occupied, and by the event that occurs. The notion weare proposing is quite similar in nature to the notion of sacrifice bunting, stealingbases, or aggressive baserunning (taking an “extra” base) which have been studied(for example, also in Tango et al. 2007, Click 2006, Winston 2009). It is known, forexample, that bunting reduces the expected runs in general, but can be a good ideafor a poor hitter, or if scoring only one run is desired. In Click, the probability ofscoring at least one run given each base configuration is given. In general, runnersshould be far more aggressive in taking an “extra base” than they are in practice(as in Tango et al., Winston). In Tango et al., the authors calculate the run valueof taking an extra base. In this paper however, we are more concerned with theopportunities lost by waiting instead of running.
Consider the possibility that there are i outs where i = 0 or i = 1 , and as a simpli-fying assumption, that aggressive baserunning without waiting, from first base, willresult in the runner ending up on third base instead of second base should a ball inplay fall in for a hit. We examine this reward – the offense has a runner on thirdbase instead of second base. Definition 1.
We define the following, for i ∈ { , } : • Let T i be the resulting probability of at least one run scoring later in the (half)inning, starting from a situation where there is a runner on third base and i outs (no requirements regarding first and second base). • Let S i be the probability of at least one run scoring later in the (half) inning,starting from a situation where there are i outs and a runner on second base,but no runner on third base (no requirements regarding first base). • Let F i be the probability that at least one runner will score later in the inning,starting from the situation that there are i + 1 outs, a runner on first base andno runners on second or third. in score is at most one. e will measure T i , S i and F i for each pitcher, as well as in aggregate. Tocalculate T i , as numerator, we use the number of half innings where there was arunner on third with i outs and the pitcher pitching, and at least one run scoreslater on in the half inning. Therefore, even if the scenario occurs more than oncein the same half inning, it is only counted once. As denominator we use the samevalue as the numerator plus the number of half innings where the pitcher was inthat situation, and there were zero runs henceforth scored. We calculate S i and F i similarly.Note that the reward for the aggressive strategy with i outs can be measuredby the difference T i − S i , and that the only possible reward from the conventionalstrategy can be measured by F i (we use the subscript i because F i is contributing tothe i -out statistic, even though F i is calculated by examining situations where thereare i + 1 outs).Now assume the ball is hit to the outfield and that the probability that the ballfalls for a hit is judged to be p , where ≤ p ≤ . The probability that at least onerun scores, using the conventional strategy is: p · S i + (1 − p ) · F i . The probability that at least one run scores with our proposed aggressive strategyis: p · T i . Thus, the aggressive strategy is at least as beneficial as the conventional strategy,with the goal of scoring a run in the inning, whenever: p · T i ≥ p · S i + (1 − p ) · F i . (1)Then, BRT i is the minimum value of p for which risky baserunning is at leastas good as the conventional strategy. In the unlikely event that S i ≥ T i , then it is notpossible that the conventional stradegy would be better than the aggressive strategy.(Observe that since ≤ p ≤ , Equation (1) would have no solution). In this case,we automatically declare BRT i to equal , meaning that one must be completelycertain that the ball will fall in for a hit before running too far away from first base.If, on the other hand, S i < T i , then it makes sense to ask: which values of p make the aggressive strategy at least as good or better? The answer is found bysolving for p in Equation (1): p ≥ F i F i + T i − S i (2)We will therefore define BRT as follows: efinition 2. For i ∈ { , } (the number of outs), we define the i -out baserunningrisk threshold , ( BRT i ) as, BRT i = F i F i + T i − S i if T i − S i > , if T i − S i ≤ . (3)Note that T i will almost certainly be bigger than S i , unless our data is insuffi-cient in the context used, to have confidence in those probabilities in the first place.If we did not include the second case in Equation (3), then the BRT statistic wouldbe less than zero if and only if S i > T i + F i , which would be undefined if and only if S i = T i + F i , and would be bigger than one if and only if S i > T i and S i < T i + F i .One of these scenarios, or S i = T i , is true if and only if T i − S i ≤ , and therefore,the BRT is mapped to one.Definition 2 was chosen based on two assumptions:1. running aggressively without waiting to see if the ball in play is a hit, fromfirst base, will result in the runner ending up safely at third base,2. conventional baserunning would result in the baserunner ending up safely atsecond base.Our work shows that in order to maximize the probability of scoring at least one rununder these assumptions, the aggressive baserunning strategy is better than conven-tional baserunning whenever p ≥ BRT . Without our assumptions the discussionwould be much different. In the event that running aggressively would result in theoriginal baserunner being thrown out at third, our strategy would not be as effective.However, it is also possible that running aggressively could be much more effectiveas well, if for example, the runner from first could score if they run aggressively.This point will be discussed further in Section 4.
All data, unless noted otherwise, is from Retrosheet. We will give restrictions onthe data used for the various statistics provided below.The aggregate BRT and BRT statistics for all pitchers, in any inning from1984-2011 appears in Table 1. This suggests that in any inning where the goalis to score at least one run, if there is one out, then the baserunner should runwithout waiting even if there is only a chance that a ball in play fall for ahit. This is quite clearly less than the conventional strategy (as discussed in Section = 0 . T = 0 . S = 0 . S = 0 . F = 0 . F = 0 . BRT = 0 . BRT = 0 . T = 0 . T = 0 . S = 0 . S = 0 . F = 0 . F = 0 . BRT = 0 . BRT = 0 . Table 1: On the left, aggregate statistics for all pitchers in any inning from 1984-2011, for one and zero out. On the right is a restriction to high leverage situations.4). Moreover, even if there is zero out, only a chance is required to make notwaiting the best strategy. In addition, we calculate the aggregate statistics for allpitchers from 1984-2011, in high leverage situations , which we define to be eitherthe eighth or ninth inning where the difference in score is at most one, also in Table1. We next examine BRT by partitioning pitchers based on the number of careerhigh leverage innings they pitched in their career. In all forthcoming statistics, weinclude career statistics in high leverage innings (ending in 2011) from all pitcherswho either retired since 1984, or who are currently active. We include their careerstatistics even if they played before 1984, but retired after 1984. We summarize thedata for BRT and BRT in Table 2. BRT
100 150 200 250 300 350+ top 10 saves all playscumulative 0.364 0.354 0.305 0.309 0.324 0.298 0.278 0.370mean 0.425 0.411 0.338 0.329 0.344 0.317 0.302 -standard dev 0.260 0.237 0.176 0.137 0.117 0.119 0.111 -
BRT
100 150 200 250 300 350+ top 10 saves all playscumulative 0.569 0.576 0.576 0.520 0.518 0.545 0.528 0.578mean 0.623 0.619 0.645 0.590 0.555 0.574 0.540 -standard dev 0.263 0.250 0.230 0.254 0.194 0.137 0.107 -
Table 2: We provide the BRT (top) and BRT (bottom) for pitchers in high leverageinnings. We provide the statistics over all pitchers with career high leverage inningsin the ranges [100 , , [150 , , [200 , , [250 , , [300 , and [350 , ∞ ) ,as well as the top 10 all time career save leaders (save leaders from Baseball Refer-ence), and all plays in high leverage situations. For each, we identify the cumulativestatistics (without separating individual pitchers within each set), the mean statis-tics over all the individual pitchers in each set, and the standard deviation betweenpitchers.Table 2 demonstrates that, on average, the BRT is lower when examining setsof pitchers with fewer career high leverage innings pitched. The cumulative statisticor the top 10 save leaders provides a BRT of . , which is significantly less thanthe statistic of . for all pitchers in high leverage situations. Hence, baserunnersshould be significantly more aggressive against elite closers.For the zero out statistic, BRT also in Table 2, we also see a tendency for theBRT to be lower for pitchers with more high leverage innings, although there is asmaller difference in the cumulative BRT statistic between all pitchers, and top 10save leaders ( . to . ), than for the BRT statistic.In Table 3, we provide each pitcher’s T , S , F and BRT for all pitchers with350 appearances in high leverage innings. We also collect the earned run averagefor each pitcher. The table is ranked in increasing order by BRT . Last Name First Name T S F BRT ERARivera Mariano 0.595 0.328 0.043 0.139 2.21Sutter Bruce 0.639 0.336 0.072 0.192 2.83Orosco Jesse 0.692 0.376 0.078 0.197 3.16Gossage Rich 0.658 0.354 0.078 0.204 3.57Righetti Dave 0.667 0.317 0.095 0.214 3.46Stanton Mike 0.707 0.359 0.099 0.221 3.92Fingers Rollie 0.638 0.338 0.094 0.239 2.90Minton Greg 0.543 0.330 0.068 0.243 3.10Jackson Michael 0.568 0.365 0.067 0.247 3.42Eckersley Dennis 0.667 0.338 0.117 0.262 3.50Hoffman Trevor 0.688 0.363 0.123 0.274 2.87Tekulve Kent 0.585 0.339 0.101 0.291 2.85McGraw Tug 0.638 0.436 0.084 0.295 3.14Jones Doug 0.694 0.468 0.094 0.295 3.30Franco John 0.671 0.330 0.143 0.295 2.62Smith Lee 0.603 0.283 0.136 0.298 3.03Reardon Jeff 0.549 0.356 0.082 0.298 3.16McDowell Roger 0.652 0.371 0.130 0.317 3.30Jones Todd 0.621 0.359 0.127 0.327 3.97Plesac Dan 0.591 0.385 0.118 0.363 3.64Timlin Mike 0.583 0.418 0.121 0.423 3.63Quisenberry Dan 0.543 0.392 0.115 0.432 2.76Hernandez Roberto 0.569 0.420 0.144 0.492 3.45Garber Gene 0.568 0.434 0.146 0.521 3.34Lavelle Gary 0.607 0.465 0.191 0.573 2.93Wagner Billy 0.436 0.375 0.084 0.580 2.31 mean
Table 3: The table above collects together all BRT data for all pitchers with 350appearances in high leverage innings. Their corresponding T , S , F contributingto their BRT is also provided. Each pitcher’s career earned run average is alsogiven, from Baseball Reference. The table is sorted in ascending order by BRT . Discussion and Concluding Remarks
Many aspects of baseball strategy assume an ability to approximately judge proba-bilities in real time. For example, when the third base coach decides whether or notto send a runner home from third base on a sacrifice fly ball, the coach is (perhapsunwittingly) making a probability estimation that the runner can beat the throw tohome plate, and comparing that to the probability the run will score in some otherway. These estimated probabilities may not be accurately calculated, but they arelikely approximately correct. They may be influenced by actual calculations anddiscussions before the game, and they may be adjusted after risks are taken andthen reassessed. Therefore trial and experience helps improve accuracy. The con-ventional baserunning strategy, as defined in this paper, tacitly assumes that BRT isequal to 1.One other method that the baserunner or third base coach could use to estimatethe probability that the ball fall in for a hit is to use batting average on balls in play(BABIP) in different contexts. If the ball is hit in some particular situation wherethe BABIP is greater than the BRT, then the baserunner should run as soon as theydetermine that situation is occurring. For example, in Fast (2011), the author cal-culates the BABIP depending on the horizontal angle of contact off of the bat (datafrom 2008). It is demonstrated that if the angle off the bat is between approximately degrees and degrees, then the BABIP is greater than . , which is greater thanthe aggregate BRT . The estimate for the angle off of the bat can be made on con-tact, perhaps by the third base coach, and if it is less than this upper bound on theangle, then the runner should run immediately.We could not tell from the statistical record, how often the aggressive baserun-ning strategy is employed. However, it seems to be extremely rare. This suggeststhat current practice of major league baseball teams is not approximately correct,in certain situations. By focusing on a shallow hit to right field, with the outfielderplaying deep and running very fast, we have suggested a specific situation where itis very likely that current practice is not correct. There may be many other situationswhere current practice is not approximately correct, but it may be more difficult tomake those determinations, until people start tracking that data necessary to makethat analysis. Suppose a team starts to employ our suggested baserunning strategy.An observer would then keep track of the number of times that the runner doesnot stop. If a double play results, it counts as a failure. If the ball falls in for a hit,and the observer judges that the runner advanced further from running aggressively,then it is a success. With such data, much more substantial analysis would be pos-sible. In other words, we do not know how to analyze our strategy directly, until ateam tries it and records success. eferences Baseball reference. , 2012. Ex-tracted January 2012.Retrosheet. , 2012. Extracted January 2012.James Click.
Baseball between the numbers , chapter 4–2: When is one run worthmore than two? Prospectus Entertainment Ventures LLC, New York, 2006.Mike Fast. Spinning yarn. , 2011.Tom M. Tango, Mitchel G. Lightman, and Andrew E. Dolphin.
The Book: Playingthe Percentages in Baseball . Potomac Books, Inc., Virginia, USA, 2007.Wayne Winston.