Statistical Baseball Research Bibliography

Hi everyone – I’ve attached the latest version of my Statistical Baseball Bibliography, along with instructions for using it.  I am in the middle of a major revision of it, including a revised classification scheme, and when you see entries in bold that means I’ve not yet had to time to reclassify them.  I hope you find this interesting and useful.  If any of you need a particular article for your work, let me know and I can probably get it for you.  I will have more material to share with you in a few weeks.




defense, Statcast

The Future Of Defensive Analysis

Statcast Lab: Outfield fielding components of Reaction, Route, Burst, Speed

The current state of defensive metrics is “pretty good”, in evaluating who the best defenders are.  There are mixed views on that, but largely the metrics used to generate the SABR Defensive Index (SDI) are aligned with Statcast’s OAA (to date) by player.  Baseball Info Solutions’ DRS (for most of this blog, I will just reference DRS, since UZR uses the same data) and RED, based on STATS’ ZR data, both do a good job of describing turning batted balls into out.

From the information Tom Tango talks about here, we will be able to understand *why* players are better or worse at turning batted balls into outs.  Which players get the best jumps.  Which players take the best route.  What is the spread of those talents and how predictable is it.

What I have seen (though not fully studied) is players come into the league and learn to adjust to the speed of the game, performing around or below average.  In their second, third, fourth seasons, they use their youth – speed, reflexes, reaction – to maximize the balls they turn into outs, peaking in their runs saved early, and then begin the decline into designated hitter, or just retirement.

The defensive aging curve is not dissimilar to the offensive aging curve with respect to shape. However, it is a couple of years earlier.  Defense is a younger man’s game because much of it relies on speed and reaction.  As more data rolls out of Statcast at baseball savant (nee Daren Willman’s site), we’ll be better at understanding how a player succeeds, and probably better predicting whether he will continue to do so.



The Length of World Series Games

There has recently been a lot of discussion over the length of baseball games. But to really understand the problem or even decide if there is one, we first need to put it in context and examine what’s behind the increase. To that end, I have looked at changes in World Series games over the past 100 years.

In SABR’s 2000 Baseball Research Journal I highlighted changes in pitch counts based on data I found in the Spalding Guide for the 1919 World Series. I have since found pitch count data for the 1916 World Series in The Sporting News. With this information and wanting to include recent seasons (particularly given the recent shifts that seem to have occurred after the 2015 All-star game), I thought it might be interesting to compare averages over five year periods. Accordingly, I averaged the World Series data over the years ending in five through nine, including only those seasons where I found complete statistics (using baseball-reference.com). The most recent averages, therefore, cover 2015 to 2017, while 1915 to 1919 includes only 1916 and 1919.

Here’s the evolution of World Series game times over the past century:

1915/19 1:56
1975/79 2:41
1985/89 3:05
1995/99 3:18
2005/09 3:31
2015/19 3:34

A century ago a World Series game lasted right around two hours. By the late 1970s it had increased to slightly over two and a half hours. The late 1980s saw World Series games finally break the three hour mark, on average. Today a World Series game lasts just over three and a half hours. Thus, over the last 100 years, the average time of a World Series game has nearly doubled, increasing by 85%.

World Series data is not fully representative of the regular season in that teams obviously manage differently in a short, winner-take-all series than over the grind of a regular season. Today, regular season games average closer to three hours than three and a half, but comparing World Series games against each other offers an interesting and valid look at trends over time.

Defining the number of pitches in a game as the “clock,” two possible explanations exist for the increase in game times: more pitches per game and/or fewer pitches per minute. Let’s look at the number of pitches per game first.

1915/19 116
1975/79 130
1985/89 141
1995/99 149
2005/09 147
2015/19 145

The number of pitches per game has clearly surged from 100 years ago. From the teens of the twentieth century through the late 1990s, the number of pitches per game increased by around 30—roughly 25%—and has held relatively steady since. As an aside, this has interesting connotations when comparing pitcher workloads over time. Assuming workload is closely tied to pitches per game, Corey Kluber or Max Scherzer tossing 7 1/3 innings today is equivalent to a complete game out of Walter Johnson or Grover Cleveland Alexander a century ago.

At a macro level, there are only two ways for the number of pitches per game to rise: an increase in the number of batters faced per game and/or an increase in the number of pitches per batter. In fact, as the table below makes clear, both have occurred. As the run scoring environment jumped at the end of the Deadball Era around 1920, a pitcher would have to face more batters to get his three outs in an inning. More recently, however, over the past 20 years as a smaller percentage of runs are accounted for by sequentially generated offence of multiple hits and more through home runs, the number of batters faced per game has come back down.

While the increase in BFP per game is meaningful, most of the increase in pitches per game can be attributed to an increase in pitches per batter. Pitchers have been going deeper into counts with each hitter, highlighted by the recent increase in strikeouts.

1915/19 36.4 7.4% 9.9% 17.4%
1975/79 37.9 8.1% 14.0% 22.0%
1985/89 38.8 8.6% 17.1% 25.7%
1995/99 39.2 11.5% 17.0% 28.5%
2005/09 38.6 9.4% 20.3% 29.8%
2015/19 37.4 8.1% 22.5% 30.6%

In the World Series over the last three years, just over 30% of each plate appearance ended in a walk or a strikeout. Clearly, the average plate appearance ends deeper in the count today than it did thirty years ago and much later than it did a century ago.

The 25% increase in pitches per game—from both the increase in BFP per game and the number of pitches per batter—does not fully account for the fact that game times have increased by 85%. The second possibility, pitches per minute, shows an even more dramatic shift, highlighted in the table below.

1915/19 2.01
1975/79 1.62
1985/89 1.52
1995/99 1.51
2005/09 1.40
2015/19 1.36

Over the last century there has been a significant and steady decrease in the number of pitches per minute during a World Series game. One hundred years ago there were roughly two pitches per minute when averaged over the length of a game. Today this has fallen to 1.36.

In sum, game lengths have expanded as pitchers have gone deeper into counts and the time between pitches and innings has risen. Quantifying these causes helps provide a framework into how we might roll back game lengths without affecting their watchability or integrity. Reducing the number of pitches per game is likely a much more difficult or intrusive challenge than reducing the time between pitches. Any change to the number of strikeouts and walks will require a fundamental change to the way the batter/pitcher matchup is now approached by each. The huge number of strikeouts has been receiving a large amount of scrutiny recently due to the negative aesthetic of pitches not being put in play. Any rule change that transforms the current approach and leads to a decrease in strikeouts in favor of balls being put in play will likely also decrease the times of games, at least at the margins.

My sense is that there is more low hanging fruit on the number of pitches per minute front. Limiting mound visits this year was one relatively easy action that should be having at least a marginal impact. There are many other possibilities that have been floated as well, such as a pitch clock, limiting time between innings, and regulating the batter’s ability to step in out of the batter’s box, as well as more radical ideas.

A century ago a fan’s time commitment for a World Series game resembled what now might be required for a college basketball game. A World Series game today requires a time commitment closer to a college football game. While that might be acceptable for a short, highly intense series, having a 162 game season with game lengths approaching the time required to play out the spectacle of a college football Saturday—without the requisite increase in on-field action—is a recipe for a shrinking interest in our national game.



Welcome to the blog for SABR’s Statistical Analysis Committee.  Although our members have published content for decades in a variety of fora, this blog should provide an additional outlet for our members, one which will result in more timely and communal feedback.  Buckle up.