In baseball today, the importance of player development is no longer debatable. Everyone has rallied around the necessity of impactful training. Instead, battles are now erupting over which skills are crucial to develop in young players.
Pitchers are the focal point of these debates as new-school biometric training facilities have revolutionized the process by concentrating on improving spin, mechanics, and velocity. This strategy has been met with harsh counter claims from older-school coaches who emphasize repeatable mechanics, control, and pitch sequencing over the “glamour” skills that the newer biometric companies covet.
This battle has been waged in MLB front offices for years, and at this point, the new training seems to be winning. This past off-season, biometric data-driven coaches were brought in by numerous franchises. These guys are touted as the future of player development. But are the new skills that they focus on any more strongly positively correlated with MLB success than the older school skills? In this analysis, we will dive into the data and find out which underlining skills are the most crucial to big league success and whether these new school guys are in fact correct.
As you can probably guess, each individual skill will have a very low correlation to overall success. That’s because pitching is a complex conglomerate of skills that can be combined in myriad ways and still be very successful. There is no perfect mix. With that being said, when you compare each skill, you can still see which ones as a whole contribute most to the overall package and are, in general, the most crucial. The skills I chose to test are as follows, generically grouped into old school and new school:
Old School
Ability to locate pitches: Measured by BP’s CMD Metric
Ability to change speeds: Measured by velocity drop off between primary fastball and off-speed
A mixed arsenal: Measured by breaking ball, off-speed, and fastball percentages of pitches thrown.
New School
Fastball Velocity: Measured by highest average fastball velocity
Fastball Spin Rate: Measured by average RPM for 4 primary fastball
Slider Spin Rate: Measured by average Rpm for all pitchers who threw the offering at least 5% of the time
Curveball Spin Rate: Measured by average Rpm for all pitchers who threw the offering at least 5% of the time
The first step was isolating each variable’s effect on success. I quantified success as MLB ERA instead of more pitcher isolation based metrics like Fip. I did this for two reasons. The first is that Fip and its follower stats tend to be biased toward pitchers who lean towards the more new-school approach. This is due to the old -school approach’s emphasis on generating weak contact, which FIP factors out completely. Using ERA puts both skillsets on relatively the same playing field, even though it may give pitchers too much credit for weak contact in turn, slightly helping out the old school. Second, I wanted to measure the most basic definition of pitcher success: limiting runs. This keeps it simple and is easily understood by the general public.
After running simple linear regressions for each of my 9 variables (I split arsenal into 3 parts for the regressions) it became clear that some of the variable had zero or almost zero impact on ERA when isolated. These included some that I assumed beforehand — like breaking ball percentage thrown and changeup percentage thrown — but also more interesting discoveries, including command and ability to change speeds. Here are the plots:
Both of these highly touted old-school skills failed the correlation test, having a basically 0 correlation coefficients and flat slopes, meaning as they get better ERA doesn’t follow suit. This might have to do partially with sampling bias as only data from the last two season of Statcast is publicly available. This limits my sample to the current game, which has trended away from maximizing these skills. Even so, this level of separation from ERA is very noteworthy and should be taken into account. The data is not saying these skills are not entirely unimportant, as they are good auxiliary skills, but just that they alone aren’t enough to drive success.
Next, I will delve into what skills my analysis found most predictive of MLB success. These were, as new school advocates already are aware of, fastball velocity and fastball spin rate, followed by slider spin.
Each of these skills still may seem to have a small R^2 at .08, .078, and .023 respectively but when isolating a single trait, the first two are about as good as you can ask for. Just like you wouldn’t have a shot at telling me a player’s ERA if you just knew he threw 93 MPH, the computer can’t really tell, either. But the computer does have a better shot at it with those two skills than any other I measured, by a wide margin. Another important finding about these three skills is that they each have relatively steep negative slopes meaning as they increase, ERA will fall with them. Velocity and spin rate have been buzz words in baseball for years now and this is just more backing for them to gain further influence in the future.
Now that we’ve gotten through the breakdown of methodology and explanation of my backing, here’s the ranking of each skill, from most important to least important based on their correlation and slope:
- Fastball Spin Rate (pushed ahead by a slightly steeper slope)
- Fastball Velocity
- Slider Spin Rate
- Fastball Percentage Thrown
- Curveball Spin Rate
- Command
- Fastball-Changeup Velocity Delta
- Breaking Ball Percentage Thrown
- Changeup Percentage Thrown
It’s worth noting that after Slider Spin Rate all other variables have basically a zero effect by themselves.
As you can probably tell from the individual skills analysis, old school seems to be at a clear disadvantage. But as they always like to preach, it’s the total package that makes a player. To account for this, I used the new/old skill groupings listed above and ran multiple linear regressions for each. This basically means the computer took into account each group’s variables together and measured the relationship between them and ERA. The results were well, not that surprising: Old school got crushed again. The correlation coefficient for the Old school group was relatively tiny for multiple variables, .024, about the same as just slider spin rate. The predicted value scatter plot also shows this as the computer had no idea how to place anything and just threw everything around the mean to hedge its losses.
The new school group fared much better, having a pretty strong correlation, all things considered, at .115. The plot also showed this with a more accurate, spread out distribution and less severe errors.
The combination of the individual skill evaluations and the groups clearly show that the new-school training regimes are focusing on the more data-backed skills. This finding is no surprise as one of their main selling points is embracing data and implementing it in a useful way. While this work does show the success of these traits in the MLB, what it doesn’t take into account is whether these skills can be taught and whether they contribute to increased injury risk, two big complaints from skeptics. These might be covered in later pieces but I thought them important to mention here as well.
This analysis may seem to completely write off the skills of command, changing speeds, and mixing your pitches that old school baseball loves to glorify. But these skills absolutely have their place. As secondary skills, they are needed along with the other abilities but in general, they can’t hold by themselves. Maybe some guys can get by with just location and changing speeds, but if you are forced to choose one of the two sets of skills, the data shows you should pick new school.
Fastball velocity and pitch spin have been the main drivers of success. If you can’t hit the broad side of the barn with your pitches, they’re obviously a moot point. But studies like mine have repeatedly shown they are crucial to pitching in today’s game so they should be a focus of player development.
Wonderful research. A couple things I’d love to see you add to the analysis: 1) add interaction effects. Your regressions are isolating the effects of each variable right now and then asking how knowing each thing on its own improves our ability to predict ERA. BUT, it may be that command on its own doesn’t do much, and changing speeds on its own doesn’t do much, but being able to do both has a multiplicative effect rather than additive. An interaction term would capture that. Now, this may still put new school on top as an interaction between velocity and spin rate may be very highly predictive. But the analysis would be informative; 2) combine new school and old school. Especially with interactions. What if fastball spin rate becomes super predictive of ERA when interacted with command or ability to change speeds? It may be that a combo of old and new school is best, and it would be interesting to see what the data says about what combos are best.
Thanks for your research!
LikeLike