Introducing the Projections

I’ve always been an avid follower of the great work done by people like Chris Mitchell, Clay Davenport, and Dan Szymborski among many others. Prospect projections and minor league stat translations have been around for a while. A previous iteration of my own projections can even be found in the archives of PitcherList. My current approach attempts to blend these two methods while also folding in some of the recently available public Statcast data from certain leagues within the minors. The core rate statistics are predicted using park and league adjusted stat translations — with some tweaks for age relative to their level, length of previous professional experience, and a handful of other relevant metrics. They’re then weighted against their output from previous seasons and regressed. This allows for greater predictive value and more meaningful results in smaller samples than simple translations alone.

This post mainly looks to outline some key differences in methodology from other sources. Future posts will likely dive deeper into why I made some of the decisions I did and what statistical backing they have. Starting with the batting projections, one new wrinkle I was excited to implement was exit velocity. The physical impact of exit velocity and has been well documented. The challenge is incorporating it into projections, particularly for minor league players, when the data only goes back several years. My approach to this problem is the use of a historical proxy that gets scaled accordingly and can therefore be swapped out interchangeably for exit velocity when it is available. In this instance, median flyball distance and 90th percentile flyball distance serve as the proxies for median exit velocity and 90th percentile exit velocity. These are metrics that correlate well with each other, though exit velocity can be viewed as a more ‘accurate’ metric for the quantification of power.

On the pitching side, fastball velocity was another key metric I wanted to integrate. In this case, there isn’t a great proxy variable available to replicate the previous technique. What does go back much farther than minor league velocity, however, is major league velocity. I produce two distinct sets of models for pitcher projections; one trained with historical major league velocity as an x variable, and one without it. Pitchers with tracked minor league velocity are projected using the former set of models, and pitchers without Statcast tracking, the latter.

The last main upgrade is the introduction of some rudimentary defensive metrics. Fielding opportunities can be estimated using minor league hit coordinates and compared against the defensive outs the player accrued. While this method is crude, it does best fielding percentage and proves statistically significant in projecting out major league defensive value.

While these enhancements seem to improve model accuracy, it’s still important to bear in mind its limitations. The models don’t currently have access to anything outside the scope of their output of their minor league seasons. This means that top amateur or international players may be underrated if they don’t perform right away. There’s also value in traditional scouting that shouldn’t be neglected. The value in these projections is to serve as an unbiased, multi-factor evaluation of prospect performance and to identify players who may be overvalued or undervalued by consensus.