top of page

Category Correlation Analysis (2020 Fantasy Baseball)

Standard Roto and H2H Categories scoring uses five hitting categories: R, HR, RBI, AVG, SB. As sabermetrics continue to become mainstream in the baseball community, leagues have adopted OBP as a sixth category or as a replacement for AVG. Unlike points leagues where you do not care where the production comes from, categories based leagues make all the stats equally important. This means that intuitively you want to draft a balanced roster so you can compete in every category. So which individual stat should you value most on draft day to help you compete in every category? Is there a stat that matters less compared to other categories?


To answer these questions I decided to assess how each stat correlates with each other. Correlation is a statistics term that describes a mutual relationship or connection between two things. A positive correlation means that as one thing increases, the other increases. A negative correlation means that as one thing increases, the other decreases. The correlation metric ranges from -1 to 1, with ±1 indicating a direct relationship, and 0 indicating no relationship.

Now for the analysis. I took the 2019 hitting stats for every player that had 300 or more plate appearances (273 hitters) and calculated the correlations between the 6 most popular scoring categories for fantasy (R, HR, RBI, AVG, OBP, SB). Below is a chart of these correlations.

The highest correlation was between HR/RBI (seen below, right). A 0.86 correlation is a strong indication that the players who were good at HR were good at RBI too, and vice versa. If you know anything about baseball this should make sense to you since hitting a HR means you also get at least one RBI. In the same sense, hitting a HR means you get one R which is why the correlation between R/HR is a strong 0.74.


The correlation between AVG/OBP is 0.70 which makes sense since the formula for OBP is largely based on AVG (except for walks, hit by pitches, and sac flies). The correlation between R/OBP is 0.62 which also makes sense since you need to, ya know, get on base to score a run.


The noteworthy takeaway from this analysis is how low the correlations between SB and the other categories are. A correlation below 0.2 indicates that there is almost no relationship between the two stats. The players who are good at SB are not the same players who are good at HR, RBI, AVG, and OBP. The players who are good at SB are not necessarily bad at those other categories, but the skills needed to be good at SB are not the skills needed to be good at RBI for example (seen below, left). The one stat that correlates with SB better than the rest is R. This makes sense because if you are getting a SB it means you are advancing to 2nd or 3rd base which is considered scoring position.

To make sure this is a representative sample (273 hitters with 300 or more plate appearances) and that 2019 was not a fluky year, I compared these findings with the same analysis conducted on all qualified hitters from 2017 to 2019 (419 hitters). The results show that HR/RBI still has the highest correlation. SB still has most of the weakest correlations with other categories (seen below). The five highest correlations in 2019 were also the five highest correlations from 2017 to 2019. Both samples show that R is involved in four of the six highest correlations. Thus, the 2019 findings should generally be applicable for future years.

The next step in this analysis is to look at the mean (average) correlation of each scoring category. For example, the mean correlation for R is the average of the correlation between R/HR, R/RBI, R/AVG, and R/SB (in leagues that use AVG and not OBP). The mean correlation for R in AVG leagues is 0.59. This indicates that players who are good at R are, to a moderate level, the same players who are good at the other categories. The mean correlation for SB in AVG leagues is 0.12 which indicates that players who are good at SB are not the same players who are good at the rest of the categories. The mean correlation for R, RBI, HR, and OBP in OBP leagues are all higher than the mean correlations for R, RBI, HR, and AVG in AVG leagues. Fans of the Moneyball sabermetrics should enjoy this finding as it proves that players who are good at OBP are generally better at other stats than players who are good at AVG. In other words, having a good OBP is more desirable than having a good AVG. Although very weak, the correlation between R/AVG is higher than the correlation between R/OBP. This also supports the Moneyball philosophy of not stealing bases or bunting because the most valuable thing is getting on base and conserving outs.

Investing in players who score more runs will help you the most in the other categories. On the other hand, investing in players who steal more bases will help you the least in the other categories. In both AVG and OBP leagues, R is the most correlated category followed by HR, RBI, AVG/OBP, and then lastly SB.


These findings support the strategy of punting SB in a Roto or H2H Categories league. Punting a category means you do not consider or value that stat on draft day or in your auction. By putting all of your resources into players who are better at R, HR, RBI, AVG/OBP than SB, you will come away with a team whose best skills are higher correlated with each other. In a H2H Categories league you still may be able to compete in SB some weeks by assessing your opponent or adding a free agent SB specialist who is weak in other categories. Punting SB should only be a strategy for the draft or auction. You should still monitor free agency during the year for breakout players who get steals. Though they may be less valuable to your team, other teams may place a higher value on SB since you upset the natural balance of SB in your league.

190 views0 comments
bottom of page