In prior posts, I have shown how to download fantasy football projections from ESPN, CBS, and NFL.com. In this post, I will demonstrate how to take the projected points from these sources and calculate the projected points for your custom league given your league settings. Calculating players’ projected points in your league will be important for picking the ideal team for your league.
The R Script
The R script for calculating custom fantasy football projections for your league is located at:
In the first portion of the league settings script, we define (and can modify) your league settings. Here are the settings for my fantasy league:
Calculating projected points for each source
We then take the projected stats for each of the categories above (e.g., passing yards, rushing yards) from each source and multiply them by the multiplier defined for your league above. Here are the calculations:
The projected points for a given source is the linear (additive) combination of these point categories. For example, we add the projected points from each of the above categories:
projections$projectedPts <- rowSums(projections[,c("passYdsPts","passTdsPts","passIntPts","rushYdsPts","rushTdsPts","recPts","recYdsPts","recTdsPts","twoPtsPts","fumblesPts")], na.rm=T)
We complete each of these steps with each source (ESPN, CBS, NFL.com, etc.). Once we have the projected points for each source, we have a couple options: 1) We can compute a simple average across the sites’ projections. 2) We can compute a robust average that is not as affected by outliers. 3) We can compute a weighted average where we weight the sources we trust more heavily in our average. 4) We can compute a latent variable that represents the common variance among the sources. I tend not to trust individual sources of projections because they tend not to reliably outperform the average, so I will compute an average, robust average, and latent variable (but see here if you want an example of a weighted average of fantasy projections).
Average across sources
To calculate an average of projections across sources, we first calculate an average of projected statistics for each of the categories across sources:
Then we multiply the projected stats categories by the multiplier defined by our league settings:
The average projected points across sources is the sum of the points across categories (that have been averaged across sources):
Robust average (Hodges-Lehmann)
Means of projections are not ideal because they can be affected by extreme values called outliers. In other words, if most sources have Tom Brady scoring 300 points, and a crazy source has Tom Brady scoring 50 points, the average would be greatly affected by the source predicting 50 points. In order to calculate a more meaningful average, we calculate a robust average that is not as affected by outliers. We use the Hodges-Lehmann estimator, which is the median of all pairwise means, and is a continuous version of the median. Here’s how we calculate the robust average of all statistical categories across sources:
Then we can calculate projected points like we did above: first multiplying each category by its multiplier and then taking the sum of points across categories.
Latent variables are helpful for calculating an unobserved variable based on the common variance among various indicator variables. Latent variables tend to have stronger psychometric properties than simple average variables because they retain the common variance (thought to be “true” variance) and discard the unique variance (i.e., measurement error). Examination of the correlations among the various sources suggests that they are highly correlated (rs > .89), suggesting that they are measuring the same thing, and that they can be combined in a latent variable.
factor.analysis <- factanal(~ espn + cbs + nfl, factors = 1, scores = "Bartlett", data=projections) factor.analysis$scores
We want to know each player’s value on the latent metric, but the factor scores are standardized with a mean of 0 and a standard deviation of 1. As a result, we have to rescale the values so that they are meaningful representation of the players’ projected points. To do that, we rescale the latent factor scores to the same distribution (Weibull) as the average projections. Then we rescale the distribution to have the same range as the average projections:
That’s it! We now have projections for our league from ESPN, CBS, and NFL.com, in addition to the average and robust average among their projections, and the latent combination of the three. Here’s a density plot showing the similarities among the distributions of projected points from the three different sources:
ggplot(densityData, aes(x=pointDensity, fill=sourceDensity)) + geom_density(alpha=.3) + xlab("Player's Projected Points") + ggtitle("Density Plot of Projected Points") + theme(legend.title=element_blank())