In the future, we will determine how to select the best possible team by maximizing your team’s projected points and minimizing its downside risk. But in order to do this, we will have to rely on our best guess of how many points each player will score. We will use 2012 projections from ESPN, CBS, and NFL.com and actual fantasy points from Yahoo. Our selected team, however, will only be as good as our projections. Garbage in, garbage out. Thus, it’s crucial to evaluate the accuracy of projections to know how much confidence to give them.
A couple years ago, there was an article in the New York Times that compared the fantasy football projections from ESPN, CBS, and Yahoo. The authors found that Yahoo’s projections were more accurate than CBS’s, which were more accurate than ESPN’s. The website co-founded by the author, FantasyPros.com, has collected historical accuracy of various sources of projections from 2009. The website uses this information to infer that some sources of projections are more accurate than others, and that the weight that you should give to each source should depend on its prior performance. Sounds reasonable, right? The problem is that many of these “sources” are just individual so-called experts. Just like mutual fund managers trying to predict the stock market, these “experts” are not reliably able to outperform the average (see, e.g., here and here). That’s why even casual bloggers outperform the experts (see, e.g., here).
If I shouldn’t trust the experts, whom should I trust?
To answer this question, it’s important to consider psychometrics. In classical test theory, any observed score (fantasy football projection) is composed of two parts: “true score” (i.e., the signal) and error (i.e., the noise; e.g., bias on the part of any individual source). One of the easiest ways to maximize the true signal to error/noise ratio is to aggregate information. In general, an average or latent variable will be more reliable and valid than the individual sources that compose it (assuming the sources are valid measures of the same thing). In other words, combining projections from various sources allows us to approximate more accurately the “true” projection for a player. In fact, one of the most accurate ways of getting accurate measurements in many domains is through wisdom of the crowd (see, e.g., here), where the best guess is the average of many individuals’ responses. Unfortunately, I’m not aware of any sites that apply the “wisdom of the crowd” calculations to fantasy football projections (but for rankings, see here and here). As a result, we will compare last year’s projections from ESPN, CBS, and NFL.com to see which was most accurate, and then compare those to the average and latent combinations of the three.
If you know of any sites with publicly available projections that aggregate across many sources, let me know.
The R Script
In my prior post, I demonstrated how to calculate projections for your custom league based on various sources of projections. We will now take those projections from last year and will examine them in relation to the actual points scored. We will examine 3 different metrics of prediction accuracy: R-squared (R2), Harrell’s c-index, and intraclass correlation (ICC). 1) R-squared represents the proportion of variance in the outcome that is explained by the predictor. R-squared is better than the simple Pearson r correlation coefficient when evaluating predictions because R-squared is better able to detect shifts in the data. 2) Harrell’s c-index is a measure of concordance that is equivalent to the area under the curve (AUC) in a receiver operating characteristic (ROC) curve, which represents the tradeoff between a predictor’s sensitivity and specificity. 3) ICC is commonly used to assess inter-rater reliability. Because the projected and actual points are supposed to measure the same thing on the same metric, we are not only interested in determining how predictive the projections are, but also how accurate (i.e., similar in value) they are. We will use the absolute agreement form of ICC to determine the accuracy of the projections.
Here’s a table of the accuracy of the predictions according to these 3 metrics (top 2 for each metric in blue):
The evidence suggests that CBS projections from last year were more accurate than ESPN and NFL.com’s. Interestingly, though, our projections are nearly as good as CBS’s when we average the projections from ESPN, CBS, and NFL.com. The average is not as accurate as the latent variable, though, which is nearly as good as the CBS projections (if not better than them, in terms of absolute accuracy, ICC). Do you think that CBS is really that much better than the other projections every year? If not, I would recommend using the latent variable. That way we can attain projections that are just as accurate as the most accurate projections without being able to know which are the most accurate beforehand (which is impossible). In other words, save the guesswork, and combine the projections with a latent variable for the most likely accuracy.
Below is a scatterplot of the association between our latent projected points for 2012 and the actual fantasy points scored in 2012. The R-squared of .57 suggests that we are explaining about 57% of the variance in actual fantasy points scored, which suggests that, although fairly accurate, our projections have room for improvement. To improve our projections, we will want to incorporate other sources of projections.
ggplot(data=projectedWithActualPts, aes(x=projectedPtsLatent, y=actualPts)) + geom_point() + geom_smooth() + xlab("Projected Fantasy Football Points") + ylab("Actual Fantasy Football Points") + ggtitle("Association Between Projected Fantasy Points and Actual Points") + annotate("text", x = 80, y = max(projectedWithActualPts$projectedPtsLatent), label = paste("R-Squared = ",round(summary(lm(actualPts ~ projectedPtsLatent, data=projectedWithActualPts))$r.squared,2),sep=""))