Scraping Fantasy Football Projections from the Web
77In this post, I show how to download fantasy football projections from the web using R. In prior posts, I showed how to scrape projections from ESPN, CBS, NFL.com, and FantasyPros. In this post, I compile the R scripts for scraping projections from these sites, in addition to the following sites: Accuscore, Fantasy Football Nerd, FantasySharks, FFtoday, Footballguys, FOX Sports, WalterFootball, and Yahoo.
Why Scrape Projections?
Scraping projections from multiple sources on the web allows us to automate importing the projections with a simple script. Automation makes importing more efficient so we don’t have to manually download the projections whenever they’re updated. Once we import all of the projections, there’s a lot we can do with them, like:
- Examine historical accuracy of projections
- Determine who has the most accurate projections
- Calculate projections for your league
- Calculate players’ risk levels
- Calculate players’ value over replacement
- Identify sleepers
- Calculate the highest value you should bid on a player in an auction draft
- Draft the best starting lineup
- Win your auction draft
- Win your snake draft
The R Scripts
To scrape the projections from the websites, I use the readHTMLTable function from the XML package in R. Here’s an example of how to scrape projections from FantasyPros:
The R Scripts for scraping the different sources are located below (note that updated scripts are available via our ffanalytics R package):
- Accuscore
- CBS – Jamey Eisenberg
- CBS – Dave Richard
- CBS – Average
- ESPN
- Fantasy Football Nerd
- FantasyPros
- FantasySharks
- FFtoday
- Footballguys – David Dodds
- Footballguys – Bob Henry
- Footballguys – Maurile Tremblay
- Footballguys – Jason Wood
- FOX Sports
- NFL.com
- WalterFootball
- Yahoo
Density Plot
Below is a density plot of the projections from the different sources:
Conclusion
Scraping projections from the web is fast, easy, and automated with R. Once you’ve downloaded the projections, there’s so much you can do with the data to help you win your league! Let me know in the comments if there are other sources you want included (please provide a link).
Could you include Footballguys in the projections?
Thanks for the suggestion, Mike! Just added scripts to scrape projections from Footballguys. I also updated the draft apps to include them in the calculations, as well. Cheers!
-Isaac
Hi Isaac,
Which site has the best accuracy rate for their draft package? football guys’, fantasy pros, a different site? Thanks for your help!
See here:
https://fantasyfootballanalytics.net/2014/06/best-fantasy-football-projections-2014.html
Hey, Thanks a lot for posting your scripts. I’m working on building a draft tool in R and am stuck on what to do for QB combination logic. For example, the combination of QB12 and QB14 if played correctly could potentially give you top 5 production. I’m in data analytics but I’m no stats expert, so I’d be very curious to hear how you would approach that.
Hey Brian,
Two principles come to mind:
1) Draft starters first and bench players last.
2) Draft QBs after RBs, WRs, and TEs.
In greater detail:
1) Draft starters first and bench players last. A Harvard analysis showed that you should spend your resources (auction dollars or early picks in a snake draft) on starters (http://harvardsportsanalysis.files.wordpress.com/2012/04/fantasyfootballdraftanalysis1.pdf). You generally shouldn’t draft bench players until filling out your starting lineup because bench players only contribute if they outperform your starters. For more info, see here:
https://fantasyfootballanalytics.net/2013/06/win-your-fantasy-football-auction-draft.html
2) Draft QBs after RBs and WRs. You should draft RBs, WRs, and TEs early because their expected points drop exponentially. QBs’ expected points don’t drop as fast, so you can get good value QBs later in the draft. For more info, see here:
https://fantasyfootballanalytics.net/2013/07/expected-points-by-position-rank-in-fantasy-football.html
In summary, having QB12 as your starter may be okay, but I wouldn’t draft QB14 until all of your other key starting positions (QB, RB, WR, TE) have been filled.
Hope that helps!
-Isaac
Walters football add into this please he was #2 2013 fantasy pros
Where can I scrape/download these? They would need to be in a tabular format to be able to parse the data.
Just added WalterFootball projections to our apps. Cheers!
Where can I get the Function.R file?
https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Functions/Functions.R
JP and I are getting into this R Script mostly JP he is the Albert Einstein of this generation! How accurate are your projections we both were wondering? You should get with fantasy pros and track it. Thanks for all the help!
Hey Leo, this is our first year for calculating projections, so we won’t be able to test our accuracy until next year. Here’s how we tested the accuracy of other projections last year:
https://fantasyfootballanalytics.net/2014/06/best-fantasy-football-projections-2014.html
Good suggestion. I’ll submit our projections next year to FantasyPros.
Thanks!
-Isaac
I was trying to play with the Yahoo projections script to pull the projections from my customized league instead of messing with multipliers. However, I can’t seem to get it to work. Is there an issue because the custom league can only be seen once I login? The error is with all the http links saying that it cannot load the data from that entity. I even tried logging into my custom league in a browser first but that didn’t help. Is this even possible?
If it’s a password protected site, you’d have to modify the script to input a username and password (https://stackoverflow.com/questions/24723606/scrape-password-protected-website-in-r). Considerably easier would likely be to scrape projections from a publicly viewable league (as my script does) and to use multipliers.
Thanks Isaac!
With the risk of sounding like a moron…
Can someone spend a few minutes (or post a link) to a place that can tell a lay person how to use the script. I have a basic understanding of coding (and I have downloaded the Github app to my computer) and have grabbed the scripts from Github, but I don’t know what to do with them now that I have them.
I’ve been painstakingly copying and pasting from different sites and using excel to average them all together to get a basic stat projection (and then applying our league point structure – which is an odd one – and no site allows me to choose it – .25 per rush attempt, 8 yards for 1 pt receiving…)
This sounds like it would save me SO much time and would love to learn how to do this.
Alright. I did some digging and have figured a few things out (Phew! This is some tough stuff for a teacher with minimal coding background!) 🙂
I downloaded Github and grabbed FantasyFootballAnalyticsR. I downloaded R.app (I’m on a Mac), and installed some of the main libraries (XML, plyr, Rglpk, ggplot2, stringr).
I’m on the correct track, right?
Now, in R.app, when I type:
source(“/Users/bz/Documents/FantasyFootballAnalyticsR/R Scripts/Projections/CBS Projections.R”)
I get:
Error in file(filename, “r”, encoding = encoding) :
cannot open the connection
In addition: Warning message:
In file(filename, “r”, encoding = encoding) :
cannot open file ‘/Users/bz/R Scripts/Functions/Functions.R’: No such file or directory
What am I missing? Thanks for any help you can provide!!
-Brian
Hey Brian,
Thanks for your interest. For how to run the R scripts, you might look into some of the resources for learning R here: https://fantasyfootballanalytics.net/2014/06/learn-r.html. It looks like the error you were receiving was because it was trying to run the Functions.R file I created but couldn’t find it in the folder specified in the script. You can download the Functions.R file from here: https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Functions/Functions.R.
Alternatively, if you want to avoid having to run the scripts and just want to get the projections, try these apps:
https://fantasyfootballanalytics.net/2014/07/fantasy-football-draft-optimizer-shiny-app-2014-update.html
Hope that helps!
-Isaac
Yeah, I found that after I had downloaded everything. The app is great, but it was also good for me to learn just a bit about R scripts for learning sake…
Wait a sec… I just looked in the data folder and noticed a .csv file from each site…
I didn’t realize it would do all of this truly automatically!
Does it update the stats (and if so, how often)? If not, how do I get it to run again?
Mind. Blown.
Sorry for cluttering up your comments with my naivety, but wow. Just wow.
Also, what is used to figure out the InflatedCost in the AvgCost.csv?
It updates the stats whenever you run the script (or whenever I update the stats by running the script and uploading them to GitHub). Inflated cost is calculated by taking the AAV source you want (ESPN, Yahoo, etc.), scaling it to your league cap and number of teams, and applying a 10% premium to top players and 10% discount to bottom players. For more info, see here:
https://fantasyfootballanalytics.net/2013/06/win-your-fantasy-football-auction-draft.html
Isaac,
Great stuff. Do you know if FantasyPros, ESPN, etc., have archived projections available? Or only the current year? Would be nice to work with a five-year data set….
Thanks,
Jon
Hey John,
I don’t think the sites keep their historical projections posted (boo!). I have historical projections saved on GitHub for some sites that go back to 2012:
https://github.com/isaactpetersen/FantasyFootballAnalyticsR/tree/master/Data/Historical%20Projections
Let me know if you find any historical projections that I’m missing!
Thanks,
Isaac
The only multi-year archive I’ve found is fftoday.com, which runs back to 2008. (e.g., http://www.fftoday.com/rankings/playerproj.php?Season=2008&PosID=20&LeagueID=1)
I don’t know their methodology, but at first glace it doesn’t appear to be consensus sourcing (see: Toby Gerhart, projected 5th among RBs in 2014).
Would be interesting to construct a joint PDF for each position using a dataset like this. The advantage of using projections over ADP would be recouping the information lost in, say, Jimmy Graham’s #1 TE ranking in 2014 (40 pts higher than #2). And the advantage of a joint distribution would be preserving the variance in the data, so that one could quickly compute the probability of any player scoring between n and (n + m) points based on projection and position.
Jon
Hi Isaac – thank you for all your work and information. I’m a stat and data nerd and love what you’ve done. I’m curious if there’s a way to integrate defense and kicker projections and enable them as part of the shiny app for snake draft.
Hey Scott,
Yes, this is on our to-do list for next season. There aren’t as many sources of projections for kickers and IDP/DST. Let me know if you find any.
Thanks!
-Isaac
Hi Isaac, First off –Great site! Are there plans to scrape the sites for weekly projections?
This is a longer-term goal. One of our readers contributed the following script that scrapes weekly projections from Yahoo:
https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Weekly%20Projections/Yahoo.R
Any update on this? I would love to use weekly aggregate projections this year for player management.
Hi Daniel,
Yes, we will be releasing weekly projections in our tools. Stay tuned!
-Isaac
Is there a way to apply the weekly scraper in a historical fashion? Would love to grab historical weekly projections
Hi Victor,
We will be storing weekly data moving forward, so you will be able to access weekly data historically when we set it up (beginning with 2016 Week 1).
Thanks,
Isaac
This is amazing stuff. I wish I would have found this before now. My draft starts in less than 4 hours…
I am running the code — got through projections but in the Calculate League Projections.R I got this error:
> #Duplicate last names
> projections$lastName source(paste(getwd(),”/R Scripts/Calculations/Wisdom of the Crowd.R”, sep=””), echo=TRUE)
I also had to comment out two of the name corrections from the footbally guys script as well — getting an error there.
Hey John,
Hope you didn’t try to run all that right before your draft. We have apps that do it for you!
https://fantasyfootballanalytics.net/2014/07/fantasy-football-draft-optimizer-shiny-app-2014-update.html
Regarding your R question. First, that doesn’t appear to be an error. It appears to be combining two different lines:
projections$lastName —– this line shows the last names of players
source(paste(getwd(),”/R Scripts/Calculations/Wisdom of the Crowd.R”, sep=””), echo=TRUE) —– this line runs the “Wisdom of the Crowd” script.
It would give an error if they were on the same line, but they aren’t on the same line in my scripts on GitHub:
https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Calculations/Calculate%20League%20Projections.R
I’d have to see the actual errors in the football guys script to troubleshoot. FootballGuys is password-protected, so you’ll have to enter your own username and password.
Hope that helps!
Sorry – I didn’t paste the error — just the code that produced it. I think something about and NA through the error (think it was an index out of bounds but can’t remember). My fix below. I’ll check out your app. I did use your scripts from github.
Here’s the code I rewrote –it’s not concise — but I think it functions the way you intend.
#Duplicate last names
projections$lastName <- gsub("Sr", "", gsub("Jr", "", gsub("III", "", gsub("[[:punct:]]", "",projections$player))))
projections$lastName <- gsub(" $","",projections$lastName,perl=T)
projections$lastName <- sapply(strsplit(projections$lastName, "\s+"),tail,n=1)
Hey John,
Not sure what the error is, but the script works for me. Feel free to post it here if you have questions.
-Isaac
They’re seems to be stronger projection sites out there like: 4for4, Bloomberg Sports, PickingPros, and Profootballfocus, Yahoo partnered with Profootballfocus this year so Yahoo’s projections will be more accurate, Numberfire seems to be gaining ground, Yahoo used to get theirs from Accuscore, but they are not too good, ESPN is average as well, I just looked at FantasyFootballNerd, and they got some horrible projections for week1. They got Brady at 31 completions for 402 yards, Rivers @ 28 completions, Locker @ 49 attempts for 27 completions, Charles 23 rush attempts and Ellington @ 22, these are all too high
Hey Evan,
These scripts are for scraping the season projections (not weekly projections). We have goals to include weekly projections in the future, though. Most of those sites you listed (except PickingPros) appear to subscription-only, so we wouldn’t be able to include them (via scraping). I will look into including PickingPros next season.
Thanks!
-Isaac
The Yahoo script is malfunctioning for me. The columns have the wrong designations, so I’m worried that the data may not be handled properly overall (passing yards are registered as return TDs, rushing yards as reception yards, rushing attempts as passing yards, reception yards as interceptions etc.).
Maybe it’s just for me, but you might want to look into that.
Hey Axel,
I last updated the script and projections data the day before the start of the NFL season. The updated data are correct — you can verify them on our GitHub repo (https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/Data/Yahoo-Projections.csv). Websites sometimes change their structure, so the scripts may have to be updated. I won’t plan on updating the season projection scripts until after the season (for next season). If you’re talking about the Weekly Projections script (https://github.com/isaactpetersen/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Weekly%20Projections/Yahoo.R), that was contributed by a reader. Feel free to update it and create a pull request to the repo, so everyone can benefit.
Thanks and hope that helps!
-Isaac
I’m fairly certain that the projection categories Yahoo display varies depending on your league settings. For example, it no longer shows first down projections for me, since we removed that category from our league this year. Thus the structure can differ from league to league.
Isaac,
Is it possible to get the code for the density plot?
Thanks,
Hey Brian,
Yes, you can find the code for the density plot in the following post (and the accompanying GitHub code):
https://fantasyfootballanalytics.net/2013/03/calculating-custom-fantasy-football.html
Hope that helps!
-Isaac
Hi Issac. I used your web scrapping code for a personal project I was working on in R and your code worked perfectly. Very helpful. I just want to say thank you for posting your script for web scraping.
Best,
Brittany Pugh
Happy to send you info. Feel free to shoot me an email.
Hey Isaac – thanks for the great website. question on the yahoo scrape. Do i need to adjust my proxy settings somehow in Rstudio? noticed that the yahoo URL needs to be logged in to yahoo to direct to the pages it is scraping.
Thanks again.
George
Hi George,
You can scrape Yahoo public leagues. I set one up this season for that purpose:
https://football.fantasysports.yahoo.com/f1/170716
Hope that helps,
Isaac
haha. wow, i feel dumb. thanks so much
m/
Hi Isaac. I am trying to scrape my Yahoo private league projections and I am running into the problem of having to login to Yahoo. How do I solve this in R (I am using this to learn R also) Thanks
See here:
https://fantasyfootballanalytics.net/2014/06/scraping-fantasy-football-projections.html#comment-34280
-Isaac
I’m a newb with R, like to the extent of I know what it can be used for, but I’ve always installed packages and just ran other peoples scripts that I’ve acquired and manipulated to make it work for me.
I’m trying to scrape projections / stats from Yahoo using the script linked.
Are their any instructions for dummies like me to make this work? I’m getting all kinds of error messages.
Thanks!
Hi Ryan,
We haven’t had a chance to update the scripts on the repo because we’ve been busy working on the tools. That’s on our to-do list. Feel free to update them and create a pull request to GitHub so you can share them with the community!
Thanks,
Isaac
Isaac,
Is there a problem with the download projections App? Everytime I have tried to download projections today it has tried to save the file as an HTML or text file. I even re-booted my computer and same problem.
I have downloaded several times in the past and no problems. Please help – my draft is tomorrow and got to prepare.
Thanks and did you guys get the weekly projections ready for this season? I know you were planning to.
Thanks again and the website is amazing !!
Vik
Hey Vik,
It works for me. What browser are you using? It essentially is a text file, you just want to save it as .csv. You might try this:
https://fantasyfootballanalytics.net/2015/05/2015-fantasy-football-projections.html#comment-31086
Hope that helps,
Isaac
Love your site! Very well done!
Can you pull weekly DFS price data from Draft Kings, Fan Duel, and Yahoo? Having the cost for each player, along with all of these other stats would be amazing to optimize line-ups. Users would be able to identify players who have the highest ratio of projected points to price, develop high floor teams (low risk), identify players with high ceilings, and countless other things.
Once again, awesome work!
Yes, we have salaries from DFS sites in the Projections and Optimizer tools (change Week and League Scoring).
Just yesterday, I came upon your scrapers for the fantasy football projections. I was hoping to utilize them for DFS on DraftKings. I updated the LeagueSettings.R file to select the 2015 season, and week 7 last night (as a test to compare the week 7 projections to actuals), but when I ran the ESPN scraper, it still appeared that I was getting full season worth of projections, rather than the weekly.
Have you encountered this before? Is there a step [or ten 🙂 ] that I missed?
Thank you for taking the time to share all of this with the community. It is really helping me build my R skills.
We haven’t updated the scripts on the GitHub repo for the current season (it’s on our to-do list). You might have to update the URL in the script to get ESPN’s weekly projections.
Hey Isaac,
Your site is amazing! I’ll probably spend whatever free time I have for the next couple of days reading further into the material on this site.
Meanwhile, I was wondering if you could answer a few questions that I had.
First of all, it seems like the approach you take for projecting players’ performances is almost entirely based around other experts’ projections. I was wondering if you’ve attempted coming up with your own projections purely based on players’ and teams’ statistics: for instance, by training a model based on players’ stats vs. opponent’s strength in defense, etc. For instance, maybe for a WR’s projection you would consider the WR’s QB’s stats and his O-line’s stats. Maybe even different factors like days of rest, home/away game, weather can come into play?
If you had your own projections based on raw data, maybe you could even corporate that with what you already have and weigh it accordingly, based on how confident you are of your own algorithm :p
Secondly, if you have already tried this, or plan on trying this, do you know where I can find these additional data (such as schedules, difficulty of opponent, etc)? The only data that I’m seeing from your site are stats revolving around others’ projections. I was hoping to try to come up an algorithm for my own projections and was wondering if you could help me find some more extensive data.
Again, thanks for your awesome site and your contributions!
Hi Andrew,
I love the idea to generate our own projections in a transparent way based on how various variables (e.g., age, strength of surrounding offense, prior performance) predict future performance, especially because other websites’ projections are basically a black box. Creating accurate projections that consider many factors is major undertaking, though (e.g., figuring out what variables to consider, collecting big data, organizing big data, finding a model that predicts well historically, and continual cross-validation), so it’s one of our future directions to consider. In the meantime, we’ve shown that aggregated projections are more accurate than any individual source of projections, so it might provide a useful starting point. Let us know if you find good sources of these data!
-Isaac
Can you show us how to scrape the salary for each player from FanDuel.com???
I think it’s behind a login wall, so I’m not sure how to scrape. Let us know if you find a publicly available link!
Old thread but I found a good source for historical salaries — http://rotoguru1.com/cgi-bin/fyday.pl?gameyr=fd2014. I’ve already scraped fan duel data from 2011-2015 and posted it on my github. Writeup here http://www.ergosum.co/scrape-historical-draft-kings-20-minutes/
I’m still looking for a good source of weekly salaries because roto-guru doesn’t look active anymore. I might just go fight the good fight and get behind the login wall
Isaac, I really enjoyed your projections last year and used them every week. Thanks for creating this
Hi Matteo,
Do you have a link to download the historical FanDuel data? We got the historical DraftKings data.
Thanks!
-Isaac
Yep but only 2011-2015 https://raw.githubusercontent.com/rogerfitz/tutorials/master/draft-kings-history-scrape/fan_duel_salaries2011-2015.csv
I’ll let you know if I find last year
When will your new 2016 projections be out?
Hey Dean,
I think most sites release projections after the NFL draft, so hopefully soon after that.
-Isaac
Hey Isaac,
Any chance you have historical data from your fantasypros scrapes? I’m looking to do study on the variability of consensus accuracy by examining several offseasons worth of predictions.
Best,
Garrett
Hi Garrett,
We have historical projections from FantasyPros for 2013 and 2014 in our Projections tool. Nevertheless, we’ve shown that our projections are more accurate than FantasyPros. We also have FantasyPros expert consensus ranking and average draft position from 2015.
Hope that helps!
-Isaac
FFChamps.com. Let me know if you want my login
Just found your site today, and I’ll be going through it over the next few days. Do your scrapes separate the individual Experts or Analysts out for each site or does it only grab the average by all the analysts?
We scrape individually and calculate the average from the individual scrapes:
https://fantasyfootballanalytics.net/2014/06/custom-rankings-and-projections-for-your-league.html
Hi, I know this is an old post but I was wondering if you could help
I keep getting the following error when trying to scrape the yahoo projections:
Error: failed to load external entity “http://football.fantasysports.yahoo.com/f1/34124/players?status=ALL&cut_type=9&myteam=0&sort=PTS&sdir=1&count=0&pos=QB&stat1=S_PS_2016”
when i run(and everything up to it in that script(I edited the league ID that is default to my ID)):
yahoo <- lapply(yahoo_urls, function(x) {data.table(readHTMLTable(x, stringsAsFactors = FALSE)[2]$'NULL')})
everything seems to be working up to that point but seem to always have a problem there.
You’ll have to use a different league ID because that appears to be a private league:
http://football.fantasysports.yahoo.com/f1/34124/players?status=ALL&cut_type=9&myteam=0&sort=PTS&sdir=1&count=0&pos=QB&stat1=S_PS_2016
You’ll want to change it to a public league ID
Do you have any python web scraping sources that would be a good resource?
Where do you scrape the Salary column from in your weekly projections? It looks as if you use FanDuel projections. I just wanted confirmation on whether or not that was true.
Thanks