Data discussion

325 views
Skip to first unread message

Henry Harrison

unread,
Jan 2, 2015, 9:36:35 PM1/2/15
to dfs-portfo...@googlegroups.com
Please don't share any data you paid for, unless the source allows it.

This is really the first step so this is where I've been focusing my efforts lately.

For NFL, all you should ever need for statistics is nfldb. I've been working on an nfldb add-on for projections, so we can store projections in a database rather than a bunch of csv files, as I've been doing. I think getting this up and running is our #1 priority so we can begin logging as many projections as we can. I would appreciate any and all contributors. Find the repo here. Currently, the database is set up, next step is to create a general routine for inserting projection sets and then writing scripts that scrape websites and add the data to the database.

For other sports, I'm not sure what the best options are, feel free to discuss here. I think building a database for projections, as I am doing for NFL is critical and I would encourage people to work on this for other sports. At the very least, an organized system of CSV files.

Historical DFS salaries (and FP totals) are available at http://rotoguru.net/, for current seasons at least.

I've been collecting some webscraping scripts at https://bitbucket.org/hharrison/daily-ff-picker. The code is ugly and there is no actual lineup picker. But there are some scraping scripts. I plan to port them to nfldb in the near future, but feel free to check that repo out for inspiration.


Devin McCabe

unread,
Jan 3, 2015, 2:20:20 PM1/3/15
to dfs-portfo...@googlegroups.com
I'm planning on starting to scrape the necessary NBA data and SI.com offers a JSON feed similar to the NFL feed used by nfldb. Here's a URL for pulling the games on a certain date:

If you get a game_id from that file, you can access a box score with play-by-play data here:

I haven't checked whether SI.com is using the same kind of feed for every sport, but I'd guess that they are and it might be the best source for data.

Robert Del Vicario

unread,
Jan 3, 2015, 2:28:38 PM1/3/15
to dfs-portfo...@googlegroups.com
Here's a link to a flat file containing BR scraped box scores + advanced stats for this season.


Henry Harrison

unread,
Jan 15, 2015, 4:53:12 PM1/15/15
to dfs-portfo...@googlegroups.com

In case anyone's interested I finally got around to scraping NumberFire's historical projections. I hope to include the scarping script in nfldb-projections when that is ready, but int he meantime I figured I'd share the data:

https://www.dropbox.com/s/kimu8bw9i0hsdcv/numberfire-2014.csv?dl=0

Unfortunately rotogoru doesn't have salaries/results for the playoff games so I only have the projections for those. If anyone knows a second source of historical DFS salary information we can fill in those blanks.

Numberfire's is the only source that I know of where it's possible (though difficult) to scrape their NFL projections for previous weeks. FantasyPros ECR is possible too but players that are not active at the time you are scraping do not show up in past weeks, so you can never be sure the data is complete (I do have the complete FantasyPros ECR projections for maybe half the NFL season). 

Devin McCabe

unread,
Jan 16, 2015, 8:40:21 PM1/16/15
to dfs-portfo...@googlegroups.com
There are actually several bugs with that FantasyPros projections page you should be aware of:

1. Players currently on IR never show for any week, as you mentioned.
2. Players' old teams are displayed.
3. If it's 2014 Week 10, you obviously can't look at 2014 Week 11 yet. However, you also can't look at 2013 Week 11 since the check for the week number is a higher priority than year.

I emailed FantasyPros about these things and they brushed me off. My hope is that when this season is over, at least 1 and 3 will no longer be relevant and that data can be downloaded and trusted.

Within the past year I've also emailed 4for4 support (I'm a paid subscriber) and NumberFire about downloading old projections and both said it wasn't possible. I didn't push them on it, though. As far as salary data goes, I bet FanDuel would just give you a spreadsheet. They're been accommodating to me in the past about giving me old contest results after the season is over.

Devin McCabe

unread,
Jan 17, 2015, 12:23:20 PM1/17/15
to dfs-portfo...@googlegroups.com
Typo, #2 should be players' new teams are displayed. I'm pretty sure Percy Harvin is a Jet for all previous weeks in that tool.

Henry Harrison

unread,
Jan 17, 2015, 12:31:04 PM1/17/15
to Devin McCabe, dfs-portfo...@googlegroups.com
Yeah basically all their info is the current week, with the exception of the projection itself. Salary, team, I assume position but that doesn't really change, and of course whether they are even active (i.e. whether they are listed).

On Sat, Jan 17, 2015 at 12:23 PM, Devin McCabe <devin....@gmail.com> wrote:
Typo, #2 should be players' new teams are displayed. I'm pretty sure Percy Harvin is a Jet for all previous weeks in that tool.

--
You received this message because you are subscribed to the Google Groups "DFS portfolio system development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dfs-portfolio-sy...@googlegroups.com.
To post to this group, send email to dfs-portfo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dfs-portfolio-system/bdeba4a1-e34a-413b-9e88-898bd0200d47%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Henry Harrison

unread,
Jan 17, 2015, 12:32:46 PM1/17/15
to Devin McCabe, dfs-portfolio-system
I also contacted the same places you did, and FantasyPros at least doesn't care as the past weeks are not meant to be publicly accessible as far as I can tell.

Spencer Cushman

unread,
Jan 23, 2015, 7:15:34 PM1/23/15
to dfs-portfo...@googlegroups.com
My go-to for NBA stats is basketball-reference.com

I only use their basic box scores and I haven't gotten around to automating the scraping yet (or checking for an api), but their format is consistent so I can pull starters vs bench pretty easy. I wrote a tiny macro in notepad++ to clean their csv on the boxscore page and then I add the date, team, opponent, and pace to each player's line. It's half-manual/half-scripts but it forces me to look at every box score of every game which I think helps.

Robert Del Vicario

unread,
Jan 25, 2015, 3:12:09 PM1/25/15
to dfs-portfo...@googlegroups.com
Hey All,

You can scrape this season with the attached file (R).
basketball reference historical scraper.R
Reply all
Reply to author
Forward
0 new messages