Rosters and recruiting

16 views
Skip to first unread message

Bill Connelly

unread,
Jan 21, 2014, 9:37:47 AM1/21/14
to study-hal...@googlegroups.com
Hey guys, I thought I'd try to rejuvenate this list a bit by sharing a file. I have been attempting to link together rosters with recruiting rankings. Attached is a file that features 2013 rosters (culled from the data Marty shares at CFBstats) and 2008-13 Rivals.com recruiting rankings. I wanted to use 247 data, but it's not in a data-friendly format (i.e. it can't be copy-pasted easily), and the 247 data folks couldn't help me much. My goal will be to use their data one day -- I think the 247 Composite is easily the best recruiting ranking to use overall -- but for now we'll make do with this.

Anyway, through a variety of VLOOKUPs and whatnot, I tried to link the data in the Rivals tab to the rosters at hand, but obviously with slight name changes, misspellings, etc., there are still a lot that aren't linked up properly. If you want to help to further link names to ratings, I'd appreciate it. I tried to make this a Google Doc, but obviously it's enormous, and Google Docs aren't actually very good, so I'm open to suggestion in how to open this up to editing for folks (without 10 people working from 10 different files).
2013 Roster-Rivals share.xlsx

Justin Moore

unread,
Jan 23, 2014, 9:22:54 PM1/23/14
to study-hal...@googlegroups.com
Thanks for sending this out.

What's the data format for the 247 data?

What limitations do you run into with Google Docs?

-j


On Tue, Jan 21, 2014 at 9:37 AM, Bill Connelly <billco...@gmail.com> wrote:
Hey guys, I thought I'd try to rejuvenate this list a bit by sharing a file. I have been attempting to link together rosters with recruiting rankings. Attached is a file that features 2013 rosters (culled from the data Marty shares at CFBstats) and 2008-13 Rivals.com recruiting rankings. I wanted to use 247 data, but it's not in a data-friendly format (i.e. it can't be copy-pasted easily), and the 247 data folks couldn't help me much. My goal will be to use their data one day -- I think the 247 Composite is easily the best recruiting ranking to use overall -- but for now we'll make do with this.

Anyway, through a variety of VLOOKUPs and whatnot, I tried to link the data in the Rivals tab to the rosters at hand, but obviously with slight name changes, misspellings, etc., there are still a lot that aren't linked up properly. If you want to help to further link names to ratings, I'd appreciate it. I tried to make this a Google Doc, but obviously it's enormous, and Google Docs aren't actually very good, so I'm open to suggestion in how to open this up to editing for folks (without 10 people working from 10 different files).

--
You received this message because you are subscribed to the Google Groups "Football Study Hall: Data Sharing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to study-hall-cfb-...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Bill Connelly

unread,
Jan 24, 2014, 2:44:45 PM1/24/14
to study-hal...@googlegroups.com
a) The only data format for 247 data is whatever you do to copy-paste it. They weren't able to give me anything more workable than that. Neither was Rivals, but while Rivals data is very copy-pasteable (that's a word now), 247 profiles are broken into a couple of different rows. And unless I'm mistaken (I hope I am), you would basically have to copy-paste each team's commits, then do it again, then do it again. It's not meant for nerds to jump in and parse.

b) My limitations with Google Docs are basically that you can only paste in a little bit of data at a time, and the spreadsheets become extremely difficult (incredibly slow/lagging) to work with in a short amount of time. They're incredibly useful overall, but not so much for large data sets, at least not in my experience.

Matthew Smith

unread,
Feb 5, 2014, 1:53:33 AM2/5/14
to study-hal...@googlegroups.com
I've started to dig into this thing, and have noticed some seemingly important limitations to the data, namely that there seem to be some duplicates.  Some examples:

For Akron, there are two Christopher Henderson's in the recruiting data, one a 2009 DT recruit from NY and the other a 2010 DT recruit from NY.  SImilarly, there are two Clayton Moore's, each of which is noted as a 2011 QB from MS.  Meanwhile neither of these names show up at all on the roster data.

For Alabama there are two Brandon Hill's, one a 2012 OL recruit from TN, the other a 2013 OL recruit from VA.  Similarly, there are two Brandon Lewis's, one a 2008 DE recruit from AL, the other a 2010 DT recruit from MS.  And there are also two Deion Belue's, one a 2010 DB recruit from AL, the other a 2012 DB recruit from MS.  And finally there are two Quinton Dial's, one a 2009 DT recruit from AL, the other a 2011 DE recruit from MS.

Meanwhile on the roster there are no Brandon Lewis's, just one Brandon Hill (from TN), one Deion Belue (from AL), and no Quinton Dial's.

etc.

Some other notes:

1) If you took the player data from the cfbstats site, I think that it assigns a player ID number.  Is it possible to share a version of this file with that number attached?  I think that'd help keep things organized in a consistent manner.

2) Apparently Iowa State really did have two distinct Sam Richardson's, which looks like the only dup value shown consistently between the roster and recruiting data.

3) Can you give a few examples of things you want to automatically match?  I don't know what you've already matched, so some tangible examples might help.  I'd guess off the cuff you've matched everything with a recruiting rank on the rosters tab, but am not 100% on this.

Matthew Smith

unread,
Feb 5, 2014, 2:23:18 AM2/5/14
to study-hal...@googlegroups.com
As a follow up, I decided to look at the Alabama roster, only for the 20 players who had blanks for stars data.  None of them showed up in the recruiting data tab at all, assuming no one changed both their last name and prior school (no matches on either given first names, exact or similar [Ty checks against Ty and Tyler, Chris against Chris and Christopher etc.] ).  So assuming this was some kind of automated process already to generate the stars data, I think this is as close a match as you're going to be able to find.  The databases just aren't 1:1 in a lot of places it seems.

PS if you did a manual match for some of the players (such as Ha Ha), and you're looking for an easy way to flag potential fill ins for blank entries for non-star players or teams, then I'd suggest matching the first four or five letters of the first and last names instead of the whole thing.  I'd think you should flag it for a quick check, and then do a lookup to make sure the detail looks right (the easy way is to assign a "Recruiting ID number" to each row in the recruiting data, then use the 4/4 lookup to find the recruiting number, then use the recruiting number to look up the full name and compare.  Then you can fairly easily evaluate whether they "should" match.  If it's helpful I can send you an example of this.


On Friday, January 24, 2014 12:44:45 PM UTC-7, Bill Connelly wrote:

Matthew Smith

unread,
Feb 5, 2014, 6:35:48 PM2/5/14
to study-hal...@googlegroups.com
I actually had some other questions on the data, again using Alabama as a guide.

The rivals website ( http://sports.yahoo.com/footballrecruiting/football/recruiting/commitments/2013/alabama-73 and http://sports.yahoo.com/footballrecruiting/football/recruiting/teamrank/2013/all/all ) lists 25 recruits for Bama 2013, while the data file lists just 24.  The missing player in the data appears to be Raheem Falkins who does seem to be at Bama as part of the 2013 class ( http://www.rolltide.com/sports/m-footbl/mtt/raheem_falkins_844039.html ) and is totally missing from the database (i.e. not part of a different class or different/unassigned program).  Any ideas why that would be missing from the rivals database that they sent you (that's my guess where that table came from).

Also,does anyone know how rivals calculates their overall points?  Bama's class got 3166 points on their overall team rankings with a displayed star average of 3.84, and the "rivals rating" average is 5.87, and the sum of the rivals ratings gets you 140.90 without Falkins and 146.70 with him.  None of these numbers seem to have anything to do with each other.  Does anyone know how to translate either between rivals rank and overall class score or star rating and overall class score?

PS Thank you so much for sharing this data, it looks really useful.  If by some chance rivals has any kind of expanded data (i.e. some prior years as well) that they're able to share, that'd make this thing even more useful.

Matthew Smith

unread,
May 4, 2014, 6:50:52 PM5/4/14
to study-hal...@googlegroups.com
Yeah I think there are just serious issues with the rivals data, unfortunately.  I went through the Alabama roster for people that seemed missing from the database, and they were all off of the database, and a number really should have been on the database.  One by one for the first 10 or so:

Bradley Bozeman (recruit, missing from database)
http://sports.yahoo.com/footballrecruiting/football/recruiting/player-Bradley-Bozeman-124365

Caleb Sims (walk-on, missing from database)
http://sports.yahoo.com/uab/football/recruiting/player-Caleb-Sims-117811
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0CDMQFjAC&url=http%3A%2F%2Fwww.al.com%2Fsports%2Findex.ssf%2F2012%2F01%2Fhoover_star_caleb_sims_will_wa.html&ei=98BmU4vBAcrzoATXt4GwAQ&usg=AFQjCNFommuw-SvB-EKxwA2iU_PmEW7idg&sig2=Hrxe3U3iewO_B9lzCNHH8g&bvm=bv.65788261,d.cGU&cad=rja

Corey McCarron (transfer from USA, missing from database)
http://www.al.com/sports/index.ssf/2011/12/corey_mccarron_will_transfer_t.html

Harold Nicholson (in 2011 class, missing from rivals website and database)
http://247sports.com/Player/Harold-Nicholson-31685

Issac Leon (actual name is Isaac, misspell on roster file; 2013 class, missing from rivals website and database)
http://247sports.com/Player/Issac-Leon-34877

Jai Miller (spent 10 years playing pro ball, walked onto bama, missing from rivals website and database)
https://alabama.rivals.com/cviewplayer.asp?Player=462429
http://www.rolltide.com/sports/m-footbl/mtt/jai_miller_844049.html

Jerrod Bierbower (looks unranked by rivals, missing from database)
http://www.rolltide.com/sports/m-footbl/mtt/jerrod_bierbower_768188.html
https://alabama.rivals.com/cviewplayer.asp?Player=450469#scouting


Kyle Kazakevicius (class of 2011, missing from rivals website and database)
http://247sports.com/Player/Kyle-Kazakevicius-34876

Matt Tinney (class of 2009, missing from rivals website and database, though 247 says he was both a recruit and a walk-on)
http://247sports.com/Player/Matt-Tinney-31690

Michael Newsome (class of 2011, was recruited, missing from database)
https://alabama.rivals.com/cviewplayer.asp?Player=457539
http://247sports.com/Player/Michael-Newsome-20102
Reply all
Reply to author
Forward
0 new messages