I've started to dig into this thing, and have noticed some seemingly important limitations to the data, namely that there seem to be some duplicates. Some examples:
For Akron, there are two Christopher Henderson's in the recruiting data, one a 2009 DT recruit from NY and the other a 2010 DT recruit from NY. SImilarly, there are two Clayton Moore's, each of which is noted as a 2011 QB from MS. Meanwhile neither of these names show up at all on the roster data.
For Alabama there are two Brandon Hill's, one a 2012 OL recruit from TN, the other a 2013 OL recruit from VA. Similarly, there are two Brandon Lewis's, one a 2008 DE recruit from AL, the other a 2010 DT recruit from MS. And there are also two Deion Belue's, one a 2010 DB recruit from AL, the other a 2012 DB recruit from MS. And finally there are two Quinton Dial's, one a 2009 DT recruit from AL, the other a 2011 DE recruit from MS.
Meanwhile on the roster there are no Brandon Lewis's, just one Brandon Hill (from TN), one Deion Belue (from AL), and no Quinton Dial's.
etc.
Some other notes:
1) If you took the player data from the cfbstats site, I think that it assigns a player ID number. Is it possible to share a version of this file with that number attached? I think that'd help keep things organized in a consistent manner.
2) Apparently Iowa State really did have two distinct Sam Richardson's, which looks like the only dup value shown consistently between the roster and recruiting data.
3) Can you give a few examples of things you want to automatically match? I don't know what you've already matched, so some tangible examples might help. I'd guess off the cuff you've matched everything with a recruiting rank on the rosters tab, but am not 100% on this.