Injuries

27 views
Skip to first unread message

Bill Connelly

unread,
Oct 24, 2012, 10:46:13 AM10/24/12
to study-hal...@googlegroups.com
Alright, let's talk injuries for a second. One of my goals this offseason will be to find a pretty consistent method for determining the impact of injuries on a given team this year (and, in theory, in previous years too). My thought was simply this: each official box score lists the game's starters. If we can catalog each team's starters for every game, we could note when a starter changed. It wouldn't take too terribly long (especially with interns, ahem) to do some quick google searches to figure out why there was a new starter -- injury, demotion, etc. That wouldn't account for the projected starters who got hurt in the preseason (RIP, 26 different Mizzou offensive linemen), but it would get us pretty far down that road. Thoughts? It would be a bit time-consuming, but everything is time-consuming when there are 124 teams involved.

David Fobare

unread,
Oct 24, 2012, 10:56:58 AM10/24/12
to study-hal...@googlegroups.com
You'll need a small staff just to track the work of AIRBHG.

On Wed, Oct 24, 2012 at 10:46 AM, Bill Connelly <billco...@gmail.com> wrote:
Alright, let's talk injuries for a second. One of my goals this offseason will be to find a pretty consistent method for determining the impact of injuries on a given team this year (and, in theory, in previous years too). My thought was simply this: each official box score lists the game's starters. If we can catalog each team's starters for every game, we could note when a starter changed. It wouldn't take too terribly long (especially with interns, ahem) to do some quick google searches to figure out why there was a new starter -- injury, demotion, etc. That wouldn't account for the projected starters who got hurt in the preseason (RIP, 26 different Mizzou offensive linemen), but it would get us pretty far down that road. Thoughts? It would be a bit time-consuming, but everything is time-consuming when there are 124 teams involved.

--
 
 

Bill Connelly

unread,
Oct 24, 2012, 10:57:48 AM10/24/12
to study-hal...@googlegroups.com
Ha, yeah, just immediately document "All Iowa RB games lost to injury" and move on immediately.



--
 
 

Jake T.

unread,
Oct 24, 2012, 10:58:07 AM10/24/12
to study-hal...@googlegroups.com

My thoughts:

I think this is definitely a fruitful pursuit - especially for in-week projections and even correlation with lines.  I agree scraping the data wouldn't be too difficult; scrubbing it would take a little longer.  I have some lingering questions...


How would impact be determined?  Would there be a player rating - i.e. if a more important/better player is injured is that injury weighted more heavily?  or, will the method just analyze team performance based on amount of usual starters missing in each game?

Also, personnel and type of offense (or defense) being played against can determine starters (i.e. a team usually plays 3 LBs but starts 5 DBs against a spread heavy team - what if their 5th DB is injured in this situation?  Is that accounted for?  I'd lean towards saying they should be, but it definitely could muddy the waters and take more time in scrubbing data).

Jake





.

jmblackmer

unread,
Oct 24, 2012, 11:03:01 AM10/24/12
to study-hal...@googlegroups.com
Why not just look for the difference between starters and replacements? A lot of positions rotate consistently throughout the game. The one caveat to that is that rotation is the difference between first and second string players and injury is the difference between first/second and third string players. I suppose this brings up depth charts as a useful set of data that we could possibly look into getting.

As a side note, I'd think you could also use this data to predict the affect of players leaving the team between seasons. Essentially, let's say first string is 100% production, second is 85% production, and third is 70% production and you can expect a 10-20% increase in production from players between season. If your first string graduates, then you are looking at 93-102% production from your new first string and 77-84% production from your new second string. So, if you can figure this out and also figure out how much you can expect players to get better between seasons (or as seasons progress) then you'd be able to do accurate preseason predictions for teams.

Message has been deleted

Jake T.

unread,
Oct 24, 2012, 11:07:12 AM10/24/12
to study-hal...@googlegroups.com
Rivals has pretty decent NCAA depth charts...

Bill Connelly

unread,
Oct 24, 2012, 11:08:48 AM10/24/12
to study-hal...@googlegroups.com
Depends on the site. Some are great, some are updated once a year.



On Wed, Oct 24, 2012 at 10:07 AM, Jake T. <jake...@gmail.com> wrote:
Rivals has pretty decent NCAA depth charts...

--
 
 

jmblackmer

unread,
Oct 24, 2012, 11:09:00 AM10/24/12
to study-hal...@googlegroups.com
I'd think that you'd have to look at changes in the running game and passing game based on when the player leaves and look at differences in stats. So, if the opponent gets 4 YPC with the first string in and 4.2 YPC with the second string in, then there is a 5% increase in opponent production. If, at the same time, the YPA in the passing game goes from 10 to 9, then you have a 10% decrease in opponent production. From there you have to decide how to combine them. Obviously, on offense, you'd just look at the offense's production instead of opponent.

David Fobare

unread,
Oct 24, 2012, 11:14:12 AM10/24/12
to study-hal...@googlegroups.com
As a side note that brings up an important task. Is Success Rate a better description of running proficiency in cfb than YPC as Brian Burke has demonstrated for the NFL?

--
 
 

jmblackmer

unread,
Oct 24, 2012, 11:14:52 AM10/24/12
to study-hal...@googlegroups.com
I don't know about other teams, but Michigan releases a new depth chart every week. I just checked the site and it doesn't look like they have any historical documents for that; they just overwrite the document each week. If we start now, we could probably scrape that data each week, however, that doesn't give use historical data to work with.

Travis Fossett

unread,
Oct 24, 2012, 11:25:16 AM10/24/12
to jmblackmer, study-hal...@googlegroups.com
For LSU, the Times Picayune tracks snap counts of each game. It may not offer the detail of how successful runs and passes were based on personnel, but it would help identify overall trends.

--
 
 

Marty Couvillon

unread,
Oct 24, 2012, 11:32:19 AM10/24/12
to Bill Connelly, study-hal...@googlegroups.com
It's not just the starters you'd want to look at, but overall game participation.  If a starter is demoted, he could still play.  If you have a guy who is a starter one game and doesn't play at all the next then he should probably be on the candidate list of injured players.

I'm planning on releasing starters and game participation in a future release of data, though there is a fair amount of work to be done.  Would we want to use the starters/participation that is listed in the box scores on the NCAA website or use the box scores from the individual schools?  The ones on the NCAA website are from the schools anyway.

Marty

On Wed, Oct 24, 2012 at 10:46 AM, Bill Connelly <billco...@gmail.com> wrote:
Alright, let's talk injuries for a second. One of my goals this offseason will be to find a pretty consistent method for determining the impact of injuries on a given team this year (and, in theory, in previous years too). My thought was simply this: each official box score lists the game's starters. If we can catalog each team's starters for every game, we could note when a starter changed. It wouldn't take too terribly long (especially with interns, ahem) to do some quick google searches to figure out why there was a new starter -- injury, demotion, etc. That wouldn't account for the projected starters who got hurt in the preseason (RIP, 26 different Mizzou offensive linemen), but it would get us pretty far down that road. Thoughts? It would be a bit time-consuming, but everything is time-consuming when there are 124 teams involved.

--
 
 

Jonathan Hodges

unread,
Oct 24, 2012, 11:44:52 AM10/24/12
to jmblackmer, study-hal...@googlegroups.com
Most teams incorporate the depth chart into their weekly game notes, and many schools keep game notes available (though some take some digging to get to, having to go to previous news posts instead of being posted in one place).

Unfortunately doing any kind of automated scraping would be next to impossible since game notes are almost always in PDF, the position of the depth chart varies in the notes, and the formats of the depth chart completely varies between schools.  But, this is theoretically possible if all of the game notes are compiled.

I honestly don't know if there is a good way to compile reliable injury data except for significant injuries where you can do a one-off analysis.  Players come in and out of games with injuries, get injured in the middle of a game, and get passed on the depth chart for reasons other than injury.  There are also unreported injuries and slight injuries that guys play with.

One thing that would be interesting but also next to impossible to get would be participation on any given play (which someone mentioned a local paper does for LSU, but I personally haven't seen elsewhere).  If you have this all kinds of things are possible, but this would take a lot of work and may not even possible unless one is at the game or has access to coaches' film since TV angles usually cut off some players.

To sum it up, scraping the starters from the box score is probably the easiest thing to do right now.

--
Jonathan Hodges
Contributor, HailToPurple
Web: http://www.hailtopurple.com/jhodges/
Twitter: @hailtopurple
Facebook: https://www.facebook.com/hailtopurple
Email: j-ho...@alumni.northwestern.edu



--
 
 

Bill Connelly

unread,
Oct 24, 2012, 11:45:31 AM10/24/12
to study-hal...@googlegroups.com
Is there a big difference between the NCAA and official sites' participation lists? I wouldn't figure so, but when in doubt I'd say go with the official sites ... unless that's a lot harder.

Matthew Smith

unread,
Oct 24, 2012, 11:53:12 AM10/24/12
to study-hal...@googlegroups.com
Rather than completely reinventing the wheel, would it make sense to use http://www.collegeinjuryreport.com/ as a source?  Their data isn't awesomely formatted, but it does provide weekly lists of injured players.  I'd think that could be cross-referenced with starters lists to get something useful.  It'd be in projected format (questionable, probable, out, etc.) but I'd think that's a lot more doable than trying to independently parse all of the box score data to try and guess at who is and isn't injured.

Bill Connelly

unread,
Oct 24, 2012, 11:54:24 AM10/24/12
to study-hal...@googlegroups.com
Interesting. I actually had no idea that site existed...

--
 
 

Matthew Smith

unread,
Oct 24, 2012, 12:01:23 PM10/24/12
to study-hal...@googlegroups.com
Yeah, it's a pretty cool site.  I'd mulled over trying to parse their data into a useable structure (including blending into performance data) but never got to that point.  Part of the issue I had was they ID'd starters and key players by formatting (bold = starters, red  = key players) which I couldn't translate into actual data, so it ended up going nowhere.  Still, I think it'd be potentially very useful for this kind of analysis.

Marty Couvillon

unread,
Oct 24, 2012, 12:03:50 PM10/24/12
to Bill Connelly, study-hal...@googlegroups.com
Somehow my first replay to did not show up.

The official box scores would be easier for me since I'm already parsing them.  I still need to store the starter and participation information to my database.  At some point, probably in another thread, we'd need to talk about normalizing the starters' listed positions for easier analysis.

Marty

Bill Connelly

unread,
Oct 24, 2012, 12:15:59 PM10/24/12
to Marty Couvillon, study-hal...@googlegroups.com
By the way, is anybody familiar enough with Google Groups to know where the settings for responses are? Marty's first response didn't show up because he hit reply, and it went just to me instead of to the group address.

Chris Treadaway

unread,
Oct 24, 2012, 12:19:07 PM10/24/12
to Bill Connelly, Marty Couvillon, study-hal...@googlegroups.com
When replying via e-mail, you have to Reply All for everyone to see it in the group.  Otherwise it is a private message.

--

--
Chris Treadaway
CEO, Notice Technologies
blog - http://treadaway.typepad.com
e-mail - ch...@noticetechnologies.com

Bill Connelly

unread,
Oct 24, 2012, 12:20:36 PM10/24/12
to Chris Treadaway, Marty Couvillon, study-hal...@googlegroups.com
Right, but that's not the case on other Google Groups I'm on. Not sure where to make that change.

Bill Connelly

unread,
Oct 24, 2012, 2:53:26 PM10/24/12
to study-hal...@googlegroups.com
Alright, think the e-mail thing has been fixed.

Matt Mills

unread,
Oct 24, 2012, 3:08:53 PM10/24/12
to study-hal...@googlegroups.com
To get back to the Injury thread....

It seems like there are a lot of cool things to research with injury information and the effect on team performance. But a great start would be to have a weekly tracker of different starters for each team from the box scores and a weekly injury list from the college football injury list website. That way it would probably make it easier for whoever is looking up injuries/demotions on the interwebs. It seems we have people already parsing through box scores and people looking at injuries, so we just need a platform to display this information to everyone, any ideas?

-Matt
--
 
 

David Smith

unread,
Oct 24, 2012, 3:27:35 PM10/24/12
to study-hal...@googlegroups.com
One thing I've always been interested in is some sort of usage and efficiency measure for all players. Specifically, the percentage of plays in which Player X participates (is on the field), the percentage of the plays in which he participates that he makes a play (touch/target on offense, tackle/PBU/INT/FF on defense), and the YPP/PPP on those plays. Adjusting efficiency numbers based on the other players on the field and sorting data by offensive & defensive formation and play type could lead to very granular insight into how players are most effectively used/beaten (great for coaching staffs). Combined with the injury data being discussed here, it could possibly have great predictive value.

--
 
 

Bill Connelly

unread,
Oct 24, 2012, 3:33:54 PM10/24/12
to study-hal...@googlegroups.com
I agree that it would be incredible data, but ... unfortunately participation data is REALLY difficult to procure, even if you get charting involved. You need All-22 film, and ... that's not going to happen in college football. Clemson and a couple of other schools catalog that, but not nearly everybody.




--
 
 

David Fobare

unread,
Oct 24, 2012, 5:05:21 PM10/24/12
to study-hal...@googlegroups.com
I don't care if I'm interrupting. All the thanks to Bill for getting the reply-to thingie fixed. Much better now.

Matthew Smith

unread,
Jan 28, 2013, 3:07:59 PM1/28/13
to study-hal...@googlegroups.com
Anyone happen to have updates on this thread?
Reply all
Reply to author
Forward
0 new messages