Problem Formulation

53 views
Skip to first unread message

rboe

unread,
Mar 15, 2012, 6:59:39 AM3/15/12
to Machine March Madness
Hi all,

Just joined the group after finding out about it a couple days ago.
Thought I'd share what I have done so far. I think it is quite similar
(and possibly identical) to the PMF method described by Danny but I am
not exactly sure. Forgive me if I am just rambling on about things
everyone here is already aware of.

In any case, I have put the problem in a compact matrix form that
makes it a lot more clear, to me at least. Also, the problem has a
convex optimum solution so it doesn't depend on Machine learning,
which I have next to zero experience with.

Problem description:
For each season (e.g. 2006-2007) I have enumerated the teams and
compiled the scores of the games into a matrix S. For example, if team
1 beat team 2 with a score of 82-72 then S12=82 and S21=72. Ideally,
each team would play every other team at least once, but this is
obviously not the case so the matrix S is sparse. Using the method
proposed by George Dahl, I define vectors o and d which correspond to
each teams offensive and defensive ability. The approximation to the
matrix S is then just the outer product od' (for example
(od')_12=o1d2=S12est). This is a simple rank one approximation for the
matrix. If each team played each other at least once then the matrix S
would be dense and the vectors o and d could be found by finding the
SVD of S (see http://www.stanford.edu/~boyd/ee263/notes/low_rank_approx.pdf).
Because this is not the case, we instead define a matrix P that
represents which teams played that season. For example, P12=P21=1 if
teams 1 and 2 played a game. Now the problem stated by George can be
expressed compactedly as, "minimize ||P.*(o*d')-S||_F". Here, '.*'
represents the Hadamard product and ||.||_F is the Frobenius norm. In
this from, it is easy to see that, for constant vector o and variable
vector d, this is a convex problem. Also, for constant vector d and
variable vector o this is a convex problem. Therefore, by solving a
series of convex problems, alternating the vector variable between o
and d, the problem converges rapidly in about 5 to 10 steps (see
"Nonnegative Matrix Factorizations" code here http://cvxr.com/cvx/examples/).
From this point the problem is easily expanded to handle higher rank
approximations.

Some notes:
1) Higher rank approximations seem to over fit the data and make less
accurate predictions than rank one fits. I have only tried rank 5 and
rank 10, though. It is possible rank 2 or so is the sweet spot.
2) My current results seem to agree very closely with those found
here: http://thepowerrank.com/visual/NCAA_Tournament_Predictions . Any
other prediction websites that people like?
3) Predictions from my code here: http://dl.dropbox.com/u/6998157/Predictions2012.xlsx
4) If anyone is interested in the MATLAB/CVX code I used I can share
it, but it is very hacked together. The CVX code is only a few lines -
most of the effort is forming the matrix S (which isn't really much
effort).
5) The algorithm correctly predicted the national champion for the
last 6 years. This year it is calling for Ohio St to squeak out a win
over Kentucky. It will be fun to watch and see how that goes. In any
case, I think it is fantastic that a simple rank one approximation can
give such accurate predictions.

Thanks for sharing the starter code Danny. It made it effortless to
jump into this fun problem. I'm looking forward to next year when I
will know about this ahead of time and be able to put more effort into
it.

- Ryan

Danny

unread,
Mar 15, 2012, 10:20:09 AM3/15/12
to Machine March Madness
This is great, Ryan. Thanks for the great description of what you
did, and I like the idea of the alternating minimizations. Do you
mind if I put this on my blog as a guest post, so it gets a bit more
visibility?

Also, just to be sure, did you hear about and enter the contest we're
holding for the algorithms? This is the link to enter your
algorithm's picks (for you, or any other last minute people):
http://tournament.fantasysports.yahoo.com/t1/register/joinprivategroup_assign_team?GID=9198&P=robotsvshumans

Anyhow, sounds great!

-danny

On Mar 15, 6:59 am, rboe <ryanboe...@gmail.com> wrote:
> Hi all,
>
> Just joined the group after finding out about it a couple days ago.
> Thought I'd share what I have done so far. I think it is quite similar
> (and possibly identical) to the PMF method described by Danny but I am
> not exactly sure. Forgive me if I am just rambling on about things
> everyone here is already aware of.
>
> In any case, I have put the problem in a compact matrix form that
> makes it a lot more clear, to me at least. Also, the problem has a
> convex optimum solution so it doesn't depend on Machine learning,
> which I have next to zero experience with.
>
> Problem description:
> For each season (e.g. 2006-2007) I have enumerated the teams and
> compiled the scores of the games into a matrix S. For example, if team
> 1 beat team 2 with a score of 82-72 then S12=82 and S21=72. Ideally,
> each team would play every other team at least once, but this is
> obviously not the case so the matrix S is sparse. Using the method
> proposed by George Dahl, I define vectors o and d which correspond to
> each teams offensive and defensive ability. The approximation to the
> matrix S is then just the outer product od' (for example
> (od')_12=o1d2=S12est). This is a simple rank one approximation for the
> matrix. If each team played each other at least once then the matrix S
> would be dense and the vectors o and d could be found by finding the
> SVD of S (seehttp://www.stanford.edu/~boyd/ee263/notes/low_rank_approx.pdf).
> Because this is not the case, we instead define a matrix P that
> represents which teams played that season. For example, P12=P21=1 if
> teams 1 and 2 played a game. Now the problem stated by George can be
> expressed compactedly as, "minimize ||P.*(o*d')-S||_F". Here, '.*'
> represents the Hadamard product and ||.||_F is the Frobenius norm. In
> this from, it is easy to see that, for constant vector o and variable
> vector d, this is a convex problem. Also, for constant vector d and
> variable vector o this is a convex problem. Therefore, by solving a
> series of convex problems, alternating the vector variable between o
> and d, the problem converges rapidly in about 5 to 10 steps (see
> "Nonnegative Matrix Factorizations" code herehttp://cvxr.com/cvx/examples/).
> From this point the problem is easily expanded to handle higher rank
> approximations.
>
> Some notes:
> 1) Higher rank approximations seem to over fit the data and make less
> accurate predictions than rank one fits. I have only tried rank 5 and
> rank 10, though. It is possible rank 2 or so is the sweet spot.
> 2) My current results seem to agree very closely with those found
> here:http://thepowerrank.com/visual/NCAA_Tournament_Predictions. Any

Scott Turner

unread,
Mar 15, 2012, 11:21:35 AM3/15/12
to machine-ma...@googlegroups.com
On Thu, Mar 15, 2012 at 6:59 AM, rboe <ryanb...@gmail.com> wrote:
Just joined the group after finding out about it a couple days ago.
Thought I'd share what I have done so far. I think it is quite similar
(and possibly identical) to the PMF method described by Danny but I am
not exactly sure. Forgive me if I am just rambling on about things
everyone here is already aware of.

In my experimentation, I've found that a model where O and D are additive rather than multiplicative to be more accurate.  Perhaps you can tweak your model and report those results.

I'm in Pittsburgh right now with a group of friends to watch the "second" and "third" round games here.  We're hoping for an historic upset of a #1 seed.  (Well, the Syracuse fans aren't, but I am :-).  This is our 20th year doing this...

-- Scott

rboe

unread,
Mar 15, 2012, 3:09:51 PM3/15/12
to Machine March Madness
Sure, feel free to make a blog post.

Looks like registration is closed - I'll have to join next year. I did
fill out an ESPN bracket last night (http://games.espn.go.com/
tournament-challenge-bracket/en/entry?entryID=4613722).

On Mar 15, 7:20 am, Danny <dannytar...@gmail.com> wrote:
> This is great, Ryan.  Thanks for the great description of what you
> did, and I like the idea of the alternating minimizations.  Do you
> mind if I put this on my blog as a guest post, so it gets a bit more
> visibility?
>
> Also, just to be sure, did you hear about and enter the contest we're
> holding for the algorithms?  This is the link to enter your
> algorithm's picks (for you, or any other last minute people):http://tournament.fantasysports.yahoo.com/t1/register/joinprivategrou...

Lee-Ming Zen

unread,
Mar 15, 2012, 3:15:51 PM3/15/12
to machine-ma...@googlegroups.com
Ryan, we'll include the predictions you had on your dropbox as your entry. I forgot to include some ESPN analyst/etc. brackets anyway, so I'll need to do some manually scoring anyway :)

rboe

unread,
Mar 15, 2012, 5:11:09 PM3/15/12
to machine-ma...@googlegroups.com
Scott,

I updated the analysis. The new cost function is just ||S-P.*(d1'+1o')||_F where 1 is the n x 1 vector of ones. I have only tested it for 06-07 data and the results are somewhat promising.
The good news:
  • Only made 7 first round mispicks
  • Predicts the VCU upset over Duke (6/11 seats). This is probably just noise, though.
The bad news:
  • The predicted scores don't seem to match reality. For example, it predicts Florida over Jackson St. by 118 to 44.
  • It incorrectly predicted that NC (1) and Washington St. (3) would lose first round. NC was predicted to lose 82-167!
  • No correct picks for the final four (as compared to the multiplication method that guessed all four teams right that season).
I think what it comes down to is that the additive method recouples defensive and offensive abilities. To see this, consider the difference between the scores of team 1 and team 2 (which is what decides a win after all). This is (o1+d2) - (o2+d1) = (o1-d1) - (o2 - d2). This occurs everywhere in the equation so we may as well define new variables od1 = o1-d1 and od2 = o2-d2. This shows how the two variables are coupled. Notice that in the multiplicative case the function is not separable: o1d2-o2d1.

What is the exact model you are using? Maybe I am doing something a bit different.

Ryan

rboe

unread,
Mar 15, 2012, 5:19:57 PM3/15/12
to machine-ma...@googlegroups.com
Thanks! In the spreadsheet I posted I manually chose Duke over Baylor round 3 because the score difference was only 0.01. Also took Gonzaga over W Virginia in round 1 where the difference was only 0.1. In the spirit of the competition, though, these should be reversed because that is technically what the algorithm predicts. In both cases the team loses in the following round anyways so it shouldn't cause much difference. 

Lee-Ming Zen

unread,
Mar 15, 2012, 6:17:54 PM3/15/12
to machine-ma...@googlegroups.com
Thanks for the honesty :) Who knows, you might have it right!

Scott Turner

unread,
Mar 16, 2012, 9:14:33 AM3/16/12
to machine-ma...@googlegroups.com
On Thu, Mar 15, 2012 at 5:11 PM, rboe <ryanb...@gmail.com> wrote:
I think what it comes down to is that the additive method recouples defensive and offensive abilities. To see this, consider the difference between the scores of team 1 and team 2 (which is what decides a win after all). This is (o1+d2) - (o2+d1) = (o1-d1) - (o2 - d2). This occurs everywhere in the equation so we may as well define new variables od1 = o1-d1 and od2 = o2-d2. This shows how the two variables are coupled. Notice that in the multiplicative case the function is not separable: o1d2-o2d1.

Correct -- if all you care about is MOV (Margin of Victory), then you can simplify to a single "strength" vector.

Another option to try is:

    Scoret(team1) = MO(team1)*MD(team2) + AO(team1) + AD(team2)

which captures the (possible) notion that there are both multiplicative and additive effects going on.  I couldn't get the optimization step for this formula to work correctly (using batch gradient descent) but I did a model where I used the additive factors to make the initial prediction and then a multiplicative factor to predict the error from that.  I found the second step improved my predictions by about 0.5 point MOV.  (All my predictors work on MOV).

But I shouldn't give away all my results :-)

By the way, if you continue to work on this, you should use MOV (or predicted scores) as your metric, not correct predictions.  Predicting that the home team wins every game gets you to about a 70% rate right off the bat.  (And better in the tournament, where the home team is almost always the stronger team.)  Heck, the model I made up for the blog posting did 83% or something like that.

On the other hand, if you did predict UConn correctly last year, that's pretty good.  I doubt very much they were the strongest team coming into the tournament.  I had that prediction correct, but only because I force a certain number of upsets in predicting the tournament.

-- Scott
Reply all
Reply to author
Forward
0 new messages