I made model to predict the outcome of NCAA football bowl games. My model is based only on the outcome of regular season games, and is an ensemble of two simple mathematical models.
The first is a random walk model (i.e. Google Page Rank). The idea of the random walk model is to imagine a network where each node (point) represents a team. There is a one-way edge between two teams if they played a game, and the direction of the path points toward the winning team. Assume you begin the walk at a random node and begin travelling along the edges randomly. A team's ranking is the probability that you are located at that team's node after a very long time. The idea of the Google Page Rank algorithm is that if you add a little more randomness -- a small probability of moving from any node to any other node -- then this limiting probability is well determined.
I weight the graph's edges by the score differential. I tried applying various functions to the score differential as well as the score differential scaled by the winning team's score. An idiosyncrasy of the random walk model is that it rewards teams that beat other teams with very few losses, even when that team had mediocre results through the season. To compensate for this quirk, I used a large Google constant, that is I made the small probability of moving from any node to any other node quite a bit larger than the default value.
The second is a regression model where each game is an observation. The teams participating in the game are the features, and I arbitrarily assign one team to be team one and the other to be team two. The value of team one is +1 and the value of team two is -1 for each game. The difference in score (team one score less team two score) is the label. One advantage of the regression model is that it was very simple to do model selection using cross-validation. I used L2 regularization and determined the amount of regularization through cross-validation.
Model Prediction | ||
---|---|---|
Virginia Tech 55 | Tulsa 52 | Virginia Tech |
Nebraska 37 | UCLA 29 | Nebraska |
Navy 44 | Pittsburgh 28 | Navy |
Minnesota 21 | Central Michigan 14 | Central Michigan |
California 55 | Air Force 36 | California |
Baylor 49 | North Carolina 38 | North Carolina |
Nevada 28 | Colorado State 23 | Colorado State |
LSU 56 | Texas Tech 27 | LSU |
Auburn 31 | Memphis 10 | Memphis |
Mississippi State 51 | NC State 28 | Mississippi State |
Louisville 27 | Texas A&M 21 | Texas A&M |
Wisconsin 23 | USC 21 | USC |
Houston 38 | Florida State 24 | Florida State |
Clemson 37 | Oklahoma 17 | Clemson |
Alabama 38 | Michigan State 0 | Michigan State |
Tennessee 45 | Northwestern 6 | Northwestern |
Michigan 41 | Florida 7 | Michigan |
Ohio State 44 | Notre Dame 28 | Ohio State |
(6) Stanford 45 | (5) Iowa 16 | Stanford |
(12) Ole Miss 48 | (16) Oklahoma State 20 | Ole Miss |
Georgia 24 | Penn State 17 | Georgia |
Arkansas 45 | Kansas State 23 | Arkansas |
(11) TCU 47 | (15) Oregon 41 (3OT) | TCU |
West Virginia 43 | Arizona State 42 | West Virginia |
(2) Alabama 45 | (1) Clemson 40 | Clemson |
Click here to see the code for my model.