• Contact
Posted by Huckleberry on December 11th, 2008 under Football
As everyone should know by now, I am a computer ratings geek. I have my own set of ratings, but I recently decided to set out to design a computer ratings system that accommodates the factors that either must or in my opinion should be reflected in a system used by the BCS. With that in mind, here are the constraints and logical requirements I started with when designing the system:
So starting with these constraints I began to design the inputs for the system. As you will see later, the first constraint is extremely important and is largely responsible for the specific results in any season. The best ratings systems are iterative ones that with each game analysis compare the expected result to the actual outcome. In a system where the actual score of the game is not allowed to be considered, the actual outcome input for each game is very limited in its possible values. Without game location or date information, in fact, this input must be binary (1 for a win -1 for a loss, for example). However, as shown in #2 above, I added to the variation in that input by considering game location. How did I arrive at specific numbers? Well, historically speaking, and I only looked at the last 4 years but it’s very consistent, home teams win a little over 57% of all college football games. At first, then, it would seem we could use 1.14 for a road win, 1 for a neutral win, and 0.86 for a home win as our game outcomes. Instead I used 1.1, 1, and 0.9 for the values. I did this because I consider that in college football better teams play more home games and therefore this contributes a little to the 57% number, so I called it 55% for my purposes. This is obviously a rather arbitrary determination and later I might revisit this figure and determine a more accurate value. The game outcome values, then, are as follows: 1.1 for a road win, 1 for a neutral win, 0.9 for a home win, -0.9 for a road loss, -1 for a neutral loss, and -1.1 for a home loss.
Using these outcome values, I then decided to use a system loosely based on the ELO rating methodology. This requires, as discussed above, an expected outcome and an actual outcome for each game. For these calculations, the expected outcome for a Team A playing Team B is initially set at Team A’s rating minus Team B’s rating all divided by 100. This value is capped, though, depending on the game location. For a road game, Team A’s expected value is capped at 1.1, 1 for a neutral game, and 0.9 for a home game. This processing step maintains our #3 and #4 requirements above. Working with a neutral site game, if a team is more than 100 points more highly rated than their opponent, winning the game will affect neither team’s ratings. Losing the game, however, would cause an adjustment.
The second step in the ELO method for each game is to adjust each team’s rating based on the comparison between expected value and actual outcome. So in this system, after each game an adjustment score is tracked for each team. This ratings adjustment is equal to 100 times the actual outcome minus the expected value. So, for example, if Team A is rated 150 points more highly than Team B and wins at a neutral site, there is no adjustment. The expected value was capped at 1, and the actual outcome was 1. Now, if Team A is rated 50 points lower than Team B and wins at a neutral site, then Team A’s rating adjustment score would be +50 and Team B’s would be -50. These adjustment scores for each team are applied at the end of the season iteration after being summed and divided by the number of games the team has played.
The algorithm then iterates as many times as necessary until the ratings stabilize to a suitable degree.
With all those things in mind, here are the Top 25 results of the system for the 2008 year-to-date (here’s the full list). I have made some comments on these results below the ratings:
1 Texas 804.23 2 Texas Tech 800.90 3 Utah 797.57 3 Oklahoma 797.57 5 TCU 707.57 6 Oklahoma St. 656.13 7 Boise St. 638.30 8 BYU 617.57 9 Florida 604.23 10 Alabama 595.50 11 Southern Cal 579.98 12 Penn St. 575.56 13 Georgia Tech 565.15 14 Cincinnati 558.74 15 Ohio St. 557.04 16 Virginia Tech 554.32 17 Florida St. 553.72 18 Boston College 550.13 19 Missouri 546.13 20 Georgia 540.59 21 North Carolina 540.57 22 Pittsburgh 529.27 23 Nebraska 528.32 24 Oregon 528.30 25 Clemson 527.82
Because I’m sure there will still be confusion about what the system is designed to do and what it means, here is the forecasted ratings set after bowl season if all teams favored in Vegas win:
1 Texas 679.27 2 Texas Tech 676.77 3 Oklahoma 659.98 4 Florida 657.70 5 Alabama 648.57 6 Utah 620.12 7 Oklahoma St. 601.06 8 Southern Cal 597.76 9 TCU 593.09 10 Boise St. 575.01 11 Cincinnati 571.97 12 Georgia Tech 564.94 13 Penn St. 562.11 14 Georgia 557.90 15 Florida St. 554.37 16 Ohio St. 552.03 17 Boston College 546.61 18 Virginia Tech 542.53 19 Missouri 541.71 20 North Carolina 539.63 21 Oregon St. 538.69 22 Clemson 534.15 23 Oregon 527.56 24 Wake Forest 521.18 25 California 519.26
BYU falls completely out for losing to Arizona. The ratings all get closer together because of the “upsets” which leads to things like TCU falling 4 spots despite beating Boise State. Interesting that Oklahoma stays ahead of Florida, but they now have two losses to #1 and #4 and a win over #2 and #7. Florida has a loss to #33 and a win over #3 and #5.
I think the system is extremely logically sound and the only screwiness it has is because of the requirement not to use margin of victory. I would be interested in any specific questions about certain teams’ ratings that anyone has because I haven’t seen one that is not defensible based on the logical constraints above. I have posted the Top 10 outputs both pre- and post-bowl season for each season since the BCS started here. I would suggest reviewing those lists to get a better feel for the ratings. The 2008 season has been fairly crazy, so only looking at the one output won’t tell you everything. Two things to remember about the raw values for each season are that the linked page’s ratings sets used only Division 1-A teams, explaining why the raw numbers are so much lower across the board, and also that you can’t compare raw numbers from year to year.
Anyway, now anyone that actually read this far can see the steps taken in putting together one particular computer ratings system. I will attempt to answer any reasonable questions in the comments section.
Black Scholes commented on the blog post Texas Hoops vs. Wake Forest: Post Mortem 29 minutes ago
lawdog – on the topic of regression, this crew can’t compare with the senior seasons Thomas and Atchley put up. Something ain’t right in this scenario. Mason topped out his sophomore year and Pittman last year.
Wangmene is ‘Manos de Piedra’ redux, so that was really never going to work out.
Kevin Berger wrote a new blog post: Top Ten Reasons Why Cal Can Beat Duke 2 hours, 29 minutes ago
This would probably go a bit better if you read it in your Bobby Knight voice and it had two decades worth of goodwill built up from its gratuitous appearance on a popular late night television show. But oh well.
1) Interior Worries. As in the Bears shouldn’t have any defensively
SHARETHIS.addEntry({ title: ””, url: ”” });
Kevin Berger wrote a new blog post: Round 2 Saturday Recaps 3 hours, 25 minutes ago
We talked about the upset of the decade in this post, but I watched some other great basketball today I’d like to comment on.
For me, the theme of the day was well-played basketball. I’m not only talking about what Northern Iowa did, I’m talking about the other seven games being really well
SHARETHIS.addEntry({ title: ””, url: ”” });
Sailor Ripley commented on the blog post Madness Magic: Northern Iowa Upsets Kansas 3 hours, 39 minutes ago
Just a phenomenal game.
SHARETHIS.addEntry({ title: ””, url: ”” });
Sailor Ripley commented on the blog post Recapping The South 3 hours, 50 minutes ago
Udoh was a fargging beast in that game. Very athletic player.
SHARETHIS.addEntry({ title: ””, url: ”” });
Sailor Ripley commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout 4 hours, 14 minutes ago
Jesus. I think I see four horseman on the horizon.
SHARETHIS.addEntry({ title: ””, url: ”” });
Kevin Berger wrote a new blog post: Madness Magic: Northern Iowa Upsets Kansas 4 hours, 18 minutes ago
Today reminded me why I love this tournament so much. A good friend of mine mentioned to me that college basketball is the great equalizer of all athletic endeavors. At least of the sports we care about. He’s right.
For instance, you can have a 40 inch vertical, be Iverson quick,
SHARETHIS.addEntry({ title: ””, url: ”” });
Nate Heupel commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout 4 hours, 21 minutes ago
Patrick,
Unless you’re completely retarded, you know precisely what I meant. The closest any Big 12 team has gotten to winning the infamous Fuller Cup is the 2007 Texas squad. I can’t remember a team being that horribly undisciplined as a whole aside from the insane OU teams of the 80’s. That’s not
SHARETHIS.addEntry({ title: ””, url: ”” });
Nate Heupel commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout 4 hours, 21 minutes ago
Patrick,
Unless you’re completely retarded, you know precisely what I meant. The closest any Big 12 team has gotten to winning the infamous Fulmer Cup is the 2007 Texas squad. I can’t remember a team being that horribly undisciplined as a whole aside from the insane OU teams of the 80’s. That’s not
SHARETHIS.addEntry({ title: ””, url: ”” });
Sailor Ripley wrote a new blog post: This Is Sparta! 4 hours, 34 minutes ago
Please make yourself welcome and Adam will be by shortly to keep you up to date on all Michigan State Spartan happenings.
SHARETHIS.addEntry({ title: ””, url: ”” });
Nickel Rover commented on the blog post Barnes worst team 4 hours, 38 minutes ago
I suppose anyone could be your “favorite Longhorn basketball player” but Ford did more than just about anyone…although Durant is clearly better. Ford’s supporting cast was better than Durant’s in his sophomore year. Durant’s had more talent (Augustin, James, Abrams) but Ford’s was more developed (Boddicker, Ivey, Mouton, Thomas) and had worked with him for
Nickel Rover commented on the blog post Bradley or Hamilton? 4 hours, 45 minutes ago
Crazy Joe, your thoughts intrigue me and I would like to subscribe to your newsletter.
Nickel Rover commented on the blog post Bradley or Hamilton? 4 hours, 48 minutes ago
This notion of Hamilton as being a disaster area on defense is all a bit much. He rebounds extremely well which, if it wasn’t obvious, is extremely important in this game since it secures possession of the basketball. Winning in basketball is achieved through the scoring of baskets and it’s necessary to possess the basketball
Ojnab Bob commented on the blog post Best Opening Round I Can Remember 4 hours, 58 minutes ago
I posted earlier about how Collins’ effort just crippled Kansas today, but what amazed me the most was his complete inability to stay in front of his man on defense. UNI got a LOT of good looks out of penetration/pass after one of UNI’s modestly gifted athletes blew right by Sherron. The best
Scipio Tex wrote a new blog post: Best Opening Round I Can Remember 6 hours, 10 minutes ago
At least it’s shaping up that way if Sunday delivers.
As disappointed as I was in last year’s opening weekend of March Madness, this one is exceeding all expectations. Putting aside the fact that my bracket now resembles Kabul after the Taliban rolled through in ‘96 – a map of ordered failure –
J commented on the blog post Bid Dance: Day Three 6 hours, 37 minutes ago
Thanks for the kind words, Trips.
SHARETHIS.addEntry({ title: ””, url: ”” });
J commented on the blog post Bid Dance: Day Three 6 hours, 38 minutes ago
You know who’s bitter and angry as fuck? This guy —> ME.
I can only hope our returning players (whoever that may be) remember this and realize they need to play motivated EVERY FUCKING GAME and put forward 40 MINUTES of effort each game.
SHARETHIS.addEntry({ title: ””, url: ”” });
Raoul Duke commented on the blog post Rumor Alert–TMG 7 hours, 9 minutes ago
Kid seems like a fantastic collegiate player. I haven’t seen any NBA info. Is he a legit prospect?
SHARETHIS.addEntry({ title: ””, url: ”” });
Raoul Duke commented on the blog post Bid Dance: Day Three 7 hours, 16 minutes ago
Tough day for Sherron on O and D.
SHARETHIS.addEntry({ title: ””, url: ”” });
GoHornsGo90 commented on the blog post Bradley or Hamilton? 8 hours ago
To leave or stay?
Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition 8 hours, 1 minute ago
KSU moving on led by a hot shooting Pullen. BTW, White Mormons can shoot FTs. 22 for 25, I think. Imagine if we could shoot like that.
Wake’s coming back on Kentucky. They’re within 25 right now…..
Trips Right commented on the blog post Bid Dance: Day Three 8 hours, 19 minutes ago
Just wanted to say I feel for you guys. As a Texas fan I know how this feels even if it’s from a football perspective.
I still think you’re the best basketball team in the country, and unfortunately you ran into a team that packed a Villanova circa 1985 type game today. Meaning they
SHARETHIS.addEntry({ title: ””, url: ”” });
ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition 8 hours, 28 minutes ago
Wow. Wake won’t break 50. Not surprising really — we should have beaten Wake and they are not a very good team.
Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition 8 hours, 34 minutes ago
Kentucky doing their best to beat Wake by half hundred. Close call….
RRR wrote a new blog post: Survive and Advance 8 hours, 38 minutes ago
We really struggled in the first half, but battled through it to get a tough win against Jacksonville. On to Oxford!
Forget what conference they play in, Jacksonville is a very good basketball team, and they got to Lubbock not because of a crazy bank shot at the buzzer in Phoenix, but by playing aggressive, intense defense for
SHARETHIS.addEntry({ title: ””, url: ”http://tortillaretort.fantake.com/1969/12/31/” });
ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition 8 hours, 55 minutes ago
The shamrock is on his left shoulder. The Griffin looking thing is on his right.
ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition 9 hours, 4 minutes ago
Everytime I see a Gumbel brother I think of “Gumbel to Gumbel”
Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition 9 hours, 17 minutes ago
Kentucky starting to pile on Wake. That would have been our fate…..
KSU just killing the Mormons on the glass…
Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition 9 hours, 44 minutes ago
10 point lead for the Mormons again…
ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition 9 hours, 59 minutes ago
This is the first-time Frank Martin has ever met a Mormon who wasn’t on a bicycle.
© 2009 Fantake. All rights reserved unless otherwise indicated.
Ben said:
December 11th, 2008 at 1:02 pm
I’m confused on #5. Why would a win over a similar ranked team count more than a huge upset victory for the winning teams?
This is great though. If you keep publishing it, maybe you can get it included in the BCS someday.
Stuck in MN said:
December 11th, 2008 at 1:09 pm
While the BCS does not allow margin of victory could you still use margin of loss? Teams don’t get extra credit for big margins when they win, but they are penalized less for close losses than big losses? This might stay within the spirit of the BCS mandated rules since it does not create an incentive to run up the score on weaker teams but still reflects the results.
Also, where you say “losing a game should never increase a team’s rating.” From a point of view where you are not permitted to look at any margin this is a logical step, but if the #110 team in the nation loses to #1 on the road by 1, shouldn’t they get some boost?
Huckleberry said:
December 11th, 2008 at 1:13 pm
Ben – #5 was phrased poorly. The games that should have little effect are the ones where two unevenly matched teams play and the result is the expected result. Upsets are given their due weight.
Stuck – No on the first question. Absolutely agreed on the second question. In my standard system, such a result would cause a significant change in the ratings for each team.
jc25 said:
December 11th, 2008 at 1:28 pm
Scip wows us with witty prose, you wow us with numbers. Well done.
Current Student said:
December 11th, 2008 at 2:33 pm
What do you think of using margin of loss as a conference tiebreaker or weighting factor in losses?
horny711 said:
December 11th, 2008 at 4:14 pm
http://www.youtube.com/watch?v=P1hTx9CNKcY
paste it or go to youtube and type in hitler hates the bcs
exuLt said:
December 11th, 2008 at 4:15 pm
Somehow there’s a systemic flaw in your system which allows middle tier teams equal rating with tier 1 schools. There’s no quality of equal opponent qualifier within league play, other than the inevitable on field defeat by a higher quality OOC team, and often times not before bowl play. And if the opponent is superior enough, there’s no rating penalty exacted at all. Not sure if win-loss records are enough or exactly how to address this disparity.
Or am I simply misunderstanding?
Huckleberry said:
December 11th, 2008 at 4:49 pm
I understand your concern, but this season’s set is unique with that number of highly rated MWC squads. Keep in mind that Utah beat Oregon State and that the MWC had a winning record against the Pac-10 this season. And because the computer can’t look at the scores, this is a major factor in what you see above.
exuLt said:
December 11th, 2008 at 5:29 pm
Still seems like an elaborate network of OOC Div. 1 common opponent evaluations is still required in order to qualify teams within leagues and leagues themselves. Or are you saying that this is inherently built into your and the other existing computer algorithms?
Huckleberry said:
December 11th, 2008 at 5:38 pm
Well, yes, it’s built in. It would obviously be nicer if there were more non-conference games, but the ratings here analyze every game in every iteration. There were 3,922 games so far this year and the ratings stabilized after 7,201 iterations. That’s over 28 million game analyses.
exuLt said:
December 11th, 2008 at 6:45 pm
Doh. I realized after I posted that I was exactly describing the infamous SOS. And therein lies the deviled details.
The Wood Shed said:
December 12th, 2008 at 7:47 am
First, I believe margin of victory should be a factor, but in an amended form. Losing to a team by 6 in the last second should not have the same effect as losing by 30 going away.
Since you are using expected outcome as a factor in your formula, you could also translate that to expected value of the margin of victory by using the spread. Scores can be adjusted to the spread while also factoring in the team’s ranking.
Huckleberry said:
December 12th, 2008 at 7:57 am
I agree in theory, and my normal ratings do that. However, as stated above in constraint #1, margin of victory is explicitly ruled out as a factor by the BCS folks. It can’t be used in a system considered by the BCS.
By the way, here is what the Top 10 would look like if Washington had beaten BYU:
doog said:
December 12th, 2008 at 8:48 am
Back when it was allowed, the BCS computers did use margin of victory in their calculations. The way I understood it, though, the margin was based on a curve, so that there was a point of diminishing returns.
EX: Winning by 14 rather than 3 had a higher positive delta than winning by 35 rather than 21. These earlier BCS comp models had a built in “running up the score doesn’t matter” algorithm. I believe Sagarin’s Pure Predictor model does this.
Huckleberry said:
December 12th, 2008 at 9:02 am
All good computer systems that use margin of victory do that (blatant personal opinion alert for that one, I guess).
The way it works is still based on the expected value versus actual outcome comparison. For example, in my standard ratings, the PMV set assigns a game score from 0 to 1. The winning team gets a certain score and the losing team gets one minus that score. The winning team’s score can vary from just barely above 1/2 to just barely below 1. Well, with rounding it can get to 1, but not in reality. This is because the game score represents the ratings’ guess for the probability that the team that won the game would win in a rematch.
So, obviously, the difference between winning 21-20 and winning 56-3 is large. The 21-20 victor’s game score is only barely above 1/2. The 56-3 winner’s game score is above 0.99, but that means that there’s not much improvement to be had for winning 63-3 instead of 56-3. It’s an exponential function based on the margin of victory and also the score ratio.
colin said:
December 12th, 2008 at 12:53 pm
I like your math.
The Current BCS = BS.
We need to abolish the BCS “System” once and for all.
I found this clever website which will send a “BS Flag” to the BCS mailbox for you for FREE.
Pretty clever. I just did it. If enough true fans do the same thing, maybe, just maybe, the BCS Committee will get the message loud and clear!
http://www.BSFlag.com/BCS
TaylorTRoom said:
December 13th, 2008 at 11:26 am
I looked at the BCS computer rating websites. Most of them describe their methodology, even if in just vague terms. Only two describe qualitatively what they think the top team should be. Billinsley thinks the top team is the team playing best at season’s end (unfortunately, that means your November schedule has is everything). Colley says he is looking for the “best body of work” in a season.
Billinsley and Colley also seem to not know what to do with D-1AA teams, so they ignore those games. Therefore, their system penalizes a team for playing SMU, but not for playing Chattanooga, which is 10 times worse.
TaylorTRoom said:
December 13th, 2008 at 11:44 am
My experience in mathematical modeling is that you always need to keep in mind what your goal is. What are you trying to capture? There are always so many decisions and tradeoffs that you can’t construct a good model without an “ideal” or goal in mind. If you’re trying to model a ranking of college football teams, what is your criteria? Best team at the end (argument for OU)? Best body of work (Texas, because it has the best win to date)? Most consistently good (argument for Utah, or any other undefeated teams)?
The polls capture all this by surveying scores of biased individuals, with their biases factored out by the large number voting. A well-constructed computer system can do this with objective data, but you need to know what it will yield.