Connect with your Facebook Account

Contact

18

Designing a New BCS Computer Ratings System

Posted by Huckleberry on December 11th, 2008 under Football

As everyone should know by now, I am a computer ratings geek. I have my own set of ratings, but I recently decided to set out to design a computer ratings system that accommodates the factors that either must or in my opinion should be reflected in a system used by the BCS. With that in mind, here are the constraints and logical requirements I started with when designing the system:

  1. No score information, only who won and who lost the game. This constraint is required by the BCS.
  2. Road Win > Neutral Win > Home Win > > > > > > > > > Road Loss > Neutral Loss > Home Loss. While the score of the game is not allowed to be used, the location is fair game.
  3. Beating a team that you are considerably higher rated than should have no effect on your rating. All 6 of the systems used by the BCS (I’m only 90% sure of this for Sagarin, though) violate this constraint. In some of them, your rating improves when you win a game, no matter the opponent. In at least one of the systems, it is possible for your rating to decrease if you beat a bad enough team. This should also never happen as a team can do no more than win a game.
  4. Losing to a team that you are considerably lower rated than should have no effect on your rating. Simply the converse of #3. Also conversely from #3, losing a game should never increase a team’s rating.
  5. A corollary to #3 and #4 is that games against teams of relatively equal strength should have a much greater impact on your rating than games against teams vastly superior or inferior to your team.
  6. Losing to a bad team is worse than losing to a good team. This should be obvious, but most of the current systems treat every game equally to a fault, IMO. And that includes my standard ratings. In fact, you can go to the Colley website and see that if you change the season so that Texas beat Texas Tech, but lost to Florida Atlantic, we become #1 in the ratings.

So starting with these constraints I began to design the inputs for the system. As you will see later, the first constraint is extremely important and is largely responsible for the specific results in any season. The best ratings systems are iterative ones that with each game analysis compare the expected result to the actual outcome. In a system where the actual score of the game is not allowed to be considered, the actual outcome input for each game is very limited in its possible values. Without game location or date information, in fact, this input must be binary (1 for a win -1 for a loss, for example). However, as shown in #2 above, I added to the variation in that input by considering game location. How did I arrive at specific numbers? Well, historically speaking, and I only looked at the last 4 years but it’s very consistent, home teams win a little over 57% of all college football games. At first, then, it would seem we could use 1.14 for a road win, 1 for a neutral win, and 0.86 for a home win as our game outcomes. Instead I used 1.1, 1, and 0.9 for the values. I did this because I consider that in college football better teams play more home games and therefore this contributes a little to the 57% number, so I called it 55% for my purposes. This is obviously a rather arbitrary determination and later I might revisit this figure and determine a more accurate value. The game outcome values, then, are as follows: 1.1 for a road win, 1 for a neutral win, 0.9 for a home win, -0.9 for a road loss, -1 for a neutral loss, and -1.1 for a home loss.

Using these outcome values, I then decided to use a system loosely based on the ELO rating methodology. This requires, as discussed above, an expected outcome and an actual outcome for each game. For these calculations, the expected outcome for a Team A playing Team B is initially set at Team A’s rating minus Team B’s rating all divided by 100. This value is capped, though, depending on the game location. For a road game, Team A’s expected value is capped at 1.1, 1 for a neutral game, and 0.9 for a home game. This processing step maintains our #3 and #4 requirements above. Working with a neutral site game, if a team is more than 100 points more highly rated than their opponent, winning the game will affect neither team’s ratings. Losing the game, however, would cause an adjustment.

The second step in the ELO method for each game is to adjust each team’s rating based on the comparison between expected value and actual outcome. So in this system, after each game an adjustment score is tracked for each team. This ratings adjustment is equal to 100 times the actual outcome minus the expected value. So, for example, if Team A is rated 150 points more highly than Team B and wins at a neutral site, there is no adjustment. The expected value was capped at 1, and the actual outcome was 1. Now, if Team A is rated 50 points lower than Team B and wins at a neutral site, then Team A’s rating adjustment score would be +50 and Team B’s would be -50. These adjustment scores for each team are applied at the end of the season iteration after being summed and divided by the number of games the team has played.

The algorithm then iterates as many times as necessary until the ratings stabilize to a suitable degree.

With all those things in mind, here are the Top 25 results of the system for the 2008 year-to-date (here’s the full list). I have made some comments on these results below the ratings:

1    Texas             804.23
2    Texas Tech        800.90
3    Utah              797.57
3    Oklahoma          797.57
5    TCU               707.57
6    Oklahoma St.      656.13
7    Boise St.         638.30
8    BYU               617.57
9    Florida           604.23
10   Alabama           595.50
11   Southern Cal      579.98
12   Penn St.          575.56
13   Georgia Tech      565.15
14   Cincinnati        558.74
15   Ohio St.          557.04
16   Virginia Tech     554.32
17   Florida St.       553.72
18   Boston College    550.13
19   Missouri          546.13
20   Georgia           540.59
21   North Carolina    540.57
22   Pittsburgh        529.27
23   Nebraska          528.32
24   Oregon            528.30
25   Clemson           527.82
  1. A reminder that margin of victory is ignored per the BCS requirements. Obviously the mid-majors wouldn’t be as high if it were included. But it’s not.
  2. Losing to a bad team kills you in this system, as I think it should. Essentially your rating then has to stay in range of the team you lost to. Within 110 if you lost on the road, 100 at a neutral site, 90 at home although this isn’t a hard and fast rule and can be violated with enough data. This is why Florida and Southern Cal are so low. They lost to “bad” teams and the other top teams did not. Ole Miss is #33 with a 509.36 rating and Oregon St. is #30 with a 519.83 rating. (Iowa’s #43 and 494.21 rating keep Penn State down)
  3. As you can see, an undefeated team’s rating is essentially determined based on their best win. Boise State beat Oregon in Eugene. Their rating is 110 points higher than the Ducks.
  4. Also, a group of teams that is undefeated outside of the group has their ratings essentially set based on the group’s best outside win. These two aspects show why Utah and Oklahoma are tied. Utah’s best win was TCU at home. Similarly, the best win of the Texas/Texas Tech/Oklahoma trio outside that group was Oklahoma’s win over TCU at home. They are tied 90 points ahead of the Horned Frogs. The logical conclusion, then, is that if Texas had beaten Oklahoma in Austin, all four teams would be tied at the top. Keeping in mind that we can’t consider the scores of the games, this is logically reasonable. All we know is that Utah has beaten everyone they have played, and that they beat TCU at home. Their rating then stabilizes at 90 points above the Horned Frogs.
  5. Remember that the intent of this system is to rank teams for the BCS. This means that I intend it to reflect who has accomplished the most based on what has already happened within the logical constraints above and that predicting the strengths of the teams for future contests is completely ignored. I have my power ratings for that. Before anyone asks no I do not think that this list is in order of how good each team is.

Because I’m sure there will still be confusion about what the system is designed to do and what it means, here is the forecasted ratings set after bowl season if all teams favored in Vegas win:

1    Texas           679.27
2    Texas Tech      676.77
3    Oklahoma        659.98
4    Florida         657.70
5    Alabama         648.57
6    Utah            620.12
7    Oklahoma St.    601.06
8    Southern Cal    597.76
9    TCU             593.09
10   Boise St.       575.01
11   Cincinnati      571.97
12   Georgia Tech    564.94
13   Penn St.        562.11
14   Georgia         557.90
15   Florida St.     554.37
16   Ohio St.        552.03
17   Boston College  546.61
18   Virginia Tech   542.53
19   Missouri        541.71
20   North Carolina  539.63
21   Oregon St.      538.69
22   Clemson         534.15
23   Oregon          527.56
24   Wake Forest     521.18
25   California      519.26

BYU falls completely out for losing to Arizona. The ratings all get closer together because of the “upsets” which leads to things like TCU falling 4 spots despite beating Boise State. Interesting that Oklahoma stays ahead of Florida, but they now have two losses to #1 and #4 and a win over #2 and #7. Florida has a loss to #33 and a win over #3 and #5.

I think the system is extremely logically sound and the only screwiness it has is because of the requirement not to use margin of victory. I would be interested in any specific questions about certain teams’ ratings that anyone has because I haven’t seen one that is not defensible based on the logical constraints above. I have posted the Top 10 outputs both pre- and post-bowl season for each season since the BCS started here. I would suggest reviewing those lists to get a better feel for the ratings. The 2008 season has been fairly crazy, so only looking at the one output won’t tell you everything. Two things to remember about the raw values for each season are that the linked page’s ratings sets used only Division 1-A teams, explaining why the raw numbers are so much lower across the board, and also that you can’t compare raw numbers from year to year.

Anyway, now anyone that actually read this far can see the steps taken in putting together one particular computer ratings system. I will attempt to answer any reasonable questions in the comments section.

More from this Barker


Share This

  • StumbleUpon

18 Responses

  1. I’m confused on #5. Why would a win over a similar ranked team count more than a huge upset victory for the winning teams?

    This is great though. If you keep publishing it, maybe you can get it included in the BCS someday.

  2. Stuck in MN said:

    December 11th, 2008 at 1:09 pm

    While the BCS does not allow margin of victory could you still use margin of loss? Teams don’t get extra credit for big margins when they win, but they are penalized less for close losses than big losses? This might stay within the spirit of the BCS mandated rules since it does not create an incentive to run up the score on weaker teams but still reflects the results.

    Also, where you say “losing a game should never increase a team’s rating.” From a point of view where you are not permitted to look at any margin this is a logical step, but if the #110 team in the nation loses to #1 on the road by 1, shouldn’t they get some boost?

  3. Ben – #5 was phrased poorly. The games that should have little effect are the ones where two unevenly matched teams play and the result is the expected result. Upsets are given their due weight.

    Stuck – No on the first question. Absolutely agreed on the second question. In my standard system, such a result would cause a significant change in the ratings for each team.

  4. Scip wows us with witty prose, you wow us with numbers. Well done.

  5. Current Student said:

    December 11th, 2008 at 2:33 pm

    What do you think of using margin of loss as a conference tiebreaker or weighting factor in losses?

  6. http://www.youtube.com/watch?v=P1hTx9CNKcY
    paste it or go to youtube and type in hitler hates the bcs

  7. Somehow there’s a systemic flaw in your system which allows middle tier teams equal rating with tier 1 schools. There’s no quality of equal opponent qualifier within league play, other than the inevitable on field defeat by a higher quality OOC team, and often times not before bowl play. And if the opponent is superior enough, there’s no rating penalty exacted at all. Not sure if win-loss records are enough or exactly how to address this disparity.

    Or am I simply misunderstanding?

  8. I understand your concern, but this season’s set is unique with that number of highly rated MWC squads. Keep in mind that Utah beat Oregon State and that the MWC had a winning record against the Pac-10 this season. And because the computer can’t look at the scores, this is a major factor in what you see above.

  9. Still seems like an elaborate network of OOC Div. 1 common opponent evaluations is still required in order to qualify teams within leagues and leagues themselves. Or are you saying that this is inherently built into your and the other existing computer algorithms?

  10. Well, yes, it’s built in. It would obviously be nicer if there were more non-conference games, but the ratings here analyze every game in every iteration. There were 3,922 games so far this year and the ratings stabilized after 7,201 iterations. That’s over 28 million game analyses.

  11. Doh. I realized after I posted that I was exactly describing the infamous SOS. And therein lies the deviled details.

  12. The Wood Shed said:

    December 12th, 2008 at 7:47 am

    First, I believe margin of victory should be a factor, but in an amended form. Losing to a team by 6 in the last second should not have the same effect as losing by 30 going away.

    Since you are using expected outcome as a factor in your formula, you could also translate that to expected value of the margin of victory by using the spread. Scores can be adjusted to the spread while also factoring in the team’s ranking.

  13. I agree in theory, and my normal ratings do that. However, as stated above in constraint #1, margin of victory is explicitly ruled out as a factor by the BCS folks. It can’t be used in a system considered by the BCS.

    By the way, here is what the Top 10 would look like if Washington had beaten BYU:

    1    Texas           773.14
    2    Texas Tech      769.81
    3    Oklahoma        766.47
    4    Utah            680.86
    5    Oklahoma St.    656.47
    6    Boise St.       647.12
    7    Florida         605.43
    8    Alabama         596.75
    9    TCU             590.86
    10   Southern Cal    585.67
  14. Back when it was allowed, the BCS computers did use margin of victory in their calculations. The way I understood it, though, the margin was based on a curve, so that there was a point of diminishing returns.

    EX: Winning by 14 rather than 3 had a higher positive delta than winning by 35 rather than 21. These earlier BCS comp models had a built in “running up the score doesn’t matter” algorithm. I believe Sagarin’s Pure Predictor model does this.

  15. All good computer systems that use margin of victory do that (blatant personal opinion alert for that one, I guess).

    The way it works is still based on the expected value versus actual outcome comparison. For example, in my standard ratings, the PMV set assigns a game score from 0 to 1. The winning team gets a certain score and the losing team gets one minus that score. The winning team’s score can vary from just barely above 1/2 to just barely below 1. Well, with rounding it can get to 1, but not in reality. This is because the game score represents the ratings’ guess for the probability that the team that won the game would win in a rematch.

    So, obviously, the difference between winning 21-20 and winning 56-3 is large. The 21-20 victor’s game score is only barely above 1/2. The 56-3 winner’s game score is above 0.99, but that means that there’s not much improvement to be had for winning 63-3 instead of 56-3. It’s an exponential function based on the margin of victory and also the score ratio.

  16. I like your math.
    The Current BCS = BS.
    We need to abolish the BCS “System” once and for all.
    I found this clever website which will send a “BS Flag” to the BCS mailbox for you for FREE.
    Pretty clever. I just did it. If enough true fans do the same thing, maybe, just maybe, the BCS Committee will get the message loud and clear!
    http://www.BSFlag.com/BCS

  17. I looked at the BCS computer rating websites. Most of them describe their methodology, even if in just vague terms. Only two describe qualitatively what they think the top team should be. Billinsley thinks the top team is the team playing best at season’s end (unfortunately, that means your November schedule has is everything). Colley says he is looking for the “best body of work” in a season.

    Billinsley and Colley also seem to not know what to do with D-1AA teams, so they ignore those games. Therefore, their system penalizes a team for playing SMU, but not for playing Chattanooga, which is 10 times worse.

  18. My experience in mathematical modeling is that you always need to keep in mind what your goal is. What are you trying to capture? There are always so many decisions and tradeoffs that you can’t construct a good model without an “ideal” or goal in mind. If you’re trying to model a ranking of college football teams, what is your criteria? Best team at the end (argument for OU)? Best body of work (Texas, because it has the best win to date)? Most consistently good (argument for Utah, or any other undefeated teams)?

    The polls capture all this by surveying scores of biased individuals, with their biases factored out by the large number voting. A well-constructed computer system can do this with objective data, but you need to know what it will yield.

Leave a Reply

Related Articles

Activity

  • Black Scholes commented on the blog post Texas Hoops vs. Wake Forest: Post Mortem   29 minutes ago

    lawdog – on the topic of regression, this crew can’t compare with the senior seasons Thomas and Atchley put up. Something ain’t right in this scenario. Mason topped out his sophomore year and Pittman last year.

    Wangmene is ‘Manos de Piedra’ redux, so that was really never going to work out.

  • Kevin Berger wrote a new blog post: Top Ten Reasons Why Cal Can Beat Duke   2 hours, 29 minutes ago

    This would probably go a bit better if you read it in your Bobby Knight voice and it had two decades worth of goodwill built up from its gratuitous appearance on a popular late night television show. But oh well.

    1) Interior Worries. As in the Bears shouldn’t have any defensively

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Kevin Berger wrote a new blog post: Round 2 Saturday Recaps   3 hours, 25 minutes ago

    We talked about the upset of the decade in this post, but I watched some other great basketball today I’d like to comment on.

    For me, the theme of the day was well-played basketball. I’m not only talking about what Northern Iowa did, I’m talking about the other seven games being really well

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Sailor Ripley commented on the blog post Madness Magic: Northern Iowa Upsets Kansas   3 hours, 39 minutes ago

    Just a phenomenal game.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Sailor Ripley commented on the blog post Recapping The South   3 hours, 50 minutes ago

    Udoh was a fargging beast in that game. Very athletic player.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Sailor Ripley commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout   4 hours, 14 minutes ago

    Jesus. I think I see four horseman on the horizon.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Kevin Berger wrote a new blog post: Madness Magic: Northern Iowa Upsets Kansas   4 hours, 18 minutes ago

    Today reminded me why I love this tournament so much. A good friend of mine mentioned to me that college basketball is the great equalizer of all athletic endeavors. At least of the sports we care about. He’s right.

    For instance, you can have a 40 inch vertical, be Iverson quick,

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Nate Heupel commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout   4 hours, 21 minutes ago

    Patrick,

    Unless you’re completely retarded, you know precisely what I meant. The closest any Big 12 team has gotten to winning the infamous Fuller Cup is the 2007 Texas squad. I can’t remember a team being that horribly undisciplined as a whole aside from the insane OU teams of the 80’s. That’s not

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Nate Heupel commented on the blog post Because We’re Dedicated To Doing Stupid Things – Tiny Gallon Reportedly Took Payout   4 hours, 21 minutes ago

    Patrick,

    Unless you’re completely retarded, you know precisely what I meant. The closest any Big 12 team has gotten to winning the infamous Fulmer Cup is the 2007 Texas squad. I can’t remember a team being that horribly undisciplined as a whole aside from the insane OU teams of the 80’s. That’s not

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Sailor Ripley wrote a new blog post: This Is Sparta!   4 hours, 34 minutes ago

    Please make yourself welcome and Adam will be by shortly to keep you up to date on all Michigan State Spartan happenings.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Nickel Rover commented on the blog post Barnes worst team   4 hours, 38 minutes ago

    I suppose anyone could be your “favorite Longhorn basketball player” but Ford did more than just about anyone…although Durant is clearly better. Ford’s supporting cast was better than Durant’s in his sophomore year. Durant’s had more talent (Augustin, James, Abrams) but Ford’s was more developed (Boddicker, Ivey, Mouton, Thomas) and had worked with him for

  • Nickel Rover commented on the blog post Bradley or Hamilton?   4 hours, 45 minutes ago

    Crazy Joe, your thoughts intrigue me and I would like to subscribe to your newsletter.

  • Nickel Rover commented on the blog post Bradley or Hamilton?   4 hours, 48 minutes ago

    This notion of Hamilton as being a disaster area on defense is all a bit much. He rebounds extremely well which, if it wasn’t obvious, is extremely important in this game since it secures possession of the basketball. Winning in basketball is achieved through the scoring of baskets and it’s necessary to possess the basketball

  • Ojnab Bob commented on the blog post Best Opening Round I Can Remember   4 hours, 58 minutes ago

    I posted earlier about how Collins’ effort just crippled Kansas today, but what amazed me the most was his complete inability to stay in front of his man on defense. UNI got a LOT of good looks out of penetration/pass after one of UNI’s modestly gifted athletes blew right by Sherron. The best

  • Scipio Tex wrote a new blog post: Best Opening Round I Can Remember   6 hours, 10 minutes ago

    At least it’s shaping up that way if Sunday delivers.

    As disappointed as I was in last year’s opening weekend of March Madness, this one is exceeding all expectations. Putting aside the fact that my bracket now resembles Kabul after the Taliban rolled through in ‘96 – a map of ordered failure –

  • J commented on the blog post Bid Dance: Day Three   6 hours, 37 minutes ago

    Thanks for the kind words, Trips.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • J commented on the blog post Bid Dance: Day Three   6 hours, 38 minutes ago

    You know who’s bitter and angry as fuck? This guy —> ME.

    I can only hope our returning players (whoever that may be) remember this and realize they need to play motivated EVERY FUCKING GAME and put forward 40 MINUTES of effort each game.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Raoul Duke commented on the blog post Rumor Alert–TMG   7 hours, 9 minutes ago

    Kid seems like a fantastic collegiate player. I haven’t seen any NBA info. Is he a legit prospect?

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • Raoul Duke commented on the blog post Bid Dance: Day Three   7 hours, 16 minutes ago

    Tough day for Sherron on O and D.

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • GoHornsGo90 commented on the blog post Bradley or Hamilton?   8 hours ago

    To leave or stay?

  • Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition   8 hours, 1 minute ago

    KSU moving on led by a hot shooting Pullen. BTW, White Mormons can shoot FTs. 22 for 25, I think. Imagine if we could shoot like that.

    Wake’s coming back on Kentucky. They’re within 25 right now…..

  • Trips Right commented on the blog post Bid Dance: Day Three   8 hours, 19 minutes ago

    Just wanted to say I feel for you guys. As a Texas fan I know how this feels even if it’s from a football perspective.

    I still think you’re the best basketball team in the country, and unfortunately you ran into a team that packed a Villanova circa 1985 type game today. Meaning they

    SHARETHIS.addEntry({ title: ””, url: ”” });

  • ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition   8 hours, 28 minutes ago

    Wow. Wake won’t break 50. Not surprising really — we should have beaten Wake and they are not a very good team.

  • Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition   8 hours, 34 minutes ago

    Kentucky doing their best to beat Wake by half hundred. Close call….

  • RRR wrote a new blog post: Survive and Advance   8 hours, 38 minutes ago

    We really struggled in the first half, but battled through it to get a tough win against Jacksonville.  On to Oxford!

     Forget what conference they play in, Jacksonville is a very good basketball team, and they got to Lubbock not because of a crazy bank shot at the buzzer in Phoenix, but by playing aggressive, intense defense for

    SHARETHIS.addEntry({ title: ””, url: ”http://tortillaretort.fantake.com/1969/12/31/” });

  • ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition   8 hours, 55 minutes ago

    The shamrock is on his left shoulder. The Griffin looking thing is on his right.

  • ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition   9 hours, 4 minutes ago

    Everytime I see a Gumbel brother I think of “Gumbel to Gumbel”

  • Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition   9 hours, 17 minutes ago

    Kentucky starting to pile on Wake. That would have been our fate…..

    KSU just killing the Mormons on the glass…

  • Patrick Bateman commented on the blog post NCAA Tournament Open Thread: Weekend Edition   9 hours, 44 minutes ago

    10 point lead for the Mormons again…

  • ghostofagroundgame commented on the blog post NCAA Tournament Open Thread: Weekend Edition   9 hours, 59 minutes ago

    This is the first-time Frank Martin has ever met a Mormon who wasn’t on a bicycle.