A Modest Proposal for the College Football Playoffs

Table of Contents

Nobody is going to be happy, so let's just own it

Ah, the College Football Playoff. The crucible where legacies are forged, fans are angered, and gamblers lose money. Possibly the most controversial subject of the past two years, and for good reason. Nevertheless, I will be sticking my dick in the proverbial beehive, if for no better reason than to have something to point at when someone gets mad at me on Xitter.

Today, I will be presenting my spin on a playoff ranking and selection mechanism, along with my methodology, reasoning, and backtesting results.

1. Assumptions

All models1 need assumptions, and this one is no exception. My primary ones are as follows:

  1. The Markovian assumption: team performance is independent year-to-year2 , 3
  2. On average, good teams will win games
  3. Qualitative data fails when attempting to take the entire season into account4
  4. Ceteris paribus, the best team in the country will win the championship so long as they are in the tournament

2. Motivation

Originally, I wanted to come up with a system free from the controversy-causing bias that existing systems carry with them (which is why the AP poll and any and all committees are noticeable by their absence)5. However, after some contemplation, it was revealed to me that no matter what system is in place, fans are going to bitch so long as their favorite team isn't included, no matter how poor that team's performance was. 6

However, the rest of my motivations remain relevant; namely: rewarding teams for tough scheduling, disincentivizing cupcake games, avoiding overindexing on gameable statistics, and making good television.

3. Judging Criteria

My system is based around Elo7, which relies purely on wins and losses, and nothing else8. Now, I can already hear the mooing of the farmyard cattle, which sounds suspiciously like "box score merchant". However, you foolish ungulates, there's a good reason for this. By assumption 2, any particular metric that you'd prefer to be integrated into the model is already "priced in", because it (ostensibly) contributes to a team win.

Furthermore, these rankings are calculated at the end of the season, after conference championships9, which means that, assuming good scheduling10, any single-game events will have averaged out over the season as further evidence is introduced11.

So, our basic model looks like this:

  • All teams begin the season with an Elo rating of 1000
  • All FCS teams maintain a constant Elo rating of 50012
  • Each week, the Elo rating of a team is adjusted by the outcome of its game and the rating of its opponent13
  • At the end of the season, the ratings are finalized and used to select the teams that will play in the playoff

Pretty simple! But it is at the fourth step that we run into a bit of a quandary.

3.1. To Weight Or Not To Weight

You see, most games a team plays are in-conference, and some conferences are stronger (i.e., have a higher concentration of better teams) than others. That brings for the question: are we to "play it straight" and have Elo ratings affected only by other Elo ratings, or should we bring in conference Elo ratings when it comes to the final choice?

I tested both14, and found that, while playing it straight could be a bit of a G5stravaganza15, weighting allowed for inferior teams to ride on the coattails of their conference; e.g., a 2024 playoff distributed in this method would have included Arkansas and Oklahoma. Despite being a fan of both teams, and an alum of the first school, I will be the first to admit that that's bullshit.

In my opinion, KISS and run straight Elo. It's a little bit noisier, but it gets the best teams nationally in, and introduces a bit more chaos with the G5 teams that show up.

olympics_turkish_shooter_internet_legend.jpg

4. Bracket Structure

Sixteen teams, no byes, with seeding structured as a reflection across the middle of the list (e.g., #1 vs #16, #2 vs #15, etc). If we're playing it straight, then this means that the G5 teams that squeaked in act as pseudo-byes, rewarding the teams that maximized their Elo rating in the regular season without creating a rest-versus-rust dilemma. Furthermore, any blowouts are likely to occur in the first round, so you don't have an anticlimactic beatdown for your championship game, like TCU vs Georgia in 202316, and any major upsets are also likely to occur in the first round, giving you solid storylines for underdog teams.

5. Citations

sports-reference.com for the data used, and my code and results are here.

Footnotes:

1

Which is what this and its peers are

2

E.g., a good performance last year does not imply a good performance this year, and vice versa

3

This is pretty easy to justify in the NIL era, since teams lose half their members to the transfer portal; however, as I will demonstrate later, it works fine even in the old days where that wasn't the case

4

Recency bias, conformity bias, etc

5

The Aurelia must be above suspicion, ya know?

6

If any Alabama fans want to jump up my ass, your precious Tide lost to Michigan in the Reliaquest Bowl. You weren't going to win the natty, so shut up

7

"Why not Glicko or another, newer rating system?" KISS, YAGNI, and we don't have enough data within the season to justify a more complex model, since all we're trying to do is find the top teams, not predict their outcomes

8

This had been used partially by the BCS, but not solely, and the BCS integrated garbage like polls

9

Incidentally, this part also incentivizes trying to win the conference, rather than sandbagging and keeping your best players off the field to prep them for the national championship, because the winner of the conference championship is almost certainly going to have the highest Elo rating in the conference going into the playoff

10

I.e., you didn't lose to an FCS team

11

Much Bayes, very wow

12

Giving a ~94% chance of a cupcake win by an FBS team

13

I tested this with k values of 32 and 64. There are some swings in the particular rankings of a given team, but the unsorted list remains largely the same. I'd use 64, since our dataset is pretty small

14

Weighting by using the mean of the final team Elo rating and the final conference Elo rating

15

I suspect due to the SEC cupcake losing them some of their edge, since all the ratings were pretty tight at the end

16

Quick tangent: this game is a stupid example for "G5 teams will never be relevant in the postseason", since a) TCU was a power 4 team (Big 12), b) TCU had to beat Michigan (51-45) to get there in the first place, they didn't just spawn in, and c) the most recent example of a G5 team against a power 4 team is Boise State versus Penn State, which ended 31-14 and would likely have been closer had someone other than Dirk Koetter been running Boise State's offense

Created: 2025-08-10 Sun 18:30

Validate