What’s the Best NCAA Bracket Strategy?


I spent the first 22 years of my life in Kentucky. Anyone who grew up in the Commonwealth has no choice but to be a college basketball fan. You are indoctrinated at an early age and will be destined to be either a Kentucky fan or a Louisville fan. Those are the only choices. Want to cheer for North Carolina? Not in our state. Duke? Go find Christian Laettner and get out of here! I am personally a Kentucky fan and, unlike my other teams (Cincinnati Reds, Cincinnati Bengals), life is good as a Kentucky fan. They are a perennial top seed in the tournament and regularly win the whole thing. They’ve won eight championships in their history, second only to UCLA who has won eleven. And they’ve been to seventeen Final Fours, a tie with UCLA and second only to North Carolina who has been to nineteen.


So, as a college basketball fan, March is one of the best times of the year. I regularly participate in bracket pools and, though I follow the sport closely, I usually do quite poorly. Every year, I do my best to make educated guesses on which teams will make it through the tournament, which underdogs will be responsible for the biggest upsets, etc. But I’ve secretly always believed that I would do much better if I just used the “Top Seed” method. The idea here is that you always pick the highest seeded team to win each game. When the number ones meet in the Final Four, you would then pick the overall top seeded team. For example, in the 2016 tournament, the four one seeds were Kansas, Oregon, North Carolina, and Virginia (alas, Kentucky was only a four seed last year). But, the selection committee assigns overall seeds to all of the 64 teams as well. Last year, Kansas was the overall number one, followed by North Carolina, Virginia, and Oregon, respectively. So, with the Top Seed method, you’d pick Kansas to defeat Oregon and North Carolina to defeat Virginia. And, in the finals, you’d choose Kansas.

Testing the Theory
With March right around the corner, I decided to test the theory. I set out to collect 2016 brackets from a number of different sources, which fall into two different categories:

    1) Experts, Analysts, and Sports Writers
    2) Computer-Aided Predictive Models

Once collected, I would then compare the results of these models to the Top Seed model. Unfortunately (and perhaps not coincidentally), it was pretty difficult to find complete pre-tournament brackets for the first category. Perhaps these experts simply don’t want people like me following up to measure their results? CBS was the only organization I found who posted their full brackets; and theirs were only available for 2016—I could not find any prior year’s brackets. Ideally we’d be analyzing data for a large number of analysts and predictive models over the course of numerous years, but since the data does not seem to be readily available, I’m going to settle for analyzing a handful of brackets for 2016.

Twelve of CBS’s writers and analysts filled out brackets for the 2016 tournament. Rather than analyze all twelve, I chose the two who represent the best and worst performances. The best performance went to Josh Nagel, a SportsLine Analyst and the worst went to Wally
Szczerbiak, a Sports Network Analyst and former NBA player. CBS also included a bracket from Bracket Voodoo, an organization which uses big data and statistical models to help people make the best of their brackets. Josh and Wally will go into our first category and Bracket Voodoo will go into the second.

So, now I had the following brackets to compare:

    1) Josh Nagel
    2) Wally Szczerbiak
    3) Bracket Voodoo
    4) Top Seed Method

I then used Tableau to create a fully interactive tournament bracket. Unfortunately, to fit all the information onscreen, the visualization is fairly large and will not fit nicely into this blog post. So, for the purpose of this discussion, I’ll be using screenshots. But I highly encourage you to check out the full visualization. It allows you to select a bracket to view, then by hovering over a team, you can highlight the path the bracket predicts that team will take. And you can use the “Highlight Incorrect” filter at the top to highlight, in green and red, the correct and incorrect predictions. Please note that each screenshot below will take you directly to the Tableau Visualization already filtered to match what’s on the image.

How’d They Do?
So, how did they do? Let’s start by showing the actual results of the tournament.


If you were to pick every single game accurately, you’d score the maximum of 192 points, as shown above. Please note that I’m using the scoring method used by CBS and most other news organizations and office pools, which doubles the points for each round and, therefore, results in each round being worth a total of 32 points:

    Round 1 – 32 Games at 1 Point Each
    Round 2 – 16 Games at 2 Points Each
    Round 3 (Sweet Sixteen) – 8 Games at 4 Points Each
    Round 4 (Elite Eight) – 4 Games at 8 Points Each
    Round 5 (Final Four) – 2 Games at 16 Points Each
    Round 6 (Championship) – 1 Game at 32 Points

Wally Szczerbiak
Now let’s take a look at poor Wally Szczerbiak. As noted previously, he had the poorest performance of all of the CBS analysts. Here’s his bracket:


Wally failed to pick a single Final Four team. The early loss of Michigan State hurt him badly, as it did many others, since he had them winning the tournament. Wally finished with a measly 52 points.

Josh Nagel
Josh Nagel, on the other hand, had the best performance of the CBS team. Here’s his bracket:


He picked two of the Final Four teams—Oklahoma and North Carolina—and one of the Championship teams—North Carolina. He had Michigan State losing in the Sweet Sixteen, so their early exit did not impact him as much as it did others. He finished with 87 points.

Bracket Voodoo
Now, onto the predictive model. Here’s Bracket Voodoo’s bracket:


Yikes! Though Bracket Voodoo did better than Wally, they were nowhere near Josh. They chose only one of the Final Four teams and neither of the Championship teams, finishing with 71 points. This falls somewhere in the middle of the 12 CBS analysts, performing better than 7 of them, but worse than 5. This is very disconcerting. A computer-aided predictive model should tend to perform better than people. Granted, this is only a single year and by no means a comprehensive analysis, but this is still worrisome. But, as I’ve said before, predicting the future is hard! And this is why we must all be very careful with predictive models. Simply because someone says they have a system and that system is aided by computers and statistics does not mean that system is necessarily any better.

Top Seed
We’ve looked at the two analysts and the one predictive model, so let’s circle back to my original theory about the Top Seed method. Here’s a bracket based on this method:


Now look at that! The top seed method chose one of the Final Four teams and one of the Championship teams (North Carolina), but performed much better in previous rounds. This method tied our top analyst with a total of 87 points. To reiterate, this simple method performed as well or better than all 12 CBS analysts and significantly better than the Bracket Voodoo predictive model.

As I’ve noted earlier, this data set is not comprehensive enough to consider these findings proof of my theory. To do that, we’d need to look at a much larger group of analysts and predictive models, covering numerous years’ tournaments. But it is at least some small amount of evidence to indicate that the brackets of “experts” and predictive models won’t necessarily perform that well.

So, should we just use the Top Seed method in the 2017 tournament? Not so fast…I have to admit that I hadn’t previously heard of Bracket Voodoo and their methodology is unclear. But there are other organizations with a strong track record in predictive analytics. So, I decided to take a look at perhaps the most well-known of them, FiveThirtyEight.

FiveThirtyEight
FiveThirtyEight does not fill out a typical bracket with winners and losers, but rather produces a model which assigns probabilities of each team making it to a given round. For instance, they projected the following probabilities for Villanova, the eventual champion:

    Advance to 2nd Round: 96%
    Advance to Sweet Sixteen: 77%
    Advance to Elite Eight: 47%
    Advance to Final Four: 22%
    Advance to Championship: 13%
    Win Championship: 6%

Though it is a gross oversimplification, I analyzed their model and built a bracket which chose the team with the highest probability of advancing. For example, Kansas had a >99% chance of advancing to round 2 and played Austin Peay who had a <1% chance. So, in this case, the bracket had Kansas moving on. I continued this process through to the end to create a complete FiveThirtyEight bracket. Here’s the result:


Now this is interesting. Like Josh Nagel, FiveThirtyEight chose two of the Final Four teams—Oklahoma and North Carolina—and one of the Championship teams—North Carolina. But their performance in earlier rounds was much better than Josh’s, allowing them to outscore him by 11 points, with a final score of 98.

What’s All This Mean?
So what does all this mean? Well, clearly there is at least some evidence that the Top Seed method will perform better than analysts and at least some predictive models. We can also see that not all predictive models are created equal. So, what strategy should you employ on your 2017 bracket? Well, Top Seed seems like a reasonable choice, but perhaps you should leverage FiveThirtyEight’s predictions for that slight edge. But whatever you do, don’t listen to former NBA players!

Onto 2017…
I had fun with this analysis and really want to see if these findings are true of other years. So, I intend to do the same analysis for the 2017 tournament. Once the experts and predictive models publish their projections, I’ll put them into my Tableau-based bracket and I’ll report on the results throughout the tournament. Look for another post coming soon! And please take a look at the full visualization on Tableau Public. I had a lot of fun building it and hope you enjoy interacting with it.

Ken Flerlage, February 24, 2017
Twitter | LinkedIn | Tableau Public

4 comments:

  1. Thanks for putting this together. I had a hard time finding an article that looked back on bracket predictions and this was very insightful.

    ReplyDelete
    Replies
    1. Thanks. I wish I had more data to test with, but I'll add in some 2017 data after the tournament is over.

      Delete
  2. I just used the FiveThirtyEight for this year and they only picked 2 upsets with nova as champ. What are your predictions for final 4 and champ?

    ReplyDelete
    Replies
    1. Sorry, just saw this. If you used Fivethirtyeight, then like others, your bracket is not in great shape. It'll be interesting to see. I personally have Kentucky, but that's because I'm a fan.

      Delete

Powered by Blogger.