Skip to main content
Open Access Publications from the University of California

Using Statistics to Create the Perfect March Madness Bracket

Creative Commons 'BY-NC-ND' version 4.0 license

The goal of this project is to analyze data from NCAA Division One Men's basketball teams during the regular season to predict how they will perform during the National Championships, colloquially known as March Madness. I use a data set that ranks teams according to their Pomeroy College Basketball Ratings[1]. These ratings give in depth basketball statistics for each year from 2002 until present and use several different measures to help quantify how good or bad a team is. My analysis will take three parts: single linear analysis, multiple linear analysis, and polynomial regression. I start by attempting to do a single linear analysis on the data from the year 2016, first using Adjusted Offensive Efficiency as the predictor and then using Adjusted Defensive Efficiency as the predictor. Next, I attempt a multiple linear analysis and find that by using both the Adjusted Offensive Efficiency and Adjusted Defensive Efficiency, the predictions greatly improve, but still are not perfect. Finally, I attempt polynomial regression using Adjusted Offensive Efficiency as the predictor. After running each of these methods, I found that none of these can predict the perfect bracket, however the multiple linear regression is able to perform surprisingly well, making the correct final ranking predictions approximately 62.33% of the time.


Main Content
Current View