It. Is. Here.
I was hoping to get this out by yesterday, but better late than never I guess. I am sure some of you are still scrambling to fill out your brackets. If you are anything like me, you have spent so much time doing research that you now find yourself running out of time to fill out your actual brackets! So, without further ado, I’ll get right to unveiling the model and what it projects.
About the Model
I ran a stepwise logistic regression on over 1,000 NCAA tournament games dating back to 2008.1 The stepwise nature of this selected the most important variables—out of the 50+ that I began with—for predicting the outcome of an NCAA tournament game. These came out to be KenPom Adjusted Efficiency, Wins Above Bubble (WAB), Bart Torvik’s Talent metric, Bart Torvik’s Resume metric, and Seed. By applying this data on 2025 tourney teams and potential matchups, I was able to get win probabilities for every possible game. These probabilities were then converted to predicted point spreads via the formula below, where ‘prob’ represents the probability of the higher seeded team winning.
The spread is then made negative if the higher seed is favored to reflect how it appears in betting lines. I used the variables in the logistic regression model to then forecast a projected point total via linear regression, which was then adjusted relative to the average NCAA tournament game.2 Lastly, I adjusted the point spread based upon the projected total, where a higher than average scoring game would yield a proportionally higher spread, and vice versa. I will go round by round to present what the model forecasts, ultimately giving my full bracket picks. You will see a column on the tables named “implied odds”, this is the win probability translated to American betting odds (rounded to represent traditional listings of numbers).
First Round
Second Round
Sweet Sixteen
Elite Eight
Final Four
BONUS: Sleepers to Watch
A few weeks back, I wrote about teams to potentially watch out for as "big upset” candidates based on similarity scores to previous successful, winning underdogs. Now it is not just hypothetical, we have actual underdogs to keep an eye on. Using the findings from that analysis, here are the teams most equipped to pull off a big (5+ seed-line) upset. Included in the table below are how each team ranks among all 2025 teams across college basketball, not just tournament teams, in terms of “big upset” team resemblance.
Famously, the 2011 Butler team advanced all the way to the national championship game. Also of note, the 2011 Xavier squad advanced to the Elite Eight as an 11-seed while Florida State did the same as a 9-seed in 2018.
I will update the model after each round with the new set of confirmed games, as I fully accept that I will not have gotten all of these games right (although a guy can dream).
If you like these predictions, I encourage you to also check out the forecasts by
at Neil’s Substack and at Silver BulletinBring on the madness!
Accessed via the wonderful March Madness Data dataset in Kaggle
Yes, I know I could have gone through and selected variables probably more pertinent to point totals, but a guy’s only got so much time…and it worked much easier in my R function.
Hey Sean!
Absolutely love the writeup. In your model, do team win percentages fluctuate based on performance as we go through the tournament?