clock menu more-arrow no yes

Filed under:

Fun With Numbers: MAC Women's Tournament Edition

New, 2 comments

Wherein I estimate the likelihood of each team winning the tournament, consider how this has been changed by the new tournament format, and discover an oddity.

One of the simpler methods for calculating game outcome probabilities (and thereby estimating tournament outcomes) is log5, which comes from Bill James' sabermetrics. Even though the only factors that go in to log5 are the winning percentages of the teams, it tends to give reasonably accurate probabilities. This can be improved by using pythagorean win percentages, which use scoring margins over the season to create "expected" win percentages, and I might go that route for the men's tournament (because there are plenty of sites that have already calculated pythagorean win percentages for men's basketball), but I think actual win percentages will give us a good approximation here. If you want to learn more about log5, here's the theoretical justification, and here's the history.

So what does log5 have to tell us about the MAC tournament? Find out, past the jump.

First, here is the calculated likelihood, using full-season winning percentages, that each team will win the tournament (rounded to a tenth of a percent):

  • Bowling Green: 44.8
  • EMU: 22.8
  • Toledo: 18.2
  • Miami: 14.0
  • Central Michigan: 1.2
  • Akron: 0.5
  • Northern Illinois: 0.5
  • Ohio: 0.4
  • Kent State: 0.0
  • Western Michigan: 0.1
  • Buffalo: 0.1
  • Ball State 0.1

No big surprises there, right?

As a check, I also calculated championship likelihood using only MAC record. The advantage of this is that the schedule strengths within conference play are roughly the same -- much more similar than non-conference schedules, anyway -- but the tradeoff is that using fewer games probably introduces more uncertainty. The percentages here are similar, although there is a stronger relationship between seed and win probability -- to be expected, since seeding is obviously more closely correlated with conference winning percentage than with overall winning percentage. This also allows us to examine the benefit of the double-bye. All other things being equal -- and in this case, since both EMU and Toledo finished 13-3, they are -- the #2 seed this year is 18% more likely to win than the #3 seed.

Next I figured the numbers under the old tournament format and compared them. Obviously, the top two seeds, with the double-byes, benefit from the new format. An EMU championship is 16% more likely and a Bowling Green championship 10% more likely because of the extra bye. Unsurprisingly, seeds 5-12 are much worse off (about 40% to 60% worse), since they would now need to win five games instead of four. What surprised me, until I thought about it some more, is that seeds 3 and 4 are also 12% to 14% worse off. After thinking about it, this is because they are guaranteed to face the top two teams in their second game, where under the old system there was a possibility that the #1 seed could be upset by the #8 or 9 seed, and/or the #2 seed upset by the #7 or #10 seed in the quarterfinal.

As an EMU fan, I was also curious how big a hit EMU took by not getting the top seed, which, theoretically, has a (slightly) easier opponent. In fact, it turns out that, this year at least, the #2 seed is a better place to be, by a meaningful amount. This has to do with where, and by how much, the team quality drops as you go down the seeds; it just happens that the teams on the #2 seed's side of the bracket are a little better than the teams on the #1 seed's side. It turns out that EMU has a better chance of winning the tournament from the #2 seed than they would if they had gotten the #1 seed!