Chapter 106
Can You Solve The Puzzle Of The Baseball Division Champs?
In a sport where each team plays games a season, take a division of five teams of exactly equal ability: each has a chance of winning any given game. What is the expected number of wins for the team that finishes first?
The Riddler, FiveThirtyEight (Nick Keenan)(original post)
Solution
A single team of average ability wins half its games, . The division champion is not an average team though, it is the best of five, and being the best of a group pulls the number up. The expected first-place total is about
The honest caveat first: a real schedule couples the teams, since one club’s win is another’s loss, so the five totals are not quite independent. But each team plays only of its games inside the division and outside it, and those out-of-division games swamp the coupling. The clean model that the column settles on treats every game as its own coin flip, making each team’s season an independent and the champion their maximum.
For the maximum of five independent counts, lean on the tail rather than the bell. The champion wins at least games unless all five fall short of , so with the single-team cumulative distribution, Summing this against the exact binomial gives , so the first-place team averages a little over wins, some seven games above the an average team manages. (A real big-league schedule nudges this only slightly, to about .)
The computation
Compute the single-team distribution exactly, raise its cumulative function to the fifth power for the champion, and sum the tail. A Monte Carlo season cross-checks the closed form.
from math import comb
import numpy as np
n, teams = 162, 5
cdf = [sum(comb(n, k) for k in range(w + 1)) / 2**n for w in range(n + 1)]
exact = sum(1 - cdf[w] ** teams for w in range(n + 1)) # E[max] via the tail sum
print(round(exact, 4)) # 88.3943
rng = np.random.default_rng(0)
sim = rng.binomial(n, 0.5, size=(2_000_000, teams)).max(axis=1).mean()
print(round(sim, 3)) # 88.391