Chapter 244
Which Baseball Team Will Win The Riddler Fall Classic?
Riddler Express
If a baseball team is truly a team (a chance of winning each game, games independent), what is the probability that it has won exactly two of its last four games and exactly four of its last eight?
The Riddler, FiveThirtyEight, September 27, 2019(original post)
Solution
The last four games sit inside the last eight, so split the eight into the recent four and the four before them. “Two of the last four” means two wins in the recent four; “four of the last eight” then forces two wins in the earlier four as well. The two groups are independent, and each is a fair four-game stretch, so That is about . The trap is to treat “two of four” and “four of eight” as one combined count; seeing that the first condition fixes two of the eight wins and leaves exactly two more for the earlier four is what makes it a clean product.
The computation
Encode the experiment: flip eight fair games, and count the fraction of trials with exactly two wins in the last four and exactly four in all eight.
import random
rng = random.Random(0); T = 2_000_000; hits = 0
for _ in range(T):
g = [rng.random() < 0.5 for _ in range(8)] # g[4:] = last four
hits += (sum(g[4:]) == 2 and sum(g) == 4)
print(f"simulated {hits / T:.4f} vs 9/64 = {9 / 64:.4f}")
# simulated 0.1407 vs 9/64 = 0.1406
Riddler Classic
Three teams play a season against one another. Each Moonwalkers batter walks with probability (else strikes out); each Doubloons batter doubles with probability , driving in everyone on base (else strikes out); each Taters batter homers with probability (else strikes out). Games are nine innings (extra innings to break ties), and each team plays an equal number against each opponent. Which team most likely finishes with the best record?
The Riddler, FiveThirtyEight, September 27, 2019(original post)
Solution
The surprise is that the team scoring the fewest runs on average wins the most games. Counting average runs per nine innings, the Moonwalkers lead (their walks string together patiently), the Taters are in the middle, and the Doubloons trail: Wins are decided head-to-head, not by average runs, and run distribution matters more than the mean. The Taters score in rare, large bursts (a home run clears the bases), so their runs are highly variable: many shutout innings punctuated by big innings. That variance wins head-to-head games. Think of a team that scores one run every game versus a team that is shut out nine games in ten but erupts for twenty in the tenth: the second scores twice as many runs yet loses nine matchups out of ten. The Taters are the mild version of that erupting team.
This is exactly why Pythagorean win expectation (wins tracking runs scored and allowed) can mislead: it assumes a fixed run distribution, and the Taters break that assumption.
The computation
Encode each team’s half-inning as the actual at-bat process (random events until three outs, with correct base-running), play full nine-inning games (extra innings on ties), and simulate each head-to-head matchup. Report average runs and head-to-head win rates, then scale to a -game season.
A half-inning: draw at-bats until three outs, advancing runners by the team’s rule.
A game: nine innings each, extra innings until the tie is broken.
Simulate each pairing many times; convert win rates into season win totals.
import random
rng = random.Random(0)
def half_inning(team):
bases = [False, False, False]; outs = 0; runs = 0
while outs < 3:
r = rng.random()
if team == 'M': # walk 40% else out
if r < 0.40:
if all(bases): runs += 1 # bases loaded: forced run
elif bases[0] and bases[1]: bases[2] = True
elif bases[0]: bases[1] = True
bases[0] = True
else: outs += 1
elif team == 'D': # double 20%, clears bases
if r < 0.20: runs += sum(bases); bases = [False, True, False]
else: outs += 1
else: # Taters home run 10%
if r < 0.10: runs += sum(bases) + 1; bases = [False, False, False]
else: outs += 1
return runs
def game(a, b):
ra = sum(half_inning(a) for _ in range(9))
rb = sum(half_inning(b) for _ in range(9))
while ra == rb: ra += half_inning(a); rb += half_inning(b)
return ra > rb
def winprob(a, b, n): return sum(game(a, b) for _ in range(n)) / n
N = 300_000
TD, TM, MD = winprob('T','D',N), winprob('T','M',N), winprob('M','D',N)
print(f"T>D={TD:.3f} T>M={TM:.3f} M>D={MD:.3f}")
print(f"season wins Taters~{81*TD+81*TM:.0f} "
f"Moonwalkers~{81*(1-TM)+81*MD:.0f} Doubloons~{81*(1-TD)+81*(1-MD):.0f}")
# T>D=0.627 T>M=0.518 M>D=0.586
# season wins Taters~93 Moonwalkers~86 Doubloons~64
The Taters win the most despite scoring fewer runs than the Moonwalkers: their boom-or-bust scoring takes the head-to-head games.