Chapter 220
Can You Turn America’s Pastime Into A Game Of Yahtzee?
Riddler Express
In the late-19th-century dice game Our National Ball Game, two players take turns rolling two standard dice and reading off a baseball event from the following table.
| double | – | out at 1st | |
| – | single | – | fly out |
| base on error | double play | ||
| base on balls | triple | ||
| – | strike | home run | |
| foul out |
(Each unordered pair is equally likely; the outcomes use the standard -roll weighting where ordered pairs and with are merged.) Innings end at three outs. Standard baserunning applies: runners on second score on a single, a runner on third scores on a fly out (sacrifice fly), forced runners advance on a walk or error, and so on. What is the average number of runs scored in a nine-inning game, and what does the distribution of runs look like?
The Riddler, FiveThirtyEight, March 22, 2019(original post)
Solution
There is no closed form for the run distribution: the dynamics of the half-inning depend on the joint state in a way that does not factor. The Solution gives the model and the headline numbers; The computation runs the half-inning as a Markov chain (or equivalently as a Monte Carlo simulation of the dice game) and reproduces the distribution.
The half-inning state. A half-inning evolves over a state where is the number of outs (the half-inning ends at ), is the strike count on the current batter (a strike-out happens at the third strike, so resets when the batter changes), and is the occupied-bases indicator for . Each roll picks one of the events with the -roll weighting and updates the state.
Event semantics. Strikes accumulate against the current batter; the third strike ends the at-bat with an out. Any non-strike event ends the at-bat, so the next batter starts at . The other events update bases and runs as follows.
Single. Runner on third scores; runner on second scores; runner on first advances to second; batter to first.
Double. Runner on third scores; runner on second scores; runner on first advances to third; batter to second.
Triple. All runners score; batter to third.
Home run. All runners score plus the batter.
Base on balls. Batter to first; runners advance only when forced.
Base on error. Treated as a single: all runners advance one base (the defence has bobbled the play), and the batter is on first.
Foul out / out at first. One out; bases unchanged.
Fly out. One out; if there is still time before the inning ends, a runner on third scores (sacrifice fly).
Double play. Two outs if a runner was on first (force at second); otherwise one out.
The runner-advancement choices follow the puzzle’s stated assumptions. Reasonable variants (such as a runner from first taking the extra base on a single) move the headline number by about half a run, not by an order of magnitude.
Headline. Running the dice game gives a per-team mean of roughly runs over nine innings (so the two-team game scores roughly runs), with a long right tail. By comparison, real Major League Baseball games of that era averaged about runs per game in total. The dice game is much higher scoring because base-reaching events (; ; ; ; ; ; ; ) collectively occur on of rolls and chain together easily.
The computation
Encode the half-inning as the actual dice process, not as a formula. Build the -row event table with the -roll weights, simulate many games, and read off the mean and the distribution.
Build the dice-to-event table and weights.
Implement
play_half_inningas a loop over rolls that updates and accumulates runs.Play a large number of nine-inning games; report mean and histogram.
import random
from collections import Counter
EVENTS = {
(1,1): 'double',
(1,2): 'single', (1,3): 'single', (1,4): 'single',
(1,5): 'error', (1,6): 'walk',
(2,2): 'strike', (2,3): 'strike',
(2,4): 'strike', (2,5): 'strike',
(2,6): 'foulout',
(3,3): 'out1', (3,4): 'out1', (3,5): 'out1', (3,6): 'out1',
(4,4): 'flyout', (4,5): 'flyout', (4,6): 'flyout',
(5,5): 'dp', (5,6): 'triple', (6,6): 'hr',
}
def roll(rng):
a, b = rng.randint(1, 6), rng.randint(1, 6)
if a > b: a, b = b, a
return EVENTS[(a, b)]
def half_inning(rng):
outs, strikes, runs = 0, 0, 0
B = [0, 0, 0] # first, second, third
while outs < 3:
ev = roll(rng)
if ev == 'strike':
strikes += 1
if strikes == 3:
outs += 1
strikes = 0
continue
strikes = 0 # batter changes
if ev == 'single':
if B[2]: runs += 1; B[2] = 0
if B[1]: runs += 1; B[1] = 0
if B[0]: B[1] = 1; B[0] = 0
B[0] = 1
elif ev == 'double':
if B[2]: runs += 1; B[2] = 0
if B[1]: runs += 1; B[1] = 0
if B[0]: B[2] = 1; B[0] = 0
B[1] = 1
elif ev == 'triple':
runs += sum(B); B = [0, 0, 0]; B[2] = 1
elif ev == 'hr':
runs += sum(B) + 1; B = [0, 0, 0]
elif ev == 'walk':
if B[0] and B[1] and B[2]: runs += 1
elif B[0] and B[1]: B[2] = 1
elif B[0]: B[1] = 1
B[0] = 1
elif ev == 'error': # treat as single
if B[2]: runs += 1; B[2] = 0
if B[1]: runs += 1; B[1] = 0
if B[0]: B[1] = 1; B[0] = 0
B[0] = 1
elif ev in ('foulout', 'out1'):
outs += 1
elif ev == 'flyout':
outs += 1
if outs < 3 and B[2]: runs += 1; B[2] = 0
elif ev == 'dp':
if B[0]:
outs += 2; B[0] = 0
else:
outs += 1
return runs
def play_game(rng):
return sum(half_inning(rng) for _ in range(9))
rng = random.Random(42)
trials = 100_000
runs = [play_game(rng) for _ in range(trials)]
print(f"mean per-team runs per nine innings = {sum(runs)/trials:.3f}")
print(f"two-team game = {2 * sum(runs) / trials:.3f}")
h = Counter(runs)
for r in range(0, 26):
print(f" {r:3d}: {100 * h[r] / trials:5.2f}%")
The script prints a per-team mean near runs (two-team near ), a unimodal distribution peaking around – runs, and a right tail out to roughly . The headline matches the model within Monte Carlo error; the modest gap to the official’s total runs reflects the choice of baserunning conventions, not a different game.
Riddler Classic
The Classic asks you to invent your own dice-to-event table that matches modern Major League run distributions more closely than the s table, then to add fidelity to other statistics (strikeouts per game, batting average, and so on).
The Riddler, FiveThirtyEight, March 22, 2019(original post)
Status
The Classic is a participatory design contest with no canonical right answer: each submitter proposes a custom -row mapping and the column tabulates several. The winning entry (Tyler Burch’s “Burchball”) gives a distribution close to real-MLB run scoring, with triple, base on error, home run, strikeout, and so on. Because the Classic is a submission contest rather than a derivable problem, it is deferred from the worked-solution standard.
If a successor edition introduces a fixed target distribution (for example, exactly match the National League run-per-game histogram), the same Markov-chain simulator from the Express, evaluated under a candidate -row table, becomes the objective for a small mixed-integer search. The puzzle as posed does not pin the target, so the problem is open by design.