Skip to content
Vamshi Jandhyala

Books · The Riddler

Chapter 220

Can You Turn America’s Pastime Into A Game Of Yahtzee?

Riddler Express

In the late-19th-century dice game Our National Ball Game, two players take turns rolling two standard dice and reading off a baseball event from the following table.

(1,1)(1,1) double (3,3)(3,3)(3,6)(3,6) out at 1st
(1,2)(1,2)(1,4)(1,4) single (4,4)(4,4)(4,6)(4,6) fly out
(1,5)(1,5) base on error (5,5)(5,5) double play
(1,6)(1,6) base on balls (5,6)(5,6) triple
(2,2)(2,2)(2,5)(2,5) strike (6,6)(6,6) home run
(2,6)(2,6) foul out

(Each unordered pair is equally likely; the 2121 outcomes use the standard 3636-roll weighting where ordered pairs (a,b)(a,b) and (b,a)(b,a) with aba \ne b are merged.) Innings end at three outs. Standard baserunning applies: runners on second score on a single, a runner on third scores on a fly out (sacrifice fly), forced runners advance on a walk or error, and so on. What is the average number of runs scored in a nine-inning game, and what does the distribution of runs look like?

The Riddler, FiveThirtyEight, March 22, 2019(original post)

Solution

There is no closed form for the run distribution: the dynamics of the half-inning depend on the joint state (outs,strikes,bases)(\text{outs},\text{strikes},\text{bases}) in a way that does not factor. The Solution gives the model and the headline numbers; The computation runs the half-inning as a Markov chain (or equivalently as a Monte Carlo simulation of the dice game) and reproduces the distribution.

The half-inning state. A half-inning evolves over a state (o,s,B)(o, s, B) where o{0,1,2,3}o \in \{0, 1, 2, 3\} is the number of outs (the half-inning ends at o=3o = 3), s{0,1,2}s \in \{0, 1, 2\} is the strike count on the current batter (a strike-out happens at the third strike, so ss resets when the batter changes), and B{0,1}3B \in \{0, 1\}^{3} is the occupied-bases indicator for (first,second,third)(\text{first}, \text{second}, \text{third}). Each roll picks one of the 2121 events with the 3636-roll weighting and updates the state.

Event semantics. Strikes accumulate against the current batter; the third strike ends the at-bat with an out. Any non-strike event ends the at-bat, so the next batter starts at s=0s = 0. The other events update bases and runs as follows.

  • Single. Runner on third scores; runner on second scores; runner on first advances to second; batter to first.

  • Double. Runner on third scores; runner on second scores; runner on first advances to third; batter to second.

  • Triple. All runners score; batter to third.

  • Home run. All runners score plus the batter.

  • Base on balls. Batter to first; runners advance only when forced.

  • Base on error. Treated as a single: all runners advance one base (the defence has bobbled the play), and the batter is on first.

  • Foul out / out at first. One out; bases unchanged.

  • Fly out. One out; if there is still time before the inning ends, a runner on third scores (sacrifice fly).

  • Double play. Two outs if a runner was on first (force at second); otherwise one out.

The runner-advancement choices follow the puzzle’s stated assumptions. Reasonable variants (such as a runner from first taking the extra base on a single) move the headline number by about half a run, not by an order of magnitude.

Headline. Running the dice game gives a per-team mean of roughly 13.713.7 runs over nine innings (so the two-team game scores roughly 2727 runs), with a long right tail. By comparison, real Major League Baseball games of that era averaged about 99 runs per game in total. The dice game is much higher scoring because base-reaching events (1,11{,}1; 1,21{,}2; 1,31{,}3; 1,41{,}4; 1,51{,}5; 1,61{,}6; 5,65{,}6; 6,66{,}6) collectively occur on 8/3622%8/36 \approx 22\% of rolls and chain together easily.

The computation

Encode the half-inning as the actual dice process, not as a formula. Build the 2121-row event table with the 3636-roll weights, simulate many games, and read off the mean and the distribution.

  1. Build the dice-to-event table and weights.

  2. Implement play_half_inning as a loop over rolls that updates (outs,strikes,bases)(\text{outs},\text{strikes},\text{bases}) and accumulates runs.

  3. Play a large number of nine-inning games; report mean and histogram.

import random
from collections import Counter

EVENTS = {
    (1,1): 'double',
    (1,2): 'single', (1,3): 'single', (1,4): 'single',
    (1,5): 'error',  (1,6): 'walk',
    (2,2): 'strike', (2,3): 'strike',
    (2,4): 'strike', (2,5): 'strike',
    (2,6): 'foulout',
    (3,3): 'out1', (3,4): 'out1', (3,5): 'out1', (3,6): 'out1',
    (4,4): 'flyout', (4,5): 'flyout', (4,6): 'flyout',
    (5,5): 'dp', (5,6): 'triple', (6,6): 'hr',
}

def roll(rng):
    a, b = rng.randint(1, 6), rng.randint(1, 6)
    if a > b: a, b = b, a
    return EVENTS[(a, b)]

def half_inning(rng):
    outs, strikes, runs = 0, 0, 0
    B = [0, 0, 0]                                  # first, second, third
    while outs < 3:
        ev = roll(rng)
        if ev == 'strike':
            strikes += 1
            if strikes == 3:
                outs += 1
                strikes = 0
            continue
        strikes = 0                                # batter changes
        if ev == 'single':
            if B[2]: runs += 1; B[2] = 0
            if B[1]: runs += 1; B[1] = 0
            if B[0]: B[1] = 1; B[0] = 0
            B[0] = 1
        elif ev == 'double':
            if B[2]: runs += 1; B[2] = 0
            if B[1]: runs += 1; B[1] = 0
            if B[0]: B[2] = 1; B[0] = 0
            B[1] = 1
        elif ev == 'triple':
            runs += sum(B); B = [0, 0, 0]; B[2] = 1
        elif ev == 'hr':
            runs += sum(B) + 1; B = [0, 0, 0]
        elif ev == 'walk':
            if B[0] and B[1] and B[2]: runs += 1
            elif B[0] and B[1]: B[2] = 1
            elif B[0]: B[1] = 1
            B[0] = 1
        elif ev == 'error':                        # treat as single
            if B[2]: runs += 1; B[2] = 0
            if B[1]: runs += 1; B[1] = 0
            if B[0]: B[1] = 1; B[0] = 0
            B[0] = 1
        elif ev in ('foulout', 'out1'):
            outs += 1
        elif ev == 'flyout':
            outs += 1
            if outs < 3 and B[2]: runs += 1; B[2] = 0
        elif ev == 'dp':
            if B[0]:
                outs += 2; B[0] = 0
            else:
                outs += 1
    return runs

def play_game(rng):
    return sum(half_inning(rng) for _ in range(9))

rng = random.Random(42)
trials = 100_000
runs = [play_game(rng) for _ in range(trials)]
print(f"mean per-team runs per nine innings = {sum(runs)/trials:.3f}")
print(f"two-team game = {2 * sum(runs) / trials:.3f}")
h = Counter(runs)
for r in range(0, 26):
    print(f"  {r:3d}: {100 * h[r] / trials:5.2f}%")

The script prints a per-team mean near 13.713.7 runs (two-team near 2727), a unimodal distribution peaking around 11111313 runs, and a right tail out to roughly 3030. The headline matches the model within Monte Carlo error; the modest gap to the official’s 30\approx 30 total runs reflects the choice of baserunning conventions, not a different game.

Riddler Classic

The Classic asks you to invent your own dice-to-event table that matches modern Major League run distributions more closely than the 18801880s table, then to add fidelity to other statistics (strikeouts per game, batting average, and so on).

The Riddler, FiveThirtyEight, March 22, 2019(original post)

Status

The Classic is a participatory design contest with no canonical right answer: each submitter proposes a custom 2121-row mapping and the column tabulates several. The winning entry (Tyler Burch’s “Burchball”) gives a distribution close to real-MLB run scoring, with (1,1)(1,1) \to triple, (2,2)(2,2) \to base on error, (4,4)(4,4) \to home run, (6,6)(6,6) \to strikeout, and so on. Because the Classic is a submission contest rather than a derivable problem, it is deferred from the worked-solution standard.

If a successor edition introduces a fixed target distribution (for example, exactly match the 20182018 National League run-per-game histogram), the same Markov-chain simulator from the Express, evaluated under a candidate 2121-row table, becomes the objective for a small mixed-integer search. The puzzle as posed does not pin the target, so the problem is open by design.