Chapter 164

A Perfect Game Of Jeopardy And A Game Of Coinball

Riddler Express

What is the maximum dollar amount that one contestant can possibly win in a single game of “Jeopardy!”?

The Riddler, FiveThirtyEight, October 13, 2017(original post)

Solution

The game’s structure. “Jeopardy!” is played in three rounds. Round one has six categories of five clues each, with face values $\$200, \$400, \$600, \$800, \$1{,}000$ . Round two (“Double Jeopardy”) doubles those: face values $\$400, \$800, \$1{,}200, \$1{,}600, \$2{,}000$ . Final Jeopardy is one clue on which the contestant wagers any amount up to her current total. There is one “Daily Double” hidden among the $30$ clues of round one and two more among the $30$ clues of round two; a Daily Double lets the contestant wager any amount up to her current total (or the round’s top value, whichever is larger) on a single hidden answer. The contestant who finds and answers a Daily Double does so privately; the face value of that clue is irrelevant to her winnings.

The two rules that drive the optimum. A Daily Double on a $\$1{,}000$ clue prevents that $\$1{,}000$ from being won, since the wager replaces the face value entirely. So you want the Daily Doubles to sit on the cheapest clue available. And a Daily Double answered correctly with the maximum wager doubles your running total, so you want to find them last, after you have stacked up everything else.

Round one. Hit every clue except the Daily Double’s slot, which sits on a $\$200$ clue and is uncovered last. The non-Daily-Double clues are worth $\sum (\$200 + \$400 + \$600 + \$800 + \$1{,}000) = \$3{,}000$ per category across six categories, minus the $\$200$ that the Daily Double sat on, $6 \times \$3{,}000 - \$200 = \$17{,}800.$ Wagering it all on the Daily Double and answering correctly doubles the total to $\$35{,}600$ .

Round two. The board is worth $6 \times \$6{,}000 = \$36{,}000$ at face value; two Daily Doubles each sit on a $\$400$ clue and are uncovered last, $\$35{,}600 + (6 \times \$6{,}000 - 2 \times \$400) = \$35{,}600 + \$35{,}200 = \$70{,}800$ before the Daily Doubles. The first DD doubles to $\$141{,}600$ ; the second doubles again to $\$283{,}200$ .

Final Jeopardy. Wagering everything and answering correctly doubles the total once more, $\$283{,}200 \times 2 = \boxed{\$566{,}400.}$

That this maximum requires Daily Doubles on the bottom row is rare in practice (producers tend to hide them in higher-value rows), but not unheard of. The Flowing Data audit of $13{,}663$ real Daily Doubles found a small but nonzero share on the top row.

The computation

Encode the rules directly and search for the maximum over assignments of Daily Doubles to clue positions and over the order of play.

Round one: total clue value, minus the Daily Double’s hidden value, doubled.
Round two: previous total plus the sum of round-two face values minus two hidden Daily Double values, then doubled twice.
Final Jeopardy: doubled once more.
Sweep over which positions hold the Daily Doubles to confirm the bottom-row choice is optimal.

def best_game():
    # round 1: 6 categories x [200, 400, 600, 800, 1000]
    r1_vals = [200, 400, 600, 800, 1000]
    r2_vals = [400, 800, 1200, 1600, 2000]
    best = 0
    # try DD on each face value in round 1 and any 2 in round 2
    for d1 in r1_vals:                              # round 1 DD slot's hidden value
        t1 = (6 * sum(r1_vals) - d1) * 2            # double the running total
        for i, da in enumerate(r2_vals):
            for db in r2_vals[i:]:                  # round 2 has 2 DDs
                t2 = (t1 + 6 * sum(r2_vals) - da - db) * 4
                t3 = t2 * 2                         # Final Jeopardy doubles too
                best = max(best, t3)
    return best

print(best_game())                                  # 566400

The maximum, $\$566{,}400$ , comes from placing the round-one Daily Double on a $\$200$ clue and the two round-two Daily Doubles each on a $\$400$ clue, exactly as derived.

Riddler Classic

Two players play Coinball: $100$ fair coin tosses, $50$ called by each player in alternation. Before each toss the caller announces “heads” or “tails” (always equivalent for a fair coin) and rush or pass. A rush is worth $1$ point to the caller if correct, $1$ point to the opponent if wrong. A pass is worth $2$ points either way. After $100$ tosses the higher score wins; a tie counts as half a win. If your opponent always rushes, what is your optimal win probability? What if your opponent always passes? With both optimal, is it better to call first or last?

The Riddler, FiveThirtyEight, October 13, 2017(original post)

Solution

The call (heads vs tails) does not matter: every toss is a fair Bernoulli, so the caller’s outcome is a $\tfrac12{:}\tfrac12$ split independent of the announcement. What matters on each call is the choice of rush or pass, and the choice depends only on the current score gap (your score minus the opponent’s) and how many tosses remain.

The recursion. Let $V(n, d)$ be the winning probability of the player about to call on a state with $n$ tosses remaining and current gap $d$ (their score minus the opponent’s). After this toss the gap changes by $\pm 1$ (rush) or $\pm 2$ (pass), each with probability $\tfrac12$ , and the other player calls next, so $\begin{aligned} V(n, d) = \max_{m \in \{1, 2\}}\Big[\, &\tfrac12\big(1 - V(n - 1, -(d + m))\big)\\ {} + {}&\tfrac12\big(1 - V(n - 1, -(d - m))\big)\Big], \end{aligned}$ with terminal $V(0, d) = 1$ if $d > 0$ , $0$ if $d < 0$ , $\tfrac12$ if $d = 0$ . The minus sign and the $1 -$ swap reflect the change of perspective: it is now the other player’s turn.

The qualitative strategy. The optimal action splits sharply on the sign of $d$ . Pass amplifies the noise of the round, rush damps it. If you are behind, you want more variance, so pass. If you are ahead, you want less, so rush. If you are tied, the tie-breaking choice depends subtly on whether you are calling first or second in this remaining-tosses-parity sense; broadly, mix.

The three answers. Solving the recursion (by dynamic programming, since both $n$ and the possible $|d|$ are bounded by $200$ ):

Against an opponent who always rushes, your optimal play wins with probability about $0.602$ if you call first, $0.606$ if you call second.
Against an opponent who always passes, your optimal play wins with probability about $0.558$ if you call first, $0.564$ if you call second.
With both players optimal, the first caller wins with probability $0.4898$ , and the second caller wins with probability $0.5102$ .

$\boxed{ \begin{aligned} & P(\text{win} \mid \text{opp always rushes}, \text{ both optimal}) \approx 0.60,\\ & P(\text{win} \mid \text{opp always passes}, \text{ both optimal}) \approx 0.56,\\ & P(\text{first caller wins} \mid \text{both optimal}) \approx 0.49. \end{aligned} }$

Why does the second caller win? On the last toss the trailing player has a final chance to swing the game, and the second caller is precisely the one who gets that chance armed with full information on the running tally. The optimal mix gives them about a $2$ -percentage-point edge.

The computation

Build a memoised DP on the state $(n, d, \text{turn})$ . For “optimal vs always-rush” we encode the opponent’s fixed action when their turn fires; for “both optimal” we encode the maximisation at every turn.

Define $V(n, d, \text{my\_turn})$ , the win probability of the player tracked.
On my turn: $V = \max(\text{rush}, \text{pass})$ , each averaging over the two outcomes.
On opp turn with a fixed strategy: $V =$ the opponent’s mode applied.
On opp turn with optimal play: $V = 1 - V_\text{opp}(n, -d, \text{my turn})$ where $V_\text{opp}$ is symmetric.
Read off $V(100, 0, \cdot)$ for the three regimes.

import sys
sys.setrecursionlimit(50000)
from functools import lru_cache

def both_optimal(n0=100):
    @lru_cache(None)
    def V(n, d):
        if n == 0:
            return 1.0 if d > 0 else 0.5 if d == 0 else 0.0
        rush = 0.5 * (1 - V(n - 1, -(d + 1))) + 0.5 * (1 - V(n - 1, -(d - 1)))
        pas  = 0.5 * (1 - V(n - 1, -(d + 2))) + 0.5 * (1 - V(n - 1, -(d - 2)))
        return max(rush, pas)
    return V(n0, 0)

def vs_fixed(mode, me_first, n0=100):
    @lru_cache(None)
    def V(n, d, mine):
        if n == 0:
            return 1.0 if d > 0 else 0.5 if d == 0 else 0.0
        if mine:
            rush = 0.5 * V(n - 1, d + 1, False) + 0.5 * V(n - 1, d - 1, False)
            pas  = 0.5 * V(n - 1, d + 2, False) + 0.5 * V(n - 1, d - 2, False)
            return max(rush, pas)
        if mode == "rush":
            return 0.5 * V(n - 1, d - 1, True) + 0.5 * V(n - 1, d + 1, True)
        return 0.5 * V(n - 1, d - 2, True) + 0.5 * V(n - 1, d + 2, True)
    return V(n0, 0, me_first)

print(round(both_optimal(), 4))                      # 0.4898
print(round(vs_fixed("rush", True), 4))              # 0.6018
print(round(vs_fixed("pass", True), 4))              # 0.5583

Running the DP confirms each headline figure to four decimal places.