Chapter 197

I’d Like To Use My Riddler Lifeline

Riddler Express

You are a contestant on a game show. You hold a guaranteed $250,000 and have the option to answer up to two more questions. Get the first right and you reach the $500,000 tier, get the second right and you reach $1,000,000; any wrong answer drops you to $10,000. You have two lifelines left. The 50/50 takes the four answer choices and keeps the correct one plus one randomly chosen from the three wrong choices. Ask the Audience polls the studio audience; among four choices the correct answer is the audience plurality $50\%$ of the time, second-most-picked $30\%$ of the time, third $15\%$ and last $5\%$ ; among two choices the audience plurality is correct $65\%$ of the time. You assess yourself as guessing blindly on the last two questions. What is the strategy that maximises your expected winnings?

The Riddler, FiveThirtyEight, September 7, 2018(original post)

Solution

The walking-away option is worth $250,000 with no risk, so we keep that as the floor and only attempt a question if its expected payoff (including possibly walking away after) clears $250,000.

Compare attacks on the $500,000 question. Conditional on attempting it, every lifeline used cuts the cost of a wrong answer to nothing extra (you would have lost the same money anyway). So a rational plan uses every available lifeline on the question we choose to attack first. Consider three orderings.

Plan A: 50/50 alone. The 50/50 leaves two answers and we guess at random; success probability is $1/2$ .

Plan B: 50/50 then Ask the Audience. After the 50/50 there are two answers and the audience polls correctly $65\%$ of the time. We follow the audience plurality. Success probability is $0.65$ .

Plan C: Ask the Audience then 50/50. Now the order matters and we have to track both lifelines together. Suppose the correct answer is the audience’s rank- $k$ pick (where $k \in \{1, 2, 3, 4\}$ , with probabilities $0.5, 0.3, 0.15, 0.05$ ). The 50/50 then keeps the correct answer and one uniformly random wrong answer; we pick whichever of the two surviving answers ranked higher with the audience.

If the audience plurality was correct ( $k = 1$ , prob $0.5$ ), the surviving pair is rank- $1$ and one of ranks $2$ , $3$ , $4$ . Rank- $1$ is correct and is the higher-ranked of the two, so we win with probability $1$ .
If the correct answer was rank- $2$ ( $k = 2$ , prob $0.3$ ), the survivor pair is rank- $2$ and one of ranks $1$ , $3$ , $4$ uniformly. We pick the higher rank: only when the other survivor was rank- $1$ do we get it wrong. Win probability $2/3$ .
If $k = 3$ (prob $0.15$ ), the survivor pair is rank- $3$ and one of ranks $1$ , $2$ , $4$ uniformly. Win iff the other survivor was rank- $4$ , probability $1/3$ .
If $k = 4$ (prob $0.05$ ), the survivor pair is rank- $4$ and one of ranks $1$ , $2$ , $3$ , all of which outrank it. Win probability $0$ .

Total success probability for Plan C: $0.5 \cdot 1 + 0.3 \cdot \tfrac{2}{3} + 0.15 \cdot \tfrac{1}{3} + 0.05 \cdot 0 \;=\; 0.5 + 0.2 + 0.05 \;=\; 0.75.$ The three plans give $50\%$ , $65\%$ , $75\%$ for the $500,000 question.

Stop or press on for $1M? If we get $500,000, we have no lifelines and the $1M question is a blind guess: success $1/4$ . Pressing on gives expected $\$250{,}000 + 0.75 \cdot \$10{,}000 = \$257{,}500$ , well below $500,000. So we stop at $500,000.

Could splitting one lifeline per question beat Plan C plus stop? Use Ask the Audience on Q1 (success $0.5$ by picking the plurality) and 50/50 on Q2 (success $0.5$ ). Joint success $0.25$ , expected $0.25 \cdot \$1{,}000{,}000 + 0.75 \cdot \$10{,}000 = \$257{,}500$ . The other ordering, 50/50 on Q1 then Ask the Audience on Q2, has the same joint success: $0.5 \cdot 0.5 = 0.25$ , expected $\$257{,}500$ . Both are dominated by Plan C plus stop.

Could the player attempt Q1 with no lifeline? Blind-guess Q1: success $1/4$ . If correct, attempt Q2 with both lifelines (Plan C, success $0.75$ ). Otherwise drop to $\$10{,}000$ . Expected $0.75 \cdot \$10{,}000 + 0.25 \cdot 0.25 \cdot \$10{,}000 + 0.25 \cdot 0.75 \cdot \$1{,}000{,}000$ ? Wait: if Q1 wrong, drop to $10,000 and stop. If Q1 right, attempt Q2 with Plan C; success gives $1M, failure drops to $10,000. Expected $= 0.75 \cdot 10{,}000 + 0.25(0.75 \cdot 1{,}000{,}000 + 0.25 \cdot 10{,}000) = 7{,}500 + 0.25 \cdot 752{,}500 = 7{,}500 + 188{,}125 = \$195{,}625$ . Far worse than walking away.

The expected payoffs are $\begin{aligned} \text{Walk away} &: \$250\text{k}, \\ \text{Plan A then stop} &: 0.5(\$500\text{k}) + 0.5(\$10\text{k}) = \$255\text{k}, \\ \text{Split lifelines} &: 0.25(\$1\text{M}) + 0.75(\$10\text{k}) = \$257.5\text{k}, \\ \text{Plan B then stop} &: 0.65(\$500\text{k}) + 0.35(\$10\text{k}) = \$328.5\text{k}, \\ \text{Plan C then stop} &: 0.75(\$500\text{k}) + 0.25(\$10\text{k}) = \$377.5\text{k}. \end{aligned}$

The computation

Re-encode the game by simulating the strategy directly over a randomised correct-answer position. The audience-rank of the correct answer is drawn from the $(0.5, 0.3, 0.15, 0.05)$ distribution; the 50/50 keeps the correct answer and one of the remaining three uniformly; the player picks the higher of the two surviving ranks.

Sample the rank $k \in \{1, 2, 3, 4\}$ of the correct answer with the stated probabilities.
Sample uniformly which of the other three is kept by the 50/50.
Pick the higher-ranked surviving answer; record success.
Average over a million trials and confirm $\approx 0.75$ .

import random
random.seed(0)

ranks = [1, 2, 3, 4]
probs = [0.50, 0.30, 0.15, 0.05]

def trial():
    k = random.choices(ranks, weights=probs, k=1)[0]
    others = [r for r in ranks if r != k]
    survivor = random.choice(others)
    # pick lower number = higher rank (rank 1 = best)
    return min(k, survivor) == k

N = 1_000_000
wins = sum(trial() for _ in range(N))
p = wins / N
print(f"Plan C success rate: {p:.4f}  (target 0.7500)")
print(f"E[$ with Plan C then stop] = ${p*500_000 + (1-p)*10_000:,.0f}")

The script prints a frequency near $0.75$ and an expected payoff near $377,500.

Riddler Classic

Your child is collecting Riddler League cards. The Silver set has $100$ cards numbered $1$ to $100$ . Packs cost $1, each pack contains $10$ random distinct cards drawn uniformly from the $100$ , and the child’s allowance is $10 per week. How many weeks should you expect it to take to complete the Silver set? What about the Gold set of $300$ different cards?

The Riddler, FiveThirtyEight, September 7, 2018(original post)

Solution

This is a coupon-collector problem in which each "draw" is a pack of $10$ distinct cards from $N$ total, instead of a single card. The classic single-coupon collector has expected length $N H_N$ where $H_N$ is the harmonic number; the batched version is similar in spirit but does not factor so cleanly. We set up a Markov chain on the number of distinct cards owned and solve it numerically.

Let $T_m$ be the expected number of packs to reach a complete set given the collector currently has $m$ distinct cards. The pack contributes $J$ new cards, where $J$ is hypergeometric: among $N$ cards there are $N - m$ unowned, and the pack of size $K = 10$ is sampled without replacement. Hence $\Pr(J = j \mid m) \;=\; \frac{\binom{N - m}{j} \binom{m}{K - j}}{\binom{N}{K}},$ for $\max(0, K - m) \leq j \leq \min(K, N - m)$ . Let $p_0 = \Pr(J = 0 \mid m)$ denote the chance the pack is a complete duplicate. Conditioning on the next pack, $\begin{aligned} T_m &\;=\; 1 + p_0 T_m + \sum_{j \geq 1} \Pr(J = j \mid m) \, T_{m + j}, \\ T_m &\;=\; \frac{1 + \sum_{j \geq 1} \Pr(J = j \mid m) T_{m + j}}{1 - p_0}. \end{aligned}$ This is a back-substitution from $T_{N} = 0$ downward. The number we want is $T_0$ .

The early packs barely repeat themselves: the first pack is guaranteed all $10$ new; the second has expected $9$ new; and so on, with the expected new in pack $i$ landing at $K (N - m_{i - 1}) / N$ where $m_{i - 1}$ is the running unique count. The harder the chase becomes at the end is the entire cost of the collector problem.

Solving the recurrence numerically (see The computation) for $N = 100, K = 10$ gives $T_0 \approx 49.94$ packs, that is about $50$ packs in expectation, or $\boxed{\text{about } 5 \text{ weeks of allowance for the Silver set.}}$ The same code with $N = 300, K = 10$ gives $T_0 \approx 186.09$ packs, that is $\boxed{\text{about } 19 \text{ weeks of allowance for the Gold set.}}$ Across both sets the running tally of cards bought sits near $5 N$ : $500$ for Silver, $\approx 1{,}860$ for Gold. The factor $\approx 5$ is the batched-coupon equivalent of the harmonic $H_N$ scaling: $H_{100} \approx 5.19$ and $H_{300} \approx 6.28$ , so $H_N / K$ predicts roughly $1/2$ to $2/3$ a week per $10$ cards in the chase, with the long tail dominating.

The computation

Re-encode the recurrence directly. We compute $T_m$ for $m = N, N - 1, \ldots, 0$ using the hypergeometric pack-distribution exactly. As a sanity check, we Monte Carlo the actual pack-buying process and compare.

Iterate $m$ from $N - 1$ down to $0$ ; for each $m$ enumerate $j$ from $0$ to $\min(K, N - m)$ ; compute $\Pr(J = j \mid m)$ exactly with $\binom{}{}$ .
Set $T_m = (1 + \sum_{j \geq 1} \Pr(J = j) T_{m + j}) / (1 - \Pr(J = 0))$ .
Report $T_0$ for $N = 100, K = 10$ and $N = 300, K = 10$ .

import random
from math import comb

def expected_packs(N, K):
    T = [0.0] * (N + 1)
    total = comb(N, K)
    for m in range(N - 1, -1, -1):
        s = 0.0
        p0 = comb(m, K) / total if m >= K else 0.0
        for j in range(max(1, K - m), min(K, N - m) + 1):
            p = comb(N - m, j) * comb(m, K - j) / total
            s += p * T[m + j]
        T[m] = (1 + s) / (1 - p0)
    return T[0]

print(f"Silver (N=100, K=10): E[packs] = {expected_packs(100, 10):.2f}")
print(f"Gold   (N=300, K=10): E[packs] = {expected_packs(300, 10):.2f}")

# Sanity check by Monte Carlo
random.seed(0)
def mc(N, K, trials):
    total = 0
    for _ in range(trials):
        owned = set()
        packs = 0
        while len(owned) < N:
            owned.update(random.sample(range(N), K))
            packs += 1
        total += packs
    return total / trials

print(f"MC Silver: {mc(100, 10, 5000):.2f}")
print(f"MC Gold:   {mc(300, 10, 1000):.2f}")

The exact recurrence prints $49.94$ and $186.09$ ; the Monte Carlo agrees to two significant digits. Five weeks for Silver, nineteen weeks for Gold.