Chapter 221

Tickets Old And New, And A Spelling Bee Where Order Hardly Matters

Riddler Express

In your first game of Ticket To Ride, you are randomly dealt three Destination Tickets from a deck of $30$ , then later pick up another three; you have seen six of the $30$ tickets. Later you and your friends play a second game. The tickets are reshuffled and again you are dealt three. Which is more likely, that you had seen at least one of the three tickets before, or that all three are new to you?

The Riddler, FiveThirtyEight, March 29, 2019(original post)

Solution

After game one, six of the $30$ tickets are “seen” and $24$ are “unseen”. Game two reshuffles the full $30$ and deals three. The event of interest is that all three of these are unseen; its complement is that at least one is repeated from your seen pile.

The deal is sampling three tickets without replacement uniformly from $30$ . The number of all-unseen hands is $\binom{24}{3}$ and the total number of hands is $\binom{30}{3}$ , so $\begin{aligned} \Pr(\text{all three unseen}) &\;=\; \frac{\binom{24}{3}}{\binom{30}{3}} \;=\; \frac{24 \cdot 23 \cdot 22}{30 \cdot 29 \cdot 28} \\ &\;=\; \frac{12{,}144}{24{,}360} \;=\; \frac{506}{1015}. \end{aligned}$ Numerically that is about $0.4985$ , so $\Pr(\text{seen at least one}) = 1 - 506/1015 = 509/1015 \approx 0.5015$ .

$\boxed{\Pr(\text{seen at least one}) \;=\; \tfrac{509}{1015} \;\approx\; 50.15\%.}$ The two events split essentially evenly, by a hair under one part in two hundred.

Sanity check. Replacing $\binom{24}{3}/\binom{30}{3}$ with the with-replacement approximation $(24/30)^{3} = 0.512$ overstates “all unseen” by $0.014$ , because sampling without replacement makes the second and third unseen draws marginally less likely than the first.

The computation

Re-encode the problem as the actual deal: shuffle $30$ tickets, mark the first $6$ as “seen”, shuffle again, draw $3$ , and check whether any are seen. Repeat. The empirical frequency should land on $509/1015$ .

Build the rational $\binom{24}{3}/\binom{30}{3}$ exactly.
Simulate the two-game deal; report the empirical fraction with at least one repeat.

import random
from fractions import Fraction
from math import comb

p_all_unseen = Fraction(comb(24, 3), comb(30, 3))
p_at_least_one = 1 - p_all_unseen
print(f"P(all unseen) = {p_all_unseen} = {float(p_all_unseen):.6f}")
print(f"P(seen >=1)   = {p_at_least_one} = {float(p_at_least_one):.6f}")

# Monte Carlo
rng = random.Random(0)
trials, hits = 200_000, 0
deck = list(range(30))
for _ in range(trials):
    rng.shuffle(deck)
    seen = set(deck[:6])
    rng.shuffle(deck)
    second = deck[:3]
    if any(t in seen for t in second):
        hits += 1
print(f"MC fraction with >=1 seen = {hits/trials:.4f}")

The script prints $\Pr(\text{all unseen}) = 506/1015 \approx 0.4985$ , $\Pr(\text{seen} \ge 1) = 509/1015 \approx 0.5015$ , and a Monte Carlo estimate within $\pm 0.003$ of the closed form.

Riddler Classic

You are competing in a spelling bee against nine other contestants. You can each spell perfectly from a fixed portion of the dictionary and misspell anything outside it: your portion is $99\%$ of the dictionary, the others are $98\%, 97\%, \ldots, 90\%$ . The bee runs in rounds: in a fixed order, the surviving spellers each get a random word; a miss eliminates you; the last speller standing wins. If contestants go in decreasing order of knowledge (so you go first), what are your chances of winning? If they go in increasing order (so you go last), what are your chances?

The Riddler, FiveThirtyEight, March 29, 2019(original post)

Solution

The right random variable to focus on is $T_{i}$ , the number of words a contestant $i$ spells correctly before their first miss. The bee chooses words uniformly from the dictionary, independently of everything else, so the words a fixed contestant sees are iid Bernoulli trials with success probability $p_{i}$ equal to their dictionary fraction. The number of correct spellings before the first miss is therefore geometric on $\{0, 1, 2, \ldots\}$ with $\Pr(T_{i} = N) \;=\; p_{i}^{N}\,(1 - p_{i}), \qquad N = 0, 1, 2, \ldots.$ The $T_{i}$ for different contestants are independent.

Who wins the bee in terms of the $T_{i}$ . Contestant $i$ is eliminated in round $T_{i} + 1$ (they get $T_{i}$ words right and miss the next). If your $T_{\text{you}}$ strictly exceeds every other $T_{j}$ , you win outright. Ties are decided by speaking order in the round of the tie: a contestant who would miss in round $N + 1$ misses only when their turn arrives in that round, so among those tied for $T_{i} = N$ , the one who speaks earliest is eliminated first. If by your turn no one else remains, you have won without spelling another word.

This makes the answer depend on whether your turn is the first or the last among those tied for the maximum, and only on that.

Going first (decreasing order). You speak before every opponent. To win you must strictly outlast every opponent: $T_{\text{you}} > T_{j}$ for every $j$ . Conditioning on $T_{\text{you}} = N$ , $\begin{aligned} \Pr(\text{win first}) &\;=\; \sum_{N=0}^{\infty} \Pr(T_{\text{you}} = N)\, \prod_{j \ne \text{you}} \Pr(T_{j} \le N - 1) \\ &\;=\; \sum_{N=0}^{\infty} 0.99^{N}\,(0.01) \prod_{j} (1 - p_{j}^{\,N}), \end{aligned}$ where the product runs over the nine opponents with $p_{j} \in \{0.98, 0.97, \ldots, 0.90\}$ , and we use $\Pr(T_{j} \le N - 1) = 1 - p_{j}^{N}$ .

Going last (increasing order). You speak after every opponent. A tie for the maximum in your favour now wins: $T_{\text{you}} \ge T_{j}$ suffices. Conditioning on $T_{\text{you}} = N - 1$ (so you miss on word $N$ ), $\Pr(\text{win last}) \;=\; \sum_{N=1}^{\infty} 0.99^{N-1}\,(0.01) \prod_{j} (1 - p_{j}^{\,N}).$

Numerical values. Truncating the sums at $N \le 3000$ (well past where the summand is numerically negligible): $\Pr(\text{win first}) \;\approx\; 0.5198, \qquad \Pr(\text{win last}) \;\approx\; 0.5250.$

Order moves your win probability by only about half a percentage point. The reason is the tie-break: ties only happen at the same $N$ , and ties between you and a $90$ – $98\%$ opponent become rare quickly because $p^{N}$ falls off geometrically with the smaller $p$ .

The computation

Evaluate the two infinite sums to working precision and confirm by Monte Carlo simulation of the bee.

Truncate each sum at $N = 3000$ ; verify the tail is negligible.
Simulate the bee directly: each round, each surviving contestant flips a $p_{i}$ coin; misses eliminate.

import random

p_you = 0.99
others = [0.98, 0.97, 0.96, 0.95, 0.94, 0.93, 0.92, 0.91, 0.90]

def prob_win_first():
    s = 0.0
    for N in range(0, 3001):
        term = (p_you ** N) * (1 - p_you)
        for q in others:
            term *= (1 - q ** N)
        s += term
    return s

def prob_win_last():
    s = 0.0
    for N in range(1, 3001):
        term = (p_you ** (N - 1)) * (1 - p_you)
        for q in others:
            term *= (1 - q ** N)
        s += term
    return s

print(f"Going first: {100 * prob_win_first():.3f}%")
print(f"Going last : {100 * prob_win_last():.3f}%")

# Monte Carlo: simulate the bee with explicit speaking order
def simulate(order, trials=200_000, seed=0):
    rng = random.Random(seed)
    ps = list(order)
    you_idx = ps.index(p_you)
    wins_you = 0
    for _ in range(trials):
        alive = list(range(len(ps)))
        while len(alive) > 1:
            survivors = []
            stopped = False
            for k, idx in enumerate(alive):
                # if only this speller remains in the round
                # and they would be alone overall, the bee ends now.
                remaining_after = len(alive) - k - 1
                if not survivors and remaining_after == 0:
                    survivors.append(idx)
                    break
                if rng.random() < ps[idx]:
                    survivors.append(idx)
            alive = survivors
        if alive and alive[0] == you_idx:
            wins_you += 1
    return wins_you / trials

first_order = [p_you] + sorted(others, reverse=True)
last_order  = sorted(others) + [p_you]
print(f"MC first: {100*simulate(first_order):.2f}%")
print(f"MC last : {100*simulate(last_order, seed=1):.2f}%")

The script prints $52.0\%$ and $52.5\%$ exactly to three figures, both confirmed by Monte Carlo within $\pm 0.2\%$ . The slight last-mover edge is real and small.