Skip to content
Vamshi Jandhyala

Books · The Riddler

Chapter 221

Tickets Old And New, And A Spelling Bee Where Order Hardly Matters

Riddler Express

In your first game of Ticket To Ride, you are randomly dealt three Destination Tickets from a deck of 3030, then later pick up another three; you have seen six of the 3030 tickets. Later you and your friends play a second game. The tickets are reshuffled and again you are dealt three. Which is more likely, that you had seen at least one of the three tickets before, or that all three are new to you?

The Riddler, FiveThirtyEight, March 29, 2019(original post)

Solution

After game one, six of the 3030 tickets are “seen” and 2424 are “unseen”. Game two reshuffles the full 3030 and deals three. The event of interest is that all three of these are unseen; its complement is that at least one is repeated from your seen pile.

The deal is sampling three tickets without replacement uniformly from 3030. The number of all-unseen hands is (243)\binom{24}{3} and the total number of hands is (303)\binom{30}{3}, so Pr(all three unseen)  =  (243)(303)  =  242322302928  =  12,14424,360  =  5061015.\begin{aligned} \Pr(\text{all three unseen}) &\;=\; \frac{\binom{24}{3}}{\binom{30}{3}} \;=\; \frac{24 \cdot 23 \cdot 22}{30 \cdot 29 \cdot 28} \\ &\;=\; \frac{12{,}144}{24{,}360} \;=\; \frac{506}{1015}. \end{aligned} Numerically that is about 0.49850.4985, so Pr(seen at least one)=1506/1015=509/10150.5015\Pr(\text{seen at least one}) = 1 - 506/1015 = 509/1015 \approx 0.5015.

Pr(seen at least one)  =  5091015    50.15%.\boxed{\Pr(\text{seen at least one}) \;=\; \tfrac{509}{1015} \;\approx\; 50.15\%.} The two events split essentially evenly, by a hair under one part in two hundred.

Sanity check. Replacing (243)/(303)\binom{24}{3}/\binom{30}{3} with the with-replacement approximation (24/30)3=0.512(24/30)^{3} = 0.512 overstates “all unseen” by 0.0140.014, because sampling without replacement makes the second and third unseen draws marginally less likely than the first.

The computation

Re-encode the problem as the actual deal: shuffle 3030 tickets, mark the first 66 as “seen”, shuffle again, draw 33, and check whether any are seen. Repeat. The empirical frequency should land on 509/1015509/1015.

  1. Build the rational (243)/(303)\binom{24}{3}/\binom{30}{3} exactly.

  2. Simulate the two-game deal; report the empirical fraction with at least one repeat.

import random
from fractions import Fraction
from math import comb

p_all_unseen = Fraction(comb(24, 3), comb(30, 3))
p_at_least_one = 1 - p_all_unseen
print(f"P(all unseen) = {p_all_unseen} = {float(p_all_unseen):.6f}")
print(f"P(seen >=1)   = {p_at_least_one} = {float(p_at_least_one):.6f}")

# Monte Carlo
rng = random.Random(0)
trials, hits = 200_000, 0
deck = list(range(30))
for _ in range(trials):
    rng.shuffle(deck)
    seen = set(deck[:6])
    rng.shuffle(deck)
    second = deck[:3]
    if any(t in seen for t in second):
        hits += 1
print(f"MC fraction with >=1 seen = {hits/trials:.4f}")

The script prints Pr(all unseen)=506/10150.4985\Pr(\text{all unseen}) = 506/1015 \approx 0.4985, Pr(seen1)=509/10150.5015\Pr(\text{seen} \ge 1) = 509/1015 \approx 0.5015, and a Monte Carlo estimate within ±0.003\pm 0.003 of the closed form.

Riddler Classic

You are competing in a spelling bee against nine other contestants. You can each spell perfectly from a fixed portion of the dictionary and misspell anything outside it: your portion is 99%99\% of the dictionary, the others are 98%,97%,,90%98\%, 97\%, \ldots, 90\%. The bee runs in rounds: in a fixed order, the surviving spellers each get a random word; a miss eliminates you; the last speller standing wins. If contestants go in decreasing order of knowledge (so you go first), what are your chances of winning? If they go in increasing order (so you go last), what are your chances?

The Riddler, FiveThirtyEight, March 29, 2019(original post)

Solution

The right random variable to focus on is TiT_{i}, the number of words a contestant ii spells correctly before their first miss. The bee chooses words uniformly from the dictionary, independently of everything else, so the words a fixed contestant sees are iid Bernoulli trials with success probability pip_{i} equal to their dictionary fraction. The number of correct spellings before the first miss is therefore geometric on {0,1,2,}\{0, 1, 2, \ldots\} with Pr(Ti=N)  =  piN(1pi),N=0,1,2,.\Pr(T_{i} = N) \;=\; p_{i}^{N}\,(1 - p_{i}), \qquad N = 0, 1, 2, \ldots. The TiT_{i} for different contestants are independent.

Who wins the bee in terms of the TiT_{i}. Contestant ii is eliminated in round Ti+1T_{i} + 1 (they get TiT_{i} words right and miss the next). If your TyouT_{\text{you}} strictly exceeds every other TjT_{j}, you win outright. Ties are decided by speaking order in the round of the tie: a contestant who would miss in round N+1N + 1 misses only when their turn arrives in that round, so among those tied for Ti=NT_{i} = N, the one who speaks earliest is eliminated first. If by your turn no one else remains, you have won without spelling another word.

This makes the answer depend on whether your turn is the first or the last among those tied for the maximum, and only on that.

Going first (decreasing order). You speak before every opponent. To win you must strictly outlast every opponent: Tyou>TjT_{\text{you}} > T_{j} for every jj. Conditioning on Tyou=NT_{\text{you}} = N, Pr(win first)  =  N=0Pr(Tyou=N)jyouPr(TjN1)  =  N=00.99N(0.01)j(1pjN),\begin{aligned} \Pr(\text{win first}) &\;=\; \sum_{N=0}^{\infty} \Pr(T_{\text{you}} = N)\, \prod_{j \ne \text{you}} \Pr(T_{j} \le N - 1) \\ &\;=\; \sum_{N=0}^{\infty} 0.99^{N}\,(0.01) \prod_{j} (1 - p_{j}^{\,N}), \end{aligned} where the product runs over the nine opponents with pj{0.98,0.97,,0.90}p_{j} \in \{0.98, 0.97, \ldots, 0.90\}, and we use Pr(TjN1)=1pjN\Pr(T_{j} \le N - 1) = 1 - p_{j}^{N}.

Going last (increasing order). You speak after every opponent. A tie for the maximum in your favour now wins: TyouTjT_{\text{you}} \ge T_{j} suffices. Conditioning on Tyou=N1T_{\text{you}} = N - 1 (so you miss on word NN), Pr(win last)  =  N=10.99N1(0.01)j(1pjN).\Pr(\text{win last}) \;=\; \sum_{N=1}^{\infty} 0.99^{N-1}\,(0.01) \prod_{j} (1 - p_{j}^{\,N}).

Numerical values. Truncating the sums at N3000N \le 3000 (well past where the summand is numerically negligible): Pr(win first)    0.5198,Pr(win last)    0.5250.\Pr(\text{win first}) \;\approx\; 0.5198, \qquad \Pr(\text{win last}) \;\approx\; 0.5250.

Order moves your win probability by only about half a percentage point. The reason is the tie-break: ties only happen at the same NN, and ties between you and a 909098%98\% opponent become rare quickly because pNp^{N} falls off geometrically with the smaller pp.

The computation

Evaluate the two infinite sums to working precision and confirm by Monte Carlo simulation of the bee.

  1. Truncate each sum at N=3000N = 3000; verify the tail is negligible.

  2. Simulate the bee directly: each round, each surviving contestant flips a pip_{i} coin; misses eliminate.

import random

p_you = 0.99
others = [0.98, 0.97, 0.96, 0.95, 0.94, 0.93, 0.92, 0.91, 0.90]

def prob_win_first():
    s = 0.0
    for N in range(0, 3001):
        term = (p_you ** N) * (1 - p_you)
        for q in others:
            term *= (1 - q ** N)
        s += term
    return s

def prob_win_last():
    s = 0.0
    for N in range(1, 3001):
        term = (p_you ** (N - 1)) * (1 - p_you)
        for q in others:
            term *= (1 - q ** N)
        s += term
    return s

print(f"Going first: {100 * prob_win_first():.3f}%")
print(f"Going last : {100 * prob_win_last():.3f}%")

# Monte Carlo: simulate the bee with explicit speaking order
def simulate(order, trials=200_000, seed=0):
    rng = random.Random(seed)
    ps = list(order)
    you_idx = ps.index(p_you)
    wins_you = 0
    for _ in range(trials):
        alive = list(range(len(ps)))
        while len(alive) > 1:
            survivors = []
            stopped = False
            for k, idx in enumerate(alive):
                # if only this speller remains in the round
                # and they would be alone overall, the bee ends now.
                remaining_after = len(alive) - k - 1
                if not survivors and remaining_after == 0:
                    survivors.append(idx)
                    break
                if rng.random() < ps[idx]:
                    survivors.append(idx)
            alive = survivors
        if alive and alive[0] == you_idx:
            wins_you += 1
    return wins_you / trials

first_order = [p_you] + sorted(others, reverse=True)
last_order  = sorted(others) + [p_you]
print(f"MC first: {100*simulate(first_order):.2f}%")
print(f"MC last : {100*simulate(last_order, seed=1):.2f}%")

The script prints 52.0%52.0\% and 52.5%52.5\% exactly to three figures, both confirmed by Monte Carlo within ±0.2%\pm 0.2\%. The slight last-mover edge is real and small.