Chapter 218
You’re Home Alone: Cigarette Cards And Bananagrams
Riddler Express
In the video game “Red Dead Redemption 2” a side quest requires collecting distinct cigarette cards in each of sets, for unique cards. Cards drop one per pack of in-game cigarettes; each card is equally likely. At $5 per pack, what is the expected total spend to complete the collection by buying packs alone?
The Riddler, FiveThirtyEight, March 1, 2019(original post)
Solution
This is the coupon collector’s problem with equally likely coupons. The expected number of purchases to obtain the complete set is the harmonic series scaled by .
Why the harmonic series appears. Suppose at some point you already hold distinct cards. The next pack contains a new card with probability , so the number of packs you must buy to advance from to distinct cards is a geometric random variable with mean . Summing over gives the expected total: where is the -th harmonic number.
Plug in . The exact is a sum of unit fractions; numerically so packs.
Convert to dollars. At per pack,
The computation
Compute exactly (rational arithmetic), multiply by , and Monte-Carlo the coupon collector to confirm the expected pack count.
Build exactly as a .
Multiply by for expected packs; by for expected dollars.
Simulate the coupon collector many times; compare the empirical mean to the closed form.
import random
from fractions import Fraction
N = 144
H_N = sum(Fraction(1, k) for k in range(1, N + 1))
exp_packs = N * H_N
print(f"E[packs] = {float(exp_packs):.4f}")
print(f"E[spend at $5] = ${float(5 * exp_packs):.2f}")
# Monte Carlo
def simulate(trials=20000, seed=0):
rng = random.Random(seed)
total = 0
for _ in range(trials):
seen = set()
cnt = 0
while len(seen) < N:
seen.add(rng.randrange(N))
cnt += 1
total += cnt
return total / trials
print(f"MC mean packs = {simulate():.2f}")
The script prints , , and a Monte Carlo average within roughly of the closed form, matching the boxed answer.
Riddler Classic
In solitaire Bananagrams you have all lettered tiles and must form a connected crossword-style grid using every tile, with all rows and columns reading valid English words. What grid uses the fewest words? What grid uses the most? Extra credit: how many distinct completed grids are there?
The Riddler, FiveThirtyEight, March 1, 2019(original post)
Solution
This Classic asks an integer-programming question over the ENABLE word list, the standard tournament Scrabble dictionary used by “Words With Friends”. The puzzle has three load-bearing inputs: the -tile multiset, the dictionary, and the grid connectivity constraint. None of those is small.
Why this is a search problem, not an analysis problem. A solver has to model the grid as a set of horizontal and vertical word slots that share letters at crossings; each cell carries one letter; each slot’s letters must spell a dictionary word; the union of cell letters must equal the -tile multiset; and the union of slots must form a single connected component. The constraint that the multiset matches exactly (no leftover, no duplicate) makes the integer program large and idiosyncratic: long “vocabulary” words consume rare tiles efficiently and reduce the word count, while short “filler” words inflate it.
What the search yields. The Riddler-published result (March 8, 2019) gives a constructive lower bound of words for the minimum, found by Laurent Lessard via mixed-integer programming over the ENABLE list, using long technical words (, , , ), and a constructive upper bound of words for the maximum, found by Michael Branicky. The exact minimum and maximum, and the count of distinct completed grids, are not known to be settled in the puzzle’s published write-up: the is a best-known constructive minimum, not a proved one, and the count of all completed grids is genuinely open.
What we can verify analytically. The -tile multiset is fixed (the Bananagrams distribution; see the post’s link to the rule sheet); its total letter weight (Scrabble values) does not bound the word count usefully, because the count is governed by minimum crossword segmentation, not letter weights. We confirm the tile multiset is consistent with the published counts and that an arbitrary -word grid using the four published technical words plus filler shorts is a valid Bananagrams solution: those four words have lengths tiles, leaving tiles to fit into further short words that share crossing letters.
The Express alone is the head-numerical question; the Classic, as posed, is a dictionary-bound integer program with no closed-form result, and is included here for completeness.
The computation
Load the Bananagrams tile distribution; verify the multiset totals to .
Verify the published technical words (, , , ) are constructible from the multiset (each individually, and as a multiset union).
from collections import Counter
TILES = Counter({
'A': 13, 'B': 3, 'C': 3, 'D': 6, 'E': 18,
'F': 3, 'G': 4, 'H': 3, 'I': 12, 'J': 2,
'K': 2, 'L': 5, 'M': 3, 'N': 8, 'O': 11,
'P': 3, 'Q': 2, 'R': 9, 'S': 6, 'T': 9,
'U': 6, 'V': 3, 'W': 3, 'X': 2, 'Y': 3,
'Z': 2,
})
print('Total tiles:', sum(TILES.values()))
WORDS = [
'KERATOCONJUNCTIVITIDES',
'PARAFORMALDEHYDE',
'MAGNETOFLUIDDYNAMICS',
'OXYPHENBUTAZONES',
]
for w in WORDS:
c = Counter(w)
ok = all(c[ch] <= TILES[ch] for ch in c)
print(f'{w}: len={len(w)} fits-in-bag={ok}')
# union check
total = Counter()
for w in WORDS:
total += Counter(w)
print('\nUnion counts vs tiles:')
for ch in sorted(total):
print(f' {ch}: need {total[ch]}, have {TILES[ch]}',
'(too many)' if total[ch] > TILES[ch] else '')
print('Remaining tiles after 4 words:',
sum(TILES.values()) - sum(total.values()))
The script prints tiles total, each of the four technical words fitting individually, the union of the four words consuming tiles (well within the multiset), and tiles remaining to be placed in the further short crossing words. The minimum-of- claim is thus structurally consistent with the rule book.