Skip to content
Vamshi Jandhyala

Books · The Riddler

Chapter 179

Shut the Box

Riddler Classic

Shut the Box has 99 tiles, numbered 11 through 99, all starting open. On each turn you roll two dice and shut any combination of open tiles whose values sum to the dice total. Once 7,8,97, 8, 9 are all shut you may choose to roll one or two dice; otherwise you must roll both. The game ends when you shut every tile (you win) or you cannot shut a subset summing to the roll (you lose). What is your probability of winning under perfect play?

The Riddler, FiveThirtyEight, April 13, 2018(original post)

The Riddler Express that week asked which two-card Texas Hold’em starting hand most often makes the nuts. The published answer is A-10A\text{-}10 suited (with the argument that the nut flush dominates the count). A careful Monte-Carlo over full 55-card boards in fact ranks J-10J\text{-}10 suited above A-10A\text{-}10 suited as the most frequent nut-making starting hand, with the difference driven by straight-and-flush coverage rather than nut-flush coverage alone. Because we have not resolved this discrepancy with the official to certainty, the Express is deferred.

Solution

The state space. The position of the game is the set S{1,2,,9}S \subseteq \{1, 2, \ldots, 9\} of still-open tiles. There are 29=5122^9 = 512 states; the empty set is the winning terminal. Let V(S)V(S) denote the probability of eventually winning from state SS under optimal play.

The Bellman equation. From state SS, the player chooses how many dice to roll (constrained: two dice unless {7,8,9}S=\{7, 8, 9\} \cap S = \varnothing, in which case either one or two), then observes the total TT, then picks a subset ASA \subseteq S with A=T\sum A = T to shut. If no such AA exists for a given TT, that roll loses.

For two dice the total TT has the standard distribution Pr(T=t)=nt/36\Pr(T = t) = n_t / 36 with n=(1,2,3,4,5,6,5,4,3,2,1)n = (1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1) for t=2,3,,12t = 2, 3, \ldots, 12. For one die, Pr(T=t)=1/6\Pr(T = t) = 1/6 for t=1,,6t = 1, \ldots, 6. Write the per-roll value at SS for d{1,2}d \in \{1, 2\} dice as Rd(S)  =  tPrd(T=t)(maxAS, A=tV(SA)),R_d(S) \;=\; \sum_{t} \Pr_d(T = t)\,\Bigl( \max_{A \subseteq S,\ \sum A = t} V(S \setminus A) \Bigr), with the convention that an empty maximum (no shutting subset exists) contributes 00. Then V(S)  =  {1if S=,R2(S)if {7,8,9}S,max(R1(S),R2(S))otherwise.V(S) \;=\; \begin{cases} 1 & \text{if } S = \varnothing, \\ R_2(S) & \text{if } \{7, 8, 9\} \cap S \ne \varnothing, \\ \max\bigl(R_1(S), R_2(S)\bigr) & \text{otherwise.} \end{cases} This is a straight dynamic programme on 29=5122^9 = 512 states. Each state’s update requires enumerating shutting subsets summing to each total, which is fast.

The headline. Solving the DP gives V({1,2,,9})    0.0976    9.76%.\boxed{\,V(\{1, 2, \ldots, 9\}) \;\approx\; 0.0976 \;\approx\; 9.76\%.\,} The starting-state value is the win probability under perfect play. The official quotes the same 9.8%\approx 9.8\%.

Where the choice of 11 vs 22 dice matters. Among the states where the choice is available (those with {7,8,9}\{7,8,9\} all shut), the optimal choice depends on SS. With many small tiles still open, two dice (which average 77) is dangerous; one die (averaging 3.53.5) gives a better chance of hitting an achievable total. With only large tiles left, two dice gives the only reach to the needed totals.

The computation

Directly solve the DP described above.

  1. Enumerate every state S2{1,,9}S \in 2^{\{1,\ldots,9\}} in order of increasing S|S| so that V(SA)V(S \setminus A) is known when computing V(S)V(S).

  2. For each SS, compute the candidate per-roll values R1(S)R_1(S) (when allowed) and R2(S)R_2(S), taking the max over shutting subsets at each total.

  3. Return V({1,2,,9})V(\{1, 2, \ldots, 9\}).

from functools import lru_cache

ALL = frozenset(range(1, 10))

def subsets_summing_to(open_set, target):
    open_list = sorted(open_set)
    results = []
    def rec(i, cur, s):
        if s == target:
            results.append(frozenset(cur)); return
        if s > target or i == len(open_list): return
        rec(i + 1, cur, s)
        rec(i + 1, cur + [open_list[i]], s + open_list[i])
    rec(0, [], 0)
    return results

DICE2 = {t: 0 for t in range(2, 13)}
for d1 in range(1, 7):
    for d2 in range(1, 7):
        DICE2[d1 + d2] += 1

@lru_cache(maxsize=None)
def V(S):
    if not S: return 1.0
    can_one = all(x not in S for x in (7, 8, 9))
    def per_roll(ndice):
        if ndice == 1:
            tot = 0.0
            for r in range(1, 7):
                subs = subsets_summing_to(S, r)
                if subs:
                    tot += max(V(S - s) for s in subs)
            return tot / 6
        tot = 0.0
        for t, n in DICE2.items():
            subs = subsets_summing_to(S, t)
            if subs:
                tot += n * max(V(S - s) for s in subs)
        return tot / 36
    return max(per_roll(2), per_roll(1)) if can_one else per_roll(2)

print(f"V(all open) = {V(ALL):.6f}")

The script prints V({1,,9})0.097614V(\{1,\ldots,9\}) \approx 0.097614, matching the boxed 9.76%9.76\%.