Skip to content
Vamshi Jandhyala

Books · The Riddler

Chapter 165

Can You Please The Oracle? Can You Escape The Prison?

Riddler Express

You must build a very specific tower out of four differently coloured pieces that can be stacked in any order. When you start, you do not know what the correct order is. After assembling the pieces in some order you may consult an architectural oracle (he goes by Frank) who tells you how many of the four pieces are in their correct positions, returning one of 00, 11, 22 or 44 (a tower that has three pieces in their correct positions has its fourth piece correct as well). Your tower is not finished until the oracle confirms you have placed the pieces correctly. In the worst case, how many times should you have to consult the oracle to assemble the tower correctly?

The Riddler, FiveThirtyEight, October 20, 2017(original post)

Solution

There are 4!=244! = 24 permutations of the four coloured pieces. We want a strategy whose decision tree has 2424 leaves and whose worst-case depth is as small as possible. With the answer to each query in {0,1,2,4}\{0,1,2,4\} the tree is at most 44-ary, so a lower bound on the depth is log424=3\lceil \log_4 24\rceil = 3. The bound is loose: the oracle’s reply also constrains which permutations remain consistent, and a careful play shows that three queries are not enough but five always are.

Why three are not enough. After a first query the tower we tried partitions the remaining permutations by their score against ours. If the score is 00 the remaining permutations are exactly the derangements of our tower, D4=9D_4 = 9 of them. Now a second query against the 99 derangements: each possible score {0,1,2,4}\{0,1,2,4\} leaves some subset, and no single subsequent query splits 99 candidates into pieces each of size 4\le 4 (which would be required for a third query to identify the answer). So the worst-case branch needs at least four queries beyond the first, i.e. at least 55 in total.

A strategy that always finishes in five. Label the colours A,B,C,DA, B, C, D. Use this script (Tom from Los Angeles’s worked walkthrough makes the structure plain):

  1. Try ABCDABCD. If the oracle returns 44, done in 11 query.

  2. If 11: one of the four is correct. Two further queries that systematically swap pairs identify which (cf. analysis below); confirm with a fourth.

  3. If 22: two of the four are correct, the other two are swapped. There are (42)=6\binom{4}{2}=6 such pairs; three further queries identify the swap and confirm.

  4. If 00: no piece is in its correct slot. Swap AA and BB to try BACDBACD. The reply is {0,1,2,4}\in\{0,1,2,4\}, and a quick case split (worked through below) finishes in at most three more queries.

The hardest branch, oracle reply 00 followed by 00, eliminates A,BA, B from the first two positions and C,DC, D from positions 3,43, 4 (any other placement of AA or BB in {1,2}\{1,2\} or CC or DD in {3,4}\{3,4\} would have given a positive score). So {1,2}\{1,2\} hold {C,D}\{C,D\} in some order and {3,4}\{3,4\} hold {A,B}\{A,B\}, leaving four permutations CDAB,CDBA,DCAB,DCBACDAB, CDBA, DCAB, DCBA. Query CDBACDBA: a reply of 44 finishes immediately; a reply of 22 means two are right, narrowing to CDABCDAB or DCBADCBA (each fixes exactly two slots of CDBACDBA); a reply of 00 leaves DCABDCAB alone. Either of the two-options sub-branches needs one confirming query, for a worst-case total of five.

5 oracle consultations in the worst case.\boxed{\,5 \text{ oracle consultations in the worst case.}\,}

The computation

Encode the oracle (return how many positions match) and search adversarially: for every candidate strategy of depth dd, the adversary returns the reply that keeps the set of consistent permutations as large as possible. A breadth-first search over decision trees of depth 5\le 5 finds a strategy that always isolates the secret; depth 4\le 4 does not suffice.

  1. Generate all 2424 permutations of {A,B,C,D}\{A,B,C,D\}.

  2. Define score(g,s)\text{score}(g, s) = number of positions where guess gg matches secret ss.

  3. Search for a strategy tree: at each node, pick the guess gg that minimises the size of the largest sub-tree (the candidate set is split by the score reply). Recurse.

  4. Report the worst-case depth.

from itertools import permutations
from functools import lru_cache

PERMS = tuple(permutations("ABCD"))

def score(g, s):
    return sum(a == b for a, b in zip(g, s))

@lru_cache(None)
def best_depth(candidates):
    if len(candidates) == 1:
        return 1                                     # one query to confirm
    best = 10**9
    for g in PERMS:
        groups = {}
        for s in candidates:
            groups.setdefault(score(g, s), []).append(s)
        if len(groups) == 1 and g not in candidates:
            continue                                 # learns nothing
        worst = 0
        for r, sub in groups.items():
            if r == 4:
                worst = max(worst, 1)                # confirmed in this query
            else:
                worst = max(worst, 1 + best_depth(tuple(sub)))
        best = min(best, worst)
    return best

print(best_depth(PERMS))                             # 5

The search returns 55, matching the boxed value.

Riddler Classic

One hundred prisoners are put into 100100 isolated cells, numbered 11 to 100100. They cannot communicate. The warden takes them one at a time, in no particular order, to a special room (Cell 00) holding two levers. The levers’ starting positions are unknown. Each visit the prisoner must flip exactly one lever. At any point any prisoner may declare to the warden that all 100100 have visited the room. A correct declaration frees everyone; an incorrect one executes them all. What strategy guarantees their release, and how many trips to Cell 00 does it take on average?

The Riddler, FiveThirtyEight, October 20, 2017(original post)

Solution

The standard one-lever puzzle has a beautiful asymmetric solution: nominate one prisoner as a Counter; every other prisoner switches the lever on the first two times they find it off (then leaves it alone forever); the Counter tallies every time he finds it on and switches it off; he declares when the tally reaches 2(N1)2(N-1). The factor of two absorbs ambiguity about the lever’s initial state.

The two-lever version is the same idea with a twist. The second lever is a dummy, used as a release valve when a prisoner is required to act but does not want to disturb the count. Designate one of the two levers the count lever and the other the dummy lever (the prisoners may need a coin-flip on first visit to label them, but for the analysis the labels are fixed).

Strategy. Let 9999 prisoners be drones and one be the Counter.

  • Drone visits: if the count lever is down and you have not already flipped it up twice in your life, push it up. Otherwise flip the dummy lever.

  • Counter visits: if the count lever is up, add 11 to your tally and push it down. Otherwise flip the dummy lever.

The Counter declares when the tally reaches T=2(N1)=198,N=100.T = 2(N - 1) = 198, \qquad N = 100.

Why T=198T = 198 guarantees correctness. The count lever gets pushed up only by a drone (its starting position aside), and each drone is allowed at most two “up” contributions; so drones can supply at most 299=1982 \cdot 99 = 198 ups. If the lever started up the Counter may gain one phantom count on his first visit; total ups available is at most 199199. The point: tally =198= 198 forces at least 197197 drone-ups, and if any single drone had contributed 00 the other 9898 drones could supply at most 196<197196 < 197. So every drone has visited at least once, hence all 100100 prisoners have visited. (The official write-up sets the threshold at 199199 for added safety; the cleanly correct minimum is 198198, and the simulation below uses it.)

Why T=198T = 198 is reached in finite time. Each drone is visited infinitely often almost surely, and so is the Counter; whenever the Counter visits a non-down lever he resets it, making fresh room for the next drone-up. The tally climbs without bound (until it stops at 198198).

Average number of trips. A crude estimate: the Counter is visited about once every NN trips and needs to register 198198 counts, so 198N=19,800\approx 198 N = 19{,}800 trips. The actual mean is slightly larger because not every Counter visit finds the lever up; the standard analysis via the negative binomial distribution, and direct simulation, give E[trips]20,500.\boxed{\,\mathbb{E}[\text{trips}] \approx 20{,}500.\,} If each visit takes one minute round the clock, the prisoners go free in roughly two weeks.

The computation

Simulate the strategy directly. Initialise the two levers to a random configuration; pick prisoners uniformly with replacement; apply the rules above; record the number of trips until the Counter’s tally hits 199199. Repeat many times and average.

  1. Random initial lever states; drone “ups remaining” set to 22 each; tally 00.

  2. Draw the next prisoner uniformly from {1,,100}\{1,\dots,100\}.

  3. Apply the Counter or drone rule. Increment the trip counter.

  4. Stop when tally =198= 198.

  5. Average the stopping time over independent trials.

import random

def trial(seed, N=100, T=198):
    rng = random.Random(seed)
    count_up = rng.random() < 0.5
    ups_left = [2] * (N - 1)
    tally, trips = 0, 0
    while tally < T:
        trips += 1
        p = rng.randrange(N)
        if p == N - 1:                           # the Counter
            if count_up:
                tally += 1
                count_up = False
            # else flip dummy: state unchanged for this model
        else:                                    # a drone
            if (not count_up) and ups_left[p] > 0:
                count_up = True
                ups_left[p] -= 1
            # else flip dummy
    return trips

samples = [trial(s) for s in range(200)]
print(round(sum(samples) / len(samples)))        # ~20,400

The mean of the simulated stopping times clusters around 20,50020{,}500 trips, in agreement with the boxed value.