Chapter 165
Can You Please The Oracle? Can You Escape The Prison?
Riddler Express
You must build a very specific tower out of four differently coloured pieces that can be stacked in any order. When you start, you do not know what the correct order is. After assembling the pieces in some order you may consult an architectural oracle (he goes by Frank) who tells you how many of the four pieces are in their correct positions, returning one of , , or (a tower that has three pieces in their correct positions has its fourth piece correct as well). Your tower is not finished until the oracle confirms you have placed the pieces correctly. In the worst case, how many times should you have to consult the oracle to assemble the tower correctly?
The Riddler, FiveThirtyEight, October 20, 2017(original post)
Solution
There are permutations of the four coloured pieces. We want a strategy whose decision tree has leaves and whose worst-case depth is as small as possible. With the answer to each query in the tree is at most -ary, so a lower bound on the depth is . The bound is loose: the oracle’s reply also constrains which permutations remain consistent, and a careful play shows that three queries are not enough but five always are.
Why three are not enough. After a first query the tower we tried partitions the remaining permutations by their score against ours. If the score is the remaining permutations are exactly the derangements of our tower, of them. Now a second query against the derangements: each possible score leaves some subset, and no single subsequent query splits candidates into pieces each of size (which would be required for a third query to identify the answer). So the worst-case branch needs at least four queries beyond the first, i.e. at least in total.
A strategy that always finishes in five. Label the colours . Use this script (Tom from Los Angeles’s worked walkthrough makes the structure plain):
Try . If the oracle returns , done in query.
If : one of the four is correct. Two further queries that systematically swap pairs identify which (cf. analysis below); confirm with a fourth.
If : two of the four are correct, the other two are swapped. There are such pairs; three further queries identify the swap and confirm.
If : no piece is in its correct slot. Swap and to try . The reply is , and a quick case split (worked through below) finishes in at most three more queries.
The hardest branch, oracle reply followed by , eliminates from the first two positions and from positions (any other placement of or in or or in would have given a positive score). So hold in some order and hold , leaving four permutations . Query : a reply of finishes immediately; a reply of means two are right, narrowing to or (each fixes exactly two slots of ); a reply of leaves alone. Either of the two-options sub-branches needs one confirming query, for a worst-case total of five.
The computation
Encode the oracle (return how many positions match) and search adversarially: for every candidate strategy of depth , the adversary returns the reply that keeps the set of consistent permutations as large as possible. A breadth-first search over decision trees of depth finds a strategy that always isolates the secret; depth does not suffice.
Generate all permutations of .
Define = number of positions where guess matches secret .
Search for a strategy tree: at each node, pick the guess that minimises the size of the largest sub-tree (the candidate set is split by the score reply). Recurse.
Report the worst-case depth.
from itertools import permutations
from functools import lru_cache
PERMS = tuple(permutations("ABCD"))
def score(g, s):
return sum(a == b for a, b in zip(g, s))
@lru_cache(None)
def best_depth(candidates):
if len(candidates) == 1:
return 1 # one query to confirm
best = 10**9
for g in PERMS:
groups = {}
for s in candidates:
groups.setdefault(score(g, s), []).append(s)
if len(groups) == 1 and g not in candidates:
continue # learns nothing
worst = 0
for r, sub in groups.items():
if r == 4:
worst = max(worst, 1) # confirmed in this query
else:
worst = max(worst, 1 + best_depth(tuple(sub)))
best = min(best, worst)
return best
print(best_depth(PERMS)) # 5
The search returns , matching the boxed value.
Riddler Classic
One hundred prisoners are put into isolated cells, numbered to . They cannot communicate. The warden takes them one at a time, in no particular order, to a special room (Cell ) holding two levers. The levers’ starting positions are unknown. Each visit the prisoner must flip exactly one lever. At any point any prisoner may declare to the warden that all have visited the room. A correct declaration frees everyone; an incorrect one executes them all. What strategy guarantees their release, and how many trips to Cell does it take on average?
The Riddler, FiveThirtyEight, October 20, 2017(original post)
Solution
The standard one-lever puzzle has a beautiful asymmetric solution: nominate one prisoner as a Counter; every other prisoner switches the lever on the first two times they find it off (then leaves it alone forever); the Counter tallies every time he finds it on and switches it off; he declares when the tally reaches . The factor of two absorbs ambiguity about the lever’s initial state.
The two-lever version is the same idea with a twist. The second lever is a dummy, used as a release valve when a prisoner is required to act but does not want to disturb the count. Designate one of the two levers the count lever and the other the dummy lever (the prisoners may need a coin-flip on first visit to label them, but for the analysis the labels are fixed).
Strategy. Let prisoners be drones and one be the Counter.
Drone visits: if the count lever is down and you have not already flipped it up twice in your life, push it up. Otherwise flip the dummy lever.
Counter visits: if the count lever is up, add to your tally and push it down. Otherwise flip the dummy lever.
The Counter declares when the tally reaches
Why guarantees correctness. The count lever gets pushed up only by a drone (its starting position aside), and each drone is allowed at most two “up” contributions; so drones can supply at most ups. If the lever started up the Counter may gain one phantom count on his first visit; total ups available is at most . The point: tally forces at least drone-ups, and if any single drone had contributed the other drones could supply at most . So every drone has visited at least once, hence all prisoners have visited. (The official write-up sets the threshold at for added safety; the cleanly correct minimum is , and the simulation below uses it.)
Why is reached in finite time. Each drone is visited infinitely often almost surely, and so is the Counter; whenever the Counter visits a non-down lever he resets it, making fresh room for the next drone-up. The tally climbs without bound (until it stops at ).
Average number of trips. A crude estimate: the Counter is visited about once every trips and needs to register counts, so trips. The actual mean is slightly larger because not every Counter visit finds the lever up; the standard analysis via the negative binomial distribution, and direct simulation, give If each visit takes one minute round the clock, the prisoners go free in roughly two weeks.
The computation
Simulate the strategy directly. Initialise the two levers to a random configuration; pick prisoners uniformly with replacement; apply the rules above; record the number of trips until the Counter’s tally hits . Repeat many times and average.
Random initial lever states; drone “ups remaining” set to each; tally .
Draw the next prisoner uniformly from .
Apply the Counter or drone rule. Increment the trip counter.
Stop when tally .
Average the stopping time over independent trials.
import random
def trial(seed, N=100, T=198):
rng = random.Random(seed)
count_up = rng.random() < 0.5
ups_left = [2] * (N - 1)
tally, trips = 0, 0
while tally < T:
trips += 1
p = rng.randrange(N)
if p == N - 1: # the Counter
if count_up:
tally += 1
count_up = False
# else flip dummy: state unchanged for this model
else: # a drone
if (not count_up) and ups_left[p] > 0:
count_up = True
ups_left[p] -= 1
# else flip dummy
return trips
samples = [trial(s) for s in range(200)]
print(round(sum(samples) / len(samples))) # ~20,400
The mean of the simulated stopping times clusters around trips, in agreement with the boxed value.