Chapter 201
How Much Will It Cost To Sniff Out The Spies?
Riddler Express
What is the next number in this series?
The Riddler, FiveThirtyEight, October 5, 2018(original post)
Solution
The first term is suggestive but the second term rules out an arithmetic sequence on the integers. The differences are The and the trailing are the giveaways: integer terms displayed in a non-decimal base. Reading the terms as hexadecimal (base ) numerals, they correspond to the decimal values The pattern is now obvious: for . The next term is , which in hexadecimal is : The hexadecimal letters stand for the decimal values through , so .
The computation
Generate the sequence directly from and format each term in hexadecimal; confirm the first ten terms match and read off the eleventh.
for n in range(1, 12):
v = (n + 2) ** 2
print(f"n={n:2d}: decimal {v:3d} -> hex {format(v, 'X')}")
The script prints the sequence .
Riddler Classic
The Riddler Intelligence Agency has agents, of whom are spies. You can send any subset of agents on a retreat. If every spy is present at a retreat, the spies hold a meeting; if any spy is missing, the meeting does not happen. You learn only whether the meeting happened. You know and . What is the minimum number of retreats needed to guarantee you identify all spies, in the worst case? Take , .
The Riddler, FiveThirtyEight, October 5, 2018(original post)
Solution
The question has a clean information-theoretic lower bound and a (substantially harder) constructive matching upper bound. We give the lower bound in full and report the constructive result.
Lower bound (information theory). Before any retreat, the set of possible spy configurations is the family of -subsets of , of size . Each retreat returns one bit (meeting or no meeting). After retreats the agent has seen at most distinct response patterns, so the strategy can distinguish at most spy configurations from one another. To force a unique identification in the worst case, For and , The minimum is at least retreats.
Constructive upper bound. A specific strategy of Tim Black achieves the bound with retreats (within of optimal), and an alternative due to Thomas Swayze costs about $93 million at $1,000 per agent per retreat. A more retreat-heavy strategy by Laurent Lessard uses retreats but only $66 million in total cost. The headline answer the puzzle asks for is the lower bound:
Why is tight in principle. The information bound is reachable when each retreat partitions the surviving possibilities into two halves of equal size. A retreat sends some subset . The response is "yes" iff the entire -subset of spies sits inside . The fraction of surviving spy-configurations with all spies in depends on the current knowledge state and on . A balanced split is not always achievable for every state with the simple "send everyone in " primitive, which is why the constructive upper bound exceeds .
Binary-search outline. A clean (if not optimal) strategy is to identify spies one at a time by binary search.
Maintain a working set of agents known to contain at least one not-yet-identified spy.
Pick a subset to keep at home; send everyone except .
If the meeting happens, every spy is outside ; remove from .
If the meeting fails, at least one spy is in ; replace with .
Iterate to corner an individual spy, then mark them and search for the next.
Choosing near gives roughly retreats per spy, and retreats is the rough cost, comfortably above the information bound.
The computation
Compute exactly and confirm . Then simulate a one-spy-at-a-time binary-search strategy on a small randomised instance and confirm it never exceeds the analytical bound.
Compute exactly and take .
For each small , run a binary search to identify the spies and record retreats used.
Confirm that the simulated cost lies between the information bound and the rough estimate.
import math, random
N_big, K_big = 1024, 17
nck = math.comb(N_big, K_big)
print(f"C(1024,17) = {nck}")
print(f"log2(C) = {math.log2(nck):.4f}")
print(f"min retreats (info bound) = {math.ceil(math.log2(nck))}")
def simulate(N, K, seed=0):
rng = random.Random(seed)
spies = set(rng.sample(range(N), K))
found = set()
suspects = set(range(N))
retreats = 0
while len(found) < K:
W = set(suspects)
while len(W) > 1:
T = set(list(W)[: len(W) // 2])
sent = (set(range(N)) - T) | found
retreats += 1
meeting = spies.issubset(sent)
if meeting:
W = W - T
else:
W = T
s = W.pop()
found.add(s)
suspects.discard(s)
assert found == spies
return retreats
random.seed(0)
for N, K in [(32, 3), (64, 4), (128, 5), (256, 6)]:
info = math.ceil(math.log2(math.comb(N, K)))
sim = simulate(N, K)
print(f"N={N}, K={K}: info bound {info}, binary search {sim}")
The script prints the information bound for the headline case, and for small instances the binary-search cost exceeds the information bound by a small constant, consistent with the published constructive strategies needing extra retreats at .