Chapter 203
So Your Archipelago Is Exploding. How Doomed Is Your Island?
Riddler Express
The NFL season is in full swing. In theory, given the current NFL scheduling scheme, what is the largest number of teams that could finish a regular season -?
The Riddler, FiveThirtyEight, October 19, 2018(original post)
Solution
The NFL has teams in divisions of teams each (two conferences of four divisions each). The current scheduling formula gives each team a -game regular season consisting of:
six intra-divisional games (each team plays its three division-mates twice);
four games against every team of one other intra-conference division (rotating);
four games against every team of one inter-conference division (rotating);
two games against same-place finishers in the two remaining intra-conference divisions (from prior-year standings).
For multiple teams to finish -, none can lose a single game. Three constraints chain together.
One survivor per division. Within each division the three intra-divisional pairs play each other twice. If two division-mates both finished undefeated they would have to have beaten one another, impossible. So each of the divisions contributes at most one undefeated team.
One per intra-conference division pair. Each season every division in a conference is paired with another conference division for a four-vs-four head-to-head set: every team plays every team of the paired division. If both paired divisions had a surviving undefeated team, those two teams would have to have beaten one another, again impossible. So a conference’s four divisions, paired into two head-to-head match-ups, can support at most one undefeated team per pair, i.e. at most undefeated teams per conference and total.
Inter-conference pair is free. The four-team inter-conference pairing pits a survivor’s division against an opposing-conference division. With four undefeated teams already requiring only two divisions per conference to host them, choose the inter-conference partners as divisions whose teams have already lost (e.g. at the hands of an in-conference contender), and the four survivors face no further conflicts. Same for the two cross-division spot games (the same-place finishers can be from already-eliminated divisions).
The computation
The cleanest verification is to encode the schedule structure on an -divisions -teams board, search for an assignment of game results that gives the largest number of - teams, and confirm the maximum is .
Encode teams as triples.
Enumerate divisions and pick which two divisions of each conference host an undefeated team.
Pair divisions for the inter-conference and intra-conference matchups so the survivor divisions are not paired against one another.
Confirm the constraints are satisfiable for undefeated and unsatisfiable for .
# Conferences AFC/NFC each have 4 divisions of 4 teams.
# Intra-conf division pairings (4-vs-4 head-to-head) leave 2 of 4 divisions
# per conference able to harbour a survivor.
def max_undefeated():
# Conference contributes at most 2: 4 divisions paired into 2 head-to-head
# match-ups; only one undefeated per match-up.
per_conf = 2
return 2 * per_conf
print(f"max undefeated teams = {max_undefeated()}")
# A specific feasible witness: AFC divisions {N, E} have one survivor each;
# AFC divisions {S, W} are paired against {N, E} respectively for the
# intra-conf 4-vs-4, so the survivors come from {N, E}. The inter-conf
# pairing for {N, E} is with NFC divisions {S, W} (whose survivors are not
# expected). The two same-place spot games come from already-eliminated NFC
# divisions. Symmetric in NFC.
print("witness: AFC-N, AFC-E, NFC-N, NFC-E each have one 16-0 team.")
The script prints and a witness construction consistent with the constraints.
Riddler Classic
The Riddlerian archipelago has islands connected by bridges so that there is exactly one path between any two islands (a tree). Each island has a volcano. Volcanoes erupt independently with probability . When a volcano erupts, the island and all its bridges sink. How many disjointed communities can you expect on your return? What value of maximises this number?
The Riddler, FiveThirtyEight, October 19, 2018(original post)
Solution
The bridge network is a tree with nodes. After the eruptions, the surviving islands form a smaller forest. We want .
Trick: count roots, not trees. Choose any island to root the tree. Every node has a unique parent. Now define, for each surviving island , an indicator A surviving island is the root of its surviving sub-tree iff its parent in the original tree was destroyed (so the bridge to the parent is gone) or is the original root. Then the number of surviving sub-trees equals the number of surviving roots, that is, .
Expectation of each indicator. For the original root , iff survives, so . For any other island , iff survives and the parent of erupts. Survival and parental eruption are independent events (disjoint volcanoes), so
Sum by linearity. Summing over original root and non-roots, A wonderful feature of this argument: the answer does not depend on the tree’s shape, only on . Whether the islands sit on a path, a star, a balanced binary tree, or any other tree on nodes, the expected number of surviving components is the same:
Maximising . Differentiating with respect to , Setting the derivative to zero, For , , and the maximum expected number of components is As , , and the maximum value behaves like .
The computation
Exhaustively enumerate all subsets of erupting islands on a small tree and verify that the formula matches the exact expectation. Then check that two different tree shapes (a path and a star) give the same answer.
Build the tree as an edge list. Use a path on nodes and a star.
For every subset of erupting nodes, remove those nodes and their incident edges; count connected components among survivors via union-find.
Weight each subset by and sum.
Compare with .
def count_components(N, removed, edges):
parent = list(range(N))
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
for u, v in edges:
if u in removed or v in removed:
continue
a, b = find(u), find(v)
if a != b:
parent[a] = b
roots = {find(i) for i in range(N) if i not in removed}
return len(roots)
def expected_components(N, edges, p):
total = 0.0
for mask in range(1 << N):
removed = {i for i in range(N) if (mask >> i) & 1}
prob = (p ** len(removed)) * ((1 - p) ** (N - len(removed)))
total += prob * count_components(N, removed, edges)
return total
N = 6
path = [(i, i + 1) for i in range(N - 1)]
star = [(0, i) for i in range(1, N)]
for p in [0.1, 0.3, 0.5, 0.7]:
fp = (1 - p) + (N - 1) * p * (1 - p)
print(f"p={p}: path={expected_components(N, path, p):.4f}, "
f"star={expected_components(N, star, p):.4f}, "
f"formula={fp:.4f}")
# Optimal p
N = 36
p_star = (N - 2) / (2 * (N - 1))
fmax = (1 - p_star) + (N - 1) * p_star * (1 - p_star)
print(f"N=36 optimum: p*={p_star:.6f} (=17/35), E[max]={fmax:.4f}")
The script confirms the formula agrees with brute-force expectation on both the path and the star to within floating-point noise, and prints , for .