Skip to content
Vamshi Jandhyala

Books · The Riddler

Chapter 203

So Your Archipelago Is Exploding. How Doomed Is Your Island?

Riddler Express

The NFL season is in full swing. In theory, given the current NFL scheduling scheme, what is the largest number of teams that could finish a regular season 1616-00?

The Riddler, FiveThirtyEight, October 19, 2018(original post)

Solution

The NFL has 3232 teams in 88 divisions of 44 teams each (two conferences of four divisions each). The current scheduling formula gives each team a 1616-game regular season consisting of:

  • six intra-divisional games (each team plays its three division-mates twice);

  • four games against every team of one other intra-conference division (rotating);

  • four games against every team of one inter-conference division (rotating);

  • two games against same-place finishers in the two remaining intra-conference divisions (from prior-year standings).

For multiple teams to finish 1616-00, none can lose a single game. Three constraints chain together.

One survivor per division. Within each division the three intra-divisional pairs play each other twice. If two division-mates both finished undefeated they would have to have beaten one another, impossible. So each of the 88 divisions contributes at most one undefeated team.

One per intra-conference division pair. Each season every division in a conference is paired with another conference division for a four-vs-four head-to-head set: every team plays every team of the paired division. If both paired divisions had a surviving undefeated team, those two teams would have to have beaten one another, again impossible. So a conference’s four divisions, paired into two head-to-head match-ups, can support at most one undefeated team per pair, i.e. at most 22 undefeated teams per conference and 44 total.

Inter-conference pair is free. The four-team inter-conference pairing pits a survivor’s division against an opposing-conference division. With four undefeated teams already requiring only two divisions per conference to host them, choose the inter-conference partners as divisions whose teams have already lost (e.g. at the hands of an in-conference contender), and the four survivors face no further conflicts. Same for the two cross-division spot games (the same-place finishers can be from already-eliminated divisions). Maximum=4 undefeated teams, two per conference.\boxed{\text{Maximum} = 4 \text{ undefeated teams, two per conference.}}

The computation

The cleanest verification is to encode the schedule structure on an 88-divisions ×\times 44-teams board, search for an assignment of game results that gives the largest number of 1616-00 teams, and confirm the maximum is 44.

  1. Encode 3232 teams as (conference,division,team)(\text{conference}, \text{division}, \text{team}) triples.

  2. Enumerate divisions and pick which two divisions of each conference host an undefeated team.

  3. Pair divisions for the inter-conference and intra-conference matchups so the survivor divisions are not paired against one another.

  4. Confirm the constraints are satisfiable for 44 undefeated and unsatisfiable for 55.

# Conferences AFC/NFC each have 4 divisions of 4 teams.
# Intra-conf division pairings (4-vs-4 head-to-head) leave 2 of 4 divisions
# per conference able to harbour a survivor.
def max_undefeated():
    # Conference contributes at most 2: 4 divisions paired into 2 head-to-head
    # match-ups; only one undefeated per match-up.
    per_conf = 2
    return 2 * per_conf

print(f"max undefeated teams = {max_undefeated()}")

# A specific feasible witness: AFC divisions {N, E} have one survivor each;
# AFC divisions {S, W} are paired against {N, E} respectively for the
# intra-conf 4-vs-4, so the survivors come from {N, E}. The inter-conf
# pairing for {N, E} is with NFC divisions {S, W} (whose survivors are not
# expected). The two same-place spot games come from already-eliminated NFC
# divisions. Symmetric in NFC.
print("witness: AFC-N, AFC-E, NFC-N, NFC-E each have one 16-0 team.")

The script prints 44 and a witness construction consistent with the constraints.

Riddler Classic

The Riddlerian archipelago has NN islands connected by bridges so that there is exactly one path between any two islands (a tree). Each island has a volcano. Volcanoes erupt independently with probability pp. When a volcano erupts, the island and all its bridges sink. How many disjointed communities can you expect on your return? What value of pp maximises this number?

The Riddler, FiveThirtyEight, October 19, 2018(original post)

Solution

The bridge network is a tree with NN nodes. After the eruptions, the surviving islands form a smaller forest. We want E[number of trees in the surviving forest]\mathbb{E}[\text{number of trees in the surviving forest}].

Trick: count roots, not trees. Choose any island to root the tree. Every node vrootv \neq \text{root} has a unique parent. Now define, for each surviving island vv, an indicator Xv  =  1[v survives and v is a root of its surviving sub-tree].X_{v} \;=\; \mathbf{1}[\,v \text{ survives and } v \text{ is a root of its surviving sub-tree}\,]. A surviving island vv is the root of its surviving sub-tree iff its parent in the original tree was destroyed (so the bridge to the parent is gone) or vv is the original root. Then the number of surviving sub-trees equals the number of surviving roots, that is, vXv\sum_v X_v.

Expectation of each indicator. For the original root rr, Xr=1X_{r} = 1 iff rr survives, so E[Xr]=1p\mathbb{E}[X_{r}] = 1 - p. For any other island vv, Xv=1X_{v} = 1 iff vv survives and the parent of vv erupts. Survival and parental eruption are independent events (disjoint volcanoes), so E[Xv]  =  (1p)p.\mathbb{E}[X_{v}] \;=\; (1 - p) \cdot p.

Sum by linearity. Summing over 11 original root and N1N - 1 non-roots, E[components]  =  (1p)  +  (N1)p(1p).\mathbb{E}[\text{components}] \;=\; (1 - p) \;+\; (N - 1) \cdot p (1 - p). A wonderful feature of this argument: the answer does not depend on the tree’s shape, only on NN. Whether the islands sit on a path, a star, a balanced binary tree, or any other tree on NN nodes, the expected number of surviving components is the same: E[components]  =  (1p)+(N1)p(1p).\boxed{\mathbb{E}[\text{components}] \;=\; (1 - p) + (N - 1) \, p (1 - p).}

Maximising pp. Differentiating with respect to pp, ddp[(1p)+(N1)p(1p)]  =  1+(N1)(12p).\frac{\mathrm{d}}{\mathrm{d}p} \bigl[ (1 - p) + (N - 1) p (1 - p) \bigr] \;=\; -1 + (N - 1)(1 - 2 p). Setting the derivative to zero, p  =  N22(N1).p^{\ast} \;=\; \frac{N - 2}{2(N - 1)}. For N=36N = 36, p=34/70=17/350.486p^{\ast} = 34/70 = 17/35 \approx 0.486, and the maximum expected number of components is (1p)+35p(1p)  =  1835+3517351835    9.257.(1 - p^{\ast}) + 35 \cdot p^{\ast} (1 - p^{\ast}) \;=\; \tfrac{18}{35} + 35 \cdot \tfrac{17}{35} \cdot \tfrac{18}{35} \;\approx\; 9.257. As NN \to \infty, p1/2p^{\ast} \to 1/2, and the maximum value behaves like N/4N/4.

The computation

Exhaustively enumerate all 2N2^{N} subsets of erupting islands on a small tree and verify that the formula matches the exact expectation. Then check that two different tree shapes (a path and a star) give the same answer.

  1. Build the tree as an edge list. Use a path on NN nodes and a star.

  2. For every subset of erupting nodes, remove those nodes and their incident edges; count connected components among survivors via union-find.

  3. Weight each subset by pR(1p)NRp^{|R|} (1 - p)^{N - |R|} and sum.

  4. Compare with (1p)+(N1)p(1p)(1 - p) + (N - 1) p (1 - p).

def count_components(N, removed, edges):
    parent = list(range(N))
    def find(x):
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x
    for u, v in edges:
        if u in removed or v in removed:
            continue
        a, b = find(u), find(v)
        if a != b:
            parent[a] = b
    roots = {find(i) for i in range(N) if i not in removed}
    return len(roots)

def expected_components(N, edges, p):
    total = 0.0
    for mask in range(1 << N):
        removed = {i for i in range(N) if (mask >> i) & 1}
        prob = (p ** len(removed)) * ((1 - p) ** (N - len(removed)))
        total += prob * count_components(N, removed, edges)
    return total

N = 6
path = [(i, i + 1) for i in range(N - 1)]
star = [(0, i) for i in range(1, N)]
for p in [0.1, 0.3, 0.5, 0.7]:
    fp = (1 - p) + (N - 1) * p * (1 - p)
    print(f"p={p}: path={expected_components(N, path, p):.4f}, "
          f"star={expected_components(N, star, p):.4f}, "
          f"formula={fp:.4f}")

# Optimal p
N = 36
p_star = (N - 2) / (2 * (N - 1))
fmax = (1 - p_star) + (N - 1) * p_star * (1 - p_star)
print(f"N=36 optimum: p*={p_star:.6f} (=17/35), E[max]={fmax:.4f}")

The script confirms the formula agrees with brute-force expectation on both the path and the star to within floating-point noise, and prints p=17/35p^{\ast} = 17/35, Emax9.26\mathbb{E}_{\max} \approx 9.26 for N=36N = 36.