Chapter 183
How Hard Is It To Find A Running Buddy?
Riddler Express
A grocer’s aisles are randomly assigned to stockers. (i) How many different ways are there for one stocker to be assigned exactly three aisles out of the ? (ii) How many different ways are there to assign all aisles to the stockers, three aisles each? (iii) If the manager may assign any number of aisles to any stocker, how many assignments are possible?
The Riddler, FiveThirtyEight, May 11, 2018(original post)
Solution
Part (i): one stocker, three aisles. The order in which the three aisles are assigned to a single stocker does not matter; the aisle set does. This is the unordered choice of from :
Part (ii): all aisles, three each. Pick the first stocker’s three aisles, then the second’s from the remaining, then the third’s from the remaining, and so on. The five stockers are distinguishable (they are different people), so the multinomial coefficient is
Part (iii): unrestricted. Each aisle is assigned to one of the five stockers independently, with no constraint on how many a single stocker may take. By the product rule,
So the three answers are
The computation
Evaluate the three counts symbolically with Python’s exact integer arithmetic and confirm the product structure of part (ii) by direct multinomial enumeration on a smaller toy case.
Compute , , and .
Cross-check the multinomial count by the product of nested binomials.
Verify part (ii) on a toy case (say aisles, stockers, each) against the brute enumeration of all such partitions.
import math
from itertools import combinations
print("(i) ", math.comb(15, 3))
print("(ii) ", math.factorial(15) // (math.factorial(3) ** 5))
print("nested",
math.comb(15, 3) * math.comb(12, 3) * math.comb(9, 3)
* math.comb(6, 3) * math.comb(3, 3))
print("(iii) ", 5 ** 15)
# toy brute force: 9 aisles, 3 stockers (A,B,C), 3 each
toy = 0
aisles = list(range(9))
for A in combinations(aisles, 3):
rem1 = [x for x in aisles if x not in A]
for B in combinations(rem1, 3):
toy += 1 # C is forced
print("toy 9-aisles, 3 stockers, 3 each =", toy,
" formula =", math.factorial(9) // (math.factorial(3) ** 3))
The three answers print, the nested-binomial product matches the multinomial, and the toy case returns , equal to , confirming the structure.
Riddler Classic
people go for a run. Each person’s preferred speed is independent and normally distributed with mean and variance . Persons and can be running buddies if . How large does need to be before the probability that a given person has a running buddy is at least ? (Assume , , and are fixed.)
The Riddler, FiveThirtyEight, May 11, 2018(original post)
Solution
The right thing to condition on. Fix person and ask: given person ’s speed , what is the probability that none of the other runners is within of ? Once we condition on , the other are independent of one another. So the no-buddy event factors: where and is the standard normal cumulative distribution function. Marginalising over the density of ,
The headline. The probability of having a buddy is . We want the smallest for which this is at least :
Why the published one-liner is too small. The official write-up uses the formula which treats the events for different as independent. They share the common variable , so they are not. Conditional on they factor cleanly; unconditional they are positively correlated, which raises above the product formula. The official formula therefore under-estimates the needed and yields a value (around for marathon-fit parameters) that is several times too small.
Worked numbers (Boston Marathon parameters). Take , miles per hour, and try the bracket of values mentioned in the official write-up.
| official formula | integral | |
|---|---|---|
The integral column is what the puzzle actually asks for. To get to buddy-coverage you need substantially more runners than the official formula suggests: for the case, around , not .
Closed-form scaling. The integral depends on the dimensionless ratio only. For small , is approximately for , and the integral is approximately . So scales like , four times the scaling the official formula suggests. The intuition is that someone with an extreme speed has nobody at all near them; even with many average-paced runners, the tails dictate the no-buddy probability.
The computation
Evaluate the integral by numerical quadrature for a sweep of , find the threshold that drops below , and confirm by Monte Carlo simulation.
For each from upwards, compute by Simpson’s rule on a wide grid (in standardised units).
Return the first for which the integral falls below .
Cross-check with -trial Monte Carlo.
import math, random
from scipy import stats
from scipy.integrate import quad
def threshold(sigma, s, mu=0.0, target=0.01):
def q(x):
a = stats.norm.cdf((x + s - mu) / sigma)
b = stats.norm.cdf((x - s - mu) / sigma)
return a - b
def no_buddy(N):
f = lambda x: stats.norm.pdf((x - mu) / sigma) / sigma * (1 - q(x)) ** (N - 1)
v, _ = quad(f, mu - 8 * sigma, mu + 8 * sigma)
return v
N = 2
while no_buddy(N) > target:
N += 1
return N
def mc(sigma, s, N, mu=0.0, trials=200_000):
has = 0
for _ in range(trials):
xs = [random.gauss(mu, sigma) for _ in range(N)]
if any(abs(xs[0] - xs[j]) <= s for j in range(1, N)):
has += 1
return has / trials
for sigma, s in [(1.0, 1.0), (1.16, 1.0), (1.16, 0.5)]:
N = threshold(sigma, s)
print(f"sigma={sigma}, s={s}: N* = {N}, MC at N* = {mc(sigma, s, N):.4f}")
For the script prints with MC buddy-rate ; for , with MC . These match the integral column in the table above.