Skip to content
Vamshi Jandhyala

Books · The Fiddler: Solutions

Chapter 101

Can You Paint by Number?

An infinitely long strip of canvas, 11 cm wide, is divided into 1×11\times1 squares. Each square is independently labelled 00 or 11 with equal probability, then painted red for 00 and blue for 11. A cluster is a maximal run of same-coloured squares. On average, how large is a cluster?

The Fiddler, Zach Wissner-Gross, April 5, 2024(original post)

Solution

A cluster is a run of identical fair coin flips. Reading along the strip, once a cluster has begun each following square continues it with probability 12\tfrac12 and ends it with probability 12\tfrac12, so a cluster’s length is geometric with mean 1/12=21/\tfrac12=2. The same count from the other side: each square starts a new cluster exactly when it differs from its left neighbour, which happens half the time, so there is one cluster for every two squares. Either way the average cluster size is 2.\boxed{2}.

The computation

Lay down a long random strip and count clusters directly as one more than the number of colour changes between adjacent squares; the strip length divided by that count is the average size.

import numpy as np
rng = np.random.default_rng(0)
g = rng.integers(0, 2, size=4_000_000)
breaks = 1 + np.count_nonzero(g[1:] != g[:-1])     # number of clusters
print(len(g) / breaks)                              # ~2.0

Extra Credit

Now the strip is 22 cm wide, a 2×2\times\infty grid, with squares adjacent only across shared edges. What is the average cluster size?

Solution

In two dimensions a run-length count no longer works, because a cluster can branch. Euler’s formula does, though. For the graph whose vertices are the cells of one colour and whose edges join same-coloured edge-neighbours, the number of connected pieces is #clusters=VEsame+F,\#\text{clusters}=V-E_{\text{same}}+F, where FF counts the independent cycles, which on this grid are exactly the monochromatic 2×22\times2 blocks (a filled square encloses one face). Take expectations per column of the 2×2\times\infty strip: V=2V=2 cells; the same-colour edges average 12\tfrac12 (the one vertical pair) plus 2122\cdot\tfrac12 (the two horizontal pairs reaching the next column); and a given 2×22\times2 block is all one colour with probability 18\tfrac18. So each column contributes E[#clusters]232+18=58\mathbb{E}[\#\text{clusters}]\to 2-\tfrac32+\tfrac18=\tfrac58 clusters. With two cells per column, the average cluster size is 2/582\big/\tfrac58: 165=3.2.\boxed{\tfrac{16}{5}=3.2.}

The computation

Generate a tall random 2×L2\times L strip and flood-fill its four-connected same-colour clusters, then divide the cell count by the number of clusters.

import numpy as np
from collections import deque
def avg_cluster(L, seed):                           # 2 x L strip, 4-connected clusters
    rng = np.random.default_rng(seed); g = rng.integers(0, 2, size=(2, L))
    seen = np.zeros_like(g, bool); n = 0
    for i in range(2):
        for j in range(L):
            if not seen[i, j]:
                n += 1; c = g[i, j]; q = deque([(i, j)]); seen[i, j] = True
                while q:
                    a, b = q.popleft()
                    for da, db in ((1, 0), (-1, 0), (0, 1), (0, -1)):
                        x, y = a + da, b + db
                        if 0 <= x < 2 and 0 <= y < L and not seen[x, y] and g[x, y] == c:
                            seen[x, y] = True; q.append((x, y))
    return 2 * L / n
print(avg_cluster(3_000_000, 0))                    # ~3.200