Chapter 100

Can You Win This Hot New Game Show?

Two players go into separate booths. Each presses a button and sees a number drawn uniformly from $[0,1]$ . A player may keep that number, or press again to discard it and take a second draw, which must be kept. The players then compare their kept numbers, and the higher one wins the gold. Both reason the same way, so both use the same cutoff: keep a first draw at or above some value $C$ , discard it below. What cutoff $C$ is optimal?

The Riddler, FiveThirtyEight (Stephen Mellendorf)(original post)

Solution

The cutoff that maximises your average kept number is $\tfrac12$ , but this is not a contest of averages, it is a contest against one opponent, so the right cutoff is different. The clean way in is the boundary itself: if $C$ is the best cutoff, then a player whose first draw is exactly $C$ must be perfectly torn between keeping it and pressing again. At any other value the choice is strict, so indifference pins $C$ down. Both quantities below are computed against an opponent also playing $C$ .

Keep $C$ . You win when the opponent’s final number falls below $C$ . That needs both of the opponent’s draws below $C$ (a first draw below $C$ forces the second), which happens with probability $C\cdot C = C^2$ .

Press again. Your new number is uniform on $[0,1]$ , independent of everything. With probability $C$ the opponent also pressed again, leaving you two fresh draws and an even $\tfrac12$ chance. With probability $1-C$ the opponent kept a first draw, which is then uniform on $[C,1]$ , and a fresh uniform beats a uniform on $[C,1]$ with probability $\tfrac{1-C}{2}$ . So pressing wins with probability $C\cdot\tfrac12 + (1-C)\cdot\tfrac{1-C}{2}$ .

Setting the two equal, $\begin{aligned} C^2 = \frac{C}{2} + \frac{(1-C)^2}{2} &\;\Longrightarrow\; 2C^2 = C + 1 - 2C + C^2\\ &\;\Longrightarrow\; C^2 + C - 1 = 0, \end{aligned}$ whose positive root is the golden ratio’s smaller cousin, $C = \frac{\sqrt5 - 1}{2} \approx \boxed{0.618}.$ Keep a first draw above about $0.618$ , and throw anything below it back.

The computation

Play the game itself. With both players using a cutoff, simulate millions of rounds and read off each side’s win rate. A cutoff is optimal only if no other value does better against it, so hold the opponent at $0.618$ and check that $0.618$ is the best reply: nudging it either way loses ground.

import numpy as np
rng = np.random.default_rng(0)
N = 5_000_000

def final(cutoff):                         # one player's kept number
    first = rng.random(N)
    return np.where(first >= cutoff, first, rng.random(N))

def win_rate(mine, opp):
    return (final(mine) > final(opp)).mean()

C = (5 ** 0.5 - 1) / 2
for mine in (0.55, C, 0.70):
    print(f"{mine:.3f} vs 0.618: win {win_rate(mine, C):.4f}")
# 0.550 vs 0.618: win 0.4983
# 0.618 vs 0.618: win 0.5001   <- best reply is 0.618 itself
# 0.700 vs 0.618: win 0.4945