Skip to content
Vamshi Jandhyala

Books · The Fiddler: Solutions

Chapter 98

Can You Play ‘The Price Is Right’ Continuously?

Players AA and BB take turns at a wheel that returns a real number uniform on [0,1][0,1]. AA goes first: spin once, or spin twice and add the two results, but a total above 11 is an instant loss. Then BB does the same, knowing AA’s result. Whoever ends with the higher total at or below 11 wins. With both playing optimally, what is AA’s probability of winning?

The Fiddler, Zach Wissner-Gross, April 26, 2024(original post)

Solution

Work backwards from BB. Suppose AA finishes with a valid total tt. If BB’s first spin exceeds tt, BB stops and wins; otherwise BB must spin again and clear tt without busting, which happens with probability 1t1-t. So BB wins with probability (1t)+t(1t)=1t2(1-t)+t(1-t)=1-t^2, and AA wins with probability exactly t2t^2 (a bust by BB counts for AA).

So AA should maximise the expected value of t2t^2. After a first spin xx, stopping scores x2x^2, while spinning again scores 01x(x+y)2dy=1x33\int_0^{1-x}(x+y)^2\,\mathrm dy=\tfrac{1-x^3}{3}. Stopping wins once x21x33x^2\ge\tfrac{1-x^3}{3}, that is past the root c0.532c\approx0.532 of x3+3x21=0x^3+3x^2-1=0. Therefore P(A)=c1x2dx+0c1x33dx,P(A)=\int_c^1 x^2\,\mathrm dx+\int_0^c\frac{1-x^3}{3}\,\mathrm dx, which evaluates to  34cos(2π/9)2+sin(π/18)2 0.4538.\boxed{\ \frac34-\frac{\cos(2\pi/9)}{2}+\frac{\sin(\pi/18)}{2}\ }\approx0.4538. Going first is a real disadvantage: AA wins less than half the time.

The computation

Play the game under the optimal rules: AA stops if its first spin reaches cc, otherwise spins again; BB, knowing AA’s total tt, stops if its first spin beats tt, otherwise spins again. The simulated win frequency for AA matches the closed form.

import numpy as np
from scipy.optimize import brentq
rng = np.random.default_rng(0)
c = brentq(lambda x: x**3 + 3*x**2 - 1, 0, 1)       # optimal stop threshold ~0.532
N = 8_000_000
x, y = rng.random(N), rng.random(N)
tA = np.where(x >= c, x, np.where(x + y <= 1, x + y, 0.0))   # A stops at first spin >= c
u, v = rng.random(N), rng.random(N)
tB = np.where(u > tA, u, np.where(u + v <= 1, u + v, 0.0))   # B stops at first spin beating A
print((tA > tB).mean())                                     # ~0.4538

Extra Credit

Now three players AA, BB, CC spin in turn, each once or twice, highest valid total wins. What is AA’s probability of winning?

Solution

The backward induction gains one layer. The last player CC, facing the current best total bb, stops on any first spin above bb and otherwise spins again, beating bb with probability 1b1-b, so CC overtakes a total tt with probability 1t21-t^2. Knowing this, the middle player BB, holding AA’s total aa, values stopping at a first spin uu at u2u^2 (it beats AA but CC may overtake) and spinning again at 1max(u,a)33\tfrac{1-\max(u,a)^3}{3}; comparing the two, BB’s optimal cutoff rises to max(a,c)\max(a,c), with c0.532c\approx0.532 the two-player threshold. The first player AA, now exposed to two opponents, raises its own first-spin threshold to about 0.650.65. Sweeping that threshold and playing the game out gives P(A)0.305,\boxed{P(A)\approx0.305,} and the order penalty deepens: first, middle, and last win roughly 30%30\%, 33%33\%, and 37%37\%. Spinning last, with full information and no one to follow, is worth the most. (A closed form is more than the puzzle needs; this is the numerical optimum.)

The computation

Encode all three players with their optimal rules (CC stops above the best, BB at max(a,c)\max(a,c)) and sweep AA’s first-spin threshold, taking the best.

import numpy as np
from scipy.optimize import brentq
rng = np.random.default_rng(0)
c = brentq(lambda x: x**3 + 3*x**2 - 1, 0, 1)       # optimal stop threshold ~0.532
N = 4_000_000
def shares(tA_thr):
    x, y = rng.random(N), rng.random(N)
    tA = np.where(x >= tA_thr, x, np.where(x + y <= 1, x + y, 0.0))
    u, v = rng.random(N), rng.random(N)
    thrB = np.maximum(tA, c)                                 # B's optimal cutoff
    tB = np.where(u > thrB, u, np.where(u + v <= 1, u + v, 0.0))
    best = np.maximum(tA, tB)
    uc, vc = rng.random(N), rng.random(N)
    tC = np.where(uc > best, uc, np.where(uc + vc <= 1, uc + vc, 0.0))
    return [((p > q) & (p > r)).mean() for p, q, r in
            [(tA, tB, tC), (tB, tA, tC), (tC, tA, tB)]]
best = max((shares(t)[0], t) for t in np.linspace(0.55, 0.75, 11))
print(round(best[0], 3), "at", round(best[1], 2))           # ~0.305 at 0.65