Chapter 136

The Lifeguard’s Dash and Toddler Poker

The Riddler for February 10, 2017. The Express is a refraction problem in disguise; the Classic is the continuous limit of baby poker, where bluffing is again optimal.

Riddler Express

You are a lifeguard at the water’s edge. A swimmer is in trouble $100$ meters to your right and $100$ meters out from shore. You run $100$ meters in $15$ seconds but swim $100$ meters in $75$ seconds, and you cannot run in the water. What is the fastest you can reach them?

The Riddler, FiveThirtyEight, February 10, 2017(original post)

Solution

Running is five times faster than swimming, so you should spend more of the journey on sand and cut the slow water leg short, the same trade-off light makes when it bends entering water. The only choice is the point where you leave the shore. Run $s$ meters along the beach, then swim the hypotenuse to the victim. Running covers $s$ meters; swimming covers $\sqrt{(100-s)^2 + 100^2}$ meters; converting distances to time at $0.15$ s/m on sand and $0.75$ s/m in water, $T(s) = 0.15\,s + 0.75\sqrt{(100-s)^2 + 100^2}.$ Minimising (set $T'(s)=0$ ) gives the optimal entry point $s \approx 79.6$ m and a time of $\boxed{\approx 88.5 \text{ seconds}},$ beating the $90$ seconds of running the full $100$ m and then swimming straight out. The condition $T'(s)=0$ is exactly Snell’s law of refraction: the ratio of the sines of the “angles” on sand and in water equals the ratio of speeds. The lifeguard, without knowing it, bends their path like a ray of light.

The computation

Encode the time function and minimise it numerically, comparing against the naive run-then-swim route.

import math
from scipy.optimize import minimize_scalar
T = lambda s: 0.15 * s + 0.75 * math.hypot(100 - s, 100)
r = minimize_scalar(T, bounds=(0, 100), method="bounded")
print("optimal entry s:", round(r.x, 1), "m   fastest time:", round(r.fun, 1), "s")
print("naive run-then-swim:", round(T(100), 1), "s")
# optimal entry s: 79.6 m   fastest time: 88.5 s
# naive run-then-swim: 90.0 s

Riddler Classic

Toddler poker: each player is dealt a number drawn uniformly from $[0,1]$ , and both ante $1. Player A may “call” (showdown for the $2, higher number wins) or “raise” $1. If A raises, B may “fold” (A takes the pot, B loses only the ante) or “call” the dollar (showdown for $4). What are the optimal strategies, and what is the game worth to A?

Extra credit: what if the raise is worth $ $k$ instead of $2?

The Riddler, FiveThirtyEight, February 10, 2017(original post)

Solution

B can never bluff, the hand ends the moment she acts, so she simply calls with high numbers and folds with low ones. A, moving first, can bluff, and should. He raises his strong hands for value, but he also raises his weakest hands: a low number is a sure loser at showdown, so betting it risks nothing extra on average while sometimes scaring B into folding a better hand. Solving for the thresholds where each player is indifferent gives the equilibrium $\boxed{ \begin{array}{l} \text{A: raise if } x > 0.7 \text{ or } x < 0.1;\ \text{call if } 0.1 \le x \le 0.7.\\ \text{B: call a raise if } y > 0.4;\ \text{fold otherwise.} \end{array}}$ To value the game, split the unit square by these thresholds and add up A’s average profit (each ante is already on the table): $\begin{aligned} x<0.1\ (\text{bluff}): &\quad (+1)(0.04) + (-2)(0.06) = -0.08,\\ 0.1<x<0.7\ (\text{call}): &\quad \int_{0.1}^{0.7}(2x-1)\,dx = -0.12,\\ x>0.7\ (\text{raise}): &\quad (+1)(0.12) + \int_{0.7}^{1}(4x-2.8)\,dx = +0.30. \end{aligned}$ Summing the three regions, the game is worth $-0.08 - 0.12 + 0.30 = \boxed{\$0.10}$ to Player A. A’s first-move advantage is worth exactly a dime a hand. For the extra credit, raising the stake to $ $k$ keeps the same three-region structure, A value-raises high, bluffs low, calls in the middle, but slides the thresholds: a bigger pot makes B call more and trims A’s profitable bluffing range. The exact boundaries shift continuously with $k$ .

The computation

Encode the payoff on the unit square and integrate to confirm the value, then check that the boxed thresholds are mutual best responses (A maximising, B minimising A’s take on a grid).

import numpy as np
def Apay(x, y, alo=0.1, ahi=0.7, bcall=0.4):
    if not (x > ahi or x < alo):                 # A calls -> showdown for $2
        return 1.0 if x > y else (-1.0 if x < y else 0.0)
    if y < bcall:                                # A raises, B folds
        return 1.0
    return 2.0 if x > y else (-2.0 if x < y else 0.0)   # A raises, B calls -> $4

def value(alo, ahi, bcall, n=600):
    g = (np.arange(n) + 0.5) / n
    return np.mean([Apay(x, y, alo, ahi, bcall) for x in g for y in g])

print("value at thresholds .1/.7/.4:", round(value(0.1, 0.7, 0.4), 4))
bestA = max(((a, b) for a in np.linspace(0, .3, 13) for b in np.linspace(.5, .9, 13)),
            key=lambda t: value(t[0], t[1], 0.4))
print("A best response vs B=.4:", [round(float(v), 2) for v in bestA], "-> value", round(value(*bestA, 0.4), 4))
bestB = min(np.linspace(0, 1, 41), key=lambda c: value(0.1, 0.7, c))
print("B holds A to:", round(value(0.1, 0.7, bestB), 4))
# value at thresholds .1/.7/.4: 0.1
# A best response vs B=.4: [0.1, 0.7] -> value 0.1
# B holds A to: 0.1