Arrange the numbers in a circle so that the sum of any two neighboring numbers is prime. The problem is known as the “Prime Circle Problem” and is due to Antonio Filz (Problem 1046, J. Recr. Math. vol 14, p 64, 1982; vol 15, p 71, 1983). It appears in the classic book by Richard Guy, Unsolved Problems in Number Theory, 2nd edition.
The prime circle is a Hamiltonian cycle in the bipartite graph made from the edges that exist upon satisfaction of the condition (namely, the sum is prime). Here is my solution in Python using the networkx library.
import networkx as nx
import matplotlib.pyplot as plt
import itertools
from networkx.algorithms.cycles import simple_cycles
from networkx import DiGraph
# Function to check if a number is prime
def is_prime(n):
if n <= 1:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
# Create an undirected graph
G = nx.Graph()
# Add nodes
G.add_nodes_from(range(1, 21))
# Add edges between nodes if their sum is prime
for u, v in itertools.combinations(G.nodes, 2):
if is_prime(u + v):
G.add_edge(u, v)
# Initialize a variable to store a Hamiltonian cycle
hamiltonian_cycle = None
# Iteratively check each cycle
for cycle in simple_cycles(DG):
if len(cycle) == len(G.nodes):
hamiltonian_cycle = cycle
break # Exit the loop as soon as a Hamiltonian cycle is found
# Plot the graph
pos = nx.circular_layout(G) # Position nodes in a circle
nx.draw(G, pos, with_labels=True, node_color='lightblue', edge_color='red', font_weight='bold')
# Highlight the Hamiltonian cycle if found
if hamiltonian_cycle:
# Ensure the cycle is in the correct order for plotting
hamiltonian_cycle.append(hamiltonian_cycle[0]) # Make it a cycle
nx.draw_networkx_edges(G, pos, edgelist=list(zip(hamiltonian_cycle, hamiltonian_cycle[1:])), width=2, edge_color='green')
print("Hamiltonian cycle found.")
else:
print("No Hamiltonian cycle found.")
plt.show()
Two other similar problems where the hamiltonian cycle and path approach works are given below.
Write out the numbers as a sequence so that every pair of neighboring numbers sums to a perfect square. (For example, could be part of the sequence because and .)
Here is a solution using the code below: . The idea is to create a graph such that the sum of a node and any of its neighbours’ is a perfect square and check if the graph contains a Hamiltonian path.
import networkx as nx
import math
import matplotlib.pyplot as plt
def is_perfect_square(n):
root = int(math.sqrt(n))
return root * root == n
def find_hamiltonian_path(graph, node, visited, path):
visited[node] = True
path.append(node)
if len(path) == len(graph.nodes()):
return True
for neighbor in graph.neighbors(node):
if not visited[neighbor]:
if find_hamiltonian_path(graph, neighbor, visited, path):
return True
path.pop()
visited[node] = False
return False
def hamiltonian_path(graph):
visited = {node: False for node in graph.nodes()}
path = []
for node in graph.nodes():
if find_hamiltonian_path(graph, node, visited, path):
return path
return None
G = nx.Graph()
G.add_nodes_from(range(1, 17))
for i in G.nodes():
for j in G.nodes():
if i < j and is_perfect_square(i + j):
G.add_edge(i, j)
path = hamiltonian_path(G)
if path:
print("Hamiltonian Path:", path)
else:
print("No Hamiltonian Path found!")
Write out the numbers as a sequence so that every pair of neighboring numbers sums to a perfect square and the first and last entries must also sum to a square.
Here is a solution . The idea is to create a graph such that the sum of a node and any of its neighbours’ is a perfect square and check if the graph contains a Hamiltonian cycle.
import networkx as nx
import math
import matplotlib.pyplot as plt
def is_perfect_square(n):
root = int(math.sqrt(n))
return root * root == n
G = nx.Graph()
G.add_nodes_from(range(1, 33))
for i in G.nodes():
for j in G.nodes():
if i < j and is_perfect_square(i + j):
G.add_edge(i, j)
def hamiltonian_cycle(graph):
for cycle in nx.simple_cycles(graph):
if len(cycle) == len(graph.nodes):
return cycle
return None
cycle = hamiltonian_cycle(G)
if cycle:
print("Hamiltonian Cycle:", cycle)
else:
print("No Hamiltonian Cycle found!")
It’s time for a random number duel! You and I will both use random number generators, which should give you random real numbers between and . Whoever’s number is greater wins the duel!
There’s just one problem. I’ve hacked your random number generator. Instead of giving you a random number between and , it gives you a random number between and .
What are your chances of winning the duel?
The random generator that I have generates numbers according to where and the random generator that you have (in the general case) generates numbers according to . We need the probability, . This is given by the area of the trapezium divided by the area of the rectangle . We have , , and in the figure below
In our particular case and , therefore the probability of me winning is .
The probability of me winning the duel as per the simulation below is which validates the result we got earlier.
from numpy.random import uniform
def prob_win(p1l, p1h, p2l, p2h, runs = 1000000):
total_wins = 0
for _ in range(runs):
x, y = uniform(p1l, p1h), uniform(p2l, p2h)
if x > y:
total_wins += 1
return total_wins/runs
print(prob_win(0.1,0.8,0,1))
I have in my possession 1 million fair coins. Before you ask, these are not legal tender. Among these, I want to find the “luckiest” coin.
I first flip all 1 million coins simultaneously (I’m great at multitasking like that), discarding any coins that come up tails. I flip all the coins that come up heads a second time, and I again discard any of these coins that come up tails. I repeat this process, over and over again. If at any point I am left with one coin, I declare that to be the “luckiest” coin.
But getting to one coin is no sure thing. For example, I might find myself with two coins, flip both of them and have both come up tails. Then I would have zero coins, never having had exactly one coin.
What is the probability that I will — at some point — have exactly one “luckiest” coin?
Let be the probabillity that we will have exactly one “luckiest” coin starting with coins. We have the following recurrence relation:
because the probability of ending up with heads when you flip coins is and then it is equivalent to starting the game with coins. When , .
From the code below, we get . Assuming convergence,
from functools import lru_cache
from math import comb
def prob_luckiest_coin(n):
@lru_cache()
def prob(n):
if n == 1:
return 1
else:
total_prob = 0
for i in range(1, n):
total_prob += prob(i)*comb(n,i)
return total_prob/(2**n-1)
return prob(n)
print(prob_luckiest_coin(100))
My condo complex has a single elevator that serves four stories: the garage , the first floor , the second floor and the third floor . Unfortunately, the elevator is malfunctioning and stopping at every single floor, no matter what. The elevator always goes etc.
I want to board the elevator on a random floor (with all four floors being equally likely). As I round the corner to approach the elevator, I hear that its doors have closed, but I have no further information about which floor it’s on or whether the elevator is going up or down. The doors might have just closed on my floor, for all I know.
On average, how many stops will the elevator make until it opens on my floor (including the stop on your floor)? For example, if I am waiting on the second floor, and I heard the doors closing on the garage level, then the elevator would open on my floor in two stops.
Extra credit: Instead of four floors, suppose my condo had floors. On average, how many stops will the elevator make until it opens on my floor?
From the simulation below, we see that the average number of stops when there are floors is .
from itertools import cycle
from random import choice, randint
def avg_stops(n = 4, runs = 100000):
rotate = lambda l, n: l[-n:] + l[:-n]
floors, elevator_cycle = list(range(n)), list(range(n)) + list(range(n-2,0,-1))
total_stops = 0
for _ in range(runs):
my_floor, start = choice(floors), randint(0, len(elevator_cycle)-1)
for f in cycle(rotate(elevator_cycle, start)):
total_stops += 1
if f == my_floor:
break
return total_stops/runs
print(avg_stops())
You are the coach at Riddler Fencing Academy, where your three students are squaring off against a neighboring squad. Each of your students has a different probability of winning any given point in a match. The strongest fencer has a percent chance of winning each point. The weakest has only a percent chance of winning each point. The remaining fencer has a percent probability of winning each point.
The match will be a relay. First, one of your students will face off against an opponent. As soon as one of them reaches a score of , they are both swapped out. Then, a different student of yours faces a different opponent, continuing from wherever the score left off. When one team reaches (not necessarily from the same team that first reached ), both fencers are swapped out. The remaining two fencers continue the relay until one team reaches points.
As the coach, you can choose the order in which your three students occupy the three positions in the relay: going first, second or third. How will you order them? And then what will be your team’s chances of winning the relay?
From the simulation below, we see that the probability of the permutation winning is the highest at .
from itertools import permutations
from random import random
def winning_probabilities(runs = 100000):
win_prob = {'w':0.25,'s':0.75, 'm':0.5}
probs = []
for p1,p2,p3 in permutations(win_prob.keys()):
total_wins = 0
for _ in range(runs):
s1, s2 = 0, 0
while (s1 < 15 and s2 < 15):
if (random() < win_prob[p1]):
s1 += 1
else:
s2 += 1
while (s1 < 30 and s2 < 30):
if (random() < win_prob[p2]):
s1 += 1
else:
s2 += 1
while (s1 < 45 and s2 < 45):
if (random() < win_prob[p3]):
s1 += 1
else:
s2 += 1
if s1 == 45:
total_wins += 1
probs.append((p1,p2,p3,total_wins/runs))
return probs
for p1,p2,p3,prob in winning_probabilities():
print(f"Probability of the permutation {(p1,p2,p3)} winning is {prob}")
I have a most peculiar menorah. Like most menorahs, it has nine total candles — a central candle, called the shamash, four to the left of the shamash and another four to the right. But unlike most menorahs, the eight candles on either side of the shamash are numbered. The two candles adjacent to the shamash are both the next two candles out from the shamash are the next pair are and the outermost pair are
The shamash is always lit. How many ways are there to light the remaining eight candles so that sums on either side of the menorah are “balanced”? (For example, one such way is to light candles and on one side and candles and on the other side. In this case, the sums on both sides are , so the menorah is balanced.)
The number of ways of lighting the candles satisfying the conditions is . The different ways of lighting the candles is given below:
(('l', 1), ('r', 1))
(('l', 2), ('r', 2))
(('l', 3), ('r', 3))
(('l', 4), ('r', 4))
(('l', 1), ('l', 2), ('r', 3))
(('l', 1), ('l', 3), ('r', 4))
(('l', 3), ('r', 1), ('r', 2))
(('l', 4), ('r', 1), ('r', 3))
(('l', 1), ('l', 2), ('r', 1), ('r', 2))
(('l', 1), ('l', 3), ('r', 1), ('r', 3))
(('l', 1), ('l', 4), ('r', 1), ('r', 4))
(('l', 1), ('l', 4), ('r', 2), ('r', 3))
(('l', 2), ('l', 3), ('r', 1), ('r', 4))
(('l', 2), ('l', 3), ('r', 2), ('r', 3))
(('l', 2), ('l', 4), ('r', 2), ('r', 4))
(('l', 3), ('l', 4), ('r', 3), ('r', 4))
(('l', 1), ('l', 2), ('l', 3), ('r', 2), ('r', 4))
(('l', 1), ('l', 2), ('l', 4), ('r', 3), ('r', 4))
(('l', 2), ('l', 4), ('r', 1), ('r', 2), ('r', 3))
(('l', 3), ('l', 4), ('r', 1), ('r', 2), ('r', 4))
(('l', 1), ('l', 2), ('l', 3), ('r', 1), ('r', 2), ('r', 3))
(('l', 1), ('l', 2), ('l', 4), ('r', 1), ('r', 2), ('r', 4))
(('l', 1), ('l', 3), ('l', 4), ('r', 1), ('r', 3), ('r', 4))
(('l', 2), ('l', 3), ('l', 4), ('r', 2), ('r', 3), ('r', 4))
(('l', 1), ('l', 2), ('l', 3), ('l', 4), ('r', 1), ('r', 2), ('r', 3), ('r', 4))
from itertools import product, combinations
def menorah_lighting(n=4):
side_sum = lambda comb, side: sum([i for s, i in comb if s == side])
candles = list(product(["l","r"], range(1, n+1)))
cnt, lightings = 0, []
for k in range(2, 2*n+1):
for comb in combinations(candles, k):
if side_sum(comb, "l") == side_sum(comb, "r"):
lightings.append(comb)
cnt += 1
return cnt, lightings
cnt, lightings = menorah_lighting()
print(cnt)
for l in lightings:
print(l)
I have three dice on my desk that I fiddle with while working, much to the chagrin of my co-workers. For the uninitiated, the is a tetrahedron that is equally likely to land on any of its four faces (numbered through ), the is a cube that is equally likely to land on any of its six faces (numbered through ), and the is an octahedron that is equally likely to land on any of its eight faces (numbered through ).
I like to play a game in which I roll all three dice in “numerical” order: , then and then . I win this game when the three rolls form a strictly increasing sequence (such as , but not ). What is my probability of winning?
Extra credit: Instead of three dice, I now have six dice: and . If I roll all six dice in “numerical” order, what is the probability I’ll get a strictly increasing sequence?
From the simulation below, we see that the probability of the winning with and is and the probability of winning with and is .
from random import choice
def prob(dice_num_faces, runs=10000000):
dice = {n:list(range(1, n+1)) for n in dice_num_faces}
cnt_succ = 0
for _ in range(runs):
roll = [choice(dice[d]) for d in sorted(dice.keys())]
cnt_succ += all(i < j for i, j in zip(roll, roll[1:]))
return cnt_succ/runs
print(prob([4,6,8]))
print(prob([4,6,8,10,12,20]))
A group of people join , and each person has a random, percent chance of being friends with each of the other people. Friendship is a symmetric relationship on , so if you’re friends with me, then I am also friends with you.I pick a random person among the — let’s suppose her name is Marcia. On average, how many friends would you expect each of Marcia’s friends to have?
From the simulation below, we see that the expected number of friends of each of Marcia’s friends is . It is interesting to note that Marcia on average would have only friends.
import networkx as nx
from random import random, randint, choice
def exp_num_friends(n, runs = 10000):
total_deg = 0
for _ in range(runs):
G = nx.Graph()
for i in range(n):
G.add_node(i)
for i in range(n-1):
for j in range(i+1, n):
if random() < 0.5:
G.add_edge(i,j)
marcia = randint(0,n-1)
marcia_friends = list(G.adj[marcia].keys())
if marcia_friends:
total_deg += G.degree[choice(marcia_friends)]
return total_deg/runs
print(exp_num_friends(101))
The sum of the factors of — including itself — is . Coincidentally, inches rounded to the nearest centimeter is … centimeters!
Can you find another whole number like , where you can “compute” the sum of its factors by converting from inches to centimeters?
Extra credit: Can you find a third whole number with this property? How many more whole numbers can you find?
From the code below we see that and are two numbers below that satisify the given property.
from functools import reduce
def sum_of_factors(n):
return sum(set(reduce(list.__add__, ([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0))))
def inches_to_cm_same_as_sum_of_divisors():
nums= []
for i in range(37,1000000):
if round(i*2.54) == sum_of_factors(i):
nums.append(i)
return nums
print(inches_to_cm_same_as_sum_of_divisors())
I have a spherical pumpkin. I carefully calculate its volume in cubic inches, as well as its surface area in square inches.
But when I came back to my calculations, I saw that my units — the square inches and the cubic inches — had mysteriously disappeared from my calculations. But it didn’t matter, because both numerical values were the same!
What is the radius of my spherical pumpkin?
Extra credit: Let’s dispense with 3D thinking. Instead, suppose I have an -hyperspherical pumpkin. Once again, I calculate its volume (with units ) and surface area (with units ). Miraculously, the numerical values are once again the same! What is the radius of my -hyperspherical pumpkin?
Let be the radius of the spherical pumpkin. We have
The recurrence relation for the surface area of an -ball is given by
If , we have .
Congratulations, you’ve made it to the fifth round of The Squiddler — a competition that takes place on a remote island. In this round, you are one of the remaining competitors who must cross a bridge made up of pairs of separated glass squares. Here is what the bridge looks like from above:
To cross the bridge, you must jump from one pair of squares to the next. However, you must choose one of the two squares in a pair to land on. Within each pair, one square is made of tempered glass, while the other is made of normal glass. If you jump onto tempered glass, all is well, and you can continue on to the next pair of squares. But if you jump onto normal glass, it will break, and you will be eliminated from the competition.
You and your competitors have no knowledge of which square within each pair is made of tempered glass. The only way to figure it out is to take a leap of faith and jump onto a square. Once a pair is revealed — either when someone lands on a tempered square or a normal square — all remaining competitors take notice and will choose the tempered glass when they arrive at that pair.
On average, how many of the competitors will survive and make it to the next round of the competition?
Let be the expected number of survivors when there are competitors and pairs of glasses. We have the recurrence relation
The Python code to compute is given below:
def S(n , m):
if n == 0:
return 0
if m == 0:
return n
return 0.5*S(n, m-1) + 0.5*S(n-1, m-1)
On average, if there are competitors and pairs of glasses, we will have survivors.
Duke Leto Atreides knows for a fact that there are not one, but two traitors within his royal household. The suspects are Lady Jessica, Dr. Wellington Yueh, Gurney Halleck and Duncan Idaho. Leto’s advisor, Thufir Hawat, will assist him in questioning the four suspects. Anyone who is a traitor will tell a lie, while anyone who is not a traitor will tell the truth.
Upon interrogation, Jessica says that she is not the traitor, while Wellington similarly says that he is not the traitor. Gurney says that Jessica or Wellington is a traitor. Finally, Duncan says that Jessica or Gurney is a traitor. (Thufir, being the logician that he is, notes that when someone says thing A is true or thing B is true, both A and B can technically be true.)
After playing back the interrogations in his mind, Thufir is ready to report the name of one of the traitors to the duke. Whose name does he report?
Let be the boolean variables indicating whether or not Jessica, Wellington, Gurney and Duncan are traitors or not. The statements made by them and the fact that there are traitors reduce to the following set of logical propositions:
Here is the Python code in Z3 to check if the propositions above can be satisfied:
from z3 import *
j = Bool('j')
w = Bool('w')
g = Bool('g')
d = Bool('d')
s = Solver()
s.add(Implies(g,Not(Or(j,w))))
s.add(Implies(Not(g),Or(j,w)))
s.add(Implies(d,Not(Or(j,g))))
s.add(Implies(Not(d),Or(j,g)))
s.add(Or([And(j,w), And(j,g), And(j,d), And(w,g), And(w,d), And(g,d)]))
while s.check() == sat:
m = s.model()
print(m)
s.add(Or(j != m[j], g != m[g], w != m[w], d != m[d]))
We see that there are two cases where the propositions above are satisfied:
[g = False, j = True, w = True, d = False]
[w = True, g = False, j = False, d = True]
In both the cases, Wellington is one of the traitors, which means that Thufir reported as one of the traitors to the Duke.
Suppose you have an equilateral triangle. You pick three random points, one along each of its three edges, uniformly along the length of each edge — that is, each point along each edge has the same probability of being selected.
With those three randomly selected points, you can form a new triangle inside the original one. What is the probability that the center of the larger triangle also lies inside the smaller one?
The logic for determining if a point is inside or outside a given triangle is described here.
The Python code for implementing the above logic is given below:
import numpy as np
from math import sqrt
from random import random
triangle = [np.array([-.5,-sqrt(3)/6]),
np.array([.5,-sqrt(3)/6]),
np.array([0, sqrt(3)/3])]
center = np.array([0,0])
def exp_percent_center_inside(triangle, center, runs = 100000):
det = np.cross
c=0
for _ in range(runs):
[A, B, C], v = triangle, center
D, E, F = A + random()*(B - A), B + random()*(C - B), C + random()*(A - C)
v0, v1, v2 = D, E - D, F - D
a = (det(v, v2) - det(v0, v2))/det(v1, v2)
b = -((det(v, v1) - det(v0, v1))/det(v1, v2))
if a > 0 and b > 0 and a + b < 1:
c += 1
return c/runs
print(exp_percent_center_inside(triangle, center))
From the simulation, we see that the probability the center of the equilateral triangle lies inside the smaller random triangle is .
While watching batter spread out on my waffle iron, and thinking back to a recent conversation I had with Friend-of-The-Riddler™ Benjamin Dickman, I noticed the following sequence of numbers:
Before you ask — yes, you can find this sequence on the On-Line Encyclopedia of Integer Sequences. However, for the full Riddler experience, I urge you to not look it up. See if you can find the next few terms in the sequence, as well as the pattern.
Now, for the actual riddle: Once you’ve determined the pattern, can you figure out the average value of the entire sequence?
Let the sequence be denoted by . You can find more about the sequence here 😊. Then represents the number of ways an integer can be expressed as a sum of two squares (positive, negative, or zero). That is, denotes the number of solutions in integers to the equation . For example, since the solutions to are , , , , , , , and . Because whenever has the form , is a very erratic function.Thankfully, the problem is about the average value of as . If we define to be the number of solutions in integers to , then the average of for is
The code (using brute force) for calculating and is given below:
from math import sqrt, ceil
def t(n):
t, l = 0, ceil(sqrt(n))+1
for i in range(0, n+1):
for j in range(-l, l):
for k in range(-l, l):
if j**2 + k**2 == i:
t += 1
return t, t/(n+1)
print(list(map(t, [1,2,3,4,5,10,20,50,100])))
Here is a table of and for a few values of :
1 | 2 | 3 | 4 | 5 | 10 | 20 | 50 | 100 | |
---|---|---|---|---|---|---|---|---|---|
5 | 9 | 9 | 13 | 21 | 37 | 69 | 161 | 317 | |
2.5 | 3 | 2.25 | 2.6 | 3.5 | 3.36 | 3.29 | 3.16 | 3.15 |
$ _{n } = $.
The proof from the reference below is based on a geometric interpretation of , and is due to Carl Friedrich Gauss : is the number of points in or on a circle with integer coordinates. For example, since the circle centered at the origin of radius contains lattice points as illustrated in the figure below:
If we draw a square with area centered at each of the points, then the total area (in grey) of the squares is also . Thus we would expect the area of the squares to be approximately the area of the circle, or in general, to be approximately . If we expand the circle of radius by half the length of the diagonal of a square of area , then the expanded circle contains all the squares. If we contract the circle by the same amount, then the contracted circle would be contained in the union of all the squares as seen in the figure below:
Thus,
Dividing each term by and applying the squeeze theorem for limits yields the desired result.
The finals of the sport climbing competition has eight climbers, each of whom compete in three different events: speed climbing, bouldering and lead climbing. Based on their time and performance, each of the eight climbers is given a ranking (first through eighth, with no ties allowed) in each event, as well as a corresponding score ( through , respectively).
The three scores each climber earns are then multiplied together to give a final score. For example, a climber who placed second in speed climbing, fifth in bouldering and sixth in lead climbing would receive a score of , or points. The gold medalist is whoever achieves the lowest final score among the eight finalists.
What is the highest (i.e., worst) score one could achieve in this event and still have a chance of winning (or at least tying for first place overall)?
From the below, we see that the worst score one could achieve and still have a chance of winning is .
from random import shuffle
from operator import mul
from functools import reduce
def max_min_score(np, ne, runs = 1000000):
max_min_score = 0
for _ in range(runs):
ranks = [list(range(1,np+1)) for i in range(ne)]
for i in range(ne):
shuffle(ranks[i])
scores = [reduce(mul, [ranks[j][i] for j in range(ne)]) for i in range(np)]
min_score = min(scores)
if min_score > max_min_score:
max_min_score = min_score
return max_min_score
print(max_min_score(8,3))
I recently came across a rather peculiar recipe for something called Babylonian radish pie. Intrigued, I began to follow the directions, which said I could start with any number of cups of flour.
Any number? I mean, I had to start with some flour, so zero cups wasn’t an option. But according to the recipe, any positive value was fair game. Next, I needed a second amount of flour that was 3 divided by my original number. For example, if I had started with two cups of flour, then the recipe told me I now needed divided by , or , cups at this point.
I was then instructed to combine these amounts of flour and discard half. Apparently, this was my new starting amount of flour. I was to repeat the process, combining this amount with divided by it and then discarding half.
The recipe told me to keep doing this, over and over. Eventually, I’d have the proper number of cups of flour for my radish pie.
How many cups of flour does the recipe ultimately call for?
In the limit after convergence, let be number of cups of flour required as per the recipe. We have
This involves recognizing the recipe is infact a classic algorithm that is “Newton’s” method to compute square roots for , i.e. to solve . The algorithm involves the following steps:
Starts with some guess .
Compute the sequence of improved guesses:
In light of the above, it is easy to see that the recipe ultimately calls for cups of flour.
From the code below, we see that the recipe ultimately calls for cups.
def num_cups():
s, c = 2, 0
while True:
c = 0.5*(s + 3/s)
if abs(s-c) < 0.000001:
break
s = c
return c
print(num_cups())
Lately, Rushabh has been thinking about very large regular polygons — that is, a polygon all of whose sides and angles are congruent. His latest construction is a particular regular -gon, which has sides of length . Rushabh picks one of its longest diagonals, which connects two opposite vertices.
Now, this -gon has many diagonals, but only some are perpendicular to that first diagonal Rushabh picked. If you were to slice the polygon along all these perpendicular diagonals, you’d break the first diagonal into distinct pieces. Rushabh is curious — what is the product of the lengths of all these pieces?
Extra credit: Now suppose you have a regular -gon, each of whose sides has length . You pick a vertex and draw an altitude to the opposite side of the polygon. Again, you slice the polygon along all the perpendicular diagonals, breaking the altitude into distinct pieces. What’s the product of the lengths of all these pieces this time?
The segments of the problem are the vertical components of the polygon’s sides on its right half as shown in the diagram below; these lengths are , with where is even (here we make use of the fact that the polygon’s side rotates through as as it moves to the next side).
The product of the lengths of all these segments is given by
Let . The roots of this polynomial are the non-trivial -th roots of unity, so
Plugging in for yields .
Using the above result, we have
Therefore, when , the required product has the value .
For the case when is odd, we need to calculate the product
We make use of the result below. You can find the proof here.
We have,
When , the required product has the value .
Earlier this year, Dakota Jones used a crystal key to gain access to a hidden temple, deep in the Riddlerian Jungle. According to an ancient text, the crystal had exactly six edges, five of which were inch long. Also, the key was the largest such polyhedron (by volume) with these edge lengths.
However, after consulting an expert, Jones realized she had the wrong translation. Instead of definitively having five edges that were inch long, the crystal only needed to have four edges that were inch long. In other words, five edges could have been inch (or all six for that matter), but the crystal definitely had at least four edges that were inch long.
The translator confirmed that the key was indeed the largest such polyhedron (by volume) with these edge lengths.
Once again, Jones needs your help. Now what is the volume of the crystal key?
Given the distances between the vertices of a tetrahedron the volume can be computed using the :
where the subscripts represent the vertices and is the pairwise distance between them – i.e., the length of the edge connecting the two vertices.
If , , be the three edges that meet at a point, and the opposite edges. The volume of the tetrahedron is given by
where
In our case, let , we have
We need to find the value of and that maximizes
Setting the partial derivates and to zero, we get the equations
Therefore, attains the maximum value of when .
https://en.wikipedia.org/wiki/Tetrahedron
A polyhedron with edges has to be a tetrahedron. In this particular case, we have a tetrahedron where one face is an equilateral triangle of side length . The volume of tetrahedron (which is a triangular pyramid where the base is an equilateral triangle) is given by
The volume is maximized when the height is maximized i.e. when another edge of length is perpendicular to the base. Therefore the volume of the crystal key is
One morning, Phil was playing with my daughter, who loves to cut paper with her safety scissors. She especially likes cutting paper into “strips,” which are rectangular pieces of paper whose shorter sides are at most inch long.
Whenever Phil gives her a piece of standard printer paper ( inches by inches), she picks one of the four sides at random and then cuts a -inch wide strip parallel to that side. Next, she discards the strip and repeats the process, picking another side at random and cutting the strip. Eventually, she is left with nothing but strips.
On average, how many cuts will she make before she is left only with strips?
Extra credit: Instead of by -inch paper, what if the paper measures by inches? (And for a special case of this, what if the paper is square?)
From the simulation below, we see that the expected number of cuts before we are left only with strips is .
from random import random
def avg_num_cuts_mc(m, n, runs= 1000000):
sum_num_cuts = 0
for _ in range(runs):
num_cuts, cm, cn = 0, m, n
while cm > 1 and cn > 1:
r = random()
if r < 0.5:
cm -= 1
else:
cn -= 1
num_cuts += 1
sum_num_cuts += num_cuts
return sum_num_cuts/runs
print(f"Expected number of cuts is {avg_num_cuts_mc(11, 9)}")
We have the following recurrence relation for the expected number of cuts
with initial conditions .
Solving the recurrence relation using we see that the expected number of cuts is indeed .
def avg_num_cuts_rr(m, n):
C = [[0 for _ in range(n)] for _ in range(m)]
for i in range(1, m):
for j in range(1, n):
C[i][j] = 1 + 0.5*(C[i][j-1] + C[i-1][j])
return C[m-1][n-1]
print(f"Expected number of cuts is {avg_num_cuts_rr(11, 9)}")
A camel is loaded with straws until its back breaks. Each straw gas a weight uniformly distributed and , independent of the other straws. The camel’s back breaks as soon as the total weight of all the straws exceeds .
Find the expected number of straws that break the camel’s back.
Let be the random variable representing the weight of each straw and let be the expected number of straws that need to be drawn such that for .
We have the following integral equation:
Differentiating on both sides, we get the differential equation with the boundary condition .
The solution to the above differential equation is and the expected number of straws is .
Let be the random variable representing the sum of uniform random variables (i.e. sum of the weights of straws in this case). follows the Irwin-Hall distribution. For , the PDF of is given by .
Let be the random variable for the number of straws required such that . We see that
The expected number of straws satisfying the condition is given by
From the simulation below, we see that the expected weight of the last straw is .
from random import random
runs, sum_num_straws = 1000000, 0
for _ in range(runs):
num_straws, weight = 0,0
while (weight < 1):
weight += random()
sum_num_straws += num_straws
print(sum_num_straws/runs)
Find the probability that the weight of the last straw is less than or equal to .
Let be the random variable representing the weight of the last straw. We have the conditional distribution
Therefore the required probability distribution for is
Find the expected weight of the last straw that breaks the camel’s back.
From the probability distribution calculated above, we get the probability density function of which is .
Therefore, the expected weight of the last straw is given by
Let be the random variable representing the weight of each straw and let be the expected weight of the last straw that needs to be drawn such that for . We have the following integral equation:
Differentiating both sides, we get the differential equation with the boundary condition .
The solution to the above differential equation is and the expected weight of the last straw is .
From the simulation below, we see that the expected weight of the last straw is .
from random import random
runs, sum_last_straw_weight = 1000000, 0
for _ in range(runs):
weight = 0
while (weight < 1):
straw = random()
weight += straw
sum_last_straw_weight += straw
print(sum_last_straw_weight/runs)
Jandhyala, Vamshi. 2021. “On Cames and Straws’, August 27, 2021. URL
Hames Jarrison has just intercepted a pass at one end zone of a football field, and begins running — at a constant speed of miles per hour — to the other end zone, yards away.
At the moment he catches the ball, you are on the very same goal line, but on the other end of the field, yards away from Jarrison. Caught up in the moment, you decide you will always run directly toward Jarrison’s current position, rather than plan ahead to meet him downfield along a more strategic course.
Assuming you run at a constant speed (i.e., don’t worry about any transient acceleration), how fast must you be in order to catch Jarrison before he scores a touchdown?
Let the chaser be at and the runner be at the time respectively, the instant the pursuit begins, with the runner running at constant speed along the line . The chaser runs at a constant speed along a curved path such that he is always moving directly toward the runner, that is, the velocity vector of the chaser points directly at the runner at every instant of time.
To find the curve of pursuit of the chaser, we assume that the chaser is at the location at time . At time , the runner is at the point and so, the slope of the tangent line to the pursuit curve (the value of at ) is given by
We also know that the chaser would have ran a distance along it by the time . This arc-length is also given by the expression on the right below:
Eliminating from the above two equations, we get
Differentiating under the integral sign with respect to , we arrive at
Integrating the above equation, we get
We see at , that when because at that instant, the runner as well as the chaser are on the . It follows that and so
From the above, we get
Integrating the above equation, we get
Since when , pursuit curve equation is given by
In the given problem, we have miles/hr, and when . Substituting these values in the above equation, we get
Solving the above quadratic and taking the positive value for , we see that the speed of the chaser should be
Help, there’s a cricket on my floor! I want to trap it with a cup so that I can safely move it outside. But every time I get close, it hops exactly foot in a random direction.
I take note of its starting position and come closer. Boom — it hops in a random direction. I get close again. Boom — it takes another hop in a random direction, independent of the direction of the first hop.
What is the most probable distance between the cricket’s current position after two random jumps and its starting position? (Note: This puzzle is not asking for the expected distance, but rather the most probable distance. In other words, if you consider the probability distribution over all possible distances, where is the peak of this distribution?)
Let be the angle between the cricket’s jumps. Using the cosine law, we see that the distance from the center to the end point of the second hop is . The least distance from the starting point is and the maximum distance is . We can asssume that .
For any real , we have the CDF of ,
Taking the derivative of the CDF, the PDF of is on .
The PDF approaches infinity as , therefore the most probable distance is feet.
From the histogram below, we see that the most probable distance is indeed feet.
import numpy as np
from math import pi, sqrt, sin, cos
import matplotlib.pyplot as plt
def distances(num_samples = 1000000):
def dist(row):
return sqrt((cos(row[0])-cos(row[1]))**2 + (sin(row[0])-sin(row[1]))**2)
angles = np.random.uniform(0, 2*pi, (num_samples,2))
return np.apply_along_axis(dist, 1, angles)
def plot_dist_hist(distances):
plt.hist(distances, bins='auto')
plt.xlabel('Distance')
plt.ylabel('Frequency')
plt.title('Distance from the starting point')
plot_dist_hist(distances())
When you roll a pair of fair dice, the most likely outcome is (which occurs 1/6 of the time) and the least likely outcomes are and (which each occur of the time).
Annoyed by the variance of these probabilities, I set out to create a pair of “uniform dice.” These dice still have sides that are uniquely numbered from to , and they are identical to each other. However, they are weighted so that their sum is more uniformly distributed between and than that of fair dice.
Unfortunately, it is impossible to create a pair of such dice so that the probabilities of all sums from to are identical (i.e., they are all ). But I bet we can get pretty close.
The variance of the probabilities is the average value of the squared difference between each probability and the average probability (which is, again, ). One way to make my dice as uniform as possible is to minimize this variance.
So how should I make my dice as uniform as possible? In other words, which specific weighting of the dice minimizes the variance among the probabilities? That is, what should the probabilities be for rolling or with one of the dice?
Let be the probabilities for rolling or with one of the dice. We have and for . To calculate the variance among the probabilities, we first need to calculate for . The probabilities are fairly straightforward to calculate e.g.
The variance is then given by
From the code below we see that the minimum value of the variance is and the probabilities are , , , , and .
from scipy.optimize import minimize
def variance(probs):
prob_sums={}
for d1,d2 in [(i,j) for i in range(6) for j in range(6)]:
prob_sums[d1+d2] = prob_sums.get(d1+d2, 0) + probs[d1]*probs[6+d2]
return sum([(p - 1/11)**2 for p in prob_sums.values()])/11
minimize(variance, [0.5]*12, constraints=[{'type':'eq', 'fun':lambda probs:sum(probs[:6])-1},
{'type':'eq', 'fun':lambda probs:sum(probs[6:])-1}])
If we allow the two dice to have different weightings, we can do much better.From the code below we see that the minimum value of the variance is and the probabilities are , , , , and for the first die and , , , , and for the second.
minimize(variance, [0.5]*12, bounds=[(0,1)]*12,
constraints=[{'type':'eq', 'fun':lambda probs:sum(probs[:6])-1},
{'type':'eq', 'fun':lambda probs:sum(probs[6:])-1},
{'type':'eq', 'fun':lambda probs:probs[0]-0.5}])
You are very clever when it comes to solving Riddler Express puzzles. You are so clever, in fact, that you are in the top percent of solvers in Riddler Nation (which, as you know, has a very large population). You don’t know where in the top percent you are — in fact, you realize that you are equally likely to be anywhere in the topmost decile. Also, no two people in Riddler Nation are equally clever.
One Friday morning, you walk into a room with members randomly selected from Riddler Nation. What is the probability that you are the cleverest solver in the room?
Without loss of generality we can assume that the normalized IQ of the population follows the Uniform distribution .Let be the random variables representing the IQs of the other individuals in the room apart from you where for . Let be the random variable representing your IQ where . We need the probability where . From Order Statistics of the Uniform distribution, we know that . Therefore the required probability is given by
When and the required probability is .
From the simulation below, we can see that the required probability is indeed .
import numpy as np
def prob_you_cleverest(n, start_iq = 0.9, end_iq = 1.0, runs = 5000000):
others_iqs = np.random.rand(runs, n)
your_iqs = np.random.uniform(start_iq, end_iq, runs)
return np.mean(np.where(np.max(others_iqs, axis=1) < your_iqs, [1], [0]))
print(prob_you_cleverest(9))
You have four standard dice, and your goal is simple: Maximize the sum of your rolls. So you roll all four dice at once, hoping to achieve a high score.
But wait, there’s more! If you’re not happy with your roll, you can choose to reroll zero, one, two or three of the dice. In other words, you must “freeze” one or more dice and set them aside, never to be rerolled.
You repeat this process with the remaining dice — you roll them all and then freeze at least one. You repeat this process until all the dice are frozen.
If you play strategically, what score can you expect to achieve on average?
Extra credit: Instead of four dice, what if you start with five dice? What if you start with six dice? What if you start with dice?
Let the number of dice be , a roll can be represented as an -tuple where for . There are a total of possible rolls. Let be the function which gives you the maximum expected score that can be achieved starting from a given roll by choosing to reroll zero, one, two or dice. The expected maximum score is then given by
where .
For a given roll , let be the corresponding tuple sorted in order.
The maximimum expected score that can be obtained by following the optimal strategy starting from a given initial roll can be calculated as follows:
We first freeze the maximum value of a roll .
We then calculate the maximum expected score that can result by rerolling remaining dice. If we are rolling dice, the maximum expected score is the sum of the expected maximum score for those dice and the top of the remaining values of the roll after removing the maximum value which was already frozen.
This leads to the following definition of the function :
where stands for the number of dice that are rerolled.
From the above it is clear that is ultimately defined in terms of and it can be calculated iteratively.
Using the code below, we see that the maximum score that can be achieved on average is .
from itertools import product
from fractions import Fraction
def MESs(num_dice):
exp_max_scores = {}
exp_max_scores[0] = 0
for n in range(1, num_dice+1):
total_mes = 0
for roll in product(range(1, 6 + 1), repeat = n):
sorted_roll = sorted(list(roll), reverse=True)
total_mes += max(sum(sorted_roll[:n-j]) + exp_max_scores[j] for j in range(n))
exp_max_scores[n]=Fraction(total_mes, 6**n)
return exp_max_scores
mess = MESs(5)
for i in range(1,5):
print(f"The average max score for {i} dice is {mess[i]}")
The average max score for 1 dice is 7/2
The average max score for 2 dice is 593/72
The average max score for 3 dice is 13049/972
The average max score for 4 dice is 989065/52488
The first thing to note is that there are at least two possible interpretations of the term evolutionary system. It is frequently used in a very general sense to describe a system that changes incrementally over time. The second sense, and the one in this post, is the narrower use of the term in biology, namely, to mean a . One way of characterizing a Darwinian evolutionary system is to identify a set of core components that constitues such a system. The core components are:
one or more populations of individuals competing for limited resources,
the notion of dynamically changing populations due to the birth and death of individuals,
a concept of fitness which reflects the ability of an individual to survive and reproduce, and
a concept of variational inheritance: offspring closely resemble their parents, but are not identical.
Such a characterization leads naturally to the view of an evolutionary system as a process that, given particular initial conditions, follows a trajectory over time through a complex evolutionary state space. One can then study various aspects of these processes such as their convergence properties, their sensitivity to initial conditions, their transient behavior, and so on. Depending on one’s goals and interests, various components of such a system may be fixed or themselves subject to evolutionary pressures.
It does not take much imagination to interpret an evolutionary system as a parallel adaptive search procedure not unlike a swarm of ants exploring a landscape in search of food. Initial random individual movement gives way to more focused exploration, not as the result of some pre-planned group search procedure, but rather through dynamic reorganization as clues regarding food locations are encountered. Mathematically, one can view individuals in the population as representing sample points in a search space that provide clues about the location of regions of high fitness. The simulated evolutionary dynamics produce an adaptive, fitness-biased exploration of the search space. When the evolutionary process is terminated, the results of that search process (e.g., the best point found) can be viewed as the “answer” to the search problem.
In order to be applied to a particular problem, this abstract notion of an evolutionary algorithm(EA)-based parallel adaptive search procedure must be instantiated by a series of key design decisions involving:
deciding what an individual in the population represents,
providing a means for computing the fitness of an individual,
deciding how children (new search points) are generated from parents (current search points),
specifying population sizes and dynamics,
defining a termination criterion for stopping the evolutionary process, and
returning an answer.
In the simplest representation, the individuals in the population represent potential solutions to the problem at hand, and the fitness of an individual is defined in terms of the quality of the solution it represents or in terms of its proximity to a solution. There are, of course, other possible scenarios. The population as a whole could represent a solution. For example, each individual could represent a decision rule, and the population could represent the current set of rules being evaluated collectively for its decision-making effectiveness. Alternatively, we might develop a hybrid system in which individuals in the population represent the initial conditions for a problem-specific local search procedure, such as the initial weights used by an artificial neural network backpropagation procedure.
For simplicity, the focus of this post will be on on EAs that search solution spaces. However, having committed to searching solution spaces still leaves open the question of how to represent that space internally in an EA.
There are two primary approaches one might take in choosing a representation: a phenotypic approach in which individuals represent solutions internally exactly as they are represented externally, and a genotypic approach in which individuals internally represent solutions encoded in a universal representation language. Both approaches have their advantages and disadvantages. A phenotypic approach generally allows for more exploitation of problem-specific properties, but at the expense of more EA software development time. A genotypic approach encourages rapid prototyping of new applications, but makes it more difficult to take advantage of domain knowledge.
The simplest and most natural internal representation for an EA involves individuals that consist of fixed-length vectors of genes. Hence, solution spaces that are defined as dimensional parameter spaces are the simplest and easiest to represent internally in an EA since solutions are described by fixed-length vectors of parameters, and simple internal representations are obtained by considering each parameter a “gene”. In this case the only decision involves whether individual parameters are internally represented phenotypically (i.e., as is) or encoded genotypically (e.g., as binary strings). There are other representationa possible using length objects but then a lot of care is required while devising reproductive operators.
It was noted in the previous section that closely tied to EA design decisions about representation are the choices made for reproductive operators. If our EA is searching solution spaces, we need reproductive operators that use the current population to generate interesting new solution candidates. With simple EAs we have two basic strategies for doing so:
There is no a priori reason to choose one or the other. In fact, these two reproductive strategies are quite complementary in nature, and an EA that uses both is generally more robust than an EA using either one alone. However, the specific form that these reproductive operators take depends heavily on the choice made for representation.
EA theory provides some additional high-level guidance to help the practitioner. First, EA theory tells us that a smoothly running EA engine has reproductive operators that exhibit high fitness correlation between parents and offspring. That is, reproductive operators are not expected to work magic with low fitness parents, nor are they expected to produce (on average) poor quality offspring from high fitness parents. While fitness correlation does not give specific advice on how to construct a particular reproductive operator, it can be used to compare the effectiveness of candidate operators. Operators that achieve good fitness correlation are those that effectively manipulate semantically meaningful building blocks. This is why the choices of representation and operators are so closely coupled. Finally, in order for mutation and recombination operators to manipulate building blocks effectively, the building blocks must not in general be highly ; that is, they must not interact too strongly with each other with respect to their effects on the viability of an individual and its fitness.
The classic one-parent reproductive mechanism is mutation that operates by cloning a parent and then providing some variation by modifying one or more genes in the offspring’s genome. The amount of variation is controlled by specifying how many genes are to be modified and the manner in which genes are to be modified. These two aspects interact in the following way. It is often the case that there is a natural distance metric associated with the values that a gene can take on, such as Euclidean distance for the real-valued parameter landscapes. Using this metric, one can then quantify the exploration level of mutation operators based on the number of genes modified and the amount of change they make to a gene’s value.
The classic two-parent reproductive mechanism is recombination in which subcomponents of the parents’ genomes are cloned and reassembled to create an offspring genome. For simple fixed-length linear genome representations, the recombination operators have traditionally taken the form of operators, in which the crossover points mark the linear subsegments on the parents’ genomes to be copied and reassembled. So, for example, a point crossover operator would produce an offspring by randomly selecting a crossover point between genes and , and then copying genes from parent and genes from parent . Similarly, a point crossover operator randomly selects two crossover points and copies segments one and three from parent and segment two from parent . For these kinds of reproductive operators, the amount of variation introduced when producing children is dependent on two factors: how many crossover points there are and how similar the parents are to each other. The interesting implication of this is that, unlike mutation, the amount of variation introduced by crossover operating on fixed-length linear genomes diminishes over time as selection makes the population more homogeneous. This dependency on the contents of the population makes it much more difficult to estimate the level of crossover-induced variation. What can be calculated is the variation due to the number of crossover points. One traditional way of doing this is to calculate the “disruptiveness” of a crossover operator. This is done by calculating the probability that a child will inherit a set of genes from one of its parents. Increasing the number of crossover points increases variation and simultaneously reduces the likelihood that a set of genes will be passed on to a child. Even the and point crossover operators described above, which are the ones used in canonical GAs, can be shown to introduce adequate amounts of variation when the parent population is fairly heterogeneous. One of the difficulties with these traditional crossover operators is that, by always choosing a fixed number of crossover points, they introduce a (generally undesirable) distance bias in that genes close together on a genome are more likely to be inherited as a group than if they are widely separated. A solution to this dilemma is to make the number of crossover points a stochastic variable as well. One method for achieving this is to imagine flipping a coin at each gene position. If it comes up heads, the child inherits that gene from parent 1; otherwise, the child inherits that gene from parent . Hence, a coin flip sequence of corresponds to a point crossover, while a sequence of corresponds to a point crossover operation. Depending on the coin flip sequence, anywhere from zero to crossover points can be generated. If the probability of heads is , then the average number of crossover points generated is and has been dubbed “uniform crossover”. The key feature of uniform crossover is that it can be shown to have no distance bias. Hence, the location of a set of genes on a genome does not affect its heritability. However, uniform crossover can be shown to have a much higher level of disruption than or point crossover, and for many situations its level of disruption (variation) is too high. Fortunately, the level of disruption can be controlled without introducing a distance bias simply by varying from (no disruption) to (maximum disruption) the probability of a heads occurring. This unbiased crossover operator with tunable disruption has been dubbed “parameterized uniform crossover” and is widely used in practice.
Intuitively, the parent population size can be viewed as a measure of the degree of parallel search an EA supports, since the parent population is the basis for generating new search points. For simple landscapes only small amounts of parallelism are required. However, for more complex, multi-peaked landscapes, populations of or even of parents are frequently required.
By contrast, the offspring population size plays quite different roles in an EA. One important role relates the exploration/exploitation balance that is critical for good EA search behavior. The current parent population reflects where in the solution space an EA is focusing its search, based on the feedback from earlier generations. The number of offspring generated is a measure of how long one is willing to continue to use the current parent population as the basis for generating new offspring without integrating the newly generated high-fitness offspring back into the parent population.
The simple EAs all maintain a population of size by repeatedly:
using the current population as a source of parents to produce offspring, and
reducing the expanded population from to m individuals.
Regardless of the particular values of and , both steps involve selecting a subset of individuals from a given set. In step one, the required number of parents are selected to produce offspring. In step two, individuals are selected to survive. So far, we have seen several examples of the two basic categories of selection mechanisms: and selection methods. With deterministic methods, each individual in the selection pool is assigned a fixed number that corresponds to the number of times they will be selected. With stochastic selection mechanisms, individuals in the selection pool are assigned a fixed probability of being chosen. So, for example, in a GA parents are selected stochastically using a fitness-proportional probability distribution. Stochastic selection can be used to add “noise” to EA-based problem solvers in a way that improves their “robustness” by decreasing the likelihood of converging to a sub-optimal solution.
However, more important than whether selection is stochastic or deterministic is how a particular selection mechanism distributes selection pressure over the selection pool of candidate individuals. The various selection schemes can be ranked according to selection pressure strength. The ranking from weakest to strongest is:
So far there has not been any discussion regarding exactly which individuals are competing for survival. The answer differs depending on whether a particular EA implements an or generation model. With non-overlapping models, the entire parent population dies off each generation and the offspring only compete with each other for survival. In non-overlapping models, if the offspring population size is significantly larger than the parent population size then competition for survival increases. Hence, we see another role that increasing offspring population size plays, namely, amplifying non-uniform survival selection pressure. However, a much more significant effect on selection pressure occurs when using an EA with an overlapping-generation model. In this case, parents and offspring compete with each other for survival. The combination of a larger selection pool and the fact that, as evolution proceeds, the parents provide stronger and stronger competition, results in a significant increase in selection pressure over a non-overlapping version of the same EA.
There are a number of EA properties that one can potentially use as indicators of convergence and stopping criteria. Ideally, of course, we want an EA to stop when it “finds the answer”. For some classes of search problems (e.g., constraint satisfaction problems) it is easy to detect that an answer has been found. But for most problems (e.g., global optimization) there is no way of knowing for sure. Rather, the search process is terminated on the basis of other criteria (e.g., convergence of the algorithm) and the best solution encountered during the search process is returned as “the answer”. The most obvious way to detect convergence in an EA is recognizing when an EA has reached a fixed point in the sense that no further changes in the population will occur. The difficulty with this is that only the simplest EAs converge to a static fixed point in finite time. Almost every EA of sufficient complexity to be of use as a problem solver converges in the limit as the number of generations approaches infinity to a probability distribution over population states. To the observer, this appears as a sort of “punctuated equilibrium” in which, as evolution proceeds, an EA will appear to have converged and then exhibit a sudden improvement in fitness. So, in practice, we need to be able to detect when an EA has converged in the sense that a “law of diminishing returns” has set in. As we saw earlier, from a dynamical systems point of view homogeneous populations are basins of attraction from which it is difficult for EAs to escape. Hence, one useful measure of convergence is the degree of homogeneity of the population. This provides direct evidence of how focused the EA search is at any particular time, and allows one to monitor over time an initially broad and diverse population that, under selection pressure, becomes increasingly more narrow and focused. By choosing an appropriate measure of population homogeneity (e.g., spatial dispersion, entropy, etc.) one typically observes a fairly rapid decrease in homogeneity and then a settling into a steady state as the slope of the homogeneity measure approaches zero. A simpler, but often just as effective measure is to monitor the best-so-far objective fitness during an EA run. Best-so-far curves are typically the mirror image of homogeneity curves in that best-so-far curves rise rapidly and then flatten out. A “diminishing returns” signal can be triggered if little or no improvement in global objective fitness is observed for g generations (typically, 10–20). Both of these measures (population homogeneity and global objective fitness improvements) are fairly robust, problem-independent measures of convergence. However, for problem domains that are computationally intensive (e.g., every fitness evaluation involves running a war game simulation), it may be the case that one cannot afford to wait until convergence is signaled. Instead, one is given a fixed computational budget (e.g., a maximum of 10,000 simulations) and runs an EA until convergence or until the computational budget is exceeded.
A final benefit of having a simple EA search solution spaces is that the process of returning an answer is fairly straightforward: whenever the stopping criterion is met, the individual with the best objective fitness encountered during the evolutionary run is returned. Of course, there is nothing to prohibit one from returning the most fit solutions encountered if that is perceived as desirable. This raises the issue of how an EA “remembers” the most fit individuals encountered during an entire evolutionary run. One solution is to use an EA with an elitist policy that guarantees that the best individuals will never be deleted from the population. At the other end of the spectrum are “generational” EAs in which parents only survive for one generation. For such EAs the best solutions encountered over an entire run may not be members of the population at the time the stopping criterion is met, and so an additional “memo pad” is required to record the list of the top solutions encountered. Does it matter which of these two approaches one uses? In general, the answer is yes. The more elitist an EA is, the faster it converges, but at the increased risk of not finding the best solution. In computer science terms, increasing the degree of elitism increases the “greediness” of the algorithm. If we reduce the degree of elitism, we slow down the rate of convergence and increase the probability of finding the best solution. The difficulty, of course, is that if we slow down convergence too much, the answer returned when the stopping criterion is met may be worse than those obtained with more elitist EAs. As noted earlier, appropriate rates of convergence are quite problem-specific and often are determined experimentally.
: All parents are selected once.
: One point crossover.
: Parents are sorted by fitness and each of the adjacent pairs in the sorted population are used for recombination with point crossover reproductive operator.
: Non-overlapping i.e. parent population is completely replaced by child population.
from typing import TypeVar, Generic
from collections.abc import Iterable
from dataclasses import dataclass
from abc import ABC, abstractmethod
@dataclass
class Chromosome:
genes: Iterable[Generic[T]]
size: int
fitness: float = None
age: int = 0
class Problem(ABC):
@abstractmethod
def genotype(self):
pass
@abstractmethod
def terminate(self):
pass
@abstractmethod
def fitness_fun(self):
pass
class Crossover:
@staticmethod
def one_point_crossover(x, y):
from random import randint
size = len(x.genes)
cx_point = randint(0, size)
(h1, t1), (h2, t2) = np.split(x.genes,[cx_point]), np.split(y.genes,[cx_point])
return (Chromosome(genes = np.concatenate([h1, t2]), size = size),
Chromosome(genes = np.concatenate([h2, t1]), size = size))
class Selection:
@staticmethod
def elite(population, fitness_fun, n):
return sorted(population, fitness_fun, reverse=True)[:n]
@staticmethod
def random(population, n):
from random import choices
return choices(population, k=n)
class GeneticAlgorithm:
def __init__(self, problem):
self.problem = problem
self.population = None
def initialize(self):
return [self.problem.genotype() for _ in range(self.problem.population_size)]
def evaluate(self):
for i in range(len(self.population)):
self.population[i].fitness = self.problem.fitness_fun(self.population[i])
self.population[i].age += 1
self.population.sort(key=lambda x:x.fitness, reverse=True)
def reproduce(self):
def chunks(lst, n):
for i in range(0, len(lst), n):
yield lst[i:i + n]
new_population = []
for x,y in chunks(self.population, 2):
new_population.extend(list(self.problem.crossover(x,y)))
self.population = new_population
def run(self):
self.population = self.initialize()
while(not self.problem.terminate(self.population)):
self.reproduce()
self.evaluate()
return self.population[0]
Arrange queens on a chess board so that none of the queens conflict with another. This problem, known as -queens, is a fundamental constraint satisfaction problem
: A random permutation of indicating the position of each queen in a row.
:
: All parents are selected once.
: One point crossover.
: Parents are sorted by fitness and each of the adjacent pairs in the sorted population are used for recombination with point crossover reproductive operator.
: Non-overlapping i.e. parent population is completely replaced by child population.
: .
class NQueens(Problem):
def __init__(self, l, population_size, crossover):
self.population_size = population_size
self.crossover = crossover
self.l = l
def fitness_fun(self, chromosome):
diag_clashes = 0
for i in range(chromosome.size):
for j in range(chromosome.size):
if i != j:
dx = abs(i - j)
dy = abs(chromosome.genes[i] - chromosome.genes[j])
diag_clashes += 1 if dx == dy else 0
return len(np.unique(chromosome.genes)) - diag_clashes
def genotype(self):
genes = np.random.permutation(self.l)
chromosome = Chromosome(genes=genes, size=self.l)
chromosome.fitness = self.fitness_fun(chromosome)
return chromosome
def terminate(self, population):
if population[0].fitness == self.l:
return True
else:
return False
ga_n_queens = GeneticAlgorithm(NQueens(8, 1000, Crossover.one_point_crossover))
print(ga_n_queens.run())
Evolutionary Computation: A Unified Approach by Kenneth A De Jong
The method of simulated annealing (SA) draws its inspiration from the physical process of metallurgy and uses terminology that comes from that field. When a metal is heated to a sufficiently high temperature, its atoms undergo disordered movements of large amplitude. If one now cools the metal down progressively, the atoms reduce their movement and tend to stabilize around fixed positions in a regular crystal structure with minimal energy. In this state, in which internal structural constraints are minimized, ductility is improved and the metal becomes easier to work. This slow cooling process is called annealing by metallurgists and it is to be contrasted with the quenching process, which consists of a rapid cooling down of the metal or alloy.Quenching causes the cooled metal to be more fragile, but also harder and more resistant to wear and vibration. In this case, the resulting atomic structure corresponds to a local energy minimum whose value is higher than the one corresponding to the arrangement produced by annealing. Note finally that in practice metallurgists often used a process called tempering by which one alternates heating and cooling phases to obtain the desired physical properties.
We can intuitively understand this process in the following way: at high temperature, atoms undergo large random movements thereby exploring a large number of possible configurations. Since in nature the energy of systems tends to be minimized, low-energy configurations will be preferred, but, at this stage, higher energy configurations remain accessible thanks to the thermal energy transferred to the system. During the exploration, the system might find itself in a low-energy configuration by chance. If the energy barrier to leave this configuration is high, then the system will stay there longer on average. As temperature decreases, the system will be more and more constrained to exploit low-amplitude movements and, finally, it will “freeze” into a low-energy minimum that may be, but is not guaranteed to be, the global one.
In an optimization context SA seeks to emulate this process. SA begins at a very high temperature where the input values are allowed to assume a great range of variation. As algorithm progresses temperature is allowed to fall. This restricts the degree to which inputs are allowed to vary. This often leads the algorithm to a better solution, just as a metal achieves a better crystal structure through the actual annealing process. So, as long as temperature is being decreased, changes to the inputs result in the generation of successively better solutions finally giving rise to an optimum set of input values when temperature is close to zero.
SA can be used to find the minimum of an objective function and it is expected that the algorithm will find the inputs that will produce a minimum value of the objective function. The main feature of SA algorithm is the ability to avoid being trapped in local minimum. This is done letting the algorithm to accept not only better solutions but also worse solutions with a given probability. The main disadvantage, is that definition of some control parameters (initial temperature, cooling rate, etc.) is somewhat subjective and must be defined from an empirical basis. This means that the algorithm must be tuned in order to maximize its performance.
The simulated annealing method is used to search for the minimum of a given objective function, often called the energy , by analogy to the physical origins of the method. The algorithm follows the basic principles of all metaheuristics. The process begins by choosing an arbitrary admissible initial solution, also called the initial configuration. Furthermore, an initial “temperature” must also be defined. Next, the moves that allow the current configuration to reach its neighbors must also be defined. These moves are also called elementary transformations. The algorithm doesn’t test all the neighbors of the current configuration; instead, a random move is selected among the allowed ones. If the move leads to a lower energy value, then the new configuration is accepted and becomes the current solution. But the original feature of SA is that even moves that lead to an increase of the energy can be accepted with positive probability. This probability of accepting moves that worsen the fitness are computed from the energy variation before and after the given move:
The probability p of accepting the new configuration is defined by the exponential
This relation is called the Metropolis rule for historical reasons. The rule says that for , the acceptance probability is one, as the exponential is larger than one in this case. In other words, a solution that is better than the current one will always be accepted. On the other hand, if , which means that the fitness of the new configuration is less good, the new configuration will nonetheless be accepted with probability computed according to the above equation. Thus, a move that worsens the fitness can still be accepted. It is also clear that the larger is, the smaller will be and, for a given , becomes larger with increasing temperature . As a consequence, at high temperatures worsening moves are more likely to be accepted, making it possible to overcome fitness barriers, providing exploration capabilities, and preventing the search being stuck in local minima. In contrast, as the temperature is progressively lowered, the configurations will tend to converge towards a local minimum, thus exploiting a good region of the search space. Indeed, in the limit for , and no new configuration with is accepted.
The choice of the Metropolis rule for the acceptance probability is not arbitrary. The corresponding stochastic process that generates changes and that accepts them with probability samples the system configurations according to a well-defined probability distribution p that is known in equilibrium statistical mechanics as the Maxwell-Boltzmann distribution. It is for this reason that the Metropolis rule is so widely used in the so-called Monte Carlo physical simulation methods. A fundamental aspect of simulated annealing is the fact that the temperature is progressively decreased during the search. The details of this process are specified by a temperature schedule, also called a cooling schedule, and can be defined in different ways. For instance, the temperature can be decreased at each iteration following a given law. In practice, it is more often preferred to lower the temperature in stages: after a given number of steps at a constant temperature the search reaches a stationary value of the energy that fluctuates around a given average value that doesn’t change any more. At this point, the temperature is decreased to allow the system to achieve convergence to a lower energy state. Finally, after several stages in which the temperature has been decreased, there are no possible fitness improvements; a state is reached that is to be considered the final one, and the algorithm stops.
: Cooling schedule.
; /∗ Generation of the initial solution ∗/
; /∗ Starting temperature ∗/
/∗ At a fixed temperature ∗/
Generate a random neighbor ;
;
Then /∗ Accept the neighbor solution ∗/
Accept with a probability ;
Equilibrium condition /∗ e.g. a given number of iterations executed at each temperature ∗/
; /∗ Temperature update ∗/
Stopping criteria satisfied /∗ e.g. ∗/
: Best solution found.
In order to start a simulated annealing search, an initial temperature must be specified.Many methods have been proposed in literature to compute the initial temperature .
which means that the temperature is high enough to allow the system to traverse energy barriers of size with probability .
It is suggested to take where is the maximal cost difference between any two neighboring solutions.
Another scheme based on a more precise estimation of the cost distribution is proposed with multiple variants. It is recommended to choose where is a constant typically ranging from to and is the second moment of the energy distribution when the temperature is . is estimated using a random generation of some solutions.
A more classical consists in computing a temperature such that the acceptance ratio is approximately equal to a given value . First, we choose a large initial temperature. Then, we have to perform a number of transitions using this temperature. The ratio of accepted transitions is compared with . If it is less than , then the temperature is multiplied by . The procedure continues until the observed acceptance ratio exceeds . Other variants are proposed to obtain an acceptance ratio which is close to . It is, for example, possible to divide the temperature by if the acceptance ratio is much higher than . Using this kind of rules, cycles are avoided and a good estimation of the temperature can be found.
In another procedure temperature is obtained using the formula
,
where is an estimation of the cost increase of strictly positive transitions. This estimation is again obtained by randomly generating some transitions. Notice that , where is the cost increase induced by a transition , is the temperature allowing this transition to be accepted with a probability . In other terms, is the average of these temperatures over a set of random transitions.
The perturbation mechanism is the method to create new solutions from the current solution. In other words it is a method to explore the neighborhood of the current solution creating small changes in the current solution. SA is commonly used in combinatorial problems where the parameters being optimized are integer numbers. In an application where the parameters vary continuously, the exploration of neighborhood solutions can be made as presented next. A solution is defined as a vector representing a point in the search space. A new solution is generated using a vector of standard deviations to create a perturbation from the current solution. A neighbor solution is then produced from the present solution by:
where is a random Gaussian number with zero mean and standard deviation.
In the SA algorithm, the temperature is decreased gradually such that and
There is always a compromise between the quality of the obtained solutions and the speed of the cooling schedule. If the temperature is decreased slowly, better solutions are obtained but with a more significant computation time. The temperature can be updated in different ways:
: In the trivial linear schedule, the temperature is updated as follows: , where is a specified constant value. Hence, we have where represents the temperature at iteration .
: In the geometric schedule, the temperature is updated using the formula where . It is the most popular cooling function. Experience has shown that should be between and .
: The main trade-off in a cooling schedule is the use of a large number of iterations at a few temperatures or a small number of iterations at many temperatures. A very slow cooling schedule such as
may be used where and is the final temperature. Only one iteration is allowed at each temperature in this very slow decreasing function.
: Typical cooling schedules use monotone temperatures. Some nonmonotone scheduling schemes where the temperature is increased again may be suggested. This will encourage the diversification in the search space. For some types of search landscapes, the optimal schedule is nonmonotone.
: Most of the cooling schedules are static in the sense that the cooling schedule is defined completely a priori. In this case, the cooling schedule is “blind” to the characteristics of the search landscape. In an adaptive cooling schedule, the decreasing rate is dynamic and depends on some information obtained during the search. A dynamic cooling schedule may be used where a small number of iterations are carried out at high temperatures and a large number of iterations at low temperatures.
To reach an equilibrium state at each temperature, a number of sufficient transitions (moves) must be applied. Theory suggests that the number of iterations at each temperature might be exponential to the problem size, which is a difficult strategy to apply in practice. The number of iterations must be set according to the size of the problem instance and particularly proportional to the neighborhood size . The number of transitions visited may be as follows:
: In a static strategy, the number of transitions is determined before the search starts.For instance, a given proportion of the neighborhood is explored. Hence, the number of generated neighbors from a solution is . The more significant the ratio , the higher the computational cost and the better the results. In practice, we assume that an equilibrium state has been attained if elementary transformations have been accepted over a total quantity of tried moves. is the number of degrees of freedom of the problem, i.e., the number of variables that define the solution.
: The number of generated neighbors will depend on the characteristics of the search. For instance, it is not necessary to reach the equilibrium state at each temperature. Nonequilibrium simulated annealing algorithms may be used: the cooling schedule may be enforced as soon as an improving neighbor solution is generated. This feature may result in the reduction of the computational time without compromising the quality of the obtained solutions. Another adaptive approach using both the worst and the best solutions found in the inner loop of the algorithm may be used.
Concerning the stopping condition, theory suggests a final temperature equal to . In practice, one can stop the search when the probability of accepting a move is negligible. The following stopping criteria may also be used:
Maximum number of iterations is reached.
Reaching a final temperature is the most popular stopping criteria. This temperature must be low (e.g., ).
Achieving a predetermined number of iterations without improvement of the best found solution.
Achieving a predetermined number of times a percentage of neighbors at each temperature is accepted; that is, a counter increases by each time a temperature is completed with the less percentage of accepted moves than a predetermined limit and is reset to when a new best solution is found. If the counter reaches a predetermined limit , the SA algorithm is stopped.
The initial temperature is set to .
The final temperature is set to .
The geometric cooling schedule is used with .
The neighbour is generated using the perturbation mechanism and ensuring that they lie within the domain specified in the problem.
For each temperature, we generate neighbours.
When the temperature is less than the minimum temperature, we stop.
Using simulated annealing we see that the function attains the minimum value at . The actual minimum value is attained at .
The Python code for solving the above problem is given below.
import numpy as np
from random import random
from math import exp
def f(x1, x2):
return (x1**2 + x2 - 11)**2 + (x2**2 + x1 - 7)**2
def neighbour(x1, x2):
while True:
x1n, x2n = np.array([x1, x2]) + np.random.normal(0, 1, 2)
if 0<=x1n<=5 and 0<=x2n<=5:
return (x1n, x2n)
def simulated_annealing():
T, Tmin, alpha, x1, x2 = 1000, 0.01, 0.9, 2.5, 2.5
while True:
for _ in range(1000):
x1n, x2n = neighbour(x1, x2)
delta_E = f(x1n, x2n) - f(x1, x2)
if delta_E <= 0:
x1, x2 = x1n, x2n
else:
if random() < exp(-delta_E/T):
x1, x2 = x1n, x2n
T = alpha*T
if T < Tmin:
return(x1, x2, f(x1, x2))
print(simulated_annealing())
Riddler Nation is competing against Conundrum Country at an Olympic archery event. Each team fires three arrows toward a circular target meters away. Hitting the bull’s-eye earns a team points, while regions successively farther away from the bull’s-eye are worth fewer and fewer points.
Whichever team has more points after three rounds wins. However, if the teams are tied after each team has taken three shots, both sides will fire another three arrows. (If they remain tied, they will continue firing three arrows each until the tie is broken.)
For every shot, each archer of Riddler Nation has a one-third chance of hitting the bull’s-eye (i.e., earning points), a one-third chance of earning points and a one-third chance of earning points.
Meanwhile, each archer of Conundrum Country earns points with every arrow.
Which team is favored to win?
Extra credit: What is the probability that the team you identified as the favorite will win?
Let be the probability of Riddler Nation winning. Riddler Nation either wins the first time with a probability of or Riddler Nation draws with a probability of and then wins with a probability . Therefore we have,
There are a couple of ways by which and can be calculated.
One approach is to use brute-force enumeration of all the -tuples of the points earned in each of the shots and count the number of tuples whose sum is greater or equal to . The Python code for counting the tuples is given below:
from itertools import product
sum([1 for p in product(*[[10,9,5]]*3) if sum(p) > 24])
The other approach is to use i.e the coefficient of in gives the count of the number of cases where the sum of the scores of the three shots is .
From the simulation below, we see that the is favoured to win and the probability of win is .
from random import random
def shot_pts_riddler():
p = random()
if p < 0.33333333:
return 10
elif p > 0.33333333 and p < 0.66666666:
return 9
else:
return 5
def shot_pts_conundrum():
return 8
def prob_win_riddler(runs=1000000):
total_wins = 0
for _ in range(runs):
pts_riddler, pts_conundrum = 0, 0
while pts_riddler == pts_conundrum:
pts_riddler = sum([shot_pts_riddler() for _ in range(3)])
pts_conundrum = sum([shot_pts_conundrum() for _ in range(3)])
if pts_riddler > pts_conundrum:
total_wins += 1
return total_wins/runs
print(prob_win_riddler())
Suppose you have a chain with infinitely many flat (i.e., one-dimensional) links. The first link has length , and the length of each successive link is a fraction of the previous link’s length. As you might expect, is less than . You place the chain flat on a table and some ink at the very end of the chain (i.e., the end with the infinitesimal links).
Initially, the chain forms a straight line segment, and the longest link is fixed in place. From there, the links are constrained to move in a very specific way: The angle between each chain and the next, smaller link is always the same throughout the chain. For example, if the link and the link form a degree clockwise angle, then so do the link and the link.
After you move the chain around as much as you can, what shape is drawn by the ink that was at the tail end of the chain?
The lengths of the links in the chain for a geometric progression .
Using the properties of , the position for of the end point of each link in the chain can be represented as a complex function of (the angle between each chain link measured anticlockwise). We have,
If and are the coordinates of the end point of the last link in the chain, we have
Eliminating from the above using brute force, we get
which is the equation of a circle centered at and radius .
The mapping is a Möbius Transformation of a complex number on the unit circle centered at the origin.
Here are a couple of fundamental results related to Möbius Transformations that we will use:
Two end points of a diameter and of the unit circle get mapped to the two end points and of a diameter of the mapped circle by the symmetry principle. Therefore the centre of the mapped circle is at
The radius of the circle is given by
The following -by- grid is covered with a total of chess pieces, with one piece on each square. You should begin this puzzle at the white bishop on the green square. You can then move from white piece to white piece via the following rules:
If you are on a pawn, move up one space diagonally (left or right). If you are on a knight, move in an “L” shape — two spaces up, down, left or right, and then one space in a perpendicular direction. If you are on a bishop, move one space in any diagonal direction. If you are on a rook, move one space up, down, left or right. If you are on a queen, move one space in any direction (horizontal, vertical or diagonal). Chess board with 64 pieces on it. From left to right in each row, and then top to bottom, the pieces are (b = black, w = white:
For example, suppose your first move from the bishop is diagonally down and to the right. Now you’re at a white rook, so your possible moves are left or up to a pawn or right to a knight.
Your objective is to reach one of the four black kings on the grid. However, at no point can you land on any of the other black pieces. (Knights are allowed to hop over the black pieces.)
What sequence of moves will allow you to reach a king?
Here is the Python code for solving the problem using directed graphs. Every square on the chessboard is a node. All squares that are reachable from any given square subject to the movement constraints of the piece on that square are neighbours of the given square. The neighbours of a square are connected by directed edges whose source is the original square and the destination is a neighbour. Given this representation, the above problem reduces to finding a directed path in the graph from the source(i.e. the green square) to the destination (i.e. one the sqares with a black king).
def moves(piece, i, j, size=8):
def is_valid(pos):
i,j = pos
return True if i >= 0 and i < size and j >= 0 and j < size else False
piece_moves = {
"p": [(i+1,j+1),(i-1,j+1)],
"b": [(i+1,j+1),(i-1,j+1),(i-1,j-1),(i+1,j-1)],
"r": [(i,j+1),(i-1,j),(i+1,j),(i,j-1)],
"h": [(i-1,j+2), (i+1,j+2), (i-1,j-2),(i+1,j-2),
(i-2,j-1), (i-2,j+1),(i+2,j-1),(i+2,j+1)],
"q": [(i,j+1),(i-1,j),(i+1,j),(i,j-1),(i+1,j+1),
(i-1,j+1),(i-1,j-1),(i+1,j-1)]
}
return list(filter(is_valid, piece_moves[piece]))
def paths(board, source, targets):
import networkx as nx
pieces = {}
for i, line in enumerate(reversed(board)):
for j, piece in enumerate(line):
pieces[(j,i)] = piece.strip()
G = nx.DiGraph()
for (i,j), piece in pieces.items():
if piece[0] == "w":
for move in moves(piece[1],i,j):
if pieces[move][0] == "w" or pieces[move] == "bk":
G.add_edge((i,j), move)
paths = []
for target in targets:
path = None
try:
path = nx.shortest_path(G, source, target)
except:
pass
if path:
paths.append(path)
return paths
board1 = [
["bk", "bb", "wb", "bb", "bk", "wr", "wb", "wr"],
["wh", "wr", "wh", "wh", "wp", "wh", "bk", "wb"],
["wr", "bp", "wh", "bh", "wp", "wr", "wp", "wr"],
["bp", "bb", "wp", "wr", "bh", "wh", "br", "bh"],
["wh", "wh", "wh", "wb", "br", "wb", "wr", "wb"],
["wq", "wr", "wh", "wp", "br", "wh", "wr", "wq"],
["bb", "br", "wr", "wp", "wb", "wp", "wb", "wq"],
["bk", "wb", "wq", "wh", "wp", "wr", "wh", "wh"]
]
print(paths(board1, (4,1), [(0,0),(0,7),(4,7),(6,6)]))
Today marks the beginning of the Summer Olympics! One of the brand-new events this year is sport climbing, in which competitors try their hands (and feet) at lead climbing, speed climbing and bouldering.
Suppose the event’s organizers accidentally forgot to place all the climbing holds on and had to do it last-minute for their -meter wall (the regulation height for the purposes of this riddle). Climbers won’t have any trouble moving horizontally along the wall. However, climbers can’t move between holds that are more than meter apart vertically.
In a rush, the organizers place climbing holds randomly until there are no vertical gaps between climbing holds (including the bottom and top of the wall). Once they are done placing the holds, how many will there be on average (not including the bottom and top of the wall)?
Extra credit: Now suppose climbers find it just as difficult to move horizontally as vertically, meaning they can’t move between any two holds that are more than meter apart in any direction. Suppose also that the climbing wall is a -by- meter square. If the organizers again place the holds randomly, how many have to be placed on average until it’s possible to climb the wall?
Start with an array and repeatedly add uniform random values from to this array (while keeping it ) until all consecutive differences are less than . Average length of the array across multiple runs gives us the average number of holds that need to be placed. From the simulation below we see that the average number of holds is .
from random import uniform
def avg_holds_1d(height, d, runs = 10000):
sum_hold_cnts = 0
for _ in range(runs):
holds = [0, height]
while any([holds[i+1] - holds[i] > d for i in range(len(holds)-1)]):
new_hold = uniform(0, height)
i = 0
while holds[i] < new_hold:
i += 1
holds.insert(i, new_hold)
sum_hold_cnts += len(holds)-2
return sum_hold_cnts/runs
print(avg_holds_1d(10.0,1.0))
The computational trick to increase the speed of simulation is to use the algorithm to keep track of all climbable paths (i.e. paths without any gaps) identified so far as new holds are randomly added. When we find a climbable path that contains the top and bottom holds, we stop the simulation run.
Using the simulation code below, we see that the average number of holds when there are no vertical gaps greater than m is and the average number of holds when there are no gaps(in any direction) greater than m is .
from networkx.utils.union_find import UnionFind
from random import uniform
from math import sqrt
def dist(p1, p2):
return sqrt((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2)
def vdist(p1, p2):
return abs(p1[1]-p2[1])
def avg_holds(height, d, climb_type="1d", runs = 1000):
dist_fun = vdist if climb_type == "1d" else dist
sum_holds_cnt = 0
for _ in range(runs):
holds, top, bot, cnt = {}, 1, 2, 2
holds[top], holds[bot] = (0,0), (0,height)
climbable_paths = UnionFind()
climbable_paths.union(top)
climbable_paths.union(bot)
while True:
cnt, new_hold = cnt+1, (uniform(0, height), uniform(0,height))
for i, hold in holds.items():
if i == top or i == bot:
if vdist(hold, new_hold) <= d:
climbable_paths.union(i, cnt)
elif dist_fun(hold, new_hold) <= d:
climbable_paths.union(i, cnt)
climbable_paths.union(cnt)
holds[cnt] = new_hold
if climbable_paths[top]==climbable_paths[bot]:
break
sum_holds_cnt += cnt-2
return sum_holds_cnt/runs
print(avg_holds(10, 1, "1d"))
print(avg_holds(10, 1, "2d"))
I have three dogs: Fatch, Fetch and Fitch. Yesterday, I found a brown -inch stick for them to play with. I marked the top and bottom of the stick and then threw it for Fatch. Fatch, a Dalmatian, bit it in a random spot — leaving a mark — and returned it to me. In her honor, I painted the stick black from the top to the bite and white from the bottom to the bite.
I subsequently threw the stick for Fetch and then for Fitch, each of whom retrieved the stick by biting a random spot. What is the probability that Fetch and Fitch both bit the same color (i.e., both black or both white)?
The three bite marks can be permuted in ways all of which are equally likely because of symmetry. Out of the permutations, there are permutations where the bite marks of Fitch and Fetch lie on one side of the bite mark of Fatch. Therefore, the probability that Fetch and Fitch both bit the same color is .
From the simulation below, we see that the probability that Fetch and Fitch both bit the same color is indeed .
from random import random
def prob(runs = 1000000):
succ_cnt = 0
for _ in range(runs):
fa, fe, fi = random(), random(), random()
if (fe < fa and fi < fa) or (fe > fa and fi > fa):
succ_cnt += 1
return succ_cnt/runs
print(prob())
Italy defeated England in a heartbreaking (for England) European Championship that came down to a penalty shootout. In a shootout, teams alternate taking shots over the course of five rounds. If, at any point, a team is guaranteed to have outscored its opponent after five rounds, the shootout ends prematurely, even if each side has not yet taken five shots. If teams are tied after five rounds, they continue one round at a time until one team scores and another misses.
If each player has a percent chance of making any given penalty shot, then how many total shots will be taken on average?
From the simulation below, we see that the total shots taken on average is .
def avg_total_shots(p, runs = 1000000):
from random import random
shot = lambda p: 1 if random() < p else 0
def game_over(t1, t2, g1, g2):
return 1 if max(5-t1, 0) + g1 < g2 or max(5-t2, t1-t2) + g2 < g1 else 0
sum_ts = 0
for _ in range(runs):
t1, t2, g1, g2 = 0, 0, 0, 0
while True:
t1, g1 = t1 + 1, g1 + shot(p)
if game_over(t1, t2, g1, g2):
break
t2, g2 = t2 + 1, g2 + shot(p)
if game_over(t1, t2, g1, g2):
break
sum_ts += t1+t2
return sum_ts/runs
print(avg_total_shots(0.7))