[MATH] An Exact Formula for the Board Game RISK

Scope: ⭐⭐

A Classic Boardgame Question

The board game RISK has reigned as one of the most popular board games of all time since its introduction in 1957. In the game, combat occurs between two countries with a number of attacking troops $a$ and a number of defending troops $b$.

The authors of the DataGenetics blog analyzed the rules of combat in this post. The authors analyze the probability that attacking force $a$ defeats defending force $b$, where $a$ continues attacking until either the attacking force or the defending force is reduced to $0$. Their main results are numerical estimates the probability of $a$ attackers succeeding in such an attack.

The RISK Recurrence

Let $t(a,b)$ represent this success probability. My contribution is formalizing a relationship between values of $t$, which I call the RISK recurrence. Below is a table that summarizes the relation between these values.

Domain	Relation
$a=0,b \geq 0$	$t(0,b) = 0$
$a=1,b \geq 0$	$t(1,b) = 0$
$a \geq 2,b=0$	$t(a,0) = 1$
$a=2,b \geq 1$	$t(2,b) = \left(\frac{55}{216}\right)^{b-1}\left(\frac{15}{36}\right)$
$a \geq 3,b=1$	$t(a,1) = 1 - \left(\frac{441}{1296}\right)^{a-3}\left(\frac{91}{216}\right)\left(\frac{21}{36}\right)$
$a=3,b \geq 2$	$t(3,b) = \frac{295}{1296}t(3,b-2) + \left(\frac{55}{216}\right)^{b-2}\left(\frac{420}{1296}\right)\left(\frac{15}{36}\right)$
$a \geq 4,b \geq 2$	$t(a,b) = \frac{2890}{7776}t(a,b-2) + \frac{2611}{7776}t(a-1,b-1) + \frac{2275}{7776}t(a-2,b)$

This recurrence can be used to compute any reasonable values of $t(a,b)$ very quickly. A simple Python implementation can be found below.

from numpy import zeros

# NOTE this function returns the probability of a attackers beating b defenders in the board game RISK
def get_risk_probability(a, b):
    table = zeros((a+1, b+1))
    for i in range(2,a+1): table[i][0] = 1
    for i in range(3,a+1): table[i][1] = 1-(21/36)*(91/216)*(441/1296)**(i-3)
    for j in range(1,b+1): table[2][j] = (15/36)*(55/216)**(j-1)

    temp = table[3][0]
    for j in range(2,b+1,2): 
        temp = (295/1296)*temp + (15/36)*(420/1296)*(55/216)**(j-2)
        table[3][j] = temp

    temp = table[3][1]
    for j in range(3,b+1,2): 
        temp = (295/1296)*temp + (15/36)*(420/1296)*(55/216)**(j-2)
        table[3][j] = temp

    for i in range(4,a+1):
        for j in range(2,b+1):
            table[i][j] = (2890/7776)*table[i][j-2] + (2611/7776)*table[i-1][j-1] +(2275/7776)*table[i-2][j]

    return table[a][b]

if __name__ == '__main__':
    a, b = 5, 2
    prob_val = get_risk_probability(a=a, b=b)
    print(f'\nProbability of {a} attackers defeating {b} defenders in RISK is {100*prob_val:.4f}%\n')

This is completely sufficient for any reasonable RISK player. But I couldn’t ignore the fact that the RISK recurrence is linear, and so can be solved exactly using (multivariate) generating functions. I knew it would be awful to work out, but I really wanted an exact formula. So I set out to solve it.

An Academic Exercise

The first step was to compute the multivariate generating function.

Theorem 1. Let

\[T(x,y) = \sum_{a,b \geq 0}t(a,b)x^a y^b = \frac{x^{2}P(x,y)}{Q(x,y)}.\]

where

$Q(x,y) = 36 (x - 1)(49 x - 144)(55 y - 216)(295 y^{2} - 1296)(2275 x^{2} + 2611 x y + 2890 y^{2} - 7776)$

and the numerator $P(x,y)$ can be found in the risk.pdf summary here.

Proof. The proof, though tedious, is a straightforward application of generating functions. This computation was completed with the assistance of SageMath.

Using Theorem 1, we can compute the exact formula for $t$. The result is ridiculous, but is an exact expression for the probability of winning.

Theorem 2. Define $T(x,y) = T_1(x,y) + T_2(x,y) + T_3(x,y) + T_4(x,y) + T_5(x,y)$, where $T_i$ is defined by the following.

\[T_1(x,y) = -x^2\left(\frac{7776 + 7776x + 3240y + 3254xy}{(7776 - 2275x^2 - 2611xy - 2890y^2)}\right)\] \[T_2(x,y) = x^2\left(\frac{7776-2611xy-2275x^2}{(1-x)(7776 - 2275x^2 - 2611xy - 2890y^2)}\right)\] \[T_3(x,y) = x^2\left(\frac{7776+1260y-2611xy-2890y^2-\frac{91385}{216}xy^2 - \frac{50575}{108}y^3}{(1-\frac{55}{216}y)(7776 - 2275x^2 - 2611xy - 2890y^2)}\right)\] \[T_4(x,y) = x^2\left(\frac{3240y + \frac{3045}{2} xy - \frac{6965}{12} x^{2} y - \frac{2309125}{5184} x^{3} y - \frac{557375}{5184} x^{4} y}{(1-\frac{49}{144}x)(1-x)(7776 - 2275x^2 - 2611xy - 2890y^2)}\right)\] \[T_5(x,y) = x^2\left(\frac{7776 x + 3885 x y - \frac{240005}{72} x y^{2} - \frac{1871275}{1296} x y^{3} + \frac{46131625}{279936} x y^{4}}{(1-\frac{55}{216}y)(1-\frac{295}{1296}y^2)(7776 - 2275x^2 - 2611xy - 2890y^2)}\right)\]

Now we define the following notation to solve the generating function.

\[F(x,y) = \frac{1}{7776-2275x^2-2611xy-2890y^2}\]

This function has coefficients

\[\begin{eqnarray*} t_1(a,b) & = & F(x,y)[x^a][y^b] \\ & = & \frac{1}{7776} \left(\frac{2275}{7776}\right)^{\frac{a}{2}} \left(\frac{2890}{7776}\right)^{\frac{b}{2}} \sum_{j=0}^{\frac{a+b}{2}}\binom{\frac{a+b}{2}}{\frac{a-b}{2}+2j}\binom{\frac{a-b}{2}+2j}{j}\left(\frac{2611}{85\sqrt{910}}\right)^{b-2j} \end{eqnarray*}\]

for even $a+b$ and $0$ otherwise. We then similarly define

\[t_2(a,b) = \frac{1}{(1-x)}F(x,y)[x^a][y^b] = \sum_{i=0}^a t_1(i,b)\] \[t_3(a,b) = \frac{1}{(1-\frac{55}{216}y)}F(x,y)[x^a][y^b] = \sum_{j=0}^b t_1(a,j)\left(\frac{55}{216}\right)^{b-j}\] \[t_4(a,b) =\frac{1}{(1-x)(1-\frac{49}{144}x)}F(x,y)[x^a][y^b] = \sum_{i_2=0}^a\sum_{i_1=0}^{i_2} t_1(i_1,b)\left(\frac{49}{144}\right)^{i_2-i_1}\] \[\begin{eqnarray*} t_5(a,b) & = & \frac{1}{(1-\frac{55}{216}y)(1-\frac{295}{1296}y^2)}F(x,y)[x^a][y^b] \\ & = & \frac{1}{2} \sum_{i_2=0}^b\sum_{i_1=0}^{i_2} t_1(a,i_1)\left(\frac{55}{216}\right)^{b-i_2} \left(\left(\sqrt{\frac{295}{1296}}\right)^{i_2-i_1} + \left(-\sqrt{\frac{295}{1296}}\right)^{i_2-i_1}\right) \end{eqnarray*}\]

Finally, we have

\[\begin{eqnarray*} t(a+2,b) & = & -7776 t_1(a,b) - 7776t_1(a-1,b) - 3240 t_1(a,b-1) - 3254 t_1(a-1,b-1) \\ & & {} + 7776 t_2(a,b) - 2611 t_2(a-1,b-1) - 2275 t_2(a-2,b) \\ & & {} + 7776 t_3(a,b) + 1260 t_3(a,b-1) -2611 t_3(a-1,b-1) - 2890 t_3(a,b-2) \\ & & {} - \frac{91385}{216} t_3(a-1,b-2) - \frac{50575}{108}t_3(a,b-3) \\ & & {} + 3240 t_4(a,b-1) + \frac{3045}{2}t_4(a-1,b-1) - \frac{6965}{12} t_4(a-2,b-1) \\ & & {} - \frac{2309125}{5184}t_4(a-3,b-1) -\frac{557375}{5184}t_4(a-4,b-1) \\ & & {} + 7776 t_5(a-1,b) + 3885 t_5(a-1,b-1) - \frac{240005}{72} t_5(a-1,b-2) \\ & & {} - \frac{1871275}{1296} t_5(a-1,b-3) + \frac{46131625}{279936} t_5(a-1,b-4) \end{eqnarray*}\]

Proof. This proof is also extremely tedious and was completed with the assistance of SageMath. Python implementations of Theorems 1 and 2 can be found here.

With the formula completed, I was admittedly disappointed with how ridiculously ugly it was. If the RISK recurrence was simpler, would I get a simpler answer?

An Even Further Afeild Academic Exercise

I wanted a simpler formula for a similar recurrence. I considered the following recurrence relation as a model for the RISK recurrence.

\[f(x,y) = \alpha f(x,y-2) + (1 - \alpha - \beta)f(x-1,y-1) + \beta f(x-2,y)\]

with $\alpha, \beta, 1-\alpha-\beta \in [0,1]$ for real $x$, if one was to choose suitable boundary conditions. This is a significant relaxation, as we allow for any boundary conditions that simplifies the final expression.

This reminded me of a recurrence relation that the Beta function exhibits. The Beta function is defined as follows

\[B(x,y) = \int_{0}^{1} t^{x-1}(1-t)^{y-1}dt\]

This function exhibits the similar looking relation

\[B(x,y) = B(x,y+1) + B(x+1,y)\]

\[\int_{0}^{1} t^{x-1}(1-t)^{y}dt + \int_{0}^{1} t^{x}(1-t)^{y-1}dt = \int_{0}^{1} (t+1-t) t^{x-1}(1-t)^{y-1}dt = B(x,y)\]

We can similarly define a solution function $f$ so that

\[f(x,y) = \int_{1=\frac{\alpha}{t_1^2}+\frac{1-\alpha-\beta}{t_1t_2}+\frac{\beta}{t_2^2}} t_1^{x} t_2^{y} dt_1 dt_2\]

which immediately solves the recurrence. We can further simplify $f$ by solving for $t_2$ and choosing $t_1=\frac{1}{2}$ to get the following expression

\[f(x,y) = 2^{-x}\beta^y \left(\sqrt{(1-\alpha-\beta)^2 - 4 \alpha \beta + \beta} - (1-\alpha - \beta)\right)^{-y}\]

Further Questions

Can one get a similarly nice expression for given boundary conditions?

Thank you for reading!