GitHub Repo 👾 YouTube Video 📺 Scope: ⭐⭐⭐

A Minecraft Optimization Problem

Harvesting crops in Minecraft is a fundamental mechanic in the game. Crops progress through a series of growth stages, until a last stage is reached. When a crop is harvested at this last stage, it will yield an item. Harvesting a crop before its last stage yields no items.

Let’s call the number of growth stages $n$. For example, wheat has $n=8$, as it has $8$ growth stages.

For a specific crop’s number of growth stages, refer to the Minecraft Wiki.

Most players wait until a large majority of their crops are ready before harvesting. But if we were to build a machine that waits some specific time before harvesting, what is the best time for the machine to wait? Clearly the value is dependent on $n$, but what are the specific values?

This periodic farming strategy may seem antiquated in the Minecraft versions that have observers (blocks that can detect a random tick of a specific block), but this is actually not the case. Ilmango, a well-known Minecraft automated farm designer, points out in this YouTube video that Amethyst Clusters are a good candidate for this type of farm. So this question is important for many Minecraft versions, new and old. But to answer it we first need to examine crop growth itself.

How Do Minecraft Crops Grow?

Every 0.05 seconds, Minecraft updates its game state. This update is called a tick. Every tick, the following procedure runs for every subchunk loaded by a player:

  • Randomly select $m$ blocks within the subchunk (with replacement)
  • For each block that is selected, do the associated random tick event

For this process, the chosen blocks are said to be random ticked, and the value $m$ is (confusingly) called the tick speed. Note that Minecraft has $m$ set to $3$.

Most blocks in the game do nothing when random ticked, but certain blocks do special behaviors. Importantly for us, crop blocks advance their growth stage when random ticked. We can now mathematically write down our optimization question.

Formalizing the Problem

The number of times a block is random ticked after $t$ ticks follows a Binomial distribution $X_{mt,p} \sim B(mt,p)$, where $p=2^{-12}$ is the probability that a specific block is chosen in a subchunk.

Then our problem is the same as calculating the following function.

\[\Lambda_{m,p}(n) = \max_{t} \frac{1}{t}\mathbb{P}(X_{mt,p} \geq n) = \max_{t} \frac{1}{t}\left(1-(1-p)^{mt-n+1}\sum_{k=0}^{n-1}p^k \binom{mt}{k}\right).\]

Note that for Minecraft we are specifically interested in $\Lambda_{3,2^{-12}}(n)$.

A Useful Approximation

Due to $p$ being small, we can approximate $X_{mt,p}$ with a Poisson random variable $Y \sim \text{Poisson}(t)$. We then define the corresponding function for $Y$, namely

\[\lambda(n) = \max_t \frac{1}{t}\mathbb{P}(Y \geq n) = \max_t \frac{1}{t}\left(1-e^{-t}\sum_{k=0}^{n-1}\frac{t^k}{k!}\right)\]

We then have, for large $n$ and small $p$, that

\[\Lambda_{m,p}(n) \approx \frac{1}{mp}\lambda(n).\]

Direct Computation

We want to simply our expressions for $\Lambda_{m,p}(n)$ and $\lambda(n)$ by getting rid of the $max$ in both expressions. To do this, we take the derivative with respect to $t$ and set the expression to $0$. For $\Lambda_{m,p}(n)$ this gives

\[\begin{eqnarray*} 0 & = & (1-p)^{n-1}e^{-m\ln(1-p)t} + \frac{(p)^{n-1}}{(n-1)!}\ln(1-p)(mt)^n -1 \\ & + & \sum_{k=1}^{n-1}\left(\frac{\ln(1-p)p^{k-1}}{(k-1)!} + \sum_{i=k}^{n-1}((k-1)s(i,k)+\ln(1-p)s(i,k-1))\frac{p^i}{i!}\right)(mt)^k, \end{eqnarray*}\]

The above equation is critical, as it says that for a given $n$, the optimal time to wait is the $t>0$ so that the above expression is true.

Similarly for $\lambda(n)$, we have

\[0 = e^t - \frac{t^n}{(n-1)!} - \sum_{k=0}^{n-1}\frac{t^k}{k!}.\]

Numerical Results

Both of these equations can easily be solved numerically using computers. Running root finding algorithms for small $n$ gives the following table

$n$ $\lambda(n)$ $(2^{12}/3)\lambda(n)$ $\Lambda_{3,2^{-12}}(n)$ Opimal Tick
1 0 0 0 1
2 1.7932821329 2448.4278721205 2449.1597630796 2449
3 3.3836342829 4619.7886741889 4620.6492653516 4621
4 4.8812774913 6664.5708681839 6665.5420888695 6666
5 6.3225055510 8632.3275790109 8633.3997579089 8633
6 7.7245836004 10546.6314757967 10547.7982629752 10548
7 9.0973444026 12420.9075576501 12422.1643509140 12422
8 10.4470306813 14263.6792235702 14265.0224812591 14265
9 11.7779066065 16080.7684866791 16082.1953698680 16082
10 13.0930424983 17876.3673576602 17877.8755221598 17878
11 14.3947389019 19653.6168474178 19655.2043132205 19655
12 15.6847741726 21414.9450036956 21416.6100682348 21417
13 16.9645579170 23162.2764093484 23164.0175872686 23164
14 18.2352307260 24897.1683511953 24898.9843310297 24899
15 19.4977315702 26620.9028372304 26622.7924493235 26623
16 20.7528448428 28334.5508254074 28336.5129688193 28337

Notice how close the exact and approximate solutions are! They often only disagree by a single tick. The final column lists the best tick to harvest a crop of a specific growth stage.

Let’s repeat the calculations done in IlMango’s video for amethyst clusters. Amethyst clusters are an $n=4$ crop, that grow on the sides of blocks (ie. there is a $1$ in $6$ chance that a specific cluster grows). There is an additional rule where if the block is random ticked there is only a $1$ in $5$ chance that the growth stage will advance. This results in an optimal time of

\[\Lambda_{3,2^{-12}}(4) * 6 * 5 = 199966 \text{ ticks} \approx 167 \text{ minutes}\]

which is what IlMango arrives at in his video.

The above table should be sufficient for any periodic Minecraft farms at the time of writing, as all crops in the game have a small number of growth stages. But what happens if there is a crop introduced to the game that has a much larger growth rate?

An Academic Exercise

The computation of $\Lambda_{m,p}(n)$ and $\lambda(n)$ for large $n$ can be difficult even for modern root finding algorithms. However, one can show that the leading terms of $\lambda(n)$ are as follows.

Theorem 1. The leading terms of $\lambda(n)$ are

\[\lambda(n) = n + \sqrt{n\ln\left(\frac{n}{2\pi}\right)} + \dots\]

This gives an asymptotic estimate for $\lambda(n)$, which is an estimate that decreases in percent error as $n$ increases.

Proof. We begin with a definition of $\lambda(n)$ that allows for continuous $n$, namely

\[1 = \int_0^1 e^{(n\ln{(1-u)}+\lambda(n)u)} du.\]

We can use Laplace’s method to derive leading terms of $\lambda(n)$.

\[1 = \int_0^1 e^{(n\ln{(t)}+\lambda(n)(1-t))} dt = \int_0^1 e^{nf(n,t)}dt\]

there $f(n,t) = \ln{(t)}+\frac{\lambda(n)}{n}(1-t)$. We start by writing a taylor explansion for $f(n,t)$.

\[f(n,t) = \ln \left(\frac{n}{\lambda(n)}\right) + \frac{\lambda(n)}{n} -1 + \sum_{k \geq 2}\left(\frac{\lambda(n)}{n}\right)^k \frac{(-1)^{k-1}}{k}\left(t-\frac{n}{\lambda(n)}\right)^k\]

The above equation is valid for $t \in (\frac{n}{\lambda(n)}-1, \frac{n}{\lambda(n)}+1)$. As $\frac{n}{\lambda(n)} > 0$, we have $(0,1) \subset (\frac{n}{\lambda(n)}-1, \frac{n}{\lambda(n)}+1)$ and can integrate over the region $(0,1)$. We can then write

\[\int_0^1 \exp{(nf(n,t))} dt = \left(\frac{n}{\lambda(n)}\right)^n e^{\lambda(n)}e^{-n}\int_0^1\exp{\left(n\sum_{k\geq2}\frac{(-1)^{k-1}}{k}\left(\frac{\lambda(n)}{n}t-1\right)^k\right)}dt\]

Let us call the last integral

\[I(n,\lambda(n)) = \lim_{n\to\infty} I_m(n,\lambda(n)) = \lim_{n\to\infty} \int_0^1\exp{\left(n\sum_{k=2}^m\frac{(-1)^{k-1}}{k}\left(\frac{\lambda(n)}{n}t-1\right)^k\right)}dt\]

Following Laplace’s method, we approximate $I$ with $I_2$. We can then say that, for $n \to \infty$, that

\[\left(\frac{n}{\lambda(n)}\right)^ne^{\lambda(n)}e^{-n}I_2(n,\lambda(n)) \to 1\]

or

\[n\ln(n) -n\ln(\lambda(n)) + \lambda(n)-n+\ln(I_2(n,\lambda(n))) \to 0.\]

We then suppose that $\lambda(n) = n + n\delta_1(n)$. We can then say that $\delta_1(n) \to 0$ as $n \to \infty$.

\[-n\ln(1+\delta_1(n)) + n\delta_1(n) + \ln I_2(n,\lambda(n)) \to 0\]

What can be said about $I_2$? We have

\[I_2(n,\lambda(n)) = \int_0^1\exp{\left(-\left(\frac{\lambda(n)}{\sqrt{2n}}t-\sqrt{\frac{n}{2}}\right)^2\right)}dt\]

and written in terms of common functions

\[I_2(n,\lambda(n)) = \sqrt{\frac{\pi n}{2}}\frac{1}{\lambda(n)}\left(\text{erf}\left(\sqrt{\frac{n}{2}}\delta_1(n)\right) + \text{erf}\left({\sqrt{\frac{n}{2}}}\right)\right).\]

Therefore, we have

\[-n\ln(1+\delta_1(n)) + n\delta_1(n) +\frac{1}{2}\ln\left(\frac{\pi}{2}\right) + \frac{1}{2}\ln(n) - \ln(\lambda(n)) + \ln(2) \to 0\]

as \(\text{erf}\left(\sqrt{\frac{n}{2}}\delta_1(n)\right) + \text{erf}\left({\sqrt{\frac{n}{2}}}\right) \to 2\) quickly. We can then write

\[-n\ln(1+\delta_1(n)) + n\delta_1(n) - \frac{1}{2}\ln(n) -\ln(1+\delta_1(n)) + \frac{1}{2}\ln(2\pi) \to 0.\]

We can again use the Taylor expansion of $\ln(1+x)$ to get the further approximate

\[\frac{n\delta_1^2(n)}{2} + \frac{1}{2}\ln\left(\frac{2\pi}{n}\right)-\delta_1(n)+\frac{\delta_1^2(n)}{2} \to 0.\]

Finally, completing the square gives the following expression for $\delta_1(n)$

\[\delta_1(n) \approx \frac{1}{n+1}\left( 1 + \sqrt{1+(n+1)\ln\left(\frac{n}{2\pi}\right)}\right)\]

or ignoring the $-\ln(1+\delta_1(n))$ term gives the cleaner but less accurate expression

\[\delta_1(n) \approx \sqrt{\frac{1}{n}\ln\left(\frac{n}{2\pi}\right)},\]

which completes the proof. The above theorem and proof were completed with the help of my academic advisor Professor Iddo Ben-Ari.

On the opposite end of the spectrum, can any of the smaller values be written exactly?

Another Academic Exercise

Though slightly contrived, it turns out one can write $\lambda(2)$ and $\lambda(3)$ exactly using Lagrange Inversion. Using this technique along with certain identities for Partial Bell polynomials gives the following results.

\[\lambda(2) = \frac{e^2-3}{e^2-5} - \sum_{n \geq 1}f_n \left(\frac{e^2-7}{e^2-5} \right)^{n+1},\]

where

\[f_n = \sum_{i=1}^{n}\sum_{j=0}^i\sum_{k=0}^j\frac{(-1)^{n+k}(n+i)!S(n-i+j+k,k)e^{2j}}{(n+1)!(i-j)!(j-k)!(n-i+j+k)!(e^2-5)^{i}}.\]

Similarly,

\[\lambda(3) = \frac{3e^4-71}{e^4-29} - \sum_{n\geq1}g_n\left(\frac{e^4-45}{e^4-29}\right)^{n+1}\]

where

\[g_n = \sum_{i=1}^n\sum_{j=0}^n\sum_{k=0}^i\sum_{\ell=0}^{k-i}\frac{(-1)^{n-\ell}(n+i)!13^{2k+j-n}S(j+\ell,\ell)e^{4(j-k)}}{(n+1)!(j+\ell)!(i-k-\ell)!k!2^k(e^4-29)^i}\binom{k}{n-j-k}.\]

Note that above has $\binom{k}{n-j-k}=0$ if $k < n-j-k$ and $S(a,b)$ is the $(a,b)$-th Stirling Number of the second kind, which is defined as $S(a,b) = \frac{1}{b!}\sum_{k=0}^b(-1)^k \binom{b}{k}(b-k)^{a}$. Both sums converge slowly and so are of little use.

Further Questions

  1. Are there other constants in the main table that can be written exactly?

  2. The process for oxidation in copper blocks is different than other random tick events. What are the optimal times for harvesting a specific oxidation stage for copper?

Thank you for reading!