Home Projects Blog

Symmetry, the Goldbach Conjecture and the Twin Prime Conjecture

math goldbach conjecture

The Goldbach Conjecture

Veritasium, a YouTube channel with science, released a video on the Goldbach Conjecture. Because the problem statement is so simple, it’s very tempting to play around with the math. Based on a symmetry in addition, there is an equivalent way of stating the conjecture.

The Conjecture

The original conjecture is stated as

Every even integer greater than 22 can be written as the sum of two primes.

Symmetric Reformulation

My claim is that an equivalent statement is that

For all integers kk greater than 11, there exist at least one integer dd, 0≀d<k0 \leq d < k such that kβˆ’dk-d and k+dk+d are prime.

In other words, the two primes that sum up to n=2kn = 2k from the Goldbach Conjecture must be at an equal distance dd from kk, the midpoint of nn. You can find the proof below.

Justification

The symmetry is not due to the Goldbach Conjecture itself, but it originates from addition. For any integers a,b,k>0a,b,k > 0 such that a+b=2ka + b = 2k, aa and bb must be of the form k+dk+d and kβˆ’dk-d for some dd. The reformulation of the Goldbach Conjecture just follows from applying this fact to the conjecture.

Relation to the Twin Prime Conjecture

The Twin Prime Conjecture is that

There are infinitely many primes pp such that p+2p+2 is also prime.

This is equivalent to having an integer kk with d=1d = 1 such that kβˆ’dk-d and k+dk+d are prime. Such a twin prime pair would satisfy the Goldbach Conjecture for this n=2kn = 2k. At the same time, if the Goldbach Conjecture is true, and one could show that we always have some nn where d=1d = 1, then this would prove the Twin Prime Conjecture.

Some Plots

The Goldbach Conjecture has been shown to be true up to 4β‹…10184\cdot 10^{18} 1. For example for n=16n = 16 we know that it can be written as the sum of two primes by 5+115 + 11, or by 3+133 + 13. The midpoint, kk, is 88. For both pairs of numbers, kk is in the middle. k+3=11k + 3 = 11 and kβˆ’3=5k-3 = 5.

Likewise, we can rewrite 33 as 8βˆ’58-5 and 1313 as 8+58+5.

The following plot contains some aggregate statistics for the distance dd. The maximum and minimum clearly follow the bounds n/2n/2 and 00. Fitting the slope for the average size of dd for 2<n≀106 2 < n \leq 10^6 we get f(n)β‰ˆ0.238579x+βˆ’43.772569f(n) \approx 0.238579x + -43.772569. The R-Squared metric is 0.9992100.999210, so it’s a good fit.

Maximum, minimum, average of the distances $d$, and number of distances $d$, for each $k$. Further, the solid line is the maximum of $d$, $n/2$.
Maximum, minimum, average of the distances dd, and number of distances dd, for each kk. Further, the solid line is the maximum of dd, n/2n/2.

The linear relationship in the average is due to the distribution of distances dd in its search space. For a given nn, dd must be between 00 and n/2n/2. Plotting all the distances, we can see the distances are roughly evenly distributed. Thus, the average of them is approximately in the middle of the search space, n/4n/4, which reflects the 0.2385…0.2385\dots in the approximation.

All the distances $d$ plotted, for each of the $n$.
All the distances dd plotted, for each of the nn.

The diagonal grid-like pattern in the distances are reminiscent of the Ulam Spiral2 pattern, so they are likely due to the pattern in prime numbers themselves.

Proof

Let n=2kn = 2k beany even postive integer. If a+b=2ka + b = 2k, for some integers a,b>0a,b > 0, then aa and bb can be written as k+dk + d and kβˆ’dk-d for some integer dd. We achieve this by reformulating a+b=2ka + b = 2k as aβˆ’k=kβˆ’ba-k = k-b, which we define as dd.

k+d=k+(aβˆ’k)=a\begin{equation} \begin{split} k + d &= k + (a -k) \\ &= a \end{split} \end{equation} kβˆ’d=kβˆ’(kβˆ’b)=b\begin{equation} \begin{split} k - d &= k - (k - b)\\ &= b \end{split} \end{equation}

so (k+d)+(kβˆ’d)=a+b=2k(k+d) + (k-d) = a + b = 2k always holds.

Assume that aa is greater than or equal to bb. The two summands aa and bb are greater than zero by assumption. The larger of the two, aa, can be at most 2kβˆ’12k-1, which is reached with equality when b=1b = 1. Thus, d=aβˆ’k≀2kβˆ’1βˆ’k=kβˆ’1d = a - k \leq 2k-1-k = k-1, our upper bound for dd. The lower bound is reached when a=b=ka = b = k, thus d=0d = 0.

This proves that any integer n=2kn = 2k and a,ba,b such that a+b=na + b = n, aa and bb are of the form (kβˆ’d)(k-d) respectively (k+d)(k+d), for some dd, 0≀d<k0 \leq d < k. β– \blacksquare

Application to Goldbach Conjecture

Let nn be an even integer greater than 22. As nn is even, we can write n=2kn=2k for some kk. The lower bound on nn translates to k>1k > 1. As we can map bijectively from all even nn to kk, proving the statement over all kk is equivalent.

If we assume the Goldbach Conjecture to be true for some n=2kn = 2k, then there exist primes p1,p2p_1, p_2 such that p1+p2=2kp_1 + p_2 = 2k. As proven earlier, they must be of the form kβˆ’dk-d and k+dk + d. If we on the other hand assume 2k2k can be written as the sum of kβˆ’dk-d and k+dk+d which are prime, we have shown the other direction, as 2k2k can be written as the sum of two primes. Thus, the reformulation is equivalent to the original conjecture.

Footnotes

  1. Goldbach Conjecture on Wikipedia ↑

  2. Ulam Spiral on Wikipedia ↑

Back to the Top