On Prime Depth as an Analytic Tool for Prime Number Generation.
ABSTRACTA new concept called prime depth at level k is introduced and used as a tool for constructing a martingale process for predicting the (n+1)th prime, [P.sub.n+1], given [P.sub.1], [P.sub.2],..., [P.sub.n]. The prime depths at level 2 was shown to obey a Laplace distribution with location parameter at 0 and scale parameter [beta]. The scale parameter is the same scale parameter of the exponential distribution describing the prime gaps. The estimator of the (n+1)th prime is given by the stochastic equation: [P.sub.n+1] = [P.sub.n] + [P.sub.n+1] - [P.sub. n-2] + [delta] where [delta] is a martingale with Laplacian filtration.
Keywords: Prime depth at level k, Laplace distribution, martingale process, filtration
INTRODUCTION
We introduce the concept of a prime depth at level k and use the concept to predict the (n+1)th prime, [P.sub.n+1], given the sequence {[P.sub.1], [P.sub.2],..., [P.sub.n]} of primes.
Definition 1. Let [P.sub.1], [P.sub.2],...,[P.sub.n] be the sequence of primes. The nth prime gap is:
g(n) = [P.sub.n+1] - [P.sub.n], n [greater than or equal to] 1 (1)
From (1), it follows that:|
[P.sub.n+1] = [P.sub.n] + g(n) (2)
and by recursion on (2), we obtain:
[P.sub.n+1] = g(1)+g(2)+ [ctdot] + g(n) (3)
Knowing the prime gaps {g(n)} allows us to express the nth prime [P.sub.n] as the sum of the previous prime gaps.
Prime gaps can be arbitrarily large (Tao 2013). For any n>0, however, Bertrand's Postulate states that:
g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] [P.sub.n] (4)
Statistical upper bounds on prime gaps had been found. Yitang Zhang (2013) stated:
g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] 7 x [10.sup.7] infinitely often (5)
while the Polymath Project of Tao (2013) improved this to:
g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] 6 (6)
Using Sieve Theory, Tao postulated that (6) may be the best upper bound that can be produced by Sieve analysis.
The unpredictability of the appearance of primes led to stochastic approaches in the study of prime gaps. Cramer (1935) first proposed the introduction of a uniform probability model on a subset of {p | 0 < p [less than or equal to] x} and demonstrated that prime gaps {g(x)} obey a Poisson process such that:
E(g(x)) ~ log x (7)
V(g(x)) ~ [log.sup.2] x
Selberg (1948) generalized Cramer's (1935) model by considering a general weight function [gamma](x) such that:
[[summation over (term)].sub.x][gamma](x) [greater than or equal to] 1 (8)
Yamasaki et al. (1995) fitted an exponential distribution on the prime gaps and used the result to prove that:
[mathematical expression not reproducible] (9)
2.0 Prime Depths
We consider primes as pseudo-random quantities. Consequently, the prime gaps (1) will likewise be pseudo-random. Yamasakin et al. (1997) fitted an exponential distribution to the prime gaps
Definition 2. Let {g(x)} be the sequence of prime gaps. The prime depth at level k is given by:
[D.sub.k] = g (n + k) - g(n), k < n (10)
Thus, [D.sub.1] = g(n + 1) - g(n) = [P.sub.n+2] - [P.sub.n+1] - [P.sub.n+1] + [P.sub.n] = [P.sub.n+2] - 2[P.sub.n+1] + [P.sub.n];
[D.sub.2] = g(n + 2) - g(n) = [P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n]; and so on.
We wish to find the distribution of [D.sub.k] for k < n. We need:
Theorem 1: Let X and Y be independent and identically distributed exponential random variables with parameter [beta]. Then, Z = X - Y has a Laplace distribution with location parameter [mu] = 0 and scale parameter [beta].
Proof: The moment generating function of X and Y is given by:
[mathematical expression not reproducible]
Since X and Y are independent, if follows that the moment generating function of Z = X - Y is:
[mathematical expression not reproducible]
which is the moment-generating function of a Laplace random variable.|
The [D.sub.k] are expressed as differences of prime gaps which are exponentially distributed (Yamasaki et al., 1997) and, hence, fall squarely under the ambit of Theorem 1 except for k = 1.
Corollary 1. Let [{[D.sub.2](k)}.sub.k=1.sup.[infinity]] be the prime depths at level 2. Then, [{[D.sub.2](k)}.sub.k=1.sup.[infinity]] has a Laplace distribution with location parameter [mu] = 0 and [beta] ~ log (n) for large n.
Proof: Let [D.sub.2] = g(n + 2) - g(n). For large n, g(n + 2) and g(n) have exponential distributions with parameter log (n + 2) and log (n) respectively. However,
[mathematical expression not reproducible]
showing that g(n + 2) [congruent to] g(n) for n [greater than or equal to] N. It remains to show independence of g(n + k) and g(n).
Now,
g(n + 2) = [P.sub.n+3] - [P.sub.n+2] g(n) = [P.sub.n+1] - [P.sub.n]
It follows that g(n + 2) and g(n) are independent iff k [not equal to] 1. The result follows from Theorem 1.
For k = 1, [D.sub.1]. = g(n + 1) - g(n) is the difference between two identical but dependent exponential variates. Let X = g(n + 1) and Y = g(n), Z = X - Y, then we have the following result.
Theorem 2. Let Z = [D.sub.1] = X - Y where X = g(n + 1) and Y = g(n). Then Z has a symmetric distribution given by:
[mathematical expression not reproducible]
Proof: Symmetry follows easily from:
[mathematical expression not reproducible]
which implies X - Y [??] Y - X.
Next,
[mathematical expression not reproducible]
from which we obtain g(z) = [1/2][beta][e.sup.-[beta]z], z > 0. The other two properties are similarly established.
The depth [D.sub.2] = g(n + 2) - g(n) = [P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n] is the only one that involves four (4) consecutive primes [P.sub.n+3], [P.sub.n+2], [P.sub.n+1], [P.sub.n] while the rest do not. Let {[delta]} be a random variable generated from a Laplace distribution, then
[P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n] = [delta]
which can be written as:
[P.sub.n+3] = [P.sub.n+2] + [P.sub.n+1] - [P.sub.n] + [delta]
or equivalently:
[P.sub.n+1] = [P.sub.n] + [P.sub.n-1] - [P.sub.n-2] + [delta]. (11)
The histograms of [D.sub.1] and [D.sub.2] are shown below.
However, [mathematical expression not reproducible] implying that [P.sub.n+1] - [P.sub.n] = [P.sub.n-1] - [P.sub.n-2], i.e., that consecutive gaps are equal for all n. This is clearly not acceptable and so, we employ the machinery of Laplacian process to ensure [mathematical expression not reproducible] is also random.
3.0 Filtration and Martingale Processes
In order to make use of (11) in predicting [P.sub.n+1], we model the depth [D.sub.2] as a martingale process. This section develops this idea.
Definition 3. Let X [not equal to] [empty set]. A sigma algebra ([sigma]--algebra) F on X is a class of subsets of X such that:
(a) If A [member of] F, then [A.sup.c][member of] F
(b) If A, B [member of] F, then A [union] B [member of] F
(c) X [member of] F.
Definition 4. A sequence {[F.sub.n]: n [member of] N} of [sigma]--algebras such that [F.sub.n] [subset] [F.sub.n+1] for every n[member of]N is called a filtration.
For example, if we have a stochastic process {[X.sub.t]} and observed [X.sub.1], [X.sub.2], ..., [X.sub.n]. Let [F.sub.n]=[sigma]([X.sub.1], [X.sub.2],..., [X.sub.n]) be the smallest [sigma]--algebra that makes the random variable [X.sub.1], [X.sub.2], ..., [X.sub.n]:[OMEGA] [right arrow] [??] measurable. The [sigma]--algebra [F.sub.n] represents the amount of information available up to time n. Next, we connect the idea of a filtration and stochastic processes.
Definition 5. Let {[X.sub.n]:n > 0} be a filtration i.e. [mathematical expression not reproducible] of [sigma]--algebras. We call [X.sub.n] an [mathematical expression not reproducible]--adapted process if [X.sub.n] is [mathematical expression not reproducible]--measurable for all n [greater than or equal to] 0.
In practical terms, we consider a stochastic process indexed by discrete time T[member of]N. For each n, [X.sub.n] is a realization from some random process [mathematical expression not reproducible] e.g. Wiener process, Brownian motion.
Definition 6. A discrete process {[X.sub.n]:n > 0} is called a martingale process adapted to the filtration [mathematical expression not reproducible] if:
(a) [X.sub.n] is adapted to [mathematical expression not reproducible];
(b) E (|[X.sub.n]|) < [infinity] for all n, and
(c) E ([X.sub.n] | [F.sub.n-1] ) = [X.sub.n-1] almost surely for all n [greater than or equal to] 1.
Theorem 3: The sequence {[D.sub.2.sup.n]} adapted to the filtration {[mathematical expression not reproducible]} characterized by a Laplace distribution with [mu] = 0 and scale parameter [beta] = log (n) is a martingale.
Proof:
We verify the properties of a martingale.
(i) For n = 1: [mathematical expression not reproducible]
(ii) For n = 2: [mathematical expression not reproducible]
Assume that it also holds for n = k, that is,
[mathematical expression not reproducible]
Now, for n = k + 1,
[mathematical expression not reproducible]
Therefore, [mathematical expression not reproducible]. Hence, [x.sub.n] is adapted to [mathematical expression not reproducible].
Note that as n [right arrow] [infinity], [mathematical expression not reproducible], thus Laplace process.
(ii) [mathematical expression not reproducible]
(iii) [mathematical expression not reproducible]
4.0 Simulation Results
This section presents the simulation using the Laplacian martingale process' prediction for the (n+1)th prime which is then compared to the prediction using the Prime Number Theorem. We utilized the data set of prime numbers below [10.sup.7] to compute the prime depths.
4.1 Parameter Estimation
The Laplace distribution has two (2) parameters, namely, the location parameter [mu] and the scale parameter b. Given a random sample of size n, the maximum likelihood estimators of these parameters can be explicitly computed as:
[mathematical expression not reproducible] (12)
where the [x.sub.j]'s are the prime depths. By the invariance property of MLE's, it follows that the maximum likelihood estimator of the variance is given by:
[mathematical expression not reproducible] (13)
4.2 Numerical Simulation
The primary equation to be simulated is given by:
pred([P.sub.n+1]) = [P.sub.n] + [P.sub.n-1] - [P.sub.n-2] + [delta] (14)
We used Equation (14) to predict the (n+1)th prime given the previous three (3) primes. Using a Laplace filtration, we generated 100 random |numbers [delta] from a Laplace distribution whose location parameter equals the immediate past prime depth value consistent with the martingale nature of the filtration. The average of the random numbers is obtained:
[mathematical expression not reproducible] (15)
and subsequently plugged into:
pred ([P.sub.n+1]) = [P.sub.n] + [P.sub.n+1] - [P.sub.n-2] + [bar.[delta]] (16)
The absolute error:
Absolute error = |pred([P.sub.n+1]) - actual ([P.sub.n+1])| (17)
is computed and averaged out over 100 repetitions. The mean absolute relative error is computed as the average of:
[mathematical expression not reproducible] (18)
The result of the simulation is summarized in Table 1. As shown, we compared the actual (n+1)th prime and its average predicted value using Laplacian process. The mean absolute error (MAE) and mean absolute relative error (MARE) are also introduced as measures of accuracy.
For comparison purposes, we tabulate the actual primes, PNT predictions of the primes, and Laplacian predictions in Table 2:
Table 2 shows that the Laplacian predictions of the nth primes are almost identical to the actual primes themselves. In contrast, the Prime Number Theorem predictions are off by as much as [10.sup.4]. Graphically, the orange and blue curves representing the actual primes and the Laplacian predictors are almost overlapping while the Prime Number Theorem predictions, in gray, are consistently below the two curves.
Note that the predictions using the Laplacian process are not exactly primes but near the true primes ([P.sub.n+1]). For this reason we define:
Definition 7. The predicted values [mathematical expression not reproducible] using a Laplacian martingale are called near-primes.
The near-primes {[mathematical expression not reproducible]} are within a [delta]-neighborhood of the true prime {[P.sub.n]}. So, it remains to show the magnitude of [delta].
From the defining equation:
[mathematical expression not reproducible]
we obtain
[mathematical expression not reproducible]
where [d.sub.n] = [P.sub.n+1] - [P.sub.n]. We observe that the absolute error increases in the order O (log(n)). However, the relative error:
[mathematical expression not reproducible]
as n [right arrow] [infinity].
Theorem 4.1 Let [mathematical expression not reproducible] be the near-prime prediction of [P.sub.n+1] based on a Laplace martingale process. Then,
[mathematical expression not reproducible]
for large n.
Proof: As outlined above.
LITERATURE CITED
Granvill, A. (1995). Unexpected irregularities in the distribution of prime numbers. Proceedings of the international congress of mathematicians, Vol. 1, 388-399.
Granville, A. (2009). Different approaches to the distribution of primes. Milan J. Math. 78, 1-25.
Granville, A. (1995). Harald cramer and the distribution of prime numbers. Scand. Actuarial J., 1, 12-28
Hardy, G.H. & Littlewood, J.E. (1923). Some problems of "Partitio Numerorum", III: on the expression of a number as a sum of primes. Acta Math, 44, 1-70. Reprinted in "Collected papers of G.H.Hardy", vol.I 561-630. Clarendon Press, Oxford, 1966.
Jones, James P., Sato, Daihachiro, Wada, Hideo, Wiens, Douglas (1976). "Diophantine representation of the set of prime numbers", (American Mathematical Monthly (Mathematical Association of America) 83 (6): 449-464)
Jones, James P. (1982). "Universal Diophantine equation",(Journal of Symbolic Logic 47 (3):549-571)
Lowther, G (2009). Filtrations and adapted processes. (lecture notes retrieved on March 27, 2016 filed under Lecture Notes in Stochastic Calculus)
Matiyasevich, Yuri V. (1999). "Formulas for prime numbers", in Tabachnikov, Serge, (Kvant Selecta: Algebra and Analysis, II, American Mathematical Society, pp. 13-24)
Mills, W. H. (1947). "A prime-representing function" (PDF), Bulletin of the American Mathematical Society 53 (6): 604
Perichon, Benoat (2010). A world record AP26 (Arithmetic progression of 26 primes) (PDF), the AP26 is listed in "Jens Kruse Andersen's primes in arithmetic progression records page"
Pintz, J. (2007). Cramer vs. cramer. On cramer's probabilistic model for primes. Functiones et Approximatio XXXVII. 2, 361-376.
Regimbal, Stephen (1975). "An Explicit Formula for the k-th prime number", (Mathematics Magazine (Mathematical Association of America) 48 (4): 230-232)
Rowland, Eric S. (2008). A natural prime-generating recurrence, (Journal of Integer Sequences 11)
Selberg, A. (1943). On the normal density of primes in small intervals and the difference between consecutive primes. Arch. Math. Naturvid. 47, 87-105.
MARLON S. FRIAS
ORCID No. 0000-0001-8891-3805
marlonfrias121487@gmail.com
University of Science and Technology of Southern Philippines
Cagayan de Oro City, Philippines
ROBERTO N. PADUA
ORCID No. 0000-0002-2054-0835
rnpadua@yahoo.com
Liceo de Cagayan University Cagayan de Oro City, Philippines
doi: http://dx.doi.org/10.7828/ljher.v12i1.959
Table 1. Laplacian Martingale Prediction of the nth prime Actual (n+1)th Average Location Scale N Prime Number Predicted Parameter Parameter Value of [P.sub.n+1] 5000 48619 48618 -2 7.632 10000 104743 104730 -4 8.411 15000 163847 163842 -4 8.899 30000 350381 350358 -24 9.611 100000 1299721 1299729 6 10.865 200000 2750161 2750178 10 11.641 300000 4256249 4256273 2 12.080 400000 5800139 5800114 28 12.365 500000 7368791 7368788 -4 12.586 600000 8960467 8960486 28 12.790 664578 9999991 10000008 6 12.899 Mean Absolute Mean Absolute N Error Relative Error 5000 7.043 0.0001449 10000 14.724 0.0001406 15000 9.230 0.0000563 30000 23.715 0.0000677 100000 13.654 0.0000105 200000 19.783 0.0000072 300000 24.955 0.0000059 400000 25.254 0.0000044 500000 11.951 0.0000016 600000 22.306 0.0000025 664578 20.027 0.0000020 Table 2. PNT versus Laplacian Prediction of the nth Prime N Actual (n+1)th Prime Predicted [P.sub.n+1] Using Number PNT 5000 48619 42595 10000 104743 92114 15000 163847 144248 30000 350381 309280 100000 1299721 1151305 200000 2750161 2441228 300000 4256249 3783475 400000 5800139 5159702 500000 7368791 6561196 600000 8960467 7982825 664578 9999991 8909950 N Laplacian Prediction 5000 48618 10000 104730 15000 163842 30000 350358 100000 1299729 200000 2750178 300000 4256273 400000 5800114 500000 7368788 600000 8960486 664578 10000008
![]() ![]() ![]() ![]() | |
Author: | Frias, Marlon S.; Padua, Roberto N. |
---|---|
Publication: | Liceo Journal of Higher Education Research |
Article Type: | Report |
Date: | Dec 1, 2016 |
Words: | 2858 |
Previous Article: | Influence of Teachers' Beliefs and Attitude towards Action Research. |
Next Article: | Motivational Factors Influencing Civilian Volunteers to Enlist in the Philippine Army. |
Topics: |