Printer Friendly

On Prime Depth as an Analytic Tool for Prime Number Generation.


A new concept called prime depth at level k is introduced and used as a tool for constructing a martingale process for predicting the (n+1)th prime, [P.sub.n+1], given [P.sub.1], [P.sub.2],..., [P.sub.n]. The prime depths at level 2 was shown to obey a Laplace distribution with location parameter at 0 and scale parameter [beta]. The scale parameter is the same scale parameter of the exponential distribution describing the prime gaps. The estimator of the (n+1)th prime is given by the stochastic equation: [P.sub.n+1] = [P.sub.n] + [P.sub.n+1] - [P.sub. n-2] + [delta] where [delta] is a martingale with Laplacian filtration.

Keywords: Prime depth at level k, Laplace distribution, martingale process, filtration


We introduce the concept of a prime depth at level k and use the concept to predict the (n+1)th prime, [P.sub.n+1], given the sequence {[P.sub.1], [P.sub.2],..., [P.sub.n]} of primes.

Definition 1. Let [P.sub.1], [P.sub.2],...,[P.sub.n] be the sequence of primes. The nth prime gap is:

g(n) = [P.sub.n+1] - [P.sub.n], n [greater than or equal to] 1 (1)

From (1), it follows that:|

[P.sub.n+1] = [P.sub.n] + g(n) (2)

and by recursion on (2), we obtain:

[P.sub.n+1] = g(1)+g(2)+ [ctdot] + g(n) (3)

Knowing the prime gaps {g(n)} allows us to express the nth prime [P.sub.n] as the sum of the previous prime gaps.

Prime gaps can be arbitrarily large (Tao 2013). For any n>0, however, Bertrand's Postulate states that:

g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] [P.sub.n] (4)

Statistical upper bounds on prime gaps had been found. Yitang Zhang (2013) stated:

g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] 7 x [10.sup.7] infinitely often (5)

while the Polymath Project of Tao (2013) improved this to:

g(n) = [P.sub.n+1] - [P.sub.n] [less than or equal to] 6 (6)

Using Sieve Theory, Tao postulated that (6) may be the best upper bound that can be produced by Sieve analysis.

The unpredictability of the appearance of primes led to stochastic approaches in the study of prime gaps. Cramer (1935) first proposed the introduction of a uniform probability model on a subset of {p | 0 < p [less than or equal to] x} and demonstrated that prime gaps {g(x)} obey a Poisson process such that:

E(g(x)) ~ log x (7)

V(g(x)) ~ [log.sup.2] x

Selberg (1948) generalized Cramer's (1935) model by considering a general weight function [gamma](x) such that:

[[summation over (term)].sub.x][gamma](x) [greater than or equal to] 1 (8)

Yamasaki et al. (1995) fitted an exponential distribution on the prime gaps and used the result to prove that:

[mathematical expression not reproducible] (9)

2.0 Prime Depths

We consider primes as pseudo-random quantities. Consequently, the prime gaps (1) will likewise be pseudo-random. Yamasakin et al. (1997) fitted an exponential distribution to the prime gaps

Definition 2. Let {g(x)} be the sequence of prime gaps. The prime depth at level k is given by:

[D.sub.k] = g (n + k) - g(n), k < n (10)

Thus, [D.sub.1] = g(n + 1) - g(n) = [P.sub.n+2] - [P.sub.n+1] - [P.sub.n+1] + [P.sub.n] = [P.sub.n+2] - 2[P.sub.n+1] + [P.sub.n];

[D.sub.2] = g(n + 2) - g(n) = [P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n]; and so on.

We wish to find the distribution of [D.sub.k] for k < n. We need:

Theorem 1: Let X and Y be independent and identically distributed exponential random variables with parameter [beta]. Then, Z = X - Y has a Laplace distribution with location parameter [mu] = 0 and scale parameter [beta].

Proof: The moment generating function of X and Y is given by:

[mathematical expression not reproducible]

Since X and Y are independent, if follows that the moment generating function of Z = X - Y is:

[mathematical expression not reproducible]

which is the moment-generating function of a Laplace random variable.|

The [D.sub.k] are expressed as differences of prime gaps which are exponentially distributed (Yamasaki et al., 1997) and, hence, fall squarely under the ambit of Theorem 1 except for k = 1.

Corollary 1. Let [{[D.sub.2](k)}.sub.k=1.sup.[infinity]] be the prime depths at level 2. Then, [{[D.sub.2](k)}.sub.k=1.sup.[infinity]] has a Laplace distribution with location parameter [mu] = 0 and [beta] ~ log (n) for large n.

Proof: Let [D.sub.2] = g(n + 2) - g(n). For large n, g(n + 2) and g(n) have exponential distributions with parameter log (n + 2) and log (n) respectively. However,

[mathematical expression not reproducible]

showing that g(n + 2) [congruent to] g(n) for n [greater than or equal to] N. It remains to show independence of g(n + k) and g(n).


g(n + 2) = [P.sub.n+3] - [P.sub.n+2] g(n) = [P.sub.n+1] - [P.sub.n]

It follows that g(n + 2) and g(n) are independent iff k [not equal to] 1. The result follows from Theorem 1.

For k = 1, [D.sub.1]. = g(n + 1) - g(n) is the difference between two identical but dependent exponential variates. Let X = g(n + 1) and Y = g(n), Z = X - Y, then we have the following result.

Theorem 2. Let Z = [D.sub.1] = X - Y where X = g(n + 1) and Y = g(n). Then Z has a symmetric distribution given by:

[mathematical expression not reproducible]

Proof: Symmetry follows easily from:

[mathematical expression not reproducible]

which implies X - Y [??] Y - X.


[mathematical expression not reproducible]

from which we obtain g(z) = [1/2][beta][e.sup.-[beta]z], z > 0. The other two properties are similarly established.

The depth [D.sub.2] = g(n + 2) - g(n) = [P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n] is the only one that involves four (4) consecutive primes [P.sub.n+3], [P.sub.n+2], [P.sub.n+1], [P.sub.n] while the rest do not. Let {[delta]} be a random variable generated from a Laplace distribution, then

[P.sub.n+3] - [P.sub.n+2] - [P.sub.n+1] + [P.sub.n] = [delta]

which can be written as:

[P.sub.n+3] = [P.sub.n+2] + [P.sub.n+1] - [P.sub.n] + [delta]

or equivalently:

[P.sub.n+1] = [P.sub.n] + [P.sub.n-1] - [P.sub.n-2] + [delta]. (11)

The histograms of [D.sub.1] and [D.sub.2] are shown below.

However, [mathematical expression not reproducible] implying that [P.sub.n+1] - [P.sub.n] = [P.sub.n-1] - [P.sub.n-2], i.e., that consecutive gaps are equal for all n. This is clearly not acceptable and so, we employ the machinery of Laplacian process to ensure [mathematical expression not reproducible] is also random.

3.0 Filtration and Martingale Processes

In order to make use of (11) in predicting [P.sub.n+1], we model the depth [D.sub.2] as a martingale process. This section develops this idea.

Definition 3. Let X [not equal to] [empty set]. A sigma algebra ([sigma]--algebra) F on X is a class of subsets of X such that:

(a) If A [member of] F, then [A.sup.c][member of] F

(b) If A, B [member of] F, then A [union] B [member of] F

(c) X [member of] F.

Definition 4. A sequence {[F.sub.n]: n [member of] N} of [sigma]--algebras such that [F.sub.n] [subset] [F.sub.n+1] for every n[member of]N is called a filtration.

For example, if we have a stochastic process {[X.sub.t]} and observed [X.sub.1], [X.sub.2], ..., [X.sub.n]. Let [F.sub.n]=[sigma]([X.sub.1], [X.sub.2],..., [X.sub.n]) be the smallest [sigma]--algebra that makes the random variable [X.sub.1], [X.sub.2], ..., [X.sub.n]:[OMEGA] [right arrow] [??] measurable. The [sigma]--algebra [F.sub.n] represents the amount of information available up to time n. Next, we connect the idea of a filtration and stochastic processes.

Definition 5. Let {[X.sub.n]:n > 0} be a filtration i.e. [mathematical expression not reproducible] of [sigma]--algebras. We call [X.sub.n] an [mathematical expression not reproducible]--adapted process if [X.sub.n] is [mathematical expression not reproducible]--measurable for all n [greater than or equal to] 0.

In practical terms, we consider a stochastic process indexed by discrete time T[member of]N. For each n, [X.sub.n] is a realization from some random process [mathematical expression not reproducible] e.g. Wiener process, Brownian motion.

Definition 6. A discrete process {[X.sub.n]:n > 0} is called a martingale process adapted to the filtration [mathematical expression not reproducible] if:

(a) [X.sub.n] is adapted to [mathematical expression not reproducible];

(b) E (|[X.sub.n]|) < [infinity] for all n, and

(c) E ([X.sub.n] | [F.sub.n-1] ) = [X.sub.n-1] almost surely for all n [greater than or equal to] 1.

Theorem 3: The sequence {[D.sub.2.sup.n]} adapted to the filtration {[mathematical expression not reproducible]} characterized by a Laplace distribution with [mu] = 0 and scale parameter [beta] = log (n) is a martingale.


We verify the properties of a martingale.

(i) For n = 1: [mathematical expression not reproducible]

(ii) For n = 2: [mathematical expression not reproducible]

Assume that it also holds for n = k, that is,

[mathematical expression not reproducible]

Now, for n = k + 1,

[mathematical expression not reproducible]

Therefore, [mathematical expression not reproducible]. Hence, [x.sub.n] is adapted to [mathematical expression not reproducible].

Note that as n [right arrow] [infinity], [mathematical expression not reproducible], thus Laplace process.

(ii) [mathematical expression not reproducible]

(iii) [mathematical expression not reproducible]

4.0 Simulation Results

This section presents the simulation using the Laplacian martingale process' prediction for the (n+1)th prime which is then compared to the prediction using the Prime Number Theorem. We utilized the data set of prime numbers below [10.sup.7] to compute the prime depths.

4.1 Parameter Estimation

The Laplace distribution has two (2) parameters, namely, the location parameter [mu] and the scale parameter b. Given a random sample of size n, the maximum likelihood estimators of these parameters can be explicitly computed as:

[mathematical expression not reproducible] (12)

where the [x.sub.j]'s are the prime depths. By the invariance property of MLE's, it follows that the maximum likelihood estimator of the variance is given by:

[mathematical expression not reproducible] (13)

4.2 Numerical Simulation

The primary equation to be simulated is given by:

pred([P.sub.n+1]) = [P.sub.n] + [P.sub.n-1] - [P.sub.n-2] + [delta] (14)

We used Equation (14) to predict the (n+1)th prime given the previous three (3) primes. Using a Laplace filtration, we generated 100 random |numbers [delta] from a Laplace distribution whose location parameter equals the immediate past prime depth value consistent with the martingale nature of the filtration. The average of the random numbers is obtained:

[mathematical expression not reproducible] (15)

and subsequently plugged into:

pred ([P.sub.n+1]) = [P.sub.n] + [P.sub.n+1] - [P.sub.n-2] + [bar.[delta]] (16)

The absolute error:

Absolute error = |pred([P.sub.n+1]) - actual ([P.sub.n+1])| (17)

is computed and averaged out over 100 repetitions. The mean absolute relative error is computed as the average of:

[mathematical expression not reproducible] (18)

The result of the simulation is summarized in Table 1. As shown, we compared the actual (n+1)th prime and its average predicted value using Laplacian process. The mean absolute error (MAE) and mean absolute relative error (MARE) are also introduced as measures of accuracy.

For comparison purposes, we tabulate the actual primes, PNT predictions of the primes, and Laplacian predictions in Table 2:

Table 2 shows that the Laplacian predictions of the nth primes are almost identical to the actual primes themselves. In contrast, the Prime Number Theorem predictions are off by as much as [10.sup.4]. Graphically, the orange and blue curves representing the actual primes and the Laplacian predictors are almost overlapping while the Prime Number Theorem predictions, in gray, are consistently below the two curves.

Note that the predictions using the Laplacian process are not exactly primes but near the true primes ([P.sub.n+1]). For this reason we define:

Definition 7. The predicted values [mathematical expression not reproducible] using a Laplacian martingale are called near-primes.

The near-primes {[mathematical expression not reproducible]} are within a [delta]-neighborhood of the true prime {[P.sub.n]}. So, it remains to show the magnitude of [delta].

From the defining equation:

[mathematical expression not reproducible]

we obtain

[mathematical expression not reproducible]

where [d.sub.n] = [P.sub.n+1] - [P.sub.n]. We observe that the absolute error increases in the order O (log(n)). However, the relative error:

[mathematical expression not reproducible]

as n [right arrow] [infinity].

Theorem 4.1 Let [mathematical expression not reproducible] be the near-prime prediction of [P.sub.n+1] based on a Laplace martingale process. Then,

[mathematical expression not reproducible]

for large n.

Proof: As outlined above.


Granvill, A. (1995). Unexpected irregularities in the distribution of prime numbers. Proceedings of the international congress of mathematicians, Vol. 1, 388-399.

Granville, A. (2009). Different approaches to the distribution of primes. Milan J. Math. 78, 1-25.

Granville, A. (1995). Harald cramer and the distribution of prime numbers. Scand. Actuarial J., 1, 12-28

Hardy, G.H. & Littlewood, J.E. (1923). Some problems of "Partitio Numerorum", III: on the expression of a number as a sum of primes. Acta Math, 44, 1-70. Reprinted in "Collected papers of G.H.Hardy", vol.I 561-630. Clarendon Press, Oxford, 1966.

Jones, James P., Sato, Daihachiro, Wada, Hideo, Wiens, Douglas (1976). "Diophantine representation of the set of prime numbers", (American Mathematical Monthly (Mathematical Association of America) 83 (6): 449-464)

Jones, James P. (1982). "Universal Diophantine equation",(Journal of Symbolic Logic 47 (3):549-571)

Lowther, G (2009). Filtrations and adapted processes. (lecture notes retrieved on March 27, 2016 filed under Lecture Notes in Stochastic Calculus)

Matiyasevich, Yuri V. (1999). "Formulas for prime numbers", in Tabachnikov, Serge, (Kvant Selecta: Algebra and Analysis, II, American Mathematical Society, pp. 13-24)

Mills, W. H. (1947). "A prime-representing function" (PDF), Bulletin of the American Mathematical Society 53 (6): 604

Perichon, Benoat (2010). A world record AP26 (Arithmetic progression of 26 primes) (PDF), the AP26 is listed in "Jens Kruse Andersen's primes in arithmetic progression records page"

Pintz, J. (2007). Cramer vs. cramer. On cramer's probabilistic model for primes. Functiones et Approximatio XXXVII. 2, 361-376.

Regimbal, Stephen (1975). "An Explicit Formula for the k-th prime number", (Mathematics Magazine (Mathematical Association of America) 48 (4): 230-232)

Rowland, Eric S. (2008). A natural prime-generating recurrence, (Journal of Integer Sequences 11)

Selberg, A. (1943). On the normal density of primes in small intervals and the difference between consecutive primes. Arch. Math. Naturvid. 47, 87-105.


ORCID No. 0000-0001-8891-3805

University of Science and Technology of Southern Philippines

Cagayan de Oro City, Philippines


ORCID No. 0000-0002-2054-0835

Liceo de Cagayan University Cagayan de Oro City, Philippines

Table 1. Laplacian Martingale Prediction of the nth prime

        Actual (n+1)th    Average     Location    Scale
   N     Prime Number    Predicted   Parameter   Parameter
                          Value of

  5000     48619           48618        -2        7.632
 10000    104743          104730        -4        8.411
 15000    163847          163842        -4        8.899
 30000    350381          350358       -24        9.611
100000   1299721         1299729         6       10.865
200000   2750161         2750178        10       11.641
300000   4256249         4256273         2       12.080
400000   5800139         5800114        28       12.365
500000   7368791         7368788        -4       12.586
600000   8960467         8960486        28       12.790
664578   9999991        10000008         6       12.899

        Mean Absolute   Mean Absolute
   N        Error      Relative Error

  5000    7.043         0.0001449
 10000   14.724         0.0001406
 15000    9.230         0.0000563
 30000   23.715         0.0000677
100000   13.654         0.0000105
200000   19.783         0.0000072
300000   24.955         0.0000059
400000   25.254         0.0000044
500000   11.951         0.0000016
600000   22.306         0.0000025
664578   20.027         0.0000020

Table 2. PNT versus Laplacian Prediction of the nth Prime

     N  Actual (n+1)th Prime  Predicted [P.sub.n+1] Using
                Number                      PNT

  5000                 48619                        42595
 10000                104743                        92114
 15000                163847                       144248
 30000                350381                       309280
100000               1299721                      1151305
200000               2750161                      2441228
300000               4256249                      3783475
400000               5800139                      5159702
500000               7368791                      6561196
600000               8960467                      7982825
664578               9999991                      8909950

     N  Laplacian Prediction

  5000                 48618
 10000                104730
 15000                163842
 30000                350358
100000               1299729
200000               2750178
300000               4256273
400000               5800114
500000               7368788
600000               8960486
664578              10000008
COPYRIGHT 2016 Liceo de Cagayan University
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Frias, Marlon S.; Padua, Roberto N.
Publication:Liceo Journal of Higher Education Research
Article Type:Report
Date:Dec 1, 2016
Previous Article:Influence of Teachers' Beliefs and Attitude towards Action Research.
Next Article:Motivational Factors Influencing Civilian Volunteers to Enlist in the Philippine Army.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |