Differential equation modeling distribution of primes

Babak

unread,

Jul 16, 2008, 5:44:33 PM7/16/08

to

Hi everyone,

I've come up with a simple way to "derive" Gauss's prime density
function
using probability heuristics, alone. The "derivation" involves
setting up a
nonlinear differential equation whose solution happens to agree
exactly
with Gauss's estimate. I don't know if I've discovered
something interesting, or [more likely] old. At any rate, here's my
argument:

>From my blog: http://babaksjournal.blogspot.com/2008/07/differential-equation-estimating.html

========================================================

I have found a simple way to derive Gauss's estimate of the prime
density function using probability heuristics.

Background

Excerpt From MathWorld: http://mathworld.wolfram.com/PrimeNumberTheorem.html

..
In 1792, when only 15 years old, Gauss proposed that pi(n), the prime
density function

pi(n)∼n/(ln n).

Gauss later refined his estimate to
pi(n)∼Li(n),

where
Li(n) = Integral] (1 / ln x) dx)

..

I have not seen a simple derivation for this estimate, and if it
exists, I am surprised why it is not more widely used in expositions
on the subject of the distribution of primes. What follows is a very
short argument based on a probability model. According to this model,
we'll find that the distribution of primes is governed by a non-linear
differential equation of the form

Q'(x) = - Q(x) Q( sqrt(x) ) / x

whose solution is

Q(x) = 1 / 2 ln x

It's probably the only solution, but then, I don't know how to prove
uniqueness. (Can there be more than one solution to any given non-
linear differential equation, anyway? I suspect not.)

Anyway, to me, what's interesting is how using some specious
probability arguments about the distribution primes I was able to set
up the equation and try a test function inspired by Gauss's estimate
to solve it. This hints at something more meaningful in my Clouseau-
esque accident.

The setup

The setup, I have come to find, is similar in spirit to Harald
Cramer's probabilistic, heuristic arguments for estimating of the
distribution primes (the difference here being that we don't use any
other results from number theory). Here it is..

Equ. 1:
The joint probability of a randomly chosen positive number n not
being divisible by two relative primes p and k is (1 - 1/p)(1 - 1/k).

Equ. 2:
Define Q(x) = Pi(1 - 1/p) taken over all primes p <= x
(Pi stands for the product symbol).

Using Equ.s 1 & 2, vigorous hand waving, and a pinch of salt, we can
say

Equ. 3:
The probability that a randomly chosen positive number x being either
1 or a prime is Q( sqrt(x) ).

What we're saying here is that for x to be prime, it suffices to show
that it is not divisible by any prime less than the square root of x.
Equ. 3 says that in the neighborhood of x, the 'average' distance
between primes is 1 / Q( sqrt(x) ).

Now the n.d.e. above comes from trying to approximate Q(x) using this
probabilistic model. The idea is to use that approximation in order to
estimate a prime counting function:

Integral] Q (sqrt(x)) dx

But we don't have an analytic expression that approximates Q, yet.
Instead of setting up an integral equation, we try a differential
approach. Consider the change in Q as x passes over 2 very large
consecutive primes p1 < p2:

delta(Q) = -Q(p1) / p2 ~ -Q(x) / x
delta(x) ~ 1 / Q( sqrt(x) )

Dividing the top equation by the bottom one, you get the n.d.e. I
described above.

The solution

I had read somewhere how the 15 year old Gauss had been able to come
up with his logarithmic integral for estimating the number of primes
less than n. Was his integral inspired by a similar probabilistic
argument? Maybe, but googling it, I couldn't find much. So, I plugged
in C / ln x and solved for the constant C (=1/2).

Does it mean anything?

I suspect it might, which is why I posted it. I did a bit of cursory
reading on the topic, but alas, I'm an amateur. My claim that

Q(x) = Pi( 1 - 1/p) ~ 1 / 2 ln x

does not agree with Merten's asymptotic formula (1874)

Pi( 1 - 1/p) ~ exp(-C) / ln x,

where C is Euler's constant.

Still, there's something here that piques my nose. My result, when
plugged into the prime counting integral, agrees exactly with Gauss's
estimate:

Integral] Q (sqrt(x)) dx ~ Integral] (1 / ln x) dx

What do you think? Is this interesting, or is this old?

-Babak

tc...@lsa.umich.edu

unread,

Jul 16, 2008, 8:39:35 PM7/16/08

to

In article <ba0292e1-4b25-40b5...@d1g2000hsg.googlegroups.com>,

Babak <far...@gmail.com> wrote:
>we'll find that the distribution of primes is governed by a non-linear
>differential equation of the form
>
>Q'(x) = - Q(x) Q( sqrt(x) ) / x

See

E. M. Wright, A functional equation in the heuristic theory of primes,
Math. Gaz. 44 (1960), 15-16.

G. Hoffman de Visme, "The density of prime numbers," Math. Gaz. 45
(1961), 13-14.
--
Tim Chow tchow-at-alum-dot-mit-dot-edu
The range of our projectiles---even ... the artillery---however great, will
never exceed four of those miles of which as many thousand separate us from
the center of the earth. ---Galileo, Dialogues Concerning Two New Sciences

Dan

unread,

Jul 17, 2008, 5:20:17 PM7/17/08

to

On Jul 16, 4:44 pm, Babak <farh...@gmail.com> wrote:

> I have not seen a simple derivation for this estimate, and if it
> exists, I am surprised why it is not more widely used in expositions
> on the subject of the distribution of primes. What follows is a very
> short argument based on a probability model. According to this model,
> we'll find that the distribution of primes is governed by a non-linear
> differential equation of the form
>
> Q'(x) = - Q(x) Q( sqrt(x) ) / x

Strictly speaking, this is a _functional_ differential
equation. A DE should depend only on the current value of
G(x) and its derivatives. This is a _delay_ differential
equation, since the current value of the derivative depend
on earlier values of G.

DEs tend to be associated with initial conditions that make
the solution unique. On the other hand, a DDE is associated
with an initial _set_ of values that makes the solution unique.

>
> whose solution is
>
> Q(x) = 1 / 2 ln x
>
> It's probably the only solution, but then, I don't know how to prove
> uniqueness. (Can there be more than one solution to any given non-
> linear differential equation, anyway? I suspect not.)

Nonlinear differential equations typically have infinitely
many solutions. An initial condition typically pins it down
to a unique one. A delay differential equation will also
typically have infinitely many solutions and an initial
_range_ of values will typically pin it down to a unique one.

Given any differentiable function q(x) on any interval of
the form [a, a^2] (a > 1) that satisfies the boundary
consistency condition

q'(a^2) = -q(a^2)q(a)/(a^2),

there is a uniquely determined solution Q of

Q'(x)= -Q(x)Q(sqrt(x))/x

obtained on [a^2, a^4] by integrating

Q(x)'/Q(x) = -q(sqrt(x))/x .

Whence, by induction, one obtains a solution for all x > a.
The solution can also be extended to (1,a): obtain it on
(sqrt a, a) by

Q(x) = -x^2 q'(x^2)/q(x^2)

and use induction again.

Dan

A N Niel

unread,

Jul 18, 2008, 8:59:58 AM7/18/08

to

In article
<9f6826b8-f66a-4f3a...@r66g2000hsg.googlegroups.com>,
Dan <luec...@uark.edu> wrote:

> > Q'(x) = - Q(x) Q( sqrt(x) ) / x
>
> Strictly speaking, this is a _functional_ differential
> equation. A DE should depend only on the current value of
> G(x) and its derivatives. This is a _delay_ differential
> equation, since the current value of the derivative depend
> on earlier values of G.

When x<1, sqrt(x) is a *future* time.
When x=1, x=sqrt(x) ... this suggests expanding Q in powers of (x-1) ??

Babak

unread,

Jul 26, 2008, 4:30:02 PM7/26/08

to

Thanks for pointing out that this is in fact a *delay* diff equ.
Re your method of solution..

> Given any differentiable function q(x) on any interval of
> the form [a, a^2] (a > 1) that satisfies the boundary
> consistency condition
>
> q'(a^2) = -q(a^2)q(a)/(a^2),
>
> there is a uniquely determined solution Q of
>
> Q'(x)= -Q(x)Q(sqrt(x))/x
>
> obtained on [a^2, a^4] by integrating
>
> Q(x)'/Q(x) = -q(sqrt(x))/x .
>
> Whence, by induction, one obtains a solution for all x > a.

I'm not following the argument. I can't see how one can obtain
a solution by integrating

Q(x)'/Q(x) = -q(sqrt(x))/x .

Let me try a counter example. Consider

q(x) = - x^2 / 2

Now
q'(a^2) = -q(a^2)q(a)/(a^2)

is satisfied for a = 2. However, integrating

Q'/Q = - x / 2

does not yield a solution to the original delay diff equ.

Can you please explain?

-Babak

Babak

unread,

Jul 26, 2008, 4:30:02 PM7/26/08

to

Thanks for pointing out de Visme's heuristic argument along
a similar line of argument. One thing I noted in my original post
was that though my claim that

Q(x) = Pi( 1 - 1/p) ~ 1 / 2 ln x

does not agree with Merten's asymptotic formula

Pi( 1 - 1/p) ~ exp(-C) / ln x,

where C is Euler's constant, my result *surprisingly* yielded
the correct prime counting integral:

Integral] Q(sqrt(x)) dx = Integral] (1 / ln x) dx

That observation puzzled me, and while attempting to fix my
heuristic model, I discovered that in fact *any* model claiming
that the probability of x being prime is roughly equal to

Pi(1 - 1/p)

taken over all primes p < x^a, a > 0, yields a d.d.e of the form

Q'(x) + Q(x)Q(x^a)/x = 0

which has a solution of the form

Q(x) = a / ln x .

and which when plugged into the prime counting integral gives

Integral] Q(x^a) dx ~ Integral] (1 / ln x) dx

So one surprise is replaced with another: it doesn't matter whether
the model looks into the past (a < 1), or into the future (a > 1)--the
resulting prime counting integral is the same.

On Jul 16, 6:39 pm, tc...@lsa.umich.edu wrote:
> In article <ba0292e1-4b25-40b5-9d61-7581108bc...@d1g2000hsg.googlegroups.com>,

Dan

unread,

Aug 3, 2008, 8:00:02 AM8/3/08

to

On Jul 26, 3:30 pm, Babak <farh...@gmail.com> wrote:
> Thanks for pointing out that this is in fact a *delay* diff equ.
> Re your method of solution..
>
> > Given any differentiable function q(x) on any interval of
> > the form [a, a^2] (a > 1) that satisfies the boundary
> > consistency condition
>
> > q'(a^2) = -q(a^2)q(a)/(a^2),
>
> > there is a uniquely determined solution Q of
>
> > Q'(x)= -Q(x)Q(sqrt(x))/x
>
> > obtained on [a^2, a^4] by integrating
>
> > Q(x)'/Q(x) = -q(sqrt(x))/x .
>
> > Whence, by induction, one obtains a solution for all x > a.
>
> I'm not following the argument. I can't see how one can obtain
> a solution by integrating
>
> Q(x)'/Q(x) = -q(sqrt(x))/x .

If a^2 <= x <= a^4, the a <= sqrt(x) <= a^2. We let Q(x) = q(x) for
a <= x <= a^2 and for a^2 <= x <- a^4 we obtain Q(x) by integrating
the above equation and using the obvious initial condition. Q(a^2) =
q(a^2).
Clearly then
Q'(x)/Q(x) = - Q(sqrt(x))/x
for x in [a^2,a^4] and so we have Q defined on [a,a^4]. Perform the
same steps for x in [a^4,a^8] using the values obtained on [[a^2,a^4],
etc.

>
> Let me try a counter example. Consider
>
> q(x) = - x^2 / 2
>
> Now
> q'(a^2) = -q(a^2)q(a)/(a^2)
>
> is satisfied for a = 2. However, integrating
>
> Q'/Q = - x / 2
>
> does not yield a solution to the original delay diff equ.

But -q(sqrt(x))/x = 1/2 so one needs to integrate 1/2.
With this correction it yields ln Q(x) = x/2 + C or

Q(x) = C exp(x/2).

Using the initial condition Q(4) = q(4) = -8 we get C = -8 exp(-2)
whence

Q(x) = -x^2/2, on [2,4]
Q(x) = -8 exp (x/2 - 2), on [4,16].

which clearly satisfies the equation for all x in [4,16].
for x in the interval [16,256] we would integrate
Q'(x)/Q(x) = 8 exp (sqrt(x)/2 - 2) / x
and use the initial condition Q(16) = -8 exp(6).

In order to get the equation satisfied on [2,4] we need
Q defined on [sqrt(2), 2] one again uses the equation to
define it: Q(sqrt(x)) = -x q'(x)/q(x), or

Q(x) = - x^2 q'(x^2)/q(x^2)
= - 2, x in [sqrt(2),2].

Unfortunately, Q fails to be differentiable at 2. It also
seems we cannot continue to [2^{1/4},2^{1/2}] without
sacrificing continuity. However, getting a solution
on [4,infinity) is no problem and the proceedure is
standard for this type of delay differential equation.

Better?

Dan