A proof that Euler missed...

n⁴
11n² + 11n + 3

№ 1 (1979) · pp.195–203

A PROOF THAT EULER MISSED...

Apéry's Proof of the Irrationality of ζ(3)

An Informal Report of Alfred van der Poorten

644 Кб

1. Journées Arithmétiques de Marseille–Luminy, June, 1978

The board of programme changes informed us that R. Apéry (Caen) would speak Thursday, 14:00 «Sur l'irrationalite de ζ(3)». Though there had been earlier rumours of his claiming a proof, scepticism was general. The lecture tended to strengthen this view to rank disbelief. Those who listened casually, or who were afflicted with being non-Francophone, appeared to hear only a sequence of unlikely assertions.

EXERCISE

Prove the following amazing claims:

For all a₁, a₂, ...

∞
∑	a₁a₂ ... a_k–1 (x + a₁)(x + a₂) ... (x + a_k)	=	1 x	.
k=1

∞

ζ(3) =

∑

n³

∑

(–1)^n–1
n³	(	2n n	)

n=1

(1)

Consider the recursion:

n³u_n + (n – 1)³u_n–2 = (34n³ – 51n² + 27n – 5)u_n–1, n ≥ 2.

(2)

Let {b_n} be the sequence defined by b₀ = 1, b₁ = 5, and b_n = u_n for all n; then the b_n are integers! Let {a_n} be the sequence defined by a₀ = 0, a₁ = 6, a_n = u_n for all n; then the a_n are rational numbers with denominator dividing 2[1, 2, ..., n]³ (here [1, 2, ..., n] is the LCM (lowest common multiple) of 1, 2, ..., n).

a_n/b_n → ζ(3); indeed the convergence is so fast as to prove that ζ(3) cannot be rational. To be precise, for all integers p, q with q sufficiently large relative to ε > 0

ζ(3) – p
q
> 1
q^{12.417820... + ε}
.

Moreover, analogous claims were made for ζ(2):

2'

∞ ∞

ζ(2) = ∑ 1
n²
= π²
6
= 3 ∑

1

n² ( 2n
n )
.

n=1 n=1

(3)

3' Consider the recursion:

n²u_n + (n – 1)²u_n–2 = (11n² – 11n + 3)u_n–1, n ≥ 2. (4)

Let {B_n} be the sequence defined by B₀ = 1, B₁ = 3, and B_n = u_n for all n; then the B_n all are integers! Let {A_n} be the sequence defined by A₀ = 0, A₁ = 5, a_n = u_n for all n; then the A_n are rational numbers with denominator dividing [1, 2, ..., n]².

4'
A_n/B_n → ζ(2); indeed the convergence is so fast as to prove that ζ(2) cannot be rational. To be precise, for all integers p, q with q sufficiently large relative to ε > 0

π² – p
q
> 1
q^{11.850782... + ε}
.

I heard with some ineredulity that, for one, Henri Cohen (Grenoble) believed that these claims might well be valid. Very much intrigued, I joined Hendrik Lenstra (Amsterdam) and Cohen in an evening's discussion, in which Cohen explained and demonstrated most of the details of the proof. We came away convinced that Professor Apery had indeed found a quite miraculous and magnificent demonstration of the irrationality of ζ(2). But we remained unable to prove a critical step.

2. For the Nonexpert Reader

A number β is irrational if it is not of the form p₀/q₀; p₀, q₀ integers (ÎZ). A rational number b is characterised by the property that for p, qÎZ (q>0) and b ≠ p/q there exists an integer q₀ (>0, of course) such that |b – p/q| ≥ 1/qq₀. On the other hand, for irrational β there are always infinitely many p/q (for instance, the convergents of the continued fraction expansion of β) such that |β – p/q| < 1/q². Plainly this yields a criterion for irrationality. It there is a δ>0 and a sequence {p_n/q_n} of rational numbers such that p_n/q_n ≠ β and |β – p_n/q_n| < 1/q_n^1+δ, nÎN, then β is irrational.

A successful application of the criterion may yield a measure of irrationality: If |β – p_n/q_n| < 1/q_n^1+δ, and the q_n are monotonic increasing with q_n < q_n–1^1+k (for n sufficiently large relative to k > 0), then for all integers p, q > 0 sufficiently large relative to ε > 0),

β –

q^{(1 + δ)/(δ – k) + ε}

For example, if the sequence {q_n} increases geometrically we may take k > 0 arbitrary small so that 1 + ¹/δ becomes an irrationality degree for β. To see the claim suppose that |β – p/q| ≤ 1/q^σ and select n so that q_n–1^1+δ ≤ q^σ ≤ q_n^1+δ. Then

qq_n

≤

–

p_n

q_n

≤

β –

p_n

q_n

β –

≤

q_n^{1 + δ}

q^σ

Hence ½q^σ ≤ qq_n < qq_n–1^k+1 < qq_n < q^{1+σ(1+k)/(1+δ)} or σ < (1+δ)/(1+k) + ε as claimed. This argument is effective (the "sufficiently large" requirements can be made explicit).

It is well-known (the theorem Thue–Siegel–Roth) that for β algebraic (a zero of a polynomial a₀Xⁿ + a₁X^n–1 + ... + a_n, a_iÎZ) always |β – p/q| < 1/q^2+ε, for q sufficiently large relative ε > 0. So, if β is too well approximable by rationals (δ > 1 above) then β is not algebraic, but transcendental. Unfortunately, only a set of measure zero of transcendental numbers can be detected in this way, whilst since the set of algebraic numbers is countable, almost all numbers are transcendental. It is notoriously difficult to prove that any given naturally occuring number is irrational, let alone transcendental. One may be fortunate: for example the usual series for e implies immediately (easy exercise) that e is irrational. In the case of the Riemann ζ-function:

∞

ζ(s) = ∑ 1
n^s
(Re s > 1)

n=1

there is the quite well-known fact that

∞

ζ(2k) = ∑ 1
n^2k
= (–1)^k–1(2π)^2k
2(2k)!
B_2k , kÎN,

n=1

(5)

where the Bernoulli numbers, B_m, are rational (ζ(2) = π²/6, ζ(4) = π⁴/90, ζ(6) = π⁶/945, ...). There are some classical techniques (see [1]) for detecting the irrationality of powers of π, but it is most useful to appeal to the theorem of Hermite–Lindemann (whereby e^α is transcendental for algebraic α ≠ 0) whence π is transcendental (because e^πi = –1) and so a fortiori its powers are irrational, kÎN. On the other hand there are no useful analogous closed evaluations of ζ at odd arguments. There is however a famous formula of Ramanujan: let α and β be positive numbers such that αβ = π². Then if n is any positive integer

( ∞ )

1
αⁿ
ζ(2n+1)
2
+ ∑ 1
k²ⁿ⁺¹(e^2αk – 1)
=

k=1

( ∞ ) n+1

= (–1)ⁿ
βⁿ
ζ(2n+1)
2
+ ∑ 1
k²ⁿ⁺¹(e^2βk – 1)
– 2²ⁿ ∑ (–1)^k B_2k
(2k)!
B_2n+2–2k
(2n+2–2k)!
α^n+1–kβ^k.

k=1 k=0

Taking α rational multiple of π one sees that ζ(2n+1) is given as a rational multiple of π²ⁿ⁺¹ plus two very rapidly convergent series. See for example [2]. Indeed the above formula is the natural analogue of Euler's formula (5). The cited paper gives many others formulas and detailed references). Incidentally, (5) is demonstrated quite easily. The Bernoulli numbers are defined by the generating function (a nontrivial example of an even function!)

∞

z
e^z – 1
+ z
2
= ∑ B_2m
(2m)!
z^2m,

m=0

hence by the recursion

(

n
0

)

B₀ +

(

n
1

)

B₁ + ... +

(

n
n–1

)

B_n–1 = 0,

B₀ = 1, B₁ = –

n = 3, 4, ... .

On the other hand it is well-known that

∞

sin πz = πz ∏ ( 1 – z²
n²
) ,

n=1

so

∞

π sin πz
cos πz
= π ctg πz = 1
z
– ∑ 2z
n² – z²
.

n=1

But

∞

π ctg πz = πiz e^πiz + e^–πiz
e^πiz – e^–πiz
= 2πiz
e^2πiz – 1
+ πiz = ∑ (–1)^m (2π)^2m
(2m)!
B_2mz^2m,

m=0

and on the other hand

∞ ∞ ∞

π ctg πz = 1 – 2 ∑ ∑ z^2m
n^2m
= 1 – 2 ∑ ζ(2m)z^2m.

m=1 n=1 m=1

Comparing coefficients one has (5). With a little ingenuity one can avoid a direct appeal to the infinite product for sin πz or to the expansion for π ctg πz (For a detailed set of references, and some new proofs, see [3]). Indeed proving the irrationality of ζ(2n+1), nÎN constitutes one of the outstanding problems of the theory (ranking with the arithmetic nature of

n

γ = lim ( ∑ 1
k
– ln n ),

n→∞ k=1

and of eπ, e + π, ... which are yet undetermined).

It is some measure of Apery's achievement that these questions have been considered by mathematicians of the top rank over the past few centuries without much success being achieved.

3. Some Irrelevant Explanations

For much of the following details I am indebted to Henri Cohen. All this due to Apery, of course. The identity

K

∑ a₁a₂ ... a_k–1
(x + a₁)(x + a₂) ... (x + a_k)
= 1
x
– a₁a₂ ... a_K
x(x + a₁)(x + a₂) ... (x + a_K)

k=1

follows easily on writing the right-hand side as A₀ – A_K and noting that each term on the left is A_k–1 – A_k. This explains 1 . Now put x = n², a_k = –k², and take k ≤ K ≤ n – 1, to obtain

n–1

∑ (–1)^k–1(k – 1)!²
(n² – 1²) ... (n² – k²)
= 1
n²
– (–1)^n–1(n – 1)!²
n²(n² – 1²) ... (n² – (n–1)²)
= 1
n²
–

2(–1)^n–1

n² ( 2n
n )
.

k=1

Writing
ε_n,k = k!²(n – k)!
k³(n + k)!

because
(–1)^k n (ε_n,k – ε_n–1,k ) = (–1)^{k – 1}(k – 1)!²
(n² – 1²) ... (n² – k²)

we have

N n–1 N N

∑ ∑ (–1)^k (ε_n,k – ε_n–1,k ) = ∑ 1
n³
– 2 ∑

(–1)^n–1

n³ ( 2n
n )
=

n=1 k=1 n=1 n=1

N N N

= ∑ (–1)^k (ε_N,k – ε_k,k ) = 1
2
∑

(–1)^k

k³ ( N+k
k )( N
k )
– 1
2
∑

(–1)^k

k³ ( 2k
k )

k=1 k=1 k=1

and on noting that as N → ∞ the first term on the right vanishes, we have 2 . Actually the formula 2 is quite well known: it was observed some years ago by Raymond Ayoub (Penn.State) and it in fact appears in [4]; independently again it was noticed by R. William Gosper, Jr. (Palo Alto) in [5]. Henri Cohen remarked that the formula is

ζ(3) =

Li₃

(

τ²

)

2π²

ln τ –

ln³ τ

where τ = ½(1 + √5) and Li₃(x) = ∑ xⁿ/n³ is the trilogarithm. Hjortnaes, Ayoub, and respectively Gosper note the integral representations (easily shown equivalent)

ln τ ½

ζ(3) = 10 ò t² cth t dt = 10 ò arsh² t
t
dt.

0 0

In the case ζ(2) the formula is even better known. It is, for example, referred to by Z. R. Melzak [6], but suggested proof is not quite appropriate. 2' may be proved by slightly varying the argument in Section 3 – multiply by (–1)^n–1 instead of dividing by n. Many formulas similar to 2 and 2' appear in the literature and the folklore.

4. Some Nearly Relevant Explanations

All this is quite irrelevant to the proof. It would suffice to introduce the quantities

n k

c_n,k = ∑ 1
m³
+ ∑

(–1)^m–1

2m³ ( n
m )( n+m
m )

, k ≤ n

m=1 m=1

(6)

and to remark that plainly c_n,k → ζ(3) as n → ∞ uniformly in k. One might hope that a sequence c_n,k already implies the irrationality of ζ(3) (say, the diagonal, with k = n) but this is not quite so. To see this, it is useful to prove a lemma:

Lemma:
2c_n,k ( n + k
k ) Î Z + Z
2³
+ ... + Z
n³
= Z
[1, 2, ..., n]³
.

Equivalently: 2[1, 2, ..., n]³c_n,k ( n + k
k ) is an integer.

Proof. We check the number of times that any given prime p divides the denominator. But

(

n + m
m

)

(

k
m

)

(

n + k
k

)

(

n + k
k – m

)

and
ord_p ( n
m ) ≤ ln n
ln p
– ord_p m = ord_p [1, ..., n] – ord_p m,

so, we have

ord_p

m³

(

n
m

)(

n + m
m

)

= ord_p

m³

(

n
m

)(

k
m

)

≤

(

n + k
k

)

(

n + k
k – m

)

≤ 3 ord_p m +

ln n

ln p

ln k

ln p

– 2 ord_p m,

which yields the assertion, because m ≤ k ≤ n. We remark that those who know it well (Those who know it really well write

∞

ln [1, ..., n] = ∑ θ(n^1/m) = ψ(n), where θ(n) = ∑ ln p.

m=1 p≤n

Then it is known that ψ(n)/n ≤ 1.03883... (with maximum at n = 113) and indeed ψ(n) – n < 0.0242334...·n/ln n for n ≥ 525.752; see [7]) know that for n sufficiently large relative to ε > 0, [1, 2, ..., n] ≤ e^n(1+ε)

(roughly: [1, ..., n] =	∏	p^{[ln n/ln p]} ≤	∏	n ~ n^{n/ln n} = eⁿ).
	p≤n		p≤n

It will turn out that the c_n,k have too large a denominator relative to their closeness to ζ(3). Hence to apply the irrationality criterion we must somehow accelerate the convergence. Apery described this process as follows: Consider two trianglular arrays (defined for k ≤ n) with entries

(0)
n,k

= c_n,k

(

n+k
k

)

and

(

n+k
k

)

respectively. We recall that the arrays have the property that their "quotient" converges to ζ(3), in the sence that given any "diagonal" {n, k(n)}, the quotient of the corresponding elements of the two arrays converges to ζ(3). Now apply the following transformations to each array:

(0)
n, k

→ d

(0)
n, n–k

= d

(1)
n, k

→

(

n
k

)

(1)
n, k

= d

(2)
n, k

→

∑

(

k
m

)

(2)
n, k

= d

(3)
n, k

→

(

n
k

)

(3)
n, k

= d

(4)
n, k

→

∑

(

k
m

)

(4)
n, k

= d

(5)
n, k

m=0

(

n+k
k

)

→

(

2n–k
n

)

→

(

n
k

)(

2n–k
n

)

→

∑

(

k
m

)(

n
m

)(

2n–m
n

)

→

m=0

→

∑

(

k
m

)(

n
m

)(

n
k

)(

2n–m
n

)

→

∑

(

k
l

)(

l
m

)(

n
l

)(

n
m

)(

2n–m
n

)

m=0

l=0

m=0

Of course, the arrays have retained the property that their "quotient" converges to ζ(3), and we still have 2[1, 2, ..., n]³d_n,kÎZ: We now take the main diagonals (k = n) of the arrays, calling them respectively {a_n} and {b_n} and make the fantastic assertions embodied in 3 ! That is, each sequence satisfies the recurrence (2)! This is plainly absurd since surely inter alia a solution {u_n} of (2) (with integral initial values u₀, u₁) will have {u_n} with denominator more like n!³ than like 1 (or even 2[1, 2, ..., n]³). In Marseille, our amazement was total when our HP-67s, calculating {b_n} on the one hand from the definition above, and on the other hand by the recurrence (2), kept on producing the same values.

5. It Seems that Apery Has Shown that ζ(3) Is Irrational

We were quite unable to prove that the sequences {a_n} defined above did satisfy the recurrence (2) (Apery rather tartly pointed out to me in Helsinki that he regarded this more a compliment than a criticizm of his method). But empirically (numerically) the evidence in favour was utterly compelling. It seemed indeed that ζ(3) had been proved irrational, because the rest, thus 4 , follows quite easily: Given (with p(n – 1) = 34n³ – 51n² + 27n – 5),

n³a_n – p(n – 1)a_n–1 + (n – 1)³a_n–2 = 0, n³b_n – p(n – 1)b_n–1 + (n – 1)³b_n–2 = 0,

one multiplies the first equation by b_n–1, the second by a_n–1, to obtain

n³(a_nb_n–1 – a_n–1b_n) = (n – 1)³(a_n–1b_n–2 – a_n–2b_n–1).

Recalling a₁b₀ – a₀b₁ = 6·1 – 0·5 = 6 this cleverly yields

a_nb_n–1 – a_n–1b_n = 6/n³.

(7)

Seeing that ζ(3) – a₀/b₀ = ζ(3), it is easily indiced (write ζ(3) – a_n/b_n = x_n, and note that we have x_n – x_n–1 = –6/n³b_nb_n–1 and x_∞ = 0) that

∞

ζ(3) – a_n
b_n
= ∑ 6
k³b_kb_{k –1}
, ζ(3) – a_n
b_n
= O ( 1
b_n²
) .

k=n+1

On the other hand the recurrence relation makes it easy to estimate b_n, at any rate asymptotically. We have

b_n –

34 –

n²

–

n³

b_n–1 +

1 –

n²

–

n³

b_n–2 = 0

and since the polynomial x² – 34x + 1 has zeros 17 ± 12√2 = (√2 ± 1)⁴ we readily conclude that b_n= O(α⁴ⁿ), α = 1 + √2. In fact Cohen has, more precisely, calculated that

b_n=

(1 + √2)²

(2π√2)^3/2

(1 + √2)⁴ⁿ

n^3/2

1 –

48 – 15√2

64n

+ O

n²

(8)

We have to recall that the a_n are not integers. But writing p_n= 2[1, 2, ..., n]³a_n, q_n= 2[1, 2, ..., n]³b_n we have p_n, q_nÎZ and q_n=O(α⁴ⁿe³ⁿ),

ζ(3) –

p_n

q_n

= O

α⁸ⁿ

= O

q_n^1+δ

, with δ =

4 ln α – 3

4 ln α + 3

= 0.080529... > 0.

Hence, by the irrationality criterion, ζ(3) is indeed irrational, and moreover, because 1/δ = 12.417820... we have: For all integers p, q > 0 sufficiently large relative to ε > 0

ζ(3) –

q^{12.417820... + ε}

6. Some Trivial Verifications

To convince ourselves of the validity of Apery's proof we need only complete the following exercise.

EXERCISE

Prove the following identities:

Let c_{n, k} defined by (6) and

a_n=

∑

(

n
k

)

(

n+k
k

)

c_{n, k}, b_n =

∑

(

n
k

)

(

n+k
k

)

k=0

Then a₀ = 0, a₁ = 6; b₀ = 1, b₁ = 5 and each sequence {a_n} and {b_n} satisfies the recurrence (2).

In the same spirit, the case of ζ(2) requires:

Let

C_{n, k}= 2

∑

(–1)^m–1

m²

∑

(–1)^n+m–1
m²	(	n m	)(	n+m m	)

m=1

A_n=

∑

(

n
k

)

(

n+k
k

)

C_{n, k}, B_n =

∑

(

n
k

)

(

n+k
k

)

k=0

Then A₀ = 0, A₁ = 5; B₀ = 1, B₁ = 3 and each sequence {A_n} and {B_n} satisfies the recurrence (4).

It is useful to notice that very little more than just proving these claims is required for Apery's proof. After all, it is quite plain that a_n/b_n → ζ(3); the b_n are integers, and the lemma of Section 4 shows that the a_n are "near-integers". In Section 5 we showed that the sequence satisfy the recursion (2) the irrationality of ζ(3) follows because from lnα>3 we obtain δ>0. Thus, as implied in various asides, most of the earlier argument is quite irrelevant. Indeed I am indebted to John Conway for the remark that even 5 is irrelevant.

EXERCISE

Be the first in your block to prove by a 2-line argument that ζ(3) is irrational (The author does not pretend to be able to do this. Notice that in fact even less is needed: it is sufficient to show a_nb_n–1 – a_n–1b_n = O(γⁿ) and b_n = O(βⁿ), with ln β – ln γ > 3).

6	Given the definitions of 5 show that a_nb_n–1 – a_n–1b_n = 1/b_n³ and b_n = O(α⁴ⁿ) with α = 1 + √2. Conclude that ζ(3) is irrational because lnα>¾.

EXERCISE

Astound your friends with an excellent irrationality measure for π².
6'
Given the definitions of 5' show that a_nb_n–1 – a_n–1b_n = 5(–1)^n–1/n² and b_n = O(ω⁵ⁿ) with ω = ½(1 + √5). Conclude that for all integers p, q > 0 sufficiently large relative to ε>0

π² – p
q
> 1
q^{11.850782... + ε}
.

Though we have long known that ζ(2) is irrational, Apery's result in this case is significant. The irrationality degree for π² is the best known; the irrationality degree implied for π is 23.701564... . These results compare very favourably with those of Mahler [8]: |π – p/q| > q^–42.

Wirsing announced |π – p/q| > q^–21 and Mignotte proved that (for q sufficiently large) |π – p/q| > q^–20; this is the best known result. It should be noted that the cited results depend on deep techniques and complicated estimates in transcendence theory as contrasted with the essentially elementary methods in Apery's proof. Mignotte also shows that |π – p/q| > q^–18, which is weaker than Apery's result.

7. ICM'78. Helsinki, August 1978

Neither Cohen nor I had been able to prove 5 or 5' in the intervening 2 months. After a few days of fruitless effort the specific problem was mentioned to Don Zagier (Bonn), and with irritating speed he showed that indeed the sequence {B_n} satisfies the recurrence (4). This more or less broke the dam and 5 and 5' were quickly conquered. Henri Cohen addressed a very well-attended meeting at 17:00 on Friday, August 18 in the language of the majority, proving 5 and explaining how this implied the irrationality of ζ(3). Apery then made some remarks on the status of the French language, and alluded to the underlying motivation (as mentioned in Section 3) for his astonishing proof.

EXERCISE

Show that

ζ(3) =	6
	5 –	1
		117 –	64
			535 – ... –	n⁶ 34n³ + 51n² + 27n + 5

and deduce that ζ(3) = 1.202056903... is irrational.

and deduce that π² has irrationality degree at most 11.850782... .

8. Some Rather Complicated but Ingenious Explanations

According to a dictum of Littlewood any identity, once verified, is trivial. Surely 5 is very nearly a counterexample. The following is principally due to Zagier and Cohen. Incidentally, we first considered 5' which appeared simpler, but this was because we had failed to notice that

n k n

∑ ∑ ( n
k ) ²
( n
l )( k
l )( 2n – l
n ) = ∑ ( n
k ) ²
( 2n – k
n ) ²
.

k=0 l=0 k=0

Now writing n – k for k links the arrays of Section 4 to 5 . It is quite convenient to write:

n n

b_{n, k}= ( n
k ) ²
( n + k
k ) ²
, a_{n, k}= b_{n, k}c_{n, k} b_n = ∑ b_{n, k}, a_n = ∑ b_{n, k}c_{n, k} .

k=0 k=0

Then we wish to show that

∑	(	(n + 1)³b_{n+1, k} – (34n³ + 51n² + 27n + 5)b_{n, k} + n³b_{n–1, k}	)	= 0.
k

We cleverly construct

B_{n, k}= 4(2n + 1)

(

k(2k + 1)– (2n + 1)²

)

(

n
k

)

(

n + k
k

)

with the motive that

B_{n, k} – B_{n, k–1} = (n + 1)³

(

n + 1
k

)

(

n + 1 + k
k

)

–

– (34n³ + 51n² + 27n + 5)

(

n
k

)

(

n + k
k

)

+ n³

(

n – 1
k

)

(

n – 1 + k
k

)

and, O mirabile dictu, the sequence {b_n} does indeed satisfy the recurrence (2) by virtue of the method of creative telescoping (by the usual conventions: B_{n, k} = 0 for k < 0 or k > n; note also that P(n) = 34n³ + 51n² + 27n + 5 implies P(n–1) = –P(–n)). The rest is plain sailing (or is it plane sailing?) We notice that

(n + 1)³b_{n+1, k}c_{n+1, k} – P(n)b_{n, k}c_{n, k} + n³b_{n–1, k}c_{n–1, k} =

= (B_{n, k} – B_{n, k–1})c_{n, k} + (n + 1)³b_{n+1, k}(c_{n+1, k} – c_{n, k}) – n³b_{n–1, k}(c_{n, k} – c_{n–1, k}).

(9)

Clearly

k

c_{n, k} – c_{n–1, k} = 1
n³
+ ∑ (–1)^m(m – 1)!²(n – m – 1)!
(n + m)!
=

m=1

k

= 1
n³
+ ∑ ( (–1)^mm!²(n – m – k)!
n²(n + m)!
– (–1)^m–1(m – 1)!
n²(n + m + 1)!
) = (–1)^kk!²(n – k – 1)!
k²(n + k)!

m=1

whilst not even a minor miracle is required to write down c_n,k – c_{n, k–1}. After some massive reorganization (9) becomes A_{n, k} – A_{n, k–1} with

A_{n, k} = B_{n, k}c_{n, k} +

5(2n + 1)(–1)^k–1k

n(n + 1)

(

n
k

)(

n + k
k

)

and we have completed 5 , and, in passing, proved 3 . This of course verifies Apery's claim to have proved ζ(3) irrational.

9. The Case of ζ(2)

The arguments required to deal with the exercises 2' – 6' are quite similar to those already described. It way however be a kindness to the reader to reveal that it would be wise to take

B_{n, k} = (k² + 3(2n+1)k – 11n² – 9n – 2)

(

n
k

)

(

n + k
k

)

A_{n, k} = B_{n, k}C_{n, k} + 3(–1)^n+k–1

(n – 1)!

(k – 1)!

Moreover

C_{n, k} – C_{n–1, k} = 2(–1)^n+k–1

k!²(n – k – 1)!

n(n + k)!

and

B_n =

[½(1 + √5)]⁵ⁿ⁺⁴

2πn√5 + 2√5

(

1 + O(¹/n)

)

(10)

(also note that if Q(n) = 11n² + 11n + 3 then Q(n–1) = Q(–n)).

10. What on Earth is Going on Here?

Apery's incredible proof appears to be a mixture of miracles and mysteries. The dominating question is how to generalize all this, down to the Euler constant γ and up to the general ζ(t)? Here we have, apparently, the tip of an iceberg which relates 1 + √2 to ζ(3) and ½(1 + √5) to ζ(2); we have surprising identities 2 and 2' , and startling continued fractions (produced by Cohen for his Helsinki talk), 7 and 7' . Does the complete berg look like this? For my part I incline to the view that much of what has been presented constitutes a mystification rather than an explanation. For example Richard Askey (Madison, Wiskonsin) has pointed out to me that the sequences {b_n} and {B_n} may be recognized as special values of certain hypergeometric polynomials; immediately the recurrences 2 and 2' become identities relating hypergeometric functions and much of the magic fades away. Unfortunately the difficulties remain, because not all that much is known about the higher generalizations of the classical hypergeometric functions. For this, and other reasons, it is however likely that one should think about recurrences of order greater than 2. This, incidentally, means that the continued fractions constitute a red herring. In any event 7 obscures a fundamental miracle. It convergents P_n/Q_n are of course such that the sequences {P_n} and {Q_n} both satisfy U_n+1 = (34n³ + 51n² + 27n + 5)U_n – n⁶U_n–1. The proof works (not because the continued fraction does not terminate; that only works for regular continued fractions, but) because if U₀ = 1, U₁ = 5 then it happens that n!³ divides the integers U_n; more honestly: it is already enough (and is necessary) that for any initial integer values U₀, U₁, n!³ always divides 2[1, 2, ..., n]³U_n. An analogous miracle makes the recurrence U_n+1 = (11n² + 11n + 3)U_n + n⁴U_n–1 useful in proving the irrationality of ζ(2). Tom Cusick (Buffalo) has noticed that the following recurrences also yield continued fractions converging to π²/6: n²u_n = (7n² – 7n + 2)u_n–1 + 8(n–1)²u_n–2
(one solution of which is

n

∑ ( n
k ) ³

k=0

),

and n³u_n = 2(2n – 1)(3n² – 3n + 1)u_n–1 + (4n – 3)(4n – 4)(4n – 5)u_n–2
(a solution is

n

∑ ( n
k ) ⁴

k=0

).

On first impression the first yields a worse irrationality degree for π² than that obtained by Apery, and the second does not yield irrationality at all. Apery's results are indeed remarkable. These surprises generalize the following quite well known fact (to which I was alerted by Frits Beukers (Leiden)): the recurrence U_n+1 = (6n + 3)U_n – n²U_n–1 is such that n! divides U_n if U₀ = 1, U₁ = 3; and n! divides [1, 2, ..., n]U_n for all integer initial values U₀, U₁.

EXERCISE

What are the higher analogues?

Show that if

		∞		n
B(z) =	1 √1 – 6z + z²	∑	b_nzⁿ, then the b_n =	∑	(	n k	)(	n + k k	)	.
		n=0		k=0

Find expression for the a_n in

z ∞

A(z) = 1
√1 – 6z + z²
∫ dt
√1 – 6t + t²
= ∑ a_nzⁿ

0 n=0

and notice that the [1, 2, ..., n]a_n all are integers. Show that sequences {a_n} (a₀ = 0, a₁ = 1) and {b_n} (b₀ = 1, b₁ = 3) both satisfy nu_n + (n – 1)u_n–2 = (6n – 3)u_n–1. Now prove that there is a constant λ such that

∞

A(z) – λB(z) = ∑ c_nzⁿ

n=0

has no singularity at (√2 – 1)². Deduce that then c_n = O((√2 – 1)²ⁿ) and conclude that it follows that ln2 has irrationality degree at most 4.662100831... .

Of course, 6 should remind us that recurrences may be quite irrelevant to the proof. The vital this then is suitable definition of the c_n,k, so one is brought back to looking for generalizations of 2 . But, for the present, generalization of Apery's work remains, as they say, a mystery wrapped in an enigma. Well, not really. It is just that it is not at all clear where to go. A numerical test (suggested by Cohen) implies that

∞

ζ(4) = π⁴
90
= 36
17
∑

1

n⁴ ( 2n
n )

n=1

(so this true for all practical purposes) and it has been shown by Gosper that

∞

ζ(5) = 5
2
∑ ( 1 + 1
2²
+ 1
3²
+ ... + 1
(n – 1)²
– 4
5n²
)

(–1)ⁿ

n³ ( 2n
n )
.

n=2

David Hawkins (Boulder) suggests similar formulas. Apparently such expressions can be generated virtually at will on using appropriate series accelerator identities. Most startling of all though should be the fact that Apery's proof has no aspect that would not have been accessible to a mathematician of 200 years ago. The proof we have seen is one that many mathematicians could have found, but missed.

This note was written at Queen's University, Kingston, Ontario whilst the author was on study leave from the University of New South Wales, Sydney, Australia (October, 1978).

P.S. See [9] for many delightful facts including the trilogarithm formula of 4 which is given at p.139. At p.89 of [10] one is astonished to be asked to prove as an exercise that

∞

∑

1

( 2n
n )
= 1
3
+ 2π√3
27
,

n=1

∞

∑

1

n ( 2n
n )
= π√3
9
,

n=1

∞

∑

1

n² ( 2n
n )
= π²
18
,

n=1

∞

∑

1

n⁴ ( 2n
n )
= 17π⁴
3240
.

n=1

Seeing that

∞

∑

x²ⁿ

n² ( 2n
n )
= 2 arcsin² ( x
2
)

n=1

([6], p.108) the first three formulae (and the one with trilogarithm) become quite accessible to proof, but I had not detected anyone able to prove the expression for ζ(4), until I proved it in March 1979 after noticing a remark of Lewin that also

π/3

2 ∫ x ln² ( 2 sin x
2
) dx = 17π⁴
3240
.

0

Sam Wagstaff (Illinois) and Andrew Odlyzko (Bell Labs) have mentioned to me that numerical evidence suggests that there are formulae of the shape 2 and 2' for ζ(t) only for t = 2, 3, 4, and this is verified by my studies in a current manuscript Some wonderful formulae. The recurrences (9) are long known, see [10], p.90. One can recognise the b_n as b_n = ₄F₃(n+1, –n, n+1, –n; 1,1,1; 1) and determine the recurrence 3 by way of three term relations with contiguous balanced series; see [11].

Frits Beukers (Leiden) [12] has found an elegant approach to Apery's proof which entirely avoids explicit identities, recurrences and other magic. Instead just consider

1 1

I = – 1
2
∫ ∫ P_n(x)P_n(y) ln xy
1 – xy
dxdy = b_nζ(3) – a_n

0 0

noticing that the b_n are integers and the a_n are rationals with the 2[1, 2, ..., n]³a_n integers, whilst | I | ≤ ζ(3)(√2 – 1)⁴ⁿ. Here P_n(z) = (d/dz)ⁿ[zⁿ(1 – zⁿ)]/n! is the Legendre polynomial. Again, there is no obvious way to generalize the proof.

In retrospect it seems clear that 8 really is useful; implications are being considered by Bombieri et al (at Princeton). For example, one's intuition is just wrong in feeling incredulity at the facts of 3 . All that this report is that the differential equation

(x⁴ – 34x³ + x²)

d³y

dx³

+ (6x³ – 103x² + 3x)

d²y

dx²

+ (7x² – 112x + 1)

+ (x – 5)y – (u₁ – 5u₀)

= 0

has two G-function solutions, namely a(x) = a₁x + a₂x² + ...; b(x) = 1 + b₁x + b₂x² + ...; and a(x) – ζ(3)b(x) is regular (in fact vanishes) at α = (√2 – 1)⁴. This is interesting, but no longer incredible; and it is readily generalizable... All this too is an idea of Beukers. Some officious readers have been critical of my casual use of the O-symbol; the fault is mine, not Apery's. No harm is done. Similarly it has been claimed that Apery's proof was not missed by Euler – «Euler did not know the prime number theorem»; to me it seems hypercritical to suggest that [1, 2, ..., n] = O((√2 + 1)^4n/3) could not have been noticed at the time, had it been needed. Anyhow, I considered it a racy title. It arose after Cohen's report at Helsinki, with someone sourly commenting «A victory for the French peasant...»; to this Nick Katz retorted: «No...! No! This is marvellous! It is something Euler could have done...».

School of Mathematics and Physics,
Macquarie University
North Ryde, New South Wales,
Australia 2113

March 1979

References

I. Niven. Irrational Numbers (Carus Monograph #11). MAA-Wiley, 1967. назад к тексту
Bruce C. Berndt. Modular transformations and generalizations of several formulae of Ramanujan. Rocky Mountain J. of Maths 7 (1977), pp. 147–189. назад к тексту
Bruce C. Berndt. Elementary evaluation of ζ(2n). Maths Mag. 48 (1945), pp. 148–153. назад к тексту
Margrethe Munthe Hjortnaes. Overforing av rekken ∑¹/k³ til et bestemt integral. Proc. 12th Cong. Scand. Maths, Lund 10–15 Aug. 1953 (Lund 1954). назад к тексту
R. William Gosper, Jr. A calculus of series rearrangements. In "Algorithms and Complexity, New Directions and Recent Results", ed. J. Traub. Academic Press, 1976, pp.121–151. назад к тексту
Z. R. Melzak. Introduction to Concrete Mathematics. Wiley, 1973, p.85. назад к тексту
J. Barkley. Rosser and lowell Schoenfeld. Math. Comp. 29 (1975) pp.243–269. назад к тексту
K. Mahler. Applications of some formulae by Hermite to the approximation of exponentials and logarithms. Math. Annalen 168 (1967) pp.200–227. назад к тексту
Leonard Lewin. Dilogarithms and associated functions. Macdonald, London, 1958. назад к тексту
Louis Comtet. Advanced Combinatorics. D.Reidel, Dordrecht, 1974. назад к тексту
J. A. Wilson. Hypergeometric series, recurrence relations and some new orthogonal functions. Ph.D.Thesis; U.Wisconsin–Madison, 1978. назад к тексту
Frits Beukers. A note on the irrationality of ζ(2) and ζ(3). J. Lond. Math. Soc. (to appear). назад к тексту

Some Wonderful Formulae

					∞
π	sin πz cos πz	= π ctg πz =	1 z	–	∑	2z n² – z²	.
					n=1

					∞
π ctg πz = πiz	e^πiz + e^–πiz e^πiz – e^–πiz	=	2πiz e^2πiz – 1	+ πiz =	∑	(–1)^m	(2π)^2m (2m)!	B_2mz^2m,
					m=0

	∞	∞			∞
π ctg πz = 1 – 2	∑	∑	z^2m n^2m	= 1 – 2	∑	ζ(2m)z^2m.
	m=1	n=1			m=1

K
∑	a₁a₂ ... a_k–1 (x + a₁)(x + a₂) ... (x + a_k)	=	1 x	–	a₁a₂ ... a_K x(x + a₁)(x + a₂) ... (x + a_K)
k=1

	∞
ln [1, ..., n] =	∑	θ(n^1/m) = ψ(n), where θ(n) =	∑	ln p.
	m=1		p≤n

			k
c_{n, k} – c_{n–1, k} =	1 n³	+	∑	(–1)^m(m – 1)!²(n – m – 1)! (n + m)!	=
			m=1

		1	1
I = –	1 2	∫	∫	P_n(x)P_n(y) ln xy 1 – xy	dxdy = b_nζ(3) – a_n
		0	0