Sturm's theorem on zeros of linear combinations of eigenfunctions
aa r X i v : . [ m a t h . SP ] O c t STURM’S THEOREM ON ZEROS OF LINEARCOMBINATIONS OF EIGENFUNCTIONS
PIERRE BÉRARD AND BERNARD HELFFER
Abstract.
Motivated by recent questions about the extension ofCourant’s nodal domain theorem, we revisit a theorem publishedby C. Sturm in 1836, which deals with zeros of linear combina-tion of eigenfunctions of Sturm-Liouville problems. Although wellknown in the nineteenth century, this theorem seems to have beenignored or forgotten by some of the specialists in spectral theorysince the second half of the twentieth-century. Although not spe-cialists in History of Sciences, we have tried to put this theoreminto the context of nineteenth century mathematics.
To appear in Expositiones Mathematicae 2018,except for the Appendices C to E (in blue). Introduction
In this paper, we are interested in the following one-dimensional eigen-value problem, where r denotes the spectral parameter. ddx K dVdx ! + ( r G − L ) V = 0 , for x ∈ ] α, β [ , (1.1) K dVdx − hV ! ( α ) = 0 , (1.2) K dVdx + HV ! ( β ) = 0 . (1.3)Here, K, G, L : [ α, β ] → R are positive functions , (1.4) h , H ∈ [0 , ∞ ] are non negative constants, possibly infinite.(1.5) Remark . When h = ∞ (resp. H = ∞ ), the boundary conditionshould be understood as the Dirichlet boundary condition V ( α ) = 0(resp. as the Dirichlet boundary condition V ( β ) = 0). Date : October 3, 2017. Revised July 31, 2018.2010
Mathematics Subject Classification.
Key words and phrases.
Sturm-Liouville eigenvalue problem, Sturm’s theorems.
Precise assumptions on
K, G, L are given below.Note that when K = G ≡
1, (1.1)–(1.3) is an eigenvalue problem forthe classical operator − d Vdx + LV .This eigenvalue problem, in the above generality ( K, G, L functions of x ), was first studied by Charles Sturm in a Memoir presented to theParis Academy of sciences in September 1833, summarized in [37, 38],and published in [39, 40]. Remark . In this paper, we have mainly retained the notation of [39],except that we use [ α, β ] for the interval, instead of Sturm’s notation[x , X]. We otherwise use today notation and vocabulary. Note that in[40], Sturm uses lower case letters for the functions
K, G, L , the samenotation as Joseph Fourier in [13].As far as the eigenvalue problem (1.1)–(1.3) is concerned, Sturm’s re-sults can be roughly summarized in the following theorems.
Theorem 1.3 (Sturm, 1836) . Under the assumptions (1.4) – (1.5) , theeigenvalue problem (1.1) – (1.3) admits an increasing infinite sequence { ρ i , i ≥ } of positive simple eigenvalues, tending to infinity. Further-more, the associated eigenfunctions V i have the following remarkableproperty: the function V i vanishes, and changes sign, precisely ( i − times in the open interval ] α, β [ . Theorem 1.4 (Sturm, 1836) . Let Y = A m V m + · · · + A n V n be a nontrivial linear combination of eigenfunctions of the eigenvalue problem (1.1) – (1.3) , with ≤ m ≤ n , and { A j , m ≤ j ≤ n } real constants suchthat A m + · · · + A n = 0 . Then, the function Y has at least ( m − ,and at most ( n − zeros in the open interval ] α, β [ . The first theorem today appears in most textbooks on Sturm-Liouvilletheory. Although well known in the nineteenth century, the secondtheorem (as well as the more precise Theorem 2.15) seems to have beenignored or forgotten by some of the specialists in spectral theory sincethe second half of the twentieth-century, as the following chronologyindicates.
Sturm’s Memoir presented to the Paris Academy of sciencesin September, summarized in [37, 38].
Sturm’s papers [39, 40] published. Joseph Liouville sum-marizes Sturm’s results in [23, § III, p. 257], and uses themto study the expansion of a given function f into a series ofeigenfunctions of (1.1). Lord Rayleigh writes “a beautiful theorem has been discov-ered by Sturm” as he mentions Theorem 1.4 in [34, Section 142].
F. Pockels [30, pp. 68-73] gives a summary of Sturm’s re-sults, including Theorem 1.4, and mentions the different proofs
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 3 provided by Sturm, Liouville and Rayleigh. On the basis of anote of Sturm in Férussac’s Bulletin [36], Pockels (p. 71, lines12-17) also suggests that Sturm may have looked for a state-ment in higher dimension as well, without success. Sturm in-deed mentions studying an example with spherical symmetry indimension 3 (leading to an ordinary differential equation withsingularity), to which he may have applied Theorem 1.4.
Hurwitz [19] gives a lower bound for the number of zerosof the sum of a trigonometric series with a spectral gap andrefers, somewhat inaccurately, to Sturm’s Theorems. This re-sult, known as the Sturm-Hurwitz theorem, already appears ina more general framework in Liouville’s paper [23].See [12, §
2] for a generalization of the Sturm-Hurwitz theo-rem to Fourier integrals with a spectral gap, [28] for geomet-ric applications, and the recent paper [35] which quantifies theSturm-Hurwitz theorem.
Courant & Hilbert [10, 11] extensively mention the Sturm-Liouville problem. They do not refer to the original papers ofSturm, but to Bôcher’s book [8] which does not include Theo-rem 1.4. They then state an extension of the so-called Courant’snodal domain theorem to linear combination of eigenfunctions,[10, footnote, p. 394] and [11, footnote, p. 454], and refer tothe dissertation of H. Herrmann [18]. It turns out that neitherHerrmann’s dissertation, nor his later papers, consider this ex-tension of Courant’s Theorem.
The book [15] by F. Gantmacher and M. Krein containsseveral notes on Sturm’s contributions. One result (Corollary,Chap. III.5, p. 138), stated in the context of Chebyshev systems,is stronger than Theorem 1.4, yet weaker than Theorem 2.15.The book does however not mention [40].
Pleijel mentions Sturm’s Theorem 1.4, somewhat inaccu-rately, in [29, p. 543 and 550].
V. Arnold [1] points out that an extension of Courant’s the-orem to linear combinations of eigenfunctions cannot be true ingeneral. Counterexamples were first given by O. Viro for the3-sphere (with the canonical metric) [41] and, more recently inthe papers [4, 5], see also [17].It seems to us that Arnold may have not been aware of Theo-rem 1.4. Indeed, in [3], see also the Supplementary problem 9 in[2, p. 327], he mentions a proof, suggested by I. Gelfand, of theupper bound in Theorem 1.4. Gelfand’s idea is to “use fermionsrather than bosons”, and to apply Courant’s nodal domain the-orem in the fermionic context. However, Arnold concludes bywriting [3, p. 30], “ the arguments [given by Gelfand] do not yetprovide a proof ”. It is interesting to note that Liouville’s and
P. BÉRARD AND B. HELFFER
Rayleigh’s proofs of the lower bound in Theorem 1.4 use an ideasimilar to Gelfand’s, see the proof of Claim 3.5.As far as we know, the first implementation of Gelfand’s ideainto a complete proof of Theorem 1.4 is given in [5, 6].
Remark . In [40], Theorem 1.4 first appears as a corollary to amuch deeper theorem [40, § XXIV], in which Sturm describes the timeevolution of the x -zeros of a solution u ( x, t ) of the heat equation. Weshall not consider this topic here, and we refer to [14, 26] for modernformulations and a historical analysis.Our interest in Theorem 1.4 arose from reading [20], and investigatingCourant’s nodal domain theorem and its extension to linear combina-tion of eigenfunctions.The main purpose of this paper is to popularize Theorems 1.4 and2.15, as well as Sturm’s originality and ideas. Sturm’s results areclearly stated in the summaries [37, 38]. Unfortunately, Sturm’s de-tailed papers [39, 40] are written linearly, and contain very few taggedstatements. Our second purpose is to provide an accessible proof ofTheorems 1.4 and 2.15, meeting today standards of rigor. We in par-ticular state precise assumptions, clarify some technical points, andprovide some alternative proofs. We otherwise closely follow the orig-inal proofs, and we provide precise cross-references to Sturm’s papers.In this paper, we make the following strong assumptions.(1.6) [ α, β ] ⊂ ] α , β [ ,K, G, L ∈ C ∞ (] α , β [) ,K, G, L > α , β [ . Remark . Neither Sturm nor Liouville make any explicit regularityassumptions, see Subsection 4.3 and Remark 3.7 for more details.
Organization of the paper.
In Section 2, we prove Theorem 2.15,Sturm’s refined version of Theorem 1.4, following the ideas of [40, § XXVI]. In Section 3, we prove Theorem 3.2, Liouville’s version ofTheorem 1.4, following [23, 24]. In Section 4, we describe the contextof Sturm’s papers and his ideas. Appendix A provides the detailedproof of a technical argument. Appendix B contains the citations fromSturm’s papers in their original French formulation. Appendix C con-siders Sturm’s theorem under weaker assumptions. Appendices D andE provide cross-references between our paper and the papers of Sturmand Liouville.
Acknowledgements.
The authors would like to thank N. Kuznetsovand J. Lützen for their comments on an earlier version of this paper.They also thank the anonymous referees for their constructive remarks.
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 5 Sturm’s o.d.e. proof of Theorem 1.4
Preliminary lemmas and notation. { ( ρ j , V j ) , j ≥ } are the eigenvalues and eigenfunc-tions of the eigenvalue problem (1.1)–(1.3).By our assumption L >
0, the eigenvalues are positive, ρ j >
0. Underthe Assumptions (1.6), the functions V j are C ∞ on ] α , β [ . This followsfrom Cauchy’s existence and uniqueness theorem, or from Liouville’sexistence proof [23]. Note that the assumption L > LG be bounded from below.In this section, we fix(2.1) Y = n X j = m A j V j , a linear combination of eigenfunctions of the eigenvalue problem (1.1)–(1.3), where 1 ≤ m ≤ n , and where the A j are real constants. Remark . We shall always assume that Y P nm A j = 0 . As far as the statement of Theorem 1.4is concerned, and without loss of generality, it is simpler to assume that A m A n = 0 .We also introduce the associated family of functions, { Y k , k ∈ Z } , where(2.2) Y k = ( − k n X j = m ρ kj A j V j . Note that Y is the original linear combination Y , and that Y k ≡ Y ≡ Y k , in the interval ] α, β [, is non-decreasing with respect to k , and thento take the limit when k tends to infinity, see Subsection 2.3. Up tochanging the constants A j , it suffices to compare the numbers of zerosof Y and Y . For this purpose, Sturm compares the signs of Y and Y near the zeros of Y (Lemma 2.4), and at the non-zero local extremaof Y (Lemmas 2.7 and 2.9). The main ingredient for this purpose isthe differential relation (2.6). In the sequel, we indicate the pages inSturm’s papers corresponding to the different steps of the proof. P. BÉRARD AND B. HELFFER m ≤ p ≤ n , write the equations satisfied by the eigenfunc-tion V p , ddx K dV p dx ! + ( ρ p G − L ) V p = 0 , (2.3) K dV p dx − hV p ! ( α ) = 0 , (2.4) K dV p dx + HV p ! ( β ) = 0 , (2.5)and multiply the p -th equation by ρ kp A p . Summing up from p = m to n , yields the following lemma. Lemma 2.2.
Assume that (1.6) holds. Let k ∈ Z . (1) The function Y k satisfies the boundary conditions (1.2) and (1.3) . (2) The functions Y k and Y k +1 satisfy the differential relation (2.6) G Y k +1 = K d Y k dx + dKdx dY k dx − L Y k . (3) Under the Assumptions (1.6) , the function Y k cannot vanish atinfinite order at a point ξ ∈ [ α, β ] , unless Y ≡ . Proof. [40, p. 437] Assertions (1) and (2) are clear by linearity.For Assertion (3), assume that Y k ξ . Then, according to (2.6) and its successive derivatives,the function Y k +1 also vanishes at infinite order at ξ , and so does Y ℓ forany ℓ ≥ k . Assume, as indicated in Remark 2.1, that A n = 0. Fixingsome p ≥
0, we can write, for any ℓ ≥ k , d p V n dx p ( ξ ) + n − X j = m ρ j ρ n ! ℓ A j A n d p V j dx p ( ξ ) = 0 . Since ρ n > ρ j for m ≤ j ≤ n −
1, letting ℓ tend to infinity, we concludethat d p V n dx p ( ξ ) = 0. This would be true for all p , which is impossible byCauchy’s uniqueness theorem, or by Sturm’s argument [39, § II]. (cid:3)
Remark . Assertion (3), and the fact that the zeros of Y are isolated,with finite multiplicities, are implicit in [40]. Lemma 2.4.
Assume that (1.6) holds. Let U denote any Y k , and U = Y k +1 . Let ξ ∈ [ α, β ] be a zero of U , of order p ≥ . Then, thereexist constants B ξ and B ,ξ , and smooth functions R ξ and R ,ξ , suchthat (2.7) U ( x ) = B ξ ( x − ξ ) p + ( x − ξ ) p +1 R ξ ( x ) ,U ( x ) = B ,ξ ( x − ξ ) p − + ( x − ξ ) p − R ,ξ ( x ) , with B ξ B ,ξ > . TURM LINEAR COMBINATIONS EIGENFUNCTIONS 7
Proof. [40, p. 439] Assume that ξ is a zero of order p ≥ U , sothat U ( ξ ) = · · · = d p − Udx p − ( ξ ) = 0and d p Udx p ( ξ ) = 0 . Taylor’s formula with integral remainder term, see Laplace [21, p. 179](in Livre premier, Partie 2, Chap. 3, § 44), gives the existence of somefunction R ξ such that U ( x ) = B ξ ( x − ξ ) p + ( x − ξ ) p +1 R ξ ( x ) , where B ξ = 1 p ! d p Udx p ( ξ ) = 0 . Equation (2.6) implies that( GU )( x ) = p ( p − B ξ ( x − ξ ) p − K ( x ) + ( x − ξ ) p − S ξ ( x ) , for some smooth function S ξ . It follows that U ( x ) = B ,ξ ( x − ξ ) p − + ( x − ξ ) p − R ,ξ ( x ) , for some function R ,ξ , with B ,ξ = p ( p − K ( ξ ) G ( ξ ) B ξ .In particular, B ,ξ B ξ > (cid:3) Lemma 2.5.
Assume that (1.6) holds. Assume that h ∈ [0 , ∞ [ , i.e.,that the boundary condition at α is not the Dirichlet boundary condi-tion. Let U denote any Y k , U = Y k +1 , and assume that U ( α ) = 0 .Then, α is a zero of U of even order, i.e., there exists n U ∈ N \ { } such that d p Udx p ( α ) = 0 for ≤ p ≤ n U − and = 0 for p = 2 n U .When H ∈ [0 , ∞ [ , a similar statement holds at the boundary β . Proof. [40, p. 440-441] Assume that U ( α ) = 0. By Lemma 2.2, U does not vanish at infinite order at α , so that there exists p ≥ U ( α ) = · · · = d p − Y k dx p − ( α ) = 0and d p Udx p ( α ) = 0 . Taylor’s formula with integral remainder term gives U ( x ) = B α ( x − α ) p + ( x − α ) p +1 R α ( x ) , where B α = p ! d p Udx p ( α ) = 0 .The boundary condition at α implies that dUdx ( α ) = 0, and hence that p ≥
2. By Lemma 2.4, we can write U ( x ) = B ,α ( x − α ) p − + ( x − α ) p − R ,α ( x ) , P. BÉRARD AND B. HELFFER with B ,α B α > p = 2, then U ( α ) = 0. If p >
2, one can continue.If p = 2 q , one arrives at Y k + q ( x ) = B k + q,α + ( x − α ) R k + q,α ( x ) , with Y k + q ( α ) = B k + q,α and B k + q,α B k,α > p = 2 q + 1, one arrives at Y k + q ( x ) = B k + q,α ( x − α ) + ( x − α ) R k + q,α ( x ) , with B k + qα B k,α > dY k + q dx ( α ) = B k + q,α = 0 . On the other-hand,since Y k + q satisfies (1.2) and Y k + q ( α ) = 0 , we must have dY k + q dx ( α ) = 0 ,because h > p = 2 q + 1 cannot occur. The lemma is proved. (cid:3) Counting zeros.
Assume that (1.6) holds. Let U denote any Y k ,and U = Y k +1 . They satisfy the relation (2.6).From Lemma 2.2, we know that U cannot vanish at infinite order at apoint ξ ∈ [ α, β ]. If ξ ∈ ] α, β [ and U ( ξ ) = 0 , we define the multiplicity m ( U, ξ ) of the zero ξ by(2.8) m ( U, ξ ) = min { p | d p Udx p ( ξ ) = 0 } . From Lemma 2.5, we know that the multiplicity m ( U, α ) is even if h ∈ [0 , ∞ [, and that the multiplicity m ( U, β ) is even if H ∈ [0 , ∞ [ . Wedefine the reduced multiplicity of α by(2.9) m ( U, α ) = ( m ( U, α ) if h ∈ [0 , ∞ [ , h = ∞ , and a similar formula for the reduced multiplicity of β .By Lemma 2.2, the function U has finitely many distinct zeros ξ ( U ) < ξ ( U ) < · · · < ξ p ( U ) in the interval ] α, β [ . We define the number of zeros of U in ] α, β [ , counted with multiplicities , by(2.10) N m ( U, ] α, β [) = p X j =1 m ( U, ξ i ( U )) , and we use the notation N m ( U ) whenever the interval is clear.We define the number of zeros of U in [ α, β ], counted with multiplicities ,by(2.11) N m ( U, [ α, β ]) = p X j =1 m ( U, ξ i ( U )) + m ( U, α ) + m ( U, β ) , and we use the notation N m ( U ) whenever the interval is clear. TURM LINEAR COMBINATIONS EIGENFUNCTIONS 9
We define the number of zeros of U in ] α, β [ ( multiplicities not ac-counted for ) by(2.12) N ( U, ] α, β [) = p , and we use the notation N ( U ) whenever the interval is clear.Finally, we define the number of sign changes of U in the interval ] α, β [by(2.13) N v ( U, ] α, β [) = p X j =1 h − ( − m ( U,ξ j ( U )) i . Remark . Note that sign changes of the function Y correspond tozeros with odd multiplicity.2.3. Comparing the numbers of zeros of Y k and Y k +1 . Assumethat (1.6) holds. Let U be some Y k and U = Y k +1 . In this subsection,we show that the number of zeros of U is not smaller than the numberof zeros of U . Lemma 2.7.
Let ξ < η be two zeros of U in [ α, β ] . Then, there existssome a ξ,η ∈ ] ξ, η [ such that U ( a ξ,η ) U ( a ξ,η ) < .Remark . We do not assume that ξ, η are consecutive zeros.
Proof. [40, p. 437] Since U cannot vanish identically in ] ξ, η [ (seeLemma 2.2), there exists some x ∈ ] ξ, η [ such that U ( x ) = 0. Let ε = sign( U ( x )). Then ε U takes a positive value at x , and hence M := sup { ε U ( x ) | x ∈ [ ξ, η ] } is positive and achieved at some a ξ,η ∈ ] ξ, η [. Denote this point by a for short, then, ε U ( a ) > , dUdx ( a ) = 0 and ε d Udx ( a ) ≤ . It follows from (2.6) that ε U ( a ) < U ( a ) U ( a ) < (cid:3) Lemma 2.9.
Let ξ ∈ ] α, β ] . Assume that U ( ξ ) = 0 , and that U doesnot change sign in ] α, ξ [ . Then, there exists some a ξ ∈ [ α, ξ [ such that U ( a ξ ) U ( a ξ ) < .Let η ∈ [ α, β [ . Assume that U ( η ) = 0 , and that U does not change signin ] η, β [ . Then, there exists some b η ∈ ] η, β ] such that U ( b η ) U ( b η ) < . Proof. [40, p. 438] Since U cannot vanish identically in ] α, ξ [ (seeLemma 2.2), there exists x ∈ ] α, ξ [ such that U ( x ) = 0 . Let ε ξ =sign( U ( x )) . Since U does not change sign in ] α, ξ [ , ε ξ U ( x ) ≥ α, ξ [ . Then, M ξ := sup { ε ξ U ( x ) | x ∈ [ α, ξ ] } > . Let a ξ := inf { x ∈ [ α, ξ ] | ε ξ U ( x ) = M ξ } . Then a ξ ∈ [ α, ξ [.If a ξ ∈ ] α, ξ [, then ε ξ U ( a ξ ) > dUdx ( a ξ ) = 0 , and ε ξ d Udx ( a ξ ) ≤ ε ξ U ( a ξ ) <
0. Equivalently, U ( a ξ ) U ( a ξ ) < Claim 2.10. If a ξ = α , then ε ξ U ( α ) > , h = 0 , dUdx ( α ) = 0 , and ε ξ d Udx ( α ) ≤ .Proof of the claim. Assume that a ξ = α , then ε ξ U ( α ) > h = ∞ . If h were in ]0 , ∞ [, we would have ε ξ dUdx ( α ) = hε ξ U ( α ) > a ξ > α . It follows that the assumption a ξ = α impliesthat h = 0 and dUdx ( α ) = 0. If ε ξ d Udx ( α ) where positive, we wouldhave a ξ > α . Therefore, the assumption a ξ = α also implies that ε ξ d Udx ( α ) ≤ a ξ = α , then by Claim 2.10 and (2.6), we have ε ξ U ( α ) <
0. Equiva-lently, U ( α ) U ( α ) <
0. The first assertion of the lemma is proved. Theproof of the second assertion is similar. (cid:3)
Proposition 2.11.
Assume that (1.6) holds, and let k ∈ Z . Then, (2.14) N v ( Y k +1 , ] α, β [) ≥ N v ( Y k , ] α, β [) , i.e., in the interval ] α, β [ , the function Y k +1 changes sign at least asmany times as the function Y k . Proof. [40, p. 437-439] We keep the notation U = Y k and U = Y k +1 .By Lemma 2.2, the functions U and U have finitely many zeros in] α, β [, with finite multiplicities. Since α and β are fixed, we skip themention to the interval ] α, β [ in the proof, and we examine severalcases. Case 1 . If N v ( U ) = 0, there is nothing to prove. Case 2 . Assume that N v ( U ) = 1. Then U admits a unique zero ξ ∈ ] α, β [ having odd multiplicity. Without loss of generality, we mayassume that U ≥ α, ξ [ and U ≤ ξ, β [ . By Lemma 2.9, thereexist a ∈ [ α, ξ [ and b ∈ ] ξ, β ] such that U ( a ) < U ( b ) > U vanishes and changes sign at least oncein ] α, β [ , so that N v ( U ) ≥ N v ( U ), which proves the lemma inCase 2. Case 3 . If N v ( U ) = 2, the function U has exactly two zeros, havingodd multiplicities, ξ and η in ] α, β [ , α < ξ < η < β , and we mayassume that U | ] α,ξ [ ≥ U | ] ξ,η [ ≤
0, and U | ] η,β [ ≥
0. The argumentsgiven in Case 2 imply that there exist a ∈ [ α, ξ [ such that U ( a ) < b ∈ ] η, β ] such that U ( b ) < ξ, η [ the function U does notvanish identically and therefore achieves a global minimum at a point c such that U ( c ) < dUdx ( c ) = 0 , and d Udx ( c ) ≥ U ( c ) > TURM LINEAR COMBINATIONS EIGENFUNCTIONS 11
We can conclude that the function U vanishes and changes sign atleast twice in ] α, β [, so that N v ( U ) ≥ N v ( U ). Case 4 . Assume that N v ( U ) = p ≥ U has exactly p zeros,with odd multiplicities, in ] α, β [ , α < ξ < ξ < · · · < ξ p < β , and onecan assume that U | ] α,ξ [ ≥ , ( − p U | ] ξ p ,β [ ≥ , and( − i U | ] ξ i ,ξ i +1 [ ≥ ≤ i ≤ p − . One can repeat the arguments given in the Cases 2 and 3, and concludethat there exist a , . . . , a p with a ∈ [ α, ξ [ , a i ∈ ] ξ i , ξ i +1 [ for 1 ≤ i ≤ p −
1, and a p ∈ ] ξ p , β ] such that ( − i U ( a i ) < U vanishes and changes signat least p times in ] α, β [ , i.e. that N v ( U ) ≥ p = N v ( U ) .This concludes the proof of Proposition 2.11. (cid:3) Proposition 2.12.
Assume that (1.6) holds. For any k ∈ Z , (2.15) N m ( Y k +1 , ] α, β [) ≥ N m ( Y k , ] α, β [) , i.e., in the interval ] α, β [ , counting multiplicities of zeros, the function Y k +1 vanishes at least as many times as the function Y k . Proof. [40, p. 439-442] Let U = Y k and U = Y k +1 . If U does notvanish in ] α, β [ , there is nothing to prove. We now assume that U hasat least one zero in ] α, β [ . By Lemma 2.2, U and U have finitely manyzeros in ] α, β [ . Let α < ξ < · · · < ξ k < β be the distinct zeros of U , with multiplicities p i = m ( U, ξ i ) for 1 ≤ i ≤ k . Let σ be the sign of U in ] α, ξ [ , σ i the sign of U in ] ξ i , ξ i +1 [ for1 ≤ i ≤ k − σ k the sign of U in ] ξ k , β [ . Note that σ i = sign d p i Udx p i ( ξ i ) ! for 1 ≤ i ≤ k . By Lemma 2.9, there exist a ∈ [ α, ξ [ and a k ∈ ] ξ k , β ] such that U ( a ) U ( a ) < U ( a k ) U ( a k ) < a i ∈ ] ξ i , ξ i +1 [ , 1 ≤ i ≤ k − U ( a i ) U ( a i ) < ≤ i ≤ k , U ( a i ) U ( a i ) < . We have the relation(2.17) N m ( U, ] α, β [) = k X i =1 N m ( U, ] a i − , a i [) = k X i =1 p i . Indeed, for 1 ≤ i ≤ k , the interval ] a i − , a i [ contains precisely one zero ξ i of U , with multiplicity p i . For U , we have the inequality(2.18) N m ( U , ] α, β [) ≥ k X i =1 N m ( U , ] a i − , a i [) , because U might have zeros in the interval ] α, a [ if a > α (resp. inthe interval ] a k , β [ if a k < β ). Claim 2.13.
For ≤ i ≤ k , N m ( U , ] a i − , a i [) ≥ N m ( U, ] a i − , a i [) = p i . To prove the claim, we consider several cases. • If p i = 1 , then U ( a i − ) U ( a i ) < U ( a i − ) U ( a i ) < N m ( U , ] a i − , a i [) ≥ • If p i ≥
2, we apply Lemma (2.4) at ξ i : there exist real numbers B, B and smooth functions R and R , such that, in a neighborhood of ξ i ,(2.19) ( U ( x ) = B ( x − ξ i ) p i + ( x − ξ i ) p i +1 R ( x ) ,U ( x ) = B ( x − ξ i ) p i − + ( x − ξ i ) p i − R ( x ) , where sign( B ) = sign( B ) = σ i .We now use (2.16) and the fact that sign( U ( a i )) = σ i . ⋄ If p i ≥ σ i − σ i = − σ i U ( a i ) < σ i U ( a i − ) > . By (2.19), for ε small enough, we also have σ i U ( ξ i + ε ) > σ i U ( ξ i − ε ) < . This means that U vanishes at order p i − ξ i , and at least once inthe intervals ] a i − , ξ i − ε [ and ] ξ i + ε, a i [, so that N m ( U , ] a i − , a i [) ≥ p i − p i = N m ( U, ] a i − , a i [) . ⋄ If p i ≥ σ i − σ i = 1 . It follows that σ i U ( a i ) < σ i U ( a i − ) < . By (2.19), for ε small enough, we also have σ i U ( ξ i + ε ) > σ i U ( ξ i − ε ) > . This means that U vanishes at order p i − ξ i , and at least once inthe intervals ] a i − , ξ i − ε [ and ] ξ i + ε, a i [ , so that N m ( U , ] a i − , a i [) ≥ p i − p i = N m ( U, ] a i − , a i [) . The claim is proved, and the proposition as well. (cid:3)
Proposition 2.14.
Assume that (1.6) holds. For any k ∈ Z , (2.20) N m ( Y k +1 , [ α, β ]) ≥ N m ( Y k , [ α, β ]) , i.e., in the interval [ α, β ] , counting multiplicities of interior zeros, andreduced multiplicities of α and β , the function Y k +1 vanishes at least asmany times as the function Y k . TURM LINEAR COMBINATIONS EIGENFUNCTIONS 13
Proof. [40, p. 440-442] Recall that the reduced multiplicity of α (resp. β ) is zero if the Dirichlet condition holds at α (resp. at β ) or if U ( α ) = 0(resp. U ( β ) = 0). Furthermore, according to Lemma 2.5, if h ∈ [0 , ∞ [and U ( α ) = 0 (resp. if H ∈ [0 , ∞ [ and U ( β ) = 0), then m ( U, α ) = 2 p (resp. m ( U, β ) = 2 q ). Case 1.
Assume that N m ( U, ] α, β [) = 0 . Without loss of generality,we may assume that U > α, β [ . • If U ( α ) = 0 and U ( β ) = 0, there is nothing to prove. • Assume that U ( α ) = U ( β ) = 0 . Then, there exists a ∈ ] α, β [ suchthat U ( a ) = sup { U ( x ) | x ∈ [ α, β ] } , with U ( a ) > , dUdx ( a ) = 0 , and d Udx ( a ) ≤ . It follows from (2.6) that U ( a ) < N m ( U , [ α, β ]) = N ( U , [ α, a ]) + N ( U , [ a, β ]) . It now suffices to look separately at the intervals [ α, a ] and [ a, β ] . ⋄ Interval [ α, a ]. If the Dirichlet condition holds at α , there is nothingto prove. If h ∈ [0 , ∞ [, m ( U, α ) = 2 p ≥ U ( x ) = B ( x − α ) p + ( x − α ) p +1 R ( x ) ,U ( x ) = B ( x − α ) p − + ( x − α ) p − R ( x ) , with B > B > . It follows that U ( α + ε ) > ε small enough so that N m ( U , ] α, a [) ≥
1. It follows that(2.23) N m ( U , [ α, a ]) = m ( U , α ) + N m ( U , ] α, a [) ≥ p − , i.e. N m ( U , [ α, a ]) ≥ N ( U, [ α, a ]) . ⋄ Interval [ a, β ]. The proof is similar. • Assume that U ( α ) = 0 and U ( β ) = 0 . The proof is similar to theprevious one with a ∈ ] a, β ] . • Assume that U ( α ) = 0 and U ( β ) = 0 . The proof is similar to theprevious one with a ∈ [ α, a [ . Case 2.
Assume that N m ( U, ] α, β [) ≥ • If U ( α ) = 0 (resp. U ( β ) = 0), there is nothing to prove for theboundary α (resp. β ). • If U ( α ) = 0 (resp. U ( β ) = 0), the number a (resp. a k ) whichappears in the proof of Proposition 2.12 belongs to the open interval] α, ξ [ (resp. to the open interval ] ξ k , β [), where ξ (resp. ξ k ) is thesmallest (resp. largest) zero of U in ] α, β [. We can then apply the proof of Step. 1 to the interval [ α, a ] (resp. to the interval [ a k , β ])to prove that N m ( U , [ α, a ]) ≥ N m ( U, [ α, a ]) (resp. to prove that N m ( U , [ a k , β ]) ≥ N m ( U, [ a k , β ]). This proves Proposition 2.14. (cid:3) We can now state Sturm’s refined version of Theorem 1.4.
Theorem 2.15.
Assume that (1.6) holds, and let Y be the non triviallinear combination (2.24) Y = n X p = m A p V p , where ≤ m ≤ n , and where { A p , m ≤ p ≤ n } are real constants suchthat A m + · · · + A n . Then, with the notation of Subsection 2.2, (2.25) N v ( Y, ] α, β [) ≤ N m ( Y, ] α, β [) ≤ N m ( Y, [ α, β ]) , (2.26) ( m − ≤ N v ( Y, ] α, β [) and N m ( Y, [ α, β ]) ≤ ( n − . Proof. [40, p. 442] Let N ( V ) be any of the above functions. We mayof course assume that A m = 0 and A n = 0. In the preceding lemmas,we have proved that N ( Y k +1 ) ≥ N ( Y k ) for any k ∈ Z . This inequalitycan also be rewritten as(2.27) N ( Y ( − k ) ) ≤ N ( Y ) ≤ N ( Y k ) for any k ≥ . Letting k tend to infinity, we conclude that(2.28) N ( V m ) ≤ N ( Y ) ≤ N ( V n ) , and we can apply Theorem 1.3. (cid:3) Remark . For a complete proof of the limiting argument when k tendsto infinity, we refer to Appendix A.3. Liouville’s approach to Theorem 1.4
Main statement.
We keep the notation of Section 2. Startingfrom a linear combination Y as in (2.1), Liouville also considers thefamily Y k given by (2.2), and shows that the number of zeros of Y k +1 is not smaller than the number of zeros of Y k . His proof is based on ageneralization of Rolle’s theorem. Remark . In his proof, Liouville [24] only considers the zeros in theopen interval ] α, β [ .As in Section 2, for 1 ≤ m ≤ n , we fix Y = P nj = m A j V j , a linearcombination of eigenfunctions of the eigenvalue problem (1.1)–(1.3),and we assume that A m A n = 0 , see Remark 2.1. Theorem 3.2.
Counting zeros with multiplicities in the interval ] α, β [ ,the function Y (1) has at most ( n − zeros and, (2) has at least ( m − zeros. TURM LINEAR COMBINATIONS EIGENFUNCTIONS 15
Proof.
Liouville uses the following version of Rolle’s theorem (MichelRolle (1652-1719) was a French mathematician). This version of Rolle’stheorem seems to go back to Cauchy and Lagrange.
Lemma 3.3.
Let f be a function in ] α , β [ . Assume that f ( x ′ ) = f ( x ′′ ) = 0 for some x ′ , x ′′ , α < x ′ < x ′′ < β . (1) If the function f is differentiable, and has ν − distinct zeros inthe interval ] x ′ , x ′′ [ , then the derivative f ′ has at least ν distinctzeros in ] x ′ , x ′′ [ . (2) If the function f is smooth, and has µ − zeros counted withmultiplicities in the interval ] x ′ , x ′′ [ , then the derivative f ′ hasat least µ zeros counted with multiplicities in ] x ′ , x ′′ [ . Proof of the lemma.
Call x < x < · · · x ν − the distinct zeros of f in ] x ′ , x ′′ [ . Since f ( x ′ ) = f ( x ′′ ) = 0 , by Rolle’s theorem [32], thefunction f ′ vanishes at least once in each open interval determined bythe x j , 1 ≤ j ≤ ν − x ′ , x [ and ] x ν − , x ′′ [ .It follows that f ′ has at least ν distinct zeros in ] x ′ , x ′′ [ , which provesthe first assertion.Call m j the multiplicity of the zero x j , 1 ≤ j ≤ ν − f ′ has atleast ν zeros, one in each of the open intervals determined by x ′ , x ′′ andthe x j ’s, and has a zero at each x j with multiplicity m j − m j >
1. It follows that the number of zeros of f ′ in ] x ′ , x ′′ [ ,counting multiplicities, is at least ν − X j =1 ( m j −
1) + ν = ν − X j =1 m j + 1 , which proves the second assertion. (cid:3) Proof of the assertion “ Y has at most ( n − zeros in ] α, β [ , counting multiplicities” . Write (1.1) for V and for V p , for some m ≤ p ≤ n . Multiply the firstequation by − V p , the second by V , and add the resulting equations.Then(3.1) V ddx K dV p dx ! − V p ddx K dV dx ! + ( ρ p − ρ ) GV V p = 0 . Use the identity(3.2) V ddx K dV p dx ! − V p ddx K dV dx ! = ddx V K dV p dx − V p K dV dx ! , and integrate from α to t to get the identity(3.3) ( ρ − ρ p ) Z tα GV V p dx = K ( t ) V ( t ) dV p dx ( t ) − V p ( t ) dV dx ( t ) ! . Here we have used the boundary condition (1.2) which implies that V ( α ) dV p dx ( α ) − V p ( α ) dV dx ( α ) ! = 0 . Multiplying the identity (3.3) by A p , and summing for p from m to n ,we obtain(3.4) Z tα GV n X p = m ( ρ − ρ p ) A p V p dx = K ( t ) V dYdx − Y dV dx ! ( t ) . or(3.5) Z tα GV n X p = m ( ρ − ρ p ) A p V p dx = K ( t ) V ( t ) ddt (cid:18) YV (cid:19) ( t ) , where we have used the fact that the function V does not vanish inthe interval ] α, β [ .Let Ψ( x ) = YV ( x ). The zeros of Y in ] α, β [ are the same as the zerosof Ψ, with the same multiplicities. Let µ be the number of zeros of Y ,counted with multiplicities. Using Lemma 3.3, Assertion (2), one canshow that d Ψ dx has at least µ − α, β [ , and hence so does theleft-hand side of (3.5), Z tα GV n X p = m ( ρ − ρ p ) A p V p dx . On the other hand, this function vanishes at α and β (because of theboundary condition (1.3) or orthogonality). By Lemma 3.3, its deriv-ative,(3.6) V n X p = m ( ρ − ρ p ) A p V p has at least µ zeros counted with multiplicities in ] α, β [ . We haveproved the following Lemma 3.4.
If the function Y = P np = m A p V p has at least µ zeroscounted with multiplicities in the interval ] α, β [ , then the function Y = P np = m ( ρ − ρ p ) A p V p has at least µ zeros, counted with multiplici-ties, in ] α, β [ . Applying this lemma iteratively, we deduce that if Y has at least µ zeros counted with multiplicities in ] α, β [ , then, for any k ≥
1, thefunction(3.7) Y k = n X p = m ( ρ − ρ p ) k A p V p has at least µ zeros, counted with multiplicities, in ] α, β [ .We may of course assume that the coefficient A n is non-zero. The aboveassertion can be rewritten as the statement: TURM LINEAR COMBINATIONS EIGENFUNCTIONS 17
For all k ≥ , the equation (3.8) A m ρ m − ρ ρ n − ρ ! k V m + · · · + A n − ρ n − − ρ ρ n − ρ ! k V n − + A n V n = 0 has at least µ solutions in ] α, β [ , counting multiplicities. Letting k tend to infinity, and using the fact that V n has exactly ( n − α, β [ , this implies that µ ≤ ( n − (cid:3) Proof of the assertion “ Y has at least ( m − zeros in ] α, β [ , counting multiplicities” . We have seen that the number of zeros of Y k is less than or equal tothe number of zeros of the function Y k +1 . This assertion actually holdsfor any k ∈ Z , and can also be rewritten as,(3.9) N m ( Y − k ) ≤ N m ( Y ) , for any k ≥
0, where(3.10) Y − k = A m ( ρ m − ρ ) − k V m + · · · + A n ( ρ n − ρ ) − k V n , and we can again let k tend to infinity. The second assertion is provedand Theorem 3.2 as well. (cid:3) Liouville’s 2nd approach to the 2nd part of Theorem 3.2.
If the function Y has µ distinct zeros, and µ ≤ µ sign changes, wecall a i , α < a < · · · < a µ < β , the points at which Y changes sign. Claim 3.5.
The function Y changes sign at least ( m − times in theinterval ] α, β [ . Proof of the claim.
Assume, by contradiction, that µ ≤ ( m − x W ( x ) := ∆( a , . . . , a µ ; x ) , where the function ∆ is defined as the determinant(3.12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) V ( a ) V ( a ) · · · V ( a µ ) V ( x ) V ( a ) V ( a ) · · · V ( a µ ) V ( x )... ... ... ... ... V µ +1 ( a ) V µ +1 ( a ) · · · V µ +1 ( a µ ) V µ +1 ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . The function W vanishes at the points a i , ≤ i ≤ µ . According tothe first part in Theorem 3.2, W being a linear combination of thefirst µ + 1 eigenfunctions, vanishes at most µ times in ] α, β [ , countingmultiplicities. This implies that each zero a i of W has order one, andthat W does not have any other zero in ] α, β [ . It follows that thefunction Y W vanishes only at the points { a i } , 1 ≤ i ≤ µ , and that it does not change sign. We can assume that Y W ≥ Z βα GY W dx = 0 , because Y involves the functions V p with p ≥ m and W the functions V q with q ≤ µ + 1 ≤ m − (cid:3) Remark . Liouville does actually not use the determinant (3.12),but a similar approach, see [23, p. 259], Lemme 1 er . The determinant∆ appears in [34, Section 142]. The paper [6] is based on a carefulanalysis of this determinant. Remark . The arguments in Subsection 3.2, using Assertion (1) ofLemma 3.3, instead of Assertion (2), yield an upper bound on thenumber of zeros of Y , multiplicities not accounted for. This estimateholds under weaker regularity assumptions, namely only assuming thatthe functions G, L are continuous, and that the function K is C , seeAppendix C, and compare with [15], Chap. III.5.4. Mathematical context of Sturm’s papers.Sturm’s motivations and ideas
On Sturm’s style.
Sturm’s papers [39, 40] are written in French,and quite long, about 80 pages each. One difficulty in reading themis the lack of layout structure. The papers are written linearly, anddivided into sequences of sections, without any title. Most results arestated without tags, “Theorem” and the like, and only appear in thebody of the text. For example, [39] only contains one theorem statedas such, see § XII, p. 125. In order to have an overview of the resultscontained in [39], the reader should look at the announcement [37].Theorem 1.4 is stated in [38].For a more thorough analysis of Sturm’s papers on differential equa-tions, we refer to [26, 14]. We refer to [7, 33] for the relationships be-tween Theorem 1.3 and Sturm’s theorem on the number of real rootsof real polynomials.4.2.
Sturm’s motivations.
Sturm’s motivations come from mathe-matical physics, and more precisely, from the problem of heat diffusionin a non-homogeneous bar. He considers the heat equation,(4.1)
G ∂u∂t = ∂∂x K ∂u∂x ! − Lu , for ( x, t ) ∈ ] α, β [ × R + , with boundary conditions(4.2) K ( α ) ∂u∂x ( α, t ) − h u ( α, t ) = 0 ,K ( β ) ∂u∂x ( β, t ) + H u ( β, t ) = 0 , TURM LINEAR COMBINATIONS EIGENFUNCTIONS 19 for all t >
0, and with the initial condition(4.3) u ( x,
0) = f ( x ) , for x ∈ ] α, β [ , where f is a given function.The functions K, G, L and the constants h, H describe the physicalproperties of the bar, see [40, Introduction, p. 376]. Sturm refers tothe book of Siméon Denis Poisson [31], rather than to Fourier’s book[13], because Poisson’s equations are more general, see [33, Chap. III].The boundary conditions (1.2)-(1.3) and (4.2) first appeared in thework of Fourier [13] but are called “Robin’s condition” in the recentliterature. Victor Gustave Robin (1855-1897) was a French mathemati-cian.As was popularized by Fourier and Poisson, in order to solve (4.1),Sturm uses the method of separation of variables, and is therefore ledto the eigenvalue problem (1.1)–(1.3).4.3.
Sturm’s assumptions.
In [39, 40], Sturm implicitly assumesthat the functions
K, G, L are C ∞ and, explicitly, that K is positive,see [39, p. 108]. For the eigenvalue problem, he also assumes that G, L are positive, see [40, p. 381]. In [40, p. 394], he mentions that L couldtake negative values, and implicitly assumes, in this case, that LG isbounded from below.In [24], Liouville does not mention any regularity assumption on thefunctions G, K, L . He however indicates a regularity assumption (piece-wise C functions) in a previous paper, [23, Footnote ( ∗ ), p. 256].4.4. Sturm’s originality.
Before explaining Sturm’s proofs, we wouldlike to insist on the originality of his approach. Indeed, unlike his pre-decessors, Sturm does not look for explicit solutions of the differentialequation (4.4) (i.e., solutions in closed form, or given as sums of seriesor as integrals), but he rather looks for qualitative properties of the so-lutions, properties which can be deduced directly from the differentialequation itself. The following excerpts are translated from [39, Intro-duction] . One only knows how to integrate these equations in a very small num-ber of particular cases, and one can otherwise not even obtain a firstintegral; even when one knows the expression of the function which sat-isfies such an equation, in finite form, as a series, as integrals eitherdefinite or indefinite, it is most generally difficult to recognize in thisexpression the behaviour and the characteristic properties of this func-tion. . . .Although it is important to be able to determine the value of the un-known function for an isolated value of the variable it depends upon, it See Appendix B for the original citations in French. is not less necessary to discuss the behaviour of this function, or other-wise stated, the form and the twists and turns of the curve whose ordi-nate would be the function, and the abscissa the independent variable.It turns out that one can achieve this goal by the sole consideration ofthe differential equation themselves, without having to integrate them.This is the purpose of the present memoir. . . .
Sturm and the existence and uniqueness theorem for or-dinary differential equation.
In [39, p. 108], Sturm considers thedifferential equation ddx
K dVdx ! + GV = 0 , ( I )and takes the existence and uniqueness theorem for granted. More pre-cisely, he claims [39, p. 108], without any reference whatsoever, The complete integral of equation (I) must contain two arbitrary con-stants, for which one can take the values of V and of dVdx correspondingto some particular value of x . Once these values are fixed, the func-tion V is fully determined by equation (I), it has a uniquely determinedvalue for each value of x . On the other hand, he gives two arguments for the fact that a solu-tion of (I) and its derivative cannot vanish simultaneously at a pointwithout vanishing identically, see [39, § II]. When the coefficients
K, G of the differential equation depend upon a parameter m , e.g. continu-ously, Sturm also takes for granted the fact that the solution V ( x, m ),and its zeros, depend continuously on m .In [40, § II], Sturm mentions the existence proof given by Liouville in[23], see also [22]. According to [16], Augustin-Louis Cauchy may havepresented the existence and uniqueness theorem for ordinary differen-tial equations in his course at École polytechnique as early as in theyear 1817-1818. Following a recommendation of the administration ofthe school, Cauchy delivered the notes of his lectures in 1824, see [9]and, in particular, the introduction by Christian Gilain who discoveredthese notes in 1974. These notes apparently had a limited distribution.Liouville entered the École polytechnique in 1825, and there attendedthe mathematics course given by Ampère (as a matter of fact Ampèreand Cauchy gave the course every other year, alternatively). Liouville’sproof of the existence theorem for differential equations in [22], à la Pi-card but before Picard, though limited to the particular case of 2ndorder linear equations, might be the first well circulated proof of anexistence theorem for differential equations, see [25, § § We are grateful to J. Lützen for providing this information.
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 21
Sturm’s proof of Theorem 1.3.
Theorem 1.3 is proved in [40].For the first assertion, see § III (p. 384) to VII; for the second assertion,see § VIII (p. 396) to X.The proof is based on the paper [39] in which Sturm studies the zerosof the solution of the initial value problem, ddx K ( x, m ) dVdx ( x, m ) ! + G ( x, m ) V ( x, m ) = 0 , (4.4) K dVdx − hV ! ( α, m ) = 0 . (4.5)Here K, G are assumed to be functions of x depending on a real pa-rameter m , with K positive (the constants h and H may also dependon the parameter m ). The solution V ( x, m ) is well defined up to ascaling factor. The main part of [39] is devoted to studying how thezeros of the function V ( x, m ) (and other related functions) dependon the parameter m , see [39, § XII, p. 125]. While developing thisprogram, Sturm proves the oscillation, separation and comparison the-orems which nowadays bear his name, [39, § XV, XVI and XXXVII].The eigenvalue problem (1.1)–(1.3) itself is studied in [40]. For thispurpose, Sturm considers the functions K ( x, r ) ≡ K ( x ) and G ( x, r ) = rG ( x ) − L ( x ) , the solution V ( x, r ) of the corresponding initial value problem (4.4)–(4.5), and applies the results and methods of [39].The spectral data of the eigenvalue problem (1.1)–(1.3) are determinedby the following transcendental equation in the spectral parameter r ,(4.6) K ( β ) dVdx ( β, r ) + HV ( β, r ) = 0 , see, [40], §III, page 383, line 8 from bottom.4.7. Sturm’s two proofs of Theorem 1.4.
Theorem 1.4 appears in[40, § XXV, p. 431], see also the announcement [38].Sturm’s general motivation, see the introductions to [39] and [40], wasthe investigation of heat diffusion in a (non-homogeneous) bar, whosephysical properties are described by the functions
K, G, L . He firstobtained Theorem 1.4 as a corollary of a much deeper theorem whichdescribes the behaviour, as time varies, of the x -zeros of a solution u ( x, t ) of the heat equation (4.1)-(4.3). When the initial temperature u ( x,
0) is given by a linear combination of simple states,(4.7) u ( x,
0) = Y ( x ) = n X j = m A j V j the function u ( x, t ) is given by(4.8) u ( x, t ) = n X j = m e − tρ j A j V j . When t tends to infinity, the x -zeros of u ( x, t ) approach those of V p ,where p is the least integer j, m ≤ j ≤ n such that A j = 0.J. Liouville, who was aware of Theorem 1.4, made use of it in [23], andprovided a purely “ordinary differential equation” proof in [24], a fewmonths before the actual publication of [40]. This induced Sturm toprovide two proofs of Theorem 1.4 in [40], his initial proof using theheat equation, and another proof based on the sole ordinary differentialequation. The proofs of Sturm actually give a more precise result. In[40, p. 379], Sturm writes, M. Liouville gave a direct proof of this theorem, which for me was amere corollary of the preceding one, without taking care of the particularcase in which the function vanishes at one of the extremities of the bar.I have also found, after him, another direct proof which I give in thismemoir. M. Liouville made use of the same theorem in a very nicememoir which he published in the July issue of his journal, and whichdeals with the expansion of an arbitrary function into a series made ofthe functions V which we have considered. The time independent analog to studying the behaviour of the x -zerosof (4.8) is to study the behaviour of the zeros of the family of functions { Y k } k ∈ Z , where(4.9) Y k ( x ) = X j = m ρ kj A j V j , as k tends to infinity. Appendix A. The limiting argument in (3.8)Recall that we assume that A n = 0. Define(A.1) ω = ρ n − − ρ ρ n − ρ ! k . One can rewrite (3.8) as V n ( x ) + ω Π( x ) = 0 , where(A.2) Π( x ) = n − X p = m ρ p − ρ ρ n − − ρ ! k A p A n V p . It follows that Π is uniformly bounded by(A.3) | Π( x ) | ≤ M := n max p (cid:12)(cid:12)(cid:12)(cid:12) A p A n (cid:12)(cid:12)(cid:12)(cid:12) max p sup [ α,β ] | V p | . TURM LINEAR COMBINATIONS EIGENFUNCTIONS 23
Similarly,(A.4) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d Π dx ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N := n max p (cid:12)(cid:12)(cid:12)(cid:12) A p A n (cid:12)(cid:12)(cid:12)(cid:12) max p sup [ α,β ] | dV p dx | . Call ξ < ξ < · · · < ξ n − the zeros of the function V n in the interval] α, β [. • Assume that V n ( α ) = 0 and V n ( β ) = 0 .Since dV n dx ( ξ i ) = 0 , there exist δ , ε > | dV n dx ( x ) | ≥ ε for x ∈ [ ξ i − δ , ξ i + δ ] , and | V n ( x ) | ≥ ε in [ α, β ] \ ∪ ] ξ i − δ , ξ i + δ [ .For k large enough, we have ωM, ωN ≤ ε /
2. It follows that in theinterval [ ξ i − δ , ξ i + δ ] , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ddx ( V n + ω Π) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ | dV n dx | − ω N ≥ ε / . Furthermore, V n ( ξ i ± δ ) + ω Π( ξ i ± δ ) ≥ | V n ( ξ i ± δ ) | − ω M ≥ ε / . Since V n ( ξ i + δ ) V n ( ξ i − δ ) < V n + ω Π has exactly one zero in each interval ] ξ i − δ , ξ i + δ [ .In [ α, β ] \ ∪ ] ξ i − δ , ξ i + δ [ , we have | V n ( x ) + ω Π( x ) | ≥ | V n ( x ) | − ωM ≥ ε / , which implies that V n ( x ) + ω Π( x ) = 0 . • Assume that V n ( α ) = 0 and V n ( β ) = 0 . This corresponds to thecase h = + ∞ and H = + ∞ . Hence the V j verify Dirichlet at α and Πverifies Dirichlet at α . Observing that V ′ n ( α ) = 0, it is immediate to seethat there exists δ >
0, such that, for k large enough, V n ( x ) + ω Π( x )has only α as zero in [ α, α + δ ]. • The other cases are treated in the same way. (cid:3)
Appendix B. Citations from Sturm’s papersFrench original and English translation
Citation from [39, Introduction].On ne sait [ces équations] lesintégrer que dans un très petitnombre de cas particuliers horsdesquels on ne peut pas mêmeen obtenir une intégrale pre-mière ; et lors même qu’on pos-sède l’expression de la fonctionqui vérifie une telle équation,soit sous forme finie, soit ensérie, soit en intégrales définiesou indéfinies, il est le plus sou-vent difficile de reconnaître danscette expression la marche etles propriétés caractéristiquesde cette fonction. . . .S’il importe de pouvoir déter-miner la valeur de la fonctioninconnue pour une valeur isoléequelconque de la variable dontelle dépend, il n’est pas moinsnécessaire de discuter la marchede cette fonction, ou en d’autrestermes, d’examiner la forme etles sinuosités de la courbe dontcette fonction serait l’ordonnéevariable, en prenant pour ab-scisse la variable indépendante.Or on peut arriver à ce but parla seule considération des équa-tions différentielles elles-mêmes,sans qu’on ait besoin de leurintégration. Tel est l’objet duprésent mémoire. . . . One only knows how to in-tegrate these equations in avery small number of particu-lar cases, and one can otherwisenot even obtain a first integral;even when one knows the ex-pression of the function whichsatisfies such an equation, in fi-nite form, as a series, as inte-grals either definite or indefi-nite, it is most generally difficultto recognize in this expressionthe behaviour and the charac-teristic properties of this func-tion. . . .Although it is important to beable to determine the value ofthe unknown function for an iso-lated value of the variable it de-pends upon, it is not less nec-essary to discuss the behaviourof this function, or otherwisestated, the form and the twistsand turns of the curve whoseordinate would be the func-tion, and the abscissa the in-dependent variable. It turnsout that one can achieve thisgoal by the sole consideration ofthe differential equation them-selves, without having to inte-grate them. This is the purposeof the present memoir. . . .
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 25
Citation from [39, p. 108].L’intégrale complète del’équation (I) doit contenirdeux constantes arbitraires,pour lesquelles on peut prendreles valeurs de V et de dVdx correspondantes à une valeurparticulière de x . Lorsque cesvaleurs sont fixées, la fonction V est entièrement définie parl’équation (I), elle a une valeurdéterminée et unique pourchaque valeur de x . The complete integral of equa-tion (I) must contain two arbi-trary constants, for which onecan take the values of V and of dVdx corresponding to some par-ticular value of x . Once thesevalues are fixed, the function V is fully determined by equation(I), it has a uniquely determinedvalue for each value of x . Citation from [40, p. 379].M. Liouville a démontré directe-ment ce théorème, qui n’étaitpour moi qu’un corollaire duprécédent, sans s’occuper du casparticulier où la fonction seraitnulle à l’une des extrémités dela barre. J’en ai aussi trouvéaprès lui une autre démonstra-tion directe que je donne dans cemémoire. M. Liouville a fait us-age du même théorème dans untrès beau Mémoire qu’il a pub-lié dans le numéro de juillet deson journal et qui a pour ob-jet le développement d’une fonc-tion arbitraire en une série com-posée de fonctions V que nousavons considérées. M. Liouville gave a direct proofof this theorem, which for mewas a mere corollary of the pre-ceding one, without taking careof the particular case in whichthe function vanishes at one ofthe extremities of the bar. Ihave also found, after him, an-other direct proof which I givein this memoir. M. Liouvillemade use of the same theoremin a very nice memoir which hepublished in the July issue of hisjournal, and which deals withthe expansion of an arbitraryfunction into a series made ofthe functions V which we haveconsidered. Appendix C. Sturm’s results under weaker assumptions
We proved Theorems 2.15 and 3.2 under the Assumptions (1.6). Inthis section, we consider the weaker assumptions(C.1) [ α, β ] ⊂ ] α , β [ ,K ∈ C (] α , β [) ,G, L ∈ C (] α , β [) ,K, G, L > α , β [ . Under these assumptions, the functions V j are C on ] α , β [. Thisfollows easily for example from Liouville’s existence proof [23], and wehave the following lemma, whose proof is analogous to the proof ofLemma 2.2 Lemma C.1.
Let k ∈ Z . (1) The function Y k satisfies the boundary conditions (1.2) and (1.3) . (2) The functions Y k and Y k +1 satisfy the relation (C.2) G Y k +1 = K d Y k dx + dKdx dY k dx − L Y k . (3) Under the Assumptions (C.1) , the function Y k cannot vanishidentically on an open interval ] α , β [ ⊂ ] α , β [ , unless Y ≡ . In Subsection 3.2, we have used Lemma 3.3 (2) which relies on thefact that the functions V j are C ∞ . If the functions V j are only C ,we can apply Lemma 3.3 (1). It is easy to conclude that Liouville’sproofs in Subsection 3.2 and 3.3 go through, under the weaker Assump-tions (C.1), if we only count distinct zeros, see (2.11). More precisely,we can prove the following claim. Claim C.2.
Under the Assertions (C.1) , for any k ∈ Z , if the function Y k has at least µ distinct zeros in the interval ] α, β [ , then the function Y k +1 has at least µ distinct zeros in the interval ] α, β [ . We can then deduce from this claim, as in Section 3, that a linearcombination Y = P nj = m A j V j has at most ( n −
1) distinct zeros (inparticular it has finitely many zeros).Once this result is secured, we can define zeros at which Y changessign (without using the multiplicity), and apply Sturm’s lower boundargument to conclude that the function Y must change sign at least( m −
1) times.
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 27
Appendix D. Sturm’s original o.d.e proof
The first proof of Theorem 1.4 appears in [40, § XXV, p. 431], as acorollary of a more profound theorem ( § XXIV) which describes thebehaviour, as t grows from 0 to infinity, of the zeros of x u ( x, t ),where u is a solution of the heat (4.1)-(4.3).Sturm proves that the number N ( t ) of zeros of the function x u ( x, t )is piecewise constant, non-increasing in t , and that jumps occur pre-cisely for values of t such that u ( x, t ) and ∂u∂t ( x, t ) have common zeros.We refer to [14] for an analysis of this aspect of Sturm’s paper [40].The second proof, purely o.d.e., is developed in [40, § XXVI, p. 436 ff].In this section, we give the main steps of this proof (with page numbersand number of line from top ℓ ↓ , resp. from bottom ℓ ↑ ). p. 436 ℓ ↑
13, Sturm mentions Liouville’s proof [24].
M. Liouville a démontré directement le théorème du numéro précédent(dans le cahier d’août de son journal) sans employer la considérationde la variable auxiliaire t qui entre dans la fonction u (42) dont j’ai faitusage. Il n’a pas tenu compte toutefois de la racine x ou X lorsqu’elleexiste. Je vais donner ici une autre démonstration directe du mêmethéorème, indépendante de celui du n ◦ XXIV.
He introduces the linear combination Y = C i V i + C i +1 V i +1 + · · · + C p V p . (43)and, p. 436 ℓ ↑
1, its companion Y = − ( C i ρ i V i + C i +1 ρ i +1 V i +1 + · · · + C p ρ p V p ) . p. 437 , Sturm establishes the differential relation gY = k d Ydx + dkdx dYdx − ℓY . (44)He also notes ℓ ↓
5, that the function Y satisfies the boundary condi-tions (1.2)-(1.3). Sturm’ idea Je vais prouver . . . , is to prove that thefunction Y has at least as many zeros in ] α, β [, counted with multi-plicities, as the function Y in the same circumstances. p. 437 ℓ ↑
10, Sturm makes the implicit assumption that the zeros of Y are isolated. p. 439 ℓ ↓
5, Sturm states that the number of sign changes of Y in] α, β [ is not smaller than the number of sign changes of Y . He thenconsiders the zeros with multiplicities, and implicitly assumes that thefunction Y (assumed not to be identically zero) does not vanish atinfinite order at some point. Respectively α and β with our notation. p. 440 ℓ ↑
13, Sturm states that the number of zeros of Y in ] α, β [,counted with multiplicities, is not smaller than the number of zeros of Y . He then examines ( ℓ ↑
6) the possible zeros of Y at α or β . p. 442 ℓ ↓
7, Sturm states that the number of zeros of Y in [ α, β ],counted with multiplicities (with a special rule for counting multiplic-ities at α , β ), is not smaller than the number of zeros of Y . p. 442 ℓ ↓
13, Sturm iterates the procedure (with Y k ), and uses alimiting argument to conclude that the number of zeros of Y in [ α, β ],counting multiplicities, is at most p − p. 443 , Sturm proves the lower bound for the number of zeros and,( ℓ ↑ Y k are equal to d k udt k ( x, Y cannot vanish identically unless all the coefficients C j are zero.He does not mention the fact that Y can actually not vanish at infiniteorder at any point. p. 444 ℓ ↓
3, Sturm explains what to do when no assumption is madeon the sign of the function ℓ . Taking Y as above, and defining Y = − ( C i ( ρ i + c ) V i + C i +1 ( ρ i +1 + c ) V i +1 + · · · + C p ( ρ p + c ) V p ) , where c is a constant, he obtains gY = k d Ydx + dkdx dYdx − ( gc + ℓ ) Y .
It suffices to assume that the constant c is such that gc + ℓ > Y . Appendix E. Cross references to Sturm’s and Liouville’spapers
In this Appendix, we give the references to pages in Sturm’s paper [40, § XXVI] for the results in our paper. • Lemma 2.2: p. 437 . Note that the third assertion does notappear in Sturm’s paper. He indeed implicitly assumes thatthe zeros of Y are isolated. • Lemma 2.4: p. 439 . • Lemma 2.5: p. 440-441 . • Lemma 2.7: p. 437 . • Lemma 2.9: p. 438 . • Proposition 2.11: p. 437-439 . • Proposition 2.12: p. 439-442 . • Proposition 2.14: p. 440-442 . • Theorem 2.15: p. 442 .Here are the pages in Liouville’s paper [24]. • Theorem 3.2: p. 272 . TURM LINEAR COMBINATIONS EIGENFUNCTIONS 29 • Lemma 3.3: Mentioned p. 272 . No precise statement, no proofprovided by Liouville. • Proof of first assertion. Lemma 3.4: p. 274 . • Proof of second assertion: p. 276 and reference to [23].Claim 3.5: We use the determinant ∆ to simplify Liouville’s[23, Lemme 1 er , p. 259]. Numbers inserted after a reference indicate the pages where it is cited.
References [1] V. Arnold. Topology of real algebraic curves (works of I.G. Petrovsky and theirdevelopment)[Russian]. Usp. Mat. Nauk. 28:5 (1973),260–262. Translated byO. Viro, in V.I. Arnold, Collected works, Vol. 2, pp. 251-254. Springer 2014. 3[2] V. Arnold. Ordinary differential equations. Translated from the 3rd Russianedition by Roger Cooke. Springer-Verlag 1992. 3[3] V. Arnold. Topological properties of eigenoscillations in mathematical physics.Proc. Steklov Inst. Math., 273 (2011), 25–34. 3[4] P. Bérard and B. Helffer. On Courant’s nodal domain property for linear com-binations of eigenfunctions, Part I. arXiv:1705.03731. 3[5] P. Bérard and B. Helffer. On Courant’s nodal domain property for linear com-binations of eigenfunctions, Part II. arXiv:1803.00449. 3, 4[6] P. Bérard and B. Helffer. Sturm’s theorem on the zeros of sums of eigenfunctions:Gelfand’s strategy implemented. arXiv:1807.03990. 4, 18[7] M. Bôcher. The published and unpublished work of Charles Sturm on algebraicand differential equations. Proc. Amer. Math. Soc., 18 (1911), 1–18. 18[8] M. Bôcher. Leçons sur les méthodes de Sturm dans la théorie des équationsdifférentielles linéaires et leurs développements modernes. Gauthier-Villars etCie, Éditeurs. Paris 1917. 3[9] A. L. Cauchy. Équations différentielles ordinaires. Cours inédit (Fragment). Crit-ical edition by Christian Gilain. Paris-Québec : Études vivantes, and New York:Johnson Reprint, p. I-LVI et p. 1–146, 1981. 20[10] R. Courant and D. Hilbert. Methoden der mathematischen Physik, Vol. I.Springer 1931. 3[11] R. Courant and D. Hilbert. Methods of mathematical physics. Vol. 1. Firstenglish edition. Interscience, New York 1953. 3[12] A. Eremenko and D. Novikov. Oscillation of Fourier integrals with a spectralgap. Journal de mathématiques pures et appliquées, 83 (2004), 313-365. 3[13] J.B. Joseph Fourier. Théorie analytique de la chaleur. Chez Firmin Didot, pèreet fils, 1822 (639 pages). 2, 19[14] V.A. Galaktionov and P.J. Harwin. Sturm’s theorems on zero sets in nonlinearparabolic equations. in
Sturm-Liouville theory: Past and present.
W.O. Amrein,A.M. Hinz, D.B. Pearson ed. Birkhäuser Verlag Basel 2005, 173–199. 4, 18, 27[15] F. Gantmacher and M. Krein. Oscillation matrices and kernels and small vi-brations of mechanical systems. AMS Chelsea Publishing, 2002. 3, 18[16] C. Gilain. Cauchy et le cours d’analyse de l’École polytechnique. Revue de laSABIX, 5 (1989), 3–31. 20[17] G. Gladwell and H. Zhu. The Courant-Herrmann conjecture. ZAMM - Z.Angew. Math. Mech., 83:4 (2003), 275–281. 3[18] H. Herrmann. Beiträge zur Theorie der Eigenwerte und Eigenfunktionen. Göt-tinger Dissertation 1932. Published by Teubner. 3[19] A. Hurwitz. Über die Fourierschen Konstanten integrierbaren Funktionen.Math. Annalen, 57 (1903), 425–446. 3[20] N. Kuznetsov. On delusive nodal sets of free oscillations. Newsletter of theEuropean Mathematical Society, 96 (2015), 34–40. 4[21] P.-S. Laplace. Théorie analytique des probabilités. 3rd edition. Coursier Paris1820. 7[22] J. Liouville. Analyse appliquée. Mémoire sur la théorie analytique de la chaleur.Annales de Mathématiques Pures et Appliquées, 21 (1830), 131–181. 20[23] J. Liouville. Mémoire sur le développement de fonctions ou parties de fonc-tions en séries dont les divers termes sont assujétis à satisfaire à une même
TURM LINEAR COMBINATIONS EIGENFUNCTIONS 31 équation différentielle du second ordre, contenant un paramètre variable. Jour-nal de Mathématiques Pures et Appliquées, 1 (1836), 253–265. 2, 3, 4, 5, 18,19, 20, 22, 26, 29[24] J. Liouville. Démonstration d’un théorème dû à M. Sturm et relatif à une classede fonctions transcendantes. Journal de Mathématiques Pures et Appliquées, 1(1836), 269–277. 4, 14, 19, 22, 27, 28[25] J. Lützen. Sturm and Liouville’s work on ordinary linear differential equations.The emergence of Sturm-Liouville theory. Archive History Exact Sciences, 29:4(1984), 309–376. 20[26] J. Lützen and A. Mingareli. Charles François Sturm and differential equations.in
Collected works of Charles François Sturm.
Jean-Claude Pont (ed.), in coll.with Flavis Padovani. Birkhäuser Verlag Basel, 2009, 25–47. 4, 18[27] F.N.M. Moigno (Monsieur l’abbé). Leçons de calcul différentiel et de calculintégral, rédigées principalement d’après les méthodes et les ouvrages de M. A.-L. Cauchy, et étendues aux travaux les plus récents des géomètres. Bachelier,Paris. Vol. 1 (1840) and Vol. 2 (1844). 20[28] V. Ovsienko and S. Tabachnikov. Projective differential geometry old and new:from Schwarzian derivative to cohomology of diffeomorphism groups. CambridgeUniversity Press 2005. 3[29] Å. Pleijel. Remarks on Courant’s nodal theorem. Comm. Pure. Appl. Math.,9 (1956), 543–550. 3[30] F. Pockels. Über die partielle Differentialgleichung ∆ u + k u = 0 und derenAuftreten in der mathematischen Physik. Teubner, Leipzig 1891. 2[31] S. D. Poisson. Théorie mathématique de la chaleur. Bachelier, Paris 1835. 19[32] M. Rolle. Démonstration d’une méthode pour résoudre les égalités de tous lesdegrés. Chez Jean Cusson, Paris 1691. 15[33] H. Sinaceur. Corps et modèles. Essai sur l’histoire de l’algèbre réelle. Secondeédition corrigée. Vrin, Paris 1999. 18, 19[34] J.W. Strutt, Baron Rayleigh. The Theory of Sound. Vol. I. Macmillan and Co.,London, 1877. 2, 18[35] S. Steinerberger. Quantitative projections in the Sturm oscillation theorem.arXiv:1804.05779, 18 Apr 2018. 3[36] C. Sturm Extrait d’un mémoire de M. Sturm, présenté à l’Académie des sci-ences, dans sa séance du 1er juin 1829. Bulletin de Férussac, XI (1829), 422–425.3[37] C. Sturm. Analyse générale d’un mémoire sur les propriétés générales des fonc-tions qui dépendent d’équations différentielles linéaires du second ordre, présentéà l’Académie des sciences de Paris, le 30 septembre 1833. L’institut. Journalgénéral des sociétés et travaux scientifiques de la France et de l’étranger, 1(1833), 247–248. 2, 4, 18[38] C. Sturm. Monsieur Sturm nous prie d’insérer la note suivante. L’institut.Journal général des sociétés et travaux scientifiques de la France et de l’étranger,1 (1833), 247–248. 2, 4, 18, 21[39] C. Sturm. Mémoire sur les équations différentielles linéaires du second ordre.Journal de Mathématiques Pures et Appliquées, 1 (1836), 106–186. 2, 4, 6, 18,19, 20, 21, 24, 25[40] C. Sturm. Mémoire sur une classe d’équations à différences partielles. Journalde Mathématiques Pures et Appliquées, 1 (1836), 373–444. 2, 3, 4, 6, 7, 9, 10,11, 13, 14, 18, 19, 20, 21, 22, 25, 27, 28[41] O. Viro. Construction of multi-component real algebraic surfaces. Soviet Math.Dokl., 20:5 (1979), 991–995. 3 PB: Institut Fourier, Université Grenoble Alpes and CNRS, B.P.74,F38402 Saint Martin d’Hères Cedex, France.
E-mail address : [email protected] BH: Laboratoire Jean Leray, Université de Nantes and CNRS, F44322Nantes Cedex, France.
E-mail address ::