A lower bound for the modulus of the Dirichlet eta function on partition P from 2-D principal component analysis
aa r X i v : . [ m a t h . G M ] J a n Noname manuscript No. (will be inserted by the editor)
A lower bound for the modulus of the Dirichleteta function when ℜ p s q ě using 2-D principalcomponent analysis Yuri Heymann
Received: date / Accepted: date
Abstract
The present manuscript aims to derive an expression for the lowerbound of the modulus of the Dirichlet eta function on vertical lines ℜ p s q “ α .An approach based on a two-dimensional principal component analysis match-ing the dimensionality of the complex plane, which is built on a parametric el-lipsoidal shape, has been undertaken to achieve this result. This lower bound,which is expressed as @ s P C s.t. ℜ p s q P P , | η p s q| ě | ´ ? α | , where η isthe Dirichlet eta function, has implications for the Riemann hypothesis as | η p s q| ą s P C s.t. ℜ p s q P P , where P is a partition spanning one halfof the critical strip, on a side of the critical line ℜ p s q “ { Keywords
Dirichlet eta function, PCA, Analytic continuation
The Dirichlet eta function is an alternating series related to the Riemann zetafunction of interest in the field of number theory for the study of the distribu-tion of primes [9]. Both series are tied together on a two-by-two relationshipexpressed as η p s q “ ` ´ ´ s ˘ ζ p s q where s is a complex number. The lo-cation of the so-called non-trivial zeros of the Riemann zeta function in thecritical strip ℜ p s q P s , r is key in the prime-number theory. For example, theRiemann-von Mangoldt explicit formula, which is an asymptotic expansion ofthe prime-counting function, involves a sum over the non-trivial zeros of theRiemann zeta function [7]. The Riemann hypothesis, which is focused on thedomain of existence of the zeros in the critical strip, has implications for theaccurate estimate of the error involved in the prime-number theorem and a va-riety of conjectures such as the Lindel¨of hypothesis [2], conjectures about short Yuri HeymannGeorgia Institute of Technology, Atlanta, GA 30332, USAE-mail: [email protected]
Present address: intervals containing primes [6], Montgomery’s pair correlation conjecture [5],the inverse spectral problem for fractal strings [4], etc. Moreover, variants ofthe Riemann hypothesis falling under the generalized Riemann hypothesis inthe study of modular L-functions [8] are core for many fundamental results innumber theory and related fields such as the theory of computational complex-ity. For instance, the asymptotic behavior of the number of primes less than x described in the prime-number theorem, π p x q „ x ln p x q , provides a smoothtransition of time complexity as x approaches infinity. As such, the time com-plexity of the prime-counting function using the x {p ln x q approximation is oforder O p M p n q log n q , where n is the number of digits of x and M p n q is thetime complexity for multiplying two n-digit numbers. This figure is based onthe time complexity to compute the natural logarithm using the arithmetic-geometric mean approach, which is of the same order as above but where n represents the number of digits of precision at which the natural logarithm isto be evaluated.The below definitions provided in the current manuscript are supplementaryto standard definitions when referring to reals, complex numbers and holo-morphic functions. Any real number can be expressed either as an integer,a rational or an irrational number. A complex number is the composite of areal and imaginary number, forming a 2-D surface, where the pure imaginaryaxis is represented by the letter i such that i “ ´
1. The neutral elementand index i form a complementary basis spanning a vector space. The Dirich-let eta function is a holomorphic function having for domain a subset of thecomplex plane where reals are positive denoted C ` , which arguments are sentto the codomain belonging to C . Any holomorphic function is characterizedby its modulus, a variable having for support an axial vertex orthonormal tothe complex plane forming a hologram comprised of five semi-surfaces and ahemisphere, which is akin to a polyhedron of four vertices. The conformal wayto describe space in geometrical terms is the orthogonal system, which consistsof eight windows delimited by the axes of the Cartesian coordinates, betweenthe three even surfaces of the Euclidean space adjacent to each other. Thehexagon known today as a flat shape refers to the six directions orthonormalto the faces of a cube embedded within the Cardinal mund, which led to thedefinition of Tensors, to describe the three orthogonal directions associatedwith each face of the cube.As a reminder, the definition of the Riemann zeta function and its analyticcontinuation to the critical strip are displayed below. The Riemann zeta func-tion is commonly expressed as follows: ζ p s q “ ÿ n “ n s , (1)where s is a complex number and ℜ p s q ą itle Suppressed Due to Excessive Length 3 The standard approach for the analytic continuation of the Riemann zetafunction to the critical strip ℜ p s q P s , r is performed with the multiplicationof ζ p s q with the function ` ´ s ˘ , leading to the Dirichlet eta function. Bydefinition, we have: η p s q “ ˆ ´ s ˙ ζ p s q “ ÿ n “ p´ q n ` n s , (2)where ℜ p s q ą η is the Dirichlet eta function. By continuity as s ap-proaches one, η p q “ ln p q .The function ` ´ s ˘ has an infinity of zeros on the line ℜ p s q “ s k “ ` kπi ln 2 where k P Z ˚ . As ` ´ s ˘ “ ˆ ` α ´ e i β ln 2 ´ ˘ { α e i β ln 2 ,the factor ` ´ s ˘ has no poles nor zeros in the critical strip ℜ p s q P s , r . Wecan infer that the Dirichlet eta function can be used as a proxy of the Riemannzeta function for zero finding in the critical strip ℜ p s q P s , r .From the above, the Dirichlet eta function is expressed as: η p s q “ ÿ n “ p´ q n ` e ´ i β ln p n q n α , (3)where s “ α ` i β is a complex number, α and β are real numbers.We have n s “ n α exp p βi ln n q “ n α p cos p β ln n q` i sin p β ln n qq . We then multiply boththe numerator and denominator by cos p β ln p n qq ´ i sin p β ln p n qq . After severalsimplifications, η p s q “ ř n “ p´ q n ` r cos p β ln n q´ i sin p β ln n qs n α .In the remaining portion of the manuscript, the Riemann zeta function isreferring to its formal definition and analytic continuation by congruence. Proposition 1
Given z and z two complex numbers, we have: ˇˇˇ z ` z ˇˇˇ ě ˇˇˇ | z | ´ | z | ˇˇˇ , (4)where | z | denotes the modulus of the complex number z . As a reverse triangleinequality, (4) can be extended to any normed vector space. Proof
Let us write the complex numbers in polar form and set z “ r e i θ and z “ r e i θ , where their respective modulus and the constituents of theirimaginary arguments, θ and θ , are reals. We get: Yuri Heymann ˇˇˇ z ` z ˇˇˇ “ ˇˇˇ r e i θ ` r e i θ ˇˇˇ “ b p r cos p θ q ` r cos p θ qq ` p r sin p θ q ` r sin p θ qq “ b r ` r ` r r cos p θ ` θ q . (5)The trigonometric identities 2 cos p a q cos p b q “ cos p a ´ b q ` cos p a ` b q and2 sin p a q sin p b q “ cos p a ´ b q ´ cos p a ` b q were used in (5), see [1] formulas4.3.31 and 4.3.32, page 72. Eq. (4) follows from (5) and (6) below: ˇˇˇ | z | ´ | z | ˇˇˇ “ ˇˇˇ r ` r e i π ˇˇˇ “ b r ` r ´ r . (6) Proposition 2
Let us consider an ellipse p x t , y t q “ r a cos p t q , b sin p t qs where a and b are two positive reals corresponding to the lengths of the semi-majorand semi-minor axes of the ellipse ( a ě b ) and t P r , π s is the angle betweenthe x-axis and the vector p x t , y t q .Let us set t such that the semi-major axis of the ellipse is aligned with thex-axis, which is the angle maximizing the objective function defined as themodulus of p x t , y t q . When |p x t , y t q| is maximized, we have: |p x t , y t q| “ x t ` y t “ a . (7)Note that by maximizing x t ` y t , we would get |p x t , y t q| ă x t ` y t , as theexpression x t ` y t is maximized when t “ arctan p b { a q , leading to max p x t ` y t q “? a ` b . Proof
Principal Component Analysis (PCA), is a statistical method for re-ducing the dimensionality of a variable space by representing it with a feworthogonal variables capturing most of the variability of an observable. Inthe current context, a two-dimensional principal component analysis built ona parametric ellipsoidal shape is introduced to match the dimensionality ofthe complex plane. An ellipse centered at the origin of the coordinate sys-tem can be parametrized as follows: p x t , y t q “ r a cos p t q , b sin p t qs where a and b are positive real numbers corresponding to the lengths of the semi-major andsemi-minor axes of the ellipse ( a ě b ) and t P R is the angle between the x-axisand the vector p x t , y t q . As the directions of the x and y-axes are orthogonal,the objective function is maximized with respect to t when the major axisis aligned with the x-axis. When |p x t , y t q| is at its maximum value, we get |p x t , y t q| “ x t ` y t “ a where a is the length of the semi-major axis. The mod-ulus of p x t , y t q is as follows: |p x t , y t q| “ b a cos p t q ` b sin p t q ď a, @ t P R . itle Suppressed Due to Excessive Length 5 By analogy with principal component analysis, x t represents the first princi-pal component and y t the second principal component. Say x t and y t were notorthogonal, then there would be a phase shift ϕ between the components, i.e. x t “ a cos p t q and y t “ b sin p t ` ϕ q . Proposition 3
Given a vector V E “ r u p E q , v p E qs defined in a bidimensionalvector space, where u p E q and v p E q are two real-valued functions say on R ν Ñ R , where ν represents the degrees of freedom of the system. The reference ofa point in such system, is described by the set E “ t ε , ε , ..., ε ν u representinga multidimensional coordinate sytem. Thus, we have: | V E | “ u p E q ` v p E q , (8)if only u p E q v p E q “ u p E q ` v p E q ě t e , e u of the above-mentioned vector space, where | e i | “ i “ , proposition 3 is true if only the inner product across basis elementsis equal to zero, i.e. e ¨ e “
0, meaning that the basis elements are disentangledfrom each other. We say that t e , e u is an orthonormal basis. This conditionis also necessary for propositions 2 and to be true. The above vector space iscommonly referred to as the Euclidean space in Cartesian coordinates. Proof
By the square rule, we have p u ` v q “ u ` v ` u v . The modulusof a vector V as defined in a orthonormal basis is | V | “ ? u ` v , leading to | V | “ | u ` v | if only u v “ Proposition 4
Given a circle of radius r P R ` parametrized as follows: p x t , y t q “ r r cos p t q , r sin p t qs where t is a real variable in r , π s , we constructa function f p t q “ a cos p t q ` b sin p t ` ϕ q where a and b are two positive realsand ϕ a real variable which can be positive or negative such that: r cos p t q ` r sin p t q “ a cos p t q ` b sin p t ` ϕ q . (9)Let us say u t “ a cos p t q and v t “ b sin p t ` ϕ q .When a ě b , the first component u t is the one which carries most of thevariance of f p t q and we have: |p x t , y t q| ď max p v t q ď max p u t q , (10)where t P r , π s , max p u t q is the maximum value of u t and max p v t q is themaximum value of v t .When a ď b , the component v t carries most of the variance of f p t q and wehave: max p u t q ď |p x t , y t q| ď max p v t q , (11)where t P r , π s . Yuri Heymann
When the functions u t and v t are orthogonal i.e. ϕ “
0, we have r “ a “ b . Proof
We have p r ` δ q cos p t q ` p r ´ δ q sin p t q “ r cos p t q ` r sin p t q ` δ cos p t q ´ δ sin p t q . Hence, we want that δ cos p t q ´ δ sin p t q “
0. Thus, δ “ δ cot p t q .Therefore we get: r cos p t q ` r sin p t q “ p r ` δ q cos p t q ` p r ´ δ cot p t qq sin p t q . As r sin p t q ´ δ cos p t q “ ? r ` δ sin p t ` ϕ q where ϕ “ ´ arctan δ { r , we get @ t Pr , π s , r cos p t q ` r sin p t q “ p r ` δ q cos p t q ` ? r ` δ sin p t ` ϕ q . We set a “ r ` δ and b “ ? r ` δ where ´ r ď δ ď r , leading to (9). If δ ě
0, we have r ď ? r ` δ ď r ` δ , leading to (10). If δ ď
0, we have r ` δ ď r ď ? r ` δ ,leading to (11). As max t r cos p t q ` r sin p t qu “ ? r which occurs when t “ π ,we have δ P r´ r, r s for any r ě Proposition 5
Given an alternating series S m constructed on a sequence t a n u monotonically decreasing with respect to its index n ě m P N where a n ą n Ñ8 a n “
0, defined such that: S m “ ÿ n “ m p´ q n ` a n , (12)where m is an integer, we get the below upper bound inequality on the absolutevalue of the series: ˇˇˇˇˇ ÿ n “ m p´ q n ` a n ˇˇˇˇˇ ď a m , (13)where a m represents a radius in Leibniz’s notations. Proof
Given the Leibniz’s rule, the series S m is convergent as t a n u is monotoni-cally decreasing and lim n Ñ8 a n “
0. Let us define the series L “ ř n “ p´ q n ` a n and its partial sum S k “ ř kn “ p´ q n ` a n . The odd partial sums decrease as S p m ` q` “ S m ` ´ a m ` ` a m ` ď S m ` . The even partial sums increaseas S p m ` q “ S m ` a m ` ´ a m ` ě S m . As the odd and even partial sumsconverge to the same value, we have S m ď L ď S m ` for any finite m P N ˚ .When m is odd: S m “ a m ` ÿ n “ m ` p´ q n a n “ a m ` p L ´ S m q ď a m . (14)When m is even: S m “ ´ a m ` ÿ n “ m ` p´ q n a n “ ´ a m ` p L ´ S m q ě ´ a m . (15)Leading to: | S m | ď a m , (16)where m is a natural number, finite in N ˚ by definition. itle Suppressed Due to Excessive Length 7 s is expressed as s “ α ` i β where α and β are reals in their corresponding basis.Note the zeros of the Dirichlet eta function are also the zeros of its complexconjugate. For convenience, let us introduce the conjugate of the Dirichlet etafunction, which is expressed as follows: s η p s q “ ÿ n “ p´ q n ` e i β ln n n α , (17)where ℜ p s q ą
0. By applying proposition 1 to (17), we get: | s η p s q| ě ˇˇˇˇˇ ´ « ÿ n “ p´ q n ` e i β ln n n α ffˇˇˇˇˇ , (18)where r z s denotes the norm of the complex number z . The square brackets areused as smart delimiters for operator precedence.With respect to the expression ˇˇˇˇ ř n “ p´ q n ` e i β ln n n α ˇˇˇˇ , its decomposition into sub-components u n “ p´ q n ` n α e iβ ln n is a vector representation where β ln n ` p n ` q π is the angle between the real axis and the orientation of the vector itselfand n α its modulus. The idea is to apply a rotation by the angle θ to all thecomponent vectors simultaneously, resulting in a rotation of the vector of theirsum. The resulting vector after rotation θ expressed in complex notations is v θ,β “ ř n “ p´ q n ` n α e i p β ln n ` θ q , where θ and β are real numbers.Let us introduce the objective function w , defined as the sum of the real andimaginary parts of v θ,β , i.e. w “ v x ` v y where v x “ ℜ p v θ,β q and v y “ ℑ p v θ,β q .We get: w “ ÿ n “ p´ q n ` n α p cos p β ln n ` θ q ` sin p β ln n ` θ qq“ ÿ n “ p´ q n ` ? n α cos ´ β ln n ` θ ´ π ¯ . (19)The trigonometric identity cos p x q ` sin p x q “ ? ` x ´ π ˘ which followsfrom cos p a q cos p b q ` sin p a q sin p b q “ cos p a ´ b q with b “ π is used in (19), see [1]formulas 4.3.31 and 4.3.32, page 72. The finite sum of a subset of the elementsof the second line of (19) from 2 to n P N is further referred to as the w -series.In the remainder of the manuscript, orthogonality between functions is definedin terms of the inner product. While orthogonality between vectors is defined Yuri Heymann in terms of the scalar product between such pair of vectors, for functions weusually define an inner product. Let us say we have two real-valued functions f and g , which are squared- Lebesgue integrable on a segment r a, b s and wherethe inner product between f and g is given by: x f, g y “ ż ba f p x q g p x q dx . (20)The functions f and g are squared- Lebesgue integrable, meaning such func-tions can be normalized, i.e. the squared norm as defined by x f, f y is finite. Forsinusoidal functions such as sine and cosine, it is common to say r a, b s “ r , π s ,which interval corresponds to one period. The condition for the functions f and g to be orthogonal is that the inner product as defined in (20) is equal tozero.The objective function w was constructed by adding together the real andimaginary parts of v θ,β . We note that the real and imaginary parts of v θ,β areorthogonal due to the Euler formula. Hence, ℜ p v θ,β q is maximized in absolutevalue when ℑ p v θ,β q “
0, which occurs for example when β “
0. As w θ “ π ,β “? ℜ p v θ “ ,β q , the maximum value of the objective function w provided that θ “ π occurs when β “
0. We get: max ‹ t| w |u “ ˇˇˇˇˇ ÿ n “ p´ q n ` ? n α ˇˇˇˇˇ , (21)which is also the maximum value of the objective function w , given that β “ p θ, β q leading to larger valuesof max t| w |u , a particularity of the expression max ‹ t| w |u as defined above is tobe equal to ? ˇˇ v θ “ π ,β “ ˇˇ .By applying proposition 5 to (21), we get max ‹ t| w |u ď ? α , leading to thebelow inequality: ˇˇ v θ “ π ,β “ ˇˇ ď α ă ? α , (22)which is less than the maximum of | v θ,β | , while the inner components of v θ,β collinear as β “ w into say prin-cipal components w and w , expressed as follows: w “ ´ ? α cos ´ β ln p q ` θ ´ π ¯ , (23)and w “ ÿ n “ p´ q n ` ? n α cos ´ β ln n ` θ ´ π ¯ , (24) itle Suppressed Due to Excessive Length 9 where most of the variance of the w -series comes from the leading component.By construction w is the sum of the real and imaginary parts of v θ,β whichare orthogonal functions. Let us say v θ is the parametric notation of v θ,β fora given β value. We note that for any given β value, the complex number v θ describes a circle in the complex plane, centered in the origin. Hence, insymbolic notations, w can be written as w “ r cos p t q` r sin p t q . The components w and w can be expressed as w “ a cos p t q and w “ b sin p t ` ϕ q where t is in r , π s . As we suppose that w carries most of the variance of w (i.e. a ě b ), themodulus | v θ | is smaller or equal to the maximum value of | w | , by proposition4 . Yet, | v θ | is equal to max t w u , if w and w are orthogonal and t “
0. By proposition 2 , at its maximum value |p w , w q| “ w ` w when ϕ “ t “
0, which in light of the above, is also equal to the maximum of | v θ | . Asthe inner product of w and w does not depend on θ , orthogonality between w and w is determined by β values. We then apply a rotation by an angle θ to maximize the objective function |p w , w q| . We consider two scenariosrespectively, whether w or w is the leading component of the w -series. When w is the leading component: In this scenario, if we suppose that w and w are orthogonal, at the maximumvalue of |p w , w q| , w “ ? α and w “
0. Hence, we get max t|p w , w q|u “ ? α .If we suppose that w and w are not orthogonal, by proposition 4 we wouldget ˇˇˇř n “ p´ q n ` e i β ln n n α ˇˇˇ ă ? α , and | η p s q| would be strictly larger than zerowhen α “ { ℜ p s q “ {
2, which is known to be false.Hence, we can say that when w is the leading component, the functions w and w are orthogonal at the maximum value of | v θ | . Thus, we have: @ s P C ` s.t. w is leading component , ˇˇˇˇˇ ÿ n “ p´ q n ` e i β ln n n α ˇˇˇˇˇ ď ? α . (25)Say for ℜ p s q “ α ě , (18) and (25) imply that: | η p s q| ě ´ ? α . (26) When w is the leading component: In this scenario, we fall on eq. (11) of proposition 4 , leading to: @ s P C ` s.t. w is leading component , ˇˇˇˇˇ ÿ n “ p´ q n ` e i β ln n n α ˇˇˇˇˇ ě ? α . (27)Say for ℜ p s q “ α ď , (18) and (27) imply that: | η p s q| ě ? α ´ . (28)By the Riemann zeta functional, if | η p s q| ą w introducing multiple terms ofthe w -series as first principal component, say w “ ? α cos p β ln 2 ` θ ´ π q ` ? α cos p β ln 4 ` θ ´ π q , | w | is maximized when β “ θ “ π . Hence ˇˇ v θ “ π ,β “ ˇˇ ď ? α ` ? α which introduces a bias in (22). Furthermore, the floorfunction of the modulus of the Dirichlet eta function on vertical lines ℜ p s q “ α does not depend on β , which is reflected by the linear relationship between θ and β in the cosine argument of the first principal component as a singleterm of w -series. This is no longer true when introducing several terms of the w -series as first principal component. In such scenarii, neither w nor w isa principal component. A special case of polynomial made up of a subset ofterms of the w -series which does not depend on β occurs, if there exists sucha polynomial which is equal to zero for any β . For such a polynomial to bea first principal component involves that the subsequent components are alsoequal to zero, leading to the degenerate case | v θ | “
0. This occurs when α tends to infinity, leading to w “
0. The combination of multiple terms of the w -series as first principal component involves that w is a function composedof terms of the form a n “ ˘ n α cos p β ln p n q ` θ q , where n is the index ofthe corresponding term in v θ . Due to the multiplicity of bivariate collineararguments in the cosine functions, which comovements are not parallel acrossthe index n , there is no bijection between θ and β , i.e. a one-degree of freedomrelationship, such that all cosine arguments β ln n ` θ of the w componentare decoupled from β . As aforementioned, the lower bound of the modulusof the Dirichlet eta function obtained from (18) needs to be decoupled from β , to be a floor function on vertical lines ℜ p s q “ α . Moreover, the principalcomponents involved in modulus maximization need to be disentangled forPCA to be applicable. As a simple rule, one degree of freedom is needed forevery additional principal component, when matching the dimensionality ofthe variable space in the parametric ellipsoidal model. Hence, the approachconsistsing of using a countable set of terms of the w -series of multiplicitylarger than one, as first principal component is not suitable to the problemaddressed in the present text.A further observation is that in the first scenario, when w is the leadingcomponent, the second term of the Dirichlet eta function is orthogonal to thevector comprised of the remaining terms of the series of index larger than2 at zeros of the function, by the Hadamar product. Furthermore, the anglebetween the axis defined by the second component of the Dirichlet eta function itle Suppressed Due to Excessive Length 11 and the vector comprised of the remaining components of index larger than 2is in the range r π , π s .The below are based on the premise that the Dirichlet eta function can be usedas a proxy of the Riemann zeta function for zero finding in the critical strip,and interpretations about the lower bound of the modulus of the Dirichleteta function as a floor function. The surface spanned by the modulus of theDirichlet eta function is a continuum resulting from the application of a real-valued function over the dimensions of the complex plane, which is a planarrepresentation where the reals form an ad-dextram vertice contiguous with theimaginary axis, and where the square of imaginary numbers are subtractedfrom zero. The modulus of the Dirichlet eta function is a holographic functionsending a complex number into a real number, whereas the floor function is aprojection of the former onto the real axis.When α “ { | η p s q| ě ℜ p s q “ {
2, which is known to be true [10], p. 256. @ α P s , r , | η p s q| ą ℜ p s q P s , r . As the Dirichlet eta function and the Riemann zeta function sharethe same zeros in the critical strip, involves the Riemann zeta function has nozeros in the strips, ℜ p s q P s , r . As an excerpt from [3], i.e. Given s a complexnumber and ¯ s its complex conjugate, if s is a zero of the Riemann zeta functionin the strip ℜ p s q P s , r , then we have: ζ p s q “ ζ p ´ ¯ s q . Accordingly, thereare no zeros in the strip ℜ p s q P s , r . It follows that the Dirichlet eta functiondoes not have any zeros on either parts of the critical strip, i.e. ℜ p s q P s , { r and s { , r , which is a prerequisite to say that the non-trivial zeros of theRiemann zeta function lie on the critical line ℜ p s q “ { The lower bound of the modulus of the Dirichlet eta function derived in thepresent manuscript from 2-D principal component analysis is @ s P C s.t. ℜ p s q P P , | η p s q| ě | ´ ? α | , where P is a partition spanning one half of the critcalstrip, on a side of the critical line depending upon a variable, and where η isthe Dirichlet eta function. As a proxy of the Riemann zeta function for zerofinding in the critical strip ℜ p s q P s , r , the floor function of the modulus ofthe Dirichlet eta function involves the Riemann zeta function does not havezeros on both sides of the critical strip. Further observations are made, withrespect to the orthogonality between the second component of the Dirichleteta function and the vector comprised of the remaining elements of the seriesof index larger than 2 at the so-called zeros of the Dirichlet eta function. References