aa r X i v : . [ m a t h . G M ] M a y A Solution of polynomial equations
N. Tsirivas
Abstract
We present a method for the solution of polynomial equations. We do notintend to present one more method among several others, because today there aremany excellent methods. Our main aim is educational. Here we attempt to present amethod with elementary tools in order to be understood and useful by students andeducators. For this reason, we provide a self contained approach. Our method is avariation of the well known method of resultant, that has its origin back to Euler.Our goal, in the present paper, is in the spirit of calculus and secondary schoolmathematics. An extensive discussion of the theory of zeros of polynomials andextremal problems for polynomials the reader can find in the books [10] and [13].
MSC (2010) : 65H04
Keywords : Polynomial equation, resultant, Gr¨obner bases.
Introduction
It is well known that many problems in Physics, Chemistry and Science leadgenerally to a polynomial equation.In pure mathematics also, there are classical problems that lead to a polynomialequation.Let us give two examples:1) If we are to compute the integral Z βα p ( x ) q ( x ) dx , where α , β ∈ R , α < β , and p ( x ) , q ( x ) are two real polynomials of one variable, and q ( x ) is a non-zero polyno-mial that does not have any root in the interval [ α , β ] , then we are led to the problemof finding the real roots of q ( x ) .2) Let n ∈ N , a i ∈ R for i = , . . ., n , where N , R are the sets of natural and realnumbers respectively.We can consider the differential equation N. TsirivasDiovouniotou 30-32, T.K. 11741, Athens, Greecee-mail: [email protected] a n y ( n ) + a n − y ( n − ) + · · · + a y + a = , where y is the unknown function.In order to solve this simple equation we have to find all the roots of the polyno-mial p ( x ) = a n x n + a n − x n − + · · · + a x + a . So, the utility to solve a polynomial equation, or in other words to find the roots of apolynomial is undoubted. This problem is a very old, classical problem in mathemat-ics and Numerical Analysis, especially. For this reason, there exist many methodsthat solve it.However, if a scientist wants to solve an equation for his work, it is sufficient touse programs as “mathematica” and “maple”, nowadays. So, the utility of the prob-lem has an other direction, which is the finding of better algorithms and programs.This is the main line of research in the area experts, nowadays.We are moving in an other direction in this paper.Our main aim is mainly educational.In this paper we present a method of solving a polynomial equation with fulldetails for educational reasons so that a student of positive sciences can improve thelevel of knowledge in the subject. First of all, let us state our problem. We denote C as the set of complex numbers. Let n ∈ N and a i ∈ C for i = , , . . ., n . We thenconsider the polynomial p ( z ) = a n z n + a n − z n − + · · · + a z + a , that is a polynomial of one complex variable z with complex coefficients. We sup-pose that a n =
0. The natural number n is called the degree of p ( z ) and it is denotedby degp ( z ) = n . The number a ∈ C is a root of p ( z ) when it is applicable: p ( a ) = p ( z ) , or in other words to solve the equation, p ( z ) =
0. Polynomials are simple and specific functions that have the following fun-damental property:
Fundamental Theorem of Algebra . Every polynomial of one complex variablewith complex coefficients and a degree greater or equal to one has at least one rootin C .This result is central. It is the basis of our method.However, even if this theorem is fundamental, its proof is not trivial. Its simplestproof comes from complex analysis that many students do not learn in university. Inthe appendix we give one of the simplest proofs of the fundamental Theorem.Many of the best methods of our problem are iterative. They are based on theconstruction of specific sequences that approach to the roots of the supposed poly-nomial. Our method here uses algebra as much as possible, and when algebra cannotgo further, analysis takes its role in solving the problem. Here we do not deal withthe problem of speed of convergence. We use numerical analysis as little as we can.It is sufficient for us to use the simplest method in order to find a root in a specificreal open interval, the bisection method. Solution of polynomial equations 3
Most of the books on numerical analysis describe the bisection method with de-tails. For example see ([8], [11]).There are some formulas that provide bounds of the roots of a polynomial. A.Cauchy had given such a bound, see [8]. In the frame of our method we providesuch a bound.There are some results that give information about the number of positive orreal roots, for example Descart’s law of signs and Sturm’s sequence [11]. A basicproblem is to find disjoint real intervals so that every one of them contains one rootexactly. There are, also, many methods for this.Let us describe now, roughly, the stages of our method.1) In the first stage we find all the real roots of a polynomial. For this reasonwe are based on two results. First of all the bisection method and secondly by thefollowing result:If we have a polynomial p of one real variable with real coefficients with a degreegreater or equal to one for which we know the roots of p ′ , then using the bisectionmethod we can find all the real roots of p .The first stage is simple. It uses only elementary knowledge and it is also conve-nient for students of secondary school!We think that it is very useful for students of secondary school to know a methodthat find all the real roots of an arbitrary real polynomial with their knowledge base.2) In the second stage, we provide a method that gives all the real roots of asystem of the form: ( p ( x , y ) = q ( x , y ) = p ( x , y ) , q ( x , y ) are polynomials of two real variables x and y with real coef-ficients. Our method here is a variation of the well known method of resultant (see[6], [13]), that has its origin in Euler. With this method the solution of the abovesystem A is reduced to the first stage. As in the first stage, the second stage is alsoconvenient for students of secondary school, (except for Theorem 3.17 in our pre-requisites).3) In the third stage we show that the solution of our problem is reduced to thesecond stage.So, roughly speaking, our main aim in this paper is to present a method that is inthe frame of the usual lessons of calculus in secondary school or in university andpresent it with all the necessary details in order for it to be understand by students.As for the notation. Let p ( x , y ) be a polynomial of two real variables x and y , withreal coefficients. We denote deg x p ( x , y ) the greatest degree of p ( x , y ) with respect to x and deg y p ( x , y ) the greatest degree of p ( x , y ) with respect to y . If deg x p ( x , y ) ≥ deg y p ( x , y ) ≥
1, we call the polynomial p ( x , y ) a pure polynomial. If p ( z ) , q ( z ) are two complex polynomials, we write p ( z ) ≡ q ( z ) , when they are equal by identity.We also write p ( z ) ≡
0, when p ( z ) is equal to zero polynomial by identity. We write p ( z ) q ( z ) when, polynomials p ( z ) , q ( z ) , are not equal by identity and p ( z ) p ( z ) is not the zero polynomial. N. Tsirivas
There are many methods and algorithms to the solution of polynomial equa-tions. Some of them are very old like the methods of Horner, Graeffe and Bernoulli,whereas today there are some others like the methods of Rutishauser, Lehmer, Lin,Bairstow, Bareiss and many others. Another method, similar to Bernoulli methodis the QD method. A classical and popular method today is that of Muller. It is ageneral method, not only for polynomials.The interested reader can find the details of some of the above methods in thebooks of our references, see [1], [3], [4], [7], [9], [10], [11] and [12]. As we saidthere exist many algorithms and programs to our problem.One of the best is the subroutine ZEROIN. One can find the details of this pro-gram in [4].As we said formerly, the basis of our method is the resultant (or eliminent). Withthis method we can convert a system of polynomial equations in one equation withonly one unknown!Theoretically, we can succeed in that, but the complexity of calculations is enor-mous, so its value today is only for polynomial equations with a low degree, and isused as a theoretical tool. For details of the resultant see [6], [12]. Apart from thisthere are some cases where the resultant fails. This can happen, for example, whenwe have to solve a system of two equations with two unknowns and one of the twoequations is a multiple of the other, and the system has a finite number of solutions.See, for example, the equation: ( x − ) + ( y − ) =
0, that has the set of solutions L = { ( , √ ) , ( , −√ ) , ( − , √ ) , ( − , −√ ) } . We describe with details how we handle these cases in our method here. An alter-nate method for our problem is to solve it with Gr¨obner bases. Gr¨obner bases is amethod that was developed in 1960 for the division of polynomials with more thanone variables. With Gr¨obner bases we can also convert a system of equations in anequation with only one unknown, as the resultant does. This is the main applicationof Gr¨obner bases. This can be done in most cases.However, there are some cases where Gr¨obner bases fail to succeed in the above,like the above case.For Gr¨obner bases, see [2]. Many books of secondary school contain the elemen-tary theory of polynomials and Euclidean division that we refer to in our prerequi-sites.The structure of our paper is as follows:In the first paragraph we give a roughly description of our method. In the secondparagraph we give the complete description of our method. In the third paragraphwe collect all the prerequisites tools of our method from Algebra and Analysis andwe present them with all the necessary proofs, especially for results that someonecannot find easily in books.Finally, in the last paragraph 4 (Appendix) we give one of the simplest proofs ofthe fundamental Theorem of Algebra that one cannot find easily in books.We, also, give a short description of the solution of binomial equation: x n = a ,where n ∈ N , n ≥ a is a positive number. Solution of polynomial equations 5
For methodological reasons, we divide the solution of our problem in threestages.
In this stage we find all the real roots of the polynomial equation a v x v + a v − x v − + · · · + a x + a = , where a i ∈ R , for every i = , , . . ., v , where v ∈ N . Let p ( x , y ) , p ( x , y ) be two polynomials of two real variables x and y whosecoefficients are in R . We consider the system of equations. ( p ( x , y ) = ( ) p ( x , y ) = ( ) (A)Let L A be the set of solutions of the above system (A), in R . That is, we considerthe set L A : = { ( x , y ) ∈ R | p ( x , y ) = p ( x , y ) = } , of solutions of the above system (A), in R . In the second stage we find the set L A under the following supposition (S)(S): Supposition : We suppose that the set L A is finite.That is, we solve the above system (A), in R , in the case of supposition (S)holds. We note, that we succeed the second stage using first stage. In the third stage we completely solve our initial problem of finding all the rootsof the polynomial equation a n z n + a n − z n − + · · · + a z + a =
0, where n ∈ N , a i ∈ C , for every i = , , . . ., n , using the previous two stages.The first stage is the analytical part, whereas the second and third stages are thealgebraic parts of our method. The prerequisites of our method are few. Elementarycalculus and the elementary linear algebra of secondary school are enough, except N. Tsirivas only for a specific case, where we use Theorem 3.17 from our prerequisites, (a verywell-known result from calculus of several variables).In the following paragraph, we give the complete description of our method.
Let a i ∈ R , for i = , , . . ., v , v ∈ N and a polynomial p ( x ) = a v x v + a v − x v − + · · · + a x + a , where a v =
0, so degp ( x ) = v .Here we find all the real roots of p ( x ) . If v =
1, or v = p ( x ) from secondary school. Let us suppose that v ≥
3. We find all thereal roots of p ′ (if any) and then we find the roots of p by applying basic Lemma 3.8or Corollaries 3.9, 3.10.More generally, we suppose that p has degree v ∈ N , v ≥
3. We consider polyno-mials p ′ , p ′′ , . . ., p ( v − ) p ( v − ) . Polynomial p ′ has degree v − p ′′ has degree v − p ( v − ) has degree 2.We find the roots of p ( v − ) (if any). After using basic Lemma 3.8, or Corollaries3.9, 3.10 we find the roots of p ( v − ) and going inductively, after a finite number ofsteps, we find the roots of p ′ and finally in the same way the roots of p , and thus wecomplete our first stage. We will now consider the system of two polynomials p ( x , y ) , p ( x , y ) of two realvariables x and y with coefficients in R . We solve the system (A) where ( p ( x , y ) = ( ) p ( x , y ) = ( ) (A)We solve system (A) with the following Supposition : We suppose that system (A) has a finite number of solutions, that is,the set L A = { ( x , y ) ∈ R | p ( x , y ) = p ( x , y ) = } is non-void and finite. Solution of polynomial equations 7
Firstly, we notice that one of the polynomials p ( x , y ) , p ( x , y ) , at least, is nonzero, or else if p ( x , y ) ≡ p ( x , y ) ≡ ( x , y ) ∈ R , then we have L A = R ,that is false because the set L A is finite. We will examine some cases.First of all, we suppose that at least one of the polynomials is of one variableonly. We can distinguish some cases here. Let p ( x , y ) ≡ q ( x ) , p ( x , y ) ≡ q ( x ) .Then, we solve the equations q ( x ) = q ( x ) = L A is the set of all ( x , y ) , where x is one of thecommon solutions of equations q ( x ) = q ( x ) = y ∈ R , that is L A isan infinite set, which is false by our supposition. So this case itself cannot occur.Similarly, we can’t have the case where p ( x , y ) ≡ r ( y ) and p ( x , y ) ≡ r ( y ) . Nowwe consider the case where: p ( x , y ) ≡ q ( x ) and p ( x , y ) ≡ q ( y ) . Then, we can solve the equations q ( x ) = q ( y ) = A = { ρ , ρ , . . ., ρ v } and B = { λ , λ , . . ., λ m } , A ∪ B ⊆ R where A is the set of roots of q and B is the set of roots of q , v , m ∈ N . Then, we have L A = { ( ρ i , λ j ) , i = , . . ., v , j = , . . ., m } . In a similar way,we can solve the system A, when p ( x , y ) = r ( y ) and p ( x , y ) = r ( x ) , for somepolynomials r ( y ) , r ( x ) .Now, we consider the case where p ( x , y ) , p ( x , y ) is two pure polynomials.(i) The simplest case is when deg y p ( x , y ) = deg y p ( x , y ) =
1. Then we have: p ( x , y ) = α ( x ) y + α ( x ) and p ( x , y ) = β ( x ) y + β ( x ) , where α ( x ) , α ( x ) , β ( x ) , β ( x ) are some polynomials of real variable x only and α ( x ) β ( x )
0, because p ( x , y ) , p ( x , y ) are pure polynomials.So we have to solve the system: ( α ( x ) y + α ( x ) = ( ) β ( x ) y + β ( x ) = ( ) We can distinguish some cases here. There exists a ( x , y ) ∈ L A , so that:1) α ( x ) = β ( x ) =
0. Then with (3) and (4), we get: α ( x ) = β ( x ) =
0. We get ( x , y ) ∈ L A for every y ∈ R , that is false because L A is finite. So, this case cannot occur.2) α ( x ) = β ( x ) =
0. Then with (4) we take: y = − β ( x ) β ( x ) (5). With (3) we have: α ( x ) = α ( x ) and α ( x ) , andfor every common root x of α ( x ) and α ( x ) , so that β ( x ) =
0, the couple ( x , y ) ∈ L A , where y is given from (5). Of course we find the real roots of N. Tsirivas polynomials α ( x ) and α ( x ) with the method of our first stage. In a similar waywe find the roots ( x , y ) ∈ L A , so that, α ( x ) = β ( x ) = α ( x ) = β ( x ) = α ( x ) = β ( x ) =
0. Then with (3), (4) and our supposition, we get: y = α ( x ) and β ( x ) , sothat they are not roots of polynomials α ( x ) and β ( x ) , with the method of thefirst stage. If x is such a root, that is: α ( x ) = β ( x ) = α ( x ) = β ( x ) =
0, then: ( x , ) ∈ L A .(ii) α ( x ) = β ( x ) = α ( x ) = α ( x ) =
0, we get: y =
0. Then,because y =
0, by (4) we get β ( x ) =
0, that is a contradiction by our suppo-sition. So, this case cannot occur.(iii) α ( x ) = β ( x ) =
0. As in the previous case (ii), this case cannotoccur.(iv) α ( x ) = β ( x ) = y = − α ( x ) α ( x ) (6) and y = − β ( x ) β ( x ) (7).With (6) and (7) we get: − α ( x ) α ( x ) = − β ( x ) β ( x ) ⇔ α ( x ) β ( x ) − α ( x ) β ( x ) = α ( x ) y + α ( x ) = , ( i ) β ( x ) y + β ( x ) = , ( ii ) α ( x ) = , α ( x ) = , β ( x ) = , β ( x ) = )and α ( x ) β ( x ) − α ( x ) β ( x ) = , ( i ) y = − α ( x ) α ( x ) , ( ii ) α ( x ) = , α ( x ) = , β ( x ) = , β ( x ) = )Let L A , L A be the two set of solutions of systems A and A respectively. We provethat L A = L A .By previous procedure and equalities (6) and (8) we get: L A ⊆ L A ( ) Now let ( x , y ) ∈ L A . Then equality ii) of A gives equality ii) of A . By equality(i) of ( A ) we get: α ( x ) β ( x ) = α ( x ) β ( x ) and by the fact that α ( x ) = β ( x ) =
0, we get:
Solution of polynomial equations 9 − α ( x ) α ( x ) = − β ( x ) β ( x ) . ( ) Through the equality (ii) of ( A ) and (10) we get: y = − β ( x ) β ( x ) . ( ) Equality (11) gives equality (ii) of ( A ) . So we have ( x , y ) ∈ L A , that is L A ⊆ L A (12).By (9) and (12) we get: L A = L A .So, we proved that in order to solve system ( A ) it suffices to solve system ( A ) .Thus, we solve equation (i) of ( A ) with the method of the first stage, and for ev-ery root x of polynomial α ( x ) β ( x ) − α ( x ) β ( x ) so that α ( x ) = α ( x ) = β ( x ) = β ( x ) =
0, we get the respective y from equality (ii) of ( A ) .So far we have completely solved the system (A), in the case of deg y p ( x , y ) = deg y p ( x , y ) = deg y p ( x , y ) ≤ deg y p ( x , y ) ≤ p ( x , y ) , p ( x , y ) are two pure polynomials. Of course, we have deg y p ( x , y ) ≥ deg y p ( x , y ) ≥
1, because p ( x , y ) , p ( x , y ) are pure polynomials.We have already examined the case deg y p ( x , y ) = deg y p ( x , y ) = deg y p ( x , y ) , deg y p ( x , y ) are equal to 2.We examine, firstly, the case where: deg y p ( x , y ) = deg y p ( x , y ) = . Then, we can write the system (A) as follows: ( α ( x ) y + α ( x ) y + α ( x ) = ( ) β ( x ) y + β ( x ) = ( ) (A)If α ( x ) ≡
0, we have the previous system. So we suppose that α ( x ) ( x , y ) ∈ L A as above. We distinguish some cases:1) α ( x ) =
0. Then, we solve the system (cid:26) α ( x ) y + α ( x ) = β ( x ) y + β ( x ) = ( x , y ) of this system so that α ( x ) = α ( x ) = α ( x ) =
0. We distinguish some cases.(i) α ( x ) = β ( x ) =
0. Then we have to solve the system ( α ( x ) y + α ( x ) = ( ) β ( x ) = ( ) . (B ) By (15) we take: y = − α ( x ) α ( x ) . ( ) So, in order to solve this system we do the following:First of all we find all the common roots x of three polynomials α ( x ) , β ( x ) , β ( x ) that are not roots of polynomial α ( x ) .If x ∈ R and α ( x ) = β ( x ) = β ( x ) = α ( x ) =
0, we consider the number − α ( x ) α ( x ) . If − α ( x ) α ( x ) ≥
0, then we set y = s − α ( x ) α ( x ) and y = − s − α ( x ) α ( x ) , if − α ( x ) α ( x ) > (cid:19) and ( y = α ( x ) = ) , and then under the above conditions ( x , y ) ∈ L A .We find the roots of polynomials α ( x ) , β ( x ) , β ( x ) with the method of thefirst stage.Of course, if we cannot find couples ( x , y ) ∈ R so that all the above conditionshold, this means, that we do not have solutions to this case.(ii) α ( x ) = β ( x ) = ( α ( x ) y + α ( x ) = ( ) β ( x ) y + β ( x ) = ( ) . (B )Through (17) and (18) we get: y = − α ( x ) α ( x ) ( ) y = − β ( x ) β ( x ) ( ) ⇒ y = (cid:18) β ( x ) β ( x ) (cid:19) ( ) Through (19) and (21) we get: − α ( x ) α ( x ) = (cid:18) β ( x ) β ( x ) (cid:19) ⇔ α ( x ) β ( x ) + α ( x ) β ( x ) = . ( ) From the above, in order to find a solution of system ( B ) we do the following:We find all the common roots of two polynomials α ( x ) β ( x ) + α ( x ) β ( x ) and α ( x ) , that are not roots of polynomials α ( x ) and β ( x ) (if any). Let x besuch a root. We set y = − β ( x ) β ( x ) , and then ( x , y ) is a solution of ( B ) and weget all the other solutions of ( B ) in the same way.(iii) α ( x ) = β ( x ) = β ( x ) =
0. So, in order to solve system (A) in thiscase, we do the following.
Solution of polynomial equations 11
We find all the common roots (if any) x of polynomials β ( x ) , β ( x ) , so that α ( x ) = α ( x ) =
0. Of course this is a finite set of numbers x .For such a root x we solve the equation α ( x ) y + α ( x ) y + α ( x ) = y (if any).All these couples ( x , y ) ∈ R (if any) are the set of solutions of system (A) inthis case.(iv) α ( x ) = β ( x ) = deg y p ( x , y ) = deg y p ( x , y ) = deg y p ( x , y ) = deg y p ( x , y ) = . We have the system: α ( x ) y + α ( x ) y + α ( x ) = ( ) β ( x ) y + β ( x ) y + β ( x ) = ( ) (B )Here we examine some cases:1) Let ( x , y ) ∈ L B .If α ( x ) =
0, or β ( x ) = α ( x ) = β ( x ) = . Now, we can distinguish some cases:i) α ( x ) = β ( x ) = α ( x ) y + α ( x ) = ( ) and β ( x ) y + β ( x ) = ( ) . Through (25) we have y = − α ( x ) α ( x ) (27) and by (26) we get y = − β ( x ) β ( x ) . (28)Through (27) and (28) we get: − α ( x ) α ( x ) = − β ( x ) β ( x ) ⇔ α ( x ) β ( x ) − β ( x ) α ( x ) = . (29)From the above, we have the following solution: We find the common roots of polynomials α ( x ) , β ( x ) , α ( x ) β ( x ) − β ( x ) α ( x ) so that α ( x ) = β ( x ) = x . If α ( x ) =
0, we get y = α ( x ) α ( x ) <
0, we consider y = s − α ( x ) α ( x ) , y = − s − α ( x ) α ( x ) , and ( x , y ) , ( x , y ) ∈ L B . We get all the other solutions of B in the sameway. Of course, if ( x , y ) does not exist with the above conditions, we do nothave solutions of B in this case.ii) α ( x ) = β ( x ) = α ( x ) = β ( x ) = α ( x ) = β ( x ) = ( B ) .We consider the number: D = α ( x ) β ( x ) − α ( x ) β ( x ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ( x ) α ( x ) β ( x ) β ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) that we call it: the determinant of system ( B ) .We distinguish two cases:a) D = ( α ( x ) z + α ( x ) ω = − α ( x ) ( ) β ( x ) z + β ( x ) ω = − β ( x ) ( ) (B )This linear system has determinant D =
0, so, it has exactly one solution.We set: D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − α ( x ) α ( x ) − β ( x ) β ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α ( x ) β ( x ) − α ( x ) β ( x ) (32)and D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ( x ) − α ( x ) β ( x ) − β ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α ( x ) β ( x ) − α ( x ) β ( x ) (33)Through Cramer’s law of linear algebra we get the unique solution ( z , ω ) of system B , that is Solution of polynomial equations 13 z = D D and ω = D D . From our supposition the couple ( x , y ) ∈ L B . This means that the numbers y and y satisfy equations (23) and (24) of ( B ) , or differently, in other words the couple ( y , y ) is a solution of the linear system ( B ) . But because of our supposition D = ( z , ω ) , where z = D D (34) and ω = D D (35), is the unique solution ofsystem ( B ) , as it is well known in linear algebra. So, we have z = y and ω = y ,and by (34) and (35) we get: y = D D ( ) and y = D D ( ) Now, we exploit the inner relation that numbers y and y have, that is: y = y · y . (38)Replacing (36) and (37) in relation (38), we get: D D = y · D D ⇒ D − y D = . (39)By (37) we have Dy − D = ( x , y ) ∈ L B also satisfies the system: ( D − yD = ( ) Dy − D = ( ) From the above, we have the two systems: α ( x ) y + α ( x ) y + α ( x ) = β ( x ) y + β ( x ) y + β ( x ) = α ( x ) = , β ( x ) = , α ( x ) = , β ( x ) = , D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ( x ) α ( x ) β (( x ) β ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = )and D = yD = Dy − D = α ( x ) = , β ( x ) = , α ( x ) = , β ( x ) = , D = . (B ) Let L B , L B be the set of solutions of systems ( B ) and ( B ) . We now show that L B = L B . Of course we have L B ⊆ L B from the previous procedure, because weobtained equalities (39) and (40) of system ( B ) from system B .Reversely, let ( x , y ) ∈ L B . From the first two equalities of ( B ) we get: y = D D and y = D D (37)We multiply these equalities and we take: y = D D (36).Now, we consider the linear system ( B ) . Because D = ( z , ω ) = (cid:16) D D , D D (cid:17) , (41) as it is well known, inCramer’s law.From (36), (37) and (41) we have: z = y (42) and ω = y (43).Replacing (42) and (43) in (30) and (31) of ( B ) we get the first two equalities of ( B ) , that is ( x , y ) ∈ L B . So, we have: L B ⊆ L B .From the above we have L B = L B . So, we are led to solve system B , which wehave examined previously, in the system: p ( x , y ) = p ( x , y ) = ) , where deg y p ( x , y ) ≤ deg y p ( x , y ) ≤ . b) D = D x , that is α ( x ) β ( x ) − α ( x ) β ( x ) = α ( x ) = β ( x ) = α ( x ) = β ( x ) =
0. We get y that satisfiesone of the equations (23), or (24) of ( B ) , that is: α ( x ) y + α ( x ) y + α ( x ) = . This holds because the two equations (23) and (24) are equivalent, (as we haveshown in prerequisites of linear algebra), and each of them is a multiple of the other.
Remark 2.2.1.
We note that the three remaining cases we have left are similar tocase a) above where D = deg y p ( x , y ) ≤ deg y p ( x , y ) ≤
2. We set m : = max { deg y p ( x , y ) , deg y p ( x , y ) } . We solve system (A) in the general case with induction above the number m . Wehave examined the cases where m = m = k ∈ N , k ≥
3, we have solved system (A) for every systemso that m ≤ k −
1. We now solve system (A) when m = k .We can write polynomials p ( x , y ) , p ( x , y ) as follows: Solution of polynomial equations 15 α m ( x ) y m + α v ( x ) y v + q ( x , y ) = p ( x , y ) and β n ( x ) y n + β µ ( x ) y µ + q ( x , y )= p ( x , y ) , where v < m , v , m ∈ N , deg y q ( x , y ) < v and µ < n , µ , n ∈ N , n ≤ m , deg y q ( x , y ) < µ .So, the initial system can be written as follows: ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = ( ) β n ( x ) y n + β µ ( x ) y µ + q ( x , y ) = ( ) (A)where α m ( x ) , q v ( x ) , β n ( x ) , β µ ( x ) are polynomials of the real variable x only and q ( x , y ) , q ( x , y ) be polynomials of real variables x and y .We, also, suppose that a m ( x )
0. We can distinguish some cases as previously:1) Let α v ( x ) ≡ q ( x , y ) ≡ β n ( x ) ≡ β µ ( x ) ≡ q ( x , y ) ≡
0. Then we have thesystem: α m ( x ) y m =
0. If m =
0, and α m ( x ) = c ∈ R then every couple ( x , ) ∈ L A and the system has an infinite set of solutions, which is false. So, this case cannotoccur.2) β n ( x ) ≡ β µ ( x ) ≡ q ( x , y ) ≡ β n ( x ) α v ( x ) ≡ q ( x , y ) ≡ β µ ( x ) ≡ q ( x , y ) ≡ ( α m ( x ) y m = β n ( x ) y n = deg α m ( x ) = deg β n ( x ) =
0, then any couple ( x , ) ∈ L A and the set of solutionsis infinite, which is false. So, this case cannot occur.4) β n ( x ) α v ( x ) ≡ β µ ( x ) ≡ ≡ q ( x , y ) and q ( x , y ) ≡ r ( x ) ( α m ( x ) y m = ( ) β n ( x ) y n + r ( x ) = ( ) (A)We can distinguish some cases here.Firstly, we suppose that (A) has a solution ( x , y ) ∈ L A .i) α m ( x ) = = β n ( x ) . Then, of course r ( x ) =
0. So, if the polynomials α m ( x ) , β n ( x ) , r ( x ) , has a common root x , then any couple ( x , y ) ∈ L A , for every y ∈ R , which is false of course.So, this case cannot happen.ii) α m ( x ) = β n ( x ) = y = − r ( x ) β n ( x ) .Thus, in this case we solve the system as follows: We find the roots x of α m ( x ) , so that β n ( x ) =
0. For every such root the couple ( x , y ) = (cid:16) x , − r ( x ) β n ( x ) (cid:17) ∈ L A .We get all the other solutions of this system in the same way.iii) α m ( x ) = β n ( x ) = y =
0. By (4), we take r ( x ) = r ( x ) suchthat α m ( x ) = β n ( x ) =
0. For every such root x , the couple ( x , ) ∈ L A .iv) α m ( x ) = β n ( x ) = y =
0, and by (4), for y = r ( x ) = x of r ( x ) sothat α m ( x ) = β n =
0. Then the couple ( x , ) is a solution of (A).v) In a similar way we can solve a system of the form: ( α m ( x ) y m + r ( x ) = β n ( x ) y n = . β n ( x ) α v ( x ) ≡ β µ ( x ) ≡ q ( x , y ) ≡ r v ( x ) q ( x , y ) ≡ r ( x ) ( α m ( x ) y m + r ( x ) = β n ( x ) y n + r ( x ) = ( x , y ) ∈ L A .If r ( x ) = r ( x ) =
0, then we have the system of the previous case 4. So,we suppose that: r ( x ) = r ( x ) =
0. We can distinguish some cases:(i) α m ( x ) = β n ( x ) = ( r ( x ) = r ( x ) = . (A)Let x be a common root of r ( x ) , r ( x ) . Then, we have ( x , y ) ∈ L A , for every y ∈ R , which is false of course, because the set L A is finite.So, this case cannot occur.(ii) α m ( x ) = β n ( x ) =
0. Then, we get r ( x ) = y n = − r ( x ) β n ( x ) , andif n is odd we have y = n s − r ( x ) β n ( x ) if r ( x ) β n ( x ) ≤ y = − n s r ( x ) β n ( x ) if r ( x ) β n ( x ) > n is even and r ( x ) β n ( x ) ≤
0, we have:
Solution of polynomial equations 17 y = n s − r ( x ) β n ( x ) and y = − n s − r ( x ) β n ( x ) . So, in this case we solve the system as follows:We find the common roots of polynomials α m ( x ) and r ( x ) , such that β n ( x ) =
0. Let x such a root.Then, if n is odd and r ( x ) β n ( x ) ≤
0, then the couple ( x , y ) ∈ L A , where y = n s − r ( x ) β n ( x ) , whereas if r ( x ) β n ( x ) >
0, then the couple ( x , y ) ∈ L A , where y = − n s r ( x ) β n ( x ) . If n is even, we set y = n s − r ( x ) β n ( x ) and y = − n s − r ( x ) β n ( x ) , and the couples ( x , y ) , ( x , y ) ∈ L A , where r ( x ) β n ( x ) ≤
0. This case can happenif the above conditions hold, of course.(iii) α m ( x ) = β n ( x ) = α m ( x ) = β n ( x ) = ( A ) we get: y m = − r ( x ) α m ( x ) and y n = − r ( x ) β n ( x ) . Through these equations we get: y n m = ( − ) n (cid:18) r ( x ) α m ( x ) (cid:19) n and y n m = ( − ) m (cid:18) r ( x ) β n ( x ) (cid:19) m and by these equations we get: ( − ) n (cid:18) r ( x ) α m ( x ) (cid:19) n = ( − ) m (cid:18) r ( x ) β n ( x ) (cid:19) m ⇔ ( − ) m − n r ( x ) m α m ( x ) n − r ( x ) n β n ( x ) m = . So, we solve this case as follows:We find the real roots of polynomial ( − ) m − n r ( x ) m α m ( x ) n − r ( x ) n β n ( x ) m ,so that α m ( x ) = β n ( x ) = x , we get y so that y m = − r ( x ) α m ( x ) as in the previous case (ii).6) β n ( x ) α v ( x ) ≡ β µ ( x ) ≡ q ( x , y ) , q ( x , y ) are two pure polynomials.So, we have to solve the system: ( α m ( x ) y m + q ( x , y ) = β n ( x ) y n + q ( x , y ) = deg y q ( x , y ) ≥ deg y q ( x , y ) ≥
1, and q ( x , y ) , q ( x , y ) are two monomials.So, in this case we have the system: α m ( x ) y m + α ( x ) y λ = β n ( x ) y n + β ( x ) y λ = , where α ( x ) , β ( x ) are two polynomials so that one of them (at least) is non-zeroand λ , λ ∈ N , so that λ < m and λ < n . We can write the system as follows: y λ ( α m ( x ) y m − λ + α ( x )) = y λ ( β n ( x ) n − λ + β ( x )) = , (A)so for every x ∈ R , the couple ( x , ) ∈ L A , which is false, because L A is finite. Thus,this cannot occur.7) We suppose that β n ( x ) q ( x , y ) ≡ r ( x ) , α v ( x ) β µ ( x ) q ( x , y ) ≡ r ( x ) , where r ( x ) r ( x )
0. So, we get the system: ( α m ( x ) y m + α v ( x ) y v + r ( x ) = β n ( x ) y n + β µ ( x ) y µ + r ( x ) = r ( x ) ≡ r ( x )
0. So, we get the system: ( α m ( x ) y m + α v ( x ) y v = β n ( x ) y n + β µ ( x ) y µ + r ( x ) = ( x , y ) ∈ L A .Through the first equation we get y v ( α m ( x ) y m − v + α v ( x )) = Solution of polynomial equations 19 If y =
0, by the second equation we get r ( x ) =
0. So, if x is a root of r ( x ) ,then the couple ( x , ) ∈ L A .Let y =
0. Then we get: α m ( x ) y m − v + α v ( x ) = β n ( x ) y n + β µ ( x ) y µ + r ( x ) = . If r ( x ) = r ( x ) =
0. If n < m we have a system that we supposedlycan solve through the induction step. Thus, we suppose that n = m . So we havethe system: α m ( x ) y m − v + α v ( x ) = β n ( x ) y m + β µ ( x ) y µ + r ( x ) = r ( x ) r ( x ) ≡ r ( x ) r ( x ) ( x , y ) ∈ L A . If r ( x ) =
0, or r ( x ) = r ( x ) = r ( x ) =
0. We have some cases:(i) α v ( x ) β µ ( x ) ≡ ( α m ( x ) y m + α v ( x ) y v + r ( x ) = β n ( x ) y m + r ( x ) = . (A)We consider some cases:(a) Let ( x , y ) ∈ L A , α m ( x ) = β n ( x ) =
0. We have examined this case previ-ously.(b) α m ( x ) = β n ( x ) =
0. We have examined this case previously.(c) α m ( x ) = β n ( x ) = α v ( x ) = α v ( x ) =
0. Then we get r ( x ) = β n ( x ) =
0, that is false by our supposition. So, this case cannot occur.(d) α m ( x ) = β n ( x ) = α v ( x ) =
0, we have examined this case previously. So, we suppose that α v ( x ) =
0. We will examine this case later.(ii) α v ( x ) ≡ β µ ( x ) α v ( x ) β µ ( x ) (a) We suppose that β n ( x ) = β µ ( x ) =
0, and α m ( x ) = α v ( x ) =
0. So,we have the system: ( α m ( x ) y m + α v ( x ) y v + r ( x ) = r ( x ) = . (A)This cannot happen because r ( x ) = α m ( x ) = α v ( x ) =
0. Then we get that r ( x ) =
0, whichis false by our supposition. Thus, this case cannot occur.(c) If α m ( x ) =
0, or α v ( x ) =
0, or β n ( x ) =
0, or β µ ( x ) =
0, then we getsome of the previous cases.(d) α m ( x ) = α v ( x ) = β n ( x ) = β µ ( x ) = ( α m ( x ) y m + α v ( x ) y v + r ( x ) = β n ( x ) y m + β µ ( x ) y µ + r ( x ) = . (A)We have here the basic case of this system.We will examine this case later in a more general case.Now, we will examine the system: ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = β n ( x ) y n + β µ ( x ) y µ + q ( x , y ) = , where α m ( x ) m ≥ v < m , n ≤ m n > µ , q ( x , y ) , q ( x , y ) are two purepolynomials, α v ( x ) β µ ( x ) β n ( x ) ≡ ( x , y ) ∈ L A .If α m ( x ) =
0, then we have a system from the induction step. So, we suppose α m ( x ) = α v ( x ) = β µ ( x ) = q ( x , y ) , q ( x , y ) and we reach a system of thefollowing form ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = β m ( x ) y µ + q ( x , y ) = , (A)where α v ( x ) = β µ ( x ) = v < m , µ ∈ N , deg y q ( x , y ) < v , deg y q ( x , y ) < µ .We will see later how we can solve such a system. Solution of polynomial equations 21 (ii) α v ( x ) = β µ ( x ) = q ( x , y ) and we reach a system of the followingform: α v ( x ) y m + α v ( x ) y v + q ( x , y ) = ( ) β µ ( x ) y µ + r ( x ) = ( ) , (B)If β µ ( x ) =
0, we get from (2) that: r ( x ) = x that is a common root of polynomials in the second equation (B) sothat α m ( x ) = α v ( x ) =
0, and we find y from the first equation of (B).(iii) α v ( x ) = β µ ( x ) =
0. We will see later how we solve this system.
After all the above cases we reach now in the most important case .We have the system: ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = β n ( x ) y n + β µ ( x ) y µ + q ( x , y ) = m > v > deg y q ( x , y ) , n > µ > deg y q ( x , y ) , α m ( x ) α v ( x ) β n ( x ) β µ ( x ) q ( x , y ) , q ( x , y ) be two purepolynomials.We distinguish some cases:(i) m = n .We also have some cases here.(a) v = µ .So, we have the system: ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = β m ( x ) y m + β v ( x ) y v + q ( x , y ) = ( x , y ) ∈ L A and α m ( x ) · α v ( x ) · β m ( x ) · β m ( x ) · β v ( x ) = . Let D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α m ( x ) α v ( x ) β m ( x ) β v ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) . We suppose that D = is the first basic case . We will study three basic cases overall.We consider the linear system: ( α m ( x ) z + α v ( x ) ω = − q ( x , y ) ( ) β m ( x ) z + β v ( x ) ω = − q ( x , y ) ( ) (B)We set: D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − α ( x , y ) α v ( x ) − q ( x , y ) β v ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) and D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α m ( x ) − q ( x , y ) β m ( x ) − q ( x , y ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . That is, we have: D = α v ( x ) q ( x , y ) − β v ( x ) q ( x , y ) and D = β m ( x ) q ( x , y ) − α m ( x ) q ( x , y ) . Because D =
0, by our supposition, we take it that system (B) has only onesolution ( z , ω ) , where z = D D (3) and ω = D D (4), as it is well known by linearalgebra by Cramer’s law.Because ( x , y ) ∈ L A , (by our supposition) this means that the couple ( y m , y v ) is a solution of system (B).But, ( z , ω ) is the unique solution of system (B). So, we have: ( z , ω ) =( y m , y v ) ⇔ z = y m (5) and ω = y v (6). By (3), (4), (5) and (6), we get: y m = D D (7) and y v = D D . (8)Now, we use the obvious relation of numbers y m and y v , that is: y m = y m − v · y v (9), where v < m , by our supposition.Replacing by (7) and (8) in (9), we get: D D = y m − v · D D ⇔ D y m − v − D = . (10)From the above we see that ( x , y ) satisfies the two equations: Dy v − D = ( ) D y m − v − D = ( ) (C)We notice that polynomials in (11) and (12) have degree with respect to y lower than m .Let us consider now the following systems α m ( x ) y m + α v ( x ) y v + q ( x , y ) = ( ) β m ( x ) y m + β v ( x ) y v + q ( x , y ) = ( ) y · α m ( x ) · α v ( x ) β m ( x ) · β v ( x ) D = Solution of polynomial equations 23 Dy v − D = ( ) D y m − v − D = ( ) y · α m ( x ) · α v ( x ) · β m ( x ) · β v ( x ) · D = D = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) . D = α v ( x ) q ( x , y ) − β v ( x ) q ( x , y ) , D = β m ( x ) q ( x , y ) − α m ( x ) q ( x , y ) . It is obvious that deg y ( Dy v ) = v < m , because D = deg y ( D y m − v ) < m ,because deg y D < v , by our suppositions. We will prove now that L A = L B . It isobvious that L A ⊆ L B (17) from the previous procedure, because we got equations(11) and (12) of system ( Γ ) from equations of system A.Now, let ( x , y ) ∈ L B .Through equations (15), (16) of (B) and the fact that y =
0, we get: y v = D D (18) and y m − v = D D (19)Through equations (18) and (19) we get: y m = D D (20). Now, we consider system(B). Because D = ( z , ω ) = (cid:16) D D , D D (cid:17) (21), fromCramer’s Law. Through (18), (20) and (21) we get z = y m (22) and ω = y v (23).Replacing (22) and (23) in equations of (B) we take it that ( x , y ) ∈ L A , so L B ⊆ L A (24). By (17) and (24), we have L A = L B (25). The equality (25) means that: inorder to solve system (A), it suffices to solve system (B), whose degree with respectto y is smaller than m , that is the degree of system (A) with respect to y . But withthe induction step, we can solve a system whose degree with respect to y is smallerthan m , and thus we complete this case. The second basic case is the following D
0, but D ( x ) = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) = . In this case the two equations of system (A) are equivalent to those of linear algebra,as we have shown in prerequisites.So, we can solve this case as follows:We find the roots of polynomial D = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) , so that: α m ( x ) · α v ( x ) · β n ( x ) · β µ ( x ) .For every such root x we find y from one of the equations of (A) that are equiv-alent. We can complete this case by finding the solutions of the form ( x , ) (if any). Third basic case (singular case)
We suppose that D ≡ ≡ α m ( x ) β v ( x ) − α v ( x ) β m ( x ) . We call this case thesingular case .We consider the system: α m ( x ) y m + α v ( x ) y v + q ( x , y ) = ( ) β m ( x ) y m + β v ( x ) y v + q ( x , y ) = ( ) α m ( x ) α v ( x ) β m ( x ) β v ( x ) = . (A)We consider our general supposition. That is, we suppose L A = /0. Let ( x , y ) ∈ L A .Then we get: ( α m ( x ) y m + α v ( x ) y v = − q ( x , y ) ( ) β m ( x ) y m + β v ( x ) y v = − q ( x , y ) ( ) . (B)We get D ( x ) = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) = . Let D ( x , y ) = α v ( x ) q ( x , y ) − β v ( x ) q ( x . y ) D ( x , y ) = β m ( x ) q ( x , y ) − α m ( x ) q ( x , y ) . We now consider the following system ( α m ( x ) z + α v ( x ) ω = − q ( x , y ) ( ) β m ( x ) z + β v ( x ) ω = − q ( x , y ) ( ) ( Γ )Through the previous system (B) we have that ( y m , y v ) is a solution of ( Γ ). Thatis, ( Γ ) is a linear system that has a solution and D ( x ) =
0. So, we have that D ( x , y ) = α m ( x ) y m + α v ( x ) y v + q ( x , y ) = ( ) α v ( x ) q ( x , y ) − β v ( x ) q ( x , y ) = ( ) y · α m ( x ) α v ( x ) β m ( x ) β v ( x ) = ∆ )From the above we have L A ⊆ L ∆ (9).We can now prove the reverse inclusion of (9).Let ( x , y ) ∈ L ∆ . Of course ( x , y ) satisfies equation (1) of (A). We distinguishtwo cases:(i) q ( x , y ) = Γ ). This system has D = D = D ≡ D = ( x , y ) , that is D ( x , y ) =
0. So, sys-tem ( Γ ) has an infinity of solutions and D ( x , y ) =
0, because D = D ( x , y ) = α m ( x ) q v ( x ) β m ( x ) β v ( x ) =
0, by our supposition.By relation D ( x ) = α m ( x ) β v ( x ) − α v ( x ) β m ( x ) = ⇔ α m ( x ) β m ( x ) = α v ( x ) β v ( x ) . (10) Solution of polynomial equations 25
By equation D ( x , y ) = α v ( x ) q ( x , y ) = β v ( x ) q ( x , y ) ⇒ α v ( x ) β v ( x ) = q ( x , y ) q ( x , y ) (11)We have q ( x , y ) = q ( x , y ) = ⇒ q ( x , y ) =
0, that is false by oursupposition. So, (11) holds. By (10) and (11) we set0 = λ = α m ( x ) β m ( x ) = α v ( x ) β v ( x ) = q ( x , y ) q ( x , y ) ⇒ β m ( x ) = λα m ( x ) , (12) β v ( x ) = λα v ( x ) , (13) q ( x , y ) = λ q ( x , y ) . (14)By (12), (13) and (14) we get: β m ( x ) y m + β v ( x ) y v + q ( x , y ) = λα m ( x ) y m + λα v ( x ) y v + λ q ( x , y )= λ ( α m ( x ) y m + α v ( x ) y v + q ( x , y )= λ · = , because ( x , y ) ∈ L ∆ , which means that ( x , y ) satisfies equality (7). So, we provedthat if ( x , y ) ∈ L ∆ and q ( x , y ) =
0, then ( x , y ) ∈ L A .(ii) q ( x , y ) = ( x , y ) ∈ L ∆ , through equality (8) we get: q ( x , y ) =
0, because α v ( x ) =
0, by our supposition.As previously, because D ( x ) = β m ( x ) β v ( x ) =
0, we take it that: (12),and (13) holds, so β m ( x ) y m + β v ( x ) y v + q ( x , y ) = λα m ( x ) y m + λα v ( x ) y v + = λ ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = , by equality (7) of ( ∆ ) because ( x , y ) ∈ L ∆ by our supposition.So, equality (2) of (A) holds, that is ( x , y ) ∈ L A . So, we have L ∆ ⊆ L A (15).Through (9) and (15), we get: L A = L ∆ . So, in order to solve system (A), it sufficesto solve system ( ∆ ) . What is the profit from system ( ∆ ) . The profit is that polynomialin equation (8) of ( ∆ ) , that is D , has deg y D ( x , y ) < v or D ( x , y ) ≡
0. We examinenow how we exploit these facts.We leave the case D ( x , y ) ≡
0, for the end.We examine now the case where D ( x , y )
0. We can write D ( x , y ) in the fol-lowing form: D ( x , y ) = α v ( x ) y v + α v ( x ) y v + q ( x , y ) , where v > v > v , deg y q ( x , y ) < v , or q ( x , y ) ≡
0. This is the general case.We suppose, also, that α v ( x ) α v ( x ) q ( x , y ) is a pure polynomial.We get: y m − v D ( x , y ) = α v ( x ) y m + α v ( x ) y m − v + v + y m − v q ( x , y ) , where deg y ( y m − v q ( x , y )) < m − v + v because deg y q ( x , y ) < v , by our sup-position.We consider the system: α m ( x ) y m + α v ( x ) y v + q ( x , y ) = α v ( x ) y m + α v ( x ) y m − v + v + y m − v q ( x , y ) = y α m ( x ) α v ( x ) β m ( x ) β v ( x ) = . (E)If α v ( x ) =
0, or α v ( x ) = x ∈ R we examine whether system (E) has aroot of α v ( x ) or α v ( x ) that satisfies system (E). So, we examine the case where α v ( x ) · α v ( x ) = D = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α m ( x ) α v ( x ) α v ( x ) α v ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α m ( x ) α v ( x ) − α v ( x ) α v ( x ) . Then, system (E) is a system similar to system (A).So, we examine the similar cases with the same way.Here, we examine only the case where D = α m ( x ) α v ( x ) − α v ( x ) α v ( x ) ≡ ( ∆ ) , so that the respective equation(8) of the new system has D ( x , y ) ( ∆ ) . So it helps us to take the equation in which the respectivepure polynomial q ( x , y ) or q ( x , y ) has the smallest number of terms. For this reasonin system (E) (that is similar to A) we take as a first equation (of system ( ∆ ) ) thesecond equation because this polynomial y m − v q ( x , y ) has at most v terms withrespect to y (by its definition), where v < v < v ⇒ v ≤ v − ( ∆ ) we take that the respective D ( x , y ) polynomial of ( ∆ ) has deg y D ( x , y ) < v , so, if we write this polynomial again in the form: D ( x , y ) = α v ( x ) y v + α v ( x ) y v + q ( x , y ) , the new polynomial q ( x , y ) , has at most v ≤ v − q ( x , y ) , q ( x , y ) of the new system ( ∆ ) will have at most v − D ( x , y ) also. So, the profit is the following: Solution of polynomial equations 27
In system ( ∆ ) polynomial D ( x , y ) has at most v terms with respect to y , whereasin a new system like ( ∆ ) in a following stage the respective polynomial D ( x , y ) ofthe new system ( ∆ ) will have at most v − y .With the same procedure we can see that the terms of the respective polynomials D ( x , y ) are decreasing, so that after a finite number of steps we reach a polynomial D ( x , y ) ≡ D ( x , y ) ≡ r ( x ) for polynomial r ( x )
0. If D ( x , y ) ≡ r ( x )
0, it suffices to find the roots of polynomial r ( x ) , otherwise we have some of the previous cases that we have already examined.Now we will examine the remaining case. In system (A), page 21. If v = µ and n = m we have the first basic case where D ( x ) = m = n , that is n < m . If n ≥ v , we have the first basic case. So, wecan examine the case v > n . In this case we have: y m − n ( β n ( x ) y n + β µ ( x ) y µ + q ( x , y )) = ⇔ β n ( x ) y m + β µ ( x ) y m − n + µ + y m − n q ( x , y ) = ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = β n ( x ) y m + β µ ( x ) y m − n + µ + y m − n q ( x , y ) = m = n , which we have already examined. So,up to now, we have examined all the possible cases of the initial system except oneonly, that we will examine now.In the third basic case we will examine now the case where D ( x , y ) ≡
0. Then,as in pages 23, 24 we take it that D ( x , y ) ≡
0, also that for every ( x , y ) ∈ R thereexists c ∈ R , such that α m ( x ) y m + α v ( x ) y v + q ( x , y ) = c · ( β m ( x ) y m + β v ( x ) y v + q ( x , y )) ( ∗ )and c =
0. The number c depends on the couple ( x , y ) , so it is better to write c ( x , y ) ,instead of c .Now, we will consider system (A ∗ ) ( α m ( x ) y m + α v ( x ) y v + q ( x , y ) = α m ( x ) α v ( x ) β m ( x ) β v ( x ) = . (A ∗ )Equality ( ∗ ) gives us that L A = L A ∗ . So, in order to solve system (A) it suffices to solve the “simpler” system (A ∗ ) thathas only one equation.Now, it is the time to exploit the unique supposition that we have not used up tonow. That is: The set L A = L A ∗ is finite. As we have seen in the prerequisites there arepolynomials p ( x , y ) of two real variables that have a finite set of roots only.For example let: p ( x , y ) = ( x − ) + ( y − ) . It is easy to see that L p ( x , y ) = { ( , ) , ( , − ) , ( − , ) , ( − , − ) } . We denote: R ( x , y ) = α m ( x ) y m + α v ( x ) y v + q ( x , y ) , for simplicity.So, we solve the system: ( R ( x , y ) = α m ( x ) α v ( x ) β m ( x ) β v ( x ) = . (A ∗ )Of course, we get R ( x , y )
0, because α m α m ( x ) m >
1, that givesthat R ( x , y ) is a pure polynomial, that has a finite set of roots, non empty.We apply Corollary 3.16 by our prerequisites and we take it that 0 is the globalmaximum or minimum of R ( x , y ) .Without loss of generality we suppose that 0 is the global minimum of R ( x , y ) .This means that if we consider the function F : U → R (where U = { ( x , y ) ∈ R | α m ( x ) α v ( x ) β m ( x ) β v ( x ) = } is an open subset of R ). F (( x , y )) = R ( x , y ) for every ( x , y ) ∈ U , then it holds F (( x , y )) ≥ ( x , y ) ∈ U , and there exists ( x , y ) ∈ U so that F (( x , y )) = ( x , y ) ∈ R so that ( x , y ) ∈ L A ∗ . Then, we have F (( x , y )) = F has a global minimum in ( x , y ) . Then, by Theorem 3.17 we get ∇ F ( x , y ) = ( , ) . So we have: ∂ F ∂ y (( x , y )) = F (( x , y )) = ∂ F ∂ y (( x , y )) = α m ( x ) α v ( x ) β m ( x ) β v ( x ) = . ((A )Of course we get L A ⊆ L A ∗ = L A and by the above we also get: L ∗ A ⊆ L A . So weget: L A = L A . So, in order to solve system (A ∗ ) it suffices to solve system (A ). We need to writea more analytic system (A ). We get: Solution of polynomial equations 29 α m ( x ) y m + α v ( x ) y v + q ( x , y ) = m α m ( x ) y m − + v α v ( x ) y v − + ∂ q ∂ y ( x , y ) = α m ( x ) α v ( x ) β m ( x ) β v ( x ) = )Because m > v ⇒ m − ≥ v . This shows that system (A ) is the first basic case,and so we can transfer system (A ) to a system that has smaller than m degreewith respect to y that we can solve with the induction step. So, inductively we havemanaged to solve the initial system in any case. So, we have completed our secondstage. Let a polynomial p ( z ) = α + α z + · · · + α v − z v − + α v z v , for v ∈ N , α ∈ C , for i = , , . . ., v , α v =
0, of one complex variable.We are now ready to solve completely the equation p ( z ) =
0, or in other wordsto find the roots of polynomial p ( z ) with degree v .We distinguish two cases:(i) α i ∈ R for every i = , , . . ., v , and (ii) α i ∈ C , i = , , . . ., v . Firstly, we provethe following lemma: Lemma 2.4 (A well known lemma).Let p ( z ) , be a polynomial as above with degree v = degp ( z ) ∈ N . Then, thereexist two polynomials p ( x , y ) , p ( x , y ) of two real variables with real coefficients,so that it holds: p ( x + yi ) = p ( x , y ) + ip ( x , y ) for every ( x , y ) ∈ R .Proof. We can prove this lemma with induction above the degree v of p ( z ) . Let p ( z ) = α + α z , α , α ∈ R , α =
0. Let ( x , y ) ∈ R . We get: p ( x + yi ) = α + α ( x + yi ) = ( α + α x ) + α yi , for v = p ( x , y ) = α + α x and p ( x , y ) = α y , the result holds.For v = p ( z ) = α + α z + α z , where α , α , α ∈ R , α = z = x + yi , ( x , y ) ∈ R . We get: p ( z ) = p ( x + yi ) = α + α ( x + yi ) + α ( x + yi ) = ( α + α + α x − α y ) + ( α y + α xy ) i , so for p ( x , y ) = α + α x + α x − α y and p ( x , y ) = α y + α xy , the result holds.We suppose now, that the result holds for any 1 ≤ i ≤ k ∈ N . We can prove thatresult holds for k + p ( z ) = α + α z + · · · + α n z k + α k + z k + , be a polynomial with α k + = α i ∈ R , for every i = , , . . ., k + ( x , y ) ∈ R . We have: p ( z ) = q ( z ) + α k + z k + , we distinguish two cases:(a) q ( z )
0. Then, through the induction step we can show that there exist twopolynomials p ( x , y ) , p ( x , y ) of two real variables x and y with real coefficients, sothat: q ( x + y i ) = p ( x , y ) + p ( x , y ) i for every ( x , y ) ∈ R . (1)We get: α k + z k + = α k + ( x + yi ) k + = α k + k + ∑ j = (cid:18) k + j (cid:19) x j · ( yi ) k + − j = α k + k + ∑ j = (cid:18) k + j (cid:19) x j y k + − j i k + − j = ∑ k + − j = ρρ ∈ N α < j ≤ k + α k + (cid:18) k + j (cid:19) x j y k + − j ( − ) ( n + − j ) / + ∑ k + − j = ρ + ρ ∈ N ≤ j ≤ k + α k + x j y k + − j i k + − j = q ( x , y ) + iq ( x , y ) (2)where q ( x , y ) = ∑ k + − j = ρρ ∈ N ≤ j ≤ k + α k + (cid:18) k + j (cid:19) x j y k + − j ( − ) ( n + − j ) / and iq ( x , y ) = ∑ k + − j = ρ + ρ ∈ N ≤ j ≤ k + α k + x j y k + − j i k + − j where q ( x , y ) , q ( x , y ) are two polynomials of the two real variables with real coef-ficients because i v + = i or − i , v ∈ N . Solution of polynomial equations 31
So, we get: α k + z k + = q ( x , y ) + iq ( x , y ) . So, we get: by (1) and (2) p ( z ) = q ( z ) + α k + z k + = ( p ( x , y )) + p ( x , y ) i ) + ( q ( x , y ) + q ( x , y ) i )= ( p ( x ) y ) + q ( x , y )) + ( p ( x , y ) + q ( x , y )) i and the result also holds for every ( x , y ) ∈ R .(b) q ( z ) ≡
0. Then, with the above equality (2) we get p ( z ) = α k + z k + = q ( x , y ) + iq ( x , y ) for every ( x , y ) ∈ R and the result alsoholds. So, by induction we see that the result holds in this case.Now, we suppose that α i ∈ C , for every i = , , . . ., v .Let α j = β j + γ j i for every j = , , . . ., v , where β j , γ j ∈ R for every j = , , . . ., v .Let z = x + yi ∈ C , ( x , y ) ∈ R . We get: p ( z ) = p ( x + yi ) = α + α z + · · · + α v − z v − + α v z v = ( β + γ i ) + ( β + γ i ) z + · · · + ( β v − + γ v − i ) z v − + ( β v + γ v i ) z v = ( β + β z + · · · + β v − z v − + β v z v ) + ( γ + γ z + · · · + γ v − z v − + γ v z v ) i . ( ) In the previous case (i) we see that there exist polynomials p ( x , y ) , p ( x , y ) , q ( x , y ) , q ( x , y ) of the two real variables x and y with real coefficients, so that β + β z + · · · + β v − z v − + β v z v = p ( x , y ) + p ( x , y ) i (4)and γ + γ z + · · · + γ v − z v − + γ v z v = q ( x , y ) + q ( x , y ) i (5)for every ( x , y ) ∈ R .By (3), (4) and (5) we get: p ( z ) = ( p ( x , y ) + p ( x , y ) i ) + ( q ( x , y ) + q ( x , y ) i ) i = ( p ( x , y ) − q ( x , y )) + ( p ( x , y ) + q ( x , y )) i and the result holds also. (cid:4) With the help of this Lemma we can now solve the equation p ( z ) = p ( z ) = α + α z + · · · + α v − z v − + α v z v , v ∈ N , α v = , α i ∈ C , for every i = , , . . ., v .With the help of the above lemma we write: p ( x + yi ) = q ( x , y ) + q ( x , y ) i ( ∗ ))for every ( x , y ) ∈ R , where q ( x , y ) , q ( x , y ) are two polynomials of two real vari-ables x and y with real coefficients.Let A be the set of roots of p ( z ) . We consider the system: ( q ( x , y ) = q ( x , y ) = . (B)It is obvious from the above equality ( ∗ ) that A = L B . So, in order to find all theroots of A, it suffices to find all the real roots of system (B).So, we solve system B with the method we have developed in the second stage,and thus we find all the roots of polynomial p ( z ) .Our method has been completed now because our supposition (S) (that system(B) has a solution) is satisfied because the same holds for (A). So, in all the caseswe can reduce our initial system to a system in which the two polynomials have alower degree than that of the polynomials of the initial system. Thus, we apply theinduction step and the system is solved inductively. a) Prerequisites from Algebra.We use some basic tools and results from theory of polynomials.We denote C [ z ] as the set of complex polynomials. We denote R [ x ] as the set ofreal polynomials, that is the set of polynomials of one real variable with coefficientsin the set of real numbers R .We begin with the following basic result, that is a simple implication of the algo-rithm of Euclidean division. Proposition 3.1
Let p ( z ) ∈ C [ z ] , degp ( z ) ≥ . The number r ∈ C is a root of p ( z ) ifand only if there exists a unique polynomial q ( z ) ∈ C [ z ] so that:p ( z ) = ( z − r ) q ( z ) . (cid:4) We need the definition of multiplicity of a root of a polynomial.
Definition 3.2
Let p ( z ) ∈ C [ z ] . Let ρ ∈ C be a root of p ( z ) . The natural number mis a multiplicity of the root ρ of p ( z ) if polynomial ( z − ρ ) m divides p ( z ) , whereaspolynomial ( z − ρ ) m + does not divide p ( z ) . As consequence of Proposition 3.1 there is the following proposition:
Proposition 3.3
Every root of a polynomial p ( z ) ∈ C [ z ] has a multiplicity, that isunique. (cid:4) We state now the fundamental Theorem of algebra, whose proof is not simpleand needs some tools from analysis.
Theorem 3.4
Every complex polynomial p ( z ) , with degp ( z ) ≥ has at least oneroot. (cid:4) From Theorem 3.4 and Proposition 3.1 we get the following fundamental result:
Solution of polynomial equations 33
Theorem 3.5
Let p ( z ) ∈ C [ z ] be a complex polynomial with degp ( z ) ≥ . Then p ( z ) has a finite number of different roots.Let ρ , ρ , . . ., ρ v be the different roots of p ( z ) with respective to multiplicitiesm , m , . . ., m v . Then, the following formula holds:p ( z ) = α · ( z − ρ ) m ( z − ρ ) m · · · ( z − ρ v ) m v , where α = and α is the coefficient of the monomial of greater grade m = degp ( z ) ,and m = m + m + · · · + m v . (cid:4) Now, we describe a simple algorithm in order to find the multiplicity of a root ofa complex polynomial.
Let p ( z ) ∈ C [ z ] be a complex polynomial of degree degp ( z ) ≥ p ( z ) has a finite number of roots. Let ρ be a root of p ( z ) . We describe with details a way in order to find the multiplicity of ρ .By Proposition 3.1 there exists a unique polynomial q ( z ) so that: p ( z ) = ( z − ρ ) q ( z ) . (1)We find the polynomial q ( z ) through the algorithm of Euclidean division, for exam-ple using Horner’s scheme.Afterwards, we compute the number q ( ρ ) , for example with Horner’s scheme.If q ( ρ ) =
0, then the root ρ has multiplicity 1. In order to prove this we suppose thatthe root ρ does not have multiplicity 1. By Proposition 3.3 the root ρ has a uniquemultiplicity, m ∈ N (see Definition 3.2). Because of m =
1, we have that m ≥
2. Bythe definition of multiplicity we have that polynomial ( z − ρ ) m divides p ( z ) . Thismeans (by the definition of division) that there exists a polynomial R ( z ) ∈ C [ z ] sthat: p ( z ) = ( z − ρ ) m R ( z ) . (2)By relations (1) and (2) we get: ( z − ρ ) q ( z ) = ( z − ρ ) m R ( z ) ⇔ ( z − ρ )( q ( z ) − ( z − ρ ) m − R ( z ) = . (3)The expressions z − ρ and q ( z ) − ( z − ρ ) m − R ( z ) are polynomials in C [ z ] of course,because m ≥ z − ρ
0, we take it that q ( z ) − ( z − ρ ) m − R ( z ) = , (4)because the Ring of polynomials C [ z ] is an integer neighbourhood, as is well knownfrom Algebra. Relation (4) gives q ( ρ ) = m ≥ q ( ρ ) =
0. So, if q ( ρ ) =
0, then root ρ has multiplicity 1.Whereas if q ( ρ ) =
0, then through Proposition 3.1, we take it that there exists apolynomial q ( z ) ∈ C [ z ] , so that: q ( z ) = ( z − ρ ) q ( z ) . (5) By (1) and (5) we take that p ( z ) = ( z − ρ ) q ( z ) . (6)Relation (6) tells us that polynomial ( z − ρ ) divides p ( z ) . Afterwards, we findpolynomial q ( z ) by (5) with the Euclidean Algorithm, for example from Horner’sscheme, because we have found polynomial q ( z ) previously. After that, we computenumber q ( ρ ) , for example with Horner’s scheme. If q ( ρ ) =
0, then the multiplic-ity of ρ is 2, with a proof similar to what we had found previously. Or otherwise if q ( ρ ) =
0, then again through Proposition 3.1 there exists a polynomial q ( z ) ∈ C [ z ] so that q ( z ) = ( z − ρ ) q ( z ) . (7)Through (6) and (7) we take it: p ( z ) = ( z − ρ ) q ( z ) . (8)We inductively continue this procedure of finding a sequence of polynomials q j ( z ) ∈ C [ z ] , for j = , , . . . , where q j ( z ) = ( z − ρ ) q j + ( z ) , for j = , , . . . , and p ( z ) = ( z − ρ ) j + q j ( z ) . If p ( z ) = ( z − ρ ) j + q j ( z ) for j ∈ N , (where q j ( ρ ) = ) , then the multiplicity of ρ is j +
1, with a proof similar to what we have shown previously. This procedurestops if some natural number j ∈ N , or if degp ( z ) = v ∈ N , then we take it that p ( z ) = ( z − ρ ) v + q v ( z ) , where q v ( z ) = p ( z ) =
0, which is false because degp ( r ) ≥
1, by supposition), so deg (( z − ρ ) v + q v ( z )) ≥ v +
1, which is false ofcourse because degp ( z ) = v .That is we take it that p ( z ) = ( z − ρ ) j + q j ( z ) for some j ∈ N , j < v −
1, and q j ( ρ ) =
0, that gives that multiplicity of ρ is j + < v otherwise we take it that p ( z ) = ( z − ρ ) v q v − ( z ) . (9)Relation (9) gives that q v − ( z ) =
0, (or else p ( z ) = q v − ( z ) is a constant polynomial with valuesay c . That is p ( z ) = ( z − ρ ) v c . Of course polynomial ( z − ρ ) v + cannot divide p ( z ) , because this polynomial has a degree deg (( z − ρ ) v + ) > v = degp ( z ) , thatgives that multiplicity of ρ is v (by the definition of multiplicity). So we havedescribed a complete algorithm that gives us the multiplicity of a root of a complexpolynomial. (cid:4) Remark 3.7
We can combine Proposition 3.1 with Theorem 3.4 and the previousalgorithm (and of course Proposition 3.3), in order to prove Theorem 3.5. We leaveit as an easy exercise for the reader. So far we have developed all we need frompolynomials of one complex variable. We also obtain some basic results from Linear
Solution of polynomial equations 35
Algebra. Here we will now consider the following linear system of two equations ( α x + β y = γ ( ) α x + β y = γ ( ) (A)where α i , β i , γ i ∈ C for i = , . We consider the determinants D , D x , D y whereD = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α β α β (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α β − α β , D x = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) γ β γ β (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = γ β − β γ , D y = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α γ α γ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = α γ − α γ . When D = , then system (A) has only one solution ( x , y ) , where x = D x D , y = D y D .When D = and D x = , or D y = , then system (A) does not have any solution,whereas when D = D x = D y = , then system (A) has an infinite number of solutionsexcept only in the case where α = α = β = β = and only one of the numbers γ , γ is non zero. We need the case where D = and the case where D = D x = D y = . We consider the case where D = D x = D y = . We suppose that system (A) is apure system of two variables x and y, that is, we suppose that at least one of thenumbers α , α is non-zero also. That is α = or α = and β = or β = ,otherwise we do not have a system of equations of two different variables .We have two cases:(i) One from the six numbers α i , β i , γ i , i = , α = D =
0, that is α β − α β = ⇒ α β = α β (4).Through (3) and (4) we have α β = α = α = β = D y = γ =
0. Thatis the equation (1) is the equation 0 · x + · y =
0, with set of solutions the set R .This means that system (A) is equivalent to the equation (2) of (A) only. If β = γ =
0, we get in a similar way that α = β = γ = α =
0, or β =
0, or γ =
0, we take it that α = β = γ = α β γ α β γ =
0, that is, non of six numbers α i , β i , γ i , i = , , is zero. We have D = ⇔ α β − α β = ⇔ α α = β β ( α = β = λ = α α = β β (8).We have D x = ⇔ γ β − β γ = ⇔ λ = β β = γ γ (9). With (8) and (9) weget: α = λα , β = λβ , λ = λγ , that is α x + β y = γ ⇔ ( λα ) x + ( λβ ) y = ( λγ ) ⇔ λ · ( α x + β y ) = λγ λ = ⇔ α x + β y = γ , that is equations (1) and (2) of (A)are equivalent, that is they have the same set of solutions, which means that system(A) is equivalent to one only from the equations (1) and (2), whichever of the two).So, we have proved that in the case of D = D x = D y =
0, system (A) has aninfinite number of solutions and it is equivalent with only one from the equations(1) and (2). So we have stated our prerequisites from Algebra. b) Prerequisites from Analysis
As it is well known, by Galois theory, there are no formulas that give the rootsof an arbitrary polynomial as a function of its coefficients with radicals. So, for anarbitrary polynomial the only way to find its roots is to approximate them with anumerical method. Perhaps, the simplest numerical method for algebraic equationsis the bisection method, which is presented in all classical books of Numerical Anal-ysis:It is a simple method, and here we have based in it in our problem. The bisectionmethod has very weak suppositions, and it is convenient for secondary students also.Let α , β ∈ R , α < β , and f : [ α , β ] → R be a continuous function. We supposethat f ( α ) · f ( β ) <
0. Then, function f has a root, at least in the interval ( α , β ) , andbisection method approximates a root of f in ( α , β ) , as closely as we want with aspecific minor error.There are many different numerical methods that find the roots in a specific in-terval. We will not discuss this subject. This is a vast subject in Numerical Analysis.In this text, it is enough for us to find only one root in a specific interval and approx-imate it using bisection method.The solution to all the real roots of a polynomial will be based on the followingbasic lemma. Basic Lemma 3.8 . Let v ∈ N , v ≥ , p ( x ) = α v x v + α v − x v − + · · · + α x + α , be apolynomial p ( x ) ∈ R [ x ] , with degree degp ( x ) = v.Let ρ , ρ , . . ., ρ k be all the different real roots of polynomial p ′ ( x ) , k ∈ N , k ≥ , ρ i = ρ j , for all i , j ∈ { , , . . ., k } , i = j.Then, we can find, with an algorithm, all the real roots of p ( x ) , with their multi-plicities.Proof. Let L = { ρ , ρ , . . ., ρ k } , be the set of all real roots of p ′ ( x ) . We suppose,also, without loss of generality that ρ < ρ < · · · < ρ k .Let i ∈ { , . . ., k − } . Then p ′ ( x ) > x ∈ ( ρ i , ρ i + ) or p ′ ( x ) < x ∈ ( ρ i , ρ i + ) . This gives that p is a strictly decreasing or strictly increasingfunction on [ ρ i , ρ i + ] . If p ( ρ i ) =
0, then ρ i is the unique root of p in [ ρ i , ρ i + ] .The same holds if p ( ρ i + ) =
0, that is ρ i + is the unique root of p in [ ρ i , ρ i + ] , if p ( ρ i + ) = p can’t have the numbers ρ i and ρ i + as roots simul-taneously, by its monotonicity. We suppose now that p ( ρ i ) · p ( ρ i + ) =
0. Then,if p ( ρ i ) · p ( ρ i + ) >
0, polynomial p does not have any root in [ ρ i , ρ i + ] . If p ( ρ i ) · p ( ρ i + ) <
0, then p has one root exactly in the interval [ ρ i , ρ i + ] , andmore specifically this root belongs in ( ρ i , ρ i + ) . Solution of polynomial equations 37
Applying the bisection method we find this root, because the suppositions ofbisection method are satisfied now. We do the same in every interval [ ρ i , ρ i + ] .So we find all the roots of p in the interval [ ρ , ρ k ] . We examine the roots in [ ρ k , + ∞ ) . Because α v =
0, we have two cases:i) If α v >
0, then lim x → + ∞ p ( x ) = + ∞ .Then p is a strictly increasing function in [ ρ k , + ∞ ) .a) p ( ρ k ) =
0, then ρ k is the unique root of p in [ ρ k , + ∞ ) .b) If p ( ρ k ) >
0, then p does not have any root in [ ρ k , + ∞ ) .c) If p ( ρ k ) <
0, then p has one root exactly, (say ρ k + ) in [ ρ k , + ∞ ) and morespecifically ρ k + ∈ ( ρ k , + ∞ ) .Because lim x → + ∞ p ( x ) = + ∞ , there exists some x ∈ R , x > ρ k , so that p ( x ) > p ( ρ k ) · p ( x ) < ρ k + ∈ ( ρ k , x ) .Applying bisection method in [ ρ k + , x ] , we approximate the root ρ k + . Later, wewill see how we compute a number like x , in order to apply bisection method.ii) If α v <
0, then lim x → + ∞ p ( x ) = − ∞ . Polynomial p is a strictly decreasing functionin [ ρ k , + ∞ ) .a) If p ( ρ k ) =
0, then ρ k is the unique root of p in [ ρ k , + ∞ ) .b) If p ( ρ k ) <
0, then p does not have any root in [ ρ k , + ∞ ) .c) If p ( ρ k ) >
0, then p has unique one root in [ ρ k , + ∞ ) (say ρ k + ) and morespecifically ρ k + ∈ ( ρ k , + ∞ ) .Because lim x → + ∞ p ( x ) = − ∞ , there exists some x ∈ ( ρ k , + ∞ ) , so that p ( x ) < p ( ρ k ) · p ( x ) <
0, and ρ k + ∈ ( ρ k , x ) , and applying bisection methodwe approximate the unique root ρ k + in ( ρ k , x ) . Now we examine the roots in ( − ∞ , ρ ] . Whether p ( ρ ) =
0, then ρ is the unique root of p in ( − ∞ , ρ ] .Now we suppose that p ( ρ ) =
0. We examine two cases:i) lim x →− ∞ p ( x ) = + ∞ .This happens when v is even and α v >
0, or v is odd and α v <
0. Then p is astrictly decreasing function in ( − ∞ , ρ ] .i), 1) If p ( ρ ) >
0, then p does not have any root in ( − ∞ , ρ ] .i), 2) If p ( ρ ) <
0, then p has a unique root in ( − ∞ , ρ ] (say ρ k + ) andmore specifically ρ k + ∈ ( − ∞ , ρ ) . Because lim x →− ∞ p ( x ) = + ∞ , there exists some x < ρ , so that p ( x ) >
0. Then ρ k + ∈ ( x , ρ ) and applying bisection methodin [ x , ρ ] , we approximate root ρ k + .ii) lim x →− ∞ p ( x ) = − ∞ . This is happened when v is even and α v <
0, or v is odd and α v >
0. Then p is a strictly increasing function in ( − ∞ , ρ ] .We have two cases:ii), 1) p ( ρ ) <
0. Then, p does not have any root in ( − ∞ , ρ ] .ii), 2) p ( ρ ) >
0. Then p has unique root in ( − ∞ , ρ ] (say ρ k + ) and more specif-ically ρ k + ∈ ( − ∞ , ρ ) . Because lim x →− ∞ p ( x ) = − ∞ , there exists some x < ρ ,such that p ( x ) <
0. Then p ( x ) · p ( ρ ) < ρ k + ∈ ( x , ρ ) . Applying bisection method in [ x , ρ ] we approximate root ρ k + . All the impli-cations of this lemma are easy to prove and are left as an easy exercise for theinterested reader. The proofs are of secondary school. Corollary 3.9.
Basic Lemma 3.8 holds again, in the case when polynomial p ′ hasonly one root . Proof.
The proof is similar to that of basic lemma for the intervals ( − ∞ , ρ ] and [ ρ , + ∞ ) , where p ′ ( ρ ) = (cid:4) Corollary 3.10.
Let v ∈ N , v ≥ , p ( x ) = α v x v + α v − x v − + · · · + α x + α , be apolynomial p ( x ) ∈ R [ x ] , with degree degp ( x ) = v.We suppose that p ′ does not have any root. Then p has unique real root and wecan construct an algorithm in order to find it . Proof.
Of course p ′ is a polynomial of even degree degp ′ = v −
1, so p is a polyno-mial of odd degree. Thus p has, at least, one real root. Because p ′ does not have anyroot, we have p ′ ( x ) =
0, for every x ∈ R . Thus, p ′ ( x ) > x ∈ R , or p ′ ( x ) < x ∈ R , or else if there exist α , β ∈ R , so that p ′ ( α ) < p ′ ( β ) > α = β ), then because p ′ is a continuous function (as a polynomial) and p ′ ( α ) · p ′ ( β ) <
0, we take it that there exists γ ∈ ( α , β ) (if α < β ) or γ ∈ ( β , α ) (if β < α ) so that p ′ ( γ ) =
0, that is a contradiction because p ′ ( x ) = x ∈ R .Thus, p is a strictly increasing function in R , if p ′ ( x ) >
0, for every x ∈ R , or else p is a strictly decreasing function in R if p ′ ( x ) < x ∈ R . If p is a strictlyincreasing function, then lim x → + ∞ p ( x ) = + ∞ and lim x →− ∞ p ( x ) = − ∞ , or else if p is astrictly decreasing function in R , then lim x → + ∞ p ( x ) = − ∞ and lim x →− ∞ p ( x ) = + ∞ .Polynomial p is a strictly increasing function if α v >
0, or else if α v <
0, then p is a strictly decreasing function.If α v >
0, then because lim x → + ∞ p ( x ) = + ∞ , there exists y ∈ R , so that p ( y ) >
0, and because lim x →− ∞ = − ∞ , there exists x ∈ R , x < y , so that p ( x ) < p ( x ) · p ( y ) < p has unique root in R , (say ρ ) so that ρ ∈ ( x , y ) .If α v <
0, then because lim x → + ∞ p ( x ) = − ∞ there exists y ∈ R , so that p ( y ) < x →− ∞ p ( x ) = + ∞ , there exists x ∈ R , x < y so that p ( x ) >
0. Thus p ( x ) · p ( y ) <
0, and p has unique root in R (say ρ ) so that ρ ∈ ( x , y ) .We sill see later how we compute numbers x , y as above.In any of the cases above we apply the bisection method in the interval [ x , y ] ,to find the unique real root of p . (cid:4) Remark 3.11
The multiplicity of a root is found with the algebraic algorithm 3.6.However, we can find the multiplicity of a root in an analytic way.More specifically:Let p ( x ) ∈ C [ x ] be a polynomial and ρ be a root of p, where degp ( x ) = v ∈ N .Then, there exists a unique natural number k ∈ N ∪ { } k ≤ v − , so that:p ( ρ ) = , p ′ ( ρ ) = , . . . , p ( k ) ( ρ ) = and p ( k + ) ( ρ ) = , that is p ( i ) ( ρ ) = , for alli = , , . . ., k and p ( k + ) ( ρ ) = , where p ( ) ( ρ ) = p ( ρ ) . Solution of polynomial equations 39
The natural number k + is the multiplicity of root ρ of p. (Of course we havealways p ( v ) ( ρ ) = )This is a classical result in calculus, that is proven easily.Now, we cover the gap from basic Lemma 3.8, computing a number like x in thislemma . Remark 3.12.
Let p ( x ) ∈ R [ x ] be a real polynomial,p ( x ) = α + α x + · · · + α v − x v − + α v x v , v = degp ( x ) , v ≥ . We suppose that α v > , and that p ′ ( x ) has real roots . Proof.
Let ρ be the greater real root of p ′ ( x ) . We suppose that p ( ρ ) <
0. We con-sider an arbitrary real number x , so that x > ρ , x > x > | α | + | α | + · · · + | α v − | α v . We prove that p ( x ) > −| y | ≤ y for every y ∈ R (1). We apply (1) for y = α x v + α x v − + · · · + α v − x (1)for some x ∈ R − { } and we have: − (cid:12)(cid:12)(cid:12)(cid:12) α x v + α x v − + · · · + α v − x (cid:12)(cid:12)(cid:12)(cid:12) ≤ α x v + α x v − + · · · + α v − x . (2)Adding the number α v in two members of (2) we take: α v − (cid:12)(cid:12)(cid:12)(cid:12) α x v + α x v − + · · · + α v − x (cid:12)(cid:12)(cid:12)(cid:12) ≤ α v + α x v + α x v − + · · · + a v − x . (3)By the triangle inequality we take for x > (cid:12)(cid:12)(cid:12)(cid:12) α x v + α x v − + · · · + α v − x (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) + · · · + (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12) ⇔ − (cid:18) | α | x v + | α v | x v − + · · · + | α v − | x (cid:19) ≤ − (cid:12)(cid:12)(cid:12)(cid:12) α x v + α x v − + · · · + α v − x (cid:12)(cid:12)(cid:12)(cid:12) (4)Adding the number α v in two members of (4) we take α v − (cid:18) | α | x v + | α | x v − + · · · + | α v − | x (cid:19) ≤ α v − (cid:12)(cid:12)(cid:12)(cid:12) α x v + α x v − + · · · + α v − x (cid:12)(cid:12)(cid:12)(cid:12) , for x > . (5)Let some x >
1. Then we have: x ≥ x , x ≥ x , . . ., x v ≥ x ⇒ x ≤ x , x < x , . . ., x v < x ⇒ | α v − | x ≤ | α v − | x , | α v − | x ≤ , | α v − | x , . . ., | α | x v − ≤ | α | x , | α | x v ≤ | α | x . Adding the above inequalities in pairs we take: | α v − | x + · · · + | α | x v − + | α | x v ≤ | α | + | α | + · · · + | α v − | x ⇒− | α | + | α | + · · · + | α v − | x ≤ − (cid:18) | α v − | x + · · · + | α | x v − + | α | x v (cid:19) ⇒ α v − | α | + | α | + · · · + | α v − | x ≤ α v − (cid:18) | α v − | x + · · · + | α | x v − + | α | x v (cid:19) . (6)Through inequalities (3), (5) and (6), we get: α v − | α | + | α | + · · · + | α v − | x ≤ α v + α x v + α x v − + · · · + α v − x for x > . (7)Now, for every x > x > | α | + | α | + · · · + | α v − | α v , we take α v > | α | + | α | + · · · + | α v − | x ⇒ α v − | α | + | α | + · · · + | α v − | x > . (8)Through (7) and (8), we take it that for every x > x > | α | + | α | + · · · + | α v − | α v we get α v + α v − x + · · · + α x v − + α x v > . ( ∗ )This gives x v · (cid:18) α v + α v − x + · · · + α x v − + α x v (cid:19) > ⇔ p ( x ) > p ( x ) )(9)We apply (9) for the number x ∈ R so that x > ρ , x > x > | α | + | α | + · · · + | α v − | α v and we take it that p ( x ) > p ( ρ ) · p ( x ) <
0. This means that the unique real root of p ( x ) in [ ρ , + ∞ ) belongs in ( ρ , c ) . Applying the bisection method in the interval [ ρ , x ] , wecompute the unique real root x ∗ of p ( x ) in [ ρ , + ∞ ) that is x ∗ ∈ ( ρ , x ) . Of course if p ( ρ ) >
0, polynomial p does not have any real root in [ ρ , + ∞ ) as we have seen inbasic Lemma 3.8 and if p ( ρ ) =
0, then ρ is the unique real root of p in [ ρ , + ∞ ) .Now, we suppose that α v <
0, and that p ′ ( x ) has real roots.Let ρ be the greatest real root of p ′ ( x ) . We suppose that p ( ρ ) >
0. We consideran arbitrary real number x , so that x > ρ , x > Solution of polynomial equations 41 x > | α | + | α | + · · · + | α v − |− α v = | α | + | α | + · · · + | α v − || α v | . (10)By (10) we get (because | α v | > x > | α v | > | α | + | α | + · · · + | α v − | x ⇒ α v + | α | + | α | + · · · + | α v − | x < . (11)Let x >
1. Because of x > x ≥ x , x ≥ x , . . ., x v − ≥ x , x v ≥ x ⇒ x v ≤ x , x v − ≤ x , . . ., x ≤ x ⇒| α | x v ≤ | α | x , | α | x v − ≤ | α | x v − , . . ., | α v − | x ≤ | α v − | x . Adding by pairs the previous inequalities we get: | α | x v + | α | x v − + · · · + | α v − | x ≤ | α | x + | α | x + · · · + α v − | x ⇒ α v + | α | x v + α v | x v − + · · · + | α − | x ≤ α v + | α | + | α | + · · · + | α v − | x . (12)Of course we get: α ≤ | α | , α ≤ | α | , . . ., α v − ≤ | α v − | x > = ⇒ α x v ≤ | α | x v , α x v − ≤ | α | x v − , . . ., α v − x ≤ | α v − | x and adding by pairs the previous inequalities we get: α x v + α x v − · · · + α v − x ≤ | α | x v + | α | x v − + · · · + | α v − | x ⇒ α v + α x v + α x v − + · · · + α v − x ≤ α v + | α | x v + | α | x v − + · · · + | α v − | x . (13)Of course as we have seen in basic Lemma 3.8 if p ( ρ ) <
0, then p does not have anyroot in [ ρ , + ∞ ) , and if p ( ρ ) =
0, number ρ is the unique real root of p in [ ρ , + ∞ ) .So far we have seen how we compute the unique real root of p (if any) in [ ρ , + ∞ ) ,when ρ is the greatest real root of p ′ .In a similar way we compute the unique real root of p in ( − ∞ , ρ ∗ ] (if any), where ρ ∗ is the smallest real root of p ′ . It suffices to observe the following:We simply write p ( x ) = p ( − ( − x )) and we find easily a polynomial q ∈ R [ x ] ,such that p ( x ) = q ( − x ) (it is trivial to find such a polynomial q ).Now, for x > x > ρ and x > | α | + | α | + · · · + | α v − || α v | the previous inequal-ities (11), (12) and (13) hold simultaneously for x = x , thus we get α v + α v − x + · · · + α x v − + α a v < ⇒ x v (cid:18) α v + α v − x + · · · + α x v − + α x v (cid:19) < ⇒ p ( x ) < p ) . So, we get p ( ρ ) · p ( x ) < p has a root exactly in ( ρ , x ) , (say x ∗ ), where x ∗ isthe unique real root of p in [ ρ , + ∞ ) .We apply the bisection method in the interval [ ρ , x ] and we compute the uniquereal root x ∗ ∈ ( ρ , x ) of p in [ ρ , + ∞ ) .Now we suppose that p has real roots, and let ρ be the smallest real root of p . Let r be an arbitrary real root of p . Then p ( r ) = q ( − r ) = − r is a real root of q . This gives that − ρ is the biggest real root of q .By (14) we also get that p ′ ( x ) = − q ′ ( − x ) , which gives that if α is a real rootof p ′ , then − α is a real root of q ′ . Thus, because by supposition p ′ has real roots,the same holds for q ′ . Then, we apply the previous procedure and we compute thegreatest real root of q , (say − ρ ), which means that ρ is the smallest real root of p .Thus, we compute the smallest real root of p also in any case. Now, we consider apolynomial p ∈ R [ x ] , with degp ( x ) = v ∈ N , v ≥
3, so that polynomial p ′ does nothave any real root. Because of p ′ does not have any root and degp ′ ( x ) ≥ degp ( x ) ≥ p ′ is a polynomial of even degree (because anypolynomial of odd degree has a real root at least). This means that p is a polynomialof odd degree that has one real root at least. Because p ′ ( x ) = x ∈ R ,we take it that p ′ ( x ) > x ∈ R or p ′ ( x ) < x ∈ R . If α v > p ′ ( x ) > x ∈ R and p is a strictly increasing function in R , such thatlim x → + ∞ p ( x ) = + ∞ and lim x →− ∞ p ( x ) = − ∞ . Thus, there exists x < p ( x ) < y > p ( y ) >
0, which gives that p has a real root in ( x , y ) , say ρ .Because p is a strictly monotonous function, we take it that the root ρ is the uniquereal root of p .We can now compute some numbers x , y with the above properties. By inequal-ity ( ∗ ) in page 40 we take it that if x ∈ R , so that x > x > | α | + | α | + · · · + | α v − | α v ,then we get α v + α v − x + · · · + α x v − + α x v >
0. We choose some y >
1, so that y > | α | + | α | + · · · + | α v − | α v , then by inequality ( ∗ ) in page 40 we get α v + α v − y + · · · + α y v − + α y v > ⇒ y v (cid:18) α v + α v − y + · · · + α y v − + · · · + α y v (cid:19) > ⇔ p ( y ) > . (15)Now, let some x < −
1. Then we have ( x = ) α x v ≥ − (cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) , α x v − ≥ − (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) , . . ., α v − x ≥ − (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12) . Solution of polynomial equations 43
Adding these inequalities we get: α x v + α x v − + · · · + α v − x ≥ − (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) + · · · + (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (16)We get also | x | > ⇒ | x | ≥ | x | , | x | > | x | , . . ., | x v − | > | x | , | x v | > | x |⇒ | x | ≤ | x | , . . ., | x v − | < | x | , | x v | < | x |⇒ (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) α v − || x | , . . ., (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) < | α || x | , (cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) < (cid:12)(cid:12)(cid:12)(cid:12) α x (cid:12)(cid:12)(cid:12)(cid:12) ⇒ − (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12) ≥ − (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12) , . . ., − (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) > − | α || x | , − (cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) > − (cid:12)(cid:12)(cid:12)(cid:12) α x (cid:12)(cid:12)(cid:12)(cid:12) . Adding by pairs the previous inequalities we get: − (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) α x v (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) α x v − (cid:12)(cid:12)(cid:12)(cid:12) + · · · + (cid:12)(cid:12)(cid:12)(cid:12) α v − x (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) ≥ − | α | + | α | + · · · + | α v − || x | . (17)By (16) and (17) we get α v + α v − x + · · · + α v x v − + α x v ≥ α v + | α | + | α | + · · · + | α v − || x | (18)for every x ∈ R , x < − x ∈ R , so that x < − x < − | α | + | α | + · · · + | α v − | α v . (19)Then by (19) we get: − x > | α | + | α | + · · · + | α v − | α v > α v > ⇒ | x | > | α | + | α | + · · · + | α v − | α v ⇒ α v − | α | + | α | + · · · + | α v − || x | > . (20)So, for x < − x < − | α | + | α | + · · · + | α v − | α v we get from (20), that α v + α v − x + · · · + α x v − + α x v > ⇒ (because x < v is odd x v < x v · · · (cid:18) α v + α v − x + · · · + α x v − + α x v (cid:19) < ⇔ p ( x ) < . Thus, for some x < − x < − | α | + | α | + · · · + | α v − | α v we get p ( x ) < p ( x ) · p ( y ) < ρ of p , so that ρ ∈ ( x , y ) .Finally in the case of α v <
0, we consider the polynomial − p ( x ) . Then ( − p ) ′ ( x ) = − p ′ ( x ) = x ∈ R and the coefficient of the monomial of greater degree of − p is positive now. Weapply the previous for − p and we compute the unique real root ρ of − p , that is theunique real root of p also.So in this remark we have covered the gap, we have left from basic Lemma 3.8and Corollaries 3.9 and 3.10 and we have computed specific numbers x , y . For thesequel we also need some tools from real polynomials of two real variables.Before this, let us give some specific examples for the roots of polynomials.As it is well known, from elementary calculus, any polynomial of odd degree hasa real root at least.On the other hand, there are many polynomials of any even degree that do nothave any real root. For example, let p ( x ) be any polynomial that is non constant, let k ∈ N , and θ >
0. Then polynomial q ( x ) = p ( x ) k + θ does not have any real root,as we can easily see, and has degree 2 k · v , where v = degp ( x ) .Of course for every finite set of real numbers A = { ρ , ρ , . . ., ρ v } , v ∈ N , ρ i = ρ j ,for i , j ∈ { , , . . ., v } , i = j , polynomial p ( x ) = ( x − ρ )( x − ρ ) . . . ( x − ρ v ) has rootsthe numbers ρ i , i = , . . ., v , and polynomial q ( x ) = (( x − ρ )( x − ρ ) . . . ( x − ρ v )) k = p ( x ) k is a polynomial of even degree degq ( x ) = kv , with roots the numbers ρ i , i = , , . . ., v , also. Now we consider polynomials of two real variables with real co-efficients, that is we consider the set R [ x , y ] = { p ( x , y ) : p ( x , y ) is a polynomial of two real variables x and y with coefficients in R } .Let p ( x , y ) ∈ R [ x , y ] . We say that polynomial p ( x , y ) is a pure polynomial, when deg x p ( x , y ) ≥ deg y p ( x , y ) ≥ deg x p ( x , y ) , deg y p ( x , y ) are the greatestdegree of its monomials with respect to x (or y respectively).The set of roots of p ( x , y ) is the set L p ( x , y ) = { ( x , y ) ∈ R | p ( x , y ) = } . As in polynomials of one real variable, we can easily see that there are many purepolynomials p ( x , y ) ∈ R [ x , y ] , that do not have any roots. Solution of polynomial equations 45
For example, let p ( x , y ) be any pure polynomial. Then polynomial q ( x , y ) = p ( x , y ) k + θ , where k ∈ N , θ >
0, is a pure polynomial that does not haveany real root, as we easily see. Of course these polynomials are of even degree for x and y .On the other hand let A = { α , α , . . ., α v } , B = { β , β , . . ., β m } , A ∪ B ⊆ R , v , m ∈ N , α i = α j for every i , j ∈ { , . . ., v } i = j and β i = β j for every i , j ∈ { , , . . ., m } , i = j . Let Also k ∈ N . We consider the pure polynomial p ( x , y ) = (( x − α )( x − α ) . . . ( x − α v ))(( y − β )( y − β ) . . . ( y − β m )) k . Then p is a pure polynomial of even degree with respect to x and y such that L = { ( α i , β j ) , i ∈ { , . . ., v } , j ∈ { , , . . ., m }} $ L p ( x , y ) . We remark also that for ever y ∈ R , the couple ( α i , y ) is a root of p ( x , y ) . This factdifferentiates pure polynomials p ( x , y ) from polynomials of one variable.That is, there exist uncountable pure polynomials, each one having uncountableset of real roots. Especially, this holds for pure polynomials of an odd degree withrespect to x and y . We have the following proposition. Proposition 3.13.
Let p ( x , y ) be a pure polynomial such that deg x ( x , y ) = v is oddor degp y ( x , y ) is odd. Then for every r ∈ R the set L r = { ( x , y ) ∈ R : p ( x , y ) = r } isuncountable . Proof.
We suppose, without loss of generality, that number v = degp x ( x , y ) is odd.Then, as we can see easily, we can write polynomial p ( x , y ) as follows: p ( x , y ) = α v ( y ) x v + α v − ( y ) x v − + · · · + α ( y ) x + α ( y ) , where α i ( y ) ∈ R [ y ] for every i = , , . . ., v , and α v ( y ) =
0, because v = degp x ( x , y ) .Because α v ( y ) =
0, polynomial α v ( y ) has a finite set of roots. Let A v be the set ofroots of α v ( y ) , that is A v = { y ∈ R | α v ( y ) = } . Let y ∈ Rr A v . Then α v ( y ) = r ∈ R . We consider the polynomial p r ( x ) = α v ( y ) x v + α v − ( y ) x v − + · · · + α ( y ) x + α ( y ) − r . Then, p r ( x ) is a polynomial of odd degree degp r ( x ) = v , thus polynomial p r ( x ) hasa real root, say x , at least, that is we have: p r ( x ) = ⇒ α v ( y ) x v + α v − ( y ) x v − + · · · + α ( y ) x + α ( y ) = r ⇔ p ( x , y ) = r ⇒ ( x , y ) ∈ L r . That is we proved that for every y ∈ Rr A v , we have that there exists some x ∈ R ,such that ( x , y ) ∈ L r . Of course, if y , y ∈ Rr A v , y = y and ( x , y ) , ( x , y ) ∈ L r , we have ( x , y ) = ( x , y ) so the set L r is uncountable, and the proof of thisproposition is complete. (cid:4) Corollary 3.14.
Let p ( x , y ) be a pure polynomial such that deg x p ( x , y ) or deg y p ( x , y ) is odd. Then the set of real roots of p ( x , y ) is uncountable . Proof.
It is a simple application of the previous Proposition 3.13 for r = (cid:4) As we have noticed previously there are also such pure polynomials that thenumbers deg x p ( x , y ) and deg y p ( x , y ) are even whose set of real roots is uncountable,as well as there being polynomials that do not have any roots.Of course here we have the natural question: Are there pure polynomials p ( x , y ) whose set of real roots p ( x , y ) is non empty and finite? Of course, let us give a simpleexample:We consider the polynomial: p ( x , y ) = ( x − ) + ( y − ) . It is easy to see that L p ( x , y ) = { ( , ) , ( , − ) , ( − , ) , ( − , − ) } . More generally, let p ( x ) be a polynomial with real roots α , α , . . ., α v , and p ( y ) be a polynomial with real roots β , β , . . ., β m .We consider the pure polynomial p ( x , y ) = p ( x ) k + p ( y ) k , where k , k , ∈ N .Then, it is easy to see that: L p ( x , y ) = { ( α i , β j ) , i ∈ { , . . ., v } , j ∈ { , . . ., m }} . Of course by Corollary 3.14 only pure polynomials whose numbers deg x p ( x , y ) , deg y p ( x , y ) are even can have finite set of roots, as in the previous examples. Fromthe previous results we also have a significant observation.These polynomials have the number zero as a global minimum!For the sequel, we have to concentrate our attention to pure polynomials p ( x , y ) that have a finite set of roots. So, from the previous observation we are led to askwhether the reverse result holds. That is, does any pure polynomial that has a globalminimum have a finite set of real roots? The answer is no, and we can give a simpleexample. We consider the pure polynomial p ( x , y ) = (( x − )( y − )) −
7. It is easyto check that polynomial p ( x , y ) has the number − p ( x , ) = ( x − ) − x ∈ R so p ( , ) = − < p ( , ) = >
0, thus there exists x ∈ ( , ) so that p ( x , ) =
0. Similarly take any real number y ∈ ( , ) . That is 3 < y < ⇒ < y − < ⇒ < ( y − ) < p ( , y ) = − < p ( , y ) = ( y − ) > > x ∈ ( , ) such that p ( x , y ) = y ∈ ( , ) , there exists x ∈ R , so that p ( x , y ) =
0, and of course if y , y ∈ ( , ) , x , x ∈ R , p ( x , y ) = p ( x , y ) = y = y we have ( x , y ) =( x , y ) , so the set of real roots of p is uncountable even if polynomial p ( x , y ) has a Solution of polynomial equations 47 global minimum. However, the property of a pure polynomial to have a global min-imum (or maximum also) is a crucial property that have all pure polynomials thathave a finite number of roots, as we will prove now with the following proposition.
Proposition 3.15. (topological lemma). Let p ( x , y ) be a pure polynomial. We sup-pose that there exist two couples ( x , y ) , ( x , y ) ∈ R such thatp ( x , y ) · p ( x , y ) < . Then, set L p ( x , y ) is uncountable . Proof.
We set A = ( x , y ) , B = ( x , y ) . We get A = B , or otherwise we have A = B and p ( x , y ) · p ( x , y ) = p ( A ) · p ( B ) = p ( A ) ≥
0, which is false. So we get A = B .We consider the midperpendicular ℓ of segment [ A , B ] . For every point Γ ∈ ℓ , weconsider the union of two segments [ A , Γ ] ∪ [ Γ , B ] . We write A Γ B = [ A , Γ ] ∪ [ Γ , B ] forsimplicity. Of course A Γ B ⊆ R . We consider the restriction p | A Γ B for simplicity,and we write p = p | A Γ B also for simplicity.Of course the set A Γ B is a compact and connected subset of R . So the set p ( A Γ B ) os a closed interval of R .We suppose that p ( A ) < p ( B ) >
0, without loss of generality. So p ( A ) , p ( B ) ∈ p ( A Γ B ) and gives that 0 ∈ p ( A Γ B ) , that is there exists some point ∆ ∈ A Γ B so that p ( ∆ ) =
0. Of course ∆ = A and ∆ = B . So, for every Γ ∈ ℓ and every curve A Γ B ,there exists some ∆ ∈ A Γ B , ∆ = A , ∆ = B , such that p ( ∆ ) = A = { A Γ B , Γ ∈ ℓ } is an uncountable supset of P ( R ) (the pow-erset of R ), and for every Γ , Γ ∈ ℓ, Γ = Γ , we have that A Γ B ∩ A Γ B = { A , B } this means that the set B = { ∆ ∈ A Γ B | Γ ∈ ℓ and p ( ∆ ) = } is uncountable, that gives that the set Lp ( x , y ) of roots of p ( x , y ) is uncountable andthe proof of this proposition is complete. (cid:4) Corollary 3.16.
Let p ( x , y ) be a pure polynomial that has a finite set of roots, nonempty. Then, number 0 is the global minimum or maximum of p ( x , y ) , or in otherwords polynomial p ( x , y ) has a global maximum or minimum, and when this holds,then this global maximum or minimum is number 0 . Proof.
There exists no two points ( x , y ) , ( x , y ) ∈ R so that: p ( x , y ) · p ( x , y ) < . Or else, if there exist two points ( x , y ) , ( x , y ) ∈ R , so that p ( x , y ) · p ( x , y ) < L p ( x , y ) is uncountable (by the previous Proposition 3.13), which is false byour supposition.This means that we have:i) p ( x , y ) ≥ ( x , y ) ∈ R , orii) p ( x , y ) ≤ ( x , y ) ∈ R .We suppose that i) holds. Because set Lp ( x , y ) is non-empty, this means thatthere exists ( x , y ) ∈ R so that p ( x , y ) =
0, so we get p ( x , y ) ≥ p ( x , y ) forevery ( x , y ) ∈ R . So, polynomial p ( x , y ) has in point ( x , y ) its global minimumthe number 0, because p ( x , y ) =
0. If ii) holds, then we take with a similar way that p has global maximum the number 0 in a point, and the proof of corollary iscomplete. (cid:4) The above corollary is a basic result that we use in the second stage of ourmethod.Finally, we refer here the most advanced result, that we use in our method.This result is called many times as Fermat’s Theorem in calculus of several vari-ables.
Theorem 3.17.
Let U ⊆ R , U open and f : U → R be a differentiable function inx ∈ U, where x is a point of local maximum or local minimum of f . Then the fol-lowing holds: ▽ f ( x ) = , that is x is a crucial point of f , where ∇ f ( x ) is thegradient of f in x . Fundamental Theorem of Algebra is a powerful and basic result in the theory ofpolynomials, especially in polynomial equations.Gauss gave the first complete proof of this result in his Ph.D. There are manyproofs for this important Theorem but none of them is trivial in order to be presentedin books of secondary school.Its simplest proof comes from complex analysis and uses an advanced Theoremof complex analysis, Liouville’s Theorem. Here we give a proof that uses the mostelementary tools that an undergraduate student learns.We think that it is difficult for an undergraduate student to find this proof inbooks, so we try to present it with details for educational reasons.For this reason we give firstly some elementary lemmas.
Lemma 4.1
Let p ( z ) ∈ C [ z ] be a complex polynomial, and z ∈ C . We considerpolynomial Q ( z ) = p ( r + z ) , z ∈ C . If p ( z ) ≡ , then of course Q ( z ) ≡ . If p ( z ) ,then degQ ( z ) = degp ( z ) . Proof. If degp ( z ) =
0, then the result is obvious. Let degp ( z ) = n ∈ N , n ≥
1. Wesuppose that n =
1, so we get p ( z ) = az + b , where a , b ∈ C , a =
0. We get: Q ( z ) = p ( z + z ) = a ( r + z ) + b = az + ( az + b ) , and degQ ( z ) =
1, because a =
0. So, theresult holds for n =
1. We prove the result inductively. We suppose z = n =
1, the result holds.We suppose that result holds for k ∈ N , k ≥ j ∈ N , 1 ≤ j ≤ k . Weprove that result holds for k + p ( z ) = a + a z + · · · + a k z + a k + z k + and a k + = , so degp ( z ) = k + . We distinguish two cases:(i) q ( z ) = a + a z + · · · + a k z k p ( z ) = q ( z ) + a k + z k + . Solution of polynomial equations 49
We have Q ( z ) = p ( z + z ) = q ( z + z ) + a k + ( z + z ) k + . (1)We set r ( z ) = q ( z + z ) , z ∈ C . Because q ( z )
0, by induction step we have degr ( z ) = degq ( z ) ≤ k (2).We have by Newton’s binomial a k + ( z + z ) k + = a k + k + ∑ j = z k + − j z j = k + ∑ j = a n + z j z k + − j . (3)Because z = a k + = dega k + ( z + z ) k + = k + degQ ( z ) = k + q ( z ) = a + a z + · · · + a k z k ≡
0. The proof is similar to case (i), so the resultholds by induction.Of course if z = Q ( z ) = p ( z ) , so degQ ( z ) = degp ( z ) . Lemma 4.2.
We consider polynomial p ( z ) ∈ C [ z ] . Of course we have | p ( z ) | ≥ forevery z ∈ C . So the set A = { x ∈ R | ∃ z ∈ C : x = | p ( z ) |} is low bounded by 0.We set m = inf ( A ) . Then there exist R > so that:m = inf ( { x ∈ R | ∃ z ∈ D ( , R ) : x = | p ( z ) |} ) where D ( , R ) = { z ∈ C : | z | ≤ R } . Proof.
We set B R = { x ∈ R | ∃ z ∈ D ( , R ) : x = | p ( z ) |} for some R > p ( z ) ≡
0, so we suppose that p ( z )
0. It is obviousthat B R ⊆ A by definitions of sets A and B R for R >
0. Let x ∈ B R for some R > x ∈ A . So: m ≤ x , because m is a lower bound of A . So we have: m ≤ x forevery x ∈ B R . This means that m is a lower bound of B R , so: m ≤ m ∗ R (1), where m ∗ R = inf ( B R ) . That is we get: m ≤ m ∗ R for every R > p ( z ) = a + a z + · · · + a n z n , where n ∈ N ∪ { } , a n =
0. When n =
0, we get ofcourse m ∗ R = m = | p ( z ) | for every z ∈ C and every R >
0, and the result is obviousof course. So we suppose that n ≥ z ∈ Cr { } we get p ( z ) = z n · (cid:18) a z n + a z n − + · · · + a n − z + a n (cid:19) . By calculus of the elementary limits in complex analysis we have:lim z → ∞ a z n = lim z → ∞ a z n − = · · · = lim z → ∞ a n − z = z → ∞ z n = ∞ , so we have: ( a n = ) lim z → ∞ p ( z ) = ∞ . By definition of lim z → ∞ p ( z ) , this means that: for m +
1, there exists R >
0, so that: | p ( z ) | > m + z ∈ C , | z | > R ( ∗ ) .From (1) we have of course m ≤ m ∗ R (2). Take w ∈ C : | w | > R . Then, by theabove we have | p ( w ) | > m + z ∈ C so that | p ( z ) | < m + | p ( z ) | ≥ m + z ∈ C , so m + A , that is m = inf ( A ) ≥ m +
1, which is false. Of course z ∈ D ( , R ) by implication ( ∗ ) , orelse | z | > R that means | p ( z ) | > m + m ∗ R ≤ | p ( z ) | < m + < | p ( w ) | ⇒ m ∗ R ≤ | p ( w ) | .So we get: m ∗ R ≤ | p ( z ) | for every z ∈ C : | z | > R . Of course we have also m ∗ R ≤ | p ( z ) | for every z ∈ D ( , R ) by definition of m ∗ R . So we get m ∗ R ≤ | p ( z ) | for every z ∈ C , that means that m ∗ R is a lower bound of A , that is m ∗ R ≤ m (6).From (2) and (6) we get m = m ∗ R , that is Lemma 4.2 has been proven. (cid:4) Remark 4.3. De Moivre Theorem:
We remind here the following result.Let n ∈ N , n ≥ . Then every non-zero complex number has exactly n roots, that isif w ∈ C , w = , then equation z n = w has exactly n solutions: This result is proveneasily by elementary properties of complex numbers and it is well known as DeMoivre’s Theorem, using properties of functions sine and cosine. We also need atopological Theorem. Theorem 4.4
Let K ⊆ C be compact and f : K → R be continuous. Then f attainsits supremum and its infimum and both are finite. For this theorem see [5]. After theabove, we are now ready to give the proof of fundamental Theorem of Algebra . Fundamental Theorem of Algebra 4.5
Proof.
We consider polynomial p ( z ) = a + a z + · · · + a n z n , a i ∈ C for every i = , , . . ., n , n ∈ N , a n = , n ≥ . We prove that p ( z ) has a root that is there exists z ∈ C , so that p ( z ) =
0. First ofall we examine the case of a n = | p ( z ) | ≥ z ∈ C . We set A = { x ∈ R | ∃ z ∈ C : x = | p ( z ) |} . Set A is low bounded by 0. We set m = inf ( A ) . Of course m ≥
0. For every R > B R = { x ∈ R | ∃ z ∈ D ( , R ) : x = | p ( z ) |} , and m ∗ R = inf ( B R ) , D ( , R ) = { z ∈ C : | z | ≤ R } . Applying Lemma 4.2 we take that there exists R > m = m ∗ R (1).By Theorem 4.3, page 233 [5], Ball D ( , R o ) = { z ∈ C : | z | ≤ R } is a com-pact set as a set closed and bounded. Polynomial p is a continuous function in C .This is a well known result in elementary Complex analysis.Usual norm | | : C → R Solution of polynomial equations 51 is a continuous function also in C , by elementary complex analysis. So, the com-position function F : C → R , F = | · | ◦ p , where p : C → C , | · | : C → R with formula F ( z ) = ( |·|◦ p )( z ) = | p ( z ) | for every z ∈ C is a continuous functions as the composi-tion of continuous functions | · | and p . Applying now Theorem 4.4 for K = D ( , R ) and f = F we take it that function F attains its infimum is some point z ∈ D ( , R ) .This means that | p ( z ) | = m ∗ R (2). By (1) and (2) we have m = | p ( z ) | (3). We arguethat m =
0. To take a contradiction we suppose that m >
0. Because | p ( z ) | = m > p ( z ) = Q ( z ) = p ( z + z ) p ( z ) , that is defined well because p ( z ) = degp ( z + z ) = degp ( z ) = n , and by definitionof Q ( z ) , we get: degQ ( z ) = n . We have Q ( ) = p ( + z ) p ( z ) =
1, so polynomial Q ( z ) has constant term equal to 1.Let Q ( z ) = + c k z k + · · · + c n z n , c n = , for every z ∈ C , where k ∈ N , ≤ k ≤ n and k be the smallest natural number such that c k =
0, (maybe k = n of course).So, we get: −| c k | / c k =
0. From Remark 4.3 there exists j ∈ C , so that j k = −| c k | / c k (4). (Of course there are k different complex numbers such that (4)holds). By (4) we take | j k | = | − | c k | / c k | = ⇒ | j | = j we have: for r ∈ C | + c k r k j k | ( ) = | + c k r k · ( −| c k | / c k ) = − | c k | r k (6).By definition of Q ( z ) we compute for z = r j for r ∈ C : Q ( z ) = Q ( r j ) = + c k ( r j ) k + · · · + c n ( r j ) n = + c k r k j k + c k + r k + j k + + · · · + c n r n j n . ( ) By (7) and triangle inequality we get: | Q ( r j ) | ≤ | + c k r k j k | + | c k + r k + j k + | + · · · + | c n r n j n | . (8)Applying (6) we get by (8) | Q ( r j ) | ≤ − | c k || r k | + | c k + || r | k + + · · · + | c n || r n | = − | r k | ( | c k | − | c k + || r | − · · · − | c n || r | n − k ) , for every r ∈ C ( ) By definition of m we get: m ≤ | p ( z + z ) | for every z ∈ C . (11)By (3) and (11) we get: | p ( z ) | ≤ | p ( z + z ) | for every z ∈ C ⇒ (cid:12)(cid:12)(cid:12)(cid:12) p ( z + z ) p ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ z ∈ C ⇒ | Q ( z ) | ≥ z ∈ C (by definition of Q ) ( ) Now, we distinguish two cases:(i) k = n . Then, from (10) we get: | Q ( r j ) | ≤ − | r | k | c k | , (11)So, for every r =
0, we get by (12) | Q ( r j ) | ≥ | Q ( r j ) | ≤ − | r k || c k | < k < n .By properties of complex limits we get:lim r → ( | c k | − | c k + || r | − · · · − | c n | r | n − k ) = | c k | > . This limit shows us that: there exists some small r so that | c k | − | c k + || r | − · · · − | c n || r | n − k > . (15)We set θ : = | c k | − | c k + || r | − · · · − | c n || r | n − k . So we have θ >
0. From (10) and(15) we get: | Q ( r j ) | ≤ − | r | k θ < . (16)From (12) we get: | Q ( r j ) | ≥ m > m = = p ( z ) ,that is polynomial p has, as a root number z . If a n = a n p ( z ) = a a n + a a n z + · · · + z n , and applying the previous result we take it that thereexists some w ∈ C such that 1 a n p ( w ) = ⇔ p ( w ) =
0, so polynomial p has a rootagain. The proof of fundamental Theorem has completed now. (cid:4) Remark 4.6.
Inside our work we have used the well known binomial equationx n = a, where a > . We remind how we solve this equation here, for n ≥ , n ∈ N .We will distinguish two cases:(i) a > . We consider function f : [ , a ] → R , with the formula f ( x ) = x n − a forevery x ∈ [ , a ] . We get f ( ) = n − a < , from our supposition andf ( a ) = a n − a = a ( a n − − ) > . So we have f ( ) · f ( a ) < and because f iscontinuous we understand from Bolzano Theorem that there exists x ∈ ( , a ) sothat: f ( x ) = ⇔ x n − a = ⇔ x = n √ a. Because f is strictly increasing in [ , a ] , Solution of polynomial equations 53 (because f ′ ( x ) = nx n − > for every x ∈ [ , a ] ) equation f ( x ) = has unique rootin [ , a ] , that is number n √ a. Applying bisection method we approximate number n √ a,or in other words we solve the equation x n = a.(ii) a ∈ ( , ) . Then we apply the above procedure similarly to the functiong : [ , ] → R with the formula g ( x ) = x n − a, for every x ∈ [ , ] . Acknowledgements:
Many thanks to Vasilli Karali for his contribution in the pre-sentation of this paper.
References
1. Conte, S. D., and De Boor, C., Elementary Numerical Analysis – An algorithmic approach,3rd ed., McGraw – Hill, New York 1980.2. Cox, D., Little, J., O’ Shea, D., – Ideals Varieties and algorithms.3. Cox, D., Little, J., O’ Shea, D., Using Algebraic Geometry, Springer-Verlag, 1998.4. Dennis, J. E. and Schnabel, R. B., Numerical methods for unconstrained optimization andnonlinear equations, Prentic- Hall, Englewood Cliffs, N.J. 1983.5. Dugundji, J. Tolopogy, Allyn and Bacon, Boston, 1966.6. Forsythe, C. E., Malcolm, M. A., Moler, C. B., Computer Methods for Mathematical Compu-tations, Prentice Hall, 1977.7. Gelfand, I. M., Kapranov, M. M., Zelevinsty, A. V., Discriminants, resultants and multidimen-sional determinants, Boston, Birkh¨auser, 1994.8. Henrici, P., Essentials of Numerical Analysis with pocket calculator demonstrations, Wiley,New York, 1982.9. Householder, A. S., The theory of matrices in numerical analysis, Blaisdell, 1964.10. Milovanovic, G. V., Mitrinovic, D. S. and Rassias, Th. M. Topics in Polynomials ExtremalProblems, Inequalities, Zeros, World Scientific Publ. Company, Singapore, New Jersey, Lon-don, 1994.11. Ortega, J. M., and Rheinboldt, W., The numerical solution of nonlinear systems, AcademicPress, New York, 1970.12. Rabinowitz, P., Numerical methods for non-linear algebraic equations, Gordon and Breach,1970.13. Rassias, Th. M., Srivastava, H. M. and A. Yanushauskas (eds), Topics in Polynomials of Oneand Several Variables and their Applications. World Scientific Publishing Company, Singa-pore, New Jersey, London, 1993.14. Scheid, F., Schaum’s outline series, Numerical Analysis, McGraw-Hill, 1968.15. Varga, R. S., Matrix iterative analysis, Prentice-Hall, 1962.1. Conte, S. D., and De Boor, C., Elementary Numerical Analysis – An algorithmic approach,3rd ed., McGraw – Hill, New York 1980.2. Cox, D., Little, J., O’ Shea, D., – Ideals Varieties and algorithms.3. Cox, D., Little, J., O’ Shea, D., Using Algebraic Geometry, Springer-Verlag, 1998.4. Dennis, J. E. and Schnabel, R. B., Numerical methods for unconstrained optimization andnonlinear equations, Prentic- Hall, Englewood Cliffs, N.J. 1983.5. Dugundji, J. Tolopogy, Allyn and Bacon, Boston, 1966.6. Forsythe, C. E., Malcolm, M. A., Moler, C. B., Computer Methods for Mathematical Compu-tations, Prentice Hall, 1977.7. Gelfand, I. M., Kapranov, M. M., Zelevinsty, A. V., Discriminants, resultants and multidimen-sional determinants, Boston, Birkh¨auser, 1994.8. Henrici, P., Essentials of Numerical Analysis with pocket calculator demonstrations, Wiley,New York, 1982.9. Householder, A. S., The theory of matrices in numerical analysis, Blaisdell, 1964.10. Milovanovic, G. V., Mitrinovic, D. S. and Rassias, Th. M. Topics in Polynomials ExtremalProblems, Inequalities, Zeros, World Scientific Publ. Company, Singapore, New Jersey, Lon-don, 1994.11. Ortega, J. M., and Rheinboldt, W., The numerical solution of nonlinear systems, AcademicPress, New York, 1970.12. Rabinowitz, P., Numerical methods for non-linear algebraic equations, Gordon and Breach,1970.13. Rassias, Th. M., Srivastava, H. M. and A. Yanushauskas (eds), Topics in Polynomials of Oneand Several Variables and their Applications. World Scientific Publishing Company, Singa-pore, New Jersey, London, 1993.14. Scheid, F., Schaum’s outline series, Numerical Analysis, McGraw-Hill, 1968.15. Varga, R. S., Matrix iterative analysis, Prentice-Hall, 1962.