A Visualizable, Constructive Proof of the Fundamental Theorem of Algebra, and a Parallel Polynomial Root Estimation Algorithm
AA Visualizable, Constructive Proof of the FundamentalTheorem of Algebra, and a Parallel Polynomial RootEstimation Algorithm
Christopher Thron and Jordan Barry Texas A& M University-Central Texas 79549 USA Texas A& M University-Central Texas 79549 USA email address: [email protected] email address: [email protected] Abstract:
This paper presents an alternative proof of the Fundamental Theorem of Algebrathat has several distinct advantages. The proof is based on simple ideas involving continuityand differentiation. Visual software demonstrations can be used to convey the gist of the proof.A rigorous version of the proof can be developed using only single-variable calculus and basicproperties of complex numbers, but the technical details are somewhat involved. In order tofacilitate the reader’s intuitive grasp of the proof, we first present the main points of the argu-ment, which can be illustrated by computer experiments. Next we fill in some of the details,using single-variable calculus. Finally, we give a numerical procedure for finding all roots of an n th degree polynomial by solving n differential equations in parallel. Keywords:
Fundamental Theorem of Algebra, calculus, chain rule, continuity.
AMS 2010 Subject Classification:
The fundamental theorem of algebra states that any polynomial function from the complexnumbers to the complex numbers with complex coefficients has at least one root. There are severalproofs of the fundamental theorem of algebra, which employ a number of different domains ofmathematics, including complex analysis (Liouville’s theorem, Cauchy’s integral theorem or themean value property) ([1][2], topology (Brouwer’s fixed point theorem)[3], differential topology[4],calculus ([1],[5]), and “elementary” methods using meshes or lattices [6], [7]. For easily-accessibleand readable web references that explain these proofs, see [8],[9],[10],[11]. Many of these proofsare beautiful and elegant. Most are not constructive and do not provide a practical methodfor finding roots ([6] and [7] are exceptions). The proof we present here is both constructive,and provides a practical method for finding all roots of any polynomial through the numericalsolution of differential equations with different initial conditions. Furthermore, we have createda simple, intuitive visual display that demonstrates the construction of roots.
Consider the polynomial f ( z ) = (cid:80) Nn =0 a n z n , where N is a positive integer and a n are complex.We want to show that f ( z ) = 0 has at least one complex solution. To approach this problem,we make some preliminary empirical observations on the behavior of polynomial functions.1 a r X i v : . [ m a t h . G M ] O c t ig 1 : R Shiny interface for dynamic display of f ( C r ) and its preimage C r , which shows upcross-ings (solid blue dots) and downcrossings (hollow red dots). The function in this case is a cubicwith roots at i, − . . i, and − . − . i . The large black dot in the preimage plot is theroot − . − . i , which maps to 0 in the image plot. The R Shiny app used to create these plotsis available online at https://github.com/jthomasbarry/complex_plot_r .First we may consider how f ( z ) behaves for some specific values of z . When z = 0 we have f ( z ) = a , and when z has a very large magnitude then the terms a n z n in f ( z ) also have largemagnitudes, especially the leading-order term a N z N . To understand the behavior of f ( z ) betweenthese two extremes, we isolate the behavior of f ( z ) for different values of | z | , as described below.A complex number can be written in polar form as z = re iθ , where r > is the magnitude of z and ≤ θ < π . If we fix r and allow θ to vary, then the set of points { re iθ , ≤ θ < π } is acircle of radius r in the complex plane, which we denote as C r . Since the function f is definedon all complex numbers, in particular it is defined on each circle C r . The image of C r under f is also a set (a curve, actually) in the complex plane, which we may denote as f ( C r ) .Using computer software, we may investigate the changes in the shape of f ( C r ) as r increases from0, for different polynomials f ( z ) . For this purpose, an R Shiny code (listed in the Appendix)was developed that displays f ( C r ) and C r as well as upcrossings and downcrossings, for anygiven value of r for an arbitrary polynomial with complex coefficients, as specified by the user.A screenshot of the interface is shown in Figure 1. A sequence of f ( C r ) plots for different valuesof r is shown in Figure 2.Without loss of generality we may assume a = − : given any polynomial with a (cid:54) = 0 we mayobtain a polynomial with the same roots and having constant coefficient − by dividing by − a .If we look at several different polynomials f and see how the curve f ( C r ) evolves as r increases,we may make the following observations:(i) When r is sufficiently small, then f ( C r ) has a nearly circular shape with center − andsmall radius. The curve f ( C r ) has multiple intersections with the real axis.(ii) As r increases, these points of intersection betwee f ( C r ) and the real axis move continuouslyalong the real axis (although sometimes they disappear: see point (v) below)(iii) There are two types of intersections: some move consistently to the right as r increases,and others move consistently to the left. When r is small, the rightmost intersection isalways rightward-moving.(iv) New intersections with the x axis may appear as r increases. From a geometrical viewpoint,these new intersections occur when a lobe of f ( C r ) located in the upper (resp. lower) half-plane shfts downward (resp. upward) as r increases so that it intersects the axis. These2 ig 2 : Curves f ( C r ) for different values of r , for the polynomial f ( z ) = ( a a a ) − ( z − a )( z − a )( z − a ) where a = 1 . i ) , a = 1 . − i ) , a = 1 . − − i ) . The solid blue dots indicateupcrossings (which move to the right), while the hollow red dots indicate downcrossings (whichmove to the left). A new downcrossing-upcrossing pair is introduced when r ≈ . (as a lobein the upper half plane expands down across the real axs) and another pair is introduced when . < r < (when a lobe in the lower half plane expands up across the real axis). Roots arefound at upcrossings for moduli r ≈ . , . , . (compare | a | = 2 . , | a | = 2 . , | a | = 2 . ).3ew intersections always appear first as a single point that splits into a left-moving andright-moving intersection as r increases.(v) A right-moving intersection continues to move to the right unless it runs into a left-movingintersection, in which case both intersections may disappear. From the two-dimensionalviewpoint, this occurs when a lobe of f ( C r ) that intersects the real axis moves above orbelow the axis.(vi) When r is very large, the shape of f ( C r ) approaches a large circle centered at the origin. Inparticular, the rightmost intersection between f ( C r ) and the real axis is large and positive.(vii) The rightmost intersection when r is small is always continuously connected to the rightmostintersection when r is large by a series of right-moving or left-moving intersections. Sincethe origin is on the real axis between these two intersections, the origin must be either aright-moving or left-moving intersection for some value of r .To understand the difference between right-moving and left-moving intersections, we may lookmore closely into the nature of the curve f ( C r ) . As we mentioned above, the circle C r isparametrized by the angle θ . As θ increases, the corresponding point on C r (given by re iθ )moves counterclockwise around C r , while the image of the point under the function f (givenby f ( re iθ traces out the curve f ( C r ) . As the tracing point crosses the real axis, we find thereare two types of crossings: either upcrossings (from below to above), or downcrossings (fromabove to below). It may be observed experimentally (and we shall soon show mathematically)that the upcrossings correspond to the rightward-moving intersection points as noted above, anddowncrossings correspond to leftward-moving intersection points.We may summarize a systematic procedure for using the software to locate roots of f :(I) Ensure that a = − by dividing f by − a (if a = 0 , then 0 is a root already);(II) Start with a small value of r and locate an upcrossing point on C r ;(III) Follow the upcrossing point as it moves rightward. Eventually it will either pass over theorigin, or run into a leftward-moving downcrossing point and disappear.(IV) If the latter holds, follow the leftward moving point backwards (i.e. decreasing r ). Even-tually, either it will pass over the origin, or it will merge with a rightward-moving pointand disappear.(V) Follow this rightward-moving point forward (increasing r ) until it either passes throughthe origin or merges with a leftward-moving point.(VI) Continue iterating Steps IV and V until the origin is reached. This procedure is the basis for a formal proof of the theorem. Some of the technical detals arerather involved, but the guiding intuition is captured by the procedure described above.The proof proceeds in several steps:1. The roots of f ( z ) are identical to the roots of − f ( z ) /a . So without loss of generality, wemay assume that the constant coefficient a is equal to − .2. Assume for the moment that f (cid:48) ( z ) has no zeros on the real axis between − and . (Laterwe will deal with the case where this is not true.)4. For any value of r > , we define the curve f r ( t ) ≡ f ( re it ) , ≤ t ≤ π . Since f r (0) = f r (2 π ) , it follows that this is a closed (possibly self-intersecting) curve in the complex plane.4. Denote by upcrossing (resp. downcrossing ) a point where f r crosses the real axis frombelow (resp. above). In other words, the real number x is an upcrossing for f r if thereexists t such that f r ( t ) = x , and there exists δ > such that Im f r ( s ) ≤ for t − δ ≤ s ≤ t and Im f r ( s ) ≥ for t ≤ s ≤ t + δ . By continuity, every root of Im f r is either an upcrossing,a downcrossing, or a point where the real line is tangent to f r .5. Suppose x r = f r ( t ) is an upcrossing and f r (cid:48) ( t ) (cid:54) = 0 , then the crossing point moves continu-ously to the right as a function of r . More precisely, there exist (cid:15), δ > and a continuous,real-valued function g ( s ) that is strictly increasing on the interval r − (cid:15) < s < r + (cid:15) suchthat g ( r ) = x r and g ( s ) = f s ( t (cid:48) ) , for some t (cid:48) in the interval t − δ < t (cid:48) < t + δ . We call thefunction g an upcrossing function . A similar statement holds for the downcrossing case,except that g is decreasing: the function g in this case is called a downcrossing function ).6. The domain of any upcrossing function g may be extended to an open interval, such thateither the range of g includes the origin, or the right endpoint b of the domain is such that g ( b ) is a point of tangency of the curve f b .7. Every point of tangency that is the right endpoint of the domain of an upcrossing functionis the left endpoint of the domain of a downcrossing function.8. Every point in the interval [ − , is either an upcrossing, a downcrossing, or a point oftangency of f r for some positive value of r . In particular, the origin is either an upcrossing,downcrossing, or point of tangency, and is thus equal to f ( re it ) for some values of r and t .Steps (1-8) handle the case where none of the roots of f (cid:48) lie on the real segment [ − , . If onthe other hand f (cid:48) does have a root on [ − , , we may consider the ray θ = π + ν , for sufficientlysmall ν , which (by continuity) will have at least one upcrossing intersection with C r if r, ν aresufficiently small. We denote this intersection as (cid:101) z . Since the roots of f (cid:48) are isolated, we canalso choose the ν such that f (cid:48) has no roots on the ray. We may then consider the function (cid:101) f ( z ) ≡ f ( z ) e − iν . Then (cid:101) ze − iν is an upcrossing point for (cid:101) f of the negative real axis. Steps (1-8)above then goes through for (cid:101) f : and the roots of (cid:101) f are identical with the roots of f . As above, we suppose f ( z ) = (cid:80) Nn =1 a n z n with a = − , where z = re iθ . We seek equationssatisfied by upcrossing locations x as a function of r . Note that in order for x = f ( re iθ ) to bean upcrossing, the complex argument θ varies as r varies, so we must consider both θ and x asfunctions of r . To make this clear, we will use φ = φ ( r ) to denote the complex argument, so that x ( r ) = f ( z ( r )) where z ( r ) = re iφ .Using the chain rule, we have: dxdr = f (cid:48) ( z ) ddr ( z ) = f (cid:48) ( z ) e iφ (cid:18) ir dφdr (cid:19) (1)Since x is real-valued, so dxdr is also a real function and dxdr = (cid:0) dxdr (cid:1) ∗ , where ∗ denotes complexconjugate. This gives f (cid:48) ( z ) e iφ (cid:18) ir dφdr (cid:19) = f (cid:48) ( z ) ∗ e − iφ (cid:18) − ir dφdr (cid:19) . (2)5olving for r dφdr , we obtain r dφdr = (cid:0) f (cid:48) ( z ) ∗ e − iφ − f (cid:48) ( z ) e iφ (cid:1) i ( f (cid:48) ( z ) e iφ + f (cid:48) ( z ) ∗ e − iφ ) = − Im (cid:0) f (cid:48) (cid:0) re iφ ( r ) (cid:1) e iφ ( r ) (cid:1) Re (cid:0) f (cid:48) (cid:0) re iφ ( r ) (cid:1) e iφ ( r ) (cid:1) , (3)where we have replaced z with re iφ ( r ) to highlight the r dependence. From (1) and (3) we maycalculate: dxdr = f (cid:48) ( z ) e iφ (cid:18) − i Im ( f (cid:48) ( z ) e iφ Re ( f (cid:48) ( z ) e iφ (cid:19) = | f (cid:48) ( re iφ ( r ) ) e iφ ( r ) | Re (cid:0) f (cid:48) (cid:0) re iφ ( r ) (cid:1) e iφ ( r ) (cid:1) (4)Equations (3) and (4) express dφdr and dxdr respectively in terms of f (cid:48) (cid:0) re iφ ( r ) (cid:1) e iφ ( r ) . Fortunately,this rather complicated expression turns out to have a relatively simple interpretation. Thedefinition of f r implies that ddθ f r ( θ ) = f (cid:48) ( re iθ )( ire iθ ) , so if we define: α ( r ) ≡ ddθ f r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) θ = φ ( r ) (5)then we may re-express the system (3)-(4) as: dφdr = − Im ( − iα ( r )) r Re ( − iα ( r )) = Re ( α ( r )) r Im ( α ( r )) ; dxdr = | − iα ( r ) | Re ( − iα ( r )) = | α ( r ) | r Im ( α ( r )) . (6)For future reference, note that Im ( α ( r )) is positive or negative depending on whether x ( r ) is anupcrossing or downcrossing.Alternatively, we can pose the system such that θ is the independent variable, and r, x are thedependent variables. To clarify the dependence of r on θ , we use ρ = ρ ( θ ) here to represent thecomplex modulus as a function of θ , so that f ρ ( ρe iθ ) is an upcrossing point for the curve C ρ . Inanalogy to (5), we define: β ( θ ) ≡ ddθ f r ( θ ) (cid:12)(cid:12)(cid:12)(cid:12) r = ρ ( θ ) , (7)and in analogy to (6) we obtain: dρdθ = − r Re ( − iβ ( θ )) r Im ( − iβ ( θ )) = Im ( β ( θ )) Re ( β ( θ )) ; dxdθ = − | − iβ ( θ ) | r Im ( − iβ ( θ )) = | β ( θ ) | r Re ( β ( θ )) . (8)It follows that given a crossing point x = f r ( θ ) , we can always make the crossing point ‘move’continuously to the right by following this strategy:1. If lim z → re iθ | Im f (cid:48) ( re iθ ) / Re f (cid:48) ( re iθ ) | > c (where c < is a fixed positive parameter), thenpropagate x to the right using (6) with either increasing or decreasing r , depending on thesign of Im f (cid:48) ( re iθ ) :2. Otherwise, propagate x to the right using (8) with either increasing or decreasing θ , de-pending on the sign of Im f (cid:48) ( re iθ ) .Following this procedure will yield a monotonically increasing crossing point. If the initial cross-ing point is chosen such that it is chosen between − and 0 on the real axis, then eventually thecrossing point will pass 0 and a root will be obtained.The above procedure is only guaranteed to obtain a single root γ . Subsequent roots may beestimated by taking f (1) ( z ) ≡ f ( z )(1 − z/γ ) − and finding another root γ , then iterating the6rocedure with f ( j ) ( z ) ≡ f ( z )(1 − z/γ j ) − , j = 1 , , . . . until all roots are found. However, it ispossible there may be numerical stability problems, because due to numerical error f ( j ) ( z ) is nolonger a polynomial for j ≥ .An alternative approach finds all roots in parallel as follows. If the polynomial has degree n , the z n term dominates the behavior of f ( C r ) when r is large. It follows that for r sufficiently large, f ( C r ) must have at least n upcrossings on the positive real axis and at least n downcrossingsof the negative real axis. All upcrossings may be followed leftwards using the reverse of therightward-tracking procedure described above; and all downcrossings may be followed rightwardby a similar procedure. Not all of these n tracks (which may be computed in parallel) will resultin a root; however, it is guaranteed that all n roots will be obtained through the procedure. References [1] Schep, A.: A simple complex analysis and and advanced calculus proof of the fundamentaltheorem of algebra. American Mathematical Monthly 116(1), 67–68 (2009).[2] Vyborny, R.: A simple proof of the fundamental theorem of algebra. Mathematica Bohem-ica 135(1), 57–61 (2010).[3] Arnold, B.H: A topological proof of the fundamental theorem of algebra. The AmericanMathematical Monthly 56(7), 465–466 (1949).[4] Guillemin, V., Pollack, A.: Differential Topology. 1st edn. Prentice-Hall (1947).[5] Fefferman, C.: An easy proof of the fundamental theorem of algebra. The American Math-ematical Monthly 74(7),854–855 (1967).[6] Rosenbloom, P.C.: An elementary constructive proof of the fundamental theorem of alge-bra.The American Mathematical Monthly 52(10), 562–570 (1945).[7] Brenner, J.L, Lyndon, R.C.: Proof of the fundamental theorem of algebra. American Math-ematical Monthly 88(4), 253–256 (1981).[8] File, D., Miller, S.: Fundamental theorem of algebra lecture notes from the readingclassics (euler) working group autumn 2003, https://people.math.osu.edu/sinnott.1/ReadingClassics/FundThmAlg_DFile.pdf , last accessed 2020/2/1.[9] Steed, M.: Proofs of the fundamental theorem of algebra, http://math.uchicago.edu/~may/REU2014/REUPapers/Steed.pdf , last accessed 2020/1/12.[10] Linford, K.: An analysis of Charles Fefferman’s proof of the fundamental theorem of alge-bra, http://commons.emich.edu/honors/504, last accessed 2020/2/1.[11] Dunfield, N.: The fundamental theorem of algebra (class notes), https://faculty.math.illinois.edu/~nmd/classes/2015/418/notes/fund_thm_alg.pdfhttps://faculty.math.illinois.edu/~nmd/classes/2015/418/notes/fund_thm_alg.pdf