Continuation Newton methods with the residual trust-region time-stepping scheme for nonlinear equations
JJournal of XXX manuscript No. (will be inserted by the editor)
Continuation Newton Methods with the ResidualTrust-region Time-stepping Scheme for Nonlinear Equations
Xin-long Luo ∗ · Hang Xiao · Jia-hui Lv
Received: date / Accepted: date
Abstract
For nonlinear equations, the homotopy methods (continuation methods)are popular in engineering fields since their convergence regions are large and theyare quite reliable to find a solution. The disadvantage of the classical homotopy meth-ods is that their consumed time is heavy since they need to solve many auxiliary non-linear systems during the intermediate continuation process. In order to overcomethis shortcoming, we consider the special explicit continuation Newton method withthe new time-stepping scheme for nonlinear equations. According to our numericalexperiments, it is more robust and faster to find the required solution of the real-worldproblem than the traditional optimization method (the built-in subroutine fsolve.m ofthe MATLAB environment) and the homotopy continuation methods (HOMPACK90and NAClab). Furthermore, we analyze the global convergence and the local super-linear convergence of the new method.
Keywords
Continuation Newton method · trust-region technique · nonlinearequations · homotopy continuation method · equilibrium problem Mathematics Subject Classification (2010) · · Xin-long Luo, Corresponding authorSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] XiaoSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] LvSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] a r X i v : . [ m a t h . NA ] A ug Luo, Xiao and Lv
In engineering fields, we often need to solve the equilibrium state of the differentialequation [26,34,41,47] as follows: dxdt = F ( x ) , x ( t ) = x . (1)That is to say, it requires to solve the following system of nonlinear equations: F ( x ) = , (2)where F : ℜ n → ℜ n is a vector function. For the nonlinear system (2), there are manypopular traditional optimization methods [6,10,19,37] and the classical homotopycontinuation methods [1,12,38,48] to solve it.For the traditional optimization methods such as the trust-region methods and theline search methods, the solution x ∗ of the nonlinear system (2) is found via solvingthe following equivalent nonlinear least-squares problemmin x ∈ ℜ n f ( x ) = (cid:107) F ( x ) (cid:107) , (3)where (cid:107) · (cid:107) denotes the Euclidean vector norm or its induced matrix norm. Generallyspeaking, the traditional optimization methods based on the merit function (3) are ef-ficient for the large-scale problems since they have the local superlinear convergencenear the solution x ∗ [6,37].However, the line search methods and the trust-region methods are apt to stagnateat a local minimum point x ∗ of problem (3), when the Jacobian matrix J ( x ∗ ) of F ( x ∗ ) is singular or nearly singular, where J ( x ) = ∂ F ( x ) / ∂ x (see p. 304, [37]). Furthermore,the termination condition (cid:107) ∇ f ( x k ) (cid:107) = (cid:107) J ( x k ) T F ( x k ) (cid:107) < ε , (4)may lead these methods to early stop far away from the local minimum x ∗ . This canbe illustrated as follows. Consider F ( x ) = Ax = , A = (cid:20) − (cid:21) , (5)which has a unique solution x ∗ = ( , ) . Set ε = − , then the traditional optimiza-tion methods will early stop far away from x ∗ provided that x k = ( , c ) , c < .For the classical homotopy methods, the solution x ∗ of the nonlinear system (2)is found via constructing the following homotopy function H ( x , λ ) = ( − λ ) G ( x ) + λ F ( x ) , (6)and attempting to trace an implicitly defined curve λ ( t ) ∈ H − ( ) from the startingpoint ( x , ) to a solution ( x ∗ , ) by the predictor-corrector methods [1,12], where thezero point of the artificial smooth function G ( x ) is known. Generally speaking, thehomotopy continuation methods are more reliable than the merit-function methods ontinuation Newton method with the trust-region time-stepping scheme 3 and they are very popular in engineering fields [26]. The disadvantage of the classicalhomotopy methods is that they require significantly more function and derivativeevaluations and linear algebra operations than the merit-function methods since theyneed to solve many auxiliary nonlinear systems during the intermediate continuationprocess.In order to overcome this shortcoming of the traditional homotopy methods, weconsider the special continuation method based on the following Newton flow [3,4,7,46] dx ( t ) dt = − J ( x ) − F ( x ) , x ( t ) = x , (7)and construct a special ODE method with the new time-stepping scheme based onthe trust-region technique to trace the trajectory of the Newton flow (7), and therebyefficiently obtain the required solution x ∗ of the nonlinear system (2).The rest of this article is organized as follows. In the next section, we consider theexplicit continuation Newton method with the new time-stepping scheme based onthe trust-region technique for nonlinear equations. In section 3, under some standardassumptions, we prove the global convergence and the local superlinear convergenceof the new method. In section 4, some promising numerical results of the new methodare also reported, with comparison to the traditional trust-region method (the built-insubroutine fsolve.m of the MATLAB environment [32,35]) and the classical homo-topy continuation methods (HOMOPACK90 [48] and NAClab [24,49,50]). Finally,some conclusions and the future work are discussed in section 5. Throughout thisarticle, we assume that F ( x ) exists the zero point x ∗ . In this section, based on the trust-region technique, we construct a new time-steppingscheme for the the continuation Newton method to trace the trajectory of the Newtonflow and obtain its equilibrium point x ∗ .2.1 The continuous Newton flowIf we consider the damped Newton method with the line search strategy for the non-linear system (2) [22,37], we have x k + = x k − α k J ( x k ) − F ( x k ) . (8)We denote o ( α ) as the higher-order infinitesimal of α , that is to say,lim α → o ( α ) α = . In equation (8), if we let x k = x ( t k ) and x k + = x ( t k + α k ) + o ( α k ) , we obtain thecontinuous Newton flow (7) when α k →
0. Actually, if we apply an iteration withthe explicit Euler method [15,44] to the continuous Newton flow (7), we also obtain
Luo, Xiao and Lv the damped Newton method (8). Since the Jacobian matrix J ( x ) may be singular, wereformulate the continuous Newton flow (7) as the more general formula: − J ( x ) dx ( t ) dt = F ( x ) , x ( t ) = x . (9)The continuous Newton flow (9) is an old method and can be backtracked to Davi-denko’s work [7] in 1953, after that it was investigated by Branin [4], Deuflhard et al[9], Tanabe [46] and Kalaba et al [20] in 1970s, and applied to nonlinear boundaryproblems by Axelsson and Sysala [3] recently. The reason of the continuous researchinterest for this method is that it has some nice properties. One is that the solution x ( t ) of the continuous Newton flow has the global convergence from any initial point,as described by the following property 1. Property 1 (Branin [4] and Tanabe [46]) Assume that x ( t ) is the solution of the con-tinuous Newton flow (9), then the energy function f ( x ( t )) = (cid:107) F ( x ) (cid:107) converges tozero as t tends to infinity. That is, for every limit point x ∗ of x ( t ) , it is also a solutionof nonlinear system (2). Furthermore, F ( x ) has the linear convergence rate e − t . If itsJacobian matrix J ( x ) is nonsingular, x ( t ) can not converge to its equilibrium x ∗ on thefinite interval. Proof.
Assume that x ( t ) is the solution of the continuous Newton flow (9), then wehave ddt (cid:0) e t F ( x ) (cid:1) = e t J ( x ) dx ( t ) dt + e t F ( x ) = , which gives F ( x ) = F ( x ) e − t . (10)From equation (10), it is not difficult to know that F ( x ( t )) converges to zero withlinear convergence rate e − t when t tends to infinity. Thus, if the solution x ( t ) of thecontinuous Newton flow (9) falls in a compact set, it converges to a point x ∗ when t tends to infinity, and this limit point x ∗ is also a solution of nonlinear system (2).Furthermore, if the Jacobian matrix J ( x ) is nonsingular, the ordinary differentialequation (ODE) (9) is equivalent to the continuous Newton flow (7). Thus, from thedynamical property of the ODE, it is not difficult to know that the ODE (7) has aunique solution x ( t ) and it can not converge to its equilibrium x ∗ on the finite interval(pp. 79-82, [40]). (cid:117)(cid:116) The inverse matrix J ( x ) − can be regarded as the preconditioner of F ( x ) such thatthe solution elements x i ( t ) ( i = , , . . . , n ) of the continuous Newton flow (7) havethe roughly same convergence rates and it mitigates the stiff property of the ODE (7)(the definition of the stiff problem can be found in [16] and references therein). Thisproperty is very useful since it makes us adopt the explicit ODE method to trace thetrajectory of the Newton flow. It can be roughly illustrated as follows.Actually, if we consider F ( x ) = Ax , from the ODE (9), we have A dxdt = − Ax , x ( ) = x . (11) ontinuation Newton method with the trust-region time-stepping scheme 5 By integrating the linear ODE (11), we obtain x ( t ) = e − t x . (12)From equation (12), we know that the solution x ( t ) of the ODE (11) linearly con-verges to zero with the same rate e − t when t tends to infinity.2.2 Continuation Newton methodsFrom subsection 2.1, we know that the solution x ( t ) of the continuous Newton flow(9) has the nice global convergence property. On the other hand, when the Jacobianmatrix J ( x ) is singular or nearly singular, the ODE (9) is the system of differential-algebraic equations (DAEs) and its trajectory can not be efficiently followed by thegeneral ODE method such as the backward differentiation formulas (the built-in sub-routine bdf15s.m of the MATLAB environment [2,5,16,32,44]). Thus, we need toconstruct the special method to handle this problem and we expect that the newmethod has the global convergence as the homotopy continuation methods and thefast convergence rate near the solution x ∗ as the merit-function methods. In order toachieve these two aims, we construct the special continuous Newton method with thenew step size α k = ∆ t k / ( + ∆ t k ) and the time-stepping size ∆ t k is adaptively by thetrust-region technique for problem (9).We first apply the implicit Euler method to the continuous Newton flow (9) [2,5],then we obtain J ( x k + ) x k + − x k ∆ t k = − F ( x k + ) . (13)The scheme (13) is an implicit method and it needs to solve a system of nonlinearequations at every iteration. To avoid solving the system of nonlinear equations, wereplace J ( x k + ) and F ( x k + ) with J ( x k ) and F ( x k ) + J ( x k )( x k + − x k ) respectively inequation (13). Thus, we obtain the explicit continuation Newton method as follows: J ( x k ) s k = − ∆ t k + ∆ t k F ( x k ) , (14) x k + = x k + s k . (15)The explicit continuation Newton method (14)-(15) is similar to the damped New-ton method (8) if we let α k = ∆ t k / ( + ∆ t k ) in equation (14). However, from the viewof the ODE method, they are very different. The damped Newton method (8) is ob-tained by the explicit Euler scheme applied to the continuous Newton flow (9), andits time-stepping size ∆ t k is restricted by the numerical stability [16,44, ? ]. That is,the large time-stepping size ∆ t k can not be adopted in the steady-state phase. Theexplicit continuation Newton method (14)-(15) is obtained by the semi-implicit Eu-ler method applied to the continuous Newton flow (9), and its time-stepping size ∆ t k is not restricted by the numerical stability. Therefore, the large time-steppingsize can be adopted in the steady-state phase for the explicit continuation Newtonmethod (14)-(15), and it mimics the Newton method near the equilibrium solution Luo, Xiao and Lv x ∗ such that it has the fast local convergence rate. The most of all, the new stepsize α k = ∆ t k / ( ∆ t k + ) is favourable to adopt the trust-region technique for adap-tively adjusting the time-stepping size ∆ t k such that the explicit continuation Newtonmethod (14)-(15) accurately traces the trajectory of the continuous Newton flow inthe transient-state phase and achieves the fast convergence rate near the equilibriumpoint x ∗ .For the real-world problem, the Jacobian matrix J ( x ) may be singular, whicharises from the physical property. For example, for the chemical kinetic reactionproblem (1), the elements of x ( t ) represent the reaction concentrations and they mustsatisfy the linear conservation law [27]. A system is called to satisfy the linear con-servation law ([42], or p. 35, [44]), if there is a constant vector c (cid:54) = c T x ( t ) = c T x ( ) (16)holds for all t ≥
0. If there exists a constant vector c such that c T F ( x ) = , ∀ x ∈ ℜ n , (17)we have c T J ( x ) = , ∀ x ∈ ℜ n . (18)From equation (18), we know that the Jacobian matrix J ( x ) is singular. For this case,the solution x ( t ) of the ODE (1) satisfies the linear conservation law (16).For the isolated singularity of the Jacobian matrix J ( x ) , there are some efficientapproaches to handle this problem [14]. Here, since the singular set of the Jacobianmatrix J ( x ) may be connected, we adopt the regularization technique [17,21] to mod-ify the explicit continuation Newton method (14)-(15) as follows: ( µ k I − J ( x k )) s Pk = F ( x k ) , (19) s k = ∆ t k + ∆ t k s Pk , (20) x k + = x k + s k , (21)where µ k is a small positive number. In order to achieve the fast convergence rate nearthe solution x ∗ , the regularization continuation Newton method (19)-(21) is requiredto approximate the Newton method x k + = x k − J ( x k ) − F ( x k ) near the solution x ∗ [11]. Thus, we select the regularization parameter µ k as follows: µ k = (cid:40) c ε , if ∆ t k ≤ / c ε , / ∆ t k , others , (22)where c ε is a small positive constant such as c ε = − in practice.It is not difficult to verify that the regularization continuation Newton method(19)-(21) preserves the linear conservation law (16) if it exists a constant vector c ∈ ℜ n such that c T F ( x ) = , ∀ x ∈ ℜ n . Actually, from c T F ( x ) =
0, we have c T J ( x ) = c T x k + = c T x k + c T s k = c T x k + µ k c T (cid:18) ∆ t k + ∆ t k F ( x k ) + J ( x k ) s k (cid:19) = c T x k . (23) ontinuation Newton method with the trust-region time-stepping scheme 7 That is to say, the regularization continuation Newton method (19)-(21) preserves thelinear conservation law (16).2.3 The residual trust-region time-stepping schemeAnother issue is how to adaptively adjust the time-stepping size ∆ t k at every iteration.A popular way to control the time-stepping size is based on the trust-region technique[6,8,19,30]. For this time-stepping scheme, it needs to select suitable a merit func-tion and construct an approximation model of the merit function. Here, we adopt theresidual (cid:107) F ( x ) (cid:107) as the merit function and adopt (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) as theapproximation model of (cid:107) F ( x k + ) (cid:107) . Thus, according to the following ratio: ρ k = (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) , (24)we enlarge or reduce the time-stepping size ∆ t k at every iteration. A particular ad-justment strategy is given as follows: ∆ t k + = γ ∆ t k , i f | − ρ k | ≤ η , ∆ t k , i f η < | − ρ k | < η , γ ∆ t k , i f | − ρ k | ≥ η , (25)where the constants are selected as γ = , γ = . , η = . , η = .
75 accordingto our numerical experiments.According to the above discussions, we give the detailed implementation of theregularization continuation Newton method with the trust-region time-stepping schemefor nonlinear equations in Algorithm 1.
In this section, we discuss some theoretical properties of Algorithm 1. Firstly, weestimate the lower bound of the predicted reduction (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) , which is similar to the estimate of the trust-region method for the unconstrainedoptimization problem [39]. Lemma 1
Assume that it exists a positive constant m such that (cid:107) J ( x k ) y (cid:107) ≥ m (cid:107) y (cid:107) , ∀ y ∈ ℜ n , k = , , , . . . . (26) Furthermore, we suppose that s k is the solution of the regularization continuationNewton method (19) - (21) , where the regularization parameter µ k defined by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m. Then, we have the following estimate (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ c r ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) , (27) where the positive constant c r satisfies < c r < . Luo, Xiao and Lv
Algorithm 1
Continuation Newton Methods with the Residual Trust-region Time-stepping Scheme (The CNMTr method)
Input: the nonlinear function: F ( x ) ;the initial point: x (optional);the terminated parameter: ε (optional). Output: the approximation solution x ∗ of nonlinear equations.1: Set the default x = [ , , ..., ] T and ε = − when the called function does not provide x or ε .2: Initialize the parameters: η a = − , η = . , γ = , η = . , γ = . F ( x ) and the Jacobian matrix J ( x ) . Select the initial time-stepping size as ∆ t = min (cid:26) . , (cid:107) F ( x ) (cid:107) (cid:27) . Compute the residual
Res = (cid:107) F ( x ) (cid:107) .4: while Res k ≥ ε do
5: Solve the linear system of equations (19)-(20) to obtain the trial step s k .6: Set x k + = x k + s k .7: Evaluate F ( x k + ) .8: Compute the residual Res k + = (cid:107) F ( x k + ) (cid:107) . if (cid:107) F ( x k ) (cid:107) < (cid:107) F ( x k ) + J ( x k ) s k (cid:107) then ρ k = − else
12: Compute the ratio ρ k from equation (24).13: end if
14: Adjust the time-stepping size ∆ t k + according to the trust-region scheme (25).15: if ρ k ≥ η a then
16: Accept the trial step s k . Set x k = x k + , ∆ t k = ∆ t k + and Res k = Res k + . Evaluate the Jacobianmatrix J ( x k ) .17: else
18: Set ∆ t k = ∆ t k + .19: end if
20: Set k ←− k + end while Proof.
From equations (19)-(20), we have − J ( x k ) s k + µ k s k = F ( x k ) ∆ t k + ∆ t k . (28)Thus, from equation (28), we obtain (cid:107) J ( x k ) s k + F ( x k ) (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) µ k s k + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) µ k ∆ t k + ∆ t k ( − J ( x k ) + µ k I ) − F ( x k ) + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) ≤ + ∆ t k (cid:32) ∆ t k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (cid:33) (cid:107) F ( x k ) (cid:107) . (29) ontinuation Newton method with the trust-region time-stepping scheme 9 According to the definition of the induced matrix norm [13], we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = max z (cid:54) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) / (cid:107) z (cid:107) = max y (cid:54) = (cid:107) y (cid:107) (cid:13)(cid:13)(cid:13)(cid:16) − µ k J ( x k ) + I (cid:17) y (cid:13)(cid:13)(cid:13) = (cid:107) y (cid:107) = (cid:13)(cid:13)(cid:13)(cid:16) − µ k J ( x k ) + I (cid:17) y (cid:13)(cid:13)(cid:13) . (30)On the other hand, when (cid:107) y (cid:107) =
1, from the nonsingular assumption (26) of matrix J ( x k ) , we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) y (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) − µ k J ( x k ) y + y (cid:13)(cid:13)(cid:13)(cid:13) ≥ µ k (cid:107) J ( x k ) y (cid:107) − (cid:107) y (cid:107) ≥ µ k m − . (31)Thus, from the assumption µ k ≤ c ε < . m and equations (30)-(31), we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ µ k m − µ k ≤ c ε m − c ε . (32)By substituting inequality (32) into inequality (29), we have (cid:107) J ( x k ) s k + F ( x k ) (cid:107) ≤ + ∆ t k (cid:18) + c ε m − c ε ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) , that is to say, (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ (cid:18) m − c ε m − c ε (cid:19) ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (33)Set c r = ( m − c ε ) / ( m − c ε ) in the above inequality. Then, we obtain the estimate(27). (cid:117)(cid:116) In order to prove that the sequence {(cid:107) F ( x k ) (cid:107)} converges to zero when k tends toinfinity, we also need to estimate the lower bound of the time-stepping size ∆ t k . Lemma 2
Assume that F : ℜ n → ℜ n is continuously differentiable and its Jacobianfunction J is Lipschitz continuous, that is to say, it exists a positive number L suchthat (cid:107) J ( x ) − J ( y ) (cid:107) ≤ L (cid:107) x − y (cid:107) , ∀ x , y ∈ ℜ n . (34) Furthermore, we suppose that the sequence { x k } is generated by Algorithm 1 andthe nonsingular condition (26) of matrix J ( x k ) holds. Then, when the regularizationparameter µ k defined by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, itexists a positive number δ ∆ t such that ∆ t k ≥ δ ∆ t > , k = , , , . . . , (35) where ∆ t k is adaptively adjusted by formulas (24) - (25) . Proof.
From the Lipschitz continuous assumption (34) of J ( x ) , we have (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) J ( x k + ts k ) s k dt − J ( x k ) s k (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) ( J ( x k + ts k ) − J ( x k )) s k dt (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:90) (cid:107) ( J ( x k + ts k ) − J ( x k )) s k (cid:107) dt ≤ (cid:90) (cid:107) J ( x k + ts k ) − J ( x k ) (cid:107)(cid:107) s k (cid:107) dt ≤ (cid:90) L (cid:107) s k (cid:107) tdt = L (cid:107) s k (cid:107) . (36)On the other hand, from equations (19)-(20), we have (cid:107) s k (cid:107) = ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( − J ( x k ) + µ k I ) − F ( x k ) (cid:13)(cid:13)(cid:13) ≤ ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( − J ( x k ) + µ k I ) − (cid:13)(cid:13)(cid:13) (cid:107) F ( x k ) (cid:107) . (37)Similarly to the estimation (32), from the assumption µ k ≤ c ε < . m and thenonsingular assumption (26) of J ( x k ) , we have (cid:13)(cid:13) ( − J ( x k ) + µ k I ) − (cid:13)(cid:13) ≤ m − µ k ≤ m − c ε . (38)Thus, from inequalities (36)-(38), we obtain (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (39)From the definition (24) of ρ k , the estimation (33), and inequality (39), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (40)According to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} is monotonicallydecreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x ) (cid:107) , k = , , . . . . Set¯ δ ∆ t (cid:44) ( m − c ε ) (cid:107) F ( x ) (cid:107) L η . (41)Assume that K is the first index such that ∆ t K ≤ ¯ δ ∆ t . Then, from inequalities(40)-(41), we obtain | ρ K − | < η . Consequently, ∆ t K + will be greater than ∆ t K according to the adaptive adjustment scheme (25). Set δ ∆ t = min { ∆ t K , ¯ δ ∆ t } . Then,we have ∆ t k ≥ δ ∆ t , k = , , , . . . . (cid:117)(cid:116) By using the estimate results of Lemma 1 and Lemma 2, we can prove that thesequence {(cid:107) F ( x k ) (cid:107)} converges to zero when k tends to infinity. ontinuation Newton method with the trust-region time-stepping scheme 11 Theorem 1
Assume that F : ℜ n → ℜ n is continuously differentiable and its Jaco-bian function J satisfies the Lipschitz condition (34) . Furthermore, we suppose thatthe sequence { x k } is generated by Algorithm 1 and J ( x k ) satisfies the nonsingular as-sumption (26) . Then, when the regularization parameter µ k defined by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, we have lim k → ∞ in f (cid:107) F ( x k ) (cid:107) = . (42) Proof.
According to Algorithm 1 and inequality (40), we know that it exists aninfinite subsequence { x k l } such that (cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l + ) (cid:107)(cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l ) + J ( x k l ) s k l (cid:107) ≥ η a , l = , , . . . . (43)From inequalities (27) and (43), we have (cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l + ) (cid:107) ≥ η a c r ∆ t k l + ∆ t k l (cid:107) F ( x k l ) (cid:107) . (44)By rearranging inequality (44) and combining it with the conclusion (35), we have (cid:107) F ( x k l + ) (cid:107) ≤ (cid:18) − η a c r ∆ t k l + ∆ t k l (cid:19) (cid:107) F ( x k l ) (cid:107) ≤ (cid:18) − η a c r δ ∆ t + δ ∆ t (cid:19) (cid:107) F ( x k l ) (cid:107) . (45)Therefore, from inequality (45) and 0 ≤ q (cid:44) − η a c r δ ∆ t / ( + δ ∆ t ) < k l → ∞ (cid:107) F ( x k l ) (cid:107) = . That is to say, the result of this lemma is proved. (cid:117)(cid:116)
Under the standard nonsingular assumption (26) of J ( x ∗ ) and the local Lipschitzcontinuity (34), we analyze the local superlinear convergence rate of Algorithm 1 nearthe solution x ∗ as follows. For convenience, we define the neighbourhood B δ ( x ∗ ) of x ∗ as B r ( x ∗ ) = { x : (cid:107) x − x ∗ (cid:107) ≤ δ } . Theorem 2
Assume that F : ℜ n → ℜ n is continuously differentiable and F ( x ∗ ) = .Furthermore, we suppose that J satisfies the local Lipschitz continuity (34) at x ∗ andthe nonsingular condition (26) when x ∈ B δ ( x ∗ ) . Then, when the regularization pa-rameter µ k defined by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, itexists a neighborhood B r ( x ∗ ) of x ∗ such that the sequence { x k } generated by Algo-rithm 1 with x ∈ B r ( x ∗ ) superlinearly converges to x ∗ . Proof.
The framework of its proof can be roughly described as follows. Firstly, weprove that the sequence { x k } linearly converges to x ∗ when x gets close enough to x ∗ . Then, we prove lim k → ∞ ∆ t k = ∞ . Finally, we prove that the search step s k approx-imates the Newton step s Nk . Consequently, the sequence { x k } superlinearly convergesto x ∗ .Firstly, similarly to the estimate (32), from the assumption µ k ≤ c ε < . m , weobtain (cid:13)(cid:13) ( µ k I − J ( x k )) − (cid:13)(cid:13) ≤ m − c ε , ∀ x k ∈ B δ ( x ∗ ) , k = , , , . . . . (46) We denote e k = x k − x ∗ . From equations (19)-(20), we have e k + = e k + s k = e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − ( F ( x k ) − F ( x ∗ ))= e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − (cid:90) J ( x ∗ + te k ) e k dt . (47)By rearranging the above equation (47), we obtain e k + = + ∆ t k e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − (cid:90) ( J ( x ∗ + te k ) − J ( x k ) + µ k I ) e k dt . By using the Lipschitz continuous assumption (34) of J , the estimate (46), and theassumption µ k ≤ c ε < . m , we have (cid:107) e k + (cid:107) ≤ + ∆ t k (cid:107) e k (cid:107) + ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( µ k I − J ( x k )) − (cid:13)(cid:13)(cid:13) (cid:90) ( (cid:107) J ( x ∗ + te k ) − J ( x k ) (cid:107) + µ k ) (cid:107) e k (cid:107) dt ≤ + ∆ t k (cid:107) e k (cid:107) + ∆ t k + ∆ t k m − µ k (cid:18) µ k + L (cid:107) e k (cid:107) (cid:19) (cid:107) e k (cid:107) = + m − µ k (cid:0) µ k + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k (cid:107) e k (cid:107) ≤ + m − c ε (cid:0) c ε + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k (cid:107) e k (cid:107) . (48)We denote q k (cid:44) + m − c ε (cid:0) c ε + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k , (49)and select x ∈ B δ ( x ∗ ) to satisfy (cid:107) e (cid:107) < m − c ε L . (50)Set r = min { δ , ( m − c ε ) / L } . When x ∈ B r ( x ∗ ) , from equations (48)-(50) and theassumption c ε < . m , by induction, we have (cid:107) e k + (cid:107) ≤ q k (cid:107) e k (cid:107) , q k < + m ( m − c ε ) ∆ t k + ∆ t k < , k = , , . . . . (51)It is not difficult to know that f ( t ) (cid:44) ( + α t ) / ( + t ) is monotonically decreasingwhen 0 ≤ α <
1. Thus, from the estimate (35) of the time step size ∆ t k and inequality(51), we obtain (cid:107) e k + (cid:107) ≤ q k (cid:107) e k (cid:107) ≤ q (cid:107) e k (cid:107) , q (cid:44) + m ( m − c ε ) δ ∆ t + δ ∆ t < . Therefore, we have (cid:107) e k + (cid:107) ≤ q k (cid:107) e (cid:107) → , when k → ∞ . (52) ontinuation Newton method with the trust-region time-stepping scheme 13 That is to say, we obtain lim k → ∞ x k = x ∗ .Secondly, from equations (19)-(20) and inequality (46), we have (cid:107) s k (cid:107) = ∆ t k + ∆ t k (cid:13)(cid:13) ( − J ( x ) + µ k I ) − F ( x k ) (cid:13)(cid:13) ≤ ∆ t k + ∆ t k (cid:13)(cid:13) ( − J ( x ) + µ k I ) − (cid:13)(cid:13) (cid:107) F ( x k ) (cid:107) ≤ m − c ε ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (53)Similarly to the estimate (40), from the definition (24) of ρ k , inequalities (33) and(53), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ( m − c ε ) (cid:107) F ( x k ) (cid:107) . (54)Since lim k → ∞ x k = x ∗ and F ( x ∗ ) =
0, we can select a sufficiently large number K suchthat (cid:107) F ( x k ) (cid:107) ≤ η ( m − c ε ) L , when k ≥ K . (55)From inequalities (54)-(55), we have | ρ k − | ≤ η , when k ≥ K . This means ∆ t k + = γ ∆ t k when k ≥ K , according to the time-stepping scheme (25).That is to say, we have lim k → ∞ ∆ t k = ∞ . (56)Finally, since lim k → ∞ ∆ t k = ∞ , we can select a sufficiently large number K µ suchthat 1 / ∆ t k < c ε when k ≥ K µ . Consequently, from the definition (22) of the regu-larization parameter µ k , we obtain µ k = / ∆ t k when k ≥ K µ . By substituting it intoinequality (48), we have (cid:107) e k + (cid:107)(cid:107) e k (cid:107) ≤ + ∆ t k + ∆ t k + ∆ t k m − µ k (cid:18) µ k + L (cid:107) e k (cid:107) (cid:19) = + ∆ t k + m − / ∆ t k (cid:18) ∆ t k + L (cid:107) e k (cid:107) (cid:19) ∆ t k + ∆ t k , when k ≥ K µ . (57)From equations (52) and (56), we know lim k → ∞ (cid:107) e k (cid:107) = k → ∞ ∆ t k = ∞ , re-spectively. Therefore, by combining them with inequality (57), we obtainlim k → ∞ (cid:107) e k + (cid:107)(cid:107) e k (cid:107) = . That is to say, the sequence { x k } superlinearly converges to x ∗ . (cid:117)(cid:116) For the real-world problem, the singularity of J ( x ) may arise from the linear con-servation law such as the conservation of mass or the conservation of charge [29,42, J ( x ) is singular. Similarly to the standard assumption of the nonlinear dynam-ical system, we suppose that J ( x ) satisfies the one-sided Lipschitz condition (see p.303, [8] or p. 180, [16]) as follows: y T J ( x ) y ≤ − ν (cid:107) y (cid:107) , for y ∈ S c = (cid:8) y | c T y = (cid:9) , ν > , (58)where the constant vector c satisfies c T F ( x ) = , ∀ x ∈ ℜ n . The positive number ν is called the one-sided Lipschitz constant. Under the assumption of the one-sidedLipschitz condition (58), we know that matrix ( µ I − J ( x )) is nonsingular when µ > Property 2
Assume that J ( x ) satisfies the one-sided Lipschitz condition (58) in thevertical space S c of vector c , where the constant vector c satisfies c T F ( x ) = , ∀ x ∈ ℜ n . Then, matrix ( µ I − J ( x )) is nonsingular when µ >
0, and the solution s k of equa-tions (19)-(20) satisfies c T s k = Proof.
We prove it by contradiction. Assume that matrix ( µ I − J ( x )) is singular.Then, it exists a nonzero vector y such that ( µ I − J ( x )) y = . (59)Consequently, from the assumption c T F ( x ) =
0, we have c T y = µ c T J ( x ) y = µ (cid:0) c T J ( x ) (cid:1) y = . Thus, from the one-sided Lipschitz condition (58) and µ > , ν >
0, we obtain y T ( µ I − J ( x )) y = µ (cid:107) y (cid:107) − y T J ( x ) y ≥ ( µ + ν ) (cid:107) y (cid:107) > . It contradicts the assumption (59). Therefore, matrix ( µ I − J ( x )) is nonsingular.From equations (19)-(20), we have ( µ k I − J ( x k )) s k = ∆ t k + ∆ t k F ( x k ) . (60)By combining it with the assumption c T F ( x k ) =
0, we obtain c T ( µ k I − J ( x k )) s k = ∆ t k + ∆ t k c T F ( x k ) = . (61)Therefore, by substituting c T J ( x k ) = µ k c T s k = c T J ( x k ) s k = (cid:0) c T J ( x k ) (cid:1) s k = . (62)That is to say, c T s k = (cid:117)(cid:116) Similarly to the estimate (27) of (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) for the nonsingu-lar Jacobian matrix J ( x k ) , we also have its lower-bounded estimate when J ( x k ) issingular, k = , , , . . . . ontinuation Newton method with the trust-region time-stepping scheme 15 Lemma 3
Assume that J ( x k ) satisfies the one-sided Lipschitz condition (58) in thevertical space S c of vector c, where the constant vector c satisfies c T F ( x ) = , ∀ x ∈ ℜ n . Let s k be the solution of equations (19) - (21) , then we have (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ c s ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) , (63) where the positive constant c s satisfies < c s < . Proof.
From Property 2, we know that matrix ( µ I − J ( x k )) is nonsingular and s k satisfies c T s k =
0. From equations (19)-(20) and the Cauchy-Schwartz inequality | x T y | ≤ (cid:107) x (cid:107)(cid:107) y (cid:107) , we have µ k (cid:107) s k (cid:107) − s Tk J ( x k ) s k = ∆ t k + ∆ t k s Tk F ( x k ) ≤ ∆ t k + ∆ t k (cid:107) s k (cid:107)(cid:107) F ( x k ) (cid:107) . (64)By substituting one-sided Lipschitz condition (58) into equation (64), we obtain ( µ k + ν ) (cid:107) s k (cid:107) ≤ µ k (cid:107) s k (cid:107) − s Tk J ( x k ) s k ≤ ∆ t k + ∆ t k (cid:107) s k (cid:107)(cid:107) F ( x k ) (cid:107) . Consequently, we have (cid:107) s k (cid:107) ≤ µ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (65)From equations (19)-(20) and (65), we have (cid:107) F ( x k ) + J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) µ k s k + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) ≤ µ k (cid:107) s k (cid:107) + + ∆ t k (cid:107) F ( x k ) (cid:107)≤ µ k µ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) + + ∆ t k (cid:107) F ( x k ) (cid:107) . (66)From the definition (22) of the parameter µ k , we know µ k ≤ c ε . By substituting it intoinequality (66), we obtain (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ νµ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107)≥ ν c ε + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (67)Set c s = ν / ( c ε + ν ) . Then, from inequality (67), we obtain the estimate (63). (cid:117)(cid:116) Similarly to the lower-bounded estimate (35) of the time step size ∆ t k for thenonsingular Jacobian matrix J ( x k ) , we also have its lower-bounded estimate when J ( x k ) is singular, k = , , , . . . . Lemma 4
Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) satisfiesthe Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . The sequence { x k } is generated by Algorithm 1. Then, it exists a positive δ s such that ∆ t k ≥ δ s > , k = , , . . . , (68) where ∆ t k is adaptively updated by the trust-region scheme (24) - (25) . Proof.
From the Lipschitz continuity (34) of J ( x ) , we have (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) ( J ( x k + ts k ) − J ( x k )) s k dt (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:90) (cid:107) J ( x k + ts k ) − J ( x k ) (cid:107)(cid:107) s k (cid:107) dt ≤ (cid:90) L (cid:107) s k (cid:107) tdt = L (cid:107) s k (cid:107) . (69)By substituting the estimate (65) of s k into inequality (69), we obtain (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) ≤ L ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (70)Thus, from the definition (24) of ρ k , and inequalities (67), (70), we obtain | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107)≤ L ν ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ν (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (71)According to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} is monotoni-cally decreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x ) (cid:107) , k = , , . . . . Set¯ δ s = ν (cid:107) F ( x ) (cid:107) L η . (72)Assume that K is the first index such that ∆ t K ≤ ¯ δ s . Then, from inequalities (71)-(72),we obtain | ρ K − | < η . Consequently, ∆ t K + will be greater than ∆ t K according tothe time-stepping scheme (25). Set δ s = min { ∆ t K , ¯ δ s } . Then, we have ∆ t k ≥ δ s , k = , , , . . . . (cid:117)(cid:116) Now, from Lemma 3 and Lemma 4, we know that the sequence {(cid:107) F ( x k ) (cid:107)} con-verges to zero when k tends to infinity and its proof is similar to the proof of Theorem1. We state it as the following theorem 3 and omit its proof. Theorem 3
Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) sat-isfies the Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . Thesequence { x k } is generated by Algorithm 1. Then, we have lim k → ∞ in f (cid:107) F ( x k ) (cid:107) = . (73) Theorem 4
Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) sat-isfies the Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . Fur-thermore, we suppose that the sequence { x k } is generated by Algorithm 1 and itssubsequence { x k i } converges to x ∗ . Then, the sequence { x k } superlinearly convergesto x ∗ . ontinuation Newton method with the trust-region time-stepping scheme 17 Proof.
The framework of its proof can be roughly described as follows. Firstly,we prove lim k → ∞ ∆ t k = ∞ . Then, we prove that the sequence { x k } linearly convergesto x ∗ . Finally, we prove that the search step s k approximates the Newton step s Nk .Consequently, the sequence { x k } superlinearly converges to x ∗ .From Property 2, we know that matrix ( µ I − J ( x k )) is nonsingular and s k satisfies c T s k =
0, where the constant vector c satisfies c T F ( x ) = x ∈ ℜ n and s k is thesolution of equations (19)-(20).Firstly, we prove that it exists an index K such that ∆ t k will be enlarged at ev-ery iteration when k ≥ K . Consequently, we have lim k → ∞ ∆ t k = ∞ . From the lower-bounded estimate (67) of F ( x k ) − F ( x k + s k ) and inequality (71), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ L ν ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ν (cid:107) F ( x k ) (cid:107) . (74)Since the subsequence { x k i } converges to x ∗ , it exists an index K F such that (cid:107) F ( x K F ) (cid:107) ≤ η ν L . (75)Furthermore, according to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} ismonotonically decreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x K F ) (cid:107) when k ≥ K F . Thus, from inequalities (74)-(75), we have | ρ k − | ≤ η , when k ≥ K F . (76)Consequently, according to the time-stepping scheme (25), we know that ∆ t k + = γ ∆ t k when k ≥ K F . Therefore, we obtain lim k → ∞ ∆ t k = ∞ .Secondly, we prove that the sequence { x k } linearly converges to x ∗ as follows.We denote e k = x k − x ∗ . (77)From equations (19)-(20) and (77), we have e k + = e k + s k = e k + ( µ k I − J ( x k )) − ∆ t k + ∆ t k F ( x k ) . (78)By rearranging inequality (78), we obtain ( µ k I − J ( x k )) e k + = ( µ k I − J ( x k )) e k + ∆ t k + ∆ t k ( F ( x k ) − F ( x ∗ ))= µ k e k − ∆ t k J ( x k ) e k + ∆ t k + ∆ t k (cid:90) ( J ( x ∗ + te k ) − J ( x k )) e k dt . (79)Since c T s k = 0 and the subsequence { x k i } + ∞ i = converges to x ∗ , from equation (78),we have c T e k + = c T e k + c T s k = c T e k = · · · = c T e k i → , when i → ∞ . That is to say, we have c T e k = , k = , , . . . . Thus, from the one-sided Lipschitzcondition (58) and the Cauchy-Schwartz inequality | x T y | ≤ (cid:107) x (cid:107) (cid:107) y (cid:107) , we have (cid:107) e k + (cid:107) (cid:107) ( µ k I − J ( x k )) e k + (cid:107) ≥ e Tk + ( µ k I − J ( x k )) e k + = µ k e Tk + e k + − e Tk + J ( x k ) e k + ≥ ( µ k + ν ) (cid:107) e k + (cid:107) . (80)By rearranging inequality (80), we obtain (cid:107) e k + (cid:107) ≤ µ k + ν (cid:107) ( µ k I − J ( x k )) e k + (cid:107) . (81)From the continuity of J at x ∗ , it exists the positive constants M and ε such that (cid:107) J ( x ) (cid:107) ≤ M when (cid:107) x − x ∗ (cid:107) < ε . Since the subsequence { x k i } converges to x ∗ , it exists K such that (cid:107) x K − x ∗ (cid:107) < ε .By combining it with lim k → ∞ ∆ t k = ∞ , we can select a sufficiently large number K such that ∆ t K ≥ M / ν and (cid:107) e K (cid:107) ≤ . ν / L . Set K = max { K , K } .From equation (79) and the Lipschitz continuity (34), we have (cid:107) ( µ K I − J ( x K )) e K + (cid:107) ≤ µ K (cid:107) e K (cid:107) + ∆ t K (cid:107) J ( x K ) (cid:107)(cid:107) e K (cid:107) + ∆ t K + ∆ t K (cid:90) (cid:107) J ( x ∗ + te K ) − J ( x K ) (cid:107)(cid:107) e K (cid:107) dt ≤ (cid:18) µ K + ∆ t K M (cid:19) (cid:107) e K (cid:107) + (cid:90) L (cid:107) e K (cid:107) tdt ≤ (cid:18) µ K + ∆ t K M + L (cid:107) e K (cid:107) (cid:19) (cid:107) e K (cid:107) . By combining it with inequality (81), ∆ t K ≥ ( M ) / ν , and (cid:107) e K (cid:107) ≤ ν / ( L ) , we obtain (cid:107) e K + (cid:107) ≤ µ K + ∆ t K M + L (cid:107) e K (cid:107) µ K + ν (cid:107) e K (cid:107) ≤ µ K + / νµ K + ν (cid:107) e K (cid:107) < (cid:107) e K (cid:107) < ε . (82)Therefore, by induction, we obtain (cid:107) e k + (cid:107) ≤ µ k + ∆ t k M + L (cid:107) e k (cid:107) µ k + ν (cid:107) e k (cid:107) ≤ µ k + / νµ k + ν (cid:107) e k (cid:107) , when k ≥ K (83)Furthermore, from the definition (22), we know that µ k < c ε . By substituting it intoinequality (83), we have (cid:107) e k + (cid:107) ≤ q (cid:107) e k (cid:107) ≤ · · · ≤ q ( k − K + ) (cid:107) e K (cid:107) , q (cid:44) c ε + / ν c ε + ν < , when k ≥ K . Consequently, we obtain lim k → ∞ (cid:107) e k (cid:107) = { x k } superlinearly converges to x ∗ . Fromlim k → ∞ ∆ t k = ∞ , we can select a sufficiently large number K µ such that 1 / ∆ t k < c ε when k ≥ K µ . Thus, from the definition (22) of the regularization parameter µ k , weknow µ k = / ∆ t k when k ≥ K µ . By substituting it into equation (83), from lim k → ∞ ∆ t k = ∞ and lim k → ∞ (cid:107) e k (cid:107) =
0, we obtain (cid:107) e k + (cid:107)(cid:107) e k (cid:107) ≤ µ k + ∆ t k M + L (cid:107) e k (cid:107) µ k + ν = ∆ t k + ∆ t k M + L (cid:107) e k (cid:107) ∆ t k + ν → , when k → ∞ . That is to say, the sequence { x k } superlinearly converges to x ∗ . (cid:117)(cid:116) ontinuation Newton method with the trust-region time-stepping scheme 19 In this section, for some practical equilibrium problems and the classical test prob-lems of nonlinear equations, we test the performance of Algorithm 1 (CNMTr) andcompare it with the trust-region method (the built-in subroutine fsolve.m of the MAT-LAB environment [32,35]) and the homotopy continuation methods (HOMPACK90[48], and NAClab [24,49,50]).HOMPACK90 [48] is a classical homotopy method implemented by fortran 90 fornonlinear equations and it is very popular in engineering fields. Another state-of-the-art homotopy method is the built-in subroutine psolve.m of the NAClab environment[24,49]. Since psolve.m only solves the polynomial systems, we replace psolve.mwith its subroutine GaussNewton.m (the Gauss-Newton method) for non-polynomialsystems. Therefore, we compare these two homotopy methods with Algorithm 1, too.We collect 26 test problems of nonlinear equations, some of which come fromthe equilibrium problems of chemical reactions [16,18,41,47], and some of whichcome from the classical test problems [8,10,28,36,37]. Their simple descriptions aregiven by Table 2. Their detailed descriptions can be found in the preprint [31] and thereferences therein. Their dimensions vary from 1 to 3000. The Jacobian matrix J ( x ) of F ( x ) is singular for some test problems. The codes are executed by a HP Pavilionnotebook with an Intel quad-core CPU. The termination condition is given by (cid:107) F ( x it ) (cid:107) ∞ ≤ − . (84)The numerical results are put in Table 1 and Table 3. The number of iterations ofCNMTr and fsolve is illustrated by Figure 1. The consumed time of these four meth-ods (CNMTr, HOMPACK90, fsolve and NAClab) is illustrated by Figure 2. From Ta-ble 1 and Table 3, we find that CNMTr performs well for those test problems, and thetrust-region method (fsolve) and the classical homotopy methods (HOMPACK90 andNAClab) are failed to solve some problems, which especially come from the practicalproblems with the non-isolated singular Jacobian matrices such as examples 1, 2, 3,4, 6, 21, 23. Furthermore, from Figures 1 and 2, we also find that CNMATr has thesame fast convergence property as the traditional optimization (fsolve). The detaileddescriptions of numerical results can be found in the preprint [31].Table 1: Statistical results. CNMTr HOMPACK90 fsolve NAClabnumber of failed problems 0 7 9 13number of the minimum consumed time 19 0 7 0
In this article, we consider the continuation Newton method with the new time-stepping scheme (CNMTr) for nonlinear equations. We also analyze its local and
Exam It e r a t i on s CNMTrfsolve
Fig. 1: The number of iterations. -3 -2 -1 CNMTrHOMPACK90fsolveNAClab
Fig. 2: The consumed time.global convergence under the standard assumptions. Finally, for some classical testproblems, we compare it with the classical homotopy methods (HOMPACK90 andpsolve.m) and the traditional optimization method (fsolve.m). Numerical results showthat CNMTr preserves the robust convergence property of the classical homotopymethod and the fast convergence property of the traditional optimization method.From our point of view, the explicit continuation Newton method (Algorithm 1) isworth investigating further as a special continuation method. Furthermore, we willapply it to the global optimization problems and report its numerical results soon.
Acknowledgments
This work was supported in part by Grant 61876199 from Na-tional Natural Science Foundation of China, and Grant YJCB2011003HI from theInnovation Research Program of Huawei Technologies Co., Ltd..
References
1. Allgower, E.L., Georg, K.: Introduction to Numerical Continuation Methods, SIAM, Philadelphia(2003)2. Ascher, U.M., Petzold, L.R.: Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations, SIAM, Philadelphia (1998)3. Axelsson, O., Sysala, S.: Continuation Newton methods. Comput. Math. Appl. 70, 2621-2637 (2015)4. Branin, F.H.: Widely convergent method for finding multiple solutions of simultaneous nonlinearequations. IBM J. Res. Dev. 16, 504-521 (1972)5. Brenan, K.E., Campbell, S.L., L. R. Petzold, L.R.: Numerical solution of initial-value problems indifferential-algebraic equations, SIAM, Philadelphia (1996)6. Conn, A.R., Gould, N., Toint, Ph.L: Trust-Region Methods, SIAM, Philadelphia (2000)7. Davidenko, D.F.: On a new method of numerical solution of systems of nonlinear equations (in Rus-sian). Dokl. Akad. Nauk SSSR 88, 601-602 (1953)8. Deuflhard, P.: Newton Methods for Nonlinear Problems: Affine Invariance and Adaptive Algorithms,Springer, Berlin (2004)9. Deuflhard, P., Pesch, H.J., Rentrop, P.: A modified continuation method for the numerical solutionof nonlinear two-point boundary value problems by shooting techniques. Numer. Math. 26, 327-343(1975)10. Dennis, J.E, Schnabel, R.B: Numerical Methods for Unconstrained Optimization and Nonlinear Equa-tions, SIAM, Philadelphia (1996)11. Dennis, J.E, Mor´e, J.J.: A characterization of superlinear convergence and its application to quasi-Newton methods. Math. Comput. 28, 549-560 (1974)ontinuation Newton method with the trust-region time-stepping scheme 2112. Doedel, E.J.: Lecture notes in numerical analysis of nonlinear equations. in: Krauskopf, B., Osinga,H.M., Gal´an-Vioque, J. (Eds.): Numerical Continuation Methods for Dynamical Systems, pp. 1-50,Springer, Berlin (2007)13. Golub, G.H, Van Loan, C.F.: Matrix Computation (4th ed.), The John Hopkins University Press,Baltimore (2013)14. Griewank, A.: On solving nonlinear equations with simple singularities or nearly singular solutions.SIAM Rev. 27, 537-563 (1985)15. Hairer, E., Nørsett, S.P., G. Wanner, G.: Solving Ordinary Differential Equations I, Nonstiff Problems(2nd ed.), Springer, Berlin (1993)16. Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II, Stiff and Differential-AlgebraicProblems (2nd ed.), Springer, Berlin (1996)17. Hansen, P.C.: Regularization Tools: A MATLAB package for analysis and solution of discrete ill-posed problems. Numer. Algorithms 6, 1-35 (1994)18. Hiebert, K.L.: An evaluation of mathematical software that solves systems of nonlinear equations.ACM Trans. Math. Softw. 8, 5-20 (1982)19. Higham, D.J.: Trust region algorithms and timestep selection. SIAM J. Numer. Anal. 37, 194-210(1999)20. Kalaba, R.F., Zagustin, E., Holbrow, W., Huss, R.: A modification of Davidenko’s method for nonlin-ear systems. Comput. Math. Appl. 3, 315-319 (1977)21. Kelley, C.T., Keyes, D.E.: Convergence analysis of pseudo-transient continuation. SIAM J. Numer.Anal. 35, 508-523 (1998)22. Kelley, C.T.: Solving Nonlinear Equations with Newton’s Method, SIAM, Philadelphia (2003)23. Kelley, C.T.: Numerical methods for nonlinear equations. Acta Numer. 27, 207-287 (2018)24. Lee, T.L., Li, T.Y., Tsai, C.H.: HOM4PS-2.0: a software package for solving polynomial systems bythe polyhedral homotopy continuation method. Computing 83, 109-133 (2008)25. Levenberg, K.: A method for the solution of certain problems in least squares. Q. Appl. Math. 2,164-168 (1944)26. Liao, S.J.: Homotopy Analysis Method in Nonlinear Differential Equations, Springer, Berlin (2012)27. Logan, S.R.: Fundamentals of Chemical Kinetics, Longman Group Limited, London (1996)28. Lukˇsan, L.: Inexact trust region method for large sparse systems of nonlinear equations. J. Optim.Theory Appl. 81, 569-590 (1994)29. Luo, X.-L.: A trajectory-following method for solving the steady state of chemical reaction rate equa-tions. J. Theor. Comput. Chem. 8, 1025-1044 (2009)30. Luo, X.-L.: A second-order pseudo-transient method for steady-state problems. Appl. Math. Comput.216, 1752-1762 (2010)31. Luo, X.-L., Xiao, H., Lv, J.-H.: Continuation Newton method with the trust-region time-steppingscheme. arXiv preprint 2006.02634, available at http://arxiv.org/abs/2006.02634 , Beijing(June, 2020)32. MATLAB 9.6.0 (R2019a), The MathWorks Inc., , 2019.33. Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl.Math. 11, 431-441 (1963) 1963.34. Meintjes, K., Morgan, A.P.: Chemical equilibrium systems as numerical test problems, ACM Trans.Math. Softw. 16, 143-151 (1990)35. Mor´e, J.J.: The Levenberg-Marquardt algorithm: Implementation and theory. in: Watson G.A. (eds.):Numerical Analysis, Lecture Notes in Mathematics, vol. 630, pp. 105-116, Springer, Berlin (1978)36. Mor´e, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans.Math. Softw. 7, 17-41 (1981)37. Nocedal, J., Wright, S.J.: Numerical Optimization, Springer, Berlin (1999)38. Ortega, J.M., Rheinboldt, W.C.: Iteration Solution of Nonlinear Equations in Several Variables, SIAM,Philadelphia (2000)39. Powell, M.J.D.: Convergence properties of a class of minimization algorithms. in: Mangasarian, O.L.,Meyer, R.R., Robinson, S.M. (eds.): Nonlinear Programming 2, pp. 1-27, Academic Press, New York(1975)40. Pontryagin, L.S.: The Ordinary Differential Equations, Higher Education Press, Beijing (2006)41. Robertson, H.H.: The solution of a set of reaction rate equations. in: Walsh, J. (ed.): Numerical Anal-ysis, an Introduction, Academic Press, pp. 178-182, New York (1966)42. Shampine, L.F.: Linear conservation laws for ODEs. Comput. Math. Appl. 35, 45-53 (1998)43. Shampine, L.F.: Conservation laws and the numerical solution of ODEs II. Comput. Math. Appl. 38,61-72 (1999)2 Luo, Xiao and Lv
Table 2: Test problems.
No. dimension problem descriptionsExam 1 n = n = n =
20 The pollution problem [47]Exam 4 n = n = F ( x ) = sin ( x ) − x (p. 279, [37])Exam 6 n = exp ( x + y ) − = , x + y − sin (cid:0) ( x + y ) (cid:1) = n = x = , − y = n = n = n = n = n = n = n = n =
10 The tridiagonal system [28]Exam 16 n =
10 The discrete boundary-value problem [28]Exam 17 n =
100 Broyden tridiagonal problem [28]Exam 18 n = n = n = f ( x ) = x + x − , f ( x ) = exp ( x − ) + x − n = n = n = n =
10 Brown almost linear function [36]Exam 25 n = n = http://homepages.neiu.edu/~zzeng/naclab.html
50. Zeng, Z.G.: A Newton’s iteraton convergence quadratically to nonisolated solutions too. thepreprint of Department of Mathematics, Northeastern Illionis University, Chicago (2019), http://homepages.neiu.edu/~zzeng ontinuation Newton method with the trust-region time-stepping scheme 23
Table 3: Numerical results.
Problem CNMTr HOMPACK90 fsolve NAClab (psolve)CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞∞