[PDF] Continuation Newton methods with the residual trust-region time-stepping scheme for nonlinear equations

Abstract

For nonlinear equations, the homotopy methods (continuation methods) are popular in engineering fields since their convergence regions are large and they are quite reliable to find a solution. The disadvantage of the classical homotopy methods is that their computational time is heavy since they need to solve many auxiliary nonlinear systems during the intermediate continuation processes. In order to overcome this shortcoming, we consider the special explicit continuation Newton method with the residual trust-region time-stepping scheme for this problem. According to our numerical experiments, the new method is more robust and faster to find the required solution of the real-world problem than the traditional optimization method (the built-in subroutine fsolve.m of the MATLAB environment) and the homotopy continuation methods(HOMPACK90 and NAClab). Furthermore, we analyze the global convergence and the local superlinear convergence of the new method.

Full PDF

JJournal of XXX manuscript No. (will be inserted by the editor)

Continuation Newton Methods with the ResidualTrust-region Time-stepping Scheme for Nonlinear Equations

Xin-long Luo ∗ · Hang Xiao · Jia-hui Lv

Received: date / Accepted: date

Abstract

For nonlinear equations, the homotopy methods (continuation methods)are popular in engineering ﬁelds since their convergence regions are large and theyare quite reliable to ﬁnd a solution. The disadvantage of the classical homotopy meth-ods is that their consumed time is heavy since they need to solve many auxiliary non-linear systems during the intermediate continuation process. In order to overcomethis shortcoming, we consider the special explicit continuation Newton method withthe new time-stepping scheme for nonlinear equations. According to our numericalexperiments, it is more robust and faster to ﬁnd the required solution of the real-worldproblem than the traditional optimization method (the built-in subroutine fsolve.m ofthe MATLAB environment) and the homotopy continuation methods (HOMPACK90and NAClab). Furthermore, we analyze the global convergence and the local super-linear convergence of the new method.

Keywords

Continuation Newton method · trust-region technique · nonlinearequations · homotopy continuation method · equilibrium problem Mathematics Subject Classiﬁcation (2010) · · Xin-long Luo, Corresponding authorSchool of Artiﬁcial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] XiaoSchool of Artiﬁcial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] LvSchool of Artiﬁcial Intelligence, Beijing University of Posts and Telecommunications, P. O. Box 101,Xitucheng Road No. 10, Haidian District, 100876, Beijing ChinaE-mail: [email protected] a r X i v : . [ m a t h . NA ] A ug Luo, Xiao and Lv

In engineering ﬁelds, we often need to solve the equilibrium state of the differentialequation [26,34,41,47] as follows: dxdt = F ( x ) , x ( t ) = x . (1)That is to say, it requires to solve the following system of nonlinear equations: F ( x ) = , (2)where F : ℜ n → ℜ n is a vector function. For the nonlinear system (2), there are manypopular traditional optimization methods [6,10,19,37] and the classical homotopycontinuation methods [1,12,38,48] to solve it.For the traditional optimization methods such as the trust-region methods and theline search methods, the solution x ∗ of the nonlinear system (2) is found via solvingthe following equivalent nonlinear least-squares problemmin x ∈ ℜ n f ( x ) = (cid:107) F ( x ) (cid:107) , (3)where (cid:107) · (cid:107) denotes the Euclidean vector norm or its induced matrix norm. Generallyspeaking, the traditional optimization methods based on the merit function (3) are ef-ﬁcient for the large-scale problems since they have the local superlinear convergencenear the solution x ∗ [6,37].However, the line search methods and the trust-region methods are apt to stagnateat a local minimum point x ∗ of problem (3), when the Jacobian matrix J ( x ∗ ) of F ( x ∗ ) is singular or nearly singular, where J ( x ) = ∂ F ( x ) / ∂ x (see p. 304, [37]). Furthermore,the termination condition (cid:107) ∇ f ( x k ) (cid:107) = (cid:107) J ( x k ) T F ( x k ) (cid:107) < ε , (4)may lead these methods to early stop far away from the local minimum x ∗ . This canbe illustrated as follows. Consider F ( x ) = Ax = , A = (cid:20) − (cid:21) , (5)which has a unique solution x ∗ = ( , ) . Set ε = − , then the traditional optimiza-tion methods will early stop far away from x ∗ provided that x k = ( , c ) , c < .For the classical homotopy methods, the solution x ∗ of the nonlinear system (2)is found via constructing the following homotopy function H ( x , λ ) = ( − λ ) G ( x ) + λ F ( x ) , (6)and attempting to trace an implicitly deﬁned curve λ ( t ) ∈ H − ( ) from the startingpoint ( x , ) to a solution ( x ∗ , ) by the predictor-corrector methods [1,12], where thezero point of the artiﬁcial smooth function G ( x ) is known. Generally speaking, thehomotopy continuation methods are more reliable than the merit-function methods ontinuation Newton method with the trust-region time-stepping scheme 3 and they are very popular in engineering ﬁelds [26]. The disadvantage of the classicalhomotopy methods is that they require signiﬁcantly more function and derivativeevaluations and linear algebra operations than the merit-function methods since theyneed to solve many auxiliary nonlinear systems during the intermediate continuationprocess.In order to overcome this shortcoming of the traditional homotopy methods, weconsider the special continuation method based on the following Newton ﬂow [3,4,7,46] dx ( t ) dt = − J ( x ) − F ( x ) , x ( t ) = x , (7)and construct a special ODE method with the new time-stepping scheme based onthe trust-region technique to trace the trajectory of the Newton ﬂow (7), and therebyefﬁciently obtain the required solution x ∗ of the nonlinear system (2).The rest of this article is organized as follows. In the next section, we consider theexplicit continuation Newton method with the new time-stepping scheme based onthe trust-region technique for nonlinear equations. In section 3, under some standardassumptions, we prove the global convergence and the local superlinear convergenceof the new method. In section 4, some promising numerical results of the new methodare also reported, with comparison to the traditional trust-region method (the built-insubroutine fsolve.m of the MATLAB environment [32,35]) and the classical homo-topy continuation methods (HOMOPACK90 [48] and NAClab [24,49,50]). Finally,some conclusions and the future work are discussed in section 5. Throughout thisarticle, we assume that F ( x ) exists the zero point x ∗ . In this section, based on the trust-region technique, we construct a new time-steppingscheme for the the continuation Newton method to trace the trajectory of the Newtonﬂow and obtain its equilibrium point x ∗ .2.1 The continuous Newton ﬂowIf we consider the damped Newton method with the line search strategy for the non-linear system (2) [22,37], we have x k + = x k − α k J ( x k ) − F ( x k ) . (8)We denote o ( α ) as the higher-order inﬁnitesimal of α , that is to say,lim α → o ( α ) α = . In equation (8), if we let x k = x ( t k ) and x k + = x ( t k + α k ) + o ( α k ) , we obtain thecontinuous Newton ﬂow (7) when α k →

0. Actually, if we apply an iteration withthe explicit Euler method [15,44] to the continuous Newton ﬂow (7), we also obtain

Luo, Xiao and Lv the damped Newton method (8). Since the Jacobian matrix J ( x ) may be singular, wereformulate the continuous Newton ﬂow (7) as the more general formula: − J ( x ) dx ( t ) dt = F ( x ) , x ( t ) = x . (9)The continuous Newton ﬂow (9) is an old method and can be backtracked to Davi-denko’s work [7] in 1953, after that it was investigated by Branin [4], Deuﬂhard et al[9], Tanabe [46] and Kalaba et al [20] in 1970s, and applied to nonlinear boundaryproblems by Axelsson and Sysala [3] recently. The reason of the continuous researchinterest for this method is that it has some nice properties. One is that the solution x ( t ) of the continuous Newton ﬂow has the global convergence from any initial point,as described by the following property 1. Property 1 (Branin [4] and Tanabe [46]) Assume that x ( t ) is the solution of the con-tinuous Newton ﬂow (9), then the energy function f ( x ( t )) = (cid:107) F ( x ) (cid:107) converges tozero as t tends to inﬁnity. That is, for every limit point x ∗ of x ( t ) , it is also a solutionof nonlinear system (2). Furthermore, F ( x ) has the linear convergence rate e − t . If itsJacobian matrix J ( x ) is nonsingular, x ( t ) can not converge to its equilibrium x ∗ on theﬁnite interval. Proof.

Assume that x ( t ) is the solution of the continuous Newton ﬂow (9), then wehave ddt (cid:0) e t F ( x ) (cid:1) = e t J ( x ) dx ( t ) dt + e t F ( x ) = , which gives F ( x ) = F ( x ) e − t . (10)From equation (10), it is not difﬁcult to know that F ( x ( t )) converges to zero withlinear convergence rate e − t when t tends to inﬁnity. Thus, if the solution x ( t ) of thecontinuous Newton ﬂow (9) falls in a compact set, it converges to a point x ∗ when t tends to inﬁnity, and this limit point x ∗ is also a solution of nonlinear system (2).Furthermore, if the Jacobian matrix J ( x ) is nonsingular, the ordinary differentialequation (ODE) (9) is equivalent to the continuous Newton ﬂow (7). Thus, from thedynamical property of the ODE, it is not difﬁcult to know that the ODE (7) has aunique solution x ( t ) and it can not converge to its equilibrium x ∗ on the ﬁnite interval(pp. 79-82, [40]). (cid:117)(cid:116) The inverse matrix J ( x ) − can be regarded as the preconditioner of F ( x ) such thatthe solution elements x i ( t ) ( i = , , . . . , n ) of the continuous Newton ﬂow (7) havethe roughly same convergence rates and it mitigates the stiff property of the ODE (7)(the deﬁnition of the stiff problem can be found in [16] and references therein). Thisproperty is very useful since it makes us adopt the explicit ODE method to trace thetrajectory of the Newton ﬂow. It can be roughly illustrated as follows.Actually, if we consider F ( x ) = Ax , from the ODE (9), we have A dxdt = − Ax , x ( ) = x . (11) ontinuation Newton method with the trust-region time-stepping scheme 5 By integrating the linear ODE (11), we obtain x ( t ) = e − t x . (12)From equation (12), we know that the solution x ( t ) of the ODE (11) linearly con-verges to zero with the same rate e − t when t tends to inﬁnity.2.2 Continuation Newton methodsFrom subsection 2.1, we know that the solution x ( t ) of the continuous Newton ﬂow(9) has the nice global convergence property. On the other hand, when the Jacobianmatrix J ( x ) is singular or nearly singular, the ODE (9) is the system of differential-algebraic equations (DAEs) and its trajectory can not be efﬁciently followed by thegeneral ODE method such as the backward differentiation formulas (the built-in sub-routine bdf15s.m of the MATLAB environment [2,5,16,32,44]). Thus, we need toconstruct the special method to handle this problem and we expect that the newmethod has the global convergence as the homotopy continuation methods and thefast convergence rate near the solution x ∗ as the merit-function methods. In order toachieve these two aims, we construct the special continuous Newton method with thenew step size α k = ∆ t k / ( + ∆ t k ) and the time-stepping size ∆ t k is adaptively by thetrust-region technique for problem (9).We ﬁrst apply the implicit Euler method to the continuous Newton ﬂow (9) [2,5],then we obtain J ( x k + ) x k + − x k ∆ t k = − F ( x k + ) . (13)The scheme (13) is an implicit method and it needs to solve a system of nonlinearequations at every iteration. To avoid solving the system of nonlinear equations, wereplace J ( x k + ) and F ( x k + ) with J ( x k ) and F ( x k ) + J ( x k )( x k + − x k ) respectively inequation (13). Thus, we obtain the explicit continuation Newton method as follows: J ( x k ) s k = − ∆ t k + ∆ t k F ( x k ) , (14) x k + = x k + s k . (15)The explicit continuation Newton method (14)-(15) is similar to the damped New-ton method (8) if we let α k = ∆ t k / ( + ∆ t k ) in equation (14). However, from the viewof the ODE method, they are very different. The damped Newton method (8) is ob-tained by the explicit Euler scheme applied to the continuous Newton ﬂow (9), andits time-stepping size ∆ t k is restricted by the numerical stability [16,44, ? ]. That is,the large time-stepping size ∆ t k can not be adopted in the steady-state phase. Theexplicit continuation Newton method (14)-(15) is obtained by the semi-implicit Eu-ler method applied to the continuous Newton ﬂow (9), and its time-stepping size ∆ t k is not restricted by the numerical stability. Therefore, the large time-steppingsize can be adopted in the steady-state phase for the explicit continuation Newtonmethod (14)-(15), and it mimics the Newton method near the equilibrium solution Luo, Xiao and Lv x ∗ such that it has the fast local convergence rate. The most of all, the new stepsize α k = ∆ t k / ( ∆ t k + ) is favourable to adopt the trust-region technique for adap-tively adjusting the time-stepping size ∆ t k such that the explicit continuation Newtonmethod (14)-(15) accurately traces the trajectory of the continuous Newton ﬂow inthe transient-state phase and achieves the fast convergence rate near the equilibriumpoint x ∗ .For the real-world problem, the Jacobian matrix J ( x ) may be singular, whicharises from the physical property. For example, for the chemical kinetic reactionproblem (1), the elements of x ( t ) represent the reaction concentrations and they mustsatisfy the linear conservation law [27]. A system is called to satisfy the linear con-servation law ([42], or p. 35, [44]), if there is a constant vector c (cid:54) = c T x ( t ) = c T x ( ) (16)holds for all t ≥

0. If there exists a constant vector c such that c T F ( x ) = , ∀ x ∈ ℜ n , (17)we have c T J ( x ) = , ∀ x ∈ ℜ n . (18)From equation (18), we know that the Jacobian matrix J ( x ) is singular. For this case,the solution x ( t ) of the ODE (1) satisﬁes the linear conservation law (16).For the isolated singularity of the Jacobian matrix J ( x ) , there are some efﬁcientapproaches to handle this problem [14]. Here, since the singular set of the Jacobianmatrix J ( x ) may be connected, we adopt the regularization technique [17,21] to mod-ify the explicit continuation Newton method (14)-(15) as follows: ( µ k I − J ( x k )) s Pk = F ( x k ) , (19) s k = ∆ t k + ∆ t k s Pk , (20) x k + = x k + s k , (21)where µ k is a small positive number. In order to achieve the fast convergence rate nearthe solution x ∗ , the regularization continuation Newton method (19)-(21) is requiredto approximate the Newton method x k + = x k − J ( x k ) − F ( x k ) near the solution x ∗ [11]. Thus, we select the regularization parameter µ k as follows: µ k = (cid:40) c ε , if ∆ t k ≤ / c ε , / ∆ t k , others , (22)where c ε is a small positive constant such as c ε = − in practice.It is not difﬁcult to verify that the regularization continuation Newton method(19)-(21) preserves the linear conservation law (16) if it exists a constant vector c ∈ ℜ n such that c T F ( x ) = , ∀ x ∈ ℜ n . Actually, from c T F ( x ) =

0, we have c T J ( x ) = c T x k + = c T x k + c T s k = c T x k + µ k c T (cid:18) ∆ t k + ∆ t k F ( x k ) + J ( x k ) s k (cid:19) = c T x k . (23) ontinuation Newton method with the trust-region time-stepping scheme 7 That is to say, the regularization continuation Newton method (19)-(21) preserves thelinear conservation law (16).2.3 The residual trust-region time-stepping schemeAnother issue is how to adaptively adjust the time-stepping size ∆ t k at every iteration.A popular way to control the time-stepping size is based on the trust-region technique[6,8,19,30]. For this time-stepping scheme, it needs to select suitable a merit func-tion and construct an approximation model of the merit function. Here, we adopt theresidual (cid:107) F ( x ) (cid:107) as the merit function and adopt (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) as theapproximation model of (cid:107) F ( x k + ) (cid:107) . Thus, according to the following ratio: ρ k = (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) , (24)we enlarge or reduce the time-stepping size ∆ t k at every iteration. A particular ad-justment strategy is given as follows: ∆ t k + =  γ ∆ t k , i f | − ρ k | ≤ η , ∆ t k , i f η < | − ρ k | < η , γ ∆ t k , i f | − ρ k | ≥ η , (25)where the constants are selected as γ = , γ = . , η = . , η = .

75 accordingto our numerical experiments.According to the above discussions, we give the detailed implementation of theregularization continuation Newton method with the trust-region time-stepping schemefor nonlinear equations in Algorithm 1.

In this section, we discuss some theoretical properties of Algorithm 1. Firstly, weestimate the lower bound of the predicted reduction (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k )( x k + − x k ) (cid:107) , which is similar to the estimate of the trust-region method for the unconstrainedoptimization problem [39]. Lemma 1

Assume that it exists a positive constant m such that (cid:107) J ( x k ) y (cid:107) ≥ m (cid:107) y (cid:107) , ∀ y ∈ ℜ n , k = , , , . . . . (26) Furthermore, we suppose that s k is the solution of the regularization continuationNewton method (19) - (21) , where the regularization parameter µ k deﬁned by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m. Then, we have the following estimate (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ c r ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) , (27) where the positive constant c r satisﬁes < c r < . Luo, Xiao and Lv

Algorithm 1

Continuation Newton Methods with the Residual Trust-region Time-stepping Scheme (The CNMTr method)

Input: the nonlinear function: F ( x ) ;the initial point: x (optional);the terminated parameter: ε (optional). Output: the approximation solution x ∗ of nonlinear equations.1: Set the default x = [ , , ..., ] T and ε = − when the called function does not provide x or ε .2: Initialize the parameters: η a = − , η = . , γ = , η = . , γ = . F ( x ) and the Jacobian matrix J ( x ) . Select the initial time-stepping size as ∆ t = min (cid:26) . , (cid:107) F ( x ) (cid:107) (cid:27) . Compute the residual

Res = (cid:107) F ( x ) (cid:107) .4: while Res k ≥ ε do

5: Solve the linear system of equations (19)-(20) to obtain the trial step s k .6: Set x k + = x k + s k .7: Evaluate F ( x k + ) .8: Compute the residual Res k + = (cid:107) F ( x k + ) (cid:107) . if (cid:107) F ( x k ) (cid:107) < (cid:107) F ( x k ) + J ( x k ) s k (cid:107) then ρ k = − else

12: Compute the ratio ρ k from equation (24).13: end if

14: Adjust the time-stepping size ∆ t k + according to the trust-region scheme (25).15: if ρ k ≥ η a then

16: Accept the trial step s k . Set x k = x k + , ∆ t k = ∆ t k + and Res k = Res k + . Evaluate the Jacobianmatrix J ( x k ) .17: else

18: Set ∆ t k = ∆ t k + .19: end if

20: Set k ←− k + end while Proof.

From equations (19)-(20), we have − J ( x k ) s k + µ k s k = F ( x k ) ∆ t k + ∆ t k . (28)Thus, from equation (28), we obtain (cid:107) J ( x k ) s k + F ( x k ) (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) µ k s k + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) µ k ∆ t k + ∆ t k ( − J ( x k ) + µ k I ) − F ( x k ) + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) ≤ + ∆ t k (cid:32) ∆ t k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (cid:33) (cid:107) F ( x k ) (cid:107) . (29) ontinuation Newton method with the trust-region time-stepping scheme 9 According to the deﬁnition of the induced matrix norm [13], we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = max z (cid:54) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) / (cid:107) z (cid:107) = max y (cid:54) = (cid:107) y (cid:107) (cid:13)(cid:13)(cid:13)(cid:16) − µ k J ( x k ) + I (cid:17) y (cid:13)(cid:13)(cid:13) = (cid:107) y (cid:107) = (cid:13)(cid:13)(cid:13)(cid:16) − µ k J ( x k ) + I (cid:17) y (cid:13)(cid:13)(cid:13) . (30)On the other hand, when (cid:107) y (cid:107) =

1, from the nonsingular assumption (26) of matrix J ( x k ) , we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) y (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) − µ k J ( x k ) y + y (cid:13)(cid:13)(cid:13)(cid:13) ≥ µ k (cid:107) J ( x k ) y (cid:107) − (cid:107) y (cid:107) ≥ µ k m − . (31)Thus, from the assumption µ k ≤ c ε < . m and equations (30)-(31), we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − µ k J ( x k ) + I (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ µ k m − µ k ≤ c ε m − c ε . (32)By substituting inequality (32) into inequality (29), we have (cid:107) J ( x k ) s k + F ( x k ) (cid:107) ≤ + ∆ t k (cid:18) + c ε m − c ε ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) , that is to say, (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ (cid:18) m − c ε m − c ε (cid:19) ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (33)Set c r = ( m − c ε ) / ( m − c ε ) in the above inequality. Then, we obtain the estimate(27). (cid:117)(cid:116) In order to prove that the sequence {(cid:107) F ( x k ) (cid:107)} converges to zero when k tends toinﬁnity, we also need to estimate the lower bound of the time-stepping size ∆ t k . Lemma 2

Assume that F : ℜ n → ℜ n is continuously differentiable and its Jacobianfunction J is Lipschitz continuous, that is to say, it exists a positive number L suchthat (cid:107) J ( x ) − J ( y ) (cid:107) ≤ L (cid:107) x − y (cid:107) , ∀ x , y ∈ ℜ n . (34) Furthermore, we suppose that the sequence { x k } is generated by Algorithm 1 andthe nonsingular condition (26) of matrix J ( x k ) holds. Then, when the regularizationparameter µ k deﬁned by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, itexists a positive number δ ∆ t such that ∆ t k ≥ δ ∆ t > , k = , , , . . . , (35) where ∆ t k is adaptively adjusted by formulas (24) - (25) . Proof.

From the Lipschitz continuous assumption (34) of J ( x ) , we have (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) J ( x k + ts k ) s k dt − J ( x k ) s k (cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) ( J ( x k + ts k ) − J ( x k )) s k dt (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:90) (cid:107) ( J ( x k + ts k ) − J ( x k )) s k (cid:107) dt ≤ (cid:90) (cid:107) J ( x k + ts k ) − J ( x k ) (cid:107)(cid:107) s k (cid:107) dt ≤ (cid:90) L (cid:107) s k (cid:107) tdt = L (cid:107) s k (cid:107) . (36)On the other hand, from equations (19)-(20), we have (cid:107) s k (cid:107) = ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( − J ( x k ) + µ k I ) − F ( x k ) (cid:13)(cid:13)(cid:13) ≤ ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( − J ( x k ) + µ k I ) − (cid:13)(cid:13)(cid:13) (cid:107) F ( x k ) (cid:107) . (37)Similarly to the estimation (32), from the assumption µ k ≤ c ε < . m and thenonsingular assumption (26) of J ( x k ) , we have (cid:13)(cid:13) ( − J ( x k ) + µ k I ) − (cid:13)(cid:13) ≤ m − µ k ≤ m − c ε . (38)Thus, from inequalities (36)-(38), we obtain (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (39)From the deﬁnition (24) of ρ k , the estimation (33), and inequality (39), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (40)According to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} is monotonicallydecreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x ) (cid:107) , k = , , . . . . Set¯ δ ∆ t (cid:44) ( m − c ε ) (cid:107) F ( x ) (cid:107) L η . (41)Assume that K is the ﬁrst index such that ∆ t K ≤ ¯ δ ∆ t . Then, from inequalities(40)-(41), we obtain | ρ K − | < η . Consequently, ∆ t K + will be greater than ∆ t K according to the adaptive adjustment scheme (25). Set δ ∆ t = min { ∆ t K , ¯ δ ∆ t } . Then,we have ∆ t k ≥ δ ∆ t , k = , , , . . . . (cid:117)(cid:116) By using the estimate results of Lemma 1 and Lemma 2, we can prove that thesequence {(cid:107) F ( x k ) (cid:107)} converges to zero when k tends to inﬁnity. ontinuation Newton method with the trust-region time-stepping scheme 11 Theorem 1

Assume that F : ℜ n → ℜ n is continuously differentiable and its Jaco-bian function J satisﬁes the Lipschitz condition (34) . Furthermore, we suppose thatthe sequence { x k } is generated by Algorithm 1 and J ( x k ) satisﬁes the nonsingular as-sumption (26) . Then, when the regularization parameter µ k deﬁned by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, we have lim k → ∞ in f (cid:107) F ( x k ) (cid:107) = . (42) Proof.

According to Algorithm 1 and inequality (40), we know that it exists aninﬁnite subsequence { x k l } such that (cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l + ) (cid:107)(cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l ) + J ( x k l ) s k l (cid:107) ≥ η a , l = , , . . . . (43)From inequalities (27) and (43), we have (cid:107) F ( x k l ) (cid:107) − (cid:107) F ( x k l + ) (cid:107) ≥ η a c r ∆ t k l + ∆ t k l (cid:107) F ( x k l ) (cid:107) . (44)By rearranging inequality (44) and combining it with the conclusion (35), we have (cid:107) F ( x k l + ) (cid:107) ≤ (cid:18) − η a c r ∆ t k l + ∆ t k l (cid:19) (cid:107) F ( x k l ) (cid:107) ≤ (cid:18) − η a c r δ ∆ t + δ ∆ t (cid:19) (cid:107) F ( x k l ) (cid:107) . (45)Therefore, from inequality (45) and 0 ≤ q (cid:44) − η a c r δ ∆ t / ( + δ ∆ t ) < k l → ∞ (cid:107) F ( x k l ) (cid:107) = . That is to say, the result of this lemma is proved. (cid:117)(cid:116)

Under the standard nonsingular assumption (26) of J ( x ∗ ) and the local Lipschitzcontinuity (34), we analyze the local superlinear convergence rate of Algorithm 1 nearthe solution x ∗ as follows. For convenience, we deﬁne the neighbourhood B δ ( x ∗ ) of x ∗ as B r ( x ∗ ) = { x : (cid:107) x − x ∗ (cid:107) ≤ δ } . Theorem 2

Assume that F : ℜ n → ℜ n is continuously differentiable and F ( x ∗ ) = .Furthermore, we suppose that J satisﬁes the local Lipschitz continuity (34) at x ∗ andthe nonsingular condition (26) when x ∈ B δ ( x ∗ ) . Then, when the regularization pa-rameter µ k deﬁned by equation (22) and the constant c ε satisfy µ k ≤ c ε < . m, itexists a neighborhood B r ( x ∗ ) of x ∗ such that the sequence { x k } generated by Algo-rithm 1 with x ∈ B r ( x ∗ ) superlinearly converges to x ∗ . Proof.

The framework of its proof can be roughly described as follows. Firstly, weprove that the sequence { x k } linearly converges to x ∗ when x gets close enough to x ∗ . Then, we prove lim k → ∞ ∆ t k = ∞ . Finally, we prove that the search step s k approx-imates the Newton step s Nk . Consequently, the sequence { x k } superlinearly convergesto x ∗ .Firstly, similarly to the estimate (32), from the assumption µ k ≤ c ε < . m , weobtain (cid:13)(cid:13) ( µ k I − J ( x k )) − (cid:13)(cid:13) ≤ m − c ε , ∀ x k ∈ B δ ( x ∗ ) , k = , , , . . . . (46) We denote e k = x k − x ∗ . From equations (19)-(20), we have e k + = e k + s k = e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − ( F ( x k ) − F ( x ∗ ))= e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − (cid:90) J ( x ∗ + te k ) e k dt . (47)By rearranging the above equation (47), we obtain e k + = + ∆ t k e k + ∆ t k + ∆ t k ( µ k I − J ( x k )) − (cid:90) ( J ( x ∗ + te k ) − J ( x k ) + µ k I ) e k dt . By using the Lipschitz continuous assumption (34) of J , the estimate (46), and theassumption µ k ≤ c ε < . m , we have (cid:107) e k + (cid:107) ≤ + ∆ t k (cid:107) e k (cid:107) + ∆ t k + ∆ t k (cid:13)(cid:13)(cid:13) ( µ k I − J ( x k )) − (cid:13)(cid:13)(cid:13) (cid:90) ( (cid:107) J ( x ∗ + te k ) − J ( x k ) (cid:107) + µ k ) (cid:107) e k (cid:107) dt ≤ + ∆ t k (cid:107) e k (cid:107) + ∆ t k + ∆ t k m − µ k (cid:18) µ k + L (cid:107) e k (cid:107) (cid:19) (cid:107) e k (cid:107) = + m − µ k (cid:0) µ k + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k (cid:107) e k (cid:107) ≤ + m − c ε (cid:0) c ε + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k (cid:107) e k (cid:107) . (48)We denote q k (cid:44) + m − c ε (cid:0) c ε + L (cid:107) e k (cid:107) (cid:1) ∆ t k + ∆ t k , (49)and select x ∈ B δ ( x ∗ ) to satisfy (cid:107) e (cid:107) < m − c ε L . (50)Set r = min { δ , ( m − c ε ) / L } . When x ∈ B r ( x ∗ ) , from equations (48)-(50) and theassumption c ε < . m , by induction, we have (cid:107) e k + (cid:107) ≤ q k (cid:107) e k (cid:107) , q k < + m ( m − c ε ) ∆ t k + ∆ t k < , k = , , . . . . (51)It is not difﬁcult to know that f ( t ) (cid:44) ( + α t ) / ( + t ) is monotonically decreasingwhen 0 ≤ α <

1. Thus, from the estimate (35) of the time step size ∆ t k and inequality(51), we obtain (cid:107) e k + (cid:107) ≤ q k (cid:107) e k (cid:107) ≤ q (cid:107) e k (cid:107) , q (cid:44) + m ( m − c ε ) δ ∆ t + δ ∆ t < . Therefore, we have (cid:107) e k + (cid:107) ≤ q k (cid:107) e (cid:107) → , when k → ∞ . (52) ontinuation Newton method with the trust-region time-stepping scheme 13 That is to say, we obtain lim k → ∞ x k = x ∗ .Secondly, from equations (19)-(20) and inequality (46), we have (cid:107) s k (cid:107) = ∆ t k + ∆ t k (cid:13)(cid:13) ( − J ( x ) + µ k I ) − F ( x k ) (cid:13)(cid:13) ≤ ∆ t k + ∆ t k (cid:13)(cid:13) ( − J ( x ) + µ k I ) − (cid:13)(cid:13) (cid:107) F ( x k ) (cid:107) ≤ m − c ε ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (53)Similarly to the estimate (40), from the deﬁnition (24) of ρ k , inequalities (33) and(53), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ L ( m − c ε ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ( m − c ε ) (cid:107) F ( x k ) (cid:107) . (54)Since lim k → ∞ x k = x ∗ and F ( x ∗ ) =

0, we can select a sufﬁciently large number K suchthat (cid:107) F ( x k ) (cid:107) ≤ η ( m − c ε ) L , when k ≥ K . (55)From inequalities (54)-(55), we have | ρ k − | ≤ η , when k ≥ K . This means ∆ t k + = γ ∆ t k when k ≥ K , according to the time-stepping scheme (25).That is to say, we have lim k → ∞ ∆ t k = ∞ . (56)Finally, since lim k → ∞ ∆ t k = ∞ , we can select a sufﬁciently large number K µ suchthat 1 / ∆ t k < c ε when k ≥ K µ . Consequently, from the deﬁnition (22) of the regu-larization parameter µ k , we obtain µ k = / ∆ t k when k ≥ K µ . By substituting it intoinequality (48), we have (cid:107) e k + (cid:107)(cid:107) e k (cid:107) ≤ + ∆ t k + ∆ t k + ∆ t k m − µ k (cid:18) µ k + L (cid:107) e k (cid:107) (cid:19) = + ∆ t k + m − / ∆ t k (cid:18) ∆ t k + L (cid:107) e k (cid:107) (cid:19) ∆ t k + ∆ t k , when k ≥ K µ . (57)From equations (52) and (56), we know lim k → ∞ (cid:107) e k (cid:107) = k → ∞ ∆ t k = ∞ , re-spectively. Therefore, by combining them with inequality (57), we obtainlim k → ∞ (cid:107) e k + (cid:107)(cid:107) e k (cid:107) = . That is to say, the sequence { x k } superlinearly converges to x ∗ . (cid:117)(cid:116) For the real-world problem, the singularity of J ( x ) may arise from the linear con-servation law such as the conservation of mass or the conservation of charge [29,42, J ( x ) is singular. Similarly to the standard assumption of the nonlinear dynam-ical system, we suppose that J ( x ) satisﬁes the one-sided Lipschitz condition (see p.303, [8] or p. 180, [16]) as follows: y T J ( x ) y ≤ − ν (cid:107) y (cid:107) , for y ∈ S c = (cid:8) y | c T y = (cid:9) , ν > , (58)where the constant vector c satisﬁes c T F ( x ) = , ∀ x ∈ ℜ n . The positive number ν is called the one-sided Lipschitz constant. Under the assumption of the one-sidedLipschitz condition (58), we know that matrix ( µ I − J ( x )) is nonsingular when µ > Property 2

Assume that J ( x ) satisﬁes the one-sided Lipschitz condition (58) in thevertical space S c of vector c , where the constant vector c satisﬁes c T F ( x ) = , ∀ x ∈ ℜ n . Then, matrix ( µ I − J ( x )) is nonsingular when µ >

0, and the solution s k of equa-tions (19)-(20) satisﬁes c T s k = Proof.

We prove it by contradiction. Assume that matrix ( µ I − J ( x )) is singular.Then, it exists a nonzero vector y such that ( µ I − J ( x )) y = . (59)Consequently, from the assumption c T F ( x ) =

0, we have c T y = µ c T J ( x ) y = µ (cid:0) c T J ( x ) (cid:1) y = . Thus, from the one-sided Lipschitz condition (58) and µ > , ν >

0, we obtain y T ( µ I − J ( x )) y = µ (cid:107) y (cid:107) − y T J ( x ) y ≥ ( µ + ν ) (cid:107) y (cid:107) > . It contradicts the assumption (59). Therefore, matrix ( µ I − J ( x )) is nonsingular.From equations (19)-(20), we have ( µ k I − J ( x k )) s k = ∆ t k + ∆ t k F ( x k ) . (60)By combining it with the assumption c T F ( x k ) =

0, we obtain c T ( µ k I − J ( x k )) s k = ∆ t k + ∆ t k c T F ( x k ) = . (61)Therefore, by substituting c T J ( x k ) = µ k c T s k = c T J ( x k ) s k = (cid:0) c T J ( x k ) (cid:1) s k = . (62)That is to say, c T s k = (cid:117)(cid:116) Similarly to the estimate (27) of (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) for the nonsingu-lar Jacobian matrix J ( x k ) , we also have its lower-bounded estimate when J ( x k ) issingular, k = , , , . . . . ontinuation Newton method with the trust-region time-stepping scheme 15 Lemma 3

Assume that J ( x k ) satisﬁes the one-sided Lipschitz condition (58) in thevertical space S c of vector c, where the constant vector c satisﬁes c T F ( x ) = , ∀ x ∈ ℜ n . Let s k be the solution of equations (19) - (21) , then we have (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ c s ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) , (63) where the positive constant c s satisﬁes < c s < . Proof.

From Property 2, we know that matrix ( µ I − J ( x k )) is nonsingular and s k satisﬁes c T s k =

0. From equations (19)-(20) and the Cauchy-Schwartz inequality | x T y | ≤ (cid:107) x (cid:107)(cid:107) y (cid:107) , we have µ k (cid:107) s k (cid:107) − s Tk J ( x k ) s k = ∆ t k + ∆ t k s Tk F ( x k ) ≤ ∆ t k + ∆ t k (cid:107) s k (cid:107)(cid:107) F ( x k ) (cid:107) . (64)By substituting one-sided Lipschitz condition (58) into equation (64), we obtain ( µ k + ν ) (cid:107) s k (cid:107) ≤ µ k (cid:107) s k (cid:107) − s Tk J ( x k ) s k ≤ ∆ t k + ∆ t k (cid:107) s k (cid:107)(cid:107) F ( x k ) (cid:107) . Consequently, we have (cid:107) s k (cid:107) ≤ µ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (65)From equations (19)-(20) and (65), we have (cid:107) F ( x k ) + J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) µ k s k + + ∆ t k F ( x k ) (cid:13)(cid:13)(cid:13)(cid:13) ≤ µ k (cid:107) s k (cid:107) + + ∆ t k (cid:107) F ( x k ) (cid:107)≤ µ k µ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) + + ∆ t k (cid:107) F ( x k ) (cid:107) . (66)From the deﬁnition (22) of the parameter µ k , we know µ k ≤ c ε . By substituting it intoinequality (66), we obtain (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) ≥ νµ k + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107)≥ ν c ε + ν ∆ t k + ∆ t k (cid:107) F ( x k ) (cid:107) . (67)Set c s = ν / ( c ε + ν ) . Then, from inequality (67), we obtain the estimate (63). (cid:117)(cid:116) Similarly to the lower-bounded estimate (35) of the time step size ∆ t k for thenonsingular Jacobian matrix J ( x k ) , we also have its lower-bounded estimate when J ( x k ) is singular, k = , , , . . . . Lemma 4

Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) satisﬁesthe Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . The sequence { x k } is generated by Algorithm 1. Then, it exists a positive δ s such that ∆ t k ≥ δ s > , k = , , . . . , (68) where ∆ t k is adaptively updated by the trust-region scheme (24) - (25) . Proof.

From the Lipschitz continuity (34) of J ( x ) , we have (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) (cid:90) ( J ( x k + ts k ) − J ( x k )) s k dt (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:90) (cid:107) J ( x k + ts k ) − J ( x k ) (cid:107)(cid:107) s k (cid:107) dt ≤ (cid:90) L (cid:107) s k (cid:107) tdt = L (cid:107) s k (cid:107) . (69)By substituting the estimate (65) of s k into inequality (69), we obtain (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107) ≤ L ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (70)Thus, from the deﬁnition (24) of ρ k , and inequalities (67), (70), we obtain | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:107) F ( x k + ) − F ( x k ) − J ( x k ) s k (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107)≤ L ν ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ν (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) . (71)According to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} is monotoni-cally decreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x ) (cid:107) , k = , , . . . . Set¯ δ s = ν (cid:107) F ( x ) (cid:107) L η . (72)Assume that K is the ﬁrst index such that ∆ t K ≤ ¯ δ s . Then, from inequalities (71)-(72),we obtain | ρ K − | < η . Consequently, ∆ t K + will be greater than ∆ t K according tothe time-stepping scheme (25). Set δ s = min { ∆ t K , ¯ δ s } . Then, we have ∆ t k ≥ δ s , k = , , , . . . . (cid:117)(cid:116) Now, from Lemma 3 and Lemma 4, we know that the sequence {(cid:107) F ( x k ) (cid:107)} con-verges to zero when k tends to inﬁnity and its proof is similar to the proof of Theorem1. We state it as the following theorem 3 and omit its proof. Theorem 3

Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) sat-isﬁes the Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . Thesequence { x k } is generated by Algorithm 1. Then, we have lim k → ∞ in f (cid:107) F ( x k ) (cid:107) = . (73) Theorem 4

Assume that F : ℜ n → ℜ n is continuously differentiable and J ( x ) sat-isﬁes the Lipschitz continuity (34) and the one-sided Lipschitz condition (58) . Fur-thermore, we suppose that the sequence { x k } is generated by Algorithm 1 and itssubsequence { x k i } converges to x ∗ . Then, the sequence { x k } superlinearly convergesto x ∗ . ontinuation Newton method with the trust-region time-stepping scheme 17 Proof.

The framework of its proof can be roughly described as follows. Firstly,we prove lim k → ∞ ∆ t k = ∞ . Then, we prove that the sequence { x k } linearly convergesto x ∗ . Finally, we prove that the search step s k approximates the Newton step s Nk .Consequently, the sequence { x k } superlinearly converges to x ∗ .From Property 2, we know that matrix ( µ I − J ( x k )) is nonsingular and s k satisﬁes c T s k =

0, where the constant vector c satisﬁes c T F ( x ) = x ∈ ℜ n and s k is thesolution of equations (19)-(20).Firstly, we prove that it exists an index K such that ∆ t k will be enlarged at ev-ery iteration when k ≥ K . Consequently, we have lim k → ∞ ∆ t k = ∞ . From the lower-bounded estimate (67) of F ( x k ) − F ( x k + s k ) and inequality (71), we have | ρ k − | = (cid:12)(cid:12)(cid:12)(cid:12) (cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k + ) (cid:107)(cid:107) F ( x k ) (cid:107) − (cid:107) F ( x k ) + J ( x k ) s k (cid:107) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ L ν ( µ k + ν ) (cid:18) ∆ t k + ∆ t k (cid:19) (cid:107) F ( x k ) (cid:107) ≤ L ν (cid:107) F ( x k ) (cid:107) . (74)Since the subsequence { x k i } converges to x ∗ , it exists an index K F such that (cid:107) F ( x K F ) (cid:107) ≤ η ν L . (75)Furthermore, according to Algorithm 1, we know that the sequence {(cid:107) F ( x k ) (cid:107)} ismonotonically decreasing. Consequently, we have (cid:107) F ( x k ) (cid:107) ≤ (cid:107) F ( x K F ) (cid:107) when k ≥ K F . Thus, from inequalities (74)-(75), we have | ρ k − | ≤ η , when k ≥ K F . (76)Consequently, according to the time-stepping scheme (25), we know that ∆ t k + = γ ∆ t k when k ≥ K F . Therefore, we obtain lim k → ∞ ∆ t k = ∞ .Secondly, we prove that the sequence { x k } linearly converges to x ∗ as follows.We denote e k = x k − x ∗ . (77)From equations (19)-(20) and (77), we have e k + = e k + s k = e k + ( µ k I − J ( x k )) − ∆ t k + ∆ t k F ( x k ) . (78)By rearranging inequality (78), we obtain ( µ k I − J ( x k )) e k + = ( µ k I − J ( x k )) e k + ∆ t k + ∆ t k ( F ( x k ) − F ( x ∗ ))= µ k e k − ∆ t k J ( x k ) e k + ∆ t k + ∆ t k (cid:90) ( J ( x ∗ + te k ) − J ( x k )) e k dt . (79)Since c T s k = 0 and the subsequence { x k i } + ∞ i = converges to x ∗ , from equation (78),we have c T e k + = c T e k + c T s k = c T e k = · · · = c T e k i → , when i → ∞ . That is to say, we have c T e k = , k = , , . . . . Thus, from the one-sided Lipschitzcondition (58) and the Cauchy-Schwartz inequality | x T y | ≤ (cid:107) x (cid:107) (cid:107) y (cid:107) , we have (cid:107) e k + (cid:107) (cid:107) ( µ k I − J ( x k )) e k + (cid:107) ≥ e Tk + ( µ k I − J ( x k )) e k + = µ k e Tk + e k + − e Tk + J ( x k ) e k + ≥ ( µ k + ν ) (cid:107) e k + (cid:107) . (80)By rearranging inequality (80), we obtain (cid:107) e k + (cid:107) ≤ µ k + ν (cid:107) ( µ k I − J ( x k )) e k + (cid:107) . (81)From the continuity of J at x ∗ , it exists the positive constants M and ε such that (cid:107) J ( x ) (cid:107) ≤ M when (cid:107) x − x ∗ (cid:107) < ε . Since the subsequence { x k i } converges to x ∗ , it exists K such that (cid:107) x K − x ∗ (cid:107) < ε .By combining it with lim k → ∞ ∆ t k = ∞ , we can select a sufﬁciently large number K such that ∆ t K ≥ M / ν and (cid:107) e K (cid:107) ≤ . ν / L . Set K = max { K , K } .From equation (79) and the Lipschitz continuity (34), we have (cid:107) ( µ K I − J ( x K )) e K + (cid:107) ≤ µ K (cid:107) e K (cid:107) + ∆ t K (cid:107) J ( x K ) (cid:107)(cid:107) e K (cid:107) + ∆ t K + ∆ t K (cid:90) (cid:107) J ( x ∗ + te K ) − J ( x K ) (cid:107)(cid:107) e K (cid:107) dt ≤ (cid:18) µ K + ∆ t K M (cid:19) (cid:107) e K (cid:107) + (cid:90) L (cid:107) e K (cid:107) tdt ≤ (cid:18) µ K + ∆ t K M + L (cid:107) e K (cid:107) (cid:19) (cid:107) e K (cid:107) . By combining it with inequality (81), ∆ t K ≥ ( M ) / ν , and (cid:107) e K (cid:107) ≤ ν / ( L ) , we obtain (cid:107) e K + (cid:107) ≤ µ K + ∆ t K M + L (cid:107) e K (cid:107) µ K + ν (cid:107) e K (cid:107) ≤ µ K + / νµ K + ν (cid:107) e K (cid:107) < (cid:107) e K (cid:107) < ε . (82)Therefore, by induction, we obtain (cid:107) e k + (cid:107) ≤ µ k + ∆ t k M + L (cid:107) e k (cid:107) µ k + ν (cid:107) e k (cid:107) ≤ µ k + / νµ k + ν (cid:107) e k (cid:107) , when k ≥ K (83)Furthermore, from the deﬁnition (22), we know that µ k < c ε . By substituting it intoinequality (83), we have (cid:107) e k + (cid:107) ≤ q (cid:107) e k (cid:107) ≤ · · · ≤ q ( k − K + ) (cid:107) e K (cid:107) , q (cid:44) c ε + / ν c ε + ν < , when k ≥ K . Consequently, we obtain lim k → ∞ (cid:107) e k (cid:107) = { x k } superlinearly converges to x ∗ . Fromlim k → ∞ ∆ t k = ∞ , we can select a sufﬁciently large number K µ such that 1 / ∆ t k < c ε when k ≥ K µ . Thus, from the deﬁnition (22) of the regularization parameter µ k , weknow µ k = / ∆ t k when k ≥ K µ . By substituting it into equation (83), from lim k → ∞ ∆ t k = ∞ and lim k → ∞ (cid:107) e k (cid:107) =

0, we obtain (cid:107) e k + (cid:107)(cid:107) e k (cid:107) ≤ µ k + ∆ t k M + L (cid:107) e k (cid:107) µ k + ν = ∆ t k + ∆ t k M + L (cid:107) e k (cid:107) ∆ t k + ν → , when k → ∞ . That is to say, the sequence { x k } superlinearly converges to x ∗ . (cid:117)(cid:116) ontinuation Newton method with the trust-region time-stepping scheme 19 In this section, for some practical equilibrium problems and the classical test prob-lems of nonlinear equations, we test the performance of Algorithm 1 (CNMTr) andcompare it with the trust-region method (the built-in subroutine fsolve.m of the MAT-LAB environment [32,35]) and the homotopy continuation methods (HOMPACK90[48], and NAClab [24,49,50]).HOMPACK90 [48] is a classical homotopy method implemented by fortran 90 fornonlinear equations and it is very popular in engineering ﬁelds. Another state-of-the-art homotopy method is the built-in subroutine psolve.m of the NAClab environment[24,49]. Since psolve.m only solves the polynomial systems, we replace psolve.mwith its subroutine GaussNewton.m (the Gauss-Newton method) for non-polynomialsystems. Therefore, we compare these two homotopy methods with Algorithm 1, too.We collect 26 test problems of nonlinear equations, some of which come fromthe equilibrium problems of chemical reactions [16,18,41,47], and some of whichcome from the classical test problems [8,10,28,36,37]. Their simple descriptions aregiven by Table 2. Their detailed descriptions can be found in the preprint [31] and thereferences therein. Their dimensions vary from 1 to 3000. The Jacobian matrix J ( x ) of F ( x ) is singular for some test problems. The codes are executed by a HP Pavilionnotebook with an Intel quad-core CPU. The termination condition is given by (cid:107) F ( x it ) (cid:107) ∞ ≤ − . (84)The numerical results are put in Table 1 and Table 3. The number of iterations ofCNMTr and fsolve is illustrated by Figure 1. The consumed time of these four meth-ods (CNMTr, HOMPACK90, fsolve and NAClab) is illustrated by Figure 2. From Ta-ble 1 and Table 3, we ﬁnd that CNMTr performs well for those test problems, and thetrust-region method (fsolve) and the classical homotopy methods (HOMPACK90 andNAClab) are failed to solve some problems, which especially come from the practicalproblems with the non-isolated singular Jacobian matrices such as examples 1, 2, 3,4, 6, 21, 23. Furthermore, from Figures 1 and 2, we also ﬁnd that CNMATr has thesame fast convergence property as the traditional optimization (fsolve). The detaileddescriptions of numerical results can be found in the preprint [31].Table 1: Statistical results. CNMTr HOMPACK90 fsolve NAClabnumber of failed problems 0 7 9 13number of the minimum consumed time 19 0 7 0

In this article, we consider the continuation Newton method with the new time-stepping scheme (CNMTr) for nonlinear equations. We also analyze its local and

Exam It e r a t i on s CNMTrfsolve

Fig. 1: The number of iterations. -3 -2 -1 CNMTrHOMPACK90fsolveNAClab

Fig. 2: The consumed time.global convergence under the standard assumptions. Finally, for some classical testproblems, we compare it with the classical homotopy methods (HOMPACK90 andpsolve.m) and the traditional optimization method (fsolve.m). Numerical results showthat CNMTr preserves the robust convergence property of the classical homotopymethod and the fast convergence property of the traditional optimization method.From our point of view, the explicit continuation Newton method (Algorithm 1) isworth investigating further as a special continuation method. Furthermore, we willapply it to the global optimization problems and report its numerical results soon.

Acknowledgments

This work was supported in part by Grant 61876199 from Na-tional Natural Science Foundation of China, and Grant YJCB2011003HI from theInnovation Research Program of Huawei Technologies Co., Ltd..

References

1. Allgower, E.L., Georg, K.: Introduction to Numerical Continuation Methods, SIAM, Philadelphia(2003)2. Ascher, U.M., Petzold, L.R.: Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations, SIAM, Philadelphia (1998)3. Axelsson, O., Sysala, S.: Continuation Newton methods. Comput. Math. Appl. 70, 2621-2637 (2015)4. Branin, F.H.: Widely convergent method for ﬁnding multiple solutions of simultaneous nonlinearequations. IBM J. Res. Dev. 16, 504-521 (1972)5. Brenan, K.E., Campbell, S.L., L. R. Petzold, L.R.: Numerical solution of initial-value problems indifferential-algebraic equations, SIAM, Philadelphia (1996)6. Conn, A.R., Gould, N., Toint, Ph.L: Trust-Region Methods, SIAM, Philadelphia (2000)7. Davidenko, D.F.: On a new method of numerical solution of systems of nonlinear equations (in Rus-sian). Dokl. Akad. Nauk SSSR 88, 601-602 (1953)8. Deuﬂhard, P.: Newton Methods for Nonlinear Problems: Afﬁne Invariance and Adaptive Algorithms,Springer, Berlin (2004)9. Deuﬂhard, P., Pesch, H.J., Rentrop, P.: A modiﬁed continuation method for the numerical solutionof nonlinear two-point boundary value problems by shooting techniques. Numer. Math. 26, 327-343(1975)10. Dennis, J.E, Schnabel, R.B: Numerical Methods for Unconstrained Optimization and Nonlinear Equa-tions, SIAM, Philadelphia (1996)11. Dennis, J.E, Mor´e, J.J.: A characterization of superlinear convergence and its application to quasi-Newton methods. Math. Comput. 28, 549-560 (1974)ontinuation Newton method with the trust-region time-stepping scheme 2112. Doedel, E.J.: Lecture notes in numerical analysis of nonlinear equations. in: Krauskopf, B., Osinga,H.M., Gal´an-Vioque, J. (Eds.): Numerical Continuation Methods for Dynamical Systems, pp. 1-50,Springer, Berlin (2007)13. Golub, G.H, Van Loan, C.F.: Matrix Computation (4th ed.), The John Hopkins University Press,Baltimore (2013)14. Griewank, A.: On solving nonlinear equations with simple singularities or nearly singular solutions.SIAM Rev. 27, 537-563 (1985)15. Hairer, E., Nørsett, S.P., G. Wanner, G.: Solving Ordinary Differential Equations I, Nonstiff Problems(2nd ed.), Springer, Berlin (1993)16. Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II, Stiff and Differential-AlgebraicProblems (2nd ed.), Springer, Berlin (1996)17. Hansen, P.C.: Regularization Tools: A MATLAB package for analysis and solution of discrete ill-posed problems. Numer. Algorithms 6, 1-35 (1994)18. Hiebert, K.L.: An evaluation of mathematical software that solves systems of nonlinear equations.ACM Trans. Math. Softw. 8, 5-20 (1982)19. Higham, D.J.: Trust region algorithms and timestep selection. SIAM J. Numer. Anal. 37, 194-210(1999)20. Kalaba, R.F., Zagustin, E., Holbrow, W., Huss, R.: A modiﬁcation of Davidenko’s method for nonlin-ear systems. Comput. Math. Appl. 3, 315-319 (1977)21. Kelley, C.T., Keyes, D.E.: Convergence analysis of pseudo-transient continuation. SIAM J. Numer.Anal. 35, 508-523 (1998)22. Kelley, C.T.: Solving Nonlinear Equations with Newton’s Method, SIAM, Philadelphia (2003)23. Kelley, C.T.: Numerical methods for nonlinear equations. Acta Numer. 27, 207-287 (2018)24. Lee, T.L., Li, T.Y., Tsai, C.H.: HOM4PS-2.0: a software package for solving polynomial systems bythe polyhedral homotopy continuation method. Computing 83, 109-133 (2008)25. Levenberg, K.: A method for the solution of certain problems in least squares. Q. Appl. Math. 2,164-168 (1944)26. Liao, S.J.: Homotopy Analysis Method in Nonlinear Differential Equations, Springer, Berlin (2012)27. Logan, S.R.: Fundamentals of Chemical Kinetics, Longman Group Limited, London (1996)28. Lukˇsan, L.: Inexact trust region method for large sparse systems of nonlinear equations. J. Optim.Theory Appl. 81, 569-590 (1994)29. Luo, X.-L.: A trajectory-following method for solving the steady state of chemical reaction rate equa-tions. J. Theor. Comput. Chem. 8, 1025-1044 (2009)30. Luo, X.-L.: A second-order pseudo-transient method for steady-state problems. Appl. Math. Comput.216, 1752-1762 (2010)31. Luo, X.-L., Xiao, H., Lv, J.-H.: Continuation Newton method with the trust-region time-steppingscheme. arXiv preprint 2006.02634, available at http://arxiv.org/abs/2006.02634 , Beijing(June, 2020)32. MATLAB 9.6.0 (R2019a), The MathWorks Inc., , 2019.33. Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl.Math. 11, 431-441 (1963) 1963.34. Meintjes, K., Morgan, A.P.: Chemical equilibrium systems as numerical test problems, ACM Trans.Math. Softw. 16, 143-151 (1990)35. Mor´e, J.J.: The Levenberg-Marquardt algorithm: Implementation and theory. in: Watson G.A. (eds.):Numerical Analysis, Lecture Notes in Mathematics, vol. 630, pp. 105-116, Springer, Berlin (1978)36. Mor´e, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans.Math. Softw. 7, 17-41 (1981)37. Nocedal, J., Wright, S.J.: Numerical Optimization, Springer, Berlin (1999)38. Ortega, J.M., Rheinboldt, W.C.: Iteration Solution of Nonlinear Equations in Several Variables, SIAM,Philadelphia (2000)39. Powell, M.J.D.: Convergence properties of a class of minimization algorithms. in: Mangasarian, O.L.,Meyer, R.R., Robinson, S.M. (eds.): Nonlinear Programming 2, pp. 1-27, Academic Press, New York(1975)40. Pontryagin, L.S.: The Ordinary Differential Equations, Higher Education Press, Beijing (2006)41. Robertson, H.H.: The solution of a set of reaction rate equations. in: Walsh, J. (ed.): Numerical Anal-ysis, an Introduction, Academic Press, pp. 178-182, New York (1966)42. Shampine, L.F.: Linear conservation laws for ODEs. Comput. Math. Appl. 35, 45-53 (1998)43. Shampine, L.F.: Conservation laws and the numerical solution of ODEs II. Comput. Math. Appl. 38,61-72 (1999)2 Luo, Xiao and Lv

Table 2: Test problems.

No. dimension problem descriptionsExam 1 n = n = n =

20 The pollution problem [47]Exam 4 n = n = F ( x ) = sin ( x ) − x (p. 279, [37])Exam 6 n = exp ( x + y ) − = , x + y − sin (cid:0) ( x + y ) (cid:1) = n = x = , − y = n = n = n = n = n = n = n = n =

10 The tridiagonal system [28]Exam 16 n =

10 The discrete boundary-value problem [28]Exam 17 n =

100 Broyden tridiagonal problem [28]Exam 18 n = n = n = f ( x ) = x + x − , f ( x ) = exp ( x − ) + x − n = n = n = n =

10 Brown almost linear function [36]Exam 25 n = n = http://homepages.neiu.edu/~zzeng/naclab.html

50. Zeng, Z.G.: A Newton’s iteraton convergence quadratically to nonisolated solutions too. thepreprint of Department of Mathematics, Northeastern Illionis University, Chicago (2019), http://homepages.neiu.edu/~zzeng ontinuation Newton method with the trust-region time-stepping scheme 23

Table 3: Numerical results.

Problem CNMTr HOMPACK90 fsolve NAClab (psolve)CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞ CPU (s) (cid:107) F ( x it ) (cid:107) ∞∞