[PDF] On the order optimality of the regularization via inexact Newton iterations

Abstract

Inexact Newton regularization methods have been proposed by Hanke and Rieder for solving nonlinear ill-posed inverse problems. Every such a method consists of two components: an outer Newton iteration and an inner scheme providing increments by regularizing local linearized equations. The method is terminated by a discrepancy principle. In this paper we consider the inexact Newton regularization methods with the inner scheme defined by Landweber iteration, the implicit iteration, the asymptotic regularization and Tikhonov regularization. Under certain conditions we obtain the order optimal convergence rate result which improves the suboptimal one of Rieder. We in fact obtain a more general order optimality result by considering these inexact Newton methods in Hilbert scales.

Full PDF

aa r X i v : . [ m a t h . NA ] N ov On the order optimality of the regularizationvia inexact Newton iterations

Qinian JinAbstract

Inexact Newton regularization methods have been proposed byHanke and Rieder for solving nonlinear ill-posed inverse problems. Every sucha method consists of two components: an outer Newton iteration and an innerscheme providing increments by regularizing local linearized equations. Themethod is terminated by a discrepancy principle. In this paper we considerthe inexact Newton regularization methods with the inner scheme deﬁned byLandweber iteration, the implicit iteration, the asymptotic regularization andTikhonov regularization. Under certain conditions we obtain the order optimalconvergence rate result which improves the suboptimal one of Rieder. We infact obtain a more general order optimality result by considering these inexactNewton methods in Hilbert scales.

Mathematics Subject Classiﬁcation (2000) · · Inverse problems arise whenever one searches for unknown causes based onobservation of their eﬀects. Driven by the requirements from huge amount ofpractical applications, the ﬁeld of inverse problems has undergone a tremen-dous growth. Such problems are usually ill-posed in the sense that their solu-tions do not depend continuously on the data. In practical applications, onenever has exact data, instead only noisy data are available due to errors in themeasurements. Even if the deviation is very small, algorithms developed forwell-posed problems may fail, since noise could be ampliﬁed by an arbitrarilylarge factor. Therefore, the development of stable methods for solving inverseproblems is a central topic.

Qinian JinMathematical Sciences Institute, The Australian National University,Canberra, ACT 0200, AustraliaE-mail: [email protected]

In this paper we consider the stable resolution of nonlinear inverse problemswhich mathematically can be formulated as the nonlinear equations F ( x ) = y, (1.1)where F : D ( F ) ⊂ X 7→ Y is a nonlinear Fr´echet diﬀerentiable operatorbetween two Hilbert spaces X and Y whose norms and inner products aredenoted as k · k and ( · , · ) respectively. We use F ′ ( x ) to denote the Fr´echetderivative of F at x ∈ D ( F ) and use F ′ ( x ) ∗ to denote the adjoint of F ′ ( x ). Weassume that (1.1) has a solution x † in the domain D ( F ) of F , i.e. F ( x † ) = y .Let y δ be the only available noisy data of y satisfying k y δ − y k ≤ δ (1.2)with a given small noise level δ >

0. Due to the intrinsic ill-posedness, regular-ization methods should be employed to produce from y δ a stable approximatesolution of (1.1).Many regularization methods have been considered in the last two decades.Due to their straightforward implementation and fast convergence property,Newton type regularization methods are attractive for solving nonlinear in-verse problems. In [8] we considered a general class of Newton type methodsof the form x n +1 = x n + g t n ( F ′ ( x n ) ∗ F ′ ( x n )) F ′ ( x n ) ∗ (cid:0) y δ − F ( x n ) (cid:1) , (1.3)where x is an initial guess of x † , { t n } is a sequence of positive numbers, and { g t } is a family of spectral ﬁlter functions. The scheme (1.3) can be derivedby applying the linear regularization method deﬁned by { g t } to the linearizedequation F ′ ( x n )( x − x n ) = y δ − F ( x n )which follows from (1.1) by replacing y by y δ and F ( x ) by its linearization F ( x n ) + F ′ ( x n )( x − x n ) at x n . When the sequence { t n } is given a prioriwith suitable property, we showed in [8] that, under the discrepancy principle,the methods are convergent and order optimal. We also considered in [9] themethods in Hilbert scales and obtained the order optimal convergence rates.In the deﬁnition of the Newton type methods (1.3), one may determinethe sequence { t n } adaptively during computation. Motivated by the inexactNewton methods in [1] for well-posed problems, Hanke proposed in [4] his reg-ularizing Levenberg-Marquardt scheme for solving nonlinear inverse problemswith { t n } chosen to satisfy k y δ − F ( x n ) − F ′ ( x n )( x n +1 − x n ) k = η k y δ − F ( x n ) k at each step for some preassigned number η ∈ (0 ,

1) and with the discrepancyprinciple used to terminate the iteration. Rieder generalized the idea in [4]and proposed in [12] (see also [10]) a general class of inexact Newton methods;every such a method consists of two components: an outer Newton iterationand an inner scheme providing increment by regularizing local linearized equa-tions. When the inner scheme is deﬁned by an iterative method, the number of iterations is determined adaptively which has the advantage to avoid theover-solving of the linearized equation that may occur when the inner schemeis terminated a priori. The convergence rates of inexact Newton regulariza-tion methods were considered in [13] but only suboptimal ones were derived.It is a longstanding question whether the inexact Newton methods are orderoptimal. Important progress has been made recently in [5] where the regu-larizing Levenberg-Marquardt scheme is shown to be order optimal. In thispaper we consider the inexact Newton regularization methods in which theinner schemes are deﬁned by applying various linear regularization methods,including Landweber iteration, the implicit iteration, the asymptotic regular-ization and Tikhonov regularization, to the local linearized equations and showthat these methods are indeed order optimal by exploiting ideas developed in[5,9,10]. We even consider these methods in Hilbert scales and derive the or-der optimal convergence rates. Our theoretical results conﬁrm those numericalillustrations in [12,13].This paper is organized as follows. In Section 2 we formulate the methodsprecisely and state the main results on the order optimal convergence rates.In Section 3 we show that these methods are well-deﬁned, and prove that theerror decays monotonically. In Section 4 we complete the proof of the the mainresult by deriving the order optimal convergence rates.

The inexact Newton regularization methods are a family of methods for solv-ing nonlinear ill-posed inverse problems. Every such a method consists of twocomponents, an outer Newton iteration and an inner scheme providing incre-ments by regularizing local linearized equations. An approximate solution isoutput by a discrepancy principle.To be more precise, the method starts with an initial guess x ∈ D ( F ).Assume that x n is a current iterate, one may apply any regularization schemeto the linearized equation F ′ ( x n ) u = y δ − F ( x n ) (2.1)to produce a family of regularized approximations { u n ( t ) } . One may choose t n to be the smallest number t n > k y δ − F ( x n ) − F ′ ( x n ) u n ( t n ) k ≤ η k y δ − F ( x n ) k (2.2)for some preassigned value 0 < η <

1. The next iterate is then updatedas x n +1 = x n + u n ( t n ). The outer Newton iteration is terminated by thediscrepancy principle k y δ − F ( x n δ ) k ≤ τ δ < k y δ − F ( x n ) k , ≤ n < n δ (2.3)for some given number τ >

1. This outputs an integer n δ and hence x n δ whichis used to approximate the exact solution x † . The convergence rates of the inexact Newton regularization methods havebeen considered in [12,13]. It has been shown that if x − x † ∈ R (( F ′ ( x † ) ∗ F ′ ( x † )) µ )for some 0 < µ ≤ /

2, then there is a number 0 < µ < µ such that k x n δ − x † k = O ( δ µ − µ ) / (1+2 µ ) )which is only suboptimal. It is a long-standing question whether the inex-act Newton regularization methods are order optimal. Important progress hasbeen made recently in [5] where the regularizing Levenberg-Marquardt schemeis proved to be order optimal.In this paper we will consider the inexact Newton regularization methodsin which the inner schemes are deﬁned by applying Landweber iteration, theimplicit iteration, the asymptotic regularization, or Tikhonov regularizationto the linearized equation (2.1) and show that these methods are indeed orderoptimal. For these four methods, u n ( t ) are deﬁned by u n ( t ) = g t ( F ′ ( x n ) ∗ F ′ ( x n )) F ′ ( x n ) ∗ (cid:0) y δ − F ( x n ) (cid:1) with the spectral ﬁlter functions { g t } given by g t ( λ ) = [ t ] − X j =0 (1 − λ ) j , [ t ] X j =1 (1 + λ ) − j , λ (cid:0) − e − tλ (cid:1) , (cid:18) t + λ (cid:19) − (2.4)respectively, where [ t ] denotes the largest integer not greater than t .We need the following standard condition which is known as the Newton-Mysovskii condition (see [2]). Assumption 2.1 (a) There exists K ≥ such that k [ F ′ ( x ) − F ′ ( z )] h k ≤ K k x − z kk F ′ ( z ) h k , ∀ h ∈ X for all x, z ∈ B ρ ( x † ) ⊂ D ( F ) , where B ρ ( x † ) denotes the ball of radius ρ > with center at x † .(b) F is properly scaled so that k F ′ ( x ) k ≤ Θ < for all x ∈ B ρ ( x † ) . The order optimality of these four inexact Newton regularization methodsis contained in the following result.

Theorem 2.1

Let F satisfy Assumption 2.1, let τ > and < η < besuch that τ η > , and let x ∈ B ρ ( x † ) . If K k x − x † k is suﬃciently small,then the inexact Newton regularization methods with the inner scheme deﬁnedby Landweber iteration, the implicit iteration, the asymptotic regularization,or Tikhonov regularization are well-deﬁned and terminate after n δ = O (1 + | log δ | ) iterations. If, in addition, x − x † = ( F ′ ( x † ) ∗ F ′ ( x † )) µ ω for some ω ∈N ( F ′ ( x † )) ⊥ ⊂ X and < µ ≤ / and if K k ω k is suﬃciently small, thenthere holds k x n δ − x † k ≤ C k ω k µ δ µ µ for some constant C independent of δ and k ω k . We will not give the proof of Theorem 2.1 directly. Instead, we will provea more general result by considering these four inexact Newton regulariza-tion methods in Hilbert scales. Let L be a densely deﬁned self-adjoint strictlypositive linear operator in X satisfying k x k ≤ γ ( Lx, x ) , x ∈ D ( L )for some constant γ >

0, where D ( L ) denotes the domain of L . For each t ∈ R ,we deﬁne X t to be the completion of ∩ ∞ k =0 D ( L k ) with respect to the Hilbertspace norm k x k t := k L t x k . This family of Hilbert spaces {X t } t ∈ R is called the Hilbert scales generated by L . The following are fundamental properties (see [3]):(a) For any −∞ < q < r < ∞ , X r is densely and continuously embeddedinto X q with k x k q ≤ γ r − q k x k r , x ∈ X r , (2.5)(b) For any −∞ < p < q < r < ∞ there holds the interpolation inequality k x k q ≤ k x k r − qr − p p k x k q − pr − p r , x ∈ X r . (2.6)(c) If T : X 7→ Y is a bounded linear operator satisfying m k h k − a ≤ k T h k ≤ M k h k − a , h ∈ X for some constants M ≥ m > a ≥

0, then for the operator A := T L − s : X 7→ Y with s ≥ − a there holds for any | ν | ≤ c ( ν ) k h k − ν ( a + s ) ≤ k ( A ∗ A ) ν/ h k ≤ c ( ν ) k h k − ν ( a + s ) (2.7)on D (( A ∗ A ) ν/ ), where A ∗ := L − s T ∗ : Y → X is the adjoint of A and c ( ν ) := min { m ν , M ν } and c ( ν ) = max { m ν , M ν } . We will consider the inexact Newton regularization methods in which theinner schemes are deﬁned by applying Landweber iteration, the implicit it-eration, the asymptotic regularization, or Tikhonov regularization in Hilbertscales to the linearized equation (2.1). Now we have u n ( t ) = g t (cid:0) L − s F ′ ( x n ) ∗ F ′ ( x n ) (cid:1) L − s F ′ ( x n ) ∗ (cid:0) y δ − F ( x n ) (cid:1) (2.8)with g t deﬁned by (2.4), where s ∈ R is a suitable chosen number. The iter-ative solutions are deﬁned by x n +1 = x n + u n ( t n ) with t n > x n δ .We will use x n δ , constructed from these four inexact Newton regularizationmethods in Hilbert scales, to approximate the true solution x † of (1.1) andderive the order optimal convergence rate when x − x † ∈ X µ with s < µ ≤ b + 2 s . We need the following condition on the nonlinear operator F . Assumption 2.2 (a) There exist constants a ≥ and < m ≤ M < ∞ suchthat m k h k − a ≤ k F ′ ( x ) h k ≤ M k h k − a , h ∈ X for all x ∈ B ρ ( x † ) .(b) F is properly scaled so that k F ′ ( x ) L − s k X →Y ≤ Θ < for all x ∈ B ρ ( x † ) , where s ≥ − a .(c) There exist < β ≤ , ≤ b ≤ a and K ≥ such that k F ′ ( x ) − F ′ ( z ) k X − b →Y ≤ K k x − z k β for all x, z ∈ B ρ ( x † ) . This condition was ﬁrst used in [11] for the convergence analysis of thenonlinear Landweber iteration in Hilbert scales. It was then used recentlyin [7] and [9] for nonlinear Tikhonov regularization and some Newton-typeregularization methods in Hilbert scales respectively. One can consult [11,7]for several examples satisfying Assumption 2.2.

Theorem 2.2

Let F satisfy Assumption 2.2 with s ≥ ( a − b ) /β , let τ > and < η < be such that τ η > , and let x ∈ D ( F ) be such that γ s k x − x † k s ≤ ρ .If K k x − x † k βs is suﬃciently small, then the inexact Newton regularizationmethods with the inner scheme deﬁned by Landweber iteration, the implicititeration, the asymptotic regularization, or Tikhonov regularization in Hilbertscales are well-deﬁned and terminate after n δ = O (1 + | log δ | ) iterations. If,in addition, x − x † ∈ X µ for some s < µ ≤ b + 2 s and K k x − x † k βµ issuﬃciently small, then there holds k x n δ − x † k r ≤ C k x − x † k a + ra + µ µ δ µ − ra + µ for all r ∈ [ − a, s ] , where C is a constant independent of δ and k x − x † k µ . The proof of Theorem 2.2 will be given in the next two sections. Here someremarks are in order.

Remark 2.1

When the inner scheme is deﬁned by the asymptotic regulariza-tion or Tikhonov regularization, there is ﬂexibility to choose t n to satisfy η k y δ − F ( x n ) k ≤ k y δ − F ( x n ) − F ′ ( x n ) u n ( t n ) k ≤ η k y δ − F ( x n ) k with some numbers 0 < η ≤ η <

1. Furthermore, we only need τ > τ η > Remark 2.2

When s > ( a − b ) /β , the same order optimal convergence rate inTheorem 2.2 holds for x − x † ∈ X µ with s ≤ µ ≤ b + 2 s which can be seenfrom the proof of Lemma 4.4 in Section 4. Remark 2.3

If the Fr´echet derivative F ′ ( x ) satisﬁes the Lipschitz condition k F ′ ( x ) − F ′ ( z ) k ≤ K k x − z k , x, z ∈ B ρ ( x † ) , then Assumption 2.2 (c) holds with b = 0 and β = 1, and thus, for theseinexact Newton regularization methods in Hilbert scales with s ≥ a , the orderoptimal convergence rates hold for x − x † ∈ X µ with s < µ ≤ s . Remark 2.4

We indicate how Theorem 2.1 can be derived from Theorem 2.2.First, we note that Assumption 2.1 (a) implies k F ( x ) − F ( z ) − F ′ ( z )( x − z ) k ≤ K k x − z kk F ′ ( z )( x − z ) k for all x, z ∈ B ρ ( x † ). One can then follow the proofs in Section 3 to showthat, if x ∈ B ρ ( x † ) and K k x − x † k is suﬃciently small, then these inexactNewton regularization methods are well-deﬁned and k x n +1 − x † k ≤ k x n − x † k , n = 0 , · · · , n δ − x n ∈ B ρ ( x † ) for 0 ≤ n ≤ n δ . By shrinking the ball B ρ ( x † ) ifnecessary, we can derive from Assumption 2.1 (a) that there exist two constants0 < C ≤ C < ∞ such that C k F ′ ( z ) h k ≤ k F ′ ( x ) h k ≤ C k F ′ ( z ) h k , h ∈ X (2.9)for all x, z ∈ B ρ ( x † ). This implies that all the operators F ′ ( x ) have the samenull space N as long as x ∈ B ρ ( x † ). By the condition of Theorem 2.1 we have x − x † ∈ N ⊥ . By the deﬁnition of { x n } we also have x n +1 − x n ∈ R ( F ′ ( x n ) ∗ ) ⊂N ⊥ for n = 0 , · · · , n δ −

1. By considering the operator G ( z ) := F ( z + x ) ifnecessary, we may assume x = 0. Therefore x † , x n ∈ N ⊥ for n = 0 , · · · , n δ ,and we may consider the equation (1.1) on N ⊥ . Consequently we may assume N = { } , i.e. each F ′ ( x ) is injective for x ∈ B ρ ( x † ).Now we introduce the operator L := ( F ′ ( x † ) ∗ F ′ ( x † )) − / which is clearlydensely deﬁned self-adjoint strictly positive linear operator in X satisfying k x k ≤ Θ ( Lx, x ) , x ∈ D ( L ) . From (2.9) it follows that C k h k − ≤ k F ′ ( x ) h k ≤ C k h k − which impliesAssumption 2.2 (a) with a = 1. Moreover, from Assumption 2.1 (b) it followsfor x, z ∈ B ρ ( x † )that k [ F ′ ( x ) − F ′ ( z )] k X − →Y = k [ F ′ ( x ) − F ′ ( z )] L k X →Y ≤ K k x − z kk F ′ ( z ) L k X →Y . Since (2.9) implies k F ′ ( z ) L k X →Y ≤ C , Assumption 2.2 (c) holds with b = 1and β = 1. Since R (( F ′ ( x † ) F ′ ( x † )) µ ) = X µ , Theorem 2.1 follows immediatelyfrom Theorem 2.2 with s = 0. We start with a simple consequence of Assumption 2.2 which will be usedfrequently.

Lemma 3.1

Let F satisfy Assumption 2.2 and let x, z ∈ B ρ ( x † ) . If t ≥ then k F ( x ) − F ( z ) − F ′ ( z )( x − z ) k ≤

11 + β K k x − z k a (1+ β ) − ba + t t k x − z k t (1+ β )+ ba + t − a . (3.1) If, in addition, t ≥ ( a − b ) /β , then k F ( x ) − F ( z ) − F ′ ( z )( x − z ) k ≤

11 + β γ tβ + b − a K k x − z k βt k x − z k − a . (3.2) Proof

From Assumption 2.2 (c) and the identity F ( x ) − F ( z ) − F ′ ( z )( x − z ) = Z [ F ′ ( z + t ( x − z )) − F ′ ( z )] ( x − z ) dt it follows immediately that k F ( x ) − F ( z ) − F ′ ( z )( x − z ) k ≤

11 + β K k x − z k β k x − z k − b . (3.3)With the help of the interpolation inequality (2.6) we have k x − z k ≤ k x − z k aa + t t k x − z k ta + t − a and k x − z k − b ≤ k x − z k a − ba + t t k x − z k t + ba + t − a . This together with (3.3) gives (3.1). If, in addition, t ≥ ( a − b ) /β , then wehave [ t (1 + β ) + b ] / ( a + t ) ≥

1. Thus, by using k x − z k − a ≤ γ a + t k x − z k t whichfollows from the embedding (2.5), we can derive (3.2) immediately from (3.1). ✷ In this section we will use the ideas from [4,6,10] to show that the fourinexact Newton regularization methods in Hilbert scales stated in Theorem2.2 are well-deﬁned and for the error term e n := x n − x † there holds k e n +1 k s ≤ k e n k s for n = 0 , · · · , n δ −

1. We will use the notation T := F ′ ( x † ) , T n := F ′ ( x n ) , A := T L − s and A n := T n L − s . It follows easily from the deﬁnition (2.8) of { u n ( t ) } that u n ( t ) = L − s g t ( A ∗ n A n ) A ∗ n (cid:0) y δ − F ( x n ) (cid:1) (3.4)and y δ − F ( x n ) − T n u n ( t ) = r t ( A n A ∗ n ) (cid:0) y δ − F ( x n ) (cid:1) , (3.5)where r t ( λ ) := 1 − λg t ( λ ) denotes the residual function associated with g t . Forthe spectral ﬁlter functions given in (2.4), it is easy to see that lim t →∞ r t ( λ ) =0 for each λ >

0. This implies thatlim t →∞ k y δ − F ( x n ) − T n u n ( t ) k = k P R ( A n ) ⊥ ( y δ − F ( x n )) k , (3.6)where P R ( A n ) ⊥ denotes the orthogonal projection of Y onto R ( A n ) ⊥ , the or-thogonal complement of the range R ( A n ) of A n . Lemma 3.2

Let F satsify Assumption 2.2 with s ≥ ( a − b ) /β , let τ > and < η < satisfy τ η > , and let x ∈ D ( F ) be such that γ s k e k s ≤ ρ . Assumethat K k e k βs is suﬃciently small. If k y δ − F ( x n ) k > τ δ and k e n k s ≤ k e k s ,then t n is well-deﬁned and t n ≥ c for some constant c > independent of n and δ . Proof

From (2.5) and the given conditions it follows that k e n k ≤ γ s k e n k s ≤ γ s k e k s ≤ ρ which implies x n ∈ B ρ ( x † ). Since k e n k s ≤ k e k s < ∞ implies L s e n ∈ X , we have k P R ( A n ) ⊥ ( y δ − F ( x n )) k ≤ k y δ − F ( x n ) + A n L s e n k = k y δ − F ( x n ) + T n e n k . In order to show that t n is well-deﬁned, in view of (3.6) it suﬃces to show k y δ − F ( x n ) + T n e n k < η k y δ − F ( x n ) k . (3.7)Since s ≥ ( a − b ) /β , we can use (1.2) and (3.2) in Lemma 3.1 to derive k y δ − F ( x n ) + T n e n k ≤ δ + 11 + β γ sβ + b − a K k e n k βs k e n k − a Now by using Assumption 2.2 (a), k e n k s ≤ k e k s and τ δ < k y δ − F ( x n ) k , weobtain with C = γ sβ + b − a / [(1 + β ) m ] that k y δ − F ( x n ) + T n e n k ≤ τ k y δ − F ( x n ) k + CK k e k βs k T n e n k≤ (cid:18) τ + CK k e k βs (cid:19) k y δ − F ( x n ) k + CK k e k βs k y δ − F ( x n ) + T n e n k . Since τ η >

1, we therefore obtain (3.7) if K k e k s is suﬃciently small.For the inner scheme deﬁned by Landweber iteration or the implicit itera-tion in Hilbert scales, it is obvious that t n is an integer with t n ≥

1. For theinner scheme deﬁned by the asymptotic regularization or Tikhonov regular-ization in Hilbert scales, we have η k y δ − F ( x n ) k = k y δ − F ( x n ) − T n u n ( t n ) k = k r t n ( A n A ∗ n )( y δ − F ( x n )) k where r t ( λ ) = e − tλ or r t ( λ ) = (1 + tλ ) − . Since k A n k ≤

1, we can obtaineither e − t n ≤ η or (1 + t n ) − ≤ η . Therefore t n ≥ log(1 /η ) or t n ≥ /η − ✷ Lemma 3.3

Let F satisfy Assumption 2.2 with s ≥ ( a − b ) /β , let τ > and < η < be such that τ η > , and let x ∈ D ( F ) be such that γ s k e k s ≤ ρ .If K k e k βs is suﬃciently small, then the four inexact Newton regularizationmethods in Hilbert scales stated in Theorem 2.2 are well-deﬁned and terminateafter n δ < ∞ iterations, and n δ − X n =0 t n k y δ − F ( x n ) k ≤ C k e k s (3.8) for some constant C > . Moreover k x n +1 − x † k s ≤ k x n − x † k s (3.9) for n = 0 , · · · , n δ − . Proof

We will prove this result for the four inexact Newton methods case bycase.(a) We ﬁrst consider the inexact Newton method with inner scheme deﬁnedby Landweber iteration in Hilber scales. We ﬁrst show the monotonicity (3.9).We may assume n δ ≥

1. Let 0 ≤ n < n δ and assume that k e n k s ≤ k e k s . Bythe deﬁnition of n δ we have k y δ − F ( x n ) k > τ δ . It follows from Lemma 3.2that t n is a well-deﬁned positive integer. Let u n,k := u n ( k ) for each integer k .Then u n, = 0 and u n,k = u n,k − + L − s T ∗ n (cid:0) y δ − F ( x n ) − T n u n,k − (cid:1) for k = 1 , · · · , t n . Recall that x n +1 = x n + u n,t n . Therefore, in order to show k e n +1 k s ≤ k e n k s , it suﬃces to show k e n + u n,k k s ≤ k e n + u n,k − k s , k = 1 , · · · , t n . (3.10)We set z n,k = y δ − F ( x n ) − T n u n,k . Then u n,k − u n,k − = L − s T ∗ n z n,k − andthus k e n + u n,k k s − k e n + u n,k − k s = 2( e n + u n,k − , u n,k − u n,k − ) s + k u n,k − u n,k − k s = ( u n,k − u n,k − , u n,k + u n,k − + 2 e n ) s = ( z n,k − , T n ( u n,k + u n,k − + 2 e n )) . According to the deﬁnition of z n,k one can see T n ( u n,k + u n,k − + 2 e n ) = − z n,k − z n,k − + 2( y δ − F ( x n ) + T n e n ) . Therefore k e n + u n,k k s − k e n + u n,k − k s = − ( z n,k − , z n,k ) − k z n,k − k + 2( z n,k − , y δ − F ( x n ) + T n e n ) . Observing that (3.5) and r t ( λ ) = (1 − λ ) [ t ] imply z n,k = ( I − A n A ∗ n ) k ( y δ − F ( x n )), we have ( z n,k − , z n,k ) ≥

0. Hence k e n + u n,k k s − k e n + u n,k − k s ≤ −k z n,k − k (cid:0) k z n,k − k − k y δ − F ( x n ) + T n e n k (cid:1) . Since τ η >

2, we can pick 0 < η < η/ τ η >

1. By using Assumption2.2, τ δ < k y δ − F ( x n ) k and k e n k s ≤ k e k s , we can derive as in the proof ofLemma 3.2 that if K k e k βs is suﬃciently small then k y δ − F ( x n ) + T n e n k ≤ η k y δ − F ( x n ) k . On the other hand, by the deﬁnition of t n we have k z n,k − k > η k y δ − F ( x n ) k .Therefore k e n + u n,k k s − k e n + u n,k − k s ≤ − ε k y δ − F ( x n ) k , (3.11) where ε := η ( η − η ) >

0. This in particular implies (3.10) and hence k e n +1 k s ≤ k e n k s . An induction argument then shows the monotonicity re-sult (3.9).Moreover, it follows from (3.11) that k e n +1 k s − k e n k s = t n X k =1 (cid:0) k e n + u n,k k s − k e n + u n,k − k s (cid:1) ≤ − ε t n k y δ − F ( x n ) k . Consequently ε n δ − X n =0 t n k y δ − F ( x n ) k ≤ k e k s − k e n δ k s ≤ k e k s < ∞ which shows (3.8). Since t n ≥ k y δ − F ( x n ) k > τ δ for 0 ≤ n < n δ , onecan see that n δ must be ﬁnite.(b) For the inexact Newton method with inner scheme deﬁned by theimplicit iteration in Hilbert scales, all t n must be positive integer and with thenotation u n,k := u n ( k ) we have u n, = 0 and u n,k = u n,k − + ( L s + T ∗ n T n ) − T ∗ n (cid:0) y δ − F ( x n ) − T n u n,k − (cid:1) . Let z n,k := y δ − F ( x n ) − T n u n,k . We have from (3.5) and r t ( λ ) = (1 + λ ) − [ t ] that z n,k = ( I + A n A ∗ n ) − z n,k − and u n,k − u n,k − = L − s T ∗ n z n,k . Thus k e n + u n,k k s − k e n + u n,k − k s = ( u n,k − u n,k − , u n,k + u n,k − + 2 e n ) s = ( z n,k , T n ( u n,k + u n,k − + 2 e n ))= ( z n,k , − z n,k − z n,k − + 2( y δ − F ( x n ) + T n e n )) . Note that ( z n,k , z n,k − ) ≥ k z n,k k . We then obtain k e n + u n,k k s − k e n + u n,k − k s ≤ − k z n,k k (cid:0) k z n,k k − k y δ − F ( x n ) + T n e n k (cid:1) . By using k A n k ≤ t n , we have k z n,k k ≥ k z n,k − k ≥ η k y δ − F ( x n ) k , k = 1 , · · · , t n . Since τ η >

2, we can obtain k e n + u n,k k s − k e n + u n,k − k s ≤ − η ( η − η ) k y δ − F ( x n ) k . for k = 1 , · · · , t n when K k e k s is suﬃciently small, where 0 < η < η/ τ η >

1. This together with an induction argument implies (3.8)and (3.9). (c) For the inexact Newton method with inner scheme deﬁned by theasymptotic regularization in Hilbert scales, u n ( t ) is the solution of the ini-tial value problem ddt u n ( t ) = L − s T ∗ n (cid:0) y δ − F ( x n ) − T n u n ( t ) (cid:1) , t > ,u n (0) = 0 . Therefore, with z n ( t ) := y δ − F ( x n ) − T n u n ( t ) we have ddt k e n + u n ( t ) k s = 2 (cid:18) ddt u n ( t ) , e n + u n ( t ) (cid:19) s = 2 ( z n ( t ) , T n ( e n + u n ( t )))= 2( z n ( t ) , − z n ( t ) + y δ − F ( x n ) + T n e n ) ≤ − k z n ( t ) k (cid:0) k z n ( t ) k − k y δ − F ( x n ) + T n e n k (cid:1) . According to the deﬁnition of t n we have k z n ( t n ) k = η k y δ − F ( x n ) k and k z n ( t ) k > η k y δ − F ( x n ) k for 0 ≤ t ≤ t n . Since τ η >

1, we therefore obtain ddt k e n + u n ( t ) k s ≤ − η ( η − η ) k y δ − F ( x n ) k , < t ≤ t n if K k e k βs is suﬃciently small, where 0 < η < η is such that τ η >

1. In viewof u n (0) = 0 and x n +1 = x n + u n ( t n ), we obtain k e n +1 k s − k e n k s ≤ − η ( η − η ) t n k y δ − F ( x n ) k . This implies (3.8) and (3.9) immediately.(d) For the inexact Newton method with inner scheme deﬁned by Tikhonovregularization, we have u n ( t ) = (cid:0) t − L s + T ∗ n T n (cid:1) − T ∗ n ( y δ − F ( x n )) . We ﬁrst observe that k e n +1 k s − k e n k s ≤ k x n +1 − x n k s + 2( x n +1 − x n , e n ) s = 2( x n +1 − x n , x n +1 − x n + e n ) s . Let z n = y δ − F ( x n ) − T n ( x n +1 − x n ). We have from (3.5) and r t ( λ ) = (1+ tλ ) − that u n ( t n ) = t n L − s T ∗ n z n and hence x n +1 − x n = t n L − s T ∗ n z n . Therefore k e n +1 k s − k e n k s ≤ t n ( z n , T n ( x n +1 − x n + e n ))= 2 t n (cid:0) z n , − z n + ( y δ − F ( x n ) + T n e n ) (cid:1) ≤ − t n k z n k (cid:0) k z n k − k y δ − F ( x n ) + T n e n k (cid:1) . By the deﬁnition of t n we have k z n k = η k y δ − F ( x n ) k . Since τ η >

1, we canobtain k e n +1 k s − k e n k s ≤ − η ( η − η ) t n k y δ − F ( x n ) k if K k e k βs is suﬃciently small, where 0 < η < η is such that τ η >

1. Thisimplies (3.8) and (3.9). ✷ Remark 3.1

The inequality (3.8) will ﬁnd its use in the proof of Lemma 4.4.From (3.8), t n ≥ c >

0, and the fact k y δ − F ( x n ) k ≥ τ δ for 0 ≤ n < n δ ,it follows easily that n δ = O ( δ − ) which gives only a rough estimate on thenumber of outer iterations. However, we should point out that the inexactNewton iterations in Hilbert scales in fact terminate after n δ = O (1 + | log δ | )outer iterations. This can be conﬁrmed by using the fact η k y δ − F ( x n ) k ≥ k y δ − F ( x n ) − T n ( x n +1 − x n ) k , ≤ n < n δ (3.12)which follows from the deﬁnition of t n and x n +1 = x n + u n ( t n ). To see this,by using (3.2) in Lemma 3.1 we have k F ( x n +1 ) − F ( x n ) − T n ( x n +1 − x n ) k ≤ γ sβ + b − a β K k x n +1 − x n k βs k x n +1 − x n k − a . Since (3.9) implies k x n +1 − x n k s ≤ k e n +1 k s + k e n k s ≤ k e k s , from Assumption2.2 (a) we have with C := 2 β γ sβ + b − a / [(1 + β ) m ] that k F ( x n +1 ) − F ( x n ) − T n ( x n +1 − x n ) k ≤ CK k e k βs k T n ( x n +1 − x n ) k . Therefore, if K k e k βs is suﬃciently small, then there holds k T n ( x n +1 − x n ) k ≤ k F ( x n +1 ) − F ( x n ) k and consequently k F ( x n +1 ) − F ( x n ) − T n ( x n +1 − x n ) k ≤ CK k e k βs k F ( x n +1 ) − F ( x n ) k . (3.13)Combining this with (3.12) yields η k y δ − F ( x n ) k ≥ k y δ − F ( x n +1 ) k − CK k e k βs k F ( x n +1 ) − F ( x n ) k . Considering η <

1, this in particular implies that if K k e k βs is suﬃcientlysmall then k y δ − F ( x n +1 ) kk y δ − F ( x n ) k ≤ η + 2 CK k e k βs − CK k e k βs ≤ η < . Therefore for all n = 0 , · · · , n δ there holds k y δ − F ( x n ) k ≤ (cid:18) η (cid:19) n k y δ − F ( x ) k . By taking n = n δ − k y δ − F ( x n δ − ) k ≥ τ δ we obtain τ δ ≤ (cid:0) η (cid:1) n δ − k y δ − F ( x ) k which shows that n δ = O (1 + | log δ | ). In this section we will show the order optimality of the four inexact Newtonmethod in Hilbert scales stated in Theorem 2.2. For simplicity of further ex-position, we will always use C to denote a generic constant independent of δ and n , we will also use the convention Φ . Ψ to mean that Φ ≤ CΨ forsome generic constant C when the explicit expression of C is not important.Furthermore, we will use Φ ∼ Ψ to mean that Φ . Ψ and Ψ . Φ . Lemma 4.1

Under the same conditions in Lemma 3.3, there holds k y δ − F ( x n ) k . k y δ − F ( x n +1 ) k , n = 0 , · · · , n δ − . Proof

We ﬁrst claim that there is a constant c > c k y δ − F ( x n ) k ≤ k y δ − F ( x n ) − T n ( x n +1 − x n ) k . (4.1)This is clear from the deﬁnition of t n when the inner scheme is deﬁned byTikhonov regularization or the asymptotic regularization. When the innerscheme is deﬁned by Landweber iteration, we have r t ( λ ) = (1 − λ ) [ t ] . Ac-cording to the deﬁnition of t n and (3.5), we have η k y δ − F ( x n ) k ≤ k y δ − F ( x n ) − T n u n ( t n − k = k ( I − A n A ∗ n ) t n − ( y δ − F ( x n )) k . Since k A n k ≤ Θ <

1, we have k ( I − A n A ∗ n ) − k ≤ (1 − Θ ) − . Therefore, using(3.5) again it follows(1 − Θ ) η k y δ − F ( x n ) k ≤ k ( I − A n A ∗ n ) t n ( y δ − F ( x n )) k = k y δ − F ( x n ) − T n ( x n +1 − x n ) k which shows (4.1) with c = (1 − Θ ) η . When the inner scheme is deﬁned bythe implicit iteration, we have r t ( λ ) = (1 + λ ) − [ t ] . Thus it follows from (3.5)and k A n k ≤ η k y δ − F ( x n ) k ≤ k ( I + A n A ∗ n ) − t n +1 ( y δ − F ( x n )) k≤ k ( I + A n A ∗ n ) − t n ( y δ − F ( x n )) k = 2 k y δ − F ( x n ) − T n ( x n +1 − x n ) k which shows (4.1) with c = η/ c k y δ − F ( x n ) k≤ k y δ − F ( x n +1 ) k + CK k e k βs k F ( x n +1 ) − F ( x n ) k≤ k y δ − F ( x n +1 ) k + CK k e k βs (cid:0) k y δ − F ( x n +1 ) k + k y δ − F ( x n ) k (cid:1) . This shows the result if K k e k βs is suﬃciently small. ✷ For the spectral ﬁlter functions deﬁned by (2.4), we have shown in [9] thatfor any sequence of positive numbers { t n } there hold0 ≤ λ ν n − Y k = j r t k ( λ ) ≤ ( s n − s j ) − ν , (4.2)0 ≤ λ ν g t j ( λ ) n − Y k = j +1 r t k ( λ ) ≤ t j ( s n − s j ) − ν (4.3)and 0 ≤ λ ν n − X i =0 g t i ( λ ) n − Y k = i +1 r t k ( λ ) ≤ s − νn (4.4)for 0 ≤ ν ≤

1, 0 ≤ λ ≤ j = 0 , , · · · , n −

1, where { s n } is deﬁned by s = 0 and s n = n − X j =0 t j for n = 1 , , · · · . (4.5)Moreover, we have the following crucial estimate. Lemma 4.2

Let F satisfy Assumption 2.2, let { g t } be deﬁned by (2.4) and r t ( λ ) = 1 − λg t ( λ ) , and let { t n } be a sequence of positive numbers with { s n } de-ﬁned by (4.5). Let A = F ′ ( x † ) L − s and for any x ∈ B ρ ( x † ) let A x = F ′ ( x ) L − s .Then for − b + s a + s ) ≤ ν ≤ / there holds (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ( A ∗ A ) ν n − Y k = j +1 r t k ( A ∗ A ) (cid:2) g t j ( A ∗ A ) A ∗ − g t j ( A ∗ x A x ) A ∗ x (cid:3)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . t j ( s n − s j ) − ν − b + s a + s ) K k x − x † k β for j = 0 , , · · · , n − .Proof We refer to [9, Lemma 2] in which similar estimates have been derivedfor a general class of spectral ﬁlter functions. ✷ We also need the following estimate concerning the sums of suitable typeswhich will occur in the convergence analysis.

Lemma 4.3

Let { t n } be a sequence of numbers satisfying t n ≥ c > , andlet s n be deﬁned by (4.5). Let p ≥ and q ≥ be two numbers. Then n − X j =0 t j ( s n − s j ) − p s − qj +1 ≤ C s − p − qn  , max { p, q } < , log(1 + s n ) , max { p, q } = 1 ,s max { p,q }− n , max { p, q } > , where C is a constant depending only on p , q and c .Proof This is essentially contained in [5, Lemma 4.3] and its proof. A simpliﬁedproof can be found in [9, Lemma 3]. ✷ Now we are ready to give the crucial estimates on k e n k µ and k T e n k for0 ≤ n < n δ . We will exploit the ideas developed in [5,8,9]. Lemma 4.4

Let F satisfy Assumption 2.2 with s ≥ ( a − b ) /β , let τ > and < η < be such that τ η > , let x ∈ D ( F ) satisfy γ s k e k s ≤ ρ . If e ∈ X µ for some s < µ ≤ b + 2 s and if K k e k βµ is suﬃciently small, then there existsa constant C ∗ > such that k e n k µ ≤ C ∗ k e k µ and k T e n k ≤ C ∗ k e k µ (1 + s n ) − a + µ a + s ) for all n = 0 , · · · , n δ − .Proof Since s < µ ≤ b + 2 s , from (2.7) we have k e n k µ ∼ k ( A ∗ A ) s − µ a + s ) L s e n k .Therefore, it suﬃces to show that there exists a constant C ∗ > k ( A ∗ A ) s − µ a + s ) L s e n k ≤ C ∗ k e k µ and k T e n k ≤ C ∗ k e k µ (1+ s n ) − a + µ a + s ) (4.6)for all n = 0 , · · · , n δ −

1. We will show (4.6) by induction. By using (2.7) andAssumption 2.2 (b) we have k ( A ∗ A ) s − µ a + s ) L s e k ≤ c ( s − µa + s ) k e k µ and k T e k = k ( A ∗ A ) / L s e k ≤ k ( A ∗ A ) s − µ a + s ) L s e k ≤ c ( s − µa + s ) k e k µ . Therefore (4.6) with n = 0 holds for C ∗ ≥ c ( s − µa + s ). Now we assume that (4.6)is true for all 0 ≤ n < l for some 0 < l < n δ and want to show that it is alsotrue for n = l .From the equation (3.4) and x n +1 = x n + u n ( t n ) it follows that e n +1 = e n + L − s g t n ( A ∗ n A n ) A ∗ n (cid:0) y δ − F ( x n ) (cid:1) = L − s r t n ( A ∗ A ) L s e n + L − s g t n ( A ∗ A ) A ∗ ( y δ − F ( x n ) + T e n )+ L − s [ g t n ( A ∗ n A n ) A ∗ n − g t n ( A ∗ A ) A ∗ ] ( y δ − F ( x n )) . By induction on this equation we obtain e l = L − s l − Y j =0 r t j ( A ∗ A ) L s e + L − s l − X j =0 l − Y k = j +1 r t k ( A ∗ A ) g t j ( A ∗ A ) A ∗ ( y δ − y )+ L − s l − X j =0 l − Y k = j +1 r t k ( A ∗ A ) g t j ( A ∗ A ) A ∗ ( y − F ( x j ) + T e j )+ L − s l − X j =0 l − Y k = j +1 r t k ( A ∗ A ) (cid:2) g t j ( A ∗ j A j ) A ∗ j − g t j ( A ∗ A ) A ∗ (cid:3) (cid:0) y δ − F ( x j ) (cid:1) . (4.7) By multiplying (4.7) by T := F ′ ( x † ), noting that A = T L − s , and using theidentity 1 − λ l − X j =0 g t j ( λ ) l − Y k = j +1 r t k ( λ ) = l − Y j =0 r t j ( λ )which follows from the relation r t ( λ ) = 1 − λg t ( λ ), we can obtain T e l = A l − Y j =0 r t j ( A ∗ A ) L s e +  I − l − Y j =0 r t j ( AA ∗ )  ( y δ − y )+ l − X j =0 l − Y k = j +1 r t k ( AA ∗ ) g t j ( AA ∗ ) AA ∗ ( y − F ( x j ) + T e j )+ l − X j =0 A l − Y k = j +1 r t k ( A ∗ A ) (cid:2) g t j ( A ∗ j A j ) A ∗ j − g t j ( A ∗ A ) A ∗ (cid:3) ( y δ − F ( x j )) . (4.8)Since e ∈ X µ with s < µ ≤ b + 2 s , by using (2.7), (4.2), (4.3), (4.4) andLemma 4.2 we can derive from (4.7) that k ( A ∗ A ) s − µ a + s ) L s e l k≤ c k e k µ + s a + µ a + s ) l δ + l − X j =0 t j ( s l − s j ) − a +2 s − µ a + s ) k y − F ( x j ) + T e j k + C l − X j =0 t j ( s l − s j ) − b +2 s − µ a + s ) K k e j k β k y δ − F ( x j ) k , (4.9)where c = c ( µ − sa + s ) and C is a generic constant independent of l and δ .Next by using again e ∈ X µ with s < µ ≤ b + 2 s , (2.7) and (4.2), we canobtain (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) A l − Y j =0 r t j ( A ∗ A ) L s e (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) A l − Y j =0 r t j ( A ∗ A )( A ∗ A ) µ − s a + s ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) ( A ∗ A ) − µ − s a + s ) L s e (cid:13)(cid:13)(cid:13) ≤ c sup ≤ λ ≤  λ a + µ a + s ) l − Y j =0 r t j ( λ )  k e k µ ≤ c s − a + µ a + s ) l k e k µ . Therefore, it follows from (4.8), (4.3) and Lemma 4.2 that k T e l k ≤ c s − a + µ a + s ) l k e k µ + δ + l − X j =0 t j ( s l − s j ) − k y − F ( x j ) + T e j k + C l − X j =0 t j ( s l − s j ) − b + a +2 s a + s ) K k e j k β k y δ − F ( x j ) k . (4.10) We ﬁrst use (4.10) to derive the desired estimate for k T e l k . According to therelation k e j k µ ∼ k ( A ∗ A ) s − µ a + s ) L s e j k , we have from the induction hypothesesthat k e j k µ . k e k µ and k T e j k . k e k µ (1 + s j ) − a + µ a + s ) , ≤ j ≤ l − . (4.11)We need to estimate the terms k e j k , k y δ − F ( x j ) k and k y − F ( x j ) + T e j k , ≤ j ≤ l − . For each term we will give two types of estimates, one is true for all 0 ≤ j ≤ l − ≤ j < l − τ δ ≤k y δ − F ( x j ) k for 0 ≤ j < n δ we have k y δ − F ( x j ) + T e j k ≤ δ + k y − F ( x j ) + T e j k ≤ δ + CK k e j k βs k e j k − a ≤ τ k y δ − F ( x j ) k + CK k e k βs k T e j k . This shows for 0 ≤ j < n δ that k y δ − F ( x j ) k ≤ ττ − (cid:0) CK k e k βs (cid:1) k T e j k , (4.12) k y δ − F ( x j ) k ≥ τ τ (cid:0) − CK k e k βs (cid:1) k T e j k . (4.13)The inequalities (4.12), (4.13) and Lemma 4.1 imply that if K k e k βs is suﬃ-ciently small then k T e j k . k T e j +1 k , ≤ j < n δ − . (4.14)Consequently, we have from (4.12) and (4.14) that k y δ − F ( x j ) k . k T e j +1 k , ≤ j < n δ − . (4.15)This together with (4.11) gives k y δ − F ( x j ) k . k e k µ s − a + µ a + s ) j +1 , ≤ j < l − . (4.16)Next we estimate k y − F ( x j ) + T e j k . We have from (3.2) in Lemma 3.1,Assumption 2.2 (a), and (4.11) that k y − F ( x j ) − T e j k . K k e j k βµ k e j k − a . K k e k βµ k T e j k . Therefore, it follows from (4.14) that k y − F ( x j ) − T e j k . K k e k βµ k T e j +1 k , ≤ j ≤ l − . (4.17)On the other hand, by using (3.1) in Lemma 3.1 and Assumption 2.2 (a), wehave k y − F ( x j ) + T e j k ≤ K k e j k a (1+ β ) − ba + µ µ k e j k µ (1+ β )+ ba + µ − a . K k e j k a (1+ β ) − ba + µ µ k T e j k µ (1+ β )+ ba + µ Therefore, it follows from (4.14) and (4.11) that k y − F ( x j ) − T e j k . K k e k βµ s − µ (1+ β )+ b a + s ) j +1 , ≤ j < l − . (4.18)For the term k e j k , we ﬁrst have from the interpolation inequality (2.6),Lemma 3.3, and Assumption 2.2 (a) that k e j k ≤ k e j k aa + s s k e j k sa + s − a . k e k aa + s s k T e j k sa + s . With the help of (4.13) we then obtain k e j k . k e k aa + s s k y δ − F ( x j ) k sa + s , ≤ j ≤ l − . (4.19)On the other hand, by using the interpolation inequality (2.6) and Assumption2.2 (a) we also obtain for 0 ≤ j ≤ l − k e j k ≤ k e j k aa + µ µ k e j k µa + µ − a . k e j k aa + µ µ k T e j k µa + µ . This together with (4.14) and (4.11) gives k e j k . k e k µ s − µ a + s ) j +1 , ≤ j < l − . (4.20)Now we use (4.15), (4.17) and (4.19) with j = l − ≤ j < l −

1, we then obtain from (4.10) that k T e l k ≤ c k e k µ s − a + µ a + s ) l + δ + CK k e k βµ l − X j =0 t j ( s l − s j ) − s − µ (1+ β )+ b a + s ) j +1 + CK k e k βµ k T e l k + CK k e k aβa + s s t a − b a + s ) l − k y δ − F ( x l − ) k sβa + s k T e l k + CK k e k βµ l − X j =0 t j ( s l − s j ) − b + a +2 s a + s ) s − µ (1+ β )+ a a + s ) j +1 . Since µ > s ≥ ( a − b ) /β , we can use Lemma 4.3 to derive that k T e l k ≤ (cid:0) c + CK k e k βµ (cid:1) k e k µ s − a + µ a + s ) l + δ + CK k e k βµ k T e l k + CK k e k aβa + s s t a − b a + s ) l − k y δ − F ( x l − ) k sβa + s k T e l k . Recall that (3.8) in Lemma 3.3 implies t l − k y δ − F ( x l − ) k . k e k s . Since s ≥ ( a − b ) /β and t l − ≥ c >

0, we have t a − b a + s ) l − k y δ − F ( x l − ) k sβa + s ≤ (cid:0) t l − k y δ − F ( x l − ) k (cid:1) sβ a + s ) t a − b − sβ a + s ) l − . k e k sβa + s s . Therefore, noting k e k s . k e k µ , we obtain k T e l k ≤ (cid:0) c + CK k e k βµ (cid:1) k e k µ s − a + µ a + s ) l + δ + CK k e k βµ k T e l k . (4.21) Since l < n δ , we have from the deﬁnition of n δ and (4.12) that δ ≤ τ k y δ − F ( x l ) k ≤ τ − (cid:0) CK k e k βµ (cid:1) k T e l k . (4.22)Combining this with (4.21) gives k T e l k ≤ (cid:0) c + CK k e k βµ (cid:1) k e k µ s − a + µ a + s ) l + (cid:18) τ − CK k e k βµ (cid:19) k T e l k . Recall that τ >

2. Therefore, if K k e k βµ is suﬃciently small, then we have k T e l k ≤ c ( τ − τ − k e k µ s − a + µ a + s ) l . Since l ≥ s l ≥ t l − ≥ c , we have 1 + s l ≤ (1 + 1 /c ) s l . Therefore k T e l k ≤ C ∗ k e k µ (1 + s l ) − a + µ a + s ) if we choose C ∗ ≥ c (1 + 1 /c )( τ − / ( τ − k ( A ∗ A ) s − µ a + s ) L s e l k .Since we have veriﬁed the estimates for k T e l k , the estimates (4.16), (4.18) and(4.20) therefore can be improved to include j = l −

1; this is clear from theabove argument. Consequently we have from (4.9) that k ( A ∗ A ) s − µ a + s ) L s e l k≤ c k e k µ + s a + µ a + s ) l δ + CK k e k βµ l − X j =0 t j ( s l − s j ) − a +2 s − µ a + s ) s − µ (1+ β )+ b a + s ) j +1 + CK k e k βµ l − X j =0 t j ( s l − s j ) − b +2 s − µ a + s ) s − µ (1+ β )+ a a + s ) j +1 . It then follows from Lemma 4.3 that k ( A ∗ A ) s − µ a + s ) L s e l k ≤ (cid:0) c + CK k e k βµ (cid:1) k e k µ + s a + µ a + s ) l δ. With the help of (4.22) and the estimate on k T e l k , we obtain k ( A ∗ A ) s − µ a + s ) L s e l k ≤ (cid:0) c + CK k e k βµ (cid:1) k e k µ + C ∗ τ − CK k e k βµ ) k e k µ . Since τ >

2, we thus obtain k ( A ∗ A ) s − µ a + s ) L s e l k ≤ C ∗ k e k µ for any C ∗ ≥ c ( τ − / ( τ −

2) if K k e k βµ is suﬃciently small. The proof is therefore com-plete. ✷ Now we are ready to complete the proof of Theorem 2.2, the main resultin this paper.

Proof of Theorem 2.2.

Considering Lemma 3.3 and Remark 3.1, it remainsonly to derive the order optimal convergence rates. When n δ = 0, the proofis standard. So we may assume n δ >

0. From Lemma 4.4 it follows that k e n δ − k µ . k e k µ . By using Lemma 4.1 and the deﬁnition of n δ we have k y δ − F ( x n δ − ) k . δ , which together with (4.13) implies that k e n δ − k − a . k T e n δ − k . δ . Therefore, from the interpolation inequality (2.6) it followsthat k e n δ − k s ≤ k e n δ − k a + sa + µ µ k e n δ − k µ − sa + µ − a . k e k a + sa + µ µ δ µ − sa + µ . In view of (3.9) in Lemma 3.3, we consequently obtain k e n δ k s . k e k a + sa + µ µ δ µ − sa + µ .By using the deﬁnition of n δ and (1.2) we have k y − F ( x n δ ) k ≤ (1 + τ ) δ .Observing that (3.2) in Lemma 3.1 and (3.9) in Lemma 3.3 imply k T e n δ k ≤ k y − F ( x n δ ) k + k y − F ( x n δ ) + T e n δ k≤ k y − F ( x n δ ) k + CK k e n δ k βs k T e n δ k≤ k y − F ( x n δ ) k + CK k e k βs k T e n δ k . Thus, if K k e k s . K k e k µ is suﬃciently small, then k T e n δ k . k y − F ( x n δ ) k .Consequently k e n δ k − a . k T e n δ k . δ . Now we can use again the interpolationinequality (2.6) to derive for all r ∈ [ − a, s ] that k e n δ k r ≤ k e n δ k a + ra + s s k e n δ k s − ra + s − a . k e k a + ra + µ µ δ µ − ra + µ . The proof is therefore complete. ✷ Inexact Newton regularization methods have been suggested by Hanke andRieder in [4] and [12], respectively, for solving nonlinear ill-posed inverse prob-lems. The convergence rates of these methods have been considered in [12,13],the results however turned out to be inferior to the so-called order optimalrates. For a long time it has been an open problem whether these inexactNewton methods are order optimal, although the numerical illustrations in[12,13] present strong indication.Important progress has been made recently in [5] where the regularizingLevenberg-Marquardt scheme is shown to be order optimal aﬃrmatively. Inthis paper we considered a general class of inexact Newton methods in whichthe inner schemes are deﬁned by Landweber iteration, the implicit iteration,the asymptotic regularization and Tikhonov regularization. By establishingthe monotonicity of iteration errors and deriving a series of subtle estimates,we succeeded in proving the order optimality of these methods. We also ex-tended these order optimality results to a more general situation where theinner schemes are deﬁned by linear regularization methods in Hilbert scales.Our theoretical ﬁndings conﬁrm the numerical results in [12,13].

Acknowledgement.

Part of the work was carried out during the stay inDepartment of Mathematics at Virginia Tech. References

1. R. S. Dembo, S. C. Eisenstat and T. Steihaug,

Inexact Newton methods , SIAM J.Numer. Anal., 19 (1982), 400–408.2. P. Deuﬂhard, H.W. Engl and O. Scherzer,

A convergence analysis of iterative methodsfor the solution of nonlinear ill-posed problems under aﬃnely invariant conditions ,Inverse Problems, 14 (1998), 1081–1106.3. H. W. Engl, M. Hanke and A. Neubauer,

Regularization of Inverse Problems , Kluwer,Dordrecht, 1996.4. M. Hanke,

A regularizing Levenberg-Marquardt scheme with applications to inversegroundwater ﬁltration problems , Inverse Problems, 13(1997), 79–95.5. M. Hanke,

The regularizing Levenberg-Marquardt scheme is of optimal order , J. Inte-geral Equations and Applications, 22 (2010), no. 2, 259–283.6. M. Hanke, A. Neubauer and O. Scherzer,

A convergence analysis of the Landweberiteration for nonlinear ill-posed problems , Numer. Math., 72 (1995), 21–37.7. T. Hohage and M. Pricop,

Nonlinear Tikhonov regularization in Hilbert scales forinverse boundary value problems with random noise , Inverse Problems and Imaging,2 (2008), 271–290.8. Q. Jin,

A general convergence analysis of some Newton-type methods for nonlinearinverse problems , SIAM J. Numer. Anal., 49 (2011), 549–573.9. Q. Jin and U. Tautenhahn,

Inexact Newton regularization methods in Hilbert scales ,Numer. Math., 117 (2011), 555–579.10. A. Lechleiter and A. Rieder,

Towards a general convergence theory for inexact Newtonregularizations , Numer. Math. 114 (2010), no. 3, 521–548.11. A. Neubauer,

On Landweber iteration for nonlinear ill-posed problems in Hilbert scales ,Numer. Math., 85 (2000), 309–328.12. A. Rieder,

On the regularization of nonlinear ill-posed problems via inexact Newtoniterations , Inverse Problems, 15(1999), 309–327.13. A. Rieder,