On Krylov solutions to infinite-dimensional inverse linear problems
OON KRYLOV SOLUTIONS TO INFINITE-DIMENSIONALINVERSE LINEAR PROBLEMS
NOE CARUSO, ALESSANDRO MICHELANGELI, AND PAOLO NOVATI
Abstract.
We discuss, in the context of inverse linear problems in Hilbertspace, the notion of the associated infinite-dimensional Krylov subspace andwe produce necessary and sufficient conditions for the Krylov-solvability of agiven inverse problem, together with a series of model examples and numericalexperiments. Introduction and outlook
Krylov subspace methods constitute a wide class of efficient numerical schemesfor finite-dimensional inverse linear problems, even counted among the ‘Top 10Algorithms’ of the 20th century [8, 6].Whereas this framework is by now classical and deeply understood for finite-dimensional inverse problems (see, e.g., the monographs [29, 21] or also [28]), itis instead less explored – and surely lacks a systematic study – in the infinite-dimensional case [20, 7, 19, 22, 31, 18].In this work we focus on the general setting of infinite-dimensional inverse lin-ear problems that are solved by means of finite-dimensional truncations taken withrespect to a basis of the associated Krylov subspace, and we investigate the possi-bility that the solution can be indeed well approximated by vectors in the Krylovsubspace.To fix the nomenclature and the notation, let us consider an inverse linear prob-lem in Hilbert space, namely the problem, given a Hilbert space H , a linear operator A acting on H , and a vector g ∈ H , to determine the solution(s) f ∈ H to the linearequation(1.1) Af = g . We shall say that: (1.1) is solvable if a solution f exists, namely if g ∈ ran A ; (1.1)is well-defined if additionally the solution f is unique, i.e., if A is also injective(in which case one refers to f ‘ exact ’ solution); (1.1) is well-posed if there exists aunique solution that depends continuously (i.e., in the norm of H ) on the datum g ,equivalently, that g ∈ ran A and A has bounded inverse on its range.Although well-defined inverse linear problems are in a sense trivial theoretically,as the existence and uniqueness of the solution is not of concern, a crucial numer-ical issue is the control of the truncation to the finite-dimensional space in whichapproximate solutions are to be computed. Obviously this refers to the case whendim H = ∞ and A is a genuine infinite-dimensional operator on H : by this wemean, as customary [30, Sect. 1.4], that A is not reduced as A = A ⊕ A by an or-thogonal direct sum decomposition H = H ⊕ H with dim H < ∞ , dim H = ∞ ,and A = O . Date : August 28, 2019.
Key words and phrases. inverse linear problems, infinite-dimensional Hilbert space, ill-posedproblems, orthonormal basis discretisation, bounded linear operators, self-adjoint operators,Krylov subspaces, cyclic operators, Krylov solution, GMRES, CG, LSQR. a r X i v : . [ m a t h . NA ] A ug N. CARUSO, A. MICHELANGELI, AND P. NOVATI
In the framework of standard Galerkin and Petrov-Galerkin methods [10, 25],typically developed for partial differential operators, the well-posedness of the prob-lem (1.1) is ensured by various classical conditions (in practice some kind of co-ercivity of A ), such as the Banach-Ne˘cas-Babuˇska Theorem or the Lax-MilgramLemma [10, Chapter 2]. Analogous conditions guarantee the well-posedness of thetruncated problems, and in order for the finite-dimensional solutions to convergestrongly in the infinite-dimensional limit, one requires stringent yet often plausibleconditions [10, Sect. 2.2-2.4], [25, Sect. 4.2] both on the truncation spaces, that needapproximate suitably well the ambient space H (‘ approximability ’, thus the interpo-lation capability of finite elements), and on the behaviour of the reduced problems,that need admit solutions that are uniformly controlled by the data (‘ uniform sta-bility ’), and that are suitably good approximate solutions of the original problem(‘ asymptotic consistency ’), together with some suitable boundedness of the problemin appropriate topologies (‘ uniform continuity ’).For non-differential inverse problems, for example when A is a compact ora generic bounded operator with a bad-behaving inverse (e.g., when A is non-coercive), as is often the case when A is an integral operator, the solvability or sin-gularity of the truncated problems and the error analysis in the infinite-dimensionallimit are being studied as well [12, 5].In this respect, Krylov subspace methods are a class of algorithms where approx-imate solutions to (1.1) are sought among the linear combinations of the vectors g, Ag, A g, . . . which span the so-called ‘ Krylov subspace ’ K ( A, g ) associated with A and g .The infinite-dimensionality of the underlying Hilbert space H comes with a loadof new issues, starting from the very definition of the Krylov vectors A k g if A is unbounded [4]. Even when A is everywhere defined and bounded, and hence K ( A, g ) is well-defined, it may well happen that K ( A, g ) is not dense in H , thuspreventing the truncation spaces to have that approximability feature which, asmentioned above, is a typical assumption for (Petrov-)Galerkin schemes.Among such potential difficulties, the first crucial question is whether the solu-tion(s) to (1.1) can be well approximated by vectors in K ( A, g ), say, whether theybelong to the closure K ( A, g ) taken in the H -norm topology. In the affirmativecase, the Krylov subspace is a reliable space for the approximants of the exact so-lution(s): we shall refer to such an occurrence by saying that the problem (1.1) is‘ Krylov-solvable ’ and a solution f to (1.1) such that f ∈ K ( A, g ) will be referred toas a ‘
Krylov solution ’.Additional relevant questions then arise, for example in the presence of a multi-plicity of solutions some may be Krylov and others may not.In the present work we investigate an amount of mathematical aspects of a(genuinely) infinite-dimensional bounded inverse linear problem in Hilbert spacewith respect to the underlying Krylov subspace.After fixing the natural generalisation of the Krylov subspace in infinite dimen-sions (Sect. 2), we address the general question of the Krylov solvability. Throughseveral paradigmatic examples and counter-examples we show the typical occur-rences where such a feature may hold or fail.Most importantly, we demonstrate necessary and sufficient conditions, for certainrelevant classes of bounded operators, in order for the solution to be a Krylovsolution (Sect. 3).To this aim, we identify a somewhat ‘intrinsic’ notion associated to the operator A and the datum g , a subspace that we call the ‘ Krylov intersection ’, that turnsout to qualify the operator-theoretic mechanism for the Krylov-solvability of theproblem.
N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 3
We observe that for the study case that is most investigated in the previousliterature of infinite-dimensional Krylov subspaces, namely the self-adjoint boundedinverse linear problems, this mechanism takes a more explicit form, that we shallrefer to as the ‘
Krylov reducibility ’ of the operator A .Last, in the concluding part, Section 4, we investigate the main features discussedtheoretically through a series of numerical tests on inverse problems in infinite-dimensional Hilbert space, suitably truncated and analysed by increasing the sizeof the truncation. General notation.
Besides further notation that will be declared in due time,we shall keep the following convention. H denotes a complex Hilbert space, that isassumed to be separable throughout this note, with norm (cid:107) · (cid:107) H and scalar product (cid:104)· , ·(cid:105) , anti-linear in the first entry and linear in the second. Bounded operators on H shall be tacitly understood as linear and everywhere defined: they naturally form aspace, denoted with B ( H ), with Banach norm (cid:107) · (cid:107) op , the customary operator norm. and O denote, respectively, the identity and the zero operator, meant as finitematrices or infinite-dimensional operators depending on the context. An upper bardenotes the complex conjugate z when z ∈ C , and the norm closure V of the spanof the vectors in V when V is a subset of H . For ψ, ϕ ∈ H , by | ψ (cid:105)(cid:104) ψ | and | ψ (cid:105)(cid:104) ϕ | we shall denote the H → H rank-one maps acting respectively as f (cid:55)→ (cid:104) ψ, f (cid:105) ψ and f (cid:55)→ (cid:104) ϕ, f (cid:105) ψ on generic f ∈ H . For identities such as ψ ( x ) = ϕ ( x ) in L -spaces,the standard ‘for almost every x ’ declaration will be tacitly understood.2. Krylov subspace in infinite dimensional Hilbert space
Definition and generalities.
As well known, given a d × d (complex) matrix A and a vector g ∈ C d , the N -thorder Krylov subspace associated with A and g is the subspace(2.1) K N ( A, g ) := span { g, Ag, . . . , A N − g } ⊂ C d . Clearly, 1 (cid:54) dim K N ( A, g ) (cid:54) N , and there always exists some N (cid:54) d such thatall N -th order spaces K N ( A, g ) are the same whenever N (cid:62) N : one then refers to the Krylov subspace associated with A and g as the maximal subspace K N ( A, g ).This notion has a natural generalisation to a Hilbert space H with dim H = ∞ ,an everywhere defined, bounded linear operator A : H → H , and a vector g ∈ H :clearly, unlike the finite-dimensional case, it may happen that sup N dim K N ( A, g ) = ∞ .The Krylov subspace associated with A and g is then defined as(2.2) K ( A, g ) := span { A k g | k ∈ N } , a definition that applies to the finite-dimensional case too. In fact (2.2) makes sensealso when A is unbounded, provided that g simultaneously belongs to the domainsof all the powers of A . Yet, in the present discussion we shall stick to boundedoperators acting on (possibly infinite-dimensional) Hilbert spaces.Obviously, when dim K ( A, g ) = ∞ the subspace K ( A, g ) is not closed in H . Itsclosure can either be a proper closed subspace of H , or even the whole H itself. Example 2.1. (i) For the right-shift operator R on (cid:96) ( N ) (Sec. A.2) and the vector g = e m +1 (one of the canonical basis vectors), K ( R, e m +1 ) = span { e , . . . , e m } ⊥ ,which is a proper subspace of (cid:96) ( N ) if m (cid:62)
1, and instead is the whole (cid:96) ( N )if g = e . N. CARUSO, A. MICHELANGELI, AND P. NOVATI (ii) For the Volterra integral operator V on L [0 ,
1] (Sec. A.5) and the func-tion g = (the constant function with value 1), it follows from (A.10) or(A.15) that the functions V g, V g, V g, . . . are (multiples of) the polyno-mials x, x , x , . . . , therefore K ( V, g ) is the space of polynomials on [0 , L [0 , K ( A, g ) is custom-arily referred to as the cyclic space of A relative to the vector g , the spanningvectors g , Ag , A g , . . . form the orbit of g under A , the density of K ( A, g ) in H iscalled the cyclicity of g , in which case g is called a cyclic vector for A , and when A admits cyclic vectors in H one says that A is a cyclic operator .For completeness of information, let us recall a few well-known facts about cyclicvectors and cyclic operators [15].(I) In non-separable Hilbert spaces there are no cyclic vectors.(II) The set of (bounded) cyclic operators on a Hilbert space H is dense in B ( H ),with respect to the (cid:107) (cid:107) op -norm, if dim H < ∞ ; instead, it is not dense in B ( H ) if dim H = ∞ .(III) The set of cyclic operators on a separable Hilbert space H is not closed in B ( H ). It is open in B ( H ) if dim H < ∞ , it is not open if dim H = ∞ .(IV) If dim H = ∞ and H is separable, then the set of non-cyclic operators on H is dense in B ( H ) (whereas, instead, the set of cyclic operators is not).(V) It is not known whether there exists a bounded operator on a separableHilbert space H such that every non-zero vector in H is cyclic.(VI) The set of cyclic vectors for a bounded operator A on a Hilbert space H iseither empty or a dense subset of H [14]. If g is a cyclic vector for A , thenalso g ( n ) := (1 − αA ) n g for any | α | ∈ (0 , (cid:107) A (cid:107) − ) and for any n ∈ N , andthe g ( n ) ’s thus defined span the whole H .(VII) A bounded operator A on the separable Hilbert space H is cyclic if andonly if there is an orthonormal basis ( e n ) n of H with respect to which thematrix elements a ij := (cid:104) e i , Ae j (cid:105) are such that a ij = 0 for i > j + 1 and a ij (cid:54) = 0 for i = j + 1 (thus, A is an upper Hessenberg infinite-dimensionalmatrix).2.2. Krylov reducibility and Krylov intersection.
For given A ∈ B ( H ) and g ∈ H , there is the orthogonal decomposition(2.3) H = K ( A, g ) ⊕ K ( A, g ) ⊥ that we shall often refer to as the Krylov decomposition of H relative to A and g . The corresponding Krylov subspace is invariant under A and its orthogonalcomplement is invariant under A ∗ , that is,(2.4) A K ( A, g ) ⊂ K ( A, g ) , A ∗ K ( A, g ) ⊥ ⊂ K ( A, g ) ⊥ . The first statement is obvious and the second follows from (cid:104) A ∗ w, z (cid:105) = (cid:104) w, Az (cid:105) = 0 ∀ z ∈ K ( A, g ), where w is a generic vector in K ( A, g ) ⊥ . Owing to the evidentrelations A V ⊂ A V ⊂ A V = A V A V = A V if A − ∈ B ( H ) , (2.5)valid for any subspace V ⊂ H and any A ∈ B ( H ), (2.4) implies also(2.6) A K ( A, g ) ⊂ K ( A, g ) . A relevant occurrence for our purposes is when the operator A is reduced by theKrylov decomposition (2.3), meaning that both K ( A, g ) and K ( A, g ) ⊥ are invariantunder A . For short, we shall refer to this feature as the Krylov reducibility , or
N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 5 also, to avoid ambiguity, K ( A, g )-reducibility. We shall discuss the relevance of thisfeature in the subsequent sections when we discuss general conditions for Krylov-solvability.It follows from this definition that if A is K ( A, g )-reduced, then A ∗ is K ( A, g )-reduced too, and vice versa, as one sees from the following elementary Lemma,whose proof is omitted.
Lemma 2.2. If A is a bounded operator on a Hilbert space H and V ⊂ H is a closed subspace, then properties (i) and (ii) below are equivalent: (i) A V ⊂ V and A V ⊥ ⊂ V ⊥ ; (ii) A ∗ V ⊂ V and A ∗ V ⊥ ⊂ V ⊥ . Example 2.3. (i) For generic A ∈ B ( H ) and g ∈ H , A may fail to be K ( A, g )-reduced. Allbounded self-adjoint operators are surely Krylov-reduced, owing to (2.4).(ii) Yet, Krylov reducibility is not a feature of self-adjoint operators only. Tosee this, let
A, B ∈ B ( H ) and (cid:101) A := A ⊕ B : H ⊕ H → H ⊕ H . If g ∈ H is acyclic vector for A in H and (cid:101) g := g ⊕
0, then K ( (cid:101) A, (cid:101) g ) = H ⊕ { } , implyingthat K ( (cid:101) A, (cid:101) g ) ⊥ = { } ⊕ H . Therefore, (cid:101) A is K ( (cid:101) A, (cid:101) g )-reduced. On the otherhand, (cid:101) A is self-adjoint (on H ⊕ H ) if and only if so are both A and B on H .For normal operators we have the following equivalent characterisation of Krylovreducibility. Proposition 2.4.
Let A be a bounded normal operator on a Hilbert space H andlet g ∈ H . Then A is K ( A, g ) -reduced if and only if A ∗ g ∈ K ( A, g ) .Proof. If A is K ( A, g )-reduced, then K ( A, g ) is invariant under A ∗ (Lemma 2.2),hence in particular A ∗ g ∈ K ( A, g ). Conversely, if A ∗ g ∈ K ( A, g ), then K ( A, A ∗ g ) = span { A k A ∗ g | k ∈ N } ⊂ K ( A, g ) , and moreover, since A is normal, A ∗ K ( A, g ) = K ( A, A ∗ g ); therefore (using (2.5)), A ∗ K ( A, g ) ⊂ K ( A, A ∗ g ) ⊂ K ( A, g ) . The latter property, together with A ∗ K ( A, g ) ⊥ ⊂ K ( A, g ) ⊥ (from (2.4) above)imply that A ∗ is K ( A, g )-reduced, and so is A itself, owing to Lemma 2.2. (cid:3) For A ∈ B ( H ) and g ∈ H , an obvious consequence of A being K ( A, g )-reduced isthat K ( A, g ) ∩ ( A K ( A, g ) ⊥ ) = { } . For its relevance in the following, we shall callthe intersection(2.7) K ( A, g ) ∩ ( A K ( A, g ) ⊥ )the Krylov intersection for the given A and g . Example 2.5.
The Krylov intersection may be trivial also in the absence of Krylovreducibility. This is already clear for finite-dimensional matrices: for example,taking (with respect to the Hilbert space C ) A θ = (cid:18) θ θ (cid:19) θ ∈ (0 , π ] , g = (cid:18) (cid:19) , one sees that A θ is K ( A θ , g )-reduced only when θ = π , whereas the Krylov inter-section (2.7) is trivial for any θ ∈ (0 , π ). N. CARUSO, A. MICHELANGELI, AND P. NOVATI Krylov solutions for a bounded linear inverse problem
Krylov solvability. Examples.
Let us re-consider the bounded linear inverse problem of the type (1.1): given A ∈ B ( H ) and the datum g ∈ ran A , one searches for solution(s) f ∈ H to Af = g .The general question we are studying here is when f to Af = g admits arbitrarilyclose (in the norm of H ) approximants expressed by finite linear combinations ofthe spanning vectors A k g ’s, or equivalently, f belongs to the closure K ( A, g ) of theKrylov subspace relative to A and g .A solution f satisfying the above property is referred to as a Krylov solution .Informally, we shall use the expression
Krylov solvability for the feature that alinear inverse problem has a Krylov solution.
Example 3.1. (i) The self-adjoint multiplication operator M : L [1 , → L [1 , ψ (cid:55)→ xψ isbounded and invertible with an everywhere defined bounded inverse. Thesolution to M f = is the function f ( x ) = x . Moreover, K ( M, ) is thespace of polynomials on [1 , dense in L [1 , f is aKrylov solution.(ii) The multiplication operator M z : L (Ω) → L (Ω), f (cid:55)→ zf , where Ω ⊂ C is a bounded open subset separated from the origin, say, Ω = { z ∈ C | | z − | < } , is a normal bounded bijection on L (Ω) (Sec. A.6), and thesolution to M z f = g for given g ∈ L (Ω) is the function f ( z ) = z − g ( z ).Moreover, K ( M z , g ) = { p g | p a polynomial in z on Ω } . One can see that f ∈ K ( M z , g ) and hence the problem M z f = g is Krylov-solvable. Indeed,Ω (cid:51) z (cid:55)→ z − is holomorphic and hence is realised by a uniformly convergentpower series (e.g., the Taylor expansion of z − about z = 2). If ( p n ) n issuch a sequence of polynomial approximants, then (cid:107) f − p n g (cid:107) L (Ω) = (cid:107) ( z − − p n ) g (cid:107) L (Ω) (cid:54) (cid:107) z − − p n (cid:107) L ∞ (Ω) (cid:107) g (cid:107) L (Ω) n →∞ −−−−→ . (iii) The left-shift operator L on (cid:96) ( N ) (Sec. A.2) is bounded, not injective, andwith range ran L = (cid:96) ( N ). The solution to Lf = g with g := (cid:80) n ∈ N n ! e n is f = (cid:80) n ∈ N n ! e n +1 . Moreover, K ( L, g ) is dense in (cid:96) ( N ) and therefore f is a Krylov solution. To see the density of K ( L, g ): the vector e belongsto K ( L, g ) because (cid:107) k ! L k g − e (cid:107) (cid:96) = (cid:107) (1 , k +1 , k +2)( k +1) , · · · ) − (1 , , , . . . ) (cid:107) (cid:96) = ∞ (cid:88) n =1 (cid:16) k !( n + k )! (cid:17) k →∞ −−−−→ . As a consequence, (0 , k ! , k +1)! , k +2)! , · · · ) = L k − g − ( k − e ∈ K g ( L ),therefore the vector e too belongs to K g ( L ), because (cid:107) k ! ( L k − g − ( k − e ) − e (cid:107) (cid:96) = ∞ (cid:88) n =1 (cid:16) k !( n + k )! (cid:17) k →∞ −−−−→ . Repeating inductively the above two-step argument proves that any e n ∈K ( L, g ), whence the cyclicity of g .(iv) The right-shift operator R on (cid:96) ( N ) (Sec. A.2) is bounded and injective,with non-dense range, and the solution to Rf = e is f = e . However, f is not a Krylov solution, for K ( R, e ) = span { e , e , . . . } . The problem Rf = e is not Krylov-solvable. N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 7 (v) The compact (weighted) right-shift operator R on (cid:96) ( Z ) (Sec. A.4) is nor-mal, injective, and with dense range, and the solution to R f = σ e is f = e . However, f is not a Krylov solution, for K ( R , e ) = span { e , e , . . . } .The problem R f = σ e is not Krylov-solvable.(vi) Let A be a bounded injective operator on a Hilbert space H with cyclicvector g ∈ ran A and let ϕ ∈ H \ { } . Let f ∈ H be the solution to Af = g . The operator (cid:101) A := A ⊕ | ϕ (cid:105)(cid:104) ϕ | on (cid:101) H := H ⊕ H is bounded. Onesolution to (cid:101) A (cid:101) f = (cid:101) g := g ⊕ (cid:101) f = f ⊕ (cid:101) f ∈ H ⊕ { } = K ( (cid:101) A, (cid:101) g ).Another solution is (cid:101) f ξ = f ⊕ ξ , where ξ ∈ H \ { } and ξ ⊥ ϕ . Clearly, (cid:101) f ξ / ∈ K ( (cid:101) A, (cid:101) g ).(vii) If V is the Volterra operator on L [0 ,
1] (Sec. A.5) and g ( x ) = x , then f ( x ) = x is the unique solution to V f = g . On the other hand, K ( V, g ) isspanned by the monomials x , x , x , . . . , whence K ( V, g ) = { x p ( x ) | p is a polynomial on [0 , } . Therefore f / ∈ K ( V, g ), because f ( x ) = x · x and x / ∈ L [0 , f ∈ K ( V, g ), because in fact K ( V, g ) is dense in L [0 , h ∈K ( V, g ) ⊥ , then 0 = (cid:82) h ( x ) x p ( x ) d x for any polynomial p ; the L -densityof polynomials on [0 ,
1] implies necessarily that x h = 0, whence also h = 0;this proves that K ( V, g ) ⊥ = { } and hence K ( V, g ) = L [0 , General conditions for Krylov solvability.
Even stringent assumptions on A such as the simultaneous occurrence of com-pactness, normality, injectivity, and density of the range do not ensure, in general,that the solution f to Af = g , for given g ∈ ran A , is a Krylov solution (Example3.1(v)).A necessary condition for the solution to a well-defined bounded linear inverseproblem to be a Krylov solution, which becomes necessary and sufficient if thelinear map is a bounded bijection, is the following. (Recall that for A ∈ B ( H ) thesethree properties are equivalent : A is a bijection; A is invertible with everywheredefined, bounded inverse on H ; the spectral point 0 belongs to the resolvent set of A .) Proposition 3.2.
Let A be a bounded and injective operator on a Hilbert space H ,and let f ∈ H be the solution to Af = g , given g ∈ ran A . One has the following. (i) If f ∈ K ( A, g ) , then A K ( A, g ) is dense in K ( A, g ) . (ii) Assume further that A is invertible with everywhere defined, bounded in-verse on H . Then f ∈ K ( A, g ) if and only if A K ( A, g ) is dense in K ( A, g ) .Proof. One has A K ( A, g ) ⊃ A K ( A, g ) = span { A k g | k ∈ N } , owing to the definitionof Krylov subspace and to (2.5). If f ∈ K ( A, g ), then A K ( A, g ) (cid:51) Af = g , in whichcase A K ( A, g ) ⊃ span { A k g | k ∈ N } ; the latter inclusion, by means of (2.5) and(2.6), implies K ( A, g ) ⊃ A K ( A, g ) ⊃ K ( A, g ), whence A K ( A, g ) = K ( A, g ). Thisproves part (i) and the ‘only if’ implication in part (ii). For the converse, let us nowassume that A − ∈ B ( H ) and that A K ( A, g ) is dense in K ( A, g ). Let ( Av n ) n ∈ N bea sequence in A K ( A, g ) of approximants of g ∈ K ( A, g ) for some v n ’s in K ( A, g ).Since A − is bounded on H , then ( v n ) n ∈ N is a Cauchy sequence, hence v n → v as n → ∞ for some v ∈ K ( A, g ). By continuity, Af = g = lim n →∞ Av n = Av , and byinjectivity f = v ∈ K ( A, g ), which proves also the ‘if’ implication of part (ii). (cid:3) A sufficient condition for the Krylov solvability of a well-defined bounded linearinverse problem is the Krylov reducibility introduced in Sec. 2. N. CARUSO, A. MICHELANGELI, AND P. NOVATI
Proposition 3.3.
Let A be a bounded and injective operator on a Hilbert space H ,and let f ∈ H be the solution to Af = g , given g ∈ ran A . If A is K ( A, g ) -reduced,then f ∈ K ( A, g ) . In particular, if A is bounded, injective, and self-adjoint, then Af = g implies f ∈ K ( A, g ) .Proof. Let P K : H → H be the orthogonal projection onto K ( A, g ). On the onehand, A ( − P K ) f ∈ K ( A, g ), because AP K f ∈ K ( A, g ) (from (2.6) above) and Af = g ∈ K ( A, g ). On the other hand, owing to the Krylov reducibility, A ( − P K ) f ∈ K ( A, g ) ⊥ . Then necessarily A ( − P K ) f = 0, and by injectivity f = P K f ∈K ( A, g ). (cid:3) In the above proof, Krylov reducibility was only used to deduce that the vector A ( − P K ) f ∈ A K ( A, g ) ⊥ must belong to K ( A, g ) ⊥ ; thus, the vanishing of A ( − P K ) f – and hence the same conclusion – follows also by merely assuming that theKrylov intersection K ( A, g ) ∩ ( A K ( A, g ) ⊥ ) is trivial. And for bounded bijections,the latter sufficient condition becomes also necessary. Proposition 3.4.
Let A be a bounded and injective operator on a Hilbert space H ,and let f ∈ H be the solution to Af = g , given g ∈ ran A . (i) If K ( A, g ) ∩ ( A K ( A, g ) ⊥ ) = { } , then f ∈ K ( A, g ) . (ii) Assume further that A is invertible with everywhere defined, bounded in-verse on H . Then f ∈ K ( A, g ) if and only if K ( A, g ) ∩ ( A K ( A, g ) ⊥ ) = { } .Proof. Part (i) and the ‘if’ implication of part (ii) follow as argued right beforestating the Proposition. Conversely, if f ∈ K ( A, g ) and A − ∈ B ( H ), then A K ( A, g )is dense in K ( A, g ) (Proposition 3.2(ii)). Let now z ∈ K ( A, g ) ∩ ( A K ( A, g ) ⊥ ), say, z = Aw for some w ∈ K ( A, g ) ⊥ , and based on the density proved above let ( Ax n ) n ∈ N be a sequence in A K ( A, g ) of approximants of z for some x n ’s in K ( A, g ). From Ax n → z = Aw and (cid:107) A − (cid:107) op < + ∞ one has x n → w as n → ∞ . Since x n ⊥ w ,then 0 = lim n →∞ (cid:107) x n − w (cid:107) H = lim n →∞ (cid:0) (cid:107) x n (cid:107) H + (cid:107) w (cid:107) H (cid:1) = 2 (cid:107) w (cid:107) H , whence necessarily w = 0 and z = 0. This proves the ‘only if’ implication of (ii). (cid:3) Propositions 3.2(ii) and 3.4(ii) provide equivalent conditions to the Krylov solv-ability of linear inverse problems on Hilbert space when the linear maps are boundedbijections. (As such, these results do not apply to compact operators on infinite-dimensional Hilbert spaces.)In particular, Proposition 3.4(ii) shows that for such linear inverse problems theKrylov solvability is tantamount as the triviality of the Krylov intersection , whichwas the actual reason to introduce the space (2.7).
Remark 3.5.
Clearly our interest here is to discuss the possible occurrence ofKrylov solvability, in the spirit that if the solution f to Af = g is a Krylov so-lution, then it has the practically favourable feature to be well approximable bylinear combinations of g, Ag, A g, . . . through one of the many efficient iterativealgorithms of the class of the ‘Krylov subspace methods’. In this case, the nextcrucial question concerns the rate of convergence of the approximate iterate to theactual f , a point of view that we do not develop in the present work, but of course isthe object of an ample part of the literature – see, e.g., the monographs [23, 9, 29].3.3. Krylov reducibility and Krylov solvability.
Concerning the relation between the Krylov reducibility and the Krylov solvabil-ity, we know that the former implies the latter (Prop. 3.3).Moreover, there are classes of operators for which the two notions coincide, asthe following remark shows.
N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 9
Remark 3.6.
For unitary operators , the Krylov solvability of the associated inverseproblem is equivalent to the Krylov-reducibility. Indeed, when U : H → H isunitary and f = U ∗ g is the solution to U f = g for some g ∈ H , then the assumption f ∈ K ( U, g ) implies U ∗ g ∈ K ( U, g ), which by Proposition 2.4 is the same as the factthat U is K ( U, g )-reduced.There are also Krylov-solvable inverse problems whose operator is not Krylov-reduced, even among well-defined inverse problems, namely when A is bounded andinjective and g ∈ ran A . This is the case of Example 2.5 when θ (cid:54) = π .Even in the relevant class of (bounded, injective) normal operators (the operator A θ of Example 2.5 is not normal), Krylov solvability does not necessarily implyKrylov reducibility. Let us discuss a counterexample that elucidates that.We need first to recall the following fact from complex functional analysis – see,e.g., [3, Theorem 4.4.10]. Proposition 3.7.
Let
U ⊂ C be an open subset of the complex plane, and denoteby H ( U ) the space of holomorphic functions on U . Then the space H ( U ) ∩ L ( U ) is closed in L ( U ) . Example 3.8.
Consider the multiplication operator M z : L (Ω) → L (Ω), M z f = zf , with Ω = { z ∈ C | | z − | < } , and let g ∈ L (Ω) be such that ε (cid:54) | g ( z ) | (cid:54) ε (cid:48) ∀ z ∈ Ω, for given ε, ε (cid:48) >
0. Then:(i) M z is bounded, injective, and normal;(ii) the inverse linear problem M z f = g is Krylov-solvable: f ∈ K ( M z , g );(iii) however, M z is not Krylov-reduced.Parts (i) and (ii) were discussed in Example 3.1(ii). It was also observed thereinthat K ( M z , g ) = { p g | p ∈ P Ω [ z ] } , where P Ω [ z ] denotes the polynomials in z ∈ Ωwith complex coefficients. Let us show that(*) K ( M z , g ) = (cid:110) φg (cid:12)(cid:12)(cid:12) φ ∈ P Ω [ z ] (cid:107) (cid:107) (cid:111) . Indeed, if w ∈ K ( M z , g ), then w (cid:107) (cid:107) ←−− p n g for a sequence ( p n ) n ∈ N in P Ω [ z ], andsince (cid:107) p n − p m (cid:107) L (Ω) (cid:54) ε − (cid:107) gp n − gp m (cid:107) L (Ω) then ( p n ) n ∈ N is a Cauchy sequence in L (Ω) with p n (cid:107) (cid:107) −−→ φ for some φ ∈ P Ω [ z ] (cid:107) (cid:107) ;whence w = φg . Conversely, if w = φg for φ ∈ P Ω [ z ] (cid:107) (cid:107) , then φ (cid:107) (cid:107) ←−− p n for asequence ( p n ) n ∈ N of approximants in P Ω [ z ] and (cid:107) w − p n g (cid:107) L (Ω) = (cid:107) φg − p n g (cid:107) L (Ω) (cid:54) ε (cid:48) (cid:107) φ − p n (cid:107) L (Ω) n →∞ −−−−→ w ∈ K ( M z , g ). The identity (*) is therefore established. Now, ifby contradiction M z was reduced with respect to the decomposition L (Ω) = K ( M z , g ) ⊕ K ( M z , g ) ⊥ , then zg = M z g = M ∗ z g ∈ K ( M z , g ) (Prop. 2.4), and iden-tity (*) would imply that the function Ω (cid:51) z (cid:55)→ z belongs to P Ω [ z ] (cid:107) (cid:107) ; however,the latter space, owing to Prop. 3.7, is formed by holomorphic functions, and thefunction z (cid:55)→ z clearly is not. Part (iii) is thus proved.3.4. More on Krylov solutions in the lack of well-posedness.
Let us consider more generally solvable inverse problem ( g ∈ ran A ) which are notnecessarily well-posed (i.e., A is possibly non-injective). First, we see that Krylovreducibility still guarantees the existence of Krylov solutions, indeed Prop. 3.3 hasa counterpart valid also in the lack of injectivity, which reads as follows. Proposition 3.9.
Let A be a bounded operator on a Hilbert space H , and let g ∈ ran A . If A is K ( A, g ) -reduced, then there exists a Krylov solution to the problem Af = g . For example, if f ◦ ∈ H satisfies Af ◦ = g and P K is the orthogonalprojection onto K ( A, g ) , then f := P K f ◦ is a Krylov solution.Proof. One has A ( − P K ) f ◦ = 0, owing to the very same argument as in the aboveproof of Prop. 3.3. Thus, AP K f ◦ = g , that is, f := P K f ◦ is a Krylov solution. (cid:3) Generic bounded linear inverse problems may or may not admit a Krylov solu-tion, and when they do there may exist further non-Krylov solutions (Example 3.1).For a fairly general class of such problems, however, the Krylov solution, when itexists, is unique . Proposition 3.10.
Let A be a bounded normal operator on a Hilbert space H , andlet Af = g be the associated linear inverse problem, given g ∈ ran A . Then thereexists at most one solution f ∈ K ( A, g ) . More generally, the same conclusion holdsif A is bounded with ker A ⊂ ker A ∗ .Proof. If f , f ∈ K ( A, g ) and Af = g = Af , then f − f ∈ ker A ∩ K ( A, g ).By normality, ker A = ker A ∗ , and moreover obviously K ( A, g ) ⊂ ran A . Therefore, f − f ∈ ker A ∗ ∩ ran A . But ker A ∗ ∩ ran A = { } , whence f = f . The secondstatement is then obvious. (cid:3) This proposition is similar to comments made in [11, 2, 12] about Krylov solutionsto singular systems in finite dimensions.Propositions 3.9 and 3.10 above have a noticeable consequence.
Corollary 3.11. If A ∈ B ( H ) is self-adjoint, then the inverse problem Af = g with g ∈ ran A admits a unique Krylov solution.Proof. A is K ( A, g )-reduced (Example (2.3)(i)), hence the induced inverse problemadmits a Krylov solution (Prop. 3.9). Such a solution is then necessarily unique(Prop. 3.10). (cid:3)
It is worth noticing that the self-adjoint case has always deserved a special statusin this context, theoretically and in applications: the convergence of Krylov tech-niques for self-adjoint operators are the object of an ample literature – see, e.g.,[20, 7, 19, 31, 22, 18, 24].
Example 3.12.
The test problems blur deriv2 foxgood gravity heati laplace parallax phillips shaw ursell of Hansen’s REGULARIZATION TOOLS Matlab package [17] correspond to in-tegral operators A K on some L [ a, b ] whose integral kernels K ( x, y ) are square-integrable and have the property K ( x, y ) = K ( x, y ), namely they are Hilbert-Schmidt and self-adjoint operators. Owing to Corollary 3.11, all such inverse prob-lems admit a unique Krylov solution (in fact they are Krylov-solvable, as long as g ∈ ran A K and A K is injective). Example 3.13.
The
PRdiffusion two-dimensional test problem of Gazzola-Hansen-Nagy’s IR Tools [13] consists of reconstructing, from the heat diffusion problem ∂u∂t = ∆ N uu (0) = u with unknown u ≡ u ( x, y ; t ) in the Hilbert space L ([0 , × [0 , N , the initial datum u starting from the knowledge of the function u ( t ) N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 11 at time t >
0. By standard functional-analytic arguments one has u = e − t ∆ N u ( t ),that is, for given t the inverse problem u ( t ) (cid:55)→ u is self-adjoint and hence (Corollary3.11) it admits a unique Krylov solution. Example 3.14.
For given k ∈ L [0 , A : L [0 , → L [0 , Au )( x ) := (cid:90) k ( x − y ) u ( y ) d y (3.1)is a Hilbert-Schmidt normal operator, with norm (cid:107) A (cid:107) op (cid:54) (cid:107) k (cid:107) L (the integralkernel κ A ∗ A of A ∗ A being the function κ A ∗ A ( x, y ) = (cid:82) k ( y − z ) k ( x − z ) d z ). A isself-adjoint if and only if k ( x ) = k ( − x ) for almost every x ∈ [0 , k ( x ) := e sin πx (1 + e ) π e x −
11 + π . With the above choice, in order investigate the Krylov-solvability of the inverseproblem Af = g for given g ∈ ran A one must then go through an ad hoc analysis.Let us introduce the orthonormal basis { ϕ n | n ∈ Z } of L [0 , ϕ n ( x ) = e π i nx . Then(3.4) k = (cid:88) n ∈ Z c n ϕ n , c n := (cid:104) ϕ n , k (cid:105) L , and a straightforward explicit computation yields(3.5) c n = n = 011 + 4i nπ + (1 − n ) π if n ∈ Z \ { } . As a consequence,( Au )( x ) = (cid:90) k ( x − y ) u ( y ) d y = (cid:88) n ∈ Z c n (cid:90) ϕ n ( x − y ) u ( y ) d y = (cid:88) n ∈ Z c n ϕ n ( x ) (cid:90) ϕ n ( y ) u ( y ) d y , (3.6)that is,(3.7) A = (cid:88) n ∈ Z c n | ϕ n (cid:105)(cid:104) ϕ n | = (cid:88) n ∈ Z λ n | ψ n (cid:105)(cid:104) ϕ n | , (cid:40) λ n := | c n | ∈ R ψ n := e i arg( c n ) ϕ n . It is standard to see that { ψ n | n ∈ Z } is just another orthonormal basis of L [0 , A . We can nowdraw a number of conclusions. • A is not injective: ker A = span { ϕ } . • ran A = span { ψ n | n ∈ Z \ { }} = span { ϕ n | n ∈ Z \ { }} . • If g ∈ ran A (that is, if g is not a constant function), and J ⊂ Z \ { } is thesubset of non-zero integers n such that g n := (cid:104) ψ n , g (cid:105) L (cid:54) = 0, then g = (cid:88) n ∈ J g n ψ n and the inverse linear problem Af = g admits an infinity of solutions of theform f = αϕ + f K for arbitrary α ∈ C , where f K := (cid:88) n ∈ J g n λ n ϕ n (recall that λ n (cid:54) = 0 whenever n (cid:54) = 0). • Moreover, due to the property ψ n = e i arg( c n ) ϕ n the vectors g, Ag, A g, . . . have non-zero components only of order n ∈ J ; this, together with the factthat the λ n ’s are all distinct, implies that K ( A, g ) = span { ψ n | n ∈ J } = span { ϕ n | n ∈ J } . • The functions αϕ with α ∈ R \ { } are non-Krylov solutions to the prob-lem Af = g , whereas f K is the unique Krylov solution, consistently withProp. 3.10.3.5. Special classes of Krylov-solvable problems.
In the current lack (to our knowledge) of a complete characterisation of allKrylov-solvable inverse problems on infinite-dimensional Hilbert space, it is of in-terest to identify special sub-classes of them.We already examined simple explicit cases in Example 3.1, parts (i), (ii), (iii),and (vii).We also concluded (Cor. 3.11) that a whole class of paramount relevance, thebounded self-adjoint operators, induce inverse problems that are Krylov-solvable,and Examples 3.12 and 3.13 survey a number of applications.It is worth mentioning that in the special case where the operator A is bounded,self-adjoint, and positive definite , an alternative analysis by Nemirovskiy and Polyak[22] (for a more recent discussion of which we refer to [9, Sect. 7.2] and [16, Sect. 3.2],as well as [4]) proved that the corresponding linear inverse problem Af = g with g ∈ ran A is actually Krylov-solvable. In particular it was proved that the sequenceof Krylov approximations from the conjugate gradient algorithm converges stronglyto the exact solution.Krylov-solvable problems can be surely found for suitable non -self-adjoint op-erators too (Example 3.14), although, as already commented, Krylov-solvability isnot automatic for compact, normal, injective operators (Example 3.1(v)).To conclude this Section, let us present one further class of well-posed inverselinear problems that are Krylov-solvable (Corollary 3.16 below). For shortness, weshall say that an operator A is of class - K when • A ∈ B ( H ), • / ∈ σ ( A ), • there exists an open subset W ⊂ C such that σ ( A ) ⊂ W , W is compactwith 0 / ∈ W , and C \ W is connected in C .(Observe, for instance, that the multiplication operator M z considered in Example3.1(ii) is of class- K , whereas unitary operators are not.)Class- K operators have a polynomial approximation of their inverse, which even-tually yields Krylov-solvability of the associated inverse problem. Proposition 3.15.
Let A be an operator of class- K on a Hilbert space H . Thenthere exists a polynomial sequence ( p n ) n ∈ N over C such that (cid:107) p n ( A ) − A − (cid:107) op → as n → ∞ .Proof. Let
U ⊂ C be an open set such that 0 / ∈ U and W ⊂ U , where W is an openset fulfilling the definition of class- K for the given A . The function z (cid:55)→ z − isholomorphic on U and hence (see, e.g., [27, Theorem 13.7]) there exists a polynomial N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 13 sequence ( p n ) n ∈ Z on W such that (cid:107) z − − p n ( z ) (cid:107) L ∞ ( W ) n →∞ −−−−→ . On the other hand, there exists a closed curve Γ ⊂ W \ σ ( A ) such that (see, e.g.,[27, Theorem 13.5]) z − = 12 π i (cid:90) Γ d ζζ ( ζ − z ) , p n ( z ) = 12 π i (cid:90) Γ p n ( ζ )( ζ − z ) d ζ , whence also (see, e.g. [26, Chapter XI, Sect. 151]) A − = 12 π i (cid:90) Γ ζ − ( ζ − A ) − d ζ , p n ( A ) = 12 π i (cid:90) Γ p n ( ζ ) ( ζ − A ) − d ζ . Thus, (cid:107) A − − p n ( A ) (cid:107) op = (cid:13)(cid:13)(cid:13) π i (cid:90) Γ ( ζ − − p n ( ζ ))( ζ − A ) − d ζ (cid:13)(cid:13)(cid:13) op (cid:54) (cid:107) z − − p n ( z ) (cid:107) L ∞ ( W ) (cid:13)(cid:13)(cid:13) π i (cid:90) Γ ( ζ − A ) − d ζ (cid:13)(cid:13)(cid:13) op = (cid:107) z − − p n ( z ) (cid:107) L ∞ ( W ) (indeed, (2 π i) − (cid:82) Γ ( ζ − A ) − d ζ = ), and the conclusion follows. (cid:3) Corollary 3.16.
Let A be an operator of class- K on a Hilbert space H . Then theinverse problem Af = g for given g ∈ H is Krylov-solvable, i.e., the unique solution f belongs to K ( A, g ) .Proof. As (cid:107) p n ( A ) − A − (cid:107) op n →∞ −−−−→ (cid:107) p n ( A ) g − f (cid:107) H = (cid:107) p n ( A ) g − A − g (cid:107) H n →∞ −−−−→
0, and obviously p N ( A ) g ∈ K ( A, g ). (cid:3) Numerical tests and examples
In this final Section we examine the main features discussed theoretically sofar through a series of numerical tests on inverse problems in infinite-dimensionalHilbert space, suitably truncated using the GMRES algorithm, and analysed byincreasing the size of the truncation (i.e. the number of iterations of GMRES).We focus on the behaviour of the truncated problems under these circumstances:I) when the solution to the original problem is or is not a Krylov solution;II) when the linear operator is or is not injective (well-defined vs ill-definedproblem).4.1.
Four inverse linear problems.
As a ‘baseline’ case, where the solution is known a priori to be a Krylov solution,we considered the compact, injective, self-adjoint multiplication operator on (cid:96) ( N )(Sec. A.1)(4.1) M = ∞ (cid:88) n =1 σ n | e n (cid:105)(cid:104) e n | , σ n = (5 n ) − , In comparison to M we tested a non-injective version of it, namely(4.2) (cid:102) M = ∞ (cid:88) n =1 (cid:101) σ n | e n (cid:105)(cid:104) e n | , (cid:101) σ n = (cid:40) n ∈ { , , } σ n otherwise , as well as the weighted right shift (Sec. A.3)(4.3) R = ∞ (cid:88) n =1 σ n | e n +1 (cid:105)(cid:104) e n | with the same weights as in (4.1). We thus investigated the inverse problems M f = g , (cid:102) M f = g , and R f = g with datum g generated by the a priori chosensolution(4.4) f = (cid:88) n ∈ N f n e n , f n = (cid:40) n − if n (cid:54) . Let us observe that(4.5) (cid:107) f (cid:107) (cid:96) = (cid:113) π − Ψ (1) (251) (cid:39) . , where Ψ ( k ) is the polygamma function of order k [1, Sec. 6.4].Fourth and last, we considered the inverse problem V f = g where V is theVolterra operator in L [0 ,
1] (Sect. A.5) and g ( x ) = x . The problem has uniquesolution(4.6) f ( x ) = x , (cid:107) f (cid:107) L [0 , = 1 √ (cid:39) . . Depending on the context, we shall denote respectively by H and by A the Hilbertspace ( (cid:96) ( N ) or L [0 , M , (cid:102) M , R , or V ) under consideration.The inverse problems in H associated with M and (cid:102) M are Krylov-solvable (Corol-lary 3.11), and so too is the inverse problem associated with V , with K ( V, g ) densein L [0 ,
1] (Example 3.1(vii)).Instead, the problem associated with R is not Krylov-solvable, for K ( R , g ) ⊥ always contains the first canonical vector e .For each operator A , we proceeded numerically by generating the spanning vec-tors g, Ag, A g, . . . of K ( A, g ) up to order N max = 500 if A = M, (cid:102) M , R , and up toorder N max = 175 if A = V . Such values represent our practical choice of ‘infinite’dimension for K ( A, g ).Analogously, when A = M, (cid:102) M , R we allocated for each of the considered vectors f, g, Ag, A g, . . . an amount of 2500 entries with respect to the canonical basis of (cid:96) ( N ): such a value represents our practical choice of ‘infinite’ dimension for H .Let us observe, in particular, that by repeated application of R up to 500 times,the vectors R k g have non-trivial entries up to order 251+500=751 (by constructionthe last non-zero entries of f and of g are the components, respectively, e and e ), and by repeated application of M and (cid:102) M the vectors M k g and (cid:102) M k g have thecomponent e as last non-zero entry: all such limits stay well below our ‘infinity’threshold of 2500 for H .From each collection { g, Ag, . . . , A N − g } we then obtained an orthonormal ba-sis of the N -dimensional truncation of K ( A, g ), N (cid:54) N max , and we truncated the‘infinite-dimensional’ inverse problem Af = g to a N -dimensional one, that wesolved by means of the GMRES algorithm, in the same spirit of our general discus-sion [5, Sect. 2].Denoting by (cid:100) f ( N ) ∈ H the vector of the solution from the GMRES algorithmat the N -th iterate, we analysed two natural indicators of the convergence ‘as N → ∞ ’, the infinite-dimensional error E N and the infinite-dimensional residual R N , defined respectively [5, Sect. 2] as(4.7) E N := f − (cid:100) f ( N ) , R N := g − A (cid:100) f ( N ) . Krylov vs non-Krylov solutions.
The (norm) behaviours of the infinite-dimensional error (cid:107) E N (cid:107) H , of the infinite-dimensional residual (cid:107) R N (cid:107) H , and of the approximated solution (cid:107) (cid:100) f ( N ) (cid:107) H at the N -th step of the algorithm are illustrated in Figure 1 as a function of N .The numerical evidence is the following. N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 15
100 200 300 400 500 N -6 -4 -2 E rr o r N o r m
100 200 300 400 500 N -10 -5 R e s i dua l no r m
100 200 300 400 500 N || f ( N ) || (a) Case M
100 200 300 400 500 N E rr o r N o r m
100 200 300 400 500 N -10 -5 R e s i dua l no r m
100 200 300 400 500 N || f ( N ) || (b) Case (cid:102) M
100 200 300 400 500 N E rr o r N o r m
100 200 300 400 500 N R e s i dua l no r m
100 200 300 400 500 N || f ( N ) || (c) Case R
50 100 150 N -3 -2 -1 E rr o r N o r m
50 100 150 N -6 -4 -2 R e s i dua l no r m
50 100 150 N || f ( N ) || (d) Case V Figure 1.
Error norm and residual norm as a function of itera-tions for the cases of the injective multiplication operator M (base-line case), the weighted right shift R , the non-injective multiplica-tion operator (cid:102) M , and the Volterra operator V . • The error norm of the baseline case and the Volterra case tend to vanishwith N , and so does the residual norm, consistently with the obvious prop-erty (cid:107) R N (cid:107) H (cid:54) (cid:107) A (cid:107) op (cid:107) E N (cid:107) H . Moreover, (cid:107) (cid:100) f ( N ) (cid:107) H stays uniformly boundedand attains asymptotically the theoretical value prescribed by (4.5) or (4.6). • Instead, the error norm of the forward shift remains of order one indicatinga lack of norm-convergence , regardless of truncation size. Analogous lackof convergence is displayed in the norm of the finite-dimensional residual.Again, (cid:107) (cid:100) f ( N ) (cid:107) H remains uniformly bounded, but attains an asymptoticvalue that is strictly smaller than the theoretical value (4.5). -0.35-0.3-0.25-0.2-0.15-0.1-0.050 Figure 2.
Support of the error vector (blue bars) for the non-injective problem (cid:102)
M f = g at final iteration N = 500. The redlines mark the entry positions of the components of the kernelspace of (cid:102) M .The asymptotics (cid:107) f − (cid:100) f ( N ) (cid:107) (cid:96) → . (cid:107) g − R (cid:100) f ( N ) (cid:107) (cid:96) → . R f = g can be understood as follows. Since (cid:100) f ( N ) ∈ K ( R , g ) andsince the latter subspace only contains vectors with zero component along e , theerror vector E N = f − (cid:100) f ( N ) tends to approach asymptotically the vector e thatgives the first component of f = (1 , , , . . . ), and this explains (cid:107) E N (cid:107) (cid:96) → g = (0 , , , , . . . ), and since the asymp-totics on E N implies that each component of (cid:100) f ( N ) but the first one converges to thecorresponding component of f , then (cid:100) f ( N ) ≈ (0 , , , . . . ) for large N , whence also R (cid:100) f ( N ) ≈ (0 , , , , . . . ). Thus g and R (cid:100) f ( N ) tend to differ by only the vector e ,which explains (cid:107) R N (cid:107) (cid:96) → .In fact, the lack of norm vanishing of error and residual for the problem R f = g is far from meaning that the approximants (cid:100) f ( N ) carry no information about theexact solution f : in complete analogy to what we discussed in a more generalcontext in [5, Sect. 3 and Sect. 4] – in particular in [5, Theorems 3.2 and 4.1] – (cid:100) f ( N ) reproduces f component-wise for all components but the first.To summarise the above findings, the Krylov-solvable infinite-dimensional prob-lems ( M f = g , V f = g ) display good (i.e., norm-) convergence of error and residual,which is sharper for the multiplication operator M and quite slower for the Volterraoperator V , indicating that the choice of the Krylov bases is not equally effectivefor the two problems. This is in contrast with the non-Krylov-solvable problem( R f = g ), which does not converge in norm at all. The uniformity in the size ofthe solutions produced by the GMRES algorithm appears not to be affected by thepresence or the lack of Krylov-solvability.4.3. Lack of injectivity.
We then focussed on the behaviour of the truncated problems in the absence ofinjectivity, by means of the case study operator (cid:102) M defined by (4.2).Let us observe that the inverse problem (cid:102) M f = g , with g ∈ ran (cid:102) M , admits an infinity of solutions, yet even in the lack of injectivity Corollary 3.11 guaranteesthat such a problem admits a unique Krylov solution.Numerically we found the following. • As opposite to the baseline case M , the infinite-dimensional error norm (cid:107) E N (cid:107) (cid:96) = (cid:107) f − (cid:100) f ( N ) (cid:107) (cid:96) does not vanish with the truncation size and re-mains instead uniformly bounded. The infinite-dimensional residual norm N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 17 (cid:107) (cid:102) M (cid:100) f ( N ) − g (cid:107) (cid:96) ( N ) , instead, displays the same vanishing behaviour as for M (Fig. 1). • The norm of the approximated solution (cid:107) (cid:100) f ( N ) (cid:107) (cid:96) ( N remains uniformly bound-ed (Fig. 1).The reason as to the observed lack of convergence of the error is unmasked inFigure 2. There one can see that the only components in the error vector that arenon-zero are the components corresponding to kernel vector entries.This shows that the Krylov algorithm has indeed found a solution to the problem,modulo the kernel components in f . Appendix A. Some prototypical example operators
Let us review in this Appendix certain operators in Hilbert space that wereuseful in the course of our discussion, both as a source of examples or counter-examples, and as a playground to understand certain mechanisms typical of theinfinite dimensionality.A.1.
The multiplication operator on (cid:96) ( N ) . Let us denote with ( e n ) n ∈ N the canonical orthonormal basis of (cid:96) ( N ). For agiven bounded sequence a ≡ ( a n ) n ∈ N in C , the multiplication by a is the operator M ( a ) : (cid:96) ( N ) → (cid:96) ( N ) defined by M ( a ) e n = a n e n ∀ n ∈ N and then extended bylinearity and density, in other words the operator given by the series(A.1) M ( a ) = ∞ (cid:88) n =1 a n | e n (cid:105)(cid:104) e n | (that converges strongly in the operator sense). M ( a ) is bounded with norm (cid:107) M ( a ) (cid:107) op = sup n | a n | and spectrum σ ( M ( a ) ) givenby the closure in C of the set { a , a , a . . . } . Its adjoint is the multiplication by a ∗ . Thus, M ( a ) is normal. M ( a ) is self-adjoint whenever a is real and it is compactif lim n →∞ a n = 0.A.2. The right-shift operator on (cid:96) ( N ) . The operator R : (cid:96) ( N ) → (cid:96) ( N ) defined by Re n = e n +1 ∀ n ∈ N and thenextended by linearity and density, in other words the operator given by the series(A.2) R = ∞ (cid:88) n =1 | e n +1 (cid:105)(cid:104) e n | (that converges strongly in the operator sense), is called the right-shift operator. R is an isometry (i.e., it is norm-preserving) with closed range ran R = { e } ⊥ .In particular, it is bounded with (cid:107) R (cid:107) op = 1, yet not compact, it is injective, andinvertible on its range, with bounded inverse(A.3) R − : ran R → H , R − = ∞ (cid:88) n =1 | e n (cid:105)(cid:104) e n +1 | . The adjoint of R on H is the so-called left-shift operator, namely the everywheredefined and bounded operator L : H → H defined by the (strongly convergent, inthe operator sense) series(A.4) L = ∞ (cid:88) n =1 | e n (cid:105)(cid:104) e n +1 | , L = R ∗ . Thus, L inverts R on ran R , i.e., LR = , yet RL = − | e (cid:105)(cid:104) e | . One has ker R ∗ =span { e } . R and L have the same spectrum σ ( R ) = σ ( L ) = { z ∈ C | | z | (cid:54) } , but R has noeigenvalue, whereas the eigenvalue of L form the open unit ball { z ∈ C | | z | < } .A.3. The compact (weighted) right-shift operator on (cid:96) ( N ) . This is the operator R : (cid:96) ( N ) → (cid:96) ( N ) defined by the operator-norm convergentseries(A.5) R = ∞ (cid:88) n =1 σ n | e n +1 (cid:105)(cid:104) e n | , where σ ≡ ( σ n ) n ∈ N is a given bounded sequence with 0 < σ n +1 < σ n ∀ n ∈ N andlim n →∞ σ n = 0. Thus, R e n = σ n e n +1 . R is injective and compact, and (A.5) is its singular value decomposition, withnorm (cid:107)R(cid:107) op = σ , ran R = { e } ⊥ , and adjoint(A.6) R ∗ = L = ∞ (cid:88) n =1 σ n | e n (cid:105)(cid:104) e n +1 | . Thus, LR = M ( σ ) , the operator of multiplication by ( σ n ) n ∈ N , whereas RL = M ( σ ) − σ | e (cid:105)(cid:104) e | .A.4. The compact (weighted) right-shift operator on (cid:96) ( Z ) . This is the operator R : (cid:96) ( Z ) → (cid:96) ( Z ) defined by the operator-norm convergentseries(A.7) R = (cid:88) n ∈ Z σ | n | | e n +1 (cid:105)(cid:104) e n | , where σ ≡ ( σ n ) n ∈ N is a given bounded sequence with 0 < σ n +1 < σ n ∀ n ∈ N andlim n →∞ σ n = 0. Thus, R e n = σ | n | e n +1 . R is injective and compact, with ran R dense in H and norm (cid:107)R(cid:107) op = σ . (A.7)gives the singular value decomposition. The adjoint of R is(A.8) R ∗ = L = (cid:88) n ∈ Z σ | n | | e n (cid:105)(cid:104) e n +1 | . Thus, LR = M ( σ ) = RL .The ‘inverse of R on its range’ is the densely defined, surjective, unboundedoperator R − : ran R → H acting as(A.9) R − = (cid:88) n ∈ Z σ | n | | e n (cid:105)(cid:104) e n +1 | as a series that converges on ran R in the strong operator sense.A.5. The Volterra operator on L [0 , . This is the operator V : L [0 , → L [0 ,
1] defined by(A.10) (
V f )( x ) = (cid:90) x f ( y ) d y , x ∈ [0 , .V is compact and injective with spectrum σ ( V ) = { } (thus, the spectral point0 is not an eigenvalue) and norm (cid:107) V (cid:107) op = π . It’s adjoint V ∗ acts as(A.11) ( V ∗ f )( x ) = (cid:90) x f ( y ) d y , x ∈ [0 , , therefore V + V ∗ is the rank-one orthogonal projection(A.12) V + V ∗ = | (cid:105)(cid:104) | onto the function ( x ) = 1. N KRYLOV SOLUTIONS TO INFINITE-DIMENSIONAL INVERSE LINEAR PROBLEMS 19
The singular value decomposition of V is(A.13) V = ∞ (cid:88) n =0 σ n | ψ n (cid:105)(cid:104) ϕ n | , σ n = n +1) π ϕ n ( x ) = √ (2 n +1) π xψ n ( x ) = √ (2 n +1) π x , where both ( ϕ n ) n ∈ N and ( ψ n ) n ∈ N are orthonormal bases of L [0 , V is dense, but strictly contained in H : for example, / ∈ ran V . (Ob-serve, though, that the dense subspace of the polynomials on [0 ,
1] is mapped by V onto the non-dense span { x, x , x , . . . } .)In fact, V is invertible on its range, but does not have (everywhere defined)bounded inverse; yet V − z does, for any z ∈ C \ { } (recall that σ ( V ) = { } ),and(A.14) ( z − V ) − ψ = z − ψ + z − (cid:90) x e x − yz ψ ( y ) d y ∀ ψ ∈ H , z ∈ C \ { } . The explicit action of the powers of V is(A.15) ( V n f )( x ) = 1( n − (cid:90) x ( x − y ) n − f ( y ) d y , n ∈ N . A.6.
The multiplication operator over Ω ⊂ C in L (Ω) . This is the operator M : L (Ω) → L (Ω), f (cid:55)→ zf , where Ω is a bounded openregion in C . M z is a normal bounded bijection with norm (cid:107) M z (cid:107) op = sup z ∈ Ω | z | ,spectrum σ ( M z ) = Ω, and adjoint given by M ∗ z f = zf . References [1]
M. Abramowitz and I. A. Stegun , Handbook of mathematical functions with formulas,graphs, and mathematical tables , vol. 55 of National Bureau of Standards Applied Mathe-matics Series, For sale by the Superintendent of Documents, U.S. Government Printing Office,Washington, D.C., 1964.[2]
P. N. Brown and H. F. Walker , GMRES on (nearly) singular systems , SIAM J. MatrixAnal. Appl., 18 (1997), pp. 37–51.[3]
N. A. Caruso , On Krylov methods in infinite-dimensional Hilbert space , Ph.D. thesis, SISSATrieste (2019).[4]
N. A. Caruso and A. Michelangeli , Convergence of the conjugate gradient method withunbounded operators , SISSA preprint 20/2019/MATE (2019).[5]
N. A. Caruso, A. Michelangeli, and P. Novati , Truncation and convergence issues forbounded linear inverse problems in Hilbert space , preprint (2018).[6]
B. A. Cipra , The best of the 20th century: Editors name top 10 algorithms , SIAM News, 33(2005).[7]
J. W. Daniel , The conjugate gradient method for linear and nonlinear operator equations ,SIAM J. Numer. Anal., 4 (1967), pp. 10–26.[8]
J. Dongarra and F. Sullivan , The Top 10 Algorithms (Guest editors’ intruduction) , Com-put. Sci. Eng., 2 (2000), pp. 22–23.[9]
H. W. Engl, M. Hanke, and A. Neubauer , Regularization of inverse problems , vol. 375 ofMathematics and its Applications, Kluwer Academic Publishers Group, Dordrecht, 1996.[10]
A. Ern and J.-L. Guermond , Theory and practice of finite elements , vol. 159 of AppliedMathematical Sciences, Springer-Verlag, New York, 2004.[11]
R. W. Freund and M. Hochbruck , On the use of two QMR algorithms for solving singularsystems and applications in Markov chain modeling , Numer. Linear Algebra Appl., 1 (1994),pp. 403–420.[12]
M. G. Gasparo, A. Papini, and A. Pasquali , Some properties of GMRES in Hilbert spaces ,Numer. Funct. Anal. Optim., 29 (2008), pp. 1276–1285.[13]
S. Gazzola, P. C. Hansen, and J. G. Nagy , IR Tools: a MATLAB package of iterativeregularization methods and large-scale test problems , Numerical Algorithms, (2018).[14]
L. Geh´er , Cyclic vectors of a cyclic operator span the space , Proc. Amer. Math. Soc., 33(1972), pp. 109–110. [15]
P. R. Halmos , A Hilbert space problem book , vol. 19 of Graduate Texts in Mathematics,Springer-Verlag, New York-Berlin, second ed., 1982. Encyclopedia of Mathematics and itsApplications, 17.[16]
M. Hanke , Conjugate gradient type methods for ill-posed problems , vol. 327 of Pitman Re-search Notes in Mathematics Series, Longman Scientific & Technical, Harlow, 1995.[17]
P. C. Hansen , Regularization Tools – A Matlab Package for Analysis and Solution of Dis-crete Ill-Posed Problems , ∼ pcha/Regutools/ .[18] R. Herzog and E. Sachs , Superlinear convergence of Krylov subspace methods for self-adjoint problems in Hilbert space , SIAM J. Numer. Anal., 53 (2015), pp. 1304–1324.[19]
W. J. Kammerer and M. Z. Nashed , On the convergence of the conjugate gradient methodfor singular linear operator equations , SIAM J. Numer. Anal., 9 (1972), pp. 165–181.[20]
W. Karush , Convergence of a method of solving linear problems , Proc. Amer. Math. Soc., 3(1952), pp. 839–851.[21]
J. Liesen and Z. e. Strakoˇs , Krylov subspace methods , Numerical Mathematics and Scien-tific Computation, Oxford University Press, Oxford, 2013. Principles and analysis.[22]
A. S. Nemirovskiy and B. T. Polyak , Iterative methods for solving linear ill-posed problemsunder precise information. I , Izv. Akad. Nauk SSSR Tekhn. Kibernet., (1984), pp. 13–25,203.[23]
O. Nevanlinna , Convergence of iterations for linear equations , Lectures in MathematicsETH Z¨urich, Birkh¨auser Verlag, Basel, 1993.[24]
P. Novati , A convergence result for some Krylov-Tikhonov methods in Hilbert spaces , Numer.Funct. Anal. Optim., 39 (2018), pp. 655–666.[25]
A. Quarteroni , Numerical models for differential problems , vol. 16 of MS&A. Modeling,Simulation and Applications, Springer, Cham, 2017. Third edition.[26]
F. Riesz and B. Sz.-Nagy , Functional analysis , Frederick Ungar Publishing Co., New York,1955. Translated by Leo F. Boron.[27]
W. Rudin , Real and complex analysis , McGraw-Hill Book Co., New York, third ed., 1987.[28]
Y. Saad , Krylov subspace methods for solving large unsymmetric linear systems , Math.Comp., 37 (1981), pp. 105–126.[29]
Y. Saad , Iterative methods for sparse linear systems , Society for Industrial and AppliedMathematics, Philadelphia, PA, second ed., 2003.[30]
K. Schm¨udgen , Unbounded self-adjoint operators on Hilbert space , vol. 265 of GraduateTexts in Mathematics, Springer, Dordrecht, 2012.[31]
R. Winther , Some superlinear convergence results for the conjugate gradient method , SIAMJ. Numer. Anal., 17 (1980), pp. 14–17.(N. Caruso)
International School for Advanced Studies – SISSA, via Bonomea 265,34136 Trieste (Italy).
E-mail address : [email protected] (A. Michelangeli) International School for Advanced Studies – SISSA, via Bonomea265, 34136 Trieste (Italy).
E-mail address : [email protected] (P. Novati) Universit`a degli Studi di Trieste, Piazzale Europa 1, 34127 Trieste (Italy).
E-mail address ::