Krylov solvability under perturbations of abstract inverse linear problems
aa r X i v : . [ m a t h . F A ] F e b KRYLOV SOLVABILITY UNDER PERTURBATIONS OFABSTRACT INVERSE LINEAR PROBLEMS
NO`E ANGELO CARUSO AND ALESSANDRO MICHELANGELI
Abstract.
When a solution to an abstract inverse linear problem on Hilbertspace is approximable by finite linear combinations of vectors from the cyclicsubspace associated with the datum and with the linear operator of the prob-lem, the solution is said to be a Krylov solution, i.e., it belongs to the Krylovsubspace of the problem. Krylov solvability of the inverse problem allows forsolution approximations that, in applications, correspond to the very efficientand popular Krylov subspace methods. We study here the possible behavioursof persistence, gain, or loss of Krylov solvability under suitable small pertur-bations of the inverse problem – the underlying motivations being the stabilityor instability of Krylov methods under small noise or uncertainties, as well asthe possibility to decide a priori whether an inverse problem is Krylov solvableby investigating a potentially easier, perturbed problem. We present a wholescenario of occurrences in the first part of the work. In the second, we ex-ploit the weak gap metric induced, in the sense of Hausdorff distance, by theHilbert weak topology, in order to conveniently monitor the distance betweenperturbed and unperturbed Krylov subspaces. Introduction
The ubiquitous occurrence of linear phenomena that produce an output g froman input f according to a linear law A , so that from the exact or approximatemeasurement of g one tries to recover exact or approximate information on f ,gives rise to the following abstraction. The possible inputs and outputs form anabstract linear vector space H on which an action is performed by a linear operator A , and given a datum g ∈ ran A one searches for solution(s) to the inverse linearproblem Af = g . In fact, a vast variety of phenomena are encompassed by such amathematical generalisation when H is taken to be a (possibly infinite-dimensional)inner product and complete complex vector space, namely a complex Hilbert space,and A is a closed linear operator acting on H . The (somewhat minimal) requirementof operator closedness is aimed at having a non-trivial notion of spectrum of A ,hence to allow for the possible use of spectral methods in solving Af = g . Theprimary interest in such setting is to obtain convenient approximations of f interms of approximants produced by certain algorithms.Here various levels of abstraction are implemented: (a) the dimensionality of H , finite or infinite, (b) the boundedness or unboundedness of A , (c) the spectralproperties of A (from a purely discrete spectrum to richer structures with continuouscomponents, separated or not from zero, and so on). When dim H < ∞ the inverseproblem involves finite matrices and is typically under a very accurate control in allits aspects (algebraic, analytical, numerical, including in applications the control ofthe rate of convergence of approximants, etc.). To a lesser degree one has a theory Date : March 1, 2021.
Key words and phrases.
Inverse linear problems, infinite-dimensional Hilbert space, Krylovsubspaces, Krylov solvability, cyclic operators, cyclic vectors, spectral theory, Hausdorff distance,subspace perturbations, weak topology, weak convergence .This work is partially supported by the Alexander von Humboldt Foundation. of inverse linear problems governed by bounded operators on infinite-dimensionalspaces: it consists of a more limited amount of sophisticated results, only fewof which have a counterpart in the unbounded case. The literature is obviouslyenormous: we refer to that body of ideas and tools generally called iterative methods[29], Petrov-Galerkin methods [9, 28], generalised projection methods [6], Krylovprojection methods [29, 24], in which as said the complex of knowledges when A isbounded on an infinite-dimensional H is certainly less systematic, let alone when A itself is unbounded.With a clear motivation from applications, in the abstract problem outlinedabove it is relevant to investigate when the solution f admits a subspace of dis-tinguished approximants in H , explicitly constructed from A and g as finite linearcombinations of g, Ag, A g, A g, . . . . That is, one introduces the ‘ Krylov subspace ’(1.1) K ( A, g ) := span { A k g | k ∈ N } ⊂ H , and inquires whether f ∈ K ( A, g ) (the closure of K ( A, g ) in the norm topology of H ). When this is the case, the problem Af = g is said to be ‘ Krylov solvable ’, andone refers to the solution(s) f as ‘ Krylov solution(s) ’.Let us stress that for unbounded A the notion of Krylov subspace only makessense if the datum g is ‘ A - smooth ’, meaning g ∈ C ∞ ( A ) with(1.2) C ∞ ( A ) := \ N ∈ N D ( A N ) , where D ( · ) is the notation for the operator domain (in applications where A isa differential operator, g ∈ C ∞ ( A ) is a regularity requirement); A -smoothness isautomatic when A is everywhere defined and bounded on H . Let us also observethat occurrence K ( A, g ) = H (that makes the Krylov solvability question trivial)corresponds to the fact that g is a cyclic vector for A . Noticeably, the set of cyclicvectors for a bounded operator is either empty or dense in H [10], and it is unknownwhether there exists a bounded operator on a separable Hilbert space H such that every non-zero vector in H is cyclic. A prototypical mechanism for K ( A, g ) to onlybe a proper closed subspace of H is provided by the right shift operator R on ℓ ( N )(defined as usual by Re n = e n +1 on the canonical basis): for instance, K ( R, e ) isthe orthogonal complement to the span of e . Another such mechanism is when A is reduced with respect to the Hilbert space decomposition H = H ⊕ H and g ∈ H .Again, it is no surprise that Krylov solvability is well understood in finite dimen-sions [29, 24], with instead only partial results for bounded A in infinite dimensions[20, 7, 19, 26, 27, 36, 18, 5] or for unbounded A [4, 3].In the bounded case it is worth recalling a few facts we recently established in [5].As customary, let us denote by B ( H ) the algebra of operators that are everywheredefined and bounded on H , equipped with the usual operator norm k A k op . Thus,let now A ∈ B ( H ) and g ∈ ran A .(I) If A is reduced with respect to the ‘ Krylov decomposition ’ H = K ( A, g ) ⊕ K ( A, g ) ⊥ (for short, if A is ‘ K ( A, g )- Krylov-reducible ’), then there exists a Krylovsolution to Af = g .(II) If the ‘ Krylov intersection ’ subspace I ( A, g ) := K ( A, g ) ∩ A ( K ( A, g ) ⊥ )is the trivial set { } , then there exists a Krylov solution to Af = g .Thus, (I) and (II) provide mechanisms for Krylov solvability, the second beingin fact more general (Krylov reducibility always implies triviality of the Krylov RYLOV SOLVABILITY UNDER PERTURBATIONS 3 intersection, but not the other way around, in general). (II) is in a sense the intrinsic mechanism under the additional condition that A has everywhere definedand bounded inverse, for in this case Krylov solvability is equivalent to the I ( A, g )-triviality. Furthermore:(III) If A is normal (or, more generally, if ker A ⊂ ker A ∗ ), then the Krylovsolution to Af = g , if existing, is unique.(IV) If A is self-adjoint, then the problem Af = g admits a unique Krylovsolution f .Property (IV) when in particular A is positive definite was previously establishedby Nemirovskiy and Polyak [26], by showing that the sequence of Krylov approxi-mants obtained by the conjugate gradient algorithm converges strongly to the exactsolution. Further examples and classes of operators in B ( H ) giving rise to Krylovsolvable problems were discussed in [5].To come the object of the present work, let us stress that the picture outlined sofar concerns inverse linear problems where both the datum g and the linear operator A are known exactly – our analysis here is abstract operator-theoretic in nature, butwith reference to the initial motivation, it is as if the law A is precisely understoodand the output g is measured with full precision. A more general perspective isto allow for amount of uncertainty affecting the knowledge of A , or g , or both.Or, from another point of view, instead of only focusing on the inverse problem ofinterest, one may consider also an auxiliary, possibly more tractable problem, closein some sense to the original one, which allows for useful approximate information.Thus, in this work we consider perturbations of the original problem Af = g ofthe form A ′ f ′ = g ′ , where A and A ′ , as well as g and g ′ are close in a controlledsense, and we study the effect of the perturbation on the Krylov solvability .This context is clearly connected with the general framework of “ill-posed” in-verse linear problems [14, 15], where only the perturbed quantities A ′ or g ′ areaccessible, due for instance to measurement errors, and ill-posedness manifests forinstance through the fact that g ′ / ∈ ran A , the goal being to approximate the actualsolution f in a controlled sense.Yet, the questions that we intend to address have a different spirit. We keepregarding A and g as exactly known or, in principle, exactly accessible, but with theidea that close to the problem Af = g there is a perturbed problem A ′ f ′ = g ′ thatserves as an auxiliary one, possibly more easily tractable, say, with Krylov subspacemethods, in order to obtain conclusions on the Krylov solvability of the originalproblem. Or, conversely, we inquire under which conditions the nice property ofKrylov solvability for Af = g is stable enough to survive a small perturbation (thatin applications could arise, again, from experimental or numerical uncertainties),or when instead Krylov solvability is washed out by even small inaccuracies in theprecise knowledge of A or g – an occurrence in which Krylov subspace methodswould prove to be unstable. And, more abstractly, we pose the question of aconvenient notion of vicinity between the subspaces K ( A, g ) and K ( A ′ , g ′ ) when A and A ′ (respectively, g and g ′ ) are suitably close.In Section 2 we elaborate more diffusely on this body of questions and theirconceptual relevance of the present abstract setting.There exists a large amount of literature that accounts for perturbations inKrylov subspace methods, most of which approaches the problem from the pointof view of inexact Krylov methods (see, e.g., [38, 35, 33, 32, 31, 37, 8]). A goodoutline of the theory of inexact Krylov methods may be found in [32], and see inparticular [38] for a general setting of Krylov algorithms under perturbations. Theidea underlying inexact Krylov methods is that the exact (typically non-singular)inverse linear problem in C N , Af = g , is perturbed in A by a series of linear NO`E ANGELO CARUSO AND ALESSANDRO MICHELANGELI operators ( E k ) Nk =1 on C N that may change at each step k of the algorithm. Typicalscenarios that could induce such perturbations at each step of the algorithm are,but not limited to, truncation and rounding errors in finite precision machines,or approximation errors from calculating complicated matrix-vector products. Themain results reported in [38, 35, 33, 32, 31, 37, 8] include the convergence behaviourof the error and residual terms, in particular their rates, and typically bounds onhow far these indicators of convergence are from the unperturbed setting at a giveniteration number k .Yet, these investigations are of a different nature than what we propose here.To begin with, the typical analysis of inexact Krylov methods is in the finite-dimensional setting, where we already have a good control over the Krylov-solvabilityof inverse problems as well as the rates of convergence to a solution. Furthermore,it is not discussed in this literature how the underlying Krylov subspaces them-selves change, as well as the richer phenomena pertaining to the Krylov-solvability(or lack-of) of the underlying problem and its perturbations; some of the veryquestions we are interested in investigating.The point is that, to our knowledge, this line of investigation is so far essentiallyuncharted. With this spirit, and in view of the set of general questions outlinedin Section 2, in Section 3 we present an overview of typical phenomena that mayoccur to the Krylov solvability of an inverse linear problem Af = g in terms of theKrylov solvability, or lack of thereof, of auxiliary inverse problems where A or g orboth are perturbed in a controlled sense. Such survey indicates that the sole controlof the operator or of the data perturbation, in the respective operator and Hilbertnorm, still leaves the possibility open to all phenomena such as the persistence,gain, or loss of Krylov solvability in the limit A n → A or g n → g , where A n (andso g n ) is the generic element of a sequence of perturbed objects. The implicitexplanation is that an information like A n → A or g n → g is not enough to accountfor a suitable vicinity of the corresponding Krylov subspaces – we discussed inpoints (I) and (II) above that the Krylov solvability of the inverse problem Af = g corresponds to certain structural properties of the subspace K ( A, g ), therefore oneimplicitly needs to monitor how the latter properties are preserved or altered underthe perturbation. This also suggests that the additional constraint of performingthe perturbation within certain subclasses of operators may supplement furtherinformation on Krylov solvability: this is in principle a vast programme, in Section4 we focus on the operators of K -class we had previously considered in [5], anddiscuss the robustness and fragility of this class from the perturbative perspectiveof the induced inverse problems.In the second part of this work, Sections 5-7, we address more systematicallythe issue of vicinity of Krylov subspaces in a sense that be informative for theKrylov solvability of the corresponding inverse problems. What shows encourag-ing properties, next to some serious limitations, though, is the comparison of (theclosures of) two Krylov subspaces in terms of the Hausdorff distance between therespective unit balls, considered as closed subset of the Hilbert unit ball when thelatter is metrised with respect to the weak Hilbert topology. (Had we used the norm topology, that would have not even controlled the very intuitive convergenceof the finite-dimensional Krylov subspaces, namely with iterates up to some A N g ,to its infinite-dimensional counterpart, as N → ∞ .) This framework leads to ap-pealing approximation results, as the inner approximability of Krylov subspacesestablished in Subsect. 7.2. Right after, Proposition 7.7 is a prototype of the kindof perturbative results we had originally in mind, namely a control of the perturba-tion, formulated in terms of the perturbed and unperturbed Krylov subspaces, that predicts the persistence of Krylov solvability when the perturbation is removed. RYLOV SOLVABILITY UNDER PERTURBATIONS 5
In this spirit we rather intended – and in the above sense managed – to open aperspective on a general problem, essentially not addressed so far, that is operator-theoretic in nature, yet with direct motivations from Krylov approximation algo-rithms in numerical computation. The corpus of partial results that we presenthere only scratch the surface of a problem that in our intentions need be furtherinvestigated. We shall collect more explicit conclusions in this sense in the finalSection 8.
Notation.
Besides further notation that will be declared in due time, we shallkeep the following convention. H denotes a complex Hilbert space with norm k · k and scalar product h· , ·i , anti-linear in the first entry and linear in the second.Norm and weak convergence in H are denoted, as usual, with x n → x and x n ⇀ x . B ( H ) is the complete, norm, ∗ -algebra of everywhere defined and bounded linearoperators on H , equipped with customary operator norm k·k op . We shall often omitthe adjective ‘linear’, with reference to operators. and O denote, respectively,the identity and the zero operator. σ ( A ) denotes the spectrum of some (closed)linear operator A on H . V is the norm closure of the span of the vectors in V when V is a subset of H , and V ⊥ is the largest closed subspace of H whose vectors areorthogonal to all elements of the subset V ⊂ H . For ψ, ϕ ∈ H , by | ψ ih ψ | and | ψ ih ϕ | we shall denote the H → H rank-one maps acting respectively as f
7→ h ψ, f i ψ and f
7→ h ϕ, f i ψ on generic f ∈ H .2. Krylov solvability from a perturbative perspective
As argued already in the Introduction, the question of the effects of perturba-tions on the Krylov solvability of an infinite-dimensional inverse linear problem isessentially new.One easily realises that such question takes a multitude of related, yet somewhatdifferent formulations depending on the precise perspective one looks at it. Giventhe essential novelty of this line of investigation, we find it instructive to organisethe most relevant of such queries into a coherent scheme – which is the goal of thisSection. This serves both as a reference for the results and explicit partial answersthat we give in this work, as well as an ideal road map for future studies.In practice, let us discuss the following main categories of connected problems.For the first three of them, we work in the bounded case, with H being the under-lying complex infinite-dimensional Hilbert space. I. Comparison between “close” Krylov subspaces.
This is the abstractproblem of providing a meaningful comparison between K ( A, g ) and K ( A ′ , g ′ ), astwo closed subspaces of H , for given A, A ′ ∈ B ( H ) and g, g ′ ∈ H such that in someconvenient sense A and A ′ , as well as g and g ′ are close. As a priori such subspacesmight only have a trivial intersection, the framework is rather that of comparisonof subspaces of a normed space, in practice introducing convenient topologies ormetric distances.A more application-oriented version of the same problem is the following. Given A ∈ B ( H ) and g ∈ H , one considers approximants of one or the other (or both),say, sequences ( A n ) n ∈ N and ( g n ) n ∈ N respectively in B ( H ) and H , such that k A n − A k op → k g n − g k → n → ∞ . Then the question is whether a meaningfulnotion of limit K ( A n , g n ) → K ( A, g ) can be defined.
II. Perturbations preserving/creating Krylov solvability.
This questionis inspired to the possibility that, given A ∈ B ( H ) and g ∈ ran A , instead of solvingthe “difficult” inverse problem Af = g one solves a convenient perturbed problem A ′ f ′ = g ′ , with A ′ ∈ B ( H ) and g ′ ∈ ran A ′ close respectively to A and g , which is“easily” Krylov solvable, and the Krylov solution of which provides approximateinformation to the original solution f . NO`E ANGELO CARUSO AND ALESSANDRO MICHELANGELI
Here is an explicit set-up for this question. Assume that one finds ( A n ) n ∈ N in B ( H ) and ( g n ) n ∈ N in H such that the inverse problems A n f n = g n are all Krylovsolvable and k A n − A k op → k g n − g k → n → ∞ . Is Af = g Krylovsolvable too? And if at each perturbed level n there is a unique Krylov solution f n ,does one have k f n − f k → f is a (Krylov) solution to Af = g ?One scenario of applications is that for A n f n = g n Krylov solvability comes witha much more easily (say, faster) solvable solution algorithm, so that f is ratherdetermined as f = lim n →∞ f n instead of directly approaching the problem Af = g .Another equally relevant scenario is that the possible Krylov solvability of theproblem of interest Af = g is initially unknown , and prior to launching resource-consuming Krylov algorithms for solving the problem, one wants to be guaranteedthat a Krylov solution indeed exists. To this aim one checks the Krylov solvabilityfor A n f n = g n uniformly in n , and the convergence A n → A , g n → g , thus comingto an affirmative answer. III. Perturbations destroying Krylov solvability.
The opposite occurrencehas to be monitored as well, namely the possibility that a small perturbation of theKrylov solvable problem Af = g produces a non-Krylov solvable problem A ′ f ′ = g ′ .Say, if in the above setting none of the problems A n f n = g n are Krylov solvable andyet A n → A and g n → g , under what conditions does one gain Krylov solvabilityin the limit for the problem Af = g ? A comprehension of this phenomenon wouldbe of great relevance to identify those circumstances when Krylov methods areintrinsically unstable, in the sense that even a tiny uncertainty in the knowledgeof A and/or g brings to a perturbed problem A ′ f ′ = g ′ for which, unlike the exactproblem of interest Af = g , Krylov methods are not applicable.In the unbounded case, more precisely when the operator A is closed and un-bounded on H , in principle all the above questions have their own counterpart,except that the fundamental condition g ∈ C ∞ ( A ) required to have a meaningfulnotion of K ( A, g ) is highly unstable under perturbations, and one has to ensurecase by case that certain problems are well posed.Yet, for its evident relevance let us highlight the following additional class ofquestions.
IV. Perturbations-regularisations exploiting Krylov solvability.
For theproblem of interest Af = g one might well have g ∈ ran A but g / ∈ C ∞ ( A ). In thiscase Krylov methods are not applicable: there is no actual notion of Krylov subspaceassociated to A and g , hence no actual Krylov approximants to utilize iteratively.Assume though that one finds a sequence ( g n ) n ∈ N entirely in ran A ∩ C ∞ ( A ) with k g n − g k →
0. This occurrence is quite typical: if A is a differential operator on L ( R d ), everyone is familiar with sequences ( g n ) n ∈ N of functions that all have highregularity, uniformly in n , and for which the L -limit g n → g produces a roughfunction g . Assume further that each problem Af n = g n is Krylov solvable. Forexample, as we showed in [3, Theorem 4.1], for a vast class of self-adjoint or skew-adjoint A ’s, possibly unbounded, there exists a unique solution f n ∈ K ( A, g n ). Thisbrings the following questions. First, do the f n ’s have a limit f and does f solve Af = g ? And, more abstractly speaking, is there a meaningful notion of the limitlim n →∞ K ( A, g n ), irrespectively of the approximant sequence ( g n ) n ∈ N , that couldthen be interpreted as a replacement for the non-existing Krylov subspace associ-ated to A and g ? The elements of such limit subspace would provide exploitableapproximants for the solution to the original problem Af = g .One last remark concerns the topologies underlying all the questions above. Weexplicitly formulated them in terms of the operator norm and Hilbert norm, butalternatively there is a variety of weaker notions of convergence that are still highlyinformative for the solution to the considered inverse problem – we discussed this RYLOV SOLVABILITY UNDER PERTURBATIONS 7 point extensively in [6]. Thus, the “weaker” counterpart of the above questionsrepresents equally challenging and potentially useful problems to address.3.
Gain or loss of Krylov solvability under perturbations
This Section is meant to present examples of different behaviours that may occurin those cases belonging to the categories II and III contemplated in the previousSection. In practice we are comparing here the “unperturbed” inverse linear prob-lem Af = g with “perturbed” problems of the form Af n = g n , or A n f n = g ,along a sequence ( g n ) n ∈ N such that k g n − g k →
0, or along a sequence ( A n ) n ∈ N such that k A n − A k op →
0. Our particular focus is the Krylov solvability, namelyits preservation, or gain, or loss in the limit n → ∞ . Next to operator perturba-tions ( A n → A ) and data perturbations ( g n → g ), it is also natural to considersimultaneous perturbations of the both of them.The purpose here is two-fold: we want to convey a concrete flavour of howinverse problems behave under controlled perturbations of the operator or of thedatum, as far as having Krylov solutions is concerned, and we also want to highlightthe emerging, fundamental, and a priori unexpected lesson. Which is going tobe, in short: the sole control that A n → A or g n → g is not enough to predictwhether Krylov solvability is preserved, or gained, or lost in the limit, i.e., eachsuch behaviour can actually occur. The immediate corollary of this conclusion is:one must describe the perturbation of the problem Af = g by means of additionalinformation, say, by restricting to particular sub-classes of inverse problems, or byintroducing suitable notions of vicinity of Krylov subspaces, in order to control theeffect of the perturbation on Krylov solvability. It is this latter consideration thatmotivates the more specific discussion of Section 4 and of Sections 5-7.For the examples that follow we shall choose concrete playgrounds that allow forthe (in general non-trivial) explicit identification of the Krylov subspace. • As typical cases of Krylov solvable inverse problems we should have in mind,for instance, self-adjoint operators A ∈ B ( H ) with g ∈ ran A ([5, Corollary3.11]), or the Volterra operator on H = L [0 , V such that ( V f )( x ) := R x f ( y )d y (in particular, x k k +1 x k +1 for any k ∈ N ), for which we know that K ( V, g ) = H for anymonomial g = x k ([5, Example 3.1]). One may find other possibilitiesdiscussed in our work [5]. • Instead, as a typical source of lack of Krylov solvability ([5, AppendixA]) we use the right-shift operator on ℓ -spaces: the basic version is on H = ℓ ( N ), with canonical orthonormal basis ( e k ) k ∈ N , where the right-shiftis R = P k ∈ N | e k +1 ih e k | (the sum converging strongly in the operator sense, k R k op = 1, Re k = e k +1 ). Other variants are the right shift on ℓ ( Z ), or thecompact counterpart R = P k λ | k | | e k +1 ih e k | with weights λ k > λ k +1 > λ k k →∞ −−−−→ R admits both a dense of non-cyclic vectors, and a dense of cyclic vectors(see, e.g., [17, 30]).3.1. Operator perturbations.Example 3.1.
Let R be the weighted (compact) right-shift operator R := ∞ X k =1 k | e k +1 ih e k | NO`E ANGELO CARUSO AND ALESSANDRO MICHELANGELI on the Hilbert space ℓ ( N ), and define R n := n − X k =1 k | e k +1 ih e k | + 1 n | e ih e n | , n ∈ N , n > . As R − R n = ∞ X k = n k | e k +1 ih e k | − n | e ih e n | , k R − R n k op ∞ X k = n k + 1 n , then R n → R in operator norm as n → ∞ . For g := e and any n >
2, the inverseproblem induced by R n and with datum g has unique solution f n := f := e , andso does the inverse problem induced by R and with the same datum, i.e., R n f n = g and Rf = g . On the other hand, K ( R n , g ) = K ( R n , g ) = span { e , . . . , e n } , K ( R, g ) = { e } ⊥ . Thus, the inverse problem R n f n = g is Krylov solvable, and obviously f n → f innorm, yet the inverse problem Rf = g is not. Example 3.2.
Let R = P ∞ k =1 | e k +1 ih e k | be the right-shift operator on the Hilbertspace ℓ ( N ), and let A n := | e ih e | + 1 n R , n ∈ N ,A := | e ih e | ,g := e . Clearly, A n → A in operator norm as n → ∞ . The inverse problem A n f n = g hasunique solution f n = ne ; as ( A n ) k g = P kj =0 n − j e j +2 for any k ∈ N , and therefore K ( A n , g ) = { e } ⊥ , such solution is not a Krylov solution. Instead, passing to thelimit, the inverse problem Af = g has unique solution f = e , which is a Krylovsolution since K ( A, g ) = span { e } . Observe also that f n does not converge to f .3.2. Data perturbations.Example 3.3.
Let R : ℓ ( Z ) → ℓ ( Z ) be the usual right-shift operator. R isunitary, with R ∗ = R − = L , the left-shift operator. Moreover R admits a densesubset C ⊂ ℓ ( Z ) of cyclic vectors and a dense subset N ⊂ ℓ ( Z ), consisting of allfinite linear combinations of canonical basis vectors, such that the solution f to theinverse problem Rf = g does not belong to K ( R, g ). All vectors in N are non-cyclicfor R .(i) (Loss of Krylov solvability.) For a datum g ∈ N , the inverse problem Rf = g admits a unique solution f , and f is not a Krylov solution. Yet, bydensity, there exists a sequence ( g n ) n ∈ N in C with g n → g (in ℓ -norm) as n → ∞ , and each perturbed inverse problem Rf n = g n is Krylov solvablewith unique solution f n = R − g n = Lg n , and with f n → f as n → ∞ .Krylov solvability is lost in the limit, still with the approximant Krylovsolutions converging to the solution f to the original problem.(ii) (Gain of Krylov solvability.) For a datum g ∈ C , the inverse problem Rf = g is obviously Krylov solvable, as K ( R, g ) = ℓ ( Z ), owing to the cyclicity of g . Yet, by density, there exists a sequence ( g n ) n ∈ N in N with g n → g , andeach perturbed inverse problem Rf n = g n is not Krylov solvable. Krylov RYLOV SOLVABILITY UNDER PERTURBATIONS 9 solvability is absent along the perturbations and only emerges in the limit,still with the solution approximation f n → f . Example 3.4.
With respect to the Hilbert space orthogonal sum H = H ⊕ H ,let A = A ⊕ A with A ( j ) ∈ B ( H j ), and g ( j ) ∈ ran A ( j ) , j ∈ { , } , such that theproblem A (1) f (1) = g (1) is Krylov solvable in H , with Krylov solution f (1) (forinstance, H = L [0 , A (1) = V , the Volterra operator, g (1) = x , f (1) = ), andthe problem A (2) f (2) = g (2) is not Krylov solvable in H (for instance, H = ℓ ( N ), A (2) = R , the right shift, g (2) = e , f (2) = e ).(i) (Lack of Krylov solvability persists in the limit.) The inverse problems Af n = g n , n ∈ N , with g n := (cid:0) n g (1) (cid:1) ⊕ g (2) , are all non-Krylov solvable,with solution(s) f n = (cid:0) n f (1) (cid:1) ⊕ f (2) . In the limit, g n → g := 0 ⊕ g (2) in H ,whence also f n → ⊕ f (2) =: f . The inverse problem Af = g has solution f (modulo ker A (1) ⊕ { } ), but f is not a Krylov solution.(ii) (Krylov solvability emerges in the limit.) The inverse problems Af n = g n , n ∈ N , with g n := g (1) ⊕ (cid:0) n g (2) (cid:1) , are all non-Krylov solvable, withsolution(s) f n = f (1) ⊕ (cid:0) n f (2) (cid:1) . In the limit, g n → g := g (1) ⊕ H ,whence also f n → f (1) ⊕ f . The inverse problem Af = g has solution f (modulo { } ⊕ ker A (2) ), and f is a Krylov solution.3.3. Simultaneous perturbations of operator and data.Example 3.5.
Same setting as in Example 3.4.(i) (Lack of Krylov solvability persists in the limit.) The inverse problems A n f n = g n , n ∈ N , with A n := ( n A (1) ) ⊕ A (2) and g n := ( n g (1) ) ⊕ g (2) ,are all non-Krylov solvable, with solutions f n = f (1) ⊕ f (2) . In the limit, A n → A := O ⊕ A (2) in operator norm and g n → g := 0 ⊕ g (2) in H . Theinverse problem Af = g has solution f := 0 ⊕ f (2) (modulo ker A (1) ⊕ { } ),which is not a Krylov solution. Moreover in general f n does not convergeto f .(ii) (Krylov solvability emerges in the limit.) The inverse problems A n f n = g n , n ∈ N , with A n := A (1) ⊕ ( n A (2) ) and g n := g (1) ⊕ ( n g (2) ), are all non-Krylovsolvable, with solutions f n = f (1) ⊕ f (2) . In the limit, A n → A := A (1) ⊕ O in operator norm and g n → g := g (1) ⊕ H . The inverse problem Af = g has solution f := f (1) ⊕ { } ⊕ ker A (2) ), which is aKrylov solution. Moreover in general f n does not converge to f .4. Krylov solvability along perturbations of K -class In a previous work [5] we singled out a class of operators in B ( H ) that in retro-spect display relevant behaviour as far as Krylov solvability along perturbations isconcerned. In this Section we elaborate further on that class, in view of the schemeof general questions presented in Sect. 2.By definition, a linear operator A acting on a complex Hilbert space H is in the K -class when A is everywhere defined and bounded, and there exists a boundedopen W ⊂ C containing the spectrum σ ( A ) and such that 0 / ∈ W and C \ W is connected. In particular, a K -class operator has everywhere defined boundedinverse.We have this result. Theorem 4.1.
Let A be a K -class operator on a complex Hilbert space H . (i) For every g ∈ H the inverse problem Af = g is Krylov solvable, with uniquesolution f = A − g . (ii) The K -class is open in B ( H ) . In particular, there is ε A > such that forany other operator A ′ ∈ B ( H ) with k A ′ − A k op < ε A the inverse problem A ′ f ′ = g has a unique solution, f ′ = A ′− g , which is also a Krylov solution. (iii) When in addition (and without loss of generality) k A − A ′ k op (2 k A − k op ) − ,then f and f ′ from (i) and (ii) satisfy k f − f ′ k k g k k A − k k A ′ − A k op . Theorem 4.1 addresses questions of type II from the general scheme of Section 2:it provides a framework where Krylov solvability is preserved under perturbationsof the linear operator inducing the inverse problem. Indeed, an obvious consequenceof Theorem 4.1 is: if a sequence ( A n ) n ∈ N in B ( H ) satisfies A n → A in operatornorm for some K -class operator A , then eventually in n the A n ’s are all of K -class,the associated inverse problems A n f n = g are Krylov solvable with unique solution f n = A − n g , and moreover f n → f in H , where f = A − g is the unique and Krylovsolution to Af = g . Proof of Theorem 4.1.
As we proved in [5, Prop. 3.15] for all K -class operators,there exists a polynomial sequence ( p n ) n ∈ N , consisting of polynomials in the variable z ∈ C , such that k p n ( A ) − A − k op → n → ∞ . Thus, the unique solution f to Af = g satisfies k f − p n ( A ) g k = k A − g − p n ( A ) g k k g k k p n ( A ) − A − k op n →∞ −−−−→ , meaning that f ∈ K ( A, g ). This proves part (i).Concerning (ii), we use the fact that σ ( A ) is an upper semi-continuous functionof A ∈ B ( H ) (see, e.g., [13, Problem 103] and [21, Theorem IV.3.1 and RemarkIV.3.3]), meaning that for every bounded open set Ω ⊂ C with σ ( A ) ⊂ Ω thereexists ε A > A ′ ∈ B ( H ) with k A ′ − A k op < ε A , then σ ( A ′ ) ⊂ Ω.Applying this to Ω = W we deduce that any such A ′ is again of K -class. Theremaining part of the thesis then follows from (i).As for (iii), clearly k f − f ′ k k g k k A − − A ′− k op = k g k k A ′− ( A − A ′ ) A − k op k g k k A ′− k op k A − k op k A − A ′ k op , and A ′− = A − ∞ X n =0 (cid:0) ( A − A ′ ) A − (cid:1) n when k A − A ′ k op < k A − k − , whence, when additionally k A − A ′ k op (cid:0) k A − k op (cid:1) − , k A ′− k op k A − k op .Plugging the latter inequality into the above estimate for k f − f ′ k yields the con-clusion. (cid:3) Remark 4.2.
Such ‘elementary’ proof of Theorem 4.1 relies on a non-trivial tool-box, the above-mentioned result [5, Prop. 3.15].Theorem 4.1 only scratches the surface of expectedly relevant features of K -classoperators, in view of the study of perturbations preserving Krylov solvability (seequestions of type II in Sect. 2).That the issue is non-trivial, however, is demonstrated by important difficultiesthat one soon encounters when trying to extend the scope of Theorem 4.1. Let usdiscuss here one point in particular: in the same spirit of questions of type II, it isnatural to inquire whether K -class operators allow to establish Krylov solvabilityin the limit when the perturbation is removed.To begin with, if a sequence ( A n ) n ∈ N of K -class operators on H converges inoperator norm, the limit A fails in general to be of K -class. Indeed: RYLOV SOLVABILITY UNDER PERTURBATIONS 11
Lemma 4.3.
The K -class is not closed in B ( H ) .Proof. It suffices to consider a positive, compact operator A : H → H , with zeroin its spectrum, and its perturbations A n := A + n − , n ∈ N . Then each A n is of K -class and k A n − A k op → n → ∞ , but by construction A is not of K -class. (cid:3) One might be misled to believe that the general mechanism for such failure isthe appearance of zero in the spectrum of the limit operator A , and that thereforea uniform separation of σ ( A n ) from zero as n → ∞ would produce a limit A still inthe K -class. To show that this is not the case either, let us work out the followingexample. Example 4.4.
Let A and A n , n ∈ N , be the operators on the Hilbert space L [0 , Af )( x ) := e π i x f ( x ) , ( A n f )( x ) := ( e π i x f ( x ) , if x ∈ ( πn , , (1 + n ) e π i x f ( x ) , if x ∈ [0 , πn ] , for f ∈ L [0 ,
1] and a.e. x ∈ [0 , k A k op = 1 , k A n k op = 1 + 1 n , k A − A n k op = 1 n k A k op n →∞ −−−−→ . Moreover, each A n is a K -class operator: its spectrum σ ( A n ) covers the unit circleexcept for a ‘lid’ arc, corresponding to the angle x ∈ [0 , n ], which lies on the circle oflarger radius 1 + n , therefore it is possible to include σ ( A n ) into a suitable boundedopen W separated from zero and with connected complement in C . In the limit n → ∞ the ‘lid’ closes the unit circle: σ ( A ) is indeed the whole unit circle. Thus,even if the limit operator A satisfies 0 / ∈ σ ( A ), A fails to belong to the K -class. Inaddition, not only the K -class condition is lost in the limit, but so too the Krylovsolvability. Indeed, by means of the Hilbert space isomorphism L [0 , ∼ = −−→ ℓ ( Z ) , e π i kx e k (namely with respect to the orthonormal bases ( e π i kx ) k ∈ Z and ( e k ) k ∈ Z ), A is uni-tarily equivalent to the right-shift operator on ℓ ( Z ): thus, any choice g ∼ = e k forsome k ∈ Z produces a non-Krylov solvable inverse problem Af = g .In conclusion, the K -class proves to be an informative sub-class of operators thatis very robust and preserves Krylov solvability under perturbations of an unper-turbed K -class inverse problem (Theorem 4.1), but on the contrary is very fragilewhen from a sequence of approximating inverse problems of K -class one wantsto extract information on the Krylov solvability of the limit problem (Lemma 4.3,Example 4.4).5. Weak gap metric for weakly closed parts of the unit ball
Let us introduce now a convenient indicator of vicinity of closed subspaces of agiven Hilbert space, which turns out to possess convenient properties when compar-ing (closures of) Krylov subspaces, and to provide a rigorous language to expressand control limits of the form K ( A n , g n ) → K ( A, g ). Even though such an indi-cator is not optimal, in that it lacks other desired properties that would make itfully informative, we discuss it in depth here as a first attempt towards an efficientmeasurement of vicinity and convergence of Krylov subspaces under perturbations.
One natural motivation is provided by the failure of describing the intuitiveconvergence K N ( A, g ) N →∞ −−−−→ K ( A, g ), where(5.1) K N ( A, g ) := span { g, Ag, . . . , A N − g } , N ∈ N is the N -th order Krylov subspace, by means of the ordinary ‘ gap metric ’ betweenclosed subspaces of the underlying Hilbert space.Let us recall (see, e.g., [21, Chapt. 4, § H and twoclosed subspaces U, V ⊂ H , the ‘ gap ’ and the ‘ gap distance ’ between them are,respectively, the quantities(5.2) b δ ( U, V ) := max { δ ( U, V ) , δ ( V, U ) } and(5.3) b d ( U, V ) := max { d ( U, V ) , d ( V, U ) } , where δ ( U, V ) := sup u ∈ U k u k =1 inf v ∈ V k u − v k ,d ( U, V ) := sup u ∈ U k u k =1 inf v ∈ V k v k =1 k u − v k , (5.4)and with the tacit definitions δ ( { } , V ) := 0, d ( { } , V ) := 0, d ( U, { } ) := 2 for U = { } , when one of the two entries is the empty set. The short-hands B H for theclosed unit ball of H , S H for the closed unit sphere, B U := U ∩ B H , S U := U ∩ S H ,and dist( x, C ) for the norm distance of a point x ∈ H from the closed subset C ⊂ H will be used throughout. Thus,(5.5) δ ( U, V ) = sup u ∈ S U dist( u, V ) , d ( U, V ) = sup u ∈ S U dist( u, S V ) . As a matter of fact, on the set of all closed subspaces of H both b δ and b d are twoequivalent metrics, with(5.6) b δ ( U, V ) b d ( U, V ) b δ ( U, V ) , and the resulting metric space is complete [11].The construction that we recalled here is for the Hilbert space setting and wasintroduced first in [22] as ‘opening’ between (closed) subspaces (i.e., the operatornorm distance between their orthogonal projections). It also applies to the moregeneral case when H is a Banach space, a generalisation originally discussed in in[23], and [1, §
34] (except that in the non-Hilbert case the gap b δ is not a metric, eventhough it still satisfies (5.6) and hence induces the same topology as the metric b d ).Let us also recall that by linearity the closedness of the above subspaces U and V can be equivalently formulated in the norm or in the weak topology of H .Now, given A ∈ B ( H ) and g ∈ H , for the closed subspaces K := K ( A, g ) and K N := K N ( A, g ), N ∈ N , of H one obviously has K N ⊂ K and hence δ ( K N , K ) = 0;on the other hand, if dim K = ∞ , one can find for every N a vector u ∈ S K suchthat u ⊥ K N , thus with dist( u, K N ) = 1, and hence δ ( K , K N ) >
1. This showsthat b d ( K N , K ) > b δ ( K N , K ) >
1, therefore the sequence ( K N ) N ∈ N fails to convergeto K in the b d -metric. In this respect, the b d -metric is certainly not a convenient toolto monitor the vicinity of Krylov subspaces, for it cannot accommodate the mostintuitive convergence K N → K .With this observation in mind, it is natural to weaken the ordinary gap distance b d -metric so as to encompass a larger class of limits. To do so, we exploit the fact(see, e.g., [2, Theorem 3.29]) that in any separable Hilbert space H the norm-closedunit ball B H is metrisable in the Hilbert space weak topology. More precisely, RYLOV SOLVABILITY UNDER PERTURBATIONS 13 there exists a norm k · k w on H (and hence a metric ̺ w ( x, y ) := k x − y k w ) such that k x k w k x k and whose metric topology restricted to B H is precisely the Hilbertspace weak topology. For concreteness one may define k x k w := ∞ X n =1 n |h ξ n , x i| for a dense countable collection ( ξ n ) n ∈ N in B H which identifies the norm k · k w .On the other hand, since a Hilbert space is reflexive, B H is compact in the weaktopology (see, e.g., [2, Theorem 3.16]), and hence in the ̺ w -metric. Being ( B H , ̺ w )a metric space, its compactness is equivalent to the property of being simultaneouslycomplete and totally bounded (see, e.g., [25, Theorem 45.1]). In conclusion, themetric space ( B H , ̺ w ) is compact and complete, and its metric topology is theHilbert space weak topology (restricted to B H ). In fact, the construction thatfollows, including Theorem 5.1 below, is applicable to the more general case where H is a reflexive Banach space with separable dual: indeed, the same propertiesabove for ( B H , ̺ w ) hold.In ( B H , ̺ w ) we denote the relative weakly open balls (namely the ̺ w - open ballsof H intersected with B H ) as(5.7) B w ( x , ε ) := { x ∈ B H | k x − x k w < ε } for given x ∈ B H and ε >
0. Observe that any such open ball B w ( x , ε ) alwayscontains points of the unit sphere (not all, if ε is small enough); thus, at fixed x ∈ B H , and along a sequence of radii ε n ↓
0, one can select a sequence ( y n ) n ∈ N with k y n k = 1 and k y n − x k w < ε n , whence the conclusion y n ̺ w −−→ x , whichreproduces, in the metric space language, the topological statement that y n ⇀ x ,i.e., that the unit ball is the weak closure of the unit sphere.Based on the weak (and metric) topology B H it is natural to weaken the gapdistance b d considered before, as we shall do in a moment, except that dealing nowwith weak limits instead of norm limits one has to expect possible “discontinuousjumps”, say, in the form of sudden expansions or contractions of the limit object ascompared to its approximants (in the same spirit of taking the closure of the unitsphere S H : the norm-closure gives again S H , the weak closure gives the whole B H ).For this reason we set up the new notion of weak gap-metric in the more generalclass(5.8) C w ( H ) := { non-empty and weakly closed subsets of B H } , instead of the subclass of unit balls of closed subspaces of H .For U, V ∈ C w ( H ) let us then set d w ( U, V ) := sup u ∈ U inf v ∈ V k u − v k w , b d w ( U, V ) := max { d w ( U, V ) , d w ( V, U ) } . (5.9)We shall now establish the fundamental properties of the map b d w . They aresummarised as follows. Theorem 5.1.
Let H be a separable Hilbert space. (i) b d w is a metric on C w ( H ) . (ii) The metric space ( C w ( H ) , b d w ) is complete. (iii) If b d w ( U n , U ) n →∞ −−−−→ for an element U and a sequence ( U n ) n ∈ N in C w ( H ) ,then (5.10) U = { u ∈ B H | u n ⇀ u for a sequence ( u n ) n ∈ N with u n ∈ U n } . (iv) The metric space ( C w ( H ) , b d w ) is compact. (v) If b d w ( U n , U ) n →∞ −−−−→ for an element U and a sequence ( U n ) n ∈ N in C w ( H ) ,then b d w ( f ( U n ) , f ( U )) n →∞ −−−−→ for any weakly closed and weakly continuousmap f : H → H such that f ( B H ) ⊂ B H . We shall also write U n b d w −−→ U as an alternative to b d w ( U n , U ) → Remark 5.2.
The completeness and the compactness result of Theorem 5.1 arein a sense folk knowledge in the context of the Hausdorff distance. In fact, thegap distance b d ( U, V ) introduced in (5.3)-(5.4) is, apart from zero-sets, the Haus-dorff distance between U and V as subsets of the metric (normed) space ( H , k · k ),and our modified weak gap distance b d w ( U, V ) defined in (5.9) between elements of C w ( H ) is the Hausdorff distance between sets in the metric space ( B H , ̺ w ). Thecompleteness and the compactness of ( B H , ̺ w ) then lift, separately, to the com-pleteness and compactness of ( C w ( H ) , b d w ) – they are actually equivalent (see, e.g.,[34, Theorem 5.38]. We chose to present here both results and their proofs in detailfor three important reasons. First, we wanted to make the discussion self-consistent(also in view of the rather miscellaneous literature we could track down, our proofof completeness, in particular, following an independent route than the general dis-cussion [16, 12, 34]). Second, we intended to expose reasonings, tailored on theweak topology setting, which we shall use repeatedly in the proof of the variousstatements of the following Sections. Third, having the proof of completeness of( C w ( H ) , b d w ) fully laid down is of further help in understanding the failure of com-pleteness of the metric space ( S ( H ) , b d w ) that we will consider in the next Section,for applications to Krylov subspaces.There are further technical properties of d w and b d w that are worth being singledout. Let us collect them in the following lemmas. Lemma 5.3.
Let
U, V, Z ∈ C w ( H ) for some separable Hilbert space H . Then d w ( U, V ) = 0 ⇔ U ⊂ V , (5.11) b d w ( U, V ) = 0 ⇔ U = V , (5.12) d w ( U, Z ) d w ( U, V ) + d w ( V, Z ) , (5.13) b d w ( U, Z ) b d w ( U, V ) + b d w ( V, Z ) . (5.14) Lemma 5.4.
Given a separable Hilbert space H and a collection ( U n ) n ∈ N in C w ( H ) ,the set (5.15) U := { x ∈ B H | u n ⇀ x for a sequence ( u n ) n ∈ N with u n ∈ U n } is closed in the weak topology of H . Given U ∈ C w ( H ) we define its ‘ weakly open ε -expansion ’ in B H as(5.16) U ( ε ) := [ u ∈ U B w ( u, ε ) . Observe that U ( ε ) is a weakly open subset of B H . Lemma 5.5.
Let
U, V ∈ C w ( H ) for some separable Hilbert space H and let ε > .Then: (i) d w ( U, V ) < ε ⇒ V ∩ B w ( u, ε ) = ∅ ∀ u ∈ U ; (ii) d w ( U, V ) < ε ⇔ U ⊂ V ( ε ) ; (iii) U ⊂ V ( ε ) U ∩ B w ( v, ε ) = ∅ ∀ v ∈ V (cid:27) ⇒ b d w ( V, U ) < ε . The remaining part of this Section is devoted to proving the above statements.
RYLOV SOLVABILITY UNDER PERTURBATIONS 15
Proof of Lemma 5.3.
For (5.11), the inclusion U ⊂ V implies inf v ∈ V k u − v k w = 0for every u ∈ U , whence d w ( U, V ) = 0; conversely, if 0 = d w ( U, V ) = sup u ∈ U inf v ∈ V k u − v k w , then inf v ∈ V k u − v k w = 0 for every u ∈ U , whence the fact, by weak closed-ness of V , that any such u belongs also to V . As for the property (5.12), it followsfrom (5.11) exploiting separately both inclusions U ⊂ V and U ⊃ V . Last, letus prove the triangular inequalities (5.13)-(5.14). Let u ∈ U : then, owing to theweak compactness of V (as a closed subset of the compact metric space ( B H , ̺ w )),inf v ∈ V k u − v k w = k u − v k w for some v ∈ V , whence k u − v k w = inf v ∈ V k u − v k w sup u ∈ U inf v ∈ V k u − v k w = d w ( U, V ) b d w ( U, V ) . As a consequence, inf z ∈ Z k u − z k w k u − v k w + inf z ∈ Z k v − z k w d ( U, V ) + d ( U, Z ) b d w ( U, V ) + b d w ( V, Z ) , having used the triangular inequality of the k · k w -norm in the first inequality. Bythe arbitrariness of u ∈ U , thus taking the supremum over all such u ’s, d w ( U, Z ) d w ( U, V ) + d w ( V, Z ) ,d w ( U, Z ) b d w ( U, V ) + b d w ( V, Z ) . With the first inequality above we proved (5.13). Next, let us combine the secondinequality above with the corresponding bound for d w ( V, U ), which is establishedin a similar manner: let now z ∈ Z , and again by weak compactness there exists v ∈ V with k v − z k w = inf v ∈ V k v − z k w d w ( Z, V ) b d w ( V, Z ), and alsoinf u ∈ U k u − v k w d ( V, U ) = b d ( U, V ), whenceinf u ∈ U k u − z k w inf u ∈ U k u − v k w + k v − z k w b d w ( U, V ) + b d w ( V, Z ) . Taking the supremum over all z ∈ Z yields d w ( Z, U ) b d w ( U, V ) + b d w ( V, Z ) . Combining the above estimates for d w ( U, Z ) and d w ( Z, U ) yields the conclusion. (cid:3)
Proof of Theorem 5.1(i).
It follows directly from (5.12)-(5.14) of Lemma 5.3. (cid:3)
Proof of Lemma 5.4.
Let x ∈ U w , the closure of U in the weak topology, and letus construct a sequence ( u n ) n ∈ N with u n ∈ U n and u n ⇀ x , thereby showing that x ∈ U .By assumption ∃ x ∈ U with k x − x k w and k x − u (2) n k w n →∞ −−−−→ u (2) n ) n ∈ N with u (2) n ∈ U n . In particular, there is N ∈ N with k x − u (2) n k w ∀ n > N . For the integer N with N > N + 1 to be fixed in a moment,set u n := u (2) n , n ∈ { N , . . . , N − } . By construction, k x − u n k w k x − x n k w + k x n − u n k w ∀ n ∈ { N , . . . , N − } .Next, for each integer k > N k ) ∞ k =3 in N with N k > N k − + 1, and vectors u n ∈ U n for n ∈ { N k , . . . , N k +1 − } as follows.By assumption ∃ x k ∈ U with k x − x k k w k and k x k − u ( k ) n k w n →∞ −−−−→ u ( k ) n ) n ∈ N with u ( k ) n ∈ U n . In particular, it is always possible to find N k ∈ N with N k > N k − + 1 such that k x k − u ( k ) n k w k ∀ n > N k . For the integer N k +1 > N k + 1 set u n := u ( k ) n , n ∈ { N k , . . . , N k +1 − } . By construction, k x − u n k w k x − x n k w + k x n − u n k w k ∀ n ∈ { N k , . . . , N k +1 − } .This yields a sequence ( u n ) n ∈ N (having added, if needed, finitely many irrelevantvectors u , . . . , u N − ) with u n ∈ U n and such that, for any integer k > k x − u n k w k ∀ n > N k . Hence k x − u n k w n →∞ −−−−→
0, thus x ∈ U . (cid:3) Proof of Theorem 5.1(ii) and (iii).
One needs to show that given ( U n ) n ∈ N , Cauchysequence in C w ( H ), there exists U ∈ C w ( H ) with b d w ( U n , U ) n →∞ −−−−→
0, and that U has precisely the form (5.10). Moreover, as ( C w ( H ) , b d w ) is a metric space, it sufficesto establish the above statement for one subsequence of ( U n ) n ∈ N .By the Cauchy property, b d w ( U n , U m ) n,m →∞ −−−−−→
0. Up to extracting a subse-quence, henceforth denoted again with ( U n ) n ∈ N , one can further assume that b d w ( U n , U m ) n ∀ m > n . We shall establish the b d w -convergence of such (sub-)sequence.First of all, fixing any n ∈ N and any u n ∈ U n , we construct a sequence with • an (irrelevant) choice of vectors u , . . . , u n − in the first n − u ∈ U , . . . , u n − ∈ U n − , • precisely the considered vector u n in position n , • and an infinite collection u n +1 , u n +2 , u n +3 , . . . determined recursively sothat, given u k ∈ U k ( k > n ), the next u k +1 is that element of U k +1 satisfyinginf v ∈ U k +1 k u k − v k w = k u k − u k +1 k w – a choice that is always possible, owingto the weak compactness of U k +1 as a closed subset of the compact metricspace ( B H , ̺ w ).Let us refer to such ( u k ) k ∈ N as the sequence ‘originating from the given u n ’ (tacitlyunderstanding that it is one representative of infinitely many sequences with thesame property, owing to the irrelevant choice of the first n − (cid:0) u ( u n ) k (cid:1) k ∈ N : thus, u ( u n ) n ≡ u n .By construction, for any k > n , k u k − u k +1 k w = inf v ∈ U k +1 k u k − v k w sup z ∈ U k inf v ∈ U k +1 k z − v k w = d w ( U k , U k +1 ) b d w ( U k , U k +1 ) k , whence, for any m > n , k u n − u m k w m − X k = n k u k − u k +1 k w m − X k = n k n − . This implies that the sequence ( u k ) k ∈ N originating from the considered u n ∈ U n isa Cauchy sequence in ( B H , ̺ w ) and we denote its weak limit as u ( u n ) ∞ ∈ B H . Thesame construction can be repeated for any n ∈ N and starting the sequence fromany u n ∈ U n : the collection of all possible limit points is U ∞ := (cid:8) u ∈ B H | u = u ( u n ) ∞ for some n ∈ N and some ‘starting’ u n ∈ U n (cid:9) . Compare now the set U ∞ with the set e U := { u ∈ B H | u n ⇀ u for a sequence ( u n ) n ∈ N with u n ∈ U n } e U is weakly closed (Lemma 5.4), and obviously U ∞ ⊂ e U . We claim that e U = U ∞k k w (the weak closure of U ∞ ) . RYLOV SOLVABILITY UNDER PERTURBATIONS 17
For arbitrary u ∈ e U and ε > n ε ∈ N and u n ε ∈ U n ε with k u − u n ε k w ε . Non-restrictively, n ε → ∞ as ε ↓
0. For a sequence (cid:0) u ( u nε ) k (cid:1) k ∈ N originatingfrom u n ε and for its weak limit u ( u nε ) ∞ ∈ U ∞ , there is k ε ∈ N with k ε > n ε satisfying both k u n ε − u ( u nε ) k ε k w − ( n ε − (because of the above property of thesequences originating from one element) and k u ( u nε ) k ε − u ( u nε ) ∞ k w ε (because of theconvergence u ( u nε ) k ⇀ u ( u nε ) ∞ ). Thus, k u − u ( u nε ) ∞ k w k u − u n ε k w + k u n ε − u ( u nε ) k ε k w + k u ( u nε ) k ε − u ( u nε ) ∞ k w − ( n ε − + 2 ε . Taking ε ↓ u indeed belongs to the weak closure of U ∞ .It remains to prove that b d w ( U n , e U ) n →∞ −−−−→
0. Let us control d w ( U n , e U ) first.Pick n ∈ N and u n ∈ U n . For the sequence ( u k ) k ∈ N originating from u n (thus, u k ⇀ u ( u n ) ∞ ) and for arbitrary ε > k ε ∈ N with k ε > n such that k u ( u n ) ∞ − u k ε k w ε and k u k ε − u n k w − ( n − . Thus,inf u ∈ e U k u − u n k w k u ( u n ) ∞ − u n k w k u ( u n ) ∞ − u k ε k w + k u k ε − u n k w ε + 12 n − , whence also d w ( U n , e U ) = sup u n ∈ U n inf u ∈ e U k u − u n k w ε + 12 n − . This implies that lim sup n d w ( U n , e U ) ε , and owing to the arbitrariness of ε , finally d w ( U n , e U ) n →∞ −−−−→
0. The other limit d w ( e U , U n ) n →∞ −−−−→ U ∞ in e U . Pick arbitrary n ∈ N , ε >
0, and u ∈ e U . We already argued, for the proof of the identity e U = U ∞k k w ,that there is n ε ∈ N with n ε → ∞ as ε ↓
0, and there is u n ε ∈ U n ε , such that (cid:13)(cid:13) u − u ( u nε ) ∞ (cid:13)(cid:13) w − ( n ε − + 2 ε . Non-restrictively, n ε > n . In turn, as u ( u nε ) k ⇀ u ( u nε ) ∞ , there is k ε ∈ N with k ε > n ε > n such that (cid:13)(cid:13) u ( u nε ) ∞ − u ( n ε ) k ε (cid:13)(cid:13) w ε and (cid:13)(cid:13) u ( n ε ) k ε − u ( n ε ) n (cid:13)(cid:13) − ( n − .Therefore, inf v ∈ U n k u − w k w (cid:13)(cid:13) u − u ( u nε ) ∞ (cid:13)(cid:13) w + inf v ∈ U n (cid:13)(cid:13) u ( u nε ) ∞ − v (cid:13)(cid:13) w (cid:13)(cid:13) u − u ( u nε ) ∞ (cid:13)(cid:13) w + (cid:13)(cid:13) u ( u nε ) ∞ − u ( n ε ) k ε (cid:13)(cid:13) w + (cid:13)(cid:13) u ( n ε ) k ε − u ( n ε ) n (cid:13)(cid:13) n ε − + 3 ε + 12 n − , whence also d w ( U, U n ) = sup u ∈ U inf u n ∈ U n k u − u n k w n ε − + 3 ε + 12 n − . As above, the limit ε ↓ n imply d w ( U, U n ) n →∞ −−−−→ b d w ( U n , U ) n →∞ −−−−→ (cid:3) Proof of Lemma 5.5. (i) If, for contradiction, V ∩ B w ( u , ε ) = ∅ for some u ∈ U ,then the weak metric distance ( ̺ w ) of u from V is at least ε , meaning that d w ( U, V ) > inf v ∈ V k u − v k w > ε . (ii) Assume that d w ( U, V ) < ε . On account of the weak compactness of V (as aclosed subset of the compact metric space ( B H , ̺ w )), for any u ∈ U there is v u ∈ V with k u − v u k w = inf v ∈ V k u − v k w d w ( U, V ) < ε , meaning that u ∈ B w ( v u , ε ). Thus, U ⊂ V ( ε ). Conversely, if U ⊂ V ( ε ), then any u ∈ U belongs to a ball B w ( v u , ε ) for some v u ∈ V , whence f ( u ) := inf v ∈ V k u − v k w k u − v u k w < ε . The function f : U → R is continuous on the weak compact set U , hence it attainsits maximum at a point u = u and d w ( U, V ) = sup u ∈ U f ( u ) = inf v ∈ V k u − v k w < ε . (iii) As by assumption U ⊂ V ( ε ), we know from (ii) that d w ( U, V ) < ε . Inaddition, for any v ∈ V it is assumed that B w ( v, ε ) is not disjoint from U , meaningthat there is u v ∈ U with k u v − v k w < ε . Therefore, g ( v ) := inf u ∈ U k u − v k w < ε, and from the continuity of g : V → R on the weak compact V , d w ( V, U ) = sup v ∈ V g ( v ) = g ( v ) < ε , where v is some point of maximum for g . In conclusion, b d w ( V, U ) < ε . (cid:3) Proof of Theorem 5.1(iv).
As the metric space ( C w ( H ) , b d w ) is complete, compact-ness follows if one proves that for any ε > C w ( H ) can be covered by finitelymany b d w -open balls of radius ε (total boundedness and completeness indeed implycompactness for a metric space).To this aim, let us observe first that, owing to the compactness of ( B H , ̺ w ), forany ε > B w ( x , ε ) , . . . , B w ( x M , ε )for some x , . . . x M ∈ B H and M ∈ N all depending on ε . Each B w ( x n , ε ) is the ε -expansion of the weakly closed set { x n } , hence Z ( ε ) = [ x ∈ Z B w ( x, ε ) ∀ Z ⊂ Z M := { x , . . . , x M } . Let us now show that the finitely many b d w -open balls of the form { U ∈ C w ( H ) | b d w ( U, Z ) < ε } , centred at some Z ⊂ Z M , actually cover C w ( H ). Pick U ∈ C w ( H ): as U ⊂ B H , U intersects some of the balls B w ( x n , ε ), so let Z U ⊂ Z M be the collection of thecorresponding centres of such balls. Thus, U ⊂ Z U ( ε ) and U ∩ B w ( x, ε ) = ∅ for any x ∈ Z U . The last two properties are precisely the assumption of Lemma 5.5(iii),that then implies b d w ( U, Z U ) < ε . In conclusion, each U ∈ C w ( H ) belongs to the b d w -open ball centred at Z U and with radius ε , and irrespectively of U the numberof such balls is finite, thus realising a finite cover of C w ( H ). (cid:3) Proof of Theorem 5.1(v).
Both f ( U n ) and f ( U ) are weakly closed, hence also weaklycompact subsets of B H . In particular it makes sense to evaluate b d w ( f ( U n ) , f ( U )).We start with proving that d w ( f ( U n ) , f ( U )) n →∞ −−−−→
0. Let ε >
0. The weaklyopen ε -expansion f ( U )( ε ) of f ( U ) (see (5.16) above) is weakly open in B H , namelyopen in the relative topology of B H induced by the weak topology of H . By weakcontinuity, f − ( f ( U )( ε )) too is weakly open in B H , and in fact it is a relativelyopen neighbourhood of U , for f ( U ) ⊂ f ( U )( ε ) ⇒ U ⊂ f − ( f ( U )( ε )). The set B H \ f − ( f ( U )( ε )) is therefore weakly closed and hence weakly compact in B H , RYLOV SOLVABILITY UNDER PERTURBATIONS 19 implying that from any point u ∈ U one has a notion of weak metric distancebetween u and B H \ f − ( f ( U )( ε )). So set e ε := inf u ∈ U inf n k u − z k w (cid:12)(cid:12)(cid:12) z ∈ B H \ f − ( f ( U )( ε )) o . It must be e ε >
0, otherwise there would be a common point in U and B H \ f − ( f ( U )( ε )) (owing to the weak closedness of the latter). Thus, any weaklyopen expansion of U up to U ( e ε ) is surely contained in f − ( f ( U )( ε )), whencealso f ( U ( e ε )) ⊂ f ( U )( ε ). Now, as U n b d w −−→ U , there is n ε ∈ N (in fact depend-ing on e ε , and therefore on ε ) such that d w ( U n , U ) < e ε for all n > n ε : then(Lemma 5.5(ii)) U n ⊂ U ( e ε ) for all n > n ε . As a consequence, for all n > n ε , f ( U n ) ⊂ f ( U ( e ε )) ⊂ f ( U )( ε ). Using again Lemma 5.5(ii), d w ( f ( U n ) , f ( U )) < ε forall n > n ε , meaning that d w ( f ( U n ) , f ( U )) n →∞ −−−−→ d w ( f ( U ) , f ( U n )) n →∞ −−−−→
0. Assume for con-tradiction that, up to passing to a subsequence, still denoted with ( U n ) n ∈ N , thereis ε > d w ( f ( U ) , f ( U n )) > ε ∀ n ∈ N . With respect to such ε , asproved in the first part, there is n ε ∈ N such that f ( U n ) ⊂ f ( U )( ε ) ∀ n > n ε .For any such n > n ε , on account of Lemma 5.5(iii) one deduces from the lattertwo properties, namely d w ( f ( U ) , f ( U n )) > ε and f ( U n ) ⊂ f ( U )( ε ), that there is y n ∈ f ( U ) such that f ( U n ) ∩ B w ( y n , ε ) = ∅ , whence also U n ∩ f − ( B w ( y n , ε )) = ∅ . From this condition we want now to construct a sufficiently small weak open ball ofa point u ∈ U that is disjoint from all the U n ’s as well. The sequence ( u ( n ) ) ∞ n = n ε with each u ( n ) ∈ U such that f ( u ( n ) ) = y n , owing to the weak compactness of U , hasa weakly convergent subsequence to some u ∈ U . (The superscript in u ( n ) is to warnthat each u ( n ) belongs to U , not to U n .) So, up to further refinement, u ( n ) ⇀ u in U ,and by weak continuity y n = f ( u ( n ) ) ⇀ f ( u ) =: y . The latter convergence impliesthat, eventually in n , say, ∀ n > m ε for some m ε ∈ N , B w ( y, ε ) ⊂ B w ( y n , ε ).In view of the disjointness condition above, one then deduces U n ∩ f − ( B w ( y, ε )) = ∅ ∀ n > m ε . As f − ( B w ( y, ε )) above is an open neighbourhood of u ∈ U in the relative weaktopology of B H (weak continuity of f ), it contains a ball B w ( u, ε ) around u forsome radius ε >
0, whence U n ∩ B w ( u, ε ) = ∅ ∀ n > m ε . On account of Lemma 5.5(i), this implies d w ( U, U n ) > ε ∀ n > m ε . However, thiscontradicts the assumption b d w ( U, U n ) → (cid:3) Weak gap metric for linear subspaces
Our primary interest is to exploit the b d w -convergence for closed subspaces of H , and ultimately for Krylov subspaces, in the sense of the convergence natu-rally induced by the convergence of the corresponding unit balls as elements of( C w ( H ) , b d w ).In other words, given two closed subspaces U, V ⊂ H , by definition we identify(6.1) b d w ( U, V ) ≡ b d w ( B U , B V )with the r.h.s. defined in (5.9), since B U , B V ∈ C w ( H ). Analogously, given U anda sequence ( U n ) n ∈ N , all closed subspaces of H , we write U n b d w −−→ U to mean that B U n b d w −−→ B U in the sense of the definition given in the previous Section. Thisprovides a metric topology and a notion of convergence on the set(6.2) S ( H ) := { closed linear subspaces of H} . By linearity, the closedness of each subspace of H is equivalently meant in the H -norm or in the weak topology. (Recall, however, that the weak topology on H is not induced by the norm k k w , as this is only the case in B H .) Lemma 6.1.
The set ( S ( H ) , b d w ) is a metric space.Proof. Positivity and triangular inequality are obvious from (6.1) and Lemma 5.3.Last, to deduce from b d w ( U, V ) = 0 that U = V , one observes that b d w ( B U , B V ) = 0and hence B U = B V . If u ∈ U , then u/ k u k ∈ B U = B V , whence by linearity u ∈ V ,thus, U ⊂ V . Exchanging the role of the two subspaces, also V ⊂ U . (cid:3) The metric space ( S ( H ) , b d w ) contains in particular the closures of Krylov sub-spaces, and monitoring the distance between two such subspaces in the b d w -metricturns out to be informative in many respects. Unfortunately there is a majordrawback, for: Lemma 6.2.
The metric space ( S ( H ) , b d w ) is not complete.Proof. It is enough to provide an example of b d w -Cauchy sequence in S ( H ) that doesnot converge in S ( H ). So take H = ℓ ( N ), with the usual canonical orthonormalbasis ( e n ) n ∈ N . For n ∈ N , set U n := span { e + e n } ⊂ S ( H ).Let us show first of all that the sequence ( U n ) n ∈ N is b d w -Cauchy, i.e., that thecorresponding unit balls form a Cauchy sequence ( B U n ) n ∈ N in the metric space( C w , b d w ). A generic u ∈ B U n has the form u = α u ( e + e n ) for some α u ∈ C with | α u | √ . Therefore,inf v ∈ B Um k u − v k w k α u ( e + e n ) − α u ( e + e m ) k w √ k e n − e m k w , the first inequality following from the concrete choice v = α u ( e + e m ) ∈ B U m .Using the above estimate and the fact that e n ⇀
0, and hence ( e n ) n ∈H is Cauchyin H , one deduces d w ( B U n , B U m ) = sup u ∈ U n inf v ∈ B Um k u − v k w √ k e n − e m k w n,m →∞ −−−−−→ . Inverting n and m one also finds d w ( B U m , B U n ) n,m →∞ −−−−−→
0. The Cauchy propertyis thus proved.On account of the completeness of ( C w , b d w ) (Theorem 5.1(ii)), B U n b d w −−→ B forsome B ∈ C w . Next, let us show that there is no closed subspace U ⊂ H with B U = B , which prevents the sequence ( U n ) n ∈ N to converge in ( S ( H ) , b d w ). To thisaim, we shall show that although the line segment { βe | β ∈ C , | β | √ } is entirely contained in B , however e / ∈ B : this clearly prevents B to be the unitball of a linear subspace. Assume for contradiction that e ∈ B ; then, owing toTheorem 5.1(iii) (see formula (5.10) therein), e ↼ u n for a sequence ( u n ) n ∈ N with u n ∈ B U n . In fact, weak approximants from B H of points of the unit sphere S H are necessarily also norm approximants: explicitly, owing to weak convergence, thesequence ( u n ) n ∈ N is norm lower semi-continuous, thus,1 = k e k lim inf n →∞ k u n k , whence lim n →∞ k u n k = 1 ; RYLOV SOLVABILITY UNDER PERTURBATIONS 21 then, since u n ⇀ e and k u n k → k e k , one has u n → x in the H -norm. As aconsequence, writing u n = α n ( e + e n ) for a suitable α n ∈ C with | α n | √ , onehas | α n |√ k u n k →
1, whence | α n | → √ . This implies though that k u n − e k = | α n ( e + e n ) − e k = | − α n | + | α n | cannot vanish as n → ∞ , a contradiction. Therefore, e / ∈ B .On the other hand, for any β ∈ C with | β | √ , B U n ∋ β ( e + e n ) ⇀ βe ,which by Theorem 5.1(iii) means that βe ∈ B . (cid:3) Despite the lack of completeness, the metric b d w in S ( H ) displays useful prop-erties for our purposes. The first is the counterpart of Theorem 5.1(iii). Proposition 6.3.
Let H be a separable Hilbert space and assume that U n b d w −−→ U as n → ∞ for some ( U n ) n ∈ N and U in S ( H ) . Then (6.3) U = { u ∈ H | u n ⇀ u for a sequence ( u n ) n ∈ N with u n ∈ U n } . Proof.
Call temporarily b U := { u ∈ H | u n ⇀ u for a sequence ( u n ) n ∈ N with u n ∈ U n } , so that the proof consists of showing that b U = U . From Theorem 5.1(iii) we knowthat B U n b d w −−→ B U = { e u ∈ B H | e u n ⇀ e u for a sequence ( e u n ) n ∈ N with e u n ∈ B U n } . So now if u ∈ U , then B U ∋ u/ k u k ↼ e u n for some ( e u n ) n ∈ N with e u n ∈ B U n , whence u ↼ k u k e u n ∈ U n , meaning that u ∈ b U . Conversely, if u ∈ b U , and hence u n ⇀ u for some ( u n ) n ∈ N with u n ∈ U n , then by uniform boundedness k u n k κ ∀ n ∈ N for some κ >
0, and by lower semi-continuity of the norm along the limit k u k κ as well. As a consequence, B U n ∋ κ − u n ⇀ κ − u , meaning that κ − u ∈ B U andtherefore u ∈ U . (cid:3) Other relevant features of the b d w -metric will be worked out in the next Sectionin application to Krylov subspaces.7. Krylov perturbations in the weak gap metric
We are mainly concerned with controlling how close two (closures of) Krylovsubspaces
K ≡ K ( A, g ) and K ′ ≡ K ( A ′ , g ′ ) are within the metric space ( S ( H ) , b d w )of closed subspaces of the separable Hilbert space H with the weak gap metric b d w ,for given A, A ′ ∈ B ( H ) and g, g ′ ∈ H .7.1. Preliminary properties.
A first noticeable feature, that closes the problem left open as one initial motiva-tion in Section 5, is the b d w -convergence of the finite-dimensional Krylov subspaceto the corresponding closed Krylov subspace. Lemma 7.1.
Let A ∈ B ( H ) and g ∈ H for an infinite-dimensional, separableHilbert space H . Then (7.1) b d w (cid:0) K N ( A, g ) , K ( A, g ) (cid:1) N →∞ −−−−→ , with the two spaces defined, respectively, in (1.1) and (5.1) . This means that the b d w -metric provides the appropriate language to measurethe distance between K N ( A, g ) and K ( A, g ) in an informative way: K N ( A, g ) b d w −−→K ( A, g ), whereas we saw that it is false in general that K N ( A, g ) b d −→ K ( A, g ). For the simple proof of this fact, and for later purposes, it is convenient to workout the following useful construction.
Lemma 7.2.
Let H be a separable Hilbert space, and let A ∈ B ( H ) , g ∈ H . Set K := K ( A, g ) . (i) For every x ∈ B K here exists a sequence ( u n ) n ∈ N in K ( A, g ) such that k u n k < ∀ n ∈ N and u n k k −−−−→ n →∞ x . (ii) For every ε > there is a cover of B K consisting of finitely many weaklyopen balls B w ( x , ε ) , . . . , B w ( x M , ε ) for some x , . . . , x M ∈ B K and M ∈ N all depending on ε . Moreover, each centre x j ∈ B K has an approximant p j ( A ) g for some polynomial p j on R with k p j ( A ) g k < and k p j ( A ) g − x j k ε ∀ j ∈ { , . . . , M } .Proof. As x ∈ K ( A, g ), then e u n k k −−→ x for some sequence ( e u n ) n ∈ N in K ( A, g ). Inparticular, k e u n k → k x k , and it is not restrictive to assume k e u n k > n .Therefore, u n := n − n k x kk e u n k e u n ∈ K ( A, g ) , and k u n k < k e u n k , and obviously u n k k −−−−→ n →∞ x . This proves part (i). Concerning part (ii), the existenceof such cover follows from the compactness of ( B H , ̺ w ) and the closure of B K in B H . The approximants p n ( A ) g are then found based on part (i). (cid:3) Proof of Lemma 7.1.
Let us use the shorthand K N ≡ K N ( A, g ) and
K ≡ K ( A, g ).As K N ⊂ K , then d w ( K N , K ) = 0 (see (5.11), Lemma 5.3 above), so only the limit d w ( K , K N ) ≡ d w ( B K , B K N ) → ε > ε -cover of B K constructed in Lemma 7.2 withcentres x , . . . , x M and Krylov approximants p ( A ) g, . . . , p M ( A ) g . Let N be thelargest degree of the p j ’s, thus ensuring that p j ( A ) g ∈ B K N ∀ j ∈ { , . . . , M } and ∀ N > N + 1.Now consider an arbitrary integer N > N + 1 and an arbitrary u ∈ B K . Thevector u clearly belongs to at least one of the balls of the finite open cover above:up to re-naming the centres, it is non-restrictive to claim that u ∈ B w ( x , ε ),and consider the above approximant p ( A ) g ∈ B K N of the ball’s centre x . Thus, k u − x k w < ε and k x − p ( A ) g k w k x − p ( A ) g k ε . Theninf v ∈ B K N k u − v k w k u − x k w + k x − p ( A ) g k w + inf v ∈ B K N k p ( A ) g − v k w < ε whence also d w ( B K , B K N ) = sup u ∈ B K inf v ∈ B K N k u − v k w ε . (cid:3) Despite the encouraging property stated in Lemma 7.1, one soon learns thatthe sequences of (closures of) Krylov subspaces with good convergence propertiesof the Krylov data A and/or g display in general quite a diverse (including non-convergent) behaviour in the b d w -metric. This suggests that an efficient controlof b d w -convergence of Krylov subspaces is only possible under suitable restrictiveassumptions.Lemma 7.3, Example 7.4 and Example 7.5 below are meant to shed some lighton this scenario. In particular, Lemma 7.3 establishes that the convergence g n → g in H is sufficient to have d w ( K ( A, g ) , K ( A, g n )) → Lemma 7.3.
Given a separable Hilbert space H and A ∈ B ( H ) , assume that g n k k −−−−→ n →∞ g for vectors g, g n ∈ H . Set K n ≡ K ( A, g n ) and K ≡ K ( A, g ) . RYLOV SOLVABILITY UNDER PERTURBATIONS 23 (i)
One has d w ( K, K n ) n →∞ −−−−→ . (ii) From a sequence ( B K n ) n ∈ N extract, by compactness of ( C w ( H ) , b d w ) , a con-vergent subsequence to some B ∈ C w ( H ) . Then B K ⊂ B .Proof. (i) For ε > ε -cover of B K constructed in Lemma 7.2 withcentres x , . . . , x M and Krylov approximants p ( A ) g, . . . , p M ( A ) g . In view of thefinitely many conditions k p j ( A ) g k < p j ( A ) g n k k −−−−→ n →∞ p j ( A ) g , j ∈ { , . . . , M } ,there is n ε ∈ N such that k p j ( A ) g n − p j ( A ) g k ε and k p j ( A ) g n k < n > n ε and j ∈ { , . . . , M } .Take u ∈ B K . Up to re-naming the centres of the cover’s balls, k u − x k w < ε , k x − p ( A ) g k w ε , k p ( A ) g n − p ( A ) g k ε , and k p ( A ) g n k < ∀ n > n ε . Then,for any n > n ε ,inf v ∈ B K n k u − v k w k u − x k w + k x − p ( A ) g k w + inf v ∈ B K n k p ( A ) g − v k w ε + k p ( A ) g − p ( A ) g n k w ε , whence also, for n > n ε , d w ( K , K n ) ≡ d w ( B K , B K n ) = sup u ∈ B K inf v ∈ B K n k u − v k w ε . This means precisely that d w ( K , K n ) → B K n ) n ∈ N , so that B K n b d w −−→ B .On account of (5.13) (Lemma 5.3), d w ( B K , B ) d w ( B K , B K n ) + d w ( B K n , B ) . Since d w ( B K n , B ) b d w ( B K n , B ) → d w ( B K , B K n ) → d w ( B K , B ) = 0. Owing to (5.11) (Lemma 5.3), thisimplies B K ⊂ B . (cid:3) Example 7.4.
In general, the assumptions of Lemma 7.3 are not enough to guaran-tee that also d w ( K n , K ) → K n b d x −→ K . Consider for instance H = ℓ ( N )and the right-shift operator A ≡ R , acting as Re k = e k +1 on the canonical basis( e k ) k ∈ N . As in Example 3.3, R admits a dense of cyclic vectors, as well as a denseof non-cyclic vectors: so, with respect to the general setting of Lemma 7.3, takenow g to be non-cyclic , say, g = e , and ( g n ) n ∈ N to be a sequence of H -norm ap-proximants of g that are all cyclic . Concerning the subspaces K n := K ( R, g n ) and K := K ( R, g ), K n = H ∀ n ∈ N by cyclicity, and K = { e } ⊥ H . As K ⊂ K n , then d w ( K , K n ) = 0, a conclusion consistent with Lemma 7.3, for ( K n ) n ∈ N is obviously b d w -Cauchy and Lemma 7.3 implies d w ( K , K n ) →
0. On the other hand, d w ( B K n , B K ) = sup u ∈ B K n inf v ∈ B K k u − v k w > inf v ∈ B K k e − v k w > , which prevents d w ( K n , K ) to vanish with n . Example 7.5.
In general, with respect to the setting of Lemma 7.3 and Example7.4, the sole convergence g n k k −−→ g is not enough to guarantee that ( K n ) n ∈ N be b d w -Cauchy. For, again with the right-shift R on H = ℓ ( N ), take now a sequence( e g n ) n ∈ N of cyclic vectors for R such that e g n k k −−→ e , and set g n := ( e g n for even ne for odd n . Thus, g n k k −−→ g := e . For even n , K n := K ( R, g n ) = H and K n +1 = { e } ⊥ , whence d w ( B K n , B K n +1 ) = sup u ∈ B K n inf v ∈ B K n +1 k u − v k w > inf v ∈ B K n +1 k e − v k w > , which prevents ( K m ) m ∈ N to be b d w -Cauchy.7.2. Existence of b d w -limits. Krylov inner approximability. Based on the examples discussed above, one is to expect a variety sufficientconditions ensuring the convergence of a sequence of (closures of) Krylov subspacesto a (closure of) Krylov subspace. In this Subsection we discuss one mechanism ofconvergence that is meaningful in our context of Krylov perturbations.
Proposition 7.6.
Let H be a separable Hilbert space, A ∈ B ( H ) , and g ∈ H .Assume further that there is a sequence ( g n ) n ∈ N such that (7.2) g n ∈ K ( A, g ) ∀ n ∈ N and g n k k −−−−→ n →∞ g . Then K ( A, g n ) b d w −−→ K ( A, g ) .Proof. Let us use the shorthand K n ≡ K ( A, g n ), K ≡ K ( A, g ). As K n ⊂ K , then d w ( K n , K ) = 0. As g n → g in H , then d w ( K , K n ) → K n b d w −−→ K . (cid:3) The above convergence K ( A, g n ) b d w −−→ K ( A, g ), in view of condition (7.2), ex-presses the “inner approximability” of K ( A, g ). In fact, K ( A, g n ) ⊂ K ( A, g ). Con-dition (7.2) includes also the case of approximants g n from K ( A, g ) or also from K n ( A, g ) (the n -th order Krylov subspace (5.1)). For instance, set g n := n − X k =0 n k k A k k op A k g ∈ K n ( A, g ) , n ∈ N , and as k g − g n k n − X k =1 k A k g k n k k A k k op k g k n − X k =1 n k k g k n , then K ( A, g ) ⊃ K ( A, g ) ⊃ K n ( A, g ) ∋ g n → g in H .7.3. Krylov solvability along b d w -limits. Let us finally scratch the surface of a very central question for the present in-vestigation, namely how a perturbation of a given inverse linear problem, that issmall in b d w -sense for the corresponding Krylov subspaces, does affect the Krylovsolvability.Far from answering in general, we have at least the tools to control the followingclass of cases. The proof is fast, but it relies on two non-trivial toolboxes. Proposition 7.7.
Let H be a separable Hilbert space. The following be given: • an operator A ∈ B ( H ) with inverse A − ∈ B ( H ) ; • a sequence ( g n ) n ∈ N in H such that for each n the (unique) solution f n := A − g n to the inverse problem Af n = g n is a Krylov solution; • a vector g ∈ H such that K ( A, g n ) b d w −−→ K ( A, g ) as n → ∞ .Then the (unique) solution f := A − g to the inverse problem Af = g is a Krylovsolution. If in addition g n → g , respectively g n ⇀ g , then f n → f , respectively f n ⇀ f . RYLOV SOLVABILITY UNDER PERTURBATIONS 25
Proof. As A is a bounded bijection of H with bounded inverse, A is a stronglycontinuous and closed H → H (linear) map, and therefore it also weakly continuousand weakly closed. Up to a non-restrictive scaling one may assume that k A k op A maps B H into itself. The conditions of Theorem 5.1(v) are thereforematched. Thus, from K ( A, g n ) b d w −−→ K ( A, g ) one deduces A K ( A, g n ) b d w −−→ A K ( A, g ).On the other hand, based on a result that we proved in [5, Prop. 3.2(ii)], theassumption that f n ∈ K ( A, g n ) is equivalent to A K ( A, g n ) = K ( A, g n ). Thus, K ( A, g n ) = A K ( A, g n ) b d w −−→ A K ( A, g ). The b d w -limit being unique, A K ( A, g ) = K ( A, g ). Then, again on account of [5, Prop. 3.2(ii)], f ∈ K ( A, g n ). This provesthe main statement; the additional convergences of ( f n ) n ∈ N to f are obvious. (cid:3) Remark 7.8.
It is worth stressing that the the control of the perturbation inProposition 7.7, namely the assumption K ( A, g n ) b d w −−→ K ( A, g ), does not necessarilycorrespond to some H -norm vicinity between g n and g (in Proposition 7.6, instead,we had discussed a case where K ( A, g n ) b d w −−→ K ( A, g ) is a consequence of g n → g in H ). The following example elucidates the situation. With respect to the generalsetting of Proposition 7.7, consider H = ℓ ( N ), A = , g = 0, g n = e n (the n -thcanonical basis vector), and hence K n := K ( A, g n ) = span { e n } ,K := K ( A, g ) = { } . Obviously B K ⊂ B K n , whence d w ( K, K n ) = 0, on account of (5.11) and (6.1).On the other hand, a generic u ∈ B K n has the form u = αe n for some | α | d w ( K n , K ) = sup u ∈ B Kn inf v ∈ B K k u − v k w = sup u ∈ B Kn k u k w k e n k w n →∞ −−−−→ . This shows that K ( A, g n ) b d w −−→ K ( A, g ). Thus, all assumptions of Proposition 7.7are matched. However, it is false that g n converges to g in norm: in this case it isonly true that g n ⇀ g (weakly in H ), indeed e n ⇀ Conclusions and perspectives
In retrospect, a few concluding observations are in order.We have already elaborated in the opening Section 1 that the main perspectiveof this kind of investigation is to regard a perturbed inverse problem as a potentially“easier” source of information, including Krylov solvability, for the original, unper-turbed problem, and conversely to understand when a given inverse problem loosesKrylov solvability under small perturbations, that in practice would correspondto uncertainties of various sort, thus making Krylov subspace methods potentiallyunstable.The evidence from Section 3 is that a controlled vicinity of the perturbed opera-tor or the perturbed datum is not sufficient, alone, to decide on the above questions,for Krylov solvability may well persist, disappear, or appear in the limit when theperturbation is removed. And the idea inspiring Section 4 is that constraining theperturbation within certain classes of operators may provide the additional infor-mation needed. Thus, a first plausible research programme is to investigate whatclasses of operators undergo perturbations that make Krylov solvability stable.The attempt we then made in Sections 5-7 is to encode the inverse problem per-turbation into a convenient topology that allows to predict whether Krylov solv-ability persists or is washed out. On a conceptual footing this is the appropriate approach, because we know from our previous investigation [5] that Krylov solvabil-ity is essentially a structural property of the Krylov subspace K ( A, g ), therefore it isnatural to compare Krylov subspaces in a meaningful sense. The weak gap metricfor linear subspaces of H , while being encouraging in many respects ( K N → K , in-ner approximability, stability under perturbations in the sense of Proposition 7.7),suffers various limitations that need be further understood (indirectly due to thelack of completeness of the b d w -metric out of the Hilbert closed unit ball, in turndue to the lack of metrisability of the weak topology out of the unit ball). It isplausible to expect, and so is our next commitment, that the informative controlof the inverse problem perturbation, as far as Krylov solvability is concerned, is acombination of an efficient distance between Krylov subspaces, vicinity of operatorsand of data, and restriction to classes of distinguished operators.At this stage, this preliminary investigation completes a first cycle of study onabstract inverse linear problems, their finite-dimensional truncations and approx-imations, their Krylov solvability in the bounded and unbounded case, and thestability of Krylov solvability under perturbations, that we developed in our previ-ous recent works [5, 6, 4, 3] and in the present one. References [1]
N. I. Akhiezer and I. M. Glazman , Theory of linear operators in Hilbert space , DoverPublications, Inc., New York, 1993. Translated from the Russian and with a preface byMerlynd Nestell, Reprint of the 1961 and 1963 translations, Two volumes bound as one.[2]
H. Brezis , Functional analysis, Sobolev spaces and partial differential equations , Universi-text, Springer, New York, 2011.[3]
N. A. Caruso and A. Michelangeli , Krylov Solvability of Unbounded Inverse Linear Prob-lems , Integral Equations Operator Theory, 93 (2021), p. Paper No. 1.[4] ,
Convergence of the conjugate gradient method with unbounded operators ,arXiv:1908.10110 (2019).[5]
N. A. Caruso, A. Michelangeli, and P. Novati , On Krylov solutions to infinite-dimensional inverse linear problems , Calcolo, 56 (2019), p. 32.[6] ,
On general projection methods and convergence behaviours for abstract linear inverseproblems , arXiv:1811.08195 (2018).[7]
J. W. Daniel , The conjugate gradient method for linear and nonlinear operator equations ,SIAM J. Numer. Anal., 4 (1967), pp. 10–26.[8]
X. Du, M. Sarkis, C. E. Schaerer, and D. B. Szyld , Inexact and truncated Parareal-in-time Krylov subspace methods for parabolic optimal control problems , Electron. Trans.Numer. Anal., 40 (2013), pp. 36–57.[9]
A. Ern and J.-L. Guermond , Theory and practice of finite elements , vol. 159 of AppliedMathematical Sciences, Springer-Verlag, New York, 2004.[10]
L. Geh´er , Cyclic vectors of a cyclic operator span the space , Proc. Amer. Math. Soc., 33(1972), pp. 109–110.[11]
I. C. Gohberg and A. S. Markus , Two theorems on the opening between subspaces ofBanach space , Uspekhi Mat. Nauk., 5(89) (1959), pp. 135–140.[12]
A. K. Gupta and S. Mukherjee , On Hausdorff Metric Spaces , arXiv:1909.07195 (2019).[13]
P. R. Halmos , A Hilbert space problem book , vol. 19 of Graduate Texts in Mathematics,Springer-Verlag, New York-Berlin, second ed., 1982. Encyclopedia of Mathematics and itsApplications, 17.[14]
M. Hanke , Conjugate gradient type methods for ill-posed problems , vol. 327 of Pitman Re-search Notes in Mathematics Series, Longman Scientific & Technical, Harlow, 1995.[15]
P. C. Hansen , Rank-deficient and discrete ill-posed problems , SIAM Monographs on Mathe-matical Modeling and Computation, Society for Industrial and Applied Mathematics (SIAM),Philadelphia, PA, 1998. Numerical aspects of linear inversion.[16]
J. Henrikson , Completeness and total boundedness of the Hausdorff metric , MITUndergradJ. Math., 1 (1999), pp. 69–80.[17]
D. A. Herrero , Eigenvectors and cyclic vectors for bilateral weighted shifts , Rev. Un. Mat.Argentina, 26 (1972/73), pp. 24–41.[18]
R. Herzog and E. Sachs , Superlinear convergence of Krylov subspace methods for self-adjoint problems in Hilbert space , SIAM J. Numer. Anal., 53 (2015), pp. 1304–1324.
RYLOV SOLVABILITY UNDER PERTURBATIONS 27 [19]
W. J. Kammerer and M. Z. Nashed , On the convergence of the conjugate gradient methodfor singular linear operator equations , SIAM J. Numer. Anal., 9 (1972), pp. 165–181.[20]
W. Karush , Convergence of a method of solving linear problems , Proc. Amer. Math. Soc., 3(1952), pp. 839–851.[21]
T. Kato , Perturbation theory for linear operators , Classics in Mathematics, Springer-Verlag,Berlin, 1995. Reprint of the 1980 edition.[22]
M. G. Kre˘ın and M. A. Krasnosel ′ ski˘ı , Fundamental theorems on the extension of Her-mitian operators and certain of their applications to the theory of orthogonal polynomialsand the problem of moments , Uspehi Matem. Nauk (N. S.), 2 (1947), pp. 60–106.[23]
M. G. Kre˘ı, M. A. Krasnosel ′ ski˘ı, and D. Mil ′ man , Concerning the deficiency numbersof linear operators in Banach space and some geometric questions , Sbornik Trudov Instit.Mat. Akad. Nauk. Ukr. S.S.R., (1948), pp. 97–112.[24]
J. Liesen and Z. e. Strakoˇs , Krylov subspace methods , Numerical Mathematics and Scien-tific Computation, Oxford University Press, Oxford, 2013. Principles and analysis.[25]
J. R. Munkres , Topology , Prentice Hall, Inc., Upper Saddle River, NJ, 2000.[26]
A. S. Nemirovskiy and B. T. Polyak , Iterative methods for solving linear ill-posed problemsunder precise information. I , Izv. Akad. Nauk SSSR Tekhn. Kibernet., (1984), pp. 13–25,203.[27] ,
Iterative methods for solving linear ill-posed problems under precise information. II ,Engineering Cybernetics, 22 (1984), pp. 50–57.[28]
A. Quarteroni , Numerical models for differential problems , vol. 16 of MS&A. Modeling,Simulation and Applications, Springer, Cham, 2017. Third edition.[29]
Y. Saad , Iterative methods for sparse linear systems , Society for Industrial and AppliedMathematics, Philadelphia, PA, second ed., 2003.[30]
S. Shkarin , A weighted bilateral shift with cyclic square is supercyclic , Bull. Lond. Math.Soc., 39 (2007), pp. 1029–1038.[31]
J. A. Sifuentes, M. Embree, and R. B. Morgan , GMRES Convergence for Perturbed Coef-ficient Matrices, with Application to Approximate Deflation Preconditioning , SIAM Journalon Matrix Analysis and Applications, 34 (2013), pp. 1066–1088.[32]
V. Simoncini and D. B. Szyld , Theory of Inexact Krylov Subspace Methods and Applicationsto Scientific Computing , SIAM Journal on Scientific Computing, 25 (2003), pp. 454–477.[33] ,
On the Occurrence of Superlinear Convergence of Exact and Inexact Krylov SubspaceMethods , SIAM Review, 47 (2005), pp. 247–272.[34]
A. A. Tuzhilin , Lectures on Hausdorff and Gromov-Hausdorff Distance Geometry ,arXiv:2012.00756 (2020).[35]
J. van den Eshof, G. L. Sleijpen, and M. B. van Gijzen , Relaxation strategies for nestedKrylov methods , Journal of Computational and Applied Mathematics, 177 (2005), pp. 347–365.[36]
R. Winther , Some superlinear convergence results for the conjugate gradient method , SIAMJ. Numer. Anal., 17 (1980), pp. 14–17.[37]
F. Xue and H. C. Elman , Fast inexact subspace iteration for generalized eigenvalue problemswith spectral transformation , Linear Algebra and its Applications, 435 (2011), pp. 601–622.Special Issue: Dedication to Pete Stewart on the occasion of his 70th birthday.[38]
J.-P. M. Zemke , Abstract perturbed Krylov methods , Linear Algebra and its Applications,424 (2007), pp. 405–434.(N. A. Caruso)
Gran Sasso Science Institute, Viale F. Crispi 7, I-67100 L’Aquila(ITALY)
Email address : [email protected] (A. Michelangeli) Institute for Applied Mathematics, and Hausdorff Center of Math-ematics, University of Bonn, Endenicher Allee 60, D-53115 Bonn (GERMANY).
Email address ::