An octagon containing the numerical range of a bounded linear operator
AAN OCTAGON CONTAINING THE NUMERICAL RANGE OF A BOUNDEDLINEAR OPERATOR
A. MelmanDepartment of Applied MathematicsSchool of Engineering, Santa Clara UniversitySanta Clara, CA 95053e-mail : [email protected]
Abstract
A polygon is derived that contains the numerical range of a bounded linear operator on acomplex Hilbert space, using only norms. In its most general form, the polygon is an octagon,symmetric with respect to the origin, and tangent to the closure of the numerical range in atleast four points when the spectral norm is used.
Key words : linear operator, numerical range, field of values, polynomial eigenvalue,bounds
AMS(MOS) subject classification :
The numerical range of T ∈ B ( H ) , the algebra of bounded linear operators on a complex Hilbertspace H , equipped with the inner product (cid:104) ., . (cid:105) , is the subset of C , defined by W ( T ) = {(cid:104) Tu , u (cid:105) : u ∈ H , (cid:107) u (cid:107) = } , where (cid:107) u (cid:107) = (cid:104) u , u (cid:105) . Also referred to as the field of values , it plays an important role in severalfields of mathematics and engineering. By the Toeplitz-Hausdorff theorem, W ( T ) is a convex set.A related quantity, the numerical radius , is defined as w ( T ) = sup (cid:107) u (cid:107) = | (cid:104) Tu , u (cid:105) | .The numerical range can be enclosed by a polygonal envelope (for matrices, but easily gen-eralized to bounded operators, see[3, Section 1.5]), although this requires the computation ofeigenvalues and corresponding eigenvectors, which is impractical when matrix sizes are large orin cases where the matrix is only implicitly defined.Our purpose is to enclose the numerical range in an easily computable region (in its mostgeneral form an octagon) using only norms and avoiding the computation of spectral or spectral-related quantitities. On the one hand, this leads to a cruder approximation than could be obtainedby using spectral information, but on the other, it is faster and much simpler. It will depend onthe application whether accuracy or computational simplicity is preferable, but such matters arebeyond our scope here.To begin, we briefly review a few basic properties of B ( H ) and the numerical range, as canbe found in any standard text on these subjects (e.g., [1], [2], [6]). We denote by T ∗ the adjointof T ∈ B ( H ) , defined by (cid:104) Tu , u (cid:105) = (cid:104) u , T ∗ u (cid:105) , u ∈ H . An operator T is self-adjoint if T = T ∗ . TheCartesian decomposition of T ∈ B ( H ) is given by T = T S + iT S , where T H and T S are self-adjointbounded operators defined as T H = ( T + T ∗ ) and T S = i ( T − T ∗ ) . a r X i v : . [ m a t h . F A ] F e b t follows from this decomposition that (cid:104) Tu , u (cid:105) = (cid:104) T H u , u (cid:105) + i (cid:104) T S u , u (cid:105) , u ∈ H , with (cid:104) T H u , u (cid:105) , (cid:104) T S u , u (cid:105) ∈ R . The spectral norm of T ∈ B ( H ) is defined as (cid:107) T (cid:107) σ = sup (cid:107) u (cid:107) = (cid:107) Tu (cid:107) . There exist several upperbounds for w ( T ) , expressed in terms of the spectral norm: first, as an immediate consequence ofthe definition of w ( T ) , one has the standard bound w ( T ) ≤ (cid:107) T (cid:107) σ . (1)However, this bound is not necessarily satisfied when the norm is different from the spectral norm:a finite dimensional counterexample of a 2 × C ) with thematrix 1-norm is given by3 + √ = (cid:12)(cid:12)(cid:12)(cid:12)(cid:18) √ / / (cid:19) ∗ (cid:18) (cid:19) (cid:18) √ / / (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) > (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) = . Two recent improvements of the bound in (1) are the following: w ( T ) ≤ (cid:16) (cid:107) T (cid:107) σ + (cid:107) T (cid:107) / σ (cid:17) from [4], (2) w ( T ) ≤ (cid:18) (cid:107) T ∗ T + T T ∗ (cid:107) σ (cid:19) / from [5]. (3)When T ∈ B ( H ) is self-adjoint, then w ( T ) = (cid:107) T (cid:107) σ , w ( T ) ≤ (cid:107) T (cid:107) for any norm, and W ( T ) ⊆ [ −(cid:107) T (cid:107) σ , (cid:107) T (cid:107) σ ] , where W ( T ) is the closure of W ( T ) . Throughout, we denote the real and imagi-nary parts of a complex number z by ℜ z and ℑ z , respectively.We now derive the enclosing octagon mentioned earlier. The following theorem forms the basis for the construction of a polygon containing the numericalrange.
Theorem 2.1.
Let T ∈ B ( H ) have the Cartesian decomposition T = T H + iT S , let T (cid:54) = c Q for anyc ∈ C and Q ∈ B ( H ) with Q = Q ∗ , and let α , β , γ , δ > . For any norm, define the rectangle R ( T ) and the parallelogram P αβγδ ( T ) , both centered at the origin in the complex plane, by R ( T ) = (cid:110) x + iy : x , y ∈ R , | x | ≤ (cid:107) T H (cid:107) and | y | ≤ (cid:107) T S (cid:107) (cid:111) and P αβγδ ( T ) = (cid:110) x + iy : x , y ∈ R , | α x + β y | ≤ (cid:107) α T H + β T S (cid:107) and | γ x − δ y | ≤ (cid:107) γ T H − δ T S (cid:107) (cid:111) . (4) Then the following holds. W ( T ) ⊆ R ( T ) ∩ P αβγδ ( T ) . (2) The corner points of the rectangle R ( T ) either lie outside R ( T ) ∩ P αβγδ ( T ) or on the boundary of this intersection. (3) If the spectral norm is used to construct R ( T ) and P αβγδ ( T ) , then each side or its opposingside of R ( T ) and P αβγδ ( T ) is tangent to W ( T ) , where the disjunction is inclusive.If T = c Q for c ∈ C and Q ∈ B ( H ) with Q = Q ∗ , then W ( T ) is contained in the closed linesegment determined by the endpoints ± c (cid:107) Q (cid:107) σ , at least one of which is a boundary point of W ( T ) if the norm is the spectral norm.Proof. Consider T ∈ B ( H ) that is not a complex multiple of a self-adjoint operator. Since (cid:104) Tu , u (cid:105) = (cid:104) T H u , u (cid:105) + i (cid:104) T S u , u (cid:105) and T H and T S are self-adjoint, we have that | ℜ (cid:104) Tu , u (cid:105)| = | (cid:104) T H u , u (cid:105) | ≤ (cid:107) T H (cid:107) and | ℑ (cid:104) Tu , u (cid:105)| = | (cid:104) T S u , u (cid:105) | ≤ (cid:107) T S (cid:107) , so that (cid:104) Tu , u (cid:105) ∈ R ( T ) , and, since R ( T ) is closed, the limit points of any sequence {(cid:104) Tu n , u n (cid:105)} , (cid:107) u n (cid:107) =
1, also lie in R ( T ) . Moreover, for any α , β ∈ R , α (cid:104) T H u , u (cid:105) + β (cid:104) T S u , u (cid:105) = (cid:104) ( α T H + β T S ) u , u (cid:105) = ⇒ | α (cid:104) T H u , u (cid:105) + β (cid:104) T S u , u (cid:105) | ≤ (cid:107) α T H + β T S (cid:107) , which is equivalent to | α ℜ (cid:104) Tu , u (cid:105) + β ℑ (cid:104) Tu , u (cid:105) | ≤ (cid:107) α T H + β T S (cid:107) . The second inequality in (4) follows analogously for any γ , δ ∈ R . When α , β , γ , δ >
0, then theinequalities in (4) define the closed parallelogram P αβγδ ( T ) centered at the origin, and boundedby the lines L j ( j = , , , C with R , by L : α x + β y = (cid:107) α T H + β T S (cid:107) , L ( x , y ) : α x + β y = −(cid:107) α T H + β T S (cid:107) , L : γ x − δ y = (cid:107) γ T H − δ T S (cid:107) , L ( x , y ) : γ x − δ y = −(cid:107) γ T H − δ T S (cid:107) , as illustrated in Figure 1. The lines L j define a nondegenerate parallelogram because their right-hand sides never vanish, as the latter would imply that T H and T S are multiples of each other, andthis was explicitly excluded by the condition that T (cid:54) = c Q for a self-adjoint operator Q . This meansthat (cid:104) Tu , u (cid:105) and the limit points of any sequence {(cid:104) Tu n , u n (cid:105)} , (cid:107) u n (cid:107) =
1, lie in P αβγδ ( T ) as welland the first part of the theorem follows.We prove the second part for the upper and lower right-hand corners of R ( T ) as the resultthen follows for the remaining corner points from the symmetry with respect to the origin of both R ( T ) and P αβγδ ( T ) . For the upper right-hand corner ( (cid:107) T H (cid:107) , (cid:107) T S (cid:107) ) , we obtain with L : α (cid:107) T H (cid:107) + β (cid:107) T S (cid:107) ≥ (cid:107) α T H + β T S (cid:107) = ⇒ ( (cid:107) T H (cid:107) , (cid:107) T S (cid:107) ) ∈ ∂ P αβγδ ( T ) OR / ∈ P αβγδ ( T ) , ( (cid:107) T H (cid:107) , −(cid:107) T S (cid:107) ) , we obtain with L : γ (cid:107) T H (cid:107) + δ (cid:107) T S (cid:107) ≥ (cid:107) γ T H − δ T S (cid:107) = ⇒ ( (cid:107) T H (cid:107) , −(cid:107) T S (cid:107) ) ∈ ∂ P αβγδ ( T ) OR / ∈ P αβγδ ( T ) , and the second part of the proof follows.For the last part of the proof, where the norm is assumed to be the spectral norm, we firstconsider the self-adjoint operator T H , which satisfies (cid:107) T H (cid:107) σ = sup (cid:107) u (cid:107) = | (cid:104) T H u , u (cid:105) | . From this itfollows that there exists a sequence { u n } in H with (cid:107) u n (cid:107) =
1, such that (cid:107) T H (cid:107) σ = lim n → ∞ | (cid:104) T H u n , u n (cid:105) | . Therefore, the real sequence {(cid:104) T H u n , u n (cid:105)} contains a subsequence {(cid:104) T H v n , v n (cid:105)} that convergeseither to (cid:107) T H (cid:107) σ or −(cid:107) T H (cid:107) σ . Since {(cid:104) T H v n , v n (cid:105)} = { ℜ (cid:104) T v n , v n (cid:105)} , this means that {(cid:104) T v n , v n (cid:105)} con-verges to the left or right side of R ( T ) , which is then necessarily tangent to W ( T ) . An analogousargument for the self-adjoint operator T S and a convergent sequence { ℑ (cid:104) Tr n , r n (cid:105)} shows that thetop or bottom side of R ( T ) is tangent to W ( T ) . In the case of the self-adjoint operator α T H + β T S ,one similarly obtains with the help of a sequence { α ℜ (cid:104) T s n , s n (cid:105) + β ℑ (cid:104) T s n , s n (cid:105)} that the top rightor bottom left side of P αβγδ ( T ) is tangent to W ( T ) , and an analogous argument for γ T H − δ T S shows that the top left or bottom right side of P αβγδ ( T ) is tangent to W ( T ) .Finally, if T = c Q for c ∈ C and Q ∈ B ( H ) with Q = Q ∗ , then the proof of the statement inthe theorem follows from the fact that W ( c Q ) = cW ( Q ) and from sup (cid:107) u (cid:107) = | (cid:104) Qu , u (cid:105) | ≤ (cid:107) Q (cid:107) , withequality for the spectral norm. This concludes the proof. L L L L ( (cid:107) T H (cid:107) , (cid:107) T S (cid:107) )( (cid:107) T H (cid:107) , −(cid:107) T S (cid:107) )( − (cid:107) T H (cid:107) , −(cid:107) T S (cid:107) )( − (cid:107) T H (cid:107) , (cid:107) T S (cid:107) ) Figure 1: R ( T ) and P αβγδ for Theorem 2.1.4heorem 2.1 with an appropriate choice of the parameters α , β , γ , δ implies the followingcorollary, which leads to a polygon that contains the numerical range and exhibits useful proper-ties. Corollary 2.1.
Let T ∈ B ( H ) have the Cartesian decomposition T = T H + iT S , and let T (cid:54) = c Qfor any c ∈ C and Q ∈ B ( H ) with Q = Q ∗ . Then W ( T ) is contained in a convex polygon, defined,for any norm, by the eight (not necessarily distinct) vertices in the complex plane (cid:16) (cid:107) T S (cid:107) − (cid:107) T h − T S (cid:107) , (cid:107) T S (cid:107) (cid:17) , (cid:16) (cid:107) T h − T S (cid:107) − (cid:107) T S (cid:107) , −(cid:107) T S (cid:107) (cid:17) , (cid:16) (cid:107) T h + T S (cid:107) − (cid:107) T S (cid:107) , (cid:107) T S (cid:107) (cid:17) , (cid:16) (cid:107) T S (cid:107) − (cid:107) T h + T S (cid:107) , −(cid:107) T S (cid:107) (cid:17) , (cid:16) (cid:107) T H (cid:107) , (cid:107) T h + T S (cid:107) − (cid:107) T H (cid:107) (cid:17) , (cid:16) −(cid:107) T H (cid:107) , (cid:107) T H (cid:107) − (cid:107) T h + T S (cid:107) (cid:17) , (cid:16) (cid:107) T H (cid:107) , (cid:107) T H (cid:107) − (cid:107) T H − T S (cid:107) (cid:17) , (cid:16) −(cid:107) T H (cid:107) , (cid:107) T H − T S (cid:107) − (cid:107) T H (cid:107) (cid:17) , resulting in a quadrilateral, hexagon, or octagon that is symmetric with respect to the origin. Ifthe norm is the spectral norm, then this polygon is tangent to W ( T ) in at least four points: one foreach pair of opposing sides.The numerical radius w ( T ) of T satisfies the inequalityw ( T ) ≤ (cid:32) max (cid:110) η + (cid:107) T H (cid:107) , η + (cid:107) T S (cid:107) (cid:111)(cid:33) / , where η = max (cid:40)(cid:12)(cid:12)(cid:12) (cid:107) T H + T S (cid:107) − (cid:107) T H (cid:107) (cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) (cid:107) T H − T S (cid:107) − (cid:107) T H (cid:107) (cid:12)(cid:12)(cid:12)(cid:41) , η = max (cid:40)(cid:12)(cid:12)(cid:12) (cid:107) T H + T S (cid:107) − (cid:107) T S (cid:107) (cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) (cid:107) T H − T S (cid:107) − (cid:107) T S (cid:107) (cid:12)(cid:12)(cid:12)(cid:41) . If T = c Q for c ∈ C and Q ∈ B ( H ) with Q = Q ∗ , then W ( T ) is contained in the closed linesegment determined by the endpoints ± c (cid:107) Q (cid:107) σ , at least one of which is a boundary point of W ( T ) if the norm is the spectral norm.Proof. Theorem 2.1 with α = β = γ = δ = W ( T ) is contained in the intersection ofthe rectangle [ −(cid:107) T H (cid:107) , (cid:107) T H (cid:107) ] × [ −(cid:107) T S (cid:107) , (cid:107) T S (cid:107) ] and the parallelogram P ( T ) , defined, after theusual identification of C with R , by the lines x ± y = ±(cid:107) T H ± T S (cid:107) , which is tangent to it in at leastfour points.Theorem 2.1 shows that the corners of the rectangle R ( T ) are cut off by these lines. If welabel the top and right-hand sides of R ( T ) , respectively, as S and S , then the vertices of R ( T ) ∩ P ( T ) are given by the following four intersection points and their reflections with respect to5he origin: L ∩ S = (cid:16) (cid:107) T h + T S (cid:107) − (cid:107) T S (cid:107) , (cid:107) T S (cid:107) (cid:17) , L ∩ S = (cid:16) (cid:107) T S (cid:107) − (cid:107) T h − T S (cid:107) , (cid:107) T S (cid:107) (cid:17) , L ∩ S = (cid:16) (cid:107) T H (cid:107) , (cid:107) T h + T S (cid:107) − (cid:107) T H (cid:107) (cid:17) , L ∩ S = (cid:16) (cid:107) T H (cid:107) , (cid:107) T H (cid:107) − (cid:107) T H − T S (cid:107) (cid:17) , where the lines L j are the same lines as in the proof of Theorem 2.1 with α = β = γ = δ = R ( T ) contains two vertices of the intersection, which may coincide. To show this, it is sufficient toconsider S , as the arguments for the other sides are analogous. The real part of the intersection of S with L satisfies (cid:107) T S (cid:107) − (cid:107) T H − T S (cid:107) = (cid:107) T S (cid:107) − (cid:107) T H + T S − T S (cid:107)≤ (cid:107) T S (cid:107) − ( (cid:107) T S (cid:107) − (cid:107) T H + T S (cid:107) ) = (cid:107) T H + T S (cid:107) − (cid:107) T S (cid:107) , which means that this vertex lies to the left of the intersection of S with L , although it may coin-cide with it. As a result, the intersection of R ( T ) and P ( T ) takes the form of a quadrilateral,hexagon, or octagon.The polygon determined by these vertices is closed and convex, since it is the intersection oftwo closed convex sets, so that the largest distance from the origin to any point in the polygon isobtained at one or more vertices. Defining, η = max (cid:40)(cid:12)(cid:12)(cid:12) (cid:107) T H + T S (cid:107) − (cid:107) T H (cid:107) (cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) (cid:107) T H − T S (cid:107) − (cid:107) T H (cid:107) (cid:12)(cid:12)(cid:12)(cid:41) , η = max (cid:40)(cid:12)(cid:12)(cid:12) (cid:107) T H + T S (cid:107) − (cid:107) T S (cid:107) (cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) (cid:107) T H − T S (cid:107) − (cid:107) T S (cid:107) (cid:12)(cid:12)(cid:12)(cid:41) . that maximum distance is given by (cid:32) max (cid:110) η + (cid:107) T H (cid:107) , η + (cid:107) T S (cid:107) (cid:111)(cid:33) / , which is necessarily an upper bound on the numerical radius w ( T ) .Finally, the statement in the corollary for the case T = c Q , c ∈ C , and Q ∈ B ( H ) with Q = Q ∗ ,follows immediately from the corresponding case in Theorem 2.1. This concludes the proof.Figure illustrates Corollary 2.1 for the matrices A = − i − + i − − i − i − i − − + i i − i − i − − i − i − i − i − i + i and B = − i − + i + i − + i − − i − i − − i − + i − i − i − + i + i + i , numerical range as for the matrix A or it can beless satisfactory as for the matrix B . However, in both cases, the approximation to the numericalradius is equally good. The latter remains true even for very elongated numerical ranges with anarea much smaller than that of the approximating octagon.Figure 2: Octagons containing the numerical ranges of the matrices A and B .The bound on the numerical radius obtained in Corollary 2.1 is not necessarily better thanexisting bounds, although it often is. To obtain an idea of the relative performance of the boundin Corollary 2.1 with the spectral norm , we have compared it to the bounds in (2) and (3). To dothis, we have generated 1000 m × m matrices, with m = , , , [ − , ] . Wehave listed in Table 1, the average ratios of the respective bounds to the spectral norm of the matrix(the smaller the ratio, the better the bound), which demonstrates the advantage of Corollary 2.1.Moreover, the results appear to be quite insensitive to the size of the matrix. m Bound (2) Bound (3) Corollary 2.110 0.91 0.88 0.80100 0.90 0.86 0.77500 0.90 0.86 0.771000 0.90 0.85 0.77 Table 1: Comparison of bounds on the numerical range for m = , , , References [1] Akhiezer, N.I. and Glazman, I.M.
Theory of Linear Operators in Hilbert Space.
Dover Publications,Inc., 1993.
2] Gustafson, K.E. and Rao, D.K.M.
Numerical range. The field of values of linear operators and ma-trices.
Universitext. Springer-Verlag, New York, 1997.[3] Horn, R. A. and Johnson, C. R.
Topics in Matrix Analysis.
Cambridge University Press, Cambridge,1999.[4] Kittaneh, F.
A numerical radius inequality and an estimate for the numerical radius of the Frobeniuscompanion matrix.
Studia Math., 158 (2003), 11–17.[5] Kittaneh, F.
Numerical radius inequalities for Hilbert space operators.
Studia Math. 168, (2005),73–80.[6] Weidmann, J.
Linear Operations in Hilbert Spaces.