[PDF] Lecture Notes on "Non-Commutative Distributions"

Abstract

This in an introduction to the theory of non-commutative distributions of non-commuting operators or random matrices. Starting from the basic problem to find a good approach to the meaning of "non-commutative distribution" we will, in particular, cover: free analysis, which is a version of complex analysis for several non-commuting variables; the operator-valued version of free probability theory (combinatorial but also analytic aspects); the linearization trick to reduce non-linear scalar problems to linear operator-valued problems; the combination of operator-valued convolution and linearization to calculate the distribution of polynomials in free variables; the basic theory of non-commutative rational functions. On one hand, this is a continuation of the Lecture Notes on "Free Probability", arXiv:1908.08125. On the other hand, the theory of free probability is developed again, but in a more general, operator-valued context. So, in principle and with some additional efforts, it should be possible to read the present notes without having a prior knowledge on free probability. Big parts of the material do also not deal so much with free variables, but more general with analytic and algebraic aspects of maximal non-commuting variables. The material here was presented in the summer term 2019 at Saarland University in 20 lectures of 90 minutes each. The lectures were recorded and can be found online at this https URL.

Full PDF

NNon-Commutative Distributions

Featuring: Operator-Valued Free Probability Theory

Lecture notes

Summer 2019

Prof. Dr. Roland Speicher a r X i v : . [ m a t h . OA ] S e p his in an introduction to the theory of non-commutative distributions of non-commuting operators or random matrices. Starting from the basic problem to ﬁnda good approach to the meaning of “non-commutative distribution” we will, in par-ticular, cover: free analysis, which is a version of complex analysis for several non-commuting variables; the operator-valued version of free probability theory (combi-natorial but also analytic aspects); the linearization trick to reduce non-linear scalarproblems to linear operator-valued problems; the combination of operator-valuedconvolution and linearization to calculate the distribution of polynomials in freevariables; the basic theory of non-commutative rational functions.On one hand, this is a continuation of the Free Probability Lecture Notes. On theother hand, the theory of free probability is developed again, but in a more general,operator-valued context. So, in principle and with some additional eﬀorts, it shouldbe possible to read the present notes without having a prior knowledge on freeprobability. Big parts of the material do also not deal so much with free variables,but more general with analytic and algebraic aspects of maximal non-commutingvariables.The material here was presented in the summer term 2019 at Saarland Universityin 20 lectures of 90 minutes each. The lectures were recorded and can be foundonline at .Many of the presented results were actually achieved in recent years in the contextof the ERC-grant “Non-Commutative Distributions in Free Probability” (2014-19).2 able of contents B -valued joint distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 473.3 Compatibility of operator-valued freeness with matrix ampliﬁcations . 485.4 Structure of formulas for mixed moments in free variables . . . . . . . 505.5 Positivity of free product contructions . . . . . . . . . . . . . . . . . . . 53 R -Transforms 79 R -transform . . . . . . . . . . . . . . . . . . . . . . . . . 82

10 Operator-Valued Free Convolution via Subordination Functionand the Distribution of Polynomials in Free Variables 85

11 Distribution of Rational Expressions in Free Random Variables 91

12 Unbounded Rational Expressions 97

13 Exercises 103

Some Oﬀ-the-Record Remarks 115

Is there anything special about distributions of generators of non-embeddablevon Neumann algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116The q -Gaussian operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Bibliography 121 Introduction

We are interested in properties, preferably analytic, of distributions µ X ,...,X n of ○ operators X , . . . , X n on Hilbert spaces (typically from C ∗ -algebras or vonNeumann algebras); ○ often those operators are “limits in distribution” of random matrix models; ○ typically our operators don’t commute, which makes our distributions “non-commutative”. Consider ﬁrst the classical case of “commutative” distributions. Then random vari-ables X , . . . , X n are measurable functions X i ∶ Ω → R ( i = , . . . , n ), where ( Ω , A , P ) is a probability space, i.e., P is a probability measure on the σ -algebra A over Ω,and the distribution µ X ,...,X n is a probability measure on R n , given as push-forwardof P , i.e., µ X ,...,X n ( B ) = P ({ ω ∈ Ω ∣ ( X ( ω ) , . . . , X n ( ω )) ∈ B }) for any Borel set B of R n .There are various ways of describing or working with this object: µ X ,...,X n is(i) a probability measure on R n ;(ii) a positive linear map, which allows to average over continuous functions of X , . . . , X n : E [ f ( X , . . . , X n )] = ∫ R n f ( t , . . . , t n ) dµ X ,...,X n ( t , . . . , t n )= ∫ Ω f ( X ( ω ) , . . . , X n ( ω )) dP ( ω ) , for continuous f ∶ R n → C ; this is the same as (i) via the Riesz respresentationtheorem;(iii) uniquely determined by its Fourier transform F ( t , . . . , t n ) = E [ e − i ( t X +⋯+ t n X n ) ] ; 7r other nice analytic functions on R n or C n ; e.g., for n =

1, one also has theCauchy or Stieltjes transform G ( z ) = E [( z − X ) − ] ;(iv) in many cases (e.g., compactly supported case) uniquely determined by itsmoments E [ X k ⋯ X k n n ] for all k , . . . , k n ∈ N . Consider now general (i.e., not necessarily commuting) X , . . . , X n ∈ A for a non-commutative probability space (A , ϕ ) , where A is a unital algebra and ϕ ∶ A → C aunital linear functional (usually with some additional analytic structure). Can wegive sense to µ X ,...,X n in this setting?The only item which makes directly sense is the combinatorial item (iv); and thiswill serve as our deﬁnition in the non-commutative case: µ X ,...,X n is the collectionof all moments ϕ ( X i ( ) ⋯ X i ( k ) ) for all k ∈ N ; 1 ≤ i ( ) , . . . , i ( k ) ≤ n of our (non-commutative) random variables X , . . . , X n . Our goal is an analytic understanding of this; i.e., to ﬁnd non-commutative ver-sions or replacements for items (i) - (iii). In particular, we would like to to havenotions for and results on ○ “smoothness” or “regularity” of non-commutative distributions; ○ absence of “atoms”; ○ existence of “densities”.We still don’t know what a “non-commutative probability measure” is, but there hasbeen quite some progress in recent years on dealing with this via versions of (ii) and(iii). In particular, we can say quite a bit about the distribution of f ( X , . . . , X n ) for big classes of ( X , . . . , X n ) and big classes of f . In particular, this gives resultson the asymptotic eigenvalue distribution for polynomials in independent randommatrices; or, equivalently, the distribution of polynomials in free variables.These results rely in particular on progress on ○ operator-valued versions of free probability theory of Voiculescu; ○ free analysis (aka free non-commutative function theory); ○ relating analytic questions about operators in von Neumann algebras with thetheory (of Cohn et al.) of non-commutative linear algebra and the free skewﬁeld (aka non-commutative rational functions).All of this, and much more, will be covered in the coming chapters.8 Basic Deﬁnitions and Examples

We start with the basic deﬁnitions and the most prominent example of a non-commutative distribution: namely free semicircular variables. They show up as thesum of creation and annihilation operators on the full Fock space as well as thelimit of our most beloved random matrices, namely independent Gaussian randommatrices.

Deﬁnition 1.1. (1) A non-commutative probability space (A , ϕ ) consists of ○ a unital algebra A ; ○ a unital linear functional ϕ ∶ A → C ; unital means ϕ ( ) = C ∗ -probability space is a non-commutative probability space (A , ϕ ) , where ○ A is a unital C ∗ -algebra; ○ ϕ is a state, i.e., ϕ ( A ∗ A ) ≥ A ∈ A .(3) Elements A , . . . , A n ∈ (A , ϕ ) are called (non-commutative) random variables . Remark . By the GNS construction, a C ∗ -probability space can always be writtenas: ○ A ⊂ B (H) , for a Hilbert space H ; ○ ϕ ( A ) = ⟨ Aξ, ξ ⟩ , for some unit vector ξ ∈ H . Deﬁnition 1.3.

Let (A , ϕ ) be a non-commutative probability space and consider A , . . . , A n ∈ A . The (non-commutative) distribution µ A ,...,A n of A , . . . , A n is givenby the collection of all their joint moments: µ A ,...,A n ˆ = { ϕ ( A i ⋯ A i k ) ∣ k ∈ N ; 1 ≤ i , . . . , i k ≤ n } . Remark . We will usually work in a C ∗ -probability space and consider selfadjointoperators X , . . . , X n . Our main goal is to get a better analytic understanding of9he distribution µ X ,...,X n . For n = n , but the X i commute) there is a lot of (commutative) analysis available. Example . (1) n =

1: Consider X = X ∗ ∈ A , where (A , ϕ ) is a C ∗ -probabilityspace. Then µ X can be identiﬁed with a probability measure on R (withcompact support) via ϕ ( X k ) = ∫ R t k dµ X ( t ) for all k ∈ N . This follows by Weierstraß Approximation Theorem of continuous functionsby polynomials on compact intervals and by Riesz Representation Theorem.(2) The same applies to the general commutative situation. For a C ∗ -probabilityspace (A , ϕ ) and selfadjoint commuting X , . . . , X n ∈ A the distribution µ X ,...,X n can be identiﬁed with a compactly supported probability measure µ on R n via ϕ ( X i ⋯ X i k ) = ∫ R n t i ⋯ t i k dµ ( t , . . . , t n ) for all n ∈ N ; 1 ≤ i , . . . , i n ≤ n. Remark . (1) Thus, in the the classical case, distributions “are” probabilitymeasures on R n and we can ask questions about their regularity: ○ do they have atoms; ○ do they have a density (with respect to Lebesgue measure, or - equiva-lently, but maybe conceptually better - with respect to Gaussian mea-sure); ○ what are the regularity properties of those densities?(2) There are nice analytic functions which contain all the relevant informationabout classical distributions; in particular we have(i) Fourier transform (aka characteristic function) F ( t , . . . , t n ) = E [ e − i ( t X +⋯+ t n X n ) ] ;(ii) Cauchy transform (in the case n = G ( z ) = ∫ z − t dµ ( t ) = ϕ ( z − X ) , which is deﬁned and analytic on C + ∶= { z ∈ C ∣ Im z > } . C ∗ -probability space (A , ϕ ) , why are we not happy with ϕ re-stricted to the C ∗ -algebra generated by X . . . , X n as our analytic description?Actually, don’t we say that C ∗ ( X , . . . , X n ) is like the continuous functions of X , . . . , X n and vN ( X , . . . , X n ) is like the measurable functions of X , . . . , X n ;indeed . . . but in these phrases we cannot separate the functions from the op-erators.What we really want is to compare random variables X , . . . , X n in (A , ϕ ) with random variables Y , . . . , Y n in (B , ψ ) , for two possibly diﬀerent non-commutative probability spaces (A , ϕ ) and (B , ψ ) . We can only do this bycomparing ϕ ( f ( X , . . . , X n )) with ψ ( f ( Y , . . . , Y n )) for as big classes of f aspossible. Thus, f must make sense as an abstract function which can beapplied to tuples of non-commuting operators.The same applies to the classical situation. Given classical probability spaces ( Ω , P ) and ( ˜Ω , ˜ P ) and random variables X ∶ Ω → R and Y ∶ ˜Ω → R , we are notcomparing P with ˜ P or X and Y directly, but just their distribution, i.e. ∫ Ω f ( X ( ω )) dP ( ω ) with ∫ ˜Ω f ( Y ( ˜ ω )) d ˜ P ( ˜ ω ) for special classses of functions f ; like: monomials, continuous, measurable. Example . For a Hilbert space H we deﬁne the full Fock space by F (H) ∶= ⊕ k ≥ H ⊗ k = C ⋅ Ω ⊕ H ⊕ H ⊗ ⊕ ⋯ , where Ω is a unit vector in H ⊗ ≃ C , called vacuum .Elements in F (H) are given by square summable linear combinations of f ⊗⋯⊗ f k ( k = , , . . . ; f , . . . , f k ∈ H ) with inner product ⟨ f ⊗ ⋯ ⊗ f k , g ⊗ ⋯ ⊗ g l ⟩ = δ kl ⟨ f , g ⟩⋯⟨ f k , g k ⟩ . For f ∈ H , we deﬁne the (left) creation operator l ( f ) , determined by l ( f ) Ω = fl ( f ) f ⊗ ⋯ ⊗ f k = f ⊗ f ⊗ ⋯ ⊗ f k . (left) annihilation operator l ∗ ( f ) , given by l ∗ ( f ) Ω = l ∗ ( f ) f ⊗ ⋯ ⊗ f k = ⟨ f , f ⟩ f ⊗ ⋯ ⊗ f k . Let ξ , . . . , ξ n be an orthonormal system of vectors in H (i.e., ξ i ⊥ ξ j for i /= j and ∥ ξ i ∥ = i ), then we consider the selfadjoint operators S i ∶= l ( ξ i ) + l ∗ ( ξ i ) ( i = , . . . , n ) . For ϕ we take ϕ ( A ) ∶= ⟨ A Ω , Ω ⟩ “vacuum expectation state”.We are interested in the non-commutative distribution µ S ,...,S n of the operators S , . . . , S n in the C ∗ -probability space ( B (F (H)) , ϕ ) . We have a quite good under-standing of this, namely we know: ○ S , . . . , S n are free (in the sense of Voiculescu’s free probability theory) ○ and each S i has a semicircular distribution dµ S i ( t ) = π √ − t dt on [− , ] , − π i.e., ϕ ( S ki ) = π + ∫ − t k √ − t dt = ⎧⎪⎪⎨⎪⎪⎩ , k odd k / + ( kk / ) , k even.The non-zero moments are the Catalan numbers .This µ S ,...,S n , the non-commutative distribution of free semicircular variables, is ourbenchmark; other distributions will be compared to this. In particular, the notionof a density (if there is any!) should be with regard to this.12 xample . Many important distributions are given as limits of random matrices.Let P ( x , . . . , x n ) be a non-commutative selfadjoint polynomial in n non-commutingvariables. For example, for n = P ( x , x ) = x + x or P ( x , x ) = x + x x x + . We consider on the space of n -tuples ( X ( N ) , . . . , X ( N ) n ) of selfadjoint N × N matricesthe probability measure µ N given by dµ N ( X ( N ) , . . . , X ( N ) n ) = c N ⋅ e − N tr [ P ( X ( N ) ,...,X ( N ) n )] dλ ( X ( N ) ) . . . dλ ( X ( N ) n ) , where c N is a normalization constant such that µ N is a probability measure, trdenotes the normalized trace on matrices and dλ ( X ( N ) ) = N ∏ i = d ( Re x ii ) ∏ ≤ i < j ≤ N d ( Re x ij ) d ( Im x ij ) is the Lebesgue measure on all entries of the selfadjoint matrix X ( N ) = ( x ij ) Ni,j = which are not constrained by the selfadjointness condition. Then we consider onselfadjoint N × N matrices a state ϕ N given by, for k ∈ N and 1 ≤ i , . . . , i k ≤ n , ϕ N ( X ( N ) i ⋯ X ( N ) i k ) ∶= ∫ tr [ X ( N ) i ⋯ X ( N ) i k ] dµ N ( X ( N ) , . . . , X ( N ) n ) and denote by µ X ,...,X n the limit of this distribution, given by the moments ϕ ( X i ⋯ X i k ) ∶= lim N →∞ ϕ N ( X ( N ) i ⋯ X ( N ) i k ) , provided these limits exist. The latter depends on P and is, for n ≥

2, a big openquestion. Only some simple situations are will understood. E.g., for P ( x , . . . , x n ) = x + ⋯ + x n , corresponding to independent Gaussian random matrices ( gue ), thislimit exists and is, by results of Voiculescu, equal to the one from Example 1.7, givenby free semicirculars.To summarize, we are interested in the limit of multi-matrix models and want tounderstand whether such limits exist and, in particular, how to describe them.The assignments address some more details about free semicirculars, in the contextof the full Fock space (Exercise 1) and as the limit of random matrices (Exercise 2).13 .4 Non-commutative polynomials anddistributions Deﬁnition 1.9. (1) We denote by C ⟨ x , . . . , x n ⟩ the polynomials in n non-com-muting indeterminates x , . . . , x n ; i.e., the unital algebra in n algebraically freenon-commuting generators x , . . . , x n . Thus, a linear basis of C ⟨ x , . . . , x n ⟩ isgiven by all monomials x i ⋯ x i k ( k ∈ N ; 1 ≤ i , . . . , i k ≤ n ; k = p = p ( x , . . . , x n ) ∈ C ⟨ x , . . . , x n ⟩ is thus of the form p ( x , . . . , x n ) = α + d ∑ k = n ∑ i ,...,i k = α i ,...,i k x i ⋯ x i k , (1.1)for d ∈ N , α , α i ,...,i k ∈ C . We can make C ⟨ x , . . . , x n ⟩ to a ∗ -algebra by declar-ing x ∗ i = x i for all i = , . . . , n .(2) If (A , ϕ ) is a C ∗ -probability space and X i = X ∗ i ∈ A ( i = , . . . , n ), then wehave the evaluation map C ⟨ x , . . . , x n ⟩ → A p ( x , . . . , x n ) ↦ p ( X , . . . , X n ) , which is the ∗ -homomorphism given by 1 ↦ x i ↦ X i ( i = , . . . , n ). Moreexplicitly, for a non-commutative polynomial p ( x , . . . , x n ) of the form (1.1)we have p ( X , . . . , X n ) = α + d ∑ k = n ∑ i ,...,i k = α i ,...,i k X i ⋯ X i k . (1.2)We denote by C ⟨ X , . . . , X n ⟩ ⊂ A the image of this map, i.e., the unital ∗ -subalgebra of A , which is generated by X , . . . , X n .(3) We deﬁne now, more precisely as in Deﬁnition 1.3, the (non-commutative)distribution µ X ,...,X n as the linear functional µ X ,...,X n ∶ C ⟨ x , . . . , x n ⟩ → C p ( x , . . . , x n ) ↦ ϕ ( p ( X , . . . , X n )) . Remark . (1) With C [ x , . . . , x n ] we denote, as usual, the ring of polynomialsin n commuting variables.142) We might also need at some point the non-selfadjoint versions of Deﬁnition 1.9;i.e., if (A , ϕ ) is just a non-commutative probability space the we do not put a ∗ -structure on C ⟨ x , . . . , x n ⟩ ; or, if we deal with general, not necessarily selfad-joint, A , . . . , A n in a C ∗ -probability space, we have the ∗ -polynomials in n non-commuting non-selfadjoint indeterminates z , . . . , z n , C ⟨ z , . . . , z n , z ∗ , . . . , z ∗ n ⟩ . Remark . There appeared recently some generalizations of non-commutativedistributions in the context of free probability, like:(i) Bi-distribution or pairs of faces (Voiculescu 2014 [Voi14]). There the randomvariables are divided into two classes, some random variables are declared asright variables, others as left variables.(ii) Trace polynomial distributions (Cebron 2013 [Ceb]). There C ⟨ x , . . . , x n ⟩ , thepolynomials in x , . . . , x n with “constant” coeﬃcients, is replaced by C { x , . . . , x n } ,the polynomials in x , . . . , x n with coeﬃcients depending on “(tracial) mo-ments” of x , . . . , x n .(iii) Traﬃc distribution (Male 2011 [Mal]). Moments can be identiﬁed with cyclicgraphs (for the case when ϕ is a trace); for example, ϕ ( T T T ) = N N ∑ i,j,k = t ( ) ij t ( ) jk t ( ) ki corresponds to T T T i j k . More general graphs, like T T T T T T T T T T T T i i i i i i i i N ∑ i ,...,i = t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i t ( ) i i For those generalizations, a general analytic theory is even more unclear than forthe ordinary non-commutative distributions, and we will not address those general-izations in the following.16

Operator-Valued Distributionsand Operator-Valued CauchyTransform

Our main analytic object for dealing with non-commutative distributions will be aversion of the Cauchy transform. However, this can only be deﬁned easily for one op-erator, but in a more general, operator-valued setting. Since the information aboutthe non-commutative distribution of a non-commutative tuple can be rewritten interms of one operator-valued operator this opens the door to the analytic world ofnon-commutative distributions.

Deﬁnition 2.1.

Let (A , ϕ ) be a C ∗ -probability space and X = X ∗ ∈ A . The function G X ∶ C + → C − ; z ↦ ϕ ( z − X ) = ∫ R z − t dµ X ( t ) (2.1)is called Cauchy transform of X (or of µ X ) . Remark . A Cauchy transform G X has the following properties.(i) G X is analytic on C + ;(ii) G X has a power series expansion about ∞ : G X ( z ) = ∞ ∑ k = ϕ ( X k ) z k + for ∣ z ∣ > ∥ X ∥ ;(iii) we have lim z ∈ C + ∣ z ∣→∞ zG X ( z ) = ϕ ( X ) =

1; 17iv) µ X can be recovered from G X by the Stieltjes inversion formula dµ X ( t ) = − lim ε ↘ π Im G X ( t + iε ) dt ;one should note that t ↦ − Im G X ( t + iε )/ π is, for each ε >

0, the density of aprobability measure.

Motivation . Let (C , ϕ ) be a C ∗ -probability space and consider selfadjoint X , . . . ,X n ∈ C . We would like to encode the information about µ X ,...,X n in an analyticfunction, something like ∞ ∑ k = n ∑ i ,...,i k = z i ⋯ z i k ϕ ( X i ⋯ X i k ) . (2.2)Since the X i do not commute in general, the variables z , . . . , z n should also notcommute. Thus we need something like an analytic function in non-commutingvariables. It is not clear how to give (2.2) a good analytic meaning (in particular, ifwe want this in some non-commutative half-planes for z , . . . , z n ).Instead, we will rewrite the above in terms of one variable X , but in an operator-valued setting. For this we put M n (C) ∶= M n ( C ) ⊗ C = {( A ij ) ni,j = ∣ A ij ∈ C} and id ⊗ ϕ ∶ M n (C) → M n ( C ) ; ( A ij ) ni,j = ↦ ( ϕ ( A ij )) ni,j = . Denoting

A ∶= M n (C) , B ∶= M n ( C ) , E ∶= id ⊗ ϕ ∶ A → B , we have now an operator-valued probability space, where, compared to Deﬁnition1.1, C is replaced by a (non-commutative) subalgebra B of A and ϕ is replaced bya conditional expectation E onto B . In this setting we put X ∶= ⎛⎜⎜⎜⎝ X ⋯ X ⋯ ⋮ ⋮ ⋱ ⋮ ⋯ X n ⎞⎟⎟⎟⎠ ∈ A . All moments of X , . . . , X n with respect to ϕ can then be recovered from the B -valued moments E [ b Xb X ⋯ b k − Xb k ] ( b , . . . , b k ∈ B ) of X . For example, ϕ ( X X ) can be recovered from ( ) ( X X ) ( ) ( X X ) ( ) = ( X X

00 0 ) ( ϕ ( X X )

00 0 ) = E [ b Xb Xb ] with b = ( ) , b = ( ) , b = ( ) . Deﬁnition 2.4. (1) An operator-valued (non-commutative) probability space (A , B , E ) consists of ○ a unital algebra A ; ○ a unital subalgebra 1 ∈ B ⊂ A ; ○ a conditonal expectation E ∶ A → B , i.e.,- E is linear;- E ( ) = E has the bimodule property: E [ b Ab ] = b E [ A ] b for all b , b ∈ B , A ∈ A ;thus also in particular: E [ b ] = b for all b ∈ B .(2) If A and B are unital C ∗ -algebras and E is positive (i.e., for all A ∈ A thereis b ∈ B such that E [ A ∗ A ] = b ∗ b ), then (A , B , E ) is an operator-valued C ∗ -probability space .(3) Elements X ∈ A are called operator-valued (or B -valued ) random variables .(4) The operator-valued moments of X are of the form E [ X ] , E ( Xb X ] , E [ Xb Xb X ] , . . . E [ Xb Xb ⋯ Xb k − X ] , . . . (5) The collection of all operator-valued moments constitutes the operator-valueddistribution µ X of X . Deﬁnition 2.5.

Let (A , B , E ) be an operator-valued C ∗ -probability space and X = X ∗ ∈ A . Then we deﬁne the operator-valued Cauchy transform G X ∶ B → B (actuallynot everywhere deﬁned, nice domain will be speciﬁed later) by G X ( b ) = E [( b − X ) − ] (if b − X is invertible). Remark . (1) G X is an analytic function between the Banach spaces B → B inGateaux or Frèchet sense; more on this later. 192) Formally, G X has a power series expansion: for ∥ b − ∥ < /∥ X ∥ we have ( b − X ) − = ( b [ − b − X ]) − = ∑ k ≥ ( b − X ) k b − = b − + b − Xb − + b − Xb − Xb − X + ⋯ , and thus G X ( b ) = ∑ k ≥ E [( b − X ) k b − ] . (3) As we see from the power series expansion, G X does not contain informa-tion about all moments, but only about symmetric moments of the form E [ XbXbXb ⋯ bX ] . In order to get all moments we have to consider matri-cial extensions (ampliﬁcations) G ( m ) X of G X . For each m ∈ N , we amplify oursetting to ( M m (A) , E ⊗ id , M m (B)) and consider there the Cauchy transform of X ⊗ = ⎛⎜⎜⎜⎝ X ⋯ X ⋯ ⋮ ⋮ ⋱ ⋮ ⋯ X ⎞⎟⎟⎟⎠ ∈ M m (A) , i.e., for G ( m ) X ∶ M m (B) → M m ( B ) with b = ( b ij ) mi,j = we have G ( m ) X ( b ) = E ⊗ id [( b − X ⊗ ) − ] = E ⊗ id ⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣⎛⎜⎜⎜⎝ b − X b ⋯ b m b b − X ⋯ b m ⋮ ⋮ ⋱ ⋮ b m b m ⋯ b mm − X ⎞⎟⎟⎟⎠ − ⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦ . (4) Note that unsymmetric moments on base level m = E [ Xb Xb X ] can be recovered from a symmetric moment for m = b = ⎛⎜⎝ b

00 0 b ⎞⎟⎠ , X ⊗ = ⎛⎜⎝ X X

00 0 X ⎞⎟⎠

20e have E ⊗ id [ X ⊗ ⋅ b ⋅ X ⊗ ⋅ b ⋅ X ⊗ ] = ⎛⎜⎝ E [ Xb Xb X ] ⎞⎟⎠ . (5) Thus, in order to encode all operator-valued moments of X in some analyticfunction, we do not just need G X = G ( ) X , but also all its matrix ampliﬁcations G ( m ) X . Those G ( m ) X are related to each other for diﬀerent m as follows:(i) For invertible b ∈ M m (B) and b ∈ M m (B) we have G m + m X ( b b ) = E ⊗ id [(( b − X ⊗ ) − ( b − X ⊗ ) − )]= ( G m X ( b ) G m X ( b )) ;(ii) for invertible S ∈ M m ( C ) and b ∈ M m (B) we have (note that we have S ⋅ X ⊗ ⋅ S − = X ⊗ G ( m ) X ( SbS − ) = E ⊗ id [( SbS − − X ⊗ ) − ]= E ⊗ id [( SbS − − S ⋅ X ⊗ ⋅ S − ) − ]= E ⊗ id [ S ( b − X ⊗ ) − S − ]= S ⋅ E ⊗ id [( b − X ⊗ ) − ] ⋅ S − = S ⋅ G ( m ) X ( b ) ⋅ S − . (6) Collections of functions which satisfy (i) and (ii) are called fully matricial func-tions (by Voiculescu [Voi04]) or (free) non-commutative functions (by Vinnikovet al. [KVV]). We will have to have a closer look on them in the next chapter.21 Non-Commutative Functions

We will now formalize the algebraic properties of the G ( m ) X ; but ignore ﬁrst thequestion of domain. The main point will be to see that “analyticity” can be encodedin algebraic properties over matrices. Later, in Section 3.2 we will also address thequestion of the domain. A good source for the material in this and the next chapterare the expository notes Operator-valued non-commutative probability by DavidJekel [Jek18]. Deﬁnition 3.1.

Let B be a unital algebra. A collection f = ( f m ) m ∈ N of functions f m ∶ M m (B) → M m (B) z ↦ f m ( z ) is called a non-commutative function (or fully matricial function ), if it satisﬁes thefollowing two conditions.(i) f respects direct sums: f m + m [( z z )] = ( f m ( z ) f m ( z )) (3.1)for all m , m ∈ N , z ∈ M m (B) , z ∈ M m (B) .(ii) f respects similarities: f m ( SzS − ) = Sf m ( z ) S − (3.2)for all m ∈ N , z ∈ M m (B) , S ∈ M m ( C ) invertible. Remark . (1) It is fairly easy to see (and you are asked in Exercise 4 to seethis) that (i) and (ii) are equivalent to the fact that f respects intertwinings:for all n, m ∈ N , z ∈ M n (B) , z ∈ M m (B) , and an n × m matrix T ∈ M n,m ( C ) we have z T = T z (cid:212)⇒ f n ( z ) T = T f m ( z ) . G ) are not deﬁned on all of M n (B) , butonly on subsets. The conditions above have then to be modiﬁed accordingly.We will ignore this for the moment, but come back to this issue later.(3) We will often just write f ( z ) instead of f m ( z ) , when the m is clear.(4) We claim now that (i) and (ii) encode analyticity in an algebraic way. Inparticular, they should allow us to distinguish analytic functions like f ( z ) = z for all z ∈ M m (B) from non-analytic ones, like g ( z ) = z ∗ for all z ∈ M m (B) .Note that (i) does not see a diﬀerence here, g ( z z ) = ( z ∗ z ∗ ) = ( g ( z ) g ( z )) , but (ii) does: f ( SzS − ) = SzS − = Sf ( z ) S − , but g ( SzS − ) = ( SzS − ) ∗ = S ∗− z ∗ S ∗ /= Sg ( z ) S − in general for S ∈ M m ( C ) with m ≥

2. Note that (ii) is for m = S ∈ C . Example . Consider the case

B = C ; i.e., let f ∶ C → C be an analytic function.Then one can extend this by holomorphic functional calculus to matrices via f m ∶ M m ( C ) → M m ( C ) z ↦ f m ( z ) ∶= πi ∫ Γ f ( ξ ) ξ − z dξ, where we integrate around the eigenvalues of the matrix z . The collection f =( f m ) m ∈ N satisﬁes then (i) and (ii): f ( z z ) = πi ∫ Γ f ( ξ ) ( ξ − z ξ − z ) − dξ = πi ∫ Γ f ( ξ ) (( ξ − z ) − ( ξ − z ) − ) dξ = ( f ( z ) f ( z )) f ( SzS − ) = πi ∫ Γ f ( ξ )( ξ − SzS − ) − dξ = πi ∫ Γ f ( ξ ) S ( ξ − z ) − S − dξ = Sf ( z ) S − . We can in this case also recover the derivative f ′ from the action of the higher f m ,without taking limits. For this consider z , z , w ∈ C , then f ( z w z ) = πi ∫ Γ f ( ξ ) ( ξ − z − w ξ − z ) − dξ = πi ∫ Γ f ( ξ ) (( ξ − z ) − ( ξ − z ) − w ( ξ − z ) − ( ξ − z ) − ) dξ = ( f ( z ) ∗ f ( z )) with ∗ = πi w ∫ Γ f ( ξ )( ξ − z ) − ( ξ − z ) − dξ = πi w ∫ Γ f ( ξ ) z − z [ ξ − z − ξ − z ] dξ = w f ( z ) − f ( z ) z − z , and thus f ( z w z ) = ( f ( z ) f ′ ( z ) w f ( z ) ) . Remark . In the same way, derivatives can be recovered for non-commutativefunctions, just relying on properties (i) and (ii) (and some continuity or boundednesscondition). We address this in the following. For this one should note that uppertriangular matrices are similar to diagonal matrices: ( z z − z z ) = ( )·„„„„„„„‚„„„„„„„„¶ S ( z z ) ( −

10 1 )·„„„„„„„„„„„„‚„„„„„„„„„„„„„¶ S − . emma 3.5. Let f be a non-commutative function. Then we have for z ∈ M n (B) , z ∈ M m (B) , w ∈ M n,m (B) : f ( z w z ) = ( f n ( z ) ∗ f m ( z )) . We denote the entry in ∗ by ∂f ( z , z ) ♯ w or by ∆ f ( z , z )[ w ] . Note that this isan element in M n,m (B) . Proof.

Write f ( z w z ) = ( a bc d ) . Note that we have then by (3.1) f ⎛⎜⎝ z w z

00 0 z ⎞⎟⎠ = ⎛⎜⎝ f ( z w z ) f ( z )⎞⎟⎠ = ⎛⎜⎝ a b c d

00 0 f ( z )⎞⎟⎠ . Furthermore, we have ⎛⎜⎝ z w z

00 0 z ⎞⎟⎠ = ⎛⎜⎝ −

10 1 00 0 1 ⎞⎟⎠·„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„¶ S ⎛⎜⎝ z w z

00 0 z ⎞⎟⎠ ⎛⎜⎝ ⎞⎟⎠·„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„¶ S − and thus by (3.2) ⎛⎜⎝ a b c d

00 0 f ( z )⎞⎟⎠ = f ⎛⎜⎝ z w z

00 0 z ⎞⎟⎠ = ⎛⎜⎝ −

10 1 00 0 1 ⎞⎟⎠ ⋅ f ⎛⎜⎝ z w z

00 0 z ⎞⎟⎠ ⋅ ⎛⎜⎝ ⎞⎟⎠= ⎛⎜⎝ −

10 1 00 0 1 ⎞⎟⎠ ⋅ ⎛⎜⎝ a b c d

00 0 f ( z )⎞⎟⎠ ⋅ ⎛⎜⎝ ⎞⎟⎠= ⎛⎜⎝ a b a − f ( z ) c d c f ( z ) ⎞⎟⎠ . This implies that a = f ( z ) and c =

0. Similarly, one gets that d = f ( z ) .26 emma 3.6. ∂f ( z , z ) ♯ w is linear in w .Proof. We have to show that(i) for all λ ∈ C and w ∈ M n,m (B) : ∂f ( z , z ) ♯( λw ) = λ ⋅ ∂f ( z , z ) ♯ w ;(ii) for all w , w ∈ M n,m (B) : ∂f ( z , z ) ♯( w + w ) = ∂f ( z , z ) ♯ w + ∂f ( z , z ) ♯ w . We only show (i); the second part is similar, see Exercise 5.(i) The case λ = ∂f ( z , z ) ♯ =

0. Thus assume that λ /=

0. Wehave ( z λw z ) = ( λ

00 1 ) ( z w z ) ( / λ

00 1 ) and thus ( f ( z ) ∂f ( z , z ) ♯( λw ) f ( z ) ) = ( λ

00 1 ) ( f ( z ) f ( z , z ) ♯ w f ( z ) ) ( / λ

00 1 )= ( f ( z ) λ ⋅ f ( z , z ) ♯ w f ( z ) ) . Proposition 3.7. (1) ∂f ( z , z ) is a diﬀerence operator, i.e., we have for all m ∈ N and all z , z ∈ M m (B) f ( z ) − f ( z ) = ∂f ( z , z ) ♯( z − z ) . (2) If f is continuous, then, for all m ∈ N and all z ∈ M m (B) , ∂f ( z, z ) is adiﬀerential operator, i.e., ∂f ( z, z ) ♯ w = lim ε ↘ f ( z + εw ) − f ( z ) ε . Proof. (1) Put S = ( ) ; then ( z z − z z ) = S ( z z ) S − , ( f ( z ) ∂f ( z , z ) ♯( z − z ) z ) = S ( f ( z ) f ( z )) S − = ( f ( z ) f ( z ) − f ( z ) f ( z ) ) . (2) By Lemma 3.6 and by part (1), we have ε ⋅ ∂f ( z, z + εw ) ♯ w = ∂f ( z, z + εw ) ♯( εw ) = f ( z + εw ) − f ( z ) . This yields ∂f ( z, z + εw ) ♯ w = ε [ f ( z + εw ) − f ( z )] , and thus f ( z w z + εw ) = ( f ( z ) ε [ f ( z + εw ) − f ( z )] f ( z + εw ) ) . As f is assumed to be continuous, the left hand side of this converges for ε ↘ f ( z w z ) = ( f ( z ) ∂f ( z, z ) ♯ w f ( z ) ) This implies then that also the right hand side of the above equation convergesand we must have ∂f ( z, z ) ♯ w = lim ε ↘ ε [ f ( z + εw ) − f ( z )] . Deﬁnition 3.8.

Let ( E, ∥ ⋅ ∥ E ) and ( F, ∥ ⋅ ∥ E ) be complex Banach spaces and let ∅ /= Ω ⊂ E be open. A function f ∶ Ω → F is called(i) Gâteau holomorphic on Ω, iflim z → z ∈ C /{ } z [ f ( x + zh ) − f ( x )] =∶ δf ( x ; h ) exists in ( F, ∥ ⋅ ∥ F ) for all x ∈ Ω and all h ∈ E ;(ii) analytic on Ω, if it is Gâteau holomorphic and locally bounded, i.e., for all x ∈ Ω there exists r = r ( x ) > y ∈ Ω ∥ y − x ∥ E < r ∥ f ( y )∥ F < ∞ . emark . (1) By a theorem of Hille (1944) one knows that an analytic functionis actually also Fréchet holomorphic , i.e., the “total derivative” δf ( x ; ⋅) ∶ E → F is a bounded linear operator andlim ∥ h ∥ E → ∥ h ∥ E ∥ f ( x + h ) − f ( x ) − δf ( x ; h )∥ F = . Moreover, f has locally a uniformly convergent “Taylor series expansion”.(2) In Proposition 3.7 we have seen how to get Gâteau holomorphic from thealgebraic conditions on our non-commutative functions, under the conditionof continuity. According to Deﬁnition 3.8 and the ﬁrst part of this remarklocal boundedness is a more natural condition to ask for. It turns out thatthis is actually suﬃcient to ensure continuity (and thus analyticity) for ournon-commutative functions. Proposition 3.10.

Let f = ( f m ) m ∈ N , f m ∶ M m (B) → M m (B) , be a non-commutativefunction. If f is locally bounded (i.e., each f m is locally bounded), then f is contin-uous (i.e., each f m is continuous). [To be precise: Boundedness and continuity ishere with respect to the C ∗ -norm on each M m (B) .]Proof. We know, by 3.7, that for z , z ∈ M m (B) f ( z z − z z ) = ( f ( z ) f ( z ) − f ( z ) f ( z ) ) ;and, since ∂f ( z , z ) ♯ w is linear in w (by Lemma 3.6), also for λ ∈ C f ( z λ ( z − z ) z ) = ( f ( z ) λ [ f ( z ) − f ( z )] f ( z ) ) . (3.3)Now take z ∈ M m (B) and ε >

0; we want to ﬁnd δ > w ∈ M m (B) and ∥ w − z ∥ < δ implies that ∥ f ( w ) − f ( z )∥ ≤ ε .For this we go to M m (B) and consider there z ⊕ z ∶= ( z z ) . Since f m is locally bounded we ﬁnd r > ... ∥ f ( y )∥ =∶ C < ∞ , wherethe supremum is over y ∈ M m (B) such that ∥ y − z ⊕ z ∥ < r . 29ow we choose δ ∶= min { r , ε r C } and consider w ∈ M m (B) with ∥ w − z ∥ < δ . Then wehave ∥( w Cε ( w − z ) z ) − ( z z )∥ = ∥( w − z Cε ( w − z ) )∥ ≤ ∥ w − z ∥·„„„„„„„„„‚„„„„„„„„„¶ < δ ≤ r + Cε ∥ w − z ∥·„„„„„„„„„‚„„„„„„„„„¶ < δ ≤ εC r < r and thus, by also using (3.3), ∥( f ( w ) Cε [ f ( w ) − f ( z )] f ( z ) )∥ = ∥ f ( w Cε ( w − z ) z )∥ ≤ C. This implies then ∥ Cε [ f ( w ) − f ( z )]∥ ≤ C , thus ∥ f ( w ) − f ( z )∥ ≤ ε . Remark . (1) This shows that for locally bounded non-commutative functionswe get the derivative δf ( z ; w ) = ∂f ( z, z ) ♯ w as part of the data of higher f m : f ( z w z ) = ( f ( z ) δf ( z ; w ) f ( z ) ) . (2) In the same way we get also higher derivatives: f ⎛⎜⎝ z w z w z ⎞⎟⎠ = ⎛⎜⎝ f ( z ) ∂f ( z , z ) ♯ w ∗ f ( z ) ∂f ( z , z ) ♯ w f ( z ) ⎞⎟⎠ , where ∗ =∶ ∂ f ( z , z , z ) ♯( w , w ) is a second-order diﬀerence quotient, which gives the second derivative ∂ f ( z, z, z ) ♯( w , w ) .(3) One should also note that uniform local boundedness of f allows us to con-trol the size of the derivatives, so that one gets a convergent “Taylor-Taylorexpansion” f ( z + w ) = ∞ ∑ k = ∂ k ( z, z, . . . , z )·„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„¶ k + ♯( w, . . . , w ·„„„„„„„„„„„„„‚„„„„„„„„„„„„„¶ k ) . In Exercise 7 you are asked to prove this expansion.The Taylors here are two diﬀerent people: Brook Taylor ( ∼ Foundations of FreeNon-Commutative Function Theory (2014) by D. Kaliuzhnyi-Verbovetskyi andV. Vinnikov [KVV].30 .2 Rigorous deﬁnition of fully matricialfunctions, caring also about domain

Deﬁnition 3.12. (1) For a C ∗ -algebra B we denote:(i) M n (B) = M n ( C ) ⊗ B =∶ B ( n ) ;(ii) for z ∈ B ( n ) we put z ( m ) = m ⊗ z = ⎛⎜⎜⎜⎝ z ⋯ z ⋯ ⋮ ⋮ ⋱ ⋮ ⋯ z ⎞⎟⎟⎟⎠ ∈ M nm (B) ;(iii) for z ∈ B ( n ) , z ∈ B ( m ) we put z ⊕ z = ( z z ) ∈ B ( n + m ) ;(iv) for z ∈ B ( n ) and r > B ( n ) ( z, r ) ∶= { w ∈ B ( n ) ∣ ∥ z − w ∥ < r } and B ( z, r ) ∶= ⋃ m ≥ B ( nm ) ( z ( m ) , r ) . (2) A fully matricial domain Ω = ( Ω ( n ) ) n ∈ N over B is a sequence of sets Ω ( n ) ⊂ B ( n ) satisfying the following conditions:(i) Ω respects direct sums: z ∈ Ω ( n ) and z ∈ Ω ( m ) implies that z ⊕ z ∈ Ω ( n + m ) ;(ii) Ω is uniformly open; i.e., for each z ∈ Ω ( n ) there exists r > B ( z, r ) ⊂ Ω;(iii) Ω is non-empty; i.e., at least one Ω ( n ) is non-empty.(3) Let Ω and Ω be fully matricial domains over B and B , respectively. A fully matricial function f = ( f ( n ) ) n ∈ N ∶ Ω → Ω is a sequence of functions f ( n ) ∶ Ω ( n ) → Ω ( n ) satisfying the following conditions:(i) f respects intertwinings; i.e., for z ∈ Ω ( n ) , z ∈ Ω ( m ) , T ∈ M n × m ( C ) wehave: z T = T z implies that f ( n ) ( z ) T = T f ( m ) ( z ) .(ii) f is uniformly locally bounded; i.e., for each z ∈ Ω ( n ) there exit r > M > B ( z, r ) ⊂ Ω and f ( B ( z, r )) ⊂ B ( , M ) . Example . (1) Non-commutative monomials and polynomials over B are fullymatricial with domains Ω ( n ) = Ω ( n ) = M n (B) ; see Exercise 6. 312) Consider Ω ( n ) ∶= { z ∈ B ( n ) ∣ z is invertible } . Then Ω = ( Ω ( n ) ) n ∈ N is a fully matricial domain and f ∶ Ω → Ω; z ↦ f ( z ) ∶= z − is fully matricial. Proof.

It is clear that Ω respects direct sums and is non-empyt. To see that Ωis uniformly open, we claim that B ( z, /∥ z − ∥) ⊂ Ω. To check this, note thatfor w ∈ B ( z, /∥ z − ∥) we have w − = [ z − ( z − w )] − = z − [ − ( z − w ) z − ] − = z − ∞ ∑ k = [( z − w ) z − ] k . Since ∥( z − w ) z − ∥ <

1, the series converges in norm, and thus w ∈ Ω.From this calculation we also get that ∥ w − ∥ ≤ ∥ z − ∥ − ∥ z − ∥ ⋅ ∥ z − w ∥ , which shows that f [ B ( z, /( ∥ z − ∥))] ⊂ B ( , ∥ z − ∥/ ) ;thus f is uniformly locally bounded. f also respects intertwinings: suppose that z T = T z ; this implies that T z − = z − T , i.e., T f ( z ) = f ( z ) T . Proposition 3.14. (1) Suppose that f, g ∶ Ω → Ω are fully matricial. Then soare f + g and f g .(2) Suppose that f ∶ Ω → Ω and g ∶ Ω → Ω are fully matricial. Then so is thecomposition g ○ f ∶ Ω → Ω .Proof. We only prove (1). Suppose that z T = T z . Then we have ( f + g )( z ) ⋅ T = f ( z ) T + g ( z ) T = T f ( z ) + T g ( z ) = T ⋅ ( f + g )( z ) and ( f g )( z ) ⋅ T = f ( z ) g ( z ) T = f ( z ) T g ( z ) = T f ( z ) g ( z ) = T ⋅ ( f g )( z ) . z ∈ Ω ( n ) , then there are r , M and r , M such that f ( B ( z, r )) ⊂ B ( , M ) and g ( B ( z, r )) ⊂ B ( , M ) . Put r ∶= min ( r , r ) ; then we have for w ∈ B ( z, r )∥( f + g )( w )∥ ≤ ∥ f ( w )∥ + ∥ g ( w )∥ ≤ M + M and ∥( f g )( w )∥ ≤ ∥ f ( w )∥ ⋅ ∥ g ( w )∥ ≤ M ⋅ M . The Operator-Valued CauchyTransform

Now let’s get serious about the operator-valued Cauchy transform as a fully matricialfunction.

Deﬁnition 4.1.

Let (A , B , E ) be an operator-valued C ∗ -probability space and let X = X ∗ ∈ A . The Cauchy transform G X = ( G ( n ) X ) n ∈ N of X is deﬁned by G ( n ) X ∶ H + ( M n (B)) → H − ( M n (B)) , z ↦ id ⊗ E [( z − ⊗ X ) − ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∈ M n (A) ] , where H + and H − denote the upper and lower, respectively, half-plane. Notation 4.2.

Let A be a unital C ∗ -algebra.(1) For A ∈ A we putRe ( A ) ∶= ( A + A ∗ ) real partIm ( A ) ∶= i ( A − A ∗ ) imaginary part(2) We deﬁne the strict upper/lower half-plane of A by H + (A) ∶= { A ∈ A ∣ ∃ ε > ∶ Im ( A ) ≥ ε ⋅ } H − (A) ∶= { A ∈ A ∣ ∃ ε > ∶ Im ( A ) ≤ − ε ⋅ } . Instead of ∃ ε > ∶ Im ( A ) ≥ ε ⋅ ( A ) >

0; and in thesame spirit Im ( A ) < roposition 4.3. Let A ∈ H + (A) . Then A is invertible and A − ∈ H − (A) .Proof. Put X ∶= Re ( A ) and Y ∶= Im ( A ) ; by assumption Y is positive and invertible,and thus we can write A = X + iY = Y / [ Y − / XY − / + i ] Y / . Since Y − / XY − / is selfadjoint, we have that i is not in its spectrum and hence A is invertible, with A − = Y − / [ Y − / XY − / + i ] − Y − / . Let us denote Y − / XY − / by ˜ X , then we can calculate [ ˜ X + i ] − = [( ˜ X − i )( ˜ X + i )] − ( ˜ X − i ) = ( ˜ X + ) − ( ˜ X − i ) , which gives ﬁnallyIm ( A − ) = Y − / ⋅ Im [ ˜ X + i ] − ⋅ Y − / = − Y − / ⋅ ( ˜ X + ) − ·„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„¶ > ⋅ Y − / < . Proposition 4.4. H + (B nc ) ∶= ( H + ( M n (B))) n ∈ N is a fully matricial domain over B .Proof. (i) H + (B nc ) respects direct sums.Consider z ∈ H + ( M n (B)) and z ∈ H + ( M m (B)) ; then Im z ≥ ε ⋅ z ≥ ε ⋅ ( z z ) = ( Im z

00 Im z ) ≥ ( ε ⋅ ε ⋅ ) ≥ min ( ε , ε ) ⋅ z ⊕ z ∈ H + ( M n + m (B)) .(ii) H + (B nc ) is uniformly open.Consider z ∈ H + ( M n (B)) , i.e., Im z ≥ ε ⋅

1; we claim that then B ( z, ε ) ⊂ H + (B nc ) . Namely, consider w ∈ B ( z, ε ) , i.e., w ∈ M mn (B) with ∥ z ( m ) − w ∥ < ε .Then we have ∥ Im z ( m ) − Im w ∥ ≤ ∥ z ( m ) − w ∥ < ε, and thus Im z ( m ) − Im w < ε ⋅

1, orIm w > Im z ( m ) ·„„„„„„„„„‚„„„„„„„„„„¶ =( Im z ) ( m ) ≥ ε ⋅ − ε ⋅ ≥ w ∈ H + ( M mn (B)) .36iii) H + (B nc ) is clearly non-empty. Theorem 4.5.

Let (A , B , E ) be an operator-valued C ∗ -probability space and X = X ∗ ∈ A . Then the Cauchy transform G X ∶ H + (B nc ) → H − (B nc ) , z ↦ id ⊗ E [( z − ⊗ X ) − ] is a fully matricial function.Proof. (i) First we should check that G ( n ) X sends H + ( M n (B)) to H − ( M n (B)) . Forthis, consider z ∈ H + ( M n (B)) , i.e., Im z ≥ ε ⋅

1. Then we have Im ( z − ⊗ X ) = Im z ≥ ε ⋅

1, and thus z − ⊗ X ∈ H + ( M n (A)) ; then Proposition 4.3 tells us that ( z − ⊗ X ) − ∈ H − ( M n (A)) .Now we apply id ⊗ E ∶ M n (A) → M n (B) . By our assumption that (A , B , E ) isan operator-valued C ∗ -probability space, we have that E ∶ A → B is positive.Since E is a conditional expectations this implies that it is completely positive,i.e., all its ampliﬁcations id ⊗ E are also positive. (Note that positivity of alinear map from A to B does in general not imply complete positivity; oneneeds some more structure, like conditional expectations.) So this impliesthen that id ⊗ E ∶ H − ( M n (A)) → H − ( M n (B)) and ﬁnally we have G ( n ) X ( z ) = id ⊗ E [( z − ⊗ X ) − ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∈ H − ( M n (A) ] ∈ H − ( M n (B)) . (ii) It is clear that G X respects intertwinings; compare Example 3.13 (2).(iii) It remains to see uniform local boundedness. Consider z ∈ H + ( M n (B)) , i.e.,Im z ≥ ε ⋅

1. As in the proof of Proposition 4.3, we write ( z − ⊗ X ) − = { Im ( z ) / [ i ⋅ + Im ( z ) − / ⋅ ( Re ( z ) − ⊗ X ) ⋅ Im ( z ) − / ] Im ( z ) / } − = Im ( z ) − / [ i ⋅ + Im ( z ) − / ⋅ ( Re ( z ) − ⊗ X ) ⋅ Im ( z ) − / ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ s.a. operator ] − ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∥⋅∥≤ Im ( z ) − / , which yields ∥( z − ⊗ X ) − ∥ ≤ ∥ Im ( z ) − / ∥ = ∥ Im ( z ) − ∥ . ⊗ E has, as a normalized completely positive mapping, norm1 and thus we have ∥ G ( n ) X ( z )∥ = ∥ id ⊗ E [ z − ⊗ X ) − ]∥≤ ∥( z − ⊗ X ) − ∥≤ ∥ Im ( z ) − ∥≤ ε since Im z ≥ ε ⋅

1. Now we are ready to consider w ∈ B ( z, ε / ) , say w ∈ M mn (B) .According to the calculations in the proof of Proposition 4.4 we haveIm w ≥ Im z ( m ) − ε ⋅ ≥ ε ⋅ , and thus ∥ G X ( w )∥ ≤ ∥ Im ( w ) − ∥ ≤ ε ;hence we have a local uniform bound. Remark . (1) In the scalar-valued case, i.e., B = C , all relevant informationabout distributions, i.e., probability measures, is encoded in the Cauchy trans-form; in particular we have(i) weak convergence of probability measures corresponds to pointwise con-vergence of the Cauchy transforms;(ii) there are precise characterizations when an analytic function is a Cauchytransform.(2) There are kind of analogues of this in the operator-valued case. Of course,now we are essentially encoding information about moments. Note that inthe scalar-valued case moments describe probability measures uniquely if thelatter are compactly supported, which corresponds to bounded operators. Inthe operator-valued case we restrict for now to bounded operators (in our C ∗ -probability spaces), thus to the non-commutative analogue of compactlysupported measures. In the scalar case we can deal with any probabilitymeasure (via analytic tools, not via moments), in the operator-valued case theunbounded situation is quite unclear.(3) Note that a compactly supported measure on the level of moments is charac-terized by38i) positive deﬁniteness of moments, in the sense that ∫ p ( t ) p ( t ) dµ ( t ) ≥ C [ t ] ;(ii) and exponential boundedness of moments: if supp µ ⊂ [− M, M ] then ∣ m n ∣ = ∣ ∫ t n dµ ( t )∣ ≤ ∫ ∣ t ∣ n dµ ( t ) = M ∫ − M ∣ t ∣ n dµ ( t ) ≤ M n . We will now deﬁne non-commutative distributions abstractly via moments viasuch properties

Deﬁnition 4.7. (1) Let B be a unital algebra. We denote by B⟨ x ⟩ the polynomi-als in the formal variable x with coeﬃcients from B , i.e., the free product of C ⟨ x ⟩ and B , with amalgamation over C ⋅

1. Elements in B⟨ x ⟩ are thus linearcombinations of monomials of the form b xb x ⋯ b k − xb k for k ∈ N , b , . . . , b k ∈ B .The elements in B , corresponding to k =

0, are the constant polynomials. If B is a ∗ -algebra, then B⟨ x ⟩ becomes a ∗ -algebra, too, by declaring x ∗ = x , i.e., ( b xb x ⋯ b k − xb k ) ∗ = b ∗ k xb ∗ k − ⋯ xb ∗ xb ∗ . (2) If B is a unital C ∗ -algebra, then a B -valued distribution is a linear map µ ∶B⟨ x ⟩ → B sucht that:(i) µ is unital, µ ( ) = µ is a B - B -bimodule map, i.e., µ ( bp ( x ) b ′ ) = bµ ( p ( x )) b ′ for all p ( x ) ∈ B⟨ x ⟩ , b, b ′ ∈ B ;(iii) µ is completely positive, i.e., µ ( n ) ( p ( x ) ∗ p ( x )) ≥ n ∈ N and all p ( x ) ∈ M n (B⟨ x ⟩) ,where µ ( n ) is, as usual, the ampliﬁcation id ⊗ µ .We denote the set of all B -valued distributions by Σ B . µ ∈ Σ B is exponentially bounded if there exists M > n ∈ N and all b , . . . , b n ∈ B that ∥ µ ( xb xb ⋯ xb n x )∥ ≤ M n + ∥ b ∥⋯∥ b n ∥ . We write then µ ∈ Σ B . 393) If (A , B , E ) is a B -valued C ∗ -probability space and X = X ∗ ∈ A , then the ( B -valued) distribution µ X ∶ B⟨ x ⟩ → B of X is given by µ X ( p ( x )) = E [ p ( X )] for all p ( x ) ∈ B⟨ x ⟩ . Remark . Σ B should consist of all possible B -valued distributions of selfadjointrandom variables X in B -valued C ∗ -probability spaces. That µ X ∈ Σ B for such X isclear (see Exercise 13), that we also have the other direction is the main content ofthe following theorem of Popa and Vinnikov [PV]. Theorem 4.9 (Popa,Vinnikov 2013) . For a unital C ∗ -algebra B the following areequivalent for a linear map µ ∶ B⟨ x ⟩ → B .(i) µ ∈ Σ B .(ii) There exists a B -valued C ∗ -probability space (A , B , E ) and a selfadjoint X ∈ A such that µ X = µ .Rough sketch of the proof. In the scalar-valued case we realize X via left-multiplicationby x on C ⟨ x ⟩ via GNS-like construction. Now we do an operator-valued version ofthis, i.e., we put on B⟨ x ⟩ a B -valued inner product by ⟨ p ( x ) , q ( x )⟩ µ ∶= µ ( p ( x ) ∗ q ( x )) ∈ B . This gives on B⟨ x ⟩ a ( C -valued) norm ∥ p ( x )∥ µ ∶= ∥⟨ p ( x ) , p ( x )⟩ µ ∥ / B . Completing B⟨ x ⟩ with respect to this gives a Banach space B⟨ x ⟩ µ . B⟨ x ⟩ acts onthis space via left multiplications. Checking a couple of technical details shows thenthat this action is bounded and adjointable and thus generates a C ∗ -algebra A . Let X ∈ A be multiplication with x . We also have a conditional expectation E ∶ A → B given by E [ A ] ∶= ⟨ , A ⟩ µ . With respect to this, X has distribution µ : E [ b Xb ⋯ b n Xb n + ] = ⟨ , b Xb ⋯ b n Xb n + ⟩ µ = ⟨ , b Xb ⋯ b n Xb n + ⟩ µ = µ ( b Xb ⋯ b n Xb n + )= µ ( b Xb ⋯ b n Xb n + ) . .3 Moments and Cauchy transform Remark . (1) Since G X depends only on the distribution µ X of X , we can alsowrite G X = G µ for µ X = µ .(2) As in the classical case the moments of X should be the coeﬃcients in thepower series expansion of G X in z − about inﬁnity. To formulate this nicely,it is better to go over to the function H X ( z ) ∶= G X ( z − ) . Proposition 4.11.

Let (A , B , E ) be a B -valued C ∗ -probability space and X = X ∗ ∈A . Then the function H X ∶ H − (B nc ) → H − (B nc ) , z ↦ H X ( z ) ∶= G X ( z − ) is a fully matricial function which has a fully matricial extension to a uniformneighbourhood of and we have E [ b Xb ⋯ b n − Xb n ] = ∂ n + H X ( , . . . , ) ♯( b , b , . . . , b n ) . Proof.

We have, uniformly in all n (where we just write X instead of 1 ⊗ X and E instead of id ⊗ E ): H X ( z ) = G X ( z − ) = E [( z − − X ) − ] = z ⋅ E [( − Xz ) − ] = z ∑ k ≥ E [( Xz ) k ] , where the sum converges uniformly for ∥ z ∥ < /∥ X ∥ . Thus H X has an extension to B ( , /∥ X ∥) .Note (see Exercise 7) that we have in general that ∂ n + H X ( , . . . , ) ♯( b , . . . , b n ) isthe upper right entry in H X ( z ) , where z ∶= ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝ b . . . b ⋯ ⋮ ⋮ ⋱ ⋱ ⋱ b n −

00 0 0 ⋯ b n ⋯ ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠ . Since z is nilpotent we can use the expansion H X ( z ) = z ∑ k ≥ E [( Xz ) k ] to calcu-late H X ( z ) in this case; the series will stop after the term k = n . Let us evaluatethe cases n = n = n = H X ( b ) = ( b ) , and thus ∂H X ( , ) ♯ b = b = E [ b ] . n = H X ⎛⎜⎝ b

00 0 b ⎞⎟⎠ = ⎛⎜⎝ b

00 0 b ⎞⎟⎠ + ⎛⎜⎝ b

00 0 b ⎞⎟⎠ ⎛⎜⎝ E [ X ] E [ X ]

00 0 E [ X ]⎞⎟⎠ ⎛⎜⎝ b

00 0 b ⎞⎟⎠= ⎛⎜⎝ b b E [ X ] b b ⎞⎟⎠ , and thus ∂ H X ( , , ) ♯( b , b ) = b E [ X ] b = E [ b Xb ] . The case of general n works in the same way. Remark . (1) In addition to the analyticity property from Proposition 4.11,our Cauchy transforms G X have also a speciﬁc leading order for z → ∞ , namely G X ( z ) = z − + ⋯ or H X ( z ) = z + ⋯ or more precisely: z k G X ( z k ) → M n (B) for any sequence ( z k ) k in M n (B) for which ∥ z − k ∥ ↘

0. Those properties are suﬃcient to characterize Cauchytransforms G µ for µ ∈ Σ B , as shown in the following theorem of John Williams[Wil].(2) Recall ﬁrst the classical scalar-valued version: Let g ∶ C + → C − be an analyticfunction such(i) iyg ( iy ) → R ∋ y → ∞ (ii) and h ( z ) ∶= g ( / z ) has an analytic continuation to a neighborhood of 0.Then there exists a (uniquely determined) compactly supported Borel proba-bility measure µ on R such that g = G µ , i.e., g ( z ) = ∫ z − t dµ ( t ) . Note that without (ii) this gives a characterization of G µ for arbitrary proba-bility measures on R .42 heorem 4.13 (Williams 2017) . Let B be a unital C ∗ -algebra and g = ( g ( n ) ) n ∈ N bea fully matrical function g ∶ H + (B nc ) → H − (B nc ) such that(i) for any n ∈ N and for any sequence ( z k ) k ∈ N with z k ∈ M n (B) and which satisﬁes lim k →∞ ∥ z − k ∥ = we have lim k →∞ z k g ( n ) ( z k ) = in M n (B) ;(ii) the fully matricial function h = ( h ( n ) ) n ∈ N , with h ( n ) ( z ) ∶= g ( n ) ( z − ) , has a fullymatricial extension to a uniform neighborhood of 0.Then g = G µ for some µ ∈ Σ B .Sketch of proof. According to Proposition 4.11 we deﬁne the distribution by µ ( b xb ⋯ b n xb n ) ∶= ∂ n + h ( , . . . , ) ♯( b , b , . . . , b n ) . One has to check that this has all the properties required in Deﬁnition 4.7 for Σ B .Exponential boundenness comes from uniform boundedness of h ; furthermore wehave µ ( b ) = ∂h ( , ) ♯ b = ddt h ( + tb )∣ t = = lim t → h ( tb ) t = lim t → b ⋅ ( tb ) − g (( tb ) − )·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ → = b, and thus: µ ∣ B = id. From this and complete positivity follows by general argumentsthe bimodule property.The main problem is to show the positivity property. We reduce the problem tothe scalar-valued version by applying states. For this note that b ∈ B positive ⇐⇒ φ ( b ) ≥ φ ∶ B → C .Hence we consider, for a state φ , φ ( g ( ξ ⋅ )) ∶ C → C as a function in ξ ∈ C ; it satisﬁes the classical characterizing properties of a Cauchytransform, hence φ ( g ( ξ ⋅ )) = ∫ ξ − t dµ φ ( t ) for some probability measure µ φ . But the coeﬃcients in the expansion about ∞ forthis are φ ( E [ X k ]) , hence the E [ X k ] are under all φ a positive deﬁnite sequence in C , and thus the E [ X k ] themselves are positive deﬁnite in B . In order to get thisalso for general moments in B⟨ x ⟩ one has to consider matrix versions of this andapply states φ to the ( , ) -entry of matrices in M n (B) . 43 Operator-Valued Freeness

In order to be able to say something more on operator-valued distributions we needmore structure in the distribution. The most prominent case is given by variableswhich are free. It is crucial that we have an operator-valued version of free prob-ability theory, which behaves nicely with respect to matrix ampliﬁcations. Thisoperator-valued freeness will be presented here and will play a main role in most ofthe coming chapters. Operator-valued free probability theory is, as its scalar-valuedversion, due to Voiculescu [Voi95]. Our presentation of operator-valued freeness ismainly based on [Sp, MSp].

Deﬁnition 5.1. (1) Let (A , B , E ) be an operator-valued probability space. Sub-algebras B ⊂ A i ⊂ A , i ∈ I , are called free if E [ a ⋯ a k ] =

0, whenever wehave: ○ k ∈ N ; ○ a j ∈ A i j , with i j ∈ I , for all j = , . . . , k ; ○ E [ a j ] = j = , . . . , k ; ○ i /= i /= i /= ⋅ ⋅ ⋅ /= i k − /= i k (neighboring elements are from diﬀerentsubalgebras).Instead of free we will also say freely independent , or more precisely free withrespect to E or free (with amalgamation) over B or similar phrases.(2) Random variables X i ∈ A , i ∈ I , are free if the corresponding subalgebras B⟨ X i ⟩ ∶= algebra generated by X and B = { p ( X i ) ∣ p ( x ) ∈ B⟨ x ⟩} are free Proposition 5.2. If A i , i ∈ I , are free then E is on the algebra generated by all A i determined by the restrictions E ∣ A i for all i ∈ I and by the freeness condition. roof. The algebra generated by all A i consists of elements which are linear com-binations of a ⋯ a k where k ∈ N , a j ∈ A i j with i j ∈ I ; we an also assume that i /= i /= ⋅ ⋅ ⋅ /= i k . Consider such a ⋯ a k . We have to show that E [ a ⋯ a k ] is deter-mined by E ∣ A i ( i ∈ I ). We do this by induction. The case k = E [ ] =

1) and k = a ∈ A i ).Consider now general k . We put a oj ∶= a j − E [ a j ]† ∈B⊂A ij ∈ A i j , then E [ a oj ] = . We get then E [ a ⋯ a k ] = E [( a o + E [ a ])⋯( a ok + E [ a k ])] = E [ a o ⋯ a ok ] + rest. The ﬁrst term vanishes by the deﬁnition of freeness and the rest -term is a sum ofterms of smaller length, which are already determined by the induction hypothesis.

Example . (1) Consider a ∈ A and a ∈ A . Then we have0 = E [( a − E [ a ])( a − E [ a ])]= E [ a a ] − E [ a ⋅ E [ a ]] − E [ E [ a ] ⋅ a ] + E [ E [ a ] ⋅ E [ a ]] The three last terms are actually all equal to E [ a ] ⋅ E [ a ] , which leads to E [ a a ] = E [ a ] ⋅ E [ a ] .(2) Consider a , ˜ a ∈ A and a ∈ A . Then we have0 = E [( a − E [ a ])( a − E [ a ])( ˜ a − E [ ˜ a ])]= E [ a a ˜ a ] − E [ a ⋅ E [ a ] ⋅ ˜ a ] + six other terms which cancel . Thus we obtain E [ a a ˜ a ] = E [ a E [ a ] ˜ a ] . This cannot be factorized further,as E [ a ] ∈ B does in general not commute with a or ˜ a . However, this is okay,as E [ a ] ∈ B and hence a E [ a ] ˜ a ∈ A , so E [ a E [ a ] ˜ a ] is a moment whichis determined by E [ a ] and by E ∣ A .(3) For a , ˜ a ∈ A and a , ˜ a ∈ A one calculates in the same way E [ a a ˜ a ˜ a ] = E [ a E [ a ] ˜ a ]⋅ E [ ˜ a ]+ E [ a ]⋅ E [ a E [ ˜ a ] ˜ a ]− E [ a ] E [ a ] E [ ˜ a ] E [ ˜ a ] . emark . (1) If B = C and E = ϕ , then ϕ ( a ) commutes with everything andwe can factorize the ﬁnal results, like ϕ ( a a ˜ a ) = ϕ ( a ϕ ( a ) ˜ a ) = ϕ ( a ˜ a ) ϕ ( a ) , and we get the formulas from usual (scalar-valued) free probability.(2) Note: on the level of moments, operator-valued freeness works like scalar-valued freeness, but one has to keep the original order of the elements.(3) Note also that with respect to E ∶ A → B the “non-commutative scalars” B arefree from any subalgebra.(4) For a random variable X ∈ A , the restriction of E to B⟨ X ⟩ is exactly theinformation about the moments of X . Hence Proposition 5.2 says in this casethat the joint moments of free variables X i ( i ∈ I ) are determined by themoments of the individual variables.For example, for X and Y free we have E [ XbY ] = E [ X ] ⋅ b ⋅ E [ Y ] = E [ Xb ] ⋅ E [ Y ] = E [ X ] ⋅ E [ bY ] and E [ Xb Y b X ] = E [ Xb ⋅ E [ Y ]† momentof Y ⋅ b X ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ moment of X . (5) Note that Proposition 5.2 gives us essentially a free product construction onan algebraic level. Since we want to do our constructions on an analytic C ∗ -probability level, we should extend our abstract notion of B -valued distri-butions from Deﬁnition 4.7 from the case of one variable to the multivariatecase. B -valued joint distributions Deﬁnition 5.5. (1) Let B be a unital C ∗ -algebra. We denote by B⟨ x i ; i ∈ I ⟩ thenon-commutative polynomials in the formal variables x i ( i ∈ I ) with coeﬃcientsfrom B ; they are linearly spanned by monomials of the form b x i b x i ⋯ b k − x i k b k with k ∈ N ; b , . . . b k ∈ B ; i , . . . , i k ∈ I .This becomes a ∗ -algebra by declaring x ∗ i = x i for all i ∈ I .(2) A B -valued (joint) distribution is a linear map µ ∶ B⟨ x i ; i ∈ I ⟩ → B such that47i) µ ( ) = µ is a B - B -bimodule map;(iii) µ is completely positive, i.e.,id ⊗ µ ( p ∗ p ) ≥ n ∈ N and p = p ( x i ; i ∈ I ) ∈ M n (B⟨ x i ; i ∈ I ⟩) ;(iv) µ is exponentially bounded, i.e., there exists M > k ∈ N , b , . . . , b k − ∈ B , i , . . . , i k ∈ I the following holds: ∥ µ ( x i b x i ⋯ b k − x i k ∥ ≤ M k ∥ b ∥⋯∥ b k ∥ . We denoteΣ I B ∶= { µ satisfying (i), (ii), (iii) } , Σ I, B ∶= { µ ∈ Σ I B , satisfying also (iv) } . (3) If (A , B , E ) is a B -valued C ∗ -probability space and X i = X ∗ i ∈ A for all i ∈ I ,then the B -valued joint distribution µ ( X i ; i ∈ I ) ∈ Σ I, B is given by µ ( X i ; i ∈ I ) ( p ( x i ; i ∈ I )) ∶= E [ p ( X i ; i ∈ I )] for all p ( x i ; i ∈ I ) ∈ B⟨ x i ; i ∈ I ⟩ . Theorem 5.6.

For a unital C ∗ -algebra B and for a linear map µ ∶ B⟨ x i ; i ∈ I ⟩ → B the following are equivalent.(i) µ ∈ Σ I, B .(ii) There exist a B -valued C ∗ -probability space (A , B , E ) and X i = X ∗ i ∈ A foreach i ∈ I sucht that µ = µ ( X i ; i ∈ I ) .“Proof”. This can be done as in the proof of Theorem 4.9, or it can also be reduced(at least for ∣ I ∣ < ∞ ) directly to Theorem 4.9 with the usual matrix trick by takinga diagonal matrix X , where the X i are sitting on the diagonal. Proposition 5.7.

Let (A , B , E ) be an operator-valued probability space and let B ⊂A i ⊂ A , i ∈ I , be free over B . Then, for any n ∈ N , in the operator-valued probabilityspace ( M n (A) , M n (B) , id ⊗ E ) , the subalgebras M n (B) ⊂ M n (A i ) ⊂ M n (A) , i ∈ I , arefree over M n (B) . roof. Consider A j ∈ M n (A i j ) such that i /= i /= ⋅ ⋅ ⋅ /= i r and id ⊗ E [ A j ] = j = , . . . , r . We have to show that id ⊗ E [ A ⋯ A r ] =

0. Write A j = ( a ( j ) kl ) nk,l = with a ( j ) kl ∈ A i j . Then id ⊗ E [ A j ] = ( E [ a ( j ) kl ]) k,l = E [ a ( j ) kl ] =

0. For A ∶= A ⋯ A r = ( a kl ) nk,l = we have a kl = n ∑ r ,...,r k − = a ( ) kr (cid:176) ∈A i a ( ) r r – ∈A i ⋯ a ( r ) r k − l † ∈A ir . For each ﬁxed choice of r , . . . , r k − , the factors in the product are coming alternat-ingly from diﬀerent subalgebras and each is centred under E . Hence, by the freenessof the A i , we get E [ a kl ] = ∑ E [ a ( ) kr a ( ) r r ⋯ a ( r ) r k − l ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ = , for all r , . . . , r k − = ⊗ A = ( E [ a kl ]) nk,l = = . Remark . (1) Note that M n (A) is also a B -valued probability space with re-spect to tr ⊗ E , where tr denotes the normalized trace on M n ( C ) . We are notclaiming freeness in this space – this is actually not true in general.For example, consider a scalar-valued probability space (A , ϕ ) . Then M (A) isboth a scalar-valued probability space (with respect to tr ⊗ ϕ ) and an operator-valued probability space (with respect to id ⊗ ϕ ). Freeness with respect to ϕ goes only over to freeness with respect to id ⊗ ϕ , but not with respect to tr ⊗ ϕ .For example, if a , ˜ a ∈ A and a , ˜ a ∈ A are free in A , then for A = ( a

00 ˜ a ) ∈ M (A ) and A = ( a

00 ˜ a ) ∈ M (A ) we have A A = ( a a

00 ˜ a ˜ a ) ⊗ ϕ [ A A ] = ( ϕ ( a a ) ϕ ( ˜ a ˜ a ))= ( ϕ ( a ) ϕ ( a ) ϕ ( ˜ a ) ϕ ( ˜ a ))= ( ϕ ( a ) ϕ ( ˜ a )) ( ϕ ( a ) ϕ ( ˜ a ))= id ⊗ ϕ [ A ] ⋅ id ⊗ ϕ [ A ] ;on the scalar-valued level, on the other side, we have in general:tr ⊗ ϕ ( A A ) = [ ϕ ( a ) ϕ ( a ) + ϕ ( ˜ a ) ϕ ( ˜ a )]/= [ ϕ ( a ) + ϕ ( ˜ a )] ⋅ [ ϕ ( a + ϕ ( ˜ a )]= tr ⊗ ϕ ( A ) ⋅ tr ⊗ ϕ ( A ) . (2) Note however that, even if in the end we are only interested in moments withrespect to tr ⊗ E , it is good to know something about the moments with respectto id ⊗ E , since those are related by tr ⊗ E = tr [ id ⊗ E ] ; i.e., instead of goingdirectly down to B , M n (A) tr ⊗ E —→ B we can also decompose this into two steps: M n (A) id ⊗ E —→ M n (B) tr —→ B . This simple observation will be crucial for our latter investigations!

Remark . (1) We have to understand better the structure of the formulas formixed moments in free variables. This is analogous to the scalar-valued case,in particular non-crossing partitions will feature prominently. For the relevantdeﬁnitions and notations in relation with partitions and kernels of multi-indiceswe refer to Chapter 2 of the Free Probability Lecture Notes.502) As in the scalar-valued case, we get for “non-crossing moments” a kind offactorizing into the moments of the individual subalgebras; however, we havenow to respect the nestings of the blocks. This is just an iteration of the“factorization” from Example 5.3, E [ a a ˜ a ] = E [ a ⋅ E [ a ] ⋅ ˜ a ] for { a , ˜ a } free from a . (5.1)For example, consider { a , a , a } , { e , e } , c , d which are free with respect to E . Then we can iterate the factorization (5.1) as follows: E [ a e ce a da ] = E [( a E [ e ce ] a ) d ( a )]= E [ a E [ e ce ]·„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„¶ E [ e E ( c ) e ] a E [ d ] a ]= E { a E [ e E [ c ] e ] a E [ d ] a } . We will denote this “factorization” by E π [ a , e , c, e , a , d, a ] for π = (3) Note that also for “crossing moments” only non-crossing factorizations showup in the formula expressing it in the individual moments, like in part (3) ofExample 5.3, for { a , ˜ a } free from { a , ˜ a } : E [ a a ˜ a ˜ a ) = E [ a E [ a ] ˜ a ] E [ ˜ a ] + E [ a ] E [ a E [ ˜ a ] ˜ a ] − E [ a ] E [ a ] E [ ˜ a ] E [ ˜ a ] . This is quite relevant in the operator-valued case; whereas in the scalar-valuedsituation the meaning of a crossing term like ϕ ( a , a , ˜ a , ˜ a ) = ϕ ( a ˜ a ) ⋅ ϕ ( a ˜ a ) is clear, there is no canonical deﬁnition for E [ a , a , ˜ a , ˜ a ] in the operator-valued case: E [ a ˜ a ] ⋅ E [ a ˜ a ] /= E [ a ˜ a ] ⋅ E [ a ˜ a ] in general, and there is no nested version which respects the order of thevariables. 51 eﬁnition 5.10. (1) Let B ⊂ A be an inclusion of unital subalgebras. A B -balanced map T ∶ A n → B is a C -multilinear map, which satisﬁes also thefollowing conditions for all a , . . . , a n ∈ A , b, b ′ ∈ B , k = , . . . , n − T ( ba , a , . . . , a n b ′ ) = bT ( a , a , . . . , a n ) b ′ T ( a , . . . , a k b, a k + , . . . , a n ) = T ( a , . . . , a k , ba k + , . . . , a n ) . (2) For a given sequence T n ∶ A n → B ( n ∈ N ) of B -balanced maps, we deﬁne thecorresponding multiplicative maps T π ( n ∈ N , π ∈ N C ( n ) ) recursively on thenumber of blocks by: T n ∶= T n for all n ∈ N ; and, for π = σ ∪ ( p + , p + , . . . , p + q )·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ interval block ∈ N C ( n ) we set T π ( a , . . . , a n ) ∶= T σ ( a , . . . , a p ⋅ T q ( a p + , . . . , a p + q ) , a p + q + , . . . , a n ) . Note that T π is also B -balanced. Example . For π = we have T π ( a , a , a , a , a , a , a , a , a , a )= T ( a ⋅ T ( a ⋅ T ( a , a ) , a ⋅ T ( a ) ⋅ T ( a , a ) , a ) , a ) . Proposition 5.12.

Let (A , B , E ) be a B -valued probability space and let B ⊂ A i ⊂ A , i ∈ I , be free with respect to E . We denote, for n ∈ N , by E n ∶ A n → B the B -balancedmap given by E n ( a , a , . . . , a n ) ∶= E [ a a ⋯ a n ] and by E π , for all n ∈ N , π ∈ N C ( n ) ,the corresponding multiplicative map. Consider now a j ∈ A i j for j = , . . . , k . If ker i ∈ P( k ) is non-crossing, then E [ a a ⋯ a k ] = E ker i ( a , a , . . . , a k ) . Proof.

By iteration of (5.1): E [ a a ˜ a ] = E [ a ⋅ E [ a ] ⋅ ˜ a ] = E ( a , a , a ) for { a , ˜ a } free from a .52 .5 Positivity of free product contructions Proposition 5.13.

Let (A , B , E ) be a B -valued probability space. Assume that B isa unital C ∗ -algebra and A a ∗ -algebra. Let ∗ -subalgebras B ⊂ A i ⊂ A , i ∈ I , be freewith respect to E and assume that A is generated by all A i , i ∈ I , as an algebra. If E is positive restricted to each A i then it is also positive on A .(Recall that “positive” on a ∗ -algebra A means that E [ aa ∗ ] ≥ for all a ∈ A .)Proof. (i) As in the proof of Proposition 5.2 one can see by recursion that eachelement in A can be written as a linear combination of elements of the form a ⋯ a n with- n ∈ N ( n = B );- a k ∈ A i k - i /= i /= ⋅ ⋅ ⋅ /= i n - E [ a k ] = k = , . . . , n .If a is a sum of such elements and we want to argue that E [ aa ∗ ] ≥

0, then wehave to understand E appplied to a product of two such elements.(ii) So let us consider two such elements a ⋯ a n and ˜ a ⋯ ˜ a m as above, with a k ∈ A i k and ˜ a l ∈ A j l . Then we have E [ a ⋯ a n ˜ a ∗ m ⋯ ˜ a ∗ ] = δ nm E [ a E [ a ⋯ E [ a n ˜ a ∗ n ]⋯ ˜ a ∗ ] ˜ a ∗ ] , which is only diﬀerent from 0 if i k = j k for all k = , . . . , n .As an example for the derivation of the above formula consider n = m = E [ a a a ˜ a ∗ ˜ a ∗ ˜ a ∗ ] = E [ a a ⋅ (( a ˜ a ∗ ) o + E [ a ˜ a ∗ ]) ⋅ ˜ a ∗ ˜ a ∗ ]= E [ a a E [ a ˜ a ∗ ] ˜ a ∗ ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ( ... ) o + E [ ... ] ˜ a ∗ ]= E [ a E [ a E [ a ˜ a ∗ ] ˜ a ∗ ] ˜ a ∗ ] . Hence, for the calculation of E [ aa ∗ ] , if suﬃces to consider a which are sumsof products of the same lenght and the same i -pattern.(iii) Consider a = r ∑ k = a ( k ) ⋯ a ( k ) n , where n ∈ N , r ∈ N , a ( k ) j ∈ A i j for all k = , . . . , r with i /= i /= ⋯ /= i n and E [ a ( k ) j ] = j = , . . . , n and k = , . . . , r . Thenwe have E [ aa ∗ ] = r ∑ k,l = E [ a ( k ) ⋯ E [ a ( k ) n − E [ a ( k ) n a ( l )∗ n ] a ( l )∗ n − ]⋯ a ( l )∗ ] . ( E [ a ( k ) n a ( l )∗ n ]) k,l is a positive matrix in M r (B) since E is com-pletely positive (see Exercise 9). But since B and thus also M r (B) is a C ∗ -algebra, this means that we can write this positive matrix as BB ∗ for some B = ( b ( k ) r n ) rk,r n = ∈ M r (B) , which yields then concretely that E [ a ( k ) n a ( l )∗ n ] = r ∑ r n = b ( k ) r n b ( l )∗ r n for all k, l = , . . . , r .Thus we can continue our above calculation as follows E [ aa ∗ ] = r ∑ k,l = E [ a ( k ) ⋯ E [ a ( k ) n − ⋅ r ∑ r n = b ( k ) r n b ( l )∗ r n ⋅ a ( l )∗ n − ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∑ rrn = E [( a ( k ) n − b ( k ) rn )( a ( l ) n − b ( l ) rn ) ∗ ] ⋯ a ( l )∗ ] . Again, ( E [( a ( k ) n − b ( k ) r n )( a ( l ) n − b ( l ) r n ) ∗ ]) k,l is a positive matrix in M r (B) and its en-tries can thus be written in the form E [( a ( k ) n − b ( k ) r n )( a ( l ) n − b ( l ) r n ) ∗ ] = r ∑ r n − = b ( k ) r n − ,r n b ( l )∗ r n − ,r n for some b ( k ) r n − ,r n ∈ B . Iterating this leads ﬁnally to E [ aa ∗ ] = r ∑ k,l = r ∑ r = ⋅ ⋅ ⋅ r ∑ r n = b ( k ) r ,...,r n b ( l )∗ r ,...,r n = r ∑ r = ⋅ ⋅ ⋅ r ∑ r n = ( r ∑ k = b ( k ) r ,...,r n )( r ∑ l = b ( l ) r ,...,r n ) ∗ ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ≥ ≥ . Theorem 5.14.

Let B be a unital C ∗ -algebra. Let µ I ∈ Σ I, B be a joint distributionon B⟨ x i ; i ∈ I ⟩ and µ J ∈ Σ J, B be a joint distribution on B⟨ y j ; j ∈ J ⟩ , with I ∩ J = ∅ .Then there exists a uniquely determined µ ∈ Σ I ∪ J, B on B⟨ x i , y j ; i ∈ I, j ∈ J ⟩ such that: ○ µ restricted to B⟨ x i ; i ∈ I ⟩ is µ I and µ restricted to B⟨ y j ; j ∈ J ⟩ is µ J ; ○ B⟨ x i ; i ∈ I ⟩ and B⟨ y j ; j ∈ J ⟩ are free with respect to µ .We write then µ = µ I ∗ µ J . roof. As a linear map we can deﬁne µ (uniquely!) by the knowledge of µ I and µ J and the freeness condition, by writing each element in B⟨ x i , y j ; i ∈ I, j ∈ J ⟩ as alinear combination of alternating products of centred elements from B⟨ x i ; i ∈ I ⟩ andfrom B⟨ y j ; j ∈ J ⟩ . On all such products µ is set to 0, only on constant terms b ∈ B itis µ ( b ) = b .One has then to check the properties (i)-(iv) from Deﬁnition 5.5 in order to seethat µ ∈ Σ I ∪ J, B . (i) and (ii) are clear. (iii) on the base level ist just Proposition 5.13;that it is also true for the matrix ampliﬁcations follows from the same proposition,if we take also into account that freeness between B⟨ x i ; i ∈ I ⟩ and B⟨ y j ; j ∈ J ⟩ goesalso over to matrices, by Proposition 5.7. For (iv) we have to see that we get alsoexponential bounds for mixed moments in x i and y j , if they are free, and if we havesuch bounds for the x i , i ∈ I , and for the y j , j ∈ J , separateley. We will see this later,when we have developed more theory for the structure of such mixed moments; seeExample 9.5. Corollary 5.15.

Let B be a unital C ∗ -algebra. For each p = p ( x i ; i ∈ I ) ∈ B⟨ x i ; i ∈ I ⟩ with p = p ∗ we have a corresponding operation p ◻ on Σ B given by p ◻ ∶ Σ B × ⋯ × Σ B ·„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∣ I ∣ -times → Σ B , ( µ i ) i ∈ I ↦ p ◻ ( µ i ; i ∈ I ) , where p ◻ ( µ i ; i ∈ I ) is the distribution of p ( x i ; i ∈ I ) with respect to ∗ i ∈ I µ i .Remark . (1) Note that via matrix ampliﬁcations we can also do the same forall selfadjoint p ∈ M n (B⟨ x i ; i ∈ I ⟩) .(2) ◻ is the generic symbol for an operation with free variables, to be used withcare and imagination; for example, we have the free convolution µ ⊞ µ for p ( x , x ) = x + x and the free commutator [ µ ◻ µ ] for p ( x , x ) = x x + x x or the free anti-commutator { µ ◻ µ } for p ( x , x ) = i ( x x − x x ) .(3) In the scalar-valued case, B = C , all those operations p ◻ are on the level ofcompactly supported probability measures. In the Free Probability LectureNotes we saw how to deal with µ ⊞ µ , but we could not address general p ◻ .We will see later that in our operator-valued context we have tools for dealingwith such general p ◻ . 55 Operator-Valued Free CentralLimit Theorem andOperator-Valued SemicircularElements

Our benchmark distribution of free semicircular variables from Section 1.3 corre-sponds on the operator-valued level to an operator-valued semicircular element.This arises also abstractly in the operator-valued theory canonically as the limitdistribution in a free central limit theorem; furthermore, this operator-valued distri-bution has a very concrete and nice description both on a combinatorial level (viamoments) as well as on an analytic level (via an explicit equation for its operator-valued Cauchy transform).

Remark . (1) A central limit theorem asks about the limit distribution of D /√ N ( µ ⊞ ⋯ ⊞ µ ) N →∞ —→ ?where D /√ N denotes dilation by a factor 1 /√ N . In terms of random variablesthe question can be stated as X + ⋯ + X N √ N N →∞ —→ ?if X i are free and identically distributed (f.i.d.).The relevant convergence is “in distribution”, which means that moments con-verge. Since moments are elements in B , we also have to specify the type ofconvergence there – we will usually take convergence in norm in B . 572) The relevant information about the input distribution is the second moment(ﬁrst moments are assumed to be zero); in the operator-valued case the secondmoment is given by a mapping η ∶ B → B with η ( b ) ∶= E [ XbX ] .In a C ∗ -setting E , and thus also η , must be completely positive: for ( b ij ) ni,j = ∈ M n (B) we haveid ⊗ E [ ⊗ X ⋅ ( b ij ) ni,j = ⋅ ⊗ X ] = ( E [ Xb ij X ]·„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„¶ η ( b ij ) ) ni,j = = id ⊗ η (( b ij ) ni,j = ) , and thus: id ⊗ η ( bb ∗ ) = id ⊗ E [( ⊗ X ⋅ b )( ⊗ X ⋅ b ) ∗ ] ≥ η can show up as second momentof a µ ∈ Σ B , see Exercise 16.(3) Much of the calculations for the central limit theorem and description of thelimit are similar to the scalar-valued situation (see Chapter 2 of the FreeProbability Lecture Notes). Let us ﬁrst check the calculation of the momentsin the limit.We consider ( X i ) i ∈ N which are f.i.d. with respect to E . We also assume that ○ X i centred: E [ X i ] = i ∈ N ; ○ second moments are given by η ∶ B → B : E [ X i bX i ] = η ( b ) for all i ∈ N and all b ∈ B .Then we put S N ∶= ( X + ⋯ + X N )/√ N and calculate its moments. E [ S N b S N b ⋯ S N b k − S N ] = N k / ∑ i ∶[ k ]→[ N ] E [ X i ( ) b X i ( ) ⋯ X i ( k − ) b k − X i ( k ) ]= N k / ∑ π ∈P( k ) ∑ i ∶[ k ]→[ N ] ker i = π E [ X i ( ) b X i ( ) ⋯ X i ( k − ) b k − X i ( k ) ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ =∶ g ( π ) depends only on ker i by Prop. 5.2 = N k / ∑ π ∈P( k ) g ( π ) ⋅ { i ∶ k → [ N ] ∣ ker i = π }·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∼ N π . Now observe that if π has a singleton, then g ( π ) =

0; because we have E [ X i ] = π ∈ P( k ) withoutsingleton contribute; for those we have necessarily π ≤ k /

2. Now we haveenough information to go to the limit N → ∞ . There only π with π = k / π ∈ P ( k ) .If π is crossing, then the deﬁnition of freeness (and interval stripping) gives58 ( π ) =

0; here is an example which illustrates this: g ( ) = E [ X b X b X b X b X b X ]= E [ X b – X b – X b E [ X b X ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ b X –] (alternating and centred) = N →∞ E [ S N b S N b ⋯ S N b k − S N ] = ∑ π ∈ NC ( k ) g ( π ) . Up to this point we just repeated the arguments for the scalar-valued case. Butnow there will be a diﬀerence, namely g ( π ) is not the same for all π ∈ N C ( k ) .We have g ( ) = E [ X b X ] = η ( b ) g ( ) = E [ X b X b X b X ] = E [ E [ X b X ]·„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„¶ η ( b ) b E [ X b X ]·„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„¶ η ( b ) ] = η ( b ) b η ( b ) g ( ) = E [ X b X b X b X ] = E [ X b E [ X b X ]·„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„¶ η ( b ) b X ] = η ( b η ( b ) b ) Thus the limit variable S has moments E [ Sb S ] = η ( b ) , E [ Sb Sb Sb S ] = η ( b ) b η ( b ) + η ( b η ( b ) b ) and in general E [ Sb S ⋯ Sb k − S ] = ∑ π ∈ NC ( k ) η π ( b , . . . , b k − ) , where η π ∶ B k − → B is the C -multilinear map given by η π ( b , . . . , b k − ) = E π [ X i b , X i b , . . . , X i b k − , X i ] . b , . . . , b k − are equal to 1, the contributionsof the η π ( , , . . . , ) are diﬀerent in general. η ( , , ) = η ( ) ⋅ ⋅ η ( ) = η ( ) , η ( , , ) = η ( ⋅ η ( ) ⋅ ) = η ( η ( )) . Note that η does not need to be unital: η ( ) /= Theorem and Deﬁnition 6.2.

Let (A , B , E ) be a B -valued C ∗ -probability space.Consider selfadjoint X i ∈ A , i ∈ N , which are f. i. d. (free and identically distributed)with ○ E [ X i ] = for all i ∈ N ; ○ E [ X i bX i ] = η ( b ) for all i ∈ N and b ∈ B , for a completely positive η ∶ B → B .Put S N ∶= ( X + ⋯ + X N )/√ N . Then µ S N converges in distribution for N → ∞ to ν η ∈ Σ B , which is given by ν η ( b xb ⋯ b k − xb k ) = ∑ π ∈ NC ( k ) b η π ( b , . . . , b k − ) b k (6.1) for k ∈ N and b , . . . , b k ∈ B . In particular, this says that all odd moments are zero.Such a distribution ν η ∈ Σ B , given by (6.1) , is called B -valued semicircular distri-bution , with covariance η . A selfadjoint element S with µ S = ν η is called ( B -valued)semicircular element . Remark . (1) Note that this deﬁnition is compatible with ampliﬁcations: if S is a semicircular element in (A , B , E ) with covariance η ∶ B → B , then1 ⊗ S is a semicircular element in ( M n (A) , M n (B) , id ⊗ E ) with covarianceid ⊗ η ∶ M n (B) → M n (B) . For a more general version of this see also Exercise17.(2) Let us check that indeed ν η ∈ Σ B , i.e., that we have positivity and exponentialboundedness. On can do this by contructing bounded operators on the fullFock space which have ν η as distribution (for this see Exercise 19). We do ithere more abstractly.(i) positivitySince positivity is preserved in a central limit, we only need a distribution µ ∈ Σ B which has ﬁrst moment zero and second moment given by η . In60xercise 16 we construct such a distribution, an operator-valued Bernoullielement.(ii) exponential boundednessWe have to estimate the norm of ν η ( xb ⋯ b k − x ) = ∑ π ∈ NC ( k ) η π ( b , . . . , b k − ) for k = m even. Note ﬁrst that η as a positive map is bounded, i.e., ∥ η ( b )∥ ≤ ∥ η ∥ ⋅ ∥ b ∥ for all b ∈ B , where ∥ η ∥ < ∞ .This implies that we have for each π ∈ N C ( m )∥ η π ( b , . . . , b k − )∥ ≤ ∥ η ∥ m ⋅ ∥ b ∥⋯∥ b m − ∥ . As an illustration for this let us have a look on the estimates for the twocontributions of order 4: η ( b , b , b ) = ∥ η ( b ) ⋅ b ⋅ η ( b )∥≤ ∥ η ( b )∥ ⋅ ∥ b ∥ ⋅ ∥ η ( b )∥≤ ∥ η ∥ ⋅ ∥ b ∥ ⋅ ∥ b ∥ ⋅ ∥ b ∥ and η ( b , b , b ) = ∥ η ( b ⋅ η ( b ) ⋅ b )∥≤ ∥ η ∥ ⋅ ∥ b ⋅ η ( b ) ⋅ b ∥≤ ∥ η ∥ ⋅ ∥ b ∥ ⋅ ∥ η ( b )∥ ⋅ ∥ b ∥≤ ∥ η ∥ ⋅ ∥ b ∥ ⋅ ∥ b ∥ ⋅ ∥ b ∥ Thus – by also using the fact that the number of elements of

N C ( m ) is given by the m -th Catalan number, which is smaller than 4 m – we cannow get our exponential bound: ∥ ν η ( xb ⋯ b k − x )∥ ≤ N C ( m ) ⋅ ∥ η ∥ m ⋅ ∥ b ∥⋯∥ b m − ∥≤ m ∥ η ∥ m ·„„„„„„„„„„„„„‚„„„„„„„„„„„„„¶ ( ∥ η ∥) m ⋅∥ b ∥⋯∥ b m − ∥ . S is B -valued semicircular, then1 ⊗ S = ⎛⎜⎜⎜⎝ S . . . S ⋯ ⋮ ⋮ ⋱ ⋮ ⋯ S ⎞⎟⎟⎟⎠ is also semicircular, over M m (B) . This is true more general; if we have freesemicircular elements over B and put linear combinations of them as entriesinto an selfadjoint m × m -matrix, then this is an M m (B) -valued semicircularelement. The proof can be done by using our free central limit theorem. Letus elaborate on this via the example S = ( S S S ) , where S and S are free and semicircular over B , with covariances η and η ,respectively. Then we can realize S and S as S = lim N →∞ X + ⋯ + X N √ N , S = lim N →∞ Y + ⋯ + Y N √ N , where all X i , Y j are free and E [ X i ] = = E [ Y j ] , E [ X i bX i ] = η ( b ) , E [ Y j bY j ] = η ( b ) . This gives us for S the realization S = lim N →∞ ( ( X + ⋯ + X N )/√ N ( X + ⋯ + X N )/√ N ( Y + ⋯ + Y N )/√ N )= lim N →∞ √ N [( X X Y ) + ⋯ + ( X N X N Y N )] The summands in the last sum are f. i. d. with respect to id ⊗ E with vanish-ing ﬁrst moment, and thus, by our central limit theorem, S is an M -valuedsemicircular element. Its variance η is given by the second moment η ( b b b b ) = id ⊗ E [( S S S ) ( b b b b ) ( S S S )]= id ⊗ E ( S b S S b S + S b S S b S + S b S S b S + S b S + S b S + S b S )= ( η ( b ) η ( b ) η ( b ) η ( b ) + η ( b )) .3 Equation for the Cauchy transform of thesemicircle Remark . In order to derive an equation for the Cauchy transform of ν η we arelooking for recursions among the moments. Consider, with µ S = ν η , E [ Sb Sb ⋯ b m − S ] = ∑ π ∈ NC ( m ) η π ( b , . . . , b m − ) . We write π ∈ N C ( m ) in the form π = ( , l ) ∪ π ∪ π , where necessarily l = k even.1 l m. . . . . .π π π In this parametrization we can express η π as η π ( b , . . . , b m − ) = E π [ Sb , . . . Sb l − , Sb l , S . . . b m − , Sπ π ]= η ( E π [ b S, . . . , Sb l − ]) ⋅ E π [ b l S, . . . , b m − S ] Thus E [ Sb ⋯ b m − S ] = m ∑ k = ∑ π ∈ NC ( ( k − )) π ∈ NC ( ( m − k )) η ( E π [ b S, . . . , Sb k − ]) ⋅ E π [ b k S, . . . , b m − S ]= m ∑ k = η ( ∑ π ∈ NC ( ( k − )) E π [ b S, . . . , Sb k − ]) ⋅ ∑ π ∈ NC ( ( m − k )) E π [ b k S, . . . , b m − S ]= m ∑ k = η ( E [ b S ⋯ Sb k − ]) ⋅ E [ b k S ⋯ Sb m − S ] . Consider now the operator-valued Cauchy transform (on the base level) G ∶= G S ∶ H + (B) → H − (B) ; z ↦ G ( z ) = E [( z − S ) − ] . ∥ z ∥ we have G ( z ) = z − ∑ m ≥ E [( Sz − ) m ]= z − + z − ∑ m ≥ E [( Sz − ) m ]= z − + z − ∑ m ≥ m ∑ k = η ( z − E [( Sz − ) ( k − ) ]) ⋅ z − E [( Sz − ) ( m − k ) ]= z − + z − ⋅ η ( ∞ ∑ k = z − E [( Sz − ) ( k − ) ]) ⋅ ( ∞ ∑ m − k = z − ⋅ E [( Sz − ) ( m − k ) ])= z − + z − ⋅ η ( G ( z )) ⋅ G ( z ) , or equivalently zG ( z ) = + η ( G ( z )) ⋅ G ( z ) . (6.2)So we conclude: G ( z ) satisﬁes for large ∥ z ∥ the Equation (6.2); by analytic extensionit must then satisfy (6.2) also for all H + (B) .The same calculation and arguments work also for all matricial ampliﬁcations of G . Remark . (1) In the case B = C and the normalization η ( z ) = z ( z ∈ C ) –corresponding to ϕ ( S ) = G S ∶ H + ( C ) → H − ( C ) of a scalar-valued semicircle: zG ( z ) = + G ( z ) . (6.3)This can be solved explicitly as G ( z ) = z ± √ z − , where we have to choose the “ − ” sign, since we have for Cauchy transformslim y →∞ iyG ( iy ) =

1; see Remark 4.12.From this explicit form for the Cauchy transform one can derive then via theStieltjes inversion formula the semicircle density.Note that of the two solutions of (6.3) only one, namely G ( z ) , lies in the rightspace H − ( C ) , the other solution is in H + ( C ) .642) How can we deal with (6.2) for general B and η . Note ﬁrst that (6.2) is, inthe case B = M n ( C ) , actually a system of quadratic equations for the entriesof the n × n -matrix G ( z ) . There are no explicit solutions nor a general theoryfor such systems.(3) Usually there can be many solutions of such equations; we are, however, in-terested in a solution which lies in H − (B) . To get an idea, consider the verysimple example: B = M ( C ) , η = id, z = z ⊕ z with z , z ∈ C and we are justlooking for solutions of the form G ( z ) = w = w ⊕ w with w , w ∈ C . Then(6.2) decouples into z w = + w , z w = + w . Hence we have two solutions for w and two solutions for w : w ± = z ± √ z − , w ± = z ± √ z − . This yields four possible solutions for w , of which only w − ⊕ w − is in H − ( M ( C )) .We want to show that this is true in general: of the many possible solutionsthere is exactly one in H − (B) .(4) The idea to see this is to rewrite the Equation (6.2) as a ﬁxed point equation: zG ( z ) = + η ( G ( z )) ⋅ G ( z ) ⇔ z = G ( z ) − + η ( G ( z ))⇔ G ( z ) = [ z − η ( G ( z ))] − , i.e., with F z ∶ w ↦ [ z − η ( w )] − we have that G ( z ) is a ﬁxed point of F z .To see the existence and uniqueness of the ﬁxed point, F z should be a con-traction. For large z (i.e., small ∥ z − ∥ ) this is true in operator norm. Forgeneral z ∈ H + (B) the operator norm does not work any more, but one gets acontraction in an “analytic” metric. The following is a kind of general versionof the Schwarz Lemma or Denjoy-Wolﬀ Theorem (for the lattter, see 5.6 andAssignment 9 of Free Probability Lecture Notes). See also [Har, Kra] for niceexpositions around the Earle–Hamilton Theorem. Theorem 6.6 (Earle, Hamilton 1968) . Let D be a non-empty domain in a complexBanach space X and let h ∶ D → D be a bounded holomorphic function. If h ( D ) liesstrictly inside D – i.e., there is some ε > such that B ε ( h ( x )) ⊂ D whenever x ∈ D –then h is a strict contraction in some (namely, Carathéodory-Riﬀen-Finsler) metric ρ , and thus has a unique ﬁxed point in D . Furthermore, there is a constant m > sucht that one has for all x, y ∈ D that ρ ( x, y ) ≥ m ∥ x − y ∥ , and thus ( h n ( x )) n ∈ N converges also in norm, for any x ∈ D , to this ﬁxed point.

65e want to apply this to our ﬁxed point equation for the semicircular distribution.In the next proposition we check that the assumptions of the Earle–Hamilton The-orem are satisﬁed in this case. We follow here the original work of Helton, RashidiFar, Speicher [HRS].

Proposition 6.7.

Let B be a unital C ∗ -algebra and η ∶ B → B a positive linear map.For ﬁxed z ∈ H + (B) we deﬁne the map F z ∶ w ↦ F z ( w ) ∶= [ z − η ( w )] − . Then we have(i) F z ∶ H − (B) → H − (B) .(ii) F z is bounded, with ∥ F z ( w )∥ ≤ ∥( Im z ) − ∥ for all w ∈ H − (B) .(iii) For R > we put H − R (B) ∶= { w ∈ H − (B) ∣ ∥ w ∥ < R } . Then, for R > ∥( Im z ) − ∥ , we have that F z ( H − R (B)) lies strictly inside H − R (B) .Proof. (i) For w ∈ H − (B) we have, by the positivity of η , that η ( w ) ∈ H − (B) , andthus − η ( w ) ∈ H + (B) . But then we have, for z ∈ H + (B) , that also z − η ( w ) ∈ H + (B) . Taking the inverse moves us then into H − (B) .(ii) In the proof of Theorem 4.5 we have seen (put X = ∥ z − ∥ ≤∥( Im z ) − ∥ for z ∈ H + (B) , and thus also ∥[ z − η ( w )] − ∥ ≤ ∥[ Im ( z − η ( w ))] − ∥ . In order to estimate this further, noteIm ( z − η ( w )) = Im z − Im η ( w )·„„„„„„„„„„„„‚„„„„„„„„„„„„¶ ≤ ≥ Im z > , which implies 0 < [ Im ( z − η ( w ))] − ≤ ( Im z ) − , and thus ﬁnally ∥[ Im ( z − η ( w ))] − ∥ ≤ ∥( Im z ) − ∥ . R > ∥( Im z ) − ∥ we have F z ∶ H − R (B) → H − R (B) . Wehave to see that F z ( w ) stays away from the boundary of H − R (B) . For the part ∥ w ∥ = R this is clear, there it stays away at least by an amount R − ∥( Im z ) − ∥ .In order to see that it also stays away from the “real axis” we need an estimatefor Im F z ( w ) , uniform in w ∈ H − R (B) . We haveIm F z ( w ) = i [ F z ( w ) − F z ( w ) ∗ ]= F z ( w ) ∗ [ F z ( w ) ∗− − F z ( w ) − i ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ = Im ( z ∗ − η ( w ) ∗ )≤ Im z ∗ F z ( w )≤ F z ( w ) ∗ ⋅ Im z ∗ ⋅ F z ( w )= − F z ( w ) ∗ ⋅ Im z ⋅ F z ( w ) . Let us write the last term, without the minus-sign, in the form F z ( w ) ∗ ⋅ Im z ⋅ F z ( w ) = [ F z ( w ) − ⋅ ( Im z ) − ⋅ F z ( w ) ∗− ] − . We estimate now F z ( w ) − ⋅ ( Im z ) − ⋅ F z ( w ) ∗− ≤ ∥ F z ( w ) − ∥ ⋅ ∥( Im z ) − ∥ ⋅ = ∥ z − η ( w )∥ ⋅ ∥( Im z ) − ∥ ⋅ ≤ (∥ z ∥ + ∥ η ∥ ⋅ ∥ w ∥) ⋅ ∥( Im z ) − ∥ ⋅ ≤ (∥ z ∥ + ∥ η ∥ ⋅ R ) ⋅ ∥( Im z ) − ∥ ⋅ . and thus, by taking the inverse and by noting that F z ( w ) − ⋅ ( Im z ) − ⋅ F z ( w ) ∗− is positive: F z ( w ) ∗ ⋅ Im z ⋅ F z ( w ) ≥ (∥ z ∥ + ∥ η ∥ ⋅ R ) ⋅ ∥( Im z ) − ∥ ⋅ . Putting everything together gives then the wanted estimateIm F z ( w ) ≤ − (∥ z ∥ + ∥ η ∥ ⋅ R ) ⋅ ∥( Im z ) − ∥ ⋅ , which is independent of w ∈ H − (B) . 67 heorem 6.8 (Helton, Rashidi Far, Speicher 2007) . Let B be a unital C ∗ -algebraand η ∶ B → B a positive linear map. For ﬁxed z ∈ H − (B) there exists exactly onesolution w ∈ H − (B) to zw = + η ( w ) ⋅ w. (6.4) This w is the limit of iterates w n = F nz ( w ) for any w ∈ H − (B) . Furthermore, wehave that ∥ w ∥ ≤ ∥( Im z ) − ∥ and Im w ≤ − (∥ z ∥ + ∥ η ∥ ⋅ ∥( Im z ) − ∥) ⋅ ∥( Im z ) − ∥ ⋅ . Proof.

By the Earle–Hamilton Theorem 6.6, each H − R (B) contains, for R > ∥( Im z ) − ∥ ,exactly one ﬁxed point of F z , i.e., a solution to (6.4). (Note that our map F z is holo-morphic.) For any w ∈ H − (B) we choose R such that w ∈ H − R (B) (i.e., R > ∥ w ∥ ),then Earle–Hamilton guarantees that F nz ( w ) converges in H − R (B) to w . Remark . (1) Clearly, this solution w from Theorem 6.8 must be the value G ( z ) of the Cauchy transform of our operator-valued semicircular element S withcovariance η .(2) The linearity of η is not essential for the arguments; one can generalize Theo-rem 6.8 in the same way to the case where η ∶ H + (B) → H + (B) is an analyticand bounded map.(3) The theorem does not give estimates for the speed of convergence. In particu-lar, for small Im z , the convergence can be very slow. One can usually improvethis by taking averages of the iterates. For example, replace w ↦ F z ( w ) by w ↦ G z ( w ) ∶= w + F z ( w ) . G z has the same ﬁxed point as F z and maps H − R (B) strictly into its interior. Thus, by Earle–Hamilton, sequences ( G nz ( w )) n ∈ N converge also (and usually faster) to the wanted ﬁxed point of F z .68 Matrices of Semicirculars andMatrix-Valued Semicirculars(and Block Random Matrices)

Here we want to be a bit more concrete about the relation between matrices of freesemicirculars and matrix-valued semicircular elements. We will here also encounterthe idea that we can consider our matrices both as scalar-valued and as operator-valued elements. Understanding the relation between these two points of view will becrucial for applications of operator-valued free probability to random matrix modelswith some more structure, like block matrices.

Remark . In Remark 6.3 we have seen that matrices of free semicirculars arematrix-valued semicirculars. We restrict here to the special case where

B = C , i.e.,the entries of our matrices are scalar-valued free semicirculars. Let us ﬁrst give theprecise statement for this. Proposition 7.2.

Let (A , ϕ ) be a C ∗ -probability space and S , . . . , S d be free stan-dard semicirculars (i.e., ϕ ( S i ) = ). For n ≥ and selfadjoint b , . . . , b d ∈ M n ( C ) weconsider S ∶= b ⊗ S + ⋯ + b d ⊗ S d ∈ M n ( C ) ⊗ A ˆ = M n (A) . Then S is in the matrix-valued C ∗ -probability space ( M n (A) , M n ( C ) , id ⊗ ϕ ) a matrix-valued semicircular element with covariance η ∶ M n ( C ) → M n ( C ) given by η ( b ) = d ∑ j = b j bb j . The proof of this is an assignment, Exercise 21. 69 .2 Treating matrix-valued semicirculars asscalar-valued variables

Remark . (1) We are now, however, interested in S as a scalar-valued ran-dom variable in the C ∗ -probability space ( M n (A) , tr ⊗ ϕ ) , i.e., instead of theoperator-valued Cauchy transform G S ∶ H + ( M n ( C )) → H − ( M n ( C )) , b ↦ G S ( b ) = id ⊗ ϕ [( b − S ) − ] we need the scalar-valued Cauchy transform g S ∶ H + ( C ) → H − ( C ) , z ↦ g s ( z ) = tr ⊗ ϕ [( b − S ) − ] . Note that for z ∈ C we clearly have g S ( z ) = tr [ G S ( z ⋅ )] . So if we can calculate G S , we can from this also get g S .(2) Note that being semicircular on an operator-valued level does in general notimply to be semicircular on a scalar-valued level. Let us check this in the nextexample. Example . Consider, for α, β ∈ R , S = ( αS βS ) = ( α

00 0 ) ⊗ S + ( β ) ⊗ S . Then S is for all α, β an M ( C ) -valued semicircular element. However, on the scalarlevel we have the second momenttr ⊗ ϕ [ S ] = ( α ϕ ( S ) + βϕ ( S )) = ( α + β ) ;and if S is semicircular, then its fourth moment must be given by twice the squareof this, i.e., by 2 ( tr ⊗ ϕ [ S ]) = ( α + β ) /

2. On the other hand we can calculatethe fourth moment directly astr ⊗ ϕ [ S ] = ( α ϕ ( S ) + β ϕ ( S )) = α + β . But α + β = ( α + β ) / ∣ α ∣ = ∣ β ∣ . Thus in general, semicircularity isnot preserved; but there are special cases where it is.70 heorem 7.5. Consider unital C ∗ -algebras D ⊂ B ⊂ A with conditional expectations E B ∶ A → B and E D ∶ A → D which are compatible in the sense that E D ○ E B = E D .Consider a B -valued semicircular element S ∈ A , with covariance η ∶ B → B with η ( b ) = E B [ SbS ] . If η (D) ⊂ D , then S is also a D -valued semicircular element, withcovariance given by the restriction of η to D .Example . Before we prove this let us reconsider Example 7.4; there

D = C , B = M ( C ) , E D = ϕ , E B = id ⊗ ϕ , and η ∶ B → B is given by η ( b b b b ) = id ⊗ ϕ ( b α S b αβS S b βαS S b β S ) = ( α b β b ) . To check that η (D) ⊂ D we just have to see that η ( ) ∈ C ; but η ( ) = ( α β ) ∈ C ⋅ α = β . One might note that η maps always into diagonal matrices ˜ D , and thus in this case S is always a ˜ D -valued semicircular.For another example of the application of Theorem 7.5 see Exercise 22. Proof of Theorem 7.5.

We have the Cauchy transforms G ( b ) = E B [( b − S ) − ] for b ∈ H + (B) and g ( d ) = E D [( b − S ) − ] for d ∈ H + (D) .Note that H + (D) ⊂ H + (B) and that g ( d ) = E D E B [( d − S ) − ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ = G ( d ) = E D [ G ( d )] . The main claim is to see that G ( d ) ∈ D for all d ∈ H + (D) ; (7.1)then we have that g ( d ) = G ( d ) for all d ∈ H + (D) and the equation bG ( b ) = + η ( G ( b )) ⋅ G ( b ) ( b ∈ H + (B)) b = d ∈ H + (D) : dg ( d ) = + η ( g ( d )) ⋅ g ( d ) , which shows that g is the Cauchy transform of a D -valued semicircular element withcovariance η ∣ D .So it remains to prove (7.1). We know, by Theorem 6.8, that we get G ( d ) ∈ H − (B) as the limit of iterates w n = F nd ( w ) for arbitrary w ∈ H − (B) , with F d ( w ) ∶= ( d − η ( w )) − . Now note that since η maps D to D , the map F d also maps D to D ; henceif we choose w ∈ H − (D) ⊂ H − (B) (as we are free to do), all iterates w n , and thusalso their limit G ( d ) , are in D . Remark . Note the relevance of this for random matrices. If X ( N ) , . . . , X ( N ) d areindependent Gaussian N × N random matrices, then we know (see Chapter 6 of theFree Probability Lecture Notes) that for N → ∞( X ( N ) , . . . , X ( N ) d ) → ( S , . . . , S d ) in distribution. But this implies that for N → ∞ b ⊗ X ( N ) + ⋯ + b d ⊗ X ( N ) d → S = b ⊗ S + ⋯ + b d ⊗ S d in distribution with respect to tr n ⊗ tr N and tr n ⊗ ϕ , respectively. The matriceson the left side are nN × nN block random matrices, considered as scalar-valuedrandom variables. Thus the scalar-valued distribution of S gives us the asymptoticeigenvalue distribution of the block matrices. See Exercise 20 for an example of this.72 Polynomials in FreeSemicirculars and Linearization

Going over to matrices over a non-commutative algebra gives surprising ﬂexibilityin dealing with problems in the algebra. In particular, one can rewrite non-linearproblems in the algebra into linear problems in the matrices. This linearizationidea has tremenduous impact in our context; it allows to reduce the calculation ofpolynomials in free variables to the calculation of operator-valued free convolution.We follow here quite closely the presentation in [HMS, MSp].

Remark . (1) In Proposition 7.2 we saw that we can deal with linear matrices S = b ⊗ S + ⋯ + b d ⊗ S d ( b , . . . , b d ∈ M n ( C )) in free semicirculars S , . . . , S d . Note that we can also consider “aﬃne” matri-ces by adding a constant b ⊗ = b ∈ M n ( C ) , since this gives only a shift inthe argument of the Cauchy transform: G b + S ( b ) = id ⊗ ϕ [( b − ( b + S )) − ] = G S ( b − b ) . Since we consider selfadjoint random variables we need b = b ∗ and thus wehave Im ( b − b ) = Im b ∈ H + ( M n ( C )) . So we can calculate G S ( b − b ) (at least numerically) and from this also thescalar-valued Cauchy transform g b + S ( z ) = tr [ G b + S ( z ⋅ )] = tr [ G S ( z ⋅ − b )] . (2) In Corollary 5.15 we saw that also for arbitrary selfadjoint polynomials p ∈ C ⟨ x , . . . , x d ⟩ the distribution of this polynomial applied to our free semicircu-lars, p ( S , . . . , S d ) , is uniquely determined; however, up to now it is not clearhow to calculate this. We will now see that we can do this by relating thisproblem with a corresponding problem in aﬃne matrices. 73 xample . Let us consider the example p ( x , x ) = x x + x x + x , i.e., P ∶= p ( S , S ) = S S + S S + S . Note that P = P ∗ . The distribution of P is given by its Cauchy transform G P ( z ) = ϕ [( z − P ) − ] , for z ∈ H + ( C ) . We lift the problem of calculating the inverse now fromthe ground level C to matrices by ﬁnding there a factorization of P into aﬃne terms − P = ( S S + S ) ⋅ ( − − ) ⋅ ( S S + S ) . Let us denote U ∶= ( S S + S ) , Q − ∶= ( − − ) , V ∶= ( S S + S ) , then we have P = − U Q − V , where U, Q, V are aﬃne in S and S . This does notdirectly give a factorization for P − , since U and V are not invertible, but we get afactorization of a lifted version of z − P into invertible factors: ( z − P − Q ) = ( − U Q − ) ⋅ ( z − U − V − Q ) ⋅ ( − Q − V ) . Since the ﬁrst and third term are always invertible ( A ) − = ( − A ) , ( B ) − = ( − B ) , we have (( z − P ) − − Q − ) = ( z − P − Q ) − = ( Q − V ) ⋅ ( z − U − V − Q ) − ⋅ ( U Q − ) . If we put ˆ P ∶= ( UV Q ) , Λ ( z ) ∶= ( z

00 0 ) , then we have (( z − P ) − − Q − ) = ([( Λ ( z ) − ˆ P ) − ] , ∗∗ ∗) (where [ A ] , denotes the ( , ) -entry of the 3 × A ), i.e., ( z − P ) − = [( Λ ( z ) − ˆ P ) − ] , , G P ( z ) = ϕ [( z − P ) − ] = ϕ {[( Λ ( z ) − ˆ P ) − ] , } = { id ⊗ ϕ [( Λ ( z ) − ˆ P ) − ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ G ˆ P ( Λ ( z )) } , . Note that ˆ P = ⎛⎜⎝ S S + S S − S + S − ⎞⎟⎠ is an aﬃne matrix in free semicirculars, thus an M ( C ) -valued semicircular shiftedby a constant, for which we can calculate its M ( C ) -valued Cauchy transform G ˆ P ( b ) .Note also that Λ ( z ) = ⎛⎜⎝ z ⎞⎟⎠ is not in H + ( M ( C )) , so our theory from Chapter 6 for solving for G ˆ P ( Λ ( z )) doesnot apply directly. But since, by the above calculation, Λ ( z ) − ˆ P is invertible, thefunction G ˆ P is holomorphic, hence continuous, in a neighborhood of b = Λ ( z ) andthus we have G ˆ P ( Λ ( z )) = lim ε ↘ G ˆ P ( Λ ε ( z )) , where Λ ε ( z ) ∶= ⎛⎜⎝ z iε

00 0 iε ⎞⎟⎠·„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„¶ ∈ H + ( M ( C )) for all ε > ε ↘ —→ Λ ( z ) . Remark . The main ingredient in the above calculation was that we can factorizeour polynomial as P = − U Q − V , with aﬃne U, Q, V . This works for all polynomialsand is actually independent from having semicircular elements as variables. Let usnow do the general case on the level of formal variables, C ⟨ x , . . . , x d ⟩ . Deﬁnition 8.4.

Let p ∈ C ⟨ x , . . . , x d ⟩ be given. A matrixˆ p = ( uv q ) ∈ M n ( C ⟨ x , . . . , x d ⟩) , where 75 n ∈ N , ○ q ∈ M n − ( C ⟨ x , . . . , x d ⟩) is invertible as a matrix over polynomials, ○ u ∈ M ,n ( C ⟨ x , . . . , x d ⟩) is a row and v ∈ M n, ( C ⟨ x , . . . , x d ⟩) is a column,is called a linearization of p , if the following two conditions are satisﬁed:(i) ˆ p is an aﬃne matrix in x , . . . , x d , i.e., there are b , b , . . . , b d ∈ M n ( C ) suchthat ˆ p = b ⊗ + b ⊗ x + ⋯ + b d ⊗ x d ;(ii) p = − uq − v . Theorem 8.5.

For any polynomial p ∈ C ⟨ x , . . . , x d ⟩ there exists a linearization. If p is selfadjoint, then there is also a selfadjoint linearization.Remark . Such linearizations are not unique; it is interesting to ﬁnd minimalones, where the matrix size n is as small as possible. Our algorithm in the followingproof will not produce minimal realizations in general. Proof of Theorem 8.5. (1) For monomials we have linearizations:(i) for degree 0, p = α ( α ∈ C )ˆ p = ( α − ) ∈ M ( C ) ; as α = − α ⋅ (− ) − ⋅ p = αx i ˆ p = ( αx i − ) ∈ M ( C ) ; as αx i = − αx i ⋅ (− ) − ⋅ k ≥ p = αx i ⋯ x i k ˆ p = ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝ . . . αx i . . . x i −

10 0 . . . x i − ⋮ ⋮ ⋰ ⋰ ⋮ ⋮ x i k − ⋰ x i k − . . . ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠ ;to be concrete, consider for example k = ⎛⎜⎝ αx i x i − x i − ⎞⎟⎠ q are always invertible ( x i − − ) − = ( − − − x i ) and we have − ( αx i ) ⋅ ( − − − x i ) ⋅ ( x i ) = αx i x i x i . (2) If we have linearizations ( u i v i q i ) for polynomials p i ( i = , . . . , r ), then theirsum p + ⋅ ⋅ ⋅ + p r has a linearization ⎛⎜⎜⎜⎜⎜⎜⎝ u u . . . u r v q . . . v q . . . ⋮ ⋮ ⋮ ⋱ ⋮ v r . . . q r ⎞⎟⎟⎟⎟⎟⎟⎠ . Thus we can build linearizations for any polynomial out of linearizations forits monomials.(3) If p has a linearization ( uv q ) , then p ∗ has a linearization ( v ∗ u ∗ q ∗ ) . If p = p ∗ we want to take a linearization of p = ( p + p ∗ ) . The construction in (2) howeverdoes not give a selfadjoint ˆ p . Instead we take12 ⎛⎜⎝ u v ∗ u ∗ q ∗ v q ⎞⎟⎠ . Remark . The linearization and the calculations for expressing the Cauchy trans-form of P in terms of the Cauchy transform of ˆ P are independent of the concretenature of our random variables, neither freeness nor being semicircular is important.Thus we have the following. 77 heorem 8.8. Let (A , ϕ ) be a C ∗ -probability space and consider selfadjoint X ,. . . , X d ∈ A . For a selfadjoint polynomial p ∈ C ⟨ x , . . . , x d ⟩ let ˆ p = b ⊗ + b ⊗ x + ⋅ ⋅ ⋅ + b d ⊗ x d , with b , b , . . . , b d ∈ M n ( C ) ,be a selfadjoint linearization of p . Put P ∶= p ( X , . . . , X d ) ∈ A and ˆ P = ˆ p ( X , . . . , X d ) = b ⊗ + b ⊗ X + . . . b d ⊗ X d ∈ M n (A) . Then we have for z ∈ H + ( C ) G P ( z ) = [ G ˆ P ( Λ ( z ))] , = lim ε ↘ [ G ˆ P ( Λ ε ))] , with Λ ( z ) = ⎛⎜⎜⎜⎝ z . . .

00 0 . . . ⋮ ⋮ ⋱ ⋮ . . . ⎞⎟⎟⎟⎠ and ⎛⎜⎜⎜⎝ z . . . iε . . . ⋮ ⋮ ⋱ ⋮ . . . iε ⎞⎟⎟⎟⎠ ∈ H + ( M n ( C )) . Remark . Thus, in order to deal with polynomials of variables, we need to un-derstand linear matrices in the variables. If the variables are free semicirculars weunderstand linear matrices in them, as they are operator-valued semicirculars. Buthow about general free variables: If X , . . . , X d are free, what can we say about X = b ⊗ + b ⊗ X + ⋯ + b d ⊗ X d . Note(i) b ⊗ G X , thus easy to deal with;(ii) the operator-valued Cauchy transform of b i ⊗ X i is determined (theoreticallyand numerically) in terms of the Cauchy transform of X i ;(iii) if X , . . . , X d are free in (A , ϕ ) , then b ⊗ X , . . . , b d ⊗ X d are, by Proposition5.7, free in ( M n (A) , M n ( C ) , id ⊗ ϕ ) ; hence we have to understand how to dealwith sums of free variables on an operator-valued level – i.e., we need to havea closer look on how to describe operator-valued free convolution.78 Combinatorial and AnalyticDescription of Operator-ValuedFreeness: Free Cumulants and R -Transforms Up to now we looked on moments and Cauchy transforms. As in the scalar-valuedcase it is advantegeous to go over to cumulants and R -transforms. Much of thetheory is the same, modulo “respecting the nesting”, as in the scalar-valued case,see Chapters 3 and 4 of Free Probability Lecture Notes. We we are not going togive proofs of the operator-valued statements, but we urge the reader (for example,in Exercise 24) to check that the scalar-valued arguments are not aﬀected by therequirement that we now have to respect the nesting. Deﬁnition 9.1.

Let (A , B , E ) be an operator-valued probability space. We denoteby E n , for n ∈ N , the B -balanced map E n ∶ A n → B ; ( a , . . . , a n ) ↦ E n ( a , a , . . . , a n ) ∶= E [ a a ⋯ a n ] , and by E π , for all n ∈ N , π ∈ N C ( n ) , the corresponding multiplicative map E π ∶A n → B , for π ∈ N C ( n ) ; see Deﬁnition 5.10. Then we deﬁne the corresponding (operator-valued) free cumulants κ n ∶ A n → B by κ n ( a , . . . , a n ) ∶= ∑ π ∈ NC ( n ) µ ( π, ) E π ( a , . . . , a n ) , (9.1)where µ is the Möbius function of N C ( n ) . Remark . The κ n are also B -balanced and with their multiplicative extension theEquation (9.1) is equivalent to E [ a ⋯ a n ] = E n ( a , . . . , a n ) = ∑ π ∈ NC ( n ) κ π ( a , . . . , a n ) . xample . (1) For n = E [ a ] = κ ( a ) = κ ( a ) . (2) For n = E [ a a ] = κ ( a , a ) + κ ( a , a )= κ ( a , a ) + κ ( a ) κ ( a )= κ ( a , a ) + E [ a ] E [ a ] , and thus κ ( a , a ) = E [ a a ]·„„„„„„„„„„„‚„„„„„„„„„„„¶ E ( a ,a ) − E [ a ] E [ a ]·„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„¶ E ( a ,a ) . (3) For n = E [ a a a ] = κ ( a , a , a ) + κ ( a , a , a )+ κ ( a , a , a ) + κ ( a , a , a ) + κ ( a , a , a ) . The interesting term is here κ ( a , a , a ) = κ ( a κ ( a ) , a )= E [ a E [ a ] a ] − E [ a E [ a ]] ⋅ E [ a ]= E [ a E [ a ] a ] − E [ a ] ⋅ E [ a ] ⋅ E [ a ] . This leads in the end to κ ( a , a , a ) = E [ a a a ] − E [ a ] ⋅ E [ a a ] − E [ a a ] ⋅ E [ a ]− E [ a E [ a ] a ] + E [ a ] ⋅ E [ a ] ⋅ E [ a ] . As in the scalar-valued case (compare 3.23 and 3.24 of Free Probability LectureNotes) one proves the following characterization of freeness.

Theorem 9.4 (freeness ˆ = vanishing of mixed cumulants) . Let (A , B , E ) be a B -valued probability space and ( κ n ) n ∈ N the corresponding free cumulants.(1) Consider subalgebras B ⊂ A i ⊂ A for i ∈ I . Then the following are equivalent.(i) The subalgebras A i , i ∈ I , are free with respect to E . ii) Mixed cumulants in the subalgebras vanish, i.e., κ n ( a , . . . , a n ) = when-ever: n ≥ ; a j ∈ A i j for j = , . . . , n ; and there exist ≤ k, l ≤ n such that i k /= i l .(2) Consider random variables X i ∈ A for i ∈ I . Then the following are equivalent.(i) The random variables X i , i ∈ I , are free with respect to E .(ii) Mixed cumulants in the random variables vanish, i.e., κ n ( X i b , X i b , . . . , X i n − b n − , X i n ) = whenever: n ≥ ; i , . . . , i n ∈ I ; there exist ≤ k, l ≤ n such that i k /= i l ;and b , . . . , b n − ∈ B .Example . This yields then formulas for the calculation of mixed moments; thoseformulas show that mixed moments of free variables are exponentially bounded, ifthis is true for of of the variables – thus providing the missing argument for ourproof of Theorem 5.14.As a concrete calculation, consider for X and Y free the following mixed moment: E [ XY XY ] = ∑ π ∈ NC ( ) κ π ( X, Y, X, Y ) . Because of the vanishing of mixed cumulants in X and Y only non-crossing π with π ≤ will make a contribution; so we can continue with E [ XY XY ] = κ ( X, Y, X, Y ) + κ ( X, Y, X, Y ) + κ ( X, Y, X, Y )= κ ( X ) ⋅ κ ( Y κ ( X ) , Y ) + κ ( Xκ ( Y ) , X ) ⋅ κ ( Y ) + κ ( X ) ⋅ κ ( Y ) ⋅ κ ( X ) ⋅ κ ( Y )= E [ X ]⋅( E [ Y E [ X ] Y ] − E [ Y E [ X ]] ⋅ E [ Y ])+( E [ XE [ Y ] X ] − E [ XE [ Y ]] ⋅ E [ X ]) ⋅ E [ Y ] + E [ X ] ⋅ E [ Y ] ⋅ E [ X ] ⋅ E [ Y ]= E [ X ] ⋅ E [ Y E [ X ] Y ] + E [ XE [ Y ] X ] ⋅ E [ X ] − E [ X ] ⋅ E [ Y ] ⋅ E [ X ] ⋅ E [ Y ] . This recovers the formula from Example 5.3 (3).

Proposition 9.6.

Let (A , B , E ) be a B -valued probability space with correspondingcumulants ( κ n ) n ∈ N . Consider, for n ∈ N , random variables X , . . . , X n ∈ A and b , . . . , b n − ∈ B . Then we have E [ X b X b ⋯ X n − b n − X n ]= n ∑ s = ∑ = j < j <⋅⋅⋅< j s ≤ n κ s ( X E [ b X ⋯ X j − b j − ] , X j E [ b j X j + ⋯ X j − b j − ] , . . . , X j s )× E [ b j s ⋯ b n − X n ] . .2 Operator-valued R -transform Theorem 9.7.

Let (A , B , E ) be a B -valued C ∗ -probability space and X = X ∗ ∈ A .Consider the following fully matricial functions on a uniform neighborhood of 0,given via the coeﬃcients in the power series expansion about 0: ○ the Cauchy transform G X given via H X ( z ) = G X ( z − ) by ∂ n + H X ( , . . . , ) ♯( b , . . . , b n ) = E [ b Xb ⋯ b n − Xb n ]○ the R -transform R X given by ∂ n R X ( , . . . , ) ♯( b , . . . , b n ) = κ n + ( Xb , Xb , . . . , Xb n , X ) Then we have that on suitable domains zG ( z ) = + R [ G ( z )] ⋅ G ( z ) , (9.2) and G and R determine each other via (9.2) .Remark . (0) Note that R has a constant term, whereas H starts with the linearterm; on the base level we have the power series expansions, for z = b ∈ B H X ( b ) = b + bE [ X ] b + bE [ XbX ] b + bE [ XbXbX ] b + ⋯ and R X ( b ) = κ ( X ) + κ ( Xb, X ) + κ ( Xb, Xb, X ) + ⋯ (1) Note that with G = ( G ( n ) ) n ∈ N and R = ( R ( n ) ) n ∈ N , (9.2) means that there exists R > n ∈ N we have zG ( n ) ( z ) = + R ( n ) [ G ( n ) ( z )] ⋅ G ( n ) ( z ) for z ∈ M n (B) with ∥ z ∥ > R .(2) For our applications to polynomials in Theorem 8.8, G P ( z ) = lim ε ↘ [ G ˆ P ( Λ ε ( z ))] , , we actually only need the base level n = G ˆ P .(3) Since mixed cumulants in free variables vanish we have for free X , X that κ n + (( X + X ) b , ( X + X ) b , . . . , ( X + X ) b n , ( X + X ))= κ n + ( X b , X b , . . . , X b n , X ) + κ n + ( X b , X b , . . . , X b n , X ) , R X + X ( z ) = R X ( z ) + R X ( z ) for ∥ z ∥ suﬃciently small . This allows in principle to express G X + X in terms of G X and G X : for i = , G X i its R -transform R X i via (9.2), then we get easily the R -transform of the sum, R X + X = R X + R X , and use again (9.2) (now in theother direction) to get from this G X + X . There is, however, a problem withthis, namely (9.2) can usually not be solved explicitly and there is also no goodnumerical algorithm for dealing with (9.2). Hence, as in the scalar-valued case,we will rewrite the R -transform approach into the “subordination” language.83 The subordination description of operator-valued free convolution yields as in thescalar-valued case algorithms which can be analytically controlled. Combining thiswith the linearization idea solves then the problem of calculating the distribution ofpolynomials in free variables, which in turn can be used to calculate the asymptoticeigenvalue distribution of polynomials in random matrices. We follow here thepresentation from [MSp], by refering the proof of the main statement to the originalpaper [BMS].

Remark . (1) We want to describe X + X , for X and X free, in a subordi-nated form via G X + X ( z ) = G X ( ω ( z )) , and G X + X ( z ) = G X ( ω ( z )) for some subordination functions ω , ω . Let us check, on a formal level, theproperties of those (compare also 5.1 of Free Probability Lecture Notes): ω ( z ) = G <− > X ( G X + X ( z )) . Note that zG ( z ) = + R [ G ( z )] ⋅ G ( z ) means that (for z = G <− > ( b ) ): G <− > ( b ) ⋅ b = + R ( b ) ⋅ b, i.e., G <− > ( b ) = b − + R ( b ) . G = G X , G = G X , G = G X + X , and the same for R . Then we have ω ( z ) = G <− > ( G ( z )) = G ( z ) − + R ( G ( z )) and ω ( z ) = G ( z ) − + R ( G ( z )) and thus ω ( z ) + ω ( z ) = G ( z ) − + R ( G ( z )) + R ( G ( z ))·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ = R [ G ( z )]= z − G ( z ) − = z + G ( z ) − = z + G ( ω ( z )) − = z + F ( ω ( z )) , and thus ω ( z ) = z + F ( ω ( z )) − ω ( z )·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ h ( ω ( z )) , where we put F ( z ) ∶= G ( z ) − , and h ( z ) ∶= F ( z ) − z = G ( z ) − − z. So we have ω ( z ) = z + h ( ω ( z )) and, by symmetry, ω ( z ) = z + h ( ω ( z )) . Inserting the ﬁrst equation into the second gives ﬁnally ω ( z ) = z + h ( z + h ( ω ( z ))) . This is a ﬁxed point equation for ω ( z ) , which can be used for calculating ω ( z ) via iterations.(2) The crucial point is that the ﬁxed point equation can be used to deﬁne ω ( z ) (and, in the same way, ω ( z ) ) not just on some suitably chosen domain, butalways on all of H + (B) . To show the convergence of the iterates on all of H + (B) one uses again the Earle–Hamilton Theorem.(3) To make the formal calculations above rigorous is much harder than in thescalar-valued case (in particular, as the involved domains are harder to con-trol), but it can be done. We only give the ﬁnal result from [BMS]. It wouldbe nice to ﬁnd a simpler, more streamlined proof of this theorem.86 heorem 10.2 (Belinschi, Mai, Speicher 2017) . Let (A , B , E ) be an operator-valued C ∗ -probability space and consider selfadjoint X , X ∈ A which are free with respectto E . Then there exists a unique pair of Fréchet analytic maps ω , ω ∶ H + (B) → H + (B) such that(i) Im ω j ( z ) ≥ Im z for all z ∈ H + (B) and j = , ;(ii) for all z ∈ H + (B) F ( ω ( z )) + z = F ( ω ( z )) + z = ω ( z ) + ω ( z ) ; (iii) for all z ∈ H + (B) G ( ω ( z )) = G ( ω ( z )) = G ( z ) . Moreover, if z ∈ H + (B) , then ω ( z ) is the unique ﬁxed point of the map f z ∶ H + (B) → H + (B) , given by f z ( w ) ∶= h ( h ( w ) + z ) + z ; and ω ( z ) = lim n →∞ f nz ( w ) for any w ∈ H + (B) . The same statements hold for ω ,where f z is replaced by w ↦ h ( h ( w ) + z ) + z. Remark . This can then be used, together with the linearization idea, to computenumerically distributions of polynomials in free variables. This has relevance for theasymptotic eigenvalue distribution of random matrices. Assume that X ( N ) , . . . , X ( N ) d are N × N random matrices which are asymptotically free, i.e., ( X ( N ) , . . . , X ( N ) d ) N →∞ —→ ( X , . . . , X d ) , where X , . . . , X d are free. Then, for any polynomial p ∈ C ⟨ x , . . . , x d ⟩ , we also have p ( X ( N ) , . . . , X ( N ) d ) N →∞ —→ p ( X , . . . , X d ) ;and the distribution of the limit can be calculated via linearization and operator-valued free convolution.Note the following typical situations for asymptotically free random matrices:(i) independent gue are asymptotically free; 87ii) gue are asymptotically free from deterministic (e.g., diagonal) matrices;(iii) “randomly rotated” matrices are asymptotically free: for D ( N ) , D ( N ) determin-istic (e.g., diagonal) matrices and U N Haar unitary N × N random matrices,we have that D ( N ) and U N D ( N ) U ∗ N are asymptotically free; so, in particular,asymptotically the eigenvalue distribution of p ( D ( N ) , U N D ( N ) U ∗ N ) is given bythe distribution of p ( X , X ) where X and X are free and µ D ( N ) → µ X and µ D ( N ) → µ X . Example . Let us compare, for the polynomial p ( x, y ) = xy + yx + x , the distri-bution of asymptotically free random matrices with the limit distribution, which wecalculate by our linearization and operator-valued convolution machinery.(1) Consider ﬁrst, for N = gue (N) matrix A N and a deterministic diagonalmatrix X N with 2000 eigenvalues -2, 1000 eigenvalues -1 and 1000 eigenvalues1. We compare the histogram of the N eigenvalues of p ( X N , A N ) with thedistribtion (red curve) of p ( X, S ) , where S and X are free, S is a semicircularelement and X has distribution µ X = ( δ − + δ − + δ + ) . -4 -2 0 2 4 6 8 10 1200.020.040.060.080.10.120.140.16 N = X N and Y N ; Y N has 2000 eigenvalues 1 and 2000 eigenvalues 3; and X N is the same asbefore, i.e., a diagonal matrix with 2000 eigenvalues -2, 1000 eigenvalues -1 and1000 eigenvalues 1. In addition we take now a Haar unitary random matrix U N and compare the histogram of the N eigenvalues of p ( X N , U N Y N U ∗ N ) with thedistribution (red curve) of p ( X, Y ) , where X and Y are free, with distribution µ X = ( δ − + δ − + δ + ) and µ Y = ( δ + δ ) . -5 0 5 10 1500.10.20.30.40.50.6 The linearization idea is usually (i.e., in other contextes than free probability) usedfor dealing with rational functions, not just polynomials. Thus it looks feasible totry to extend our results to rational functions. We will follow quite closely [HMS],where one can also ﬁnd more information on the history of the linearization ideaand more details on non-commutative rational functions.

Remark . Recall the idea of the linearization of a polynomial. For a polynomial P = p ( X , . . . , X d ) ∈ A we need to ﬁnd U, Q, V with ○ U, Q, V are aﬃne matrices in X , . . . , X d ; ○ Q is invertible; ○ P = − U Q − V .Then the linearization ˆ P = ( UV Q ) knows a lot about P , namely G P ( z ) = [ G ˆ P ( Λ ( z ))] , . (11.1) Question : Can we linearize more general “functions”?

Example . Note that in P = − U Q − V the inverse shows up, which suggests thatwe might also linearize inverses. Try the simplest case, P = X − . We can write thisas P = −( ) ⋅ (− X ) − ⋅ ( ) , i.e., U = Q = − X , V =

1, all 1 × P = ( − X ) ∈ M (A) . P is here also selfadjoint!) If we assume that Q = − X is invertible,then this satisﬁes all properties of our linearization and (11.1) allows to calculatethe distribution of X − via the linear matrix ˆ P . (Of course, in this case of onevariable we would calculate the distribution of X − form the distribution of X justvia ordinary function calculus.)Note that in this case invertibility of Q is not just an algebraic issue, which istrue for all p ( X ) , but depends on the existence of p ( X ) = X − for the concretelyconsidered X . We have to be careful that the existence of P implies the existenceof all inverses which show up in our calculations. The basic ingredient for all this isthe following well-known formula. Theorem 11.3 (Schur complement formula) . Let A be a complex unital algebra.Let matrices a ∈ M k (A) , b ∈ M k,l (A) , c ∈ M l,k (A) , d ∈ M l (A) be given and assume that d is invertible in M l (A) . Then the following are equivalent.(i) ( a bc d ) ∈ M k + l (A) is invertible.(ii) The Schur complement a − bd − c is invertible in M k (A) .If those are satisﬁed, then we have ( a bc d ) − = (( a − bd − c ) − ∗∗ ∗) . Proof.

We have ( a bc d ) = ( bd − ) ( a − bd − c d ) ( d − c ) . (11.2)Since the ﬁrst and the third factor are always invertible, the invertibility of the lefthand side is equivalent to the invertibility of ( a − bd − c d ) , which in turn is equivalent to the invertibility of a − bd − c (since d is invertible byassumption).The formula for the inverse follows by taking the inverse of (11.2). Deﬁnition 11.4.

Let r be a rational expression in the formal variables x , . . . , x d .A linear representation ρ = ( u, q, v ) of r consists of92 an aﬃne matrix q in the variables x , . . . , x d , of size n × n for some n ∈ N , ○ an 1 × n matrix u over C , ○ and an n × v over C such that we have for any unital algebra A and any X , . . . , X d ∈ A : whenever r ( X , . . . , X d ) makes sense in A (i.e., all inverses appearing in r must exist in A ),then q ( X , . . . , X d ) is also invertible in M n (A) and we have then r ( X , . . . , X d ) = − uq ( X , . . . , X d ) − v. Theorem 11.5.

Let r be a selfadjoint rational expression and ρ = ( u, q, v ) a selfad-joint linear representation of r (i.e., u = v ∗ , q = q ∗ ). Consider a C ∗ -probability space (A , ϕ ) and selfadjoint random variables X , . . . , X d ∈ A such that r ( X , . . . , X d ) isdeﬁned in A (necessarily as bounded operator). Then, with ˆ R ∶= ( uv q ( X , . . . , X d )) ∈ M n + (A) , we have for all z ∈ H + ( C ) G r ( X ,...,X d ) ( z ) = lim ε ↘ [ G ˆ R ( Λ ε ( z ))] , . Proof.

Compare also Example 8.2. We haveΛ ( z ) − ˆ R = ( z − u − v − q ( X , . . . , X d )) ;by deﬁnition of linear representation, q ( X , . . . , X d ) is invertible; so, by the Schurcomplement formula 11.3, Λ ( z ) − ˆ R is invertible if and only if z − u (− q ( X , . . . , X d )) − v = z − r ( X , . . . , X d ) is invertible, and then [( Λ ( z ) − ˆ R ) − ] , = ( z − r ( X , . . . , X d )) − . Applying ϕ to this, and taking into account the continuity in ε as in Example 8.2,gives the statement on Cauchy transforms. 93 For every rational expressionone can build a linear representation according to the following algorithm.(1) Scalars λ ∈ C and variables x j have respective linear representations (( ) , ( λ − − ) , ( )) and (( ) , ( x j − − ) , ( )) . (2) If ( u , q , v ) is a representation of r and ( u , q , v ) is a representation of r ,then representations for r + r and for r ⋅ r are respectively given by (( u u ) , ( q q ) , ( v v )) and (( u ) , ( v u q q ) , ( v )) . (3) If ( u, q, v ) is a representation of r /=

0, then (( ) , ( uv − q ) , ( )) is a representation of r − . Proof.

Let us just check (3). We have to see: if r − ( X , . . . , X d ) makes sense (i.e., r ( X , . . . , X d ) /= A ), then ( uv − q ( X , . . . , X d )) is invertible. Since r ( X , . . . , X d ) makes sense, q ( X , . . . , X d ) is invertible (by thedeﬁnition of a linear representation) and, by the Schur complement formula 11.3,the matrix above is invertible if and only if − uq ( X , . . . , X d ) v = r ( X , . . . , X d ) isinvertible; but this is the case by our assumption; and then we have, still by 11.3, ( ) ( uv − q ( X , . . . , X d )) − ( ) = ⎡⎢⎢⎢⎢⎣( uv − q ( X , . . . , X d )) − ⎤⎥⎥⎥⎥⎦ , = r ( X , . . . , X d ) − . Example . Let us apply the above algorithm to r ( x, y ) = [ x − + y − ] − . First, for x − and y − we have the linearizations ⎛⎜⎝( ) , ⎛⎜⎝ − x

11 1 0 ⎞⎟⎠ , ⎛⎜⎝ ⎞⎟⎠⎞⎟⎠ and ⎛⎜⎝( ) , ⎛⎜⎝ − y

11 1 0 ⎞⎟⎠ , ⎛⎜⎝ ⎞⎟⎠⎞⎟⎠ , x − + y − the linearization ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝( ) , ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝ − x − y

10 0 0 1 1 0 ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠ , ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝ ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠ . Finally, the inverse [ x − + y − ] − has then the linearization ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝( ) , ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ − x − − − −

10 0 0 0 0 y −

10 0 0 0 − − ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ , ⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝ ⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠ . Evaluating rational expressions in operators will typically lead to unbounded op-erators. Here we will see that we can also say quite a bit about such a situation.Actually, understanding what is going on there is crucial for getting a grasp on oneof the most basic regularity questions about non-commutative distributions: theabsence of atoms in the distribution of polynomials or rational functions of our op-erators. The material here relies on the original work [MSY], where one can also ﬁndmore details about the algebraic description of non-commutative rational functionsas the “free skew ﬁeld”, and the “fullness” of matrices in this context.

Remark . Note that we have to restrict to X , . . . , X d ∈ A for which r ( X , . . . , X d ) is deﬁned in A , for a rational expression r . Up to now we considered this in a C ∗ -algebra A , which means that r ( X , . . . , X d ) has to exist in A , i.e., as a boundedoperator. Can we weaken this? Example . Let X ∶ Ω → R be a classical real-valued random variable, deﬁned ona probability space ( Ω , A , P ) . When does Y ∶= X − = / X make sense as a randomvariable. Since our functions are deﬁned only almost everywhere, we need µ X ({ }) = P ( X = ) = . (12.1)If we consider X as multiplication operator on L ( Ω , P ) , then (12.1) says that thekernel ker ( X ) ∶= { f ∈ L ( Ω ) ∣ Xf = } is trivial, i.e., ker ( X ) = { } . Under this condition, X − exists, but it might be anunbounded operator, namely if 0 is in the spectrum σ ( X ) of X . 971) For example, let X = S be a semicircular variable with distribution µ S = − π We can realize X as multiplication operator on the interval [− , ] ; i.e., ( Xf )( t ) = tf ( t ) for f ∈ L ([− , ] , µ S ) .Then 0 ∈ σ ( S ) and S − does not exist as bounded operator, but makes senseas unbounded operator: ( X − f )( t ) = t − f ( t ) for f such that t ↦ t − f ( t ) is in L ([− , ] , µ S ) . Note that we only need injectivity of X – i.e., ker ( X ) = { } – to ensure “surjectivity” – i.e., that the image of X is dense, so that we caninvert it there. This is like for matrices, but of course is not true for generalinﬁnite-dimensional operators.(2) Without injectivity we have no chance of making sense of X − , even as un-bounded operator. E.g., if µ X = ( δ + δ ) , there is no X − . Deﬁnition 12.3.

Let M ⊂ B (H) be a von Neumann algebra. A densely deﬁnedand closed unbounded operator X on H is aﬃliated with M , if for every unitary U ∈ M ′ ( M ′ is the commutant) we have U X = XU . [Equivalently, in the polardecomposition X = U ∣ X ∣ we have U ∈ M and ∣ X ∣ is aﬃliated with M , i.e., allspectral projections of ∣ X ∣ are in M .] We write ˜ M for the set of operators aﬃliatedto M . Example . (1) If M = B (H) , then ˜ M consists of all unbounded densely deﬁnedand closed operators on H ; for dim H = ∞ this is a nasty object without muchstructure.(2) If M = L ∞ ( µ ) , then ˜ M is the ∗ -algebra of all µ -measurable functions.(3) If M is a ﬁnite von Neumann algebra (i.e., it has a faithful normal trace τ ),then the situation is as nice as in the classical commutative case or in the caseof matrices; namely, then ˜ M is a ∗ -algebra and for X ∈ ˜ M the inverse X − ∈ ˜ M exists if and only if X is injective, i.e., ker ( X ) = { } . [Those are results ofMurray and von Neumann.]98 emark . (1) Note that the case of a ﬁnite von Neumann algebra is relevantfor us; our C ∗ -probability spaces (A , ϕ ) are usually W ∗ -probability spaces ( M, τ ) where M is a von Neumann algebra and τ is a trace. In particular,free semicirculars S , . . . , S d are living in a ﬁnite von Neumann algebra. Moregeneral, limits of random matrices do so, since our ϕ as the limit of traces onmatrices is necessarily also a trace.(2) If we are in a ﬁnite von Neumann algebra setting ( M, τ ) , then we can replace A in Theorem 11.5 by ˜ M and thus also treat r ( X , . . . , X d ) which are de-ﬁned as unbounded aﬃliated operators. Via our linearization r ˆ = ( u, q, v ) thisrequirement on the existence of r ( X , . . . , X d ) as unbounded operator is thesame as the existence of the inverse of q ( X , . . . , X d ) as unbounded operator;and then we still have r ( X , . . . , X d ) = − uq ( X , . . . , X d ) − v .(3) This raises the question whether there are operators X , . . . , X d for which(i) all rational expressions r ( X , . . . , X d ) are deﬁned as unbounded operatorsor(ii) all inverses of q ( X , . . . , X d ) exist as unbounded operators.Note that we have to specify more precisely which r and q we mean.(i) We have to make sure that we never invert 0; thus 0 − is not allowed as r ; but there can also be more subtle versions of this, like ( yxx − − y ) − or { x − [ x − + ( y − − x ) − ] − − xyx } − . (ii) The q arising in our linearization algorithm are full in the following sense: q has no proper rectangular factorization in matrices over C ⟨ x , . . . , x d ⟩ ,i.e. if we can factorize q ∈ M n ( C ⟨ x , . . . , x d ⟩) as q = q q with q ∈ M n,r ( C ⟨ x , . . . , x d ⟩) and q ∈ M r,n ( C ⟨ x , . . . , x d ⟩) , then we necessarilyhave r ≥ n .Note: if q is not full, then q ( X , . . . , X d ) = q ( X , . . . , X d ) ⋅ q ( X , . . . , X d ) ;since q ( X , . . . , X d ) as an r × n matrix with r < n has no dense image, itmust also have a kernel, but then q ( X , . . . , X d ) has also a kernel. So weneed clearly fullness as a requirement for the considered q . Theorem 12.6 (Mai, Speicher, Yin 2019) . Let ( M, τ ) be a tracial W ∗ -probabilityspace and consider X , . . . , X d ∈ M . Then the following are equivalent. i) For all meaningful rational expressions r /= , the operator r ( X , . . . , X d ) existsas unbounded operator in ˜ M and is invertible in ˜ M .(ii) For all full aﬃne q ∈ M n ( C ⟨ x , . . . , x d ⟩) the operator q ( X , . . . , X d ) ∈ M n ( M ) is invertible in M n ( ˜ M ) .(iii) ∆ ( X , . . . , X d ) = d , which means the following: if we have ﬁnite rank operators T , . . . , T d on L ( M, τ ) such that ∑ dk = [ T k , X k ] = , then necessarily T = ⋅ ⋅ ⋅ = T d = .Remark . (1) The equivalence between (i) and (ii) is more or less the lineariza-tion idea; the relation between (ii) and (iii) relies on the following. Considerlinear and selfadjointˆ R = b ( ) ⊗ + b ( ) ⊗ X + ⋯ + b ( d ) ⊗ X d with b ( ) , b ( ) , . . . , b ( d ) ∈ M n ( C ) selfadjoint, and assume we have an element f = ( f , . . . , f n ) , with f i ∈ L ( M ) for i = , . . . , n , in the kernel of ˆ R , i.e.,ˆ Rf =

0; then put T k ∶= n ∑ i,j = b ( k ) ij ⟨⋅ , f i ⟩ f j ( k = , , . . . , d ) . Those T , T , . . . , T d are ﬁnite rank operators and ˆ Rf = T + k ∑ k = X k T k = . Since the T i are selfadjoint, we get by taking the adjoint T + d ∑ k = T k X k = . By taking the diﬀerence between those two equations we have then d ∑ k = ( T k X k − X k T k )·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ [ T k ,X k ] = . The theorem holds also for non-selfadjoint X i , the arguments are getting thenmore involved.(2) It is not obvious how to check whether ∆ ( X , . . . , X d ) = d is satisﬁed or not.However, there are a couple of free probability tools to decide on this, like“maximality of free entropy dimension” or “existence of a dual system”. So weknow, for example, that ∆ ( S , . . . , S d ) = d for free semicirculars S , . . . , S d .100he above gives us directly some strong implications about the absence of atomsin the distribution of polynomials, or even rational functions, of operators whichsatisfy ∆ ( X , . . . , X d ) = d . Let us formulate this just for the most prominent case offree semicirculars. Corollary 12.8.

Let ( M, τ ) be a ﬁnite W ∗ -probability space and S , . . . , S d ∈ M freesemicirculars.(1) For any meaningful rational expression r the operator r ( S , . . . , S d ) ∈ ˜ M existsas unbounded operator. If r = r ∗ and not constant, then µ r ( S ,...,S d ) has noatoms.(2) For any full q ∈ M n ( C ⟨ x , . . . , x d ⟩) the operator q ( S , . . . , S d ) is invertible in M n ( ˜ M ) . If q = q ∗ , then µ q ( S ,...,S d ) , with respect to ( M n ( M ) , tr n ⊗ τ ) , has noatom at 0. In Examples 1.7 and 1.8 we saw two realizations of the most important non-com-mutative distribution, namely n free semicircular elements. In this assignment youare asked to familiarize yourself with the meaning of this. For the notion of freenessyou might watch Lecture 1 and 2 from the class “ Free Probability Theory” from lastterm or read the corresponding Chapter 1 of the class notes. For random matricesyou might watch Lecture 17 and 18 or read Chapter 6. Exercise 1.

Let S , . . . , S n be the operators on the full Fock space from Example1.7.(i) Saying that each S ∈ { S i ∣ ≤ i ≤ n } is a semicircular variable means thatits odd moments are zero and the even moments are given by the Catalannumbers, i.e. ϕ ( S k + ) = ϕ ( S k ) = k + ( kk ) . Check the latter for small k , i.e. show that ϕ ( S ) = , ϕ ( S ) = , ϕ ( S ) = , ϕ ( S ) = . (ii) Saying that the S , . . . , S n are free means that special mixed moments vanish.Show this for the following special cases. ϕ ( S S S S ) = , ϕ (( S − )( S − )( S − )) = . Exercise 2.

Let X ( N ) i be the independent Gaussian random matrices from Example1.8. Familiarize yourself with computer programs (e.g., matlab) to produce randommatrices and calculate and plot histograms of their eigenvalues.(i) Saying that, for each i , X ( N ) i is asymptotically a semicircular variable meansthat for large N the eigenvalue distribution of the N eigenvalues of such a ma-trix is close to the semicircle distribution. Check this by producing a histogramfor a 1000 × X ( N ) , . . . , X ( N ) n are asymptotically free means that special mixedmoments (with respect to the normalized trace tr) are, for large N , close tozero. Check this numerically for the following special cases:tr ( ABAB ) , tr (( A − )( B − )( A − )) , where A and B are two independent 1000 × Exercise 3.

Let (C , ϕ ) be a non-commutative probability space. Put A ∶= M n (C) , B ∶= M n ( C ) , E ∶= id ⊗ ϕ ∶ A → B . (i) Show that (A , B , E ) is an operator-valued probability space.(ii) Assume that (C , ϕ ) is a C ∗ -probability space. Show that (A , B , E ) is then anoperator-valued C ∗ -probability space.(iii) Show that in the C ∗ -case we also have: if ϕ is faithful, then E is also faithful.[Faithful means: E ( A ∗ A ) = A = ϕ is a trace, i.e., ϕ ( AB ) = ϕ ( BA ) for all A, B ∈ C . Does thenalso E have the tracial property? Give a proof or counter example! Exercise 4.

Let B be a unital algebra. Consider a collection of functions F =( F m ) m ∈ N F m ∶ M m (B) → M m (B) , z ↦ F m ( z ) . (i) We say that F respects direct sums if F m + m ( z z ) = ( F m ( z ) F m ( z )) for all m , m ∈ N , z ∈ M m (B) , z ∈ M m (B) .(ii) We say that F respects similarities if F m ( SzS − ) = SF m ( z ) S − for all m ∈ N , z ∈ M m (B) and all invertible S ∈ M m ( C ) .(iii) We say that F respects intertwininigs if for all n, m ∈ N , z ∈ M n (B) , z ∈ M m (B) , T ∈ M n,m ( C ) (the latter are the n × m matrices with complex entries)we have the following: z T = T z (cid:212)⇒ F n ( z ) T = T F m ( z ) . Prove that [(i) and (ii)] is equivalent to (iii).104

Exercise 5.

Prove the second item from the proof of Lemma 3.6: Let f be a non-commutative function, then we have for z ∈ M n (B) , z ∈ M m (B) that ∂f ( z , z ) ♯( w + w ) = ∂f ( z , z ) ♯ w + ∂f ( z , z ) ♯ w for all w , w ∈ M n,m (B) . Exercise 6.

Let r ∈ N and b , b , . . . , b r + ∈ B be given and consider the monomial ff ( z ) = b zb zb z ⋯ b r zb r + . (i) Show that f = ( f m ) m ∈ N is a non-commutative function. (For this, also giveﬁrst the precise deﬁnition of all f m ∶ M m (B) → M m (B) .)(ii) Calculate the ﬁrst and second order derivatives of f , i.e., ∂f ( z , z ) ♯ w, and ∂ f ( z , z , z ) ♯( w , w ) . Exercise 7.

For a non-commutative function f we deﬁne the mappings ∂ k − ( z , . . . , z k ) ♯( w , . . . , w k − ) by f ⎛⎜⎜⎜⎜⎜⎜⎝ z w . . . z w . . . ⋮ ⋮ ⋱ ⋱ ⋮ ⋯ z k − w k − ⋯ z k ⎞⎟⎟⎟⎟⎟⎟⎠= ⎛⎜⎜⎜⎜⎜⎜⎝ f ( z ) ∂f ( z , z ) ♯ w ∂ ( z , z , z ) ♯( w , w ) ⋯ ∂ k − f ( z , . . . , z k ) ♯( w , . . . , w k − ) f ( z ) ∂f ( z , z ) ♯ w ⋯ ∂ k − f ( z , . . . , z k ) ♯( w , . . . , w k − )⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ∂f ( z k − , z k ) ♯ w k − ⋯ f ( z k ) ⎞⎟⎟⎟⎟⎟⎟⎠ Show that for each N ∈ N we have the expansion f ( z + tw ) = N ∑ k = t k ∂ k ( z, . . . , z, z ) ♯( w, . . . , w ) + t N + ∂ N + f ( z, . . . , z, z + tw ) ♯( w, . . . , w ) m ∈ N , z, w ∈ M m (B) and t ∈ C .You can assume for this that ∂ k − ( z , . . . , z k ) ♯( w , . . . , w k − ) is linear in the argu-ments w i .Hint: It might be helpful, to consider the matrix y ∶= ⎛⎜⎜⎜⎜⎜⎜⎝ z tw ⋯ z tw ⋯ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋯ z tw ⋯ z + tw ⎞⎟⎟⎟⎟⎟⎟⎠ and observe that y ⋅ ⎛⎜⎝ ⋮ ⎞⎟⎠ = ⎛⎜⎝ ⋮ ⎞⎟⎠ ⋅ ( z + tw ) Exercise 8.

Consider the C ∗ -algebra M n ( C ) of n × n matrices over C . We deﬁneits upper half-plane by H + ( M n ( C )) ∶= { b ∈ M n ( C ) ∣ ∃ ε > ∶ Im ( b ) ≥ ε } , where Im ( b ) ∶= ( b − b ∗ )/( i ) .(i) In the case n =

2, show that in fact H + ( M ( C )) ∶= {( b b b b )∣ Im ( b ) > , Im ( b ) Im ( b ) > ∣ b − b ∣ } . (ii) For general n ∈ N , prove: if a matrix b ∈ M n ( C ) belongs to H + ( M n ( C )) thenall eigenvalues of b lie in the complex upper half-plane H + ( C ) . Is the conversealso true? Let A and B be unital C ∗ -algebras. A linear map Φ ∶ A → B is called completelypositive if all matrix ampliﬁcations Φ ⊗ id ∶ M n (A) → M n (B) are positive. Exercise 9.

Show that the following are equivalent:(i) Φ ∶ A → B is completely positive.(ii) For each n ∈ N and all a , . . . , a n ∈ A the matrix ( Φ ( a i a ∗ j )) ni,j = ∈ M n (B) ispositive.106 xercise 10. Show that the transpose map on 2 × ∶ M ( C ) → M ( C ) , ( a a a a ) ↦ ( a a a a ) , is positive, but not completely positive. Exercise 11.

Show that a positive conditional expectation E ∶ A → B is completelypositive. What does this tell us about the complete positivity of states ϕ ∶ A → C ?Hint: For this you can use the following characterization: A matrix ( b ij ) ni,j = ∈ M n (B) is positive if and only if we have n ∑ i,j = b i b ij b ∗ j ≥ b , . . . , b n ∈ B . Exercise 12. (i) Let (A , B , E ) be a B -valued C ∗ -probability space. Consider a“constant” selfadjoint random variable b = b ∗ ∈ B ⊂ A . Calculate the fullymatricial Cauchy transform of b .(ii) Consider a C ∗ -probability space (A , ϕ ) as a special case of an operator-valued C ∗ -probability space, where B = C . Consider a selfadjoint X = X ∗ ∈ A . Itsdistribution µ X is then a probability measure on R . Express the fully matricial C -valued Cauchy transform G X in terms of µ X .(iii) Assume that X and X are classical (thus commuting) bounded selfadjointrandom variables. Hence they have a classical distribution, which is a proba-bility measure on R with compact support. Consider now the 2 × X = ( X X ) . The M ( C ) -valued Cauchy transform of X , as a fully matricial function, shouldnow be determined in terms of this classical data. Make this concrete! Exercise 13.

Show the following easy direction of Theorem 4.9: Let (A , B , E ) bean operator-valued C ∗ -probability space and X = X ∗ ∈ A . Show that µ X ∈ Σ B . Exercise 14.

Let Φ ∶ A → B be a completely positive map between two unital C ∗ -algebras with Φ ( ) =

1. Show that Φ satisﬁes the following kind of Cauchy-Schwarzinequality: for a ∈ A we have Φ ( a ) ∗ Φ ( a ) ≤ Φ ( a ∗ a ) . 107int: Consider the positive matrix ( a ∗ a a ∗ a ) Exercise 15.

Let X and Y be free in an operator-valued probability space (A , B , E ) .Calculate the mixed moment E [ Xb Y b Xb Y ] , for b , b , b ∈ B , in terms of momentsof X and of Y . Exercise 16.

Let η ∶ B → B be a completely positive map on the unital C ∗ -algebra B . We want to construct an operator X which has η as its second moment; this willbe a kind of operator-valued Bernoulli element. For this we consider the degenerateFock space F ∶= B ⊕ B x B ⊂ B⟨ x ⟩ , equipped with the B -valued inner product ⟨⋅ , ⋅⟩ ∶ F × F → B , given by linear extension of ⟨ b + b xb , ˜ b + ˜ b x ˜ b ⟩ ∶= b ∗ ˜ b + b ∗ η ( b ∗ ˜ b ) ˜ b . On F we deﬁne the creation operator l ∗ by l ∗ b = xb l ∗ b xb = , and the annihilation operator l by lb = , lb xb = η ( b ) b . Let A be the ∗ -algebra which is generated by l and by elements b ∈ B acting asmultiplication operators on F . We also put E ∶ A → B , A ↦ E [ A ] ∶= ⟨ , A ⟩ . (i) Show that the inner product is positive and that l and l ∗ are adjoints of eachother.(ii) Show that E is positive.108iii) Show that the second moment of the selfadjoint operator X = l + l ∗ is given by η .(iv) What is the formula for a general moment of X ? Exercise 17.

Let S ∈ A be a B -valued semicircular element with covariance η ∶ B →B . Fix n ∈ N and b ∈ M n (B) . Consider nowˆ S ∶= b ( ⊗ S ) b ∗ = b ⎛⎜⎝ S . . . ⋮ ⋱ ⋮ . . . S ⎞⎟⎠ b ∗ ∈ M n (A) . Show that ˆ S is an M n (B) -valued semicircular element and calculate its covarianceˆ η ∶ M n (B) → M n (B) . Compare also Remark 6.3.

Exercise 18.

Assume that we have X ( ) i ( i ∈ N ) which are f.i.d., with ﬁrst momentzero and second moment given by a covariance η ∶ B → B ; and that we have X ( ) i ( i ∈ N ) which are f.i.d with ﬁrst moment zero and second moment given by a covariance η ∶ B → B . According to the operator-valued version of the free central limit theoremwe know then that the normalized sum of the X ( ) i converges to an operator-valuedsemicircular element S with covariance η and that the normalized sum of the X ( ) i converges to an operator-valued semicircular element S with covariance η .Assume now that the X ( ) i and X ( ) i are realized in the same C ∗ -probability spaceand are also free for each i . Then the joint distribution of ( X ( ) i , X ( ) i ) convergesto the joint distribution of the pair ( S , S ) . Convince yourself that our argument(from the Free Probability Lecture Notes, Assignment 3, Exercise 4) for the scalar-valued case that freeness goes over to the limit remains valid in the operator-valuedcase. Thus we get in the limit two semicircular elements which are free.By repeating the calculation in our proof of the central limit theorem, Theorem6.2, for this multivariate setting derive the formula for mixed moments of two freesemicircular elements S and S , with covariance mappings η and η , respectively. Exercise 19.

Let η ∶ B → B be a completely positive map on the C ∗ -algebra B . Wewant to construct a semicircular operator X which has ν η as its distribution. Thisoperator will be constructed on an operator-valued version of the full Fock space.109he latter is nothing but our polynomials B⟨ x ⟩ , equipped with the B -valued innerproduct ⟨ b xb x ⋯ b n xb n + , ˜ b x ˜ b x ⋯ ˜ b m x ˜ b m + ⟩ ∶= δ nm b ∗ n + η ( b ∗ n ⋯ η ( b ∗ η ( b ∗ ˜ b ) ˜ b )⋯ ˜ b n ) ˜ b n + . On this full fock space F we deﬁne again a creation operator l ∗ , now given by l ∗ b xb ⋯ xb n + ∶= xb xb ⋯ xb n + , and an annihilation operator l ∗ , given by lb = lb xb x ⋯ xb n + ∶= ( η ( b ) b ) x ⋯ xb n + . Elements from B act on F by left multiplication. For A we take now the ∗ -algebrawhich is generated by l and by all multiplication operators from B . Furthermore,we put E ∶ A → B , A ↦ E [ A ] ∶= ⟨ , A ⟩ . (i) Show that the inner product on F is positive and that l and l ∗ are adjoints ofeach other.(ii) Show that E is positive.(iii) Calculate explicitly the second and the fourth moments of X ∶= l + l ∗ .(iv) Prove that X = l + l ∗ has semicircular distribution ν η . Exercise 20.

Let S and S be two free (scalar-valued) standard semicircular ele-ments and consider S ∶= ( S S S ) . We have seen in item (3) of Remark 6.3 that S is then an M ( C ) -valued semicircularelement whose covariance function η ∶ M ( C ) → M ( C ) is given by η ( b b b b ) = ( b b b b + b ) . Refresh your memory on the relation between free semicircular elements and inde-pendent gue random matrices (for example, from Chapter 6 of the Free ProbabilityLecture Notes). From this it follows that S is the limit of a random matrix X N = ( A N A N B N ) , A N and B N are independent gue random matrices. (If A N and B N are N × N matrices, then X N is of course a 2 N × N matrix.) Since g ( z ) = tr E [( z − S ) − ] = tr G ( z ) is the scalar-valued Cauchy transform of S with respect to tr ○ E (tr is here thenormalized trace over 2 × g ( z ) of the limiting eigenvalue distribution of X N by ﬁrst calculating the M ( C ) -valuedCauchy transform G ( z ) of S and then taking the trace of this. For invoking theCauchy-Stieltjes inversion formula, we should calculate this for z close to the realaxis.(i) We know that the operator-valued Cauchy transform (on the ground level) G ( b ) satisﬁes the matrix equation bG ( b ) = + η ( G ( b )) G ( b ) . This is true for all b ∈ M ( C ) , but we are here only interested in argumentsof the form b = z

1, where z ∈ H + ( C ) . Try to solve this equation (exactly ornumerically) for z ∈ H + ( C ) close to the real axis, so that you can produce fromthis a density for the scalar-valued distribution of S .(ii) Realize for large N the random matrix X N and calculate histograms for itseigenvalue distribution. Compare this with the result from (i). Exercise 21.

Prove Proposition 7.2: Let (A , ϕ ) be a C ∗ -probability space and S , . . . , S d ∈ A free standard semicirculars (i.e., ϕ ( S i ) = n ≥ b , . . . , b d ∈ M n ( C ) we consider S ∶= b ⊗ S + ⋯ + b d ⊗ S d ∈ M n (A) . Then S is in the matrix-valued C ∗ -probability space ( M n (A) , M n ( C ) , id ⊗ ϕ ) a matrix-valued semicircular element with covariance η ∶ M n ( C ) → M n ( C ) given by η ( b ) = d ∑ j = b j bb j . Exercise 22.

Let S ij for i ≥ j be free standard semicircular elements, and put S ij = S ji . Furthermore, let α ij ∈ R with α ij = α ji be given. Then we consider S ∶= ( α ij S ij ) ni,j = . M n ( C ) -valued semicircular el-ement. Give, by relying on Theorem 7.5, a criterium to decide whether S is alsoa scalar-valued semicircular element. Use this to decide whether the following arescalar-valued semicircular elements (for S , . . . , S free standard semicirculars): S = ⎛⎜⎝ S S S S S ⎞⎟⎠ or ˜ S = ⎛⎜⎝ S S S S S S S S S ⎞⎟⎠ . Exercise 23.

Check your conclusion from the last exercise numerically by producinghistograms, for N = X ( N ) = ⎛⎜⎜⎝ X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) ⎞⎟⎟⎠ or ˜ X ( N ) = ⎛⎜⎜⎝ X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) X ( N ) ⎞⎟⎟⎠ , where X ( N ) , . . . , X ( N ) are independent gue (N) random matrices. Exercise 24.

Prove the recursion between moments and free cumulants from Propo-sition 9.6, by checking that the arguments from the scalar-valued case work also inthe operator-valued situation.

Exercise 25.

Write down explicitly the linearization for a monomial of degree k = Exercise 26.

Find a linearization ˆ p of the polynomial p ( x, y ) = xy + y x − y. Bonus Questions:Exercise 27.

Calculate, via linearization and numerical calculation of the corre-sponding operator-valued semicircular or of the corresponding operator-valued freeconvolution, the distribution of p ( X, Y ) = XY + Y X − Y , where ○ X and Y are free standard semicircular elements ○ X and Y are free random variables, with µ X = ( δ + δ ) , µ Y = ( δ − + δ ) . xercise 28. Realize X and Y , as given in Exercise 27, (asymptotically) via large N × N random matrices X N and Y N , and produce histograms of the eigenvaluedistribution of p ( X N , Y N ) . Compare the results with the calculations from Exercise27. 113 ome Oﬀ-the-Record Remarks The preceding presentation has hopefully convinced the reader that we have devel-oped powerful analytic tools to deal with non-commutative distributions and that wehave reached a deep understanding of many facets of this non-commutative world.Let us reconsider what we have achieved so far. We have diﬀerent ways to describenon-commutative distributions, namely by ○ presenting concrete operators on Hilbert spaces ○ by describing the joint moments or the joint cumulants of the operators ○ by giving the Cauchy transform of the distributions ○ or by describing the classical distribution of all polynomials (or maybe evenall rational functions) in the operatorsWe consider a situation nice and well-understood when we have something to sayabout all of those ways and usually progress comes from being able to switch betweenthe diﬀerent points of view. In particular, we should be able to get our hands onthe Cauchy transform and distributions of polynomials.In the case of free variables, so in particular for free semicirculars, we are in sucha nice situation.Also if we move away from free variables many of our tools still apply and leadto quite non-trivial statements. In particular, the statements about the abscenceof atoms in polynomials for operators which have maximal ∆ are of this type andin the continuation of such investigations we have many more qualitative results onregularity properties of polynomials in such variables; like, for example, in the recentwork [BM] on Hölder continuity of the distribution function of such polynomials.In this chapter we want to point out that there are of course also situations wherethe situation is not so satisfactory, and that we still hope for many more excitingdiscoveries in the non-commutative territory. 115 s there anything special about distributions ofgenerators of non-embeddable von Neumannalgebras By the refutation [JNVWY] of Connes embedding problem we know now that thereare tracial von Neumann algebras which cannot be embedded into the ultrapowerof the hyperﬁnite factor - or to put it more in our language: there are operators ina tracial W ∗ -probability space, whose joint moments cannot be approximated wellby moments of matrices. Up to now nobody was able to construct explicit examplesof such objects. Can our theory of non-commutative distributions say anythingabout the distribution of such operators? Can we address them by any of the abovementioned ways to deal with non-commutative distributions? Let us have a look. ○ We do not know any concrete operators - that’s of course what we would liketo ﬁnd! ○ Neither do we know any candidates for joint moments or joint cumulants. Sincepositivity is always an issue here, it is not only the problem of coming up withmoments which are unreachable for matrices, but one also needs argumentsguaranteeing that those are really moments of selfadjoint operators. ○ Again, we would have to come up with a Cauchy transform which is notreachable by Cauchy transforms of matrices (and which is indeed a Cauchytransform, so satisﬁes Theorem 4.13). It’s hard to imagine how to get onewithout writing it down concretely, or maybe at least writing down an equationfor it. Actually, a “random” Cauchy transform might do the job, but it is notclear how to make this rigorous. ○ This is even more unclear; without having knowledge about the Cauchy trans-form it seems quite unlikely to get a grasp on other functions of the variables.So, for the moment, there is nothing we have to oﬀer from our non-commutativedisribution perspective and we can only hope for some more insights.

The q -Gaussian operators Since we had no place to start in the preceding case it is not surprising that wecould not say anything. So one might still have the hope that given some concreteoperators, of which we have at least some knowldege, we should have good chancesof saying something about its Cauchy transform and then mabye also the distribu-tion of polynomials in them. Here comes an example which shows that even thenthe situation is not so promising. This is a deformation of the situation of free116emicirculars, but as they have no freeness in them we have problems with gettinga grasp on their Cauchy transform.The q -Gaussian distribution, also known as q -semicircular distribution, was intro-duced in [BSp, BKS] in the context of non commutative probability. Let us reviewsome basic deﬁnitions. In the following q ∈ [− , ] is ﬁxed. Consider a Hilbert space H . The following is a q -deformation of the contructions from Example 1.7. On thealgebraic full Fock space ⊕ n ≥ H ⊗ n – where H = C Ω with a norm one vector Ω,called “vacuum” – we deﬁne a q -deformed inner product as follows: ⟨ h ⊗ ⋯ ⊗ h n , g ⊗ ⋯ ⊗ g m ⟩ q = δ nm ∑ σ ∈ S n n ∏ r = ⟨ h r , g σ ( r ) ⟩ q i ( σ ) , where i ( σ ) = {( k, l ) ∣ ≤ k < l ≤ n ; σ ( k ) > σ ( l )} is the number of inversions of a permutation σ ∈ S n . In [BSp] it was shown that thisinner product is positive deﬁnite, and has a kernel only for q = q = − q -Fock space is then deﬁned as the completion of the algebraic full Fock spacewith respect to this inner product F q (H) = ⊕ n ≥ H ⊗ n ⟨⋅ , ⋅⟩ q . In the cases q = q = − h ∈ H we deﬁne the q -creation operator a ∗ ( h ) , given by a ∗ ( h ) Ω = h,a ∗ ( h ) h ⊗ ⋯ ⊗ h n = h ⊗ h ⊗ ⋯ ⊗ h n . Its adjoint (with respect to the q -inner product), the q -annihilation operator a ( h ) ,is given by a ( h ) Ω = ,a ( h ) h ⊗ ⋯ ⊗ h n = n ∑ r = q r − ⟨ h, h r ⟩ h ⊗ ⋯ ⊗ h r − ⊗ h r + ⊗ ⋯ ⊗ h n . [Never mind that we have switched here the convention whether the creation orthe annihilation operator gets the ∗ . There are two conﬂicting traditions, one fromphysics, where creation goes with the ∗ , and one from operator theory where, in thecase q =

0, the left shift l , and not its adjoint l ∗ , is the basic isometry. Since we are117ow more on the physics side, our inner product has also become linear in its secondargument.]Those operators satisfy the q -commutation relations a ( f ) a ∗ ( g ) − qa ∗ ( g ) a ( f ) = ⟨ f, g ⟩ ⋅ ( f, g ∈ H) . For q = q =

0, and q = − q =

1, theoperators a ∗ ( f ) are bounded.Let ξ , . . . , ξ n be an orthonormal system of vectors in H , then we consider theselfadjoint operators X i ∶= a ( ξ i ) + a ∗ ( ξ i ) ( i = , . . . , n ). For ϕ we take again thevacuum expectation state ϕ ( A ) ∶= ⟨ Ω , A Ω ⟩ . We are now interested in the non-commutative distribution µ X ,...,X n of the operators X , . . . , X n in the C ∗ -probabilityspace ( B (F q (H)) , ϕ ) . We call this the (multivariate) q -Gaussian distribution . For q = n free semicirculars. The q -deformation has still some of the features of the q = q -deformed Wick formula: for any ε ∶ { , . . . , k } → { , . . . , n } we have ϕ ( X ε ( ) ⋯ X ε ( k ) ) = ∑ π ∈P ( k ) π ≤ ker ε q cr ( π ) , where cr ( π ) denotes the number of crossings of the pair-partition π , i.e., the numberof pairs of blocks which have a crossing.So, this looks quite good: we have a nice realization of the q -Gaussian distributionby very concrete operators and we have nice combinatorial formulas for all jointmoments. But does this mean that we understand this non-commutative distributionwell? Unfortunately, not really. In particular, we do not get a hold on its operator-valued Cauchy transform.Following our general strategy of going over from tuples of non-commuting op-erators to one operator-valued operator we put our operators X , . . . , X n on thediagonal of an n × n matrix X = ⎛⎜⎜⎜⎝ X . . . X . . . ⋮ ⋮ ⋱ ⋮ ⋯ X n ⎞⎟⎟⎟⎠ . (13.1)Understanding the distribution of ( X , . . . , X n ) is now the same as understandingthe B -valued distribution of X , where we have put B ∶= M n ( C ) ; the matrix X is118hat we would call an operator-valued q -semicircular element. In order to deal withthis we should understand the B -valued Cauchy transform G X = ( G ( k ) X ) k ∈ N . For afull understanding we need its structure as a fully matricial function with all itsmatrix ampliﬁcations, but for many applications even the knowledge just on thebase level would be very helpful. But here we are stuck. We do not have any niceconcrete analytic description of this Cauchy transform.From the situation for q =

0, the case of free semicirculars, one might have got theimpression that the one-dimensional and the multi-variate case are not so diﬀerentafter all. In that case the quadratic equation for the Cauchy transform of thescalar-valued semicircular distribution was replaced by a corresponding operator-valued quadratic equation. The latter was of course harder than its scalar-valuedcounterpart, but we could still deal with it. This might give the impression thatalso in the case of general q we should be able to extend results from the n = n = q quite well, but all the nice structure there doesnot extend into the operator-valued regime.For n =

1, the q -Gaussian distribution is a probability measure on the interval [− /√ − q, /√ − q ] , with analytic formulas for its density, see Theorem 1.10 in[BKS]. For its Cauchy transform G we do not have an algebraic equation, but weknow a good continued fraction expansion of the form G ( z ) = z − z − + qz − + q + q z − . . . . The naive guess that one might also have a corresponding operator-valued versionof such a continued fraction expansion is unfortunately not true. Whereas in thescalar case any probability measure has a continued fraction expansion for its Cauchytransform, this does not hold any more in the operator-valued setting (see [AW]),and it is easy to check that the matrix X in (13.1) for the q -Gaussian distributionis one of the basic examples where this fails.So in a sense, at the moment our machinery for operator-valued Cauchy-transformshas unfortunately nothing to oﬀer for dealing with q -Gaussian distributions. Ofcourse, Cauchy transforms are not everything and we have also other approaches andtools to understand non-commutative distributions. In particular, there has beenquite some progress [GSh, Jek19] in our understanding of the q -Gaussian distribu-tions, by describing them as free Gibbs states and using non-commutative versions119f transport to relate diﬀerent q ’s. Combined with [BM] this gives then also regular-ity properties of polynomials in q -Gaussian operators. What is missing, comparedto the free case, is a way to calculate the distribution of polynomials in q -Gaussians.But this would be the content of another lecture series ...120 ibliography [AW] M. Anshelevich and J. Williams: Operator-valued Jacobi parameters andexamples of operator-valued distributions. Bulletin des Sciences Mathé-matiques,

Vol 145 (2018), 1–37.[BM] M. Banna and T. Mai: Hölder continuity of cumulative distribution func-tions for noncommutative polynomials under ﬁnite free ﬁsher informa-tion.

Journal of Functional Analysis (2020): 108710.[BMS] S. Belinschi, T. Mai, and R. Speicher: Analytic subordination theory ofoperator-valued free additive convolution and the solution of a generalrandom matrix problem.

Journal für die reine und angewandte Mathe-matik

Communications in Mathematical Physics

137 (1991), 519–531.[BKS] M. Bożejko, B. Kümmerer, and R. Speicher: q -Gaussian Processes: Non-commutative and Classical Aspects. Communications in MathematicalPhysics

Journalof Functional Analysis

Inventionesmathematicae

Abstract and Applied Analysis

Journal of Functional Analysis

International Mathematics Research Notices

Foundations of free non-commutative function theory , Vol. 199, American Mathematical Soc.,2014.[Kra] S. Krantz: The Carathéodory and Kobayashi metrics and applicationsin complex analysis.

The American Mathematical Monthly

Free probability and random matrices , FieldsInstitute Monographs, Vol. 35. New York: Springer, 2017.[PV] M. Popa and V. Vinnikov: Non-commutative functions and the non-commutative free Lévy–Hinčin formula.

Advances in Mathematics

Combinatorial theory of the free product with amalgamationand operator-valued free probability theory , Memoirs of the AMS, Vol.627, American Mathematical Soc., 1998.[Tay] J. Taylor: A general framework for a multi-operator functional calculus.

Advances in Mathematics

9, no. 2 (1972), 183–252.[Voi95] D. Voiculescu: Operations on certain non-commutative operator-valuedrandom variables.

Astérisque ∂ X ∶ B . International Mathematics Research Notices

Communicationsin Mathematical Physics