[PDF] Tracial smooth functions of non-commuting variables and the free Wasserstein manifold

Abstract

We formulate a free probabilistic analog of the Wasserstein manifold on \mathbb{R}^d (the formal Riemannian manifold of smooth probability densities on \mathbb{R}^d), and we use it to study smooth non-commutative transport of measure. The points of the free Wasserstein manifold \mathscr{W}(\mathbb{R}^{*d}) are smooth tracial non-commutative functions V with quadratic growth at \infty, which correspond to minus the log-density in the classical setting. The space of smooth tracial non-commutative functions used here is a new one whose definition and basic properties we develop in the paper; they are scalar-valued functions of self-adjoint d-tuples from arbitrary tracial von Neumann algebras that can be approximated by trace polynomials. The space of non-commutative diffeomorphisms \mathscr{D}(\mathbb{R}^{*d}) acts on \mathscr{W}(\mathbb{R}^{*d}) by transport, and the basic relationship between tangent vectors for \mathscr{D}(\mathbb{R}^{*d}) and tangent vectors for \mathscr{W}(\mathbb{R}^{*d}) is described using the Laplacian L_V associated to V and its pseudo-inverse \Psi_V (when defined). Following similar arguments to arXiv:1204.2182, arXiv:1701.00132, and arXiv:1906.10051 in the new setting, we give a rigorous proof for the existence of smooth transport along any path t \mapsto V_t when V is sufficiently close (1/2) \sum_j \operatorname{tr}(x_j^2), as well as smooth triangular transport.

Full PDF

aa r X i v : . [ m a t h . OA ] J a n Tracial smooth functions of non-commuting variables and the freeWasserstein manifold

David Jekel, Wuchen Li, and Dimitri ShlyakhtenkoJanuary 19, 2021

Abstract

We formulate a free probabilistic analog of the Wasserstein manifold on R d (the formal Rieman-nian manifold of smooth probability densities on R d ), and we use it to study smooth non-commutativetransport of measure. The points of the free Wasserstein manifold W ( R ∗ d ) are smooth tracial non-commutative functions V with quadratic growth at ∞ , which correspond to minus the log-density in theclassical setting. The space of smooth tracial non-commutative functions used here is a new one whosedeﬁnition and basic properties we develop in the paper; they are scalar-valued functions of self-adjoint d -tuples from arbitrary tracial von Neumann algebras that can be approximated by trace polynomials.The space of non-commutative diﬀeomorphisms D ( R ∗ d ) acts on W ( R ∗ d ) by transport, and the basicrelationship between tangent vectors for D ( R ∗ d ) and tangent vectors for W ( R ∗ d ) is described using theLaplacian L V associated to V and its pseudo-inverse Ψ V (when deﬁned).Following similar arguments to [35, 26, 41] in the new setting, we give a rigorous proof for theexistence of smooth transport along any path t V t when V is suﬃciently close (1 / P j tr( x j ), as wellas smooth triangular transport. The two main ingredients are (1) the construction of Ψ V through theheat semigroup and (2) the theory of free Gibbs laws, that is, non-commutative laws maximizing thefree entropy minus the expectation with respect to V . We show for instance that every V with boundedﬁrst and second derivative has a free Gibbs law, and for generic V , the free Gibbs law is unique. Weconclude with a mostly heuristic discussion of the smooth structure on W ( R ∗ d ) and hence of the free heatequation, optimal transport equations, incompressible Euler equation, and inviscid Burgers’ equation. Contents C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Continuity and diﬀerentiability properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.5 An inverse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Non-commutative smooth functions: connections 31 ∗ -algebra C tr ( R ∗ d , M ( R ∗ d )) d , its trace, and its log-determinant . . . . . . . . . . . . . . 404.5 Large N limits of diﬀerential operators on M N ( C ) d sa . . . . . . . . . . . . . . . . . . . . . . . 46 L V X ( X , X ′ , t ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.2 The semigroup e tL x , J . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.3 Kernel projection and pseudo-inverse of the Laplacian . . . . . . . . . . . . . . . . . . . . . . 726.4 Diﬀerential equation and continuity properties . . . . . . . . . . . . . . . . . . . . . . . . . . 76 E x ,V and conditional expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1008.4 Triangular transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Voiculescu’s free probability treats tracial von Neumann algebras as a non-commutative analog of probabilityspaces, and studies an analog of probabilistic independence, called free independence, which relates to freeproducts of these von Neumann algebras. Free probability also describes the large N behavior of certainprobability distributions on N × N matrices, and more generally d -tuples of N × N matrices. Free probabilityuses both complex-analytic and combinatorial tools, and relates to the large N representation theory ofunitary, orthogonal, and symmetric groups. For background, see e.g. [78, 89, 5].Voiculescu’s theory of free entropy [79, 80, 81, 83] is the beginning of free information theory. As inclassical information theory, there are versions of entropy and Fisher’s information, which satisfy inequalitiessimilar to the classical entropy and Fisher information. Voiculescu actually initiated two approaches to freeentropy theory. The ﬁrst approach uses matricial microstates, or d -tuples of matrices that approximate thebehavior of the d -tuple of operators we want to study; the microstates free entropy describes the lim supexponential growth rate of the volume of the microstate spaces [80]. Thus, free entropy is the rate functionfor a (still partially conjectural) large deviation principle in random matrix theory; see [7]. The second2inﬁnitesimal approach” deﬁnes free entropy via the free Fisher information and perturbation by freelyindependent semicircular families (the free version of Gaussian random variables) [81].Our main motivation is to ﬁnd a free version of the Wasserstein manifold. The classical Wassersteinmanifold P ( R d ) is a formal inﬁnite-dimensional Riemannian manifold whose points are smooth probabilitydensities ρ , which has many natural properties [49, 52, 75]. By taking the inﬁmum of the lengths of smoothcurves in the manifold, the Riemannian metric gives rise to the ( L ) Wasserstein distance, which describesthe L distance between an optimal transport map f from µ to ν and the identity function [75]. Thegradient structure of P ( R d ) describes the diﬀerentiation with respect to ρ of certain functionals on thespace of probability measures [61], and the evolution of a measure under Brownian diﬀusion turns outto be the gradient ﬂow of the entropy functional [44] [62]. Furthermore, the tangent manifold of P ( R d )has a symplectic structure [49], which connects to the geodesic equations on this space. With suitablemodiﬁcations, one can connect these results to the construction of hydrodynamic equations, including thecompressible Euler equation, Schr¨odinger equation, Schr¨odinger bridge problem, and mean ﬁeld games [22,52]. The ﬁeld of transport information geometry is active, and the study of the Hessian operators on theWasserstein manifold are useful in studying ﬂuid dynamics and formulating functional inequalities [53, 54, 75].Although a Wasserstein manifold has never been systematically described for multivariable free prob-ability, some of the key ideas of information geometry have been present as motivation throughout thedevelopment of free information theory. This includes the relationship between entropy and Fisher infor-mation [79, 81], Talagrand inequalities [11, 39, 37], and the relationship between entropy and transport ofmeasure [80, § § N limit of transport of measure on the space of N × N matrices[41, 42]. Non-commutative ideas have been generalized beyond the setting of tracial von Neumann algebras[70, 57, 58].For a single variable, free entropy has been studied as a functional on the Wasserstein manifold of R ,and the relationship between optimal transport for probability measures on R and optimal transport forrandom matrix models is better understood [10, 39, 55]. The setting of several non-commuting variables issigniﬁcantly more challenging, as is apparent for instance from the open problems about free entropy (see[85]). We also point out that Wasserstein geometry is currently being developed in the ﬁeld of quantuminformation theory. Quantum information theory is another non-commutative analog of classical informationtheory, though it is distinct from free probability theory; for instance, in quantum information theory, aprobability density is replaced with a positive operator of trace 1, but in free probability the non-commuativelaws do not have a direct analog of density (only of log-density as we will see). Related to quantuminformation are the analogs of the Wasserstein manifold which deal with matrix-valued densities on R d orother classical manifolds [60, 21, 19]. For comparison, free probability describes the large N behavior ofordinary probability distributions on the space of N × N matrices.We deﬁne free Wasserstein manifold as a space of certain “smooth (minus) log-densities,” which aresmooth scalar-valued functions of several non-commuting self-adjoint operators (see § V in terms of perturbations V , and we describe the relationship betweentangent vectors and inﬁnitesimal transport maps through a Laplacian operator L V associated to V andits pseudo-inverse. Following the same strategy as [26] (but in a diﬀerent technical framework), we give arigorous treatment in the case of log-densities V that are suﬃciently close to the quadratic V ( x , . . . , x d ) =(1 / P dj =1 tr( x j ), which leads to a free transport result similar to [35, 26] as well as a new C ∗ version ofthe triangular transport results of [41, 42]. We conclude by stating versions of the heat equation, Wasser-stein geodesic equation, incompressible Euler equation, and inviscid Burgers’ equation in our tracial non-commutative framework.The results in this paper, even though they are technically new, have a large overlap with previous worksuch as [35, 26, 41], and this is because our goals are largely expository. The free Wasserstein manifold hasbeen treated in prior work only as motivation or as interpretation a posteriori of analytically rigorous results.We want to bring it to center stage as a unifying framework that simultaneously provides a heuristic and aproof strategy for rigorous results, playing a similar role to that of the classical Wasserstein manifold in [62].3ith the beneﬁt of hindsight, we strive to organize and present the proofs in the most natural way possible.The end goals of deﬁning the Wasserstein manifold and constructing transport for potentials close to P j tr( x j ) seem modest compared to wealth of knowledge that exists about the classical Wasserstein manifold.However, as in [35, 26, 41, 42], even results that are basic in the classical setting require a lot of technicalpreparation in the free setting. When developing the classical Wasserstein manifold, people already had aclear understanding of smooth functions, measure and probability theory, and partial diﬀerential equations.By contrast, there is not even a well-established deﬁnition of smooth functions for several non-commutingreal variables. Thus, in § §

4, we deﬁne new spaces of tracial non-commutative smooth functions ofseveral self-adjoint operators in a tracial von Neumann algebra. Like [26], the functions are based on tracepolynomials, but the approach to deﬁning the norms is completely diﬀerent.Another technical diﬃculty that arises in the free setting is that there is no direct analog of density inthe free setting. We only know how to pass from a log-density V to a non-commutative law µ V through freeentropy/random matrix theory or through the heat semigroup associated to V (and the related stochasticdiﬀerential equations), and in fact we will combine both of these approaches in this paper (see § § §

7, we deﬁne free Gibbs laws for V as the maximizers of free entropy minusthe expectation of V , giving for the ﬁrst time a proof of their existence and properties directly from thedeﬁnition of free entropy, as motivated by [85, § §

9, we formulate several diﬀerential equations ofinterest for free transport information geometry and operator algebras, including the geodesic equation andgradient ﬂow on the Wasserstein manifold and the compressible Euler equation. Our framework allows fora closer resemblance of these equations with their classical analogs than previously understood, because itincludes a natural description of scalar-valued smooth functions of several operators. Of course, the rigorousstudy of these equations will be another undertaking, and we do not expect all the results from the classicalsetting to carry over in the same level of generality. Nonetheless, it is a crucial ﬁrst step to clarify theconnection between the classical and free versions of an equation and what it would mean for a smoothfunction to solve the equation.In the remainder of the introduction, § § § § We will set up the free Wasserstein manifold as follows: • We deﬁne a space tr( C ∞ tr ( R ∗ d )) of scalar-valued smooth functions of several self-adjoint operators ina tracial von Neumann algebra. Another space C ∞ tr ( R ∗ d ) d sa provides the analog of smooth functions R d → R d (a.k.a. vector ﬁelds on R d ). • The free Wasserstein manifold W ( R ∗ d ) is deﬁned as the space of V ∈ tr( C ∞ tr ( R ∗ d )) such that V isbounded above and below by a quadratic function, that is, a + bV ≤ V ≤ a ′ + b ′ V for some constantswith b, b ′ >

0, where V ( x ) = (1 / P dj =1 tr( x j ). • The tangent space to W ( R ∗ d ) consists of tr( C ∞ tr ( R ∗ d )) functions with some bounds on the ﬁrst andsecond derivatives. • For V ∈ tr( C ∞ tr ( R ∗ d )), we deﬁne the associated free Gibbs laws as non-commutative laws that max-imize a certain entropy functional. A free Gibbs law ν must satisfy the integration-by-parts relation ν ( ∇ ∗ V h ) = 0 for any vector ﬁeld h , where ∇ ∗ V is the free analog of the divergence operator associatedto V . If there is a unique law satisfying this equation, we denote it by µ V .4 The Riemannian metric at V for two tangent vectors W and W is given by ν V ( h∇ L − V W , ∇ L − V W i ),where L V = −∇ ∗ V ∇ is a Laplacian operator associated to V , whenever the above expression makessense. • We show rigorously that the deﬁnition makes sense for V suﬃciently close to the quadratic V .We have the following deﬁnitions and results relating to non-commutative transport of measure: • We deﬁne an analog of diﬀeomorphisms of R d , as well as a construction of certain diﬀeomorphisms asﬂows along vector ﬁelds. A Lie bracket on vector ﬁelds is deﬁned analogous to the classical case. • For a diﬀeomorphism f and a potential V , there is a push-forward deﬁned by f ∗ V = V ◦ f − − log ∆ ( ∂ f − ), where log ∆ is an analog of the log-determinant. The push-forward deﬁnes an actionof the diﬀeomorphism group on the Wasserstein manifold. • With certain assumptions on V , if there is a unique free Gibbs law µ V , then f ∗ µ V is the unique freeGibbs law for f ∗ V (see Proposition 7.14). • Given a one-parameter group of diﬀeomorphisms f t generated by a vector ﬁeld h , the tangent vector( d/dt ) | t =0 ( f t ) ∗ V is given by ∇ ∗ V h . • Conversely, for a tangent vector W , a possible vector ﬁeld h for producing transport is given by ∇ ( − L V ) − W , provided that the latter makes sense. • When V is suﬃciently close to the quadratic, we can make this relationship between tangent vectorsand inﬁnitesimal tranport rigorous. Thus, for any continuously diﬀerentiable path t V t of potentialsclose to the quadratic, we can naturally produce a family of transport maps f t with ( f t ) ∗ V = V t (seeTheorem 8.3). • We can also arrange that the transport maps f t are lower-triangular functions in the sense that for j = 1, . . . , d , the j th coordinate of f t ( x , . . . , x d ) depends only on x , . . . , x j (see Theorem 8.22).The last result on triangular transport is a partial analog of classical triangular transport of measurestudied in [13]. It has the following consequence for operator algebras, which is in some sense an improvementof the triangular transport results from [41, 42]; it asserts an isomorphism of C ∗ -algebras not only of W ∗ -algebras, but it also has stronger smoothness hypotheses on V . Of course, this theorem also overlaps withthe previous results on free transport from [35, 26], which were not triangular. For precise statement, seeCorollary 8.24. Theorem.

Let V ∈ tr( C ∞ tr ( R ∗ d )) sa be suﬃciently close to V ( x ) = (1 / P j tr( x j ) (more precisely, assumethat the ﬁrst and second derivatives are suﬃciently close and third derivative is uniformly bounded). Let µ V be the associated free Gibbs law, and let ( A , τ ) be the tracial W ∗ -algebra associated to µ V , with thecanonical generators X = ( X , . . . , X d ) . Let ( B , σ ) be the tracial W ∗ -algebra generated by a standard freesemicircular family S = ( S , . . . , S d ) . Then there exists a isomorphism of tracial von Neumann algebras φ : ( A , τ ) → ( B , σ ) such that for each j = 1 , . . . , d , we have φ (C ∗ ( X , . . . , X j )) = C ∗ ( S , . . . , S j ) . In the ﬁnal section, we present several diﬀerential equations related to the free Wasserstein manifold forfuture study, including the following: • We diﬀerentiate the functional V µ V ( f ) for V ∈ W ( R ∗ d ). • We explain how the non-commutative heat equation ˙ V t = L V t V t represents the gradient ﬂow of freeentropy, similar to the classical case [61]. • We state the free version of the geodesic equations on W ( R ∗ d ), which are ˙ V t = L V t φ t and ˙ φ t = − (1 / h∇ φ t , ∇ φ t i tr . We show that smooth solutions satisfy V t = (id + t ∇ ˙ φ ) ∗ V . We also show thatthe path t µ V t is a minimal curve in the L -coupling distance on the space of non-commutative laws.5 We state a non-commutative incompressible Euler equation with respect to a potential V in a similarspirit to [88]. Similar to the classical case [6], this represents the geodesic equation on the groupof non-commutative diﬀeomorphisms that preserve V . Similarly, the geodesic equation on the entirenon-commutative diﬀeomorphism group is the non-commutative inviscid Burgers’ equation. Our formulation of the free Wasserstein manifold is closely linked with random matrix theory and free Gibbslaws. An important branch of random matrix theory studies probability measures µ ( N ) on M N ( C ) d sa (thespace of d -tuples of self-adjoint N × N matrices) of the form dµ ( N ) f ( X ) = constant e − N tr N ( f ( X )) d X . Here X = ( X , . . . , X d ) ∈ M N ( C ) d sa ; tr N denotes the normalized trace (1 /N ) Tr on M N ( C ), and d X isLebesgue measure on M N ( C ) d sa , which we view as a real inner product space of dimension dN with theinner product h X , Y i = P dj =1 tr N ( X j Y j ); and f is a non-commutative polynomial in d -variables such thattr N ( f ( X )) is real for X ∈ M N ( C ) d sa . More generally, we can consider dµ ( N ) V ( X ) = constant e − N V ( X ) d X , where V is a trace polynomial , that is, a formal linear combination of terms of the form tr( f ) . . . tr( f k ) forsome k ∈ N and non-commutative polynomials f , . . . , f k . Such models were ﬁrst studied for a single matrixin [16] and then for multiple matrices in [26]. Here V is evaluated on some X ∈ M N ( C ) sa by replacingeach term tr( f j ) by tr N ( f j ( X )). This more general class of trace polynomials is quite natural because, asshown by Procesi [65], every polynomial function M N ( C ) d sa → R (that is, polynomial with respect to thereal and imaginary parts of the matrix entries) that is invariant under conjugation by unitary matrices mustbe given by a trace polynomial. For prior work relating trace polynomials with random matrix theory, see[66, 69, 20, 28, 47, 48, 26].The measure µ ( N ) V is an element of the classical Wasserstein manifold P ( M N ( C ) d sa ) since it has a smoothdensity. However, the density does not have a large N limit since there is an N in the exponent. However,the − /N times the log of density is precisely V , which is dimension-independent by assumption. Thisleads us to the following heuristic for studying the free Wasserstein manifold: Reparametrize P ( M N ( C ) d sa )in terms of V = − (1 /N ) log ρ instead of in terms of the density ρ . Compute the Riemannian metric (andwhatever other objects of diﬀerential equations we wish to study) in terms of V rather than ρ . Then studythe behavior of this object as N → ∞ . The reparamerization in terms of the log-density for the classicalWasserstein manifold P ( R d ) is explained in § V , consider two diﬀerenttrace polynomials W and W . Then the curves t V + tW j represent tangent vectors in P ( M N ( C ) d sa ).Since V + tW j is is considered up to an additive constant, assume that R W j dµ = 0. It follows from thecomputations in § P ( M N ( C ) d sa ) is given by Z h∇ ( L ( N ) V ) − W , ∇ ( L ( N ) V ) − W i dµ ( N ) V , (1.1)where L ( N ) V f = 1 N ∆ f − h∇ V, ∇ f i . If f is a trace polynomial, then ∇ f is dimension-independent and (1 /N )∆ f on M N ( C ) d sa is given by a tracepolynomial which converges coeﬃcient-wise as N → ∞ to some trace polynomial Lf ; see [20, § § § L ( N ) V above is dimension-independent for ourrandom matrix setting. The Riemannian metric for the free Wasserstein manifold should heuristically bethe large N limit of (1.1).Several ingredients are desirable to make this heuristic precise:61) We want to understand the large N behavior of µ ( N ) V .(2) We want a notion of “trace C ∞ functions” that generalizes trace polynomials, such that L V is well-deﬁned on any trace C ∞ function. Of course, we will replace the trace polynomials in the deﬁnitionwith these smooth functions.(3) We want to study the pseudo-inverse of L V on the space of trace smooth functions (and we hope thatthe kernel and cokernel are 1-dimensional).Let us discuss each of these questions in more detail.(1) In the case where V is a perturbation of the quadratic, prior work has shown that R f dµ ( N ) V convergesalmost surely to some deterministic limit [33, 34, 40]. This limit is described in terms of a tuple X of self-adjoint operators from a von Neumann algebra A equipped with a (faithful, normal) tracial linear functional τ : A → C . We have R f dµ ( N ) V → f ( X ) for all scalar-valued trace polynomials f , where the evaluation of f on X is given in the same way as the evaluation on a tuple of matrices, with τ instead of tr N . In fact, theevaluation f ( X ) is completely determined by τ ( p ( X )) when p is a non-commutative polynomial. Thus, the(bulk) large N behavior of µ ( N ) V is described by the non-commutative law of X , that is, the linear functional C h x , . . . , x d i → C given by p τ ( p ( X )).For more general V , a suﬃcient condition for such convergence to happen would be if there was a uniquenon-commutative law ν that maximized χ ( ν ) − ν ( V ), where χ is Voiculescu’s microstates free entropy. Wediscuss this approach in § M N ( C ) d sa → C . Indeed, the gradient of such a function will be a map M N ( C ) d sa → M N ( C ) d , which is a d -tuple of operator-valued trace polynomials M N ( C ) d sa → M N ( C ). The operator-valued trace polynomials arelinear combinations of terms such as f tr( f ) . . . tr( f k ) where f , . . . , f k are non-commutative polynomials.Of course, since f can be 1, any scalar-valued trace polynomial can be viewed as an operator-valued tracepolynomial, and thus we can pass to the more general consideration of operator-valued trace polynomials. If f is an operator-valued trace polynomial, and if X , Y , . . . , Y k are in M N ( C ) d sa , then the iterated directionalderivative ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 . . . ddt k (cid:12)(cid:12)(cid:12)(cid:12) t k =0 f ( X + t Y + · · · + t k Y k )deﬁnes an operator-valued trace polynomial in X , Y , . . . , Y d that is multilinear in Y , . . . , Y d .We deﬁne C tr ( R ∗ d , M k ) as the completion of the space of operator-valued trace polynomials in X , Y , . . . , Y k that are multilinear in Y , . . . , Y k , with respect to a certain family of seminorms k f k C tr ( R ∗ d , M k ) ,R for R >

0. Here for each radius R , the seminorm k f k C tr ( R ∗ d , M k ) ,R is deﬁned as follows: Fix a tracial vonNeumann algebra ( A , τ ) and α , α , . . . , α k ∈ [1 , ∞ ] with 1 /α = 1 /α + · · · + 1 /α k . Take the supremum of k f ( X )[ Y , . . . , Y d ] k L α ( A ,τ ) over X in an operator norm ball of radius R and Y j in the unit ball of L α j ( A , τ ).Then take the supremum over ( A , τ ) and α , α , . . . , α k .Then C k tr ( R ∗ d ) is deﬁned as the space of functions whose derivatives of order k ′ ≤ k are in C tr ( R ∗ d , M k ′ ).On C ∞ tr ( R ∗ d ), diﬀerentiation and composition are well-deﬁned, and there is a Laplacian operator L V thatdescribes the large N behavior of L ( N ) V . Remark . Our space C k tr ( R ∗ d ) is closely related to the deﬁnition in [26] of trace C k functions on theoperator norm ball of radius R . However, the deﬁnition in [26] was more complicated because it involvedseparating out diﬀerent types of terms in the derivative and using Haagerup tensor norms. The norms used inthis paper have some of the same desirable properties, such as good behavior under conditional expectationsand the ability to control the Lipschitz norms of a function with respect to k·k . The deﬁnition in [26]also had some unavoidable complexity due to working in setting of operator-valued free probability whichreplaced the scalars C with some von Neumann algebra B .(3) We study the pseudo-inverse of L V rigorously in the case where V is suﬃciently close to a quadratic.The strategy is the same as previous works such as [10, 34, 35, 26]. In fact, the results about the expectationwith respect to ν V discussed above in (1) and the results about the pseudo-inverse Ψ V both follow from the7tudy of the heat semigroup e tL V . Indeed, we hope to obtain the expectation map the E V : C tr ( R ∗ d ) → C associated to ν V as E V f = lim t →∞ e tL V f and the pseudo-inverse of L V as Ψ V f = Z ∞ ( e tL V − E V ) f dt. The most explicit known method of constructing the heat semigroup in the free setting is using freestochastic diﬀerential equations, as in the papers cited above. Let ( A , τ ) be a tracial W ∗ -algebra and X ∈ A d sa . Let X ( X , t ) be a stochastic process solving the equation d X ( X , t ) = d S ( t ) − ∇ x V ( X ( X , t )) dt, X ( X ,

0) = X . where ( S ( t )) t ∈ [0 , ∞ ) is a free Brownian motion in d variables, freely independent of X . Then we deﬁne( e tL V f )( X ) = E A f ( X ( X , t )) for X ∈ A d sa .We prove in § V , the resulting stochastic process and the heat semigroup are smoothfunctions of X and depend continuously on V . This argument is closely parallel to [26, § d ′ -tuple ofvariables X ′ . This is what enables us to prove the triangular transport theorem in § L V will be invertible for arbitrary V ∈ W ( R ∗ d ). As we discuss in § d = 1 case shows that in general the Laplacian might have akernel of dimension larger than 1 when acting the L space associated to the free Gibbs law.We conclude the discussion by pointing out an (at ﬁrst) counterintuitive feature of our deﬁnition of W ( R ∗ d ): There could in principle be many diﬀerent functions V satisfying Assumptions 5.14 and 5.16 whichproduce the same non-commutative law µ V . This is unavoidable because if µ V is realized by a d -tuple ofbounded operators with norm < R , then we could perturb V outside the ball of radius R and end up withthe same law µ V .Besides perturbing V outside the “support” of µ V , there is another way in which such degeneracy canarise, which is easier to describe from the point of view of the tangent space. The Riemannian metric h· , ·i V could have a very large kernel in T V W ( R ∗ d ). Indeed, suppose ( A , τ ) is the tracial von Neumann algebraassociated to the GNS representation of µ V and X is the canonical generating tuple (see Proposition 2.18).Then for tangent vectors ˙ V and ˙ W , we have h ˙ V , ˙ W i V = h ( ∇ Ψ V ˙ V ) A ,τ ( X ) , ( ∇ Ψ V ˙ W ) A ,τ ( X ) i τ . Thus, ˙ V will be in the kernel of h· , ·i V if and only if ∇ Ψ V ˙ V evaluates to zero on X . There are manyfunctions in C tr ( R ∗ d ) which evaluate to zero on X ; for instance, for any trace polynomial f , there will be anon-commutative polynomial g with f A ,τ ( X ) = g A ,τ ( X ).The fact that µ V does not uniquely determine V might seem like a defect in the deﬁnition. In the classicalcase, the space of probability measures on R d is the completion of smooth positive densities with respect toa certain topology. But to obtain some space of non-commutative laws from the free Wasserstein manifolddeﬁned here, one has to ﬁrst quotient out by the equivalence relation that V ∼ W if µ V = µ W , that is, wemust use a separation-completion rather than a completion.A heuristic explanation for why this degeneration occurs is because the random matrix models of-ten have exponential concentration of measure as N → ∞ (see e.g. [36]). Although the measures µ ( N ) V are supported on all of M N ( C ) d sa , their mass concentrates on much smaller sets, namely the matricialmicrostate spaces of Voiculescu. Due to the concentration of measure, one must be very careful aboutthe normalization of various quantities associated to V and µ ( N ) V . For instance, we earlier gave the for-mula R h∇ ( L ( N ) V ) − W , ∇ ( L ( N ) V ) − W i dµ ( N ) V for the Riemannian metric which turns out to be dimension-independent, but the metric could also be written as N Z ( − L ( N ) V ) − W · W dµ ( N ) . R ( − L ( N ) V ) − W · W dµ ( N ) goes to zero as N → ∞ (because both ( − L ( N ) V ) − W and W are close their mean, which is zero, with high probability). Thus, the Riemannian metric cannot bedeﬁned by this formula in the large- N limit.The choice to work with globally deﬁned functions in C tr ( R ∗ d ) d rather than only their projections in L ( µ V ) d enables us to more easily apply the ideas of classical analysis. This is conceptually similar to howmight study functions on some small and complicated compact subset K of R d by ﬁrst analyzing those whichextend to smooth functions in a neighborhood of K . Prior work on free transport such as [35] and [26] hasalso used functions that are globally deﬁned (at least on some operator-norm ball) rather than only on thespeciﬁc d -tuple of operators realizing the law µ V . Since degeneration is unavoidable in any case, we mightas well frame the Wasserstein manifold in terms of the globally deﬁned functions that are more analyticallytractable rather than attempting to sort out the diﬃcult technical question of exactly how much degenerationoccurs.Besides, as seen in [40, 41, 42] as well as § V suﬃciently close to P j tr( x j ), variousfunctions f ( N ) on M N ( C ) d sa associated to µ ( N ) V will as N → ∞ be asymptotically close to corresponding non-commutative functions f in C tr ( R ∗ d ) d sa everywhere (uniformly on each operator-norm ball) rather than onlythe microstate spaces associated to µ V . These results are better than we might expect; due to concentrationof measure, there is no way to deduce them simply from studying the “bulk behavior”, or knowing the L ( µ ( N ) V )-norms of non-commutative functions on M N ( C ) d sa as N → ∞ . Another way to describe thisphenomenon is that the C tr ( R ∗ d ) d sa functions carry more information about the large N behavior of therandom matrix models than could be detected from the non-commutative law µ V alone. However, it isunclear to what extent this generalizes when V is not close to P j tr( x j ) or not uniformly convex.Another diﬃculty in framing the free Wasserstein manifold is that the non-commuative laws µ V associatedto our smooth potentials V ∈ W ( R ∗ d ) might not be dense in the space of all non-commutative laws. Certainly,we can only approximate non-commutative laws that can be approximated by the non-commutative lawsof matrix tuples (or laws whose associated von Neumann algebras are Connes-embeddable); and we nowknow that not all tracial W ∗ -algebras are Connes-embeddable due to the recent work on related problemsin quantum information theory [43]. But even after we restrict our attention to Connes-embeddable vonNeumann algebras, it is unlikely than an arbitrary potential V ∈ W ( R ∗ d ) can be approximated by otherpotentials W such that L W has a one-dimensional kernel, in light of the counterexamples in the single-matrixsetting (see 5.4). In §

2, we explain background material and terminology. In § ∗ and von Neumann algebras that will be used throughout the paper. In § §

3, we deﬁne spaces of tracial non-commutative C k functions, and describe their basic properties, suchas the chain rule for composition. In §

4, we relate non-commutative functions with smooth functional calculusfor self-adjoint operators, and we describe diﬀerential operators on non-commutative smooth functions thatmimic the gradient and Laplacian of trace polynomial functions on M N ( C ) d sa .In §

5, we deﬁne the free Wasserstein manifold W ( R ∗ d ), diﬀeomorphism group D ( R ∗ d ), and action D ( R ∗ d ) y W ( R ∗ d ) by transport.In §

6, we analyze the heat semigroup, expectation, and pseudo-inverse associated to the Laplacian L V when V is suﬃciently close to the quadratic V . In particular, we construct an operator Ψ V such that − Ψ V L V f = f − E V ( f ), where E V is the expectation functional (which will turn out to agree with µ V ).In §

7, we discuss a version of Voiculescu’s free entropy χ deﬁned on (a slight generalization of) non-commutative laws. We show that for certain V (with quadratic growth at ∞ but not necessarily convex),there always exist non-commutative laws ν maximizing χ ( ν ) − ν ( V ). Any free Gibbs law must satisfy theequation ν ( ∇ ∗ V h ) = 0 (Proposition 7.15). Finally, when ∂V and ∂ V are bounded, this equation impliesthat ν can be realized by a d -tuple of bounded operators (Theorem 7.18).In § § § V suﬃciently close to P j tr( x j ). More precisely,9or any continuously diﬀerentiable path t V t with V t suﬃciently close to P j tr( x j ), there is a path t f t of diﬀeomorphisms with ( f t ) ∗ V = V t , and our choice of t f t is “inﬁnitesimally optimal” (Theorem 8.3).In the remainder of §

8, we adapt the technique to prove triangular transport (Theorem 8.22) by studyingconditional expectations and transport. An important tool for the enterprise, which is interesting in its ownright, is a precise connection between non-commutative functions and functions on N × N matrices in thelarge N limit. In particular, similar to [41, 42], we show that a certain conditional expectation operator from § N limit of conditional expectations for the matrix models.Finally, § We thank Alice Guionnet, Yoann Dabrowski, and Wilfrid Gangbo for various useful discussions. In particular,we have used many ideas of the joint work of Dabrowski, Guionnet, and Shlyakhtenko [26]. Moreover,Jekel would like to thank Guionnet and Dabrowski for enlightening discussions about free Gibbs laws andnon-commutative smooth functions at the ´Ecole Normale Superieure Lyon in March 2020, as well as theMathematische Forschungsinstitut Oberwolfach for travel support for that visit. Jekel was also supportedby the NSF postdoctoral grant DMS-2002826. Wuchen Li was supported by start-up funding from theUniversity of South Carolina.

We recall some standard deﬁnitions and results about C ∗ and von Neumann algebras, non-commutativelaws, and free independence. For background material on C ∗ and von Neumann algebras, see e.g. [45, 46]. Deﬁnition 2.1 ( ∗ -algebra) . A unital ∗ -algebra (over C ) is a unital algebra A over C equipped with a skew-linear involution ∗ : A → A satisfying ( ab ) ∗ = b ∗ a ∗ . We call a ∗ the adjoint of a , and we say a is self-adjoint if a ∗ = a . We denote by A sa the set of self-adjoint elements (which is a vector space over R ). Deﬁnition 2.2 (C ∗ -algebra) . Let B ( H ) denote the ∗ -algebra of bounded operators on a Hilbert space H (where the ∗ -operation is the adjoint in the usual sense). A (unital) C ∗ -algebra is a unital ∗ -subalgebra of B ( H ) that is closed with respect to the operator norm. Deﬁnition 2.3 (W ∗ -algebra) . The σ -weak operator topology ( σ -WOT) on B ( H ) is the topology generatedby all maps B ( H ) → C of the form T ∞ X j =1 h ξ j , T ξ j i , where ( ξ j ) j ∈ N is a sequence of vectors with P j k ξ j k < ∞ . (Equivalently, the σ -WOT is weak- ⋆ topology on B ( H ) obtained from viewing it as the dual of the space of trace-class operators.) A von Neumann algebra or W ∗ -algebra is a unital ∗ -subalgebra of B ( H ) that is closed in the σ -WOT. Deﬁnition 2.4 (States and traces) . If A is a unital ∗ -algebra, then a linear functional φ : A → C is saidto be positive if φ ( a ∗ a ) ≥ a ∈ A , unital if φ (1) = 1, tracial if φ ( ab ) = φ ( ba ) for a, b ∈ A , faithful if φ ( a ∗ a ) = 0 implies a = 0. If A is a W ∗ -algebra, then φ is said to be normal if it is continuous with respectto the σ -WOT. A state is unital positive functional, and a trace is a unital positive tracial functional. Deﬁnition 2.5 (Tracial C ∗ and W ∗ -algebras) . A tracial C ∗ -algebra is a pair ( A , τ ) where A is a C ∗ -algebraand τ is a faithful trace. A tracial W ∗ -algebra is a pair ( A , τ ) where A is a W ∗ -algebra and τ is a faithfulnormal trace. Deﬁnition 2.6 (Homomorphisms) . A ∗ -homomorphism from one ∗ -algebra to another is a linear mapwhich respects multiplication and the ∗ -operation. A ∗ -homomorphism of unital ∗ -algebras is called unital

10f it preserves 1. A ∗ -homomorphism of W ∗ -algebras is said to be normal if it is σ -WOT continuous. An isomorphism of tracial C ∗ -algebras is a ∗ -isomorphism that preserves the trace; we make the same deﬁnitionfor tracial W ∗ -algebras but with the added requirement that the map and its inverse are normal. Lemma 2.7 (Properties of ∗ -homomorphisms) . Any ∗ -homomorphism of C ∗ -algebras is contractive and anyinjective ∗ -homomorphism is isometric. For any tracial C ∗ -algebra, there is a non-commutative analog of the L α spaces for α ∈ [0 , ∞ ] (we use α rather than p to reserve the letter p for polynomials), and they satisfy the non-commutative H¨older’sinequality. Deﬁnition 2.8 (Non-commutative L α norms) . Let ( A , τ ) be a tracial C ∗ -algebra. For α ∈ [0 , ∞ ] and X ∈ A d , we write k X k α = (cid:16)P dj =1 τ (( X ∗ j X j ) α/ ) (cid:17) /α , α < ∞ max j k X j k ∞ , α = ∞ . Here ( X ∗ j X j ) α/ is deﬁned by functional calculus. Lemma 2.9 (Non-commutative H¨older’s inequality) . Let α , α , . . . , α n ∈ [0 , ∞ ] with /α = P nj =1 /α j .Let ( A , τ ) be a tracial C ∗ -algebra and let a , . . . , a n ∈ A . Then k a . . . a n k α ≤ k a k α . . . k a n k α n . Also, we have lim α →∞ k X k α = k X k ∞ for X ∈ A d . Modulo renormalization of the trace, the inequality for matrices follows from the treatment of trace-classoperators in [72]; see especially Thm. 1.15 and Thm. 2.8, as well as the references cited on p. 31. For thesetting of von Neumann algebras, a convenient proof can be found in [23, Thm. 2.4 - 2.6]; for an overviewand further history see [64, § Deﬁnition 2.10 (Conditional expectation) . Let A be a C ∗ -algebra and B a unital C ∗ -subalgebra. A conditional expectation E : A → B is a linear map such that(1) E is positive , that is, it maps any operator of the form a ∗ a ∈ A to an operator of the form b ∗ b ∈ B .(2) E is a B - B -bimodule map, that is, E [ b ab ] = b E [ a ] b for a ∈ A and b , b ∈ B .(3) E | B = id.The following result about tracial W ∗ -algebras is well-known. Lemma 2.11 (Conditional expectations for tracial W ∗ -algebras) . Let ( A , τ ) be a tracial W ∗ -algebra andlet B be a W ∗ -subalgebra. Then there exists a unique trace-preserving conditional expectation E : A → B ,and this E is σ -WOT continuous. For each a ∈ A , the conditional expectation E [ a ] is characterized by thecondition that τ ( E [ a ] b ) = τ ( ab ) for all b ∈ B . Moreover, k E [ X ] k α ≤ k X k α for any X ∈ A d and α ∈ [1 , ∞ ] . Next, we describe the space of non-commutative laws. A non-commutative law is the analog of a linearfunctional C [ x , . . . , x d ] → R given by f R f dµ for some compactly supported measure on R d . Instead of C [ x , . . . , x d ], we use the non-commutative polynomial algebra in d variables. Deﬁnition 2.12 (Non-commutative polynomial algebra) . We denote by C h x , . . . , x d i the universal unitalalgebra generated by variables x , . . . , x d . As a vector space, C h x , . . . , x d i has a basis consisting of allproducts x i . . . x i ℓ for ℓ ≥ i , . . . , i ℓ ∈ { , . . . , d } . We equip C h x , . . . , x d i with the unique ∗ -operationsuch that x ∗ j = x j . Deﬁnition 2.13 (Non-commutative law) . A linear functional λ : C h x , . . . , x d i is said to be exponentiallybounded if there exists R > | λ ( x i . . . x i ℓ ) | ≤ R ℓ for all ℓ ∈ N and i , . . . , i ℓ ∈ { , . . . , d } , andin this case we say R is an exponential bound for λ . A non-commutative law is a unital, positive, tracial,exponentially bounded linear functional λ : C h x , . . . , x d i → C . We denote the space of non-commutativelaws by Σ d , and we equip it with the weak- ∗ topology (that is, the topology of pointwise convergence on C h x , . . . , x d i ). We denote by Σ d,R the subset of Σ d comprised of non-commutative laws with exponentialbound R . 11 bservation 2.14. The space Σ d,R is compact and metrizable. Observation 2.15.

Let A be a ∗ -algebra and X = ( X , . . . , X d ) ∈ A d sa . Then there is a unique ∗ -homomorphism ρ X : C h x , . . . , x d i → A such that ρ X ( x j ) = X j for j = 1 , . . . , d . Deﬁnition 2.16 (Non-commutative law of a d -tuple) . Let ( A , τ ) be a tracial C ∗ -algebra. Let X =( X , . . . , X d ) ∈ A d sa . Then we deﬁne λ X : C h x , . . . , x d i → C by λ X = τ ◦ ρ X . Observation 2.17. If ( A , τ ) and X are as above, then λ X is a non-commutative law with exponential bound k X k ∞ . Conversely, if R is an exponential bound for λ X , then k X k ∞ = max j lim n →∞ [ X j τ ( X nj )] / n ≤ R. Hence, k X k ∞ is the smallest exponential bound for λ X and in particular it is uniquely determined by λ X . In the case of a single operator X , we can apply the spectral theorem to show that there is a uniqueprobability measure µ X on R satisfying Z R f dµ X = τ ( f ( X )) for f ∈ C ( R ) . Since X is bounded, µ X is compactly supported and thus makes sense to evaluate on polynomials. If p is apolynomial, then λ X [ p ] = R R p dµ X . Thus, λ X is simply the linear functional on polynomials correspondingto the spectral distribution.We use the notation λ X in particular when A = M N ( C ). We denote by tr N the normalized trace(1 /N ) Tr on M N ( C ); recall that this is the unique (unital) trace on M N ( C ). Thus, for any X ∈ M N ( C ) d sa ,a non-commutative law λ X is unambiguously speciﬁed by the previous deﬁnition. In the d = 1 case, thenon-commutative law is given by the empirical spectral distribution. Note that when X is a random d -tupleof matrices, we will use the notation λ X by default to refer to the empirical non-commutative law, that is,the (random) non-commutative law of X with respect to tr N .The next proposition shows that any non-commutative law can be realized by a self-adjoint d -tuplein some tracial C ∗ or W ∗ -algebra. This is a version of the Gelfand-Naimark-Segal construction (or

GNSconstruction ). A proof can be found in [5, Proposition 5.2.14(d)].

Proposition 2.18 (GNS construction for non-commutative laws) . Let λ ∈ Σ d,R . Then we may deﬁne asemi-inner product on C h x , . . . , x d i by h p, q i λ = λ ( p ∗ q ) . Let H λ be the separation-completion of C h x , . . . , x d i with respect to this inner product, that is, the completionof C h x , . . . , x d i / { p : λ ( p ∗ p ) = 0 } , and let [ p ] denote the equivalence class of a polynomial p in H λ .There is a unique unital ∗ -homomorphism π : C h x , . . . , x d i → B ( H λ ) satisfying ρ ( p )[ q ] = [ pq ] for p , q ∈ C h x , . . . , x d i . Moreover, k π ( x j ) k ≤ R .Let X j = π ( x j ) , let X = ( X , . . . , X d ) and let C ∗ ( X ) and W ∗ ( X ) denote respectively the C ∗ and W ∗ -algebras generated by the image of π . Deﬁne τ : W ∗ ( X ) → C by τ ( T ) = h [1] , T [1] i λ . Then τ is a faithfulnormal trace on W ∗ ( X ) and in particular a faithful trace on C ∗ ( X ) . Deﬁnition 2.19.

In the situation of the previous proposition, we call C ∗ ( X ) and W ∗ ( X ), the C ∗ and W ∗ -algebras associated to λ .The operator algebras associated to λ are canonical in the sense that any other construction would yieldan isomorphic W ∗ or C ∗ -algebra. The following lemma can be deduced from the well-known properties ofthe GNS representation associated to a faithful trace τ on a C ∗ or W ∗ -algebra A (which gives the so-calledstandard form of a tracial W ∗ -algebra). Lemma 2.20.

Let ( A , τ ) and ( B , σ ) be tracial C ∗ -algebras. Let X ∈ A d sa and Y ∈ B d sa such that λ X = λ Y .Let C ∗ ( X ) and C ∗ ( Y ) be the C ∗ -subalgebras of A and B generated by X and Y respectively. Then there is aunique tracial C ∗ -isomorphism ρ : C ∗ ( X ) → C ∗ ( Y ) such that ρ ( X j ) = Y j . The same result holds with tracial W ∗ -algebras rather than tracial C ∗ -algebras. ∗ -algebras. For background material, see e.g.[89, 59, 56]. Deﬁnition 2.21 (Free independence) . Let A be a ∗ -algebra and τ : A → C a trace. Then unital ∗ -subalgebras ( A i ) i ∈ I are said to be freely independent if τ ( a . . . a ℓ ) = 0 whenever a ∈ A i , . . . , a ℓ ∈ A i ℓ such that τ ( a j ) = 0 and i = i = . . . = i ℓ . Similarly, if I is an index set and X i is a d i -tuple of operatorsin A for each i ∈ I , we say that ( X i ) i ∈ I freely independent if the ∗ -algebras A i generated by X i are freelyindependent. Lemma 2.22 (Free independence determines joint moments) . Let ( A , τ ) be a ∗ -algebra with a trace. Supposethat X i = ( X i,d , . . . , X i,d i ) is a d i -tuple of self-adjoint operators for each i in some index set I , suchthat ( X i ) i ∈ I are freely independent. Then for any non-commutative polynomial p in ( X i ) i ∈ I , the trace τ ( p (( X i ) i ∈ I )) is uniquely determined from the traces τ ( q ( X i )) for q ∈ C h x , . . . , x d i i and i ∈ I . In fact,there is a universal formula for τ ( p (( X i ) i ∈ I )) using sums and products of the traces τ ( q ( X i )) that does notdepend on the particular A and τ . In particular, (if I is ﬁnite) the non-commutative law of ( X i ) i ∈ I isuniquely determined by ( λ X i ) i ∈ I . For proof, see [89, Proposition 2.5.5].

Lemma 2.23 (Free conditional expectations) . Let X ∈ A d sa and Y ∈ A d ′ sa be freely independent in ( A , τ ) .Let E W ∗ ( X ) : A → W ∗ ( X ) be the unique trace-preserving conditional expectation. If p ( X , Y ) is a non-commutative polynomial of X and Y , then E W ∗ ( X ) [ p ( X , Y )] is a non-commutative polynomial of X . Fur-thermore, the coeﬃcients are given by a universal formula in terms of sums and products of traces of non-commutative polynomials in X and traces of non-commutative polynomials in Y . See [56, § Lemma 2.24 (Free products) . Let ( A , τ ) , . . . , ( A n , τ n ) be tracial W ∗ -algebras. Then there exists a tracial W ∗ -algebra ( A , τ ) = ( A ∗ · · · ∗ A n , τ ∗ · · · ∗ τ n ) with canonical trace-preserving inclusions ι j : ( A j , τ j ) → ( A , τ ) such that A is the W ∗ -algebra generated bythe images ι ( A ) , . . . , ι n ( A n ) and these images are freely independent. The free product is commutativeand associative up to a canonical isomorphism. For proof, refer to [89, Propositions 1.5.5 and 2.5.3] or [59, Lectures 6-7].

Deﬁnition 2.25 (Standard semicircular family) . A d -tuple S = ( S , . . . , S d ) from ( A , τ ) is said to be a standard semicircular family if S , . . . , S d are freely independent and the spectral measure of each S j withrespect to τ is (1 / π ) √ − t [ − , ( t ) dt . Lemma 2.26 (Free Brownian motion) . There exists tracial W ∗ -algebra ( B , σ ) and self-adjoint d -tuples ( S ( t )) t ∈ [0 , ∞ ) from B such that(1) S (0) = 0 ;(2) ( S ( t ) − S ( t )) / ( t − t ) / is a standard semicircular family for each t < t ;(3) S ( t ) − S ( t ) , . . . , S ( t m ) − S ( t m − ) are freely independent whenever t < t < · · · < t m ;(4) ( B , σ ) is generated as a W ∗ -algebra by ( S ( t )) t ∈ [0 , ∞ ) .Moreover, ( B , σ ) and ( S ( t )) t ∈ [0 , ∞ ) are unique up to a W ∗ -isomorphism that preserve the generators. Wecall S ( t ) a d -variable free Brownian motion . For proof, refer to [73, §

5] or [89, § .2 The classical Wasserstein manifold and log-density coordinates To motivate our construction of the free Wasserstein manifold, we brieﬂy review the classical Wassersteinmanifold and discuss an alternate coordinate system based on minus the log-density rather than the densityitself, as was done to some extent in [49] and [62]. In the following, M will be a Riemannian manifold ofdimension d . We denote by h v, w i the inner product of two tangent vectors v and w at some point x ∈ M withrespect to the Riemannian metric, by d M the geodesic distance on M , and by dx the canonical volume formassociated to the Riemannian metric. In this discussion, we will mostly assume that M is compact becauseit makes the analysis simpler; and for instance, the rigorous formulation of P ( R d ) as a Fr´echet manifold iseasiest when M is compact, see e.g. [49]. However, readers who are less familiar with Riemannian geometrymay focus on the case M = R d to understand the computations. Our non-commutative Wasserstein manifoldis the analog of the case M = R d . Deﬁnition 2.27 (Wasserstein manifold) . We deﬁne the manifold of probability densities or Wassersteinmanifold of M by P ( M ) := (cid:26) ρ ∈ C ∞ ( M ; R ) : ρ > , Z M ρ dx = 1 (cid:27) . For each density ρ , the tangent space is deﬁned by T ρ P ( M ) := (cid:26) σ ∈ C ∞ ( M ; R ) : Z M σ dx = 0 (cid:27) . The Riemannian metric for P ( R ∗ d ) is deﬁned in terms of the elliptic diﬀerential operator ∆ ρ : C ∞ ( M ) → C ∞ ( M ) given by ∆ ρ f := ∇ † ( ρ ∇ f ) = ρ ∆ f + h∇ ρ, ∇ f i , where ∇ † denotes the divergence operator from vector ﬁelds on M to smooth functions. When M is compact,∆ ρ deﬁnes a unbounded self-adjoint operator on L ( dx ) with ∆ ρ ≤

0. The kernel is the space of constantfunctions and its orthogonal complement in L ( ρ ) is the space of functions σ with R σ dx = 0. Thanks to thetheory of elliptic PDE, there is a pseudo-inverse operator ∆ − ρ : C ∞ ( M ) → C ∞ ( M ) satisfying ∆ − ρ f = g ifand only if R M g dx = 0 and ∆ ρ g = f − R M f dx . Deﬁnition 2.28 (Riemannian metric on P ( R ∗ d )) . Let M be compact. For each ρ ∈ P ( M ), we deﬁne aRiemannian metric h· , ·i T ρ P ( M ) on the tangent space by h σ , σ i T ρ P ( M ) := Z M σ ( x )( − ∆ − ρ σ )( x ) dx, or equivalently (using integration by parts), h σ , σ i T ρ P ( M ) := Z M h∇ (∆ − ρ σ ) , ∇ (∆ − ρ σ ) i ρ ( x ) dx Next, we deﬁne alternative coordinates in terms of minus the log-density, and we compute the Riemannianmetric in these new coordinates.

Deﬁnition 2.29 (Log-density manifold) . Let W ( M ) := (cid:26) V ∈ C ∞ ( M, R ) : Z M e − V dx = 1 (cid:27) and T V W ( M ) := (cid:26) W ∈ C ∞ ( M, R ) : Z M W e − V dx = 0 (cid:27) . Lemma 2.30 (Change of coordinates between density and log-density) . Let M be compact. There is abijection E : W ( M ) → P ( M ) given by V e − V . The corresponding map d E V : T V W ( M ) → T ρ P ( M )14 s W

7→ −

W e − V . Moreover, the Riemannian metric on P ( M ) corresponds to the Riemannian metric on W ( M ) given by h W , W i T V W ( M ) := − Z M W ( L − V W ) e − V dx = Z M h∇ ( L − V W ) , ∇ ( L − V W ) i e − V dx, where L V f := ∆ f − h∇ f, ∇ V i and L − V is the pseudo-inverse of L V given by L V ( L − V f ) = L − V ( L V f ) = f − Z M f e − V dx, L − V (1) = 0 . Proof. E deﬁnes a bijection since the inverse is given by ρ

7→ − log ρ . A tangent vector W ∈ T V W ( M )represents the equivalence class of the path t V + tW in W ( M ). The corresponding path in P ( M ) is t e − ( V + tW ) . Diﬀerentiating at t = 0 yields − W e − V , hence this is the corresponding element of T ρ P ( M ).Note that ∆ e − V f = e − V ∆ f − e − V h∇ V, ∇ f i = e − V L V f, and that e − V f integrates to zero dx if and only if f integrates to zero with respect to e − V dx . Hence,∆ − e − V ( e − V f ) = L − V f, so h d E V ( W ) , d E V ( W ) i T e − V P ( M ) = − Z M e − V W ∆ − e − V [ e − V W ] dx = − Z M W L − V ( W ) e − V dx. Using integration by parts, this is equivalent to R M h∇ L − V ( W ) , ∇ L − V ( W ) i e − V dx .We point out that L V deﬁnes a self-adjoint unbounded operator on L ( e − V ) with L V ≤

0. In fact, L V = −∇ ∗ V ∇ , where ∇ ∗ V f := −∇ † f + h f , ∇ V i when f is a vector ﬁeld on M . When M is compact, the kernel of L V is precisely the space of constantfunctions. The operator L V seems more intrinsic than ∆ ρ since it is deﬁned directly on terms of the measure e − V dx rather than dx .Smooth transport of measure, or in other words, the transport action of the diﬀeomorphism group of M on P ( M ), is of central importance for our work. Let D ( M ) denote the group of diﬀeomorphisms of thecompact Riemannian manifold M , where the group operation is composition. We can consider D ( M ) as aninﬁnite-dimensional Lie group. The corresponding Lie algebra is the algebra of smooth vector ﬁelds on M ,which we denote by Vect( M ), and the exponential map sends a vector ﬁeld f to the diﬀeomorphism obtainedfrom the ﬂow along f at time 1. The Lie bracket for the Lie algebra of vector ﬁelds is known as the Poissonbracket , and it coincides (up to varying sign conventions) with the taking the commutator of the diﬀerentialoperators associated to vector ﬁelds.

Observation 2.31 (Transport action) . There is a group action D ( M ) y P ( M ) given by ( f , ρ ) f ∗ ρ := ( ρ ◦ f − ) | det d f − | , or in other words, the push-forward of the measure ρ dx by the function f is ( f ∗ ρ ) dx . The correspondingaction D ( M ) y W ( M ) is given by ( f , V ) f ∗ V := V ◦ f − − log | det d f − | . Lemma 2.32 (Diﬀerential of transport action) . Fix ρ ∈ P ( M ) , and consider the map S : D ( M ) → P ( M ) given by ρ f ∗ ρ . Then the diﬀerential satisﬁes d S id : Vect( M ) → T ρ P ( M ) : h

7→ −∇ † ( ρ h ) = −h∇ ρ, h i − ρ ∇ † h . Fix V ∈ W ( M ) , and consider the map T : D ( M ) → W ( M ) given by f f ∗ V . Then the diﬀerential satisﬁes d T id : Vect( M ) → T V W ( M ) : h

7→ −∇ ∗ V h = ∇ † h − h∇ h , ∇ V i . roof. Let f t be a path of diﬀeomorphisms with f = id and ˙ f = h . Then using the product rule ddt (cid:12)(cid:12)(cid:12) t =0 (( ρ ◦ f − t ) | det d f − t | ) = −h∇ ρ, h i − Tr( d h ) = −∇ † ( ρ h )and ddt (cid:12)(cid:12)(cid:12) t =0 ( V ◦ f − t − log | det d f − t | ) = −h∇ V, h i + Tr( d h ) = −∇ ∗ V h . If M is compact, then the action of D ( M ) on P ( M ) is transitive [29]. Moreover, if we ﬁx some ρ , thenthe map f f ∗ ρ is a submersion D ( M ) → P ( M ), which can be used to deﬁne local coordinates on P ( M )[49, § −∇ ∗ V : Vect( M ) → C ∞ ( M ) moduloconstants has a right-inverse given by ∇ L − V since −∇ ∗ V ∇ L − V f = f − R f e − V dx . Thus, ∇ L − V transformsa change in V into an inﬁnitesimal transport map. We shall use this idea to construct families of transportmaps along paths in the free Wasserstein manifold.The stabilizer in D ( M ) of some V ∈ W ( M ) is the group D ( M, V ) of diﬀeomorphisms that preserve themeasure e − V dx . If h ∈ Vect( M ), then exp( t h ) preserves V for all t if and only if ∇ ∗ V h = 0. Hence, Liealgebra for the stabilizer consists of divergence-free vector ﬁelds with respect to V , which is the orthogonalcomplement in L ( e − V dx ) of the space of gradients. For each V , we can deﬁne an inner product onvector ﬁelds by integrating the Riemannian metric of M with respect to the measure e − V dx , and this canbe extended to a right-invariant Riemannian metric on the diﬀeomorphism group. Geodesic equations on D ( M ) and D ( M, V ) yield respectively the inviscid Burgers’ equation and incompressible Euler’s equation[6]; we formulate the non-commutative versions in § P ( M ) or W ( M ). Deﬁnition 2.33 (Wasserstein diﬀerential and gradient) . For a F : P ( M ) → R , we denote the diﬀerential(when deﬁned) by δ ρ F ( ρ ) : T ρ P ( M ) → R . Moreover, grad ρ F ( ρ ) is the unique element of T ρ P ( M ) satisfying h grad ρ F ( ρ ) , σ i T ρ P ( M ) = δ ρ F ( ρ )[ σ ] . For functionals F on W ( M ), we make the analogous deﬁnitions of δ V F and grad V F .Often, the functionals are given by integration of some function of ρ over M , and then the gradients arecomputed using integration by parts. We illustrate this technique on one of the most important functionals,the entropy functional h ( ρ ) := Z − ρ log ρ dx. Lemma 2.34 (Wasserstein gradient of entropy) . We have grad ρ [ h ( ρ )] = ∆ ρ. and grad V [ h ( e − V )] = L V V. Proof.

Consider the perturbation ρ + tσ for some σ ∈ T ρ P ( M ). Note that ddt (cid:12)(cid:12)(cid:12) t =0 Z − ( ρ + tσ ) log( ρ + tσ ) dx = − Z σ (1 + log ρ ) dx = Z ∆ ρ ( − ∆ − ρ σ )(1 + log ρ ) dx = Z ∆ ρ (1 + log ρ )( − ∆ ρ ) − σ dx. Then note that ∆ ρ (1 + log ρ ) = ∇ † ( ρ ∇ log ρ ) = ∇ † ∇ ρ = ∆ ρ .16imilarly, consider W ∈ T V W ( M ). Let h = ∇ L − V W and let V t = exp( t h ) ∗ V , so that ˙ V = −∇ ∗ V h = W .Then ddt (cid:12)(cid:12)(cid:12) t =0 Z e − V t V t dx = Z W (1 + V ) e − V dx = Z L V ( L − V W )(1 + V ) e − V dx = Z L V (1 + V ) L − V W e − V dx = h L V (1 + V ) , W i T V W ( M ) . Hence, grad V [ h ( e − V )] = L V (1 + V ) = L V V . Alternatively, we can deduce this from the computation for P ( M ) and the relation that − e − V L V V = ∆[ e − V ].Hence, as observed by Otto [61], the upward gradient ﬂow on P ( M ) for the entropy functional is describedby the heat equation ˙ ρ = ∆ ρ . The corresponding equation on W ( M ) is ˙ V = L V V .Next, we discuss Hamiltonian ﬂows on W ( M ) and in particular the geodesic equation. Hamiltonian ﬂowson a the tangent manifold T M are related to the natural symplectic form

T M coming from the Riemannianmetric on M . While we could write the Hamiltonian ﬂows either in terms of the density ρ or the log-density V , we will focus on the log-density case since it is less standard and more relevant to our work. It will beconvenient for use to reparametrize the tangent space T V W ( M ) using φ = L − V W as our coordinate. Moreprecisely, write T ′ V W ( M ) = C ∞ ( M, R ) / R , where R L V sends T ′ V W ( M ) onto T V W ( M ) and theRiemannian metric on T ′ V W ( M ) is the Dirichlet inner product with respect to e − V dx , that is, h φ , φ i T ′ V W ( M ) = Z h∇ φ , ∇ φ i e − V dx. Let T ′ W ( M ) be the corresponding tangent bundle T ′ W ( M ) = W ( M ) × C ∞ ( M, R ) / R . We denote by grad ′ V F ( V ) = L − V grad V F ( V ) the gradient of F ( V ) expressed in these new coordinates. Deﬁnition 2.35 (Hamiltonian ﬂow) . Let H : T ′ W ( M ) → R : ( V, φ ) H ( V, φ ). We call V the positionvariable and φ the momentum variable . Then the Hamiltonian ﬂow associated to H is the pair of equations ( ˙ V t = L V t grad ′ φ H ( V, φ )˙ φ t = − grad ′ V H ( V, φ ) , where t ( V t , φ t ) is a path in T ′ W ( M ) and ˙ denotes the time derivative. The L V t term is included totransform T ′ V W ( M ) to T V W ( M ) and thus to interpret the tangent vector as the rate of change of V . Lemma 2.36.

Let F : W ( M ) → R . The Hamiltonian ﬂow associated to H ( V, φ ) := 12 h φ, φ i T ′ V W ( M ) + F ( V ) is  ˙ V t = L V t φ ˙ ρ t = − h∇ φ, ∇ φ i − grad ′ V F ( V )17 roof. It is clear that grad ′ φ H ( V, φ ) = φ . To compute grad ′ V [ h φ, φ i T ′ V W ( M ) ], consider ψ ∈ T ′ V W ( M ), andthe corresponding vector L V ψ ∈ T V W ( M ). Let t V t be some path such that ˙ V = L V ψ . Note that ddt (cid:12)(cid:12)(cid:12) t =0 h φ, φ i T ′ Vt W ( M ) = ddt (cid:12)(cid:12)(cid:12) t =0 Z M h∇ φ, ∇ φ i e − V t dx = Z M h∇ φ, ∇ φ i ( − L V ψ ) e − V dx = Z M h∇h∇ φ, ∇ φ i , ∇ ψ i e − V dx = hh∇ φ, ∇ φ i , ψ i T ′ V W ( M ) . With this computation in hand, we obtaingrad ′ V H ( V, φ ) = 12 h∇ φ, ∇ φ i + grad ′ V F ( V )which yields the asserted equations for the Hamiltonian ﬂow.We remark that if F ( V ) = 0, then the Wasserstein Hamiltonian ﬂow is the geodesic equation on W ( M ),which is closely related to the study of optimal transport; we will discuss the non-commutative version in § F often arise as Nash equilibria in mean ﬁeld games. While there is a not a universally agreed upon analog of C ∞ functions of several self-adjoint operators, it hasat least become clear that in the random matrix setting these functions should include trace polynomials.Trace polynomials were ﬁrst studied from an algebraic viewpoint since the give all the unitarily invariantpolynomials over n × n matrices for every n [67, 65, 51, 68]. Their applications to Brownian motion andmatrix groups and to probability theory are evident from [66, 69, 20, 28, 47, 48, 26].Trace polynomials are functions of several self-adjoint operators obtained by mixing non-commutativepolynomials with applications of the trace from the ambient von Neumann algebra. Let C h x , . . . , x d i bethe ∗ -algebra of non-commutative polynomials (Deﬁnition 2.12). Any non-commutative polynomial p canbe evaluated on self-adjoint d -tuples in a tracial C ∗ -algebra. If ( A , τ ) is a tracial C ∗ -algebra and X =( X , . . . , X d ) ∈ A d sa , then we write p ( X ) = ρ X ( p ), where ρ X is the unique ∗ -homomorphism C h x , . . . , x d i →A mapping x j to X j . Then X p ( X ) deﬁnes a function p A ,τ : A d sa → A . Moreover, there is a function(tr( p )) A ,τ : A d sa → C given by X τ ( p ( X )). In fact, (tr( p )) A ,τ ( X ) depends only on the non-commutativelaw λ X and deﬁnes a continuous function on the space of laws Σ d (by deﬁnition of non-commutative laws).We obtain the algebra of scalar-valued trace polynomials TrP d by taking sums and products of functions ofthe form tr( p ), for instance, tr( x x ) tr( x ) − x ) + 5 tr( x ) tr( x x x ) . In fact, using the Stone-Weierstrass theorem, this algebra is dense in C (Σ d,R ) (see [42, Proposition 13.6.3]).These scalar-valued trace polynomials sit inside a larger algebra TrP d obtained by multiplying scalar-valued trace polynomials and non-commutative polynomials, which would contain for instancetr( x x ) x + x − x )1 + 5 tr( x ) x x x . The space of trace polynomials is deﬁned algebraically as follows.

Deﬁnition 3.1.

We deﬁne tr( C h x , . . . , x d i ) to be the vector space C h x , . . . , x d i / Span { pq − qp : p, q ∈ C h x , . . . , x d , y , . . . , y k i . Then TrP ( R ∗ d ) is deﬁned to be the symmetric tensor algebra over tr( C h x , . . . , x d i ) modulo the relationtr(1) = 1. We also deﬁne TrP( x , . . . , x d ) = TrP ( x , . . . , x d ) ⊗ C h x , . . . , x d i ∗ -algebras.18or p ∈ C h x , . . . , x d i , we denote the corresponding element of tr( C h x , . . . , x d i ) by tr( p ). Elements in thealgebra TrP( x , . . . , x d ) will be written as linear combinations of expressions such as tr( p ) . . . tr( p n ) p . Notethat C h x , . . . , x d i has a natural Z d ≥ -grading by the degrees in each variable. The quotient tr( C h x , . . . , x d i )is deﬁned by relations pq − qp = 0, and it suﬃces to take p and q monomials, so that pq − qp is in a single gradedcomponent. Therefore, tr( C h x , . . . , x d i ) inherits the Z d ≥ -grading, and the same is true for TrP ( x , . . . , x d )and TrP( x , . . . , x d ). We also identify TrP ( x , . . . , x d ) with the subalgebra TrP ( x , . . . , x d ) ⊗ x , . . . , x d ).Just as commutative polynomials in d variables can be interpreted as functions R d → R , a trace poly-nomial f deﬁnes a function A dsa → A for every tracial C ∗ -algebra ( A , τ ). This is done through evaluationmaps which naturally extend the evaluation maps on C h x , . . . , x d i . Deﬁnition 3.2.

Let ( A , τ ) be a tracial C ∗ -algebra, and let X , . . . , X d ∈ A be self-adjoint. Then we deﬁnethe evaluation map ev A ,τX ,...,X d : TrP( x , . . . , x d ) → A as the unique ∗ -homomorphism satisfyingev A ,τX ,...,X d ( p ( x , . . . , x d )) = p ( X , . . . , X d )ev A ,τX ,...,X d (tr( p ( x , . . . , x d ))) = τ ( p ( X , . . . , X d ))1 . To see that that is well-deﬁned, ﬁrst note ev A ,τX ,...,X d deﬁnes a linear map tr( C h x , . . . , x d i ) → A since τ is invariant under cyclic symmetry. Using the universal property of the symmetric tensor algebra, weobtain a map TrP ( X , . . . , X d ) → A . Finally, we tensor this map with the well-known evaluation map C h x , . . . , x d i → A to obtain a map TrP( x , . . . , x d ) → A . Deﬁnition 3.3.

With ( A , τ ) a tracial C ∗ -algebra and f ∈ TrP( x , . . . , x d ), we deﬁne f A ,τ : A d sa → A by f A ,τ ( X , . . . , X d ) = ev X ,...,X d | A ,τ ( f ) . Thus, a trace polynomial f deﬁnes a function A d sa → A . We next explain how to diﬀerentiate the function f A ,τ , and this will motivate the construction of non-commutative C k functions. Given f : A d sa → A for sometracial C ∗ -algebra, we deﬁne ∂ j f : A d sa × A sa → A by ∂ j f ( X , . . . , X d )[ Y ] = ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 f ( X , . . . , X j − , X j + tY, X j +1 , . . . , X d ) (3.1)whenever the limit deﬁning the derivative exists in norm. (Of course, this deﬁnition makes sense for mapsbetween Banach spaces in general, and one could also consider diﬀerentiation in the weak topology.) Similarly,for j ∈ { , . . . , d } , we can view ∂ j f ( X )[ Y ] as a function of d +1 variables, and then take a second directionalderivative with respect to the j th variable in another direction Y . In general, we denote the iterateddirectional derivatives of order k by ∂ j k . . . ∂ j f ( X , . . . , X d )[ Y , . . . , Y k ]for j , . . . , j k ∈ { , . . . , d } and X , . . . , X d and Y , . . . , Y k in A sa .We claim that if f ∈ TrP( x , . . . , x d ), then the directional derivative ∂ j ( f A ,τ )( X )[ Y ] is given by g A ,τ ( X, Y )for some trace polynomial g that is independent of ( A , τ ). In fact, we will describe abstract diﬀerentiationoperators on the algebra TrP( x , . . . , x d ) such that the abstract derivatives of f evaluate to the directionalderivatives of f A ,τ for every ( A , τ ). Since a trace polynomial is smooth in the sense of Fr´echet diﬀeren-tiation, the k th directional derivatives of a function f ( X , . . . , X d ) in directions ( Y , . . . , Y k ) will be mul-tilinear in ( Y , . . . , Y k ). Hence, the k th directional derivatives ought to be given by trace polynomials in( x , . . . , x d , y , . . . , y k ) that are multilinear in ( y , . . . , y k ), which motivates the following deﬁnition. Deﬁnition 3.4.

We deﬁne TrP( x , . . . , x d ; y , . . . , y ℓ ) to be the subspace of TrP( x , . . . , x d , y , . . . , y ℓ ) con-sisting of trace polynomials that are linear in each y j , that is, it is the sum of the graded components withgrading in Z d ≥ ×{ } ℓ . An element f ∈ TrP( x , . . . , x d ; y , . . . , y ℓ ) will often be denoted f ( x , . . . , x d )[ y , . . . , y ℓ ]rather than f ( x , . . . , x d , y , . . . , y ℓ ).Of course, if f ∈ TrP( x , . . . , x d ; y , . . . , y k ), then f | A ,τ deﬁnes a map A d + k sa → A that is multilinear in thelast k variables. To deﬁne the abstract derivative operators, we start with the case of ﬁrst-order derivatives.19 emma 3.5. There is a unique linear operator ∂ x j : TrP( x , . . . , x d ) → TrP( x , . . . , x d ; y ) satisfying ∂ x j ( x j )[ y ] = y∂ x j ( x i )[ y ] = 0 for i = j∂ x j [tr( p ( x ))][ y ] = tr( ∂ x j [ p ( x )][ y ]) for p ∈ C h x , . . . , x d i ∂ x j [ f ( x ) g ( x )] = ∂ x j f ( x )[ y ] g ( x ) + f ( x, y ) ∂ x j g ( x )[ y ] . Proof.

First, for a monomial p ( x ) = x j (1) . . . x j ( k ) , deﬁne ∂ x j p ( x ) = X i : j ( i )= j x j (1) . . . x j ( i − yx j ( i +1) . . . x j ( k ) . Since monomials are a basis for C h x , . . . , x d i , this extends to a linear operator C h x , . . . , x d i → C h x , . . . , x d , y i .Then observe that if q is cyclically equivalent to p , then ∂ x j q is cyclically equivalent to ∂ x j p . Thus, ∂ x j alsodeﬁnes a map tr( C h x , . . . , x d i ) → tr( C h x , . . . , x d , y i ). Recall that a basis for TrP( x , . . . , x d ) is given byelements of the form tr( p ) . . . tr( p n ) p , where p , . . . , p n are monomials up to cyclic symmetry and p is amonomial. Thus, there is a unique linear operator TrP( x , . . . , x d ) → Tr( x , . . . , x d , y ) satisfying ∂ x j [tr( p ) . . . tr( p n ) p ] = n X i =1 tr( ∂ x j p i ) Y i ′ ∈ [ n ] \{ i } tr( p i ′ ) p + n Y i =1 tr( p i ) ∂ x j p . whenever p , . . . , p n are monomials. We leave it as an exercise to check that this operator ∂ x j satisﬁesall the desired properties and is uniquely determined by those properties, and moreover that it maps intoTrP( x , . . . , x d ; y ). Remark . The action of ∂ x j can be described in words as “ﬁnd each occurrence of x j and replace it by y and then add the resulting trace polynomials.” For instance, with d = 2, j = 1, ∂ x [tr( x x ) tr( x ) x ][ y ] = tr( yx ) tr( x ) x + tr( x x ) tr( x ) yx + tr( x x ) tr( x ) x y. To deﬁne higher order derivatives, note that TrP( x , . . . , x d , y , . . . , y k ) is isomorphic to TrP( x , . . . , x d + k ),and hence for j = 1,. . . , d , we can deﬁne ∂ x j : TrP( x , . . . , x d , y , . . . , y k ) → TrP( x , . . . , x d , y , . . . , y k ; y k +1 ) , where y k +1 stands for the extra variable y that is introduced when diﬀerentiating. In fact, this operatormaps TrP( x , . . . , x d ; y , . . . , y k ) → TrP( x , . . . , x d ; y , . . . , y k +1 ) . Lemma 3.7.

Let f ∈ TrP( x , . . . , x d ; y , . . . , y ℓ ) , and let ( A , τ ) be a tracial C ∗ -algebra. Then ∂ j k . . . ∂ j ( f A ,τ )( X , . . . , X d )[ Y , . . . , Y k + ℓ ] = ( ∂ x jk . . . ∂ x j f ) | A ,τ ( X , . . . , X d )[ Y , . . . , Y k + ℓ ] for X , . . . , X d , Y , . . . , Y k + ℓ ∈ A sa .Proof. By induction, it suﬃces to prove the case where k = 1. Then, since a function in Tr( x , . . . , x d ; y , . . . , y ℓ )can be viewed as a function of d + ℓ variables, we can assume without loss of generality that ℓ = 0 by changing d if necessary. Hence, it suﬃces to show that for f ∈ TrP( x , . . . , x d ), ∂ j ( f | A ,τ )( X , . . . , X k )[ Y ] = ( ∂ x j f ]) A ,τ ( X , . . . , X k )[ Y ] . The two sides of the equation agree when f ( x , . . . , x k ) = x i for some i , hence they agree for non-commutativemonomials using the Leibniz rule and for non-commutative polynomials by linearity. Then because both ∂ x j and the directional derivative operations commute with the application of the trace, the relation also holdsfor f ∈ tr( C h x , . . . , x d i ). Finally, by the Leibniz rule, it extends to all of TrP( x , . . . , x d ).20 .2 The spaces C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) Now we are ready to deﬁne a certain non-commutative analog of C k functions. These are roughly speakingfunctions whose derivatives up to order k can be approximated by trace polynomials. But we must ﬁrstdecide what norm to use for the approximation, and there are many possible choices. Thus, we will ﬁrst givesome motivation for our deﬁnitions. What is most important is for the resulting function spaces to have goodclosure properties; for instance, closure under addition, multiplication, and more generally composition.The ﬁrst derivative of a trace polynomial f in ( x , . . . , x d ) is a trace polynomial in ( x , . . . , x d , y ) that islinear in y . Thus, ∂ x j f ( X , . . . , X d ) deﬁnes a linear map A → A for each tracial C ∗ -algebra A and X , . . . , X d in A sa . Obviously, it is natural to consider the norm of ∂ x j f ( X , . . . , X d ) as a linear map with respectto the operator norm of A . However, A also has a 2-norm with respect to the trace (Deﬁnition 2.8). The2-norm is important in the study of von Neumann algebras since it allows us to apply Hilbert space theory.And the 2-norm on M n ( C ) is a rescaling of the standard Euclidean norm on M n ( C ) ∼ = C n . Thus, we wantto take into consideration k ∂ x j f ( X , . . . , X d ) k = sup {k ∂ x j f ( X , . . . , X d )[ Y ] k : k Y k ≤ } . Higher order derivatives will be multilinear forms A k sa → A . For instance, one term might be themultilinear form f ( x , x )[ y , y , y ] = x y x x y y . If X , X ∈ A sa , then f ( X , X ) will not be boundedas a map from ( A sa , k·k ) → ( A , k·k ). However, by the non-commutative H¨older’s inequality (Lemma 2.9),if α , α , α , α ∈ [1 , ∞ ] satisfy 1 /α = 1 /α + 1 /α + 1 /α , then we have k X Y X X Y Y k α ≤ k X k ∞ k X k ∞ k Y k α k Y k α k Y k α , where k Y k α = τ (( Y ∗ Y ) α/ ) /α for α < ∞ and k Y k ∞ is the operator norm.These considerations will lead to the deﬁnition of the space C k tr ( R ∗ d ), which we think of as an analogof the classical space C k ( R d ). Before explaining the formal deﬁnition, let us ﬁrst discuss the notation andtype of object we aim to describe. The symbol R ∗ d does not have a literal meaning but it expresses the ideaof a functions of d free real (that is, self-adjoint) variables. The derivatives of these functions will live incertain spaces of functions of self-adjoint variables which output ℓ -multilinear forms. Thus, for instance for f ∈ C k tr ( R ∗ d ), the total derivative ∂ k f will be deﬁne for each ( A , τ ) a function of d -tuples X , Y , . . . , Y ℓ which is real-multilinear in the last ℓ arguments (i.e. an ℓ -multilinear function of Y , . . . , Y ℓ that dependson X ). Here, for the sake of compact notation, we want to denote a tuple ( X , . . . , X d ) ∈ A d sa by a singleletter X , akin to the common notation for vectors in R d . Thus the derivative ∂ k f will collect all the partialderivatives of f of order k (discussed in the previous section) into a single gadget.Although in many applications the variables X and Y , . . . , Y ℓ will be vectors with the same number ofcomponents, we will on some occasions need each of them to have a diﬀerent number of components. Thespace C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ will describe functions which assign to each ( A , τ ) and each X in A d sa ,a multilinear form A d sa × · · · × A d ℓ sa → A d ′ .The entries of the output vector are not restricted to be self-adjoint; thus, this is the non-commutativeanalog of functions from R d to the space of R -multilinear maps R d × · · · × R d ℓ → C d ′ . Moreover, just as an R -multilinear map R d × · · · × R d ℓ → C d ′ extends to a unique C -multilinear map C d × . . . C d ℓ → C d ′ , any R -multilinear map A d sa × · · · × A d ℓ sa → A d ′ extends uniquely to a C -multlinear map A d × · · · × A d ℓ → A .We will deﬁne norms of multilinear forms using the “complexiﬁed” versions since they are slightly betterbehaved (although it only makes a diﬀerence up to a constant factor). Now let us give the precise deﬁnitions. Deﬁnition 3.8.

If Λ : A d × · · · × A d ℓ → A d ′ is a C -multilinear form and α , α , . . . , α ℓ ∈ [0 , ∞ ], then wedeﬁne k Λ k α ; α ,...,α ℓ = sup {k Λ[ Y , . . . , Y ℓ ] k α : Y ∈ A d , . . . , Y ℓ ∈ A d ℓ , k Y k α ≤ , . . . k Y ℓ k α ℓ ≤ } . We also deﬁne k Λ k M ℓ , tr = sup {k Λ k α ; α ,...,α ℓ : α − = α − + · · · + α − ℓ } . Note that in the case ℓ = 0, the multilinear form reduces to an element of A d ′ and k Λ k M , tr = k Λ k ∞ .21 bservation 3.9. Every Y ∈ A d can be written uniquely as Re( Y ) + i Im( Y ) , where Re( Y ) and Im( Y ) ∈A d sa , and we have k Re( Y ) k α , k Im( Y ) k α ≤ k Y k α . Therefore, We have ℓ k Λ k α ; α ,...,α ℓ ≤ sup {k Λ[ Y , . . . , Y ℓ ] k α : Y ∈ A d sa , . . . , Y ℓ ∈ A d ℓ sa , k Y k α ≤ , . . . k Y ℓ k α ℓ ≤ } ≤ k Λ k α ; α ,...,α ℓ . Deﬁnition 3.10.

Suppose that ( A , τ ) is a tracial C ∗ -algebra and f : A d sa × A d sa . . . A d ℓ sa → A d ′ is a functionthat is real-multilinear in the last ℓ arguments. Then we deﬁne k f k M ℓ , tr ,R = sup {k f ( X ) k M ℓ , tr : X ∈ A d sa , k X k ∞ ≤ R } . In the case ℓ = 0, we write it simply as k f k tr ,R .The seminorm of a function f in C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ with radius R will be deﬁned belowessentially as the supremum of k f A ,tau k M ℓ , tr ,R over tracial C ∗ -algebras ( A , τ ), but there is a small technicalissue that the classes of tracial C ∗ -algebras and of tracial W ∗ -algebras are not sets. However, this issueis easily resolved as follows (for a moment, we assume a greater background knowledge about operatoralgebras): There does exist a set W of isomorphism class representatives for tracial W ∗ -algebras that areseparable in σ -WOT. This is because a separable tracial W ∗ -algebra with a choice of a countable set ofself-adjoint generators is equivalent to a non-commutative law in countably many variables, that is, unital,positive, tracial, exponentially bounded linear maps C h x j : j ∈ N i → C . These linear functionals evidentlyform a set. Isomorphism between the W ∗ -algebras deﬁnes an equivalence relation on the space of laws, hencewe can deﬁne W as the set of equivalence classes. Of course, if we take the supremum over separable tracialW ∗ -algebras, the supremum is the same as if we used all tracial W ∗ -algebras since k f ( X )[ Y , . . . , Y ℓ ] k α can be evaluated only using the σ -WOT-separable subalgebra W ∗ ( X ; Y , . . . , Y ℓ ) and its trace. Moreover,it is the same as the supremum over all tracial C ∗ -algebras, since any tracial C ∗ -algebra can be completedto a tracial W ∗ -algebra through the Gelfand-Naimark-Segal construction. Deﬁnition 3.11.

We denote by TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ vector space of d ′ -tuples g of trace polyno-mials in the variables x = ( x , . . . , x d ) , y = ( y , , . . . , y ,d ) , . . . , y ℓ = ( y ℓ, , . . . , y ℓ,d ℓ )that are multilinear in y , . . . , y ℓ (as above).We observe that for every g ∈ TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′ , we havesup ( A ,τ ) ∈ W k g k M ℓ , tr ,R < ∞ . To verify this, it suﬃces to check the case d ′ = 1. By linearity, we reduce to the case where g = p tr( p ) . . . tr( p n ) where p , . . . , p n are non-commutative monomials in x = ( x , . . . , x d ) and y , . . . , y ℓ ,such that each y j occurs exactly once in the entire expression. When evaluating this function on X and Y ∈ A d sa , . . . , Y ℓ ∈ A d ℓ sa for some ( A , τ ) ∈ W , one estimates the result by applying the non-commutativeH¨older’s inequality to τ ( p i ) for each i , using k Y j k α j and k X k ∞ for each occurrence of X i (and k X k ∞ inturn is bounded by R ). Deﬁnition 3.12.

We deﬁne C tr ( R ∗ d , M ( R d , . . . , R d ℓ )) d ′ as the set of tuples ( f A ,τ ) ( A ,τ ) ∈ W such that f A ,τ : A d sa × A d sa × · · · × A d ℓ sa → A d ′ that are real-multilinear in the last ℓ variables and such that for every R > ǫ >

0, there exists a d ′ -tuple g ∈ TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′ such thatsup ( A ,τ ) ∈ W k f A ,τ − g A ,τ k M ℓ , tr ,R < ǫ. We also deﬁne k f k C tr ( R ∗ d , M ( R ∗ d ,..., R ∗ dℓ )) d ′ ,R = sup ( A ,τ ) ∈ W k f A ,τ k M ℓ , tr ,R . R ∗ d , . . . , R ∗ d ℓ is rather cumbersome, we will also use the shorthand k f k C tr ( R ∗ d , M ℓ ) d ′ ,R when the dimensions d , . . . , d ℓ are understood from context. Finally, we write C tr ( R ∗ d , M ℓ ( R ∗ d )) = C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d | {z } ℓ )) . Evidently, there is a canonical linear mapTrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ . In fact, this map is injective. For any trace polynomial f , it makes sense to evaluate f M N ( C ) , tr N on ar-bitrary matrix d -tuples (not necessarily self-adjoint), although this doe not respect the ∗ -operation. Let B be an orthonormal basis for M N ( C ) d sa as a real inner product space, hence also an orthonormal ba-sis for M N ( C ) d as a complex inner product space. For any trace polynomial f and b ∈ B , the function g ( X ) = h b, f M N ( C ) , tr N ( X ) i tr N is a complex analytic function in the coeﬃcients z b = h b, X i . Hence, by an-alytic continuation, it is uniquely determined by the values of g when z b ∈ R , that is, by g restricted toself-adjoint d -tuples. Since this is true for each basis element b , we see that if f M N ( C ) , tr N = 0 for self-adjoint X , then it is zero for arbitrary d -tuple of N × N matrices. If a trace polynomial f satisﬁes f M N ( C ) , tr N = 0 forall N , then f must equal zero by [65, Corollary 4.4]. Hence if f A ,τ = g A ,τ for all ( A , τ ) ∈ W , then f = g astrace polynomials, which is what we wanted to prove. While this is not essential to any of our main results,it is notationally and conceptually convenient to treat TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ as a dense subspaceof C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ .The following observations are straightforward exercises: • C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) is a Fr´echet space with respect to the family of seminorms k f k C tr ( R ∗ d , M ℓ ) ,R for R > R which tends to ∞ ). • If f ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ , then it makes sense to evaluate f on ( X , Y , . . . , Y ℓ ) ∈ A d sa ×A d × · · · × A d ℓ for any tracial C ∗ -algebra ( A , τ ). Indeed, we restrict to the C ∗ -algebra generated by X and Y , . . . , Y ℓ , then complete it to a tracial W ∗ -algebra. • Given such an ( A , τ ) and X , Y , . . . , Y ℓ , the evaluation f ( X )[ Y , . . . , Y ℓ ] is always a d ′ -tuple fromthe C ∗ -algebra generated by X , Y , . . . , Y ℓ because f can be approximated in k·k M ℓ , tr ,R by tracepolynomials. Moreover, the value of f ( X )[ Y , . . . , Y d ] only depends on τ | C ∗ ( X , Y ,..., Y ℓ ) . • There is a unique ∗ -operation on C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ that is continuous and extends the ∗ -operation on trace polynomials. This is given by( f ∗ ) A ,τ ( X )[ Y , . . . , Y ℓ ] = ( f A ,τ ( X )[ Y ∗ , . . . , Y ∗ ℓ ]) ∗ . Moreover, the ∗ -operation is isometric with respect to each of the seminorms k·k C tr ( R ∗ d , M ℓ ) d ′ ,R for R > Deﬁnition 3.13.

For k ∈ N ∪ {∞} , we deﬁne C k tr ( R ∗ d , M ℓ ) d ′ as the set of tuples f = ( f A ,τ ) ( A ,τ ) ∈ W suchthat for k ′ ≤ k , there exists a function f k ′ ∈ C ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , . . . , R ∗ d | {z } k ′ )) d ′ such that for every ( A , τ ) ∈ W , for X , Y ∈ A d sa , . . . , Y ℓ ∈ A d ℓ sa , and Y ℓ +1 , . . . , Y ℓ + k ′ ∈ A d sa , we have ddt k ′ (cid:12)(cid:12)(cid:12)(cid:12) t k ′ =0 . . . ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 f A ,τ ( X + t Y ℓ +1 + · · · + t k ′ Y ℓ + k ′ )[ Y , . . . , Y ℓ ] = f A ,τk ′ ( X )[ Y , . . . , Y ℓ + k ′ ] . In other other words, each iterated directional derivative of f in some ( A , τ ) is given by some function in C tr ( R ∗ d , M ℓ + k ′ ) d ′ , which is independent of the choice of ( A , τ ). For each k ′ ≤ k , the function f k ′ is uniquelydetermined, and we will denote this function by ∂ k ′ f .23he following observations are immediate: • If f = ( f A ,τ ) ( A ,τ ) ∈ W ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ , and if k ′ ≤ k , then ∂ k ′ f is an element of C k − k ′ tr ( R ∗ d , M ℓ + k ′ ) d ′ . • Every element of TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ deﬁnes an element of C ∞ tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ . • C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ is a Fr´echet space with the topology given by the seminorms k ∂ k ′ f k C tr ( R ∗ d , M ℓ + k ′ ) d ′ ,R for R > k ′ ≤ k , and j , . . . , j k ′ ∈ [ d ]. • If k ≤ k ′ , then C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ ⊆ C k ′ tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ , and the inclusion map is continuous. • If d ≤ d , then there is a continuous inclusion C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ → C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ given by sending f to the function ( X , . . . , X d ) f ( X , . . . , X d ).It is often convenient to work with bounded functions so as not to worry about growth conditions at ∞ .Thus, we deﬁne the following BC k tr spaces. Deﬁnition 3.14.

For f ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ , we deﬁne k f k BC tr ( R ∗ d , M ℓ ) d ′ := sup R k f k C tr ( R ∗ d , M ℓ ) d ′ ,R . For k ∈ N ∪ {∞} , we deﬁne BC k tr ( R ∗ d , M ℓ ) d ′ as the set of f ∈ C k tr ( R ∗ d , M ℓ ) d ′ such that k ∂ k ′ f k BC tr ( R ∗ d , M ℓ ) d ′ < ∞ for k ′ ∈ N with k ′ ≤ k .We equip BC k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ with the topology given by these seminorms. If k < ∞ ,there are only ﬁnitely many of these seminorms, so we have a Banach space. Note that this topology on BC k tr ( R ∗ d , M ℓ ) d ′ is stronger than the subspace topology from C k tr ( R ∗ d , M ℓ ) d ′ . Moreover, BC k tr ( R ∗ d , M ℓ ) d ′ is a Banach space for k ∈ N and a Fr´echet space for k = ∞ . Remark . At this point, it may not be clear whether there are any nontrivial functions BC k tr ( R ∗ d , M ℓ ) d ′ .However, it turns out that these functions are quite abundant. It follows from Proposition 4.13 below thatif φ : R → R is a function whose Fourier transform satisﬁes R R | s n φ ( s ) | ds < ∞ for all n , then an element of BC ∞ tr ( R ) is deﬁned applying φ to self-adjoint operators through functional calculus. Furthermore, it followsTheorem 3.19 below that BC ∞ tr functions are closed under composition (hence also under multiplication).Moreover, if f ∈ BC ∞ tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )), then so is tr( f ). Functions in C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ have the following continuity property, which is a type of uniformcontinuity for X in the k·k ∞ -ball of radius R . Lemma 3.16.

Let f = ( f A ,τ ) ( A ,τ ) ∈ W ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ . Then for every R > and ǫ > ,there exists a δ > such that for every ( A , τ ) ∈ W , if X and X ′ ∈ A d sa with k X k ∞ ≤ R and k X ′ k ∞ ≤ R and k X − X ′ k ∞ < δ for each i , then k f A ,τ ( X ) − f A ,τ ( X ′ ) k M ℓ , tr < ǫ . roof. First, consider the case where f ∈ TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ . Let X and X ′ be self-adjoint d -tuples from ( A , τ ) with k X k ∞ ≤ R and k X ′ k ≤ R and k X − X ′ k ∞ < δ . Let α , α , . . . α ℓ ∈ [1 , ∞ ] with1 /α = 1 /α + · · · + 1 /α ℓ , and let Y ∈ A d , . . . , Y ℓ ∈ A d ℓ with k Y j k α j ≤

1. It follows from Lemma 3.7 that ddt f A ,τ ((1 − t ) X + t X ′ )[ Y , . . . , Y ℓ ] = ( ∂ f ) A ,τ ((1 − t ) X + t X ′ )[ Y , . . . , Y ℓ , X ′ − X ] . Since k (1 − t ) X + t X ′ k ∞ ≤ R for t ∈ [0 , k ( ∂f ) A ,τ ((1 − t ) X + t X ′ )[ Y , . . . , Y ℓ , X ′ − X ] k α ≤ k ∂ f k C tr ( R ∗ d , M ℓ +1 ) d ′ ,R k Y k α . . . k Y ℓ k α ℓ k X ′ − X k ∞ ≤ k ∂ x f k C tr ( R ∗ d , M ℓ +1 ) δ. Hence, k f A ,τ ( X ′ )[ Y , . . . , Y ℓ ] − f A ,τ ( X )[ Y , . . . , Y ℓ ] k α ≤ k ∂ f k C tr ( R ∗ d , M ℓ +1 ) d ′ ,R δ. This implies the desired uniform continuity property for f ∈ TrP( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ .In general, if f ∈ C tr ( R ∗ d , M ℓ +1 ) d ′ , then there is a sequence of trace polynomials f ( n ) that converge to f in C tr ( R ∗ d , M ℓ +1 ) d ′ . For a given R >

0, this implies that f ( n ) → f with respect to k·k C tr ( R ∗ d , M ℓ +1 ) d ′ ,R .The uniform continuity property asserted in the lemma holds for f by the principle that uniform continuityis preserved under uniform limits.Next, we discuss how the non-commutative derivatives deﬁned in this paper related to the more standardnotions of Fr´echet diﬀerentiation for functions between Banach spaces. While this discussion is of interest inits own right, it is also helpful for our proof of the chain rule in the next section, since it allows us to deduceproperties of C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) from the better known properties of Fr´echet derivatives.Let X and Y be Banach spaces over R , and let f : X → Y . We say that f is Fr´echet-diﬀerentiable at x ∈ X if there is a bounded linear map T : X → Y such thatlim x → x k f ( x ) − f ( x ) − T ( x − x ) kk x − x k = 0 . This T is unique and is denoted Df ( x ). We say that f is Fr´echet- C if f is Fr´echet-diﬀerentiable at everypoint and x Df ( x ) is a continuous function X → L ( X , Y ), where L ( X , Y ) is the Banach space of boundedlinear transformations X → Y . By induction, we say that f is Fr´echet- C k if it is Fr´echet-diﬀerentiable atevery point and Df is Fr´echet- C k − . We say that f is Fr´echet- C ∞ if it is Fr´echet- C k for every k ∈ N .If f is Fr´echet- C k , then the k th-order Fr´echet derivatives D k f are multlinear maps X k → Y deﬁned asfollows. For k = 2, note that D ( Df )( x ) is an element of L ( X , L ( X , Y )). But a linear map from X to L ( X , Y ) is equivalent to a bilinear map X × X → Y . The operator norm on L ( X , L ( X , Y )) agrees withnorm on bilinear forms k Λ k = sup {k Λ[ x , x ] k : k x k , k x k ≤ } . In a similar way, let M k ( X , Y ) be the space of k -linear forms X k → Y . Then the k -fold application of D toa Fr´echet- C k function f produces a function D k f from X to M k ( X , Y ).The spaces C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ can be described alternatively as follows. Lemma 3.17.

Let f = ( f A ,τ ) ( A ,τ ) ∈ W be a tuple of functions A d sa × A d sa × · · · × A d ℓ sa → A d ′ that is multilinearin the last ℓ variables. Then f ∈ C k tr ( R d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ if and only if the following hold:(1) For each ( A , τ ) , f A ,τ is a Fr´echet- C k function A d sa → M ℓ ( A sa , A d ′ ) , where A d sa and A d are viewed asBanach spaces with respect to k·k ∞ .(2) For k ′ ≤ k , there exists f k ′ ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ × R ∗ d × · · · × R ∗ d | {z } k ′ )) d ′ such that for all ( A , τ ) ∈ W , D k ′ ( f A ,τ ) = f A ,τk ′ . roof. Suppose that f ∈ C k tr ( R ∗ d , M ℓ ) d ′ . By Deﬁnition 3.13, this means that all the iterated directionalderivatives up of order k ′ ≤ k exist and are given by functions f k ′ in C tr ( R ∗ d , M ℓ + k ′ ) d ′ . Now observe thatfor each ( A , τ ), the function f k ′ deﬁnes a continuous map from A d sa to the space of multilinear forms A d sa × · · · × A d ℓ sa × A d sa × · · · × A d sa → A d ′ endowed with k·k ∞ ; ∞ ,..., ∞ . This follows from Lemma 3.16 because for a multilinear form Λ : A ℓ + k ′ sa → A d ′ ,we have k Λ k ∞ ; ∞ ,..., ∞ ≤ k Λ k M ℓ , tr . Once we have this continuity, it is a standard argument to show that f A ,τ is Fr´echet- C k ; this is a generalization of the well-known fact from multivariable calculus that if a functionhas continuous iterated directional derivatives up to order k , then it is C k .The converse direction of the lemma is immediate. Indeed, the combination of statements (1) and (2) isstronger than Deﬁnition 3.13 since Fr´echet-diﬀerentiability implies the existence of directional derivatives.The equality of mixed partials generalizes to the setting of Fr´echet diﬀerentiation: If f is a Fr´echet- C k function, then D k f is a symmetric multlinear form, that is, it is invariant under permutation of thearguments. For f ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ and σ in the symmetric group Perm( ℓ ), we denote by f σ ∈ C k tr ( R ∗ d , M ( R ∗ d σ − , . . . , R ∗ d σ − ℓ ) )) the function given by( f σ ) A ,τ ( X )[ Y , . . . , Y ℓ ] = f A ,τ ( X )[ Y σ − (1) , . . . , Y σ − ( ℓ ) ] . This deﬁnes a right action of Perm( ℓ ) on C k tr ( R ∗ d , M ℓ ) d ′ , and this action is isometric for each seminorm k·k C tr ( R ∗ d , M ℓ ) ,R .Equality of mixed partials means that if f ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ , then ( ∂ k f ) σ = ∂ k f for everypermutation σ that only aﬀects the last k elements (that is, the indices corresponding to the multilineararguments introduced by diﬀerentiation). In this section, we will discuss composition of functions in C k,ℓ tr ( R ∗ d ) d ′ and the chain rule. The ﬁrst lemmadescribes composition in our spaces of non-commutative continuous functions. Lemma 3.18.

Let f ∈ C tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d n )) d ′′ for some n, d ′ ∈ N and d ′′ , d , . . . , d n ∈ N . Let g ∈ C tr ( R ∗ d ) d ′ sa for some d ∈ N . For each m = 1 , . . . , n , let h m ∈ C tr ( R ∗ d , M ( R ∗ d m, , . . . , R ∗ d m,ℓm )) d m forsome ℓ m ∈ N and d m, , . . . , d m,ℓ m . Let L m = ℓ + · · · + ℓ m . Then there exists a (unique) function f ( g ) h , . . . , h n ] ∈ C tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ given by ( f ( g ) h , . . . , h n ]) A ,τ ( X )[ Y , . . . , Y L n ]:= f A ,τ ( g A ,τ ( X ))[ h A ,τ ( X )[ Y , . . . , Y L ] , . . . , h A ,τn ( X )[ Y L n − +1 , . . . , Y L n ]] . Moreover, if we ﬁx

R > and if R ′ = k g k C tr ( R ∗ d ) d ′ ,R , then k f ( g ) h , . . . , h n ] k C tr ( R ∗ d , M Ln ) d ′′ ,R ′ ≤ k f k C tr ( R ∗ d ′ , M n ) d ′′ ,R ′ k h k C tr ( R ∗ d , M ℓ ) ,R . . . k h n k C tr ( R d , M ℓn ) ,R . Moreover, the composition map C tr ( R ∗ d ) d ′ sa × n Y m =1 C tr ( R ∗ d , M ( R ∗ d m, , . . . , R ∗ d m,ℓm )) d m → C tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ is jointly continuous. roof. Let F = f ( g )[ h , . . . , h n ]. Fix R and let R ′ be as above. We begin by proving the inequality that foreach ( A , τ ), k F A ,τ k M Ln , tr ,R ≤ k f A ,τ k M n , tr ,R ′ k h A ,τ k M ℓ , tr ,R . . . k h A ,τn k M ℓn , tr ,R . (3.2)Let α , α , . . . , α L n ∈ [1 , ∞ ] such that 1 α = 1 α + · · · + 1 α L n . Let β , . . . , β n be given by 1 β m = ℓ m X j =1 α L m − + j . Let X ∈ A d sa with k X k ≤ R . For each m ≤ n and j ≤ ℓ m , let Y L m − j + j ∈ A d m,j such that k Y i k α i ≤ i = 1, . . . L n . Note that k g A ,τ ( X ) k ∞ ≤ k g k tr ,R ≤ R ′ . Hence, k F A ,τ ( X )[ Y , . . . , Y L n ] k α ≤ k f A ,τ k M n , tr ,R ′ k h A ,τ ( X )[ Y , . . . , Y L ] k β . . .. . . k h A ,τn ( X )[ Y L n − +1 , . . . , Y L n ] k β n . Moreover, for each m , by the deﬁnition of β m and of k h A ,τm k M ℓm tr ,R , we have k h A ,τm ( X )[ Y L m − +1 , . . . , Y L m ] k β m ≤ k h A ,τm k M ℓ , tr ,R k Y L m − +1 k α Lm − . . . k Y L m k α Lm ≤ k h A ,τm k M ℓn , tr ,R . Therefore, (3.2) holds.Now let us prove that F ∈ C tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ . We proceed inseveral steps.(1) Suppose that f , g , and the h m ’s are all trace polynomials. Then clearly F is a trace polynomial.(2) Next, suppose that f and the h m ’s are trace polynomials, while g is in C tr ( R ∗ d ) d ′ sa . Let g ( N ) ∈ TrP( R ∗ d ) d ′ sa such that g ( N ) → g in C tr ( R ∗ d ) d ′ sa as N → ∞ . If we ﬁx R >

0, then R ∗ := sup N k g ( N ) k C tr ( R ∗ d ) d ′ ,R < ∞ . Applying Lemma 3.16 with the radius R ∗ , we see thatlim N →∞ sup ( A ,τ ) ∈ W k f A ,τ (( g ( N ) ) A ,τ ) − f A ,τ ( g ) k M n , tr ,R = 0 . Let F ( N ) be deﬁned analogously to F except using g ( N ) instead of g . By the same argument as (3.2), k ( F ( N ) ) A ,τ − F A ,τ k M Ln , tr ,R ≤ k f A ,τ (( g ( N ) ) A ,τ ) − f A ,τ ( g A ,τ ) k M n , tr ,R k h k M ℓ , tr ,R . . . k h n k M ℓn , tr ,R . Hence, lim N →∞ sup ( A ,τ ) ∈ W k ( F ( N ) ) A ,τ − F A ,τ k M Ln , tr ,R = 0 , so that F ∈ C tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ because this space is completewith respect to the family of seminorms.(3) Next, suppose f is a trace polynomial, while g ∈ C tr ( R ∗ d ) d ′ sa and h m ∈ C tr ( R ∗ d , M ( R ∗ d m, , . . . , R d m,ℓm )) d m sa .We approximate h m by trace polynomials h ( N ) m as N → ∞ . Then using (3.2), we conclude that thefunction F ( N ) obtained from composing f with g and h ( N ) m converges to F with respect to the seminormsused to deﬁne C tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ , hence F is in this space.(4) Finally, we consider the general case. In the last step we approximate f by trace polynomials f ( N ) as N → ∞ . The argument is similar to the previous step, so we leave the details as an exercise.27inally, to prove continuity, it suﬃces to show that given f , g , h , . . . , h n and given R and ǫ >

0, thereexist R , δ , δ , and η , . . . , η n such that if k f ′ − f k C tr ( R ∗ d , M n ) ,R < δ , k g ′ − g k C tr ( R ∗ d ) ,R < δ , k h ′ m − h m k C tr ( R ∗ d , M ℓm ) ,R < η m , then k F ′ − F k C tr ( R ∗ d , M Ln ) d ,R < ǫ. Let R = k g k C tr ( R ∗ d ) d ,R +1. Then by choosing δ small enough, we can guarantee that k g ′ k C tr ( R ∗ d ) d ,R

Let k ∈ N ∪ {∞} and n ∈ N ) . Let f ∈ C k tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d n )) d ′′ for some d ′ ∈ N and d ′′ , d , . . . , d n ∈ N . Let g ∈ C (tr R ∗ d ) d ′ sa for some d ∈ N . For each m = 1 , . . . , n , let h m ∈ C k tr ( R ∗ d , M ( R ∗ d m, , . . . , R ∗ d m,ℓm )) d m for some ℓ m ∈ N and d m, , . . . , d m,ℓ m . Let L m = ℓ + · · · + ℓ m . Then f ( g ) h , . . . , h n ] ∈ C k tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ , and for k ′ ≤ k , we have ∂ k ′ [ f ( g ) h , . . . , h n ]] = k ′ X j =0 X ( B ,...,B n ,B ′ ,...,B ′ j ) partition of [ L n + k ′ ] , min B ′ < ··· < min B ′ j (cid:16) ∂ j f ( g ) ∂ | B | h , . . . , ∂ | B n | h n , ∂ | B ′ | g , . . . , ∂ | B ′ j | g ] (cid:17) σ , where σ is the permutation given by ( σ (1) , . . . , σ ( L n + k ′ )) = ( I , . . . , I n , B , . . . , B n , B ′ , . . . , B ′ j ) , where I m = {| B | + · · · + | B m | + L m +1 + 1 , . . . , L m +1 + | B | + · · · + | B m | + L m } , and where each of the sets I i , B i , and B ′ i is interpreted in the deﬁnition of σ as a list of elements in orderfrom least to greatest. Here the blocks B , . . . , B n , B ′ , . . . , B ′ j are regarded as an ordered tuple rather thana set, so that the same partition (set of blocks) can occur several times. Moreover, the composition map C k tr ( R ∗ d ) d ′ sa × n Y m =1 C k tr ( R ∗ d , M ( R ∗ d m, , . . . , R ∗ d m,ℓm )) d m → C k tr ( R ∗ d , M ( R ∗ d , , . . . , R ∗ d ,ℓ , . . . . . . , R ∗ d m, , . . . , R ∗ d n,ℓn )) d ′′ is jointly continuous.Remark . It is immediate from the theorem that the BC k tr spaces are also closed under composition. Proof.

Fix ( A , τ ) ∈ W . Then by iteratively applying the chain rule for Fr´echet- C k functions (which isstandard), we obtain the formula asserted above with f A ,τ , g A ,τ , and h A ,τ rather than f , g , and h m .Because of Lemma 3.18, the resulting derivative is an element of C tr ( R ∗ d , M L n + k ′ ) d .To explain the formula, note that when we apply ∂ iteratively k ′ times, the operator ∂ at each stagecould “hit” three diﬀerent things:(1) It could diﬀerentiate ∂ j f ( g ) by the chain rule which will change it to ∂ j +1 f ( g ) and produce another term ∂ g , which we append as the ( j + 1)th argument for ∂ j +1 f ( g ) (thus, setting t j +1 = 0).282) It could diﬀerentiate an already existing term ∂ t i g that is one of the multilinear arguments (which wasoriginally produced by step (1)).(3) It could diﬀerentiate one of the multilinear arguments ∂ s m h m .We arrive at the formula by keeping track of all these possibilities. Here B m represents the set of time indiceswhen h m is diﬀerentiated and B ′ i represents the set of indices in which the i th derivative of g is appended anddiﬀerentiated. Since the copies are appended in order, we have min B ′ < · · · < min B ′ j . The ﬁrst L n inputvectors into ∂ k ′ [ f ( g ) h , . . . , h n ]] are supposed to represent the multilinear arguments in the positions thatalready existed at stage 0; or in other words, Y L m − +1 , . . . , Y L m should be plugged into the ﬁrst ℓ m placesof h m for each m , which is the index set I m . The permutation σ is deﬁned to put these vectors into thecorrect locations, and the same for the tangent vectors corresponding to diﬀerentiation of the terms of theform h i or ∂ i g .Continuity of the composition operation follows from formula for derivatives and the continuity in Lemma3.18. Corollary 3.21. C k tr ( R ∗ d ) is a ∗ -algebra.Proof. We already explained the ∗ -operation on C k tr ( R ∗ d ). If f and g are self-adjoint, then the product f g isthe same as h ( f, g ) where h ( x , x ) = x x ∈ TrP( R ∗ ). Since h is C ∞ tr , it follows from Theorem 3.19 that if f and g are C k tr and self-adjoint, then f g is C k tr . The restriction of self-adjointness for f and g can be removedby decomposing a general element into its real and imaginary (that is, self-adjoint and anti-self-adjoint)parts. Corollary 3.22.

There is a continuous map tr : C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) → C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) deﬁned by (tr( f )) A ,τ ( X )[ Y , . . . , Y ℓ ] = τ ( f A ,τ ( X )[ Y , . . . , Y ℓ ]) . Moreover, ∂ k ′ [tr( f )] = tr[ ∂ k ′ f ] for k ′ ≤ k .Proof. The trace tr can be viewed as an element g of C ∞ tr ( R ∗ , M ( R ∗ )) that is given by g A ,τ [ Y ] = τ ( Y ).Recall that | τ ( X ) | ≤ k X k α for every α ∈ [1 , ∞ ] and hence k g k C tr ( R ∗ , M ) ,R = 1 for all R . Also, ∂ k g = 0for k ≥

1. For f ∈ C k tr ( R ∗ d , M ℓ ) sa , we deﬁne tr( f ) := g [ f ]. Then the relation ∂ k ′ [tr( f )] = tr[ ∂ k ′ f ] followsfrom the chain rule. A general f ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) can be broken into its self-adjoint andanti-self-adjoint parts, and thus the map tr can be extended to all of C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )).As a consequence, if f , g ∈ C k tr ( R ∗ d ) d ′ , we can deﬁne a new function h f , g i tr ∈ tr( C k tr ( R ∗ d )) by h f , g i A ,τ tr ( X ) = h f A ,τ ( X ) , g A ,τ ( X ) i τ . In particular, we will denote by h x , x i tr the function whose evaluation on ( A , τ ) and X is k X k . The following result is a version of the inverse function theorem. Although it would be possible to proveinverse function theorems on an operator norm ball, it is suﬃcient for our purposes to use the “cheap” globalversion that comes from a contraction mapping principle.

Proposition 3.23 (Global inverse function theorem) . Let k ≥ . Let f ∈ C k tr ( R ∗ d ) d sa for some k ≥ .Suppose that for some < K < K ′ , we have k ∂ f − K ′ Id k BC tr ( R ∗ d , M ) d ≤ K . Then there exists (a unique) g ∈ C k tr ( R ∗ d ) d sa such that f ◦ g = g ◦ f = id .Let us denote this function by f − . For a given K ′ < K , we have continuity of the map f f − : { f ∈ C k tr ( R ∗ d ) d sa : k ∂ f − K ′ Id k BC tr ( R ∗ d , M ) d ≤ K } → C tr ( R ∗ d ) d sa , where we use the subspace topology from C k tr ( R ∗ d ) d sa on the domain. roof. By substituting (1 /K ′ ) f for f and g ( K ′ ( · )) for g , we may assume without loss of generality that K ′ = 1. Deﬁne g = id and inductively g n +1 = id +(id − f ) ◦ g n . Note that k (id − f ) A ,τ ( X ) − (id − f ) A ,τ ( Y ) k ∞ ≤ K k X − Y k ∞ for X , Y ∈ A d sa for any ( A , τ ) ∈ W . It followsthat k (id − f ) ◦ h − (id − f ) ◦ h ′ k C tr ( R ∗ d ) d ,R ≤ K k h − h ′ k C tr ( R ∗ d ) d ,R for h , h ′ ∈ C tr ( R ∗ d ) d sa and R >

0. In particular, for

R > k g n +1 − g n k C tr ( R ∗ d ) d sa ,R ≤ K n k g − g k C tr ( R ∗ d ) d sa ,R = K n k id − f k C tr ( R ∗ d ) d sa ,R . Hence, g n converges as n → ∞ to some g ∈ C tr ( R ∗ d ) d sa , which must also g = id +(id − f ) ◦ g , or in otherwords f ◦ g = id. Since id − f is K -Lipschitz on A d sa for any ( A , τ ) and K <

1, it follows that f A ,τ is injective.Thus, in the relation f A ,τ ◦ g A ,τ ◦ f A ,τ = f A ,τ , we may cancel f A ,τ on the left-hand side and thus obtain g ◦ f = id. Since the rate of convergence in k·k C tr ( R ∗ d ) d ,R only depends on K and k id − f k C tr ( R ∗ d ) d sa ,R , itfollows that g depends continuously on f .Note that by the chain rule and induction g n ∈ C k tr ( R ∗ d ) d sa and we have for 1 ≤ k ′ ≤ k that ∂ k ′ g n +1 = k ′ X j =1 X ( B ,...,B j )partition of [ k ′ ]min B < ··· < min B j ( ∂ j (id − f ) ◦ g n )[ ∂ | B | g n , . . . , ∂ | B j | g n ] . We claim that ∂ k ′ g n converges as n → ∞ . We ﬁrst describe the candidate limit functions g ( k ′ ) as ﬁxedpoints of the equation where we substitute g ( k ′ ) for ∂ k ′ g n and ∂ k ′ g n +1 . Of course g (0) will simply be g .Separating out the j = 1 term on the right-hand side, this equation becomes g ( k ′ ) = (Id − ∂ f ◦ g ) g ( k ′ ) − k ′ X j =2 X ( B ,...,B j )partition of [ k ′ ]min B < ··· < min B j ( ∂ j f ◦ g )[ g ( | B | ) , . . . , g ( | B j | ) ] . Since k Id − ∂ f k BC tr ( R ∗ d , M ) d ≤ K <

1, it follows that the right-hand side is K -contractive as a function of g ( k ′ ) . Thus, we may construct the functions g ( k ′ ) by induction on k ′ ; assuming the previous terms have beendeﬁned, g ( k ′ ) is obtained by iteration of the right-hand side, starting with the function Id for k ′ = 1 and 0for k ′ >

1. The rate of convergence of the iterates with respect to k·k C tr ( R ∗ d ) ,R is controlled completely bythe constant K , the norms of the derivatives of f on the ball of radius R ′ := k g k C tr ( R ∗ d ) d ,R , and the normsof the previous terms g ( j ) on the ball of radius R . In particular, it follows that g ( k ′ ) ∈ C tr ( R ∗ d , M k ′ ( R ∗ d )) d depends continuously on f ∈ C k tr ( R ∗ d ) d sa using induction on k ′ . Indeed, once we know the claim for j < k ′ ,then the iterates for g ( k ′ ) depend continuously on f , and the preceding remarks show that for each R , therate of convergence will be uniform on some open set in C k tr ( R ∗ d ) d sa containing f .To ﬁnish the proof, it only remains to show that g is in C k tr ( R ∗ d ) d sa and ∂ k ′ g = g ( k ′ ) for k ′ ≤ k . To thisend, it suﬃces to show that ∂ k ′ g n → g ( k ′ ) as n → ∞ . We proceed by induction on k ′ ≥ k ′ = 0already proved). Subtracting the relations for ∂ k ′ g n +1 and g ( k ′ ) , we get ∂ k ′ g n +1 − g ( k ′ ) = (Id − ∂ f ◦ g n ) ∂ k ′ g n − g ( k ′ ) ) + ( ∂ f ◦ g n − ∂ f ◦ g ) g ( k ′ ) + k ′ X j =2 X ( B ,...,B j )partition of [ k ′ ]min B < ··· < min B j (cid:2) ( ∂ j (id − f ) ◦ g n )[ ∂ | B | g n , . . . , ∂ | B j | g n ] − ( ∂ j f ◦ g )[ g ( | B | ) , . . . , g ( | B j | ) ] (cid:3) . Let ǫ n,R be the norm of ( ∂ f ◦ g n − ∂ g ) g ( k ′ ) plus the norms of the terms in the summation. By the inductionhypothesis and by continuity of composition ǫ n,R → n → ∞ , and we also have k ∂ k ′ g n +1 − g ( k ′ ) k C tr ( R ∗ d , M k ′ ( R ∗ d )) d ,R ≤ K k ∂ k ′ g n − g ( k ′ ) k C tr ( R ∗ d , M k ′ ( R ∗ d )) d ,R + ǫ n,R .

30 straightforward induction on n shows that k ∂ k ′ g n − g ( k ′ ) k C tr ( R ∗ d , M k ′ ( R ∗ d )) d ,R ≤ K n k ∂ k ′ g − g ( k ′ ) k C tr ( R ∗ d , M k ′ ( R ∗ d )) d ,R + n X m =0 K m ǫ n − m,R . Clearly, the ﬁrst term on the right-hand side goes to zero as n → ∞ . For the second term, note that thebi-inﬁnite sequence ( m ≤ n ǫ n − m,R ) m,n is bounded and lim n →∞ m ≤ n ǫ n − m,R = 0. Because P ∞ m =0 K m < ∞ ,the dominated convergence theorem implies thatlim n →∞ n X m =0 K m ǫ n − m,R = lim n →∞ ∞ X m =0 K m m ≤ n ǫ n − m = 0 . Thus, ∂ k ′ g n → g ( k ′ ) as desired. The trace map in Corollary 3.22 leads to the following deﬁnition.

Deﬁnition 4.1.

We denote the image of tr in C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) by tr( C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))). Observation 4.2.

Let f ∈ C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) . Then the following are equivalent:(1) f ∈ tr( C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))) ,(2) f A ,τ ( X )[ Y , . . . , Y ℓ ] ∈ C for every ( A , τ ) and X , Y , . . . , Y ℓ ∈ A sa .(3) f = tr( f ) . Thus, tr( C k tr ( R ∗ d , M ℓ ) may be viewed as the subspace of C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) consisting of scalar-valued functions. Similarly, f ∈ tr( C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))) is self-adjoint if and only if f A ,τ is real-valued for every ( A , τ ) ∈ W .Non-commutative laws can be characterized as certain linear functionals on C tr ( R ∗ d ). To state this result,we use the following deﬁnitions. Deﬁnition 4.3.

We say that f ∈ C k tr ( R ∗ d ) is positive if f A ,τ ( X ) ≥ A for every ( A , τ ) ∈ W and X ∈ A d sa .We say that a map Φ : C k tr ( R ∗ d ) → C tr ( R ∗ d ) is positive if it maps positive elements to positive elements. Deﬁnition 4.4.

Let A be an algebra. We say that map Φ : C k tr ( R ∗ d ) → A is multiplicative over tr( C tr ( R ∗ d ))if Φ( f g ) = Φ( f )Φ( g ) whenever f ∈ tr( C k tr ( R ∗ d )). Lemma 4.5.

The following three sets are in bijection with each other:(1) the space Σ d of non-commutative laws λ ,(2) the set of continuous positive algebra homomorphisms ρ : tr( C tr ( R ∗ d )) → C ,(3) the set of continuous unital positive maps Φ : C tr ( R ∗ d ) → C that are multiplicative over tr( C tr ( R ∗ d )) and satisfy Φ = Φ ◦ tr .The bijections are given by λ = ρ ◦ tr | C h x ,...,x d i λ = Φ | C h x ,...,x d i Φ = ρ ◦ tr ρ = Φ | tr( C tr ( R ∗ d )) roof. First, we show the bijection between (2) and (3). Note that tr is a continuous unital positive map C tr ( R ∗ d ) → tr( C tr ( R ∗ d )) that is multiplicative over tr( C tr ( R ∗ d )). Hence, if ρ satisﬁes (2), then Φ = ρ ◦ trsatisﬁes (3). Conversely, if Φ satisﬁes (3), then Φ | tr( C tr ( R ∗ d )) satisﬁes (2), and the maps ρ ρ ◦ tr andΦ Φ | tr( C tr ( R ∗ d )) are mutually inverse.Next, we show the bijection between (1) and (2). If ρ satisﬁes (2), then let λ ( p ) = ρ (tr( p )) for p ∈ C h x , . . . , x d i . Since ρ is an algebra homomorphism it is unital and hence λ (1) = 1. Also, λ ( pq ) = λ ( qp )since tr( pq ) = tr( qp ) in C tr ( R ∗ d ). Thirdly, tr( p ∗ p ) is positive in tr( C tr ( R ∗ d )), hence λ ( p ∗ p ) ≥

0. Finally,since ρ is continuous, there exists R > δ > k f k C tr ( R ∗ d ) ,R ≤ δ = ⇒ | ρ (tr( f )) | < . Taking p ( x ) = x i . . . x i ℓ , we have k p k C tr ( R ∗ d ) ,R = R ℓ and hence | λ ( p ) | = | ρ (tr( p )) | ≤ R ℓ δ . Since this holds for all ℓ , we know λ is exponentially bounded and hence is a non-commutative law.Conversely, suppose that λ is a non-commutative law in Σ d,R . Let X be a d -tuple of self-adjoint operatorsin ( A , τ ) which realize the law λ . Then deﬁne ρ : tr( C tr ( R ∗ d )) → C by ρ ( f ) = f ( X ). Clearly, f is a positivehomomorphism, and also ρ is continuous since | ρ ( f ) | ≤ k f k C tr ( R ∗ d ) ,R .Now, let us show that the maps λ ρ and ρ λ described above are mutually inverse. If we start with λ and deﬁne ρ ( f ) = f ( X ) using A , τ , and X as above, then ρ (tr( p )) = τ ( p ( X )) = λ ( p ). On the other hand,suppose we start with ρ and let λ = ρ ◦ tr | C h x ,...,x d i . Let X be a tuple realizing the law λ . Then clearly ρ (tr( p )) = τ ( p ( X )). Since ρ is a homomorphism, it follows that ρ ( f ) = f ( X ) holds for all scalar-valued tracepolynomials. But the trace polynomials are dense in C tr ( R ∗ d ) and hence this equality holds for all f .This lemma allows us to describe the push-forward of non-commutative laws by functions f ∈ C tr ( R ∗ d ) d ′ sa .Indeed, if f ∈ C tr ( R ∗ d ) d ′ sa , then there is a continuous positive homomorphism tr( C tr ( R ∗ d ′ )) → tr( C tr ( R ∗ d ))given by g g ◦ f . Continuity follows because f is bounded in k·k ∞ on each k·k ∞ -ball. If ρ is a positivehomomorphism tr( C tr ( R ∗ d ) sa ) → C , then f ∗ ρ := ρ ◦ f is a continuous positive homomorphism tr( C tr ( R ∗ d ′ )) → C . Since the continuous homomorphisms are in bijection with non-commutative laws, there is a correspondingpush-forward operation f ∗ : Σ d → Σ d ′ . Furthermore, the push-forward map f ∗ is characterized by theproperty that for every ( A , τ ) ∈ W and X ∈ A d sa , we have λ f ( X ) = f ∗ λ X .Push-forwards of non-commutative laws lead naturally to inclusions and isomorphisms of tracial C ∗ - andW ∗ -algebras. The next observation is immediate from Lemma 2.20. Observation 4.6.

Let f ∈ C tr ( R ∗ d ) d sa . Let µ ∈ Σ d , and let ( A , τ ) be the W ∗ GNS representation of µ , andlet X ∈ ( A ) d sa be the canonical generators having the non-commutative law µ . Similarly, let ( A , τ ) be theGNS representation for f ∗ µ with its canonical generators Y ∈ ( A ) d ′ sa . Then there is a unique inclusion map ι : ( A , τ ) → ( A , τ ) of tracial W ∗ -algebras such that ι ( Y ) = f A ,τ ( X ) . We also have ι (C ∗ ( Y )) ⊆ C ∗ ( X ) . Observation 4.7.

Consider the same situation as above, and suppose there exists a function g ∈ C tr ( R ∗ d ′ ) d sa such that g A ,τ ( Y ) = X . Then ι is an isomorphism of tracial W ∗ -algebras, which also restricts to anisomorphism C ∗ ( Y ) → C ∗ ( X ) . Observation 4.8.

Suppose that f ∈ C tr ( R ∗ d ) d ′ sa and C tr ( R ∗ d ′ ) d sa satisfy f ◦ g = id and g ◦ f = id . Let µ ∈ Σ d .Then by the previous observations there is an isomorphism of the tracial W ∗ -algebras associated to µ and f ∗ µ respectively, which also restricts to an isomorphism of the C ∗ -algebras associated to the two laws.Remark . If f and g as above satisfy f ◦ g = id and g ◦ f = id, then we must have d = d ′ . This is because f deﬁnes a homeomorphism M N ( C ) d sa → M N ( C ) d ′ sa for every N , so it follows from the invariance of domaintheorem in topology (and in fact, we would only need the homeomorphism for a single value of N to makethis conclusion). However, if we only assume that g ( f ( X )) = X for a particular d -tuples of operators X ,then it is a diﬃcult question whether d must equal d ′ , and the answer will likely depend on the propertiesof the tuple X . 32 .2 One-variable functional calculus Lemma 4.10. If φ ∈ C ( R ) , then we may deﬁne an element f ∈ C tr ( R ) by f A ,τ ( X ) = φ ( X ) for every ( A , τ ) ∈ W and X ∈ A sa .Proof. Let ( φ ( N ) ) N ∈ N be a sequence of polynomials which converge to φ uniformly on compact subsets of R .By the spectral mapping theorem, for any ( A , τ ) and any self-adjoint operator X in A with k X k ≤ R , wehave k φ ( N ) ( X ) − φ ( X ) k ∞ ≤ sup t ∈ [ − R,R ] | φ ( N ) ( t ) − φ ( t ) | . Hence, the sequence of polynomials φ ( N ) ( x ) ∈ C [ x ] ⊆ C tr ( R ) converges in C tr ( R ) to some function f , whichclearly must satisfy f A ,τ ( X ) = φ ( X ) for self-adjoint X in ( A , τ ). Deﬁnition 4.11.

Given φ ∈ C ( R ), we denote the corresponding element of C tr ( R ) by φ ( x ), where x is thesame formal variable used for deﬁning the trace polynomials in C tr ( R ). Similarly, for j ≤ d , we may deﬁnean element φ ( x j ) in C tr ( R ∗ d ) as the element sending a self-adjoint tuple ( X , . . . , X d ) in ( A , τ ) to φ ( X j ).Under what conditions is φ ( x ) ∈ C k tr ( R ∗ d )? Peller, Aleksandrov, and Nazarov have studied the freediﬀerence quotients of functions on the real line for the sake of understanding the perturbations of self-adjoint operators [63, 2, 1, 4, 3], and concluded that Besov spaces are natural spaces of functions on R thatlead to operator C k functions. However, we do not need the full strength of their results, and we will becontent to directly apply one of the key basic ideas, Fourier decomposition, to our current context. We beginby describing the non-commutative derivatives of the complex exponential e ix ∈ C tr ( R ) for each t ∈ R . Inthe formula for derivatives, we recall that the theory of Riemann integration is valid for continuous functionson polytopes taking values in a Fr´echet space, with all the same proofs that are learned in undergraduatecalculus. Lemma 4.12.

For each t ∈ R , the function e itx is in BC ∞ tr ( R ) and satisﬁes k ∂ k [ e itx ] k BC tr ( R , M k ) ≤ t k . (4.1) The derivatives are given explicitly as follows. Let ∆ k denote the simplex ∆ k := { ( s , . . . , s k ) : s j ≥ , s + · · · + s k = 1 } , and let ρ k be the standard uniform probability measure on ∆ k . Then ∂ k [ e itx ][ y , . . . , y k ] = ( it ) k k ! X σ ∈ Perm( k ) Z ∆ k e its x y σ (1) e its x . . . y σ ( k ) e its k x dρ k ( s , . . . , s k ) . (4.2) Here y , . . . , y k denote the formal variables occurring as multilinear arguments of the derivative, and theintegral is interpreted as a Riemann integral with values in the Fr´echet space C tr ( R , M k ) .Proof. First, we prove the formula for the derivative. Consider the projection map π k : R k +1 → R k onto theﬁrst k coordinates. Note that π k gives an aﬃne bijection from ∆ k onto the simplex { s j ≥ , s + · · · + s k − ≤ } , and therefore this map is measure-preserving up to a constant factor. The Lebesgue measure on R k assigns total mass 1 /k ! to the simplex π k (∆ k ) and hence the formula for the derivatives is equivalent to ∂ k [ e itx ][ y , . . . , y k ] = ( it ) k X σ ∈ Perm( k ) Z π k (∆ k ) e its x y σ (1) e its x . . . y σ ( k ) e it (1 − s + ··· + s k − ) x ds . . . ds k − . (4.3)We prove this by induction. First, consider k = 1. For n ∈ N , the function x n is in C tr ( R ) with k x n k C tr ( R ) ,R = R n . Moreover, using the product rule, ∂ [ x n ][ y ] = n − X m =0 x n − − m yx m ,

33o clearly k ∂ [ x n ] k C tr ( R , M ) ,R ≤ nR n − . It follows that the series ∞ X n =0 n ! ( itx ) n converges in C ( R ). This series must agree with e itx since they agree when evaluating on any self-adjointoperator X . We thus have ∂ [ e itx ][ y ] = ∞ X n =0 ( it ) n n ! n − X m =0 x n − − m yx m = X ℓ,m ≥ ( it ) ℓ + m +1 ( ℓ + m + 1)! x ℓ yx m = it X ℓ,m ≥ ℓ + m + 1)! ( itx ) ℓ y ( itx ) m . Observe that by repeated integration by parts Z ℓ ! s ℓ m ! (1 − s ) m ds = Z ℓ + 1)! s ℓ +1 m − − s ) m − ds = · · · = Z ℓ + m )! s ℓ + m ds = 1( ℓ + m + 1)! , so that ∂ [ e itx ][ y ] = it X ℓ,m ≥ (cid:18)Z ℓ ! s ℓ m ! (1 − s ) m ds (cid:19) ( itx ) ℓ y ( itx ) m = it Z X ℓ,m ≥ ℓ ! ( itsx ) ℓ m ! ( it (1 − s ) x ) m ds = it Z e itsx ye it (1 − s ) x ds. Note that ( itsx ) ℓ y ( itx (1 − s )) m is an element of C tr ( R , M ) that depends continuously on s and norm on the R -ball is bounded by ( | t | R ) ℓ + m . This implies uniform convergence of the series and hence the summationand integration are deﬁned over C tr ( R , M ) and exchangeable. This prove (4.2) and hence (4.3) in the case k = 1.For the induction step, assume (4.2) holds for k . Then by applying the product rule inside the integral,we evaluate ∂ k +1 [ e itx ][ y , . . . , y k , y k +1 ] as( it ) k k ! X σ ∈ Perm( k ) Z ∆ k k X ℓ =0 e its x y σ (1) . . . e its ℓ − x y σ ( ℓ ) ∂ [ e its ℓ ][ y k +1 ] y σ ( ℓ +1) e its ℓ +1 x . . . y σ ( k ) e its k x dρ k ( s , . . . , s k ) . Using the k = 1 case, ∂ [ e its ℓ ][ y k +1 ] = its ℓ Z e its ℓ ux y k +1 e ist ℓ (1 − u ) x du = it Z s ℓ e itvx y k +1 e it ( s ℓ − v ) x dv. We substitute this into the above equation. Then we observe for any function φ on ∆ k +1 , we have k ! Z ∆ k Z s ℓ φ ( s , . . . , s k , s ℓ − v ) dv dρ k ( s , . . . , s k ) = ( k + 1)! Z ∆ k +1 φ ( s , . . . , s k +1 ) dρ k +1 ( s , . . . , s k +1 ) , which follows using the parametrization of ∆ k by π k (∆ k ). Also, recall that ρ k is permutation invariant.Thus, ∂ k +1 [ e itx ][ y , . . . , y k , y k +1 ] becomes( it ) k +1 ( k + 1)! X σ ∈ Perm( k ) k X ℓ =0 Z ∆ k +1 e its x y σ (1) . . . e its ℓ − x y σ ( ℓ ) e its ℓ x y k +1 e its ℓ +1 x y σ ( ℓ +1) . . . e its k x y σ ( k ) e its k +1 x dρ k +1 ( s , . . . , s k +1 ) . It is a straightforward combinatorial manipulation to reduce this to (4.2) for k + 1; the idea is that bychoosing a permutation σ ∈ Perm( k ) and then inserting k + 1 at every possible position before, between, orafter the existing elements, we achieve every permutation of k + 1 elements.Now note that for any operator X , e itX is unitary. This implies that k e itx k BC tr ( R ) = 1. By substitutingthis into (4.2), we get (4.1). 34he role of the Fourier transform is to decompose a function on R into a linear combination of complexexponentials. For φ ∈ L ( R ), the Fourier transform is given by b φ ( s ) = Z R e − πist φ ( t ) dt. If b φ ∈ L ( R ), then we have the Fourier inversion formula φ ( t ) = Z R e πits b φ ( s ) ds. The Fourier transform extends to a well-deﬁned operator on the space of tempered distributions and inparticular is well-deﬁned for any continuous function of polynomial growth at ∞ . We also have b φ ′ ( s ) = 2 πis b φ ( s )for all tempered distributions. In particular, this implies that if s k b φ ( s ) is in L ( R ), then ( d/dt ) k φ is in BC ( R ). In fact, we will show a similar property for the non-commutative derivatives of φ ( x ) in C tr ( R ). Proposition 4.13.

Let k ∈ N .(1) Suppose that φ ∈ BC ( R ) and that R R (1 + | s | k ) | b φ ( s ) | ds is ﬁnite. Then φ ( x ) ∈ BC k tr ( R ) with k ∂ ℓ φ ( x ) k BC tr ( R , M ℓ ) ≤ Z R | (2 πis ) ℓ b φ ( s ) | ds for each ℓ ≤ k .(2) If φ ∈ C k +2 ( R ) , then φ ( x ) ∈ C k tr ( R ) .Proof. (1) In light of (4.1), we have for every R > ℓ ≤ k that k ∂ ℓ ( e πisx ) k C tr ( R ) ,R ≤ | πs | ℓ . Moreover, the map s ∂ k [ e πisx ] from R to C tr ( R , M ℓ ) is continuous by continuity of composition inLemma 3.18. Moreover, b φ is continuous. Thus, the improper Riemann integral Z R ∂ ℓ [ e πisx ] b φ ( s ) ds = lim S →∞ Z S − S ∂ ℓ [ e πisx b φ ( s ) ds is well-deﬁned in C tr ( R , M ℓ ) for each ℓ ≤ k . Or equivalently, the improper Riemann integral R R e πisx b φ ( s ) ds is well-deﬁned in C k tr ( R ). By evaluating this on any self-adjoint operator X and using the spectral decom-position of X , we see that φ ( x ) = R R e πisx b φ ( s ) ds in C tr ( R ). Therefore, φ ∈ C k tr ( R ). Also, ∂ ℓ [ φ ( x )] = Z R ∂ ℓ [ e πisx ] b φ ( s ) ds, so that k ∂ ℓ [ φ ( x )] k C tr ( R , M ℓ ) ,R ≤ R R | (2 πis ) k b φ ( s ) | ds for all R , which implies that φ ∈ BC k tr ( R ).(2) Since the deﬁnition of C k tr ( R ) requires approximation of φ ( x ) and its derivatives on each operatornorm ball, it suﬃces to show that φ ( x ) agrees with a C k tr ( R ) function on each operator norm ball. Fix R ,and let ψ ∈ C k +2 c ( R ) such that ψ | [ − R,R ] = φ | [ − R,R ] . Clearly, ψ ( x ) agrees with φ ( x ) on the operator normball of radius R . Note that s ℓ b ψ ( s ) is bounded for ℓ ≤ k + 2. In particular, (1 + | s | k ) | b ψ ( s ) | is bounded by aconstant times 1 / (1 + s ), and hence it is integrable. Thus, (1) shows that ψ ∈ C k tr ( R ) as required.The following is a technical variant of the previous proposition which we will use later in the proof ofTheorem 7.18. The point is that we can control ∂φ ( x ) with only information about b φ ′ and not b φ . Lemma 4.14.

Suppose that φ ∈ C ( R ) with polynomial growth at ∞ . If s b φ ( s ) is in C ( R ) ∩ L ( R ) , then φ ( x ) ∈ C ( R ) with ∂φ ( x ) ∈ BC tr ( R , M ( R ∗ )) . roof. Note that for any

R >

0, (1 − e − Rs ) b φ ( s ) is in C ( R ) ∩ L ( R ). Thus, we may deﬁne φ R ( t ) = Z R e πits (1 − e − Rs ) b φ ( s ) ds. Thus, c φ R ( s ) = (1 − e − Rs ) b φ ( s ) and c φ ′ R ( s ) = 2 πis (1 − e − Rs ) b φ ( s ). Because 2 πis b φ ( s ) is in L ( R ) ∩ C ( R ), wehave 2 πis (1 − e − Rs ) b φ ( s ) → πis b φ ( s ) in L ( R ) as R → ∞ . In particular, it follows that φ ′ R → φ ′ uniformly,hence φ R − φ R (0) → φ − φ (0) uniformly on compact sets, and so φ R ( x ) − φ R (0) + φ (0) → φ ( x ) in C tr ( R ).Now because 2 πis b φ R ( s ) → πis b φ ( s ) in L ( R ), we see in particular that 2 πis b φ R ( s ) is Cauchy in L ( R ) as R → ∞ , and hence ∂φ R ( x ) is Cauchy in BC tr ( R , M ( R ∗ )) as R → ∞ , and thus converges to some limit.The limit must give the Fr´echet derivative of φ ( x ) and hence φ ∈ C ( R ) and ∂φ ∈ BC tr ( R ). A function f ∈ tr( C ( R ∗ d )) deﬁnes for each ( A , τ ) ∈ W a map A d sa → C . Since A d sa is contained in the Hilbertspace L ( A , τ ) d sa , it makes sense at least formally to speak of the gradient of f . In fact, taking A = M N ( C )with its canonical trace tr N , we obtain a C function f M N ( C ) , tr N : M N ( C ) d sa → C , which certainly has agradient with respect to the inner product coming from tr N . The rigorous construction of the gradient infact makes sense for f ∈ tr( C ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))). We start with an auxiliary technical lemma. Lemma 4.15.

There is a Fr´echet-space isomorphism

Φ : tr( C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d ))) → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d such that Φ( g ) is the unique element satisfying g A ,τ ( X )[ Y , . . . , Y ℓ , Y ] = h Y , Φ( g ) A ,τ ( X )[ Y , . . . , Y ℓ ] i τ . (4.4) Furthermore, we have k Φ( g ) k C tr ( R ∗ d , M ℓ ) ,R ≤ k g k C tr ( R ∗ d , M ℓ +1 ) , R ≤ d k Φ( g ) k C tr ( R ∗ d , M ℓ ) ,R (4.5) Finally, for k ∈ N ∪ {∞} , Φ maps tr( C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )) isomorphically (as Fr´echet spaces)onto C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d , and it satisﬁes for k ′ ≤ k that ∂ k ′ (Φ( g )) = Φ(( ∂ k ′ g ) σ ) , (4.6) where σ is the permutation of { , . . . , ℓ + 1 + k ′ } that moves ℓ + 1 to the last position and leaves the otherindices in the same order.Proof. Consider a trace polynomial g in C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) that is expressed as a product ofmonomials τ ( g ( x , y , . . . , y ℓ )) . . . τ ( g k ( x , y , . . . , y ℓ )) τ ( h ( x , y , . . . , y ℓ ) y i h ( x , y , . . . , y ℓ )) , such that the overall expression is multilinear in y , . . . , y ℓ , y , where y = ( y , . . . , y d ). Then setΦ( g ) = (0 , . . . , | {z } i − , h h , , . . . , | {z } d − i ) . Straightforward computation checks that Φ( g ) satisﬁes (4.4). The map Φ extends to all trace polynomialsby linearity. Next, we must be pass to the completion C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )).To this end, we ﬁrst show (4.5) in the special case where g is a trace polynomial. Let ( A , τ ) ∈ W , let X ∈ A d sa with k X k ∞ ≤ R , let α , α , . . . , α ℓ ∈ [1 , ∞ ] with 1 /α = 1 /α + · · · + 1 /α ℓ , and let Y j ∈ A d j with k Y j k α j ≤

1. Let 1 /α + 1 /β = 1, and let Y ∈ A d with k Y k β ≤

1. Then |h Y , Φ( g ) A ,τ ( X )[ Y , . . . , Y ℓ ] i τ | = | g A ,τ ( X )[ Y , . . . , Y ℓ , Y ] | ≤ k g k C tr ( R ∗ d , M ℓ +1 ) ,R . Y was arbitrary with k Y k β ≤

1, we have k Φ( g ) A ,τ ( X )[ Y , . . . , Y ℓ ] k α ≤ k g k C tr ( R ∗ d , M ℓ +1 ) ,R . Then taking the supremum over X , Y , . . . , Y ℓ and α , α , . . . , α ℓ satisfying the conditions given above, wehave k Φ( g ) k C tr ( R ∗ d , M ℓ ) d ,R ≤ k g k C tr ( R ∗ d , M ℓ +1 ) ,R . Conversely, to estimate g in terms of Φ( g ), let ( A , τ ) and X be as above and consider α , α , . . . , α ℓ , β with1 /α = 1 /α + · · · + 1 /α ℓ + 1 /β . For j = 1, . . . , ℓ , let Y j ∈ A d j with k Y j k α j ≤ Y ∈ A d with k Y k β ≤

1. Let β ′ be such that 1 /α + · · · + 1 /α ℓ + 1 /β ′ = 1. Then β ′ ≤ β and hence k Y k β ′ ≤ d k Y k β ≤ d .Since g A ,τ ( X )[ Y , . . . , Y ℓ , Y ] is a scalar, its norm in L α ( A , τ ) is equal to its absolute value, hence | g A ,τ ( X )[ Y , . . . , Y ℓ ] | = |h Y , Φ( g ) A ,τ ( X )[ Y , . . . , Y ℓ ] i τ | ≤ d k Φ( g ) k C tr ( R ∗ d , M ℓ ) ,R . Hence, (4.5) holds when f is a trace polynomial. It follows that the map Φ extends to the unique maptr( C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )) → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d and that this map (still denoted by Φ) is injective. To see that Φ is surjective, let h ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d .Let g ∈ tr( C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d ))) be given by g A ,τ ( X )[ Y , . . . , Y ℓ , Y ] = h Y , h A ,τ ( X )[ Y , . . . , Y ℓ ] i τ . Then Φ( g ) = h . So Φ is a linear isomorphism. Continuity of Φ and Φ − is clear from (4.5).Finally, (4.6) is checked directly using the characterization (4.4) of Φ, and it follows that Φ mapstr( C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d ))) isomorphically onto C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d . Deﬁnition 4.16.

For f ∈ tr( C ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )), we deﬁne ∇ f := Φ( ∂f ), where Φ is the map inthe previous lemma. Equivalently, ∇ f is characterized by the relation that for every ( A , τ ), for X ∈ A d sa ,and Y ∈ A d sa , . . . , Y ℓ ∈ A d ℓ sa , and Y ∈ A d sa , we have( ∂f ) A ,τ ( X )[ Y , . . . , Y ℓ , Y ] = h Y , ∇ f A ,τ ( X )[ Y , . . . , Y ℓ ] i τ . The previous lemma implies in particular that for each

R > k∇ f k C tr ( R ∗ d , M ℓ ) ,R ≤ k ∂f k C tr ( R ∗ d , M ℓ +1 ) ,R ≤ d k∇ f k C tr ( R ∗ d , M ℓ ) ,R . (4.7)Also, for k ∈ N ∪{∞} , we have f ∈ tr( C k +1tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))) if and only if ∇ f is in C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d .Intuition for the gradient comes from the following special cases. Remark . Suppose that f ( x ) = τ ( φ ( x )) for some C function φ : R → C . Then we claim that f ∈ tr( C ( R ∗ d )) and ∇ f ( x ) = φ ′ ( x ). To prove this, ﬁrst consider the case where φ ( t ) = t n . Then ∂f A ,τ ( X )[ Y ] = n − X j =0 τ ( X j Y X n − − j ) = τ ( nX n − Y ) = τ ( φ ′ ( X ) Y ) , so that ∇ f A ,τ ( X ) = φ ′ ( X ). By linearity, the same holds whenever φ is a polynomial. Finally, if φ is C ,then there exist polynomials φ N such that φ N → φ and φ ′ N → φ ′ uniformly on compact subsets of R . Hence, ∇ [ φ N ( x )] → φ ′ ( x ) in C tr ( R ), which implies that ∂ [ τ ( φ N ( x ))] converges in C tr ( R , M ( R )). The limit clearlygives ∂ [ τ ( φ ( x ))], hence ∇ [ τ ( φ ( x ))] = φ ′ ( x ) as desired. Remark . Suppose that f ( x ) = τ ( p ( x )) for some non-commutative polynomial p . Then ∇ f as deﬁnedin Deﬁnition 4.16 is the same as the cyclic gradient of the non-commutative polynomial p introduced byVoiculescu in [81, 84, 86]. For further explanation, see [20], [28, § § M N ( C ) , tr N ). Recall that M N ( C ) d sa with the inner product coming fromtr N is a real inner-product space of dimension dN , and hence can be mapped by a linear isometry onto R dN . Hence, the classical gradient, divergence, Jacobian, and Hessian all make sense for M N ( C ) d sa . If f ∈ tr( C ( R ∗ d )), then f M N ( C ) , tr N : M N ( C ) d sa → C has its gradient given by ( ∇ f ) M N ( C ) , tr N . Moreover,if f ∈ C ( R ∗ d ) d , then the Jacobian matrix of f M N ( C ) , tr N ( X ) corresponds to the linear transformation( ∂ f ) M N ( C ) , tr N ( X ) : M N ( C ) d sa → M N ( C ) d .It is natural to ask whether the divergence also has an analog deﬁned on C tr ( R ∗ d ) d . Recall that if f : R d → C d , then div( f ) = P dj =1 ∂ j f j . The divergence is the trace of the Jacobian matrix Df (that is, theFr´echet derivative). Moreover, it can be expressed in probabilistic terms as follows. Let Z be a standardGaussian (random) vector in R d . Thendiv( f )( x ) = Tr( D f ( x )) = E [ h Z , D f ( x ) Z i ] . Now the analog of the standard Gaussian vector in free probability is a standard semicircular family S =( S , . . . , S d ), where the S j ’s are freely independent of each other and each S j has the spectral measure(1 / π ) √ − t [ − , ( t ) dt . Let ( B , σ ) be the tracial W ∗ -algebra generated by the standard semicircularfamily S . Then we want to deﬁne, for f ∈ C ( R ∗ d ) d ,div( f ) A ,τ ( X ) = h S , ∂ f A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ , where ( A∗ B , τ ∗ σ ) denotes the W ∗ -algebraic free product of ( A , τ ) and ( B , σ ). As in the case of the gradient,we will phrase the deﬁnition in greater generality to work with multilinear forms. As in the study of thegradient, we begin with an auxiliary technical lemma. Lemma 4.19.

Let ℓ ∈ N and d , d ′ , d , . . . , d ℓ ∈ N . Let ( B , σ ) be the tracial W ∗ -algebra generated by astandard semicircular family S .(1) There exists a unique continuous map Υ : C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′ → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′ satisfying Υ( f ) A ,τ ( X )[ Y , . . . , Y ℓ ] = E A [ f A∗B ,τ ∗ σ ( X )[ Y , . . . , Y ℓ , S , S ]] , (4.8) where E A : A ∗ B → A is the unique trace-preserving conditional expectation.(2) We have k Υ( f ) k C tr ( R ∗ d , M ℓ ) d ′ ,R ≤ k f k C tr ( R ∗ d , M ℓ +2 ) d ′ ,R . (3) For k ∈ N ∪ {∞} , Υ maps C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) into C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) , andwe have for k ′ ≤ k that ∂ k ′ (Υ( f )) = Υ(( ∂ k ′ f ) σ ) , where σ is the permutation of { , . . . , ℓ + k ′ + 2 } that moves the elements ℓ + 1 and ℓ + 2 to the end andkeeps the others in the same order.Proof. First, we show that if f ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′ → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ is a trace polynomial, then there is a trace polynomial Υ( f ) satisfying (4.8) (which is clearly uniquely deter-mined by this relation). We may consider each coordinate 1, . . . , d ′ individually and thus assume withoutloss of generality that d ′ = 1. By linearity, it suﬃces to consider the case where f = tr( p ) . . . tr( p n ) q where p , . . . , p n , q are non-commutative monomials (and f satisﬁes the appropriate multilinearity conditions).We then consider the following cases. To make the discussion clearer, we shall assume the polynomial isevaluated on some ( A , τ ), ( A , τ ), X , Y , . . . , Y ℓ , and S as in (4.8) when referring to the diﬀerent argumentsof the function, but of course the statements are equally valid for all instances of ( A , τ ), X , and so forth.(a) Suppose that one of the monomials p j is linear in S , or more precisely, it contains one occurrence of S i for one value of i . Then it will evaluate to zero by free independence. Thus, we may take Υ( f ) = 0.38b) Similarly, if one of the monomials p j contains an occurrence of S i and S j for i = j , then it has the form g ( X , Y , . . . , Y ℓ ) S i g ( X , Y , . . . , Y ℓ ) S j g ( X , Y , . . . , Y ℓ )where the g j ’s are non-commutative monomials. By free independence, the trace will be zero, and hencewe may again take Υ( f ) = 0.(c) Suppose that one of the monomials p j contains two occurrences of S i for some i . Then it has the form g ( X , Y , . . . , Y ℓ ) S i g ( X , Y , . . . , Y ℓ ) S i g ( X , Y , . . . , Y ℓ )where the g j ’s are non-commutative monomials. By free independence the trace is tr( g g ) tr( g ) evalu-ated on X , Y , . . . , Y ℓ . Thus, Υ( f ) is obtained from f by replacing tr( p j ) with tr( g g ) tr( g ).(d) Suppose that q contains an occurrence of S i and an occurrence of S j for i = j . Then using freeindependence (similar to case (2)), we see that E A [ q ( X , Y , . . . , Y ℓ , S , S )] = 0, so we can take Υ( f ) = 0.(e) Suppose that q contains two occurrences of S i for some i . Then we can write q ( X , Y , . . . , Y ℓ , S , S ) as g ( X , Y , . . . , Y ℓ ) S i g ( X , Y , . . . , Y ℓ ) S i g ( X , Y , . . . , Y ℓ ) . Since the remaining terms in f are scalar-valued, they can be factored out of the conditional expectation E A . The conditional expectation onto A of q ( X , Y , . . . , Y ℓ , S , S ) will be g ( X , Y , . . . , Y ℓ ) τ [ g ( X , Y , . . . , Y ℓ )] g ( X , Y , . . . , Y ℓ ) . Hence, Υ( f ) will be obtained from f by replacing q by g g tr( g ).Next, let us prove (2) for the trace polynomial case. In all the above computations with free independence,we only had to use the ﬁrst and second moments of S with respect to the trace σ . Thus, we would havegotten the same result if we took S , . . . , S d to be freely independent operators, each of which has as itsspectral distribution the Bernoulli measure (1 / δ − + δ ). In particular, for these operators k S k ∞ = 1.Thus, (2) follows directly from our deﬁnitions of the norms.Then using (2), we can extend the claim about existence of Υ( f ) satisfying (4.8) from the case of tracepolynomial f to general f ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′ . The extended map Υ clearly stillsatisﬁes (2), which in turn implies it is continuous.Finally, to prove (3), the equality ∂ k ′ (Υ( f )) = Υ(( ∂ k ′ f ) σ ) can be checked directly from (4.8) since thesubstitution of S into two places commutes with the operation of Fr´echet diﬀerentiation. But the relation ∂ k ′ (Υ( f )) = Υ(( ∂ k ′ f ) σ ) implies that Υ maps C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) into C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ). Remark . In the proof, we saw that the “cross terms” that mix S i and S j for i = j will cancel. Thus,we can in fact rewrite Υ asΥ( f ) A ,τ ( X )[ Y , . . . , Y ℓ ] = d X j =1 E A [ f A∗B ,τ ∗ σ ( X )[ Y , . . . , Y ℓ , ˜ S j , ˜ S j ]] , where ˜ S j = (0 , . . . , , S j , , . . . ,

0) where S j occurs in the j th position. Deﬁnition 4.21.

We deﬁne the divergence ∇ † : C ( R ∗ d ) d → tr( C tr ( R ∗ d ))by ∇ † = Υ ◦ Φ − where Φ is as in Lemma 4.15 and Υ is as in Lemma 4.19. In other words, ∇ † ( f ) A ,τ ( X ) = h S , ∂ f A∗B ,σ ∗ τ ( X )[ S ] i τ ∗ σ , where ( B , σ ) is the tracial W ∗ -algebra generated by a standard semicircular family S = ( S , . . . , S d ).39e can more generally deﬁne a similar operation on multilinear forms. Deﬁnition 4.22.

Let ℓ ∈ N and d , d ,. . . , d ℓ ∈ N , we deﬁne ∂ † : C ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )) → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ))by ∂ † = Υ ◦ ∂ .This leads to the deﬁnition of the free Laplacian. Deﬁnition 4.23.

Deﬁne L : C ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ by L := ∂ † ∂ . Observation 4.24. If f ∈ tr( C ( R ∗ d )) , we have Lf = ∇ † ∇ f .Remark . We shall state an analog of the classical fact that the divergence is the trace of the Jacobianand the Laplacian is the trace of the Hessian after we discuss the trace on C tr ( R ∗ d , M ( R ∗ d )) d in the nextsection. Remark . There is a generalization of all the above diﬀerential operators to functions that depend notonly on X but also on an auxiliary variable X ′ . More precisely, let ℓ ∈ N , let d, d ′ , d ′′ ∈ N , and let d , . . . , d ℓ ∈ N . Then we may consider d ′′ -tuples of functions of ( A , τ ) and X ∈ A d sa , X ′ ∈ A d ′ sa , and Y j ∈ A d j . Let ∂ x : C ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) → C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d ))be the operation of diﬀerentiation with respect to the ﬁrst d -variables, which are represented by the formalvariable x = ( x , . . . , x d ). Lemma 4.15 generalizes to deﬁne an isomorphismΦ : tr( C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d ))) → C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d , and hence Deﬁnition 4.16 generalizes to deﬁne ∇ x . Moreover, Lemma 4.19 generalizes to deﬁne a map C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′′ → C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ by Υ( f ) A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] = E A [ f A∗B ,τ ∗ σ ( X , X ′ )[ Y , . . . , Y ℓ , S , S ]] . Hence, we can deﬁne ∂ † x and L x analogously to ∂ † and L . Finally, if L x ′ denotes the Laplacian with respectto the last d ′ variables rather than the ﬁrst d variables, and if L denotes the Laplacian with respect to theentire collection of variables ( x , x ′ ), we have L x + L x ′ = L. This follows from Remark 4.20. ∗ -algebra C tr ( R ∗ d , M ( R ∗ d )) d , its trace, and its log-determinant In this section, we endow C tr ( R ∗ d , M ( R ∗ d )) d with the structure of a tracial ∗ -algebra, which we view as atracial non-commutative analog of C ( R d , M d ( C )) with the pointwise adjoint and trace operations.Recall that if F ∈ C tr ( R ∗ d , M ( R ∗ d )) d , then for each ( A , τ ) ∈ W and X ∈ A d sa , F A ,τ ( X ) deﬁnes a(complex) linear transformation A d → A d . Moreover, for F , G ∈ C tr ( R ∗ d , M ( R ∗ d )) d , we have( F G ) A ,τ ( X )[ Y ] = F A ,τ ( X )[ G A ,τ ( X )[ Y ]] . By Lemma 3.18, F G ∈ C tr ( R ∗ d , M ( R ∗ d )) d , and more generally, by Theorem 3.19, if F and G are in C k tr ( R ∗ d , M ( R ∗ d )) d , then so is F G . In other words, C k tr ( R ∗ d , M ( R ∗ d )) d is an algebra under C tr ( R ∗ d , M ( R ∗ d )) is the function Id given byId( x )[ y ] = y . (We use the lowercase id to denote the identity function in C tr ( R ∗ d ) d .)In fact, for k ∈ N , C tr ( R ∗ d , M ( R ∗ d )) d behaves like a Banach algebra in the following way. This willbe useful for proving smoothness of functions deﬁned by Lemma 4.27.

Let k ∈ N . For F ∈ C k tr ( R ∗ d , M ( R ∗ d )) d , deﬁne k F k C k tr ( R ∗ d , M ) d ,R = k X j =0 j ! k F k C tr ( R ∗ d , M j ) d ,R . Then k F G k C k tr ( R ∗ d , M ,R ≤ k F k C k tr ( R ∗ d , M ) d ,R k G k C k tr ( R ∗ d , M ) d ,R . Proof.

Let k ′ ≤ k . We apply the formula from Theorem 3.19 to compute ∂ k ′ [ F G ] by taking n = 1 and f = F and g = id and h = G . Note that | B ′ i | = 1 and hence | B | = k ′ − j . Since the blocks B ′ i must havetheir minimal elements ordered, they are uniquely determined by the choice of the block B . Thus, ∂ k ′ [ F G ] = X B ⊆{ ,...,k ′ +1 } ∂ k ′ −| B | F ∂ | B | G , Id , . . . , Id] σ , where σ is the permutation sending 1 to 1 and mapping 2, . . . , 1 + | B | onto B and sending the rest of2 + | B | , . . . , 1 + k ′ in order onto the remaining points in [ k ′ + 1]. For each j ≤ k ′ , there are k ′ choose j choices of B with | B | = j , which results in the estimate k ∂ k ′ [ F G ] k C tr ( R ∗ d , M k ′ +1 ) d ,R ≤ k ′ X j =1 (cid:18) k ′ j (cid:19) k ∂ k ′ − j F k C tr ( R ∗ d , M k ′− j ) d ,R k ∂ j G k C tr ( R ∗ d , M j ) d ,R . Hence, k F G k C k tr ( R ∗ d , M ) d ,R = k X k ′ =0 k ′ ! k ∂ k ′ [ F G ] k C tr ( R ∗ d , M k ′ +1 ) d ,R ≤ k X k ′ =0 k ′ X j =1 k ′ − j )! j ! k ∂ k ′ − j F k C tr ( R ∗ d , M k ′− j ) d ,R k ∂ j G k C tr ( R ∗ d , M j ) d ,R ≤ k X i =0 k ∂ i F k C tr ( R ∗ d , M i ) d ,R !  k X j =1 k ∂ j G k C tr ( R ∗ d , M j ) d ,R  = k F k C k tr ( R ∗ d , M ) d ,R k G k C k tr ( R ∗ d , M ) d ,R . Next, we claim that C k tr ( R ∗ d , M ( R ∗ d )) d is a ∗ -algebra with respect to some involution ✶ that is compatiblewith the ∗ by pointwiseapplication of ∗ , that is, ( F ∗ ) A ,τ ( X )[ Y ] = F A ,τ ( X )[ Y ] ∗ for X , Y ∈ A d sa . However, this involution isanalogous to applying entrywise complex conjugation to a matrix rather than taking the adjoint. To preventambiguity, we will use the symbol ✶ for the new adjoint operation.41 emma 4.28. There exists a unique involution ✶ on C tr ( R ∗ d , M ( R ∗ d )) d such that for every ( A , τ ) ∈ W and X ∈ A d sa and Y , Y ∈ A d , we have h ( F ✶ ) A ,τ ( X )[ Y ] , Y i τ = h Y , F A ,τ ( X )[ Y ] i τ . (4.9) Moreover, ✶ deﬁnes a continuous map C k tr ( R ∗ d , M ( R ∗ d )) → C k tr ( R ∗ d , M ( R ∗ d )) d for every k with k ∂ k F ✶ k C tr ( R ∗ d , M k +1 ) d ,R = k ∂ k F k C tr ( R ∗ d , M k +1 ) ,R for R > , (4.10) and hence for k ∈ N and R > , k F ✶ k C k tr ( R ∗ d , M ) d ,R = k F k C k tr ( R ∗ d , M ) d ,R . (4.11) We also have ( F G ) ✶ = G ✶ F ✶ . (4.12) Proof.

For each k ∈ N , letΦ : tr( C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d | {z } k +2 ))) → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d | {z } k +1 )) d be as in Lemma 4.15. Let σ be the element of Perm( k + 2) that switches the last 2 indices. Then we deﬁneΩ : C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d | {z } k +1 )) d → C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d | {z } k +1 )) d by Ω( F ) := Φ(Φ − ( F ) ∗ σ ) , and F ✶ = Ω( F ) in the case where k = 1. Note that by Lemma 4.15, Ω is a continuous involution. By directcomputation, for any k , for any ( A , τ ) ∈ W and X , Y , Y , . . . , Y k +1 ∈ A d , we have h Ω( F ) A ,τ ( X )[ Y , . . . , Y k +1 ] , Y i τ = h Y k +1 , F A ,τ ( X )[ Y , . . . , Y k , Y ] i τ , and hence in particular (4.9) holds. Moreover, for any k , if 1 /α = 1 /α + · · · + 1 /α k +1 and 1 /α + 1 /β = 1,then k Ω( F ) A ,τ ( X ) k α ; α ,...,α k = sup {k Ω( F ) A ,τ ( X )[ Y , . . . , Y k +1 ] k τ,α : k Y j k τ,α j ≤ } = sup {h Y , Ω( F ) A ,τ ( X )[ Y , . . . , Y k +1 ] i τ : k Y k β ≤ , k Y j k τ,α j ≤ } = sup {h F A ,τ ( X )[ Y , Y , . . . , Y k +1 ] , Y i τ : k Y k β ≤ , k Y j k τ,α j ≤ } = k F A ,τ ( X ) k (1 − /α ) − ; β,α ,...,α k +1 . It follows that k Ω( F ) k C tr ( R ∗ d , M k +1 ) d ,R = k F k C tr ( R ∗ d , M k +1 ) d ,R for all R . Then we observe that ∂ [Ω( F )] = Ω[( ∂ F ) σ ], and hence by induction ∂ j [Ω( F )] is Ω of a permutationof ∂ j F whenever F is a C j tr function. It follows that ✶ , which is the k = 1 case of Ω, satisﬁes (4.10) and(4.11). Finally, to show (4.12), note that by (4.9), we have for any ( A , τ ), X , Y , Y ∈ A d sa that h Y , [( F G ) ✶ ] A ,τ ( X )[ Y ] i τ = h Y , [ G ✶ F ✶ ] A ,τ ( X )[ Y ] i τ . By linearity, the same relation holds if Y is taken from A d rather than A d sa . This implies that [( F G ) ✶ ] A ,τ ( X )[ Y ] = Y , [ G ✶ F ✶ ] A ,τ ( X )[ Y ], and since ( A , τ ), X , and Y were arbitrary (4.12) holds.Next, we construct a trace functional on C tr ( R ∗ d , M ( R ∗ d )) d .42 emma 4.29. There exists a unique linear functional Tr : C tr ( R ∗ d , M ( R ∗ d )) d → tr( C tr ( R ∗ d )) satisfying [Tr ( F )] A ,τ ( X ) = h S , F A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ (4.13) for ( A , τ ) ∈ W , where ( B , σ ) is the tracial W ∗ -algebra generated by a standard free semicircular family S = ( S , . . . , S d ) . We have Tr ( F ✶ ) = Tr ( F ) ∗ (4.14) and Tr ( F G ) = Tr ( G F ) . (4.15) Furthermore, Tr maps C k tr ( R ∗ d , M ( R ∗ d )) d into tr( C k tr ( R ∗ d )) for each k ∈ N ∪ {∞} , and we have for k ′ ≤ k that k ∂ k ′ Tr ( F ) k C tr ( R ∗ d , M k ′ ) ,R ≤ d k ∂ k ′ F k C tr ( R ∗ d , M k ′ ) d ,R . (4.16) Proof.

We deﬁne Tr ( F ) = Υ ◦ Φ − ( F ) where Φ is as in Lemma 4.15 and Υ is as in Lemma 4.19. Then(4.13) is veriﬁed from the deﬁnitions of Φ and Υ. The relation (4.14) follows because h S , F A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ = h ( F ✶ ) A∗B ,τ ∗ σ ( X )[ S ] , S i τ ∗ σ = h S , ( F ✶ ) A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ . The claim about C k tr functions and (4.16) follows from (4.5) and (4.6) together with Lemma 4.19 (2) and (3).It remains to prove (4.15). By density and by continuity of the composition operations, it suﬃces toconsider elements F , G of C tr ( R ∗ d , M ( R ∗ d )) d given by trace polynomials. Then there are trace polynomials F i,j,k,ℓ fo i, j ∈ [ d ] and k = 1 , . . . , K and ℓ = 1 , . . . , A , τ ), F A ,τi ( X )[ Y ] = K X k =1 d X j =1 (cid:16) F A ,τi,j,k, ( X ) Y j F A ,τi,j,k, ( X ) + F A ,τi,j,k, ( X ) τ ( F A ,τi,j,k, ( X ) Y j ) (cid:17) and similarly we may write G A ,τi ( X )[ Y ] = K ′ X k ′ =1 d X j =1 (cid:16) G A ,τi,j,k ′ , ( X ) Y j G A ,τi,j,k ′ , ( X ) + G A ,τi,j,k ′ , ( X ) τ ( G A ,τi,j,k ′ , ( X ) Y j ) (cid:17) . By free independence, ( τ ∗ σ )( F A ,τi,j,k, ( X ) S j ) = 0so that F A∗B ,τ ∗ σi ( X )[ S ] = F A ,τi ( X )[ Y ] = K X k =1 d X j =1 F A ,τi,j,k, ( X ) S j F A ,τi,j,k, ( X ) . Again using free independence, we have( τ ∗ σ )( G A ,τi,m,k ′ , ( X ) F A ,τm,j,k, ( X ) S j F A ,τm,j,k, ( X )) = 0 . Hence, (cid:0) G A∗B ,τ ∗ σ ( X )[ F A∗B ,τ ∗ σ ( X )[ S ]] (cid:1) i = X k,k ′ d X j =1 d X m =1 G A ,τi,m,k ′ , ( X ) F A ,τm,j,k, ( X ) S j F A ,τm,j,k, ( X ) G A ,τi,m,k ′ , ( X ) , and thus h S , G A∗B ,τ ∗ σ ( X )[ F A∗B ,τ ∗ σ ( X )[ S ]] i τ ∗ σ = X k,k ′ d X i,j,m =1 ( τ ∗ σ ) h S i G A ,τi,m,k ′ , ( X ) F A ,τm,j,k, ( X ) S j F A ,τm,j,k, ( X ) G A ,τi,m,k ′ , ( X ) i . i = j , then the trace of the expression in the sum is zero by free independence. Moreover, the i = j canbe evaluated using free independence as follows: X k,k ′ d X j,m =1 ( τ ∗ σ ) h S j G A ,τi,m,k ′ , ( X ) F A ,τm,j,k, ( X ) S j F A ,τm,j,k, ( X ) G A ,τi,m,k ′ , ( X ) i = X k,k ′ d X j,m =1 τ h G A ,τi,m,k ′ , ( X ) F A ,τm,j,k, ( X ) i τ h F A ,τm,j,k, ( X ) G A ,τi,m,k ′ , ( X ) i . This expression is invariant if we switch F and G , by applying traciality of τ and interchanging the indices j and m . Thus, (4.15) holds.We will next discuss the log-determinant described by the trace Tr on C tr ( R ∗ d , M ( R ∗ d )) d . It is easiestto deﬁne this trace in terms of the Fuglede-Kadison determinant on tracial W ∗ -algebras. To this end, let usinterpret the trace Tr in terms of traces on a C ∗ -algebra.Observe that for each ( A , τ ) ∈ W and each X ∈ A d sa with k X k ∞ ≤ R , the function F ( X ) deﬁnes abounded linear transformation π A ,τ X ( F ) : L ( A , τ ) d → L ( A , τ ) d with k π A ,τ X ( F ) k ≤ k F k C tr ( R ∗ d , M ( R ∗ d )) d ,R . We deﬁne a C ∗ -semi-norm on C tr ( R ∗ d , M ( R ∗ d )) d by k F k C ∗ ,R = sup {k π A ,τ X ( F ) k : ( A , τ ) ∈ W , X ∈ A d sa , k X k ∞ ≤ R } . The separation-completion of C tr ( R ∗ d , M ( R ∗ d )) d with respect to this seminorm is thus a C ∗ -algebra. Wewill (temporarily) denote this C ∗ -algebra by C R and the quotient map C tr ( R ∗ d , M ( R ∗ d )) d → C R by π R .Letting ( B , σ ) be the tracial W ∗ -algebra generated by a free semicircular family S , we have | Tr ( F ) A ,τ ( X ) | = h S , π A∗B ,τ ∗ σ X ( F ) S i τ ∗ σ ≤ d k π A∗B ,τ ∗ σ X ( F ) k . Thus, F (1 /d ) Tr ( F ) A ,τ passes to a well-deﬁned trace tr A ,τ X on the C ∗ -algebra C R . In particular, afterconstructing the GNS representation of C R associated to tr A ,τ X , we can obtain a tracial W ∗ -algebra as theWOT-closure of the image of this representation.For an algebra A , let GL ( A ) denote the group of invertible elements. For F ∈ GL ( C tr ( R ∗ d , M ( R ∗ d )) d )and ( A , τ ) ∈ W and X ∈ A d sa with k X k ∞ ≤ R , consider the Fuglede-Kadison log-determinantlog ∆ A ,τ X ( F ) := d tr A ,τ X log π R ( F ✶ F ) / . It follows from the work of Fuglede and Kadison [31, Theorem 1, property 1 ◦ ] thatlog ∆ A ,τ X ( F G ) = log ∆ A ,τ X ( F ) + log ∆ A ,τ X ( G ) . Our goal is to show that if F is in GL ( C k tr ( R ∗ d , M ( R ∗ d )) d ), then the log-determinant deﬁnes a function intr( C tr ( R ∗ d )). We will use the path-connectedness of the general linear group. Lemma 4.30.

Let k ∈ N ∪ {∞} . Then GL ( C k tr ( R ∗ d , M ) d ) is path-connected.Proof. Let tr ∈ C ∞ tr ( R ∗ d , M ( R ∗ d )) d denote the function tr ( x )[ y ] = (tr( y ) , . . . , tr( y d )). Note that tr tr = tr and tr ✶ = tr .There is a ∗ -homomorphism φ : M d ( C ) → C ∞ tr ( R ∗ d , M ( R ∗ d )) given by φ ( M )( x ) =  d X j =1 m ,j x j , . . . , d X j =1 m d,j x j  . Since φ ( M ) commutes with the self-adjoint idempotent tr , the ∗ -algebra N generated by φ ( M d ( C )) and tr is isomorphic to M d ( C ) ⊕ M d ( C ), where M ⊕ M corresponds to M (Id − tr ) + M tr . Thus, GL ( N ) ispath-connected. 44t remains to show that every F in C k tr ( R ∗ d , M ( R ∗ d )) d is path-connected to some element of GL ( N ). For t ∈ [0 , F ( t id) be the composition of F with t id. By Theorem 3.19, t F ( t id) is a continuous function[0 , → C k tr ( R ∗ d , M ( R ∗ d )) d . Since F F ( t id) is a ∗ -homomorphism, F ( t id) ∈ GL ( C k tr ( R ∗ d , M ( R ∗ d )) d )for all t . Hence, F is path-connected to F (0) = F ◦ (0 id) in GL ( C k tr ( R ∗ d , M ( R ∗ d )) d ). In the case where F is a trace polynomial, it is easy to check that F (0) ∈ N since all the monomials involving x will dis-appear when we compose with the zero function. Since N is closed, it follows that F (0) ∈ N for all F ∈ GL ( C k tr ( R ∗ d , M ( R ∗ d )) d ). Proposition 4.31.

Let k ∈ N ∪ {∞} . Then there exists a unique map log ∆ : GL ( C k tr ( R ∗ d , M ( R ∗ d )) d ) → tr( C k tr ( R ∗ d )) such that for each ( A , τ ) ∈ W and X ∈ A d sa , we have (log ∆ ( F )) A ,τ ( X ) = log ∆ A ,τ X ( F ) . Moreover, log ∆ is a continuous group homomorphism with respect to multiplication in the domain andaddition in the codomain.Proof.

The claim for k = ∞ will follow if we can prove it for k < ∞ , so assume k < ∞ . Let F ∈ GL ( C k tr ( R ∗ d , M ( R ∗ d )) d , and ﬁx R >

0. Since there is a continuous path from F to Id, we can write F = F . . . F n with k F ✶ j F j − Id k C k tr ( R ∗ d , M ( R ∗ d )) d <

1. Then by additivity of the Fuglede-Kadison determinant, for each( A , τ ) ∈ W and X ∈ A d sa with k X k ∞ ≤ R , we havelog ∆ A ,τ X ( F ) = n X j =1 log ∆ A ,τ X ( F j ) . Since k F ✶ j F j − Id k C k tr ( R ∗ d , M ( R ∗ d )) d ,R < ( F ✶ j F j ) = − ∞ X m =1 m (id − F ✶ j F j ) m with respect to k·k C k tr ( R ∗ d , M ( R ∗ d )) d ,R . Since the representation π A ,τ X is bounded by in norm by k·k C tr ( R ∗ d , M ( R ∗ d )) d ,R and respects analytic functional calculus, we havelog ∆ A ,τ X ( F j ) = − ∞ X m =1 m (Tr [(Id − F ✶ j F j ) m ]) A ,τ ( X ) . Because of convergence of the series − n X j =1 ∞ X m =1 m Tr [(Id − F ✶ j F j ) m ] (4.17)in k·k C k tr ( R ∗ d , M ( R ∗ d )) d ,R , it follows that log ∆ A ,τ X ( F ) is a Fr´echet- C k function of X on the ball over radius R , and that this function, as well as its derivatives up to order k , be approximated on the ball of radius R of every ( A , τ ) ∈ W by functions in C k tr ( R ∗ d , M ( R ∗ d )) d , where the approximation of the k ′ derivativeoccurs with respect to k·k C tr ( R ∗ d , M k ′ ) ,R . Since this holds for every R , we conclude that log ∆ A ,τ X ( F ) deﬁnesa function log ∆ ( F ) in tr( C k tr ( R ∗ d )).The fact that log ∆ ( F G ) = log ∆ ( F )+log ∆ ( G ) follows immediately from additivity of the Fuglede-Kadison determinant. Next, to prove continuity of log ∆ , it suﬃces to check continuity at the point Id. Fix R >

0. Then in a neighborhood of Id, the power series expansion log converges uniformly with respect to k·k C k tr ( R ∗ d , M ( R ∗ d )) d ,R , and hence in this neighborhood log ∆ ( F ✶ F ) and its derivatives up to order k dependcontinuously on F respect to k·k C k tr ( R ∗ d , M ( R ∗ d )) d ,R in the domain and P kk ′ =0 k ∂ k ′ ( · ) k C tr ( R ∗ d , M k ′ ) ,R in thetarget space. 45he following gives an explicit formula for ∂ log ∆ ( F ) which is helpful for assessing the boundednessproperties of the derivative. Lemma 4.32.

Let F ∈ GL ( C ( R ∗ d , M ( R ∗ d )) d ) and let G be the -inverse of F . For ( A , τ ) ∈ W and X , Y ∈ A d sa , we have ∂ [log ∆ ( F )] A ,τ ( X )[ Y ] = (cid:10) S , [ G ∂ F + G ✶ ∂ F ✶ ] A∗B ,τ ∗ σ ( X )[ S , Y ] (cid:11) τ ∗ σ , where ( B , σ ) is the tracial W ∗ -algebra generated by a family of freely independent operators S each of whichhas mean zero and variance . In particular, if G ∈ BC tr ( R ∗ d , M ( R ∗ d )) and ∂ F ∈ BC tr ( R ∗ d , M ) , then ∂ [log ∆ ( F )] ∈ BC tr ( R ∗ d , M ( R ∗ d )) .Proof. Let us compute the directional derivatives. Fix ( A , τ ) ∈ W . Let X and Y ∈ A d sa , and letΦ( t ) = π A∗B ,τ ∗ σ X + t Y ( F ) . Note that for Z ∈ A d sa , ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 [Φ( t ) Z ] = ∂ F A∗B ,τ ∗ σ ( X + t Y )[ Z , Y ] . Note that ∂ F A∗B ,τ ∗ σ ( X + t Y )[ − , Y ] deﬁnes bounded operator on L ( A , τ ) d which depends continuously on t , and hence Φ( t ) is diﬀerentiable in the operator norm. In particular. For t in a neighborhood of zero, Φ( t )is contained in some interval of the form [ ǫ, R − ǫ ]. We can compute ( d/dt ) | t =0 log Φ( t ) ∗ Φ( t ) using the powerseries for log centered at R . If we also apply the fact that h S , ( − ) S i τ ∗ σ is tracial on the algebra generatedby Φ(0) and Φ ′ (0) (for the same reason that Tr is a trace), we obtain ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 (cid:28) S ,

12 log Φ( t ) ∗ Φ( t ) S (cid:29) τ ∗ σ = (cid:28) S , (Φ(0) ∗ Φ(0)) − ddt | t =0 [Φ( t ) ∗ Φ( t )] S (cid:29) τ ∗ σ = (cid:10) S , Φ(0) − (Φ(0) ∗ ) − [Φ ′ (0) ∗ Φ(0) + Φ(0) ∗ Φ ′ (0)] S (cid:11) τ ∗ σ = (cid:10) S , [(Φ(0) ∗ ) − Φ ′ (0) ∗ + Φ(0) − Φ ′ (0)] S (cid:11) τ ∗ σ , where the last equality follows using traciality. This reduces to the asserted formula. The boundednessstatement then follows by inspection from the formula and the deﬁnitions of the norms. N limits of diﬀerential operators on M N ( C ) d sa We have deﬁned non-commutative analogs of the gradient, divergence, and Laplacian as well as the trace onmatrix-valued functions. Note that if f ∈ C ( R ∗ d ) d , then ∂ f is the analog of the Jacobian, and we have ∇ † f = Tr ( ∂ f ) . For f ∈ tr( C ( R ∗ d )), the analog of the Hessian matrix would be ∂ ∇ f , and it is straightforward to checkthat Lf = Tr ( ∂ ∇ f ) . Let us now explain how the diﬀerential operators on non-commutative smooth functions describe in somesense the large N limit of diﬀerential operators on M N ( C ) d sa . We have already seen that if f ∈ tr( C ( R ∗ d )),then ( ∇ f ) M N ( C ) , tr N is the classical gradient of f M N ( C ) , tr N as a function on the dN -dimensional inner productspace M N ( C ) d sa , where the inner product is the one deﬁned by tr N . If f ∈ C ( R ∗ d ) d , then the classicaldivergence of f M N ( C ) , tr N does not equal ( ∇ † f ) M N ( C ) , tr N precisely, but they agree asymptotically as N → ∞ in the following sense. Lemma 4.33.

Let f ∈ C ( R ∗ d ) d . Let div( f M N ( C ) , tr N ) denote the classical divergence of f M N ( C ) , tr N as afunction on the inner product space M N ( C ) d sa . Then for every R > , lim N →∞ (cid:13)(cid:13)(cid:13)(cid:13) N div( f M N ( C ) , tr N ) − ( ∇ † f ) M N ( C ) , tr N (cid:13)(cid:13)(cid:13)(cid:13) tr ,R = 0 , where k·k tr ,R is as in Deﬁnition 3.10 for A = M N ( C ) . Or more explicitly, lim N →∞ sup (cid:26)(cid:13)(cid:13)(cid:13)(cid:13) N div( f M N ( C ) , tr N )( X ) − ( ∇ † f ) M N ( C ) , tr N ( X ) (cid:13)(cid:13)(cid:13)(cid:13) ∞ : X ∈ M N ( C ) d sa , k X k ∞ ≤ R (cid:27) = 0 .

46f course, the previous lemma also applies to the Laplacian of functions f ∈ tr( C tr ( R ∗ d )) since theLaplacian is the divergence of the gradient. More generally, given f ∈ C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )), notethat f M N ( C ) , tr N is a map from M N ( C ) d sa to the vector space of multilinear forms M N ( C ) d sa × · · ·× M N ( C ) d ℓ sa → M N ( C ). The classical Laplacian of vector-valued functions on a real inner product space is deﬁned as thesum of the second directional derivatives over an orthonormal basis (which is the same as choosing a vectorbasis for the target space and computing the Laplacian coordinatewise). As per Remark 4.26, we will statethe next lemma more generally in the case of the Laplacian with respect to a subset of the variables. Lemma 4.34.

Let f ∈ C ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) . Let ∆ x denote the Laplacian with respect to x ofa function of variables ( x , x ′ ) ∈ M N ( C ) d sa × M N ( C ) d ′ sa . Then for every R > , we have lim N →∞ (cid:13)(cid:13)(cid:13)(cid:13) N ∆ x [ f M N ( C ) , tr N ] − [ L x f ] M N ( C ) , tr N (cid:13)(cid:13)(cid:13)(cid:13) M ℓ , tr ,R = 0 , where k·k M ℓ , tr ,R is as in Deﬁnition 3.10. Because the Laplacian and the divergence are both deﬁned in terms of the map Υ in Lemma 4.19 (andits generalization in Remark 4.26), Lemmas 4.33 and 4.34 will follow from relating Υ to the trace map inthe ﬁnite-dimensional setting, as we will do in Lemma 4.36.We begin with some notation. Let d, d ′ , ℓ ∈ N and d ′′ , d , . . . , d ℓ ∈ N . Let M ( M N ( C ) d sa , . . . , M N ( C ) d ℓ sa ; M N ( C ) d ′′ )denote the space of real-multilinear forms M N ( C ) d sa × · · · × M N ( C ) d ℓ sa → M N ( C ) d ′′ .Let E be an orthonormal basis of M N ( C ) d sa . Then we deﬁneΥ ( N ) : M ( M N ( C ) d sa , . . . , M N ( C ) d ℓ sa , M N ( C ) d sa , M N ( C ) d sa ; M N ( C ) d ′′ ) → M ( M N ( C ) d sa , . . . , M N ( C ) d ℓ sa ; M N ( C ) d ′′ )by (Υ ( N ) Λ)[ Y , . . . , Y ℓ ] = X E ∈E Λ[ Y , . . . , Y ℓ , E , E ] . (4.18) Lemma 4.35.

Let Υ ( N ) be as above and let Υ : C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′′ → C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ by given by (Υ f ) A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] = (Υ f ) A∗B ,τ ∗ σ ( X , X ′ )[ Y , . . . , Y ℓ , S , S ] , where ( B , σ ) is the tracial W ∗ -algebra generated by a standard semicircular d -tuple S . Then for f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′′ , for every R > , lim N →∞ (cid:13)(cid:13)(cid:13) Υ ( N ) f M N ( C ) , tr N − (Υ f ) M N ( C ) , tr N (cid:13)(cid:13)(cid:13) M ℓ , tr ,R = 0 . (4.19) Proof.

Note that we can also write(Υ ( N ) Λ)[ Y , . . . , Y ℓ ] = E Λ[ Y , . . . , Y ℓ , Z , Z ] , (4.20)where Z is a standard Gaussian random vector in M N ( C ) d sa , that is, a Gaussian random vector with meanzero and covariance matrix I . In this case S ( N ) = (1 /N ) Z is Gaussian unitary ensemble. It is well-knownthat E k S ( N ) k ∞ ≤ C for some constant independent of N (and in fact much more is true); see Lemma 8.15 and the references citedin the discussion preceding that lemma. It follows that for Λ ∈ M ( M N ( C ) d sa , . . . , M N ( C ) d ℓ sa , M N ( C ) d sa , M N ( C ) d sa ; M N ( C ) d ′′ ),we have k Υ ( N ) Λ k M ℓ , tr ≤ C k Λ k M ℓ +2 , tr . In particular, for f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′′ , we have k Υ ( N ) f M N ( C ) , tr N k M ℓ , tr ,R ≤ C k f k C tr ( R ∗ d , M ℓ +2 ) d ′′ ,R . f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ d )) d ′′ , forinstance for those given by trace polynomials. Furthermore, it suﬃces to consider the case d ′′ = 1 since wecan handle each coordinate of f individually.To evaluate Υ ( N ) for trace polynomials, we use the following magic formula:1 N X E ∈E AE i BE j C = E AS ( N ) i BS ( N ) j C = δ i = j A tr N ( B ) C for A, B, C ∈ M N ( C ) . (4.21)This can be proved, for instance, by direct computation using the orthonormal basis E = { N / E j,j } Nj =1 ∪ { ( N/ / ( E j,k + E k,j ) } j

Let F ∈ C tr ( R ∗ d , M ) d . Then F M N ( C ) , tr N ( X ) deﬁnes a linear transformation M N ( C ) d → M N ( C ) d , which has a well-deﬁned trace Tr( F M N ( C ) , tr N ( X )) . Then for each R > , lim N →∞ sup X ∈ M N ( C ) d sa k X k ∞ ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N Tr[ F M N ( C ) , tr N ( X )] − [Tr ( F )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Similarly, for each F ∈ GL ( C tr ( R ∗ d , M ) d ) and for every R > , we have lim N →∞ sup X ∈ M N ( C ) d sa k X k ∞ ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N log (cid:12)(cid:12) det[ F M N ( C ) , tr N ( X )] (cid:12)(cid:12) − [log ∆ ( F )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Proof.

The ﬁrst claim is immediate since the trace was deﬁned in terms of Υ in Lemma 4.29. The claimabout the log-determinant follows by expressing the log-determinant as the trace of some function as in theproof of Proposition 4.31; see (4.17).We also have the following reﬁnement which allows for uniform convergence on k·k -balls if ∂ F is bounded. Lemma 4.37.

Let F ∈ C ( R ∗ d , M ) d with ∂ F ∈ BC tr ( R ∗ d , M ) d . Then for each R > , lim N →∞ sup X ∈ M N ( C ) d sa k X k ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N Tr[ F M N ( C ) , tr N ( X )] − [Tr ( F )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Similarly, if F ∈ GL ( C ( R ∗ d , M )) d with inverse given by G , and if G ∈ BC tr ( R ∗ d , M ( R ∗ d )) d and ∂ F ∈ BC tr ( R ∗ d , M ( R ∗ d , R ∗ d )) d , then lim N →∞ sup X ∈ M N ( C ) d sa k X k ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N log (cid:12)(cid:12) det[ F M N ( C ) , tr N ( X )] (cid:12)(cid:12) − [log ∆ ( F )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Proof.

Fix

R > R ′ >

0. Let φ R ′ ( t ) = max( − R ′ , min( t, R ′ )). For X ∈ ( A , τ ), we have k φ R ′ ( X ) − X k τ, ≤ R ′ k X k τ, . Hence, letting g A ,τR ′ ( X ) = ( φ R ′ ( X ) , . . . , φ R ′ ( X d )), we have k g A ,τR ′ ( X ) − X k τ, ≤ R ′ k X k τ, . Now F ( g R ′ ) ∈ C tr ( R ∗ d , M ) d . Moreover, if S ∈ ( A , τ ) with k S k ∞ ≤ k X k ≤ R , then |h S , F A ,τ ( g A ,τR ′ ( X ))[ S ] i τ − h S , F A ,τ ( X )[ S ] i τ | ≤ k ∂ F k BC tr ( R ∗ d , M ) k S k ∞ k g A ,τR ′ ( X ) − X k ≤ R R ′ k ∂ F k BC tr ( R ∗ d , M ) . In particular, this implies that for each N sup X ∈ M N ( C ) d sa k X k ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N Tr[ F M N ( C ) , tr N ( X )] − N Tr[( F ◦ g ) M N ( C ) , tr N ( X )] (cid:12)(cid:12)(cid:12)(cid:12) ≤≤ R R ′ k ∂ F k BC tr ( R ∗ d , M ) and a similar error bound holds for the error when replacing F with F ◦ g R ′ in Tr . Since k g A ,τR ′ ( X ) k ∞ ≤ R ′ ,we have lim N →∞ sup X ∈ M N ( C ) d sa k X k ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N Tr[( F ◦ g R ′ ) M N ( C ) , tr N ( X )] − [Tr ( F ◦ g )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . N →∞ sup X ∈ M N ( C ) d sa k X k ≤ R (cid:12)(cid:12)(cid:12)(cid:12) N Tr[ F M N ( C ) , tr N ( X )] − [Tr ( F )] M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ R R ′ k ∂ F k BC tr ( R ∗ d , M ) . Since R ′ was arbitrary, we are ﬁnished proving the ﬁrst claim. The proof of the second claim is similar usingLemma 4.32. This section will give the deﬁnition of the free Wasserstein manifold W ( R ∗ d ) consisting of non-commutativelog-densities V , the non-commutative diﬀeormorphism group D ( R ∗ d ), and the transport action D ( R ∗ d ) y W ( R ∗ d ). It will explain as many results as can be proved by computation, and then sketch other ideas thatwill be carried out rigorously when V is suﬃciently close to the quadratic function (1 / h x , x i tr . Deﬁnition 5.1.

We deﬁne the free Wasserstein manifold W ( R ∗ d ) be the set of V ∈ tr( C ∞ tr ( R ∗ d )) such that a k x k + b ≤ V ≤ a ′ k x k + b ′ for some a, a ′ > b, b ′ ∈ R , considered modulo additive constants. Deﬁnition 5.2.

We deﬁne the tangent space T V W ( R ∗ d ) as the set of equivalence classes of continuouslydiﬀerentiable paths t V t from some interval ( − ǫ, ǫ ) to tr( C ∞ tr ( R ∗ d )) sa such that V = V modulo constantsand such that a k x k + b ≤ V ≤ a ′ k x k + b ′ for some a, a ′ > b, b ′ ∈ R } . Here t V t and t W t areconsidered to be equivalent ˙ V = ˙ W modulo constants. Here “continuously diﬀerentiable” is interpreted interms of the Fr´echet topology on tr( C tr ( R ∗ d )) sa . Deﬁnition 5.3.

For k ∈ N ∪{∞} , we deﬁne Diﬀ k tr ( R ∗ d ) as the space of functions f ∈ C k tr ( R ∗ d ) such that f hasan inverse function f − ∈ C tr ( R ∗ d ). Similarly, we deﬁne BDiﬀ k tr ( R ∗ d ) as the space of functions f ∈ Diﬀ k tr ( R ∗ d )such that ∂ f , . . . , ∂ k f and ∂ f − , . . . , ∂ k f − are bounded. We also use the notation Diﬀ tr ( R ∗ d ) = Diﬀ ∞ tr ( R ∗ d )and BDiﬀ tr ( R ∗ d ) = BDiﬀ ∞ tr ( R ∗ d ). Observation 5.4.

It follows from the chain rule that

Diﬀ k tr ( R ∗ d ) and BDiﬀ k tr ( R ∗ d ) are groups under com-position. Deﬁnition 5.5.

Let D ( R ∗ d ) := Diﬀ tr ( R ∗ d ) ∩ BDiﬀ ( R ∗ d ). We deﬁne T f D ( R ∗ d ) as the set of continuouslydiﬀerentiable paths t f t from some interval ( − ǫ, ǫ ) to D ( R ∗ d ) such that f = f , the derivatives ∂ f t and ∂ f − t are uniformly bounded, and the maps t f t and t f − t are continuously diﬀerentiable ( − ǫ, ǫ ) → C ∞ tr ( R ∗ d ).Here t f t and t g t are considered equivalent if ˙ f = ˙ g . Lemma 5.6.

Then there is a group action D ( R ∗ d ) y W ( R ∗ d ) given by ( f , V ) f ∗ V := V ◦ f − − log ∆ ( ∂ f − ) . More generally, this formula deﬁnes an action

Diﬀ k +1tr ( R ∗ d ) y tr( C k tr ( R ∗ d )) sa .Proof. First, note that if V ∈ tr( C k tr ( R ∗ d )) sa and f ∈ Diﬀ k +1tr ( R ∗ d ), then f ∗ V ∈ tr( C k tr ( R ∗ d )) sa . Indeed, Theo-rem 3.19 shows that V ◦ f − ∈ tr( C k tr ( R ∗ d )) sa , and Proposition 4.31 shows that log ∆ ( ∂ f − ) ∈ tr( C k tr ( R ∗ d )) sa .To show that f ∗ ( g ∗ V ) = ( f ◦ g ) ∗ V , observe that V ◦ ( f ◦ g ) − − log ∆ ( ∂ ( f ◦ g ) − ) = ( V ◦ g − ) ◦ f − − log ∆ (( ∂ g − ◦ f − ) ∂ f − )= ( V ◦ g − − log ∆ ( ∂ g − )) ◦ f − − log ∆ ( ∂ f − ) . To complete the proof that D ( R ∗ d ) acts on W ( R ∗ d ), it suﬃces to show that if f ∈ BDiﬀ ( R ∗ d ) and V ∈ tr( C tr ( R ∗ d ) sa satisﬁes a h x , x i tr + b ≤ V ≤ a ′ h x , x i tr + b ′ , then f ∗ V satisﬁes similar bounds. Now ∂ f − and its inverse ∂ f ◦ f − are both bounded. This implies a uniform bound, independent of R , on the C ∗ -norms k ∂ f − k C ∗ ,R and k ( ∂ f − ) − k C ∗ ,R used in the deﬁnition of log ∆ . Hence, log ∆ ( ∂ f − ) is bounded. Thus,50t remains to show that V ◦ f − has quadratic upper and lower bounds. But note that f − and f both havebounded ﬁrst derivative, and thus they are both uniformly Lipschitz with respect to k·k , and more precisely,for all ( A , τ ) ∈ W and X ∈ A d sa , k f − (0) k + 1 k ∂ f k BC tr ( R ∗ d , M ) k X k ≤ k ( f − ) A ,τ ( X ) k ≤ k f − (0) k + k ∂ f − k BC tr ( R ∗ d , M ) k X k . Substituting this into the given bounds for V completes the argument.The group action D ( R ∗ d ) y W ( R ∗ d ) produces a map from T id ( D ( R ∗ d )) to T V W ( R ∗ d ). This transfor-mation from “inﬁnitesimal transport maps” to perturbations of V is described as follows. For the classicalanalog, see [49, Theorem 3.5]. Lemma 5.7.

Let ( − ǫ, ǫ ) → D ( R ∗ d ) : t f t be a tangent vector at id in D ( R ∗ d ) , and let V ∈ W ( R ∗ d ) . Then t V t := ( f t ) ∗ V is a tangent vector at V in W ( R ∗ d ) . Moreover, we have ˙ V = −∇ ∗ V ˙ f , where ∇ ∗ V h := − Tr ( ∂ h ) + ∂V h . Proof.

Let g t = f − t . Note that ˙ V t = ∂V ( g t )[ ˙ g t ], which depends continuously on t in tr( C ∞ tr ( R ∗ d ) , M ( R ∗ d ))by Theorem 3.19. Next, we claim that ddt log ∆ ( ∂ g t ) = Tr ( ∂ ˙ g t ∂ f t ◦ g t ) . Let g s,t = g s ◦ g − t . Then for small δ ∈ R , we have ∂ g t + δ = ( ∂ g t + δ,t ◦ g t ) ∂ g t , hence log ∆ ( ∂ g t + δ ) − log ∆ ( ∂ g t ) = (log ∆ ∂ g t + δ,t ) ◦ g t . Note g t + δ,t → id in C tr ( R ∗ d ) d as δ → ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 g t + δ,t = ˙ g t ◦ g − t . For each

R > k >

0, the series expansionlog ∆ ( ∂ g t + δ,t ) = − ∞ X m =1 m Tr [(Id − ( ∂ g t + δ,t ) ✶ ∂ g t + δ,t ) m ]converges in k·k C k ( R ∗ d , M ( R ∗ d )) d ,R for suﬃciently small δ . Therefore, ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 log ∆ ( ∂ g t + δ,t ) = 12 Tr (cid:18) ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 g t + δ,t ) ✶ ∂ g t + δ,t (cid:19) = 12 Tr (cid:18) ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 ( ∂ g t + δ,t + ( ∂ g t + δ,t ) ✶ ) (cid:19) . Now ∂ g A ,τt + δ,t ( X ) maps A d sa → A d sa for any ( A , τ ). Therefore, if ( B , σ ) is the tracial W ∗ -algebra generated bya semicircular d -tuple S , then ∂ g A∗B ,σ ∗ τt + δ,t ( X )[ S ] is self-adjoint and hence h S , ∂ g A∗B ,σ ∗ τt + δ,t ( X )[ S ] i τ ∗ σ = h ∂ g A∗B ,σ ∗ τt + δ,t ( X )[ S ] , S i τ ∗ σ = h S , (( ∂ g t + δ,t ) ✶ ) A∗B ,σ ∗ τ ( X )[ S ] i τ ∗ σ . Hence, Tr (( ∂ g t + δ,t ) ✶ ) = Tr ( ∂ g t + δ,t ), which implies that ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 log ∆ ( ∂ g t + δ,t ) = Tr (cid:18) ddδ (cid:12)(cid:12)(cid:12)(cid:12) δ =0 ∂ g t + δ,t (cid:19) = Tr ( ∂ ( ˙ g t ◦ g − t ))= Tr ( ∂ g t ◦ g − t ∂ ( g − t )) . ddt log ∆ ( ∂ g t ) = Tr ( ∂ ˙ g t ◦ g − t ∂ ( g − t )) ◦ g t = Tr ( ∂ ˙ g t ∂ f t ◦ g t ) . This is continuous in t by Theorem 3.19 and Proposition 4.31. Hence, t log ∆ ( ∂ g t ) is continuouslydiﬀerentiable as desired. The above computations also show that˙ V = ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 [ V ◦ g t − log ∆ ( ∂ g t )] = ∂V g − Tr ( ∂ ˙ g ) = − ∂V f + Tr ( ∂ ˙ f ) = −∇ ∗ V ˙ f . Given a tangent vector t f t of the identity in D ( R ∗ d ), the function ˙ f ∈ C tr ( R ∗ d ) d sa can be viewed as a d -dimensional vector ﬁeld. The next lemma describes how to construct paths of diﬀeomorphisms as the ﬂowof a family of vector ﬁelds. Lemma 5.8.

Let t h t be a continuous map [0 , T ] → C ( R ∗ d ) d sa such that k ∂ h t k BC tr ( R ∗ d , M ) d is boundedby a constant M . Then there exist continuous maps t f t and t g t from [0 , T ] to C ( R ∗ d ) d sa satisfying f t = id + Z t h u ◦ f u du g t = id − Z t h t − u ◦ g u du and f t ◦ g t = g t ◦ f t = id and k ∂ f t k BC tr ( R ∗ d , M ) d ≤ e Mt , k ∂ g t k BC tr ( R ∗ d , M ) d ≤ e Mt . Furthermore, for k ≥ , if t h t is a continuous map into C k tr ( R ∗ d ) d sa , then so are t f t and t g t . If inaddition k ∂ k ′ h t k BC tr ( R ∗ d , M k ′ ) d is bounded for each ≤ k ′ ≤ k , then the same holds for f t and g t .Proof. We focus ﬁrst on the function f t and its derivatives. We construct the solution f t through Picarditeration. Let f t, = id f t,n +1 = id + Z t h u ◦ f u,n du. To see why this makes sense, recall that for any continuous function γ from [0 , T ] into a Fr´echet space Y ,the Riemann integral R T γ is well-deﬁned. The proof is the same as for Riemann integration of R d -valuedfunctions; one simply has to use the uniform continuity of γ with respect to each of the seminorms in Y .Similarly, R t γ is continuously diﬀerentiable with derivative equal to γ . Now C tr ( R ∗ d ) d sa is a Fr´echet spaceand the composition operation is continuous, so by induction f t,n is a well-deﬁned and continuous function[0 , T ] → C tr ( R ∗ d ) d sa .Next, since ∂ h u is bounded by M for all u , we know that for every ( A , τ ) ∈ W , the function h A ,τu : A d sa →A d sa is M -Lipschitz with respect to k·k ∞ . It follows that k h u ◦ f u,n − h u ◦ f u,n − k C tr ( R ∗ d ) ,R ≤ M k f u,n − f u,n − k C tr ( R ∗ d ) ,R . Therefore, k f t,n +1 − f t,n k C tr ( R ∗ d ) ,R ≤ M Z t k f u,n − f u,n − k C tr ( R ∗ d ) ,R du. By induction, k f t,n +1 − f t,n k C tr ( R ∗ d ) ,R ≤ M n t n n ! sup t ∈ [0 ,T ] k f t, − id k C tr ( R ∗ d ,R ) . R , the right-hand side goes to zero. Hence, f t,n converges to some function f t in C tr ( R ∗ d ) as n → ∞ uniformly for all t , which satisﬁes the integral equation as desired.For k ≥

1, suppose that t h t is a continuous map into C k tr ( R ∗ d ) d , and we will show that t f t is aswell. Because the composition operation on C k tr functions is continuous, we obtain by the chain rule that for n ∈ N , ∂ f t,n +1 = Id + Z t ( ∂ h u ◦ f u,n ) ∂ f u,n du and for 2 ≤ k ′ ≤ k , ∂ k ′ f t,n +1 = k ′ X j =1 X B ,...,B j partition of [ k ′ ]min B < ··· < min B j Z t ( ∂ j h u ◦ f u,n ) ∂ | B | f u,n , . . . , ∂ | B j | f u,n ] du. We want to show that ∂ k ′ f t,n converges as n → ∞ in order to conclude that f t is in C k tr ( R ∗ d ) d sa .First, we construct the limiting functions. For 1 ≤ k ′ ≤ k , we claim that there is a continuous function t f ( k ′ ) t from [0 , T ] to C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d )) (here the multilinear form has k ′ arguments) that satisﬁes f (1) t = Id + Z t ( ∂ h u ◦ f u ) ∂ f (1) u du (5.1)and for 2 ≤ k ′ ≤ k , f ( k ′ ) t,n +1 = k ′ X j =1 X B ,...,B j partition of [ k ′ ]min B < ··· < min B j Z t ( ∂ j h u ◦ f u ) f ( | B | ) u , . . . , f ( | B j | ) u ] du. (5.2)We proceed by strong induction. Let k ′ ≥ ≤ ℓ < k ′ . Note that theright-hand side only has one term which depends on f ( k ′ ) u , namely the term ( ∂ h u ◦ f u ) f ( k ′ ) u for j = 1. Allthe other terms f ( | B i | ) u are already deﬁned by inductive hypothesis and bounded in k·k C tr ( R ∗ d , M | Bi | ) d ,R . Since ∂ h u is bounded by M , the right-hand side is thus M -Lipschitz in f ( k ′ ) u with respect to k·k C tr ( R ∗ d , M k ′ ) d ,R .Thus, a solution f ( k ′ ) u exists by Picard iteration by the same argument as we used for f t .Let f (0) t = f t . Next, we show by strong induction on k ′ that for each R >

0, we have ∂ k ′ f t,n → f ( k ′ ) t in k·k C tr ( R ∗ d , M k ′ ) as n → ∞ uniformly for t ∈ [0 , T ]. Suppose k ′ ≥ ℓ < k ′ . Fix R > ∂ k ′ f t,n +1 − f ( k ′ ) t = Z t ( ∂ h u ◦ f u,n ) ∂ k ′ f t,n − f ( k ′ ) t ) du + Z t ( ∂ h u ◦ f u − ∂ h u − ◦ f u,n ) f ( k ′ ) t du + k ′ X j =2 X B ,...,B j partition of [ k ′ ]min B < ··· < min B j Z t ( ∂ j h u ◦ f u,n ) ∂ | B | f u,n , . . . , ∂ | B j | f u,n ] − k ′ X j =2 X B ,...,B j partition of [ k ′ ]min B < ··· < min B j ( ∂ j h u ◦ f u ) f ( | B | ) u , . . . , f ( | B j | ) u ] du. n ≥

1, let ǫ n,R = sup t ∈ [0 ,T ] (cid:13)(cid:13)(cid:13) ( ∂ h u ◦ f u − ∂ h u − ◦ f u,n ) f ( k ′ ) t (cid:13)(cid:13)(cid:13) C tr ( R ∗ d , M k ′ ) ,R + k ′ X j =2 X B ,...,B j partition of [ k ′ ]min B < ··· < min B j (cid:13)(cid:13)(cid:13) ( ∂ j h u ◦ f u,n ) ∂ | B | f u,n , . . . , ∂ | B j | f u,n ] − ( ∂ j h u ◦ f u ) f ( | B | ) u , . . . , f ( | B j | ) u ] (cid:13)(cid:13)(cid:13) C tr ( R ∗ d , M k ′ ) d ,R . By the inductive hypothesis and continuity of composition, we have ǫ n,R → n → ∞ . We have k ∂ k ′ f t, − f ( k ′ ) t k C tr ( R ∗ d , M k ′ ) ,R ≤ sup u ∈ [0 ,T ] k f ( k ′ ) u k C tr ( R ∗ d , M k ′ ) d ,R =: K and k ∂ k ′ f t,n +1 − f ( k ′ ) t k C tr ( R ∗ d , M k ′ ) d ,R ≤ Z t (cid:16) M k ∂ k ′ f t,n − f ( k ′ ) t k C tr ( R ∗ d , M k ′ ) d ,R + ǫ n,R (cid:17) du. A straightforward induction on n shows that k ∂ k ′ f t,n − f ( k ′ ) t k C tr ( R ∗ d , M k ′ ) ,R ≤ KM n t n n ! + n X ℓ =1 ǫ n − ℓ,R M ℓ t ℓ ℓ ! . Let ǫ n,R = 0 for n ≤

0. Then n X ℓ =1 ǫ n − ℓ,R M ℓ t ℓ ℓ ! = ∞ X ℓ =1 ǫ n − ℓ,R M ℓ t ℓ ℓ ! → n → ∞ using the dominated convergence theorem because ( ǫ n − ℓ,R ) n,ℓ ∈ N is bounded and ǫ n − ℓ,R → n → ∞ and P ∞ m =1 ( M t ) m /m ! converges. Therefore, k ∂ k ′ f t,n − f ( k ′ ) t k C tr ( R ∗ d , M k ′ ) ,R → n → ∞ as desired.Because ∂ k ′ f t,n → f ( k ′ ) t as n → ∞ for each k ′ ≤ k , we conclude that f t ∈ C k tr ( R ∗ d ) d and ∂ k ′ f t = f ( k ′ t ) for k ′ ≤ k . We already showed that f ( k ′ ) t depends continuously on t in C tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d )) d and therefore t f t is a continuous map from [0 , T ] into C k tr ( R ∗ d ) d .The bound k f t k C tr ( R ∗ d , M ) d ≤ e Mt follows from (5.1) by the same argument as Gr¨onwall’s inequality inclassical ordinary diﬀerential equations. Similarly, if ∂ k ′ h t is uniformly bounded for each k ′ ≤ k , then onecan obtain a Gr¨onwall-type bound and (5.2) to show that ∂ k ′ f t is uniformly bounded for k ′ ≤ k . We leavethe details to the reader.It remains to show that the same claims hold for g t as for f t . By applying the foregoing argument to asubinterval of [0 , T ], we obtain functions f t,s for s, t ∈ [0 , T ] such that t f t,s is continuous and f t,s = id + Z ts h u ◦ f u,s du. Also, f t,s ∈ C ( R ∗ d ) d sa and k ∂ f t,s k BC tr ( R ∗ d , M ) d ≤ e M | t − s | . One can verify from the integral equations that f t ,t ◦ f t ,t = f t ,t , which is a standard idea in ordinary diﬀerential equations. In particular, since f t = f t, ,the inverse function is given by g t = f ,t , which satisﬁes the integral equation asserted in the propositionafter switching the order of the endpoints in the Riemann integral. Remark . Of course, the lemma applies equally well to negative time intervals. It also works for unboundedtime intervals with the hypotheses and conclusions modiﬁed to state uniform bounds on each compact timeinterval rather than for all time.An important special case is when h is independent of t . Let h ∈ C ∞ tr ( R ∗ d ) d sa with ∂ h bounded. Thenthere is a one-parameter group ( f t ) t ∈ R in G solving the equation f t = id + Z t h ◦ f u du.

54n the spirit of Lie theory, we will denote f t by exp( t h ). This description of one-parameter subgroupsnaturally gives rise to a Lie bracket on C ∞ tr ( R ∗ d ) d sa analogous to the classical Lie bracket on vector ﬁeldsassociated to the classical diﬀeomorphism group of R d (also known as the Poisson bracket). Suppose h , h ∈ C ∞ tr ( R ∗ d ) d sa have bounded ﬁrst derivatives. Then using continuity of t exp( t h ) and the diﬀerentialequation above, one can compute thatexp( t h ) ◦ exp( t h ) ◦ exp( − t h ) ◦ exp( − t h ) = id + t [ h , h ] + o ( t ) , where [ h , h ] := ∂ h h − ∂ h h , and where o ( t ) means o ( t ) with respect to each of the seminorms in C ∞ tr ( R ∗ d ) d sa . It is an exercise to checkthat the Lie bracket is a continuous map C ∞ tr ( R ∗ d ) d sa × C ∞ tr ( R ∗ d ) d sa → C ∞ tr ( R ∗ d ) d sa and satisﬁes the Jacobiidentity. In the special case of non-commutative polynomials and power series, this Lie bracket was studiedby [85, § § h ∈ L , let δ h : C ∞ tr ( R ∗ d ) → C ∞ tr ( R ∗ d ) be the map ∂ h f := ∂f h . It follows from the product rule (which isa special case of Theorem 3.19) that ∂ h ( f g ) = ( ∂ h f ) · g + f · ( ∂ h g ), that is, ∂ h is a derivation on the algebra C ∞ tr ( R ∗ d ). We also have ∂ h ∂ h f = ∂ ( ∂f h ) h = ∂ f h , h ] − ∂f ∂ h h , hence ( ∂ h ∂ h − ∂ h ∂ h ) f = − ∂ [ h , h ] f. In other words, h

7→ − ∂ h is a Lie algebra homomorphism from C ∞ tr ( R ∗ d ) d sa to the Lie algebra of derivationson C ∞ tr ( R ∗ d ).The next lemma describes how the ﬂows ( f t ) of Lemma 5.8 will act upon some V ∈ tr( C ( R ∗ d )) sa . Thisis the basic computation that underlies our results about free transport. Lemma 5.10.

Let t V t be continuously diﬀerentiable map [0 , T ] → tr( C ( R ∗ d )) sa and let ˙ V t be its timederivative. Let t h t be a continuous map [0 , T ] → C ( R ∗ d ) d sa with k ∂ h t k BC tr ( R ∗ d , M ) d bounded, and let f t be the solution from Lemma 5.8 to the equation f t = id + Z t h u ◦ f u du. (5.3) Then we have in tr( C tr ( R ∗ d )) that ddt [( f − t ) ∗ V t ] = ( ˙ V t + ∇ ∗ V t h t ) ◦ f t . (5.4) In particular, V t = ( f t ) ∗ V modulo constants for all t if and only if −∇ ∗ V t h t = ˙ V t modulo constants for all t .Proof. For s, t ∈ [0 , T ], let f t,s solve the equation f t,s = id + Z ts h u ◦ f u,s du. Then for t ∈ [0 , T ] and ǫ ∈ R such that t + ǫ ∈ [0 , T ], we have f t + ǫ = f t + ǫ,t ◦ f t . Moreover,( f − t ) ∗ V t = V t ◦ f t − log ∆ ∂ f t , and ( f − t + ǫ ) ∗ V t + ǫ = V t + ǫ ◦ f t + ǫ,t ◦ f t − log ∆ ∂ f t + ǫ,t ◦ f t − log ∆ ∂ f t . Therefore, ( f − t + ǫ ) ∗ V t + ǫ − ( f − t ) ∗ V t = (cid:16) ( V t + ǫ − V t ) ◦ f t + ǫ,t + [ V t ◦ f t + ǫ,t − V t ] − log ∆ ∂ f t + ǫ,t (cid:17) ◦ f t . (5.5)55y continuity of composition (see Lemma 3.18), we havelim ǫ → V t + ǫ − V t ǫ ◦ f t + ǫ,t = ˙ V t ◦ f t,t = ˙ V in tr( C tr ( R ∗ d )) . Next, note that V t ◦ f t + ǫ,t − V t = Z ǫ h∇ V t ◦ f t + u,t , h t + u ◦ f t + u,t i tr du. This identity is veriﬁed by plugging in a speciﬁc ( A , τ ) and X , and then using the diﬀerential equation for f t + u,t with respect to u and the chain rule for Banach-valued functions. However, the Riemann integral onthe right-hand side is deﬁned in tr( C tr ( R ∗ d )). Using continuity of composition and the fundamental theoremof calculus, lim ǫ → ǫ Z ǫ h∇ V t ◦ f t + u,t , h t + u ◦ f t + u,t i tr du = lim u → h∇ V t ◦ f t + u,t , h t + u ◦ f t + u,t i tr = h∇ V t , h t i tr . Finally, for the third term on the right-hand side of (5.5), we have similarly to the proof of Proposition 7.15that lim ǫ → ǫ log ∆ ∂ f t + ǫ,t = Tr ( ∂ h t ) . Altogether, lim ǫ → ǫ (cid:16) ( f − t + ǫ ) ∗ V t + ǫ − ( f − t ) ∗ V t (cid:17) = (cid:16) ˙ V t + h∇ V t , h t i tr − Tr ( ∂ h t ) (cid:17) ◦ f t , which proves (5.4). The ﬁnal claim of the Proposition follows immediately.The case where h is independent of t is worthy of special note, since it gives a description of one-parametersubgroups of D ( R ∗ d ) that stabilize some V ∈ W ( R ∗ d ) (the analog of measure-preserving transformations). Corollary 5.11.

Let V ∈ tr( C ( R ∗ d )) sa , and let h ∈ C tr ( R ∗ d ) d sa with ∂ h ∈ BC tr ( R ∗ d , M ( R ∗ d )) d . Let f t = id + R t h ◦ f u du . Then ( f t ) ∗ V = V for all t if and only if ∇ ∗ V h = 0 .Remark . Voiculescu [84, § µ . If there is a law µ V canonically associated to V (as described below), then V may notbe uniquely determined by µ V , and thus preserving µ V is a weaker condition than preserving V .Note that the stabilizer D ( R ∗ d , V ) := { f ∈ D ( R ∗ d ) : f ∗ V = V } is a subgroup that is closed under limitswith respect to convergence of f and f − in C ( R ∗ d ) d . Based on Corollary 5.11, the tangent space of thesubgroup D ( R ∗ d , V ) at the identity should naturally be identiﬁed with (a subspace of) ker( ∇ ∗ V ) ⊆ C ∞ tr ( R ∗ d ) d sa .Thus, we expect that ker( ∇ ∗ V ) is closed under Lie brackets. To give a rigorous justiﬁcation for this, we observethe following identity. Lemma 5.13.

For V ∈ tr( C ∞ tr ( R ∗ d )) sa and h , h ∈ C ∞ tr ( R ∗ d ) d sa , ∇ ∗ V [ h , h ] = ∂ ( ∇ ∗ V h ) h − ∂ ( ∇ ∗ V h ) h . Proof.

Fix ( A , τ ) ∈ W . Let ( B , σ ) be the tracial W ∗ -algebra generated by a freely independent standardsemicircular d -tuple S . Then ∇ ∗ V ( ∂ h h ) A ,τ ( X )= − h S , ∂ ( ∂ h h ) A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ + ( ∂V ∂ h h ) A ,τ ( X )= − h S , ∂ h ∂ h ) A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ − h S , ∂ h A∗B ,τ ∗ σ ( X )[ h A ,τ ( X ) , S ] i τ ∗ σ + ( ∂V ∂ h h ) A ,τ ( X )= − Tr ( ∂ h ∂ h ) A ,τ ( X ) − h S , ∂ h A∗B ,τ ∗ σ ( X )[ S , h A∗B ,τ ∗ σ ( X )] i τ ∗ σ + ( ∂V ∂ h ) A ,τ ( X )[ h ) A ,τ ( X )]= − Tr ( ∂ h ∂ h ) A ,τ ( X ) + ∂ ( ∇ ∗ V h ) A ,τ ( X )[ h A ,τ ( X )] . Therefore, ∇ ∗ V ( ∂ h h ) = − Tr( ∂ h ∂ h ) + ∂ ( ∇ ∗ V h ) h . When we subtract ∇ ∗ V ( ∂ h h ) from ∇ ∗ V ( ∂ h h ), the terms Tr ( ∂ h ∂ h ) and Tr ( ∂ h ∂ h ) cancel.56 .3 The Laplacian and the Riemannian metric Recall that the Riemannian metric on the classical Wasserstein manifold is given by Z h∇ L − V ˙ V , ∇ L − V ˙ W i dµ V for two tangent vectors ˙ V and ˙ W at the point V such that R ˙ V j dµ V = 0. To deﬁne the Riemannian metricin free case, we must describe how to associate a non-commutative law µ V to some V ∈ W ( R ∗ d ) as well ashow to invert L − V on the space of functions with expectation zero. As this section is primarily concernedwith formal computation, we will state the necessary ingredients as hypotheses.There are several ways to approach the problem of associating a non-commutative law µ V to a potential V . We will assume here that µ V is characterized by ∇ ∗ V h having expectation zero for all h ∈ tr( C ∞ tr ( R ∗ d )),a relation known as the Dyson-Schwinger equation . The analogous property in the classical setting is that Z ( h∇ V, h i − div( h )) dµ = 0 , which holds for the Gibbs measure dµ ( x ) = e − V dx/ R e − V for the potential V using integration by parts.In §

7, we will argue that for many choices of V , there exist non-commutative laws satisfying the Dyson-Schwinger equation. Assumption 5.14.

Suppose that V ∈ W ( R ∗ d ) and there is a unique non-commutative law µ V ∈ Σ d thatsatisﬁes the Dyson-Schwinger equation ˜ µ V [ ∇ ∗ V h ] = 0 (5.6)for h ∈ C ∞ tr ( R ∗ d ) d , where ˜ µ V is the positive homomorphism tr( C ∞ tr ( R ∗ d )) → C corresponding to µ V .The second hypothesis is invertibility of the Laplacian associated to V , which we will discuss in § V close to (1 / P j tr( x j ). Deﬁnition 5.15.

For V ∈ W ( R ∗ d ), we deﬁne L V : tr( C ∞ tr ( R ∗ d )) → tr( C ∞ tr ( R ∗ d )) by L V f := −∇ ∗ V ∇ f = Tr ( ∂ ∇ f ) − ∂V ∇ f. Assumption 5.16.

Suppose Assumption 5.14 holds and there is a continuous linear transformation Ψ V :tr( C ∞ tr ( R ∗ d )) → ker(˜ µ V ) ⊆ tr( C ∞ tr ( R ∗ d )) such that − L V Ψ V f = − Ψ V L V f = f − ˜ µ V ( f ). Deﬁnition 5.17.

Suppose that V ∈ W ( R ∗ d ) satisﬁes Assumptions 5.14 and 5.16. Then we deﬁne a formalRiemannian metric h· , ·i V on T V W ( R ∗ d ) by h ˙ V , ˙ W i T V W ( R ∗ d ) = ˜ µ ( h∇ Ψ V ˙ V , ∇ Ψ V ˙ W i tr ) , where by abuse of notation ˙ V represents an equivalence class of paths t V t in the tangent space with˙ V = ˙ V .The operator Ψ V has another use besides deﬁning the Riemannian metric. We saw in Lemma 5.7 that avector ﬁeld h , viewed as a tangent vector to id in D ( R ∗ d ), produces a tangent vector ˙ V = −∇ ∗ V h to V in W ( R ∗ d ). The operator Ψ V allows us to reverse this transformation, since for any ˙ V , the vector ﬁeld −∇ Ψ V ˙ V satisﬁes ˙ V = −∇ ∗ V ( −∇ Ψ V ˙ V ) . Furthermore, if we go from a vector ﬁeld h by ∇ ∗ V to a perburbation ˙ V = −∇ ∗ V h and then back by −∇ Ψ V to a vector ﬁeld ∇ Ψ V ∇ ∗ V h , then see that any vector ﬁeld is equivalent modulo ker( ∇ ∗ V ) to a gradient. Theoperator P V = ∇ Ψ V ∇ ∗ V : C ∞ tr ( R ∗ d ) d → C ∞ tr ( R ∗ d ) d thus represents the “projection of vector ﬁelds onto gradients”, and 1 − P V is the free version of the Lerayprojection in ﬂuid dynamics. The operators L V , ∇ , ∇ ∗ V , Ψ V , and P V satisfy the following relations.57 roposition 5.18. Suppose that V ∈ W ( R ∗ d ) satisﬁes Assumptions 5.14 and 5.16. Consider the operators C ι −→ tr( C ∞ tr ( R ∗ d )) ∇ −→ C ∞ tr ( R ∗ d ) d , where ι maps a scalar to the corresponding constant function, and C ∞ tr ( R ∗ d ) d ∇ ∗ V −−→ tr( C ∞ tr ( R ∗ d )) E V −−→ C . Then(1) ker( ∇ ) = ker( L V ) = ι ( C ) .(2) Im( ∇ ∗ V ) = Im( L V ) = ker(˜ µ V ) .(3) − L V Ψ V L V = L V and − Ψ V L V Ψ V = Ψ V .(4) P V = P V .(5) Every f ∈ C ∞ tr ( R ∗ d ) d can be uniquely written as f = ∇ g + h where g ∈ tr( C ∞ tr ( R ∗ d )) and ∇ ∗ V h = 0 . Here ∇ g = P V f .Proof. (1) Clearly, ι ( C ) ⊆ ker( ∇ ) ⊆ ker( L V ). Conversely, if f ∈ ker( L V ), then f = − Ψ V L V f + ˜ µ V f =˜ µ V f ∈ ι ( C ).(2) Clearly, Im( L V ) ⊆ Im( ∇ ∗ V ). Moreover, (5.6) says precisely that Im( ∇ ∗ V ) ⊆ ker( E V ). Finally, if f ∈ ker( E V ), then f = − L V Ψ V f + E V f = ∇ ∗ V ∇ Ψ V f + 0.(3) Note that − L V Ψ V L V f = L V ( f − ˜ µ V ( f )) = L V f and − Ψ V L V Ψ V f = Ψ V f − ˜ µ V (Ψ V f ) = Ψ V f sinceIm(Ψ V ) ⊆ ker(˜ µ V ).(4) Note that ∇ Ψ V ∇ ∗ V ∇ Ψ V ∇ ∗ V = −∇ Ψ V L V Ψ V ∇ ∗ V = ∇ Ψ V ∇ ∗ V .(5) To show existence, ﬁx f and let g = Ψ V ∇ ∗ V f and h = f − ∇ g = (1 − P V ) f . Then ∇ ∗ V h = ∇ ∗ V f −∇ ∗ V ∇ Ψ V ∇ ∗ V f = (1 + L V ) ∇ ∗ V f = E ∇ ∗ V f = 0. For uniqueness, note that P V f must equal ∇ g , and hence h must equal (1 − P V ) f .In the classical setting, P V the orthogonal projection of the space of vector ﬁeld onto the space of gradientsin L . Thus, P V h a vector ﬁeld which will produce the same perturbation of V through the transport action,and which has a smaller L norm than the original vector ﬁeld. That is, P V is an inﬁnitesimal version ofoptimal transport. For the same idea to apply in the free setting, we would like to show that ker( ∇ ∗ V ) andIm( ∇ ) are orthogonal with respect to ˜ µ V .Although this is merely an integration by parts computation in the classical case, the same approach doesnot directly work in the free setting because (despite our choice of notaion) ∇ ∗ V is not actually the adjointof ∇ . Rather, it is the large N limit of 1 /N times the adjoint of ∇ on L ( µ ( N ) V ), where µ ( N ) V is the measureon M N ( C ) d sa with density proportional to e − N V . The adjointness relation as written does not make sensein the large N limit because of the factor of 1 /N .There is another natural heuristic for why ker( ∇ V ∗ ) and Im( ∇ ) are orthogonal. If h ∈ ker( ∇ ∗ V ) withappropriate boundedness assumptions, then h should generate a one-parameter group of measure-preservingtransformations f t for V by Corollary 5.11. If we diﬀerentiate the equation ˜ µ V [ g ◦ f t ] = 0 at t = 0, we get˜ µ V [ h∇ g, h i tr ] = 0. However, to make a rigorous argument, it is easier to directly use the Lie bracket identityLemma 5.13 (related to the group of measure-preserving transformations) together with the Dyson-Schwingerequation. Proposition 5.19.

Suppose that V satisﬁes Assumption 5.14, and in (3) - (5) suppose also that V satisﬁesAssumption 5.16.(1) ˜ µ V [ h∇∇ ∗ V h , h i tr ] = ˜ µ V [ h h , ∇∇ ∗ V h i tr ] for h , h ∈ C tr ( R ∗ d ) d .(2) ˜ µ V [ h∇ L V g , ∇ g i tr ] = ˜ µ V [ h∇ g , ∇ L V g i tr ] for g , g ∈ tr( C ∞ tr ( R ∗ d )) .(3) ˜ µ V [ h∇ Ψ V g , ∇ g i tr ] = ˜ µ V [ h∇ g , ∇ Ψ V g i tr ] for g , g ∈ tr( C ∞ tr ( R ∗ d )) .

4) If g ∈ tr( C ∞ tr ( R ∗ d )) and h ∈ ker( ∇ V ∗ ) , then ˜ µ V [ h∇ g, h i tr ] = 0 .(5) ˜ µ V [ h P V h , h i tr ] = ˜ µ V [ h h , P V h i tr ] for h , h ∈ C tr ( R ∗ d ) d .Proof. (1) By complex-linearity, it suﬃces to consider the case when h and h are self-adjoint. By Lemma5.13, we have ∇ ∗ V [ h , h ] = h∇∇ ∗ V h , h i tr − h∇∇ ∗ V h , h i tr . When we apply ˜ µ V , the left-hand side evaluates to zero, hence˜ µ V [ h∇∇ ∗ V h , h i tr ] = ˜ µ V [ h∇∇ ∗ V h , h i tr ] = ˜ µ V [ h h , ∇∇ ∗ V h i tr ] , since h and ∇∇ ∗ V h are self-adjoint (which follows since ∇ ∗ V h is real-valued).(2) Substitute h j = ∇ g j into (1) and apply ∇ ∗ V ∇ = L V .(3) Substitute Ψ V g j for g j in (2) and note that ∇ L V Ψ V g j = ∇ [˜ µ V [ g j ] − g j ] = −∇ g j .(4) Note ˜ µ V [ h∇ g, h i tr ] = ˜ µ V [ h∇∇ ∗ V ∇ Ψ V g, h i tr ]= ˜ µ V [ h∇ Ψ V g, ∇∇ ∗ V h i tr ]= 0 . (5) Since P V h ∈ Im( ∇ ) and (1 − P V ) h ∈ ker( ∇ ∗ V ), they are orthogonal with respect to ˜ µ V ◦ h· , ·i tr .Therefore, ˜ µ V [ h P V h , h i tr ] = ˜ µ V [ h P V h , P V h i tr ] . By symmetrical reasoning, this equals ˜ µ V [ h h , P V h i tr ].In contrast to the situation with ∇ , the adjoint of the operator ∂ can be understood directly from theDyson-Schwinger equation. The following lemma is related to computations in [71, Proposition 21]. Lemma 5.20.

Let V satisfy Asssumptions 5.14 and 5.16. Deﬁne ∂ ∗ V : C ( R ∗ d , M ( R ∗ d )) d → C tr ( R ∗ d ) d by ∂ ∗ V F = F ∇ V − ∂ † F . Then we have for f ∈ C ( R ∗ d ) d and F ∈ C ( R ∗ d , M ( R ∗ d )) d , we have ˜ µ V h f , ∂ ∗ V F i tr = ˜ µ V Tr [( ∂ f ) ✶ F ] . Remark . We can deﬁne an semi-inner product on C ∞ tr ( R ∗ d ) d by ( f , g ) E V h f , g i tr . We can also deﬁnea semi-inner product on C ∞ tr ( R ∗ d , M ( R ∗ d )) d by ( F , G ) E V Tr ( F G ). The lemma then says that ∂ ∗ V isformally the adjoint of ∂ with respect to these inner products. Proof.

We apply (5.6) with h = ( F ✶ f ) ∗ . Observe that ∂V h = h∇ V, h i tr = h h ∗ , ∇ V i tr = h F ✶ f , ∇ V i tr = h f , F ∇ V i tr . Next, we compute Tr ( ∂ h ). Let Φ and Υ be the maps in Lemmas 4.15 and 4.19 respectively. Then( A , τ ) ∈ W and X , Y ∈ A d sa , we haveΦ − ( h ) A ,τ ( X )[ Y ] = h Y , h A ,τ ( X ) i τ = h h A ,τ ( X ) ∗ , Y i τ = h ( F ✶ ) A ,τ ( X )[ f A ,τ ( X )] , Y i τ = h f A ,τ ( X ) , F A ,τ ( X )[ Y ] i τ . Now Tr ( ∂ h ) = Υ(Φ − ( ∂ h )) = Υ( ∂ Φ − ( h ))) , g σ ) = Υ( g ) when σ is the permutation thatswitches the last two indices. Let ( B , σ ) be generated by a standard semicircular d -tuple S . Using ourprevious expression for Φ − ( h ), we haveΥ( ∂ Φ − ( h )) A ,τ ( X ) = ddt (cid:12)(cid:12)(cid:12) t =0 h f A∗B ,τ ∗ σ ( X + t S ) , F A∗B ,τ ∗ σ ( X + t S )[ S ] i τ ∗ σ = h ∂ f A∗B ,τ ∗ σ ( X )[ S ] , F A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ + h f A∗B ,τ ∗ σ ( X ) , ∂ F A∗B ,τ ∗ σ ( X )[ S , S ] i τ ∗ σ = h S , ( ∂ f ✶ F ) A∗B ,τ ∗ σ ( X )[ S ] i τ ∗ σ + h f A ,τ ( X ) , E A ∂ F A∗B ,τ ∗ σ ( X )[ S , S ] i τ = Tr [( ∂ f ) ✶ F ] A ,τ ( X ) + h f A ,τ ( X ) , ( ∂ † F ) A ,τ ( X ) i τ . Thus, we get Tr ( ∂ h ) = Tr [( ∂ f ) ✶ F ] + h f , ∂ † F i tr . So the Dyson-Schwinger equation yields E V h f , F ∇ V i tr = E V Tr [( ∂ f ) ✶ F ] + E V h f , ∂ † F i tr , which is the desired equality. A natural strategy to produce transport maps from one point V to another V in W ( R ∗ d ) is as follows.Suppose we are given a path t V t from [0 ,

1] into the free Wasserstein manifold. Suppose all the V t ’ssatisfy Assumptions 5.14 and 5.16. Assume without loss generality that · V t has expectation zero under µ V t .Let h t = −∇ Ψ V t ˙ V t , so that −∇ ∗ V t h t = ˙ V t . Let f t solve the equation f t = id + R t h u ◦ f u du . Then ( f t ) ∗ V should equal V t for all t . Of course, carrying this out rigorously requires additional analytic assumptions.The remainder of the paper will show that Assumptions 5.14 and 5.16 hold and the transport strategycan be carried out rigorously for potentials V ∈ C ∞ tr ( R ∗ d ) of the form V ( x ) = (1 / P j tr( x j ) + W ( x ) suchthat ∂W is uniformly bounded and ∂ ∇ W is uniformly bounded by a constant strictly less than 1. Moreprecisely, § L V , and from there the associated expectation E V : tr( C tr ( R ∗ d )) → C and the pseudo-inverse Ψ V of the Laplacian L V . These results will imply that V satisﬁes Assumption 5.16, and that there is a unique law µ V satisfying ˜ µ V ( L V f ) = 0 for all f ∈ tr( C ( R ∗ d )).However, this alone does not imply that µ V satisﬁes (5.6).Next, § V , that is, non-commutative lawmaximizing a certain free entropy functional. These results will imply that if ∂W and ∂ W are bounded(here there are no restrictions on the constant), then there exists a non-commutative law ν satisfying theDyson-Schwinger equation ˜ ν [ ∇ ∗ V h ] = 0 for all suﬃciently smooth h . Hence, in the situation where ∂ ∇ W isuniformly smaller than 1, we have existence and uniqueness of a law µ V satisfying (5.6), or in other words, V satisﬁes Assumption 5.14.In order to execute the strategy for constructing transport, we need h t = −∇ ∗ V t Ψ V t ˙ V t to have uniformlybounded ﬁrst derivative and to depend continuously on t in order to apply Lemmas 5.8 and 5.10. Thus,in our construction of Ψ V in §

6, we have to estimate the derivatives of Ψ V f and show that Ψ V f dependscontinuously on V and f jointly. The continuity property of course increases the amount of technical work,but it follows quite naturally from a detailed analysis of the heat semigroup provided that we uniform boundson ∂V and ∂ ∇ V . On the other hand, to get h t to have bounded ﬁrst derivative with our methods requiresus to assume that ∂ V t is bounded and that ∂ ˙ V t and ∂ ˙ V t are bounded.We ﬁnally conclude that ( f ) ∗ µ V = µ V , and this yields an isomorphism of the C ∗ and W ∗ -algebrasassociated to µ V and µ V . In § ∂ V − Id, we construct transport functions h t and f t which are triangular, in the sense that f t ( x , . . . , x d ) = ( f t, ( x ) , f t, ( x , x ) , . . . , f t,d ( x , . . . , x d )) . This produces a triangular isomorphism of C ∗ and W ∗ -algebras.It is natural to ask what the minimal assumptions are on V and V to obtain isomorphisms of theassociated C ∗ and W ∗ -algebras. First, although we assume that V ∈ tr( C ∞ tr ( R ∗ d )) throughout, the proof60ould work just as well if V is merely in tr( C ( R ∗ d )) (with of course the required bounds on the derivatives).We did not wish to get mired down with writing the precise smoothness assumptions needed for each result.In any case, the smoothness assumptions needed in this proof may not be optimal. For instance, vonNeumann algebraic triangular transport was constructed in [41, 42] using only assumptions on the ﬁrst twoderivatives of V . We do not know whether this is suﬃcient for C ∗ -algebraic triangular transport.More generally, do we expect such results to hold for functions V which are not perturbations of aquadratic, and especially those which are not even convex? Unfortunately, the C ∗ -isomorphism can fail evenfor d = 1 with V ∈ tr( C ∞ tr ( R ∗ d )).Random matrix theorists have carried out a detailed analysis of the case (among others) where d = 1and V ( X ) = tr( f ( X )) for some smooth f : R → R ; see [17, 10, 15, 14, 16]. Of course, by § V willbe in tr( C ∞ tr ( R ∗ d )). As in [10, § f ( t ) = t / − ct , or V ( x ) = tr( x ) / − c tr( x ). Let µ ( N ) bethe associated measure on M N ( C ) sa , and let X ( N ) be a random matrix chosen according to this measure. Itwas shown that for large enough c , the empirical spectral distribution of X ( N ) converges in probability to ameasure ρ on R whose support is the disjoint union of two closed intervals. If X is a self-adjoint operator in( A , τ ) with spectral distribution ρ , then C ∗ ( X ) ∼ = C [0 , ⊕ C [0 , ∗ -algebra generated by a self-adjoint operator S with the semicircular distribution.As a side note, the function tr( x ) / − c tr( x ) is not a bounded perturbation of (1 /

2) tr( x ), hence notamong the class of functions studied in this paper. However, one can easily modify the function t / − ct near ∞ so that it is a bounded perturbation of some constant times t . If this modiﬁcation is close enoughto ∞ , and the values of the modiﬁed function remain suﬃciently large in that region, then the support of thelimiting distribution can be forced to stay inside a bounded set where the function was not changed (usingsimilar techniques as [10, § § ρ because of [17,Theorem 1]. Similarly, one could consider a function such as f ( t ) = t / ae − bt for large constants a and b . By choosing the coeﬃcients correctly, one could presumably produce similar behavior to t / − ct in thatthe limiting empirical spectral distribution would have a support with two components.Such examples are an obstruction to C ∗ transport results for free Gibbs laws for general V . These exam-ples will in fact fail Assumptions 5.14 and 5.16. Indeed, by reweighting the pieces of µ V on each component ofthe support, one can obtain a continuum of measures that satisfy the Dyson-Schwinger equation, although itturns out that often there is still a unique maximizer of entropy. Moreover, if we consider a smooth function f on R that is constant on each component of the support, then ∇ ( f ( x )) = f ′ ( x ) will evaluate to zero in L of the free Gibbs law for V . Although this is not technically the same as ∇ ( f ( x )) being zero in C tr ( R ∗ d ) d ,this behavior still suggests an obstacle to inverting L V modulo constant functions. On the other hand, [14]and [16] were able to invert the Laplacian on L modulo a ﬁnite-dimensional kernel (still for a single matrix).It is an intriguing possibility that something like this could work for the multi-matrix setting and lead toa transport result that applies as long as h t is in a certain subspace of C ∞ tr ( R ∗ d ) d sa complementary to thekernel of L V t .We also remark that since W ∗ -isomorphism is weaker than C ∗ -isomorphism, there could be situations inwhich the former is possible even when the latter is not. In the case of a single self-adjoint operator, topo-logical obstructions, such as disconnected support, disappear when we pass from the algebra of continuousfunctions to the L ∞ space. On the other hand, it is known that ﬁnite free entropy for a non-commutativelaw is not suﬃcient to guarantee W ∗ -isomorphism with the law of a semicircular family. However, we do notknow of any counterexamples to having a W ∗ -isomorphism between µ V and the law of a free semicircularfamily for any smooth V with quadratic growth at ∞ . Voiculescu conjectured such a W ∗ -isomorphism for acertain class of potentials in [87]. L V As we saw in § §

5, the Laplacian associated to V plays an important role in converting betweenperturbations of V and inﬁnitesimal transport maps, both in the classical case and in the non-commutativecase. Recall that for V ∈ tr( C ∞ tr ( R ∗ d )), the associated Laplacian is deﬁned by L V f = Lf − d X j =1 ∂ x j f ∇ x j V. k ∈ N ∪ {∞} , this operator is a continuous linear transformation C k +2tr ( R ∗ d ) → C k tr ( R ∗ d ).We seek suﬃcient conditions for L V to have a one-dimensional kernel and a well-behaved pseudo-inverseΨ V . We will use this in § V satisﬁes Assumption 5.16. As discussed in § V is close in a certain sense to the quadratic (1 / h x , x i tr .Following similar ideas to [10, 7, 25, 34, 35, 26] and especially [26], since we cannot work directly with thedensity in the free setting, we will instead recover E V and Ψ V from the heat semigroup ( e tL V ) t ∈ [0 , ∞ ) , whichin turn will be constructed from a free stochastic process X ( X , t ) solving the equation d X ( X , t ) = d S ( t ) − ∇ x V ( X ( X , t )) dt, X ( X ,

0) = X , where ( S ( t )) t ∈ [0 , ∞ ) is a free Brownian motion in d variables, freely independent of X . We remark that thetechnical development of free SDE theory owes a great deal to the work of Biane [8], Biane and Speicher[9, 10], and Dabrowski [25, 24], although due to the simple nature of the SDE considered here, we opt for aself-contained treatment which does not require any background in free stochastic analysis.In fact, the SDE construction only depends on V through its gradient ∇ V and nothing about theconstruction of the SDE and heat semigroup requires us to use a gradient. Hence, we will prove the resultswith ∇ V replaced by a function J ∈ C ∞ tr ( R ∗ d ) d sa which is suﬃciently close to the identity function. Asmotivation, note that in the case where J = ∇ V , the condition k ∂ J − Id k BC tr ( R ∗ d , M ( R ∗ d )) < V is within 1 of Id. In the classical world, this implies that V is uniformly convex. Deﬁnition 6.1.

For constants c ∈ (0 ,

1) and a ∈ R , we deﬁne J da,c = { J ∈ C ∞ tr ( R ∗ d ) : k J − id k BC tr ( R ∗ d ) d ≤ a, k ∂ J − Id k BC tr ( R ∗ d , M ) ≤ − c } . We also deﬁne L J f = Lf − ∂f J . Thus, in particular, the earlier operator L V would equal L ∇ V in this notation. This will not cause anyconfusion because V and ∇ V are diﬀerent types of objects: V is a scalar-valued function while ∇ V is a d -tuple of operator-valued functions (which cannot be purely scalar-valued since ∇ V − id is bounded). Aprecise statement of our results is as follows. Deﬁnition 6.2.

Let J ∈ J da,c . Let ( A , τ ) be a tracial W ∗ -algebra, let ( B , σ ) be the tracial W ∗ -algebragenerated by a d -tuple of self-adjoint free Brownian motions ( S ( t ) , . . . , S d ( t )) for t ∈ [0 , ∞ ), and let ( A∗B , τ ∗ σ ) be the tracial free product of ( A , τ ) and ( B , σ ). For X = ( X , . . . , X d ) ∈ A d sa , let X ( X , t ) = X A ,τ ( X , t )be the solution to the integral equation X ( X , t ) = X + S ( t ) + Z t J ( X ( X , u )) du (which we will show is well-deﬁned in Lemma 6.10). Note that X is a function A d sa × [0 , ∞ ) → ( A ∗ B ) d sa .For f ∈ C tr ( R ∗ d ), we deﬁne ( e tL J f ) A ,τ ( X ) = E A [ f A∗B ,τ ∗ σ ( X ( X , t ))] , where E A : A ∗ B → A is the unique trace-preserving conditional expectation.

Theorem 6.3.

Let J ∈ J da,c for some a ∈ R and c ∈ (0 , . Let f ∈ C k tr ( R ∗ d ) .(1) We have e tL J f ∈ C k tr ( R ∗ d ) .(2) As t → ∞ , the function e tL J f converges in C k tr ( R ∗ d ) to a constant E J f .(3) The integral Ψ J f = R ∞ [ e tL J − E J ] f dt makes sense as an improper Riemann integral in C k tr ( R ∗ d ) .(4) We have − L J Ψ J + E J = − Ψ J L J + E J = id as operators C k tr ( R ∗ d ) → C k tr ( R ∗ d ) . J and f to depend on an auxiliary variable x ′ . We will furthermore allowthe function f to be in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ for some ℓ ∈ N and d , . . . , d ℓ , and d ′′ ∈ N . Themore general deﬁnition of the heat semigroup is as follows. Deﬁnition 6.4.

Consider formal variables x = ( x , . . . , x d ) and x ′ = ( x ′ , . . . , x ′ d ′ ). Let π ( x , x ′ ) = x and π ′ ( x , x ′ ) = x ′ . Moreover, let Π( x , x ′ )[ y , y ′ ] = y and Π ′ ( x , x ′ )[ y , y ′ ] = y ′ , where y is a d -tuple and y ′ is a d ′ -tuple. Then deﬁne J d,d ′ a,b = { J ∈ C ∞ tr ( R ∗ ( d + d ′ ) ) d sa } Deﬁnition 6.5.

Let J ∈ J d,d ′ a,b . Let ( A , τ ) be a tracial W ∗ -algebra, let ( B , σ ) be the tracial W ∗ -algebragenerated by a d -tuple of self-adjoint free Brownian motions ( S ( t ) , . . . , S d ( t )) for t ∈ [0 , ∞ ), and let ( A∗B , τ ∗ σ ) be the tracial free product of ( A , τ ) and ( B , σ ). For X = ( X , . . . , X d ) ∈ A d sa and X ′ = ( X ′ , . . . , X ′ d ′ ) ∈A d ′ sa , let X A ,τ ( X, X ′ , t ) be the solution to the integral equation X ( X , X ′ , t ) = X + S ( t ) + Z t J ( X ( X , X ′ , u ) , X ′ ) du (which we will show is well-deﬁned in Lemma 6.10). Note that X A ,τ is a function A d + d ′ sa × [0 , ∞ ) → ( A ∗ B ) d sa .For f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , we deﬁne( e tL x , J f ) A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] = E A [ f A∗B ,τ ∗ σ ( X ( X, X ′ , t ) , X ′ )[ Y , . . . , Y ℓ ]] , where E A : A ∗ B → A is the unique trace-preserving conditional expectation.We refer to Propositions 6.22 and 6.26 for the precise generalizations of Theorem 6.3 to the conditionalsetting. X ( X , X ′ , t ) The bulk of the technical work to prove Theorem 6.3 lies in showing that X is a “ C ∞ tr function of ( X , X ′ )and S ” in a certain sense. Once we prove that, it is relatively easy to deduce that if f is a C k tr function of( X , X ′ ), then so is e tL x , J , as we will do in § § X A ,τ ( X , X ′ , t ) depends on X and X ′ as well as the free Brownian motion S ( t ), and thus wewant to deﬁne a similar space to C k tr ( R ∗ ( d + d ′ ) ) which also allows dependence on a freely independent freeBrownian motion. Since of course we will need to study the space-derivatives of X A ,τ ( X , X ′ , t ) of arbitraryorders, this involves deﬁning analogs of C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ that also allow dependence on S ( t ). For simplicity, we call the tuple of formal variables x rather than ( x , x ′ ) in the deﬁnition. Deﬁnition 6.6.

Let s denote a collection of formal self-adjoint variables ( s j ( t )) t ∈ [0 , ∞ ) ,j ∈ [ d ] and let x denotea collection of formal self-adjoint variables x , . . . , x d ′ . We denote by TrP s ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) thespace of trace polynomials in the formal variables x , . . . , x d , { s ( t ) } t ∈ [0 , ∞ ) , and y , . . . , y ℓ (where y j is a d j -tuple) that are real-multilinear in y , . . . , y ℓ . Deﬁnition 6.7.

With x and s as above, suppose that f = ( f A ,τ ) ( A ,τ ) ∈ W is a tuple of functions where f A ,τ : ( A ∗ B ) d ′ sa × ( A ∗ B ) d sa × · · · × ( A ∗ B ) d ℓ sa → ( A ∗ B ) d ′′

63s a function which is real-multilinear in the last ℓ variables. We say that f ∈ C tr , S ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d )) d ′′ if for every R > ǫ >

0, there exists a g ∈ TrP s ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) such that for every ( A , τ ) wehave sup {k f A ,τ ( X ) − g | A∗B ,τ ∗ σ ( S , X ) k M ℓ , tr : X ∈ ( A ∗ B ) d sa with k X k ∞ ≤ R } < ǫ. We equip C tr , S ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ with the Fr´echet topology given by the seminorms k f k C tr , S ( R ∗ d ′ , M ℓ ) ,R := sup ( A ,τ ) ∈ W sup {k f A ,τ ( X ) k M ℓ , tr : X ∈ ( A ∗ B ) d sa with k X k ∞ ≤ R } for R > Deﬁnition 6.8.

Let k ∈ N ∪ {∞} . Suppose that f = ( f A ,τ ) ( A ,τ ) ∈ W is a tuple of functions where f A ,τ : ( A ∗ B ) d ′ sa × ( A ∗ B ) d × · · · × ( A ∗ B ) d ℓ → ( A ∗ B ) d ′′ is a function which is real-multilinear in the last ℓ variables. We say that f ∈ C k tr , S ( R ∗ d ′ , M ℓ ) d ′ if for every k ′ ∈ N with k ′ ≤ k , there exists g k ′ ∈ C tr , S ( R ∗ d ′ , M ℓ + k ′ ) d ′ such that for every ( A , τ ) ∈ W , ∂ k ′ f A ,τ = g A ,τk ′ as functions ( A∗B ) d + ℓ + k ′ sa → A∗B . We equip C k tr , S ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ with the family of seminorms k ∂ k ′ f k C tr , S ( R ∗ d ′ , M ℓ + k ′ ) d ′′ ,R for k ′ ≤ k and j , . . . , j k ′ ∈ [ d ′ ] and R > Proposition 6.9.

Lemma 3.18 and Theorem 3.19 hold with each space C k tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ replaced by C k tr , S ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . The proof of this proposition is exactly the same as the original statements, and so we leave the detailsto the reader. Now we are ready to deﬁne the solution to the integral equation. We continue to use S todenote a d -tuple of free Brownian motions. Lemma 6.10.

For each ( A , τ ) , there exists a unique function X A ,τ : ( A ∗ B ) d + d ′ sa × [0 , ∞ ) → ( A ∗ B ) d sa thatis continuous in t and satisﬁes X A ,τ ( X , X , t ) = X + S ( t ) − Z t J A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du. (6.1) Moreover, X deﬁnes a continuous map [0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) ) d sa which satisﬁes kX ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) ) d ,R ≤ e − t/ ( R + 2) + (1 − e − t/ ) k J − π k BC tr ( R ∗ ( d + d ′ ) ) d . (6.2) Proof.

Deﬁne Picard iterates inductively by X A ,τ ( X , X ′ , t ) = X X A ,τn +1 ( X , X ′ , t ) = S ( t ) − Z t J A∗B ,τ ∗ σ ( X A ,τn ( X , X ′ , u ) , X ′ ) du. We will show by induction X A ,τn is well-deﬁned and that t

7→ X n ( · , t ) is a continuous map [0 , ∞ ) → C tr ( R ∗ ( d + d ′ ) ) d sa . The base case is immediate. For the induction step, recall that composition is a con-tinuous operation by Lemma 3.18 / Proposition 6.9, and hence J ( X n ( x , x ′ , t ) , x ′ ) deﬁnes a continuous map[0 , ∞ ) → C tr ( R ∗ ( d + d ′ ) ) d sa . Thus, it makes sense to integrate from 0 to t using Riemann integration forfunctions taking values in a Fr´echet space, and of course the output will again be a continuous function[0 , ∞ ) → C tr ( R ∗ ( d + d ′ ) ) d sa (the argument is the same as in [42, § X n +1 deﬁnes such a continuousfunction as desired. 64ext, we prove convergence of the Picard iterates as k → ∞ . Because ∂ x J − Π is globally bounded by c ,it follows that J A∗B ,τ ∗ σ is (1 + c )-Lipschitz in X (with respect to k·k ∞ ). This implies that for k ≥ kX A ,τn +1 ( X , X ′ , t ) − X A ,τn ( X , X ′ , t ) k ∞ ≤ c Z t kX A ,τn ( X , X ′ , u ) − X A ,τn − ( X , X ′ , u ) k ∞ du, so that kX n +1 ( · , t ) − X n ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R ≤ c Z t kX n ( · , u ) − X n − ( · , u ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R du. (6.3)Let C ( t, R ) = sup u ∈ [0 ,t ] kX ( · , u ) − X ( · , u ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R . Then a straightforward induction argument shows that kX n +1 ( · , t ) − X n ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R ≤ C ( T, R ) (1 + c ) k t n n n ! , for t ∈ [0 , T ]. This implies the convergence of X k in C tr ( R ∗ ( d + d ′ ) ) d sa uniformly for t ∈ [0 , T ] as k → ∞ . Thus,the limit X is a solution to the integral equation satisfying the desired continuity property.Note that we have asserted the uniqueness claim in a weaker setting than that of continuous functions[0 , ∞ ) → C tr ( R ∗ ( d + d ′ ) ) d sa . Indeed, we claim that for a ﬁxed ( A , τ ) and initial condition X , the trajectorydeﬁned by the integral equation is unique. This follows from the Picard-Lindel¨of theory because J A∗B ,τ ∗ σ isLipschitz in X .Finally, to prove (6.2), the idea is to “diﬀerentiate” e t/ X A ,τ ( X , X ′ , t ) with respect to t . One can ﬁnda stochastic diﬀerential equation for e t/ X ( X , X ′ , t ) using free ˆIto calculus and then use standard SDEtechniques to estimate it. However, let us give this argument in an elementary language that does notrequire knowledge of free SDE.Fix t and n , and let t j = jt/n for j = 0, . . . , n . Then X A ,τ ( X , X ′ , t j ) − X A ,τ ( X , X ′ , t j − ) = S ( t j ) − S ( t j − ) − Z t j t j − J A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du. Let K = J − π . By continuity of X in t , we have Z t j t j − X A ,τ ( X , X ′ , u ) du = ( t/n ) X A ,τ ( X , X ′ , t j ) + o (1 /n ) , where the error estimate holds uniformly for k ( X , X ′ ) k ≤ R and is independent of j . Thus,(1 + t/ n ) X A ,τ ( X , X ′ , t j ) − X A ,τ ( X , X ′ , t j − )= S ( t j ) − S ( t j − ) − Z t j t j − K A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du + o (1 /n ) . Note that 1 + t/n = e t/ n + o (1 /n ) and hence e t/ n X A ,τ ( X , X ′ , t j ) − X A ,τ ( X , X ′ , t j − )= S ( t j ) − S ( t j − ) − Z t j t j − e ( u − t j − ) / K A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du + o (1 /n ) . Now multiply by e t j − / and sum from j = 1 to n to obtain e t/ X A ,τ ( X , X ′ , t ) − X = n X j =1 e t j − / [ S ( t j ) − S ( t j − )] + Z t e u/ K A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du + o (1) , (6.4)65here the error estimate o (1) holds uniformly as n → ∞ for k ( X , X ′ ) k ∞ ≤ R (and in fact independently of( A , τ )). Note that n X j =1 e t j − / [ S ( t j ) − S ( t j − )]is a standard free semicircular d -tuple where each coordinate has variance n X j =1 e t j − ( t j − t j − ) ≤ Z t e u/ du = e t − ≤ e t . Hence, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X j =1 e t j − / [ S ( t j ) − S ( t j − )] (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ ≤ e t/ . We also have (cid:13)(cid:13)(cid:13)(cid:13)Z t e u/ K A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du (cid:13)(cid:13)(cid:13)(cid:13) ≤ (1 − e − t/ ) k K k BC tr ( R ∗ ( d + d ′ ) ) d . Thus, upon taking n → ∞ in (6.4), we obtain the desired estimate.Since t

7→ X ( · , t ) is a continuous map [0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) ) d sa , we can deﬁne the Riemann integral Z t J ( X ( · , u ) , π ′ ) du, where J ( X ( · , u ) , π ′ ) denotes the function in C tr , S ( R ∗ ( d + d ′ ) ) d sa given by composing X ( · , u ) and π ′ in theprescribed manner. Here we rely on the fact that the Riemann integrals are deﬁned for continuous functionsfrom [0 , t ] to a Fr´echet space; the proof of this fact is exactly the same as for functions taking values in R d or in a Banach space, except that one must prove convergence of the Riemann-sum approximations withrespect to each of a countable family of seminorms rather than only one norm. It follows that the identity X ( · , t ) = S ( t ) − Z t J ( X , π ′ ) du holds in C tr , S ( R ∗ ( d + d ′ ) ) d sa . Similarly, t

7→ X ( · , t ) − S ( t ) is a continuously diﬀerentiable function [0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) ) d . It will be convenient in the rest of the section to view our equations as integral / diﬀerentialequations in C tr , S ( R ∗ ( d + d ′ ) ) d sa rather than equations for functions on A d + d ′ sa for every ( A , τ ) separately.The next lemma will be used to construct the process ∂ X ( · , t ). Lemma 6.11.

Let t

7→ F ( · , t ) be a continuous function [0 , ∞ ) → C tr , S ( R ∗ d + d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ,and let G ∈ C tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d . Then there exists a unique continuous G : [0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d satisfying G ( · ,

0) = G (6.5) ddt G ( · , t ) = − ∂ x J ( X ( · , t ) , π ′ ) G ( · , t ) + F ( · , t ) . (6.6) Moreover, we have kG ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R ≤ e − ct/ (cid:18) kG k C S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R + Z t e cu/ kF ( · , u ) k C S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R du (cid:19) . (6.7)66 roof. Recall our assumption that J = π + K with k ∂ x K k BC S ( R ∗ ( d + d ′ ) , M ) d ≤ − c. Hence, k ∂ x J k BC S ( R ∗ ( d + d ′ ) , M ) d ≤ − c. It follows that for each t , the right-hand side of the diﬀerential equation depends in a Lipschitz manner upon G ( · , t ) with respect to k·k C k tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R for every R >

0, with the Lipschitz constant being (2 − c ) / J = π + K , we also obtain ddt G ( · , t ) + 12 G ( · , t ) = − ∂ x K ( X ( · , t ) , π ′ ) G ( · , t ) + F ( · , t ) . Hence, upon multiplying by e t/ and using the given bound for ∂ x K , we obtain (cid:13)(cid:13)(cid:13)(cid:13) ddt h e t/ G ( · , t ) i(cid:13)(cid:13)(cid:13)(cid:13) C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R ≤

12 (1 − c ) (cid:13)(cid:13)(cid:13) e t/ G ( · , t ) (cid:13)(cid:13)(cid:13) C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R + e t/ kF ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R . Using Gr¨onwall’s inequality, (cid:13)(cid:13)(cid:13) e t/ G ( · , t ) (cid:13)(cid:13)(cid:13) C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R ≤ e (1 − c ) t/ (cid:18) kG k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R + Z t e − (1 − c ) u/ e u/ kF ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R du (cid:19) . This simpliﬁes to the desired estimate (6.7).Next, we explain how to diﬀerentiate G ( · , t ) with respect to ( x , x ′ ) in the situation of Lemma 6.11 when F is a C , S function. This will allow us to show that X ( · , t ) is C ∞S by induction. Lemma 6.12.

Let t

7→ F ( · , t ) be a continuous function [0 , ∞ ) → C tr , S ( R ∗ d + d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d , andlet G ∈ C S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d . Then the solution G in Lemma 6.11 is a continuous function [0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d , and we have ddt ∂ G ( · , t ) = − ∂ x J ( X ( · , t ) , π ′ ) ∂ G ( · , t ) − ∂ [ ∂ x J ( X ( · , t ) , π ′ )] G ( · , t ) , Π] + ∂ F ( · , t ) . (6.8) Proof.

We claim that for each t , the right hand side of (6.6) depends in a Lipschitz manner upon G ( · , t )in C , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d . More precisely, if we subtract the right hand side of (6.6) for twodiﬀerent functions G and G ′ , then k·k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R + k ∂ ·k C tr , S ( R ∗ ( d + d ′ ) , M ℓ +1 ) d ,R of the diﬀerence isbounded by a constant times kG ( · , t ) − G ′ ( · , t ) k C S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R + k ∂ G ( · , t ) − ∂ G ′ ( · , t ) k C S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R . We already explained in the proof of Lemma 6.11 how to estimate the diﬀerence with respect to k·k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R .To estimate the diﬀerence of the derivative, note that applying ∂ to the right-hand side of (6.6) results in theright-hand side of (6.8), which clearly depends in a Lipschitz manner upon G ( · , t ) and ∂ G ( · , t ) with respectto k·k C tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ,R , and the Lipschitz constant is independent of t . Here we note that k ∂ [ ∂ x ∇ X V ( X ( X , X ′ , t ) , X ′ )] k C tr ( R ∗ ( d + d ′ ) , M ) ,R ≤ k ∂ [ ∂ x ∇ X V ] k C tr ( R ∗ ( d + d ′ ) , M ) ,R ′ , where R ′ = max( R + 2 , k∇ X W k BC tr ( R ∗ ( d + d ′ ) ) d ) using (6.2).Because of the Lipschitz property, the Picard-Lindel¨of method shows that the equation (6.6) has a solutionin C , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d . This must agree with the solution in C tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d from Lemma 6.11. Then by applying ∂ to both sides, we obtain (6.8).67 emma 6.13. The function X from Lemma 6.10 is a continuous map [0 , ∞ ) → C ∞ tr , S ( R ∗ ( d + d ′ ) ) d sa . Moreover,there exist constants C k, J ,R such that k ∂ k X ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M k ) d ,R ≤ C k, J ,R (6.9) for k ≥ and polynomials p k, J ,R : R → R such that p k, J ,R has degree k and k ∂ x ∂ k X ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M k +1 ) d ,R ≤ e − ct/ p k, J ,R ( t ) (6.10) for k ≥ .Proof. Let π ( X , X ′ ) = X and π ′ ( X , X ′ ) = X ′ . We claim that for each k ≥ t ∂ k X ( · , t ) is a continuousfunction [0 , ∞ ) → BC tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ ( d + d ′ ) , . . . , R ∗ ( d + d ′ ) | {z } k )) d and it satisﬁes ddt ∂ k X ( · , t ) = − k X k ′ =0 k − k ′ X j =0 (cid:18) j + k ′ j (cid:19) X ( B ,...,B j )partition of [ j ]min( B ) < ··· < min( B j ) k ! X σ ∈ Perm([ k ]) ∂ k ′ x ′ ∂ j x J ( X ( · , t ) , π ′ ) ∂ | B | X ( · , t ) , . . . , ∂ | B j | X ( · , t ) , Π ′ , . . . , Π ′ | {z } k ′ ] σ . (6.11)We will deduce this from Lemma 6.12 by induction.Let us make a few preliminary comments on the form of the above equation before we show the termsare well-deﬁned. We obtained (6.11) by formally repeatedly diﬀerentiating the equation for X ( · , t ) usingthe chain rule. More precisely, we diﬀerentiated the composition of J with ( X ( · , t ) , π ′ ), and evaluated thederivative of the inner function as ( ∂ X ( · , t ) , π ′ ), and then expressed the result in terms of these two pieces.We moved the occurrences of π ′ to the right for each term. In order not to worry about which order toplug in the tangent vectors, we symmetrized over Perm( k ), which is valid because the k th derivative is asymmetric k -linear form.On the right-hand side of (6.11), the term with k ′ = 0, j = 1, and B = [ k ] is exactly ∂ x J ( X ( · , t ) , π ′ ) ∂ k X ( · , t ) , and all the other terms only involve lower-order derivatives of X ( · , t ). We will denote the sum of all theseother terms by F ( k ) ( · , t ).Now we prove by induction on k that X ( · , t ) deﬁnes a continuous map[0 , ∞ ) → C tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ ( d + d ′ ) , . . . , R ∗ ( d + d ′ ) | {z } k )) d (and hence F ( k ) is also well-deﬁned) and that X satisﬁes the formula (6.11) and the estimate (6.9).For the base case k = 1, let G : [0 , ∞ ) → C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d )) d be the solution to G ( · ,

0) = π, G ( · , t ) = − ∂ x J ( X ( · , t ) , π ′ ) G ( · , t )] + ∂ x ′ J ( X ( · , t ) , π ′ ) ′ . The solution exists by applying Lemma 6.11 with F given by ∂ x ′ J ( X ( · , t ) , π ′ ) ′ = ∂ x ′ K ( X ( · , t ) , π ′ ) ′ , which is bounded by a constant C ′ , J by assumption. Thus, by (6.7), we have kG ( · , t ) k BC tr , S ( R ∗ ( d + d ′ ) ) d ≤ e − ct/ (cid:18) c C ′ , J ( e ct/ − (cid:19) , C , J .To complete the base case, we need to show G = ∂ X . Let X n be the Picard iterate as in the proofof Lemma 6.10. Using continuity of the composition operation on C functions, we see that X n is in C , S ( R ∗ ( d + d ′ ) ) d sa , and we have ∂ X n +1 ( · , t ) = π − Z t ∂ x J ( X n ( · , u ) , π ′ ) ∂ X n ( · , u ) + ∂ x ′ J ( X n ( · , u ) , π ′ ) ′ du. By the same token as (6.3), we have kX n +1 ( · , t ) − X ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R ≤ c Z t kX n ( · , u ) − X ( · , u ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R du. In a similar way, we have k ∂ X n +1 ( · , t ) − F ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d ,R ≤ Z t (cid:18) − c k ∂ X n ( · , u ) − F ( · , u ) k C tr , S ( R ∗ ( d + d ′ ) , M ) d ,R + k ∂ x ∂ J k C tr ( R ∗ ( d + d ′ ) , M ) d ,R kX n ( · , u ) − X ( · , u ) k C tr , S ( R ∗ ( d + d ′ ) ) d ,R ( C , J + 1) (cid:19) du where the ﬁrst error term comes from swapping out ∂ X n for F and the second comes from swapping out X for X n inside ∂ x ∇ X . Altogether the function φ n,R ( t ) := kX n +1 ( · , t ) − X ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d sa ,R + k ∂ X n +1 ( · , t ) − F ( · , t ) k C tr ( R ∗ ( d + d ′ ) ) d ,R satisﬁes φ n +1 ,R ( t ) ≤ K Z t φ n,R ( u ) du for some constant K that depends only on W , and this implies that φ n,R → n → ∞ . Thus, ∂ X n converges to F in C tr ( R ∗ ( d + d ′ ) , M ) d as n → ∞ . It follows that X is in C ( R ∗ ( d + d ′ ) ) d sa and ∂ X = F .For the induction step, suppose the claim holds for k −

1, so that ddt ∂ k − X ( · , t ) = − ∂ x J ( X ( · , t ) , π ′ ) ∂ k − X ( · , t ) + F ( k ) ( X , X ′ , t ) . Then by Lemma 6.12, we deduce that ∂ k − X is in BC , S ( R ∗ ( d + d ′ ) , M k − ) d (and depends continuously on t ) and that ∂ k X satisﬁes the diﬀerential equation computed by applying ∂ termwise to both sides. Thiscomputation of derivatives results in (6.11). Next, by our induction hypothesis the spatial derivatives of X ( · , t ) of order < k satisfy (6.9). This implies that F ( k ) is bounded in BC tr , S ( R ∗ ( d + d ′ ) , M k ) d by someconstant C ′ k, J ,R independent of t , because the derivatives of X of order < k are bounded on each ball ofradius R , and so are the derivatives of J ( X , π ′ ). Now we apply (6.7) with G = ∂ k X , noting that G = 0 for k ≥

2, and thus conclude that k ∂ k X ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M k ) d ,R ≤ e − ct/ Z t e cu/ C ′ k, J ,,R du ≤ c C ′ k, J ,R =: C k, J ,R . To (6.10), we again proceed by induction on k . We can deduce an equation for ∂ x ∂ k X ( · , t ) from (6.11).It has the same type of terms as (6.11) except that each term has one multilinear argument of the form ∂ j X replaced by ∂ x ∂ j X . As before, one of the terms is − ∂ x J ( X ( · , t ) , π ′ ) ∂ x ∂ k X ( · , t ) , while all the other terms involve lower-order derivatives of X . We separate this term out, and denote thesum of the remaining terms by H ( k ) ( · , t ). 69or the base case k = 0, we have H (0) = 0 and ∂ x X ( · ,

0) = id d . Thus, using (6.7) with F = H (0) , we get k ∂ x X ( · , t ) k BC tr , S ( R ∗ ( d + d ′ ) , M ) d ≤ e − ct/ . Thus, the claim holds with p ,W,R ( t ) = 1.For the induction step, let k ≥

2, and suppose the claim holds for k −

1. Observe that H ( k ) is boundedin orm · C tr , S ( R ∗ ( d + d ′ ) , M k ) d ,R by e − ct/ p ′ k, J ,R ( t ) for some polynomial p ′ k, J ,R of degree k −

1. This is veriﬁed byusing the induction hypothesis for (6.10) on each occurrence of ∂ x ∂ j X in H ( k ) (there being one occurrenceper summand) and applying (6.9) to all the other terms. Then we apply (6.7) to ∂ x ∂ k X , noting that itvanishes when t = 0, and thus obtain k ∂ x ∂ k X ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M k +1 ) d ,R ≤ e − ct/ Z t e cu/ e − cu/ p ′ k, J ,R ( u ) du =: e − ct/ p k, J ,R ( t ) . This completes the inductive step and hence veriﬁes (6.10).

Remark . From the proof, it is apparent that C , J ,R is independent of R . Moreover, for k >

1, theconstant C k, J ,R only depends on k ∂ k ′ ( J − π ) k C tr ( R ∗ d , M k ′ ) d ,R ′ for k ′ ≤ k , where R ′ = max( R + 2 , k J − π k BC tr ( R ∗ ( d + d ′ ) , M ) ). In particular, if J − π ∈ BC k tr ( R ∗ ( d + d ′ ) , M ), then ∂ X ∈ BC tr , S ( R ∗ ( d + d ′ ) , M ). e tL x , J Next, we explain results about the heat semigroup parallel to [26, § X , we use the following result about conditionalexpectations. Lemma 6.15.

Let k ∈ N ∪{∞} . Let d , d ′ , d ′′ ∈ N and ℓ ∈ N and d , . . . , d ℓ ∈ N . Let S be a d -variable freeBrownian motion, and let ( B , σ ) be the associated W ∗ -algebra. Let F ∈ C k tr , S ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .Recall that F A ,τ : ( A ∗ B ) d ′ sa × ( A ∗ B ) d sa × · · · × ( A ∗ B ) d ℓ → ( A ∗ B ) d ′′ , and let F A ,τ ( X )[ Y , . . . , Y ℓ ] = E A (cid:2) F A ,τ ( X )[ Y , . . . , Y ℓ ] (cid:3) for all X , Y , . . . , Y ℓ ∈ A d sa . Then F = ( F A ,τ ) ( A ,τ ) ∈ W is in C k tr ( R ∗ d , M ℓ ) d ′′ and for each k ′ ≤ k and R > , k ∂ k ′ F k C tr ( R ∗ d , M ℓ + k ′ ) d ′′ ,R ≤ k ∂ k ′ Fk C tr , S ( R ∗ d , M ℓ + k ′ ) d ′′ ,R . Proof.

Fix ( A , τ ). Recall that E A : A ∗ B → A is a linear map which is bounded map with respect to k·k ∞ ,the chain rule for Fr´echet diﬀerentiation implies that F A ,τ is Fr´echet- C k and that for k ′ ≤ k , ∂ k ′ F A ,τ ( X )[ Y , . . . , Y ℓ + k ′ ] = E A [ ∂ k ′ F A ,τ ( X )[ Y , . . . , Y ℓ + k ′ ]] . Since E A is a contraction with respect to the non-commutative L α norm for every α ∈ [1 , ∞ ], we have k ∂ k ′ F k M ℓ + k ′ , tr ,R ≤ k ∂ k ′ Fk M ℓ + k ′ , tr ,R for every R >

0. Note that this estimate is independent of ( A , τ ).For each k ′ , R >

0, and ǫ >

0, there exists some trace polynomial g ∈ TrP ∫ ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ such that k ∂ k ′ F A ,τ − g ( A ,τ ) k M ℓ , tr ,R ≤ ǫ for all ( A , τ ) ∈ W . Now g is really a trace polynomial in the variables x , y , . . . , y ℓ + k ′ and S ( t ), . . . , S ( t m ) for some ﬁnitelymany times 0 < t < · · · < t m . We can rewrite this trace polynomial in terms X , the Y j ’s and the freely70ndependent increments S ( t j ) − S ( t j − ) for j = 1, . . . , m , where t := 0, that is, there is a trace polynomial b g ∈ TrP( R ∗ ( d ′ + md ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ ) such that g A ,τ ( X )[ Y , . . . , Y ℓ + k ′ ] = b g A∗B ,τ ∗ σ ( X , ( t − t ) − / ( S ( t ) −S ( t )) , . . . , ( t m − t m − ) − / ( S ( t m ) −S ( t m − ))[ Y , . . . , Y ℓ + k ′ ] . Now ( t − t ) − / ( S ( t ) − S ( t )) , . . . , ( t m − t m − ) − / ( S ( t m ) − S ( t m − ) is a standard free semicircular dm -tuple. Lemma 2.23 implies that there is a trace polynomial h ∈ TrP( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ such that E A [ g A ,τ ( X )[ Y , . . . , Y ℓ + k ′ ] = h A ,τ ( X )[ Y , . . . , Y ℓ + k ′ ]for every ( A , τ ) ∈ W and for every X , Y , . . . , Y ℓ + k ′ ∈ A d sa . Then k ∂ k ′ F A ,τ − h A ,τ k M ℓ + k ′ , tr ,R ≤ ǫ for all ( A , τ ) ∈ W , and hence ∂ k ′ F ∈ C tr ( R ∗ d , M ℓ + k ′ ) d ′′ . This holds for k ′ ≤ k and therefore F ∈ C k tr ( R ∗ d , M ℓ ) d ′′ . Remark . In fact, in the above argument, one can compute h explicitly from g by studying the actionon trace polynomials of the heat semigroup associated to the ﬂat free Laplacian L as in [20, § § § dm inputs of the function g where the free semicircularfamily is located. Lemma 6.17.

Let k ∈ N ∪ {∞} . Then for f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , we have e tL x , J f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Moreover, ﬁx R > , and let R ′ = max( R + 2 , k∇ X W k ) . then for k ′ ≤ k , (cid:13)(cid:13)(cid:13) ∂ k ′ [ e tL x , J f ] (cid:13)(cid:13)(cid:13) C tr ( R ∗ ( d + d ′ ) , M ℓ + k ′ ) d ′′ ,R ≤ C k ′ , J ,R k ′ X j =1 k ∂ j f k C tr ( R ∗ ( d + d ′ ) ) d ′′ ,R ′ , (6.12) where C k ′ , J ,R is a constant depending only on k ′ and W and R . Also, if ∂ x f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ ,then k ′ ≤ k , (cid:13)(cid:13)(cid:13) ∂ x ∂ k ′ [ e tL x , J f ] (cid:13)(cid:13)(cid:13) C tr ( R ∗ ( d + d ′ ) , M ℓ + k ′ +1 ) d ′′ ,R ≤ e − ct p k ′ , J ,R ( t ) k ′ X j =1 k ∂ x ∂ j f k C tr ( R ∗ ( d + d ′ ) ) d ′′ ,R ′ , (6.13) where p k ′ , J ,R is a polynomial of degree k ′ depending only on k ′ and W and R .Remark . These are not the same constants and polynomials from Lemma 6.13, but they are derivedfrom them.

Proof.

Since C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ ⊆ C k tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ , we may view f as anelement of the latter space. By Lemma 6.13, X ∈ C ∞ tr , S ( R ∗ ( d + d ′ ) ) d and hence by Proposition 6.9, f ( X ( · , t ) , π ′ )is a function in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . So by Lemma 6.15, we e tL x , J f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .To prove (6.12), observe that by similar reasoning as in (6.11), ∂ k ′ [ f ( X ( · , t ) , π ′ )] = k ′ X k ∗ =0 k ′ − k ∗ X j =0 (cid:18) j + k ∗ j (cid:19) X ( B ,...,B j )partition of [ k ′ − k ∗ ]min( B ) < ··· < min( B j ) k ′ ! X σ ∈ Perm([ k ′ ]) ∂ k ∗ x ′ ∂ j x f ( X ( · , t ) , π ′ ) , . . . , Id | {z } ℓ , ∂ | B | X ( X , X ′ , t ) , . . . , ∂ | B j | X ( X , X ′ , t ) , Π ′ , . . . , Π ′ | {z } k ∗ ] σ . (6.14)71t follows from (6.2) that kX ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) ) d ,R ≤ R ′ , and the same estimate holds for ( X ( · , t ) , π ′ ) since R ′ > R . Thus, using (6.9), we can bound ∂ k ′ [ f ( X ( · , t ) , π ′ )]by the right-hand side of (6.9), and then apply Lemma 6.15 to ﬁnish the proof of (6.12). The proof of (6.13)is similar using (6.10) instead of (6.9). Lemma 6.19.

For s, t ≥ and f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , we have e sL x , J [ e tL x , J f ] = e ( s + t ) L x , J f . Proof.

Fix ( A , τ ), let ( B , σ ) be a freely independent algebra generated by the Brownian motion S , andlet ( B , σ ) be another freely independent copy of ( B , σ ) generated by another free Brownian motion S . Foreach algebra ( A , τ ), and j = 1 ,

2, let X j be the solution to (6.1) with S j instead of S . Then[ e sL x , J [ e tL x , J f ]] A ,τ ( X , X ′ )= E A [[ e tL x , J f ] A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , s ) , X ′ )]= E A ◦ E A∗B [ f A∗B ∗B ,τ ∗ σ ∗ σ ( X A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , s ) , X ′ , t ) , X ′ )] . Let S ( u ) = ( S ( u ) , u ∈ [0 , s ] , S (2 s ) + S ( u − s ) , u ∈ [2 s, ∞ ) , and let S ( u ) = S ( u + 2 s ). Let ( B , σ ) and ( B , σ ) be the associated tracial W ∗ -algebras. Note that B ∗ B = B ∗ B . Since the X and X ′ are from A sa , we have X A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , s ) , X ′ , t ) = X A ,τ ( X , X ′ , s + t )) , because the ﬂowing for time 2 s along (6.1) with S and then for time 2 t with S is the same as ﬂowingfor time 2 s + 2 t with S . Now E A ◦ E A∗B is equal to the unique trace-preserving conditional expectation

A ∗ B ∗ ˜ B → A . Thus, this agrees with ﬁrst taking the conditional expectation from

A ∗ B ∗ B onto A ∗ B and then onto A . Now X A ,τ ( X , X ′ , s + t )) is in A ∗ B already and hence the above expression reduces to E A [ f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , s + 2 t ) , X ′ )] = [ e ( s + t ) L x , J f ] A ,τ ( X , X ′ ) . Lemma 6.20.

Let f ∈ C k tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ . Then t e tL x , J f is a continuous function [0 , ∞ ) → C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Proof.

By Lemma 6.13, X is a continuous map [0 , ∞ ) → C ∞ tr , S ( R ∗ ( d + d ′ ) ) d sa . By continuity of composition inTheorem 3.19 / Proposition 6.9, t

7→ { ( X , π ′ ) deﬁnes a continuous map [0 , ∞ ) → C k tr , S ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .Using Lemma 6.15, continuity is preserved when we apply the conditional expectation to obtain the heatsemigroup. Our next goal is to construct a “kernel projection” E x , J and pseudo-inverse Ψ x , J for the Laplacian L x , J .The operator E x , J is obtained as the limit of e − tL J as t → ∞ . Lemma 6.21.

Let f ∈ C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , let R > , and let R ′ = max( R + 2 , k J − π k C tr ( R ∗ d ) d ,R ) . Then for k ′ ≤ k , k ∂ k ′ f − ∂ k e tL x , J f k C tr ( R ( d + d ′ ) , M ℓ + k ) d ′′ ,R ≤ C k ′ , J ,R R ′ k ′ X j =0 k ∂ x ∂ j f k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ′ , (6.15) where C k, J ,R is a constant depending only on k and J and R . roof. Using Lemma 6.15, we have k ∂ k f − ∂ k e tL x , J f k C tr ( R ( d + d ′ ) , M ℓ + k ) d ′′ ,R ≤ k ∂ k f − ∂ k [ f ◦ ( X , π ′ )] k C tr , S ( R ( d + d ′ ) , M ℓ + k ) d ′′ ,R Recall that ∂ k ′ [ f ( X ( · , t )] is given by (6.14). Let us ﬁrst control the terms where ∂ k ∗ x ′ ∂ j x f has some multilinearargument of the form ∂ m X with m ≥

2. Of course, this can only happen if j ≥

1, which means f isdiﬀerentiated with respect to x at least once. Using (6.9), we can bound the term ∂ k ′ x ′ ∂ j x f ( X ( · , t ) , X ′ ) , . . . , Id | {z } ℓ , ∂ | B | X ( · , t ) , . . . , ∂ | B j | X ( · , t ) , Π ′ , . . . , Π ′ | {z } k ′ ] σ by a constant times the sum of the norms of ∂ x ∂ j f for j ≤ k ′ −

1. This produces a bound of the same formas the right-hand side of (6.15) since j ≥ ≤ R ′ .The remaining terms of (6.14) are those where | B i | = 1 for all i . This implies that j + k ∗ = k ′ , and hencethese terms add up to k ′ X j =0 (cid:18) k ′ j (cid:19) k ′ ! X σ ∈ Perm([ k ′ ]) ∂ k ∗ x ′ ∂ j x f ( X ( · , t ) , π ′ ) , . . . , Id | {z } ℓ , ∂ X ( · , t ) , . . . , ∂ X ( · , t ) | {z } j , Π ′ , . . . , Π ′ | {z } k ′ − j ] σ . (6.16)When t = 0, this reduces to k ′ X j =0 (cid:18) k ′ j (cid:19) k ′ ! X σ ∈ Perm([ k ′ ]) ∂ k ∗ x ′ ∂ j x f , . . . , Id | {z } ℓ , Π , . . . , Π | {z } j , Π ′ , . . . , Π ′ | {z } k ′ − j ] σ = ∂ k f . (6.17)Thus, to complete the proof, it suﬃces to estimate the diﬀerence between (6.16) and (6.17) by the right-handside of (6.15). Now (6.17) is obtained from (6.16) by swapping out each ∂ X for π and swapping out X for X inside ∂ k f .By (6.9), ∂ X ( · , t ) is bounded by a constant. Hence, when swapping out each ∂ X for π , the error isbounded by the right-hand side of (6.15) as desired. Finally, we must replace ∂ k ′ f ( X ( · , t ) , π ′ ) by ∂ k f . Given( A , τ ), if k ( X , X ′ ) k ∞ ≤ R , then X A ,τ ( X , X ′ , t ) is also bounded by R ′ . Thus, the error can be controlled in k·k C tr , S ( R ∗ d , M ℓ + k ′ ) d ′′ ,R by k ∂ x ∂ k ′ f k C tr ( R ∗ ( d + d ′ ) , M ℓ + k ′ ) d ′′ ,R ′ kX ( · , t ) − π k C tr ( R ∗ ( d + d ′ ) ) d ,R . Then using Lemma 6.10, we have kX ( · , t ) − π k C tr , S ( R ∗ ( d + d ′ ) ) d ,R ≤ R ′ . Thus, we can bound the error by the right-hand side of (6.15) as desired.

Proposition 6.22.

There exists a unique continuous operator E x , J : C tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ → C tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ such that ( E x , J f ) ◦ π ′ = lim t →∞ e tL x , J f in C tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . (6.18) For k ∈ N ∪ {∞} , the operator E x , J maps C k tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ into C k tr ( R ∗ d , M ℓ ) d ′′ . It satisﬁes k ∂ k ′ E x , J f k C tr ( R ∗ d ′ , M ℓ ) d ′′ ,R ≤ C k ′ , J ,R k ′ X j =1 k ∂ j f k C tr ( R ∗ d ′ , M ℓ ) d ′′ ,R ′ (6.19) for k ′ ≤ k , where R ′ = max( R +2 , k J − π k BC tr ( R ∗ ( d + d ′ ) ) d ) . Finally, the limit (6.18) holds in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ whenever f ∈ C k +1tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ (or more generally the closure of C k +1tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ ). emark . Unfortunately, we have not proved that C k +1tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ is dense in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ ). Proof.

First, suppose that f ∈ C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Let R ′ = max( R + 2 , k J − π k BC tr ( R ∗ ( d + d ′ ) ) d ) ,R ′′ = max( R ′ + 2 , k J − π k BC tr ( R ∗ ( d + d ′ ) ) d ) . Then k X j =1 k ∂ j [ e tL x , J f ] − ∂ j [ e sL x , J f ] k C tr ( R ∗ ( d + d ′ ) , M ℓj ) d ′′ ,R ≤ C k, J ,R R ′ k X j =1 k ∂ x ∂ j [ e sL x , J f ] k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ′ ≤ e − cs p k, J ( s ) R ′ k X j =1 k ∂ x ∂ j f k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ′′ , (6.20)where the ﬁrst inequality for some constant C k, J ,R follows from Lemma 6.19 and (6.15), and the secondinequality for some polynomial p k, J follows from (6.13). (As before, the constants and polynomials here arenot the same ones as in the previous lemmas.) Because of the e − cs term, the diﬀerence goes to zero as s, t →∞ , and thus e tL x , J f is Cauchy with respect to each of the seminorms in C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .So the limit T f := lim t →∞ e tL x , J f exists in C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Let[ E x , J f ] A ,τ ( X ′ ) = [ T f ] A ,τ (0 , X ′ ) . Note that E x , J f ∈ C ∞ tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Because of (6.13), we see that ∂ x T f = 0, and therefore, T f = T f (0 , π ′ ) = E x , J f ( π ′ ) . So we have proved existence of the limit for f ∈ C ∞ tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Next, note thatTrP( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ ⊆ C ∞ tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ is dense in C tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .By (6.12), the operators e tL x , J on C tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ are equicontinuous. Thus, since the limitas t → ∞ exists on a dense subset, it exists everywhere. Thus, E V, X is a well-deﬁned continuous operatoron C tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .Similarly, (6.12) shows that the operators e tL x , J are equicontinuous on C k tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .Using (6.20), we see that if f ∈ C k +1tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , then the limit of e tL x , J f exists in C k +1tr ( R ∗ ( d + d ′ ) M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ as t → ∞ , and hence the same holds in the closure by equicontinuity. Proposition 6.24.

Let J ∈ J da,c for some c ∈ (0 , and a ∈ R . Then e tL x , J and E x , J : C tr ( R ∗ ( d + d ′ ) ) → C tr ( R ∗ d ′ ) are multiplicative over tr( C tr ( R ∗ ( d + d ′ ) )) , they are positive, and they satisfy e tL x , J ◦ tr = tr ◦ e tL x , J and E J ◦ tr = tr ◦ E J .Remark . In particular, in the case d ′ = 0, we see that E J deﬁnes a non-commutative law by Lemma4.5. This turns out to be one method to obtain the law µ V associated to a potential V when ∇ V ∈ J da,c , aswe will explain in § Proof.

To prove multiplicativity for the heat semigroup, let φ ∈ tr( C tr ( R ∗ d )) and f ∈ C tr ( R ∗ d ). Then e tL x , J [ φf ] A ,τ ( X , X ′ )= E A [ φ A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ ) f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ )]= φ A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ ) E A [ f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ )]= e tL x , J [ φ ] A ,τ ( X , X ′ ) e tL x , J [ f ] A ,τ ( X , X ′ ) , φ A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ ) is scalar-valued and thus can be pulled out of the con-ditional expectation onto A . The multiplicativity property for E x , J follows by taking t → ∞ .The positivity property is immediate because e tL x , J f is obtained by evaluating f on some operator andthen applying conditional expectation.The trace-preserving property follows by similar reasoning. Indeed,[tr( e tL x , J f )] A ,τ ( X , X ′ ) = τ [ E A f A∗B ,τ ∗ σ ( X ( X , X ′ , t ) , X ′ )]= τ [ f A∗B ,τ ∗ σ ( X ( X , X ′ , t ) , X ′ )]= E A [[tr( f )] A∗B ,τ ∗ σ ( X ( X , X ′ , t ) , X ′ )]= [ e tL x , J [tr( f )]] A ,τ ( X , X ′ ) . The trace-preserving property for E x , J follows by taking t → ∞ . Proposition 6.26.

Let R ′ = max(2 + R, k∇ X W k BC tr ( R ∗ ( d + d ′ ) ) d ) .(1) For f ∈ C ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , the integral Ψ x , J f := Z ∞ e tL x , J ( f − E x , J f ◦ π ′ ) dt := lim T →∞ Z T e tL x , J ( f − E x , J f ◦ π ′ ) dt exists as an improper Riemann integral in C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ .(2) Ψ x , J maps C k +1tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ into C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ and satisﬁes k X j =0 k ∂ j Ψ x , J k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ≤ C k, J ,R k X j =0 k ∂ x ∂ j f k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ′ for some constants C k, J ,R .(3) We also have that if ∂ x f is in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )) d ′′ , then k X j =0 k ∂ x ∂ j Ψ J k C tr ( R ∗ d , M ℓ + j +1 ) d ′′ ,R ≤ C ′ k, J ,R k X j =0 k ∂ x ∂ j f k C tr ( R ∗ d , M ℓ + j +1 ) d ′′ ,R ′ for some constants C ′ k, J ,R . In particular, in the d ′ = 0 case where there is no x ′ , and in this case theoperator, which we will denote Ψ J , maps C k tr ( R ∗ d , M ℓ ) d ′′ into itself.Proof. We shall prove (1) and (2) at the same time. Let k ≥ f ∈ C k +1tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ . Then byProposition 6.22, E V, X f is in C k tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ . Because t e tL x , J f is a continuous function [0 , ∞ ) → C k +1tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , the Riemann integral Z T e tL x , J ( f − E x , J f ◦ π ′ ) dt is well-deﬁned in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Then using (6.20) and taking t → ∞ , we see that k X j =1 k ∂ j [ e sL x , J f ] − ∂ j [ E x , J f ◦ π ′ ] k C tr ( R ∗ ( d + d ′ ) , M ℓj ) d ′′ ,R ≤ e − cs p k, J ( s ) R ′ k X j =1 k ∂ x ∂ j f k C tr ( R ∗ ( d + d ′ ) , M ℓ + j ) d ′′ ,R ′′ , which implies convergence of the integral in C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ as T → ∞ with the boundsasserted in (2). In particular, by taking k = 0, we obtain (1).753) Using (6.13), the improper integral R ∞ ∂ x ∂ j e tL J f dt converges in C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d )) d ′′ for j = 1, . . . , k , and we have (cid:13)(cid:13)(cid:13)(cid:13)Z ∞ ∂ j ∂ x e tL x , J f dt (cid:13)(cid:13)(cid:13)(cid:13) C tr ( R ∗ d , M ℓ + j +1 ) d ′′ ,R ≤ Z ∞ e − ct p k, J ( t ) dt j X j ′ =0 k ∂ j ′ ∂ x f k C tr ( R ∗ d , M ℓ + j ′ +1 ) d ′′ ,R ′ , where R ′ is as above. Convergence of the integral in C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ , R ∗ d , R ∗ ( d + d ′ ) , . . . , R ∗ ( d + d ′ ) )) d ′′ implies that for a ﬁxed ( A , τ ), the integral Z ∞ ∂ x ∂ j [ e tL x , J f ] A ,τ ( X , X ′ ) dt converges uniformly for X ∈ A sa with k X k ∞ ≤ R , for each j = 1, . . . , k . Uniform convergence implies thatwe can exchange integration with Fr´echet-diﬀerentiation. This shows that ∂ j ∂ x [Ψ J f ] A ,τ = Z ∞ ∂ j ∂ x [ e tL J f ] A ,τ dt. Since this holds for all ( A , τ ), we have ∂ j ∂ x [Ψ J f ] = Z ∞ ∂ j ∂ x [ e tL J f ] dt for j = 0, . . . , k . This proves the desired estimate. Remark . In (2), the constants C k, J ,R only depend on R and on the norms of the derivatives up to order k + 1 of J on the ball of radius R ′ . In (3), the constants C k, J ,R only depend on the norms of the derivativesof J − π up to order k + 1 of J on the ball of radius R ′ , and there is no direct dependence on R , i.e. nodependence on R other than through these norms. In particular, if J ∈ BC k +1tr ( R ∗ d ), then sup R C k, J ,R < ∞ . Proposition 6.28.

Let k ∈ N ∪ {∞} , and let f ∈ C k +2tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Let F ( X , t ) = e tL x , J f ( X ) . Then F deﬁnes a diﬀerentiable map [0 , ∞ ) → C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ , and ddt F = L x , J F = L x F − ∂ x F J . Proof.

By considering each coordinate of f separately, it suﬃces to consider the case d ′′ = 1. We will ﬁrstprove diﬀerentiability in a weak sense and then deduce the stronger statement by general tricks.We claim that for f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) and ( A , τ ) ∈ W and for ( X , X ′ ) and Y , . . . , Y ℓ in A d + d ′ sa , we havelim δ → ( e δL x , J f ) A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] − f ( X , X ′ )[ Y , . . . , Y ℓ ] δ = [ L x , J f ] A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] (6.21)with respect to k·k ∞ . By (6.1), we have X A ,τ ( X , X ′ , δ ) = X + S (2 δ ) − Z δ J A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , u ) , X ′ ) du. Given continuity of X A ,τ in t , it follows that X A ,τ ( X , X ′ , δ ) = X + S (2 δ ) − δ J A∗B ,τ ∗ σ ( X , X ′ ) + o ( δ ) . f is a Fr´echet- C function and S (2 δ ) is O ( δ / ), we have the Taylor expansion f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , δ ) , X ′ )[ Y , . . . , Y ℓ ]= − f A∗B ,τ ∗ σ ( X , X ′ ) − δ ∂ x f A∗B ,τ ∗ σ ( X , X ′ ) Y , . . . , Y ℓ , ∇ X V A∗B ,τ ∗ σ ( X , X ′ )]+ ∂ x f A∗B ,τ ∗ σ ( X , X ′ ) Y , . . . , Y ℓ , S (2 δ )]+ 12 ∂ x f A∗B ,τ ∗ σ ( X , X ′ ) Y , . . . , Y ℓ , S (2 δ ) , S (2 δ )] + o ( δ ) . The ﬁrst term on the right-hand side is already in A d sa . When we apply the expectation E A , the second termon the right-hand side vanishes using free independence, while the third term (by our very deﬁnition of L x in Deﬁnitions 4.21 and 4.23) produces δ ( L x , J f ) A ,τ ( X , X ′ )[ Y , . . . , Y ℓ ] . This establishes (6.21).Now we begin the main argument. By Lemma 6.20, t F ( · , t ) is a continuous function from [0 , ∞ )to C k +2tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )), and hence t L x , J F ( · , t ) is a continuous function from [0 , ∞ ) to C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )). This follows by continuity of L x : C k +2tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) → C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) , which in turn implies continuity of f ∂ x f ∇ X V using continuity of composition. Therefore, we maydeﬁne G ( · , t ) = f + Z t L x , J F ( · , u ) du as a Riemann integral with values in C k tr ( R ∗ ( d + d ′ ) , M ( R ( d , . . . , R ∗ d ℓ )). By the fundamental theorem ofcalculus, G is diﬀerentiable as a function [0 , ∞ ) → C k tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) with derivative equal to F ( · , t ). Therefore, it suﬃces to show that G = F .Fix ( A , τ ), let ( X , X ′ ) ∈ A d + d ′ sa , let t ∈ [0 , ∞ ), and let φ be a state on A , and we will prove that φ ◦ ( F − G ) A ,τ ( X , X ′ , t ) = 0 . (6.22)As in the proof of the mean value theorem, consider the function β : [0 , t ] → R given by β ( u ) = φ (cid:16) ( F − G ) A ,τ ( X , X ′ , u ) − ut ( F − G ) A ,τ )( X , X ′ , t ) (cid:17) . Note that β (0) = β ( t ) = 0 and β is continuous. Moreover, by (6.21) applied to e uL x , J f , we havelim δ → + δ (cid:0) ( F − G ) A ,τ ( X , X ′ , u + δ ) − ( F − G ) A ,τ ( X , X ′ , u ) (cid:1) = ( L x , J F ( · , u ) − L x , J F ( · , u )) A ,τ = 0 . This implies that β is right-diﬀerentiable with right-derivative given by1 t φ ( F − G ) A ,τ ( X , X ′ , t ) . Since β (0) = β ( t ) = 0 and β is continuous, it must achieve a maximum at some point in (0 , t ), which impliesthat the right-derivative 1 t φ ( F − G ) A ,τ ( X , X ′ , t ) ≤ . By the same token, it has a local minimum, so the opposite inequality holds as well, which proves (6.22).77 roposition 6.29.

Let J ∈ J da,b . Then the operators { e tL x , J } t ∈ [0 , ∞ ) , L x , J , E x , J [ − ] ◦ π ′ , and Ψ x , J allcommute as operators on C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ . Moreover, L x , J [ E x , J f ◦ π ′ ] = 0 (6.23) and ( − L x , J Ψ x , J + E x , J ) f = f . (6.24) Proof.

By Lemma 6.19, the operators { e tL x , J } t ∈ [0 , ∞ ) form a semigroup, and hence they all commute witheach other. This implies that e tL x , J e sL x , J − s f = e sL x , J − s e tL x , J f . When we take s → + , by Proposition 6.28 and the continuity of e tL x , J as an operator on C ∞ tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ ,we obtain that e tL x , J and L x , J commute.Similarly, since e tL x , J f → E x , J f ◦ π ′ as t → ∞ , we see that the operators e sL x , J and L x , J commute with E V, X [ − ] ◦ π ′ .Next, for each T ∈ [0 , ∞ ), the operator f Z T [ e tL x , J f − E x , J f ◦ π ] dt commutes with e sL x , J , L x , J , and E x , J [ − ] ◦ π as one can see easily from Riemann sum approximations. Thentaking T → ∞ , we see that Ψ x , J commutes with all these operators.To prove (6.23), observe that E x , J f ◦ π ′ is a function that only depends on X ′ , and hence the output willbe in the kernel of ∇ X and ∂ x , and hence in the kernel of L x , J .To prove (6.24), observe that using (6.23) and the previous proposition, we have − L x , J Z T [ e tL x , J f − E x , J f ] dt = − Z T L x , J [ e tL x , J f ] dt = − Z T ddt [ e tL x , J f ] dt = f − e T L x , J f . As we take T → ∞ , the right-hand side approaches f − E V, X f in C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ byProposition 6.22. Moreover, as in the proof of Proposition 6.26, R T e tL x , J dt converges in C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ as T → ∞ to Ψ x , J f , and hence − L x , J Ψ x , J f = f − E x , J f , which rearranges to (6.24). Proposition 6.30.

Let T be any one of the operators { e tL x , J } t ∈ [0 , ∞ ) , L x , J , E x , J [ − ] ◦ π ′ , and Ψ x , J . Thenfor f ∈ C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ and g ∈ C ∞ tr ( R ∗ d ′ ) , we have T [ f · ( g ◦ π ′ )] = T [ f ] · ( g ◦ π ′ ) , T [( g ◦ π ′ ) · f ] = ( g ◦ π ′ ) · T [ f ] . (6.25) Proof.

Note that for ( A , τ ) ∈ W and ( X , X ′ ) ∈ A d + d ′ sa , e L x , J [ f · ( g ◦ π ′ )] A ,τ ( X , X ′ ) = E A [ f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ ) g A∗B ,τ ∗ σ ( X ′ )]= E A [ f A∗B ,τ ∗ σ ( X A ,τ ( X , X ′ , t ) , X ′ )] g A ,τ ( X ′ )= e L x , J [ f ] A ,τ ( X , X ′ ) g A ,τ ( X ′ )since g A∗B ,τ ∗ σ ( X ′ ) = g A ,τ ( X ′ ) ∈ A . The same reasoning holds when g is on the left side of f , which provesthe ﬁrst case of (6.25). In other words, e tL x , J is a bimodule map over C ∞ tr ( R ∗ d ′ ). Since the identity is abimodule map, and bimodule maps are closed under linear combinations and limits (hence also derivativesand integrals with respect to t ), we see that L x , J , E x , J [ − ] ◦ π ′ , and Ψ x , J are also bimodule maps over C ∞ tr ( R ∗ d ′ ). This proves (6.25). 78he ﬁnal observation that we will make is about continuous dependence of Ψ x , J on J . This result has asimilar purpose to [26, Lemma 44]. Proposition 6.31.

Fix c ∈ (0 , and a ∈ (0 , ∞ ) . Let T J be one of the operators e tL V W, X , E x , J [ − ] ◦ π ′ , L x , J , or Ψ x , J . Then ( W, f ) T W f deﬁnes a continuous map J d,d ′ a,c × C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ → C ∞ tr ( R ∗ d ′ , M ( R ∗ d , ˙ , R ∗ d ℓ )) d ′′ , where J d,d ′ a,c is equipped with the subspace topology from C tr ( R ∗ ( d + d ′ ) ) d sa .Proof. First, let us prove that X depends continuously on J in J d,d ′ a,c . Speciﬁcally, we will show that for J ∈ J d,d ′ a,c and T >

0, and for every k and ǫ > R >

0, there is a neighborhood U of J in J d,d ′ a,c suchthat J ∈ U implies that sup t ∈ [0 ,T ] k ∂ k X ( · , t ) − ∂ k X ( · , t ) k C tr , S ( R ∗ ( d + d ′ ) , M k ) d ,R < ǫ, where X and X are the processes corresponding to J and J respectively.As one might expect, the argument proceeds by induction on k using Gr¨onwall’s inequality with thediﬀerential equations for ∂ k X . For k = 0, by (6.1), we obtain X ( · , t ) − X ( · , t ) = − Z t J ( X ( · , u ) , π ′ ) − J ( X ( · , u ) , π ′ ) du − Z t ( J − J )( X ( · , u ) , π ′ ) du. In the second term on the right-hand side, the integrand is bounded in k·k C tr , S ( R ∗ ( d + d ′ ) ) d ,R by (1 / k J − J k C tr ( R ∗ ( d + d ′ ) ) d ,R ′ where R ′ = max( R + 2 , a ) using (6.2). In the ﬁrst term on the right-hand side, theintegrand is bounded in k·k C tr , S ( R ∗ ( d + d ′ ) ) d ,R by 2 − c times kX ( · , u ) − X ( · , u ) k C tr ( R ∗ ( d + d ′ ) ) d , . Thus, usingGr¨onwall’s inequality, we get a bound of the desired form for k = 0.For the induction step, the argument uses (6.11) instead of (6.1). As in the proof of Lemma 6.13, weseparate out the terms ∂ x J j ( X , π ′ ) ∂ k X j . All each of the other terms have approximately the same valuein BC tr ( R ∗ ( d + d ′ ) , M k ) d for J and J by the induction hypothesis (using an argument where we swap outeach X in the product for an X iteratively). Then we use Gr¨onwall’s inequality. The details are left as anexercise.Now that we proved our claim about continuous dependence of X on J , observe that by continuity ofcomposition, f ( X , π ′ ) in C ∞ tr , S ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ depends continuously on ( f , W ). Then by Lemma 6.15, weobtain the continuity of e tL x , J f asserted in the proposition.Next, we prove continuity of ( J , f ) E J , X f ◦ π ′ . From our argument about the continuous dependence of X on W , we can deduce that for each J and k , there is a neighborhood U ⊆ J d,d ′ a,c such that the constants C k, J ,R in Lemma 6.13 are uniformly bounded for J ∈ U . Tracing through our previous arguments, it followsthat the constants in Proposition 6.22 are also uniformly bounded for J in a neighborhood of J . Therefore,we can conclude from Proposition 6.22 the following: For each J ∈ J d,d ′ a,c and f ∈ C ∞ tr ( R ∗ ( d + d ′ ) , M ℓ ) d ′′ ,there exists neighborhoods U ⊆ W a,c and

V ⊆ C ∞ tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′′ such that the convergenceof e tL x , J f as t → ∞ is uniform for ( J , f ) ∈ U × V . Since continuity is preserved under locally uniform limits,we have that ( J , f ) E x , J f ◦ π ′ is continuous in the sense asserted by this proposition.In a similar way, using the continuity of ( J , f ) e tL x , J f (which is uniform for t ∈ [0 , T ]) and ( J , f ) E V + W, X f ◦ π , we obtain the continuity of ( J , f ) Ψ x , J . Finally, the continuity of ( J , f ) L x , J f can bechecked directly from the deﬁnition since L x , J f is obtained by diﬀerentiation and multiplication. The last section described one method of associating a non-commutative law to a potential V . Namely,if V ∈ tr( C ∞ tr ( R ∗ d )) such that ∇ V ∈ J da,c , the non-commutative law is obtained from the expectationfunctional. E V := E ∇ V : tr( C tr ( R ∗ d )) → C .In this section, we describe another approach based on free entropy, which works in greater generality.For certain potentials V , we show the existence of free Gibbs laws , that is, non-commutative laws maximizing79 ( ν ) − ˜ ν ( V ), where χ is a variant of Voiculescu’s free entropy (Proposition 7.11). This idea was suggestedby the results and comments in [85, § V satisﬁes a certain integration-by-parts relation (Proposition 7.15) and we deduce an exponentialbound for ν directly from this equation (Theorem 7.18). Finally, we show in Proposition 7.19 that “most”potentials V with bounded ﬁrst and second derivative have a unique free Gibbs law. Free Gibbs laws for a potential V will be deﬁned as the maximizers of a certain entropy functional χ ωV .This is a variant of Voiculescu’s microstates free entropy χ that uses limits along an ultraﬁlter. We alsoslightly modify Voiculescu’s framework. Rather than assuming a priori that the non-commutative laws arisefrom bounded operators, we allow ourselves to work with something like measures of ﬁnite variance, or moreprecisely, linear functionals deﬁned on a space C of test functions with quadratic growth at ∞ . Thus, wewill work with matricial microstate spaces that do not have any operator-norm cutoﬀ.In the end, we will show that for V satisfying certain bounds on the ﬁrst and second derivative, the freeGibbs laws are automatically given as the non-commutative laws of bounded operators. Thus, the space C ismostly a technical artiﬁce. We will therefore allow ourselves an ad hoc deﬁnition of C for the sake of makingthe statements and proofs cleaner.Let V ∈ tr( C tr ( R ∗ d )) be given by V A ,τ ( X ) = 12 d X j =1 τ ( X ∗ j X j ) = 12 k X k . Note that if g ∈ C ( R ∗ d ) has bounded ﬁrst derivatives, then g is k·k -Lipschitz; more precisely, for all( A , τ ) ∈ W and X , Y ∈ A d sa , we have k g A ,τ ( X ) − g A ,τ ( Y ) k ≤ k ∂g k BC tr ( R ∗ d , M ) k X − Y k . In particular, tr( g ∗ g ) is bounded by a constant times 1 + V . Hence, if g and h are in C ( R ∗ d ) and havebounded ﬁrst derivative, then tr( gh ) / (1 + V ) is bounded.We deﬁne C to be the set of f ∈ tr( C tr ( R ∗ d )) such that f / (1 + V ) ∈ tr( BC tr ( R ∗ d )) and such that f / (1 + V ) is the limit in tr( BC tr ( R ∗ d )) of a sequence f n / (1 + V ), where each f n is a linear combination offunctions of the form tr( gh ), where g and h ∈ C (tr R ∗ d ) have bounded ﬁrst derivatives. We equip C with thenorm k f k C = k f / (1 + V ) k BC tr ( R ∗ d ) , which makes C into a Banach space. Note that V ∈ C , since V = P dj =1 tr( x j ) and x j has bounded ﬁrstderivative. Clearly, C also contains tr( g ) = tr(1 g ) for any g ∈ C tr ( R ∗ d ) with bounded ﬁrst derivative. Remark . In fact, the property that elements of the form tr( gh ) where g and h have bounded ﬁrstderivatives span a dense subspace of C is only needed at the end of the proof of Theorem 7.18. The rest ofthe results of this section would hold with C replaced with the larger space of functions f such that f / (1+ V )is bounded.The next lemma describes how non-commutative laws give rise to linear functionals on C . Lemma 7.2.

Let C ⋆ denote the Banach-space dual of C . There is an injective map I : Σ d → C ⋆ given by I ( λ )( f ) = f A ,τ ( X ) , where X is a d -tuple of operators in ( A , τ ) which realizes the law λ . We also have k I ( λ ) k C ⋆ = 1 + d X j =1 λ ( x j ) . (7.1) For each

R > , I | Σ d,R is a homeomorphism onto its image with respect to the weak- ⋆ topologies on Σ d,R and C ⋆ roof. To see that I is injective, suppose that λ and µ ∈ Σ d,R for some R and I ( λ ) = I ( µ ). Let φ ∈ C ∞ c ( R ; R )with φ ( t ) = t for | t | ≤ R . If p is a non-commutative polynomial in d variables, then f := tr( p ( φ, . . . , φ )) isin tr( BC tr ( R ∗ d )), hence f ∈ C . Since f = tr( p ) on the ball of radius R , we have λ ( p ) = I ( λ )( f ) = I ( µ )( f ) = µ ( p ) . Next, to show (7.1), note that if f ∈ C with k f k C ≤

1, then | f | ≤ V and hence | I ( λ )( f ) | ≤ I ( λ )(1 + V ) = 1 + 12 d X j =1 λ ( x j ) , while on the other hand equality is clearly achieved for f = 1 + V .Finally, we show that I | Σ d,R is a weak- ⋆ homeomorphism onto its image. Consider a net λ i and a potentiallimit point λ . Let ν i and ν be the corresponding homomorphisms tr( C tr ( R ∗ d )) → C . If λ i → λ in the weak- ⋆ topology, then ν i ( f ) → ν ( f ) for every scalar-valued trace polynomial f and hence for every f ∈ tr( C tr ( R ∗ d ))by density. Since I ( λ i ) = ν i | C and I ( λ ) = φ | C , we have I ( λ i ) → I ( λ ) in the weak- ⋆ topology. Conversely, if I ( λ i ) → I ( λ ) in the weak- ⋆ topology, then λ i → λ in the weak- ⋆ topology because we can compute λ i ( p ) as I ( λ )(tr( p ( φ, . . . , φ )), where φ is a cut-oﬀ function as in the ﬁrst part of the proof.We will denote the closure of I (Σ d ) in C ⋆ by E . From the Banach-Alaoglu theorem, closed and boundedsubsets of C ⋆ (and in particular of E ) are compact, which will become important later for proving the existenceof maximizers of χ V . Indeed, using Voiculescu’s original deﬁnition of χ , it is possible to ﬁnd a maximizer ofΣ d,R (laws where the operator norm is bounded by R ) because it is compact, but it not clear whether weobtain a global maximum over Σ d (without using external information). On the other hand, compactnessof the space of laws in E with “second moment” bounded by R is enough to obtain a global maximizer inProposition 7.11 below. Remark . Unfortunately, the price we pay for such compactness is that there exist “spurious” laws in E that do not arise from any d -tuple of operators in L ( A , τ ) for ( A , τ ) ∈ W . Examples can be constructedas follows. Let X ( n ) be some d -tuple of operators with such that X ( n ) j has spectral measure supported in R \ [ − n, n ] and has second moment equal to 1. By compactness, the sequence ( I ( λ X ( n ) )) n ∈ N has a weak- ⋆ limit point ν ∈ E . Then ν (tr( x j )) = 1 but ν (tr( φ ( x j ))) = 0 for every φ ∈ C c ( R ), which is impossible if ν arose from a d -tuple in L of a tracial W ∗ -algebra.Free entropy will be deﬁned as the exponential growth rate of microstate spaces. When studying theexponential growth rates, we do not know whether the limits in question exist; see [85, § § M N ( C ) as N → ∞ ; see [30, § § β N denote the Stone- ˇCech compacti-ﬁcation of N . Recall that β N is a compact space containing N as an open dense subset, and any functionfrom N into a compact Hausdorﬀ space Ω extends uniquely to a continuous function β N → Ω. In particular,if ( a ( N ) ) N ∈ N is a bounded sequence of complex numbers, and if ω ∈ β N , then lim N → ω a ( N ) exists. Similarly,for any sequence in [ −∞ , ∞ ], the limit along the ultraﬁlter exists in [ −∞ , ∞ ]. Deﬁnition 7.4.

For

U ⊆ C ⋆ , we deﬁne the microstate spaceΓ ( N ) ( U ) = { X ∈ M N ( C ) d sa : I ( λ X ) ∈ U} . Deﬁnition 7.5.

Let V ∈ C such that V A ,τ ( X ) ≥ aV + b for some a > b ∈ R . Then we deﬁne aprobability measure µ ( N ) V on M N ( C ) d sa by dµ ( N ) V ( X ) = 1 Z ( N ) V e − N V MN ( C ) , tr N ( X ) d X , Z ( N ) V = Z M N ( C ) d sa e − N V MN ( C ) , tr N ( X ) d X . Here d X denotes Lebesgue measure on M N ( C ) d sa , which is a real inner product space of dimension dN withrespect to h· , ·i and hence has a canonical Lebesgue measure obtained by mapping it onto R dN by a linearisometry. Note that the lower bound for V implies that e − N V is integrable on M N ( C ) d sa . Deﬁnition 7.6.

Let V be as above, let ν ∈ C ⋆ , and let ω ∈ β N \ N . We deﬁne χ ωV ( ν ) = inf open U∋ ν lim sup N → ω N log µ ( N ) V (Γ ( N ) ( U )) , where the inﬁmum is taken over all weak- ⋆ neighborhoods U of ν in C ⋆ . Observation 7.7. If U ⊆ V , then µ ( N ) V (Γ ( N ) ( U )) ≤ µ ( N ) V (Γ ( N ) ( V )) . Hence, χ ωV ( ν ) is the limit of the net lim sup N → ω N log µ ( N ) V (Γ ( N ) ( U )) as U tends to { ν } , that is, the limit of the net over the directed system ofneighborhoods of U ordered by reverse inclusion. Deﬁnition 7.8.

We say that ν ∈ C ⋆ is a free Gibbs law for V with respect to ω if it maximizes χ ωV . Proposition 7.9.

Let V ∈ C with V ( X ) ≥ ( a/ k X k + b for some a > and b ∈ R . Let ω ∈ β N \ N .(1) We have χ ωV ( ν ) ≤ .(2) χ ωV is upper semi-continuous on C ⋆ with respect to the weak- ⋆ topology.(3) If χ ωV ( ν ) > −∞ , then ν must be in E , that is, the weak- ⋆ closure of I (Σ d ) . In particular, we have ν (1) = 1 , ν ( f ) ≥ for every nonnegative f ∈ C , and ν ( f g ) = ν ( f ) ν ( g ) whenever f , g , and f g are in C .Proof. (1) This is immediate since µ ( N ) V is a probability measure.(2) For each weak- ⋆ open set U ⊆ C ⋆ , deﬁne χ ωV, U ( ν ) = ( lim N → ω N log µ ( N ) V (Γ ( N ) ( U )) , ν ∈ U , ∞ , ν

6∈ U . Thus, χ ωV, U only takes two values, one of which is ∞ . Since U is open, χ ωV, U is upper semi-continuous.Observe that χ V = inf open U ( χ ωV, U ), hence χ ωV is upper semi-continuous as the inﬁmum of a family of uppersemi-continuous functions.(3) Let E be the weak- star closure of I (Σ d ). Then C ⋆ \ E is an open set. Since I ( λ X ) ∈ E for every matrixtuple X , we have Γ ( N ) ( C ⋆ \ E ) = ∅ . Hence, if ν ∈ C ⋆ \ E , we have χ ωV ( ν ) ≤ lim N → ω N log µ ( N ) V (Γ ( N ) ( C ⋆ \ E )) = −∞ . Thus, by contrapositive, if χ V ( ν ) > −∞ , then ν ∈ E .Clearly, if ν ∈ I (Σ d ), then ν (1) = 1, ν ( f ) ≥ f ≥

0, and ν ( f g ) = ν ( f ) ν ( g ) whenever f , g , and f g arein C . Since these conditions are given by equalities or non-strict inequalities of quantities that are weak- ⋆ continuous functions in ν , they also hold for ν in the closure of I (Σ d ). Proposition 7.10.

Suppose that V ∈ C and V ≥ aV + b for some a > and b ∈ R and let ω ∈ β N \ N .Then N log Z ( N ) V + d log N is bounded as N → ∞ . Moreover, the quantity χ ωV ( ν ) + ν ( V ) + lim N → ω (cid:18) N log Z ( N ) V + d log N (cid:19) (7.2)82 s independent of V , so long as V ≥ aV + b for some a > and b ∈ R . Denoting this quantity by χ ω ( ν ) , wehave χ ω ( ν ) ≤ d + ν ( V ) d + d πe. (7.3) where log + ( t ) = max(0 , log t ) .Proof. Let V ( X ) = (1 / k X k . Let σ ( N ) d,a be the Gaussian measure on M N ( C ) d sa given by dσ ( N ) d,a ( X ) = 1 Z ( N ) aV e − N aV ( X ) d X , where Z ( N ) aV = Z e − N aV ( X ) d X . Since M N ( C ) d sa is a real inner product space of dimension dN , we have from a well-known computationthat Z ( N ) aV = (cid:16)p π/N a (cid:17) dN = (2 π ) dN / a dN / N dN , hence 1 N log Z ( N ) aV + d log N = d πa . We assumed that V ∈ C and V ≥ aV + b . Since V ∈ C , we also have V ≤ AV + B for some A > B ∈ R . Thus, e − N AV e − N B ≤ e − N V ≤ e − N aV e − N b . Hence, Z ( N ) AV e − N B ≤ Z ( N ) V ≤ Z ( N ) aV e − N b and − B + d πA ≤ log Z ( N ) V + d log N ≤ − a + d πa , which proves the ﬁrst claim about boundedness.Next, to show that (7.2) is independent of V , consider two potentials V and V satisfying the givenassumptions. Let U be a weak- ⋆ neighborhood of ν in C ⋆ such that ψ ( V − V ) is bounded for ψ ∈ U . Then µ ( N ) V (Γ ( N ) ( U )) = 1 Z ( N ) V Z Γ ( N ) ( U ) e − N V ( X ) d X ≤ Z ( N ) V e N sup ψ ∈U ψ ( V − V ) Z Γ ( N ) ( U ) e − N V ( X ) X = Z ( N ) V Z ( N ) V e N sup ψ ∈U ψ ( V − V ) µ ( N ) V (Γ ( N ) ( U )) . Thus,1 N log µ ( N ) V (Γ ( N ) ( U )) + 1 N log Z ( N ) V + d log N ≤ N log µ ( N ) V (Γ ( N ) ( U )) + 1 N log Z ( N ) V + d log N + sup ψ ∈U ψ ( V − V ) . Taking the limit N → ω and then the limit as U shrinks to ν (see Observation 7.7), we have χ ωV ( ν ) + lim N → ω (cid:18) N log Z ( N ) V + d log N (cid:19) ≤ χ ωV ( ν ) + lim N → ω (cid:18) N log Z ( N ) V + d log N (cid:19) + ν ( V − V ) . ν ( V ) to both sides and observe that the same result holds with V and V switched, whichproves that (7.2) yields the same value for V and V .To prove (7.3), we will use the potential V for the computation of χ ω . The associated measure µ ( N ) V gives a Gaussian random variable S ( N ) in M N ( C ) d sa with mean zero and covariance matrix N I . Now for R > Z k X k >d / R e − N k X k / d X = Z k Y k >d / R dN e − N R k Y k / d Y = R dN Z k Y k >d / e − N k Y k / e − N ( R − k Y k / d Y ≤ R dN e − dN ( R − Z M N ( C ) d sa e − N k X k / d X . so that σ ( N ) d, ( { X : k X k ≥ d / R } ) ≤ ( Re − ( R − / ) − dN . (This can also be deduced from the Chernoﬀ bound for the chi-squared distribution.) Hence, for R > N log σ ( N ) d, ( { X : k X k ≥ d / R } ) ≤ d (cid:18) log R −

12 ( R − (cid:19) . Let ν ∈ C ⋆ and assume that R = p ν ( V ) >

1. Let 1 < R < p ν ( V ) /d . Then let U = { ψ ∈ C ⋆ :2 ψ ( V ) /d > R } . Thus, Γ ( N ) ( U ) = { X ∈ M N ( C ) d sa : k X k > d / R } . Hence, χ ωV ( ν ) ≤ lim N → ω N log σ ( N ) V (Γ ( N ) ( U )) ≤ d (cid:18) log + R −

12 ( R − (cid:19) . (7.4)Taking R → p ν ( V ) /d , we obtain χ ωV ( ν ) ≤ d + ν ( V ) d − ν ( V ) + d . (7.5)In the case where ν ( V ) = d/

2, there is nothing to prove since the right-hand side is zero. The case where ν ( V ) < d/ µ ( N ) V ( { X : k X k < d / R } ) ≤ R dN e − dN ( R − for R <

1, which is obtained by a similar change of variables as we used for the other case. Now (7.3) followseasily from (7.5) because lim N → ω N (cid:16) log Z ( N ) V + d log N (cid:17) = d π. Proposition 7.11.

Let V ∈ C with V ≥ aV + b , and let ω ∈ β N \ N . Then χ ωV achieves a maximum of zeroin C ⋆ . If F is a weak- ⋆ closed subset of C ⋆ , then χ ωV achieves a maximum on F . inf U⊇F open lim N → ω N log µ ( N ) V (Γ ( N ) ( U )) = max ν ∈F χ ωV ( ν ) . (7.6) In particular, the maximum of χ ωV over C ⋆ is achieved and the maximum is zero. Thus, a free Gibbs lawexists for V with respect to ω .Proof. Let F be a given closed set, and let us prove that the maximum is achieved in F . If χ ωV is identically −∞ on F , then there is nothing to prove, so assume that ν ∈ F with χ ωV ( ν ) > −∞ .84n order to restrict our attention to a compact set, we ﬁrst exclude a neighborhood of ∞ from achievingthe maximum. By our assumption that V ≥ aV + b , we have by the similar reasoning as in the previousproposition that µ ( N ) V (Γ ( N ) ( ν ( V ) > dR )) ≤ Z ( N ) aV + b Z ( N ) V µ ( N ) aV + b (Γ ( N ) ( ν ( V ) > dR )) ≤ Z ( N ) aV + b Z ( N ) V µ ( N ) V (Γ ( N ) ( ν ( V ) > adR ))and hence for R > a − / ,lim N → ω N log µ ( N ) V (cid:16) Γ ( N ) ( ν ( V ) > daR / (cid:17) ≤ lim N → ω N log Z ( N ) aV + b Z ( N ) V + d (cid:18) log a / R −

12 ( aR − (cid:19) . Let C = lim N → ω N log Z ( N ) aV + b Z ( N ) V , which exists by the previous proposition. Fix R suﬃciently large that C + d (log a / R − (1 / aR − <χ ωV ( ν ).Let E be the weak- ⋆ closure of I (Σ d ), and let K = F ∩ E ∩ { ν : ν ( V ) ≤ daR / } . (7.7)Then K is weak- ⋆ closed. Moreover, K is contained in the ball of radius (1 + M ) in C ⋆ . Indeed, if k f k C ≤ − (1 + V ) ≤ Re f ≤ (1 + V ). Since ν ∈ E , it is unital and positive and hence − (1 + ν ( V )) ≤ Re ν ( f ) ≤ ν ( V ) . Since we can multiply f by any complex number of modulus 1, we have | ν ( f ) | ≤ M . By Banach-Alaoglu,the ball of radius 1 + M is weak- ⋆ compact, hence K is weak- ⋆ compact.Since χ ωV is weak- ⋆ upper semi-continuous, it achieves a maximum on K . In fact, this is the maximumover all of F . Indeed, if ν is not in E , then χ ωV ( ν ) = −∞ . Moreover, if ν ( V ) > daR /

2, then by our choiceof R , χ ωV ( ν ) ≤ C + log a / R −

12 ( aR − < χ ωV ( ν ) ≤ max K χ ωV . Thus, the maximum over K is the maximum over F .Next, we prove (7.6). The inequality ≥ is immediate because every neighborhood U of F is also aneighborhood of each ν ∈ F . To prove the opposite inequality, ﬁx M > max F χ ωV . (Here the maximum of χ ωV on F is allowed to be −∞ .) Choose R suﬃciently large that C + log a / R − ( aR − < M , and let K be given again by (7.7). For each ν ∈ K , there is a neighborhood U ν such thatlim N → ω N log µ ( N ) V (Γ ( N ) ( U ν )) < M. By compactness, we may choose ﬁnitely many ν , . . . , ν k such that the neighborhoods U j = U ν j cover K .Let U = { ν : ν ( V ) > daR / } , U = E c ∪ k [ j =0 U j . Since Γ ( N ) ( E c ) = ∅ , we have Γ ( N ) ( U ) = k [ j =0 Γ ( N ) ( U j ) . j = 0, . . . , k , we have lim N → ω (1 /N ) log µ ( N ) V (Γ ( N ) ( U j )) < M , so for N suﬃciently close to ω , µ ( N ) V (Γ ( N ) ( U j )) < e − N M . Thus, µ ( N ) V (Γ ( N ) ( U )) < ( k + 1) e − N M . This implies that lim N → ω N log µ ( N ) V (Γ ( N ) ( U )) ≤ M . Since M > max F χ ωV was arbitrary, (7.6) holds.By considering F = C ⋆ , we see that χ ωV achieves a maximum. Moreover,0 = lim N → ω µ ( N ) V (Γ ( N ) ( C ⋆ )) ≤ max χ ωV ≤ . Corollary 7.12.

If there is a unique free Gibbs law ν for V with respect to ω , then for every weak- ⋆ neighborhood U of ν , we have lim N →∞ N log µ ( N ) V (Γ ( N ) ( U ) c ) < . Proof.

Note that U c is closed and so χ ωV achieves a maximum on this set, which must be strictly less than χ ωV ( ν ) = 0. Hence, the claim follows from the previous proposition. Next, we will prove a change-of-variables formula for free entropy for ν ∈ C ⋆ . Since ν is only in C ⋆ ratherthan Σ d , we will assume that the transport function f and its inverse have bounded derivatives. We beginby describing the action of diﬀeomorphisms on C and C ⋆ , along the same lines as Lemma 5.6. Lemma 7.13. (1) There is a right group action

C ×

BDiﬀ ( R ∗ d ) → C given by ( h, f ) h ◦ f . Each element of BDiﬀ ( R ∗ d ) induces a Banach-space automorphism of C .(2) There is a left group action of BDiﬀ ( R ∗ d ) on C ⋆ by weak- ⋆ homeomorphisms given by ( f ∗ ν )( h ) = ν ( h ◦ f ) .(3) There is a left group action of BDiﬀ ( R ∗ d ) on the set of potentials V ∈ C satisfying V ≥ aV + b forsome a > and b ∈ R , given by f ∗ V = V ◦ f − − log ∆ ( ∂ f − ) . Proof. (1) Let f ∈ BDiﬀ ( R ∗ d ). If g, h ∈ C tr ( R ∗ d ) have bounded ﬁrst derivatives, then so do g ◦ f and h ◦ f . Thus, tr( gh ) ◦ f ∈ C . Recall that linear combinations of functions of the form tr( gh ) are densein C by deﬁnition. Thus, to show that precomposition with f maps C into C , it suﬃces to show that k ( u ◦ f ) / (1+ V ) k BC tr ( R ∗ d ) ≤ C k u/ (1+ V ) k BC tr ( R ∗ d ) for some constant C . However, because f is k·k -Lipschitz,we obtain k f ( X ) k ≤ a ′ k X k + b ′ for some constants a ′ and b ′ . It follows that 1 + V ◦ f ≤ (1 /C )(1 + V )for some C > / (1 + V ) ≤ C/ (1 + V ◦ f ), which implies the desired bound. The associativityproperty of this action is clear. It follows that the action of f deﬁnes a Banach-space automorphism of C .(2) The map f ∗ : C ⋆ → C ⋆ is simply the adjoint of the map h h ◦ f and thus it is weak- ⋆ continuous.Since the same considerations apply to f − , the inverse map h h ◦ f − is also weak- ⋆ continuous.(3) This follows by similar reasoning as Lemma 5.6. Note that log ∆ ( ∂ f − ) has bounded ﬁrst derivativeand therefore is in C . Proposition 7.14.

Let V ∈ C with V ≥ aV + b for some a > and b ∈ R , let ν ∈ C ⋆ , and let f ∈ BDiﬀ ( R ∗ d ) . Then we have the following relations: lim N → ω N log Z ( N ) f ∗ V Z ( N ) V = 1 , (7.8) χ ω f ∗ V ( f ∗ ν ) = χ ωV ( ν ) , (7.9) χ ω ( f ∗ ν ) = χ ω ( ν ) + ν [log ∆ ( ∂ f )] . (7.10) In particular, ν is a free Gibbs law for V if and only if f ∗ ν is a free Gibbs law for f ∗ V (both with respect tothe given ω ), and hence V has a unique free Gibbs law if and only if f ∗ V has a unique free Gibbs law. roof. As an intermediate step to proving (7.8) and (7.9), we will show that for ν ∈ C ⋆ , we have χ ωV ( ν ) = χ ω f ∗ V ( f ∗ ν ) + lim N → ω N log Z ( N ) f ∗ V Z ( N ) V . (7.11)Let U be a neighborhood of f ∗ ν in C ⋆ and let V = ( f ∗ ) − ( U ), which is a neighborhood of ν . Let g = f − .Observe that by change of variables, Z Γ ( N ) ( V ) e − N V MN ( C ) , tr N ( X ) d X = Z Γ ( N ) ( U ) e − N ( V ◦ g ) MN ( C ) , tr N ( X ) | det[ ∂ g ] M N ( C ) , tr N ( X ) | d X = Z Γ ( N ) ( U ) exp (cid:18) − N (cid:18) ( V ◦ g ) M N ( C ) , tr N ( X ) − N log | det[ ∂ g ] M N ( C ) , tr N ( X ) | (cid:19)(cid:19) d X . By choosing U small enough, we may guarantee that k X k is uniformly bounded on Γ ( N ) ( U ) independentlyof N . Hence, since ∂ g ∈ BC tr ( R ∗ d , M ) d , by Lemma 4.37, we havelim N → ω sup X ∈ Γ ( N ) ( U ) (cid:12)(cid:12)(cid:12)(cid:12) N log | det[ ∂ g ] M N ( C ) , tr N ( X ) | − (log ∆ ( g )) M N ( C ) , tr N ( X ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Therefore,lim N → ω N log Z Γ ( N ) ( V ) e − N V MN ( C ) , tr N ( X ) d X − log Z Γ ( N ) ( U ) e − N ( f ∗ V ) MN ( C ) , tr N ( X ) d X ! = 0 . This implies lim N → ω N log µ ( N ) V (Γ ( N ) ( V )) = lim N → ω N log µ f ∗ V (Γ ( N ) ( U )) + lim N → ω N log Z ( N ) f ∗ V Z ( N ) V . Then we take the limit as U shrinks to f ∗ ν , which is equivalent to V shrinking to ν , since f ∗ is a weak- ⋆ homeomorphism. This yields (7.11).By Proposition 7.11, the maximum of χ ωV and the maximum of χ ω f ∗ V are both equal to zero. This fact,together with (7.11) and that the fact that f ∗ is a bijection on C ⋆ , implies (7.8). Then substituting (7.8)back into (7.11) produces (7.9). Next, from the deﬁnition of χ ω and (7.8), we have χ ω ( f ∗ ν ) = χ ω ( ν ) + ν ( V ) − ( f ∗ ν )( f ∗ V )= χ ω ( ν ) + ν ( V ) − ( f ∗ ν )( V ◦ g ) + ( f ∗ ν )(log ∆ ∂ g )= χ ω ( ν ) + ν (log ∆ ∂ g ◦ f )= χ ω ( ν ) − ν (log ∆ ∂ f ) , since ∂ g ◦ f is the ∂ f , and this proves (7.10). Then from (7.9), it follows immediately that ν isa free Gibbs law for V if and only if f ∗ ν is a free Gibbs law for f ∗ V .Next, by applying the change-of-variables formula to diﬀeomorphisms obtained from ﬂows along vectorﬁelds, we will show that any maximizer of χ ωV must satisfy a certain “integration-by-parts” relation. Proposition 7.15.

Let V ∈ C ∩ tr( C ( R ∗ d )) sa satisﬁes | ∂V A ,τ ( X )[ Y ] | ≤ ( a + b k X k ) k Y k ∞ | ∂ V A ,τ ( X )[ Y , Y ] | ≤ ( a + b k X k ) k Y k ∞ k Y k ∞ for some constants a , b , a , b > . Suppose that ν is a free Gibbs law for V with respect to ω . Then forall h ∈ C ( R ∗ d ) d with ∂ h ∈ BC ( R ∗ d , M ) d , we have ν ( ∂V h − Tr ( ∂ h )) = 0 . (7.12)87 emark . The hypotheses are chosen so that if V satisﬁes the hypotheses and g ∈ BDiﬀ ( R ∗ d ), then V ◦ g also satisﬁes the hypotheses. This is straightforward to verify from the fact that log ∆ g − has boundedﬁrst and second derivatives, while ∂ ( V ◦ g − ) = ∂V ( g − ) ∂ g − and ∂ ( V ◦ g − ) = ∂ V ( g − ) ∂ g − , ∂ g − ] + ∂V ( g − ) ∂ g − . Furthermore, the hypotheses are satisﬁed in the case where ∇ V − id is bounded and ∂ V is bounded, whichis the case we usually focus on in this paper. Proof of Proposition 7.15.

By linearity, it suﬃces to prove (7.12) in the case where h is self-adjoint.Let f t and g t be the functions constructed by Lemma 5.8 by taking h t ≡ h , and note that f t ∈ BDiﬀ ( R ∗ d ).Hence, by (7.10), χ ω (( f t ) ∗ ν ) = χ ω ( ν ) + ν (log ∆ ( ∂ f t ))Since ν is a free Gibbs law for V , we have χ ωV (( f t ) ∗ ν ) ≤ χ ωV ( ν ). Since χ ωV ( ν ) is equal to χ ω ( ν ) − ν ( V ) plus aconstant, this amounts to0 ≤ ( f t ) ∗ ν ( V ) − ν ( V ) − ν (log ∆ ( ∂ f t )) = ν ( V ◦ f t − V − log ∆ ( ∂ f t )) . We claim that lim t → + ( V ◦ f t − V − log ∆ ( ∂ f t )) = ∂V h − Tr ( ∂ h ) in C . (7.13)To prove this, let us ﬁrst derive error bounds for the Taylor expansion of t f t as t → + . Note that k f t − id k BC tr ( R ∗ d ) d ≤ Z t k h ◦ f u k BC tr ( R ∗ d ) d du ≤ t k h k BC tr ( R ∗ d ) d . This implies that k h ◦ f t − h k BC tr ( R ∗ d ) d ≤ k ∂ h k BC tr ( R ∗ d , M ) d k f t − id k BC tr ( R ∗ d ) d ≤ t k ∂ h k BC tr ( R ∗ d , M ) d k h k BC tr ( R ∗ d ) d . Hence, k f t − id − t h k BC tr ( R ∗ d ) d ≤ Z t k h ◦ f u − h k BC tr ( R ∗ d ) d du ≤ t k ∂ h k BC tr ( R ∗ d , M ) d k h k BC tr ( R ∗ d ) d . By Taylor expansion, we have V ◦ f t − V = ∂V f t − id] + 12 Z ∂ V ∗ ((1 − s ) id + s f t ) f t − id , f t − id] ds. Since ∂ f t is bounded, we have k f A ,τt ( X ) k ≤ a + b k X k for some constants a and b . Hence, (cid:12)(cid:12)(cid:12)(cid:12)Z ∂ V ∗ ((1 − s ) id + s f t ) A ,τ ( X ) f A ,τt ( X ) − X , f A ,τt ( X ) − X ] ds (cid:12)(cid:12)(cid:12)(cid:12) ≤ ( a + b (1+ a + b k X k ) ) t k h k BC tr ( R ∗ d ) d sa . Therefore, this term is O ( t ) in C . So computing the limit of (1 /t )( V ◦ f t − V ) in C is equivalent to computingthe limit of (1 /t ) ∂V f t − id). Our earlier estimates show that f t − id t → h in BC tr ( R ∗ d ) d . Combining this with our hypothesis on ∂V , we get thatlim t → + t ( V ◦ f t − V ) = lim t → + t h∇ V, f t − id i tr = h∇ V, h i tr = ∂V h in C . ∂ f t − Id = Z t ( ∂ h ◦ f u ) ∂ f u du. Recall k ∂ f t k BC tr ( R ∗ d , M ) d ≤ exp( t k h k BC tr ( R ∗ d , M ) d ) . Plugging this into the integral, we obtain ∂ f t = Id + O ( t ) in BC tr ( R ∗ d , M ) d . Then because ∂ h is bounded, we get ∂ h ◦ f u ∂ f u = ∂ h O ( u )and thus ∂ f t − I = Z t ( ∂ h + O ( u )) du = t∂ h + O ( t )in BC tr ( R ∗ d , M ( R ∗ d )) d . If the right-hand side is strictly smaller than 1, then we may evaluatelog ∆ ( ∂ f t ) = 12 Tr ∞ X m =1 ( − m +1 m (( ∂ f t ) ✶ ∂ f t − I ) m ! = t Tr ( ∂ h + ∂ h ✶ ) + O ( t ) . Therefore, by the same reasoning as in Lemma 5.7lim t → + t log ∆ ( ∂ f t ) = Tr ( ∂ h ) in BC tr ( R ∗ d , M ) d , and hence the same limit also holds in C . This completes the proof of (7.13).It follows from (7.13) that ν ( V ◦ f t − V + log ∆ ( ∂ f t )) ≥ . But the same argument applies with − h instead of h , so that (7.12) holds. The equation (7.12) is sometimes called the

Dyson-Schwinger equation , In the classical setting, this relationcan be proved directly using integration-by-parts. The Dyson-Schwinger equation and the considerations ofthe previous section lead to the following result.

Corollary 7.17.

Let E be the weak- ⋆ closure of I (Σ d ) in C ⋆ . Suppose that there is a unique ν ∈ E satisfying (7.12) . Then for every neighborhood U of ν in C ⋆ , we have lim sup N →∞ N log µ ( N ) V (Γ ( N ) ( U ) c ) < . More generally, if f ∈ C and ν ( f ) = c for every ν satisfying (7.12) , then for every ǫ > , we have lim sup N →∞ N log µ ( N ) V ( { X : | f ( X ) − c | ≥ ǫ } ) < . Proof.

For each ω ∈ β N \ N , a free Gibbs law must satisfy (7.12). Thus, ν is the unique free Gibbs law withrespect to ω , so that for each neighborhood U of ν , we havelim N → ω N log µ ( N ) V (Γ ( N ) ( U ) c ) < . (7.14)But since this holds for every ω , it must also hold for the lim sup as N → ∞ . For the second claim, let U = { ν : | ν ( f ) − c | < ǫ } . For each ω , the entropy χ ωV achieves a maximum on U c that is strictly less thanzero. Thus, (7.14) also holds, and we conclude as before.89mazingly, for a potential V + W with ∂W and ∂ W bounded, the Dyson-Schwinger equation is enoughto guarantee that an element of E actually agrees with a law in Σ d with an explicit bound on the “supportradius.” Theorem 7.18.

Let k ≥ . Let V = V + W ∈ tr( C ( R ∗ d )) with ∂W ∈ BC ( R ∗ d , M ( R ∗ d )) . Suppose that ν ∈ E satisﬁes ν ( ∂V h − Tr ( ∂ h )) = 0 for h ∈ C k tr ( R ∗ d ) d with ∂ h ∈ BC k tr ( R ∗ d , M ( R ∗ d )) d . (7.15) Then there exists ( A , τ ) ∈ W and X ∈ A d sa such that ν = I ( λ X ) and k X k ∞ ≤ C (cid:16) k ∂W k BC tr ( R ∗ d , M ) + q k ∂W k BC tr ( R ∗ d , M ) + 4 (cid:17) , (7.16) where C is a universal constant. Moreover, (7.15) holds for all h ∈ C k tr ( R ∗ d ) .Proof. GNS Construction:

Let B be the set of functions f ∈ BC tr ( R ∗ d ) such that f is uniformly k·k -continuous on each k·k -ball. Note that B is a C ∗ -subalgebra of BC tr ( R ∗ d ). Moreover, we may deﬁne a trace τ on B by τ ( f ) = ν [tr( f )] , which makes sense because tr( f ) ∈ C . Let H τ be the GNS Hilbert space associated to B and τ , that is, theseparation-completion of B with respect to h· , ·i τ . Let π τ : B → B ( H τ ) be the GNS representation. Recall τ passes to a well-deﬁned faithful trace on π τ ( B ), and π τ ( B ) can be completed to a W ∗ -algebra A ⊆ B ( H τ ),and we will denote the associated trace also by τ by a slight abuse of notation. Bump functions:

Let ρ ∈ C ∞ c ( R ) be a nonnegative symmetric function supported in [ − ,

1] whichintegrates to 2. Then let ψ ( t ) = R t ρ , so that b ψ ( s ) = b ρ ( s ) / πis . As in § ψ ( x j ) denote the functionin C tr ( R ∗ d ) given by [ ψ ( x j )] A ,τ ( X ) = ψ ( X j ) for X ∈ ( A ) d sa for ( A , τ ) ∈ W ; here x j denotes a formalself-adjoint variable while X j denotes an operator from ( A , τ ) as in our notation for trace polynomials. Itfollows from Lemma 4.14 that ψ ( x j ) ∈ C ( R ∗ d ), and we have k ∂ψ ( x j ) k BC tr ( R , M ) ≤ Z R | b ρ ( s ) | ds. In particular, ψ is uniformly k·k -Lipschitz and hence ψ ( x j ) is in C for j = 1, . . . , d . Let φ R ( t ) = R ( ψ ( t/R + 1) + 1)Note that φ R ≥

0. Since φ R is deﬁned by scaling and translation of ψ , we obtain that ∂ [ φ R ( x j )] = ∂ψ ( x j /R + 1) , and hence k ∂φ R ( x j ) k BC tr ( R ∗ , M ) ≤ Z R | b ρ ( s ) | ds. So φ R ( x j ) ∈ B . In fact, since ρ ∈ C ∞ c ( R ), we have φ R ( x j ) ∈ BC ∞ tr ( R ∗ d ). Application of Dyson-Schwinger equation:

Recall that V = V + W , hence ∇ V = ∇ V + ∇ W =id + ∇ W , and thus for n ∈ N , we have ν ( h x j , φ R ( x j ) n i tr ) = ν ( h∇ x j W, φ R ( x j ) n i tr ) + ν (Tr ( ∂ ( φ R ( x j ) n ))) . (7.17)Note that x j φ R ( x j ) n is obtained by applying a C ∞ c ( R ) function to x j and hence is in C . Thus, ν ( h x j , φ R ( x j ) n i tr ) = ν (tr( x j φ R ( x j ) n )) = τ ( x j φ R ( x j ) n )) . Also, φ R ( t ) ≤ t k ρ k L ∞ ( R ) ≤ t k b ρ k L ( R ) , so that φ R ( t ) n +1 ≤ k b ρ k L ( R ) tφ R ( t ) n , which implies that τ ( φ R ( x j ) n +1 ) ≤ k b ρ k L ( R ) τ ( x j φ R ( x j ) n +1 ) = ν ( h x j , φ R ( x j ) n i tr ) . ν ( h∇ x j W, φ R ( x j ) n i tr ) = τ ( ∇ x j W · φ R ( x j ) n ) ≤ k∇ x j W k B τ ( φ R ( x j ) n ) ≤ k ∂W k BC tr ( R ∗ d , M ) , where we have used the fact that φ R ( x j ) n ≥ ∗ -algebra B . Finally, for the second term on theright-hand side of (7.17), observe that by the product rule (which follows from the chain rule Theorem 3.19), ∂ ( φ R ( x j ) n )) = n − X i =0 φ R ( x j ) i ∂ [ φ R ( x j )] φ R ( x j ) n − − i . For f, g ∈ BC tr ( R ∗ d ), we may deﬁne an element E i,j ⊗ f ⊗ g ∈ BC tr ( R ∗ d , M ) d [ E i,j ⊗ f ⊗ g ] A ,τ ( X )[ Y ] i ′ = δ i = i ′ f ( X ) Y j g ( X ) . Note that ( E i,j ⊗ f ⊗ g ) ✶ = E j,i ⊗ f ∗ ⊗ g ∗ and[ E i,j ⊗ f ⊗ g ] E i ′ ,j ′ ⊗ f ⊗ g ] = δ j = i ′ E i,j ′ ⊗ f f ⊗ g g , and Tr ( E i,j ⊗ f ⊗ g ) = δ i = j tr( f ) tr( g ) , which follows from a straightforward computation with free independence. In particular, since φ R ( x j ) ispositive in BC tr ( R ∗ d ), we can write E j,j ⊗ φ R ( x j ) i ⊗ φ R ( x j ) n − − i = [ E j,j ⊗ φ R ( x j ) i/ ⊗ φ R ( x j ) ( n − − i ) / ] , which is positive in BC tr ( R ∗ d , M ). Since this is positive and (1 /d ) Tr deﬁnes a tr( BC tr ( R ∗ d ))-valued traceon BC tr ( R ∗ d , M ) d , we obtainTr (cid:0) φ R ( x j ) i ∂ [ φ R ( x j )] φ R ( x j ) n − − i (cid:1) = Tr (cid:0) [ E j,j ⊗ φ R ( x j ) i ⊗ φ R ( x j ) n − − i ] ∂ [ φ R ( x j )] (cid:1) ≤ k ∂ [ φ R ( x j )] k BC tr ( R ∗ d , M ) d Tr (cid:0) E j,j ⊗ φ R ( x j ) i ⊗ φ R ( x j ) n − − i (cid:1) ≤ k b ρ k L ( R ) tr( φ R ( x j ) i ) tr( φ R ( x j ) n − − i ) , where the inequality holds in tr( BC tr ( R ∗ d )). Then using positivity of ν , we have ν (Tr ( ∂ ( φ R ( x j ) n ))) = ν Tr n − X i =0 φ R ( x j ) i ∂ [ φ R ( x j )] φ R ( x j ) n − − i !! ≤ k b ρ k L ( R ) ν n − X i =0 tr( φ R ( x j ) i ) tr( φ R ( x j ) n − − i ! = k b ρ k L ( R ) n − X i =0 τ ( φ R ( x j ) i ) τ ( φ R ( x j ) n − − i . Putting all these inequalities together, (7.17) implies τ ( φ R ( x j ) n +1 ) ≤ k b ρ k L ( R ) k ∂W k BC tr ( R ∗ d , M ) τ ( φ R ( x j ) n ) + k b ρ k L ( R ) n − X i =0 τ ( φ R ( x j ) i ) τ ( φ R ( x j ) n − − i ! . (7.18) Combinatorial estimate:

We use a similar trick as in [9, proof of Theorem 3.2.1]. Recall that the

Catalan numbers are given by C n = 1 n + 1 (cid:18) nn (cid:19) . n , and they satisfy the recursive formula C n +1 = n X j =0 C j C n − j . Moreover, C n is the 2 n th moment of the semicircular measure π √ − x [ − , ( x ) dx , so that in particular C n ≤ n .Let M = k ∂W k BC tr ( R ∗ d , M ) , and let R = k b ρ k L ( R ) M + √ M + 42 . so that R = k b ρ k L ( R ) M R + k b ρ k L ( R ) . We claim that for n ∈ N , we have τ ( φ R ( x j ) n ) ≤ R n C n . The base case n = 0 is trivial. For the induction step, using (7.18), we get τ ( φ R ( x j ) n +1 ) ≤ k ρ k L ∞ ( R ) M τ ( φ R ( x j ) n ) + k b ρ k L ( R ) n − X i =0 τ ( φ R ( x j ) i ) τ ( φ R ( x j ) n − − i ! ≤ k ρ k L ∞ ( R ) M R n C n + k b ρ k L ( R ) n − X i =0 R n − C i C n − − i ! = k ρ k L ∞ ( R ) (cid:0) M R + k b ρ k L ( R ) (cid:1) R n − C n = R n +10 C n ≤ R n +10 C n +1 . This completes the induction step. This implies that τ ( φ R ( x j ) n ) ≤ (4 R ) n for all n , and hence k π τ ( φ R ( x j )) k ≤ R . Choice of operators:

We claim that if ζ ∈ C c ( R ) with supp( ζ ) ⊆ (4 R , ∞ ), then π τ ( ζ ( x j )) = 0. Tosee this, let R = inf supp( ζ ) > R . Note that for n ∈ N , | ζ | ≤ k ζ k C ( R ) [ R, ∞ ) ≤ k ζ k C ( R ) φ nR R n . Hence, τ ( | ζ ( x j ) | ) ≤ k ζ k C ( R ) τ ( φ R ( x j ) n ) R n ≤ k ζ k C ( R ) (cid:18) R R (cid:19) n . Taking n → ∞ , we see that τ ( ζ ( x j ) ∗ ζ ( x j )) = 0 and hence π τ ( ζ ( x j )) = 0.The same reasoning can be applied with − x substituted for x since the ( − id) ∗ φ will satisfy the Dyson-Schwinger equation with − ∂W ◦ ( − id). Thus, we also have π τ ( ζ ( x j )) = 0 when supp( ζ ) ⊆ ( −∞ , − R ).Let X j = π τ [ η ( x j )] where η ∈ C ∞ c ( R ; R ) is some function with η ( t ) = t for | t | < R + ǫ , for some ǫ > X j is independent of the particular choice of η .Moreover, for any ǫ >

0, we can arrange that k η k C ( R ) ≤ R + ǫ , hence k X j k ≤ k ζ ( x j ) k B ≤ R + ǫ . Since ǫ was arbitrary, we have k X j k ≤ R , which proves (7.16) with C = 2 k b ρ k L ( R ) . Agreement of ν and I ( λ X ) on functions with bounded derivative: We claim that ν [ f ] = f A ,τ ( X )for f ∈ tr( C ( R ∗ d )) with ∂f bounded.Let η ∈ C ∞ c ( R ; R ) with η ( t ) = t for t in a neighborhood of [ − R , R ]. Since π τ is a ∗ -homomorphism,we have for any p ∈ C h x , . . . , x d i that π τ ( p ( η ( x ) , . . . , η ( x d ))) = p ( X , . . . , X d ) , ν [tr( p ( η ( x ) , . . . , η ( x d )))] = τ ( p ( η ( x ) , . . . , η ( x d ))) = τ ( p ( X , . . . , X d )) . Since ν is multiplicative on B and λ X is also multiplicative, it follows that ν [ f ( η ( x ) , . . . , η ( x d ))] = f ( X , . . . , X d )whenever f ∈ tr(TrP( R ∗ d )).Next, consider f ( η ( x ) , . . . , η ( x d )) where f ∈ tr( C ( R ∗ d )) with ∂f bounded. If we choose R > k η k C ( R ) ,then we can approximate f uniformly on the k·k ∞ -ball of radius R by trace polynomials ( f n ) n ∈ N . Since k η ( x j ) k BC tr ( R ∗ d ) < R , this implies that f n ( η ( x ) , . . . , η ( x d )) approximates f ( η ( x ) , . . . , η ( x n )) in tr( BC tr ( R ∗ d ))(and hence in C ), and therefore in this case we still have the identity ν [ f ( η ( x ) , . . . , η ( x d ))] = f ( X , . . . , X d ) = f ( η ( X ) , . . . , η ( X d )) . Keeping f ﬁxed, we use a sequence of functions η R to approximate the identity. We can arrange that η ( t ) is between 0 and t for all t ∈ R and η ( t ) = t for | t | ≤

1. Then let η R ( t ) = Rη ( t/R ). Note that forself-adjoint operator Y from ( A , τ ), we have k η R ( Y ) − Y k ≤ k Y k R .

Since ∂f is bounded, we know that f is uniformly k·k -continuous, and hence as R → ∞ , we have f ( η R ( x ) , . . . , η R ( x d )) → f ( x , . . . , x d )uniformly on k·k -balls. Also f is uniformly k·k -continuous and hence for all ( A , τ ) ∈ W , we have k f A ,τ ( Y ) k ≤ A (1 + V A ,τ ( Y )) / for some constant A . We also have k f A ,τ ( η ( Y ) , . . . , η ( Y d )) k ≤ A (1 + V A ,τ ( Y )) / since k η ( Y j ) k ≤ k Y j k . Thus, | f A ,τ ( η R ( Y ) , . . . , η R ( X d )) − f A ,τ ( Y , . . . , Y d ) | / (1 + V A ,τ ( Y )) is bounded by 2 A (1 + V A ,τ ( Y )) − / , which can be made arbitrarily small outside of k·k -ball(independently of ( A , τ )). Therefore,11 + V ( x ) | f ( η R ( x ) , . . . , η R ( x d )) − f ( x , . . . , x d ) | → BC tr ( R ∗ d )). This means that f ( η R ( x ) , . . . , η R ( x d )) → f ( x , . . . , x d ) in C , and therefore, ν ( f ( x , . . . , x d )) = lim R →∞ ν ( f ( η ( x ) , . . . , η ( x d ))) = f A ,τ ( X , . . . , X d ) .I ( λ X ) satisﬁes the Dyson-Schwinger equation (7.15) : Let

R > R . Let ζ ∈ C ∞ c ( R , [0 , − R, R ]. Suppose that h ∈ BC k tr ( R ∗ d ) d . Then d X j =1 ν ◦ tr[ x j h j + h j ∇ x j W ] − ν (Tr ( ∂ h ) = 0 . Because ν ◦ tr agrees with I ( λ X ) on BC ( R ∗ d ) and because ∇ x j W h j and Tr ( ∂ h ) are in BC ( R ∗ d ), wehave ν ◦ tr[ h j ∇ x j W ] = ( h j ∇ x j W ) A ,τ ( X ) ν (Tr ( ∂ h )) = Tr ( ∂ h ) A ,τ ( X ) . The only term that remains to substitute is tr[ x j h j ]. But note that ν ◦ tr[ x j (1 − ζ ( x j )) h j ] ≤ ( ν ◦ tr[ x j ]) / ( ν ◦ tr[(1 − ζ ( x j )) h ∗ j h j (1 − ζ ( x j ))]) / = ( ν ◦ tr( x j )) / ( τ ((1 − ζ ( X j ))( h ∗ j h j ) A ,τ ( X )(1 − ζ ( X j )))) / = 0because (1 − ζ ( x j )) h ∗ j h j (1 − ζ ( x j )) has bounded ﬁrst derivative since h j and ∂h j are bounded. Therefore, ν ◦ tr[ x j h j ] = ν ◦ tr[ ζ ( x j ) x j h j ] = τ [ ζ ( X j ) X j h A ,τj ( X )] = τ [ X j h A ,τj ( X )] , (7.19)93here we have used the fact that ζ ( x j ) x j h j ∈ BC ( R ∗ d ) and ζ ( X j ) X j = X j . This establishes (7.15) when h ∈ BC k tr ( R ∗ d ).However, using smooth cut-oﬀ functions, every h ∈ C k tr ( R ∗ d ) agrees on the ball of radius R with somefunction g in BC k tr ( R ∗ d ). It follows from the deﬁnition of Fr´echet diﬀerentiation that ∂ h = ∂ g on the openball of radius R . Hence, both sides of (7.15) are the same for h and for g . So I ( λ X ) satisﬁes (7.15) for all h ∈ C tr ( R ∗ d ) d as desired. In particular, the last claim of the theorem will be proved as soon as we knowthat ν = I ( λ X ). Agreement of ν and I ( λ X ) on C : Let ζ be as above. Using (7.15) for ν , we have ν ◦ tr[ x j ] = ν ◦ tr[ x j ∇ x j V ( x )] − ν ◦ tr[ x j ∇ x j W ( x )] = 1 − ν ◦ tr[ x j ∇ x j W ( x )] . since ∂ ( x ) = Id ∈ BC ∞ tr ( R ∗ d , M ( R ∗ d )). The same holds for I ( λ X ) because it also satisﬁes (7.15). Hence, ν ◦ tr[ x j ] − ν ◦ tr[ ζ ( x j ) x j ] = ν ( x j ) − τ ( X j ) = − ν ( x j ∇ x j W ( x )) + τ [ X j ∇ x j W ( X )] . Because the function h j = ∇ x j W is bounded and has bounded ﬁrst derivative, (7.19) applies and shows that τ [ X j ∇ x j W ( X )] = ν ◦ tr[ x j ∇ x j W ( X )] . Therefore, ν ◦ tr[ x j ] = ν ◦ tr[ ζ ( x j ) x j ]. Now tr[ ζ ( x j ) x j ] ≤ tr[ ζ ( x j ) x j ] ≤ tr[ x j ], hence ν ◦ tr[ ζ ( x j ) x j ] is equalto the common value of ν ◦ tr[ x j ] and ν ◦ tr[ ζ ( x j ) x j ]. This implies that ν ◦ tr[( x j − ζ ( x j ) x j ) ] = ν ◦ tr[ x j − ζ ( x j ) x j + ζ ( x j ) x j ] = 0 . Now suppose that g, h ∈ C ( R ∗ d ) have bounded ﬁrst derivative. Then writing z ( x ) = ( x ζ ( x ) , . . . , x ζ ( x d )),we have ν ◦ tr[( g ( x ) − g ( z ( x ))) ] ≤ k ∂g k BC tr ( R ∗ d , M ( R ∗ d )) d X j =1 ν ◦ tr[( x j − ζ ( x j ) x j ) ] = 0 . The same holds for h . Hence, because of the Cauchy-Schwarz inequality, ν ◦ tr[ gh ] = ν ◦ tr[( g ◦ z )( h ◦ z )] = τ ( g ( X ) h ( X )) . Because linear combinations of functions like tr[ gh ] are dense in C by deﬁnition, it follows that ν and I ( λ X )agree on all of C . We shall show in the next section that for perturbations of V , there is a unique law satisfying the Dyson-Schwinger equation, and hence in particular a unique free Gibbs law for every ultraﬁlter ω . But we pausehere to ﬁrst establish a more general result that for each ω , generic potentials V with bounded ﬁrst andsecond derivatives have a unique free Gibbs law with respect to ω . Proposition 7.19.

Fix ω ∈ β N \ N and k ≥ and C , C > . Consider the space V kC ,C := { V + W : W ∈ tr( C k tr ( R ∗ d )) sa with k ∂ j − ∇ W k BC tr ( R ∗ d , M j − ) d ≤ C j for j = 1 , } , equipped with the subspace topology inherited from tr( C k tr ( R ∗ d )) . Then the set of V ∈ V kC ,C which have aunique free Gibbs law with respect to ω is a dense G δ -set. Recall that a G δ set in a topological space is a countable intersection of open sets. Moreover, the Bairecategory theorem states that in a complete metric space, a countable intersection of dense open sets is dense.Such a set is often called generic . Also, note that V kC ,C is a complete metric space. Indeed, since thetopology of tr( C k tr ( R ∗ d )) is deﬁned by a countable family of seminorms, it is metrizable. It is straightforwardto check that V kC ,C is a closed subset of tr( C k tr ( R ∗ d )), hence complete.94 emark . As far as we know, χ ωV may depend in general on ω , and hence so does the dense G δ set inthe proposition. The proof would apply equally well to the entropy χ V deﬁned by using the lim sup ratherthan limit as N → ω in the deﬁnition. However, then the condition of being a free Gibbs law (maximizer of χ V ) only implies convergence of the random matrix models along a subsequence of µ ( N ) V .To prove the proposition, we do not in fact need to use the Baire category theorem. Rather, if a potentialdoes not have a unique free Gibbs law, we will perturb it using the following lemma. Lemma 7.21.

Let λ ∈ Σ d . Then there exists f ∈ tr( BC ∞ tr ( R ∗ d )) sa such that f A ,τ ( X ) ≥ for all ( A , τ ) ∈ W and X ∈ A d sa , and f A ,τ ( X ) = 0 if and only if λ X = λ .Proof. Let R be an exponential bound for λ , so that λ ∈ Σ d,R . Let R ′ > R . Let φ ∈ C ∞ ( R ) be a functionsuch that φ ( t ) = t on [ − R, R ] and φ ′ is nonnegative, symmetric, and supported in [ − R ′ , R ′ ]. Similar to thebump function construction in the proof of Theorem 7.18, Lemma 4.14 implies that φ ( x j ) ∈ BC ∞ tr ( R ∗ d ). Weclaim that the sum f ( x ) = X m ≥ m ! X i ,...,i m ∈{ ,...,d } | tr( φ ( x i ) . . . φ ( x i m )) − λ ( x i . . . x i m ) | converges in tr( BC ∞ tr ( R ∗ d )). For each k > g ∈ BC ∞ tr ( R ∗ d ), k g k BC k tr ( R ∗ d ) = k X j =0 j ! k ∂ j g k BC k tr ( R ∗ d ) . By the same reasoning as in Lemma 4.27, we have k g g k BC k tr ( R ∗ d ) ≤ k g k BC k tr ( R ∗ d ) k g k BC k tr ( R ∗ d ) . In particular, k (tr( φ ( x i ) . . . φ ( x i m )) − λ ( x i . . . x i m )) k BC k tr ( R ∗ d ) ≤ k tr( φ ( x i ) . . . φ ( x i m )) − λ ( x i . . . x i m ) k BC k tr ( R ∗ d ) ≤ (cid:16) k tr( φ ( x i ) . . . φ ( x i m ) k BC k tr ( R ∗ d ) + | λ ( x i . . . x i m ) | (cid:17) ≤ (cid:16) k φ ( x ) k mBC k tr ( R ∗ d ) + R m (cid:17) . Note that X m ≥ m ! X i ,...,i m ∈{ ,...,d } (cid:16) k φ ( x ) k mBC k tr ( R ∗ d ) + R m (cid:17) = X m ≥ m ! d m (cid:16) k φ ( x ) k mBC k tr ( R ∗ d ) + R m (cid:17) < ∞ . Therefore, the sum deﬁning f converges in tr( BC k tr ( R ∗ d )) for every k , which means it converges in tr( BC ∞ tr ( R ∗ d )).Clearly, f ≥

0. If f A ,τ ( X ) = 0, then τ ( φ ( X i ) . . . φ ( X i m )) = λ ( x i . . . x i m ) for all m and i , . . . , i m ∈ { , . . . , d } . Thus, the tuple Y = ( φ ( X ) , . . . , φ ( X d )) satisﬁes λ Y = λ . In particular, k Y j k ≤ R . Recall φ is an increasing function and φ ′ = 1 on [ − R, R ], and therefore, | φ ( t ) | > R whenever | t | > R . By thespectral mapping theorem, the only way that k φ ( X j ) k can be less than or equal to R is if k X j k ≤ R . Hence, φ ( X j ) = X j , and so λ X = λ . Proof of Proposition 7.19.

By Theorem 7.18, there exists

R > C such that every freeGibbs law for any V ∈ V kC ,C is in I (Σ d,R ).We claim that any open subset U of V kC ,C contains some potential which has a unique free Gibbs lawwith respect to ω . Let V + W ∈ U . Fix t ∈ (0 ,

1) suﬃciently close to 1 that V + tW ∈ U , and note that k t∂ j W k BC tr ( R ∗ d , M j ) < C j for j = 1, 2. Let I ( λ ) be some free Gibbs law for V + tW . Let f be as in Lemma7.21 for λ . By choosing ǫ > V + tW + ǫf is in U .We claim that I ( λ ) is the unique free Gibbs law for V = V + tW + ǫf . Recall that χ ωV ( ν ) = χ ω ( ν ) − ν ( V ) + K K . Any free Gibbs law has the form I ( µ ) for some µ ∈ Σ d,R . Now χ ω ( I ( µ )) − I ( µ )[ V + tW ] ≤ χ ω ( I ( λ )) − I ( λ )[ V + tW ]By our choice of f , I ( µ )[ − f ] ≤ I ( λ )[ − f ]with equality if and only if µ = λ . It follows that I ( λ ) is the unique maximizer of χ ωV .It remains to show that the set of V which have a unique free Gibbs law is a G δ set. Recall that Σ d,R is compact and metrizable, so let ρ be a metric. Let V ∈ V kC ,C , let G ( V ) ⊆ Σ d,R be the set of λ such that I ( λ ) is a free Gibbs law. By upper semi-continuity of χ ωV , the space of free Gibbs laws is closed in C ⋆ , hencein light of Lemma 4.5, G ( V ) is closed in Σ d,R . Let U n = { V ∈ V kC ,C : G ( V ) ⊆ B /n ( µ ) for some µ ∈ Σ d,R } , where B /n ( µ ) is the open ball of radius n in Σ d,R with respect to the metric ρ . Observe that V ∈ T ∞ n =1 U n if and only if the set G ( V ) has diameter zero if and only if V has a unique free Gibbs law.We claim that U n is open. Fix V ∈ U n . Let µ ∈ Σ d,R such that G ( V ) ⊆ B /n ( µ ). Note that Σ d,R \ B /n ( µ )is compact, hence its image in C ⋆ is a closed set, so χ ωV achieves a maximum, which must be strictly lessthan zero since all the free Gibbs laws are in B /n ( µ ). Call the maximum − ǫ . Let I ( λ ) be a free Gibbs lawfor V . Then sup ν ∈ Σ d,R \ B /n ( µ ) ( χ ω ( I ( ν )) − I ( ν )[ V ]) ≤ χ ω ( I ( λ )) − I ( λ )[ V ] − ǫ. If V ′ ∈ V kC ,C such that k V ′ − V k C tr ( R ∗ d ) ,R ≤ ǫ/

3, thensup ν ∈ Σ d,R \ B /n ( µ ) χ ω ( I ( ν )) − I ( ν )[ V ′ ] ≤ sup ν ∈ Σ d,R \ B /n ( µ ) χ ω ( I ( ν )) − I ( ν )[ V ] + ǫ ≤ χ ω ( I ( λ )) − I ( λ )[ V ] − ǫ ≤ χ ω ( I ( λ )) − I ( λ )[ V ′ ] − ǫ . Hence, for V ′ in a neighborhood of V , the elements of Σ d,R \ B /n ( µ ) are not free Gibbs laws, which impliesthat G ( V ′ ) ⊆ B /n ( µ ), so V ′ ∈ U n . Thus, U n is open as desired. In this section, we will combine the results of § § V suﬃcientlyclose to P j tr( x j ). If V satisﬁes ∇ V ∈ J da,c (see Deﬁnition 6.1). In §

6, we constructed an expectation map E V := E ∇ V . We will also use the notation L V , e tL V , and Ψ V rather than L ∇ V , e tL ∇ V , and Ψ ∇ V . We willshow in Proposition 8.1 that E V describes the unique free Gibbs law for V . Then Theorem 8.3 will completethe strategy of 5.4 to construct transport.We use the same strategy to prove a more reﬁned result (Theorem 8.22), which produces triangularsmooth transport which produces a triangular smooth transport, and hence triangular isomorphisms of C ∗ -and W ∗ -algebras. Several of the necessary ingredients, such as a conditional version of the Dyson-Schwingerequation, cannot be deduced directly from the results of §

7. We rely instead upon the relationship between E x ,V to conditional expectations from random matrix theory and operator algebras, which is also of interestin its own right. Proposition 8.1.

Let V satisfy ∇ V ∈ J da,c for some a ∈ R and c ∈ (0 , . Then E V | C is the unique elementof C ⋆ satisfying (7.12) . In particular, for any ω ∈ β N \ N , it is the unique free Gibbs law for V with respectto ω . roof. Let ν ∈ C ⋆ satisfy (7.15). By Theorem 7.18, ν = I ( λ ) for some λ ∈ Σ d,R for some R >

0, andthe corresponding homomorphism ˜ λ : tr( C tr ( R ∗ d )) → C satisﬁes the Dyson-Schwinger equation for allsmooth test functions. If f ∈ tr( C ∞ tr ( R ∗ d )), then Proposition 6.26 we have Ψ V f ∈ tr( C ∞ tr ( R ∗ d )) and hence ∇ (Ψ V f ) ∈ C ∞ tr ( R ∗ d ) d . Thus, by (7.15),0 = ˜ λ [ ∇ ∗ V ∇ (Ψ V f )] = ˜ λ [ f − E V [ f ]] = ˜ λ ( f ) − E V ( f ) . Therefore, ˜ λ [ f ] = E V [ f ] for all smooth f . By density, this extends to all of tr( C tr ( R ∗ d )). Hence, ˜ λ = E V and ν = E V | C . Corollary 8.2. If V satisﬁes ∇ V ∈ J da,c for some c > and a ∈ R , then for every f ∈ tr( C ( R ∗ d )) with ∂f bounded and for every ǫ > , we have lim sup N →∞ N log µ ( N ) V ( { X : | f ( X ) − E V ( f ) | ≥ ǫ } ) < . As a consequence of (5.6) and Proposition 6.29, any such V satisﬁes Assumptions 5.14 and 5.16. Hence, allthe properties of Propositions 5.18 and 5.19 hold. Now we give a rigorous proof of transport for log-densitiesclose to the quadratic, and in fact “inﬁnitesimally optimal” transport. Theorem 8.3.

Let V t = (1 / k x , x k tr + W t , where t W t be a continuously diﬀerentiable path [0 , T ] → tr( C ∞ tr ( R ∗ d )) sa . Suppose that k ∂ k − ∇ W k BC tr ( R ∗ d , M k − ) d ≤ C k for k = 1 , , , k ∂ k − ∇ ˙ W k BC tr ( R ∗ d , M k − ) d ≤ C ′ k for k = 1 , , for constants C , C , C , C ′ , C ′ ∈ [0 , ∞ ) such that C < . Let V t = k x k , tr + W t . Let h t = −∇ Ψ V t ˙ V t . Then the solution f t to (5.3) satisﬁes ( f t ) ∗ V = V t modulo constants for all t . Moreover, this choice of h t minimizes Z T E V t k h t k , tr dt among all maps t h t satisfying the hypotheses of Lemma 5.10 with ( f t ) ∗ V = V t for all t .Remark . The last condition says that the transport is “inﬁnitesimally optimal.”

Proof.

Note that ∇ V t ∈ J dC , − C , and thus Proposition 6.26 constructs a pseudo-inverse Ψ V t for − L V t . Let h t = −∇ Ψ V t ˙ W t . We apply Proposition 6.26 (3) and Remark 6.27, observing that ∂ x reduces to ∂ since there is no x ′ . Because ∇ W t ∈ BC ( R ∗ d ) d sa , we have k ∂ Ψ V t ˙ W t k BC tr ( R ∗ d , M ) ,R ≤ constant X k =0 k ∂ k ˙ W t k BC tr ( R ∗ d , M ) ,R ′ , which is bounded by a constant, and similarly ∂ Ψ V t ˙ W t is bounded by a constant. Therefore, ∂ h t and ∂ h t are bounded by constants. By Lemma 5.8, there is a family of diﬀeomorphisms f t satisfying ˙ f t = h t ◦ f t and f = id. Note that −∇ ∗ V t h t = ∇ ∗ V t ∇ Ψ V t ˙ W t = ˙ W t modulo constants. Therefore, by Lemma 5.10, we have( f t ) ∗ V = V t modulo constants.Finally, consider another possible choice of functions ˜ h t . If the ﬂow generated by ˜ h t transports V to V t modulo constants, then by the previous proposition, we must have ∇ ∗ V t ˜ h t = ˙ W t = ∇ ∗ V t h t modulo constants.Since ˜ h t − h t is in the kernel of ∇ ∗ V , it is orthogonal with respect to E V t to any gradient by Proposition 5.19(4), and in particular orthogonal to h t . Hence, E V t k h t k , tr ≤ E V t (cid:13)(cid:13)(cid:13) ˜ h t (cid:13)(cid:13)(cid:13) , tr , which shows the desired optimality condition. 97heorem 8.3 directly implies certain isomorphisms of W ∗ - and C ∗ -algebras. The following results areclosely related to those of [35, 26, 41, 42]. Observation 8.5.

Suppose that V and V ∈ tr( C ∞ tr ( R ∗ d ))) such that V j = (1 / k x k + W j with k ∂ k W j k BC tr ( R ∗ d , M k ) < C k for k = 1 , , , with C < . Then the path W t = (1 − t ) W + tW satisﬁes the assumptions of Theorem 8.3. Hence, by thetheorem, there exists some f ∈ C ∞ tr ( R ∗ d ) d sa such that f ∗ µ V = µ V . Because f is given by solving the ODE (5.3) , the function f also has an inverse g ∈ C ∞ tr ( R ∗ d ) d sa . In particular, by Observation 4.8, there is a tracial W ∗ isomorphism between the GNS representations of µ V and µ V which also restricts to an isomorphism ofthe associated C ∗ -algebras. Corollary 8.6.

Suppose that V = (1 / k x k + W where ∂W ∈ tr( BC tr ( R ∗ d , M ( R ∗ d ))) and k ∂ W k BC tr ( R ∗ d , M ) < . Then the GNS representation of µ V is isomorphic to the tracial W ∗ -algebra generated by a standard semi-circular family S = ( S , . . . , S d ) , and the isomorphism restricts to an isomorphism of the C ∗ -algebras. Although the construction of E x ,V nowhere used matrix approximations, we will use the matrix approx-imations to prove various relations among diﬀerent conditional expectation maps. Even in the previoussection, we could only prove the properties of Proposition 5.18 after knowing the Dyson-Schwinger equation E V ∇ ∗ V h = 0 for h ∈ BC ( R ∗ d ) d . The Dyson-Schwinger equation in turn was deduced from the fact that thefree Gibbs law maximized the free entropy χ ωV . But free entropy is deﬁned in terms of matricial microstates.Hence, even our previous results depended on matrix approximation.As we do not yet know a good deﬁnition for conditional microstate entropy, our results in the conditionalsetting will rely on the random matrix models in a more explicit fashion. As in [40, 41, 42], we will view thefunctions in C tr ( R ∗ d ) as large- N asymptotic descriptions of certain sequences of functions on M N ( C ) d sa . Forthis reason, we desire a function f to be uniquely determined by knowing its restrictions f M N ( C ) , tr N for all N . Thus, we must restrict our attention to tracial W ∗ -algebras that can be approximated by matrices in acertain sense.We say that ( A , τ ) is Connes-approximable if for every d and every X ∈ A d sa , there exists a sequence of d -tuples X ( N ) ∈ M N ( C ) d that converges in non-commutative law to X . It is well-known in von Neumannalgebras that this is equivalent to the embeddability of ( A , τ ) into the ultrapower ( R , τ R ) ω for some ω ∈ β N \ N . However, recent work has shown that not every von Neumann algebra has this property [43].The space C tr ( R ∗ d ) by deﬁnition consists of tuples of functions on d -tuples for any separable tracialW ∗ -algebra, since we used a set of isomorphism class representative of such tracial W ∗ -algebras to de-ﬁne the norm. However, the same constructions can be performed using some subclass of tracial W ∗ -algebras. When we replace the set of representatives W with a set of representatives W app for Connes-approximable tracial W ∗ -algebras, we obtain analogous spaces to C k tr ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) which we willdenote C k tr , app ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )), where the subscript app stands for “approximable.”All the results in the paper work with C k tr replaced with C k tr , app . For §

6, one of course has to deﬁnethe Connes-approximable versions of C tr , S where S is a Brownian motion. It is well-known that if ( A , τ ) isConnes-embeddable and if ( B , σ ) is the tracial W ∗ -algebra generated by the free Browian motion S , then( A , τ ) ∗ ( B , σ ) is Connes-embeddable [82, Proposition 3.3].The next lemma shows that functions in C k tr , app ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) are uniquely determined bytheir values on matrix tuples. The proof may be obvious to those familiar with folklore about Connes-approximability, but nonetheless we will explain the argument here for the sake of completeness. Lemma 8.7.

Given f ( N ) : M N ( C ) dsa × M N ( C ) d × · · · × M N ( C ) d ℓ → M N ( C ) d ′ that is multilinear in the last ℓ arguments, deﬁne as in 3.10 k f ( N ) k M ℓ , tr ,R = sup {k f ( N ) ( X ) k M ℓ , tr : X ∈ M N ( C ) d sa , k X k ≤ R } . Let f ∈ C tr , app ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′ . Then k f k C tr , app ( R ∗ d ) d ′ ,R = sup N k f M N ( C ) , tr N k M ℓ , tr ,R = lim N →∞ k f M N ( C ) , tr N k M ℓ , tr ,R . roof. Note that it suﬃces to prove both equalities when f is a trace polynomial, since any f ∈ C tr , app ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ can be approximated in k·k C tr , app ( R ∗ d , M ℓ ) d ′ ,R by trace polynomials and this norm clearly dominates the ma-trix version on the right-hand side. Now given some Connes-approximable A , some α , α , . . . , α ℓ with1 /α = 1 /α + · · · +1 /α ℓ , and some X , Y , . . . , Y ℓ ∈ A d sa , we may choose some matrix tuples X ( N ) ∈ M N ( C ) d sa , Y ( N )1 ∈ M N ( C ) d , . . . , Y ℓ ∈ M N ( C ) d ℓ sa such that X ( N ) and the real and imaginary parts of Y ( N ) j convergein joint non-commutative law to X and the real and imginary parts of Y , . . . , Y ℓ . By applying a cut-oﬀfunction to X ( N ) and the real and imaginary parts of Y ( N ) j , we may also assume that k X ( N ) k ≤ R and k Y ( N ) j k ≤ k Y j k . Convergence in law also implies convergence of the L β norms of X ( N ) , Y ( N )1 , . . . , Y ( N ) ℓ to those of the corresponding operators for β ∈ [1 , ∞ ). Using convergence in law again, we also havelim N →∞ k f ( N ) ( X ( N ) )[ Y ( N )1 , . . . , Y ( N ) ℓ ] k β = k f ( X )[ Y , . . . , Y ℓ ] k β for β ∈ [1 , ∞ )and because the ∞ -norm can be recovered as the limit of the β -norms as β → ∞ , we have k f ( X )[ Y , . . . , Y ℓ ] k ∞ ≤ lim inf N →∞ k f ( N ) ( X ( N ) )[ Y ( N )1 , . . . , Y ( N ) ℓ ] k ∞ . This implies that k f k C tr , app ( R ∗ d , M ℓ ) d ′ ,R ≤ lim inf N →∞ k f M N ( C ) , tr N k ( N ) M ℓ , tr ,R ≤ lim sup N →∞ k f M N ( C ) , tr N k ( N ) M ℓ , tr ,R ≤ sup N k f M N ( C ) , tr N k ( N ) M ℓ , tr ,R ≤ k f k C tr , app ( R ∗ d , M ℓ ) d ′ ,R Next, we deﬁne a precise notion of an element of C tr , app ( R ∗ d , M ℓ ) describing the large N limit of asequence of functions on M N ( C ) d sa . Deﬁnition 8.8.

Let f ( N ) : M N ( C ) d sa × M N ( C ) d sa × · · · × M N ( C ) d ℓ sa → M N ( C ) d ′ and let f ∈ C tr , app ( R ∗ d , M ( R ∗ d , . . . , R ∗ d ℓ )) d ′ . We say that ( f ( N ) ) N ∈ N is asymptotic to f , or f ( N ) f iflim N →∞ k f ( N ) − f M N ( C ) , tr N k tr ,R = 0 . Remark . In the case ℓ = 0, the error is measured in k·k ∞ uniformly on operator norm balls. Thiscondition is stronger than the one in [41] and [42], which measured the error in k·k . Remark . It follows from Lemma 8.7 that the condition f ( N ) f uniquely determines f . Lemma 8.11.

Let f ∈ C tr ( R ∗ d ′ , M ( R ∗ d , . . . , R ∗ d n )) d ′′ for some n, d ′ ∈ N and d ′′ , d , . . . , d n ∈ N . Let g ∈ C (tr R ∗ d ) d ′ sa for some d ∈ N . For each m = 1 , . . . , n , let h m ∈ C tr ( R ∗ d , M ( R ∗ d m, , . . . , R ∗ d m,ℓm ) d m forsome ℓ m ∈ N and d m, , . . . , d m,ℓ m . Similarly, let f ( N ) : M N ( C ) ∗ d ′ sa × M N ( C ) d × · · · × M N ( C ) d n → M N ( C ) d ′′ g ( N ) : M N ( C ) d sa → M N ( C ) d ′ sa h ( N ) m : M N ( C ) d ′ sa × M N ( C ) d m, × M N ( C ) d m,ℓm → M N ( C ) d ′′ , where f ( N ) and h ( N ) m are multilinear in the last n and ℓ m arguments respectively. If f ( N ) f , g ( N ) g ,and h ( N ) m h m for each m , then f ( N ) ( g ( N ) )[ h ( N )1 , . . . , h ( N ) n ] f ( g )[ h , . . . , h n ] . The proof is essentially the same as the proof of continuity of composition in Lemma 3.18, hence we leavethe details to the reader. 99 .3 E x ,V and conditional expectations Deﬁnition 8.12.

For each choice of C , C , C >

0, let V d,C ,C ,C be the set of functions V = k x k + W ∈ tr( C ∞ tr ( R ∗ d )) sa satisfying k ∂ k − ∇ W k BC tr ( R ∗ d , M k ) d ≤ C k for k = 1, 2, 3.For V ∈ V d,C ,C ,C , we will denote the expectation E x , ∇ x V from § E x ,V . In this subsection,we will show that the expectation map E x ,V describes the large N limit of classical conditional expectationsassociated to the measures µ ( N ) V .Given a potential V ( N ) : M N ( C ) d + d ′ sa → R such that e − N V ( N ) is integrable, we deﬁne dµ V ( N ) ( X , X ′ ) = e − N V ( N ) ( X , X ′ ) d X d X ′ R M N ( C ) d + d ′ sa e − N V ( N ) ( X , X ′ d X d X ′ . Moreover, we deﬁne the conditional distribution dµ V ( N ) ( X | X ′ ) = e − N V ( N ) ( X , X ′ ) d X R M N ( C ) d sa e − N V ( N ) ( X , X ′ ) d X . If f ( N ) : M N ( C ) d + d ′ sa × M N ( C ) d sa × · · · × M N ( C ) d ℓ sa → M N ( C ) d is real-multilinear in the last ℓ arguments, weset E x ,V ( N ) [ f ( N ) ]( X ′ )[ Y , . . . , Y ℓ ] = R M N ( C ) d sa f ( N ) ( X )[ Y ′ , . . . , Y ℓ ] e − N V ( N ) ( X , X ′ ) d X R M N ( C ) d sa e − N V ( N ) ( X , X ′ ) d X . This describes the conditional expectation of f ( N ) ( X , X ′ ) given X ′ , when ( X , X ′ ) is a random variable withthe distribution µ ( N ) . Note that the subscript x denotes integration with respect to x , hence conditioningon x ′ . Theorem 8.13.

Let V ∈ V C ,C ,C for some C < . Let V ( N ) : M N ( C ) d + d ′ sa → R such that(1) V ( N ) is invariant under conjugation of X , . . . , X d + d ′ by a ﬁxed unitary U .(2) V ( N ) is a C function and ∇ V ( N ) ∇ V .(3) V ( N ) ( X ) − c k X k is convex and V ( N ) ( X ) − C k X k is concave for some < c < C .Let f ∈ C tr ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ , and let f ( N ) : M N ( C ) ( d + d ′ )sa × M N ( C ) d sa × · · · × M N ( C ) d ℓ sa → M N ( C ) d ′′ sa with f ( N ) f and k f ( N ) ( X , X ′ ) k M ℓ , tr ≤ K e K k ( X , X ′ ) k ∞ for some constants K and K . Then E x ,V ( N ) [ f ( N ) ] E x ,V [ f ] . Remark . If we take V ( N ) = V M N ( C ) , tr N , then the hypotheses (1), (2), (3) are automatically satisﬁed.For the condition (3), we set c = 1 − C and C = 1 + C where C = k ∂ ∇ V − Id k BC tr ( R ∗ ( d + d ′ ) , M ) .Since the asymptotic approximation relation relies on approximation for each operator norm ball,we will have to truncate the conditional distribution µ V ( N ) ( X | X ′ ) to operator-norm balls. The followinglemma from [42] relies on concentration of measure (see e.g. [32], [50], [12], [5, § ǫ -net argument (see [74, § V ( N ) satisﬁes (3). For the proof, referto [42, p. 277]. The constant R on p. 277 is the R ′ in the lemma statement here. Lemma 8.15.

Suppose that V ( N ) : M N ( C ) d + d ′ sa satisﬁes assumptions (1), (2), and (3) of the theorem, andlet K > and R > . Then there is some constant R ′ such that lim N →∞ sup k X ′ k ∞ ≤ R Z k x k ∞ ≥ R ′ e K k X k ∞ dµ ( N ) ( X | X ′ ) = 0 . roof of Theorem 8.13. First, consider the case where f ( N ) is exactly equal to f M N ( C ) , tr N and f ∈ BC , app ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ ∩ C ∞ tr , app ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ . Let g = Ψ x ,V f . Recall that f = E x ,V [ f ] ◦ π ′ − L x ,V g , and hence E x ,V ( N ) [ f M N ( C ) , tr N ] − E x ,V [ f ] M N ( C ) , tr N = E x ,V ( N ) [ L x ,V g M N ( C ) , tr N ] . For a function h on M N ( C ) d + d ′ sa × ( M N ( C ) d + d ′ sa ) ℓ , let L x ,V ( N ) h = 1 N ∆ x h − ∂ x h ∇ x V ( N ) . Because of Lemma 4.36, we have 1 N ∆ x [ g M N ( C ) , tr N ] L x g . Similarly, using Lemma 8.11, we have ∂ x g M N ( C ) , tr N ∇ x V ( N ) ∂ x g ∇ x V. Thus, L x ,V ( N ) g M N ( C ) , tr N L x ,V g . Note that because of integration by parts Z M N ( C ) d sa L x ,V ( N ) g M N ( C ) , tr N ( X , X ′ ) dµ ( N ) ( X | X ′ ) = 0 . (8.1)Fix R >

0, and let R ′ be the radius associated to R as in Lemma 8.15, and let M = max( R, R ′ ). Becauseof assumption (3), ∇ V ( N ) is C -Lipschitz with respect to k·k . Since k∇ V ( N ) (0) k is bounded as N → ∞ ,we have k∇ x j V ( N ) ( X , X ′ ) k ≤ A + B k ( X , X ′ ) k for some constants A and B . But it follows from [42, Lemma 11.5.4] that k∇ x j V ( N ) ( X , X ′ ) − tr N ( ∇ x j V ( N ) ( X , X ′ )) k ∞ ≤ B ′ k ( X , X ′ ) k ∞ for some constant B ′ . Thus, overall, k∇ x j V ( N ) ( X , X ′ ) − tr N ( ∇ x j V ( N ) ( X , X ′ )) k ∞ ≤ A + ( Bd + B ′ ) k ( X , X ′ ) k ∞ . Moreover, note that ∂ f M N ( C ) , tr N t and (1 /N )∆ f M N ( C ) , tr N are uniformly bounded for every N and ( X , X ′ )and t since ∂ f t and ∂ f t is uniformly bounded. Therefore, using Lemma 8.15, we see thatlim N →∞ sup k X ′ k≤ R Z k X k ∞ ≥ M k ( L x ,V ( N ) g M N ( C ) , tr N ( X , X ′ ) − [ L x ,V g ] M N ( C ) , tr N ( X , X ′ ) k M ℓ , tr dµ V ( N ) ( X | X ′ ) = 0 . Meanwhile, we can estimate the same integral over k X k ∞ ≤ M by using the condition that L x ,V ( N ) g M N ( C ) , tr N L x ,V f t , and thus putting the two pieces together,lim N →∞ sup k X ′ k≤ R Z k ( L x ,V ( N ) g M N ( C ) , tr N ( X , X ′ ) − [ L x ,V g ] M N ( C ) , tr N ( X , X ′ ) k M ℓ , tr dµ V ( N ) ( X | X ′ ) = 0 . Since R was arbitrary, it follows that E x ,V ( N ) [ L x ,V ( N ) [ g M N ( C ) , tr N ] − [ L x ,V g ] M N ( C ) , tr N ] E x ,V ( N ) [ f M N ( C ) , tr N ] E x ,V [ f ] . For the more general case, suppose that f ( N ) f and that f ( N ) satisﬁes the given operator norm bounds.Fix R and let M be as above and also let M ′ = max( M, R + 2 , C ). If ǫ >

0, then we may choose some g ∈ C ∞ tr , app ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ ∩ BC tr , app ( R ∗ ( d + d ′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ′′ with k g − f k C tr ( R ∗ ( d + d ′ ) ) d ,M < ǫ (here g can be taken to be a trace polynomial composed with a smoothcut-oﬀ function in ( X , X ′ )). Then observe that k E x ,V f − E x ,V g k C tr ( R ∗ d ′ , M ℓ ) d ,R ≤ ǫ using Proposition 6.22and the deﬁnition of M ′ . Moreover,lim sup N →∞ sup k x ′ k ∞ ≤ R Z k x k ∞ ≤ M k f ( N ) ( X , X ′ ) − g M N ( C ) , tr N ( X , X ′ ) k M ℓ , tr dµ V ( N ) ( X | X ′ ) ≤ ǫ, while the integral over k X k tr N , ∞ > M can be estimated using Lemma 8.15. Hence,lim sup N →∞ k E x ,V ( N ) [ f ( N ) ] − E x ,V [ f ] k M ℓ , tr ≤ ǫ, and since R and ǫ were arbitrary, we are done.Next, given a potential V ( x , x ′ ) in V d + d ′ C ,C ,C , we want to describe the “marginal potential” b V ( x ′ ) for thedistribution of x ′ , that is, the function describing the large N limit of the log of the marginal density of µ ( N ) V for x ′ . Choose V ( N ) as in Theorem 8.13. We can deﬁne the marginal potential b V ( N ) ( X ′ ) = − N log Z e − N V ( N ) ( X , X ′ ) d X A straightforward computation shows that ∇ b V ( N ) ( X ′ ) = E x ,V ( N ) [ ∇ x ′ V ( N ) ] . Now it follows from the previous theorem that ∇ b V ( N ) E x ,V [ ∇ x ′ V ] . Our next goal is to show that E x ,V [ ∇ x ′ V ] is the gradient of some function b V ∈ tr( C ∞ tr ( R ∗ d )). To this end,we use the following lemma. Lemma 8.16.

Let g ∈ C k tr ( R ∗ d ) d sa . If there exist C functions f ( N ) : M N ( C ) d sa → R such that ∇ f ( N ) g ,then there exists f ∈ tr( C k +1tr ( R ∗ d )) sa such that ∇ f = g . This f is unique up to an additive constant. Italso satisﬁes f ( N ) − f ( N ) (0) f − f (0) .Proof. We may deﬁne a function h ( x , x , x ) in tr( C tr ( R ∗ d )) by h ( x , x , x ) = X j =1 Z h g ( t x j + (1 − t ) x j +1 ) , x j − x j +1 i tr dt, where the index j +1 is reduced modulo 3. The function h is intuitively the path integral of g around a trianglewith vertices x , x , x . Here x , x , x are formal variables, and thus h g ( t x j + (1 − t ) x j +1 ) , x j − x j +1 i is anelement of tr( C tr ( R ∗ d )). Moreover, it depends continuously on t in this space by continuity of composition.It follows that the Riemann integral of these functions is deﬁned.Next, let h ( N ) ( X , X , X ) = X j =1 Z h∇ f ( N ) ( t X j + (1 − t ) X j +1 ) , X j − X j +1 i tr N dt, X , X , X represent elements of M N ( C ) d sa . It is straightforward to show that since ∇ f ( N ) g , wehave h ( N ) h . But because ∇ f ( N ) is a gradient, we have h ( N ) = 0. Therefore, h = 0.Deﬁne f ( x ) = Z h g ( t x ) , x i tr dt. Given that h = 0, we have for any ( A , τ ) and any X , X , X ∈ A d sa that0 = h A ,τ (0 , X , X ) = f A ,τ ( X ) − f A ,τ ( X ) + Z h g A ,τ ( t X + (1 − t ) X ) , X − X i τ dt. It follows easily that ∇ f = g .Moreover, f is unique up to an additive constant because f A ,τ ( X ) − f A ,τ (0) can be evaluated by integrat-ing the ∇ f A ,τ along the path from 0 to X . Similarly, since f ( N ) ( X ) − f ( N ) (0) = R h∇ f ( N ) ( t X ) , X i tr N dt ,we obtain f ( N ) − f ( N ) (0) f − f (0).Finally, observe that if g = ∇ f ∈ C k tr ( R ∗ d ) d , then f ∈ tr( C k +1tr ( R ∗ d )) Proposition 8.17.

Let V ∈ V d + d ′ ,C ,C ,C for some C < . Then there exists b V ∈ tr( C ∞ tr ( R ∗ d ′ )) sa , uniqueup to an additive constant, such that ∇ b V = E x ,V [ ∇ x ′ V ] . Furthermore, we have b V ∈ V d ′ ,C ′ ,C ′ ,C ′ for some constants C ′ , C ′ , and C ′ depending only on C , C , and C , where speciﬁcally C ′ = C C ′ = C (1 + C )1 − C Proof.

Let V ( N ) = V M N ( C ) , tr N , so that ∇ V ( N ) ∇ V . By Theorem 8.13 and Remark 8.14, we have E x ,V ( N ) [ ∇ x ′ V ( N ) ] E x ,V [ ∇ x ′ V ]We know that E x ,V ( N ) [ ∇ x ′ V ( N ) ] = ∇ b V ( N ) for the function b V ( N ) discussed above. Hence, by Lemma 8.16,there exists b V ∈ C ∞ tr with ∇ b V = E x ,V [ ∇ x ′ V ], which is unique up to an additive constant.Next, we must show that b V ∈ V d ′ ,C ′ ,C ′ ,C ′ . Let W = V − (1 / h x , x i tr − (1 / h x ′ , x ′ i tr and c W = V − (1 / h x ′ , x ′ i tr . Note that ∇ x ′ V ( x , x ′ ) = x ′ + ∇ x ′ W ( x , x ′ ) and ∇ b V ( x ′ ) = x ′ + ∇ c W ( x ′ ). Thus, since E x ,V [ x ′ ] = x ′ , we have ∇ c W = E x ,V [ ∇ x ′ W ].Now recall that e tL x ,V f is obtained as a conditional expectation of the function f ( X , π ′ ), and hence k e tL x ,V ∇ x ′ W k BC tr ( R ∗ ( d + d ′ ) ) d ′ ≤ k∇ x ′ W k BC tr ( R ∗ ( d + d ′ ) ) d ′ . Taking t → ∞ , we get k∇ c W k BC tr ( R ∗ ( d + d ′ ) ) d ′ ≤ C .Next, recall that the process X from § ∂ x ′ X ( · , t ) = Z t [ ∂ x ∇ x V ( X ( · , u ) , π ′ ) ∂ x ′ X + ∂ x ′ ∇ x V ( X ( · , u ) , π ′ )] du. In The proof of the base case of Lemma 6.13, we applied Lemma 6.11 to get a bound for this function. The c from that proof is here 1 − C and the constant C ′ , J = C ′ , ∇ x W is here C . Thus, k ∂ x ′ X ( · , t ) k BC tr , S ( R ∗ ( d + d ′ ) , M ) ≤ e − (1 − C ) t (cid:18) C − C ( e (1 − C ) t − (cid:19) . It follows as in the proof of Lemma 6.17 that k ∂ x ′ e tL x ,V ∇ x ′ W k BC tr ( R ∗ ( d + d ′ ) , M ) ≤ e − (1 − C ) t/ (cid:18) C − C ( e (1 − C ) t/ − (cid:19) k ∂ x ∇ x ′ W k BC tr ( R ∗ ( d + d ′ ) , M ) + k ∂ x ′ ∇ x ′ W k BC tr ( R ∗ ( d + d ′ ) , M ) . t → ∞ , we obtain k ∂ x ′ E x ,V ∇ x ′ W k BC tr ( R ∗ d ′ , M ) ≤ C − C · C + C = C (1 + C )1 − C . The existence of C ′ follows by similar reasoning, which we leave as an exercise. Proposition 8.18.

Consider variables x , x ′ , x ′′ which are d , d ′ , and d ′′ -tuples respectively. Let V ∈ V d + d ′ + d ′′ ,C ,C ,C for some C < √ − . Let b V be the marginal potential for ( x ′ , x ′′ ) . Then E x ′ , b V ◦ E x ,V [ f ] = E ( x , x ′ ) ,V [ f ] for f ∈ C tr , app ( R ∗ ( d + d ′ + d ′′ ) , M ( R ∗ d , . . . , R ∗ d ℓ ) d ∗ .Proof. By Proposition 6.31, it suﬃces to prove the relation for f in a dense subset of C tr , app ( R ∗ ( d + d ′ + d ′′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) d ∗ . In particular, we may restrict our attention to bounded f .Let V ( N ) = V M N ( C ) , tr N which satisﬁes the assumptions of Theorem 8.13 with c = 1 − C and C = 1 + C .Let b V ( N ) be the marginal potential for ( X ′ , X ′′ ), which satisﬁes ∇ b V ( N ) = E x ,V ( N ) [ ∇ x ′ , x ′′ V ( N ) ] . By Theorem 8.13, ∇ b V ( N ) ∇ b V .

By Proposition 8.17, b V ∈ V d ′ + d ′′ ,C ′ ,C ′ ,C ′ with C ′ = C (1 + C ) / (1 − C ). Note that C ′ < C < √ − b V ( N ) ( X ) − ( c/ k X k is convexand b V ( N ) ( X ) − ( C/ k X k is concave for the same constants c and C that worked for V ( N ) . (Note equation(4.18) of [18] should read D = A − BC − B ∗ . Of course, if the block 2 × D is the same scalar multiple of the appropriately sizedidentity matrix.) Overall, we conclude that b V ( N ) and b V also satisfy the hypotheses of Theorem 8.13.Now let f ( N ) = f M N ( C ) , tr N . Then by Theorem 8.13 applied to V and V ( N ) , we have E x ,V ( N ) [ f ( N ) ] E x ,V [ f ] . Note that these functions are uniformly bounded because we assumed f was bounded. By Theorem 8.13applied to b V and b V ( N ) , we have E x , b V ( N ) ◦ E x ′ ,V ( N ) [ f ( N ) ] E x , b V ◦ E x ,V [ f ] . From the well-known properties of classical conditional expectations, E x , b V ( N ) ◦ E x ′ ,V ( N ) [ f ( N ) ] = E ( x , x ′ ) ,V ( N ) [ f ( N ) ] . By another application of Theorem 8.13, E ( x , x ′ ) ,V ( N ) [ f ( N ) ] E ( x , x ′ ) ,V [ f ] . Therefore, E x , b V ◦ E x ,V [ f ] = E ( x , x ′ ) ,V [ f ] as desired.As a corollary, in the situation of the previous proposition, we will get the same answer the marginalpotential for x ′′ whether we compute it from V or from b V . There is a variant of the previous propositionthat does not explicitly refer to b V and hence works whenever C < roposition 8.19. Consider variables x , x ′ , x ′′ which are d , d ′ , and d ′′ -tuples respectively. Fix ℓ ≥ and d , . . . , d ℓ ∈ N . Let ι be the canonical inclusion map ι : C tr , app ( R ∗ ( d ′ + d ′′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) → C tr , app ( R ∗ ( d ′ + d ′′ ) , M ( R ∗ d , . . . , R ∗ d ℓ )) , obtained by viewing a function of ( x ′ , x ′′ ) as a function of ( x , x ′ , x ′′ ) . Let V ∈ V d + d ′ + d ′′ ,C ,C ,C for some C < . Then E ( x , x ′ ) ,V ◦ ι ◦ E x ,V [ f ] = E ( x , x ′ ) ,V [ f ] for f ∈ C tr , app ( R ∗ ( d + d ′ + d ′′ ) , M ( R ( ∗ d , . . . , R ∗ d ℓ )) d . The proof of the proposition is similar to the previous one. Use the fact that the analogous result holdsfor the classical conditional expectation maps associated to V ( N ) and then take the large N limit usingTheorem 8.13. We leave the details to the reader.The next proposition relates the map E x ,V to W ∗ -algebraic conditional expectations. This result issimilar to [42, Theorem 15.1.7]. The only diﬀerence is that we have a smaller space of non-commutativefunctions, and hence we are able to make conclusions about the C ∗ -algebras, not only the W ∗ -algebras. Proposition 8.20.

Let V ∈ V d + d ′ ,C ,C ,C where C < √ − . Let ( A , τ ) be a tracial W ∗ -algebra withself-adjoint generators ( X , X ′ ) satisfying τ ( f A ,τ ( X , X ′ )) = E V [tr( f ( X , X ′ ))] for f ∈ C tr , app ( R ∗ ( d + d ′ ) ) . Then we have E W ∗ ( X ′ ) [ f A ,τ ( X , X ′ )] = ( E x ,V [ f ]) A ,τ ( X ′ ) for f ∈ C tr , app ( R ∗ ( d + d ′ ) ) , where W ∗ ( X ′ ) is the W ∗ -subalgebra of A generated by X ′ and E W ∗ ( X ′ ) : A → W ∗ ( X ′ ) is the unique trace-preserving conditional expectation. Furthermore, E W ∗ ( X ′ ) maps C ∗ ( X , X ′ ) into C ∗ ( X ′ ) .Proof. Let f ∈ C tr ( R ∗ ( d + d ′ ) ) and g ∈ C tr ( R ∗ d ′ ). Then using 6.25 and Proposition 8.19, τ [( E x ,V [ f ]) A ,τ ( X ′ ) g A ,τ ( X ′ )] = τ [( E x ,V [ f ] g ) A ,τ ( X ′ )]= τ [ E x ,V [ f · ( g ◦ π ′ )] A ,τ ( X ′ )]= E V [ E x ,V [ f · ( g ◦ π ′ )] ◦ π ′ ]= E V [ f · ( g ◦ π ′ )]= τ ( f A ,τ ( X , X ′ ) g A ,τ ( X ′ )) . Since this holds for all g , it holds in particular for non-commutative polynomials. Non-commutative poly-nomials in X ′ are dense in W ∗ ( X ′ ) with respect to the weak operator topology. Thus, the above relationshows that ( E x ,V [ f ]) A ,τ ( X ′ ) equals the conditional expectation of f A ,τ ( X , X ′ ) onto W ∗ ( X ′ ).Because E x ,V [ f ] ∈ C tr ( R ∗ d ′ ), the operator E x ,V [ f ] A ,τ ( X ′ ) is in C ∗ ( X ′ ). Hence, E W ∗ ( X ′ ) [ f A ,τ ( X , X ′ )] ∈ C ∗ ( X ′ ) whenever f ∈ C tr ( R ∗ ( d + d ′ ) ). But elements of the form f A ,τ ( X , X ′ ) are dense in C ∗ ( X , X ′ ), andtherefore, E W ∗ ( X ′ ) maps C ∗ ( X , X ′ ) into C ∗ ( X ′ ). In this section, we will prove a triangular transport result similar to [41, Theorem 8.11]. However, in theboth the hypotheses and conclusion we will use C ∞ tr functions rather than k·k -Lipschitz functions, and thusour new result yields triangular isomorphisms of the C ∗ -algebras generated by our non-commutative randomvariables, not only the W ∗ -algebras. Moreover, our current result constructs triangular transport at theinﬁnitesimal level and thus allows us to construct a family of transport maps along any path of potentials V t that are suﬃciently close to the quadratic, whereas [41] performed the transport one variable at a timeand at each stage only used a path obtained by freely convolving the distribution with a freely independentsemicircular family. 105 eﬁnition 8.21. For j ≤ d , let ι j,d : C tr , app ( R ∗ j ) → C tr , app ( R ∗ d ) be the canonical inclusion ι j,d ( f )( x , . . . , x d ) = f ( x , . . . , x j ). A function f = ( f , . . . , f d ) ∈ C tr , app ( R ∗ d ) d sa is said to be lower-triangular if f j ∈ ι j,d ( C tr , app ( R ∗ d ))for every j = 1, . . . , d , or in other words f j is a function of x , . . . , x j alone. Theorem 8.22.

Fix C , C , C with C < √ − , and let t V t be a continuously diﬀerentiable path [0 , T ] → V dC ,C ,C (where diﬀerentiation again occurs with respect to the topology on tr( C ∞ tr ( R ∗ d )) sa ), andassume that k ∂ k ˙ V k BC tr ( R ∗ d , M k ) ≤ C ′ k for k = 1 , . Then there exists a family of triangular functions ( f t,s ) s,t ∈ [0 ,T ] in C ∞ tr ( R ∗ d ) d sa such that f u,t ◦ f t,s = f u,s for s, t, u ∈ [0 , T ] and ( f t,s ) ∗ V s = V t for s, t ∈ [0 , T ] . Similar to the proof of Theorem 8.3, we rely on Lemma 5.10, and thus we will ﬁrst construct a triangularfunction h satisfying L ∗ V h = φ for a given V and φ . Lemma 8.23.

Fix C , C , C with C < √ − . Then for V ∈ V dC ,C ,C , there exists a linear operator T V : tr( C ∞ tr , app ( R ∗ d )) sa → C ∞ tr , app ( R ∗ d ) d sa such that the following conditions hold:(1) T V φ is a lower-triangular for every φ .(2) ∇ ∗ V T V φ = φ − E V ( φ ) .(3) We have k T V φ k BC tr , app ( R ∗ d ) d + k ∂T V φ k BC tr , app ( R ∗ d , M ) d ≤ constant( C , C , C , d ) (cid:0) k ∂φ k BC tr , app ( R ∗ d , M ) + k ∂ φ k BC tr , app ( R ∗ d , M ) (cid:1) . (4) We have continuity of the map V dC ,C ,C × tr( C ∞ tr , app ( R ∗ d )) sa → C ∞ tr , app ( R ∗ d ) d sa : ( V, φ ) T V φ. Proof.

Let V j be the marginal potential on the variables x , . . . , x j obtained from V given by ∇ V j = E x j +1 ,...,x d ,V [ ∇ x ,...,x j V ] , with the normalization V j (0) = 0. Note that V j ∈ V jC ′ ,C ′ ,C ′ with C ′ = 2 C / (1 − C ) < C < /

2. Therefore, the pseudoinverse operators Ψ V j are deﬁned byProposition 6.26.To simplify notation, we will view C tr ( R ∗ j ) as a subset of C tr ( R ∗ d ) using the canonical inclusion ι j,d .Given φ ∈ tr( C tr ( R ∗ d )) sa , we deﬁne functions h j ∈ C tr ( R ∗ j ) sa inductively by h j = ∇ x j Ψ x j ,V j E x j +1 ,...,x d ,V ( φ ) − j − X i =1 ∂ x i V j h i ! . (8.2)It makes sense to apply Ψ x j ,V j to E x j +1 ,...,x d ,V ( φ ) − P j − i =1 ∇ ∗ x i ,V j h i since the latter is a function of x , . . . , x j . We set T V φ = ( h , . . . , h d ). Clearly, T V is a linear operator and satisﬁes (1) by construction, and nowwe shall check that it has the other desired properties.(2) Observe that ∇ ∗ x j ,V h j = ∂ x j V h j − L x j h j = ∂ x j ( V − V j ) h j + ∂ x j V j h j − L x j h j = ∂ x j ( V − V j ) h j + ∇ ∗ x j ,V j h j . ∇ ∗ x j ,V h j = ∇ ∗ x j ,V ∇ x j Ψ x j ,V j E x j +1 ,...,x d ,V ( φ ) − j − X i =1 ∂ x i V j h i ! = (1 − E x j ,V j ) " E x j +1 ,...,x d ,V ( φ ) − j − X i =1 ∂ x i V j h i = E x j +1 ,...,x d ,V ( φ ) − E x j ,...,x d ,V ( φ ) − j − X i =1 ∂ x i ( V j − V j − ) h i , where we have observed that for i ≤ j − E x j ,V j [ ∂ x i V j h i ] = E x j ,V j [ ∂ x i V j ] h i = ∂ x i V j − h i , since h i does not depend on x j . Therefore, ∇ ∗ V h = d X j =1 ∇ ∗ x j ,V h j = d X j =1 ∂ x j ( V − V j ) h j − d X j =1 j − X i =1 ∂ x i ( V j − V j − ) h i + d X j =1 (cid:0) E x j +1 ,...,x d ,V ( φ ) − E x j ,...,x d ,V ( φ ) (cid:1) = d X j =1 ∂ x j ( V − V j ) h j − d − X i =1 d X j = i +1 ∂ x i ( V j − V j − ) h i + (cid:0) φ − E V ( φ ) (cid:1) = d X j =1 ∂ x j ( V − V j ) h j − d − X i =1 ∂ x i ( V − V i ) h i + (cid:0) φ − E V ( φ ) (cid:1) = φ − E V ( φ ) . (3) Because V j ∈ V jC ′ ,C ′ ,C ′ , it follows from Proposition 6.26 and Remark 6.27 that for some constants K , K , K depending only on C , C , C , we have X k =0 k ∂ k h j k BC tr , app ( R ∗ d , M k ) = X k =0 (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∂ k ∂ x j Ψ x j ,V j E x j +1 ,...,x d ,V ( φ ) − j − X i =1 ∂ x i V j h i !(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) ≤ K X k =0 (cid:13)(cid:13) ∂ k ∂ x j E x j +1 ,...,x d ,V ( φ ) (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) + K j − X i =1 1 X k =0 (cid:13)(cid:13) ∂∂ x j [ ∂ x i V j h i ] (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) ≤ K X k =0 (cid:13)(cid:13) ∂ k φ (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) + K j − X i =1 1 X k =0 (cid:13)(cid:13) ∂ [ ∂ x j ∂ x i V j h i ] (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) ≤ K X k =0 (cid:13)(cid:13) ∂ k φ (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) + K j − X i =1 1 X k =0 (cid:13)(cid:13) ∂ k h i (cid:13)(cid:13) BC tr , app ( R ∗ d , M k ) where we have used Proposition 6.22 and the fact that ∂ x j h i = 0. Based on this inequality, it is easy tocheck by induction that each h j satisﬁes the desired bounds.(4) By Proposition 6.31, V j depends continuously on V . Similarly, by applying Proposition 6.31 to eachpart of (8.2), we see by induction that h j depends continuously on ( V, φ ).107 roof of Theorem 8.22.

Let h t = − T V t ˙ V t , where T V t is as in the previous lemma. The lemma implies that t h t is continuous and ∂ h t is bounded. Thus, Lemma 5.8 shows that there are functions f t,s satisfying f t,s = id + Z ts h u ◦ f u,s du. Because h t is lower-triangular, so is f t,s (for instance because the Picard iterates are lower-triangular). Frombasic results on ODE, the functions satisfy the asserted properties under composition. Finally, by Lemma5.10, since −∇ ∗ V t h t = ˙ V t modulo constants, we have ( f t,s ) ∗ V s = V t modulo constants for every s, t ∈ [0 , T ].The operator algebraic consequences of this theorem are similar to Observation 8.5 and Corollary 8.6. Corollary 8.24.

Let V ∈ V dC ,C ,C with C < √ − , and let X be a d -tuple of non-commutative randomvariables that generate a tracial W ∗ -algebra ( A , τ ) such that E V [ f ] = τ ( f ( X )) . Let S be a standard free semicircular d -tuple that generates the tracial W ∗ -algebra ( B , σ ) ∼ = L ( F d ) . Thenthere exists a tracial W ∗ -isomorphism φ : ( A , τ ) → ( B , σ ) such that φ (C ∗ ( X , . . . , X j )) = C ∗ ( S , . . . , S j )) for j = 1 , . . . , d. In particular, for each j = 1 , . . . , d , C ∗ ( X , . . . , X d ) is the internal reduced free product of C ∗ ( X , . . . , X j ) and C ∗ ( φ − ( S j +1 ) , . . . , φ − ( S d )) . In this section, we compute the derivatives of certain functions on W ( R ∗ d ). If F is a function from W ( R ∗ d ) to some topological vector space, then we will denote formal derivative withrespect to V in tangent directions ˙ V , . . . , ˙ V k by δ k F ( V )[ ˙ V , . . . , ˙ V k ] . If F : W ( R ∗ d ) → C and there is a function G mapping a potential V to some element G ( V ) ∈ T V W ( R ∗ d )that satisﬁes δ F ( V )[ ˙ V ] = h ˙ V , G ( V ) i T V W ( R ∗ d ) , then it is natural we say that G ( V ) is a gradient for F . Due to the degeneracy of h· , ·i T V W ( R ∗ d we do notexpect gradients to be unique. However, there may turn out in some circumstances to be a canonical choiceof gradient that the describes the large N limit of what happens for the random matrix models in the senseof § V ˜ µ V ( g ) for a ﬁxed g ∈ tr( C ∞ tr ( R ∗ d )). Thenext lemma is a precise version of the statement that δ [˜ µ V ( g )][ ˙ V ] = h ˙ V , L V g i T V W ( R ∗ d ) , or that V L V g is a gradient for the expectation functional of g . Proposition 9.1.

Suppose t V t is a tangent vector to V = V in W ( R ∗ d ) . Assume that each V t satisﬁesAssumption 5.14 and that V satisﬁes Assumption 5.16 and that for some ﬁxed R , we have µ V t ∈ Σ d,R forall t . Then ddt (cid:12)(cid:12)(cid:12) t =0 ˜ µ V t ( g ) = − ˜ µ V [ h∇ ˙ V , ∇ Ψ V g i tr ] = h ˙ V , L V g i T V W ( R ∗ d ) . Remark . Our previous results show that all the assumptions of the lemma are satisﬁed if ∇ V t is uniformlybounded and ∂ ∇ V t is uniformly bounded by a constant strictly less than 1.108 roof of Proposition 9.1. Since V satisﬁes Assumption 5.16, we have g − ˜ µ V [ g ] = ∇ ∗ V ∇ Ψ V g = ∇ ∗ V t ∇ Ψ V g − h∇ V t − ∇ V, ∇ Ψ V g i tr . When we apply ˜ µ t , the term ∇ ∗ V t ∇ Ψ V g will vanish, and thus,˜ µ V t [ g ] − ˜ µ V [ g ] = − ˜ µ V t [ h∇ V t − ∇ V, ∇ Ψ V g i tr ] . Now ∇ V t − ∇ V → C ∞ tr ( R ∗ d ) d as t →

0. Since we assumed µ V t ∈ Σ d,R for all t , this implies that˜ µ V t [ h∇ V t − ∇ V, ∇ Ψ V g i tr ] →

0. Since f was an arbitrary smooth scalar-valued function, we therefore have µ V t → µ V as t →

0. Thus,lim t → ˜ µ V t [ g ] − ˜ µ V [ g ] t = − lim t → ˜ µ V t (cid:20)(cid:28) ∇ V t − ∇ Vt , ∇ Ψ V g (cid:29) tr (cid:21) = − ˜ µ V [ h∇ ˙ V , ∇ Ψ V g i tr ] . It follows from Proposition 5.19 (3) that −h∇ ˙ V , ∇ Ψ V g i tr = −h∇ Ψ V ˙ V , ∇ g i tr = h ˙ V , L V g i V . Remark . There is another heuristic in terms of inﬁnitesimal transport for why this identity is true.Suppose that V t = ( f t ) ∗ V . Then we expect that µ V t = ( f t ) ∗ µ V . Hence, ddt (cid:12)(cid:12)(cid:12) t =0 ˜ µ V t ( g ) = ddt (cid:12)(cid:12)(cid:12) t =0 ˜ µ V ( g ◦ f t ) = ˜ µ V [ h ˙ f t , ∇ g i tr ] = ˜ µ V [ h P V ˙ f , ∇ g i tr ] = − ˜ µ V [ h∇ Ψ V ˙ V , ∇ g i tr ] . Deﬁnition 9.4.

The heat ﬂow for non-commutative log-densities is the equation ˙ V t = L V t V t = LV t −h∇ V t , ∇ V t i tr for some smooth map t V t : [0 , ∞ ) → W ( R ∗ d ), where ˙ V t denotes the time-derivative.As in [40], this equation describes the large N limit of the equation that a function V ( N ) t on M N ( C ) d sa satisﬁes when ∂ t [ e − N V ( N ) t ] = (1 /N )∆[ e − N V ( N ) t ]. Following the classical works of [61] and [62], we willexplain why the heat equation can be viewed as the gradient ﬂow on W ( R ∗ d ) of the entropy functional.Fix ω ∈ β N \ N . For V satisfying Assumption 5.14, we can consider the functional X ( V ) := χ ω ( µ V ).More properly in the notation of §

7, we should write I ( µ V ) rather than µ V , but since the meaning is clear,we will simplify the notation hereafter. The functional X is the analog of the classical entropy of the freeGibbs law associated to a potential V ; for a precise relation between the free entropy and classical entropy ofrandom matrix models, see [40] or [42, § X is δ X ( V )[ ˙ V ] = h L V V, ˙ V i T V W ( R ∗ d ) , that is, V L V V is a gradient for X . We will only prove this in the case where the tangent vector t V t is given by transport. Proposition 9.5.

Let V ∈ W ( R ∗ d ) with bounded ﬁrst and second derivatives and let V t = ( f t ) ∗ V , where t f t is a tangent vector to id . Suppose that V t satisﬁes Assumption 5.14 for all t , and assume that ∂ f t and ∂ f − t are bounded. Then ddt (cid:12)(cid:12)(cid:12) t =0 X ( V t ) = h L V V, ˙ V i T V W ( R ∗ d ) . Proof.

By Theorem 7.18, any free Gibbs law for V is actually a non-commutative law (it is exponentiallybounded). Since µ V is the unique non-commutative law satisfying the Dyson-Schwinger equation by assump-tion, it is the unique free Gibbs law for V . Hence, by Proposition 7.14, ( f t ) ∗ µ V is the unique free Gibbs lawfor V t , so it satisﬁes the Dyson-Schwinger equation and thus ( f t ) ∗ µ V = µ V t . By Proposition 7.14 again, χ ω (( f t ) ∗ µ V ) = χ ω ( µ V ) + ˜ µ V [log ∆ ( ∂ f t )] . ddt (cid:12)(cid:12)(cid:12) t =0 χ ω (( f t ) ∗ µ V ) = ˜ µ V ◦ Tr ( ∂ ˙ f )= ˜ µ V [ h∇ V, ˙ f i tr ]= ˜ µ V [ h∇ V, P V ˙ f i tr ]= − ˜ µ V [ h∇ V, ∇ Ψ V ˙ V i tr ]= h L V V, ˙ V i T V W ( R ∗ d ) . Using L V V as a (conjectural) gradient of the entropy functional X ( V ), the (upward) gradient ﬂow of X ( V ) is given by the heat equation ˙ V t = L V t V t . Solutions to the corresponding equation on M N ( C ) d sa werestudied in the large N limit by [40] under the assumption that V was uniformly convex and semi-concave.The equation was viewed there is a “mixture” of the ﬂat heat equation ˙ V t = LV t , which can be solvedexplicitly using free Brownian motion, and the Hamilton-Jacobi equation ˙ V t = −h∇ V t , ∇ V t i tr , which can besolved using the Hopf-Lax inf-convolution semigroup. The earlier approach of Dabrowski [24] applied theClark-Ocone formula to study the solution on matrices through a stochastic optimization problem. In thenon-commutative setting, there are subtle technical questions about which stochastic processes to optimizeover (and in particular, in what von Neumann algebra these stochastic processes live in).The derivative of entropy along this gradient ﬂow is computed in the same way as for the classicalWasserstein manifold, namely, ddt X ( V t ) = h L V t V t , ˙ V t i T Vt W ( R ∗ d ) = h L V t V t , L V t V t i T Vt W ( R ∗ d ) = ˜ µ V t [ h∇ V t , ∇ V t i tr ] . The right-hand side (under suitable assumptions) is the free Fisher information of V t ; see [42, 16.2]. Thisis the motivation for Voiculescu’s deﬁnition of the free Fisher information and free entropy χ ∗ in [81]. Ofcourse, it is challenging to make this computation rigorous for general V ; for further discussion, see [7], [24],[40], [42].Since ˙ V t = L V t V t = −∇ ∗ V t ∇ V t , in light of Lemma 5.10, there is a natural family of transport maps f t associated to the path t V t given by f t = id + Z t ∇ V u ◦ f u du. These equations were used in [41] and [42, §

17] to construct transport in the non-commutative setting. Ofcourse, the classical analog of these equations has been well-studied, since it comes naturally out of Laﬀerty’sinsight that the transport provides local coordinates for the Wasserstein manifold [49, §

3] and Otto’s resultthat the heat equation is the gradient ﬂow of the entropy functional [61]. The transport maps arising fromthe gradient ﬂow were also used by Otto and Villani in their proof of the Talagrand inequality [62, Theorem1]. More generally, one can write down the gradient ﬂow of the relative entropy functional X W ( V ) := χ ωW ( µ V ) = χ ω ( µ V ) − ˜ µ V ( W ) . Using Proposition 9.1, the natural guess is that δ X W ( V )[ ˙ V ] = h L V V, ˙ V i T V W ( R ∗ d ) − h ˙ V , L V W i T V W ( R ∗ d ) , that is, that V L V [ V − W ] is a gradient for X W . The gradient ﬂow thus becomes˙ V t = L V t [ V t − W ] = LV t − LW − h∇ V t , ∇ V t i tr + h∇ W, ∇ V t i tr , and the vector ﬁeld for constructing transport is ∇ [ V t − W ]. It would be very interesting to study thisequation when V ∈ W ( R ∗ d ) is arbitrary and W is close to (1 / h x , x i tr in order to obtain a “transport”proof that W satisﬁes the non-commutative Talagrand inequality, parallel to [62]; for an SDE proof of thefree Talagrand inequality, see [37]. 110he case where W = (1 / h x , x i tr was studied in [41, 42], and in fact the conditional version of theequation was used to construct triangular transport to the Gaussian case. That paper was able to showW ∗ triangular transport using functions that were only approximated in uniform k·k by trace polynomialsrather than in uniform k·k ∞ . However, since many of the ingredients for that argument have been provedhere with the new function spaces C k tr ( R ∗ d ), it is likely that the same argument would work to produceC ∗ triangular transport under the assumption that k ∂ ∇ V − Id k BC tr ( R ∗ d , M ) is bounded by some universalconstant smaller than 1. That is, it is unnecessary to assume bounds on the third derivatives to obtain theresult of Corollary 8.24. Deﬁnition 9.6 (Geodesic equation) . The geodesic equation on W ( R ∗ d ) is the pair of equations  ˙ V t = L V t φ t ˙ φ t = − h∇ φ t , ∇ φ t i tr . The ﬁrst equation is called the continuity equation and the second one is called the

Hamilton-Jacobi equation .This equation arises as the large N limit of the geodesic equation for densities e − N V ( N ) on M N ( C ) d sa afterexpressing it in log-density coordinates and using the normalized Laplacian (1 /N )∆ and renormalizationof time. Moreover, we could formally derive it as a Hamiltonian ﬂow as in the classical case (Lemma 2.36),relying on Proposition 9.1 to diﬀerentiate ˜ µ V [ h∇ φ, ∇ φ i tr ] with respect to V . At present, in order to highlightthe connections with optimal transport, we will give a heuristic derivation based on minimizing length, whichis closely parallel to the classical case (and also related to the Hamiltonian formulation).Consider a smooth path [0 , T ] → W ( R ∗ d ) : t V t such that V t satisﬁes Assumptions 5.14 and 5.16.With appropriate continuity assumptions, it makes sense to write down Z T h ˙ V t , ˙ V t i T V W ( R ∗ d ) dt. If the curve t V t is a geodesic, then it should minimize this quantity over all paths with the start andend points V and V T . Assume that ˜ µ V t [ ˙ V t ] = 0, and let φ t = − Ψ V t ˙ V t (plus an arbitrary constant), so that L V t φ t = ˙ V t . Then Z T h ˙ V t , ˙ V t i T V W ( R ∗ d ) dt = Z T ˜ µ V t [ h∇ φ t , ∇ φ t i tr ] dt. Assume we can solve the equation ˙ f t = ∇ φ t ◦ f t to obtain a path of diﬀeomorphisms f t satisfying V t = ( f t ) ∗ V as in Lemma 5.10. This implies under appropriate assumptions that ( f t ) ∗ µ V = µ V t by the same reasoningas in Proposition 6.28. Then note that Z T ˜ µ V t [ h∇ φ t , ∇ φ t i tr ] dt = Z T (( f t ) ∗ ˜ µ V )[ h∇ φ t , ∇ φ t i tr ] dt = Z T ˜ µ V [ h∇ φ t ◦ f t , ∇ φ t ◦ f t i tr ] dt = Z T ˜ µ V [ h ˙ f t , ˙ f t i tr ] dt. Now we could have replaced ∇ φ t by an arbitrary vector ﬁeld h t satisfying −∇ ∗ V t h t = 0, and then thediﬀeomorphisms g t generated as the ﬂow along h t would also satisfy ( g t ) ∗ V = V t . However, since ker( ∇ ∗ V t )and Im( ∇ ) are orthogonal with respect to µ V t , we would have Z T ˜ µ V [ h ˙ g t , ˙ g t i tr ] dt = Z T ˜ µ V t [ h h t , h t i tr ] dt ≥ Z T ˜ µ V t [ h∇ φ t , ∇ φ t i tr ] dt. Thus, we expect that f t minimizes R T ˜ µ V [ h ˙ f t , ˙ f t i tr ] dt among all paths f t of diﬀeomorphisms satisfying f = idand ( f T ) ∗ V = V T . 111ext, we use minimality to show that ¨ f t = 0 in L ( µ V ). Let t h t be a smooth map [0 , T ] → C ∞ tr ( R ∗ d ) d sa such that ∂ h t and ∂ h t are uniformly bounded, h = h T = 0. Let g t,ǫ be diﬀeomorphisms given by ddǫ g t,ǫ = h t ◦ g t,ǫ , g t, = id , or in other words g t,ǫ = exp( ǫ h t ). Note that g ,ǫ = g T,ǫ = id. Using e.g. the integral equation for g t,ǫ , onecan show that ( t, ǫ ) g t,ǫ and ( t, ǫ ) g t,ǫ ◦ f t are continuously diﬀerentiable maps into C tr ( R ∗ d ) d sa , similarto classical ODE results on smooth dependence. Therefore, by minimality0 = ddǫ (cid:12)(cid:12)(cid:12) ǫ =0 Z T ˜ µ V (cid:20)(cid:28) ddt [ g t,ǫ ◦ f t ] , ddt [ g t,ǫ ◦ f t ] (cid:29) tr (cid:21) dt = 2 Z T ˜ µ V (cid:20)(cid:28) ddt ddǫ (cid:12)(cid:12)(cid:12) ǫ =0 [ g t,ǫ ◦ f t ] , ˙ f t (cid:29) tr (cid:21) dt = 2 Z T ˜ µ V (cid:20)(cid:28) ddt [ h t ◦ f t ] , ˙ f t (cid:29) tr (cid:21) dt = − Z T ˜ µ V hD h t ◦ f t , ¨ f t E tr i dt using integration by parts. Since h t is arbitrary except for its values at the endpoints and since f t is invertible,we get that ¨ f t = 0 in L ( µ V ) for t ∈ (0 , T ).Due to degeneracy of the metric, this does not imply that ¨ f t in C tr ( R ∗ d ) d . Nonetheless, since the samecomputations work for the random matrix models, we will proceed in faith to impose the condition ¨ f t = 0.By computation ¨ f t = ddt [ ∇ φ t ◦ f t ]= ∇ ˙ φ t ◦ f t + [ ∂ ∇ φ t ◦ f t ] f t = ∇ ˙ φ t ◦ f t + [ ∂ ∇ φ t ◦ f t ] ∇ φ t ◦ f t ]= [ ∇ ˙ φ t + ∂ ∇ φ t ∇ φ t ] ◦ f t . Hence, ∇ ˙ φ t + ∂ ∇ φ t ∇ φ t = 0 . (9.1)But note that ∇h∇ φ t , ∇ φ t i tr = 2 ∂ ∇ φ t ∇ φ t , which follows from the computation h∇h∇ φ A ,τt ( X ) , ∇ φ A ,τt ( X ) i τ , Y i τ = ∂ [ h∇ φ A ,τt ( X ) , ∇ φ A ,τt ( X ) i τ ][ Y ]= h∇ φ A ,τt ( X ) , ∂ ∇ φ A ,τt ( X )[ Y ] i τ = h ∂ ∇ φ A ,τt ( X )[ ∇ φ A ,τt ( X )] , Y i τ , where we use the fact that ( ∇ ∂φ ) ✶ = ∇ ∂φ since φ is real-valued. Therefore, (9.1) becomes ∇ [ ˙ φ t + 12 h∇ φ t , ∇ φ t i tr ] = 0 . Thus, we can modify φ t by an additive constant (depending on t ) to achieve that ˙ φ t = − (1 / h∇ φ t , ∇ φ t i tr .This is exactly the Hamilton-Jacobi equation, so our derivation is complete.If φ t satisﬁes the Hamilton-Jacobi equation, the same computations show that ¨ f t = 0, and hence f t =id + t ˙ f = id + t ∇ φ . Thus, V t = (id + t ∇ φ t ) ∗ V for some φ , or in other words, a path in W ( R ∗ d ) that solvesthe geodesic equation is a displacement interpolation just as in the classical case.However, does such a displacement interpolation actually minimize the Riemannian distance? If f t is any112amily of transport maps with f = id and ( f T ) ∗ V = V T , then (still assuming the validity of ( f T ) ∗ µ V = µ V T )˜ µ V [ h f T − id , f T − id i tr ] / ≤ Z T ˜ µ V [ h ˙ f t , ˙ f t i tr ] / dt ≤ T / Z T ˜ µ V [ h ˙ f t , ˙ f t i tr ] dt ! / = T / Z T h ˙ V t , ˙ V t i T V W ( R ∗ d ) dt ! / , and equality is achieved when ˙ f t is constant. Hence, to show that a family of transport maps f t is minimal,it suﬃces to show that ˜ µ V [ h f T − id , f T − id i tr ] / is minimal among all f with f ∗ µ V = µ V T . And this is amuch stronger condition since we could easily have f ∗ µ V = µ V without f ∗ V = V due to the degeneracy ofthe Riemannian metric.The quantity ˜ µ V [ h f − id , f − id i tr ] / is related to the non-commutative L Wasserstein distance of [11]deﬁned as follows.

Deﬁnition 9.7 (Non-commutative L coupling distance) . For µ and ν ∈ Σ d , we deﬁne d W, ( µ, ν ) = inf {k X − Y k : X , Y ∈ A d sa , ( A , τ ) ∈ W , λ X = µ, λ Y = ν } . We say that ( A , τ ) and X , Y ∈ A d sa achieve the inﬁmum above, then they are called an optimal coupling of µ and ν . Remark . The existence of optimal couplings is immediate from compactness. Indeed, let Π( µ, ν ) be theset of π ∈ Σ d such that the marginals on the ﬁrst and last d cooordinates are µ and ν respectively. ThenΠ( µ, ν ) is contained in Σ d,R and is compact. Because π

7→ h x − y , x − y i / π is a continuous function onΠ( µ, ν ), it achieves a minimum. However, it is challenging in the non-commutative case to achieve prove anyregularity for the optimal coupling, and indeed we know that there are many non-isomorphic diﬀuse tracialW ∗ -algebras, so we do not expect optimal couplings to be given by transport functions in general.Returning to our geodesic V t = (id + t ∇ φ ) ∗ V , we want to show that id + t ∇ φ provides an optimal couplingbetween µ V and µ V t where V t = (id + t ∇ φ ) ∗ µ V . In fact, since the potential V t and the interpolationid + t ∇ φ are no longer important for the proof, let us proceed more generally. Forgetting about V t andrenaming (1 / h x , x i + tφ as φ , it suﬃces to show that if ∂ ∇ φ is close enough to Id, then ∇ φ provides anoptimal coupling between µ and ( ∇ φ ) ∗ µ for every non-commutative law µ . That is the content of the nextproposition. This is a non-commutative version of one of the easier implications of the Monge-Kantorovichcharacterization of transport, and it holds without any assumption that µ is a free Gibbs law or evenConnes-approximable. Proposition 9.9 (Optimality of certain transport maps) . Let φ ∈ tr( C k tr ( R ∗ d )) sa for some k ≥ . Supposethat for some K > , we have k ∂ ∇ φ − K Id k BC tr ( R ∗ d , M ) d < K . Then for every µ ∈ Σ d , we have d W, ( µ, ( ∇ φ ) ∗ µ ) = ˜ µ [ h∇ φ − id , ∇ φ − id i tr ] . In other words, if X is a self-adjoint d -tuple from ( A , τ ) ∈ W , then X and ∇ φ A ,τ ( X ) are an optimal couplingof λ X and ( ∇ φ ) ∗ λ X . In the proof, we “reverse-engineer” the Monge-Kantorovich duality. We must ﬁrst construct the Legendretransform of ψ of φ . The Legendre transform in the classical setting is a convex function given by ψ ( x ) = sup[ h x, y i − φ ( y )] . If φ is smooth and strictly convex, then the inﬁmum for ψ ( x ) is achieved at y = ( ∇ φ ) − ( x ). Hence, thecheapest way to obtain a smooth Legendre transform for a smooth non-commutative function φ is to invert ∇ φ . 113 emma 9.10 (Smooth non-commutative Legendre transform) . Let φ ∈ tr( C k tr ( R ∗ d )) sa for some k ≥ .Suppose that for some K > , we have k ∂ ∇ φ − K Id k BC tr ( R ∗ d , M ) d < K . Let g be the inverse of ∇ φ as inProposition 3.23, and let ψ be given by ψ A ,τ ( Y ) := h Y , g A ,τ ( Y ) i τ − φ A ,τ ( g A ,τ ( Y )) . (9.2) Then for all ( A , τ ) ∈ W and X ∈ A d sa , we have ψ A ,τ ( Y ) = sup X ∈A d sa (cid:2) h Y , X i τ − φ A ,τ ( X ) (cid:3) . (9.3) Moreover, ∇ ψ = g and hence ψ ∈ tr( C k tr ( R ∗ d )) .Proof. Fix ( A , τ ) and Y , Z ∈ A d sa . Let h : R → R be given by h ( t ) = h Y , g A ,τ ( Y ) + t Z i τ − φ A ,τ ( g A ,τ ( Y ) + t Z ) . Then h ′ ( t ) = h Y , Z i τ − h∇ φ A ,τ ( g A ,τ ( Y ) + t Z ) , Z i τ and h ′′ ( t ) = −h ∂ ∇ φ A ,τ ( g A ,τ ( Y ) + t Z )[ Z ] , Z i τ . Because k ∂ ∇ φ − K Id k tr < K , we obtain h ′′ ( t ) >

0, so h is concave. Also, since ∇ φ ◦ g = id, we have h ′ (0) = 0. Therefore, h is maximized at t = 0, so that h Y , g A ,τ ( Y ) + Z i τ − φ A ,τ ( g A ,τ ( Y ) + Z ) ≤ h Y , g A ,τ ( Y ) i τ − φ A ,τ ( g A ,τ ( Y )) = ψ A ,τ ( Y ) . By substituting X − g A ,τ ( Y ) for Z , we obtain (9.3).Next, by direct computation, h∇ ψ A ,τ ( Y ) , Z i τ = ∂ [ h Y , g A ,τ ( Y ) i τ − φ A ,τ ( g A ,τ ( Y ))][ Z ]= h Z , g A ,τ ( Y ) i τ + h Y , ∂ g A ,τ ( Y )[ Z ] i τ − h∇ φ A ,τ ( g A ,τ ( Y )) , ∂ g A ,τ ( Y )[ Z ] i τ = h Z , g A ,τ ( Y ) i τ . Hence, ∇ ψ = g , and so ψ is C k tr by the chain rule. Proof of Proposition 9.9.

Let X be a self-adjoint d -tuple from ( A , τ ) with non-commutative law µ , and let ν = ( ∇ φ ) ∗ µ . As in the previous lemma, let g = ( ∇ φ ) − and let ψ be the Legendre transform of φ . Writing Y = ( ∇ φ ) − ( X ), we have h Y , X i τ = h Y , g A ,τ ( Y ) i τ = ψ A ,τ ( Y ) + φ A ,τ ( g ( Y ) = ψ A ,τ ( Y ) + φ A ,τ ( X ) . If X ′ and Y ′ are any other d -tuples from some ( B , σ ) with the same law as X and Y , then by (9.3) h Y ′ , X ′ i σ ≤ ψ B ,σ ( Y ′ ) + φ B ,σ ( X ′ ) = ψ A ,τ ( Y ) + φ A ,τ ( X ) = h Y , X i τ , where we have used the fact that evaluation of φ and ψ only depends on the non-commutative law of theargument. Therefore, the coupling X , Y maximizes the inner product and therefore minimizes the L -distance (since k X k and k Y k are uniquely determined by the ﬁxed laws µ and ν ). Hence, we have anoptimal coupling. Remark . Proposition 9.9 partially answers a question of [35, § µ V with V = tr( f ) for some non-commutative power series f on an operator-norm ball of some radius R , and showed the existence of another power series g such that (id + ∇ tr( g )) ∗ σ = µ V where σ is the law ofa semicircular family. Moreover, tr( g ) goes to zero in a certain power-series norm as tr( f ) goes to zero. Thepaper did not settle whether the transport map constructed there was optimal, but we can prove this withProposition 9.9 if tr( g ) is small enough. Let γ : R → [ − R, R ] be a smooth compactly supported functionwith γ ( t ) = t for t ∈ [ − R, R ]. If tr( g ) is suﬃciently small, then φ = tr( g ) ◦ ( γ ( x ) , . . . , γ ( x d )) will satisfy k ∂ ∇ φ k BC tr ( R ∗ d , M ) d <

1. Hence, Proposition 9.9 shows that id + ∇ φ deﬁnes an optimal coupling between σ and µ . 114 .4 Incompressible Euler equation and inviscid Burgers’ equation Deﬁnition 9.12.

Let V satisfy Assumptions 5.14 and 5.16. Let P V = ∇ Ψ ∇ ∗ V , and let Π V = 1 − P V be theLeray projection. The (tracial non-commutative) incompressible Euler equation is the equation ( ˙ u t = − Π V [ ∂ u t u t ] ∇ ∗ V u t = 0 . This equation was formulated in the framework of non-commutative polynomials (and from there a certaincompletion of the space) in [88]. Here Voiculescu imitated the approach of Arnold in the classical setting.Arnold related the incompressible Euler equation to the geodesic equation on the group of diﬀeomorphismson some Riemannian manifold that preserve a given measure; more precisely, if t f t is the geodesic, then u t = ˙ f t ◦ f − t , that is, the right-shift of the ˙ f t to a tangent vector at id.The non-commutative incompressible Euler equation could be derived by normalizing the classical incom-pressible Euler equation on M N ( C ) d sa , but we will give a direct heuristic based on geodesics minimizing length,similar to the earlier derivation of the geodesic equation on W ( R ∗ d ). Recall that D ( R ∗ d , V ) is the group ofnon-commutative diﬀeomorphisms f with f ∗ V = V . A semi-inner product can be deﬁned on T id D ( R ∗ d , V )by h h , h i T id D ( R ∗ d ,V ) = ˜ µ V [ h h , h i tr ] . We extend this to a right-invariant formal Riemannian metric on D ( R ∗ d , V ). Since the diﬀeomorphismsare elements of the vector space C d tr ( R ∗ d ) d sa , we can view tangent vectors at f concretely as elements of C tr ( R ∗ d ) d sa , and the right-shift by f − of a tangent vector h at f produces the tangent vector h ◦ f − at id.Since f preserves V and hence (again under some reasonable assumptions) µ V , the Riemannian metric at anarbtirary point is given by the same formula as at id.Suppose that [0 , T ] → D ( R ∗ d , V ) : t f t minimizes the integral Z T ˜ µ V [ h ˙ f t , ˙ f t i tr ] dt over all paths with the same start and end points. Let u t = ˙ f t ◦ f − t , so that ˙ f t = u t ◦ f t and ∇ ∗ V u t = 0 byLemma 5.10 since f t preserves V . Let h t be another time-dependent vector ﬁeld with bounded ﬁrst derivativesuch that h = h T = 0 and ∇ ∗ V h t = 0. Let g t,ǫ = exp( ǫ h t ), and note that g t,ǫ is in D ( R ∗ d , V ) by Corollary5.11, hence g t,ǫ ◦ f t is another candidate for the minimizer. Thus, as in the previous section,0 = ddǫ (cid:12)(cid:12)(cid:12) ǫ =0 Z T ˜ µ V (cid:20)(cid:28) ddt [ g t,ǫ f t ] , ddt [ g t,ǫ f t ] (cid:29) tr (cid:21) dt = 2 Z T ˜ µ V (cid:20)(cid:28) ddt ddǫ (cid:12)(cid:12)(cid:12) ǫ =0 [ g t,ǫ f t ] , ˙ f t ] (cid:29) tr (cid:21) dt = − Z T ˜ µ V hD h t ◦ f t , ¨ f t E tr i dt = − Z T ˜ µ V hD h t , ¨ f t ◦ f − t E tr i dt Now h t was arbitrary with ∇ ∗ V h t = 0 and h = h T = 0. Although we have not proved that elementsof ker( ∇ ∗ V ) with bounded derivative are dense in ker( ∇ ∗ V ), we proceed under the assumption that ¨ f t ◦ f t isorthogonal to ker( ∇ ∗ V ). Then, despite the degeneracy of the Riemannian metric, we posit that ¨ f t ◦ f t is agradient, or that Π V [¨ f t ◦ f t ] = 0. But note that¨ f t = ddt [ u t ◦ f t ] = ˙ u t ◦ f t + ( ∂ u t ◦ f t ) u t ◦ f t ) , hence Π V [ ˙ u t + ∂ u t u t ] = 0 . V ˙ u t = ˙ u t , so this is the incompressible Euler equation.One can also proceed using Arnold’s framework for geodesics on Lie groups with a right-invariant Rie-mannian metric. He showed that the angular velocity u t of a geodesic must satisfy ˙ u t = − B ( u t , u t ), where B is the bilinear form on the Lie algebra deﬁned by h [ h , h ] , h i = h B ( h , h ) , h i . This was the approachfollowed by Voiculescu [88] in the non-commutative setting. We present here a version of [88, Lemma 1] fortracial non-commutative smooth functions. Lemma 9.13.

Let V satisfy Assumptions 5.14 and 5.16. For h , h ∈ ker( ∇ ∗ V ) , let B ( h , h ) := Π V [ ∂ h h + ( ∂ h ) ✶ h ] . Then for h , h , h ∈ ker( ∇ ∗ V ) ∩ C ∞ tr ( R ∗ d ) d sa , we have ˜ µ V [ h [ h , h ] , h i tr ] = ˜ µ V [ h B ( h , h ) , h i tr ] . Moreover, B ( h , h ) = Π V [ ∂ h h ] . Proof.

Note that ∇h h , h i tr = ( ∂ h ) ✶ h + ( ∂ h ) ✶ h , which we can see from evaluating at some X ∈ A d sa and pairing with a tangent vector Y ∈ A d . By Proposition5.19, h is orthogonal to gradients. Thus,˜ µ V [ h h , ( ∂ h ) ✶ h i tr + h h , ( ∂ h ) ✶ h i tr ] = 0 . Therefore, ˜ µ V [ h [ h , h ] , h i tr ] = ˜ µ V [ h ∂ h h − ∂ h h , h i tr ]= ˜ µ V [ h h , ( ∂ h ) ✶ h i ] − ˜ µ V [ h h , ( ∂ h ) ✶ h i tr ]= ˜ µ V [ h ( ∂ h ) ✶ h , h i ] + ˜ µ V [ h h , ( ∂ h ) ✶ h i tr ]= ˜ µ V [ h ( ∂ h ) ✶ h + ∂ h h , h i tr ] . Since h is in the kernel of ∇ ∗ V , we have h = Π V h . After inserting the Π V into the equation, we can moveit to the other side of the inner product by Proposition 5.19 (5) to obtain ˜ µ V [ h B ( h , h ) , h i tr ].For the second claim, note that ( ∂ h ) ✶ h = ∇h h , h i tr , and thus it is killed by Π V . The only remainingterm is Π V [ ∂ h h ].The formula ˙ u t = − B ( u t , u t ) clearly gives the same incompressible Euler equation.We remark that the geodesic equation on D ( R ∗ d ) can be derived in a similar way. Fixing V , we candeﬁne a right-invariant Riemannian metric by h h , h i T f D ( R ∗ d ) = ˜ µ V [ h h ◦ f − , h ◦ f − i tr ]. The minimalitycondition results in ¨ f t ◦ f − t being zero in L ( µ V ) d . We posit that ¨ f t is actually zero, which results in theequation ˙ u t = ∂ u t u t , where u t = ˙ f t ◦ f − t . This is the tracial non-commutative inviscid Burgers’ equation . The case where u t = ∇ φ t gives exactly the Wasserstein geodesics. References [1] A. B. Aleksandrov and V. V. Peller. Functions of perturbed unbounded self-adjoint operators. operatorBernstein type inequalities.

Indiana University Mathematics Journal , 59:1451–1490, 04 2010.[2] A. B. Aleksandrov and V. V. Peller. Operator H¨older-Zygmund functions.

Advances in Mathematics ,224(3):910–966, 2010. 1163] A. B. Aleksandrov and V. V. Peller. Multiple operator integrals, Haagerup and Haagerup-like tensorproducts, and operator ideals.

Bulletin of the London Mathematical Society , 49(3):463–479, 2017.[4] A.B. Aleksandrov, F.L. Nazarov, and V.V. Peller. Functions of noncommuting self-adjoint operatorsunder perturbation and estimates of triple operator integrals.

Advances in Mathematics , 295:1–52, 2016.[5] Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni.

An Introduction to Random Matrices . CambridgeStudies in Advanced Mathematics. Cambridge University Press, 2009.[6] Vladimir I. Arnold. Sur la g´eom´etrie diﬀ´erentielle des groupes de lie de dimension inﬁnie et ses appli-cations `a l’hydrodynamique des ﬂuides parfaits.

Ann. Inst. Fourier (Grenoble) , 16:319–361, 1966.[7] P. Biane, M. Capitaine, and A. Guionnet. Large deviation bounds for matrix Brownian motion.

Inven-tiones Mathematicae , 152:433–459, 2003.[8] Philippe Biane. Free brownian motion, free stochastic calculus and random matrices. In Dan-VirgilVoiculescu, editor,

Free Probability Theory , volume 12 of

Fields Institute Communications , pages 1–19.American Mathematical Society, Providence, 1997.[9] Philippe Biane and Roland Speicher. Stochastic calculus with respect to free brownian motion andanalysis on wigner space.

Probab. Theory Relat. Fields , 112:373–409, 1998.[10] Philippe Biane and Roland Speicher. Free diﬀusions, free entropy and free ﬁsher information.

Annalesde l’Institut Henri Poincare (B) Probability and Statistics , 37(5):581 – 606, 2001.[11] Philippe Biane and Dan-Virgil Voiculescu. A free probability analogue of the wasserstein metric on thetrace-state space.

Geometric and Functional Analysis , 11:1125–1138, 2001.[12] S. G. Bobkov and M. Ledoux. From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolevinequalities.

Geometric and Functional Analysis , 10:1028–1052, 2000.[13] V. I. Bogachev, A. V. Kolesnikov, and K. V. Medvedev. Triangular transformations of measures.

SbornikMathematics , 196(3):309–335, 2005.[14] Ga¨etan Borot and Alice Guionnet. Asymptotic expansion of β matrix models in the multi-cut regime.preprint, arXiv:1303.1045, 2013.[15] Ga¨etan Borot and Alice Guionnet. Asymptotic expansion of β matrix models in the one-cut regime. Communications in Mathematical Physics , 317(2):447–483, 2013.[16] Ga¨etan Borot, Alice Guionnet, and Karol K. Kozlowski. Large-N Asymptotic Expansion for Mean FieldModels with Coulomb Gas Interaction.

International Mathematics Research Notices , 2015(20):10451–10524, 01 2015.[17] A. Boutet de Monvel, L. Pastur, and M. Shcherbina. On the statistical mechanics approach in therandom matrix theory: Integrated density of states.

Journal of Statistical Physics , 79:585–611, 05 1995.[18] Herm Jan Brascamp and Elliot H. Lieb. On extensions of the Brunn-Minkowski and Pr´ekopa-Leindlertheorems, including inequalities for log concave functions, and with an application to the diﬀusionequation.

Journal of Functional Analysis , 22:366–389, 1976.[19] Yann Brenier and Dimitry Vorotnikov. On optimal transport for matrix-valued measures.

SIAM J.Math. Anal. , 52(3):2849–2873, 2020.[20] Guillaume C´ebron. Free convolution operators and free hall transform.

Journal of Functional Analysis ,265(11):2645 – 2708, 2013.[21] Y. Chen, T. T. Georgiou, and A. Tannenbaum. Matrix optimal mass transport: A quantum mechanicalapproach.

IEEE Transactions on Automatic Control , 63(8):2612–2619, 2018.11722] Shui-Nee Chow, Wuchen Li, and Haomin Zhou. A discrete Schr¨odinger equation via optimal transporton graphs.

Journal of Functional Analysis , 276(8):2440–2469, 2019.[23] Ricardo Correa da Silva. Lecture notes on non-commutative L p -spaces. arXiv:1803.02390, 2018.[24] Yoann Dabrowksi. A Laplace principle for Hermitian Brownian motion and free entropy I: the convexfunctional case. arXiv:1604.06420, 2017.[25] Yoann Dabrowski. A non-commutative path space approach to stationary free stochastic diﬀerentialequations. arxiv:1006.4351, 2010.[26] Yoann Dabrowski, Alice Guionnet, and Dimitri Shlyakhtenko. Free transport for convex potentials.arXiv:1701.00132, 2016.[27] Benjamin Dadoun and Pierre Youssef. Maximal correlation and monotonicity of free entropy.arXiv:2011.03045, 2020.[28] Bruce K. Driver, Brian C. Hall, and Todd Kemp. The large- n limit of the Segal-Bargmann transformon u n . Journal of Functional Analysis , 265(11):2585 – 2644, 2013.[29] D.G. Ebin and J.E. Marsden. Groups of diﬀeomorphisms and the ﬂow of an incompressible ﬂuid.

Ann.of Math. (2) , 92:102–163, 1970.[30] Ilias Farah, Bradd Hart, and David Sherman. Model theory of operator algebras ii: model theory.

IsraelJournal of Mathematics , 201(1):477–505, 2014.[31] Bent Fuglede and Richard V. Kadison. Determinant theory in ﬁnite factors.

Ann. Math. (2) , 55(3):520–530, 05 1952.[32] Leonard Gross. Logarithmic Sobolev inequalities.

American Journal of Mathematics , 97(4):1061–1083,1975.[33] Alice Guionnet and Edouard Maurel-Segala. Combinatorial aspects of random matrix models.

LatinAmerican Journal of Probability and Statistics (ALEA) , 1:241–279, 2006.[34] Alice Guionnet and Dimitri Shlyakhtenko. Free diﬀusions and matrix models with strictly convexinteraction.

Geometric and Functional Analysis , 18(6):1875–1916, 03 2009.[35] Alice Guionnet and Dimitri Shlyakhtenko. Free monotone transport.

Inventiones Mathematicae ,197(3):613–661, 09 2014.[36] Alice Guionnet and Ofer Zeitouni. Concentration of the spectral measure for large matrices.

ElectronicCommunications in Probability , 5:119–136, 2000.[37] F. Hiai and Y. Ueda. Free transportation cost inequalities for noncommutative multi-variables.

InﬁniteDimensional Analysis, Quantum Probability, and Related Topics , 9:391–412, 2006.[38] Fumio Hiai. Free analog of pressure and its legendre transform.

Comm. Math. Phys. , 255(1):229–252,2005.[39] Fumio Hiai, D´enes Petz, and Yoshimichi Ueda. Free transportation cost inequalities via random matrixapproximation.

Probab. Theory Related Fields , 130(2):199–221, 2004.[40] David Jekel. An elementary approach to free entropy theory for convex potentials. arXiv:1805.08814 ,2018. To appear in Analysis and PDE Journal.[41] David Jekel. Conditional expectation, entropy, and transport for convex gibbs laws in free probability.

International Mathematics Research Notices IMRN , 2020, 2020.[42] David Jekel.

Evolution equations in non-commutative probability . PhD thesis, University of California,Los Angeles, 2020. 11843] Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. Mip*=re.arXiv:2001.04383.[44] Richard Jordan, David Kinderlehrer, and Felix Otto. The variational formulation of the Fokker–Planckequation.

SIAM Journal on Mathematical Analysis , 29(1):1–17, 1998.[45] Richard V. Kadison and John R. Ringrose.

Fundamentals of the Theory of Operator Algebras, VolumeI: Elementary Theory , volume 15 of

Graduate Studies in Mathematics . American Mathematical Society,Providence, 1997.[46] Richard V. Kadison and John R. Ringrose.

Fundamentals of the Theory of Operator Algebras, VolumeII: Advanced Theory , volume 16 of

Graduate Studies in Mathematics . American Mathematical Society,Providence, 1997.[47] Todd Kemp. The large- n limits of brownian motions on gl ( n ). International Mathematics ResearchNotices , 2016(13):4012–4057, 2016.[48] Todd Kemp. Heat kernel empirical laws on u ( n ) and gl ( n ). Journal of Theoretical Probability , 30(2):397–451, 2017.[49] John D. Laﬀerty. The density manifold and conﬁguration space quantization.

Transactions of theAmerican Mathematical Society , 305(2):699–741, 1988.[50] Michel Ledoux. A heat semigroup approach to concentration on the sphere and on a compact Rieman-nian manifold.

Geometric and Functional Analysis , 2(2):221–224, 06 1992.[51] Uri Leron. Trace identities and polynomial identities of n × n matrices. Journal of Algebra , 42:369–377,1976.[52] Wuchen Li. Transport information geometry: Riemannian calculus on probability simplex. arXiv:1803.06360 [math] , 2018.[53] Wuchen Li. Diﬀusion Hypercontractivity via Generalized Density Manifold. arXiv:1907.12546 [cs,math] , 2019.[54] Wuchen Li. Hessian metric via transport information geometry. arXiv:2003.10526 , 2020.[55] Myl`ene Ma¨ıda and ´Edouard Maurel-Segala. Free transport-entropy inequalities for nonconvex potentialsand application to concentration for random matrices.

Probab. Theory Related Fields , 159:329–356, 2014.[56] James A. Mingo and Roland Speicher.

Free probability and random matrices , volume 35 of

FieldsInstitute Monographs . Springer-Verlag, New York, 2017.[57] Brent Nelson. Free monotone transport without a trace.

Communications in Mathematical Physics ,334(3):1245–1298, 2015.[58] Brent Nelson. Free transport for ﬁnite depth subfactor planar algebras.

Journal of Functional Analysis ,268(9):2586–2620, 2015.[59] Alexandru Nica and Roland Speicher.

Lectures on the Combinatorics of Free Probability , volume 335 of

London Mathematical Society Lecture Note Series . Cambridge University Press, Cambridge, 2006.[60] L. Ning, T. T. Georgiou, and A. Tannenbaum. On matrix-valued monge–kantorovich optimal masstransport.

IEEE Transactions on Automatic Control , 60(2):373–382, 2015.[61] Felix Otto. The geometry of dissipative evolution equations the porous medium equation.

Communi-cations in Partial Diﬀerential Equations , 26(1-2):101–174, 2001.[62] Felix Otto and C´edric Villani. Generalization of an inequality by Talagrand and links with the loga-rithmic Sobolev inequality.

Journal of Functional Analysis , 173(2):361–400, 2000.11963] Vladimir V. Peller. Multiple operator integrals and higher operator derivatives.

Journal of FunctionalAnalysis , 233:515–544, 04 2006.[64] Gilles Pisier and Quanhua Xu. Non-commutative L p -spaces. In Williams B. Johnson and JoramLindenstrauss, editors, Handbook of the geometry of Banach spaces , volume 2, pages 1459–1517. Elsevier,2003.[65] Claudio Procesi. The invariant theory of n × n matrices. Advances in Mathematics , 19:306–381, 1976.[66] E. M. Rains. Combinatorial properties of Brownian motion on the compact classical groups.

Journalof Theoretical Probability , 10(3):659–679, 1997.[67] Yu P. Razmyslov. Trace identities of full matrix algebras over a ﬁeld of characteristic zero.

Mathematicsof the USSR-Izvestiya , 8(4):727, 1974.[68] Yu P. Razmyslov. Trace identities and central polynomials in the matrix superalgebras M n,k . Mathe-matics of the USSR-Sbornik , 56(1):187, 1987.[69] A. Sengupta. Traces in two-dimensional qcd: the large-n limit. In

Traces in number theory, geometryand quantumﬁelds , volume 38 of

Aspects of Mathematics , pages 193–212. Vieweg, 2008.[70] Dimitri Shlyakhtenko. Free ﬁsher information for non-tracial states.

Paciﬁc J. Math , 211:375–390, 2003.[71] Dimitri Shlyakhtenko. Lower estimates on microstates free entropy dimension.

Analysis & PDE ,2(2):119–146, 2009.[72] Barry Simon.

Trace Ideals and Their Applications . Mathematical Surveys and Monographs. AmericanMathematical Society, Providence, RI, 2 edition, 2005.[73] Roland Speicher. A new example of ’independence’ and ’white noise’.

Probability Theory and RelatedFields , 84(2):141–159, 1990.[74] Terence Tao.

An Introduction to Random Matrix Theory , volume 132 of

Graduate Texts in Mathematics .American Mathematical Society, 2012.[75] C´edric Villani.

Optimal Transport: Old and New , volume 338 of

Grundlehren Der MathematischenWissenschaften . Springer, Berlin, 2009.[76] Dan-Virgil Voiculescu. Symmetries of some reduced free product C ∗ -algebras. In Huzihiro Araki,Calvin C. Moore, S¸erban-Valentin Stratila, and Dan Voiculescu, editors, Operator Algebras and theirConnections with Topology and Ergodic Theory , pages 556–588. Springer, Berlin, Heidelberg, 1985.[77] Dan-Virgil Voiculescu. Addition of certain non-commuting random variables.

Journal of FunctionalAnalysis , 66(3):323–346, 1986.[78] Dan-Virgil Voiculescu. Limit laws for random matrices and free products.

Inventiones mathematicae ,104(1):201–220, Dec 1991.[79] Dan-Virgil Voiculescu. The analogues of entropy and Fisher’s information in free probability, I.

Com-munications in Mathematical Physics , 155(1):71–92, 1993.[80] Dan-Virgil Voiculescu. The analogues of entropy and of Fisher’s information in free probability, II.

Inventiones Mathematicae , 118:411–440, 1994.[81] Dan-Virgil Voiculescu. The analogues of entropy and of Fisher’s information in free probability V.

Inventiones Mathematicae , 132:189–227, 1998.[82] Dan-Virgil Voiculescu. A strengthened asymptotic freeness result for random matrices with applicationsto free entropy.

International Mathematics Research Notices , 1998(1):41–63, 1998.[83] Dan-Virgil Voiculescu. The analogues of entropy and of ﬁsher’s information measure in free probabilitytheory, VI: Liberation and mutual free information.

Advances in Mathematics , 146:101–166, 1999.12084] Dan-Virgil Voiculescu. Cyclomorphy.

Int. Math. Res. Not. IMRN , 2002(6), 2002.[85] Dan-Virgil Voiculescu. Free entropy.

Bulletin of the London Mathematical Society , 34:257–278, 2002.[86] Dan-Virgil Voiculescu. Free analysis questions I: duality transform for the coalgebra of ∂ X : B . Interna-tional Mathematics Research Notices , 2004(16):793–822, 2004.[87] Dan-Virgil Voiculescu. Symmetries arising from free probability theory. In Pierre Cartier, BernardJulia, Pierre Moussa, and Pierre Vanhove, editors,

Frontiers in Number Theory, Physics, and GeometryI , pages 231–243. Springer Berlin Heidelberg, 2006.[88] Dan-Virgil Voiculescu. A hydrodynamic exercise in free probability: setting up free euler equations.Preprint at arXiv:1902.02442.pdf, 2019.[89] Dan-Virgil Voiculescu, Kenneth J. Dykema, and Alexandru Nica.

Free Random Variables , volume 1 of