TTHE GAUSSIAN ENTROPY MAP IN VALUED FIELDS
Yassine El Maazouz
Abstract . —
We exhibit the analog of the entropy map for multivariate Gaussiandistributions on local fields. As in the real case, the image of this map lies in thesupermodular cone and it determines the distribution of the valuation vector. In general,this map can be defined for non-archimedian valued fields whose valuation group isan additive subgroup of the real line, and it remains supermodular. We also explicitlycompute the image of this map in dimension 3.
1. Introduction and notation1.1. Real Multivariate Gaussian distributions. —
In probability theory andstatistics, classical (or Euclidean) Gaussian distributions appear naturally in manycontexts, for example, as the universal limit distribution in the central limit theorem.For a positive integer d , multivariate Gaussian distributions on R d are determined bytheir mean µ ∈ R d and their positive semi-definite covariance matrix Σ ∈ R d × d . Hencethe natural parameter space for centered (i.e with zero mean) Gaussian distributionson R d is the positive semi-definite cone in R d × d , which we denote by PSD d := { Σ ∈ Sym d ( R ) , (cid:104) x, Σ x (cid:105) ≥ for all x ∈ R d } , where Sym d ( R ) is the space of real symmetric matrices in R d × d and (cid:104)· , ·(cid:105) is theusual inner product on R d . Non-degenerate Gaussian distributions are those whosecovariance matrix Σ is positive definite, i.e, Σ ∈ PD d where PD d := { Σ ∈ Sym d ( R ) , (cid:104) x, Σ x (cid:105) > for all non zero x ∈ R d } . There is no shortage of instances where the PSD cone appears in probability andstatistics [
SU10 ], optimization [
MS19 , Chapter 12] and combinatorics [
Goe97 ]. Mathematics Subject Classification . —
Key words . —
Entropy; Probability; Gaussian measures; Non-Archimedean valuation; Local fields;Bruhat-Tits building; Conditional independence.The author would like to thank the Max Planck Institute for Mathematics in the Sciences for itshospitality while working on this project. He would also like to thank Bernd Sturmfels and Ian Lefor valuable discussions. The author is grateful to Avinash Kulkarni for the numerous and valuableexchanges while writing this paper. Many thanks also to Rida Ait El Mansour and Adam QuinnJaffe for their remarks on early drafts of this manuscript. a r X i v : . [ m a t h . S T ] J a n YASSINE EL MAAZOUZ
The positive definite cone has a pleasant group-theoretic structure in the sensethat its elements are in one-to-one correspondence with left cosets of the orthogo-nal group O d ( R ) in the general linear group GL d ( R ) . The map sending the coset A.O d ( R ) ∈ GL d ( R ) /O d ( R ) to AA T ∈ PD d is a bijection. This underscores the factthat multivariate Gaussians are tightly linked to the linearity and orthogonalitystructures that the Euclidean space R d enjoys.An important concept in statistics, probability, and information theory is the notionof entropy, which is a measure of uncertainty and disorder in a distribution (see[ Wan08 ]). The entropy of a centered multivariate Gaussian with covariance matrix Σ is given, up to an additive constant, by h (Σ) = − log( | det(Σ) | ) = − log(det(Σ)) . If X is a random vector in R d with non-degenerate centered Gaussian distributiongiven by a covariance matrix Σ ∈ PD d , then for any subset I of [ d ] := { , , . . . , d } thevector X I of coordinates of X indexed by I is also a random vector with non-degenerateGaussian measure on R | I | . Moreover, its covariance matrix is Σ I = (Σ i,j ) i,j ∈ I ∈ R | I |×| I | ,so we can define the entropy h I (Σ) of X I as h I (Σ) := h (Σ I ) = − log(det(Σ I )) . The collection of entropy values ( h I (Σ)) I ⊂ [ d ] satisfies the inequalities(1) h I (Σ) + h J (Σ) ≤ h I ∩ J (Σ) + h I ∪ J (Σ) for any two subsets I, J ⊂ [ d ] . This is thanks to what is known as Koteljanskii’s inequalities [
Kot63 ] on the determi-nants of positive definite matrices, i.e,(2) det(Σ I ) det(Σ J ) ≥ det(Σ I ∩ J ) det(Σ I ∪ J ) . In the language of polyhedral geometry this means that the image of the entropy map H : PD d −→ R d Σ (cid:55)→ ( h I (Σ)) I ⊂ [ d ] (3)lies inside the supermodular cone S d in R d . This is the polyhedral cone specified bythe inequalities in (1), i.e, S d := { x = ( x I ) I ⊂ [ d ] ∈ R d , x ∅ = 0 and x I + x J ≤ x I ∩ J + x I ∪ J for all I, J ⊂ [ d ] } . Since x ∅ = 0 for x ∈ S d we can see S d as a cone in R d − .In this paper we deal with multivariate Gaussian distributions on local fields, andmore generally non-archimedean valued fields. See Example 4.1 for a discussion. Inparticular we shall exhibit an analog to this entropy map that satisfies the same setof inequalities (1). More precisely we prove the following: Theorem 1.2 . —
The push-forward measure of a multivariate Gaussian measure ona local field by the valuation map is given by a tropical polynomial. The coefficientsof this tropical polynomial are exactly the entropies given by the entropy map of thismeasure. Moreover, these coefficients are supermodular. The entropy map is still welldefined on non-archimedian valued fields in general, and remains supermodular. Itinduces a probability measure on the Euclidean space R d . The support of this measureis a polyhedral complex. HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS This solves conjecture 21 in [ MT ]. We shall break down this result into severalpieces. Namely, Theorems 2.6 and 3.2 for the local field case, and the discussion inSection 4 for the general non-archimedean valued field case. Let us now set things up for our dis-cussion of multivariate Gaussians on fields with a non-archimedean valuation. Thereis an extensive literature on valued fields in number theory [
Ser13, Wei13, EP05 ],analysis [ vR78, Sch84, Sch07 ], representation theory [
CR66 ], mathematical physics[
VVZ94, Khr13 ], and probability [
Eva01, EL07, AZ01 ].Let K be a field with an additive non-archimedean valuation val : K −→ R ∪ { + ∞} with valuation group Γ := val( K × ) . The valuation map val defines an equivalenceclass of exponential valuations or absolute values | . | on K via | x | := a − val( x ) (where a ∈ (1 , ∞ ) ) and hence also a topology on K . The valuation val is called discrete ifits valuation group Γ is a discrete subgroup of R which, by scaling val suitably, wecan always assume to be Z (we then call val a normalized valuation ). In the discretevaluation case we fix a uniformizer π of K , i.e, an element π ∈ K with val( π ) = 1 . Wedenote by O := { x ∈ K, val( x ) ≥ } the valuation ring of K ; this is a local ring withunique maximal ideal m := { x ∈ K, val( x ) > } and residue field k := O / m . Whenthe valuation is discrete, the ideal m is generated in O by π i.e m := π O . A typicalexample is the p -adic number field Q p where p is prime, or F q (( t )) the field of Laurentseries in one variable with coefficients in the finite field F q .When K is a local field (i.e, a finite algebraic extension of Q p or F q (( t )) ), thevaluation group Γ is discrete in R , and k is finite. There exist then a unique Haarmeasure µ on K such that µ ( O ) = 1 . As shown by Evans [ Eva01 ], one can definemultivariate
Gaussian measures on K d using non-archimedean orthogonality. It turnsout that these measures are precisely the uniform distributions on O -submodulesof K d . The non-degenerate Gaussians on K d are then parameterized by full ranksubmodules of K d which are called lattices . We can think of these as an analogue forthe positive definite covariance matrices in the real case. In the language of grouptheorists, one can then think of the Bruhat-Tits building for the special linear group SL d ( K ) [ AB08 ] as the parameter space for non-degenerate Gaussians up to scalarmultiplication.One motivation for this paper is the search for a suitable definition of tropicalGaussian measures [ Tra20 ]. Tropical stochastics has been an active research area in therecent years and has diverse applications from phylogenetics [
LMY, YZZ19 ] to gametheory [
AGG12 ] and economics [
BK13, TY19 ]. One approach to define tropicalGaussians is to tropicalize Gaussian measures on a valued field. We show in Section 2that tropicalizing multivariate Gaussians on local fields yields probability measureson the lattice Z d . We also show that the tropicalized measures are determined by theentropy map via a tropical polynomial. In Section 3 we show the supermodularityof the entropy map and give an couple of examples. In Section 4, we explain whyorthogonality is not a suitable way to define Gaussian measures when the field K has a dense valuation group or when the residue field is not finite. Nevertheless, wewill see that the entropy map is still well defined and remains supermodular and wecompute its image when d = 3 . YASSINE EL MAAZOUZ
2. The entropy map of local field Gaussian distributions
Let d ≥ an integer. We call a lattice in K d any O -submodule Λ := (cid:76) ni =1 O a i generated by a basis ( a , . . . , a d ) of K d . The basis ( a , . . . , a d ) that generates Λ isnot unique. We can write Λ = A O d where A is the matrix with columns a , . . . , a d which is then called a representative of Λ . The elements U of the group GL d ( K ) thatleave O d invariant (i.e U O d = O d ) are exactly the matrices U ∈ GL d ( O ) with entriesin O whose inverse has all entries in O . The group GL d ( O ) then plays the role of O d ( R ) and it is an analogue for the orthogonal group [ ER19 , Theorem 2.4]. Then,like covariance matrices, lattices are in a one-to-one correspondence with left cosets GL d ( K ) / GL d ( O ) and any two representatives of a lattice are elements of the sameleft coset.In this section we assume that K is a local field and we consider a lattice Λ in K d . We recall that there is a unique Haar measure µ ⊗ d on K d which is the productmeasure induced by µ on K . Since K is a local field, the residue field k is finiteand its cardinality | k | = q := p r is a power of some prime p where r ≥ . In thiscase we define the absolute value associated to val as | x | = q − val( x ) . Letting A be arepresentative of the lattice Λ , we can define the entropy h (Λ) of the lattice Λ as h (Λ) = val(det( A )) . This is a well defined quantity since any other representative of Λ is of the form AU where U ∈ GL d ( O ) and det( U ) ∈ O × is a unit, so val(det( U )) = 0 . This definitionlines up with the definition in the real case because val( x ) = − log q ( | x | ) where | · | isthe absolute value on K , so we get h (Λ) = val(det( A )) = − log q ( | det( A ) | ) . The following lemma relates the entropy h (Λ) of a lattice Λ to its measure µ ⊗ d (Λ) . Lemma 2.1 . —
For any lattice Λ in K d , we have µ ⊗ d (Λ) = q − h (Λ) . Proof.
Let A be a representative of Λ . Thanks to the non-archimedean single valuedecomposition (see [ Eva02 , Theorem 3.1]), we can write A = U DV , where
U, V ∈ GL d ( O ) are two orthogonal matrices and D is a diagonal matrix. Then we have Λ =
U D. O d . Since orthogonal linear transformation in K d preserve the measure,we have µ ⊗ d (Λ) = µ ⊗ ( D. O d ) . Let α , . . . , α d be the diagonal entries of D . Then wehave µ ⊗ d (Λ) = µ ⊗ d ( (cid:76) di =1 α i O ) = q − val( α ) −···− val( α d ) . But val( α ) + · · · + val( α d ) =val(det( A )) = h (Λ) . Hence the desired result.The Gaussian measure P Λ , given by a lattice Λ , is the uniform measure on Λ . Itis the measure whose density f Λ (with respect to the Haar measure µ ⊗ d ) is given by f Λ ( x ) = Λ ( x ) /µ ⊗ d (Λ) = q h (Λ) Λ ( x ) , where Λ is the set indicator function of Λ . Proposition 2.2 . —
The quantity h (Λ) is the differential entropy of the Gaussianmeasure P Λ , i.e, h (Λ) = (cid:90) K d log q ( f Λ ( x )) P Λ ( dx ) HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS Proof.
We can compute the above integral as follows: (cid:90) K d log q ( f Λ ( x )) P Λ ( dx ) = (cid:90) K d log q ( f Λ ( x )) f Λ ( x ) µ ⊗ d ( dx )= (cid:90) Λ − log q ( µ ⊗ d (Λ)) f Λ ( x ) µ ⊗ d ( dx )= (cid:90) Λ h (Λ) f Λ ( x ) µ ⊗ d ( dx )= h (Λ) . For a subset I of [ d ] := { , , . . . , d } we denote by Λ I the image of Λ under theprojection onto the space K | I | of coordinates indexed by I . This is also a lattice inthe space K | I | . So, for any subset I ⊂ [ d ] , we can define the entropy h I (Λ) of thelattice Λ I . We can then define the entropy map H : GL d ( K ) / GL( O ) −→ R d Λ (cid:55)→ ( h I (Λ)) I ⊂ [ d ] where h ∅ (Σ) = 0 by convention. If A is a representative of Λ with columns a , . . . , a d ,then the lattice Λ I is the lattice generated over O by the vectors a i,I which are thesub-vectors of the a i ’s with coordinates indexed by I . So we can compute h I (Λ) fromthe matrix A by(4) h I (Λ) = min J ⊂ [ d ] , | J | = | I | val(det( A I × J )) , where A I × J is the matrix extracted from A by taking the rows indexed by I and thecolumns indexed by J . More precisely, A I × J = ( A i,j ) i ∈ I,j ∈ J .Now let X be a K d -valued random variable with Gaussian distribution P Λ given by Λ . That is, for any measurable set B in the Borel sigma-algebra of K d , P Λ ( X ∈ B ) = µ ⊗ d (Λ ∩ B ) µ ⊗ d (Λ) , and V := val( X ) its image under coordinate-wise valuation. Notice that, since P Λ ( X i = 0) = 0 for any i ∈ { , . . . , d } , the vector V is almost surely in Z d . Bydefinition the distribution of V is the push-forward of the distribution of X by val .We are interested in the distribution of the valuation vector V and to determine it wecompute its tail distribution function Q Λ which is defined on R d as Q Λ ( v ) := P Λ ( V ≥ v ) for any v ∈ R d , where ≥ is the coordinate-wise partial order on R d . Since V takes values in Z d this,function is completely determined by its values for v ∈ Z d . For a vector v ∈ Z d let usdefine the lattice π v as the lattice in K d generated by the basis π v i e i where e , . . . , e d is the standard basis of K d . We then have(5) Q Λ ( v ) = P Λ ( X ∈ π v ) = µ ⊗ d (Λ ∩ π v ) µ ⊗ d (Λ) = q − h (Λ ∩ π v ) q − h (Λ) = q − h (Λ ∩ π v )+ h (Λ) . YASSINE EL MAAZOUZ
We also have the following equation(6) Q Λ ( v ) = 1[Λ : Λ ∩ π v ] . Definition 2.3 . — We define the logarithmic tail distribution function ϕ Λ : Z d : → Z as ϕ Λ ( v ) := − log q ( Q Λ ( v )) = h (Λ ∩ π v ) − h (Λ) = log q ([Λ : Λ ∩ π v ]) . The first equality in Definition 2.3 is due to equation (5), and the second inequalityholds thanks to equation (6).Before we state the main results in this section, let us start by establishing a usefullemma concerning the action of GL d ( K ) on the set of lattices. Lemma 2.4 . —
For any two lattices Λ , Λ (cid:48) there exists an element g ∈ GL d ( K ) suchthat g. Λ and g. Λ (cid:48) are both diagonal lattices.Proof. It suffices to show this when Λ is the standard lattice Λ = O d . Let A ∈ GL d ( K ) be a representative of Λ (cid:48) . Thanks to the non-archimedean single value decomposition(see [ Eva02 , Theorem 3.1]), there exists a diagonal matrix D ∈ GL d ( K ) and U, V ∈ GL d ( O ) such that A = U DV . Hence we deduce that Λ (cid:48) = U D O d . Picking g = U − yields g Λ = U − O d = O d and g Λ (cid:48) = D O d .This is in fact a property of buildings: any two chambers belong to a commonapartment [ AB08 ]. Next, we introduce a technical tool that we will be using in theproof of our first result.
Definition 2.5 . — For any (cid:96) ∈ { , . . . , d } we define the (cid:96) -distance φ (cid:96) (Λ , Λ (cid:48) ) of twolattices Λ , Λ (cid:48) as the minimum of val(det( x , . . . , x (cid:96) , y , . . . , y k )) among all possiblechoices of x , . . . , x (cid:96) ∈ Λ and y , . . . , y k ∈ Λ (cid:48) where k = d − (cid:96) .Since for any g ∈ GL d ( K ) , x , . . . x (cid:96) ∈ Λ and y , . . . , y k ∈ Λ (cid:48) we have val(det( gx , . . . gx (cid:96) , gy , . . . , gy k )) = val(det( x , . . . , x (cid:96) , y , . . . , y k )) + val(det( g )) , we can see that φ (cid:96) satisfies the following property: φ (cid:96) ( g. Λ , g. Λ (cid:48) ) = φ (cid:96) (Λ , Λ (cid:48) ) + val(det( g )) . We then deduce that the quantity φ (cid:96) (Λ , Λ (cid:48) ) − h (Λ (cid:48) ) is invariant under the action GL d ( K ) , i.e, for any g ∈ GL d ( K ) we have φ (cid:96) ( g. Λ , g. Λ (cid:48) ) − h ( g. Λ (cid:48) ) = φ (cid:96) (Λ , Λ (cid:48) ) − h (Λ (cid:48) ) . When the second lattice Λ (cid:48) = π v is diagonal and Λ has representative A ∈ GL d ( K ) ,the optimal choice for the vectors x , . . . , x (cid:96) and y , . . . , y k is when the vectors x , . . . , x (cid:96) are among the columns a , . . . , a d of A and the vectors y , . . . , y k are among the vectors π v i e i where ( e i ) ≤ i ≤ d is the standard basis of K d . So we deduce that φ (cid:96) (Λ , π v ) can becomputed as follows: φ (cid:96) (Λ , π v ) = min I,J ⊂ [ d ] | I | = | J | = (cid:96) (cid:32) val(det( A I × J )) + (cid:88) j (cid:54)∈ J v j (cid:33) . HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS So we also get(7) φ (cid:96) (Λ , π v ) − h ( π v ) = min I,J ⊂ [ d ] | I | = | J | = (cid:96) (cid:32) val(det( A I × J )) − (cid:88) j ∈ J v j (cid:33) . In the special case
Λ = π a , for a ∈ Z d , the determinant of A I × J in the aboveoptimization problem is whenever J (cid:54) = I , since we can choose A to be diagonal. Sowe get the following φ (cid:96) ( π a , π v ) − h ( π v ) = min I ⊂ [ d ] , | I | = (cid:96) (cid:32)(cid:88) i ∈ I a i − (cid:88) i ∈ I v i (cid:33) . Theorem 2.6 . —
The logarithmic tail distribution function ϕ Λ is a tropical polyno-mial on Z d given by (8) ϕ Λ ( v ) = max I ⊂ [ d ] ( v I − h I (Λ)) . Proof.
First we show this for a diagonal lattice
Λ = π a where a ∈ Z d . For any v ∈ Z d ,let a ∨ v the vector with coordinates max( a i , v i ) . We have π a ∩ π v = π a ∨ v so we getthe entropy h ( π a ) = (cid:80) di =1 a i and h ( π a ∩ π v ) = h ( π a ∨ v ) = (cid:80) di =1 max( a i , v i ) . Hencewe have ϕ Λ ( v ) = h ( π a ∩ π v ) − h ( π a ) = max I ⊂ [ d ] (cid:32)(cid:88) i ∈ I v i + (cid:88) i (cid:54)∈ I a i (cid:33) − d (cid:88) i =1 a i = max I ⊂ [ d ] ( v I − a I ) , and h I ( π a ) = a I . So the theorem holds for diagonal lattices. To see why it also holdsfor a general lattice Λ , first notice that in the diagonal case Λ = π a we have ϕ Λ ( v ) = − min (cid:96) =0 ,...,d ( φ (cid:96) (Λ , π v ) − h ( π v )) . Secondly, notice that the right hand side of the previous equation is invariant underthe action of GL d ( K ) . So for g ∈ GL d ( K ) , min (cid:96) =0 ,...,d ( φ (cid:96) ( g. Λ , g. π v ) − h ( g. π v )) = min (cid:96) =0 ,...,d ( φ (cid:96) (Λ , π v ) − h ( π v )) . By Definition 2.3, we have ϕ Λ ( v ) = log q ([Λ : Λ ∩ π v ]) = log q ([ g. Λ : g. Λ ∩ g. π v ]) . Nowfix a general lattice Λ and v ∈ Z d . Also, by Lemma 2.4, there exists g ∈ GL d ( K ) suchthat g Λ and g π v are both diagonal, so ϕ Λ ( v ) = log q ([ g. Λ : g. Λ ∩ g. π v ]) = − min (cid:96) =0 ,...,d ( φ (cid:96) ( g. Λ , g. π v ) − h ( g. π v ))= − min (cid:96) =0 ,...,d ( φ (cid:96) (Λ , π v ) − h ( π v )) . Hence, we deduce, thanks to equation (7), that ϕ Λ ( v ) = − min (cid:96) =0 ,...,d min I,J ⊂ [ d ] | I | = | J | = (cid:96) (cid:32) val(det( A I × J )) − (cid:88) j ∈ J v j (cid:33) . We can simplify this thanks to equation (4) to get the desired equation (8).
YASSINE EL MAAZOUZ
So the distribution of the random vector of valuations V is given by a tropicalpolynomial ϕ Λ via its tail distribution function Q Λ . The coefficients of this polynomialare exactly the entropies h I (Λ) . Now we prove a couple of interesting propertiesof ϕ Λ , namely how the coefficients h I (Λ) behave under diagonal scaling and per-mutation of coordinates of the random vector X . To this end, let us denote by D a = diag( a , . . . , a n ) the diagonal matrix with coefficients a i ∈ K and P σ the per-mutation matrix corresponding to a permutation σ of [ d ] i.e P σi,j = 1 when j = σ ( i ) and otherwise. Lemma 2.7 . —
Let Λ be a lattice in K d , a ∈ K d and σ a permutation of [ d ] . Wehave the following: h I ( D a Λ) = h I (Λ) + (cid:80) i ∈ I val( a i ) and h I ( P σ Λ) = h σ ( I ) (Λ) .Proof. For I ⊂ [ d ] , we have h I ( D a Λ) = min | J | = | I | val(det(( D a A ) I × J )) , where A is anyrepresentative of Λ . Since all the lines of D a A are multiples of those of A by thescalars a i we deduce that det(( D a A ) I × J ) = det( A I × J ) (cid:81) i ∈ I a i and hence we get h I ( D a Λ) = h I (Λ) + (cid:88) i ∈ I val( a i ) . Similarly we can see the effect the permutation of coordinates of X has on the vectorof entropies H (Λ) = ( h I (Λ)) I ⊂ [ d ] .
3. Supermodularity of the entropy map
As it is the case for real Gaussians, we would like the vector of entropies H (Λ) :=( h I (Λ)) to have values in the supermodular cone S d as conjectured in [ MT ]. As afirst step towards proving this result, notice that the previous lemma implies thatif Λ is a lattice such that H (Λ) ∈ S d , then for any diagonal matrix D a we still have H ( D a Λ) ∈ S d and H ( P σ Λ) ∈ S d for any permutation σ . Definition-Proposition 3.1 . — Every lattice Λ in K d has a representative A in Hermite normal form , i.e, a matrix A in GL d ( K ) satisfying the following conditions:(i) A is lower diagonal.(ii) For any ≤ j < i ≤ d we have either val( A i,j ) < val( A j,j ) or A i,j = 0 .(iii) The diagonal coefficients A i,i are of the form A i,i = π a i for some a i ∈ Z .Now we can state the second result of this section concerning the supermodularityof the entropy map. But, before we do that, we give an equivalent definition of thesupermodular cone, S d = { ( x I ) I ⊂ [ d ] ∈ R d , x ∅ = 0 and x Ii + x Ij ≤ x I + x Iij for any I ⊂ [ d ] , i (cid:54) = j ∈ [ d ] \ I } where we write Ii instead of I ∪ { i } . These are the facet-defining inequalities of thecone S d and there are d ( d − d − of them. See [ KVV10 ] and references therein.
Theorem 3.2 . —
The image of the map H : Λ → ( h I (Λ)) I ⊂ [ d ] lies in the supermodu-lar cone S d , i.e, for any subset I ⊂ [ d ] with | I | ≤ d − and i (cid:54) = j ∈ [ d ] \ I , h Ii (Λ) + h Ij (Λ) ≤ h I (Λ) + h Iij (Λ)
HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS Proof.
We prove this by induction on d . The result is trivial for d = 1 , . Assumethat it holds for lattices in K r for any r ≤ d , where d ≥ . Let Λ be a lattice in K d and A its Hermite normal form. For any I ⊂ [ d ] of size | I | < d − the inequality h Ii (Λ) + h Ij (Λ) ≤ h I (Λ) + h Iij (Λ) holds for any i (cid:54) = j not in I thanks to the inductionhypothesis. This is because, when | I | ≤ d − , we are working on the lattice Λ Iij which is a lattice in dimension less than d . Then, it suffices to show the inequalitywhen I has size d − . By Lemma 2.7 we can assume that I = { , . . . , d − } and i = d − and j = d (if not, we can just act on Λ by a suitable permutation matrix).Let us write down the matrix A as follows A = π a . . . ∗ π a . . . ... ... ...... . . . . . . ∗ . . . ∗ π a d − ∗ . . . ∗ ∗ π a d − ∗ . . . ∗ ∗ x π a d . Recall that since A is the Hermite form of Λ we have val( x ) < a d or x = 0 . Now wehave h I,i (Λ) = a + · · · + a d − , h I,j (Λ) = a + · · · + a d − + min(val( x ) , a d ) and h I (Λ) = a + · · · + a d − , h I,i,j (Λ) = a + · · · + a d . The inequality h Ii (Λ) + h Ij (Λ) ≤ h I (Λ) + h Iij (Λ) then holds simply because min(val( x ) , a d ) ≤ a d and this finishes the proof.This theorem underlines another similarity between the local field Gaussians definedin [ Eva01 ] and classical multivariate Gaussian measures. From Lemma (2.7) we cansee that acting on Λ by a diagonal matrix just moves the point H (Λ) ∈ S d in parallelto the lineality space of the cone S d , that is, the biggest vector space contained in S d .The classical entropy map is tightly related to conditional independence. Moreprecisely, if Σ ∈ PD d and X is a Gaussian vector with covariance matrix Σ , then forany I ⊂ [ d ] and i (cid:54) = j not in I the variables X i and X j are independent given thevector X I if and only if h Ii (Σ) + h Ij (Σ) = h I (Σ) + h Iij (Σ) and we write X i ⊥⊥ X j | X I ⇐⇒ h Ii (Σ) + h Ij (Σ) = h I (Σ) + h Iij (Σ) . This means that the conditional independence models are exactly the inverse imagesby H of the faces of S d [ Stu09 , Proposition 4.1]. It turns out that, in the local fieldsetting, the non-archimedian entropy map H defined in (3) also encodes conditionalindependence information on the coordinates of the random Gaussian vector X asstated in the following proposition. Proposition 3.3 . —
Assume d ≥ and let I be a subset of [ d ] and i (cid:54) = j ∈ [ d ] \ I two distinct integers. Let Λ be a lattice in K d and X a random Gaussian vector withdistribution given by Λ . Then the conditional independence statement X i ⊥⊥ X j | X I holds if and only if h Ii (Λ) + h Ij (Λ) = h I (Λ) + h Iij (Λ) . YASSINE EL MAAZOUZ
Proof.
Using Lemma 2.7 we reduce to the case I = [ r ] where r ≤ d − , i = r + 1 and j = i + 1 . Let A = ( a i,j ) be the unique representative in Hermite form of Λ . We claimthat X i ⊥⊥ X j | X I if and only if a j,i = 0 . To see why, let Z = A − X which is a Gaussianvector whose distribution is the uniform on O d . We have X i = a i, Z + · · · + a i,i Z i and X j = a j, Z + · · · + a j,j Z j . Since Z I = A − I,I X I , given X I we know Z I and vice-versa.Hence X i ⊥⊥ X j | X I holds if and only if ( a j,i Z i + a j,j Z j ) ⊥⊥ Z i . This happens if andonly if the vectors (1 , and ( a j,i , a j,j ) in K are orthogonal (see [ Eva01 ]). This isequivalent to val( a j,j ) ≤ val( a j,i ) which means that a j,i = 0 since A is in Hermite form.On the other hand, since A is lower triangular, we have the following h I (Λ) = val(det( A I × I )) , h Ii (Λ) = h I (Λ) + val( a i,i ) h Ij (Λ) = h I (Λ) + min(val( a j,i ) , val( a j,j )) and h Iij (Λ) = h I (Λ) + val( a i,i ) + val( a j,j ) . So the equality h Ii (Λ)+ h Ij (Λ) = h I (Λ)+ h Iij (Λ) holds if and only if val( a j,j ) ≤ val( a j,i ) since A is the Hermite form of Λ this happens if and only if a j,i = 0 . In combinationwith the calculation above, this finishes the proof.In other terms, the conditional independence statement X i ⊥⊥ X j | X I holds if andonly if the entropy vector H (Λ) = ( h I (Λ)) is on the face of the polyhedral cone S d cutby the equation h Ii (Λ) + h Ij (Λ) = h I (Λ) + h Iij (Λ) . This gives an analogue of [
Stu09 ,Proposition 4.1].
Corollary 3.4 . — The Gaussian conditional independence models are exactly thosesubsets of lattices that arise as inverse images of the faces of S d under the map H . Proof.
Follows immediately from the previous proposition.This underlines the importance of the map H , and also gives reason to think thatthe suitable analogue of the positive definite cone on local fields is the set of lattices ormore precisely the Bruhat-Tits building [ AB08, MT ]. A hard question in informationtheory for classical multivariate Gaussians is to describe the image of the entropy map[
Stu09 ]. This problem turns out to be difficult in this setting as well.
Problem 3.5 . — Characterize the image of the entropy map H and describe how itintersects the faces of S d . What can you say about the fibers of this map? Remark 3.6 . — We recall that for any d ≥ the image im( H ) is invariant underthe action of the symmetric group and by translation in parallel to the lineality spaceof S d . This is thanks to Lemma 2.7. We will provide an answer for Problem 3.5 when d = 2 , in the end of Section 4. HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS We now provide an algorithm to compute the entropy vector H (Λ) , i.e, the coeffi-cients of the polynomial ϕ Λ . This relies on computing the Hermite form rather thandirectly solving the optimization problems given by equation (4). Algorithm 1:
Computing H (Λ) Input:
A full rank matrix A = ( a , . . . , a n ) ∈ K d × n with n ≥ d generating Λ Output:
The entropy vector H (Λ) for I ⊂ [ d ] do Compute the Hermite form A I of Λ I . h I (Λ) ← val(det( A I )) (sum of valuations of diagonal elements of A I ) end H (Λ) ← ( h I (Λ)) I ⊂ [ d ] return H (Λ) .A Julia implementation of a variant of this algorithm (Remark 3.9) is available andsupplementary materials are available at(9) https://github.com/yassineELMAAZOUZ/Local_field-Gaussians .We now discuss a couple of low-dimensional examples when K = Q p . Example 3.7 . — Let Λ be the lattice represented by A = (cid:18) p p (cid:19) . The coefficients h I (Λ) of the polynomial ϕ Λ can be computed from the representative A using Algorithm(1) and we have h ∅ (Λ) = 0 , h (Λ) = 0 , h (Λ) = 1 , h , (Λ) = 2 and then we get ϕ Λ ( v , v ) = max(0 , v , v − , v + v − . The independence statement X ⊥⊥ X does not hold since the inequality h (Λ) + h (Λ) ≤ h (Λ) is strict. (0 ,
1) (1 ,
2) (0 , ,
1) (1 , , Figure 1.
Tropical curve of ϕ Λ and its regular triangulation of the squarefor example 3.7 YASSINE EL MAAZOUZ
Example 3.8 . — Let Λ be the lattice represented by A = p p p . Thepolynomial ϕ Λ can be computed again using Algorithm (1) and we get h ∅ (Λ) = 0 h (Λ) = 0 , h (Λ) = 0 , h (Λ) = 0 h , (Λ) = 2 , h , (Λ) = 1 , h , (Λ) = 1 h , , (Λ) = 4 . So we deduce that ϕ Λ ( v ) = max(0 , v , v , v , v + v − , v + v − , v + v − , v + v + v − . We can easily check that the supermodularity inequalities are satisfied. Also, noneof the conditional independence statements X i ⊥⊥ X j | X k are satisfied for { i, j, k } = { , , } since the point H (Λ) is in the interior of the cone S , i.e, all the inequalities h ki (Λ) + h kj (Λ) ≤ h i (Λ) + h ijk (Λ) are strict. (a) Tropical variety of ϕ Λ . (b) Regular subdivision of the New-ton polytope of ϕ Λ . Figure 2.
Tropical geometry of the lattice Λ for example 3.8 Remark 3.9 . — For any lattice Λ , there exists a maximal (for inclusion) diagonallattice inside Λ and a minimal diagonal lattice containing Λ . Let us denote thesetwo lattices by π a and π b respectively, where a ≥ b ∈ Z d . So, we have the inclusions π a ⊂ Λ ⊂ π b . It is not difficult to see that the region of linearity corresponding tothe monomial v + · · · + v d − h (Λ) in the tropical polynomial ϕ Λ ( v ) is the orthant R ≥ a := { x ∈ R d , x ≥ a } . Similarly, the region of linearity corresponding to themonomial is the orthant R ≤ b := { x ∈ R d , x ≤ b } . From this, we can the deduce thefollowing recursive relation h [ d ] (Λ) = h [ d − (Λ) + a d . This iterative way of computing the entropy map H (Λ) is slightly more efficient thanAlgorithm 1 where we have to compute the whole Hermite form of Λ I for every I ⊂ [ d ] .This iterative algorithm is the one implemented in (9). HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS
4. The entropy map on non-archimedean fields
In this section we generalize some of the results in Section 2 to the case where K isa field with a non-archimedean valuation.When the residue field k of K is infinite or the valuation group Γ is dense in R , theprobabilistic framework we had in Section 2 is no longer valid. More precisely, we losethe local compactness and we no longer have a Haar measure on K . First we providea list of interesting valued fields one can consider which have different mathematicalinterests. Example 4.1 (Examples of valued fields) . — – The fields R (( t )) or C (( t )) of Laurent series with complex or real coefficients.These are fields with an infinite residue field but still in discrete valuation Γ = Z . – The fields R = ∪ n ≥ R (( t /n )) and C = ∪ n ≥ C (( t /n )) of Puiseux seriesin t . In this case the valuation group Γ = Q is dense in R . – Another interesting field is the field of generalized Puiseux series K which hasvaluation group Γ = R . This fields consists of formal series f = (cid:80) α ∈ R a α t α where supp( f ) := { α ∈ R : a α (cid:54) = 0 } is either finite or has + ∞ as the only accumulationpoint. See [ ABGJ18 ] and references therein. – All the previous fields were in in equal characteristic with their residue fields.Interesting examples in mixed characteristic are Q p the algebraic closure of Q p and the field of p -adic complex numbers C p (completion of Q p ). They both havevaluation group Γ = Q .We define the entropy map H of a lattice as in Section 2, i.e for any I ⊂ [ d ] , h I (Λ) := min | J | = | I | val(det( A I × J )) , where A is a representative of Λ . We can still define a Hermite representative of Λ . Definition 4.2 . — Every lattice Λ in K d has a representative A in Hermite normalform , i.e. a matrix A in GL d ( K ) satisfying the following conditions:(i) A is lower diagonal.(ii) For any ≤ j < i ≤ d we have either val( A i,j ) < val( A j,j ) or A i,j = 0 .The same argument used in Theorem 3.2 can be used again to show that the imageof H still lies in the supermodular cone S d . In this setting however, since the valuationgroup can be dense in R , the image is not necessarily in S d ∩ Z d − . As in Section2, the map H fails to be surjective when d ≥ . The algorithm we provide in (9)computes the map H when K = Q is the field of Puiseux series over Q .Now we show that the only distribution on the field Laurent series K = R (( t )) thatsatisfies the definition suggested in [ Eva01 , Definition 4.1] is the Dirac measure at .Let P be such a probability measure. First, we recall that if X is a random variablewith distribution P , then for any a ∈ O × K the random variables X and aX have thesame distribution, and we write X d = aX . In particular, for any a ∈ R × we have X d = aX . YASSINE EL MAAZOUZ
Proposition 4.3 . —
The probability distribution P is the Dirac measure at .Proof. We can write the power series expansion of X as X = X t V + X t V +1 + . . . ,where V ∈ Z is the random valuation of X . Hence for a ∈ R × we have aX = aX t V + aX t V +1 + . . . , and we deduce that X k d = aX k for any k ≥ and a ∈ R × .We then deduce that X k = 0 almost surely for all k ≥ . Hence X = 0 almost surelywhich finishes the proof.Using a variant of this argument, it is not difficult to see that a similar problemwould arise when we try to define Gaussian measures by orthogonality for all fieldslisted in Example 4.1. It is not immediately clear how to fix this problem and find asuitable definition for Gaussian measures on non-archimedean valued fields.
Problem 4.4 . — Is there a suitable definition for Gaussian measures on the fieldslisted in Example 4.1?
Remark 4.5 . — We can define a probability measure on R d induced by Λ via itstail distribution Q Λ as in Section 2. One can see that the support of this distributionis trop(Λ) := val(Λ ∩ ( K × ) d ) ; the image under valuation of points in Λ with no zerocoordinates. This is in general a polyhedral complex in R d where each edge is parallelto some e I := (cid:80) i ∈ I e i . The following figure is a drawing of trop(Λ) for a lattice in K when K = R . Figure 3.
The polyhedral complex trop(Λ) for
Λ = t t t O K . To conclude this section we give a partial answer for Problem 3.5 when d = 2 , and the valuation group is R . So, let K := R be the field of generalized Puiseuxseries in the variable t (in this case the the valuation group Γ is equal to R ). Proposition 4.6 . —
For d = 2 , the image im( H ) of the entropy map H is exactly S .Proof. For Λ with representative (cid:18) t a t b t b + δ (cid:19) with a, b ∈ R and δ ≥ we have H (Λ) = ( a, b, a + b + δ ) . So H is indeed surjective onto S .For d = 3 , the cone S ⊂ R has a lineality space L of dimension . Since both S and im( H ) are stable under translations in L (see Remark 3.6 and Lemma 2.7 HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS on diagonal scaling of lattices), they are fully determined by their projection onto acomplement of L . Let us we write vectors x of R in the following form x = ( x , x , x ; x , x , x ; x ) , and let us project S and im( H ) on the linear space W ⊂ R of vectors of the form x = (0 , x , x ; 0 , x , x ; 0) . who is a complement of L in R . We write a vector of W as ( x , x ; x , x ) or simplyas ( w, x, y, z ) to simplify notation. Let us denote by P , C be the projections of im( H ) and S respectively onto the space W . From Section 3, we clearly have P ⊂ C .The projection C of S onto W is a polyhedral cone that does not contains anylines. In the language of polyhedral geometry, this is called a pointed cone . Moreover,the dimension of this projection is . It is defined in W by the inequalities(10) w ≤ ,x ≤ y,w + x ≤ z,y ≤ ,z ≤ w,y + z ≤ x. This defines C as a pointed cone over a bipyramid (see Figure 4).On the other hand, any lattice Λ in K can be represented, up to diagonal scaling,by a representative with Hermite form of the shape ∗ ∗ ∗ . The entropy vector of a lattice Λ with such a Hermite normal form is of the shape H (Λ) = (0 , h , h ; 0 , h , h ; 0) . This corresponds to the projection of im( H ) to W parallel to L . So the projection P of im( H ) onto W is the set P = H (Λ) , Λ given by a matrix of the shape ∗ ∗ ∗ in R × . For a lattice Λ with representative A = a b c , such that a, b, c ∈ R withnegative or zero valuation (see Definition 3.1), the point H (Λ) in W is given by w = h (Λ) = val( a ) ,x = h (Λ) = min(val( b ) , val( c )) ,y = h (Λ) = val( c ) ,z = h (Λ) = min(val( ac − b ) , val( a )) . YASSINE EL MAAZOUZ
One can check that, for any choice of a, b, c ∈ R with negative or zero valuation,the above coordinates satisfy the inequalities in (10). With the constraints on thevaluations of a, b, c , and from this parametric representation of P , we can see thatpoints of P have to satisfy the inequalities w ≤ ,x ≤ y,y ≤ . The only part that remains to determine is the inequalities involving the last variable z .The ambiguity comes from the fact that cancellations can happen in ac − b which mightaffect val( ac − b ) and hence also z . But, separating the cases where val( ac ) = val( b ) and val( ac ) (cid:54) = val( b ) , we get the following three sets of inequalities that describe P : w ≤ ,x ≤ w + y,y ≤ ,z = x, , w ≤ ,x ≤ y,y ≤ ,y + w ≤ x,z = y + w, and w ≤ ,y ≤ ,x = y + w,z ≤ w,x ≤ z. We can then see that P is a polyhedral complex of dimension inside C . Moreprecisely, P is the union of three pointed polyhedral cones of dimension inside C which is a cone of dimension . The following figure depicts the intersections of P and C with the hyperplane w + x + y + z + 1 = 0 (eliminating the variable w ) (a) P ∩ { w + x + y + z + 1 = 0 } (b) C ∩ { w + x + y + z + 1 = 0 } Figure 4.
Intersections of P and C with the hyperplane x + y + z + w = − . Corollary 4.7 . — The entropy map H : GL d ( K ) / GL d ( O ) → S d is not surjectivewhen d ≥ . Proof.
Follows from the previous discussion.
HE GAUSSIAN ENTROPY MAP IN VALUED FIELDS We expect this result to hold in every dimension, i.e, the image im( H ) is a polyhedralcomplex whose facets are polyhedral cones of dimension d ( d +1)2 inside S d that is ofdimension d − .
5. Conclusion
In conclusion, there are many similarities between the classical theory of Gaussiandistributions on euclidean spaces and the theory of Gaussian measures on local fieldsas defined by Evans in [
Eva01 ]. In this paper we have exhibited another similarityin terms of differential entropy. This gives reason to think that the suitable non-archimediean analog of the positive definite cone is indeed the set of lattices, or moreprecisely, in the language of group theorists, the Bruhat-Tits building for SL . Thisanalogy can still be carried out for non-archimedean valued fields in general. However,when the field K has a dense valuation group or an infinite residue field, we lose theprobabilistic interpretation and thus also the notion of entropy. References [AB08] Peter Abramenko and Kenneth S Brown.
Buildings: theory and applications , volume248. Springer Science & Business Media, 2008.[ABGJ18] Xavier Allamigeon, Pascal Benchimol, Stéphane Gaubert, and Michael Joswig.Log-barrier interior point methods are not strongly polynomial.
SIAM Journal on AppliedAlgebra and Geometry , 2(1):140–178, 2018.[AGG12] Marianne Akian, Stephane Gaubert, and Alexander Guterman. Tropical polyhedraare equivalent to mean payoff games.
International Journal of Algebra and Computation ,22(01):1250001, 2012.[AZ01] Sergio Albeverio and Xuelei Zhao. A decomposition theorem for Lévy processes onlocal fields.
J. Theoret. Probab. , 14(1):1–19, 2001.[BK13] Elizabeth Baldwin and Paul Klemperer. Tropical geometry to analyse demand.
Unpublished paper.[281] , 2013.[CR66] Charles W Curtis and Irving Reiner.
Representation theory of finite groups andassociative algebras , volume 356. American Mathematical Soc., 1966.[EL07] Steven N Evans and Tye Lidman. Expectation, conditional expectation and martin-gales in local fields.
Electronic Journal of Probability , 12(17):498–515, 2007.[EP05] Antonio J Engler and Alexander Prestel.
Valued fields . Springer Science & BusinessMedia, 2005.[ER19] Steven N Evans and Daniel Raban. Rotatable random sequences in local fields.
Electronic Communications in Probability , 24, 2019.[Eva01] Steven N Evans. Local fields, gaussian measures, and brownian motions.
Topics inprobability and Lie groups: boundary theory , 28:11–50, 2001.[Eva02] Steven N. Evans. Elementary divisors and determinants of random matrices over alocal field.
Stochastic processes and their applications , 102(1):89–102, 2002.[Goe97] Michel X Goemans. Semidefinite programming in combinatorial optimization.
Math-ematical Programming , 79(1-3):143–161, 1997.[Khr13] Andrei Y Khrennikov. p-Adic valued distributions in mathematical physics , volume309. Springer Science & Business Media, 2013. YASSINE EL MAAZOUZ[Kot63] DM Koteljanskii. A property of sign-symmetric matrices.
Amer. Math. Soc. Transl.Ser , 2(27):19–23, 1963.[KVV10] Jeroen Kuipers, Dries Vermeulen, and Mark Voorneveld. A generalization of theshapley–ichiishi result.
International Journal of Game Theory , 39(4):585–602, 2010.[LMY] Bo Lin, Anthea Monod, and Ruriko Yoshida. Tropical foundations for probability &statistics on phylogenetic tree space. arXiv:1805.12400 .[MS19] Mateusz Michałek and Bernd Sturmfels. Invitation to nonlinear algebra.
GraduateStudies in Mathematics, American Mathematical Society , 2019.[MT] Yassine El Maazouz and Ngoc Mai Tran. Statistics of gaussians on local fields andtheir tropicalizations. arXiv:1909.00559 .[Sch84] WH Schikhof.
Ultrametric Calculus (Cambridge Studies in Advanced Mathematics,4) . Cambridge University Press, Cambridge, 1984.[Sch07] Wilhelmus Hendricus Schikhof.
Ultrametric Calculus: an introduction to p-adicanalysis , volume 4. Cambridge University Press, 2007.[Ser13] Jean-Pierre Serre.
Local fields , volume 67. Springer Science & Business Media, 2013.[Stu09] Bernd Sturmfels. Open problems in algebraic statistics. In
Emerging applications ofalgebraic geometry , pages 351–363. Springer, 2009.[SU10] Bernd Sturmfels and Caroline Uhler. Multivariate Gaussian, semidefinite matrixcompletion, and convex algebraic geometry.
Ann. Inst. Statist. Math. , 62(4):603–638, 2010.[Tra20] Ngoc M. Tran. Tropical gaussians: a brief survey.
Algebraic statistics , 11, 2020.[TY19] Ngoc Mai Tran and Josephine Yu. Product-mix auctions and tropical geometry.
Math. Oper. Res. , 44(4):1396–1411, 2019.[vR78] Arnoud CM van Rooij.
Non-Archimedean functional analysis . Dekker New York,1978.[VVZ94] Vasilii Sergeevich Vladimirov, Igor Vasilievich Volovich, and Evgenii IgorevichZelenov. p-adic Analysis and Mathematical Physics . World Scientific, 1994.[Wan08] Qiuping A Wang. Probability distribution and entropy as a measure of uncertainty.
Journal of Physics A: Mathematical and Theoretical , 41(6):065004, 2008.[Wei13] André Weil.
Basic number theory. , volume 144. Springer Science & Business Media,2013.[YZZ19] Ruriko Yoshida, Leon Zhang, and Xu Zhang. Tropical principal component analysisand its application to phylogenetics.
Bull. Math. Biol. , 81(2):568–597, 2019.
October 25, 2020