Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More
Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, Thuy-Duong Vuong
aa r X i v : . [ c s . D S ] F e b Fractionally Log-Concave and Sector-Stable Polynomials:Counting Planar Matchings and More
Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and Thuy-Duong VuongStanford University, {yeganeh,anari,shiragur,tdvuong}@stanford.edu
February 5, 2021
Abstract
We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchingsof a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time,counting non-perfect matchings was shown by Jerrum [Jer87] to be P -hard, who also raised the questionof whether efficient approximate counting is possible. We answer this affirmatively by showing that themulti-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly,and that this dynamics can be implemented efficiently on downward-closed families of graphs wherecounting perfect matchings is tractable. As further applications of our results, we show how to sampleefficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions,and nonsymmetric determinantal point processes.In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions forgenerating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity.These notions generalize well-studied properties like real-stability and log-concavity, but unlike themrobustly degrade under useful transformations applied to the distribution. We relate these notions topairwise correlations in the underlying distribution and the notion of spectral independence introducedby Anari, Liu, and Oveis Gharan [ALO20], providing a new tool for establishing spectral independencebased on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoidingroots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes aclassic result established by Gårding [Går59] who showed homogeneous polynomials that have no rootsin a half-plane must be log-concave over the positive orthant. Let µ : ( [ n ] k ) → R ≥ be a density function on the family of subsets of size k out of a ground set of n elements, which defines a probability distribution P [ S ] ∝ µ ( S ) .The goal of this work is to establish properties of µ that translate into efficient algorithms for samplingfrom this distribution, and by classical equivalences between approximate counting and sampling [JVV86],to algorithms for approximately computing the normalizing constant, i.e., the partition function: ∑ S µ ( S ) .We study a family of local Markov chains that can be used to approximately sample from such a distribu-tion. Definition 1 (Down-Up Random Walks) . For a density µ : ( [ n ] k ) → R ≥ , and an integer ℓ ≤ k , we definethe k ↔ ℓ down-up random walk as the sequence of random sets S , S , . . . generated by the following1 Figure 1: A symmetric sector around the positive real axis. Sector-stability of a polynomial means that ifall variables are chosen from the interior of the sector, the polynomial does not vanish.algorithm: for t =
0, 1, . . . do Select T t uniformly at random from subsets of size ℓ of S t .Select S t + with probability ∝ µ ( S t + ) from supersets of size k of T t .This random walk is time-reversible, always has µ as its stationary distribution, and moreover has positivereal eigenvalues [see, e.g., ALO20]. The special case of ℓ = k − k − ℓ = O ( ) and we have oracleaccess to µ . This is because the number of supersets of T t is at most n k − ℓ = poly ( n ) , so we can enumerateover all in polynomial time.Our main result establishes a formal connection between roots of the generating polynomial of µ , definedbelow, and rapid mixing of the k ↔ ℓ down-up walks. Definition 2 (Generating Polynomial) . To a density µ : ( [ n ] k ) → R ≥ we associate a multivariate generatingpolynomial g µ ∈ R [ z , . . . , z n ] , which encodes µ in its coefficients: g µ ( z , . . . , z n ) : = ∑ S µ ( S ) ∏ i ∈ S z i .Note that g µ is a polynomial with nonnegative coefficients, and as such, it has no roots ( z , . . . , z n ) ∈ R n > .We consider polynomials that not only avoid roots on the positive real axis, but also avoid roots in aneighborhood, that is a sector of the complex plane centered around R > . Definition 3 (Sector-Stability) . For an open sector Γ ⊆ C centered around the positive real axis in thecomplex plane, see Fig. 1, we call a polynomial g ( z , . . . , z n ) sector-stable if z , . . . , z n ∈ Γ = ⇒ g ( z , . . . , z n ) = Γ has constant aperture, implies rapid mixingof the k ↔ ℓ down-up random walk for an appropriately chosen ℓ = k − O ( ) . Theorem 4.
Suppose that the density µ : ( [ n ] k ) → R ≥ has a generating polynomial that is sector-stable with respectto a sector Γ of aperture Ω ( ) . Then for an appropriate value of ℓ = k − O ( ) , the k ↔ ℓ has relaxation time k O ( ) . As a reminder, for a time-reversible Markov chain with positive eigenvalues, the relaxation time is theinverse of the spectral gap [LP17]. A corollary of polynomially-bounded relaxation time is that for startingpoints with not-terribly small probability, the mixing time can be polynomially bounded as well.
Corollary 5 ([see, e.g., LP17]) . Suppose µ has a sector-stable generating polynomial for a sector of constantaperture, and let ℓ = k − O ( ) be the value promised by Theorem 4. If the k ↔ ℓ down-up random walk is startedfrom S , then t mix ( ǫ ) ≤ O (cid:18) k O ( ) · log (cid:18) ǫ · P µ [ S ] (cid:19)(cid:19) where t mix ( ǫ ) is smallest time t such that S t is ǫ -close in total variation distance to the distribution defined by µ . As our main application, we obtain efficient algorithms to approximately sample/countg (weighted)matchings and matchings of a given size in planar graphs. We discuss this and other applications inSections 1.1 to 1.3. We then discuss the techniques we use and related work in Sections 1.4 to 1.6.
Matchings in graphs have been a rich source of intriguing algorithmic questions. The celebrated blossomalgorithm of Edmonds [Edm65], which finds a maximum-sized matching in a general graph, has beenpartially credited with the creation of the notion of polynomial time algorithm [Koz06]. An entirely differ-ent class of algorithms for finding matchings, based on connections to determinants, was introduced byLovász [Lov79] and developed further by Karp, Upfal, and Wigderson [KUW86] and Mulmuley, Vazirani,and Vazirani [MVV87]; these determinant-based algorithms have played a central role in the study ofparallel algorithms and derandomization [see, e.g., FGT19].Matchings have also played a central role in counting complexity. The problem of counting perfect match-ings of a given graph was shown by Valiant [Val79] to be complete for the class P , yielding strongevidence that it cannot be solved in polynomial time. This was the first major result of its kind, demon-strating hardness of counting for a problem whose search version, i.e., the problem of distinguishing zeroand nonzero counts, is polynomial-time solvable.Given the hardness of exact counting [Val79], the main focus in subsequent work has been on approximatecounting . Unlike combinatorial optimization problems which often admit nontrivial approximation factors ,for a wide range of counting problems, the approximation factor achievable in polynomial time can beeither made as small as 1 + ǫ , in fact for inverse-polynomially small ǫ , or it has to be super-polynomiallylarge [SJ89]. Therefore, the gold standard for approximate counting is a fully polynomial time random-ized approximation scheme or FPRAS; this is a randomized algorithm whose output is a ( + ǫ ) -factorapproximation to the count with high probability, running in time poly ( n , 1/ ǫ ) .In a breakthrough, Jerrum and Sinclair [JS89] established an FPRAS for counting matchings of all sizes on unweighted graphs. It has been a major problem to design an FPRAS for counting matchings of agiven size or perfect matchings . In a celebrated result, Jerrum, Sinclair, and Vigoda [JSV04] designed anFPRAS for these problems on the important subclass of bipartite graphs . Bipartite graphs are an importantsubclass because of their connection to the permanent of matrices. However, designing an FPRAS to countmatchings of a given size on general graphs remains open [see, e.g., ŠVW18].Besides the class of bipartite graphs, there is another major tractable class for counting perfect matchings .Motivated by models in statistical mechanics, Temperley and Fisher [TF61] and Kasteleyn [Kas61] relatedthe number of perfect matchings in 2-dimensional lattices to a specific determinant, obtaining exact formu-lae for these counts. Later, Kasteleyn [Kas67] generalized this to all planar graphs , obtaining a polynomialtime algorithm for exactly counting perfect matchings in such graphs. At a high-level, this algorithm finds3 suitable signing of the adjacency matrix, a.k.a. the Tutte matrix, ensuring its determinant is the squareof the number of perfect matchings.While both bipartite and planar graphs form tractable classes for (approximately/excatly) counting perfect matchings, see Figs. 2 and 3, there is a major difference between the two when it comes to non-perfectmatchings. The problem of counting k -matchings, matchings with exactly k edges, is no harder thancounting perfect matchings in general. In a general graph on n nodes, one can add n − k dummy nodesconnected to everything else, see Fig. 4, and count perfect matchings in the modified graph; the resultis ( n − k ) ! times the number of k -matchings. This strategy extends to counting k -matchings in bipartitegraphs as well. However, in the case of planar graphs, the dummy nodes destroy planarity. This is notjust a coincidence. Jerrum [Jer87] showed that while perfect matchings can be counted exactly in poly-nomial time on planar graphs, counting k -matchings on such graphs is P -hard, adding to the mysteryof determinant-based counting algorithms. Nevertheless, Jerrum [Jer87] raised the possibility of approx-imately counting k -matchings in polynomial time, i.e., designing an FPRAS. As the main application ofour results, we resolve this question affirmatively. Theorem 6.
There is a randomized algorithm that receives a planar graph on n nodes and a number k, and outputsa ( + ǫ ) -approximation to the number of k-matchings with high probability, running in time poly ( n , 1/ ǫ ) . More generally, our results apply to the setting of weighted graphs, a.k.a. monomer-dimer systems . Supposethat a given graph G = ( V , E ) has edge weights w : E → R ≥ and vertex weights λ : V → R ≥ . Thendefine the weight of a matching M asweight ( M ) : = ∏ e ∈ M w ( e ) · ∏ v M λ ( v ) ,where e ranges over dimers, i.e., the matching edges, and v ranges over the monomers, i.e., the vertices notmatched in M . Normalizing these weights defines a probability distribution over matchings, and approx-imating the normalizing factor, a.k.a. the partition function, is known to be equivalent to approximatelysampling from this distribution [JVV86]. It was shown by Jerrum and Sinclair [JS89] how to approximatelysample/count from monomer-dimer systems in general graphs when edge weights w ( e ) are polynomiallybounded and there are no vertex weights λ ( v ) ; these assumptions on weights are quite strong, despite theirseemingly innocuous appearance. Approximately sampling/counting from the monomer-dimer systemswith no restriction on the weights remains a key challenge.Computing statistics of monomer-dimer systems on 2-dimensional lattices, and more generally planargraphs, was originally studied in statistical physics [Kas61; TF61; Kas67]. However, the determinant-based algorithms found could only solve the case of zero monomer weights: ∀ v : λ ( v ) =
0. Here weremove this restriction, at the expense of approximation. Theorem 7.
There is an algorithm that receives a planar graph G = ( V , E ) on n vertices and weights w : E → R ≥ and λ : V → R ≥ , and outputs a random matching M, whose distribution is ǫ -close in total variation distance tothe monomer-dimer distribution induced by w , λ . The running time of this algorithm is poly ( n , log ( ǫ )) . Our results do not rely on planarity strongly. In fact, Theorems 6 and 7 extend to any downward-closedfamily of graphs for which perfect matchings can be counted efficiently. Examples that go beyond planargraphs include certain minor-free graphs [EV19], and small genus graphs [GL99].The key insight that enables Theorems 6 and 7 is that we show local random walks on the set of monomers ,or terminals of the matching M , mix rapidly on all graphs. Monomer-dimer systems and k -matchingseach induce a distribution on subsets S of vertices of the graph if we only view the unmatched (or duallymatched) vertices, i.e., the monomers. On planar graphs, the weight of each such set S can be computedefficiently, up to a global normalizing factor. µ ( S ) : = ∑ { weight ( M ) | M is a perfect matching on the complement of S } . We remark that by the results of [Jer87], approximation appears to be necessary, at least for the counting problem.
4e show how to sample a set S with probability approximately following the above, by running a multi-site Glauber dynamics on S for polynomially many steps. The rapid mixing of this random walk, com-bined with known equivalences between approximate sampling and approximate counting [MVV87] im-ply Theorems 6 and 7.We prove Theorems 6 and 7 by showing sector-stability of the corresponding generating polynomials andthen applying Theorem 4. We show sector-stability by starting from results of Heilmann and Lieb [HL72]who characterized regions of root-freeness for unconstrained non-homogeneous monomer-dimer systems,and applying a set of tools we build that show sector-stability degrades gracefully under a number ofoperations, like conditioning on cardinality or homogenization. Lemma 8.
Suppose that a graph G = ( V , E ) is given with edge weights w : E → R ≥ and vertex weights λ : V → R ≥ which together define a weight on matchings weight ( M ) = ∏ e ∈ M w ( e ) ∏ v M λ ( v ) . For any k, thefollowing polynomial, encoding k-matchings, is sector stable for a sector of aperture π /2 .g ( z , . . . , z n ) = ∑ M matching of size k weight ( M ) ∏ v M z v . Additionally the following homogeneous polynomial in n variables, encoding all matchings, is sector stable for asector of aperture π /2 . g ( z , . . . , z n , z ′ , . . . , z ′ n ) = ∑ M matching weight ( M ) ∏ v M z v ∏ v ∼ M z ′ v . Remark . Techniques developed by Jerrum and Sinclair [JS89] allow one to tune the weights in monomer-dimer systems to make the probability mass of k -matchings inverse-polynomially large. In turn combin-ing these techniques with rejection sampling, Theorem 6 can be derived from Theorem 7. Nevertheless,our techniques directly solve the sampling problem for k -matchings, monomer-dimer systems, and evenmonomer-dimer systems restricted to k -matchings, without the need to resolve to weight-tuning. Determinantal point processes (DPP) are elegant probabilistic models used to capture the relationshipbetween items within a subset drawn from a large universe of items. A DPP is formally defined with thehelp of an n × n positive semidefinite matrix L (cid:23)
0, where a subset S ⊆ [ n ] is chosen with probabilitiesgiven by minors of L : P [ S ] ∝ det ( L S , S ) .Determinantal point processes (DPP) were first studied in 1975 by Macchi [Mac75], who was motivated bythe study of fermion processes in quantum mechanics. Since then, DPPs have been very well-studied andhave found applications in many areas such as physics [CMO19; Sos02], random matrix theory [Joh05],combinatorics [BBL09] (random spanning trees [BP93], non-intersecting paths [Ste90]) and recently in ma-chine learning. Within machine learning, DPPs have been used in several applications such as documentsummarization [Cha+15; LB12], recommender systems [GPK16], and many others [Aff+14; KT11; KSG08].Due to broad and practical applications, algorithmic questions occurring in DPP have received lot of at-tention and efficient algorithms for DPP learning [Aff+14; Bor09; KT12; LMR15] and sampling [AOR16;RK15; LJS16; Hou+06] have been provided.Kulesza and Taskar [KT11; KT12] studied an extension of DPPs where the samples are conditioned onhaving a fixed size k . These so called k -DPPs are formally defined with the help of an n × n positivesemidefinite matrix L (cid:23) k , where a subset S ∈ ( [ n ] k ) of size k is chosen withprobabilities given by k × k minors of L : P [ S ] = det ( L S , S ) ∑ T ∈ ( [ n ] k ) det ( L T , T ) .5he authors in [KT11; KT12] used k -DPPs to attack problems such as the image search task, where thegoal is to output a diverse set of image results, of desired cardinality, in response to a search query.Almost all prior work on DPPs assume the underlying matrix L is symmetric and positive semidefinite(PSD) and the understanding of nonsymmetric DPPs (where L does not have to be symmetric) remainssparse. For nonsymmetric matrices L that are guaranteed to have nonnegative minors, the nonsymmetricDPP can still be defined by P [ S ] ∝ det ( L S , S ) .Nonsymmetric DPPs are important as they allow one to model both repulsive and attractive relationshipsbetween items, providing a significantly improved modeling power. For applications of nonsymmetricDPPs see [Gar+19], where the authors use nonsymmetric DPPs to effectively recover correlation structurewithin data, particularly for data that contains large disjoint collections of items where the items within thesame collection have positive correlation while those across different collections are negatively correlated.Brunel [Bru18] also studied learning certain subclasses of nonsymmetric DPPs. Due to their enhancedexpressivity power and potential new applications, the study of nonsymmetric DPPs has been an activearea of research in the past few years.The question of sampling from nonsymmetric k -DPPs is known to be polynomial-time tractable. Indeed,the counting question, that is computing the sum of principal minors can be done exactly, even whenrestricted to k × k principal minors. However these naive algorithms are cumbersome to run in practice,as they require at least n × n matrix multiplication time. A similar barrier existed for symmetric DPPs, butMarkov-chain-based sampling from k -DPPs for symmetric L provided one way to get around this barrier[AOR16; LJS16], yielding algorithms that run in O ( n poly ( k )) time.As an application of our results we provide the first efficient Markov-chain-based algorithm to sample froma wide class of nonsymmetric k -DPPs. Our algorithm works for any nonsymmetric matrix L satisfying L + L T (cid:23)
0. These matrices are the sum of a skew-symmetric matrix and a symmetric PSD matrix; thisclass of matrices L , which are automatically guaranteed to have nonnegative principal minors, defines themain class of nonsymmetric DPPs studied in the literature [Gar+19]. Theorem 10.
For any matrix L ∈ R n × n satisfying L + L ⊺ (cid:23) and cardinality k ≥ , consider the distribution µ : ( [ n ] k ) → R ≥ defined by µ ( S ) ∝ det ( L S , S ) . Then the k ↔ ( k − ) random walk for µ has relaxation time poly ( k ) . Note that each step of this random walk can be implemented using O ( n ) computations of k × k principalminors of L . So this results in a mixing time of O ( n poly ( k ) · log ( P [ S ])) , which can be much fasterthan n × n matrix multiplication time. To the best of our knowledge, our work is the first to establish thatnatural Markov chains can be used for the task of sampling from nonsymmetric k -DPPs.Unsurprisingly, we show this result by proving sector-stability of the corresponding generating polyno-mial. Lemma 11.
For any matrix L ∈ R n × n satisfying L + L ⊺ (cid:23) and number k, the following polynomial is sector-stable w.r.t. a sector of aperture π /2 . g ( z , . . . , z n ) = ∑ S ∈ ( [ n ] k ) det ( L S , S ) ∏ i ∈ S z i . Suppose that µ : ( [ n ] k ) → R ≥ is a density where g µ is stable with respect to a half plane in C , i.e., stablew.r.t. the sector { z ∈ C | Re ( z ) > } . Distributions with this property are called strongly Rayleigh , andthey have been widely studied in the literature [see, e.g., BBL09]. Strongly Rayleigh distributions includedeterminantal point processes, certain classes of matroids, results of the symmetric exchange process, andmore [see, e.g., BBL09]. Motivated by the important problems of computing mixed discriminants, and6ounting intersections of matroids, several works [AO17; SV17; Cel+16; KD16] have studied the problemof sampling from such µ subject to a partition constraint . That is, given a partition T ∪ T ∪ · · · ∪ T s = [ n ] ,and numbers c , . . . , c s ∈ Z ≥ , the question is to sample S ∼ µ conditioned on the constraint ∀ i : | S ∩ T i | = c i .If we allow arbitrarily large s , this problem becomes as hard as (approximately) computing the mixeddiscriminant for which no FPRAS is known. If one defines the same problem for distributions µ that havea log-concave generating polynomial, then partition-constrained sampling is as hard as sampling from theintersection of two matroids ; this is again an important open problem, which remains unsolved.Given the importance of partition-constrained distributions mentioned above, a natural question is, arethere assumptions on the partitions that allow for an FPRAS or approximate sampling? Celis, Deshpande,Kathuria, Straszak, and Vishnoi [Cel+16] obtained such a positive result when the number of partitions s isa constant and importantly when g µ can be computed exactly (as is the case for determinantal distributions).They relied on polynomial interpolation to achieve this result. However, for many strongly Rayleighdistributions µ , we can only approximately compute g µ .As a further application of our results, we show how to sample from partition-constrained µ , as long thenumber of partitions is O ( ) ; our algorithm only requires having access to an oracle for µ , as opposed to g µ . We do this by showing that the local random walks on the partition-constrained µ still mix rapidly, byrelying on Theorem 4 and showing sector-stability for the conditioned distribution. Lemma 12.
Suppose that µ has a sector-stable polynomial with respect to the sector { z ∈ C | Re ( z ) > } . Thenthe partition-constrained distribution for O ( ) -many partitions is sector-stable w.r.t. a sector of Ω ( ) aperture. As a corollary of the ability to approximately compute the partition function for µ subject on partitionconstraints, we show how to approximately compute mixed derivatives of real-stable polynomials g µ ,where the number of distinct derivatives is O ( ) . Note that without this restriction of O ( ) , this problembecomes as hard as computing mixed discriminants. Corollary 13.
Let g ( z , · · · , z n ) be a homogeneous real-stable polynomial with nonnegative coefficients. Supposewe are given oracle access to coefficients of g, and we are also given a term with nonzero coefficient. Then there isan FPRAS that can approximately compute mixed derivatives of g along positive directions, as long as the numberof unique directions is O ( ) . That is given v , · · · , v s , x ∈ R n ≥ with s = O ( ) and tuple ( c , · · · , c s ) ∈ Z s ≥ , wecan efficiently approximate ∂ c v · · · ∂ c s v s g (cid:12)(cid:12)(cid:12) z = x . Here ∂ v is simply the operator v ∂ z + · · · + v n ∂ z n . In order to prove Theorem 4, we build on a recent line of work leveraging high-dimensional expandersfor sampling problems [Ana+19; AL20; ALO20; CLV20b; Fen+20; Che+20; CLV20a]. Specifically, weuse the framework dubbed spectral independence by Anari, Liu, and Oveis Gharan [ALO20]. In thisframework, one views a target distribution µ as a weighted hypergraph or simplicial complex. Establishinga certain notion of high-dimensional expansion would then imply fast-mixing of natural random walksthat converge to µ [DK17; KM16; LLP17; KO18; AL20]. Reinterpreting the notion of high-dimensionalexpansion needed for rapid mixing, Anari, Liu, and Oveis Gharan [ALO20] showed how properties of pairwise correlations in the distribution µ , and certain distributions derived from µ , can imply rapid mixingof natural local random walks, see Definition 1.The spectral independence framework can be applied to the problem of sampling from a distribution onsize k subsets of a ground set of n elements, given up to a global normalizing factor by a function µ : µ : (cid:18) [ n ] k (cid:19) → R ≥ .7n many cases the domain of µ can be adapted to be of the form ( [ n ] k ) [see, e.g., ALO20]. For concreteness,let us look at the distribution of monomers in a monomer-dimer system on the graph G = ( V , E ) . Notall monomer sets have the same size, but we can view each set S ⊆ V as a subset of size | V | chosen from V × {
0, 1 } : S
7→ { ( v , 0 ) | v / ∈ S } ∪ { ( v , 1 ) | v ∈ S } .This gives us a distribution µ : ( V ×{ }| V | ) → R ≥ . Note that in the case of k -matchings, the monomer set isalready of a fixed size, and there is no need for this transformation.Anari, Liu, and Oveis Gharan [ALO20] based on earlier work of Alev and Lau [AL20] showed that rapidmixing of natural local random walks converging to µ can be established as long as pairwise correlations of µ (and certain distributions derived from µ ) are spectrally bounded . More precisely, consider the correlationmatrix defined below. Definition 14 (Correlation Matrix) . For a distribution µ over subsets S of a ground set [ n ] , define thecorrelation matrix Ψ ∈ R n × n as the matrix having entries Ψ i , j : = P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] .The entries of the matrix Ψ measure pairwise correlations or in other words deviations from pairwiseindependence. The key behind the spectral independence framework is to show that the maximumeigenvalue of Ψ is O ( ) . Note that Ψ is always similar to a symmetric matrix and therefore has realeigenvalues [ALO20]. More precisely, one needs to show this not just for the distribution µ , but alsoconditioned versions of it. We remark that in earlier work, a variant of the correlation matrix has appearedwhere the entries are instead given by P [ j ∈ S | i ∈ S ] − P [ j ∈ S | i / ∈ S ] , but these two variants areintimately connected and for homogeneous distributions one can go from eigenvalue bounds of one to theother. Definition 15 (Conditioned Distribution) . For a distribution µ defined over subsets of a ground set [ n ] and T ⊆ [ n ] , define µ T to be the distribution of S ∼ µ conditioned on the event T ⊆ S .One has to show that the correlation matrix has bounded eigenvalues for every T where µ T is well-defined.The main challenge in all applications of this framework is bounding these eigenvalues. Roughly speaking,prior work has managed to use three categories of techniques to establish eigenvalue bounds, discussedbelow: Trickle-Down.
Oppenheim [Opp18] showed that an eigenvalue bound on Ψ for µ { } , µ { } , . . . , µ { n } alsoimplies an eigenvalue bound for Ψ for the distribution µ , under some mild additional conditions. Thisenables an inductive approach to bounding the eigenvalues of Ψ , starting from µ T for large sets T (i.e., ofsize k − deteriorates , and theinduction cannot be completed. A notable exception to this deterioration of the bounds are distributionsrelated to matroids [Ana+19], but as was observed by Alev and Lau [AL20], for almost any distributionbeyond matroids, one has to employ additional tricks to make this induction useful for sampling. Negative Correlation.
Some distributions have negative entries in Ψ , everywhere except on the diagonal;this property is known as negative correlation [see, e.g., BBL09]. Most notably, the uniform distributionon spanning trees, balanced matroids, and determinantal point processes, all have negative correlation[FM92; BBL09]. When negative correlations exist, the ℓ norm of rows of Ψ and consequently its maximumeigenvalue can be bounded by O ( ) [ALO20]. For non-homogeneous distributions that satisfy negativecorrelation, related statements hold, as was shown recently by Eldan and Shamir [ES20]. We remark that in some works using the spectral independence framework, the matrix Ψ is defined slightly differently, withentries of the form P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S | i / ∈ S ] , but these matrices are directly related, and we believe it is more naturalto consider the definition presented here. n o t m a n y Figure 5: The two vertices are ei-ther both monomers or neither are.Therefore they are positively corre-lated. Figure 6: Only two matchings, onewith odd edges and one with evenedges, appear in the monomer-dimer system. The endpoints havelong-range correlation. Figure 7: Informally, the num-ber of vertices strongly correlatedwith any given vertex is bounded.
Correlation Decay.
When µ is a distribution defined on an underlying graph, e.g., spin systems whichare distributions on random assignments σ : V → [ q ] of q spins to vertices of a graph, one can define aclass of properties under the umbrella term “correlation decay”. Informally, these properties imply that fordistant vertices u , v , the values of σ ( u ) , σ ( v ) are almost independent of each other. Naturally this is veryuseful for bounding the entries and consequently the eigenvalues of the matrix Ψ . While correlation decayproperties were known to yield efficient sampling/counting algorithms, when combined with the spectralindependence framework, they resulted in algorithms with truly polynomial running times (compared toprior results which often needed extra assumptions such as boundedness of the degrees in the graph) forseveral problems like the hardcore model [ALO20], two-spin systems [CLV20b], and random colorings[Che+20; FGT19].Unfortunately, in the case of the monomer distribution in monomer-dimer systems, none of these methodsappear to work. As demonstrated in Figs. 5 and 6, we can have both positive and long-range correlations.Nevertheless, we show that the correlation matrix is still bounded, see Fig. 7.
Theorem 16.
Suppose that µ : ( [ n ] k ) → R ≥ is a density whose generating polynomial is sector-stable w.r.t. a sectorof aperture Ω ( ) . Then the ℓ norm of any row in the correlation matrix Ψ is bounded by O ( ) . ∀ i : ∑ j (cid:12)(cid:12) P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] (cid:12)(cid:12) ≤ O ( ) .Note that a bound on the ℓ norm of rows, is also a bound on the maximum eigenvalue [see, e.g., ALO20].Combining this with sector-stability of various distributions, e.g., the monomer distribution, results inspecific bounds on the correlation matrix. Corollary 17.
Let µ be the distribution of monomers in uniformly random k-matchings or more generally monomer-dimer systems with arbitrary weights (possibly restricted to k-matchings). Then the ℓ norm of rows of the correlationmatrix Ψ are bounded by a universal constant: ∀ i : ∑ j (cid:12)(cid:12) P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] (cid:12)(cid:12) ≤ O ( ) .Our main technical contribution is introducing a new technique for establishing spectral independencebased on the roots of the partition function in the complex plane. We remark that for the special case of unweighted monomer-dimer systems, a form of correlation decay does exist [Bay+07]. .5 Techniques and Related Work: Sector-Stability and Fractional Log-Concavity The study of roots of polynomials associated with distributions has a very long history, most notablyin statistical physics. In statistical physics, having roots near the positive real axis is recognized as anindicator of phase transition. This is because roots indicate singularity of log g µ , and many physicalobservables are related to log g µ and its derivatives, which can rapidly change near singularities [see, e.g.,YL52]. For monomer-dimer systems, Heilmann and Lieb [HL72] established a crucial property for rootsof the polynomial defined below: ∑ M matching weight ( M ) ∏ v M z v .Here for each matching M , we multiply its weight by the variables z v , for v ranging over the monomers.Heilmann and Lieb [HL72] formally showed that if we plug in z , . . . , z n ∈ C such that Re ( z ) , . . . , Re ( z n ) >
0, then the above expression will not result in zero. This is the crucial property that Lemma 8 and conse-quently Theorems 6 and 7 rely on. This property is also known as Hurwitz-stability [see, e.g., BB09].Note that the polynomial defined by Heilmann and Lieb [HL72] is not homogeneous, i.e., it does notcorrespond to a distribution on ( [ n ] k ) . Unfortunately, homogenization does not preserve Hurwitz-stability;similarly we do not get Hurwitz-stability if we only include matchings M of a particular size. We establishthe weaker, but more robust, notion of sector-stability for these polynomials. Instead, we show thatmonomer distributions, when homogenized or conditioned on size, cannot have roots in a wide enoughsector in Lemma 8.A special case of sector-stability, when the sector is the entire right-half-plane, is equivalent to Hurwitz-stability. For homogeneous polynomials, Hurwitz-stability is the same as another widely studied propertycalled real stability, or more generally, the so-called half-plane property [BBL09]. Under this special notionof sector-stability, the distribution µ is known to exhibit negative correlations [BBL09], and rapid mixingof local random walks for µ had already been established [FM92; AOR16]. Outside of this special case,negative correlation no longer holds. But we show that correlations are still bounded in Theorem 16.As mentioned before, real-stability, a special case of sector-stability for homogeneous polynomials, is awell-studied property of the generating polynomial g µ that already implied efficient sampling/countingalgorithms for µ [see, e.g., AOR16]. However, recent works have shone light on a generalization of real-stability, that does not involve root locations. Anari, Liu, Oveis Gharan, and Vinzant [Ana+19] establishedthat if log g µ ( z , . . . , z n ) is concave, viewed as a function over R n ≥ , then k ↔ ( k − ) down-up randomwalks for sampling from µ would rapidly mix. This class of log-concave polynomials have been instrumentalin resolving several long-standing questions about matroids [Ana+18; Ana+19; BH19].Log-concave polynomials are a proper superset of real-stable polynomials, at least in the homogeneouscase. This was first shown by [Går59], and this important result has been instrumental in the developmentof hyperbolic programming [Gül97]. A natural question that arises is, whether there is an analogousgeneralization of log-concavity, that is a superset of sector-stable polynomials.We define a natural property, that we call fractional log-concavity . We show that in a “local sense”, it isactually equivalent to spectral independence of the distribution µ , and then show that sector-stabilityimplies fractional log-concavity, establishing an extension of the result of Gårding [Går59]. Definition 18 (Fractional Log-Concavity) . We call the polynomial g µ ( z , . . . , z n ) fractionally log-concavewith parameter α ∈ [
0, 1 ] , if log g µ ( z α , . . . , z α n ) is concave, viewed as a function over R n ≥ .Note that for α =
1, this is the same as log-concavity. We show the following local equivalence betweenspectral independence and fractional log-concavity.
Proposition 19.
Suppose that µ : ( [ n ] k ) → R ≥ is a distribution, and define the n × n correlation matrix Ψ as Ψ i , j : = P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] . Then the maximum eigenvalue of Ψ is bounded by O ( ) if and only if the polynomial g µ is fractionally log-concavearound the point z = (
1, . . . , 1 ) for a parameter α > Ω ( ) . (
1, . . . , 1 ) . However, sector-stability is preserved under the change of variables z i λ i z i , where λ , . . . , λ n are positive reals. This is because sectors in the complex plane are preserved under such scalings.This allows us to map any point in R n ≥ to the special point (
1, . . . , 1 ) . Using this we establish an extensionof the result of Gårding [Går59]. Theorem 20.
Suppose that g µ is sector-stable for a sector of aperture Ω ( ) . Then g µ is fractionally log-concave fora parameter α ≥ Ω ( ) . As a corollary of this result we prove bounds similar to those obtained by Anari, Oveis Gharan, and Vin-zant [AOV18] relating the entropy of fractionally log-concave, and consequently sector-stable, distributionswith the sum of their marginal entropies. See Section 6.While fractional log-concavity around the point (
1, . . . , 1 ) is equivalent to a bound on the eigenvalues of thecorrelation matrix Ψ , it does not imply a bound for the conditioned distributions µ T . However fractionallog-concavity at all points in R n ≥ does. This is because the polynomial for conditional distributions µ T canbe obtained as the following limit: g µ T ∝ lim λ → ∞ g µ elements in T z }| { λ z , λ z , . . . , z n / λ | T | .Scaling the variables or the polynomial, and taking limits all preserve fractional log-concavity. Corollary 21. If µ : ( [ n ] k ) → R ≥ has a fractionally log-concave generating polynomial with parameter α = Ω ( ) ,or a sector-stable polynomial with a sector of aperture Ω ( ) , then for all conditioned distributions µ T , the correlationmatrix has maximum eigenvalue O ( ) . This work establishes a number of examples of fractionally log-concave polynomials, but all of our exam-ples are also sector-stable. We leave the question of finding other examples of fractionally log-concavepolynomials that go beyond sector-stability to future work. However, we make the following concreteconjecture, in line with a conjecture of Mihail and Vazirani [MV89] on the expansion of 0/1 polytopes.
Conjecture 22.
Suppose that µ is the uniform distribution on a subset of the hypercube F ⊆ {
0, 1 } n , such that theconvex hull conv ( F ) has edges of bounded length O ( ) . Then we conjecture that the polynomial ∑ S ∈ F µ ( S ) ∏ i ∈ S z i is fractionally log-concave for a parameter α > Ω ( ) . Matroids are a special case of this conjecture, and their log-concavity has already been established [AOV18].However this conjecture is widely more general, encompassing combinatorial objects such as delta-matroids,Coxeter matroids, and more [BGW03].
All of our sampling algorithms are obtained as instantiations of the k ↔ ℓ down-up random walk forsome ℓ = k − O ( ) applied to an appropriate formulation of the target distribution µ , see Definition 1.Unlike prior applications of spectral independence, we have to consider the k ↔ ℓ random walk when k − ℓ >
1. For example, consider the distribution of monomers in a monomer-dimer system. As wehave established, we view this distribution on the ground set ( V ×{ }| V | ) , where V is the set of vertices.The k ↔ ( k − ) random walk then becomes the following procedure, known as the (single-site) Glauber11 . .. . .Figure 8: The k ↔ ( k − ) random walk on monomers avoids the parity issue. In each round two verticescan change their membership in the monomer set. This is an instance of the multi-site Glauber dynamics.dynamics:Start with monomer set S for t =
0, 1, . . . do Select vertex v ∈ V uniformly at randomSelect S t + between S t − { v } and S t ∪ { v } randomly with probability ∝ µ ( resulting set ) It is not hard to see that cardinality of all monomer sets in a graph has a constant parity. This means thatthere is no transition possible from a monomer set S to another set S ′ that differs in exactly one vertexfrom it. Therefore the k ↔ ( k − ) walk produces a constant sequence S , S = S , . . . and obviously doesnot mix. Note, however, that considering a higher value of k − ℓ gets around this parity issue, see Fig. 8.We show that fractional log-concavity, and consequently, sector-stability, imply rapid mixing of the k ↔ ℓ random walk for some ℓ = k − O ( ) . The following is the result of slight modifications of arguments byAlev and Lau [AL20]. Theorem 23.
Suppose that µ : ( [ n ] k ) → R ≥ has a fractionally log-concave polynomial with parameter α = Ω ( ) .Then for some ℓ = k − O ( ) , the k ↔ ℓ random walk started at the set S , gets ǫ -close in total variation distance tothe distribution µ in time t mix ( ǫ ) = O (cid:18) k O ( ) · log (cid:18) ǫ · P µ [ S ] (cid:19)(cid:19) .One has to be careful that log ( P µ [ S ]) is not too large in applications. This is achieved by making surethat S has at least a 2 − poly ( n ) probability under µ . In all distributions we study in this paper, this can beachieved easily. For example, in the case of monomer-dimer distributions, by running a maximum-weightmatching algorithm, we can find a matching M having the maximum possible weight under the monomer-dimer distribution. Because the number of matchings is at most 2 poly ( n ) , we can safely use the monomerset of this matching as the starting point S . We thank Michał Derezi ´nski and Paul Liu for illuminating discussions about existing results on determi-nantal point processes. We also thank Alexander Barvinok for pointing us to existing results related tosector-stability.
We use Z ≥ to denote the set of nonnegative integers {
0, 1, . . . } . For a subset S of R n , we use conv ( S ) todenote the convex hull of S . 12e use [ n ] to denote {
1, . . . , n } . For a set U we let ( Uk ) denote the family of k -element subsets of U . When n is clear from context, we use S ∈ R n to denote the indicator vector of the set S ⊆ [ n ] , having a coordinateof 0 everywhere, except for elements of S , where the coordinate is 1. For two measures µ , ν defined on the same state space Ω , we define their total variation distance as d tv ( µ , ν ) = ∑ ω ∈ Ω | µ ( ω ) − ν ( ω ) | = max { P µ [ S ] − P ν [ S ] | S ⊆ Ω } .The total variation distance is a special case of a more general class of “distance measures” called f -divergences. Definition 24 ( f -Divergence) . For a convex function f : R ≥ → R , define the f -divergence between twodistributions µ and ν on the same state space as follows: D f ( ν k µ ) = E ω ∼ µ (cid:20) f (cid:18) ν ( ω ) µ ( ω ) (cid:19)(cid:21) − f (cid:18) E ω ∼ µ (cid:20) ν ( ω ) µ ( ω ) (cid:21)(cid:19) .Note that by Jensen’s inequality this quantity is always nonnegative. Also notice that if µ and ν arenormalized distributions the second term is just f ( ) . In this work we will mostly deal with the case of f ( x ) = x , where D f ( · k · ) is also known as the variance. However, we state some results in full generalityin terms of arbitrary f -divergences, in the hope that they fill find use in future work.A Markov chain on a state space Ω is defined by a row-stochastic matrix P ∈ R Ω × Ω . We view distributions µ on Ω as row vectors, and as such µ P would be the distribution after one transition according to P , if westarted from a sample of µ . A stationary distribution µ for the Markov chain P is one that satisfies µ P = µ .Under mild assumptions on P (ergodicity), stationary distributions are unique and the distribution ν P t converges to this stationary distribution as t → ∞ [LP17]. We refer the reader to [LP17] for a detailedtreatment of Markov chain analysis.A popular method for the analysis of Markov chains is via functional inequalities, that are often inequal-ities relating f -divergences before and after one transition of the Markov chain. We are specifically in-terested in contraction of the f -divergence. We state this contraction for (potentially non-square) row-stochastic operators for generality. Definition 25.
We say that a row-stochastic matrix P ∈ R Ω × Ω ′ contracts f -divergence w.r.t. a backgrounddistribution µ : Ω → R ≥ by a factor of α if for all other distributions ν : Ω → R ≥ , we have D f ( ν P k µ P ) ≤ α · D f ( ν k µ ) .We remark that all row-stochastic operators P have contraction with factor 1, and this property is onlyuseful for α < Proposition 26 (Data Processing Inequality) . For all row-stochastic matrices P ∈ R Ω × Ω ′ and all distributions µ , ν : Ω → R ≥ , we have D f ( ν P k µ P ) ≤ D f ( ν k µ ) .For a Markov chain P , we define the mixing time from a starting distribution ν as the first time t such that ν P t gets close to the stationary distribution µ . t mix ( P , ν , ǫ ) = min { t | d tv ( ν P t , µ ) ≤ ǫ } .We drop P and ν if they are clear from context. If ν is the Dirac measure on a single point ω , we write t mix ( P , ω , ǫ ) for the mixing time. When mixing time is referenced without mentioning ǫ , we imagine that13 is set to a reasonable small constant (such as 1/4). This is justified by the fact that the growth of themixing time in terms of ǫ can be at most logarithmic [LP17].Contraction inequalities, combined with companion inequalities relating d tv and f -divergences allow oneto bound the mixing time of a Markov chain. In particular for f ( x ) = x , one has the relationship d tv ( ν , µ ) ≤ O (cid:18)q D x ( ν k µ ) (cid:19) ,and as a result we get Proposition 27 ([see, e.g., LP17]) . Suppose that a Markov chain P with stationary distribution µ has α -factorcontraction in D x ( · k · ) . Then the mixing time of P started from a point ω satisfiest mix ( P , ν , ǫ ) ≤ O (cid:18) log ( ǫ P µ [ ω ]) log ( α ) (cid:19) ≤ O (cid:18) − α · log (cid:18) ǫ · P µ [ ω ] (cid:19)(cid:19) . We use the following classic result from elementary complex analysis [see, e.g., Lan13].
Lemma 28 (Schwarz’s lemma) . Let D = { z ∈ C | | z | < } be the open unit disk in the complex plane C centeredat the origin and let f : D → C be a holomorphic map such that f ( ) = and | f ( z ) | ≤ on D. Then | f ′ ( ) | ≤ Theorem 29 (Courant-Fischer Theorem) . Let A ∈ R n × n be a Hermitian matrix with eigenvalues λ ≥ λ ≥· · · ≥ λ n . Then λ k ( A ) = min U max v h v , Av i , where the minimum is taken over all ( n − k + ) -dimensional subspaces U ⊆ R n and the maximum is taken overall vectors v ∈ U with h v , v i = . Theorem 30.
Let A ∈ R n × m , B ∈ R m × n where m ≥ n. Then the spectrum of BA (as a multiset) is precisely theunion of the spectrum of AB (as a multiset) with m − n copies of . We use F [ z , . . . , z n ] to denote n -variate polynomials with coefficients from F , where we usually take F tobe R or C . We denote the degree of a polynomial g by deg ( g ) . We call a polynomial homogeneous ofdegree k if all nonzero terms in it are of degree k . We define a λ -scaling, or an external field of λ ∈ F n applied to a polynomial g , to be the polynomial g ( λ z , . . . , λ n z n ) . If g was the generating polynomial ofa distribution µ , we denote the same scaling applied to µ by λ ⋆ µ .The main workhorse behind our main results are polynomials that avoid roots in certain regions of thecomplex plane. Definition 31 (Stability) . For an open subset U ⊆ C n , we call a polynomial g ∈ C [ z , . . . , z n ] U -stable iff ( z , . . . , z n ) ∈ U = ⇒ g ( z , . . . , z n ) = U -stable. This ensures that limits of U -stable polynomials are U -stable. For convenience, when n is clear from context, we abbreviate stability w.r.t. regions of the form U × U × · · · × U where U ⊆ C simply as U -stability.Our choice of the region U in this work is the product of open sectors in the complex plane.14 efinition 32 (Sectors) . We name the open sector of aperture απ centered around the positive real axis Γ α : Γ α : = { exp ( x + iy ) | x ∈ R , y ∈ ( − απ /2, απ /2 ) } .With these definitions Definition 3 is the same as Γ α -stability for a suitable parameter α .Note that Γ is the right-half-plane, and Γ -stability is the same as the classically studied Hurwitz-stability[see, e.g., Brä07]. Another closely related notion is that of real-stability where the region U is the upper-half-plane { z | Im ( z ) > } [see, e.g., BBL09]. Note that for homogeneous polynomials, stability w.r.t. U isthe same as stability w.r.t. any rotation/scaling of U ; so Hurwitz-stability and real-stability are the samefor homogeneous polynomials. Consider an open half-plane H θ = (cid:8) e − i θ z (cid:12)(cid:12) Im ( z ) > (cid:9) ⊆ C . A polynomial g ( z , · · · , z n ) ∈ C [ z , · · · , z n ] is H θ -stable if g does not have roots in H n θ . We call H and H π /2 the upper half-plane and right half-planerespectively. We say g is Hurwitz-stable if it is H π /2 -stable. We say g is real-stable if it is H -stable andhas real coefficients.We observe that for homogeneous polynomials, the definition of H θ -stable is equivalent for all angles θ . Lemma 33 (Lemma 2.3, [BB09]) . Suppose that f j ∈ C [ z , · · · , z n ] for all j ∈ N is U-stable for an open setU ⊆ C n and that f is the limit, uniformly on compact sets, of the sequence (cid:8) f j (cid:9) j ∈ N . Then f is either U-stable or itis identically equal to 0.In particular, if f j has bounded degree for all j ∈ N , and the sequence (cid:8) f j (cid:9) j ∈ N converges to f coefficient-wise, thenf j converge to f uniformly on compact sets. Proposition 34 (Polarization, [BBL09]) . For an element κ of N n let R κ [ z , · · · , z n ] = { polynomials in R [ z i ] ≤ i ≤ n of degree at most κ i in z i for all i } R a κ [ z ij ] = (cid:8) multi-affine polynomials in R [ z ij ] ≤ i ≤ n ,1 ≤ j ≤ κ i (cid:9) The polarization map ∏ ↑ κ is a linear map that sends monomial z α = ∏ ni = z α i i to the product ( κα ) n ∏ i = ( elementary symmetric polynomial of degree α i in the variables { z ij } ≤ j ≤ κ i ) where ( κα ) = ∏ ni = ( κ i α i ) . A polynomial g ∈ R κ [ z i ] ≤ i ≤ n with nonnegative coefficients is real-stable if an only if its polarization ∏ ↑ κ ( g ) is alsoreal-stable. Taking polarization of z k with κ = n , we obtain the following well-known result. Corollary 35.
For k ≤ n, the k-th symmetric polynomial in n variables e k ( z , · · · , z n ) is real-stable/Hurwitz-stable. The following theorems will be useful in the proof of Theorem 10.
Theorem 36.
Let g ( z , · · · , z n ) ∈ R [ z , · · · , z n ] be Hurwitz-stable. Let g e (g o ) be the even (odd) part of g i.e., thesum of terms c α z α whose total degree | α | is even (odd resp.). Then g e and g o are either identically or Hurwitz-stable.Proof. We have g = g e + g o . Replace z j with iy j with y j ∈ H . Let h ( { y j } nj = ) : = g ( { iy j } nj = ) , h e ( { y j } nj = ) : = g e ( { iy j } nj = ) and h o ( { y j } nj = ) : = i − g o ( { iy j } nj = ) then h e , h o are polynomials with real coefficients, and h isupper half-plane stable. 15e have h = h e + ih o , and this is the unique way to write h as h + ih where h j are polynomial with realcoefficients, for j ∈ {
1, 2 } . By [BB09, Corollary 2.4], h e and h o are real-stable or identically 0. Thus g e , g o are Hurwitz-stable or identically 0. Theorem 37 ([BBL09], Proposition 3.2) . Let A , · · · , A n be (complex) positive semi-definite matrices and let B bea (complex) Hermitian matrix, all matrices being of the same size m × m.1. The polynomial f ( z , · · · , z n ) = det ( z A + · · · + z n A n + B ) is either identically zero or real-stable;2. If B is also positive semi-definite then f has all non-negative coefficients. Lemma 38.
Consider A ∈ R n × n satisfying A + A T is positive semi-definite. Let f ( z , · · · , z n ) = ∑ S ⊆ [ n ] z [ n ] \ S det ( A S , S ) . Then f has non-negative coefficients, and is either identically or Hurwitz-stable.Proof. Clearly, A + A T is positive semi-definite, so A is a P -matrix (see [Gar+19, Lemma 1]) i.e., allprinciple minors of A are nonnegative. The coefficients of f are principle minors of A , and are thusnonnegative.Let D = ( A + A T ) /2, X = ( A − A T ) /2. Note that X is skew-symmetric, thus B : = iX is a Hermitianmatrix, and D is positive semi-definite. Apply Theorem 37 with A j = diag e j for j ∈ [ n ] where e j is the j -thstandard basis vector, A n + = D and B = iX , we have g ( z , · · · , z n , z n + ) : = det ( ∑ ni = z i A i + z n + D + iX ) is either identically 0 or real-stable.Let w j : = i − z j , Z = ∑ ni = z i A i = diag z , · · · , z n and W = diag w , · · · , w n . We can rewrite g ( z , · · · , z n , i ) = det ( Z + iD + iX ) = det ( iW + iA ) = i n det ( W + A )= i n ∑ S ⊆ [ n ] w [ n ] \ S det ( A S , S ) = i n f ( w , · · · , w n ) If g ≡ f . Suppose g
0. Fix arbitrary w , · · · , w n in the right half plane H π /2 . Observe that z j = iw j is in the upper half plane H . Real-stability of g implies f ( w , · · · , w n ) = g ( z , · · · , z n , i ) = f is Hurwitz-stable.We also need the following for the proof of Theorem 7. Theorem 39 ([HL72]) . Consider a graph G = G ( V , E ) on n vertices with edge weight w : E → R ≥ and vertexweight λ : V → R ≥ . For S ⊆ V, let m S : = ∑ M weight ( M ) = ∑ M ( ∏ e ∈ M w ( e ) ∏ v S λ ( v )) where the sum istaken over all perfect matchings M of S. The following polynomial is Hurwitz-stablef ( z , · · · , z n ) = ∑ S ⊆ V z [ n ] \ S m S A matroid M = ( E , I ) is a structure consisting of a finite ground set E and a non-empty collection I of independent subsets of E satisfying:1. If S ⊆ T and T ∈ I , then S ∈ I .2. If S , T ∈ I and | T | > | S | , then there exists an element i ∈ T \ S such that S ∪ { i } ∈ I .The rank of a matroid is the size of the largest independent set of that matroid. If M has rank r , any set S ∈ I of size r is called a basis of M . Let B M ⊂ I denote the set of bases of M . The set of bases B M of amatroid unique define M .We say a matroid M is strongly Rayleigh or satisfies the weak half-plane property if f ( z , · · · , z n ) = ∑ S ∈B M z S is real-stable. 16or partition T , · · · , T s of [ n ] , and tuple ( c , · · · , c s ) ∈ N s , the partition matroid M associated with ( T , · · · , T s ) and ( c , · · · , c s ) is defined by B M = { S ⊆ [ n ] | | S ∩ T i | = c i ∀ i } . Here we establish sufficient conditions for rapid mixing of the k ↔ ℓ down-up random walks as definedin Definition 1. Remark . Our arguments in this section are small tweaks of the local-to-global contraction analysesalready found in prior work of Alev and Lau [AL20] and Cryan, Guo, and Mousa [CGM19]; the originof these types of arguments goes back to the study of high-dimensional expanders [KM16; DK17; KO18],and more sophisticated variants useful in the context of Markov chain analysis can be found in recentworks of Chen, Liu, and Vigoda [CLV20b; CLV20a] and Guo and Mousa [GM20]. For the mixing timebounds in this work, the analysis of Alev and Lau [AL20] and the framework built on it by Anari, Liu, andOveis Gharan [ALO20] dubbed “spectral independence” suffices; however, we choose to state a generallocal-to-global contraction analysis not found explicitly in prior work, in the hope that it will find use infuture applications.For a distribution µ : ( [ n ] k ) → R ≥ , our goal is to analyze the mixing time of the k ↔ ℓ down-up randomwalk. We will do this by establishing contraction of f -divergence in these random walks. Similar toprior results on local-to-global analysis of high-dimensional expanders, our goal is to show that “local”contraction of f -divergence (where the down-up walks are applied to a “localization” of µ ) implies “global”contraction of f -divergence.The down-up walks can be written as the composition of two row-stochastic operators known aptly as thedown and up operators. Definition 41 (Down Operator) . For a ground set [ n ] , and cardinalities k ≥ ℓ define the row-stochasticdown operator D k → ℓ ∈ R ( [ n ] k ) × ( [ n ] ℓ ) as D k → ℓ ( S , T ) = ( ( k ℓ ) if T ⊆ S ,0 otherwise.This operator applied to a random set S , produces a uniformly random subset T of size ℓ out of it. Thedown operators compose in the way one expects, i.e., D k → ℓ D ℓ → m = D k → m . Note that the down operatorhas no dependence on µ . In contrast the up operator as defined below depends on µ and is actuallydesigned to be the time-reversal of the down operator w.r.t. the background measure µ . Definition 42 (Up Operator) . For a ground set [ n ] , cardinalities k ≥ ℓ , and density µ : ( [ n ] k ) → R ≥ , definethe up operator U ℓ → k ∈ R ( [ n ] ℓ ) × ( [ n ] k ) as U ℓ → k ( T , S ) = ( µ ( S ) ∑ S ′⊇ T µ ( S ′ ) if T ⊆ S ,0 otherwise.If we name µ k = µ and more generally let µ ℓ be µ k D k → ℓ , then the down and up operators satisfy detailedbalance (time-reversibility) w.r.t. the µ k , µ ℓ operators. In other words we have µ k ( S ) D k → ℓ ( S , T ) = µ ℓ ( T ) U ℓ → k ( T , S ) .This property ensures that the composition of the down and up operators have the appropriate µ as astationary distribution, are time-reversible, and have nonnegative real eigenvalues. Proposition 43 ([see, e.g., KO18; AL20; ALO20]) . The operators D k → ℓ U ℓ → k and U ℓ → k D k → ℓ both define Markovchains that are time-reversible and have nonnegative eigenvalues. Moreover µ k and µ ℓ are respectively their station-ary distributions. f -divergence by a multiplicative factor. To this end,it is enough to show contraction of f -divergence under D k → ℓ . This is because, by the data processinginequality, Proposition 26, the operator U ℓ → k cannot increase the f -divergence.The key ingredient in local-to-global arguments is the “local contraction” assumption. Here, one assumesthat D → contracts f -divergences w.r.t. the background measure µ . The goal is to go from this assump-tion, and similar ones for conditionings of µ , see Definition 15, to contraction of f -divergence for D k → ℓ .This is the natural “ f -divergence” generalization of the notion of local spectral expansion and its implica-tions for global expansion [see KO18].First we define the notion of the link of the distribution µ w.r.t. a set T [see, e.g., KO18]. This notion isalmost the same as the notion of conditioned distributions µ T , see Definition 15, except we remove the set T as well. Definition 44.
For a distribution µ : ( [ n ] k ) → R ≥ and a set T ⊆ [ n ] of size at most k , we define the link of T to be the distribution µ − T : ( [ n ] − Tk ) → R ≥ which describes the law of the set S − T where S is sampledfrom µ conditioned on the event S ⊇ T .Next we define the notion of local f -divergence contraction for a distribution µ . Definition 45 (Local f -Divergence Contraction) . For a distribution µ : ( [ n ] k ) → R ≥ and a set T of size atmost k −
2, define the local contraction at T , to be the smallest number α ( T ) ≥ D → contracts f -divergences w.r.t. ( µ − T ) = µ − T D ( k −| T | ) → by a factor of α ( T ) . That is α ( T ) is the smallest number suchthat for all ν : ( [ n ] − T ) → R ≥ we have D f (cid:16) ν D → (cid:13)(cid:13)(cid:13) π T ( µ ) D ( k −| T | ) → (cid:17) ≤ α ( T ) · D f (cid:16) ν (cid:13)(cid:13)(cid:13) π T ( µ ) D ( k −| T | ) → (cid:17) .We now show that local contraction of f -divergence results in a bound on the contraction of D k → ℓ opera-tors. Theorem 46.
Suppose that µ : ( [ n ] k ) → R ≥ has local f -divergence contraction with contraction factors α ( T ) .Define β ( T ) = min { α ( T ) / ( − α ( T )) } . For a set T ⊆ [ n ] define γ T : = E e ,..., e m ∼ uniformly random permutation of T [ β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } )] . Then the operator D k → ℓ has contraction factor at least −
1/ max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .Proof. Consider an arbitrary distribution ν : ( [ n ] k ) → R ≥ . The f -divergence D f ( ν k µ ) is a difference oftwo terms, both involving expectations over samples S ∼ µ : D f ( ν k µ ) = E S ∼ µ (cid:20) f (cid:18) ν ( S ) µ ( S ) (cid:19)(cid:21) − f (cid:18) E S ∼ µ (cid:20) ν ( S ) µ ( S ) (cid:21)(cid:19) .Our strategy is to write this difference as a telescoping sum of differences, where elements of S are revealedone-by-one in the sum.Consider the following process. We sample a set S ∼ µ and uniformly at random permute its elements toobtain X , . . . , X k . Define the random variable τ i = f (cid:18) E (cid:20) ν ( S ) µ ( S ) (cid:12)(cid:12)(cid:12)(cid:12) X , . . . , X i (cid:21)(cid:19) = f ∑ S ′ ∋ X ,..., X i ν ( S ′ ) ∑ S ′ ∋ X ,..., X i µ ( S ′ ) ! = f (cid:18) ν D k → i ( { X , . . . , X i } ) µ D k → i ( { X , . . . , X i } ) (cid:19) .Note that τ i is a “function” of X , . . . , X i . It is not hard to see that D f ( ν k µ ) = E [ τ k ] − E [ τ ] = k − ∑ i = ( E [ τ i + ] − E [ τ i ]) .18 convenient fact about this telescoping sum is that to obtain D f ( ν D k → ℓ k µ D k → ℓ ) , one has to just sumover the first ℓ terms instead of k : D f ( ν D k → ℓ k µ D k → ℓ ) = E [ τ ℓ ] − E [ τ ] = ℓ − ∑ i = ( E [ τ i + ] − E [ τ i ]) .This is because the set { X , . . . , X ℓ } is distributed according to µ D k → ℓ . So our goal of showing that D k → ℓ has contraction boils down to showing that the last k − ℓ terms in the telescoping sum are sufficientlylarge compared to the rest.Consider applying the assumption of local contraction to the link of the set T = { X , . . . , X i } . From thisone can extract that E [ τ i + | X , . . . , X i ] − E [ τ i | X , . . . , X i ] ≤ α ( T ) · ( E [ τ i + | X , . . . , X i ] − E [ τ i | X , . . . , X i ]) .Defining ∆ i = τ i + − τ i , the above can be rewritten as E [ ∆ i | X , . . . , X i ] ≤ α ( { X , . . . , X i } ) · E [ ∆ i + ∆ i + | X , . . . , X i ] .Rearranging yields E [ ∆ i | X , . . . , X i ] ≤ α ( { X , . . . , X i } ) − α ( { X , . . . , X i } ) E [ ∆ i + | X , . . . , X i ] ≤ β ( { X , . . . , X i } ) E [ ∆ i + | X , . . . , X i ] .From this we obtain that if we consider the quantities ∆ i · β ( ∅ ) · β ( { X } ) · · · β ( { X , . . . , X i − } ) ,they form a submartingale; this means that we have E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] ≥ E [ ∆ ] .Now, consider an alternative process for generating the ordering X , X , . . . , X k . First select S ∼ µ , andpartition it into two sets, T of size ℓ − S − T of size k − ℓ +
1. We then randomly shuffle T and let X , . . . , X ℓ − be the result, and then randomly shuffle S − T and let X ℓ , . . . , X k be the result. This processis equivalent to randomly shuffling all elements of S .The key insight is that ∆ ℓ is only a function of the unordered set T and the ordering of S − T . However theother factor β ( ∅ ) · · · β ( { X , . . . , X ℓ − } ) is only a function of the ordering chosen for T and not S − T . Thismeans that conditioned on T , these two quantities are independent and we get E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] = E T [ E [ ∆ ℓ | T ] · E [ β ( ∅ ) · · · β ( { X , . . . , X ℓ − } ) | T ]] .From the definition of γ T , we obtain E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] ≤ E [ ∆ ℓ ] · max (cid:26) γ T (cid:12)(cid:12)(cid:12)(cid:12) T ∈ (cid:18) [ n ] ℓ − (cid:19)(cid:27) .Combining with previous inequalities we obtain E [ ∆ ℓ ] ≥ E [ ∆ ] max n γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .Similar inequalities can be obtained with ∆ replaced by ∆ , ∆ , . . . in the above arguments (with poten-tially better factors than γ T , but we ignore this potential improvement). Averaging over these k inequalitieswe obtain E [ ∆ ℓ ] ≥ E [ ∆ + · · · + ∆ k − ] max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o = D f ( ν k µ ) max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .19t just remains to note that D f ( ν k µ ) − D f ( ν D k → ℓ k µ D k → ℓ ) = E [ ∆ ℓ + · · · + ∆ k − ] ≥ E [ ∆ ℓ ] .Here we used nonnegativity of E [ ∆ i ] which follows from convexity of f and Jensen’s inequality. Combin-ing the previous two inequalities and rearranging the terms yields the desired result. Remark . We remark that similar to prior works, in this paper we only deal with the case where the α ( T ) contraction factors only depend on the size | T | . However, we suspect the more general statementwe proved here to be useful in potential future applications of this method, especially to distributions µ that “factorize” into two independent distributions when conditioned on an element; some potentialexamples include distributions over chains in a poset. In these scenarios, the order of conditioning onthe elements matters, and we hope that by having E orderings [ β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } )] instead ofmax orderings { β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } ) } , we get more tractable results.From this point on, we deal with cases where α ( T ) , β ( T ) only depend on the cardinality | T | , and as suchwe write them as α i , β i , where i = | T | . Consequently, the global contraction factor we obtained can berewritten as 1 − k β β · · · β ℓ − . Remark . A similar, slightly better, contraction factor can be obtained when β ( T ) only depend on | T | . Inthese cases one can simply use E [ ∆ i ] ≤ β i · E [ ∆ i + ] and obtain that the we have contraction E [ ∆ + · · · + ∆ ℓ − ] E [ ∆ + · · · + ∆ k − ] ≤ + β + · · · + β · · · β ℓ − + β + · · · + β · · · β k − .This is essentially the same bound found by Chen, Liu, and Vigoda [CLV20a] and Guo and Mousa [GM20]and the analysis is essentially the same as those in its core. However this slightly better bound does notproduce any meaningful improvement in the mixing time bounds we get in this work, and for simplicitywe use the more naive bound.While it might seem that β · · · β ℓ − can get exponentially large, in the case of distributions that satisfyspectral independence [ALO20], this product remains polynomially small. In particular, one can show[see, e.g., ALO20; CLV20a] that if the correlation matrix, see Definition 14, has O ( ) -bounded eigenvaluesfor the distribution µ and all of its conditionings, then β i ≃ ( − O ( ( k − i ))) . In particular, as long as k − i is larger than a constant (hidden in the O -notation), then β i is finite an can be roughly approximatedby e O ( ( k − i )) . Thus for k − ℓ larger than an appropriate constant, we have the bound β β · · · β k − ℓ ≃ exp (cid:18) O (cid:18) k + k − + · · · + ℓ (cid:19)(cid:19) ≤ exp ( O ( log k )) = poly ( k ) . In this section, we prove Theorem 16.
Definition 49 (Signed Pairwise Influence/Correlation Matrix) . Let µ be a probability distribution over 2 [ n ] with generating polynomial f ( z , · · · , z n ) = ∑ S ∈ [ n ] µ ( S ) z S .Let the signed pairwise influence matrix Ψ inf µ ∈ R n × n be defined by Ψ inf µ ( i , j ) = ( j = i P [ j | i ] − P [ j | ¯ i ] elsewhere P [ j | i ] = P T ∼ µ [ j ∈ T | i ∈ T ] , P [ j ] = P T ∼ µ [ j ∈ T ] and P [ j | ¯ i ] = P T ∼ µ [ j ∈ T | i T ] .20et the correlation matrix Ψ cor µ ∈ R n × n be defined by Ψ cor µ ( i , j ) = ( − P [ i ] if j = i P [ j | i ] − P [ j ] elseIn Definition 49, we use the convention that the entry Ψ inf ( i , j ) ( Ψ cor ( i , j ) resp.) is set to 0 if P [ j | i ] or P [ j | ¯ i ] ( P [ j | i ] resp.) are not well-defined, e.g., P [ i ] = P [ ¯ i ] = P [ i ] = Ψ inf µ , was first introduced in [ALO20]. All of eigenvalues of Ψ inf µ and Ψ cor µ are real [see, e.g., ALO20].We show that Ω ( ) -aperture sector-stability of the generating polynomial of µ implies O ( ) -bound on therow norms of Ψ inf µ and Ψ cor µ . The high level idea is to write the ℓ -norm of a row of Ψ inf as the derivativeat 0 of some holomorphic function that maps the unit disk to itself, and then use Schwarz’s Lemma(Lemma 28) to derive a bound. Theorem 50.
Consider a multi-affine f ∈ R ≥ [ z , · · · , z n ] polynomial that is Γ α -stable with α ≤ . Let µ : 2 [ n ] → R ≥ be the distribution generated by f , then Ψ inf µ and Ψ cor µ have bounded row norms. Specifically, ∑ j | Ψ inf µ ( i , j ) | ≤ α − and ∑ j | Ψ cor µ ( i , j ) | ≤ α . As a corollary, the same bounds hold for maximum eigenvalues, i.e., λ max ( Ψ inf µ ) ≤ α − and λ max ( Ψ cor µ ) ≤ α .Proof. If we can show the first statement, the second follows from P [ j | i ] − P [ j ] = P [ j | i ] − ( P [ j | i ] P [ i ] + P [ j | ¯ i ] P [ ¯ i ]) = ( − P [ i ])( P [ j | i ] − P [ j | ¯ i ]) ∑ j | Ψ cor µ ( i , j ) | ≤ ( − P [ i ])( + ∑ j = i | P [ j | i ] − P [ j | ¯ i ] | ) ≤ α .Fix a row i . W.l.o.g., assume i = n . Let h = ∂ i f , g = f z i = . We can assume w.l.o.g. that neither g and h arethe zero polynomial. If either g or h are the zero polynomial then the row just become identically 0 andthe statement is trivial. Let S : = (cid:8) j ∈ [ n ] \ { i } (cid:12)(cid:12) P [ j | i ] − P [ j | ¯ i ] < (cid:9) then ∑ j = i | Ψ inf µ ( i , j ) | = ∑ j ∈ S ( P [ j | ¯ i ] − P [ j | i ]) − ∑ j S ( P [ j | ¯ i ] − P [ j | i ]) . (1)Note that P [ j | i ] = ∂ j h ( ~ ) h ( ~ ) and P [ j | ¯ i ] = ∂ j g ( ~ ) g ( ~ ) .Define ~ z ∈ R n − by z j = ( y for z ∈ Sy − else .Let ¯ h ( y ) = h ( ~ z ) and ¯ g ( y ) = g ( ~ z ) . Note that ∑ j ∈ S ∂ j h ( ) − ∑ j S ∂ j h ( ) = ¯ h ′ ( ) and the same goes for ¯ g .This is because for each monomial z U = z U ∩ S z U \ S , we have ∑ j ∈ S ∂ j z U − ∑ j S ∂ j z U ! ~ z = = | U ∩ S | − | U \ S | = (cid:16) y | U ∩ S | ( y − ) | U \ S | (cid:17) ′ | y = n − ∑ j = | Ψ inf µ ( i , j ) | = ∑ j ∈ S ( ∂ j g ( ~ ) g ( ~ ) − ∂ j h ( ~ ) h ( ~ ) ) − ∑ j S ( ∂ j g ( ~ ) g ( ~ ) − ∂ j h ( ~ ) h ( ~ ) ) = ¯ g ′ ( ) ¯ g ( ) − ¯ h ′ ( ) ¯ h ( ) = ( log ¯ g − log ¯ h ) ′ y = = φ ′ ( ) (2)where φ ( x ) = log ¯ g ( e x ) ¯ h ( e x ) − log ¯ g ( ) ¯ h ( ) . Note that φ maps 0 to itself.Let D , H ⊆ C be the centered (open) unit disk and the (open) right half-plane respectively. For any set Ω ⊆ C , we let Ω denote its closure.The Mobius transformation T : x x − x + is a conformal map from H onto D .For angle θ ∈ ( π ) let Ω θ : = { x ∈ C | | Im ( x ) | < θ } and ϕ θ : Ω θ → D , x T ( exp ( π x θ )) . Note that ϕ θ ( ) = T ( ) = ϕ ′ θ ( ) = T ′ ( ) π θ = π θ and ( ϕ − θ ) ′ ( ) = ϕ ′ θ ( ) = θπ .To bound | φ ′ ( ) | , we show that φ maps Ω απ /2 to Ω π − απ /2 . Now, for all small ǫ >
0, ˜ φ : = ϕ π − απ /2 + ǫ ◦ φ ◦ ϕ − απ /2 is a holomorphic function that takes the centered unit disk to itself. We use Schwarz Lemma tobound | ˜ φ ′ ( ) | , then use this to bound | φ ′ ( ) | .Let θ : = απ /2. Consider x ∈ Ω θ . Note that the function x e x maps Ω θ to S α . Also, ¯ g ( e x ) ¯ h ( e x )
6∈ − S α else ¯ g ( e x ) + ¯ h ( e x ) z = z ∈ S α i.e., f ( e x , · · · , e x , z ) =
0, which contradicts S α -sector-stability of f . In particular, ¯ g ( e x ) ¯ h ( e x ) never takes negative real value, thus the function log ¯ g ( e x ) ¯ h ( e x ) is holomorphic, and asargued earlier, | Im ( log ¯ g ( e x ) ¯ h ( e x ) ) | ≤ π − θ . Additionally, since g , h has non-negative coefficients and are notthe zero polynomial, ¯ g ( ) and ¯ h ( ) are positive real and log ¯ g ( ) ¯ h ( ) is a real number. Therefore, | Im ( φ ( x )) | = | Im ( log ¯ g ( e x ) ¯ h ( e x ) ) | ≤ π − θ . Hence, φ maps Ω θ to Ω π − θ + ǫ for every ǫ > ǫ >
0. Consider the holomorphic map ˜ φ = ϕ π − θ + ǫ ◦ φ ◦ ϕ − θ that takes D to itself. Since φ , ϕ ∗ bothtake 0 to itself, so is ˜ φ . By Schwarz’s Lemma (Lemma 28), | ˜ φ ′ ( ) | ≤
1. On the other hand, ˜ φ ′ ( ) = ϕ ′ π − θ + ǫ ( ) × φ ′ ( ) × ( ϕ − θ ) ′ ( ) = π ( π − θ + ǫ ) φ ′ ( ) θπ = θπ − θ + ǫ φ ′ ( ) , thus | φ ′ ( ) | ≤ π + ǫθ −
1. Taking ǫ → | φ ′ ( ) | ≤ πθ −
1. Substitute back into (2) gives the desired bound.
Remark . Theorem 50’s bounds on k Ψ inf k ∞ , k Ψ inf µ k , and k Ψ cor k ∞ are tight, even for homogeneous µ .For e.g., consider f µ ( z , . . . , z rk ) = ∑ r − i = ∏ ( i + ) kj = ik + z j For r =
2, we have Ψ inf µ = (cid:20) J k − J k − J k J k (cid:21) − I k and k Ψ inf k ∞ = k Ψ inf µ k = k −
1. For arbitrary r we get Ψ cor µ = J k J k · · · J k − r J rk with J being theall ones matrix. k Ψ cor k ∞ = k ( − r ) + ( r − ) k r = k ( − r ) −−−→ r → ∞ k .The bound on k Ψ cor µ k is tight in general, for e.g. consider f ( z , . . . , z k ) = ǫ z . . . z k + ( − ǫ ) for small ǫ >
0, but is not tight for homogeneous distribution µ . In this section, we show how certain natural operations affect the sector-stability of polynomials. InCorollary 60, we show that the degree- k part of a Hurwitz-stable (or Γ -stable) polynomial is Γ -stable.22n Theorem 64, we show that given a homogeneous real-stable polynomial g , the sum of terms in g whose ( T , . . . , T k ) -degree is equal to ( c , . . . , c k ) is Γ k -stable. These results are important ingredients in theproof of Theorems 7 and 10 and Corollary 13. Proposition 52.
The following operations preserve α -sector-stability:1. Specialization: g ( z , . . . , z n ) g ( a , z , . . . , z n ) , where a ∈ ¯ Γ α .2. Scaling: g g ⋆ λ , if λ i ∈ R ≥ ∀ i ∈ [ n ] .
3. Dual: g g ∗ , where g ( z ) = ∑ S ⊆ [ n ] c S z S and g ∗ ( z , · · · , z n ) : = ∑ S ⊆ [ n ] c S z [ n ] \ S .Proof. Part 1 for a ∈ Γ α holds by the definition and for the closed boundary of Γ α we can set a to 0 or ∞ by Lemma 33. Part 2 holds by the definition of sector-stability. For part 3, g ∗ ( z , · · · , z n ) = z · · · z n g ( z − , · · · , z − n ) = z , · · · , z n ∈ Γ α , where we use Γ α -stability of g and the fact that z − , · · · , z − n are also in Γ α . Lemma 53 (Homogenization) . If multi-affine polynomial g ( z , · · · , z n ) : = ∑ S ⊆ [ n ] c S z S is Γ α -stable, then itshomogenization g hom ( z , · · · , z n , w , · · · , w n ) : = ∑ S ⊆ [ n ] c S z S w [ n ] \ S is multi-affine, homogeneous of degree n, and Γ α /2 -stable.Proof. One can rewrite g hom as g hom ( z , · · · , z n , w , · · · , w n ) = w · · · w n g ( z w , · · · , z n w n ) .For any z , · · · , z n , w · · · , w n ∈ Γ α /2 , we have z i w i ∈ Γ α ∀ i ∈ [ n ] , thus the RHS is nonzero by Γ α -stability of g . Corollary 54.
Consider graph G = G ( V , E ) on n vertices with edge weight w : E → R ≥ and vertex weight λ : V → R ≥ . For S ⊆ V, let m S : = ∑ M weight ( M ) = ∑ M ( ∏ e ∈ M w ( e ) ∏ v S λ ( v )) where the sum is taken overall perfect matching M of S. The following polynomial is Γ stablef ( z , · · · , z n , y , · · · , y n ) = ∑ S ⊆ V y S z [ n ] \ S m S .The class of sector-stable polynomials was studied in [SS19], where the authors proved that symmetriza-tion preserves sector-stablity of univariate polynomials with nonnegative coefficients. Given a univariatecomplex polynomial p ( z ) = a n z n + . . . + a z + a , its symmetrization with n variables is defined as P ( z , . . . , z n ) = n ∑ k = a k ( nk ) S k ( z , . . . , z n ) ,where S k ( z , . . . , z n ) = ∑ ≤ i < ... < i k ≤ n z i . . . z i k . By the definition, P ( z , . . . , z ) = p ( z ) . We call ( z , . . . , z n ) asolution of p , if P ( z , . . . , z n ) =
0. Define a closed set Ω ⊆ C ∗ the locus holder of p , if every solution of p has a point in Ω . Call a minimal by inclusion locus holder Ω a locus of p . For examples and properties oflocus holders see [SS14]. Note that any polynomial is stable with respect to the complement of its locus.The next result shows that symmetrization of a univariate sector-stable polynomial with non-negative issector-stable. Note that this result is not true if we drop the assumption of nonnegative coefficients. Proposition 55 (Theorem 1.1 [SS19]) . Let p ( z ) be a univariate Γ α -sector-stable polynomial with nonnegativecoefficients. Then Γ α is the locus holder of p ( z ) . g with respect to indices in a given set S .When the set S and g is specified let k max , k min be the maximum and minimum S -degree among monomialsin g . Lemma 56.
Let U : = ∏ i Γ α i ⊆ C , S ⊆ [ n ] . If g ∈ C [ z , . . . , z n ] is U-stable, then g Sk S max , g Sk S min are also U-stable.Proof. We may re-index z i so that S = [ t ] for some t ≤ n . W.l.o.g., assume that this is already done.For simplicity of notation, below we omit the superscript S . Observe that U is open and g k max , g k min are notidentically zero, by definition.For λ ∈ R > let g λ ( z , · · · , z n ) : = λ k max g ( λ z , · · · , λ z t , z t + , · · · , z n ) = g k max ( z , · · · , z n ) + k max − ∑ k = g k ( z , · · · , z n ) λ k max − k Clearly, g λ is U -stable, and lim λ → ∞ g λ = g k max , so by Lemma 33, g k max is U -stable. Similarly, g k min = lim λ → + λ k min g ( λ z , · · · , λ z n ) is U -stable.As a consequence, we can prove partial derivatives preserve sector stability. Corollary 57.
If p ( z , . . . z n ) is a multiaffine polynomial, then the partial derivative of p with respect to any variablez i in i ∈ [ n ] , which we denote by ∂ i p, is sector stable.Remark . In general taking derivatives of non-multiaffine polynomials does not preserve sector stability.For example, let x , y , z , . . . , z n be variables. Look at the polynomial p = ( xz + yz )( xz + yz )( xz + yz ) . . . ( xz n + yz ) . This is Γ -sector stable. Now differentiate w.r.t. each z i once, and then set each z i itto zero. What you end up with is x n + y n . This is only Γ n -sector-stable. Theorem 59 (Hurwitz-stable intersected with one partition constraint) . Suppose g ( z , . . . , z n ) is a Γ -stablepolynomial with constant parity (the degree of every monomial is even or odd). Then g k is Γ -stable or identically0.More precisely, for k ∈ [ k min , k max ] with k ≡ k max ( mod 2 ) , g k is Γ stable.Proof. Lemma 56 with S = [ n ] and U = Γ n implies g k max , g k min are Γ -stable. W.l.o.g., we assume k max > k min ≥
0, otherwise there is nothing to prove.Fix arbitrary z , . . . , z n ∈ Γ . Let h ( z ) = z k min g ( z z , z z , . . . , z n z ) . Note that h ( ) = g k min ( z , · · · , z n ) = Γ -stability of g k min . Note also that all terms in h has even degree in z , and the highest degree termis g k max ( z , · · · , z n ) z k max − k min with g k max ( z , · · · , z n ) = Γ -stability of g k max . By substituting z = y in h , we obtain a polynomial ˜ h ( y ) : = h ( y ) that satisfies ˜ h ( y ) = y ∈ ¯ S ∪ { } . Indeed,˜ h ( ) = h ( ) =
0. For y ∈ ¯ S , we have z = y ∈ ¯ S thus ( z i z ) ni = ∈ Γ n , and ˜ h ( y ) = g ( z z , · · · , z n z ) = Γ -stability of g .Let λ , · · · , λ d be the roots of ˜ h ( y ) where d : = deg ( ˜ h ) = k max − k min . As argued earlier, λ i ∈ ( C \ ( ¯ S ∪{ } )) = H − π /2 . Fix k ∈ [ k min , k max ) with k ≡ k max ( mod 2 ) . By half-plane stability of symmetric polyno-mial ( Corollary 35), g k ( z , · · · , z n ) = g k max ( z , · · · , z n ) e t ( λ , · · · , λ d ) = t : = k max − k ∈ N .The next corollary results in DPP sampling on P matrix A ∈ R n × n , where A + A T is PSD, and thesampling from monomer-dimer of fixed size. Corollary 60.
Suppose g ( z , . . . , z n ) ∈ R [ z , · · · , z n ] is Γ -stable, then g k is either identically 0 or Γ -stable. roof. Define the even and odd parts g e and g o of g as in Theorem 36. If g e ≡ g o ≡ g e , g o
0, then they are Γ -stable by Theorem 36. The claim follows by applying Theorem 59 to g e ( g o ) if k is even (odd resp.)Lemma 38 and Corollary 60 together imply the following corollaries. Corollary 61.
Consider A ∈ R n × n where A + A T is positive semi-definite, thenf k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) det ( A S , S ) z [ n ] \ S and its dual f ∗ k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) det ( A S , S ) z S are either identically or Γ -stable, and has nonnegative real coefficients. Corollary 62.
Consider a graph G = G ( V , E ) on n vertices. For S ⊆ V, let m S be the number of perfect matchingon S. Then f k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) m S z [ n ] \ S and its dual f ∗ k are either identically or Γ -stable. Lemma 63.
Suppose that p ( x , y ) is a homogeneous polynomial with coefficients in C , defined asp ( x , y ) = ∑ i c i x i y d − i . If p is ( Γ α × Γ β ) -stable for α + β ≥ , then the sequence of c i will have no holes (zeros in between nonzeros).Proof. We may as well assume that c , c d =
0, otherwise we can factor out extra powers of x and y . Let g ( z ) = p ( z , 1 ) . Then g is Γ -stable. This is because every z ∈ Γ can be written as x / y for x ∈ Γ α and y ∈ Γ β . So g ( z ) = p ( x / y , 1 ) = p ( x , y ) / y d =
0. Since g is Γ -stable and has no zero root, its roots mustbe in the left half-plane { z | Re ( z ) < } . But then c d − i / c d is going to be up to a plus/minus sign the i -th elementary symmetric polynomial of the roots of g . Since elementary symmetric polynomials arehalf-plane-stable for every open half plane, all the coefficients of g must be nonzero. Theorem 64.
Suppose that g ( z , . . . , z n ) is a homogeneous Γ -stable polynomial. Let T , . . . , T k be a partition of [ n ] and ( c , . . . , c k ) ∈ Z n ≥ . Define the ( T , . . . , T k ) -degree of a monomial z t · · · z t n n as ( ∑ i ∈ T t i , ∑ i ∈ T t i , . . . , ∑ i ∈ T k t i ) .Let h be the sum of terms in g whose ( T , . . . , T k ) -degree is equal to ( c , . . . , c k ) . Then h is either identically zero, oris Γ k -stable.Proof. Let h i be the polynomial obtained from g by retaining the terms whose ( T , . . . , T i ) -degree is ( c , . . . , c i ) . Then h = g and h k = h . If h i ≡ i , then h k ≡
0. W.l.o.g., we assume h i ∀ i . We will inductively prove that h i is (cid:16) ∏ j ∈ T ∪···∪ T i Γ α × ∏ j ∈ T i + ∪···∪ T k Γ β i (cid:17) -stable for α = k and β i = − ( i − ) /2 k . Let ∏ i : = (cid:16) ∏ j ∈ T ∪···∪ T i Γ α × ∏ j ∈ T i + ∪···∪ T k Γ β i (cid:17) . Note that ∏ i + ⊆ ∏ i ∀ i .Note that β =
1, and by assumption g = h is Γ -stable. So it is enough to prove the induction step.Assume the statement is true for h i and let us prove it for h i + . Fix ( z , . . . , z n ) ∈ ∏ i + . We will show h i + ( z , . . . , z n ) =
0. Note that we can get h i + from h i by retaining the terms whose T i + - degree is c i + .Take two variables x and y , and look at the polynomial p ( x , y ) = h i ( u , . . . , u n ) , where u j : = z j if j ∈ T ∪ · · · ∪ T i , xz j if j ∈ T i + , yz j if j ∈ T i + ∪ · · · ∪ T k .25ote that p is a homogeneous polynomial (of some degree d ). This is because h i is homogeneous invariables from T i + ∪ · · · ∪ T k . Let c max , c min be the maximum and minimum T i + -degree in h i respectively.Note that the coefficient of x c y d − c in p ( x , y ) is exactly h T i + i , c ( z , · · · , z n ) , where h T i + i , c are sum of termsin h i whose T i + -degree is c . We will show that the coefficient of x c y d − c in p ( x , y ) are nonzero, where c ∈ [ c min , c max ] . This immediately implies stability of h i + , as c i + ∈ [ c min , c max ] since h i +
0. For c ∈ { c min , c max } , h T i + i , c is ∏ i -stable by inductive assumption on h i and Lemma 56, thus h T i + i , c ( z , · · · , z n ) = ( z , · · · , z n ) ∈ ∏ i + ⊆ ∏ i .For the remaining c ∈ ( c min , c max ) we will use Lemma 63. Let x ∈ Γ β i − α and y ∈ Γ β i − β i + . These choicesmake sure that xz j ∈ Γ β i and yz j ∈ Γ β i for appropriate indices j . By the inductive assumption, we have p ( x , y ) =
0. So p is ( Γ β i − α × Γ β i − β i + ) -stable. If this stability satisfies the assumptions of Lemma 63, weare done. So it is enough to check ( β i − α ) + ( β i − β i + ) ≥ β i − α − β i + = − i + − k − k − + i + − k = Conjecture 65.
With the same assumptions as in Theorem 64, we have h is either identically zero or Γ k -stable. For any distribution µ , define its Newton polytope, newt ( µ ) as the convex hull of its support,newt ( µ ) : = conv ( { S : µ ( S ) > } ) .Next, we show that the ℓ edge lengths of the Newton polytope of a Γ k -sector-stable distribution are O ( k ) . Lemma 66.
Let µ : 2 [ n ] → R be a Γ k -sector-stable distribution, then the length of edges of newt ( µ ) is at most k.Proof. First, we show that for any face F of newt ( µ ) there exists a Γ k -sector-stable polynomial withsupport equal to the face F . Since F is a face of newt ( µ ) there exists some vector w = ( w , . . . , w n ) that F = arg max {h w , x i| x ∈ newt ( µ ) } .Let g µ be the generation polynomial of µ , then g µ ( t w z , . . . , t w n z n ) = ∑ α ∈ Z n ≥ coeff g ( z α ) t h w , α i z α .Now, if we take the limit t → ∞ only the coefficients of the terms that are the supports of F remain, i.e.,1 t max h w , x i g ( t w z , . . . , t w n z n ) = ∑ α ∈ F coeff g ( z α ) z α + ∑ α F t − δα coeff g ( z α ) z α g F = lim t → ∞ t max h w , x i g ( t w z , . . . , t w n z n ) = ∑ α ∈ F coeff g ( z α ) z α .Note that linear scaling of variables and taking limit with respect to t , preserves sector-stability of thefunction, therefore, g F is a sector-stable polynomial. By applying the same argument again, we canconstraint the Newton polytope to lower dimensional faces and yet still preserves sector-stability, until weget an edge. As a result, the corresponding polynomial for each edge should be also Γ k -sector-stable.Now, assume the the contrary that there exists an edge ( α , α ′ ) of newt ( µ ) with length more than 2 k . Thenwe should have that g ( α , α ′ ) ( z ) = az α + bz α ′ = bz α ( ab − + z α ′ − α ) Γ k -sector-stable, where a = coeff g ( z α ) and b = coeff g ( z α ′ ) are nonzero.If | α − α ′ | > k , we can set z i ∈ Γ k for i ∈ α ∆ α ′ so that z α ′ − α can take any values in C . Therefore, g ( α , α ′ ) ( z ) is not Γ k -stable, a contradiction. In this section, first, we show that any sector-stable polynomial is a fractionally log-concave polynomialas well. Then by analyzing properties of fractionally log-concave polynomials we show that entropy ofmarginals gives a constant approximation for the entropy of fractionally log-concave distributions. Thisleads to a multiplicative-approximation on the logarithm of the size of the support of a sector-stable poly-nomial (see Lemma 73). Our techniques are a natural generalization of the results obtained by Anari,Oveis Gharan, and Vinzant [AOV18]. See also [ES20] for recent alternative techniques for proving similarentropy-based inequalities. An immediate consequence of our results is a multiplicative-approximationfor the logarithm of the size of the support of the monomer-dimer model (Corollary 75).
Lemma 67.
For α ∈ [
0, 1/2 ] , if polynomial f ∈ R ≥ [ z , · · · , z n ] is Γ α -sector-stable then f is α fractionallylog-concave.Proof. Let µ : 2 [ n ] → R ≥ be the distribution generated by f .First we claim that, it is enough to prove fractional log-concavity at ~
1. For an arbitrary vector ~ v ∈ R > let f v ( z i ) = f ( { v α i z i } ) . Note that f v is sector stable, and ∇ log f ( { z α i } ) | ~ v = D ~ v (cid:0) ∇ log f v ( { z α i } ) | ~ (cid:1) D ~ v ,where D ~ v = diag { v − i } . So, we proceed by replacing f with f v .Let H = ∇ log f ( { z α i } ) | ~ then H ij = ( α ( α − ) P [ i ] − α P [ i ] if j = i α ( P [ i ∧ j ] − P [ i ] P [ j ]) otherwise .Since the row norm of Ψ cor µ is bounded by α , its maximum eigenvalue λ max ( Ψ cor µ ) is at most α . Therefore,1 α I ≥ Ψ cor µ = α diag ( P [ i ]) i H + α I ,hence, H ≤ f ( { z α i } ) is concave. Remark . Observe that the proof of Lemma 67 implies λ max ( Ψ cor µ ) ≤ α is equivalent to α -fractionallog-concavity of µ .Lemmas 63 and 67 imply that for Γ α -sector stable µ : 2 [ n ] → R ≥ , the homogenization µ hom : ( [ n ] n ) → R ≥ of µ is Γ α /2 -sector stable and α /4-fractionally log concave. In Lemma 69, we prove the stronger statementthat µ hom is α /2-fractionally log concave. Lemma 69.
Consider distribution µ : 2 [ n ] → R ≥ that is generated by a Γ α -sector-stable polynomial f . Let ν : = µ hom be the homogenization of µ . We have λ max ( Ψ cor ν ) ≤ α , or, equivalently, the homogenization f hom of f is α /2 -fractionally log-concave.Proof. Let Ω = {
1, . . . , n } and ¯ Ω = { ¯1, . . . , ¯ n } . For set S ⊆ Ω , let S ⊆ Ω : = (cid:8) ¯ i (cid:12)(cid:12) i ∈ S (cid:9) . Recall that ν ( U ) = ( µ ( U ∩ Ω ) if U = S ∪ ( Ω \ S ) P [ i ] : = P U ∼ ν [ i ∈ U ] = P S ∼ µ [ i ∈ S ] and P [ ¯ i ] : = P U ∼ ν [ ¯ i ∈ U ] = P S ∼ µ [ i S ] .Note that Ψ cor ν ( i , i ) = − Ψ cor ν ( i , ¯ i ) = P [ ¯ i ] and Ψ cor ν ( ¯ i , ¯ i ) = − Ψ cor ν ( ¯ i , i ) = P [ i ] . For i = j , we can write Ψ cor ν ( i , j ) = − Ψ cor ν ( i , ¯ j ) = P [ ¯ i ] Ψ inf µ ( i , j ) Ψ cor ν ( ¯ i , ¯ j ) = − Ψ cor ν ( ¯ i , j ) = P [ i ] Ψ inf µ ( i , j ) Let D : = diag ( P [ i ]) ni = and D : = diag ( P [ ¯ i ]) ni = . We can rewrite Ψ cor ν as a block matrix in term of matrix A : = Ψ inf µ + I as follow Ψ cor ν = (cid:20) DA − DA − DA DA (cid:21)
We consider the left eigenvectors of Ψ cor ν . Recall that all eigenvalues of Ψ inf µ are real.Let v , · · · , v n ∈ R n bea basis of left eigenvectors of Ψ inf µ , with corresponding eigenvalues λ ( Ψ inf µ ) ≥ · · · ≥ λ n ( Ψ inf µ ) . For i ∈ [ n ] ,let w i ∈ R n be the concatenation of v i and − v i i.e. w ti : = (cid:2) v ti − v ti (cid:3) . Then { w i } are linearly independent,and are left eigenvector of Ψ cor ν with eigenvalues { λ i + } , since w ti Ψ cor ν = (cid:2) v ti − v ti (cid:3) (cid:20) DA − DA − DA DA (cid:21) = (cid:2) v ti ( D + D ) A − v ti ( D + D ) A (cid:3) = (cid:2) v ti A − v ti A (cid:3) = ( λ i + ) w ti On the other hand, for i ∈ [ n ] , consider the vector u i ∈ R n defined by u ti : = (cid:2) e ti D e ti D (cid:3) where e i isthe i -th standard basis vector of R n . Observe that u i = P [ i ] or P [ ¯ i ] must be nonzero, and u ti Ψ cor ν =
0. Moreover, { u i } are linearly independent.Now, let W , U be the n -dimensional subspaces of R n spanned of { w i } and { u i } respectively. We showthat W ∩ U = { } , then conclude that the vectors { u i } ∪ { w i } are linearly independent, and form a basisof (left) eigenvectors of Ψ cor ν . Hence, the spectrum of Ψ cor ν is the union of { λ i + } ni = and n copies of 0. Inparticular, λ max ( Ψ cor ν ) ≤ λ ( Ψ inf µ ) + = α .Indeed, suppose w ∈ W ∩ U . We can write w t = (cid:2) y t − y t (cid:3) for some y ∈ R n and w t = (cid:2) x t D x t D (cid:3) forsome x ∈ R n .Then 0 = y ( i ) − y ( i ) = w ( i ) + w ( i + n ) = x ( i ) P [ i ] + x ( i ) P [ ¯ i ] = x ( i ) where we use y ( i ) ( x ( i ) , z ( i ) resp.) to denote the i -th entries of vector y . Now, all entries of x are 0, so w = . Remark . Lemma 69 is tight. For example, take f µ ( x , x ) = x + x , which is Γ -stable, then λ max ( Ψ cor µ hom ) = µ over a finite set Ω , define its entropy as H ( µ ) = ∑ ω ∈ Ω µ ( ω ) log µ ( ω ) .Recall the marginal probability of element i ∈ Ω , µ ( i ) , is the probability that i is in a sample from µ , i.e., µ ( i ) = P S ∼ µ [ i ∈ S ] . For any probability distribution, by sub-additivity of entropy, we know that entropyof marginals is an upper bound on the entropy, ∑ i H ( µ ( i )) ≥ H ( µ ) .The next lemma, which is an analogous of Theorem 5.2. in [AOV18], leads to a lower bound for theentropy of fractionally log-concave polynomials. Lemma 71.
For any α fractionally log-concave distribution µ : 2 [ n ] → R with marginal probabilities µ , . . . , µ n ,we have H ( µ ) ≥ α ∑ i µ ( i ) log ( µ ( i ) ) .28 roof. Let g µ be the polynomial of the distribution µ . Define f ( z , . . . , z n ) = log g µ (cid:0) z α µ α , . . . , z α n µ α n (cid:1) .Since µ is α fractionally log-concave, log g µ ( z α , . . . , z α n ) is a concave function. Also, scaling preservesconcavity. Therefore, f ( z , . . . , z n ) is concave.Now, let X be a random variable that indicates a set chosen according to µ , i.e., P ( X = S ) = µ ( S ) . Then,by Jensen inequality we have that f ( E [ X ]) ≥ E [ f ( X )] . Note that, f ( E ( X )) = f ( µ , . . . , µ n ) = log g µ (cid:0) µ α µ α , . . . µ α n µ α n (cid:1) = log ( g µ (
1, . . . , 1 )) = f ( S ) = log ( ∑ T ⊆ S µ ( T ) ∏ i ∈ T µ ( i ) α ) ≥ log ( µ ( S ) ∏ i ∈ S µ ( i ) α ) = log µ ( S ) + α ∑ i log 1 µ ( i ) ,where the inequality is true because of mendacity of log. Hence, E [ f ( X )] = ∑ S µ ( S ) f ( S ) ≥ µ ( S ) log µ ( S ) + αµ ( S ) ∑ i log 1 µ ( i )= −H ( µ ) + α ∑ µ ( i ) log 1 µ ( i ) .By applying Jensen inequality we have H ( µ ) ≥ α ∑ i µ ( i ) log 1 µ ( i ) .Given a probability distribution µ : 2 [ n ] → R ≥ , the dual probability distribution µ ∗ is defined so thatthe probability of occurrence of each set is equal to its complement under µ , i.e., for any set S ⊆ [ n ] , µ ∗ ( S ) = µ ([ n ] \ S ) . Corollary 72. If µ and its dual µ ∗ are α -fractionally log-concave then ∑ i H ( µ ( i )) is a α approximation for H ( µ ) .In particular, if µ is Γ α -sector-stable, then µ and its dual µ ∗ are α -fractionally log-concave (see Proposition 52 Part3and Lemma 67). Therefore, ∑ i H ( µ ( i )) is a α approximation for H ( µ ) .Proof. For any probability distribution µ we have that, H ( µ ) ≤ ∑ i H ( µ ( i )) . So, it is enough to prove H ( µ ) ≥ α ∑ i H ( µ ( i )) By Lemma 71 we have that, H ( µ ) ≥ α ∑ i µ ( i ) log 1 µ ( i ) , H ( µ ∗ ) ≥ α ∑ i ( − µ ( i )) log 11 − µ ( i ) .Since µ and µ ∗ are duals H ( µ ) = H ( µ ∗ ) . Therefore,2 H ( µ ) = H ( µ ) + H ( µ ∗ ) ≥ α ( ∑ i µ ( i ) log 1 µ ( i ) + ∑ i ( − µ ( i )) log 11 − µ ( i ) ) = α ∑ i H ( µ ( i )) .29iven a distribution µ , let F µ be its support. We want to show how to approximate log | F µ | when µ isfractionally log-concave. Previously, this result was shown for log-concave polynomials in [AOV18]. Lemma 73.
Consider F ⊆ ( [ n ] k ) . Let F ∗ : = { [ n ] \ S | S ∈ F } . Let β : = max ( ∑ i p i log 1 p i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ∈ conv ( F ) ) , and β ∗ : = max (cid:26) q i log 1 q i (cid:12)(cid:12)(cid:12)(cid:12) q ∈ conv ( F ∗ ) (cid:27) . Assume there exists an α -fractionally log-concave polynomials g and h with supp ( g ) = F and supp ( h ) = F ∗ . Then β + β ∗ is α /2 -approximation for log | F | i.e. ( β + β ∗ ) ≥ log | F | ≥ ( β + β ∗ ) α /2. In particular, if there exists an Γ α -sector-stable polynomial g with supp ( g ) = F then β + β ∗ is α /2 -approximationfor log | F | .Note that β and β ∗ can be efficiently computed via a convex program (see e.g. [AOV18, Theorem 2.10].The following lemma states that any point in conv ( F µ ) , where µ is fractionally log-concave, is equal to themarginals of some fractionally log-concave distribution. Lemma 74.
Consider F ⊆ ( [ n ] k ) . Suppose there exists an α -fractionally log-concave polynomial g with supp ( g ) = F.For any ( p , · · · , p n ) ∈ conv ( F ) , there exists ν with supp ( ν ) ⊆ F such that ∑ S ν ( S ) z S is α -fractionally log concaveand ν ( i ) = p i ∀ i ∈ [ n ] . Consequently, max n ∑ i p i log p i (cid:12)(cid:12)(cid:12) p ∈ conv ( F ) o = max n ∑ i µ ( i ) log µ ( i ) (cid:12)(cid:12)(cid:12) µ ∈ V o where V is the set of α -fractionally log concave µ with supp ( µ ) ⊆ F .The proof is very similar to [AOV18, Theorem 2.10, Corollary 2.11]. They show that given any ~ p =( p , · · · , p n ) ∈ newt ( g ) , one can find vector λ ∈ R n ≥ such that the distribution generated by g λ ⋆ µ has thesame marginal as ~ p . Now, we are ready to prove Lemma 73. Proof of Lemma 73.
Let ν and ν ∗ be the uniform distribution over F and F ∗ respectively. For a set F ′ , let V F ′ be set of α -fractionally log concave µ with supp ( µ ) ⊆ F ′ . Since V F is non empty, by Lemma 74, β = max µ ∈ V F n ∑ i µ ( i ) log µ ( i ) o and β ∗ = max µ ∈ V F ∗ n ∑ i µ ( i ) log µ ( i ) o . For S ∈ { F , F ∗ } , let µ argmax S = argmax µ ∈ V S ∑ i µ ( i ) log 1 µ ( i ) .We have log ( | F | ) = H ( ν ) ≤ ∑ i H ( ν ( i )) = ∑ i ( ν ( i ) log ν ( i ) + ( − ν ( i )) log − ν ( i ) ) ≤ β + β ∗ , where theinequality follows from the fact that ( ν ( i )) ni = ∈ conv ( F ) and ( − ν ( i )) ni = ∈ conv ( F ∗ ) .On the other hand, since the uniform distribution over discrete set maximizes entropy, we havelog ( | F | ) = H ( ν ) ≥ H ( µ argmax F ) ≥ α ∑ i µ argmax F ( i ) log 1 µ argmax F ( i ) = αβ ,where the second inequality follows from Lemma 71. Analogously,log ( | F ∗ | ) = H ( ν ) ≥ H ( µ argmax F ∗ ) ≥ α ∑ i µ argmax F ∗ ( i ) log 1 µ argmax F ∗ ( i ) = αβ ∗ .Summing the these two inequalities, we getlog ( | F | ) ≥ ( β + β ∗ ) α /2.30orollaries 54 and 62 and Lemma 73 together imply the following corollary. Corollary 75.
Consider graph G = G ( V , E ) . Let V M be the family of sets of S ⊆ V that have a perfect matching.For k ≤ n /2 , let V Mk be the family of vertices of size k that have a perfect matching. Then we can efficiently computea -multiplicative-approximation of log | V M | and of log | V Mk | .Analogously, Corollary 61 and Lemma 73 together imply the following Corollary 76.
Consider matrix L ∈ R n × n such that L + L T is positive semi-definite. Let V L be the family of setsS ⊆ [ n ] such that det ( L S , S ) = . For k ≤ n, let V Lk be the family of sets S ∈ ( [ n ] k ) such that det ( L S , S ) = Thenwe can efficiently compute an -multiplicative-approximation of log | V L | and of log | V Lk | . Remark . In Lemma 66, we show the convex hull of the support of a Γ α -sector stable polynomial has edgelength bounded by O ( α ) . We can show a similar result for α -fractionally log-concave polynomial. Weleave the problem of characterizing the support of α -fractionally log-concave polynomial to future work. In this section, we state and prove the formal version of Corollary 13. This result suggests that there isan efficient algorithm to compute mixed derivatives of a real-stable polynomials. The time complexityof the algorithm depends on the bit complexity of coefficients of the polynomial and the number ofpartial derivatives. As a result, we have an FPRAS to compute the sum of coefficients of the monomialscorresponding to a partition matroid with constantly many parts. By dropping the assumption on thecoefficients, the best known result gives an e r -approximation factor where r is the rank of matroid (see[SV17]). Lemma 78.
Let f ∈ R ≥ [ z , · · · , z n ] be a homogeneous real-stable polynomial whose maximum degree in z i is κ i .Let κ : = ∑ ni = κ i . For v , · · · , v k , x ∈ R n ≥ with k = O ( ) , we can compute ∂ c v · · · ∂ c k v k f | ~ z = x in polynomial time in κ and b, where b ≥ is the bit complexity of the coefficients of f and the entries of v , · · · , v k , x i.e., these entriesare in between [ − b , 2 b ] . Proof.
W.l.o.g., we can assume f is homogeneous multiaffine, else we replace f with its polarization f ↑ ( z ij ) i ∈ [ n ] , j ∈ [ κ i ] (see Proposition 34). Note that f ↑ is a homogeneous multi-affine polynomial in κ vari-ables, and has same degree as f . Moreover, ∂ v f = ( ∑ ni = ∑ κ i j = v i ∂∂ z ij ) f ↑ . Each call to the oracle O f ↑ for f ↑ can be implemented using one call to the oracle O f for f . The bounds on coefficients of f implies that thecoefficients of f ↑ are bounded by 2 − κ O ( ) b and 2 κ O ( ) b . Therefore, in the remainder of the proof we assumethat polynomial f is multiaffine, homogeneous, real-stable, and κ = n .We divide the proof into two main steps. In the first step, we map the polynomial f to another polynomial g such that:1. g is a homogeneous multiaffine real-stable polynomial in n ′ = O ( n ) variables, of degree d = deg ( g ) = deg ( f ) .2. D c v · · · D c k v k ( f ) | x , ··· , x n = D c w · · · D c k w k ( g ) | x ′ , ··· , x ′ n ′ ∈ R , where x ′ i ≥ ∀ i ∈ [ n ′ ] . The vectors w i ∈ {
0, 1 } n and correspond to subsets T i ⊆ [ n ′ ] ; further these sets T i are disjoint. Note that D c w · · · D c k w k ( g ) isexactly h ( x ′ , · · · , x ′ n ) where h ( z , · · · , z n ′ ) is the sum of terms in g whose ( T , · · · , T k , T k + ) -degreeis ( c , · · · , c k , c k + ) where T k + = [ n ′ ] \ S ki = T i and c k + = n ′ − ∑ ki = c i .In the second step, we (approximately) sample from the distribution µ generated by h ( x ′ z , · · · , x ′ n z n ) . Aroutine sampling to counting argument then allows computing an approximation of h ( x ′ , · · · , x ′ n ) . Theo-rem 64 and Lemma 67 implies h is α -fractionally log-concave for α = k + . Let ∆ = ( α − ) , ℓ = ⌈ ∆ ⌉ ,the ℓ -steps down-up walk has eigenvalue gap ≥ n ∆ + . The bound on coefficients of f implies an upperbound of κ O ( ) b on min S ∈ supp ( µ ) log ( µ ( S )) , thus the random walk starting from any S ∈ supp ( µ ) mixes31n k O ( ) b steps. We can use O g to obtain a starting state in supp ( µ ) . Each step of the random walk can beimplemented using polynomially many calls O g .For t ≤ n and v i > i ∈ [ t ] , it is easy to see that, ( t ∑ i = v i ∂ i ) c f ( x , · · · , x n ) = ( t ∑ i = ∂ i ) c f ( v x , · · · , v t x t , x t + , · · · , x n ) .For j ∈ [ k ] , consider w j ∈ {
0, 1 } n where w ji = ( v ji = ) . For i ∈ [ n ] , let x ′ i : = x i ∏ j : v ji = v ji ≥
0. We have D c v · · · D c k v k ( f ) | x , ··· , x n = D c w · · · D c k w k ( f ) | x ′ , ··· , x ′ n ′ .Consider the linear transformation T : R [ z i ] → R [ z i , j ] i ∈ [ n ] , j ∈ [ k ] obtained by substituting z i : = ∑ kj = z i , j for i ∈ [ n ] and define g = T ( f ) . Clearly, g is multiaffine, homogeneous and real-stable.We next show that: T ( D c w · · · D c k w k f ) | x , ··· , x n = D c ˜ w · · · D c k ˜ w k ( g ) |{ ˜ x i , j } i ∈ [ n ] , j ∈ [ k ] ,where ˜ x i , j = x i ∀ j ∈ [ k ] and ˜ w ji , j ′ = ( w ji if j ′ = j f and g are multiaffine. To prove the above equality, we only need to verify it for multiaffinemonomials. Fix a multiaffine monomial m and j ∈ [ k ] . We check T ( w ji · · · w ji cj ∂ i · · · ∂ i cj m ) = ˜ w ji , j · · · ˜ w ji cj , j ( c j ∏ t = ∂∂ z i t , j ) T ( m ) .This immediately implies T ( D c j w j ˜ f ) = D c j ˜ w j T ( ˜ f ) for any multiaffine polynomial ˜ f . The desired equality thenfollows by induction.First, w ji · · · w ji cj = ˜ w ji , j · · · ˜ w ji cj , j . We can factor them out, and prove T ( ∂ i · · · ∂ i cj m ) = ∏ c j t = ∂∂ z it , j T ( m ) . Now,if the i t are not distinct, then LHS and RHS are both 0 since m and T ( m ) are multiaffine. If m does notdivide z i t for some t , then both LHS and RHS are 0. Now, write m = m ∏ c j t = z i t for some monomials m containing only variables in [ n ] \ n i , · · · , i c j o . Clearly, the LHS is T ( m ) . The RHS is c j ∏ t = ∂∂ x ji t , j T ( m ) c j ∏ t = ( k ∑ j = x ji t ) = T ( m )( c j ∏ t = ∂∂ x ji t , j ) c j ∏ t = ( k ∑ j = x ji t ) ! = T ( m ) . References [Aff+14] Raja Hafiz Affandi, Emily B. Fox, Ryan P. Adams, and Ben Taskar.
Learning the Parameters ofDeterminantal Point Process Kernels . 2014. arXiv: .[AL20] Vedat Levi Alev and Lap Chi Lau. “Improved analysis of higher order random walks and ap-plications”. In:
Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing .2020, pp. 1198–1211.[ALO20] Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. “Spectral Independence in High-DimensionalExpanders and Applications to the Hardcore Model”. In:
Proceedings of the 61st IEEE AnnualSymposium on Foundations of Computer Science . IEEE Computer Society, Nov. 2020.[Ana+18] Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomi-als III: Mason’s Ultra-Log-Concavity Conjecture for Independent Sets of Matroids”. In:
CoRR abs/1811.01600 (2018). 32Ana+19] Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-concave polynomi-als II: high-dimensional walks and an FPRAS for counting bases of a matroid”. In:
Proceedingsof the 51st Annual ACM SIGACT Symposium on Theory of Computing . 2019, pp. 1–12.[AO17] Nima Anari and Shayan Oveis Gharan. “A Generalization of Permanent Inequalities and Ap-plications in Counting and Optimization”. In:
Proceedings of the 49th Annual ACM SIGACT Sym-posium on Theory of Computing . ACM, June 2017, pp. 384–396. doi : .[AOR16] Nima Anari, Shayan Oveis Gharan, and Alireza Rezaei. “Monte Carlo Markov Chain Algo-rithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes”. In: Proceedings of the 29th Conference on Learning Theory . Vol. 49. JMLR Workshop and ConferenceProceedings. PMLR, June 2016, pp. 103–115.[AOV18] Nima Anari, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomials I: En-tropy and a Deterministic Approximation Algorithm for Counting Bases of Matroids”. In:
Pro-ceedings of the 59th IEEE Annual Symposium on Foundations of Computer Science . IEEE ComputerSociety, Oct. 2018. doi : .[Bay+07] Mohsen Bayati, David Gamarnik, Dimitriy A. Katz, Chandra Nair, and Prasad Tetali. “Simpledeterministic approximation algorithms for counting matchings”. In: Proceedings of the 39thAnnual ACM Symposium on Theory of Computing, San Diego, California, USA, June 11-13, 2007 . Ed.by David S. Johnson and Uriel Feige. ACM, 2007, pp. 122–127. doi : .[BB09] Julius Borcea and Petter Brändén. “The Lee-Yang and Pólya-Schur programs. I. Linear op-erators preserving stability”. In: Inventiones Mathematicae doi : . arXiv: .[BBL09] Julius Borcea, Petter Brändén, and Thomas Liggett. “Negative dependence and the geometryof polynomials”. In: Journal of the American Mathematical Society
Coxeter Matroids . Springer, 2003, pp. 151–197.[BH19] Petter Brändén and June Huh. “Lorentzian polynomials”. In: arXiv preprint arXiv:1902.03719 (2019).[Bor09] Alexei Borodin.
Determinantal point processes . 2009. arXiv: .[BP93] Robert Burton and Robin Pemantle. “Local Characteristics, Entropy and Limit Theorems forSpanning Trees and Domino Tilings Via Transfer-Impedances”. In:
The Annals of Probability issn : 00911798.[Brä07] Petter Brändén. “Polynomials with the half-plane property and matroid theory”. In:
Advancesin Mathematics
Advances in Neural Information Processing Systems . Ed. byS. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett. Vol. 31.Curran Associates, Inc., 2018, pp. 7365–7374.[Cel+16] L Elisa Celis, Amit Deshpande, Tarun Kathuria, Damian Straszak, and Nisheeth K Vishnoi.“On the complexity of constrained determinantal point processes”. In: arXiv preprint arXiv:1608.00554 (2016).[CGM19] Mary Cryan, Heng Guo, and Giorgos Mousa. “Modified log-Sobolev inequalities for stronglylog-concave distributions”. In: . IEEE. 2019, pp. 1358–1370.[Cha+15] Wei-Lun Chao, Boqing Gong, Kristen Grauman, and Fei Sha. “Large-Margin DeterminantalPoint Processes”. In:
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelli-gence . UAI’15. Amsterdam, Netherlands: AUAI Press, 2015, pp. 191–200. isbn : 9780996643108.[Che+20] Zongchen Chen, Andreas Galanis, Daniel Štefankoviˇc, and Eric Vigoda. “Rapid mixing forcolorings via spectral independence”. In: arXiv preprint arXiv:2007.08058 (2020).[CLV20a] Zongchen Chen, Kuikui Liu, and Eric Vigoda. “Optimal Mixing of Glauber Dynamics: EntropyFactorization via High-Dimensional Expansion”. In: arXiv preprint arXiv:2011.02075 (2020).[CLV20b] Zongchen Chen, Kuikui Liu, and Eric Vigoda. “Rapid Mixing of Glauber Dynamics up toUniqueness via Contraction”. In: arXiv preprint arXiv:2004.09083 (2020).[CMO19] Fabio Deelan Cunden, Satya N Majumdar, and Neil O’Connell. “Free fermions and α -determinantalprocesses”. In: 52.16 (Mar. 2019), pp. 165–202. doi : .33DK17] Irit Dinur and Tali Kaufman. “High dimensional expanders imply agreement expanders”.In: . IEEE. 2017,pp. 974–985.[Edm65] Jack Edmonds. “Paths, trees, and flowers”. In: Canadian Journal of mathematics
17 (1965), pp. 449–467.[ES20] Ronen Eldan and Omer Shamir. “Log concavity and concentration of Lipschitz functions onthe Boolean hypercube”. In: arXiv preprint arXiv:2007.13108 (2020).[EV19] David Eppstein and Vijay V Vazirani. “NC Algorithms for Computing a Perfect Matching, theNumber of Perfect Matchings, and a Maximum Flow in One-Crossing-Minor-Free Graphs”. In:
The 31st ACM Symposium on Parallelism in Algorithms and Architectures . 2019, pp. 23–30.[Fen+20] Weiming Feng, Heng Guo, Yitong Yin, and Chihao Zhang. “Rapid mixing from spectral inde-pendence beyond the Boolean domain”. In: arXiv preprint arXiv:2007.08091 (2020).[FGT19] Stephen Fenner, Rohit Gurjar, and Thomas Thierauf. “Bipartite perfect matching is in quasi-NC”. In:
SIAM Journal on Computing
Proceedings of the twenty-fourth annualACM symposium on Theory of computing . 1992, pp. 26–38.[Gar+19] Mike Gartrell, Victor-Emmanuel Brunel, Elvis Dohmatob, and Syrine Krichene. “LearningNonsymmetric Determinantal Point Processes”. In:
ArXiv abs/1905.12962 (2019).[Går59] Lars Gårding. “An inequality for hyperbolic polynomials”. In:
Journal of Mathematics and Me-chanics (1959), pp. 957–965.[GL99] Anna Galluccio and Martin Loebl. “On the theory of Pfaffian orientations. I. Perfect matchingsand permanents”. In: the electronic journal of combinatorics (1999), R6–R6.[GM20] Heng Guo and Giorgos Mousa. “Local-to-Global Contraction in Simplicial Complexes”. In: arXiv preprint arXiv:2012.14317 (2020).[GPK16] Mike Gartrell, Ulrich Paquet, and Noam Koenigstein. “Bayesian Low-Rank DeterminantalPoint Processes”. In:
Proceedings of the 10th ACM Conference on Recommender Systems . RecSys’16. Boston, Massachusetts, USA: Association for Computing Machinery, 2016, pp. 349–356. isbn : 9781450340359. doi : .[Gül97] Osman Güler. “Hyperbolic polynomials and interior point methods for convex programming”.In: Mathematics of Operations Research
Comm. Math.Phys.
Probab. Surveys doi : .[Jer87] Mark Jerrum. “Two-dimensional monomer-dimer systems are computationally intractable”. In: Journal of Statistical Physics
Mathematical StatisticalPhysics, Session LXXXIII: Lecture Notes of the Les Houches Summer School . 2005, pp. 1–56.[JS89] Mark Jerrum and Alistair Sinclair. “Approximating the permanent”. In:
SIAM journal on com-puting
Journal of the ACM (JACM)
Theoretical computer science
43 (1986), pp. 169–188.[Kas61] Pieter W Kasteleyn. “The statistics of dimers on a lattice: I. The number of dimer arrangementson a quadratic lattice”. In:
Physica
Graph theory and theoretical physics (1967), pp. 43–110.[KD16] Tarun Kathuria and Amit Deshpande. “On sampling and greedy map inference of constraineddeterminantal point processes”. In: arXiv preprint arXiv:1607.01551 (2016).[KM16] Tali Kaufman and David Mass. “High dimensional random walks and colorful expansion”. In: arXiv preprint arXiv:1604.02947 (2016). 34KO18] Tali Kaufman and Izhar Oppenheim. “High order random walks: Beyond spectral gap”. In:
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (AP-PROX/RANDOM 2018) . Schloss Dagstuhl-Leibniz-Zentrum für Informatik. 2018.[Koz06] Dexter C Kozen.
Theory of computation . Springer Science & Business Media, 2006.[KSG08] Andreas Krause, Ajit Singh, and Carlos Guestrin. “Near-Optimal Sensor Placements in Gaus-sian Processes: Theory, Efficient Algorithms and Empirical Studies”. In:
J. Mach. Learn. Res. issn : 1532-4435.[KT11] Alex Kulesza and Ben Taskar. “K-DPPs: Fixed-Size Determinantal Point Processes”. In:
Proceed-ings of the 28th International Conference on International Conference on Machine Learning . ICML’11.Bellevue, Washington, USA: Omnipress, 2011, pp. 1193–1200. isbn : 9781450306195.[KT12] Alex Kulesza and Ben Taskar. “Determinantal Point Processes for Machine Learning”. In:
Foundations and Trends® in Machine Learning issn : 1935-8237. doi : .[KUW86] Richard M Karp, Eli Upfal, and Avi Wigderson. “Constructing a perfect matching is in randomNC”. In: Combinatorica
Complex analysis . Vol. 103. Springer Science & Business Media, 2013.[LB12] Hui Lin and Jeff Bilmes. “Learning Mixtures of Submodular Shells with Application to Docu-ment Summarization”. In:
Proceedings of the Twenty-Eighth Conference on Uncertainty in ArtificialIntelligence . UAI’12. Catalina Island, CA: AUAI Press, 2012, pp. 479–490. isbn : 9780974903989.[LJS16] Chengtao Li, Stefanie Jegelka, and Suvrit Sra. “Fast DPP Sampling for Nyström with Appli-cation to Kernel Methods”. In:
Proceedings of the 33rd International Conference on InternationalConference on Machine Learning - Volume 48 . ICML’16. New York, NY, USA: JMLR.org, 2016,pp. 2061–2070.[LLP17] Eyal Lubetzky, Alex Lubotzky, and Ori Parzanchevski. “Random walks on Ramanujan com-plexes and digraphs”. In: arXiv preprint arXiv:1702.05452 (2017).[LMR15] Frederic Lavancier, Jesper Moller, and Ege Rubak. “Determinantal point process models andstatistical inference”. In:
Journal of the Royal Statistical Society. Series B (Statistical Methodology) issn : 13697412, 14679868.[Lov79] László Lovász. “On determinants, matchings, and random algorithms”. In:
Fundamentals ofComputation Theory, FCT 1979, Proceedings of the Conference on Algebraic, Arthmetic, and CategorialMethods in Computation Theory, Berlin/Wendisch-Rietz, Germany, September 17-21, 1979 . Ed. byLothar Budach. Akademie-Verlag, Berlin, 1979, pp. 565–574.[LP17] David A Levin and Yuval Peres.
Markov chains and mixing times . Vol. 107. American Mathemat-ical Soc., 2017.[Mac75] Odile Macchi. “The Coincidence Approach to Stochastic Point Processes”. In:
Advances in Ap-plied Probability issn : 00018678.[MV89] Milena Mihail and Umesh Vazirani. “On the expansion of 0-1 polytopes”. In:
Journal of Combi-natorial Theory, Series B, to appear (1989).[MVV87] Ketan Mulmuley, Umesh V Vazirani, and Vijay V Vazirani. “Matching is as easy as matrixinversion”. In:
Proceedings of the nineteenth annual ACM symposium on Theory of computing . 1987,pp. 345–354.[Opp18] Izhar Oppenheim. “Local spectral expansion approach to high dimensional expanders part I:Descent of spectral gaps”. In:
Discrete & Computational Geometry
Information and Computation
Ann.Probab. doi : .[SS14] Blagovest Sendov and Hristo S. Sendov. “Loci of complex polynomials, part I”. In: Transactionsof the American Mathematical Society
366 (2014), pp. 5155–5184.35SS19] BLAGOVEST SENDOV and HRISTO SENDOV. “Duality between loci of complex polynomialsand the zeros of polar derivatives”. In:
Mathematical Proceedings of the Cambridge PhilosophicalSociety doi : .[Ste90] John R Stembridge. “Nonintersecting paths, pfaffians, and plane partitions”. In: Advances inMathematics issn : 0001-8708. doi : https://doi.org/10.1016/0001-8708(90)90070-4 .[SV17] Damian Straszak and Nisheeth K. Vishnoi. “Real Stable Polynomials and Matroids: Optimiza-tion and Counting”. In: STOC 2017. Montreal, Canada: Association for Computing Machinery,2017, pp. 370–383. isbn : 9781450345286. doi : .[ŠVW18] Daniel Štefankoviˇc, Eric Vigoda, and John Wilmes. “On counting perfect matchings in generalgraphs”. In: Latin American Symposium on Theoretical Informatics . Springer. 2018, pp. 873–885.[TF61] Harold NV Temperley and Michael E Fisher. “Dimer problem in statistical mechanics-an exactresult”. In:
Philosophical Magazine
Theoretical computer science