[PDF] Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More

Abstract

Full PDF

aa r X i v : . [ c s . D S ] F e b Fractionally Log-Concave and Sector-Stable Polynomials:Counting Planar Matchings and More

Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and Thuy-Duong VuongStanford University, {yeganeh,anari,shiragur,tdvuong}@stanford.edu

February 5, 2021

Abstract

We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchingsof a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time,counting non-perfect matchings was shown by Jerrum [Jer87] to be P -hard, who also raised the questionof whether efﬁcient approximate counting is possible. We answer this afﬁrmatively by showing that themulti-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly,and that this dynamics can be implemented efﬁciently on downward-closed families of graphs wherecounting perfect matchings is tractable. As further applications of our results, we show how to sampleefﬁciently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions,and nonsymmetric determinantal point processes.In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions forgenerating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity.These notions generalize well-studied properties like real-stability and log-concavity, but unlike themrobustly degrade under useful transformations applied to the distribution. We relate these notions topairwise correlations in the underlying distribution and the notion of spectral independence introducedby Anari, Liu, and Oveis Gharan [ALO20], providing a new tool for establishing spectral independencebased on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoidingroots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes aclassic result established by Gårding [Går59] who showed homogeneous polynomials that have no rootsin a half-plane must be log-concave over the positive orthant. Let µ : ( [ n ] k ) → R ≥ be a density function on the family of subsets of size k out of a ground set of n elements, which deﬁnes a probability distribution P [ S ] ∝ µ ( S ) .The goal of this work is to establish properties of µ that translate into efﬁcient algorithms for samplingfrom this distribution, and by classical equivalences between approximate counting and sampling [JVV86],to algorithms for approximately computing the normalizing constant, i.e., the partition function: ∑ S µ ( S ) .We study a family of local Markov chains that can be used to approximately sample from such a distribu-tion. Deﬁnition 1 (Down-Up Random Walks) . For a density µ : ( [ n ] k ) → R ≥ , and an integer ℓ ≤ k , we deﬁnethe k ↔ ℓ down-up random walk as the sequence of random sets S , S , . . . generated by the following1 Figure 1: A symmetric sector around the positive real axis. Sector-stability of a polynomial means that ifall variables are chosen from the interior of the sector, the polynomial does not vanish.algorithm: for t =

0, 1, . . . do Select T t uniformly at random from subsets of size ℓ of S t .Select S t + with probability ∝ µ ( S t + ) from supersets of size k of T t .This random walk is time-reversible, always has µ as its stationary distribution, and moreover has positivereal eigenvalues [see, e.g., ALO20]. The special case of ℓ = k − k − ℓ = O ( ) and we have oracleaccess to µ . This is because the number of supersets of T t is at most n k − ℓ = poly ( n ) , so we can enumerateover all in polynomial time.Our main result establishes a formal connection between roots of the generating polynomial of µ , deﬁnedbelow, and rapid mixing of the k ↔ ℓ down-up walks. Deﬁnition 2 (Generating Polynomial) . To a density µ : ( [ n ] k ) → R ≥ we associate a multivariate generatingpolynomial g µ ∈ R [ z , . . . , z n ] , which encodes µ in its coefﬁcients: g µ ( z , . . . , z n ) : = ∑ S µ ( S ) ∏ i ∈ S z i .Note that g µ is a polynomial with nonnegative coefﬁcients, and as such, it has no roots ( z , . . . , z n ) ∈ R n > .We consider polynomials that not only avoid roots on the positive real axis, but also avoid roots in aneighborhood, that is a sector of the complex plane centered around R > . Deﬁnition 3 (Sector-Stability) . For an open sector Γ ⊆ C centered around the positive real axis in thecomplex plane, see Fig. 1, we call a polynomial g ( z , . . . , z n ) sector-stable if z , . . . , z n ∈ Γ = ⇒ g ( z , . . . , z n ) = Γ has constant aperture, implies rapid mixingof the k ↔ ℓ down-up random walk for an appropriately chosen ℓ = k − O ( ) . Theorem 4.

Suppose that the density µ : ( [ n ] k ) → R ≥ has a generating polynomial that is sector-stable with respectto a sector Γ of aperture Ω ( ) . Then for an appropriate value of ℓ = k − O ( ) , the k ↔ ℓ has relaxation time k O ( ) . As a reminder, for a time-reversible Markov chain with positive eigenvalues, the relaxation time is theinverse of the spectral gap [LP17]. A corollary of polynomially-bounded relaxation time is that for startingpoints with not-terribly small probability, the mixing time can be polynomially bounded as well.

Corollary 5 ([see, e.g., LP17]) . Suppose µ has a sector-stable generating polynomial for a sector of constantaperture, and let ℓ = k − O ( ) be the value promised by Theorem 4. If the k ↔ ℓ down-up random walk is startedfrom S , then t mix ( ǫ ) ≤ O (cid:18) k O ( ) · log (cid:18) ǫ · P µ [ S ] (cid:19)(cid:19) where t mix ( ǫ ) is smallest time t such that S t is ǫ -close in total variation distance to the distribution deﬁned by µ . As our main application, we obtain efﬁcient algorithms to approximately sample/countg (weighted)matchings and matchings of a given size in planar graphs. We discuss this and other applications inSections 1.1 to 1.3. We then discuss the techniques we use and related work in Sections 1.4 to 1.6.

Matchings in graphs have been a rich source of intriguing algorithmic questions. The celebrated blossomalgorithm of Edmonds [Edm65], which ﬁnds a maximum-sized matching in a general graph, has beenpartially credited with the creation of the notion of polynomial time algorithm [Koz06]. An entirely differ-ent class of algorithms for ﬁnding matchings, based on connections to determinants, was introduced byLovász [Lov79] and developed further by Karp, Upfal, and Wigderson [KUW86] and Mulmuley, Vazirani,and Vazirani [MVV87]; these determinant-based algorithms have played a central role in the study ofparallel algorithms and derandomization [see, e.g., FGT19].Matchings have also played a central role in counting complexity. The problem of counting perfect match-ings of a given graph was shown by Valiant [Val79] to be complete for the class P , yielding strongevidence that it cannot be solved in polynomial time. This was the ﬁrst major result of its kind, demon-strating hardness of counting for a problem whose search version, i.e., the problem of distinguishing zeroand nonzero counts, is polynomial-time solvable.Given the hardness of exact counting [Val79], the main focus in subsequent work has been on approximatecounting . Unlike combinatorial optimization problems which often admit nontrivial approximation factors ,for a wide range of counting problems, the approximation factor achievable in polynomial time can beeither made as small as 1 + ǫ , in fact for inverse-polynomially small ǫ , or it has to be super-polynomiallylarge [SJ89]. Therefore, the gold standard for approximate counting is a fully polynomial time random-ized approximation scheme or FPRAS; this is a randomized algorithm whose output is a ( + ǫ ) -factorapproximation to the count with high probability, running in time poly ( n , 1/ ǫ ) .In a breakthrough, Jerrum and Sinclair [JS89] established an FPRAS for counting matchings of all sizes on unweighted graphs. It has been a major problem to design an FPRAS for counting matchings of agiven size or perfect matchings . In a celebrated result, Jerrum, Sinclair, and Vigoda [JSV04] designed anFPRAS for these problems on the important subclass of bipartite graphs . Bipartite graphs are an importantsubclass because of their connection to the permanent of matrices. However, designing an FPRAS to countmatchings of a given size on general graphs remains open [see, e.g., ŠVW18].Besides the class of bipartite graphs, there is another major tractable class for counting perfect matchings .Motivated by models in statistical mechanics, Temperley and Fisher [TF61] and Kasteleyn [Kas61] relatedthe number of perfect matchings in 2-dimensional lattices to a speciﬁc determinant, obtaining exact formu-lae for these counts. Later, Kasteleyn [Kas67] generalized this to all planar graphs , obtaining a polynomialtime algorithm for exactly counting perfect matchings in such graphs. At a high-level, this algorithm ﬁnds3 suitable signing of the adjacency matrix, a.k.a. the Tutte matrix, ensuring its determinant is the squareof the number of perfect matchings.While both bipartite and planar graphs form tractable classes for (approximately/excatly) counting perfect matchings, see Figs. 2 and 3, there is a major difference between the two when it comes to non-perfectmatchings. The problem of counting k -matchings, matchings with exactly k edges, is no harder thancounting perfect matchings in general. In a general graph on n nodes, one can add n − k dummy nodesconnected to everything else, see Fig. 4, and count perfect matchings in the modiﬁed graph; the resultis ( n − k ) ! times the number of k -matchings. This strategy extends to counting k -matchings in bipartitegraphs as well. However, in the case of planar graphs, the dummy nodes destroy planarity. This is notjust a coincidence. Jerrum [Jer87] showed that while perfect matchings can be counted exactly in poly-nomial time on planar graphs, counting k -matchings on such graphs is P -hard, adding to the mysteryof determinant-based counting algorithms. Nevertheless, Jerrum [Jer87] raised the possibility of approx-imately counting k -matchings in polynomial time, i.e., designing an FPRAS. As the main application ofour results, we resolve this question afﬁrmatively. Theorem 6.

There is a randomized algorithm that receives a planar graph on n nodes and a number k, and outputsa ( + ǫ ) -approximation to the number of k-matchings with high probability, running in time poly ( n , 1/ ǫ ) . More generally, our results apply to the setting of weighted graphs, a.k.a. monomer-dimer systems . Supposethat a given graph G = ( V , E ) has edge weights w : E → R ≥ and vertex weights λ : V → R ≥ . Thendeﬁne the weight of a matching M asweight ( M ) : = ∏ e ∈ M w ( e ) · ∏ v M λ ( v ) ,where e ranges over dimers, i.e., the matching edges, and v ranges over the monomers, i.e., the vertices notmatched in M . Normalizing these weights deﬁnes a probability distribution over matchings, and approx-imating the normalizing factor, a.k.a. the partition function, is known to be equivalent to approximatelysampling from this distribution [JVV86]. It was shown by Jerrum and Sinclair [JS89] how to approximatelysample/count from monomer-dimer systems in general graphs when edge weights w ( e ) are polynomiallybounded and there are no vertex weights λ ( v ) ; these assumptions on weights are quite strong, despite theirseemingly innocuous appearance. Approximately sampling/counting from the monomer-dimer systemswith no restriction on the weights remains a key challenge.Computing statistics of monomer-dimer systems on 2-dimensional lattices, and more generally planargraphs, was originally studied in statistical physics [Kas61; TF61; Kas67]. However, the determinant-based algorithms found could only solve the case of zero monomer weights: ∀ v : λ ( v ) =

0. Here weremove this restriction, at the expense of approximation. Theorem 7.

There is an algorithm that receives a planar graph G = ( V , E ) on n vertices and weights w : E → R ≥ and λ : V → R ≥ , and outputs a random matching M, whose distribution is ǫ -close in total variation distance tothe monomer-dimer distribution induced by w , λ . The running time of this algorithm is poly ( n , log ( ǫ )) . Our results do not rely on planarity strongly. In fact, Theorems 6 and 7 extend to any downward-closedfamily of graphs for which perfect matchings can be counted efﬁciently. Examples that go beyond planargraphs include certain minor-free graphs [EV19], and small genus graphs [GL99].The key insight that enables Theorems 6 and 7 is that we show local random walks on the set of monomers ,or terminals of the matching M , mix rapidly on all graphs. Monomer-dimer systems and k -matchingseach induce a distribution on subsets S of vertices of the graph if we only view the unmatched (or duallymatched) vertices, i.e., the monomers. On planar graphs, the weight of each such set S can be computedefﬁciently, up to a global normalizing factor. µ ( S ) : = ∑ { weight ( M ) | M is a perfect matching on the complement of S } . We remark that by the results of [Jer87], approximation appears to be necessary, at least for the counting problem.

4e show how to sample a set S with probability approximately following the above, by running a multi-site Glauber dynamics on S for polynomially many steps. The rapid mixing of this random walk, com-bined with known equivalences between approximate sampling and approximate counting [MVV87] im-ply Theorems 6 and 7.We prove Theorems 6 and 7 by showing sector-stability of the corresponding generating polynomials andthen applying Theorem 4. We show sector-stability by starting from results of Heilmann and Lieb [HL72]who characterized regions of root-freeness for unconstrained non-homogeneous monomer-dimer systems,and applying a set of tools we build that show sector-stability degrades gracefully under a number ofoperations, like conditioning on cardinality or homogenization. Lemma 8.

Suppose that a graph G = ( V , E ) is given with edge weights w : E → R ≥ and vertex weights λ : V → R ≥ which together deﬁne a weight on matchings weight ( M ) = ∏ e ∈ M w ( e ) ∏ v M λ ( v ) . For any k, thefollowing polynomial, encoding k-matchings, is sector stable for a sector of aperture π /2 .g ( z , . . . , z n ) = ∑ M matching of size k weight ( M ) ∏ v M z v . Additionally the following homogeneous polynomial in n variables, encoding all matchings, is sector stable for asector of aperture π /2 . g ( z , . . . , z n , z ′ , . . . , z ′ n ) = ∑ M matching weight ( M ) ∏ v M z v ∏ v ∼ M z ′ v . Remark . Techniques developed by Jerrum and Sinclair [JS89] allow one to tune the weights in monomer-dimer systems to make the probability mass of k -matchings inverse-polynomially large. In turn combin-ing these techniques with rejection sampling, Theorem 6 can be derived from Theorem 7. Nevertheless,our techniques directly solve the sampling problem for k -matchings, monomer-dimer systems, and evenmonomer-dimer systems restricted to k -matchings, without the need to resolve to weight-tuning. Determinantal point processes (DPP) are elegant probabilistic models used to capture the relationshipbetween items within a subset drawn from a large universe of items. A DPP is formally deﬁned with thehelp of an n × n positive semideﬁnite matrix L (cid:23)

0, where a subset S ⊆ [ n ] is chosen with probabilitiesgiven by minors of L : P [ S ] ∝ det ( L S , S ) .Determinantal point processes (DPP) were ﬁrst studied in 1975 by Macchi [Mac75], who was motivated bythe study of fermion processes in quantum mechanics. Since then, DPPs have been very well-studied andhave found applications in many areas such as physics [CMO19; Sos02], random matrix theory [Joh05],combinatorics [BBL09] (random spanning trees [BP93], non-intersecting paths [Ste90]) and recently in ma-chine learning. Within machine learning, DPPs have been used in several applications such as documentsummarization [Cha+15; LB12], recommender systems [GPK16], and many others [Aff+14; KT11; KSG08].Due to broad and practical applications, algorithmic questions occurring in DPP have received lot of at-tention and efﬁcient algorithms for DPP learning [Aff+14; Bor09; KT12; LMR15] and sampling [AOR16;RK15; LJS16; Hou+06] have been provided.Kulesza and Taskar [KT11; KT12] studied an extension of DPPs where the samples are conditioned onhaving a ﬁxed size k . These so called k -DPPs are formally deﬁned with the help of an n × n positivesemideﬁnite matrix L (cid:23) k , where a subset S ∈ ( [ n ] k ) of size k is chosen withprobabilities given by k × k minors of L : P [ S ] = det ( L S , S ) ∑ T ∈ ( [ n ] k ) det ( L T , T ) .5he authors in [KT11; KT12] used k -DPPs to attack problems such as the image search task, where thegoal is to output a diverse set of image results, of desired cardinality, in response to a search query.Almost all prior work on DPPs assume the underlying matrix L is symmetric and positive semideﬁnite(PSD) and the understanding of nonsymmetric DPPs (where L does not have to be symmetric) remainssparse. For nonsymmetric matrices L that are guaranteed to have nonnegative minors, the nonsymmetricDPP can still be deﬁned by P [ S ] ∝ det ( L S , S ) .Nonsymmetric DPPs are important as they allow one to model both repulsive and attractive relationshipsbetween items, providing a signiﬁcantly improved modeling power. For applications of nonsymmetricDPPs see [Gar+19], where the authors use nonsymmetric DPPs to effectively recover correlation structurewithin data, particularly for data that contains large disjoint collections of items where the items within thesame collection have positive correlation while those across different collections are negatively correlated.Brunel [Bru18] also studied learning certain subclasses of nonsymmetric DPPs. Due to their enhancedexpressivity power and potential new applications, the study of nonsymmetric DPPs has been an activearea of research in the past few years.The question of sampling from nonsymmetric k -DPPs is known to be polynomial-time tractable. Indeed,the counting question, that is computing the sum of principal minors can be done exactly, even whenrestricted to k × k principal minors. However these naive algorithms are cumbersome to run in practice,as they require at least n × n matrix multiplication time. A similar barrier existed for symmetric DPPs, butMarkov-chain-based sampling from k -DPPs for symmetric L provided one way to get around this barrier[AOR16; LJS16], yielding algorithms that run in O ( n poly ( k )) time.As an application of our results we provide the ﬁrst efﬁcient Markov-chain-based algorithm to sample froma wide class of nonsymmetric k -DPPs. Our algorithm works for any nonsymmetric matrix L satisfying L + L T (cid:23)

0. These matrices are the sum of a skew-symmetric matrix and a symmetric PSD matrix; thisclass of matrices L , which are automatically guaranteed to have nonnegative principal minors, deﬁnes themain class of nonsymmetric DPPs studied in the literature [Gar+19]. Theorem 10.

For any matrix L ∈ R n × n satisfying L + L ⊺ (cid:23) and cardinality k ≥ , consider the distribution µ : ( [ n ] k ) → R ≥ deﬁned by µ ( S ) ∝ det ( L S , S ) . Then the k ↔ ( k − ) random walk for µ has relaxation time poly ( k ) . Note that each step of this random walk can be implemented using O ( n ) computations of k × k principalminors of L . So this results in a mixing time of O ( n poly ( k ) · log ( P [ S ])) , which can be much fasterthan n × n matrix multiplication time. To the best of our knowledge, our work is the ﬁrst to establish thatnatural Markov chains can be used for the task of sampling from nonsymmetric k -DPPs.Unsurprisingly, we show this result by proving sector-stability of the corresponding generating polyno-mial. Lemma 11.

For any matrix L ∈ R n × n satisfying L + L ⊺ (cid:23) and number k, the following polynomial is sector-stable w.r.t. a sector of aperture π /2 . g ( z , . . . , z n ) = ∑ S ∈ ( [ n ] k ) det ( L S , S ) ∏ i ∈ S z i . Suppose that µ : ( [ n ] k ) → R ≥ is a density where g µ is stable with respect to a half plane in C , i.e., stablew.r.t. the sector { z ∈ C | Re ( z ) > } . Distributions with this property are called strongly Rayleigh , andthey have been widely studied in the literature [see, e.g., BBL09]. Strongly Rayleigh distributions includedeterminantal point processes, certain classes of matroids, results of the symmetric exchange process, andmore [see, e.g., BBL09]. Motivated by the important problems of computing mixed discriminants, and6ounting intersections of matroids, several works [AO17; SV17; Cel+16; KD16] have studied the problemof sampling from such µ subject to a partition constraint . That is, given a partition T ∪ T ∪ · · · ∪ T s = [ n ] ,and numbers c , . . . , c s ∈ Z ≥ , the question is to sample S ∼ µ conditioned on the constraint ∀ i : | S ∩ T i | = c i .If we allow arbitrarily large s , this problem becomes as hard as (approximately) computing the mixeddiscriminant for which no FPRAS is known. If one deﬁnes the same problem for distributions µ that havea log-concave generating polynomial, then partition-constrained sampling is as hard as sampling from theintersection of two matroids ; this is again an important open problem, which remains unsolved.Given the importance of partition-constrained distributions mentioned above, a natural question is, arethere assumptions on the partitions that allow for an FPRAS or approximate sampling? Celis, Deshpande,Kathuria, Straszak, and Vishnoi [Cel+16] obtained such a positive result when the number of partitions s isa constant and importantly when g µ can be computed exactly (as is the case for determinantal distributions).They relied on polynomial interpolation to achieve this result. However, for many strongly Rayleighdistributions µ , we can only approximately compute g µ .As a further application of our results, we show how to sample from partition-constrained µ , as long thenumber of partitions is O ( ) ; our algorithm only requires having access to an oracle for µ , as opposed to g µ . We do this by showing that the local random walks on the partition-constrained µ still mix rapidly, byrelying on Theorem 4 and showing sector-stability for the conditioned distribution. Lemma 12.

Suppose that µ has a sector-stable polynomial with respect to the sector { z ∈ C | Re ( z ) > } . Thenthe partition-constrained distribution for O ( ) -many partitions is sector-stable w.r.t. a sector of Ω ( ) aperture. As a corollary of the ability to approximately compute the partition function for µ subject on partitionconstraints, we show how to approximately compute mixed derivatives of real-stable polynomials g µ ,where the number of distinct derivatives is O ( ) . Note that without this restriction of O ( ) , this problembecomes as hard as computing mixed discriminants. Corollary 13.

Let g ( z , · · · , z n ) be a homogeneous real-stable polynomial with nonnegative coefﬁcients. Supposewe are given oracle access to coefﬁcients of g, and we are also given a term with nonzero coefﬁcient. Then there isan FPRAS that can approximately compute mixed derivatives of g along positive directions, as long as the numberof unique directions is O ( ) . That is given v , · · · , v s , x ∈ R n ≥ with s = O ( ) and tuple ( c , · · · , c s ) ∈ Z s ≥ , wecan efﬁciently approximate ∂ c v · · · ∂ c s v s g (cid:12)(cid:12)(cid:12) z = x . Here ∂ v is simply the operator v ∂ z + · · · + v n ∂ z n . In order to prove Theorem 4, we build on a recent line of work leveraging high-dimensional expandersfor sampling problems [Ana+19; AL20; ALO20; CLV20b; Fen+20; Che+20; CLV20a]. Speciﬁcally, weuse the framework dubbed spectral independence by Anari, Liu, and Oveis Gharan [ALO20]. In thisframework, one views a target distribution µ as a weighted hypergraph or simplicial complex. Establishinga certain notion of high-dimensional expansion would then imply fast-mixing of natural random walksthat converge to µ [DK17; KM16; LLP17; KO18; AL20]. Reinterpreting the notion of high-dimensionalexpansion needed for rapid mixing, Anari, Liu, and Oveis Gharan [ALO20] showed how properties of pairwise correlations in the distribution µ , and certain distributions derived from µ , can imply rapid mixingof natural local random walks, see Deﬁnition 1.The spectral independence framework can be applied to the problem of sampling from a distribution onsize k subsets of a ground set of n elements, given up to a global normalizing factor by a function µ : µ : (cid:18) [ n ] k (cid:19) → R ≥ .7n many cases the domain of µ can be adapted to be of the form ( [ n ] k ) [see, e.g., ALO20]. For concreteness,let us look at the distribution of monomers in a monomer-dimer system on the graph G = ( V , E ) . Notall monomer sets have the same size, but we can view each set S ⊆ V as a subset of size | V | chosen from V × {

0, 1 } : S

7→ { ( v , 0 ) | v / ∈ S } ∪ { ( v , 1 ) | v ∈ S } .This gives us a distribution µ : ( V ×{ }| V | ) → R ≥ . Note that in the case of k -matchings, the monomer set isalready of a ﬁxed size, and there is no need for this transformation.Anari, Liu, and Oveis Gharan [ALO20] based on earlier work of Alev and Lau [AL20] showed that rapidmixing of natural local random walks converging to µ can be established as long as pairwise correlations of µ (and certain distributions derived from µ ) are spectrally bounded . More precisely, consider the correlationmatrix deﬁned below. Deﬁnition 14 (Correlation Matrix) . For a distribution µ over subsets S of a ground set [ n ] , deﬁne thecorrelation matrix Ψ ∈ R n × n as the matrix having entries Ψ i , j : = P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] .The entries of the matrix Ψ measure pairwise correlations or in other words deviations from pairwiseindependence. The key behind the spectral independence framework is to show that the maximumeigenvalue of Ψ is O ( ) . Note that Ψ is always similar to a symmetric matrix and therefore has realeigenvalues [ALO20]. More precisely, one needs to show this not just for the distribution µ , but alsoconditioned versions of it. We remark that in earlier work, a variant of the correlation matrix has appearedwhere the entries are instead given by P [ j ∈ S | i ∈ S ] − P [ j ∈ S | i / ∈ S ] , but these two variants areintimately connected and for homogeneous distributions one can go from eigenvalue bounds of one to theother. Deﬁnition 15 (Conditioned Distribution) . For a distribution µ deﬁned over subsets of a ground set [ n ] and T ⊆ [ n ] , deﬁne µ T to be the distribution of S ∼ µ conditioned on the event T ⊆ S .One has to show that the correlation matrix has bounded eigenvalues for every T where µ T is well-deﬁned.The main challenge in all applications of this framework is bounding these eigenvalues. Roughly speaking,prior work has managed to use three categories of techniques to establish eigenvalue bounds, discussedbelow: Trickle-Down.

Oppenheim [Opp18] showed that an eigenvalue bound on Ψ for µ { } , µ { } , . . . , µ { n } alsoimplies an eigenvalue bound for Ψ for the distribution µ , under some mild additional conditions. Thisenables an inductive approach to bounding the eigenvalues of Ψ , starting from µ T for large sets T (i.e., ofsize k − deteriorates , and theinduction cannot be completed. A notable exception to this deterioration of the bounds are distributionsrelated to matroids [Ana+19], but as was observed by Alev and Lau [AL20], for almost any distributionbeyond matroids, one has to employ additional tricks to make this induction useful for sampling. Negative Correlation.

Some distributions have negative entries in Ψ , everywhere except on the diagonal;this property is known as negative correlation [see, e.g., BBL09]. Most notably, the uniform distributionon spanning trees, balanced matroids, and determinantal point processes, all have negative correlation[FM92; BBL09]. When negative correlations exist, the ℓ norm of rows of Ψ and consequently its maximumeigenvalue can be bounded by O ( ) [ALO20]. For non-homogeneous distributions that satisfy negativecorrelation, related statements hold, as was shown recently by Eldan and Shamir [ES20]. We remark that in some works using the spectral independence framework, the matrix Ψ is deﬁned slightly differently, withentries of the form P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S | i / ∈ S ] , but these matrices are directly related, and we believe it is more naturalto consider the deﬁnition presented here. n o t m a n y Figure 5: The two vertices are ei-ther both monomers or neither are.Therefore they are positively corre-lated. Figure 6: Only two matchings, onewith odd edges and one with evenedges, appear in the monomer-dimer system. The endpoints havelong-range correlation. Figure 7: Informally, the num-ber of vertices strongly correlatedwith any given vertex is bounded.

Correlation Decay.

When µ is a distribution deﬁned on an underlying graph, e.g., spin systems whichare distributions on random assignments σ : V → [ q ] of q spins to vertices of a graph, one can deﬁne aclass of properties under the umbrella term “correlation decay”. Informally, these properties imply that fordistant vertices u , v , the values of σ ( u ) , σ ( v ) are almost independent of each other. Naturally this is veryuseful for bounding the entries and consequently the eigenvalues of the matrix Ψ . While correlation decayproperties were known to yield efﬁcient sampling/counting algorithms, when combined with the spectralindependence framework, they resulted in algorithms with truly polynomial running times (compared toprior results which often needed extra assumptions such as boundedness of the degrees in the graph) forseveral problems like the hardcore model [ALO20], two-spin systems [CLV20b], and random colorings[Che+20; FGT19].Unfortunately, in the case of the monomer distribution in monomer-dimer systems, none of these methodsappear to work. As demonstrated in Figs. 5 and 6, we can have both positive and long-range correlations.Nevertheless, we show that the correlation matrix is still bounded, see Fig. 7.

Theorem 16.

Suppose that µ : ( [ n ] k ) → R ≥ is a density whose generating polynomial is sector-stable w.r.t. a sectorof aperture Ω ( ) . Then the ℓ norm of any row in the correlation matrix Ψ is bounded by O ( ) . ∀ i : ∑ j (cid:12)(cid:12) P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] (cid:12)(cid:12) ≤ O ( ) .Note that a bound on the ℓ norm of rows, is also a bound on the maximum eigenvalue [see, e.g., ALO20].Combining this with sector-stability of various distributions, e.g., the monomer distribution, results inspeciﬁc bounds on the correlation matrix. Corollary 17.

Let µ be the distribution of monomers in uniformly random k-matchings or more generally monomer-dimer systems with arbitrary weights (possibly restricted to k-matchings). Then the ℓ norm of rows of the correlationmatrix Ψ are bounded by a universal constant: ∀ i : ∑ j (cid:12)(cid:12) P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] (cid:12)(cid:12) ≤ O ( ) .Our main technical contribution is introducing a new technique for establishing spectral independencebased on the roots of the partition function in the complex plane. We remark that for the special case of unweighted monomer-dimer systems, a form of correlation decay does exist [Bay+07]. .5 Techniques and Related Work: Sector-Stability and Fractional Log-Concavity The study of roots of polynomials associated with distributions has a very long history, most notablyin statistical physics. In statistical physics, having roots near the positive real axis is recognized as anindicator of phase transition. This is because roots indicate singularity of log g µ , and many physicalobservables are related to log g µ and its derivatives, which can rapidly change near singularities [see, e.g.,YL52]. For monomer-dimer systems, Heilmann and Lieb [HL72] established a crucial property for rootsof the polynomial deﬁned below: ∑ M matching weight ( M ) ∏ v M z v .Here for each matching M , we multiply its weight by the variables z v , for v ranging over the monomers.Heilmann and Lieb [HL72] formally showed that if we plug in z , . . . , z n ∈ C such that Re ( z ) , . . . , Re ( z n ) >

0, then the above expression will not result in zero. This is the crucial property that Lemma 8 and conse-quently Theorems 6 and 7 rely on. This property is also known as Hurwitz-stability [see, e.g., BB09].Note that the polynomial deﬁned by Heilmann and Lieb [HL72] is not homogeneous, i.e., it does notcorrespond to a distribution on ( [ n ] k ) . Unfortunately, homogenization does not preserve Hurwitz-stability;similarly we do not get Hurwitz-stability if we only include matchings M of a particular size. We establishthe weaker, but more robust, notion of sector-stability for these polynomials. Instead, we show thatmonomer distributions, when homogenized or conditioned on size, cannot have roots in a wide enoughsector in Lemma 8.A special case of sector-stability, when the sector is the entire right-half-plane, is equivalent to Hurwitz-stability. For homogeneous polynomials, Hurwitz-stability is the same as another widely studied propertycalled real stability, or more generally, the so-called half-plane property [BBL09]. Under this special notionof sector-stability, the distribution µ is known to exhibit negative correlations [BBL09], and rapid mixingof local random walks for µ had already been established [FM92; AOR16]. Outside of this special case,negative correlation no longer holds. But we show that correlations are still bounded in Theorem 16.As mentioned before, real-stability, a special case of sector-stability for homogeneous polynomials, is awell-studied property of the generating polynomial g µ that already implied efﬁcient sampling/countingalgorithms for µ [see, e.g., AOR16]. However, recent works have shone light on a generalization of real-stability, that does not involve root locations. Anari, Liu, Oveis Gharan, and Vinzant [Ana+19] establishedthat if log g µ ( z , . . . , z n ) is concave, viewed as a function over R n ≥ , then k ↔ ( k − ) down-up randomwalks for sampling from µ would rapidly mix. This class of log-concave polynomials have been instrumentalin resolving several long-standing questions about matroids [Ana+18; Ana+19; BH19].Log-concave polynomials are a proper superset of real-stable polynomials, at least in the homogeneouscase. This was ﬁrst shown by [Går59], and this important result has been instrumental in the developmentof hyperbolic programming [Gül97]. A natural question that arises is, whether there is an analogousgeneralization of log-concavity, that is a superset of sector-stable polynomials.We deﬁne a natural property, that we call fractional log-concavity . We show that in a “local sense”, it isactually equivalent to spectral independence of the distribution µ , and then show that sector-stabilityimplies fractional log-concavity, establishing an extension of the result of Gårding [Går59]. Deﬁnition 18 (Fractional Log-Concavity) . We call the polynomial g µ ( z , . . . , z n ) fractionally log-concavewith parameter α ∈ [

0, 1 ] , if log g µ ( z α , . . . , z α n ) is concave, viewed as a function over R n ≥ .Note that for α =

1, this is the same as log-concavity. We show the following local equivalence betweenspectral independence and fractional log-concavity.

Proposition 19.

Suppose that µ : ( [ n ] k ) → R ≥ is a distribution, and deﬁne the n × n correlation matrix Ψ as Ψ i , j : = P S ∼ µ [ j ∈ S | i ∈ S ] − P S ∼ µ [ j ∈ S ] . Then the maximum eigenvalue of Ψ is bounded by O ( ) if and only if the polynomial g µ is fractionally log-concavearound the point z = (

1, . . . , 1 ) for a parameter α > Ω ( ) . (

1, . . . , 1 ) . However, sector-stability is preserved under the change of variables z i λ i z i , where λ , . . . , λ n are positive reals. This is because sectors in the complex plane are preserved under such scalings.This allows us to map any point in R n ≥ to the special point (

1, . . . , 1 ) . Using this we establish an extensionof the result of Gårding [Går59]. Theorem 20.

Suppose that g µ is sector-stable for a sector of aperture Ω ( ) . Then g µ is fractionally log-concave fora parameter α ≥ Ω ( ) . As a corollary of this result we prove bounds similar to those obtained by Anari, Oveis Gharan, and Vin-zant [AOV18] relating the entropy of fractionally log-concave, and consequently sector-stable, distributionswith the sum of their marginal entropies. See Section 6.While fractional log-concavity around the point (

1, . . . , 1 ) is equivalent to a bound on the eigenvalues of thecorrelation matrix Ψ , it does not imply a bound for the conditioned distributions µ T . However fractionallog-concavity at all points in R n ≥ does. This is because the polynomial for conditional distributions µ T canbe obtained as the following limit: g µ T ∝ lim λ → ∞ g µ  elements in T z }| { λ z , λ z , . . . , z n  / λ | T | .Scaling the variables or the polynomial, and taking limits all preserve fractional log-concavity. Corollary 21. If µ : ( [ n ] k ) → R ≥ has a fractionally log-concave generating polynomial with parameter α = Ω ( ) ,or a sector-stable polynomial with a sector of aperture Ω ( ) , then for all conditioned distributions µ T , the correlationmatrix has maximum eigenvalue O ( ) . This work establishes a number of examples of fractionally log-concave polynomials, but all of our exam-ples are also sector-stable. We leave the question of ﬁnding other examples of fractionally log-concavepolynomials that go beyond sector-stability to future work. However, we make the following concreteconjecture, in line with a conjecture of Mihail and Vazirani [MV89] on the expansion of 0/1 polytopes.

Conjecture 22.

Suppose that µ is the uniform distribution on a subset of the hypercube F ⊆ {

0, 1 } n , such that theconvex hull conv ( F ) has edges of bounded length O ( ) . Then we conjecture that the polynomial ∑ S ∈ F µ ( S ) ∏ i ∈ S z i is fractionally log-concave for a parameter α > Ω ( ) . Matroids are a special case of this conjecture, and their log-concavity has already been established [AOV18].However this conjecture is widely more general, encompassing combinatorial objects such as delta-matroids,Coxeter matroids, and more [BGW03].

All of our sampling algorithms are obtained as instantiations of the k ↔ ℓ down-up random walk forsome ℓ = k − O ( ) applied to an appropriate formulation of the target distribution µ , see Deﬁnition 1.Unlike prior applications of spectral independence, we have to consider the k ↔ ℓ random walk when k − ℓ >

1. For example, consider the distribution of monomers in a monomer-dimer system. As wehave established, we view this distribution on the ground set ( V ×{ }| V | ) , where V is the set of vertices.The k ↔ ( k − ) random walk then becomes the following procedure, known as the (single-site) Glauber11 . .. . .Figure 8: The k ↔ ( k − ) random walk on monomers avoids the parity issue. In each round two verticescan change their membership in the monomer set. This is an instance of the multi-site Glauber dynamics.dynamics:Start with monomer set S for t =

0, 1, . . . do Select vertex v ∈ V uniformly at randomSelect S t + between S t − { v } and S t ∪ { v } randomly with probability ∝ µ ( resulting set ) It is not hard to see that cardinality of all monomer sets in a graph has a constant parity. This means thatthere is no transition possible from a monomer set S to another set S ′ that differs in exactly one vertexfrom it. Therefore the k ↔ ( k − ) walk produces a constant sequence S , S = S , . . . and obviously doesnot mix. Note, however, that considering a higher value of k − ℓ gets around this parity issue, see Fig. 8.We show that fractional log-concavity, and consequently, sector-stability, imply rapid mixing of the k ↔ ℓ random walk for some ℓ = k − O ( ) . The following is the result of slight modiﬁcations of arguments byAlev and Lau [AL20]. Theorem 23.

Suppose that µ : ( [ n ] k ) → R ≥ has a fractionally log-concave polynomial with parameter α = Ω ( ) .Then for some ℓ = k − O ( ) , the k ↔ ℓ random walk started at the set S , gets ǫ -close in total variation distance tothe distribution µ in time t mix ( ǫ ) = O (cid:18) k O ( ) · log (cid:18) ǫ · P µ [ S ] (cid:19)(cid:19) .One has to be careful that log ( P µ [ S ]) is not too large in applications. This is achieved by making surethat S has at least a 2 − poly ( n ) probability under µ . In all distributions we study in this paper, this can beachieved easily. For example, in the case of monomer-dimer distributions, by running a maximum-weightmatching algorithm, we can ﬁnd a matching M having the maximum possible weight under the monomer-dimer distribution. Because the number of matchings is at most 2 poly ( n ) , we can safely use the monomerset of this matching as the starting point S . We thank Michał Derezi ´nski and Paul Liu for illuminating discussions about existing results on determi-nantal point processes. We also thank Alexander Barvinok for pointing us to existing results related tosector-stability.

We use Z ≥ to denote the set of nonnegative integers {

0, 1, . . . } . For a subset S of R n , we use conv ( S ) todenote the convex hull of S . 12e use [ n ] to denote {

1, . . . , n } . For a set U we let ( Uk ) denote the family of k -element subsets of U . When n is clear from context, we use S ∈ R n to denote the indicator vector of the set S ⊆ [ n ] , having a coordinateof 0 everywhere, except for elements of S , where the coordinate is 1. For two measures µ , ν deﬁned on the same state space Ω , we deﬁne their total variation distance as d tv ( µ , ν ) = ∑ ω ∈ Ω | µ ( ω ) − ν ( ω ) | = max { P µ [ S ] − P ν [ S ] | S ⊆ Ω } .The total variation distance is a special case of a more general class of “distance measures” called f -divergences. Deﬁnition 24 ( f -Divergence) . For a convex function f : R ≥ → R , deﬁne the f -divergence between twodistributions µ and ν on the same state space as follows: D f ( ν k µ ) = E ω ∼ µ (cid:20) f (cid:18) ν ( ω ) µ ( ω ) (cid:19)(cid:21) − f (cid:18) E ω ∼ µ (cid:20) ν ( ω ) µ ( ω ) (cid:21)(cid:19) .Note that by Jensen’s inequality this quantity is always nonnegative. Also notice that if µ and ν arenormalized distributions the second term is just f ( ) . In this work we will mostly deal with the case of f ( x ) = x , where D f ( · k · ) is also known as the variance. However, we state some results in full generalityin terms of arbitrary f -divergences, in the hope that they ﬁll ﬁnd use in future work.A Markov chain on a state space Ω is deﬁned by a row-stochastic matrix P ∈ R Ω × Ω . We view distributions µ on Ω as row vectors, and as such µ P would be the distribution after one transition according to P , if westarted from a sample of µ . A stationary distribution µ for the Markov chain P is one that satisﬁes µ P = µ .Under mild assumptions on P (ergodicity), stationary distributions are unique and the distribution ν P t converges to this stationary distribution as t → ∞ [LP17]. We refer the reader to [LP17] for a detailedtreatment of Markov chain analysis.A popular method for the analysis of Markov chains is via functional inequalities, that are often inequal-ities relating f -divergences before and after one transition of the Markov chain. We are speciﬁcally in-terested in contraction of the f -divergence. We state this contraction for (potentially non-square) row-stochastic operators for generality. Deﬁnition 25.

We say that a row-stochastic matrix P ∈ R Ω × Ω ′ contracts f -divergence w.r.t. a backgrounddistribution µ : Ω → R ≥ by a factor of α if for all other distributions ν : Ω → R ≥ , we have D f ( ν P k µ P ) ≤ α · D f ( ν k µ ) .We remark that all row-stochastic operators P have contraction with factor 1, and this property is onlyuseful for α < Proposition 26 (Data Processing Inequality) . For all row-stochastic matrices P ∈ R Ω × Ω ′ and all distributions µ , ν : Ω → R ≥ , we have D f ( ν P k µ P ) ≤ D f ( ν k µ ) .For a Markov chain P , we deﬁne the mixing time from a starting distribution ν as the ﬁrst time t such that ν P t gets close to the stationary distribution µ . t mix ( P , ν , ǫ ) = min { t | d tv ( ν P t , µ ) ≤ ǫ } .We drop P and ν if they are clear from context. If ν is the Dirac measure on a single point ω , we write t mix ( P , ω , ǫ ) for the mixing time. When mixing time is referenced without mentioning ǫ , we imagine that13 is set to a reasonable small constant (such as 1/4). This is justiﬁed by the fact that the growth of themixing time in terms of ǫ can be at most logarithmic [LP17].Contraction inequalities, combined with companion inequalities relating d tv and f -divergences allow oneto bound the mixing time of a Markov chain. In particular for f ( x ) = x , one has the relationship d tv ( ν , µ ) ≤ O (cid:18)q D x ( ν k µ ) (cid:19) ,and as a result we get Proposition 27 ([see, e.g., LP17]) . Suppose that a Markov chain P with stationary distribution µ has α -factorcontraction in D x ( · k · ) . Then the mixing time of P started from a point ω satisﬁest mix ( P , ν , ǫ ) ≤ O (cid:18) log ( ǫ P µ [ ω ]) log ( α ) (cid:19) ≤ O (cid:18) − α · log (cid:18) ǫ · P µ [ ω ] (cid:19)(cid:19) . We use the following classic result from elementary complex analysis [see, e.g., Lan13].

Lemma 28 (Schwarz’s lemma) . Let D = { z ∈ C | | z | < } be the open unit disk in the complex plane C centeredat the origin and let f : D → C be a holomorphic map such that f ( ) = and | f ( z ) | ≤ on D. Then | f ′ ( ) | ≤ Theorem 29 (Courant-Fischer Theorem) . Let A ∈ R n × n be a Hermitian matrix with eigenvalues λ ≥ λ ≥· · · ≥ λ n . Then λ k ( A ) = min U max v h v , Av i , where the minimum is taken over all ( n − k + ) -dimensional subspaces U ⊆ R n and the maximum is taken overall vectors v ∈ U with h v , v i = . Theorem 30.

Let A ∈ R n × m , B ∈ R m × n where m ≥ n. Then the spectrum of BA (as a multiset) is precisely theunion of the spectrum of AB (as a multiset) with m − n copies of . We use F [ z , . . . , z n ] to denote n -variate polynomials with coefﬁcients from F , where we usually take F tobe R or C . We denote the degree of a polynomial g by deg ( g ) . We call a polynomial homogeneous ofdegree k if all nonzero terms in it are of degree k . We deﬁne a λ -scaling, or an external ﬁeld of λ ∈ F n applied to a polynomial g , to be the polynomial g ( λ z , . . . , λ n z n ) . If g was the generating polynomial ofa distribution µ , we denote the same scaling applied to µ by λ ⋆ µ .The main workhorse behind our main results are polynomials that avoid roots in certain regions of thecomplex plane. Deﬁnition 31 (Stability) . For an open subset U ⊆ C n , we call a polynomial g ∈ C [ z , . . . , z n ] U -stable iff ( z , . . . , z n ) ∈ U = ⇒ g ( z , . . . , z n ) = U -stable. This ensures that limits of U -stable polynomials are U -stable. For convenience, when n is clear from context, we abbreviate stability w.r.t. regions of the form U × U × · · · × U where U ⊆ C simply as U -stability.Our choice of the region U in this work is the product of open sectors in the complex plane.14 eﬁnition 32 (Sectors) . We name the open sector of aperture απ centered around the positive real axis Γ α : Γ α : = { exp ( x + iy ) | x ∈ R , y ∈ ( − απ /2, απ /2 ) } .With these deﬁnitions Deﬁnition 3 is the same as Γ α -stability for a suitable parameter α .Note that Γ is the right-half-plane, and Γ -stability is the same as the classically studied Hurwitz-stability[see, e.g., Brä07]. Another closely related notion is that of real-stability where the region U is the upper-half-plane { z | Im ( z ) > } [see, e.g., BBL09]. Note that for homogeneous polynomials, stability w.r.t. U isthe same as stability w.r.t. any rotation/scaling of U ; so Hurwitz-stability and real-stability are the samefor homogeneous polynomials. Consider an open half-plane H θ = (cid:8) e − i θ z (cid:12)(cid:12) Im ( z ) > (cid:9) ⊆ C . A polynomial g ( z , · · · , z n ) ∈ C [ z , · · · , z n ] is H θ -stable if g does not have roots in H n θ . We call H and H π /2 the upper half-plane and right half-planerespectively. We say g is Hurwitz-stable if it is H π /2 -stable. We say g is real-stable if it is H -stable andhas real coefﬁcients.We observe that for homogeneous polynomials, the deﬁnition of H θ -stable is equivalent for all angles θ . Lemma 33 (Lemma 2.3, [BB09]) . Suppose that f j ∈ C [ z , · · · , z n ] for all j ∈ N is U-stable for an open setU ⊆ C n and that f is the limit, uniformly on compact sets, of the sequence (cid:8) f j (cid:9) j ∈ N . Then f is either U-stable or itis identically equal to 0.In particular, if f j has bounded degree for all j ∈ N , and the sequence (cid:8) f j (cid:9) j ∈ N converges to f coefﬁcient-wise, thenf j converge to f uniformly on compact sets. Proposition 34 (Polarization, [BBL09]) . For an element κ of N n let R κ [ z , · · · , z n ] = { polynomials in R [ z i ] ≤ i ≤ n of degree at most κ i in z i for all i } R a κ [ z ij ] = (cid:8) multi-afﬁne polynomials in R [ z ij ] ≤ i ≤ n ,1 ≤ j ≤ κ i (cid:9) The polarization map ∏ ↑ κ is a linear map that sends monomial z α = ∏ ni = z α i i to the product ( κα ) n ∏ i = ( elementary symmetric polynomial of degree α i in the variables { z ij } ≤ j ≤ κ i ) where ( κα ) = ∏ ni = ( κ i α i ) . A polynomial g ∈ R κ [ z i ] ≤ i ≤ n with nonnegative coefﬁcients is real-stable if an only if its polarization ∏ ↑ κ ( g ) is alsoreal-stable. Taking polarization of z k with κ = n , we obtain the following well-known result. Corollary 35.

For k ≤ n, the k-th symmetric polynomial in n variables e k ( z , · · · , z n ) is real-stable/Hurwitz-stable. The following theorems will be useful in the proof of Theorem 10.

Theorem 36.

Let g ( z , · · · , z n ) ∈ R [ z , · · · , z n ] be Hurwitz-stable. Let g e (g o ) be the even (odd) part of g i.e., thesum of terms c α z α whose total degree | α | is even (odd resp.). Then g e and g o are either identically or Hurwitz-stable.Proof. We have g = g e + g o . Replace z j with iy j with y j ∈ H . Let h ( { y j } nj = ) : = g ( { iy j } nj = ) , h e ( { y j } nj = ) : = g e ( { iy j } nj = ) and h o ( { y j } nj = ) : = i − g o ( { iy j } nj = ) then h e , h o are polynomials with real coefﬁcients, and h isupper half-plane stable. 15e have h = h e + ih o , and this is the unique way to write h as h + ih where h j are polynomial with realcoefﬁcients, for j ∈ {

1, 2 } . By [BB09, Corollary 2.4], h e and h o are real-stable or identically 0. Thus g e , g o are Hurwitz-stable or identically 0. Theorem 37 ([BBL09], Proposition 3.2) . Let A , · · · , A n be (complex) positive semi-deﬁnite matrices and let B bea (complex) Hermitian matrix, all matrices being of the same size m × m.1. The polynomial f ( z , · · · , z n ) = det ( z A + · · · + z n A n + B ) is either identically zero or real-stable;2. If B is also positive semi-deﬁnite then f has all non-negative coefﬁcients. Lemma 38.

Consider A ∈ R n × n satisfying A + A T is positive semi-deﬁnite. Let f ( z , · · · , z n ) = ∑ S ⊆ [ n ] z [ n ] \ S det ( A S , S ) . Then f has non-negative coefﬁcients, and is either identically or Hurwitz-stable.Proof. Clearly, A + A T is positive semi-deﬁnite, so A is a P -matrix (see [Gar+19, Lemma 1]) i.e., allprinciple minors of A are nonnegative. The coefﬁcients of f are principle minors of A , and are thusnonnegative.Let D = ( A + A T ) /2, X = ( A − A T ) /2. Note that X is skew-symmetric, thus B : = iX is a Hermitianmatrix, and D is positive semi-deﬁnite. Apply Theorem 37 with A j = diag e j for j ∈ [ n ] where e j is the j -thstandard basis vector, A n + = D and B = iX , we have g ( z , · · · , z n , z n + ) : = det ( ∑ ni = z i A i + z n + D + iX ) is either identically 0 or real-stable.Let w j : = i − z j , Z = ∑ ni = z i A i = diag z , · · · , z n and W = diag w , · · · , w n . We can rewrite g ( z , · · · , z n , i ) = det ( Z + iD + iX ) = det ( iW + iA ) = i n det ( W + A )= i n ∑ S ⊆ [ n ] w [ n ] \ S det ( A S , S ) = i n f ( w , · · · , w n ) If g ≡ f . Suppose g

0. Fix arbitrary w , · · · , w n in the right half plane H π /2 . Observe that z j = iw j is in the upper half plane H . Real-stability of g implies f ( w , · · · , w n ) = g ( z , · · · , z n , i ) = f is Hurwitz-stable.We also need the following for the proof of Theorem 7. Theorem 39 ([HL72]) . Consider a graph G = G ( V , E ) on n vertices with edge weight w : E → R ≥ and vertexweight λ : V → R ≥ . For S ⊆ V, let m S : = ∑ M weight ( M ) = ∑ M ( ∏ e ∈ M w ( e ) ∏ v S λ ( v )) where the sum istaken over all perfect matchings M of S. The following polynomial is Hurwitz-stablef ( z , · · · , z n ) = ∑ S ⊆ V z [ n ] \ S m S A matroid M = ( E , I ) is a structure consisting of a ﬁnite ground set E and a non-empty collection I of independent subsets of E satisfying:1. If S ⊆ T and T ∈ I , then S ∈ I .2. If S , T ∈ I and | T | > | S | , then there exists an element i ∈ T \ S such that S ∪ { i } ∈ I .The rank of a matroid is the size of the largest independent set of that matroid. If M has rank r , any set S ∈ I of size r is called a basis of M . Let B M ⊂ I denote the set of bases of M . The set of bases B M of amatroid unique deﬁne M .We say a matroid M is strongly Rayleigh or satisﬁes the weak half-plane property if f ( z , · · · , z n ) = ∑ S ∈B M z S is real-stable. 16or partition T , · · · , T s of [ n ] , and tuple ( c , · · · , c s ) ∈ N s , the partition matroid M associated with ( T , · · · , T s ) and ( c , · · · , c s ) is deﬁned by B M = { S ⊆ [ n ] | | S ∩ T i | = c i ∀ i } . Here we establish sufﬁcient conditions for rapid mixing of the k ↔ ℓ down-up random walks as deﬁnedin Deﬁnition 1. Remark . Our arguments in this section are small tweaks of the local-to-global contraction analysesalready found in prior work of Alev and Lau [AL20] and Cryan, Guo, and Mousa [CGM19]; the originof these types of arguments goes back to the study of high-dimensional expanders [KM16; DK17; KO18],and more sophisticated variants useful in the context of Markov chain analysis can be found in recentworks of Chen, Liu, and Vigoda [CLV20b; CLV20a] and Guo and Mousa [GM20]. For the mixing timebounds in this work, the analysis of Alev and Lau [AL20] and the framework built on it by Anari, Liu, andOveis Gharan [ALO20] dubbed “spectral independence” sufﬁces; however, we choose to state a generallocal-to-global contraction analysis not found explicitly in prior work, in the hope that it will ﬁnd use infuture applications.For a distribution µ : ( [ n ] k ) → R ≥ , our goal is to analyze the mixing time of the k ↔ ℓ down-up randomwalk. We will do this by establishing contraction of f -divergence in these random walks. Similar toprior results on local-to-global analysis of high-dimensional expanders, our goal is to show that “local”contraction of f -divergence (where the down-up walks are applied to a “localization” of µ ) implies “global”contraction of f -divergence.The down-up walks can be written as the composition of two row-stochastic operators known aptly as thedown and up operators. Deﬁnition 41 (Down Operator) . For a ground set [ n ] , and cardinalities k ≥ ℓ deﬁne the row-stochasticdown operator D k → ℓ ∈ R ( [ n ] k ) × ( [ n ] ℓ ) as D k → ℓ ( S , T ) = ( ( k ℓ ) if T ⊆ S ,0 otherwise.This operator applied to a random set S , produces a uniformly random subset T of size ℓ out of it. Thedown operators compose in the way one expects, i.e., D k → ℓ D ℓ → m = D k → m . Note that the down operatorhas no dependence on µ . In contrast the up operator as deﬁned below depends on µ and is actuallydesigned to be the time-reversal of the down operator w.r.t. the background measure µ . Deﬁnition 42 (Up Operator) . For a ground set [ n ] , cardinalities k ≥ ℓ , and density µ : ( [ n ] k ) → R ≥ , deﬁnethe up operator U ℓ → k ∈ R ( [ n ] ℓ ) × ( [ n ] k ) as U ℓ → k ( T , S ) = ( µ ( S ) ∑ S ′⊇ T µ ( S ′ ) if T ⊆ S ,0 otherwise.If we name µ k = µ and more generally let µ ℓ be µ k D k → ℓ , then the down and up operators satisfy detailedbalance (time-reversibility) w.r.t. the µ k , µ ℓ operators. In other words we have µ k ( S ) D k → ℓ ( S , T ) = µ ℓ ( T ) U ℓ → k ( T , S ) .This property ensures that the composition of the down and up operators have the appropriate µ as astationary distribution, are time-reversible, and have nonnegative real eigenvalues. Proposition 43 ([see, e.g., KO18; AL20; ALO20]) . The operators D k → ℓ U ℓ → k and U ℓ → k D k → ℓ both deﬁne Markovchains that are time-reversible and have nonnegative eigenvalues. Moreover µ k and µ ℓ are respectively their station-ary distributions. f -divergence by a multiplicative factor. To this end,it is enough to show contraction of f -divergence under D k → ℓ . This is because, by the data processinginequality, Proposition 26, the operator U ℓ → k cannot increase the f -divergence.The key ingredient in local-to-global arguments is the “local contraction” assumption. Here, one assumesthat D → contracts f -divergences w.r.t. the background measure µ . The goal is to go from this assump-tion, and similar ones for conditionings of µ , see Deﬁnition 15, to contraction of f -divergence for D k → ℓ .This is the natural “ f -divergence” generalization of the notion of local spectral expansion and its implica-tions for global expansion [see KO18].First we deﬁne the notion of the link of the distribution µ w.r.t. a set T [see, e.g., KO18]. This notion isalmost the same as the notion of conditioned distributions µ T , see Deﬁnition 15, except we remove the set T as well. Deﬁnition 44.

For a distribution µ : ( [ n ] k ) → R ≥ and a set T ⊆ [ n ] of size at most k , we deﬁne the link of T to be the distribution µ − T : ( [ n ] − Tk ) → R ≥ which describes the law of the set S − T where S is sampledfrom µ conditioned on the event S ⊇ T .Next we deﬁne the notion of local f -divergence contraction for a distribution µ . Deﬁnition 45 (Local f -Divergence Contraction) . For a distribution µ : ( [ n ] k ) → R ≥ and a set T of size atmost k −

2, deﬁne the local contraction at T , to be the smallest number α ( T ) ≥ D → contracts f -divergences w.r.t. ( µ − T ) = µ − T D ( k −| T | ) → by a factor of α ( T ) . That is α ( T ) is the smallest number suchthat for all ν : ( [ n ] − T ) → R ≥ we have D f (cid:16) ν D → (cid:13)(cid:13)(cid:13) π T ( µ ) D ( k −| T | ) → (cid:17) ≤ α ( T ) · D f (cid:16) ν (cid:13)(cid:13)(cid:13) π T ( µ ) D ( k −| T | ) → (cid:17) .We now show that local contraction of f -divergence results in a bound on the contraction of D k → ℓ opera-tors. Theorem 46.

Suppose that µ : ( [ n ] k ) → R ≥ has local f -divergence contraction with contraction factors α ( T ) .Deﬁne β ( T ) = min { α ( T ) / ( − α ( T )) } . For a set T ⊆ [ n ] deﬁne γ T : = E e ,..., e m ∼ uniformly random permutation of T [ β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } )] . Then the operator D k → ℓ has contraction factor at least −

1/ max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .Proof. Consider an arbitrary distribution ν : ( [ n ] k ) → R ≥ . The f -divergence D f ( ν k µ ) is a difference oftwo terms, both involving expectations over samples S ∼ µ : D f ( ν k µ ) = E S ∼ µ (cid:20) f (cid:18) ν ( S ) µ ( S ) (cid:19)(cid:21) − f (cid:18) E S ∼ µ (cid:20) ν ( S ) µ ( S ) (cid:21)(cid:19) .Our strategy is to write this difference as a telescoping sum of differences, where elements of S are revealedone-by-one in the sum.Consider the following process. We sample a set S ∼ µ and uniformly at random permute its elements toobtain X , . . . , X k . Deﬁne the random variable τ i = f (cid:18) E (cid:20) ν ( S ) µ ( S ) (cid:12)(cid:12)(cid:12)(cid:12) X , . . . , X i (cid:21)(cid:19) = f ∑ S ′ ∋ X ,..., X i ν ( S ′ ) ∑ S ′ ∋ X ,..., X i µ ( S ′ ) ! = f (cid:18) ν D k → i ( { X , . . . , X i } ) µ D k → i ( { X , . . . , X i } ) (cid:19) .Note that τ i is a “function” of X , . . . , X i . It is not hard to see that D f ( ν k µ ) = E [ τ k ] − E [ τ ] = k − ∑ i = ( E [ τ i + ] − E [ τ i ]) .18 convenient fact about this telescoping sum is that to obtain D f ( ν D k → ℓ k µ D k → ℓ ) , one has to just sumover the ﬁrst ℓ terms instead of k : D f ( ν D k → ℓ k µ D k → ℓ ) = E [ τ ℓ ] − E [ τ ] = ℓ − ∑ i = ( E [ τ i + ] − E [ τ i ]) .This is because the set { X , . . . , X ℓ } is distributed according to µ D k → ℓ . So our goal of showing that D k → ℓ has contraction boils down to showing that the last k − ℓ terms in the telescoping sum are sufﬁcientlylarge compared to the rest.Consider applying the assumption of local contraction to the link of the set T = { X , . . . , X i } . From thisone can extract that E [ τ i + | X , . . . , X i ] − E [ τ i | X , . . . , X i ] ≤ α ( T ) · ( E [ τ i + | X , . . . , X i ] − E [ τ i | X , . . . , X i ]) .Deﬁning ∆ i = τ i + − τ i , the above can be rewritten as E [ ∆ i | X , . . . , X i ] ≤ α ( { X , . . . , X i } ) · E [ ∆ i + ∆ i + | X , . . . , X i ] .Rearranging yields E [ ∆ i | X , . . . , X i ] ≤ α ( { X , . . . , X i } ) − α ( { X , . . . , X i } ) E [ ∆ i + | X , . . . , X i ] ≤ β ( { X , . . . , X i } ) E [ ∆ i + | X , . . . , X i ] .From this we obtain that if we consider the quantities ∆ i · β ( ∅ ) · β ( { X } ) · · · β ( { X , . . . , X i − } ) ,they form a submartingale; this means that we have E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] ≥ E [ ∆ ] .Now, consider an alternative process for generating the ordering X , X , . . . , X k . First select S ∼ µ , andpartition it into two sets, T of size ℓ − S − T of size k − ℓ +

1. We then randomly shufﬂe T and let X , . . . , X ℓ − be the result, and then randomly shufﬂe S − T and let X ℓ , . . . , X k be the result. This processis equivalent to randomly shufﬂing all elements of S .The key insight is that ∆ ℓ is only a function of the unordered set T and the ordering of S − T . However theother factor β ( ∅ ) · · · β ( { X , . . . , X ℓ − } ) is only a function of the ordering chosen for T and not S − T . Thismeans that conditioned on T , these two quantities are independent and we get E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] = E T [ E [ ∆ ℓ | T ] · E [ β ( ∅ ) · · · β ( { X , . . . , X ℓ − } ) | T ]] .From the deﬁnition of γ T , we obtain E [ ∆ ℓ · β ( ∅ ) · · · β ( { X , . . . , X ℓ − } )] ≤ E [ ∆ ℓ ] · max (cid:26) γ T (cid:12)(cid:12)(cid:12)(cid:12) T ∈ (cid:18) [ n ] ℓ − (cid:19)(cid:27) .Combining with previous inequalities we obtain E [ ∆ ℓ ] ≥ E [ ∆ ] max n γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .Similar inequalities can be obtained with ∆ replaced by ∆ , ∆ , . . . in the above arguments (with poten-tially better factors than γ T , but we ignore this potential improvement). Averaging over these k inequalitieswe obtain E [ ∆ ℓ ] ≥ E [ ∆ + · · · + ∆ k − ] max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o = D f ( ν k µ ) max n k · γ T (cid:12)(cid:12)(cid:12) T ∈ ( [ n ] ℓ − ) o .19t just remains to note that D f ( ν k µ ) − D f ( ν D k → ℓ k µ D k → ℓ ) = E [ ∆ ℓ + · · · + ∆ k − ] ≥ E [ ∆ ℓ ] .Here we used nonnegativity of E [ ∆ i ] which follows from convexity of f and Jensen’s inequality. Combin-ing the previous two inequalities and rearranging the terms yields the desired result. Remark . We remark that similar to prior works, in this paper we only deal with the case where the α ( T ) contraction factors only depend on the size | T | . However, we suspect the more general statementwe proved here to be useful in potential future applications of this method, especially to distributions µ that “factorize” into two independent distributions when conditioned on an element; some potentialexamples include distributions over chains in a poset. In these scenarios, the order of conditioning onthe elements matters, and we hope that by having E orderings [ β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } )] instead ofmax orderings { β ( ∅ ) β ( { e } ) · · · β ( { e , . . . , e m } ) } , we get more tractable results.From this point on, we deal with cases where α ( T ) , β ( T ) only depend on the cardinality | T | , and as suchwe write them as α i , β i , where i = | T | . Consequently, the global contraction factor we obtained can berewritten as 1 − k β β · · · β ℓ − . Remark . A similar, slightly better, contraction factor can be obtained when β ( T ) only depend on | T | . Inthese cases one can simply use E [ ∆ i ] ≤ β i · E [ ∆ i + ] and obtain that the we have contraction E [ ∆ + · · · + ∆ ℓ − ] E [ ∆ + · · · + ∆ k − ] ≤ + β + · · · + β · · · β ℓ − + β + · · · + β · · · β k − .This is essentially the same bound found by Chen, Liu, and Vigoda [CLV20a] and Guo and Mousa [GM20]and the analysis is essentially the same as those in its core. However this slightly better bound does notproduce any meaningful improvement in the mixing time bounds we get in this work, and for simplicitywe use the more naive bound.While it might seem that β · · · β ℓ − can get exponentially large, in the case of distributions that satisfyspectral independence [ALO20], this product remains polynomially small. In particular, one can show[see, e.g., ALO20; CLV20a] that if the correlation matrix, see Deﬁnition 14, has O ( ) -bounded eigenvaluesfor the distribution µ and all of its conditionings, then β i ≃ ( − O ( ( k − i ))) . In particular, as long as k − i is larger than a constant (hidden in the O -notation), then β i is ﬁnite an can be roughly approximatedby e O ( ( k − i )) . Thus for k − ℓ larger than an appropriate constant, we have the bound β β · · · β k − ℓ ≃ exp (cid:18) O (cid:18) k + k − + · · · + ℓ (cid:19)(cid:19) ≤ exp ( O ( log k )) = poly ( k ) . In this section, we prove Theorem 16.

Deﬁnition 49 (Signed Pairwise Inﬂuence/Correlation Matrix) . Let µ be a probability distribution over 2 [ n ] with generating polynomial f ( z , · · · , z n ) = ∑ S ∈ [ n ] µ ( S ) z S .Let the signed pairwise inﬂuence matrix Ψ inf µ ∈ R n × n be deﬁned by Ψ inf µ ( i , j ) = ( j = i P [ j | i ] − P [ j | ¯ i ] elsewhere P [ j | i ] = P T ∼ µ [ j ∈ T | i ∈ T ] , P [ j ] = P T ∼ µ [ j ∈ T ] and P [ j | ¯ i ] = P T ∼ µ [ j ∈ T | i T ] .20et the correlation matrix Ψ cor µ ∈ R n × n be deﬁned by Ψ cor µ ( i , j ) = ( − P [ i ] if j = i P [ j | i ] − P [ j ] elseIn Deﬁnition 49, we use the convention that the entry Ψ inf ( i , j ) ( Ψ cor ( i , j ) resp.) is set to 0 if P [ j | i ] or P [ j | ¯ i ] ( P [ j | i ] resp.) are not well-deﬁned, e.g., P [ i ] = P [ ¯ i ] = P [ i ] = Ψ inf µ , was ﬁrst introduced in [ALO20]. All of eigenvalues of Ψ inf µ and Ψ cor µ are real [see, e.g., ALO20].We show that Ω ( ) -aperture sector-stability of the generating polynomial of µ implies O ( ) -bound on therow norms of Ψ inf µ and Ψ cor µ . The high level idea is to write the ℓ -norm of a row of Ψ inf as the derivativeat 0 of some holomorphic function that maps the unit disk to itself, and then use Schwarz’s Lemma(Lemma 28) to derive a bound. Theorem 50.

Consider a multi-afﬁne f ∈ R ≥ [ z , · · · , z n ] polynomial that is Γ α -stable with α ≤ . Let µ : 2 [ n ] → R ≥ be the distribution generated by f , then Ψ inf µ and Ψ cor µ have bounded row norms. Speciﬁcally, ∑ j | Ψ inf µ ( i , j ) | ≤ α − and ∑ j | Ψ cor µ ( i , j ) | ≤ α . As a corollary, the same bounds hold for maximum eigenvalues, i.e., λ max ( Ψ inf µ ) ≤ α − and λ max ( Ψ cor µ ) ≤ α .Proof. If we can show the ﬁrst statement, the second follows from P [ j | i ] − P [ j ] = P [ j | i ] − ( P [ j | i ] P [ i ] + P [ j | ¯ i ] P [ ¯ i ]) = ( − P [ i ])( P [ j | i ] − P [ j | ¯ i ]) ∑ j | Ψ cor µ ( i , j ) | ≤ ( − P [ i ])( + ∑ j = i | P [ j | i ] − P [ j | ¯ i ] | ) ≤ α .Fix a row i . W.l.o.g., assume i = n . Let h = ∂ i f , g = f z i = . We can assume w.l.o.g. that neither g and h arethe zero polynomial. If either g or h are the zero polynomial then the row just become identically 0 andthe statement is trivial. Let S : = (cid:8) j ∈ [ n ] \ { i } (cid:12)(cid:12) P [ j | i ] − P [ j | ¯ i ] < (cid:9) then ∑ j = i | Ψ inf µ ( i , j ) | = ∑ j ∈ S ( P [ j | ¯ i ] − P [ j | i ]) − ∑ j S ( P [ j | ¯ i ] − P [ j | i ]) . (1)Note that P [ j | i ] = ∂ j h ( ~ ) h ( ~ ) and P [ j | ¯ i ] = ∂ j g ( ~ ) g ( ~ ) .Deﬁne ~ z ∈ R n − by z j = ( y for z ∈ Sy − else .Let ¯ h ( y ) = h ( ~ z ) and ¯ g ( y ) = g ( ~ z ) . Note that ∑ j ∈ S ∂ j h ( ) − ∑ j S ∂ j h ( ) = ¯ h ′ ( ) and the same goes for ¯ g .This is because for each monomial z U = z U ∩ S z U \ S , we have ∑ j ∈ S ∂ j z U − ∑ j S ∂ j z U ! ~ z = = | U ∩ S | − | U \ S | = (cid:16) y | U ∩ S | ( y − ) | U \ S | (cid:17) ′ | y = n − ∑ j = | Ψ inf µ ( i , j ) | = ∑ j ∈ S ( ∂ j g ( ~ ) g ( ~ ) − ∂ j h ( ~ ) h ( ~ ) ) − ∑ j S ( ∂ j g ( ~ ) g ( ~ ) − ∂ j h ( ~ ) h ( ~ ) ) = ¯ g ′ ( ) ¯ g ( ) − ¯ h ′ ( ) ¯ h ( ) = ( log ¯ g − log ¯ h ) ′ y = = φ ′ ( ) (2)where φ ( x ) = log ¯ g ( e x ) ¯ h ( e x ) − log ¯ g ( ) ¯ h ( ) . Note that φ maps 0 to itself.Let D , H ⊆ C be the centered (open) unit disk and the (open) right half-plane respectively. For any set Ω ⊆ C , we let Ω denote its closure.The Mobius transformation T : x x − x + is a conformal map from H onto D .For angle θ ∈ ( π ) let Ω θ : = { x ∈ C | | Im ( x ) | < θ } and ϕ θ : Ω θ → D , x T ( exp ( π x θ )) . Note that ϕ θ ( ) = T ( ) = ϕ ′ θ ( ) = T ′ ( ) π θ = π θ and ( ϕ − θ ) ′ ( ) = ϕ ′ θ ( ) = θπ .To bound | φ ′ ( ) | , we show that φ maps Ω απ /2 to Ω π − απ /2 . Now, for all small ǫ >

0, ˜ φ : = ϕ π − απ /2 + ǫ ◦ φ ◦ ϕ − απ /2 is a holomorphic function that takes the centered unit disk to itself. We use Schwarz Lemma tobound | ˜ φ ′ ( ) | , then use this to bound | φ ′ ( ) | .Let θ : = απ /2. Consider x ∈ Ω θ . Note that the function x e x maps Ω θ to S α . Also, ¯ g ( e x ) ¯ h ( e x )

6∈ − S α else ¯ g ( e x ) + ¯ h ( e x ) z = z ∈ S α i.e., f ( e x , · · · , e x , z ) =

0, which contradicts S α -sector-stability of f . In particular, ¯ g ( e x ) ¯ h ( e x ) never takes negative real value, thus the function log ¯ g ( e x ) ¯ h ( e x ) is holomorphic, and asargued earlier, | Im ( log ¯ g ( e x ) ¯ h ( e x ) ) | ≤ π − θ . Additionally, since g , h has non-negative coefﬁcients and are notthe zero polynomial, ¯ g ( ) and ¯ h ( ) are positive real and log ¯ g ( ) ¯ h ( ) is a real number. Therefore, | Im ( φ ( x )) | = | Im ( log ¯ g ( e x ) ¯ h ( e x ) ) | ≤ π − θ . Hence, φ maps Ω θ to Ω π − θ + ǫ for every ǫ > ǫ >

0. Consider the holomorphic map ˜ φ = ϕ π − θ + ǫ ◦ φ ◦ ϕ − θ that takes D to itself. Since φ , ϕ ∗ bothtake 0 to itself, so is ˜ φ . By Schwarz’s Lemma (Lemma 28), | ˜ φ ′ ( ) | ≤

1. On the other hand, ˜ φ ′ ( ) = ϕ ′ π − θ + ǫ ( ) × φ ′ ( ) × ( ϕ − θ ) ′ ( ) = π ( π − θ + ǫ ) φ ′ ( ) θπ = θπ − θ + ǫ φ ′ ( ) , thus | φ ′ ( ) | ≤ π + ǫθ −

1. Taking ǫ → | φ ′ ( ) | ≤ πθ −

1. Substitute back into (2) gives the desired bound.

Remark . Theorem 50’s bounds on k Ψ inf k ∞ , k Ψ inf µ k , and k Ψ cor k ∞ are tight, even for homogeneous µ .For e.g., consider f µ ( z , . . . , z rk ) = ∑ r − i = ∏ ( i + ) kj = ik + z j For r =

2, we have Ψ inf µ = (cid:20) J k − J k − J k J k (cid:21) − I k and k Ψ inf k ∞ = k Ψ inf µ k = k −

1. For arbitrary r we get Ψ cor µ =  J k J k · · · J k  − r J rk with J being theall ones matrix. k Ψ cor k ∞ = k ( − r ) + ( r − ) k r = k ( − r ) −−−→ r → ∞ k .The bound on k Ψ cor µ k is tight in general, for e.g. consider f ( z , . . . , z k ) = ǫ z . . . z k + ( − ǫ ) for small ǫ >

0, but is not tight for homogeneous distribution µ . In this section, we show how certain natural operations affect the sector-stability of polynomials. InCorollary 60, we show that the degree- k part of a Hurwitz-stable (or Γ -stable) polynomial is Γ -stable.22n Theorem 64, we show that given a homogeneous real-stable polynomial g , the sum of terms in g whose ( T , . . . , T k ) -degree is equal to ( c , . . . , c k ) is Γ k -stable. These results are important ingredients in theproof of Theorems 7 and 10 and Corollary 13. Proposition 52.

The following operations preserve α -sector-stability:1. Specialization: g ( z , . . . , z n ) g ( a , z , . . . , z n ) , where a ∈ ¯ Γ α .2. Scaling: g g ⋆ λ , if λ i ∈ R ≥ ∀ i ∈ [ n ] .

3. Dual: g g ∗ , where g ( z ) = ∑ S ⊆ [ n ] c S z S and g ∗ ( z , · · · , z n ) : = ∑ S ⊆ [ n ] c S z [ n ] \ S .Proof. Part 1 for a ∈ Γ α holds by the deﬁnition and for the closed boundary of Γ α we can set a to 0 or ∞ by Lemma 33. Part 2 holds by the deﬁnition of sector-stability. For part 3, g ∗ ( z , · · · , z n ) = z · · · z n g ( z − , · · · , z − n ) = z , · · · , z n ∈ Γ α , where we use Γ α -stability of g and the fact that z − , · · · , z − n are also in Γ α . Lemma 53 (Homogenization) . If multi-afﬁne polynomial g ( z , · · · , z n ) : = ∑ S ⊆ [ n ] c S z S is Γ α -stable, then itshomogenization g hom ( z , · · · , z n , w , · · · , w n ) : = ∑ S ⊆ [ n ] c S z S w [ n ] \ S is multi-afﬁne, homogeneous of degree n, and Γ α /2 -stable.Proof. One can rewrite g hom as g hom ( z , · · · , z n , w , · · · , w n ) = w · · · w n g ( z w , · · · , z n w n ) .For any z , · · · , z n , w · · · , w n ∈ Γ α /2 , we have z i w i ∈ Γ α ∀ i ∈ [ n ] , thus the RHS is nonzero by Γ α -stability of g . Corollary 54.

Consider graph G = G ( V , E ) on n vertices with edge weight w : E → R ≥ and vertex weight λ : V → R ≥ . For S ⊆ V, let m S : = ∑ M weight ( M ) = ∑ M ( ∏ e ∈ M w ( e ) ∏ v S λ ( v )) where the sum is taken overall perfect matching M of S. The following polynomial is Γ stablef ( z , · · · , z n , y , · · · , y n ) = ∑ S ⊆ V y S z [ n ] \ S m S .The class of sector-stable polynomials was studied in [SS19], where the authors proved that symmetriza-tion preserves sector-stablity of univariate polynomials with nonnegative coefﬁcients. Given a univariatecomplex polynomial p ( z ) = a n z n + . . . + a z + a , its symmetrization with n variables is deﬁned as P ( z , . . . , z n ) = n ∑ k = a k ( nk ) S k ( z , . . . , z n ) ,where S k ( z , . . . , z n ) = ∑ ≤ i < ... < i k ≤ n z i . . . z i k . By the deﬁnition, P ( z , . . . , z ) = p ( z ) . We call ( z , . . . , z n ) asolution of p , if P ( z , . . . , z n ) =

0. Deﬁne a closed set Ω ⊆ C ∗ the locus holder of p , if every solution of p has a point in Ω . Call a minimal by inclusion locus holder Ω a locus of p . For examples and properties oflocus holders see [SS14]. Note that any polynomial is stable with respect to the complement of its locus.The next result shows that symmetrization of a univariate sector-stable polynomial with non-negative issector-stable. Note that this result is not true if we drop the assumption of nonnegative coefﬁcients. Proposition 55 (Theorem 1.1 [SS19]) . Let p ( z ) be a univariate Γ α -sector-stable polynomial with nonnegativecoefﬁcients. Then Γ α is the locus holder of p ( z ) . g with respect to indices in a given set S .When the set S and g is speciﬁed let k max , k min be the maximum and minimum S -degree among monomialsin g . Lemma 56.

Let U : = ∏ i Γ α i ⊆ C , S ⊆ [ n ] . If g ∈ C [ z , . . . , z n ] is U-stable, then g Sk S max , g Sk S min are also U-stable.Proof. We may re-index z i so that S = [ t ] for some t ≤ n . W.l.o.g., assume that this is already done.For simplicity of notation, below we omit the superscript S . Observe that U is open and g k max , g k min are notidentically zero, by deﬁnition.For λ ∈ R > let g λ ( z , · · · , z n ) : = λ k max g ( λ z , · · · , λ z t , z t + , · · · , z n ) = g k max ( z , · · · , z n ) + k max − ∑ k = g k ( z , · · · , z n ) λ k max − k Clearly, g λ is U -stable, and lim λ → ∞ g λ = g k max , so by Lemma 33, g k max is U -stable. Similarly, g k min = lim λ → + λ k min g ( λ z , · · · , λ z n ) is U -stable.As a consequence, we can prove partial derivatives preserve sector stability. Corollary 57.

If p ( z , . . . z n ) is a multiafﬁne polynomial, then the partial derivative of p with respect to any variablez i in i ∈ [ n ] , which we denote by ∂ i p, is sector stable.Remark . In general taking derivatives of non-multiafﬁne polynomials does not preserve sector stability.For example, let x , y , z , . . . , z n be variables. Look at the polynomial p = ( xz + yz )( xz + yz )( xz + yz ) . . . ( xz n + yz ) . This is Γ -sector stable. Now differentiate w.r.t. each z i once, and then set each z i itto zero. What you end up with is x n + y n . This is only Γ n -sector-stable. Theorem 59 (Hurwitz-stable intersected with one partition constraint) . Suppose g ( z , . . . , z n ) is a Γ -stablepolynomial with constant parity (the degree of every monomial is even or odd). Then g k is Γ -stable or identically0.More precisely, for k ∈ [ k min , k max ] with k ≡ k max ( mod 2 ) , g k is Γ stable.Proof. Lemma 56 with S = [ n ] and U = Γ n implies g k max , g k min are Γ -stable. W.l.o.g., we assume k max > k min ≥

0, otherwise there is nothing to prove.Fix arbitrary z , . . . , z n ∈ Γ . Let h ( z ) = z k min g ( z z , z z , . . . , z n z ) . Note that h ( ) = g k min ( z , · · · , z n ) = Γ -stability of g k min . Note also that all terms in h has even degree in z , and the highest degree termis g k max ( z , · · · , z n ) z k max − k min with g k max ( z , · · · , z n ) = Γ -stability of g k max . By substituting z = y in h , we obtain a polynomial ˜ h ( y ) : = h ( y ) that satisﬁes ˜ h ( y ) = y ∈ ¯ S ∪ { } . Indeed,˜ h ( ) = h ( ) =

0. For y ∈ ¯ S , we have z = y ∈ ¯ S thus ( z i z ) ni = ∈ Γ n , and ˜ h ( y ) = g ( z z , · · · , z n z ) = Γ -stability of g .Let λ , · · · , λ d be the roots of ˜ h ( y ) where d : = deg ( ˜ h ) = k max − k min . As argued earlier, λ i ∈ ( C \ ( ¯ S ∪{ } )) = H − π /2 . Fix k ∈ [ k min , k max ) with k ≡ k max ( mod 2 ) . By half-plane stability of symmetric polyno-mial ( Corollary 35), g k ( z , · · · , z n ) = g k max ( z , · · · , z n ) e t ( λ , · · · , λ d ) = t : = k max − k ∈ N .The next corollary results in DPP sampling on P matrix A ∈ R n × n , where A + A T is PSD, and thesampling from monomer-dimer of ﬁxed size. Corollary 60.

Suppose g ( z , . . . , z n ) ∈ R [ z , · · · , z n ] is Γ -stable, then g k is either identically 0 or Γ -stable. roof. Deﬁne the even and odd parts g e and g o of g as in Theorem 36. If g e ≡ g o ≡ g e , g o

0, then they are Γ -stable by Theorem 36. The claim follows by applying Theorem 59 to g e ( g o ) if k is even (odd resp.)Lemma 38 and Corollary 60 together imply the following corollaries. Corollary 61.

Consider A ∈ R n × n where A + A T is positive semi-deﬁnite, thenf k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) det ( A S , S ) z [ n ] \ S and its dual f ∗ k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) det ( A S , S ) z S are either identically or Γ -stable, and has nonnegative real coefﬁcients. Corollary 62.

Consider a graph G = G ( V , E ) on n vertices. For S ⊆ V, let m S be the number of perfect matchingon S. Then f k ( z , · · · , z n ) = ∑ S ∈ ( [ n ] k ) m S z [ n ] \ S and its dual f ∗ k are either identically or Γ -stable. Lemma 63.

Suppose that p ( x , y ) is a homogeneous polynomial with coefﬁcients in C , deﬁned asp ( x , y ) = ∑ i c i x i y d − i . If p is ( Γ α × Γ β ) -stable for α + β ≥ , then the sequence of c i will have no holes (zeros in between nonzeros).Proof. We may as well assume that c , c d =

0, otherwise we can factor out extra powers of x and y . Let g ( z ) = p ( z , 1 ) . Then g is Γ -stable. This is because every z ∈ Γ can be written as x / y for x ∈ Γ α and y ∈ Γ β . So g ( z ) = p ( x / y , 1 ) = p ( x , y ) / y d =

0. Since g is Γ -stable and has no zero root, its roots mustbe in the left half-plane { z | Re ( z ) < } . But then c d − i / c d is going to be up to a plus/minus sign the i -th elementary symmetric polynomial of the roots of g . Since elementary symmetric polynomials arehalf-plane-stable for every open half plane, all the coefﬁcients of g must be nonzero. Theorem 64.

Suppose that g ( z , . . . , z n ) is a homogeneous Γ -stable polynomial. Let T , . . . , T k be a partition of [ n ] and ( c , . . . , c k ) ∈ Z n ≥ . Deﬁne the ( T , . . . , T k ) -degree of a monomial z t · · · z t n n as ( ∑ i ∈ T t i , ∑ i ∈ T t i , . . . , ∑ i ∈ T k t i ) .Let h be the sum of terms in g whose ( T , . . . , T k ) -degree is equal to ( c , . . . , c k ) . Then h is either identically zero, oris Γ k -stable.Proof. Let h i be the polynomial obtained from g by retaining the terms whose ( T , . . . , T i ) -degree is ( c , . . . , c i ) . Then h = g and h k = h . If h i ≡ i , then h k ≡

0. W.l.o.g., we assume h i ∀ i . We will inductively prove that h i is (cid:16) ∏ j ∈ T ∪···∪ T i Γ α × ∏ j ∈ T i + ∪···∪ T k Γ β i (cid:17) -stable for α = k and β i = − ( i − ) /2 k . Let ∏ i : = (cid:16) ∏ j ∈ T ∪···∪ T i Γ α × ∏ j ∈ T i + ∪···∪ T k Γ β i (cid:17) . Note that ∏ i + ⊆ ∏ i ∀ i .Note that β =

1, and by assumption g = h is Γ -stable. So it is enough to prove the induction step.Assume the statement is true for h i and let us prove it for h i + . Fix ( z , . . . , z n ) ∈ ∏ i + . We will show h i + ( z , . . . , z n ) =

0. Note that we can get h i + from h i by retaining the terms whose T i + - degree is c i + .Take two variables x and y , and look at the polynomial p ( x , y ) = h i ( u , . . . , u n ) , where u j : =  z j if j ∈ T ∪ · · · ∪ T i , xz j if j ∈ T i + , yz j if j ∈ T i + ∪ · · · ∪ T k .25ote that p is a homogeneous polynomial (of some degree d ). This is because h i is homogeneous invariables from T i + ∪ · · · ∪ T k . Let c max , c min be the maximum and minimum T i + -degree in h i respectively.Note that the coefﬁcient of x c y d − c in p ( x , y ) is exactly h T i + i , c ( z , · · · , z n ) , where h T i + i , c are sum of termsin h i whose T i + -degree is c . We will show that the coefﬁcient of x c y d − c in p ( x , y ) are nonzero, where c ∈ [ c min , c max ] . This immediately implies stability of h i + , as c i + ∈ [ c min , c max ] since h i +

0. For c ∈ { c min , c max } , h T i + i , c is ∏ i -stable by inductive assumption on h i and Lemma 56, thus h T i + i , c ( z , · · · , z n ) = ( z , · · · , z n ) ∈ ∏ i + ⊆ ∏ i .For the remaining c ∈ ( c min , c max ) we will use Lemma 63. Let x ∈ Γ β i − α and y ∈ Γ β i − β i + . These choicesmake sure that xz j ∈ Γ β i and yz j ∈ Γ β i for appropriate indices j . By the inductive assumption, we have p ( x , y ) =

0. So p is ( Γ β i − α × Γ β i − β i + ) -stable. If this stability satisﬁes the assumptions of Lemma 63, weare done. So it is enough to check ( β i − α ) + ( β i − β i + ) ≥ β i − α − β i + = − i + − k − k − + i + − k = Conjecture 65.

With the same assumptions as in Theorem 64, we have h is either identically zero or Γ k -stable. For any distribution µ , deﬁne its Newton polytope, newt ( µ ) as the convex hull of its support,newt ( µ ) : = conv ( { S : µ ( S ) > } ) .Next, we show that the ℓ edge lengths of the Newton polytope of a Γ k -sector-stable distribution are O ( k ) . Lemma 66.

Let µ : 2 [ n ] → R be a Γ k -sector-stable distribution, then the length of edges of newt ( µ ) is at most k.Proof. First, we show that for any face F of newt ( µ ) there exists a Γ k -sector-stable polynomial withsupport equal to the face F . Since F is a face of newt ( µ ) there exists some vector w = ( w , . . . , w n ) that F = arg max {h w , x i| x ∈ newt ( µ ) } .Let g µ be the generation polynomial of µ , then g µ ( t w z , . . . , t w n z n ) = ∑ α ∈ Z n ≥ coeff g ( z α ) t h w , α i z α .Now, if we take the limit t → ∞ only the coefﬁcients of the terms that are the supports of F remain, i.e.,1 t max h w , x i g ( t w z , . . . , t w n z n ) = ∑ α ∈ F coeff g ( z α ) z α + ∑ α F t − δα coeff g ( z α ) z α g F = lim t → ∞ t max h w , x i g ( t w z , . . . , t w n z n ) = ∑ α ∈ F coeff g ( z α ) z α .Note that linear scaling of variables and taking limit with respect to t , preserves sector-stability of thefunction, therefore, g F is a sector-stable polynomial. By applying the same argument again, we canconstraint the Newton polytope to lower dimensional faces and yet still preserves sector-stability, until weget an edge. As a result, the corresponding polynomial for each edge should be also Γ k -sector-stable.Now, assume the the contrary that there exists an edge ( α , α ′ ) of newt ( µ ) with length more than 2 k . Thenwe should have that g ( α , α ′ ) ( z ) = az α + bz α ′ = bz α ( ab − + z α ′ − α ) Γ k -sector-stable, where a = coeff g ( z α ) and b = coeff g ( z α ′ ) are nonzero.If | α − α ′ | > k , we can set z i ∈ Γ k for i ∈ α ∆ α ′ so that z α ′ − α can take any values in C . Therefore, g ( α , α ′ ) ( z ) is not Γ k -stable, a contradiction. In this section, ﬁrst, we show that any sector-stable polynomial is a fractionally log-concave polynomialas well. Then by analyzing properties of fractionally log-concave polynomials we show that entropy ofmarginals gives a constant approximation for the entropy of fractionally log-concave distributions. Thisleads to a multiplicative-approximation on the logarithm of the size of the support of a sector-stable poly-nomial (see Lemma 73). Our techniques are a natural generalization of the results obtained by Anari,Oveis Gharan, and Vinzant [AOV18]. See also [ES20] for recent alternative techniques for proving similarentropy-based inequalities. An immediate consequence of our results is a multiplicative-approximationfor the logarithm of the size of the support of the monomer-dimer model (Corollary 75).

Lemma 67.

For α ∈ [

0, 1/2 ] , if polynomial f ∈ R ≥ [ z , · · · , z n ] is Γ α -sector-stable then f is α fractionallylog-concave.Proof. Let µ : 2 [ n ] → R ≥ be the distribution generated by f .First we claim that, it is enough to prove fractional log-concavity at ~

1. For an arbitrary vector ~ v ∈ R > let f v ( z i ) = f ( { v α i z i } ) . Note that f v is sector stable, and ∇ log f ( { z α i } ) | ~ v = D ~ v (cid:0) ∇ log f v ( { z α i } ) | ~ (cid:1) D ~ v ,where D ~ v = diag { v − i } . So, we proceed by replacing f with f v .Let H = ∇ log f ( { z α i } ) | ~ then H ij = ( α ( α − ) P [ i ] − α P [ i ] if j = i α ( P [ i ∧ j ] − P [ i ] P [ j ]) otherwise .Since the row norm of Ψ cor µ is bounded by α , its maximum eigenvalue λ max ( Ψ cor µ ) is at most α . Therefore,1 α I ≥ Ψ cor µ = α diag ( P [ i ]) i H + α I ,hence, H ≤ f ( { z α i } ) is concave. Remark . Observe that the proof of Lemma 67 implies λ max ( Ψ cor µ ) ≤ α is equivalent to α -fractionallog-concavity of µ .Lemmas 63 and 67 imply that for Γ α -sector stable µ : 2 [ n ] → R ≥ , the homogenization µ hom : ( [ n ] n ) → R ≥ of µ is Γ α /2 -sector stable and α /4-fractionally log concave. In Lemma 69, we prove the stronger statementthat µ hom is α /2-fractionally log concave. Lemma 69.

Consider distribution µ : 2 [ n ] → R ≥ that is generated by a Γ α -sector-stable polynomial f . Let ν : = µ hom be the homogenization of µ . We have λ max ( Ψ cor ν ) ≤ α , or, equivalently, the homogenization f hom of f is α /2 -fractionally log-concave.Proof. Let Ω = {

1, . . . , n } and ¯ Ω = { ¯1, . . . , ¯ n } . For set S ⊆ Ω , let S ⊆ Ω : = (cid:8) ¯ i (cid:12)(cid:12) i ∈ S (cid:9) . Recall that ν ( U ) = ( µ ( U ∩ Ω ) if U = S ∪ ( Ω \ S ) P [ i ] : = P U ∼ ν [ i ∈ U ] = P S ∼ µ [ i ∈ S ] and P [ ¯ i ] : = P U ∼ ν [ ¯ i ∈ U ] = P S ∼ µ [ i S ] .Note that Ψ cor ν ( i , i ) = − Ψ cor ν ( i , ¯ i ) = P [ ¯ i ] and Ψ cor ν ( ¯ i , ¯ i ) = − Ψ cor ν ( ¯ i , i ) = P [ i ] . For i = j , we can write Ψ cor ν ( i , j ) = − Ψ cor ν ( i , ¯ j ) = P [ ¯ i ] Ψ inf µ ( i , j ) Ψ cor ν ( ¯ i , ¯ j ) = − Ψ cor ν ( ¯ i , j ) = P [ i ] Ψ inf µ ( i , j ) Let D : = diag ( P [ i ]) ni = and D : = diag ( P [ ¯ i ]) ni = . We can rewrite Ψ cor ν as a block matrix in term of matrix A : = Ψ inf µ + I as follow Ψ cor ν = (cid:20) DA − DA − DA DA (cid:21)

We consider the left eigenvectors of Ψ cor ν . Recall that all eigenvalues of Ψ inf µ are real.Let v , · · · , v n ∈ R n bea basis of left eigenvectors of Ψ inf µ , with corresponding eigenvalues λ ( Ψ inf µ ) ≥ · · · ≥ λ n ( Ψ inf µ ) . For i ∈ [ n ] ,let w i ∈ R n be the concatenation of v i and − v i i.e. w ti : = (cid:2) v ti − v ti (cid:3) . Then { w i } are linearly independent,and are left eigenvector of Ψ cor ν with eigenvalues { λ i + } , since w ti Ψ cor ν = (cid:2) v ti − v ti (cid:3) (cid:20) DA − DA − DA DA (cid:21) = (cid:2) v ti ( D + D ) A − v ti ( D + D ) A (cid:3) = (cid:2) v ti A − v ti A (cid:3) = ( λ i + ) w ti On the other hand, for i ∈ [ n ] , consider the vector u i ∈ R n deﬁned by u ti : = (cid:2) e ti D e ti D (cid:3) where e i isthe i -th standard basis vector of R n . Observe that u i = P [ i ] or P [ ¯ i ] must be nonzero, and u ti Ψ cor ν =

0. Moreover, { u i } are linearly independent.Now, let W , U be the n -dimensional subspaces of R n spanned of { w i } and { u i } respectively. We showthat W ∩ U = { } , then conclude that the vectors { u i } ∪ { w i } are linearly independent, and form a basisof (left) eigenvectors of Ψ cor ν . Hence, the spectrum of Ψ cor ν is the union of { λ i + } ni = and n copies of 0. Inparticular, λ max ( Ψ cor ν ) ≤ λ ( Ψ inf µ ) + = α .Indeed, suppose w ∈ W ∩ U . We can write w t = (cid:2) y t − y t (cid:3) for some y ∈ R n and w t = (cid:2) x t D x t D (cid:3) forsome x ∈ R n .Then 0 = y ( i ) − y ( i ) = w ( i ) + w ( i + n ) = x ( i ) P [ i ] + x ( i ) P [ ¯ i ] = x ( i ) where we use y ( i ) ( x ( i ) , z ( i ) resp.) to denote the i -th entries of vector y . Now, all entries of x are 0, so w = . Remark . Lemma 69 is tight. For example, take f µ ( x , x ) = x + x , which is Γ -stable, then λ max ( Ψ cor µ hom ) = µ over a ﬁnite set Ω , deﬁne its entropy as H ( µ ) = ∑ ω ∈ Ω µ ( ω ) log µ ( ω ) .Recall the marginal probability of element i ∈ Ω , µ ( i ) , is the probability that i is in a sample from µ , i.e., µ ( i ) = P S ∼ µ [ i ∈ S ] . For any probability distribution, by sub-additivity of entropy, we know that entropyof marginals is an upper bound on the entropy, ∑ i H ( µ ( i )) ≥ H ( µ ) .The next lemma, which is an analogous of Theorem 5.2. in [AOV18], leads to a lower bound for theentropy of fractionally log-concave polynomials. Lemma 71.

For any α fractionally log-concave distribution µ : 2 [ n ] → R with marginal probabilities µ , . . . , µ n ,we have H ( µ ) ≥ α ∑ i µ ( i ) log ( µ ( i ) ) .28 roof. Let g µ be the polynomial of the distribution µ . Deﬁne f ( z , . . . , z n ) = log g µ (cid:0) z α µ α , . . . , z α n µ α n (cid:1) .Since µ is α fractionally log-concave, log g µ ( z α , . . . , z α n ) is a concave function. Also, scaling preservesconcavity. Therefore, f ( z , . . . , z n ) is concave.Now, let X be a random variable that indicates a set chosen according to µ , i.e., P ( X = S ) = µ ( S ) . Then,by Jensen inequality we have that f ( E [ X ]) ≥ E [ f ( X )] . Note that, f ( E ( X )) = f ( µ , . . . , µ n ) = log g µ (cid:0) µ α µ α , . . . µ α n µ α n (cid:1) = log ( g µ (

1, . . . , 1 )) = f ( S ) = log ( ∑ T ⊆ S µ ( T ) ∏ i ∈ T µ ( i ) α ) ≥ log ( µ ( S ) ∏ i ∈ S µ ( i ) α ) = log µ ( S ) + α ∑ i log 1 µ ( i ) ,where the inequality is true because of mendacity of log. Hence, E [ f ( X )] = ∑ S µ ( S ) f ( S ) ≥ µ ( S ) log µ ( S ) + αµ ( S ) ∑ i log 1 µ ( i )= −H ( µ ) + α ∑ µ ( i ) log 1 µ ( i ) .By applying Jensen inequality we have H ( µ ) ≥ α ∑ i µ ( i ) log 1 µ ( i ) .Given a probability distribution µ : 2 [ n ] → R ≥ , the dual probability distribution µ ∗ is deﬁned so thatthe probability of occurrence of each set is equal to its complement under µ , i.e., for any set S ⊆ [ n ] , µ ∗ ( S ) = µ ([ n ] \ S ) . Corollary 72. If µ and its dual µ ∗ are α -fractionally log-concave then ∑ i H ( µ ( i )) is a α approximation for H ( µ ) .In particular, if µ is Γ α -sector-stable, then µ and its dual µ ∗ are α -fractionally log-concave (see Proposition 52 Part3and Lemma 67). Therefore, ∑ i H ( µ ( i )) is a α approximation for H ( µ ) .Proof. For any probability distribution µ we have that, H ( µ ) ≤ ∑ i H ( µ ( i )) . So, it is enough to prove H ( µ ) ≥ α ∑ i H ( µ ( i )) By Lemma 71 we have that, H ( µ ) ≥ α ∑ i µ ( i ) log 1 µ ( i ) , H ( µ ∗ ) ≥ α ∑ i ( − µ ( i )) log 11 − µ ( i ) .Since µ and µ ∗ are duals H ( µ ) = H ( µ ∗ ) . Therefore,2 H ( µ ) = H ( µ ) + H ( µ ∗ ) ≥ α ( ∑ i µ ( i ) log 1 µ ( i ) + ∑ i ( − µ ( i )) log 11 − µ ( i ) ) = α ∑ i H ( µ ( i )) .29iven a distribution µ , let F µ be its support. We want to show how to approximate log | F µ | when µ isfractionally log-concave. Previously, this result was shown for log-concave polynomials in [AOV18]. Lemma 73.

Consider F ⊆ ( [ n ] k ) . Let F ∗ : = { [ n ] \ S | S ∈ F } . Let β : = max ( ∑ i p i log 1 p i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ∈ conv ( F ) ) , and β ∗ : = max (cid:26) q i log 1 q i (cid:12)(cid:12)(cid:12)(cid:12) q ∈ conv ( F ∗ ) (cid:27) . Assume there exists an α -fractionally log-concave polynomials g and h with supp ( g ) = F and supp ( h ) = F ∗ . Then β + β ∗ is α /2 -approximation for log | F | i.e. ( β + β ∗ ) ≥ log | F | ≥ ( β + β ∗ ) α /2. In particular, if there exists an Γ α -sector-stable polynomial g with supp ( g ) = F then β + β ∗ is α /2 -approximationfor log | F | .Note that β and β ∗ can be efﬁciently computed via a convex program (see e.g. [AOV18, Theorem 2.10].The following lemma states that any point in conv ( F µ ) , where µ is fractionally log-concave, is equal to themarginals of some fractionally log-concave distribution. Lemma 74.

Consider F ⊆ ( [ n ] k ) . Suppose there exists an α -fractionally log-concave polynomial g with supp ( g ) = F.For any ( p , · · · , p n ) ∈ conv ( F ) , there exists ν with supp ( ν ) ⊆ F such that ∑ S ν ( S ) z S is α -fractionally log concaveand ν ( i ) = p i ∀ i ∈ [ n ] . Consequently, max n ∑ i p i log p i (cid:12)(cid:12)(cid:12) p ∈ conv ( F ) o = max n ∑ i µ ( i ) log µ ( i ) (cid:12)(cid:12)(cid:12) µ ∈ V o where V is the set of α -fractionally log concave µ with supp ( µ ) ⊆ F .The proof is very similar to [AOV18, Theorem 2.10, Corollary 2.11]. They show that given any ~ p =( p , · · · , p n ) ∈ newt ( g ) , one can ﬁnd vector λ ∈ R n ≥ such that the distribution generated by g λ ⋆ µ has thesame marginal as ~ p . Now, we are ready to prove Lemma 73. Proof of Lemma 73.

Let ν and ν ∗ be the uniform distribution over F and F ∗ respectively. For a set F ′ , let V F ′ be set of α -fractionally log concave µ with supp ( µ ) ⊆ F ′ . Since V F is non empty, by Lemma 74, β = max µ ∈ V F n ∑ i µ ( i ) log µ ( i ) o and β ∗ = max µ ∈ V F ∗ n ∑ i µ ( i ) log µ ( i ) o . For S ∈ { F , F ∗ } , let µ argmax S = argmax µ ∈ V S ∑ i µ ( i ) log 1 µ ( i ) .We have log ( | F | ) = H ( ν ) ≤ ∑ i H ( ν ( i )) = ∑ i ( ν ( i ) log ν ( i ) + ( − ν ( i )) log − ν ( i ) ) ≤ β + β ∗ , where theinequality follows from the fact that ( ν ( i )) ni = ∈ conv ( F ) and ( − ν ( i )) ni = ∈ conv ( F ∗ ) .On the other hand, since the uniform distribution over discrete set maximizes entropy, we havelog ( | F | ) = H ( ν ) ≥ H ( µ argmax F ) ≥ α ∑ i µ argmax F ( i ) log 1 µ argmax F ( i ) = αβ ,where the second inequality follows from Lemma 71. Analogously,log ( | F ∗ | ) = H ( ν ) ≥ H ( µ argmax F ∗ ) ≥ α ∑ i µ argmax F ∗ ( i ) log 1 µ argmax F ∗ ( i ) = αβ ∗ .Summing the these two inequalities, we getlog ( | F | ) ≥ ( β + β ∗ ) α /2.30orollaries 54 and 62 and Lemma 73 together imply the following corollary. Corollary 75.

Consider graph G = G ( V , E ) . Let V M be the family of sets of S ⊆ V that have a perfect matching.For k ≤ n /2 , let V Mk be the family of vertices of size k that have a perfect matching. Then we can efﬁciently computea -multiplicative-approximation of log | V M | and of log | V Mk | .Analogously, Corollary 61 and Lemma 73 together imply the following Corollary 76.

Consider matrix L ∈ R n × n such that L + L T is positive semi-deﬁnite. Let V L be the family of setsS ⊆ [ n ] such that det ( L S , S ) = . For k ≤ n, let V Lk be the family of sets S ∈ ( [ n ] k ) such that det ( L S , S ) = Thenwe can efﬁciently compute an -multiplicative-approximation of log | V L | and of log | V Lk | . Remark . In Lemma 66, we show the convex hull of the support of a Γ α -sector stable polynomial has edgelength bounded by O ( α ) . We can show a similar result for α -fractionally log-concave polynomial. Weleave the problem of characterizing the support of α -fractionally log-concave polynomial to future work. In this section, we state and prove the formal version of Corollary 13. This result suggests that there isan efﬁcient algorithm to compute mixed derivatives of a real-stable polynomials. The time complexityof the algorithm depends on the bit complexity of coefﬁcients of the polynomial and the number ofpartial derivatives. As a result, we have an FPRAS to compute the sum of coefﬁcients of the monomialscorresponding to a partition matroid with constantly many parts. By dropping the assumption on thecoefﬁcients, the best known result gives an e r -approximation factor where r is the rank of matroid (see[SV17]). Lemma 78.

Let f ∈ R ≥ [ z , · · · , z n ] be a homogeneous real-stable polynomial whose maximum degree in z i is κ i .Let κ : = ∑ ni = κ i . For v , · · · , v k , x ∈ R n ≥ with k = O ( ) , we can compute ∂ c v · · · ∂ c k v k f | ~ z = x in polynomial time in κ and b, where b ≥ is the bit complexity of the coefﬁcients of f and the entries of v , · · · , v k , x i.e., these entriesare in between [ − b , 2 b ] . Proof.

W.l.o.g., we can assume f is homogeneous multiafﬁne, else we replace f with its polarization f ↑ ( z ij ) i ∈ [ n ] , j ∈ [ κ i ] (see Proposition 34). Note that f ↑ is a homogeneous multi-afﬁne polynomial in κ vari-ables, and has same degree as f . Moreover, ∂ v f = ( ∑ ni = ∑ κ i j = v i ∂∂ z ij ) f ↑ . Each call to the oracle O f ↑ for f ↑ can be implemented using one call to the oracle O f for f . The bounds on coefﬁcients of f implies that thecoefﬁcients of f ↑ are bounded by 2 − κ O ( ) b and 2 κ O ( ) b . Therefore, in the remainder of the proof we assumethat polynomial f is multiafﬁne, homogeneous, real-stable, and κ = n .We divide the proof into two main steps. In the ﬁrst step, we map the polynomial f to another polynomial g such that:1. g is a homogeneous multiafﬁne real-stable polynomial in n ′ = O ( n ) variables, of degree d = deg ( g ) = deg ( f ) .2. D c v · · · D c k v k ( f ) | x , ··· , x n = D c w · · · D c k w k ( g ) | x ′ , ··· , x ′ n ′ ∈ R , where x ′ i ≥ ∀ i ∈ [ n ′ ] . The vectors w i ∈ {

0, 1 } n and correspond to subsets T i ⊆ [ n ′ ] ; further these sets T i are disjoint. Note that D c w · · · D c k w k ( g ) isexactly h ( x ′ , · · · , x ′ n ) where h ( z , · · · , z n ′ ) is the sum of terms in g whose ( T , · · · , T k , T k + ) -degreeis ( c , · · · , c k , c k + ) where T k + = [ n ′ ] \ S ki = T i and c k + = n ′ − ∑ ki = c i .In the second step, we (approximately) sample from the distribution µ generated by h ( x ′ z , · · · , x ′ n z n ) . Aroutine sampling to counting argument then allows computing an approximation of h ( x ′ , · · · , x ′ n ) . Theo-rem 64 and Lemma 67 implies h is α -fractionally log-concave for α = k + . Let ∆ = ( α − ) , ℓ = ⌈ ∆ ⌉ ,the ℓ -steps down-up walk has eigenvalue gap ≥ n ∆ + . The bound on coefﬁcients of f implies an upperbound of κ O ( ) b on min S ∈ supp ( µ ) log ( µ ( S )) , thus the random walk starting from any S ∈ supp ( µ ) mixes31n k O ( ) b steps. We can use O g to obtain a starting state in supp ( µ ) . Each step of the random walk can beimplemented using polynomially many calls O g .For t ≤ n and v i > i ∈ [ t ] , it is easy to see that, ( t ∑ i = v i ∂ i ) c f ( x , · · · , x n ) = ( t ∑ i = ∂ i ) c f ( v x , · · · , v t x t , x t + , · · · , x n ) .For j ∈ [ k ] , consider w j ∈ {

0, 1 } n where w ji = ( v ji = ) . For i ∈ [ n ] , let x ′ i : = x i ∏ j : v ji = v ji ≥

0. We have D c v · · · D c k v k ( f ) | x , ··· , x n = D c w · · · D c k w k ( f ) | x ′ , ··· , x ′ n ′ .Consider the linear transformation T : R [ z i ] → R [ z i , j ] i ∈ [ n ] , j ∈ [ k ] obtained by substituting z i : = ∑ kj = z i , j for i ∈ [ n ] and deﬁne g = T ( f ) . Clearly, g is multiafﬁne, homogeneous and real-stable.We next show that: T ( D c w · · · D c k w k f ) | x , ··· , x n = D c ˜ w · · · D c k ˜ w k ( g ) |{ ˜ x i , j } i ∈ [ n ] , j ∈ [ k ] ,where ˜ x i , j = x i ∀ j ∈ [ k ] and ˜ w ji , j ′ = ( w ji if j ′ = j f and g are multiafﬁne. To prove the above equality, we only need to verify it for multiafﬁnemonomials. Fix a multiafﬁne monomial m and j ∈ [ k ] . We check T ( w ji · · · w ji cj ∂ i · · · ∂ i cj m ) = ˜ w ji , j · · · ˜ w ji cj , j ( c j ∏ t = ∂∂ z i t , j ) T ( m ) .This immediately implies T ( D c j w j ˜ f ) = D c j ˜ w j T ( ˜ f ) for any multiafﬁne polynomial ˜ f . The desired equality thenfollows by induction.First, w ji · · · w ji cj = ˜ w ji , j · · · ˜ w ji cj , j . We can factor them out, and prove T ( ∂ i · · · ∂ i cj m ) = ∏ c j t = ∂∂ z it , j T ( m ) . Now,if the i t are not distinct, then LHS and RHS are both 0 since m and T ( m ) are multiafﬁne. If m does notdivide z i t for some t , then both LHS and RHS are 0. Now, write m = m ∏ c j t = z i t for some monomials m containing only variables in [ n ] \ n i , · · · , i c j o . Clearly, the LHS is T ( m ) . The RHS is c j ∏ t = ∂∂ x ji t , j T ( m ) c j ∏ t = ( k ∑ j = x ji t ) = T ( m )( c j ∏ t = ∂∂ x ji t , j ) c j ∏ t = ( k ∑ j = x ji t ) ! = T ( m ) . References [Aff+14] Raja Haﬁz Affandi, Emily B. Fox, Ryan P. Adams, and Ben Taskar.

Learning the Parameters ofDeterminantal Point Process Kernels . 2014. arXiv: .[AL20] Vedat Levi Alev and Lap Chi Lau. “Improved analysis of higher order random walks and ap-plications”. In:

Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing .2020, pp. 1198–1211.[ALO20] Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. “Spectral Independence in High-DimensionalExpanders and Applications to the Hardcore Model”. In:

Proceedings of the 61st IEEE AnnualSymposium on Foundations of Computer Science . IEEE Computer Society, Nov. 2020.[Ana+18] Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomi-als III: Mason’s Ultra-Log-Concavity Conjecture for Independent Sets of Matroids”. In:

CoRR abs/1811.01600 (2018). 32Ana+19] Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-concave polynomi-als II: high-dimensional walks and an FPRAS for counting bases of a matroid”. In:

Proceedingsof the 51st Annual ACM SIGACT Symposium on Theory of Computing . 2019, pp. 1–12.[AO17] Nima Anari and Shayan Oveis Gharan. “A Generalization of Permanent Inequalities and Ap-plications in Counting and Optimization”. In:

Proceedings of the 49th Annual ACM SIGACT Sym-posium on Theory of Computing . ACM, June 2017, pp. 384–396. doi : .[AOR16] Nima Anari, Shayan Oveis Gharan, and Alireza Rezaei. “Monte Carlo Markov Chain Algo-rithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes”. In: Proceedings of the 29th Conference on Learning Theory . Vol. 49. JMLR Workshop and ConferenceProceedings. PMLR, June 2016, pp. 103–115.[AOV18] Nima Anari, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomials I: En-tropy and a Deterministic Approximation Algorithm for Counting Bases of Matroids”. In:

Pro-ceedings of the 59th IEEE Annual Symposium on Foundations of Computer Science . IEEE ComputerSociety, Oct. 2018. doi : .[Bay+07] Mohsen Bayati, David Gamarnik, Dimitriy A. Katz, Chandra Nair, and Prasad Tetali. “Simpledeterministic approximation algorithms for counting matchings”. In: Proceedings of the 39thAnnual ACM Symposium on Theory of Computing, San Diego, California, USA, June 11-13, 2007 . Ed.by David S. Johnson and Uriel Feige. ACM, 2007, pp. 122–127. doi : .[BB09] Julius Borcea and Petter Brändén. “The Lee-Yang and Pólya-Schur programs. I. Linear op-erators preserving stability”. In: Inventiones Mathematicae doi : . arXiv: .[BBL09] Julius Borcea, Petter Brändén, and Thomas Liggett. “Negative dependence and the geometryof polynomials”. In: Journal of the American Mathematical Society

Coxeter Matroids . Springer, 2003, pp. 151–197.[BH19] Petter Brändén and June Huh. “Lorentzian polynomials”. In: arXiv preprint arXiv:1902.03719 (2019).[Bor09] Alexei Borodin.

Determinantal point processes . 2009. arXiv: .[BP93] Robert Burton and Robin Pemantle. “Local Characteristics, Entropy and Limit Theorems forSpanning Trees and Domino Tilings Via Transfer-Impedances”. In:

The Annals of Probability issn : 00911798.[Brä07] Petter Brändén. “Polynomials with the half-plane property and matroid theory”. In:

Advancesin Mathematics

Advances in Neural Information Processing Systems . Ed. byS. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett. Vol. 31.Curran Associates, Inc., 2018, pp. 7365–7374.[Cel+16] L Elisa Celis, Amit Deshpande, Tarun Kathuria, Damian Straszak, and Nisheeth K Vishnoi.“On the complexity of constrained determinantal point processes”. In: arXiv preprint arXiv:1608.00554 (2016).[CGM19] Mary Cryan, Heng Guo, and Giorgos Mousa. “Modiﬁed log-Sobolev inequalities for stronglylog-concave distributions”. In: . IEEE. 2019, pp. 1358–1370.[Cha+15] Wei-Lun Chao, Boqing Gong, Kristen Grauman, and Fei Sha. “Large-Margin DeterminantalPoint Processes”. In:

Proceedings of the Thirty-First Conference on Uncertainty in Artiﬁcial Intelli-gence . UAI’15. Amsterdam, Netherlands: AUAI Press, 2015, pp. 191–200. isbn : 9780996643108.[Che+20] Zongchen Chen, Andreas Galanis, Daniel Štefankoviˇc, and Eric Vigoda. “Rapid mixing forcolorings via spectral independence”. In: arXiv preprint arXiv:2007.08058 (2020).[CLV20a] Zongchen Chen, Kuikui Liu, and Eric Vigoda. “Optimal Mixing of Glauber Dynamics: EntropyFactorization via High-Dimensional Expansion”. In: arXiv preprint arXiv:2011.02075 (2020).[CLV20b] Zongchen Chen, Kuikui Liu, and Eric Vigoda. “Rapid Mixing of Glauber Dynamics up toUniqueness via Contraction”. In: arXiv preprint arXiv:2004.09083 (2020).[CMO19] Fabio Deelan Cunden, Satya N Majumdar, and Neil O’Connell. “Free fermions and α -determinantalprocesses”. In: 52.16 (Mar. 2019), pp. 165–202. doi : .33DK17] Irit Dinur and Tali Kaufman. “High dimensional expanders imply agreement expanders”.In: . IEEE. 2017,pp. 974–985.[Edm65] Jack Edmonds. “Paths, trees, and ﬂowers”. In: Canadian Journal of mathematics

17 (1965), pp. 449–467.[ES20] Ronen Eldan and Omer Shamir. “Log concavity and concentration of Lipschitz functions onthe Boolean hypercube”. In: arXiv preprint arXiv:2007.13108 (2020).[EV19] David Eppstein and Vijay V Vazirani. “NC Algorithms for Computing a Perfect Matching, theNumber of Perfect Matchings, and a Maximum Flow in One-Crossing-Minor-Free Graphs”. In:

The 31st ACM Symposium on Parallelism in Algorithms and Architectures . 2019, pp. 23–30.[Fen+20] Weiming Feng, Heng Guo, Yitong Yin, and Chihao Zhang. “Rapid mixing from spectral inde-pendence beyond the Boolean domain”. In: arXiv preprint arXiv:2007.08091 (2020).[FGT19] Stephen Fenner, Rohit Gurjar, and Thomas Thierauf. “Bipartite perfect matching is in quasi-NC”. In:

SIAM Journal on Computing

Proceedings of the twenty-fourth annualACM symposium on Theory of computing . 1992, pp. 26–38.[Gar+19] Mike Gartrell, Victor-Emmanuel Brunel, Elvis Dohmatob, and Syrine Krichene. “LearningNonsymmetric Determinantal Point Processes”. In:

ArXiv abs/1905.12962 (2019).[Går59] Lars Gårding. “An inequality for hyperbolic polynomials”. In:

Journal of Mathematics and Me-chanics (1959), pp. 957–965.[GL99] Anna Galluccio and Martin Loebl. “On the theory of Pfafﬁan orientations. I. Perfect matchingsand permanents”. In: the electronic journal of combinatorics (1999), R6–R6.[GM20] Heng Guo and Giorgos Mousa. “Local-to-Global Contraction in Simplicial Complexes”. In: arXiv preprint arXiv:2012.14317 (2020).[GPK16] Mike Gartrell, Ulrich Paquet, and Noam Koenigstein. “Bayesian Low-Rank DeterminantalPoint Processes”. In:

Proceedings of the 10th ACM Conference on Recommender Systems . RecSys’16. Boston, Massachusetts, USA: Association for Computing Machinery, 2016, pp. 349–356. isbn : 9781450340359. doi : .[Gül97] Osman Güler. “Hyperbolic polynomials and interior point methods for convex programming”.In: Mathematics of Operations Research

Comm. Math.Phys.

Probab. Surveys doi : .[Jer87] Mark Jerrum. “Two-dimensional monomer-dimer systems are computationally intractable”. In: Journal of Statistical Physics

Mathematical StatisticalPhysics, Session LXXXIII: Lecture Notes of the Les Houches Summer School . 2005, pp. 1–56.[JS89] Mark Jerrum and Alistair Sinclair. “Approximating the permanent”. In:

SIAM journal on com-puting

Journal of the ACM (JACM)

Theoretical computer science

43 (1986), pp. 169–188.[Kas61] Pieter W Kasteleyn. “The statistics of dimers on a lattice: I. The number of dimer arrangementson a quadratic lattice”. In:

Physica

Graph theory and theoretical physics (1967), pp. 43–110.[KD16] Tarun Kathuria and Amit Deshpande. “On sampling and greedy map inference of constraineddeterminantal point processes”. In: arXiv preprint arXiv:1607.01551 (2016).[KM16] Tali Kaufman and David Mass. “High dimensional random walks and colorful expansion”. In: arXiv preprint arXiv:1604.02947 (2016). 34KO18] Tali Kaufman and Izhar Oppenheim. “High order random walks: Beyond spectral gap”. In:

Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (AP-PROX/RANDOM 2018) . Schloss Dagstuhl-Leibniz-Zentrum für Informatik. 2018.[Koz06] Dexter C Kozen.

Theory of computation . Springer Science & Business Media, 2006.[KSG08] Andreas Krause, Ajit Singh, and Carlos Guestrin. “Near-Optimal Sensor Placements in Gaus-sian Processes: Theory, Efﬁcient Algorithms and Empirical Studies”. In:

J. Mach. Learn. Res. issn : 1532-4435.[KT11] Alex Kulesza and Ben Taskar. “K-DPPs: Fixed-Size Determinantal Point Processes”. In:

Proceed-ings of the 28th International Conference on International Conference on Machine Learning . ICML’11.Bellevue, Washington, USA: Omnipress, 2011, pp. 1193–1200. isbn : 9781450306195.[KT12] Alex Kulesza and Ben Taskar. “Determinantal Point Processes for Machine Learning”. In:

Foundations and Trends® in Machine Learning issn : 1935-8237. doi : .[KUW86] Richard M Karp, Eli Upfal, and Avi Wigderson. “Constructing a perfect matching is in randomNC”. In: Combinatorica

Complex analysis . Vol. 103. Springer Science & Business Media, 2013.[LB12] Hui Lin and Jeff Bilmes. “Learning Mixtures of Submodular Shells with Application to Docu-ment Summarization”. In:

Proceedings of the Twenty-Eighth Conference on Uncertainty in ArtiﬁcialIntelligence . UAI’12. Catalina Island, CA: AUAI Press, 2012, pp. 479–490. isbn : 9780974903989.[LJS16] Chengtao Li, Stefanie Jegelka, and Suvrit Sra. “Fast DPP Sampling for Nyström with Appli-cation to Kernel Methods”. In:

Proceedings of the 33rd International Conference on InternationalConference on Machine Learning - Volume 48 . ICML’16. New York, NY, USA: JMLR.org, 2016,pp. 2061–2070.[LLP17] Eyal Lubetzky, Alex Lubotzky, and Ori Parzanchevski. “Random walks on Ramanujan com-plexes and digraphs”. In: arXiv preprint arXiv:1702.05452 (2017).[LMR15] Frederic Lavancier, Jesper Moller, and Ege Rubak. “Determinantal point process models andstatistical inference”. In:

Journal of the Royal Statistical Society. Series B (Statistical Methodology) issn : 13697412, 14679868.[Lov79] László Lovász. “On determinants, matchings, and random algorithms”. In:

Fundamentals ofComputation Theory, FCT 1979, Proceedings of the Conference on Algebraic, Arthmetic, and CategorialMethods in Computation Theory, Berlin/Wendisch-Rietz, Germany, September 17-21, 1979 . Ed. byLothar Budach. Akademie-Verlag, Berlin, 1979, pp. 565–574.[LP17] David A Levin and Yuval Peres.

Markov chains and mixing times . Vol. 107. American Mathemat-ical Soc., 2017.[Mac75] Odile Macchi. “The Coincidence Approach to Stochastic Point Processes”. In:

Advances in Ap-plied Probability issn : 00018678.[MV89] Milena Mihail and Umesh Vazirani. “On the expansion of 0-1 polytopes”. In:

Journal of Combi-natorial Theory, Series B, to appear (1989).[MVV87] Ketan Mulmuley, Umesh V Vazirani, and Vijay V Vazirani. “Matching is as easy as matrixinversion”. In:

Proceedings of the nineteenth annual ACM symposium on Theory of computing . 1987,pp. 345–354.[Opp18] Izhar Oppenheim. “Local spectral expansion approach to high dimensional expanders part I:Descent of spectral gaps”. In:

Discrete & Computational Geometry

Information and Computation

Ann.Probab. doi : .[SS14] Blagovest Sendov and Hristo S. Sendov. “Loci of complex polynomials, part I”. In: Transactionsof the American Mathematical Society

366 (2014), pp. 5155–5184.35SS19] BLAGOVEST SENDOV and HRISTO SENDOV. “Duality between loci of complex polynomialsand the zeros of polar derivatives”. In:

Mathematical Proceedings of the Cambridge PhilosophicalSociety doi : .[Ste90] John R Stembridge. “Nonintersecting paths, pfafﬁans, and plane partitions”. In: Advances inMathematics issn : 0001-8708. doi : https://doi.org/10.1016/0001-8708(90)90070-4 .[SV17] Damian Straszak and Nisheeth K. Vishnoi. “Real Stable Polynomials and Matroids: Optimiza-tion and Counting”. In: STOC 2017. Montreal, Canada: Association for Computing Machinery,2017, pp. 370–383. isbn : 9781450345286. doi : .[ŠVW18] Daniel Štefankoviˇc, Eric Vigoda, and John Wilmes. “On counting perfect matchings in generalgraphs”. In: Latin American Symposium on Theoretical Informatics . Springer. 2018, pp. 873–885.[TF61] Harold NV Temperley and Michael E Fisher. “Dimer problem in statistical mechanics-an exactresult”. In:

Philosophical Magazine

Theoretical computer science