[PDF] Lagrange Inversion

Abstract

We give a survey of the Lagrange inversion formula, including different versions and proofs, with applications to combinatorial and formal power series identities.

Full PDF

LLAGRANGE INVERSION

IRA M. GESSEL ∗ Abstract.

We give a survey of the Lagrange inversion formula, including diﬀerent versionsand proofs, with applications to combinatorial and formal power series identities. Introduction

The Lagrange inversion formula is one of the fundamental formulas of combinatorics. Inits simplest form it gives a formula for the power series coeﬃcients of the solution f ( x ) ofthe function equation f ( x ) = xG ( f ( x )) in terms of coeﬃcients of powers of G . Functionalequations of this form often arise in combinatorics, and our interest is in these applicationsrather than in other areas of mathematics.There are many generalizations of Lagrange inversion: multivariable forms [28], q -analogues[22, 23, 25, 71] noncommutative versions [6, 7, 23, 56] and others [29, 43, 45]. In this paper wediscuss only ordinary one-variable Lagrange inversion, but in greater detail than elsewherein the literature.In section 2 we give a thorough discussion of some of the many diﬀerent forms of Lagrangeinversion, prove that they are equivalent to each other, and work through some simpleexamples involving Catalan and ballot numbers. We address a number of subtle issues thatare overlooked in most accounts of Lagrange inversion (and which some readers may wantto skip). In sections 3 we describe applications of Lagrange inversion to identities involvingbinomial coeﬃcients, Catalan numbers, and their generalizations. In section 4, we giveseveral proofs of Lagrange inversion, some of which are combinatorial.A number of exercises giving additional results are included.An excellent introduction to Lagrange inversion can be found in Chapter 5 of Stanley’s Enumerative Combinatorics , Volume 2. Other expository accounts of Lagrange inversioncan be found in Hofbauer [35], Bergeron, Labelle, and Leroux [5, Chapter 3], Sokal [68], andMerlini, Sprugnoli, and Verri [51].1.1.

Formal power series.

Although Lagrange inversion is often presented as a theorem ofanalysis (see, e.g., Whittaker and Watson [76, pp. 132–133]), we will work only with formalpower series and formal Laurent series. A good account of formal power series can be foundin Niven [55]; we sketch here some of the basic facts. Given a coeﬃcient ring C , whichfor us will always be an integral domain containing the rational numbers, the ring C [[ x ]] of formal power series in the variable x with coeﬃcients in C is the set of all “formal sums” (cid:80) ∞ n =0 c n x n , where c n ∈ C , with termwise addition and multiplication deﬁned as one wouldexpect using distributivity: (cid:80) ∞ n =0 a n x n · (cid:80) ∞ n =0 b n x n = (cid:80) ∞ n =0 c n x n , where c n = (cid:80) ni =0 a i b n − i .Diﬀerentiation of formal power series is also deﬁned termwise. A series (cid:80) ∞ n =0 c n x n has amultiplicative inverse if and only if c is invertible in C . We may also consider the ring of formal Laurent series C (( x )) whose elements are formal sums (cid:80) ∞ n = n c n x n for some integer n , i.e., formal sums (cid:80) ∞ n = −∞ c n x n in which only ﬁnitely many negative powers of x have a r X i v : . [ m a t h . C O ] S e p onzero coeﬃcients. Henceforth will omit the word “formal” and speak of power series andLaurent series.We can iterate the power series and Laurent series ring constructions, obtaining, for ex-ample the ring C (( x ))[[ y ]] of power series in y whose coeﬃcients are Laurent series in x .In any (possibly iterated) power series or Laurent series ring we will say that a set { f α } ofseries is summable if for any monomial m in the variables, the coeﬃcient of m is nonzero inonly ﬁnitely many f α . In this case the sum (cid:80) α f is well-deﬁned and we will say that (cid:80) α f α is summable. If we write (cid:80) α f α as an iterated sum, then the order of summation is irrele-vant. If f ( x ) = (cid:80) n c n x n is a Laurent series in C (( x )) and u ∈ C , where C may be a powerseries or Laurent series ring, then we say that that the substitution of u for x is admissible if f ( u ) = (cid:80) n c n u n is summable, and similarly for multivariable substitutions. Admissiblesubstitutions are homomorphisms. If u is a power series or Laurent series g ( x ) then f ( g ( x )),if summable, is called the composition of f and g . If f ( x ) = c x + c x + · · · , where c isinvertible in C , then there is a unique power series g ( x ) = c − x + · · · such that f ( g ( x )) = x ;this implies that g ( f ( x )) = x . We call g ( x ) the compositional inverse of f ( x ) and write g ( x ) = f ( x ) (cid:104)− (cid:105) . For simplicity, we will always assume that if f ( x ) = c x + c x + · · · , where c (cid:54) = 0 then c is invertible. (Since C is an integral domain, we can always adjoin c − to C if necessary.)The iterated power series rings C [[ x ]][[ y ]] and C [[ y ]][[ x ]] are essentially the same, in thatboth consist of all sums (cid:80) m,n ≥ c m,n x m y n . We may therefore write C [[ x, y ]] for either of theserings. However, the iterated Laurent series rings C (( x ))[[ y ]], C (( y ))[[ x ]], and C [[ y ]](( x ))are all diﬀerent: in the ﬁrst we have ( x − y ) − = (cid:80) ∞ n =0 y n /x n +1 , in the second we have( x − y ) − = − (cid:80) ∞ n =0 x n /y n +1 , and in the third x − y is not invertible.It is sometimes convenient to work with power series in inﬁnitely many variables; forexample, we may consider the power series (cid:80) ∞ n =0 r n t n where the r n are independent inde-terminates. Although we don’t give a formal deﬁnition of these series, they behave, in ourapplications, exactly as expected.We use the notation [ x n ] f ( x ) to denote the coeﬃcient of x n in the Laurent series f ( x ).An important fact about the coeﬃcient operator that we will use often, without comment,is that [ x n ] x k f ( x ) = [ x n − k ] f ( x ).The binomial coeﬃcient (cid:0) ak (cid:1) is deﬁned to be a ( a − · · · ( a − k + 1) /k ! if k is a nonnegativeinteger and 0 otherwise. Thus the binomial theorem (1 + x ) a = (cid:80) ∞ k =0 (cid:0) ak (cid:1) x k holds for all a .2. The Lagrange inversion formula

Forms of Lagrange inversion.

We will give several proofs of the Lagrange inversionformula in section 4. Here we state several diﬀerent forms of Lagrange inversion and showthat they are equivalent.

Theorem 2.1.1.

Let R ( t ) be a power series not involving x . Then there is a unique powerseries f = f ( x ) such that f ( x ) = xR ( f ( x )) , and for any Laurent series φ ( t ) and ψ ( t ) notinvolving x and any integer n we have [ x n ] φ ( f ) = 1 n [ t n − ] φ (cid:48) ( t ) R ( t ) n , where n (cid:54) = 0 , (2.1.1)[ x n ] φ ( f ) = [ t n ] (cid:0) − tR (cid:48) ( t ) /R ( t ) (cid:1) φ ( t ) R ( t ) n , (2.1.2) ( f ) = (cid:88) n x n [ t n ](1 − xR (cid:48) ( t )) φ ( t ) R ( t ) n , (2.1.3)[ x n ] ψ ( f )1 − xR (cid:48) ( f ) = [ t n ] ψ ( t ) R ( t ) n , (2.1.4)[ x n ] ψ ( f )1 − f R (cid:48) ( f ) /R ( f ) = [ t n ] ψ ( t ) R ( t ) n . (2.1.5)We show here these formulas are equivalent in the sense that any one of them is easilyderivable from any other; proofs of these formulas are given in section 4. It is clear that(2.1.4) and (2.1.5) are equivalent since x = f /R ( f ). Taking ψ ( t ) = (1 − tR (cid:48) ( t ) /R ( t )) φ ( t )shows that (2.1.2) and (2.1.5) are equivalent.To derive (2.1.3) from (2.1.4), we rewrite (2.1.4) as ψ ( f )1 − xR (cid:48) ( f ) = (cid:88) n x n [ t n ] ψ ( t ) R ( t ) n . (2.1.6)Until now, we have assumed that φ ( t ) and ψ ( t ) do not involve x . We leave it to the readerto see that in (2.1.6) this assumption can be removed. Then (2.1.3) follows from (2.1.6) bysetting ψ ( t ) = (1 − xR (cid:48) ( t )) φ ( t ), and similarly (2.1.6) follows from (2.1.3).Although we allow R to be an arbitrary power series in Theorem 2.1.1, if R has constantterm 0 then f ( x ) = 0, so we may assume now that R has a nonzero constant term, andthus f and R ( f ) are nonzero. Then the equation f ( x ) = xR ( f ( x )) may be rewritten as f /R ( f ) = x . So if we set g ( t ) = t/R ( t ) then we have g ( f ) = x , and thus g = f (cid:104)− (cid:105) . Itis sometimes convenient to rewrite the formulas of Theorem 2.1.1 using g , rather than R .Since 1 − tR (cid:48) ( t ) /R ( t ) = tg (cid:48) ( t ) /g ( t ), formula (2.1.2) takes on a slightly simpler form (whichwill be useful later on) when expressed in terms of g rather than R :[ x n ] φ ( f ) = [ t n − ] φ ( t ) g (cid:48) ( t ) g ( t ) (cid:18) tg ( t ) (cid:19) n = [ t − ] φ ( t ) g (cid:48) ( t ) g ( t ) n +1 , (2.1.7)For future use, we note also the corresponding form for (2.1.5):[ x n − ] ψ ( f ) f g (cid:48) ( f ) = [ t n ] ψ ( t ) (cid:18) tg ( t ) (cid:19) n = [ t ] ψ ( t ) g ( t ) n . (2.1.8)To show that (2.1.1) and (2.1.2) are equivalent, using (2.1.7) in place of (2.1.2), we showthat [ t − ] φ (cid:48) ( t ) g ( t ) n = n [ t − ] φ ( t ) g (cid:48) ( t ) g ( t ) n +1 (2.1.9)But φ (cid:48) ( t ) g ( t ) n − n φ ( t ) g (cid:48) ( t ) g ( t ) n +1 = ddt φ ( t ) g ( t ) n , and the coeﬃcient of t − in the derivative of any Laurent series is 0, so (2.1.9) follows. Thisshows that (2.1.1) and (2.1.2) are equivalent if n (cid:54) = 0. If φ is a power series, then thecoeﬃcient of x in φ ( f ) is simply the constant term in φ , but if φ is a more general Laurentseries, then the constant term in φ ( f ) is not so obvious, and cannot be determined by (2.1.1).In equation (2.2.8) we will give a formula for the constant term in φ ( f ) for all φ . he case φ ( t ) = t k of (2.1.1), with R ( t ) = t/g ( t ) may be written[ x n ] f k = kn [ t n − k ] (cid:18) tg ( t ) (cid:19) n = kn [ t − k ] g ( t ) − n . (2.1.10)In other words, if g = f (cid:104)− (cid:105) , and for all integers k , f k = (cid:80) n a n,k x n and g k = (cid:80) n b n,k x n then a n,k = kn b − k, − n (2.1.11)for n (cid:54) = 0. Equation (2.1.11) is known as the Schur-Jabotinsky theorem . (See Schur [66,equation (10)] and Jabotinsky [37, Theorem II].)2.2.

Polynomials.

We now give a slightly more general form of Lagrange inversion basedon the fact that if two polynomials agree at inﬁnitely many values than they are identicallyequal. This will imply that to prove our Lagrange formulas for all n , it is suﬃcient to provethem in the case in which n is a positive integer. (Some proofs require this restriction.)By linearity, the formulas of Section 2.1 are implied by the special cases in which φ ( t )and ψ ( t ) are of the form t k for some integer k , and these special cases (especially k = 0 and k = 1) are particularly important. These special cases of (2.1.1), (2.1.4), and (2.1.5) areespecially useful and may be written[ x n ] f k = kn [ t n − k ] R ( t ) n , where n (cid:54) = 0 , (2.2.1)[ x n ] f k − xR (cid:48) ( f ) = [ t n − k ] R ( t ) n . (2.2.2)[ x n ] f k − f R (cid:48) ( f ) /R ( f ) = [ t n − k ] R ( t ) n . (2.2.3)In these formulas, let us assume that R ( t ) has constant term 1. (It is not hard to modifyour approach to take care of the more general situation in which the constant term of R ( t )is invertible.) Then the coeﬃcient of x in f ( x ) is 1, so f ( x ) /x has constant term 1. If we set n = m + k in (2.2.1), (2.2.2), and (2.2.3) then the results may be written[ x m ]( f /x ) k = km + k [ t m ] R ( t ) m + k , where m + k (cid:54) = 0 , (2.2.4)[ x m ] ( f /x ) k − xR (cid:48) ( f ) = [ t m ] R ( t ) m + k (2.2.5)[ x m ] ( f /x ) k − f R (cid:48) ( f ) /R ( f ) = [ t m ] R ( t ) m + k . (2.2.6)It is easy to see that in each of these equations, for ﬁxed m both sides are polynomials in k .Thus if these equalities hold whenever k is a positive integer, then they hold as identities ofpolynomials in k . Moreover, although (2.2.4) is invalid for k = − m , if m > k → − m with l’Hˆopital’s rule to obtain[ x m ]( f /x ) − m = [ x ] f − m = − m [ t m ] log R. (2.2.7)(Note that (2.2.7) does not hold for m = 0.) By linearity, (2.2.7) yields a supplement to(2.1.1) that takes care of the case n = 0:[ x ] φ ( f ) = [ t ] φ ( t ) + [ t − ] φ (cid:48) ( t ) log R. (2.2.8) e can also diﬀerentiate (2.2.4) with respect to k and then set k = 0 to obtain[ x m ] log( f /x ) = 1 m [ t m ] R ( t ) m , for m (cid:54) = 0 . (2.2.9)Returning to (2.2.1)–(2.2.3), we see that if they hold when n and k are positive integers,then they also hold when n and k are arbitrary integers. (Note that if n < k then everythingis zero.)2.3. A simple example: Catalan numbers.

The Catalan numbers C n may deﬁned bythe equation c ( x ) = 1 + xc ( x ) (2.3.1)for their generating function c ( x ) = (cid:80) ∞ n =0 C n x n . The quadratic equation (2.3.1) has twosolutions, (cid:0) ± √ − x (cid:1) / (2 x ), but only the minus sign gives a power series, so c ( x ) = 1 − √ − x x . Unfortunately (2.3.1) is not of the form f ( x ) = xR ( f ( x )), so we cannot apply directly anyof the versions of Lagrange inversion that we have seen so far.One way to apply Lagrange inversion is to set f ( x ) = c ( x ) −

1, so that f = x (1 + f ) . Wemay then apply Theorem 2.1.1 to the case R ( t ) = (1 + t ) . The equation f = x (1 + f ) hasthe solution f ( x ) = c ( x ) − xc ( x ) = 1 − √ − x x − . Then (2.1.1) with φ ( t ) = (1 + t ) k gives for n > x n ] c ( x ) k = [ x n ](1 + f ) k = 1 n [ t n − ] k (1 + t ) k − (1 + t ) n = kn [ t n − ] (cid:18) n + k − n − (cid:19) . Thus since the constant term in c ( x ) k is 1, we have c ( x ) k = 1 + ∞ (cid:88) n =1 kn (cid:18) n + k − n − (cid:19) x n . The sum may also be written c ( x ) k = ∞ (cid:88) n =0 k n + k (cid:18) n + kn (cid:19) x n = ∞ (cid:88) n =0 kn + k (cid:18) n + k − n (cid:19) x n . (2.3.2)These formulas are valid for all k except where n = − k/ n = − k in the second sum in (2.3.2). These coeﬃcients are called ballot numbers , and for k = 1 (2.3.2) gives the usual formula for the Catalan numbers, C n = n +1 (cid:0) nn (cid:1) . quation (2.1.2) with φ ( t ) = (1 + t ) k gives a formula for the ballot number as a diﬀerenceof two binomial coeﬃcients,[ x n ] c ( x ) k = [ t n ] 1 − t t (1 + t ) k (1 + t ) n = [ t n ](1 − t )(1 + t ) n + k − = (cid:18) n + k − n (cid:19) − (cid:18) n + k − n − (cid:19) , and equation (2.1.3) with φ ( t ) = (1 + t ) k gives another such formula, c ( x ) k = (cid:88) n x n [ t n ] (cid:0) (1 + t ) n + k − x (1 + t ) n + k +1 (cid:1) = (cid:88) n x n (cid:20)(cid:18) n + kn (cid:19) − (cid:18) n + k − n − (cid:19)(cid:21) . Finally, (2.2.9) gives[ x m ] log( f /x ) = [ x m ] 2 log c ( x ) = 1 m [ t m ](1 + t ) m = 1 m (cid:18) mm (cid:19) , so log c ( x ) = ∞ (cid:88) m =1 m (cid:18) mm (cid:19) x m . Since R ( t ) = (1 + t ) , we have R (cid:48) ( t ) = 2(1 + t ), so 1 − xR (cid:48) ( f ) = 1 − xc ( x ) = √ − x .Thus (2.1.4), with ψ ( t ) = (1 + t ) k gives[ x n ] c ( x ) k √ − x = [ t n ](1 + t ) k (1 + t ) n = (cid:18) n + kk (cid:19) , so c ( x ) k √ − x = ∞ (cid:88) n =0 (cid:18) n + kk (cid:19) x n . (2.3.3)Equating coeﬃcients of x n in c ( x ) k c ( x ) l = c ( x ) k + l , and using (2.3.2), gives the convolutionidentity (cid:88) i + j = n k i + k (cid:18) i + ki (cid:19) · l j + l (cid:18) j + lj (cid:19) = k + l n + k + l (cid:18) n + k + ln (cid:19) . Similarly using (2.3.2) and (2.3.3) we get (cid:88) i + j = n k i + k (cid:18) i + ki (cid:19)(cid:18) j + lj (cid:19) = (cid:18) n + k + ln (cid:19) . These convolution identities are special cases of identities discussed in section 3.3.

Exercise 2.3.1.

Derive these formulas for c ( x ) in other ways by applying Lagrange inversionto the equations f = x/ (1 − f ) and f = x (1 + f ). .4. A generalization.

There is another way to apply Lagrange inversion to the equation c ( x ) = 1 + xc ( x ) that, while very simple, has far-reaching consequences. Consider theequation F = z (1 + xF ) (2.4.1)where we think of F as a power series in z with coeﬃcients that are polynomials in x . Wemay apply (2.2.1) to (2.4.1) to get[ z n ] F k = kn [ t n − k ](1 + xt ) n . The right side is 0 unless n − k is even, and for n = 2 m + k we have[ z m + k ] F k = k m + k [ t m ](1 + xt ) m + k = k m + k (cid:18) m + km (cid:19) x m . Thus we have (if k is an integer but not a negative even integer) F k = ∞ (cid:88) m =0 k m + k (cid:18) m + km (cid:19) x m z m + k . (2.4.2)Now let c be the result of setting z = 1 in F (an admissible substitution), so by (2.4.2), wehave c k = ∞ (cid:88) m =0 k m + k (cid:18) m + km (cid:19) x m and by (2.4.1) we have c = 1 + xc . Moreover, as we have seen before, c = 1 + xc has aunique power series solution.The same idea works much more generally, but we must take care that the substitution isadmissible. For example, we can solve f = x (1 + f ) by Lagrange inversion, but we cannotset x = 1 in the solution.The case in which the coeﬃcients of R ( t ) are indeterminates is easy to deal with. Theorem 2.4.1.

Suppose that R ( t ) = (cid:80) ∞ n =0 r n t n , where the r n are indeterminates. Thenthere is a unique power series f satisfying f = R ( f ) . If φ ( t ) is a power series then φ ( f ) = φ (0) + ∞ (cid:88) n =1 n [ t n − ] φ (cid:48) ( t ) R ( t ) n , (2.4.3) and for any Laurent series φ ( t ) and ψ ( t ) we have φ ( f ) = [ t ] φ ( t ) + [ t − ] φ (cid:48) ( t ) log( R/r ) + (cid:88) n (cid:54) =0 n [ t n − ] φ (cid:48) ( t ) R ( t ) n , (2.4.4) φ ( f ) = (cid:88) n [ t n ] (cid:0) − tR (cid:48) ( t ) /R ( t ) (cid:1) φ ( t ) R ( t ) n = (cid:88) n [ t n ](1 − R (cid:48) ( t )) φ ( t ) R ( t ) n , (2.4.5) ψ ( f )1 − R (cid:48) ( f ) = ψ ( f )1 − f R (cid:48) ( f ) /R ( f ) = (cid:88) n [ t n ] ψ ( t ) R ( t ) n . (2.4.6) In (2.4.5) and (2.4.6) the sum is over all integers n . roof. These formulas follow from equations (2.1.1) to (2.1.5) on making the admissiblesubstitution x = 1, where for (2.4.4) we have included the correction term given by (2.2.8),modiﬁed to take into account that the constant term of R ( t ) is r rather than 1. Uniquenessfollows by equating coeﬃcients of the monomials r i r i · · · on both sides of f = R ( f ), whichgives a recurrence that determines them uniquely. (cid:3) We would like to relax the requirement in Theorem 2.4.1 that the r n be indeterminates.To do this, we can take any of the formulas of Theorem 2.4.1 and apply any admissiblesubstitution for the r n . For example, the following result, while not the most general possible,is sometimes useful. Theorem 2.4.2.

Suppose that R ( t ) = (cid:80) ∞ n =0 r n t n , where the coeﬃcients lie in some powerseries ring C [[ u , u , . . . ]] , and that each r n with n > is divisible by some u i . Then there isa unique power series f satisfying f = R ( f ) , and formulas (2.4.3) to (2.4.6) hold. (cid:3) We note that more generally, any admissible substitution will yield a solution of f = R ( f )to which these formulas hold, but uniqueness is not guaranteed. For example, the equation f = x + yf has the unique power series solution, f = 1 − √ − xy y = ∞ (cid:88) n =0 n + 1 (cid:18) nn (cid:19) x n +1 y n . The admissible substitution y = 1 gives f = (1 − √ − x ) as a power series solution of theequation f = x + f . However, the equation f = x + f has another power series solution, f = (1 + √ − x ) . Exercise 2.4.3.

The equation f = x + f has two power series solutions, f = (1 ±√ − x ).However, according to Theorem 2.1.1, the equivalent equation f = x/ (1 − f ) has only onepower series solution. Explain the discrepancy.2.5. Explicit formulas for the coeﬃcients.

It is sometimes useful to have an explicitformula for the coeﬃcients of f k where f = xR ( f ). With R ( t ) = (cid:80) ∞ n =0 r n t n , if we expand R ( t ) n by the multinomial theorem then (2.2.1) gives f k = (cid:88) n + n + ··· = nn +2 n +3 n + ··· = n − k k ( n − n ! n ! . . . r n r n · · · x n . (2.5.1)We might also want to express the coeﬃcients of f k in terms of the coeﬃcients of g = f (cid:104)− (cid:105) .Suppose that g ( x ) = x − g x − g x − · · · , where the minus signs and the assumption thatthe coeﬃcient of x in g is 1 make our formula simpler with no real loss of generality. Then(2.2.1) gives [ x m ] f k = km [ t m − k ] (cid:18) − g t − g t − · · · (cid:19) m . Expanding by the binomial theorem and simplifying gives f k = (cid:88) n + n + ··· = n − mn +2 n + ··· = m − k k ( n − m ! n ! n ! · · · g n g n · · · x m here the sum is over all m , n , and n , n , . . . satisfying the two subscripted equalities. Ifwe replace m with n then we may write the formula as f k = (cid:88) n + n + n + ··· = n n +3 n + ··· = n − k k ( n − n ! n ! n ! · · · x n g n g n · · · (2.5.2)and we see that the coeﬃcients here are exactly the same as the coeﬃcients in (2.5.1) (with n = 0). Exercise 2.5.1.

Explain the connection between (2.5.1) and (2.5.2) without using Lagrangeinversion.2.6.

Derivative formulas.

Lagrange inversion, especially in its analytic formulations, isoften stated in terms of derivatives. We give here several derivative forms of Lagrangeinversion.

Theorem 2.6.1.

Let G ( t ) = (cid:80) ∞ n =0 g n t n , where the g i are indeterminates. Then there is aunique power series f satisfying f = x + G ( f ) and for any power series φ ( t ) and ψ ( t ) we have φ ( f ) = ∞ (cid:88) m =0 d m dx m (cid:18) φ ( x ) (cid:0) − G (cid:48) ( x ) (cid:1) G ( x ) m m ! (cid:19) , (2.6.1) φ ( f ) = φ ( x ) + ∞ (cid:88) m =1 d m − dx m − (cid:18) φ (cid:48) ( x ) G ( x ) m m ! (cid:19) , (2.6.2) and ψ ( f )1 − G (cid:48) ( f ) = ∞ (cid:88) m =0 d m dx m (cid:18) ψ ( x ) G ( x ) m m ! (cid:19) . (2.6.3) Proof.

We ﬁrst prove (2.6.3). By (2.4.6) we have ψ ( f )1 − G (cid:48) ( f ) = ∞ (cid:88) n =0 [ t n ] ψ ( t )( x + G ( t )) n = ∞ (cid:88) m =0 ∞ (cid:88) n =0 [ t n ] (cid:18) nm (cid:19) x n − m ψ ( t ) G ( t ) m . So to prove (2.6.3), it suﬃces to prove that d m dx m (cid:18) ψ ( x ) G ( x ) m m ! (cid:19) = ∞ (cid:88) n =0 [ t n ] (cid:18) nm (cid:19) x n − m ψ ( t ) G ( t ) m . We show more generally, that for any power series α ( t ) we have d m dx m α ( x ) m ! = ∞ (cid:88) n =0 [ t n ] (cid:18) nm (cid:19) x n − m α ( t ) . (2.6.4)But by linearity, it is suﬃcient to prove (2.6.4) for the case α ( t ) = t j , where both sides areequal to (cid:0) jm (cid:1) x j − m .Next, (2.6.1) follows from (2.6.3) by taking ψ ( t ) = (1 − G (cid:48) ( t )) φ ( t ). inally, we derive (2.6.2) from (2.6.1). Writing D for d/dx , φ for φ ( x ) and G for G ( x ), wehave D m (cid:18) φ G m m ! (cid:19) = D m − (cid:18) D ( φG m ) m ! (cid:19) = D m − (cid:18) φ (cid:48) G m m ! + φG (cid:48) G m − ( m − (cid:19) . Thus D m − (cid:18) φ (cid:48) G m m ! (cid:19) = D m (cid:18) φ G m m ! (cid:19) − D m − (cid:18) φG (cid:48) G m − ( m − (cid:19) , so φ + ∞ (cid:88) m =1 D m − (cid:18) φ (cid:48) G m m ! (cid:19) = ∞ (cid:88) m =0 D m (cid:18) φ G m m ! (cid:19) − ∞ (cid:88) m =0 D m (cid:18) φG (cid:48) G m m ! (cid:19) . (cid:3) As before, applying an admissible substitution allows more general coeﬃcients to be used.As an application of these formulas, let us take G ( x ) = zH ( x ) and consider the formula φ ( f ) · ψ ( f )1 − zH (cid:48) ( f ) = φ ( f ) ψ ( f )1 − zH (cid:48) ( f ) . Applying (2.6.2) and (2.6.3) to the left side and (2.6.3) to the right, and then equatingcoeﬃcients of z n gives the convolution identity n (cid:88) m =0 (cid:18) nm (cid:19) d m − dx m − ( φ (cid:48) ( x ) H ( x ) m ) d n − m dx n − m (cid:0) ψ ( x ) H ( x ) n − m (cid:1) = d n dx n (cid:0) φ ( x ) ψ ( x ) H ( x ) n (cid:1) (2.6.5)and similarly, expanding φ ( f ) ψ ( f ) in two ways using (2.6.2) gives n (cid:88) m =0 (cid:18) nm (cid:19) d m − dx m − ( φ (cid:48) ( x ) H ( x ) m ) d n − m − dx n − m − (cid:0) ψ (cid:48) ( x ) H ( x ) n − m (cid:1) = d n − dx n − (cid:0) ( φ ( x ) ψ ( x )) (cid:48) H ( x ) n (cid:1) . (2.6.6)Here d m − φ (cid:48) ( x ) /dx m − for m = 0 is to be interpreted as φ ( x ). Formulas (2.6.5) and (2.6.6)were found by Cauchy [10]; a formula equivalent to (2.6.5) had been found earlier by Pfaﬀ[58]. A detailed historical discussion of these identities and generalizations has been givenby Johnson [40]; see also Chu [12, 14] and Abel [2]. Our approach to these identities hasalso been given by Huang and Ma [36]. Exercise 2.6.2.

With the notation of Theorem 2.6.1, show that for any positive integer k , G ( f ) k = ∞ (cid:88) m =0 k ( m + k ) m ! d m dx m G ( x ) m + k . Applications

In this section we describe some applications of Lagrange inversion.3.1.

A rational function expansion.

It is surprising that Lagrange inversion can giveinteresting results when the solution to the equation to be solved is rational.We consider the equation f = 1 + a + abf, with solution f = 1 + a − ab . e apply (2.4.6) with R ( t ) = 1 + a + abt and ψ ( t ) = t r (1 + bt ) s . Here we have 1 + bf =(1 + b ) / (1 − ab ) and 1 − R (cid:48) ( f ) = 1 − ab . Then(1 + a ) r (1 + b ) s (1 − ab ) r + s +1 = (cid:88) n [ t n ] t r (1 + bt ) s (1 + a + abt ) n = (cid:88) n,i [ t n ] (cid:18) ni (cid:19) a i t r (1 + bt ) s + i = (cid:88) n,i,j [ t n ] (cid:18) ni (cid:19) a i (cid:18) s + ij (cid:19) b j t r + j = (cid:88) i,j (cid:18) r + ji (cid:19)(cid:18) s + ij (cid:19) a i b j . For another approach to this identity, see Gessel and Stanton [26].3.2.

The tree function.

In applying Lagrange inversion, the nicest examples are those inwhich the series R ( t ) has the property that there is a simple formula for the coeﬃcients of R ( t ) n , and these simple formulas usually come from the exponential function or the binomialtheorem. In this section we discuss the simplest case, in which R ( t ) = e t . Later, in section3.5, we discuss a more complicated example involving the exponential function.Let T ( x ) be the power series satisfying T ( x ) = xe T ( x ) . Equivalently, T ( x ) = ( xe − x ) (cid:104)− (cid:105) and thus T ( xe − x ) = x . Then by the properties of exponen-tial generating functions (see, e.g., Stanley [70, Chapter 5]), T ( x ) is the exponential gener-ating function for rooted trees and e T ( x ) = T ( x ) /x is the exponential generating function forforests of rooted trees. We shall call T ( x ) the tree function and we shall call F ( x ) = e T ( x ) the forest function The tree function is closely related to the Lambert W function [16, 17]which may be deﬁned by W ( x ) = − T ( − x ). Although the Lambert W function is betterknown, we will state our results in terms of the tree and forest functions.Applying (2.2.1) with R ( t ) = e t gives T ( x ) = ∞ (cid:88) n =1 n n − x n n ! (3.2.1)and more generally, T ( x ) k k ! = ∞ (cid:88) n = k kn n − k − (cid:18) nk (cid:19) x n n ! (3.2.2)for all positive integers k , and F ( x ) k = e kT ( x ) = ∞ (cid:88) n =0 k ( n + k ) n − x n n ! (3.2.3)for all k . Equation (3.2.2) implies that there are kn n − k − (cid:0) nk (cid:1) forests of n rooted trees on n vertices and equation (3.2.3) implies that there are k ( n + k ) n − forests with vertex set { , , . . . , n + k } in which the roots are 1, 2, . . . , k . n interesting special case of (3.2.3) is k = −

1, which may be rearranged to F ( x ) = (cid:18) − ∞ (cid:88) n =1 ( n − n − x n n ! (cid:19) − . (3.2.4)Equation (3.2.4) may be interpreted in terms of prime parking functions [70, Exercise 5.49f,p. 95; Solution, p. 141].Applying (2.2.6) gives F ( x ) k − T ( x ) = ∞ (cid:88) n =0 ( n + k ) n x n n ! . (3.2.5)A. Lacasse [46, p. 90] conjectured an identity that may be written as U ( x ) − U ( x ) = ∞ (cid:88) n =0 n n +1 x n n ! , (3.2.6)where U ( x ) = ∞ (cid:88) n =0 n n x n n ! . Proofs of Lacasse’s conjecture were given by Chen et al. [11], Prodinger [59], Sun [73], andYounsi [80]. We will prove (3.2.6) by showing that both sides are equal to T ( x ) / (1 − T ( x )) .To do this, we ﬁrst note that the right side of (3.2.6) is ddx ∞ (cid:88) n =0 ( n − n x n n ! = ddx e − T ( x ) − T ( x ) = e − T ( x ) T ( x ) T (cid:48) ( x )(1 − T ( x )) . (3.2.7)Diﬀerentiating (3.2.1) with respect to x gives T (cid:48) ( x ) = ∞ (cid:88) n =0 ( n + 1) n x n n ! , which by (3.2.5) is equal to e T ( x ) / (1 − T ( x )). Thus (3.2.7) is equal to T ( x ) / (1 − T ( x )) .But by (3.2.5), U ( x ) = 1 / (1 − T ( x )) from which it follows easily that U ( x ) − U ( x ) = T ( x ) / (1 − T ( x )) .The series T ( x ) and U ( x ) were studied by Zvonkine [79], who showed that D k U ( x ) and D k U ( x ) , where D = xd/dx , are polynomials in U ( x ).The series (cid:80) ∞ n =0 ( n + k ) n + m x n /n !, where m is an arbitrary integer, can be expressed interms of T ( x ). (The case k = 0 has been studied by Smiley [67].) We ﬁrst deal with thecase in which m is negative. Theorem 3.2.1.

Let l be a positive integer. Then for some polynomial p l ( u ) of degree l − ,with coeﬃcients that are rational functions of k , we have ∞ (cid:88) n =0 ( n + k ) n − l x n n ! = e kT ( x ) p l ( T ( x )) . (3.2.8) he ﬁrst three polynomials p l ( u ) are p ( u ) = 1 k ,p ( u ) = 1 k − uk ( k + 1) ,p ( u ) = 1 k − (2 k + 1) k ( k + 1) u + u k ( k + 1)( k + 2) . Before proving Theorem 3.2.1 let us check (3.2.8) for l = 1 and l = 2. The case l = 1 isequivalent to (3.2.3). For l = 2, we have e kT ( x ) = F ( x ) k = k ∞ (cid:88) n =0 ( n + k ) n − x n n !and e kT ( x ) T ( x ) = F ( x ) k · xF ( x ) = xF ( x ) k +1 = x ( k + 1) ∞ (cid:88) n =0 ( n + k + 1) n − x n n != ( k + 1) ∞ (cid:88) n =0 n ( n + k ) n − x n n ! . Thus e kT ( x ) (cid:18) k − T ( x ) k ( k + 1) (cid:19) = 1 k ∞ (cid:88) n =0 (cid:0) ( n + k ) n − − n ( n + k ) n − (cid:1) x n n != ∞ (cid:88) n =0 ( n + k ) n − x n n ! . The general case can be proved in a similar way. (See Exercise 3.2.4.) However, it isinstructive to take a diﬀerent approach, using ﬁnite diﬀerences (cf. Gould [31]), that we wewill use again in section 3.3.Let s be a function deﬁned on the nonnegative integers. The shift operator E takes s tothe function Es deﬁned by ( Es )( n ) = s ( n + 1). We denote by I the identity operator thattakes s to itself and by ∆ the diﬀerence operator E − I , so (∆ s )( n ) = s ( n + 1) − s ( n ). Itis easily veriﬁed that if s is a polynomial of degree d > L then∆ s is a polynomial of degree d − dL , and if s is a constant then∆ s = 0. The k th diﬀerence of s is the function ∆ k s . Thus if s is a polynomial of degree d with leading coeﬃcient L then ∆ k s = 0 for k > d and ∆ d s is the constant d ! L .Since the operators E and I commute, we can expand ∆ k = ( E − I ) k by the binomialtheorem to obtain (∆ k s )( n ) = k (cid:88) i =0 ( − k − i (cid:18) ki (cid:19) s ( n + i ) . We may summarize the result of this discussion in the following lemma. emma 3.2.2. Let s be a polynomial of degree d with leading coeﬃcient L . Then k (cid:88) i =0 ( − k − i (cid:18) ki (cid:19) s ( n + i ) is 0 if k > d and is the constant d ! L for k = d . (cid:3) Proof of Theorem 3.2.1.

If we set x = ue − u in Theorem 3.2.1 and use the fact that T ( ue − u ) = u we see that Theorem (3.2.1) is equivalent to the formula e − ku ∞ (cid:88) n =0 ( n + k ) n − l ( ue − u ) n n ! = p l ( u ) . We have e − ku ∞ (cid:88) n =0 ( n + k ) n + m ( ue − u ) n n ! = ∞ (cid:88) n =0 ( n + k ) n + m u n n ! e − ( n + k ) u = ∞ (cid:88) j =0 u j j ! j (cid:88) n =0 ( − j − n (cid:18) jn (cid:19) ( n + k ) j + m (3.2.9)If m = − l is a negative integer and j ≥ l then ( n + k ) j + m = ( n + k ) j − l is a polynomial in n of degree less than j , so the inner sum in (3.2.9) is 0. Thus (3.2.8) follows, with p l ( u ) = l − (cid:88) j =0 u j j ! j (cid:88) n =0 ( − j − n (cid:18) jn (cid:19) ( n + k ) − ( l − j ) . (cid:3) We cannot set k = 0 in (3.2.8), since the n = 0 term on the left is k − l . However it is nothard to evaluate (cid:80) ∞ n =1 n n − l x n /n !. Theorem 3.2.3.

Let q l ( u ) be the result of setting k = 1 in p l ( u ) . Then ∞ (cid:88) n =1 n n − l x n n ! = T ( x ) q l ( T ( x )) . Proof.

We have ∞ (cid:88) n =1 n n − l x n n ! = x ∞ (cid:88) n =0 ( n + 1) n − l x n n !By (3.2.8) with k = 1 this is xe T ( x ) q l ( T ( x )) = T ( x ) q l ( T ( x )) . (cid:3) Exercise 3.2.4.

Prove Theorem 3.2.1 by ﬁnding a formula for the coeﬃcient t j ( n ) of x n /n !in e kT ( x ) T ( x ) j and showing that ( n + k ) n − l can be expressed as a linear combination, withcoeﬃcients that are rational functions of k , of t ( n ) , . . . , t l − ( n ).Next we consider (cid:80) ∞ n =0 ( n + k ) n + m x n /n ! where m is a nonnegative integer. (We evaluatedthe case k = 0 , m = 1 in our discussion of Lacasse’s conjecture.) heorem 3.2.5. Let m be a nonnegative integer. Then there exists a polynomial r m ( u, k ) ,with integer coeﬃcients, of degree m in u and degree m in k , such that ∞ (cid:88) n =0 ( n + k ) n + m x n n ! = e kT ( x ) r m ( T ( x ) , k )(1 − T ( x )) m +1 . The ﬁrst three polynomials r m ( u, k ) are r ( u, k ) = 1 ,r ( u, k ) = k + (1 − k ) u,r ( u, k ) = k + (1 + 3 k − k ) u + (2 − k + k ) u . Proof sketch.

We give here a sketch of a proof that tells us something interesting about thepolynomials r m ( u, k ); for a more direct approach see Exercise 3.2.6.As in the proof of Theorem 3.2.1 we set x = ue − u and consider the sum on the left side of(3.2.9).Following Carlitz [9], we deﬁne the weighted Stirling numbers of the second kind R ( n, j, k )by R ( n, j, k ) = 1 j ! j (cid:88) i =0 ( − j − i (cid:18) ji (cid:19) ( k + i ) n , so that ∞ (cid:88) n =0 R ( n, j, k ) x n n ! = e kx ( e x − j j ! . (3.2.10)(For k = 0, R ( n, j, k ) reduces to the ordinary Stirling number of the second kind S ( n, j ).)Equation (3.2.10) implies that the R ( n, j, k ) is a polynomial in k with integer coeﬃcients.Then (3.2.9) is equal to (cid:80) ∞ j =0 R ( j + m, j, k ) u j . It is not hard to show that for ﬁxed m , R ( j + m, j, k ) is a polynomial in j of degree 2 m .Thus ∞ (cid:88) j =0 R ( j + m, j, k ) u j = r m ( u, k )(1 − u ) m +1 for some polynomial r m ( u, k ) of degree at most 2 m .We omit the proof that r m ( u, k ) actually has degree m in u . (cid:3) For k = 1, the coeﬃcients of r m ( u,

1) are positive integers, sometimes called second-orderEulerian numbers ; see, for example, [32, p. 270] and [24].

Exercise 3.2.6.

Give an inductive proof of Theorem 3.2.5 using the fact that if W ( m, k ) = (cid:80) ∞ n =0 ( n + k ) n + m x n /n ! then dW ( m, k ) /dx = W ( m + 1 , k + 1).We can get convolution identities by applying (3.2.3) and (3.2.5) to F ( x ) k + l − T ( x ) = F ( x ) k F ( x ) l − T ( x )and F ( x ) k + l = F ( x ) k F ( x ) l . he ﬁrst identity yields( n + k + l ) n = n (cid:88) i =0 (cid:18) ni (cid:19) k ( i + k ) i − ( n − i + l ) n − i . and the second yields( k + l )( n + k + l ) n − = n (cid:88) i =0 (cid:18) ni (cid:19) k ( i + k ) i − l ( n − i + l ) n − i − . Note that these are identities of polynomials in k and l . If we set k = x and l = y − n in theﬁrst formula we get the nicer looking( x + y ) n = n (cid:88) i =0 (cid:18) ni (cid:19) x ( x + i ) i − ( y − i ) n − i . (3.2.11)Replacing x with x/z and y with y/z in (3.2.11), and multiplying through by z n gives thehomogeneous form ( x + y ) n = n (cid:88) i =0 (cid:18) ni (cid:19) x ( x + iz ) i − ( y − iz ) n − i (3.2.12)which was proved by N. H. Abel in 1826 [1]. Note that for z = 0, (3.2.12) reduces to thebinomial theorem. Riordan [64, pp. 18–27] gives a comprehensive account of Abel’s identityand its generalizations, though he does not use Lagrange inversion. Exercise 3.2.7. (Chu [13].) Prove Abel’s identity (3.2.11) using ﬁnite diﬀerences. (Startby expanding ( y − i ) n − i = [( x + y ) − ( x + i )] n − i by the binomial theorem.)3.3. Fuss-Catalan numbers.

The Fuss-Catalan (or Fuß-Catalan) numbers of order p (alsocalled generalized Catalan numbers) are the numbers pn +1 (cid:0) pn +1 n (cid:1) = p − n +1 (cid:0) pnn (cid:1) , which re-duce to Catalan numbers for p = 2. They were ﬁrst studied by N. Fuss in 1791 [21]. As weshall see, they are the coeﬃcients of the power series c p ( x ), satisfying the functional equation c p ( x ) = 1 + xc p ( x ) p , (3.3.1)or equivalently, c p ( x ) = 11 − xc p ( x ) p − , as was shown using Lagrange inversion by Liouville [49].An account of these generating functions can be found Graham, Knuth, and Patashnik[32, pp. 200–204].It follows easily from (3.3.1) that c p ( x ) − xc p ( x ) p = (cid:18) x (1 + x ) p (cid:19) (cid:104)− (cid:105) , (3.3.2) xc p ( x ) p − = (cid:0) x (1 − x ) p − (cid:1) (cid:104)− (cid:105) ,xc p ( x p − ) = ( x − x p ) (cid:104)− (cid:105) , and c (cid:48) p ( x ) = c p ( x ) p − pxc p ( x ) p − . agrange inversion gives c p ( x ) k = ∞ (cid:88) n =0 kpn + k (cid:18) pn + kn (cid:19) x n (3.3.3)for all k . With R ( t ) = 1 + xt p we have R (cid:48) ( t ) = pxt p − so1 − R (cid:48) ( c p ( x )) = 1 − pxc p ( x ) p − = 1 − p ( c p ( x ) − /c p ( x ) = 1 − p + pc p ( x ) − , and thus by (2.4.6), ∞ (cid:88) n =0 (cid:18) pn + kn (cid:19) x n = c p ( x ) k − pxc p ( x ) p − = c p ( x ) k +1 − ( p − c p ( x ) − . (3.3.4)Equivalently, ∞ (cid:88) n =0 (cid:18) pn + kn (cid:19) (cid:18) x (1 + x ) p (cid:19) n = (1 + x ) k +1 − ( p − x and ∞ (cid:88) n =0 (cid:18) pn + kn (cid:19)(cid:0) x (1 − x ) p − (cid:1) n = 1(1 − px )(1 − x ) k . The convolution identities obtained from (3.3.3) and (3.3.4), known as Rothe-Hagen iden-tities [65, 33, 30] are (cid:88) i + j = n kpi + k (cid:18) pi + ki (cid:19) · lpj + l (cid:18) pj + lj (cid:19) = k + lpn + k + l (cid:18) pn + k + ln (cid:19) . and (cid:88) i + j = n kpi + k (cid:18) pi + ki (cid:19)(cid:18) pj + lj (cid:19) = (cid:18) pn + k + ln (cid:19) . Exercise 3.3.1.

Show that c − p ( x ) = 1 /c p +1 ( − x ). Exercise 3.3.2.

Prove that c p + q ( x ) = c p (cid:0) xc p + q ( x ) q (cid:1) (a) combinatorially (b) algebraically(c) using Lagrange inversion. Exercise 3.3.3.

Prove that (cid:0) xc p ( x a ) b (cid:1) (cid:104)− (cid:105) = xc ab − p +1 ( − x a ) b (a) algebraically (b) using Lagrange inversion. In particular, as noted by Dennis Stanton, if f ( x ) = xc ( x ) then f ( x ) (cid:104)− (cid:105) = − f ( − x ). Exercise 3.3.4. (Mansour and Sun [50, Example 5.6], Sun [72].) Show that11 − x c (cid:18) x (1 − x ) (cid:19) = c ( x ) . Exercise 3.3.5.

Prove (3.3.3) and (3.3.4) by ﬁnite diﬀerences.

Exercise 3.3.6. (Chu [13].) Prove the Hagen-Rothe identities by ﬁnite diﬀerences. ext, we prove Jensen’s formula [39] n (cid:88) l =0 (cid:18) j + pll (cid:19)(cid:18) r − pln − l (cid:19) = n (cid:88) i =0 (cid:18) j + r − in − i (cid:19) p i . (3.3.5)By (3.3.4) we have ∞ (cid:88) l =0 (cid:18) pl + jl (cid:19) x l ∞ (cid:88) m =0 (cid:18) pm + km (cid:19) x m = c p ( x ) j + k (1 − pxc p ( x ) p − ) = c p ( x ) j + k − pxc p ( x ) p − ∞ (cid:88) i =0 p i x i c p ( x ) ( p − i = ∞ (cid:88) i =0 p i x i c p ( x ) j + k +( p − i − pxc p ( x ) p − = ∞ (cid:88) i =0 p i x i ∞ (cid:88) m =0 (cid:18) pm + j + k + ( p − im (cid:19) x m = ∞ (cid:88) n =0 x n n (cid:88) i =0 (cid:18) pn + j + k − in − i (cid:19) p i . Equating coeﬃcients of x n on both sides gives n (cid:88) l =0 (cid:18) pl + jl (cid:19)(cid:18) p ( n − l ) + kn − l (cid:19) = n (cid:88) i =0 (cid:18) pn + j + k − in − i (cid:19) p i . Setting k = r − pn gives (3.3.5).We also have analogues of Theorems 3.2.1 and 3.2.5 for Fuss-Catalan numbers. Theorem 3.3.7.

Let i and j be nonnegative integers with i < j . Then ∞ (cid:88) n =0 ( pn + i )! n ! (( p − n + j )! x n (1 + x ) pn + i +1 is a polynomial u i,j ( x ) in x of degree j − i − .Proof. We have ∞ (cid:88) n =0 ( pn + i )! n ! (( p − n + j )! x n (1 + x ) pn + i +1 = ∞ (cid:88) n =0 ( pn + i )! n ! (( p − n + j )! x n ∞ (cid:88) l =0 ( − l (cid:18) pn + i + ll (cid:19) x l = ∞ (cid:88) m =0 x m m (cid:88) n =0 ( pn + i )! n ! (( p − n + j )! ( − m − n (cid:18) ( p − n + i + mm − n (cid:19) . For m ≥ j − i , the coeﬃcient of x m may be rearranged to (cid:0) m − ( j − i ) (cid:1) ! m ! m (cid:88) n =0 ( − m − n (cid:18) mn (cid:19)(cid:18) ( p − n + i + mm − ( j − i ) (cid:19) . he sum is the m th diﬀerence of a polynomial of degree less than m and is therefore 0. For m = j − i −

1, the coeﬃcient of x j − i − reduces to1 m ! m (cid:88) n =0 ( − m − n (cid:18) mn (cid:19) p − n + j , which is nonzero by the well-known identity m (cid:88) n =0 ( − n (cid:18) mn (cid:19) an + a = (cid:18) m + aa (cid:19) − , so the degree of the polynomial is not less than j − i − (cid:3) The ﬁrst few values of these polynomials are u i,i +1 ( x ) = 1 i + 1 u i,i +2 ( x ) = 1( i + 1)( i + 2) − p − i + 2)( p + i + 1) xu i,i +3 ( x ) = 1( i + 1)( i + 2)( i + 3) − ( p − p + 2 i + 4)( i + 2)( i + 3)( p + i + 1)( p + i + 2) x + ( p − ( i + 3)( p + i + 2)(2 p + i + 1) x As a simple example of Theorem 3.3.7, the number of 2-stack-sortable permutations of { , , . . . , n } is a n = 2 (3 n )!( n + 1)! (2 n + 1)! = 4 (3 n )! n ! (2 n + 2)!(see [57, Sequence A000139]), so by Theorem 3.3.7, with p = 3, i = 0, and j = 2, (cid:80) ∞ n =0 a n x n / (1 + x ) n +1 is a polynomial of degree 1, which is easily computed to be 2 − x .Then by (3.3.2), we ﬁnd that ∞ (cid:88) n =0 a n x n = 3 c ( x ) − c ( x ) , which can be checked directly from (3.3.3).There is a result similar to Theorem 3.3.7 for i ≥ j , which we state without proof. Theorem 3.3.8.

Let i and j be nonnegative integers with i ≥ j . Then (1 − ( p − x ) i − j )+1 ∞ (cid:88) n =0 ( pn + i )! n ! (( p − n + j )! x n (1 + x ) pn + i +1 is a polynomial in x of degree at most i − j . (cid:3) Exercise 3.3.9.

Prove Theorem 3.3.8. .4. Narayana and Fuss-Narayana numbers.

The Narayana numbers may be deﬁnedby N ( n, i ) = n (cid:0) ni (cid:1)(cid:0) ni − (cid:1) for n ≥

1. They have many combinatorial interpretations, in termsof Dyck paths, ordered trees, binary trees, and noncrossing partitions.It is not hard to see from the formula for Narayana numbers that N ( n, i ) = N ( n, n + 1 − i ).A generating function for the Narayana that exhibits this symmetry is given by the solutionto the equation f = (1 + xf )(1 + yf ) . (3.4.1)Lagrange inversion gives f k = ∞ (cid:88) n =0 kn [ t n − k ](1 + xt ) n (1 + yt ) n = ∞ (cid:88) n =0 kn [ t n − k ] (cid:88) i,j (cid:18) ni (cid:19)(cid:18) nj (cid:19) x i y j t i + j = ∞ (cid:88) i,j =0 ki + j + k (cid:18) i + j + ki (cid:19)(cid:18) i + j + kj (cid:19) x i y j . In particular f = ∞ (cid:88) i,j =0 i + j + 1 (cid:18) i + j + 1 i (cid:19)(cid:18) i + j + 1 i + 1 (cid:19) x i y j = ∞ (cid:88) n =1 n − (cid:88) i =0 N ( n, i + 1) x i y n − i − . Equation (3.4.1) can be solved explicitly to give f = 1 − x − y − (cid:112) (1 − x − y ) − xy xy . Exercise 3.4.1.

Prove that(1 + xf ) r (1 + yf ) s (cid:112) (1 − x − y ) − xy = (cid:88) i,j (cid:18) r + i + ji (cid:19)(cid:18) s + i + jj (cid:19) x i y j and(1 + xf ) r (1 + yf ) s = (cid:88) i,j (cid:20)(cid:18) r + i + j − i (cid:19)(cid:18) s + i + jj (cid:19) − (cid:18) r + i + ji (cid:19)(cid:18) s + i + j − j − (cid:19)(cid:21) x i y j = (cid:88) i,j rs + ri + sj ( r + i + j )( s + i + j ) (cid:18) r + i + ji (cid:19)(cid:18) s + i + jj (cid:19) x i y j . The ﬁrst formula is equivalent to a well-known generating function for Jacobi polynomials;see Carlitz [8].We may generalize (3.4.1) to f = (1 + x f ) r (1 + x f ) r · · · (1 + x m f ) r m , (3.4.2) or which Lagrange inversion gives f k = ∞ (cid:88) n = k (cid:88) i + ··· + i m = n − k kn (cid:18) r ni (cid:19) · · · (cid:18) r m ni m (cid:19) x i · · · x i m m . (3.4.3)These numbers reduce to Catalan numbers for k = m = 1 , r = 2 and to Narayana numbersfor k = 1 , m = 2 , r = r = 1. For k = 1 , m = 2 , r = 1 they are sometimes called Fuss-Narayana numbers; see Armstrong [3], Cigler [15], Edelman [19], Eu and Fu [20], and Wang[75]. The numbers for k = 1, r i = 1 for all i have been called generalized Fuss-Narayananumbers by Lenczewski and Sa(cid:32)lapata [48]; they have also been studied by Edelman [19],Stanley [69], and Xu [78]. The case k = 1 , m = 3 , r = − r , r = 1 of these numbers wasconsidered by Krattenthaler, [44, equation (31)]. We note that if r = · · · = r m , then f is a symmetric function of x , . . . , x m , and this symmetric function arises in the study ofalgebraic aspects of parking functions [69].If we set r i = − s i in (3.4.2) and replace x i with − x i , and f with g , then (3.4.2) becomes g = 1(1 − x g ) s (1 − x g ) s · · · (1 − x m g ) s m , (3.4.4)and, with the formula (cid:0) − ai (cid:1) = ( − i (cid:0) a + i − i (cid:1) , (3.4.3) becomes g k = ∞ (cid:88) n = k (cid:88) i + ··· + i m = n − k kn (cid:18) s n + i − i (cid:19) · · · (cid:18) s m n + i m − i m (cid:19) x i · · · x i m m . (3.4.5)These numbers reduce to Catalan numbers for k = m = s = 1. For s = · · · = s m = 1 theyhave been considered by Aval [4] and (for k = 1) by Stanley [69].Of special interest are the cases of (3.4.2) that reduce to a quadratic equation, since inthese cases there are simple explicit formulas for f . If we take m = 1, r = r = 1, and r = − f = (1 + xf )(1 + yf )1 − zf , with the solution f = 1 − x − y − (cid:112) (1 − x − y ) − xy − z xy + z ) , and (3.4.3) gives f k = ∞ (cid:88) n = k (cid:88) i + i + i = n − k kn (cid:18) ni (cid:19)(cid:18) ni (cid:19)(cid:18) n + i − i (cid:19) x i y i z i . Another case of (3.4.2) that reduces to a quadratic is f = (cid:112) (1 + xf )(1 + yf ); we leave thedetails to the reader.3.5. Raney’s equation.

G. Raney [62] considered the equation f = ∞ (cid:88) m =1 A m e B m f , (3.5.1) n which f is a power series in the indeterminates A m and B m . He used Pr¨ufer’s corre-spondence to give a combinatorial derivation of a formula for the coeﬃcients in this powerseries.We can use Lagrange inversion to give a formula for the coeﬃcients of f . Theorem 3.5.1.

Let f be the power series in A m and B m satisfying (3.5.1) , and let k bea positive integer. Let i , i , . . . and j , j , . . . be nonnegative integers, only ﬁnitely many ofwhich are nonzero. If i + i + · · · = k + j + j + · · · then the coeﬃcient of A i A i · · · B j B j · · · in f k is k ( i + i + · · · − i ! i ! · · · i j j ! i j j ! · · · and if i + i + · · · (cid:54) = k + j + j + · · · then the coeﬃcient is zero.Proof. Applying equation (2.4.3) to (3.5.1) gives f k = ∞ (cid:88) n = k kn [ t n − k ] (cid:18)(cid:88) m A m e B m t (cid:19) n = ∞ (cid:88) n = k kn [ t n − k ] (cid:88) i + i + ··· = n n ! i ! i ! · · · A i A i · · · e i B t e i B t · · · = ∞ (cid:88) n = k kn [ t n − k ] (cid:88) i + i + ··· = n n ! i ! i ! · · · A i A i · · · (cid:88) j ( i B t ) j j ! (cid:88) j ( i B t ) j j ! · · · = ∞ (cid:88) n = k (cid:88) i + i + ··· = nj + j + ··· = n − k k ( n − i ! i ! · · · A i A i · · · B j B j · · · i j j ! i j j ! · · · , and the formula follows. (cid:3) A combinatorial derivation of Raney’s formula has also been given by D. Knuth [42, Section2.3.4.4]. 4.

Proofs

In this section we give several proofs of the Lagrange inversion formula.4.1.

Residues.

The simplest proof of Lagrange inversion is due to Jacobi [38]. We deﬁnethe residue res f ( x ) of a Laurent series f ( x ) = (cid:80) n f n x n to be f − .Jacobi proved the following change of variables formula for residues: Theorem 4.1.1.

Let f be a Laurent series and let g ( x ) = (cid:80) ∞ n =1 g n x n be a power series with g (cid:54) = 0 . Then res f ( x ) = res f ( g ( x )) g (cid:48) ( x ) . Proof.

By linearity, it is suﬃcient to prove the formula when f ( x ) = x k for some integer k .If k (cid:54) = − x k = 0 andres g ( x ) k g (cid:48) ( x ) = res ddx g ( x ) k +1 / ( k + 1) = 0 , since the residue of a derivative is 0. f k = − x k = 1 andres g ( x ) k g (cid:48) ( x ) = res g (cid:48) ( x ) /g ( x ) = res g + 2 g x + · · · g x + g x + · · · = res 1 x · g + 2 g x + · · · g + g x + · · · = 1 . (cid:3) Jacobi’s paper [38] contains a multivariable generalization of Theorem 4.1.1; see also Gessel[28] and Xin [77].Now let f ( x ) and g ( x ) be compositional inverses. Then for any Laurent series φ ,[ x n ] φ ( f ) = res φ ( f ) x n +1 = res φ ( f ( g )) g (cid:48) g n +1 = res φ ( x ) g (cid:48) g n +1 . This is equation (2.1.7), which we have already seen is equivalent to the other forms ofLagrange inversion.

Exercise 4.1.2.

Let f be a Laurent series and let g ( x ) = (cid:80) ∞ n = m g n x n be a Laurent serieswith g m (cid:54) = 0. Show that if f ( g ( x )) is well-deﬁned as a Laurent series then m res f = res f ( g ( x )) g (cid:48) ( x ) . Exercise 4.1.3. (Hirzebruch [34]; see also Kneezel [41].) Use the change of variables formula(Theorem 4.1.1) to show that the unique power series f ( x ) satisfyingres (cid:18) f ( x ) x (cid:19) n = 1for all n ≥ f ( x ) = x/ (1 − e − x ).4.2. Induction.

In this proof and the next we consider the equation f = xR ( f ), where R ( t ) is a power series. If R ( t ) has no constant term then f = 0 and the formulas are trivial.So we may assume that R ( t ) has a nonzero constant term, and thus f exists and is unique,since it is the compositional inverse of x/R ( x ).We now give an inductive proof of (2.1.2): for any power series φ ( t ),[ x n ] φ ( f ) = [ t n ] (cid:18) − tR (cid:48) ( t ) R ( t ) (cid:19) φ ( t ) R ( t ) n . (4.2.1)(As noted in section 2, this implies that (4.2.1) holds more generally when φ ( t ) is a Laurentseries.)We ﬁrst take care of the case in which φ ( t ) = 1. The case φ ( t ) = 1, n = 0 is trivial. If φ ( t ) = 1 and n >

0, we have (cid:18) − tR (cid:48) ( t ) R ( t ) (cid:19) R ( t ) n = R ( t ) n − tR (cid:48) ( t ) R ( t ) n − = R ( t ) n − tn ddt R ( t ) n . (4.2.2)Now for any power series u ( t ), [ t n ] (cid:18) u ( t ) − tn u (cid:48) ( t ) (cid:19) = 0 . With(4.2.2), this proves (4.2.1) for φ ( t ) = 1, n > ow we prove the formula (4.2.1) by induction on n . It is clear that (4.2.1)holds for n = 0.Now let us suppose that for some nonnegative integer m , (4.2.1) holds for all φ when n = m .We now want to show that (4.2.1) holds for all φ when n = m + 1. By linearity and the case φ ( t ) = 1, it is enough to prove (4.2.1) for n = m + 1 and φ ( t ) = t k , where k ≥

1. In thiscase we have [ x m +1 ] f k = [ x m +1 ] f k − · xR ( f )= [ x m ] f k − R ( f )= [ t m ] (cid:18) − tR (cid:48) ( t ) R ( t ) (cid:19) t k − R ( t ) R ( t ) m = [ t m +1 ] (cid:18) − tR (cid:48) ( t ) R ( t ) (cid:19) t k R ( t ) m +1 . Factorization.

Another proof is based on a version of the “factor theorem”: if f = xR ( f ) then t − f divides t − xR ( t ). This proof is taken from Gessel [27] but it is similar toLagrange’s original proof [47].We will prove (2.2.1), which gives a formula for f k where f = f ( x ) satisﬁes f = xR ( f ).First we recall Taylor’s theorem for power series: if P ( t ) is a power series in t , and α is anelement of the coeﬃcient ring, then P ( t ) = ∞ (cid:88) n =0 ( t − α ) n n ! P ( n ) ( α ) , (4.3.1)as long as this sum is summable. (The case P ( t ) = t m of (4.3.1) is just the binomial theorem,and the general case follows by linearity.)Now let us apply (4.3.1) with P ( t ) = t − xR ( t ) and α = f , where f = xR ( f ) so that P ( f ) = 0. Then we have t − xR ( t ) = 0 + ( t − f ) (cid:0) − xR (cid:48) ( f ) (cid:1) + ( t − f ) S ( x, t )= ( t − f ) Q ( x, t ) , (4.3.2)where S ( x, t ) is a power series and Q ( x, t ) is a power series with constant term 1.Equation (4.3.2) is an identity in the ring C [[ x, t ]], which is naturally embedded in thering C (( t ))[[ x ]] of power series in x with coeﬃcients that are Laurent series in t . In thisring, series like (cid:80) ∞ n =0 ( x/t ) n are allowed, even though they have inﬁnitely many negativepowers of t , since the coeﬃcient of any power of x is a Laurent series in t . We now do somecomputations in C (( t ))[[ x ]].By (4.3.2), we have 1 − xR ( t ) /t = (1 − f /t ) Q ( x, t ) . Since xR ( t ) /t and f /t are divisible by x and Q ( x, t ) is a power series in x and t with constant term 1, we may take logarithms toobtain − log(1 − xR ( t ) /t ) = − log(1 − f /t ) − log Q ( x, t ) . (4.3.3)Note that log Q ( x, t ) is a power series in x and t , and so has no negative powers of t . Now weequate coeﬃcients of x n t − k on both sides of (4.3.3) where n and k are both positive integers.On the left we have [ t − k ]( R ( t ) /t ) n /n = [ t n − k ] R ( t ) n /n and on the right we have [ x n ] f k /k. Thus, [ x n ] f k = kn [ t n − k ] R ( t ) n , hich is (2.2.1). Exercise 4.3.1. (Gessel [27].) Derive (2.2.2) similarly.4.4.

Combinatorial proofs.

There are several diﬀerent combinatorial proofs of Lagrangeinversion. They all interpret the solution f of f = xR ( f ) or f = R ( f ) as counting certaintrees. Here f may be interpreted as either an ordinary or exponential generating functionand thus diﬀerent types of trees may be involved.In ordinary generating function proofs, f will count unlabeled ordered trees (also called plane trees ), which are rooted trees in which the children of each vertex are linearly ordered.(See Figure 1.) More generally, f k will count k -tuples of ordered trees, which we also call Figure 1.

An ordered tree forests of ordered trees .If f = xR ( f ), where R ( t ) = (cid:80) ∞ i =0 r i t i , then the coeﬃcient of x n in f is the sum of theweights of the ordered trees with a total of n vertices, where the weight of a tree is theproduct of the weights of its vertices and the weight of a vertex with i children is r i . (Forexample, the tree of Figure 1 has weight r r .) So to give a combinatorial proof of formula(2.2.1), we show that the sum of the weights of all k -tuples of ordered trees with a total of n vertices is ( k/n ) [ t n − k ] R ( t ) n , or equivalently by (2.5.1), the number of k -tuples of orderedtrees in which n i vertices have i children for each i is equal to kn (cid:18) nn , n , n , . . . (cid:19) (4.4.1)if n = (cid:88) k n i and n − k = (cid:88) k in i , (4.4.2)and is zero otherwise. W. T. Tutte [74] gave the case k = 1 of (4.4.1), which he derived (ina roundabout way) from Lagrange inversion.We can also work with exponential generating functions. One way to do this is to considerthe equation f = x (cid:80) ∞ i =0 s i f i /i ! where x is the exponential variable and the s n are weights.Then by the properties of exponential generating functions (see, e.g., [70, Chapter 5]) f counts labeled rooted trees where a vertex with i children is weighted s i . More precisely,the coeﬃcient of x n /n ! in f k /k ! is the sum of the weights of all forests of k rooted trees withvertex set [ n ] = { , , . . . , n } . Then Lagrange inversion in the form (2.5.1) is equivalent toassertion that the number of forests of k rooted trees with vertex set [ n ] in which n i verticeshave i children is equal to ( n − k − n n n · · · (cid:18) nn , n , . . . (cid:19) (4.4.3)with the same conditions on n , n , . . . as before. (See Stanley [70, p. 30, Corollary 5.3.5].) nother exponential generating function approach is to consider the equation f = ∞ (cid:88) i =0 s i f i i !where we work with exponential generating functions in the variables s , s , . . . . Then bythe properties of multivariable exponential generating functions, the coeﬃcient of s n n ! s n n ! . . . in f k /k ! is the number of forests of k rooted trees in which for each i , n i vertices have i children and these vertices are labeled 1 , , . . . , n i . (Since vertices with the same label havediﬀerent numbers of children, there are no nontrivial label-preserving automorphisms of theseforests.) For example, such a forest with k = 2, n = 5, n = 3, and n i = 0 for i / ∈ { , } isshow in Figure 2. Then Lagrange inversion in the form (2.5.1) (with x = 1) is equivalent to

35 14 31 22 3

Figure 2.

A forest with n = 5, n = 3the assertion that the number of such forests is( n − k − n n n · · · , (4.4.4)where n = n + n + · · · and n + 2 n + 3 n + · · · = n − k .4.5. Raney’s proof.

The earliest combinatorial proof of Lagrange inversion is that ofRaney [61]. We sketch here a proof that is based on Raney’s though the details are dif-ferent. (See also Stanley [70, pp. 31–35 and 39–40].) We deﬁne the suﬃx code c ( T ) for anordered tree T to be a sequence of nonnegative integers deﬁned recursively: If the root r of T has j children, and the trees rooted at the children of r are T , . . . , T j , then c ( T ) is theconcatenation c ( T ) · · · c ( T j ) j . More generally, the suﬃx code for a k -tuple ( T , . . . , T k ) ofordered trees is the concatenation c ( T ) · · · c ( T k ). We deﬁned the reduced code of a tree or k -tuple of trees to be the sequence obtained from the suﬃx code by subtracting 1 from eachentry.For example the suﬃx code of the k -tuple of trees in Figure 3 is 0 0 2 0 1 and the reducedcode is ¯1 ¯1 1 ¯1 0, where ¯1 denotes − Lemma 4.5.1. (i)

A forest of ordered trees is uniquely determined by its reduced code. igure 3. A forest of ordered trees(ii)

A sequence a a · · · a n of integers greater than or equal to − is the reduced code ofan ordered k -forest if and only if a + · · · + a n = − k and a + · · · + a i is negative for i = 1 , . . . , n . (cid:3) We also need a lemma, due to Raney, that generalizes the “cycle lemma” of Dvoretzkyand Motzkin [18]. It can be proved by induction, or in other ways. (See, e.g., Stanley [70,pp. 32–33].)

Lemma 4.5.2.

Let a · · · a n be a sequence of integers greater than or equal to − with sum − k < . Then there are exactly k integers i , with ≤ i ≤ n , such that the sequence a i · · · a n a · · · a i − has all partial sums negative. (cid:3) We can now prove Lagrange inversion. We want to prove that the sum of the weights ofall k -forests with n vertices is k/n times[ t n − k ] R ( t ) n = [ t − k ] (cid:18) R ( t ) t (cid:19) n , where R ( t ) = (cid:80) ∞ n =0 s n t n . Let us deﬁne the weight of a sequence a · · · a n of integers to bethe product s a +1 · · · s a n +1 . Then by Lemma 4.5.1, the sum of the weights of all k -forestswith n vertices is the sum of the weights of all sequences of integers of length n , with entriesgreater than or equal to −

1, with sum − k , and with all partial sums negative. It is clearthat [ t − k ]( R ( t ) /t ) n is the sum of the weights of all sequences of integers of length n , withentries greater than or equal to −

1, and with sum − k . But by Lemma 4.5.2, a proportion k/n of these sequences have all partial sums negative.4.6. Proofs by labeled trees.

We can derive Lagrange inversion by counting labeled treeswith the following result, which seems to have ﬁrst been proved by Moon [52].

Theorem 4.6.1.

Let m be a positive integer and let d , d , . . . , d m be positive integers. Thenthe number of (unrooted) trees with vertex set [ m ] in which vertex i has degree d i is themultinomial coeﬃcient (cid:18) m − d − , . . . , d m − (cid:19) (4.6.1) if (cid:80) mi =1 d i = 2( m − and is 0 otherwise. We will prove Theorem 4.6.1 a little later; we ﬁrst look at some of its consequences. Acorollary of Theorem 4.6.1 allows us to count forests of rooted trees:

Corollary 4.6.2.

Let e , e , . . . , e n be nonnegative integers with e + e + · · · + e n = n − k ,and let k be a positive integer. Then the number of forests of k rooted trees, with vertex set [ n ] , in which vertex i has e i children is (cid:18) n − k − , e , . . . e n (cid:19) . roof. Let T be a tree on [ n + 1] in which vertex n + 1 has degree k , and vertex i has degree e i + 1 for 1 ≤ i ≤ n (so that (cid:80) ni =1 e i = n − k ). Removing vertex n + 1 from T and rooting theresulting component trees at the neighbors of n + 1 in T gives a forest F of k rooted treesin which vertex i has e i children, and this operation gives a bijection from trees on [ n + 1]in which vertex n + 1 has degree k and vertex i has degree e i + 1 for 1 ≤ i ≤ n to the setof rooted forests of k trees on [ n ] in which vertex i has e i children. The result then followsfrom Lemma 4.6.1. (cid:3) It follows from Corollary 4.6.2 that the number of forests of k rooted trees in which foreach i , n i vertices have i children and these vertices are labeled 1 , , . . . , n i is given by (4.4.4),which as we saw earlier, is equivalent to Lagrange inversion.Now to count k -forests on [ n ] in which n i vertices have i children, we ﬁrst assign thenumber of children to each element of [ n ], which can be done in (cid:0) nn ,n ,... (cid:1) ways. For eachassignment, the number of trees is ( n − k − n n · · · Multiplying these factors gives (4.4.3).We now present sketches of three proofs of Theorem 4.6.1. They all depend on the factthat a tree with at least two vertices must have at least one leaf (vertex of degree 1).The ﬁrst proof uses Pr¨ufer’s correspondence [60]. Given a tree T with vertex set [ n ], let a be the least leaf of T and let b be the unique neighbor of a in T . Let T be the resultof removing a and its incident edge from T . Let a be the least leaf of T and let b beits neighbor in T . Continue in this way to deﬁne b , · · · , b n − . Then the Pr¨ufer code of T is the sequence ( b , b , · · · , b n − ). It can be shown that the map that takes a tree to itsPr¨ufer code is a bijection from the set of trees with vertex set [ n ], for n ≥

2, to the set ofsequences b b . . . b n − of elements of [ n ], with the property that a vertex of degree d appears d − i has degree d i .The second proof is by induction, and is due to Moon [53] (see also [54, p. 13]). For m ≥ T ( m ; d , . . . , d m ) be the number of trees on [ m ] in which vertex i has degree d i and let U ( m ; d . . . , d m ) be the multinomial coeﬃcient (4.6.1), which is 0 if any d i is less than 1 orif (cid:80) i d i (cid:54) = 2( n − T ( m ; d , . . . , d m ) is equal to U ( m ; d . . . , d m ) for m = 2. Now suppose that n > T ( m ; d , . . . , d m ) is equal to U ( m ; d . . . , d m ) for m = n − d , . . . , d m .We next observe that if d n = 1 then T ( n ; d , . . . , d n ) = n − (cid:88) i =1 T ( n − d , . . . , d i − , . . . , d n − ) (4.6.2)since every tree in which n is a leaf is obtained by joining vertex n with an edge to somevertex of a tree on [ n − T ( n ; d , . . . , d n ) is equal to U ( n ; d . . . , d n ). We have assumed that d n = 1, but ince T ( n ; d , . . . , d n ) is symmetric in d , . . . , d n and every tree has at least one leaf, the resultholds without this assumption.R´enyi [63] (see also Moon [54, p. 13]) gave an elegant variation of this proof. He showed byinduction that for m ≥ m ] with degree sequence ( d , . . . , d m ), i.e.,trees in which vertex i has degree d i , is the coeﬃcient of x d − · · · x d m − m in ( x + · · · + x m ) m − .The result clearly holds for m = 2, so suppose that n > m = n −

1. To show that it holds for m = n , we note that every tree on [ n ] has at leastone leaf, so by symmetry, we may assume, without loss of generality, that d n = 1. Thus weneed only show that the number of trees on [ n ] with degree sequence ( d , . . . , d n − ,

1) is thecoeﬃcient of x d − · · · x d n − − n − in( x + · · · + x n − ) n − = ( x + · · · + x n − ) · ( x + · · · + x n − ) n − . But every tree on [ n ] in which vertex n is a leaf is obtained by joining vertex n with an edgeto some vertex of a tree on [ n − n is joined to vertex i is x i ( x + · · · + x n − ) n − and the result follows. Acknowledgment.

I would like to thank Sateesh Mane and an anonymous referee for helpfulcomments.

References [1] N. H. Abel,

Beweis eines Ausdruckes, von welchem die Binomial-Formel ein einzelner Fall ist , J. ReineAngew. Math. (1826), 159–160.[2] Ulrich Abel, A generalization of the Leibniz rule , Amer. Math. Monthly (2013), no. 10, 924–928.[3] Drew Armstrong,

Generalized noncrossing partitions and combinatorics of Coxeter groups , Mem. Amer.Math. Soc. (2009), no. 949, x+159.[4] Jean-Christophe Aval,

Multivariate Fuss-Catalan numbers , Discrete Math. (2008), no. 20, 4660–4669.[5] F. Bergeron, G. Labelle, and P. Leroux,

Combinatorial Species and Tree-like Structures , Encyclopediaof Mathematics and its Applications, vol. 67, Cambridge University Press, Cambridge, 1998, Translatedfrom the 1994 French original by Margaret Readdy. With a foreword by Gian-Carlo Rota.[6] Christian Brouder, Alessandra Frabetti, and Christian Krattenthaler,

Non-commutative Hopf algebraof formal diﬀeomorphisms , Adv. Math. (2006), no. 2, 479–524.[7] Jean-Paul Bultel,

Combinatorial properties of the noncommutative Fa`a di Bruno algebra , J. AlgebraicCombin. (2013), no. 2, 243–273.[8] L. Carlitz, The generating function for the Jacobi polynomial , Rend. Sem. Mat. Univ. Padova (1967),86–88.[9] , Weighted Stirling numbers of the ﬁrst and second kind. I , Fibonacci Quart. (1980), no. 2,147–162.[10] Augustin-Louis Cauchy, Exercises de math´ematiques, vol. 1 , De Bure fr`eres, 1826, Application du calculdes r´esidus `a la sommation de plusieurs suites. Oeuvres compl`etes d’Augustin Cauchy. S´erie 2, tome 6,pp. 62–73.[11] William Y. C. Chen, Janet F. F. Peng, and Harold R. L. Yang,

Decomposition of triply rooted trees ,Electron. J. Combin. (2013), no. 2, Paper 10, 10 pp.[12] Wenchang Chu, Leibniz inverse series relations and Pfaﬀ-Cauchy derivative identities , Rend. Mat. Appl.(7) (2009), no. 2, 209–221.[13] , Elementary proofs for convolution identities of Abel and Hagen-Rothe , Electron. J. Combin. (2010), no. 1, Note 24, 5 pp.[14] , Derivative inverse series relations and Lagrange expansion formula , Int. J. Number Theory (2013), no. 4, 1001–1013.

15] J. Cigler,

Some remarks on Catalan families , European J. Combin. (1987), no. 3, 261–267.[16] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeﬀrey, and D. E. Knuth, On the Lambert W function ,Adv. Comput. Math. (1996), no. 4, 329–359.[17] Robert M. Corless, David J. Jeﬀrey, and Donald E. Knuth, A sequence of series for the Lambert W function , Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation(Kihei, HI), ACM, New York, 1997, pp. 197–204.[18] A. Dvoretzky and Th. Motzkin, A problem of arrangements , Duke Math. J. (1947), 305–313.[19] Paul H. Edelman, Chain enumeration and noncrossing partitions , Discrete Math. (1980), no. 2,171–180.[20] Sen-Peng Eu and Tung-Shan Fu, Lattice paths and generalized cluster complexes , J. Combin. TheorySer. A (2008), no. 7, 1183–1210.[21] Nicolao Fuss,

Solutio quaestionis, quot modis polygonum n laterum in polygona m laterum, per diago-nales resolvi queat , Nova Acta Academiae Sci. Petropolitanae (1791), 243–251.[22] Adriano M. Garsia, A q -analogue of the Lagrange inversion formula , Houston J. Math. (1981), no. 2,205–237.[23] Ira Gessel, A noncommutative generalization and q -analog of the Lagrange inversion formula , Trans.Amer. Math. Soc. (1980), no. 2, 455–482.[24] Ira Gessel and Richard P. Stanley, Stirling polynomials , J. Combinatorial Theory Ser. A (1978),no. 1, 24–33.[25] Ira Gessel and Dennis Stanton, Applications of q -Lagrange inversion to basic hypergeometric series ,Trans. Amer. Math. Soc. (1983), no. 1, 173–201.[26] , Short proofs of Saalsch¨utz’s and Dixon’s theorems , J. Combin. Theory Ser. A (1985), no. 1,87–90.[27] Ira M. Gessel, A factorization for formal Laurent series and lattice path enumeration , J. Combin. TheorySer. A (1980), no. 3, 321–337.[28] , A combinatorial proof of the multivariable Lagrange inversion formula , J. Combin. Theory Ser.A (1987), no. 2, 178–195.[29] Ira M. Gessel and Gilbert Labelle, Lagrange inversion for species , J. Combin. Theory Ser. A (1995),no. 1, 95–117.[30] H. W. Gould, Final analysis of Vandermonde’s convolution , Amer. Math. Monthly (1957), 409–415.[31] , Euler’s formula for n th diﬀerences of powers , Amer. Math. Monthly (1978), no. 6, 450–467.[32] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concrete Mathematics: A Foundation forComputer Science , second ed., Addison-Wesley Publishing Company, Reading, MA, 1994.[33] S. J. Hagen, Johann G.,

Synopsis der Hoeheren Mathematik: Arithmetische und Algebraische Analyse ,vol. 1, Felix L. Dames, Berlin, 1891.[34] F. Hirzebruch,

The signature theorem: reminiscences and recreation , Prospects in mathematics (Proc.Sympos., Princeton Univ., Princeton, N.J., 1970), Princeton Univ. Press, Princeton, N.J., 1971, pp. 3–31. Ann. of Math. Studies, No. 70.[35] Josef Hofbauer,

Lagrange-inversion , S´em. Lothar. Combin. (1982), Art. B06a, .[36] Jianfeng Huang and Xinrong Ma, Two elementary applications of the Lagrange expansion formula , J.Math. Res. Appl. (2015), no. 3, 263–270.[37] Eri Jabotinsky, Representation of functions by matrices. Application to Faber polynomials , Proc. Amer.Math. Soc. (1953), 546–553.[38] C. G. J. Jacobi, De resolutione aequationum per series inﬁnitas , Journal f¨ur die reine und ange-wandte Mathematik (1830), 257–286, Gesammelte Werke, vol. 6, pp. 26–61, G. Reimer, Berlin (1891),reprinted by Chelsea Publishing Company, New York (1969).[39] J. L. W. V. Jensen, Sur une identit´e d’Abel et sur d’utres formules analogues , Acta Math. (1902),307–318.[40] Warren P. Johnson, The Pfaﬀ/Cauchy derivative identities and Hurwitz type extensions , Ramanujan J. (2007), no. 1–3, 167–201.[41] Dan Kneezel, Hirzebruch’s motivation of the Todd class , MathOverﬂow, http://mathoverflow.net/q/60478 (version: 2011-04-03).

42] Donald E. Knuth,

The Art of Computer Programming. Vol. 1: Fundamental Algorithms , third ed.,Addison-Wesley, Reading, MA, 1997.[43] Ch. Krattenthaler,

Operator methods and Lagrange inversion: a uniﬁed approach to Lagrange formulas ,Trans. Amer. Math. Soc. (1988), no. 2, 431–465.[44] Christian Krattenthaler,

The F -triangle of the generalised cluster complex , Topics in Discrete Mathe-matics, Algorithms Combin., vol. 26, Springer, Berlin, 2006, pp. 93–126.[45] Gilbert Labelle, Some new computational methods in the theory of species , Combinatoire ´enum´erative(Montreal, Que., 1985/Quebec, Que., 1985), Lecture Notes in Math., vol. 1234, Springer, Berlin, 1986,pp. 192–209.[46] Alexandre Lacasse,

Bornes PAC-Bayes et algorithmes d’apprentissage , Ph.D. thesis, Universit´e Laval,Qu´ebec, 2010.[47] Joseph Louis Lagrange,

Nouvelle m´ethode pour r´esoudre les ´equations litt´erales par le moyen des s´eries ,M´emoires de l’Acad´emie Royale des Sciences et Belles-Lettres de Berlin (1770), 251–326, Oeuvrescompl`ete, tome 3, Paris, 1867, 5–73.[48] Romuald Lenczewski and Rafa(cid:32)l Sa(cid:32)lapata, Multivariate Fuss-Narayana polynomials and their applicationto random matrices , Electron. J. Combin. (2013), no. 2, Paper 41, 14 pp.[49] J. Liouville, Remarques sur un m´emoire de N. Fuss , J. Math. Pures Appl. (1843), 391–394.[50] Touﬁk Mansour and Yidong Sun, Bell polynomials and k -generalized Dyck paths , Discrete Appl. Math. (2008), no. 12, 2279–2292.[51] D. Merlini, R. Sprugnoli, and M. C. Verri, Lagrange inversion: when and how , Acta Appl. Math. (2006), no. 3, 233–249 (2007).[52] J. W. Moon, The second moment of the complexity of a graph , Mathematika (1964), 95–98.[53] , Enumerating labelled trees , Graph Theory and Theoretical Physics, Academic Press, London,1967, pp. 261–272.[54] ,

Counting Labelled Trees , Canadian Mathematical Monographs, No. 1, Canadian MathematicalCongress, Montreal, Que., 1970.[55] Ivan Niven,

Formal power series , Amer. Math. Monthly (1969), 871–889.[56] Jean-Christophe Novelli and Jean-Yves Thibon, Noncommutative symmetric functions and Lagrangeinversion , Adv. in Appl. Math. (2008), no. 1, 8–35.[57] The On-Line Encyclopedia of Integer Sequences , published electronically at http://oeis.org , 2016.[58] J. F. Pfaﬀ,

Allgemeine Summation einer Reihe, worinn h¨ohere Diﬀerenziale vorkommen , Archiv derreinen und angewandten Mathematik (1795), 337–47.[59] Helmut Prodinger, An identity conjectured by Lacasse via the tree function , Electron. J. Combin. (2013), no. 3, Paper 7, 3 pp.[60] Heinz Pr¨ufer, Neuer Beweis eines Satzes ¨uber Permutationen , Arch. Math. Phys (1918), 742–744.[61] George N. Raney, Functional composition patterns and power series reversion , Trans. Amer. Math. Soc. (1960), 441–451.[62] , A formal solution of (cid:80) ∞ i =1 A i e B i X = X , Canad. J. Math. (1964), 755–762.[63] Alfr´ed R´enyi, On the enumeration of trees , Combinatorial Structures and their Applications (Proc.Calgary Internat. Conf., Calgary, Alta., 1969), Gordon and Breach, New York, 1970, pp. 355–360.[64] John Riordan,

Combinatorial Identities , John Wiley & Sons, Inc., New York-London-Sydney, 1968.[65] Henrico Augusto Rothe,

Formulae De Serierum Reversione Demonstratio Universalis Signis LocalibusCombinatorio-Analyticorum Vicariis Exhibita , Sommer, Leipzig, 1793.[66] Issai Schur,

On Faber polynomials , Amer. J. Math. (1945), 33–41.[67] Leonard M. Smiley, Completion of a rational function sequence of Carlitz , 2000, arXiv:math/0006106[math.CO] .[68] Alan D. Sokal,

A ridiculously simple and explicit implicit function theorem , S´em. Lothar. Combin. (2009/11), Art. B61Ad, 21 pp.[69] Richard P. Stanley,

Parking functions and noncrossing partitions , Electron. J. Combin. (1997), no. 2,Research Paper 20, 4 pp., The Wilf Festschrift (Philadelphia, PA, 1996).[70] , Enumerative Combinatorics. Vol. 2 , Cambridge Studies in Advanced Mathematics, vol. 62,Cambridge University Press, Cambridge, 1999, With a foreword by Gian-Carlo Rota and appendix 1by Sergey Fomin.

71] Dennis Stanton,

Recent results for the q -Lagrange inversion formula , Ramanujan revisited (Urbana-Champaign, Ill., 1987), Academic Press, Boston, MA, 1988, pp. 525–536.[72] Yidong Sun, A simple bijection between binary trees and colored ternary trees , Electron. J. Combin. (2010), no. 1, Note 20, 5.[73] , A simple proof of an identity of Lacasse , Electron. J. Combin. (2013), no. 2, Paper 11, 3pp.[74] William T. Tutte, The number of planted plane trees with a given partition , Amer. Math. Monthly (1964), no. 3, 272–277.[75] Chao-Jen Wang, Applications of the Goulden-Jackson Cluster Method to Counting Dyck paths by Oc-currences of Subwords , Ph.D. thesis, Brandeis University, 2011.[76] E. T. Whittaker and G. N. Watson,

A Course of Modern Analysis , Third Edition, Cambridge UniversityPress, 1920.[77] Guoce Xin,

A residue theorem for Malcev-Neumann series , Adv. in Appl. Math. (2005), no. 3,271–293.[78] Dapeng Xu, Generalizations of two-stack-sortable permutations , 2002, arXiv:math/0209313[math.CO] .[79] Malik Younsi,

An algebra of power series arising in the intersection theory of moduli spaces of curvesand in the enumeration of ramiﬁed coverings of the sphere , 2004, arXiv:math/0403092[math.AG] .[80] ,

Proof of a combinatorial conjecture coming from the PAC-Bayesian machine learning theory ,2012, arXiv:1209.0824[math.CO] . Department of Mathematics, Brandeis University, Waltham, MA 02453-2700

E-mail address : [email protected]@brandeis.edu