[PDF] Further results on the estimation of dynamic panel logit models with fixed effects

Abstract

Kitazawa (2013, 2016) showed that the common parameters in the panel logit AR(1) model with strictly exogenous covariates and fixed effects are estimable at the root-n rate using the Generalized Method of Moments. Honoré and Weidner (2020) extended his results in various directions: they found additional moment conditions for the logit AR(1) model and also considered estimation of logit AR(p) models with p>1. In this note we prove a conjecture in their paper and show that 2^{T}-2T of their moment functions for the logit AR(1) model are linearly independent and span the set of valid moment functions, which is a 2^{T}-2T -dimensional linear subspace of the 2^{T} -dimensional vector space of real valued functions over the outcomes y element of {0,1}^{T}. We also prove that when p=2 and T element of {3,4,5}, there are, respectively, 2^{T}-4(T-1) and 2^{T}-(3T-2) linearly independent moment functions for the panel logit AR(2) models with and without covariates.

Full PDF

aa r X i v : . [ ec on . E M ] O c t Further results on the estimation ofdynamic panel logit models with ﬁxed eﬀects.

Hugo Kruiniger ∗ Durham UniversityThis version: 27 October 2020

JEL classiﬁcation: C12, C13, C23.Keywords: dynamic panel logit models, exogenous regressors, ﬁxed eﬀects. ∗ Address: [email protected]; Department of Economics, 1 Mill Hill Lane,Durham DH1 3LB, England. I thank N. Peyerimhoﬀ for helpful comments. All remainingerrors are mine. bstract

Kitazawa (2013, 2016) showed that the common parameters in the panel logit AR(1)model with strictly exogenous covariates and ﬁxed eﬀects are estimable at the root-nrate using the Generalized Method of Moments. Honor´e and Weidner (2020) extendedhis results in various directions: they found additional moment conditions for the logitAR(1) model and also considered estimation of logit AR(p) models with p >

1. Inthis note we prove a conjecture in their paper and show that 2 T − T of their momentfunctions for the logit AR(1) model are linearly independent and span the set of validmoment functions, which is a 2 T − T -dimensional linear subspace of the 2 T -dimensionalvector space of real valued functions over the outcomes y ∈ { , } T . We also prove thatwhen p = 2 and T ∈ { , , } , there are, respectively, 2 T − T −

1) and 2 T − (3 T − Proof of a conjecture in Honor´e and Weidner (2020)

We adopt the notation of Honor´e and Weidner (2020). Their conjecture in p.17 isthat for γ = 0 (and arbitrary y , x and β ; index i is omitted) any moment func-tion m y ( y, x, β, γ ) = w ( y , ..., y t − ) m ( a/b )( t,s,r ) y ( y, x, β, γ ) for the panel logit AR(1) modelwith strictly exogenous regressors and T ≥ m y ( y, x, β, γ ) = T − X t =1 T − X s = t +1 [ w ( a ) y ( t, s, y , ..., y t − , x, β, γ ) m ( a )( t,s,T ) y ( y, x, β, γ )+ w ( b ) y ( t, s, y , ..., y t − , x, β, γ ) m ( b )( t,s,T ) y ( y, x, β, γ )]with weights w ( a/b ) y ( t, s, y , ..., y t − , x, β, γ ) ∈ R that are uniquely determined by the func-tion m y ( ., x, β, γ ) . We will prove this conjecture by showing (i) that the set of valid moment func-tions is a linear subspace of the 2 T -dimensional vector space of real valued functionsover the outcomes y ∈ { , } T that has a dimension of at most 2 T − T and (ii)that the 2 T − T functions of the form w y ,...,y t − ( y , ..., y t − ) m ( a/b )( t,s,T ) y ( y, x, β, γ ) , where w y ,...,y t − ( y , ..., y t − ) : { , } t − → { , } are 2 t − linearly independent indicator functionsand 1 ≤ t < s < T, are linearly independent and span this subspace.Proof:Recall that P r ( Y i = y i | Y i = y i , X i = x i , A i = α i ) ≡ p y i ( y i , x i , β , γ , α i ) = T Y t =1

11 + exp [(1 − y it )( x ′ it β + y i,t − γ + α i )] . We drop the index i. A valid moment function m y ( y, x, β, γ ) satisﬁes E [ m y ( Y, X, β , γ ) | Y = y , X = x, A = α ] = 0 for all α ∈ R or equivalently X y ∈{ , } T p y ( y, x, β , γ , α ) m y ( y, x, β , γ ) = 0 for all α ∈ R . T ≥ α < α < ... < α T . Deﬁne the 2 T × T matrix ¯ P with typical element¯ P g,h = p y ( y, x, β , γ , α g ) for g ∈ { , , ..., T } and h = 1 + 2 y + 2 y + ... + 2 T − y T . Let P g,t = exp( x ′ t β + α g ) and y S = P Tt =1 y t . Deﬁne the 2 T × T matrix ˘ P withtypical element ˘ P g,h = P y T g,T T − Q t =1 ( P g,t (1 + P g,t +1 ) / (1 + P g,t +1 e γ )) y t for g ∈ { , , ..., T } and h = 1 + 2 y + 2 y + ... + 2 T − y T . Note that ˘ P = D ¯ P ˘ D for some nonsingular diagonal matrices D = D ( x, β , γ , α ) and˘ D = ˘ D ( γ ) . Hence rk ( ˘ P ) = rk ( ¯ P ) . We now show (i). If the model does not contain covariates, i.e., if β = 0 , then ˘ P does not depend on x and there exist 2 T − rk ( ˘ P ) linearly independent moment functions,which will not depend on x . Furthermore, the number of linearly independent momentfunctions available for the model without covariates is at least as large as the number oflinearly independent moment functions available for the model that does include them,i.e., that allows β = 0. In the appendix we show that rk ( ˘ P ) ≥ T irrespective of whether β = 0 or β = 0, that is, we prove Lemma 1, which states that the 2 T columns of ˘ P corresponding to vectors y with either the ﬁrst k or the last k elements equal to 1 andthe remaining elements (if any) equal to 0 for k = 0 , , , ...T are linearly independent. Recall that rk ( ˘ P ) = rk ( ¯ P ) . It follows that claim (i) is correct. We now show (ii):It is easily seen that the moment functions w y ,...,y t − ( y , ..., y t − ) × m ( a/b )( t ,s ,T ) y ( y, x, β, γ ) with t < s < T are linearly independent of w y ,...,y t − ( y , ..., y t − ) × m ( a/b )( t,s,T ) y ( y, x, β, γ ) with t < s < T and ( t, s ) = ( t , s ) because only the former dependon exp[ ± z t ,s ( y , y, x, β, γ )] , where z t ,s ( y , y, x, β, γ ) = ( x t − x s ) ′ β + γ ( y t − − y s − ).This is still true when β = 0 . Furthermore, for any t and s with t < s < T ,the moment functions w y ,...,y t − ( y , ..., y t − ) m ( a/b )( t ,s ,T ) y are linearly independent be-cause w y ,...,y t − ( y , ..., y t − ) are linearly independent indicator functions and m ( a )( t ,s ,T ) y and m ( b )( t ,s ,T ) y are linearly independent. Hence the 2 T − T functions of the form w y ,...,y t − ( y , ..., y t − ) m ( a/b )( t,s,T ) y ( y, x, β, γ ) are linearly independent. They are also validmoment functions. It follows that they span a 2 T − T -dimensional linear subspace ofthe 2 T -dimensional vector space of real valued functions over the outcomes y ∈ { , } T that contains the valid moment functions.Remark 1: The analysis above is also valid when there are no covariates, i.e., β = 0. More generally, any 2 T columns of ˘ P will be linearly independent if they correspond tothe following 2 T y -vectors: the two y -vectors that satisfy y S = 0 or y S = T and for each k ∈{ , , ..., T − } two y -vectors that satisfy y S = k , one with y T = 0 and the other with y T = 1 . β = 0 , then ˘ P depends on the x i and it may well be the casethat rk ( ˘ P ) > T. However, this does not contradict the existence of 2 T − T linearlyindependent moment functions, because for each individual i, they will depend on x i .On the other hand, when β = 0 , then ˘ P does not depend on the x i and hence thereexist 2 T − rk ( ˘ P ) linearly independent moment functions, which will not depend on the x i . This and part (ii) of the proof in turn imply that when β = 0 , then rk ( ˘ P ) = 2 T .Remark 3: It follows from the result under (i) that there are no valid moment functionswhen T = 2 . In other words, GMM estimation of the panel logit AR(1) model with ﬁxedeﬀects and possibly strictly exogenous covariates is not possible for T = 2 . Our proof ismore general than that of Honor´e and Weidner (2020) for this claim because we also coverthe case where the values of α can only be ﬁnite. In their proof, Honor´e and Weidner(2020) chose two of the four diﬀerent values of α equal to ±∞ , which leads to probabilitiesthat are equal to 1 for the events where all elements of y are either zero or one. Thisunnecessarily restricts the moment functions a priori. In contrast, we also allow all theprobabilities of observing a y -vector with only zeros or only ones to be less than 1.Remark 4: The analysis above can also be extended to panel logit AR( p ) models withﬁxed eﬀects and p > γ = 0 . In that case ˘ P g,h = T Q t =1 P y t g,t . When also β = 0 , ˘ P is equal to a matrix withcolumns from a Vandermonde matrix of rank T + 1 . It follows that when γ = 0 , the setof valid moment functions is a linear subspace of the 2 T -dimensional vector space of realvalued functions over the outcomes y ∈ { , } T that has at most dimension 2 T − ( T + 1)and in particular that when T = 2 , there exists at most one valid moment condition. When p = 2 , we have P r ( Y i = y i | Y i = y i , Y i, − = y i, − , X i = x i , A i = α i ) ≡ p y (0) i ( y i , x i , β , γ , α i ) = T Y t =1 exp ( x ′ it β + P l =1 y i,t − l γ l + α i )1 + exp ( x ′ it β + P l =1 y i,t − l γ l + α i ) , γ = ( γ , γ ) ′ . We drop the index i. Let us redeﬁne ¯ P as a 2 T × T matrix withtypical element ¯ P g,h = p y (0) ( y, x, β , γ , α g ) for g ∈ { , , ..., T } and h = 1 + 2 y + 2 y + ... + 2 T − y T , and let us redeﬁne ˘ P as a 2 T × T matrix with a typical element ˘ P g,h = P y T g,T × T − Q t =2 (cid:16) P g,t − ( (1+ P g,t )(1+ P g,t +1 )(1+ P g,t e γ )(1+ P g,t +1 e γ ) ) − y t − ( P g,t P g,t e γ γ ) y t − (cid:17) y t − × (cid:16) P g,T − ( P g,T P g,T e γ ) − y T − ( P g,T P g,T e γ γ ) y T − (cid:17) y T − for g ∈ { , , ..., T } and h = 1 + 2 y +2 y + ... + 2 T − y T . Note that with these new deﬁnitions of ¯ P and ˘ P , we still have˘ P = D ¯ P ˘ D for some nonsingular diagonal matrices D = D ( x, β , γ , α ) and ˘ D = ˘ D ( γ ).The formula for ˘ P g,h suggests that a second conjecture of Honor´e and Weidner (2020),henceforth H&W, namely that the number of linearly independent moment functions forthe general panel logit AR( p ) models with covariates is given by l = 2 T − ( T + 1 − p )2 p , is plausible: when p increases by one, the number of factors under the product sign Πdecreases by one (this number equals T − p = 2), which explains the ( T + 1 − p )part of the formula, while the number of possible values for a p -tuple ( y t , y t − , . . . , y t − p +1 ) , namely 2 p , doubles. To prove H&W’s second conjecture for p > , one can in principlefollow a similar proof strategy as for the case where p = 1. However, when p > , things are a bit more complicated. As H&W demonstrate, when p > , the numberof linearly independent moment functions for the general panel logit AR( p ) model issmaller than the number of linearly independent moment functions for the panel logitAR( p ) model without covariates (i.e., with β = 0). One can relatively easily establishthe latter number for diﬀerent values of T by using a proof strategy similar to that forthe case p = 1 . The diﬀerence between the two numbers of moment functions is equalto the number of linearly independent ”special” moment functions that are only validfor ”special” versions of the model, e.g. the model with β = 0 , but not for the generalmodel. Thus by subtracting the number of these special moment functions from the totalnumber of linearly independent moment functions for the model with β = 0 , one obtainsthe number of linearly independent moment functions for the general model.H&W claim that they have found all moment functions for the general model when T ≤

5. However, their claim is premature as they have not shown that there cannot bemore than l moment functions for the general model when T ≤ We have shown this H&W have found one moment function for the panel logit AR(2) model with β = 0 (giventhe value of y (0) ) when T = 3, which is a special case of a moment function that is only validwhen x = x . However, they have not shown that when T = 3, there is only one momentfunction for this model. p = 1 (and any T ) and we will show this in the appendix for p = 2 and T ≤ . For the panel logit AR(2) model without covariates (i.e., with β = 0), one can showthat rk ( ˘ P ) = 4( T − − ( T −

2) = 3 T − , so that there are 2 T − (3 T −

2) linearlyindependent moment functions available for this model. One can easily obtain theseby solving the system ˆ P [3 T − ¯ M T − = 0 , where ˆ P [3 T − = ˆ P [3 T − ( e γ , e γ ) is a (3 T − × T matrix that consists of (any) 3 T − P ˘ D , each evaluatedat/corresponding to diﬀerent values for the α g , and ¯ M T − is a 2 T × (2 T − (3 T − rk ( ¯ M T − ) = 2 T − (3 T − T − (3 T −

2) columns of ¯ M T − span thenullspace of ¯ P ˘ D, which is the space of valid moment functions for the panel logit AR(2)model without covariates. References [1] Honor´e, B. E., and M. Weidner, 2020, Moment Conditions for Dynamic Panel LogitModels with Fixed Eﬀects, arXiv:2005.05942v3 [econ.EM] 21 Jun 2020.[2] Kitazawa, Y., 2013, Exploration of dynamic ﬁxed eﬀects logit models from a tradi-tional angle, Discussion paper No. 60, Kyushu Sangyo University, Faculty of Eco-nomics.[3] Kitazawa, Y., 2016, Root-N consistent estimations of time dummies for the dynamicﬁxed eﬀects logit models: Monte Carlo illustrations, Discussion paper No. 72, KyushuSangyo University, Faculty of Economics. A proof strategy for the claim that rk ( ˘ P ) = 3 T − Appendix

Lemma 1

The T columns of ˘ P corresponding to vectors y with either the ﬁrst k or thelast k elements equal to 1 and the remaining elements (if any) equal to 0 for k = 0 , , , ...T are linearly independent a.s. (almost surely) for any T ≥ : Proof:

We will prove this Lemma by showing that the square matrix e P T (sometimessimply denoted by e P for short for some value of T ) that contains the ﬁrst 2 T rows ofthese 2 T columns of ˘ P has full rank for any T ≥ β = 0 . We deﬁne the elements of the matrix e P T as follows:If y = (1 , . . . , , , . . . , ′ with the ﬁrst k entries equal to 1 and 0 ≤ k ≤ T − e P T,g,h (or simply e P g,h for some value of T ) = ( e α g e αg e αg + γ ) k for any g ∈ { , , . . . , T } and for h = 2 k + 1;if y = (0 , . . . , , , . . . , ′ with the last k + 1 entries equal to 1 and 0 ≤ k ≤ T − e P T,g,h = ( e P g,h =) e α g ( e α g e αg e αg + γ ) k for any g ∈ { , , . . . , T } and for h = 2( k + 1) . Let D T = diag (1 + e α + γ , e α + γ , . . . , e α T + γ ) . Note that det( D T ) = 0 . We will prove the Lemma by induction. When T = 2 , we consider the 4 × D e P =  e α + γ e α (1 + e α + γ ) e α (1 + e α ) e α (1 + e α )1 + e α + γ e α (1 + e α + γ ) e α (1 + e α ) e α (1 + e α )1 + e α + γ e α (1 + e α + γ ) e α (1 + e α ) e α (1 + e α )1 + e α + γ e α (1 + e α + γ ) e α (1 + e α ) e α (1 + e α )  , and it is easily veriﬁedthat rank ( D e P ) = 4 a.s. (Recall that γ = 0, note that any linear combination of the ﬁrsttwo columns of D e P depends on γ and conclude that the k-th column of D e P cannotbe written as a linear combination of the k-1 columns on its LHS for k = 2 , . . . , D ) = 0 , it follows that rank ( e P ) = 4 a.s.Assuming that the Lemma is correct for T = S + 2 for some S ∈ N , we will now provethat it is also correct for T = S + 3 :The 2( S + 3) × S + 3) matrix e P = e P S +3) contains the 2( S + 2) × S + 2) matrix e P S +2) (in the north-west corner) and two more rows and columns: e P S +3) =  e P S +2) [ ˘ P g, S +2 ] ≤ g ≤ S +2) [ ˘ P g, S +3 ] ≤ g ≤ S +2) [ e P S +5 ,h ] ≤ h ≤ S +2) ˘ P S +5 , S +2 ˘ P S +5 , S +3 [ e P S +6 ,h ] ≤ h ≤ S +2) ˘ P S +6 , S +2 ˘ P S +6 , S +3  , where ˘ P is a2( S + 3) × S + 3) matrix.We can partition D S +22( S +3) e P = D S +22( S +3) e P S +3) as " D S +22( S +2) e P S +2) BC F . M ≡ D S +22( S +2) e P S +2) − BF − C. Then it follows from a standard result regardingthe determinants of partitioned matrices that det( D S +22( S +3) e P ) = det( F ) det( M ) . It is easily checked that F has full rank, i.e., rank ( F ) = 2: F = (cid:20) ( e α S +5 (1 + e α S +5 )) S +2 e α S +5 ( e α S +5 (1 + e α S +5 )) S +2 ( e α S +6 (1 + e α S +6 )) S +2 e α S +6 ( e α S +6 (1 + e α S +6 )) S +2 (cid:21) so det( F ) = 0 be-cause e α S +6 − e α S +5 = 0 . It is also easily shown that M = D S +22( S +2) e P S +2) − BF − C is invertible because it followsfrom Leibniz’s formula for determinants (or from Laplace’s expansion of the determinant,which uses cofactors and minors) that det( M ) is equal to a polynomial in the elementsof M, because this polynomial can be rewritten as a sum of terms that includes the termdet( D S +22( S +2) e P S +2) ) , because det( D S +22( S +2) e P S +2) ) = 0 a.s., and because (the sum of) allthe other terms in this sum is/are a.s. incapable of canceling out det( D S +22( S +2) e P S +2) ):Let Q = BF − C ≡ e Q/ det( F ) . Then Q g,h = B g,. F − C .,h = e Q g,h / det( F ) with e Q g,h =(1+ e α g + γ ) S +2 (cid:2) ˘ P g, S +2 ˘ P g, S +3 (cid:3) (cid:20) e α S +6 ( e α S +6 (1 + e α S +6 )) S +2 − e α S +5 ( e α S +5 (1 + e α S +5 )) S +2 − ( e α S +6 (1 + e α S +6 )) S +2 ( e α S +5 (1 + e α S +5 )) S +2 (cid:21) × " e P S +5 ,h (1 + e α S +5 ) S +2 e P S +6 ,h (1 + e α S +6 ) S +2 = (cid:2) ( e α g (1 + e α g )) S +2 e α g ( e α g (1 + e α g )) S +2 (cid:3) × (cid:20) e α S +6 ( e α S +6 (1 + e α S +6 )) S +2 − e α S +5 ( e α S +5 (1 + e α S +5 )) S +2 − ( e α S +6 (1 + e α S +6 )) S +2 ( e α S +5 (1 + e α S +5 )) S +2 (cid:21) × " ( e α S +5 ) δ ( e α S +5 e α S +5 e α S +5+ γ ) k (1 + e α S +5 ) S +2 ( e α S +6 ) δ ( e α S +6 e α S +6 e α S +6+ γ ) k (1 + e α S +6 ) S +2 for some k ∈ { , , . . . , S + 1 } and some δ ∈ { , } . Omitting the factor ( e α g (1 + e α g )(1 + e α S +5 )(1 + e α S +6 )) S +2 , e Q g,h ∝ (cid:2) e α g (cid:3) (cid:20) e α S +6 ( e α S +6 ) S +2 − e α S +5 ( e α S +5 ) S +2 − ( e α S +6 ) S +2 ( e α S +5 ) S +2 (cid:21) " ( e α S +5 ) δ ( e α S +5 e α S +5 e α S +5+ γ ) k ( e α S +6 ) δ ( e α S +6 e α S +6 e α S +6+ γ ) k = e δα S +6 (cid:0) e α g e ( S +2) α S +5 − e α S +5 e ( S +2) α S +5 (cid:1) (cid:18) e α S +6 e γ + α S +6 + 1 ( e α S +6 + 1) (cid:19) k − e δα S +5 (cid:0) e α g e ( S +2) α S +6 − e α S +6 e ( S +2) α S +6 (cid:1) (cid:18) e α S +5 e γ + α S +5 + 1 ( e α S +5 + 1) (cid:19) k = e δα S +6 e ( S +2) α S +5 ( e α g − e α S +5 ) (cid:18) e α S +6 e γ + α S +6 + 1 ( e α S +6 + 1) (cid:19) k − e δα S +5 e ( S +2) α S +6 ( e α g − e α S +6 ) (cid:18) e α S +5 e γ + α S +5 + 1 ( e α S +5 + 1) (cid:19) k . Note that the expression for e Q g,h cannot be rewritten as an expression that is divisibleby the expression e α S +6 − e α S +5 and hence that the expressions for all elements of Q are7atios with the factor e α S +6 − e α S +5 in the denominator. We conclude that det( M ) can bewritten as the sum of det( D S +22( S +2) e P S +2) ) and one other term, (which itself is the result ofsumming almost all terms that appear in the aforementioned expansion of det( M ) exceptfor det( D S +22( S +2) e P S +2) ) , and) which is an expression that is given by a ratio with the factor e α S +6 − e α S +5 raised to some positive power appearing in the denominator (as a commonfactor) and with the same factor also appearing in the numerator but raised to lowerpositive powers than its power in the denominator so that its presence in the numeratordoes not completely cancel out this factor in the denominator. However, none of theelements of D S +22( S +2) e P S +2) depend on e α S +5 or e α S +6 . It follows that det( M ) = 0 a.s.and that e P = e P S +3) is invertible a.s. (as we have already seen that det( F ) = 0), i.e., rank ( e P S +3) ) = 2( S + 3) a.s. Another way of seeing this is that det( M ) can be expressedas a ratio with a numerator that is a polynomial in e α g for g = 1 , , . . . , S + 2) , in e γ and, unless the second term (”the other term”) in the aforementioned sum of two termsis zero (in which case det( M ) = det( D S +22( S +2) e P S +2) ) = 0 a.s.), also in e α S +5 and e α S +6 . Hence det( M ) = 0 if and only if this numerator equals zero. Given values of e α g for g = 1 , , . . . , S + 2) and e γ , the numerator is a polynomial in e α S +5 and e α S +6 witha ﬁnite number of roots. As the values of α g , g = 1 , , . . . , S + 3), and γ = 0 canbe assumed to be randomly drawn from some continuous distribution(s), the probabilitythat the values of e α S +5 and e α S +6 coincide with these roots is negligible. It follows thatPr(det( M ) = 0) = 1 and hence that Pr(det( e P S +3) ) = 0) = 1 . The arguments generalize to the case where β = 0 . Q.E.D.

An alternative proof of the claim that rk ( ¯ P ) = 2 T for the panel logit AR(1) modelwithout covariates (i.e., with β = 0 ): Consider the 2 T × T matrix ¨ P with typical element ¨ P g,h = (1 + P g, e γ ) T − P y S g, × T − Q t =1 ((1+ P g, ) / (1+ P g, e γ )) y t for g ∈ { , , ..., T } and h = 1+2 y +2 y + ... +2 T − y T . Notethat ¨ P = ¨ D ˘ P for some nonsingular diagonal matrix ¨ D = ¨ D ( γ , α ) and that the columnsof ¨ P correspond to diﬀerent polynomials in P g, up to order 2 T − P . It follows that rk ( ¯ P ) = rk ( ¨ P ) is equal to the rank We have not investigated whether this second term (expression) in the sum is zero. If thelatter were the case, we would have det( M ) = det( D S +22( S +2) e P S +2) ) = 0 a.s., i.e., det( M ) = 0a.s., which is what we want to show.

8f a matrix that consists of linear combinations of the columns of a Vandermonde matrixthat is based on powers of P g, and has rank 2 T. Hence rk ( ¯ P ) = rk ( ¨ P ) ≤ T. To provethat rk ( ¯ P ) = rk ( ¨ P ) = 2 T, it suﬃces to show that rk ( ¨ P ) ≥ T. This can be done byselecting the same 2 T columns of ¨ P as those of ˘ P that underlie the deﬁnition of thematrix e P that is used in the proof of Lemma 1. Of course, it follows from Lemma 1, rk ( ¨ D ) = 2 T and ¨ P = ¨ D ˘ P that rk ( ¨ P ) ≥ T. Analysis for the panel logit AR(2) model:

Proof strategy for the claim that rk ( ¯ P ) = 3 T − for the panel logit AR(2) modelwithout covariates (i.e., with β = 0 ): When y = 1 , we consider ¨ P with typical element ¨ P g,h = (1 + P g, e γ ) ⌊ . T − ⌋ (1 + P g, e γ ) ⌊ . T − ⌋ (1 + P g, e γ + γ ) T − P y S g, (cid:16) ( P g, P g, e γ ) − y T − ( P g, P g, e γ γ ) y T − (cid:17) y T − × T − Q t =2 (cid:16) ( (1+ P g, ) (1+ P g, e γ )(1+ P g, e γ ) ) − y t − ( P g, P g, e γ γ ) y t − (cid:17) y t − . Note that ¨ P = ¨ D ˘ P for some non-singular diagonal matrix ¨ D = ¨ D ( γ , α ) and that the columns of ¨ P correspond to diﬀerentpolynomials in P g, up to order 3( T −

1) with all intermediate powers occuring some-where inside ¨ P . It follows that rk ( ¯ P ) = rk ( ¨ P ) is equal to the rank of a matrix thatconsists of linear combinations of the columns of a Vandermonde matrix that is basedon powers of P g, and has rank 3 T − . Hence rk ( ¯ P ) = rk ( ¨ P ) ≤ T − . To prove that rk ( ¯ P ) = rk ( ¨ P ) = 3 T − , it suﬃces to show that rk ( ¨ P ) ≥ T − . This can be done byselecting 3 T − P and showing that they are linearly independentsimilarly to the proof of Lemma 1.When y = 0 , we consider ¨ P with typical element ¨ P g,h = (1 + P g, e γ ) ⌊ . T ⌋ (1 + P g, e γ ) ⌊ . T − ⌋ (1 + P g, e γ + γ ) T − P y S g, (cid:16) ( P g, P g, e γ ) − y T − ( P g, P g, e γ γ ) y T − (cid:17) y T − × T − Q t =2 (cid:16) ( (1+ P g, ) (1+ P g, e γ )(1+ P g, e γ ) ) − y t − ( P g, P g, e γ γ ) y t − (cid:17) y t − . Note that ¨ P = ¨ D ˘ P for some non-singular diagonal matrix ¨ D = ¨ D ( γ , α ) and that the columns of ¨ P correspond to diﬀerentpolynomials in P g, up to order 3( T −

1) with all intermediate powers occuring some-where inside ¨ P . It follows that rk ( ¯ P ) = rk ( ¨ P ) is equal to the rank of a matrix thatconsists of linear combinations of the columns of a Vandermonde matrix that is basedon powers of P g, and has rank 3 T − . Hence rk ( ¯ P ) = rk ( ¨ P ) ≤ T − . To prove that rk ( ¯ P ) = rk ( ¨ P ) = 3 T − , it suﬃces to show that rk ( ¨ P ) ≥ T − . This can be done byselecting 3 T − P and showing that they are linearly independentsimilarly to the proof of Lemma 1. 9 roof of the second conjecture of H&W (2020) for p = 2 and T ∈ { , , } :We have followed the proof strategy above to show that when p = 2 and β = 0 , then rk ( ¯ P ) = 3 T − T ∈ { , , } and any y ∈ { , } . In particular, we haveused

Mathematica to verify that when p = 2 and β = 0 , then rk ( ¨ P ) = 3 T − T ∈ { , , } and any y ∈ { , } . We note that when p = 2 , T = 3 and x = x , thereis (at least) one extra moment function relative to the number of linearly independentmoment functions for the general model (given the value of y (0) ), cf. H&W (2020) whofound one extra moment function for this case; by analogy, when p = 2 , T = 4 and x = x = x , there will be (at least) two extra moment functions relative to the numberof linearly independent moment functions for the general model (given the value of y (0) );and when p = 2 , T = 5 and x = x = x = x , there will be (at least) three extramoment functions relative to the number of linearly independent moment functions forthe general model (given the value of y (0) ). H&W (2020) also found l = 2 T − T − y (0) )when p = 2 and T ∈ { , , } , so there are at least l of them in these cases. Hencethe number of linearly independent ”general” and ”special” moment functions is at least2 T − T −

1) + ( T −

2) = 2 T − (3 T − . However, this number cannot be larger than thenumber of linearly independent moment functions for the model without covariates (i.e.,with β = 0), which is 2 T − (3 T − . We conclude that when p = 2 and T ∈ { , , } , there are 2 T − (3 T − − ( T −

2) = 2 T − T −

1) = l linearly independent momentfunctions for the general model (given the value of y (0)(0)