Precise effective masses from density functional perturbation theory
Jonathan Laflamme Janssen, Yannick Gillet, Samuel Poncé, Alexandre Martin, Marc Torrent, Xavier Gonze
PPrecise effective masses from density functional perturbation theory
J. Laflamme Janssen, Y. Gillet, S. Ponc´e, A. Martin, M. Torrent, and X. Gonze European Theoretical Spectroscopy Facility and Institute of Condensed Matter and Nanosciences,Universit´e catholique de Louvain, Chemin des ´etoiles 8,bte L07.03.01, B-1348 Louvain-la-neuve, Belgium. ∗ CEA, DAM, DIF. F-91297 Arpajon, France (Dated: August 22, 2017)The knowledge of effective masses is a key ingredient to analyze numerous properties of semi-conductors, like carrier mobilities, (magneto-)transport properties, or band extrema characteristicsyielding carrier densities and density of states. Currently, these masses are usually calculated usingfinite-difference estimation of density functional theory (DFT) electronic band curvatures. How-ever, finite differences require an additional convergence study and are prone to numerical noise.Moreover, the concept of effective mass breaks down at degenerate band extrema. We assess theformer limitation by developing a method that allows to obtain the Hessian of DFT bands directly,using density functional perturbation theory (DFPT). Then, we solve the latter issue by adaptingthe concept of ‘transport equivalent effective mass’ to the k · ˆ p framework. The numerical noiseinherent to finite-difference methods is thus eliminated, along with the associated convergence study.The resulting method is therefore more general, more robust and simpler to use, which makes itespecially appropriate for high-throughput computing. After validating the developed techniques,we apply them to the study of silicon, graphane, and arsenic. The formalism is implemented intothe ABINIT software and supports the norm-conserving pseudopotential approach, the projectoraugmented-wave method, and the inclusion of spin-orbit coupling. The derived expressions alsoapply to the ultrasoft pseudopotential method. PACS numbers: 72.20.Fr,71.18.+y
I. INTRODUCTION
Accurate and precise ab initio effective masses are de-sirable for the description of, e.g., transport properties,optoelectronic properties, cyclotron frequencies and Lan-dau levels . In particular, this topic has recently seenrenewed interest motivated by optoelectronics andthermoelectric applications .Effective masses enter the above quantities as a de-scription of the band dispersion around an extrema.They appear within the context of the k · ˆ p theory, whichis a second-order perturbation expansion of eigenenergieswith respect to the electronic wave vector k , assuming anHamiltonian with local potential V ( r , r (cid:48) ) = V ( r − r (cid:48) ). Inthis framework, they are defined as the inverse second-order expansion coefficient. Thus, for non-degeneratebands, the effective mass are inverse curvatures of bandsin one dimension and inverse Hessian of bands in threedimensions.In the first-principle context, these second-orderderivatives of eigenenergies are usually obtained throughfinite-difference calculations or integrations overthe Brillouin zone of DFT results. Such calculationsrequire a convergence on the finite-difference parameter(or the k -point grid density) and are prone to numeri-cal noise. These extra convergences lead to additionalwork and possible precision issues. Moreover, since DFTeigenvalues show limited agreement with experimental ∗ Corresponding author: lafl[email protected] eigenenergies (see e.g. § , the precision issues have yet to beaddressed. Thus, within this work, we focus on the lat-ter problem, i.e. the calculation of precise DFT effectivemasses.We note that circumventing the use of finite differencesis already possible using Wannier functions . However,the issues of additional work and precision remain tosome extent. Indeed, the Wannier function optimizationprocedure can get stuck in a local minima and, to pre-vent this, the user needs to choose the starting functionswith some care. Thus, a method avoiding any such taskand the associated precision issues remains desirable.Another difficulty in the calculation of effective massesis the treatment of degeneracies. Indeed, subtleties arisewhen one considers the k · ˆ p framework in the context ofdegenerate perturbation theory . In one dimension,the perturbation coefficients become matrices (deriva-tives of Hamiltonian matrix elements within the degen-erate subspace) instead of scalars. Still, obtaining theeffective masses remains a simple matter of diagonaliz-ing the second-order matrix and attributing the inverseeigenvalues to the effective masses. However, in three di-mensions, the second-order expansion coefficient becomesa matrix of tensors (i.e. the Hessian of the Hamiltonianin the degenerate subspace), as first noted by Luttingerand Kohn . Since it is not possible to diagonalize such amatrix of tensors, it would appear that simple individualquantities cannot be attributed to degenerate bands fordescribing their dispersion at second order. a r X i v : . [ c ond - m a t . m t r l - s c i ] A ug Dresselhaus, Kip, and Kittel assessed this issue forthe top valence band of crystals with diamond structure.Starting with symmetry group arguments to justify theresults of Luttinger and Kohn , they added spin-orbitcoupling and obtained a simple formula that describesdegenerate band dispersions individually within the k · ˆ p framework . While it successfully describes the com-plex directional dependence of the band curvature forthese degenerate bands, it is limited to electronic statesbelonging to a specific symmetry group and to latticeswith cubic symmetry. Moreover, the determination ofe.g. transport properties from these parameters is lessconventional than from effective mass tensors. A moregeneral and convenient formalism would thus be wel-come.Mecholsky and coworkers recently proposed such aformalism for the specialized case of transport tensor cal-culations. They first justify rigorously the breakdownof the concept of effective mass tensor for degeneratebands and argue that degenerate band curvatures shouldinstead be described as a function of spherical angles f ( θ, φ ). Then, they derive the relation between f ( θ, φ )and transport tensors, assuming a parabolic band ex-tremum. Finally, they define a ‘transport equivalent ef-fective mass’ tensor that generates the same contribu-tion to transport tensors than f ( θ, φ ). This tensor hasthe benefit of being well defined for parabolic band ex-trema in any material and being straightforward to usefor transport properties calculations; however it does notdescribe band dispersions anymore.In this work, we first eliminate the necessity of car-rying out finite-difference differentiation of eigenenergiesto obtain effective masses. To do so, we derive analyt-ical expressions for the second-order derivative of non-degenerate eigenenergies within the density functionalperturbation theory (DFPT) framework , which hasalready been used successfully to compute various deriva-tives with respect to the electronic wave vector k in thepast . More specifically, in Section II, we first derivethe relation between the effective mass tensor and thederivatives of the Hamiltonian. Then, we differentiatethe relevant contributions of the Hamiltonian expressedin a plane wave basis set. We start with the modifiedkinetic energy used to smooth the total energy depen-dence on primitive cell size . Then, we proceed tothe non-local potential involved in the norm-conservingpseudopotential (NCPP) , projector augmented-wave(PAW) , and ultrasoft pseudopotential (USPP) meth-ods. Finally, we consider the spin-orbit coupling contri-bution at the end of the Section.Also, since the present work targets particularly thehigh-throughput design of materials with optimizedtransport properties , we adapt the ‘transport equiva-lent effective mass’ formalism of Mecholsky and cowork-ers to the DFPT context for the description of degener-ate bands in Section III. To do so, we first generalize theformalism of Luttinger and Kohn for the description ofdegenerate states in the k · ˆ p framework to the NCPP and PAW contexts. Then, we bridge the gap betweenthis generalized formalism and the ‘transport equivalenteffective mass’ formalism.We validate in Section IV our implementation withinthe ABINIT software by comparison with finite-difference calculations. Then, we apply it to a semi-conductor (silicon), a 2D material (graphane) and asemimetal ( α -arsenic).Atomic units are used throughout. II. THE NON-DEGENERATE CASEA. Effective mass in the DFPT framework
Within density functional theory (DFT), Schr¨odinger’sequation for periodic systems isˆ H | ψ n k (cid:105) = (cid:18) ˆ p V (cid:19) | ψ n k (cid:105) = ε n k | ψ n k (cid:105) , (1)where ˆ H is the Hamiltonian, | ψ n k (cid:105) and ε n k are its eigen-states and eigenenergies, n is the band index, k is Bloch’swave vector, ˆ p = − i ˆ ∇ is the momentum operator and ˆ V is a local potential (cid:104) r | ˆ V | r (cid:48) (cid:105) = V ( r ) δ ( r , r (cid:48) ) . (2)Using Bloch’s theorem, the wavefunction can be ex-pressed as the product of a crystal periodic function | u n k (cid:105) and a phase | ψ n k (cid:105) = e i k · ˆ r | u n k (cid:105) . (3)Eq. (1) then becomesˆ H k | u n k (cid:105) = (cid:18) k + 2 k · ˆ p + ˆ p V (cid:19) | u n k (cid:105) = ε n k | u n k (cid:105) , (4)where we have defined k -dependent operators ˆ O k asˆ O k = (cid:52) e − i k · ˆ r ˆ Oe i k · ˆ r , (5)and where the eigenstates are orthonormalized (cid:104) u n k | u n (cid:48) k (cid:105) = δ nn (cid:48) . (6)We now consider the situation where band n is non-degenerate at k and where | u n k (cid:105) and ε n k are known. TheTaylor expansion of the band dispersion around k yields ε n k + δ k = ε n k + (cid:88) α ε αn k δk α + 12 (cid:88) αβ δk α ε αβn k δk β + O ( δk ) , (7)where Greek letters α, β, γ, ... stand for Cartesian direc-tions { x, y, z } and where derivatives of any quantity X with respect to a Cartesian component of the wave vector k are noted X α = (cid:52) ∂X∂k α ; X αβ = (cid:52) ∂ X∂k α ∂k β . (8)Within perturbation theory, we wish to obtain deriva-tives of observables from derivatives of the Hamiltonian.Thus, we first project Eq. (4) on (cid:104) u n k | ε n k = (cid:104) u n k | ˆ H k | u n k (cid:105) , (9)and then differentiate with respect to k α , which yieldsthe first derivative appearing in Eq. (7) ε αn k = (cid:104) u n k | ˆ H α k | u n k (cid:105) + (cid:16) (cid:104) u αn k | ˆ H k | u n k (cid:105) + c . c . (cid:17) , = (cid:104) u n k | ˆ H α k | u n k (cid:105) + ε n k ( (cid:104) u αn k | u n k (cid:105) + c . c . ) , = (cid:104) u n k | ˆ H α k | u n k (cid:105) . (10)The last two relations have been obtained in the spirit ofthe Hellmann-Feynman force theorem , using respec-tively Eq. (4) and the derivative of the normalizationcondition (Eq. (6))( (cid:104) u αn k | u n k (cid:105) + c . c . ) = 0 , (11)where c . c . stands for the complex conjugate of the previ-ous term. The second-order derivative thus reads ε αβn k = (cid:104) u n k | ˆ H αβ k | u n k (cid:105) + (cid:16) (cid:104) u βn k | ˆ H α k | u n k (cid:105) + c . c . (cid:17) . (12)Defining the following complementary projectorsˆ P k = (cid:52) | u n k (cid:105) (cid:104) u n k | ; ˆ Q k = (cid:52) − ˆ P k , (13)we realize that | ˆ P k u βn k (cid:105) does not contribute to ε αβn k ε αβn k = (cid:104) u n k | ˆ H αβ k | u n k (cid:105) + (cid:16) (cid:104) ˆ Q k u βn k | ˆ H α k | u n k (cid:105) + c . c . (cid:17) + (cid:16) (cid:104) u βn k | u n k (cid:105) (cid:104) u n k | ˆ H α k | u n k (cid:105) + c . c . (cid:17) = (cid:104) u n k | ˆ H αβ k | u n k (cid:105) + (cid:16) (cid:104) ˆ Q k u βn k | ˆ H α k | u n k (cid:105) + c . c . (cid:17) , (14)since ( (cid:104) u βn k | u n k (cid:105) (cid:104) u n k | ˆ H α k | u n k (cid:105) +c . c . ) = ε αn k ( (cid:104) u βn k | u n k (cid:105) +c . c . ) = 0, following Eqs. (10) and (11).Now, | u αn k (cid:105) still has to be expressed in terms of deriva-tives of the Hamiltonian. To do so, we differentiateEq. (4) with respect to k ˆ H α k | u n k (cid:105) + ˆ H k | u αn k (cid:105) = ε αn k | u n k (cid:105) + ε n k | u αn k (cid:105) , ⇒ (cid:16) ˆ H k − ε n k (cid:17) | u αn k (cid:105) = − (cid:16) ˆ H α k − ε αn k (cid:17) | u n k (cid:105) , (15)and apply ˆ Q k to the left (cid:16) ˆ H k − ε n k (cid:17) | ˆ Q k u αn k (cid:105) = − ˆ Q k ˆ H α k | u n k (cid:105) , (16)which allows to deduce | ˆ Q k u αn k (cid:105) from ˆ H α k and unper-turbed quantities.Directly solving Eq. (16) using linear algebra tech-niques such as conjugate gradients can be unstable be-cause the left-hand side operator ˆ H k − ε n k is not, in gen-eral, positive definite . It is more practical to invert the operator ˆ H k − ε n k in Eq. (16) then use Eqs. (4) and (13)to obtain the usual sum-over-state expression | ˆ Q k u αn k (cid:105) = (cid:88) n (cid:48) (cid:54) = n | u n (cid:48) k (cid:105) (cid:104) u n (cid:48) k | ˆ H α k | u n k (cid:105) ε n k − ε n (cid:48) k . (17)Substituting Eq. (17) in Eq. (14) gives ε αβn k = (cid:104) u n k | ˆ H αβ k | u n k (cid:105) + (cid:18) (cid:88) n (cid:48) (cid:54) = n (cid:104) u n k | ˆ H β k | u n (cid:48) k (cid:105) (cid:104) u n (cid:48) k | ˆ H α k | u n k (cid:105) ε n k − ε n (cid:48) k + c . c . (cid:19) . (18)While the above expression is numerically easier tohandle than Eq. (16), it has the downside being muchless efficient. Indeed, Eq. (17) exhibits a notoriouslyslow convergence with the number of states included inthe summation . However, it is possible to combine thetechnical ease of Eq. (17) with the efficiency of Eq. (16)by using the former to obtain the contribution of the ac-tive space (i.e. bands up to the highest one N for whicheigenenergies derivatives are desired) to | ˆ Q k u αn k (cid:105) and thelatter to obtain the contribution of the complementarysubspace (band index above N ) to | ˆ Q k u αn k (cid:105) . Indeed,this strategy guarantees the left-hand side operator ofEq. (16) to be positive definite (thus allowing the use of,e.g., conjugated gradients), while minimizing the numberof bands treated using Eq. (17).Defining the projectorˆ Q N k = (cid:52) (cid:88) n (cid:48) >N | u n (cid:48) k (cid:105) (cid:104) u n (cid:48) k | , (19)we can write the second-order eigenenergies in the formdescribed above ε αβn k = (cid:104) u n k | ˆ H αβ k | u n k (cid:105) + (cid:16) (cid:104) ˆ Q N k u βn k | ˆ H α k | u n k (cid:105) + c . c . (cid:17) + (cid:18) N (cid:88) n (cid:48) (cid:54) = n (cid:104) u n k | ˆ H β k | u n (cid:48) k (cid:105) (cid:104) u n (cid:48) k | ˆ H α k | u n k (cid:105) ε n k − ε n (cid:48) k + c . c . (cid:19) , (20)with ˆ Q N k | u αn k (cid:105) given by the projection of Eq. (15) onbands above N (cid:16) ˆ H k − ε n k (cid:17) | ˆ Q N k u αn k (cid:105) = − ˆ Q N k ˆ H α k | u n k (cid:105) . (21)Once ε αβn k is known, one can obtain the effective massfrom the usual expression [ M − n k ] αβ = (cid:52) ε αβn k , (22)for non-degenerate ε n k .In the case of an Hamiltonian with a local potential, asdescribed in Eqs. (2) and (4), the perturbed Hamiltonianreduces to ˆ H α k = ( k + ˆ p ) α ; ˆ H αβ k = δ αβ . (23)For practical reasons, reduced coordinates are oftenused internally by DFT codes instead of Cartesian coor-dinates. We therefore provide in Appendix A the relationbetween derivatives with respect to k in both coordinatesystems. B. Derivatives of kinetic energy operator withcutoff smearing
To avoid discontinuities in the total energy with re-spect to primitive cell size, one can modify the kineticenergy to ensure that the number of available degreesof freedom for energy minimization varies continuouslywith cell size (see Appendix B). In the ABINIT soft-ware, this kinetic energy reads (see Eqs. (B2) and (B3)) (cid:104) G | ˆ T k | G (cid:48) (cid:105) = 12 ( k + G ) δ GG (cid:48) p ( x ) , (24)where p ( x ) is a function that is one for most of the planewaves, but diverges when ( k + G ) becomes close tothe plane wave kinetic energy cut-off E c . Its accurateformulation is given in Appendix B. The quantity x isequal to x = (cid:52) E c − ( k + G ) E s , (25)where the parameter E s is the energy range around thecutoff energy E c where the occupations start to be forcedtowards 0, i.e. E s can be interpreted as a smearing of thecutoff energy.The present method implements the derivatives of thismodified kinetic energy (cid:104) G | ˆ T α k | G (cid:48) (cid:105) = (cid:18) p ( x ) −
12 ( k + G ) p (cid:48) ( x ) E s (cid:19) ( k + G ) α δ GG (cid:48) , (26) (cid:104) G | ˆ T αβ k | G (cid:48) (cid:105) = (cid:18) p ( x ) −
12 ( k + G ) p (cid:48) ( x ) E s (cid:19) δ αβ δ GG (cid:48) + (cid:18)
12 ( k + G ) p (cid:48)(cid:48) ( x ) E s − p (cid:48) ( x ) E s (cid:19) ( k + G ) α ( k + G ) β δ GG (cid:48) , (27)where p (cid:48) ( x ) stands for the derivative of p ( x ) with respectto x . Their reduced coordinate version can then be ob-tained by using the reverse of Eqs. (A5) and (A6). C. Pseudopotentials and derivatives of associatednon-local operators
The potential V ( r ) appearing in a DFT Hamiltonianinvolves the Coulomb potential generated by the nucleiof the simulated system. It therefore has sharp features,which are cumbersome to represent accurately with aplane wave basis set. Numerous methods have beendeveloped to alleviate this problem, among which twoare supported in the present implementation: the norm-conserving pseudopotential (NCPP) and projector-augmented wave (PAW) methods. We derive the rel-evant expressions in the PAW framework, since it general-izes the NCPP framework . We then obtain the NCPP expressions by carrying out the appropriate simplifica-tions.Since the relationship between all-electrons wavefunc-tions | ψ n k (cid:105) and pseudo wavefunctions | ˜ ψ n k (cid:105) (Eq. (C5))has the same form in the ultrasoft pseudopotential for-malism (USPP) and the PAW formalism, it re-sults that the present Section applies to both PAW andUSPP (see Ref. 28 for a more detailed discussion of this).This also allows us to build upon existing DFPT devel-opments within the USPP framework .We offer a short PAW reminder in Appendix C to putEqs. (28)-(33) below into context.In PAW, Eq. (4) becomes (see Eq. (C8))ˆ˜ H k | ˜ u n k (cid:105) = ε n k ˆ˜1 k | ˜ u n k (cid:105) , (28)where (see Eqs. (C10), (C11), and (C12))ˆ˜ H k = ˆ H k + ˆ D k = ˆ H k + (cid:88) R ij e − i k · ˆ r | ˜ p R i (cid:105) D R ij (cid:104) ˜ p R j | e i k · ˆ r , (29)ˆ˜1 k = 1 + (cid:88) R ij e − i k · ˆ r | ˜ p R i (cid:105) S R ij (cid:104) ˜ p R j | e i k · ˆ r , (30)and where (cid:104) ˜ p R i | , D R ij , and S R ij are defined in Eqs. (C3),(C13), and (C14), respectively.The relations required to carry out the differentiationof the non-local part of the Hamiltonian ˆ D k are (seeEqs. (C22), (C20), and (C19)) (cid:104) G | ˆ D k | G (cid:48) (cid:105) = (cid:88) R ij (cid:104) K | ¯ p R i (cid:105) D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) , (31) (cid:104) K | ¯ p R i (cid:105) = 4 πi l i Y l i m i ( ˆ K ) ˜ P R i ( K ) e − i G · R , (32)˜ P R i ( K ) = (cid:90) s c ds s ˜ P R i ( s ) j l i ( Ks ) , (33)where K , s , Y lm , j l ( ks ), s c , ˜ P R i ( s ), ˜ P R i ( K ), and (cid:104) K | ¯ p R i (cid:105) are defined in Appendix C .Since the D R ij have no dependence on k , the deriva-tives of (cid:104) K | ¯ p R i (cid:105) suffice to obtain those of (cid:104) G | ˆ D k | G (cid:48) (cid:105) .Thus, (cid:104) G | ˆ D α k | G (cid:48) (cid:105) = (cid:88) R ij (cid:104) K | ¯ p R i (cid:105) α D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) + (cid:104) K | ¯ p R i (cid:105) D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) α , (34) (cid:104) G | ˆ D αβ k | G (cid:48) (cid:105) = (cid:88) R ij (cid:104) K | ¯ p R i (cid:105) αβ D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) + (cid:104) K | ¯ p R i (cid:105) D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) αβ + (cid:16) (cid:104) K | ¯ p R i (cid:105) α D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) β + α ↔ β (cid:17) , (35)where α ↔ β stands for the transpose of the previousterm (with respect to α and β ), where (cid:104) K | ¯ p R i (cid:105) α =4 πi l i e − i G · R (cid:16) Y αl i m i ( ˆ K ) ˜ P R i ( K )+ Y l i m i ( ˆ K ) ˜ P α R i ( K ) (cid:17) , (36) (cid:104) K | ¯ p R i (cid:105) αβ = 4 πi l i e − i G · R (cid:16) Y αβl i m i ( ˆ K ) ˜ P R i ( K ) + Y l i m i ( ˆ K ) ˜ P αβ R i ( K )+ (cid:0) Y αl i m i ( ˆ K ) ˜ P β R i ( K ) + α ↔ β (cid:1)(cid:17) , (37)with Y αlm ( ˆ K ) ( Y αβlm ( ˆ K )) the Cartesian component α ( αβ )of the gradient (Hessian) of spherical harmonics Y lm , andwhere ˜ P α R i ( K ) = (cid:90) s c ds s ˜ P R i ( s ) j (cid:48) l i ( Ks ) s K α K , (38)˜ P αβ R i ( K ) = (cid:90) s c ds s ˜ P R i ( s ) (cid:18) j (cid:48)(cid:48) l i ( Ks ) s K α K K β K + j (cid:48) l i ( Ks ) K s (cid:16) δ αβ − K α K K β K (cid:17)(cid:19) , (39)with j (cid:48) l ( x ) and j (cid:48)(cid:48) l ( x ) the first and second derivativesof spherical Bessel functions with respect to their ar-gument x , respectively. The calculation of the deriva-tives of (cid:104) G | ˆ˜1 k | G (cid:48) (cid:105) is conceptually identical to that of (cid:104) G | ˆ D k | G (cid:48) (cid:105) and the final result is identical to Eqs. (34)and (35), with the substitution D R ij → S R ij . Also, toobtain the NCPP version of this section, one simply hasto substitute ˆ˜1 k → S R ij → H α k and ˆ˜1 α k (as per Eqs. (34), (36), (38) and theiranalogs for ˆ˜1 k ) in ABINIT was already done by C. Au-douze and co-workers . Therefore, we only had to im-plement the second-order quantities ˆ˜ H αβ k and ˆ˜1 αβ k (as perEqs. (35), (37), (39) and their analogs for ˆ˜1 k ) in the code. D. Spin-orbit coupling
A simple, approximate way to take into account spin-orbit coupling (SOC) within the PAW framework is tocalculate the coupling only in the PAW augmentation re-gions, which is what is done in the ABINIT software .The hypotheses underlying this approximation, and itspractical validity, have been discussed in Ref. 45. Notethat since it involves the PAW augmentation regions, itcannot be applied to the NCPP and USPP methods. Within this approximation, the PAW Hamiltonian ofEq. (29) becomesˆ˜ H k = ˆ H k + (cid:88) R ijσσ (cid:48) e − i k · ˆ r | ˜ p R iσ (cid:105) (cid:0) D R ij δ σσ (cid:48) + D SO R ijσσ (cid:48) (cid:1) (cid:104) ˜ p R jσ (cid:48) | e i k · ˆ r , (40)where σ denotes the spin component of the wavefunctionacted upon by the projector and where D SO R ijσσ (cid:48) are thematrix elements of the spin-orbit Hamiltonian betweenall electron partial waves | φ R iσ (cid:105) .We note that Eq. (40) has the same k -dependence thanthe non-local operators studied in Section II C. There-fore, since we already implemented the correspondingderivatives (Eqs. (34)-(39)) and since the ground-statespin-orbit Hamiltonian (i.e. D SO R ijσσ (cid:48) ) was already avail-able in the ABINIT software, we added SOC support toour effective mass implementation by introducing spinors | ˜ ψ n k (cid:105) → | ˜ ψ n k σ (cid:105) in Section II A-II C and substituting D R ij δ σσ (cid:48) → D R ij δ σσ (cid:48) + D SO R ijσσ (cid:48) . For NCPP, the equa-tions that allow to treat the SOC within DFPT have beenpresented in Ref. 46 and applied in the case of phonons.The present formalism might easily be generalized to thiscase, although this has not been part of the present work. III. THE DEGENERATE CASE ANDTRANSPORT EQUIVALENT EFFECTIVEMASSESA. Effective mass tensor and degeneracy
In degenerate perturbation theory, the corrections toobservables ˆ O at successive orders ( i ) with respect to avariable x are calculated O ( i ) nn (cid:48) k = ∂ i (cid:104) u n (cid:48) k | ˆ O k | u n k (cid:105) /∂x i until one finds an order at which the degenerescence islifted O ( i ) nn (cid:48) k (cid:54) = C ( i ) k δ nn (cid:48) , n and n (cid:48) labelling states withinthe degenerate subspace and C ( i ) k being a proportional-ity constant. Thus, Section II A applies if there is nodegenerescence at second order, i.e. if the degeneracy islifted at 0th or 1st order. We now consider the case wherethe degeneracy is maintained at 0th and 1st order (e.g. adegenerate band extrema), which is a case often encoun-tered in important technological materials, like the III-Vor II-VI semiconductors. Eq. (7) can then be generalizedto (see Eq. (IV.9) of Ref. 17) ε nn (cid:48) k + δ k = ε { d } k δ nn (cid:48) + (cid:88) α ε α { d } k δ nn (cid:48) δk α + 12 (cid:88) αβ δk α ε αβnn (cid:48) k δk β + O ( δk ); n, n (cid:48) ∈ { d } (41)where { d } represents the degenerate subspace and with ε n k = ε { d } k and ε αn k = ε α { d } k ∀ n ∈ { d } . In such cases, af-ter obtaining the ε αβnn (cid:48) k matrix within the degenerate sub-space, one must diagonalize it to find the relevant eigen-states, eigenenergies, and the associated effective masses.The first step is therefore to generalize ε αβn k to ε αβnn (cid:48) k .We will also consider in this subsection the more generalcase of PAW (where ˆ˜1 k is present, see Subsection II C),from which the norm-conserving expressions can be ob-tained by substituting ˆ˜1 k →
1. Moreover, as discussedat the beginning of Section II C, the equations derivedin this Section for PAW also apply to the USPP methodwithin the parallel gauge (see Refs. 24, 41, and 47 formore details on this gauge). This Section thus general-izes Eqs. (20) and (21) to both the degenerate case andthe PAW (USPP) formalism(s).We will systematically express ε { d } k as ε n k + ε n (cid:48) k and ε α { d } k as ε αn k + ε αn (cid:48) k in order to be explicitly symmetric in n, n (cid:48) and to facilitate the comparison with other PAWexpressions within the parallel gauge (see e.g. Eq. (78)of Ref. 41).We obtain analytically the derivatives of ε nn (cid:48) k fromthe derivatives of the Hamiltonian and overlap operator,starting from ε nn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H k | ˜ u n k (cid:105) (42)and differentiating with respect to k α ε αnn (cid:48) k = (cid:104) ˜ u αn (cid:48) k | ˆ˜ H k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H k | ˜ u αn k (cid:105) . (43)Using Eq. (28) and evaluating at the degenerate point k yields ε αnn (cid:48) k = ε n k + ε n (cid:48) k (cid:16) (cid:104) ˜ u αn (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u αn k (cid:105) (cid:17) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k | ˜ u n k (cid:105) . (44)Using the derivative of the PAW version of the orthonor-malization condition (from Eqs. (C5), (C9), (3), and (5)) δ nn (cid:48) = (cid:104) ψ n (cid:48) k | ψ n k (cid:105) = (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) (45) ⇒ (cid:104) ˜ u αn (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u αn k (cid:105) , (46)Eq. (44) becomes ε αnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˜ u n k (cid:105) . (47)We now proceed to the second-order derivative, start-ing from Eq. (43). Differentiating with respect to k β yields ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˜ u αβn (cid:48) k | ˆ˜ H k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H k | ˜ u αβn k (cid:105) (cid:17) + (cid:16) (cid:104) ˜ u αn (cid:48) k | ˆ˜ H β k | ˜ u n k (cid:105) + (cid:104) ˜ u αn (cid:48) k | ˆ˜ H k | ˜ u βn k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k | ˜ u βn k (cid:105) (cid:17) + α ↔ β. (48)Using Eq. (28) and the second-order derivative of theorthonormalization condition (Eq. (45)) (cid:104) ˜ u n (cid:48) k | ˆ˜1 αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˜ u αβn (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u αβn k (cid:105) (cid:17) + (cid:16) (cid:104) ˜ u αn (cid:48) k | ˆ˜1 β k | ˜ u n k (cid:105) + (cid:104) ˜ u αn (cid:48) k | ˆ˜1 k | ˜ u βn k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u βn k (cid:105) (cid:17) + α ↔ β = 0 (49)as well as evaluating at the degenerate point k leads to ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k − ε n k + ε n (cid:48) k αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˜ u αn (cid:48) k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˜ u αn (cid:48) k | ˆ˜ H k − ε n k + ε n (cid:48) k k | ˜ u βn k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˜ u βn k (cid:105) (cid:17) + α ↔ β. (50)Since the analytical derivatives of ˆ˜ H k and ˆ˜1 k are avail-able (see Section II C), the only task left is to find | ˜ u αn k (cid:105) or, more precisely, to generalize Eqs. (20) and (21) to thedegenerate case and the PAW formalism. To handle thedegeneracy, we separate the components of | ˜ u αn k (cid:105) lying inthe degenerate subspace { d } from the rest | ˜ u αn k (cid:105) = ˆ˜ P k | ˜ u αn k (cid:105) + ˆ˜ Q k | ˜ u αn k (cid:105) , (51)with ˆ˜ P k = (cid:52) (cid:88) n (cid:48) ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜1 k , (52)ˆ˜ Q k = (cid:52) (cid:88) n/ ∈{ d } | ˜ u n k (cid:105) (cid:104) ˜ u n k | ˆ˜1 k . (53)Substituting Eq. (51) in Eq. (50) and carrying out somealgebra presented in Appendix D, we obtain (see Eq. (D4) ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k − ε n k + ε n (cid:48) k αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k − δ ˜ u αn k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˆ˜ Q k ˜ u βn k − δ ˜ u βn k (cid:105) + (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k − δ ˜ u αn k | ˆ˜ H k − ε n k + ε n (cid:48) k k | ˆ˜ Q k ˜ u βn k −
12 ˜ u βn k (cid:105) (cid:17) + α ↔ β, (54)where we have defined (see Eq. (D3)) | δ ˜ u αn k (cid:105) = (cid:52) (cid:88) n (cid:48) ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u n k (cid:105) . (55)We are now left with the task of finding | ˆ˜ Q k ˜ u αn k (cid:105) withinPAW. To do so, we first differentiate Eq. (28)ˆ˜ H α k | ˜ u n k (cid:105) + ˆ˜ H k | ˜ u αn k (cid:105) = ε αn k ˆ˜1 k | ˜ u n k (cid:105) + ε n k ˆ˜1 α k | ˜ u n k (cid:105) + ε n k ˆ˜1 k | ˜ u αn k (cid:105) . (56)Re-arranging the terms and applying (cid:80) n (cid:48) / ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | on both sides yields (cid:88) n (cid:48) / ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜ H k − ε n k ˆ˜1 k | ˜ u αn k (cid:105) = − (cid:88) n (cid:48) / ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k ˆ˜1 α k | ˜ u n k (cid:105) , (57)where we have used the orthonormality condition ofEq. (45). Finally, using Eq. (28), dividing by ε n (cid:48) k − ε n k ,and using Eq. (52) yieldsˆ˜ Q k | ˜ u αn k (cid:105) = − (cid:88) n (cid:48) / ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k ˆ˜1 α k | ˜ u n k (cid:105) ε n (cid:48) k − ε n k . (58)For reasons that have already been mentioned in theparagraph after Eq. (18), we calculate the contributionto ˆ˜ Q k | ˜ u αn k (cid:105) stemming from the subspace generated bythe bands explicitly treated in the calculation (the ac-tive subspace n ≤ N ) using a sum-over-states approachwhile we solve a linear equation to obtain the contribu-tion stemming from the complementary subspaceˆ˜ Q N k = (cid:88) n>N | ˜ u n k (cid:105) (cid:104) ˜ u n k | ˆ˜1 k . (59) We thus split ˆ˜ Q k | ˜ u αn k (cid:105) into these two contributions andadd the − | δ ˜ u αn k (cid:105) contribution required to calculateEq. (54). Also, since it is more convenient to calculateand store the symmetrized matrix elements of ˆ˜ H α k − ε n k ˆ˜1 α k (i.e. (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k ˆ˜1 α k | ˜ u n k (cid:105) ), we re-express the partof Eq. (58) that stems from the active subspace ( n ≤ N )so that it uses these symmetric matrix elements. We ob-tainˆ˜ Q k | ˜ u αn k (cid:105) − | δ ˜ u αn k (cid:105) = − N (cid:88) n (cid:48) / ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k ˆ˜1 α k | ˜ u n k (cid:105) ε n (cid:48) k − ε n k + | ˆ˜ Q N k ˜ u αn k (cid:105) − | δ ˜ u αNn k (cid:105) , (60)where we have defined | δ ˜ u αNn k (cid:105) = (cid:52) N (cid:88) n (cid:48) | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u n k (cid:105) (61)(following the notation introduced in Eq. (42) of Ref. 41,with an additional N index) and | ˆ˜ Q N k ˜ u αn k (cid:105) = − (cid:88) n (cid:48) >N | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k ˆ˜1 α k | ˜ u n k (cid:105) ε n (cid:48) k − ε n k . (62)To obtain the last contribution from a linear equationproblem, we start from Eq. (62), use Eq. (59), multiply by ε n (cid:48) k − ε n k , then use Eq. (28) and the conjugate transposeof Eq. (59)ˆ˜ Q † N k (cid:16) ˆ˜ H k − ε n k ˆ˜1 k (cid:17) | ˆ˜ Q N k ˜ u αn k (cid:105) = − ˆ˜ Q † N k (cid:16) ˆ˜ H α k − ε n k ˆ˜1 α k (cid:17) | ˜ u n k (cid:105) . (63)Now, substituting Eq. (60) into Eq. (54) and simplify-ing using Eqs. (59), (61), (28), and (45) yields ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k − ε n k + ε n (cid:48) k αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˆ˜ Q N k ˜ u αn (cid:48) k + δ ˜ u αNn (cid:48) k | ˆ˜ H k − ε n k + ε n (cid:48) k k | ˆ˜ Q N k ˜ u βn k + δ ˜ u βNn k (cid:105) + (cid:104) ˆ˜ Q N k ˜ u αn (cid:48) k + δ ˜ u αNn (cid:48) k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˆ˜ Q N k ˜ u βn k + δ ˜ u βNn k (cid:105) + N (cid:88) n (cid:48)(cid:48) / ∈{ d } (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n (cid:48) k + ε n (cid:48)(cid:48) k α k | ˜ u n (cid:48)(cid:48) k (cid:105) ε n k − ε n (cid:48)(cid:48) k (cid:104) ˜ u n (cid:48)(cid:48) k | ˆ˜ H β k − ε n (cid:48)(cid:48) k + ε n k β k | ˜ u n k (cid:105) (cid:17) + α ↔ β. (64)Eqs. (64) and (63) are the extension of Eqs. (20) and (21)to degeneracy and PAW (USPP) that we were seeking.By substituting ˆ˜1 k → ε αβnn (cid:48) k = (cid:104) u n (cid:48) k | ˆ H αβ k | u n k (cid:105) + (cid:16) (cid:104) ˆ Q N k u αn (cid:48) k | ˆ H k − ε n k + ε n (cid:48) k | ˆ Q N k u βn k (cid:105) + (cid:104) ˆ Q N k u αn (cid:48) k | ˆ H β k | u n k (cid:105) + (cid:104) u n (cid:48) k | ˆ H α k | ˆ Q N k u βn k (cid:105) + N (cid:88) n (cid:48)(cid:48) / ∈{ d } (cid:104) u n (cid:48) k | ˆ H α k | u n (cid:48)(cid:48) k (cid:105) (cid:104) u n (cid:48)(cid:48) k | ε n k − ε n (cid:48)(cid:48) k ˆ H β k | u n k (cid:105) (cid:17) + α ↔ β, (65)where | ˆ Q N k u αn k (cid:105) is given by Eq. (21). The precedingexpression can be simplified using Eqs. (62) and (4) toyield ε αβnn (cid:48) k = (cid:104) u n (cid:48) k | ˆ H αβ k | u n k (cid:105) + (cid:104) u n (cid:48) k | ˆ H α k | ˆ Q N k u βn k (cid:105) + (cid:104) ˆ Q N k u βn (cid:48) k | ˆ H α k | u n k (cid:105) + (cid:32) N (cid:88) n (cid:48)(cid:48) / ∈{ d } (cid:104) u n (cid:48) k | ˆ H α k | u n (cid:48)(cid:48) k (cid:105) (cid:104) u n (cid:48)(cid:48) k | ˆ H β k | u n k (cid:105) ε n k − ε n (cid:48)(cid:48) k + α ↔ β (cid:33) , (66)which simplifies to Eq. (20) when degeneracy is lifted at0th order. B. Transport equivalent effective mass fromdegenerate perturbation theory
However, deducing transport properties from ε αβnn (cid:48) k asper Eqs. (63) and (64) is more involved than in thenon-degenerate case, where we simply had [ M − n k ] αβ = (cid:52) ε αβn k (Eq. (22)). It is indeed not possible to obtain effectivemass tensors that describe the dispersion of the individ-ual bands: this would involve diagonalizing ε αβnn (cid:48) k withrespect to nn (cid:48) , but each matrix element in this case isa tensor ε nn (cid:48) k (with [ ε nn (cid:48) k ] αβ = (cid:52) ε αβnn (cid:48) k ) and it is not ingeneral possible to diagonalize a matrix of tensors. It ishowever still possible to circumvent this complication by associating ε αβnn (cid:48) k to a set of N deg ‘transport equivalentmass’ tensors ¯ M n k , which generate the same contri-bution to transport properties as the less intuitive ε αβnn (cid:48) k does. To make this association, we begin by transformingthe tensorial matrix elements into scalar quantities. Thisis easily done by picking a direction ( θ, φ ) in spherical co-ordinates along which we describe the bands dispersion ε nn (cid:48) k + q = ε { d } k + 12 (cid:88) αβ q α ε αβnn (cid:48) k q β = ε { d } k + q (cid:88) αβ ˆ q α ( θ, φ ) ε αβnn (cid:48) k ˆ q β ( θ, φ ) (67)= (cid:52) ε { d } k + q f nn (cid:48) k ( θ, φ ) , where we have supposed that the degenerate point k isalso a band extrema ε αnn (cid:48) k = 0, where q represents aposition in the Brillouin zone with respect to the bandextrema k , and where we have defined the curvature f nn (cid:48) k ( θ, φ ) of the matrix elements ε nn (cid:48) k at the band ex-trema along the direction ( θ, φ ). We now deal with anangular-dependent matrix of scalars f nn (cid:48) k ( θ, φ ), whichcan be diagonalized for each combination of ( θ, φ ) con-sidered in the calculation (cid:88) n (cid:48) f nn (cid:48) k ( θ, φ ) | ν n (cid:48) k ( θ, φ ) (cid:105) = f n k ( θ, φ ) | ν n k ( θ, φ ) (cid:105) . (68)The resulting eigenvalues f n k ( θ, φ ) and eigenvectors | ν n k ( θ, φ ) (cid:105) yield the diagonalized version of Eq. (67) ε n k + q = ε { d } k + q f n k ( θ, φ ) (69)= (cid:52) ε { d } k + q m n k ( θ, φ ) , (70)where we have defined the angular-dependent effectivemass m n k ( θ, φ ).Once these angular dependent quantities are availablefor each band in the degenerate set, it becomes possi-ble possible to apply the idea of Ref. 22. The latterassociates to f n k ( θ, φ ) a ‘transport equivalent mass’ ten-sors ¯ M n k that generate the same contribution to trans-port properties. This association supposes that the re-laxation time approximation to Boltzmann’s transportequation holds. It furthermore supposes that the re-laxation time depends on the energy only and that thisdependence can be factored out of the tensor (i.e. a re-laxation time of the form U n τ n k ( ε )). It also requires that¯ M n k be calculated at a band extrema ( ε αnn (cid:48) k = 0), whichis why we have already supposed that the degeneratepoint k is also a band extrema in Eq. (67). For conve-nience, we summarize in Appendix E the derivation ofthis association, generalized to an energy dependent re- laxation time tensor τ n k ( ε ) and specialized to the case ofconductivity σ for concision. Since the 2D case presentssome peculiarities with respect to the 3D case, we also ex-tend the concept of ‘transport equivalent effective mass’to the 2D case in Appendix F.The prescription to obtain ¯ M n k from f n k ( θ, φ ) beginswith the calculation of the angular dependence of theelectronic velocity (see Eq. (29) of Ref. 22 or Eq. (E6) ofpresent work)¯ v n k (ˆ q )= (cid:52) f n k ( θ, φ ) sin( θ ) cos( φ ) + ∂f n k ∂θ cos( θ ) cos( φ ) − ∂f n k ∂φ sin( φ )sin( θ ) f n k ( θ, φ ) sin( θ ) sin( φ ) + ∂f n k ∂θ cos( θ ) sin( φ ) + ∂f n k ∂φ cos( φ )sin( θ ) f n k ( θ, φ ) cos( θ ) − ∂f n k ∂θ sin( θ ) . (71)Then, one can deduce from ¯ v n k (ˆ q ) and f n k ( θ, φ ) a tenso-rial quantity C n k representing the angular contributionof extrema k of band n to transport properties (e.g. σ )(see Eqs. (E9), (E10) and (E11), which are analogous toEqs. (34), (35) and (36) of Ref. 22) C n k = (cid:90) π dφ (cid:90) π dθ sin( θ ) ¯ v n k ( θ, φ )¯ v Tn k ( θ, φ )2 | f n k ( θ, φ ) | / . (72)In the present implementation, we use Gauss-Legendrequadrature to numerically integrate with respect to θ and φ . Finally, one has to rotate the Cartesian axes to diag-onalize this tensor U Tn k C n k U n k = (cid:52) C n k x C n k y
00 0 C n k z , (73)where U n k is the desired rotation, then deduce the com-ponents of ¯ M n k in the rotated system (using Eq. (38) ofRef. 22 or Eq. (E15) of present work) and rotate back tothe original Cartesian axes¯ M n k = (cid:18) π (cid:19) U n k C n k y C n k z C n k x C n k z
00 0 C n k x C n k y U Tn k . (74)Note that the partial derivatives with respect to spher- ical angles of f n k ( θ, φ ) can easily be obtained analytically f n k ( θ, φ )= (cid:88) n (cid:48) n (cid:48)(cid:48) ∈{ d } (cid:104) ν n k ( θ, φ ) | u n (cid:48) k (cid:105) f n (cid:48) n (cid:48)(cid:48) k ( θ, φ ) (cid:104) u n (cid:48)(cid:48) k | ν n k ( θ, φ ) (cid:105) , = (cid:52) (cid:104) ν n k ( θ, φ ) | ˆ f k ( θ, φ ) | ν n k ( θ, φ ) (cid:105) , (75) ⇒ ∂f n k ∂θ = ∂ (cid:104) ν n k | ∂θ ˆ f k | ν n k (cid:105) + (cid:104) ν n k | ∂ ˆ f k ∂θ | ν n k (cid:105) + (cid:104) ν n k | ˆ f k ∂ | ν n k (cid:105) ∂θ , = (cid:104) ν n k | ∂ ˆ f k ∂θ | ν n k (cid:105) + f n k (cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8) ∂ (cid:104) ν n k | ν n k (cid:105) ∂θ , = (cid:88) n (cid:48) n (cid:48)(cid:48) ∈{ d } (cid:104) ν n k ( θ, φ ) | u n (cid:48) k (cid:105) ∂f n (cid:48) n (cid:48)(cid:48) k ( θ, φ ) ∂θ (cid:104) u n (cid:48)(cid:48) k | ν n k ( θ, φ ) (cid:105) , (76)with (see Eq. (67)) ∂f nn (cid:48) k ( θ, φ ) ∂θ = (cid:88) αβ ∂ ˆ q α ( θ, φ ) ∂θ ε αβnn (cid:48) k ˆ q β ( θ, φ )+ ˆ q α ( θ, φ ) ε αβnn (cid:48) k ∂ ˆ q β ( θ, φ ) ∂θ , (77)and with ˆ q T = (sin θ cos φ, sin θ sin φ, cos θ ). The analo-gous expressions for φ derivatives are easily obtained bysubstituting ∂θ → ∂φ .It is also interesting to note that Eq. (72) diverges ina 2D system, i.e. a system where f n k ( θ, φ ) → IV. RESULTSA. Numerical validation by direct comparison ofDFPT against finite-difference effective masses inthe case of silicon
To check our implementation, we assessed its agree-ment with finite-difference calculations. Finite-differencederivatives of the form ε ααn k were computed using second-order derivative formulae of order 2 and 8 calculated ona regularly spaced grid from Ref. 48 ∂ X∂h (cid:12)(cid:12)(cid:12) h i = X h i +1 − X h i + X h i − ∆ h + O (∆ h ) (78)and ∂ X∂h (cid:12)(cid:12)(cid:12) h i = (cid:16) − X h i +4 + X h i +3 − X h i +2 + X h i +1 − X h i + X h i − − X h i − + X h i − − X h i − (cid:17)(cid:14) ∆ h + O (∆ h ) , (79)where X stands for any function of the independent vari-able h computed on a regular grid of spacing ∆ h , withits points labeled by the index i and where O (∆ h n ) de-notes the order n of the error. Also, derivatives of theform ε αβn k where α (cid:54) = β were obtained by substituting theexpression for the first-order derivative along the first di-rection α into the expression for the first-order derivativealong the second direction β . The first-order derivativeformulae used were those of Ref. 48 for precision order 2and 8 with a regularly spaced grid ∂X∂h (cid:12)(cid:12)(cid:12) h i = X h i +1 − X h i − ∆ h + O (∆ h ) (80)and ∂X∂h (cid:12)(cid:12)(cid:12) h i = (cid:16) − X h i +4 + X h i +3 − X h i +2 + X h i +1 − X h i − + X h i − − X h i − + X h i − (cid:17)(cid:14) ∆ h + O (∆ h ) . (81)The resulting expressions for ε αβn k i (where α (cid:54) = β ) at pre-cision order 2 and 8 require 4 and 64 evaluations of ε n k i ,respectively.In both DFPT and finite-difference calculations, thePAW formalism was used along with spin-orbit cou-pling within the PAW augmentation regions (see Sec-tion II D). A cutoff energy of 20 Ha for the plane wavebasis and 40 Ha for the PAW double grid along with a6 6 6 Monkhorst-Pack grid for k -point integrations wereused to ensure that the calculated effective masses wereconverged to four significant digits. The local-density TABLE I. Comparison of effective masses for the two non-degenerate valence bands and the first conduction band ofsilicon at Γ . Due to the cubic symmetry, the effective massestensors are proportional to the identity, so that only one scalaris reported. We use order 2 and order 8 finite differences aswell as DFPT to compute the masses. The agreement betweenthe finite-difference methods and DFPT is provided (DFPT -FD 2 and DFPT - FD 8, respectively). Results are providedin atomic units ( m e = 1).Band Γ v split-off Γ (cid:48) v split-off Γ c FD order 2 1.161 530 -0.222 58 FD order 8 1.161 530 -0.222 58 DFPT 1.161 530 -0.222 58 DFPT - FD 2 -1E-7 8E-7 -3E-7DFPT - FD 8 -9E-8 -5E-8 -5E-8 approximation of Perdew and Wang was used, for co-herence with available effective-mass calculations in theliterature. The cell parameter was fully relaxed, yielding5 . Γ ( Γ v , split-off Γ (cid:48) v , and split-off Γ c ). Since the Γ point exhibit cubic symmetry, the effective mass tensorsbecome proportional to the identity at this point, so thatonly one scalar needs to be reported in this case.Also, for degenerate bands, it is still possible to ob-tain scalar effective masses along specific directions (seeSection III B). Thus, in Table II, we compare the val-ues provided by the two methods for the scalar effectivemasses of the top valence bands of silicon at Γ in theCartesian directions (100), (111), and (110).Finally, to assess the quality of the finite differences,we also compare the results obtained using order 2 andorder 8 finite differences.The results in Tables I and II show good agreement be-tween DFPT and order 8 finite differences (agreement inthe range 10 − − − ). Furthermore, we observe that thefinite-difference results converge (with increasing order)towards the DFPT results. This observation suggeststhat the difference between DFPT and order 8 finite dif-ferences can be attributed to the numerical precision ofthe latter method.To further support this attribution, it is interestingto study the convergence behavior of the finite-differencecalculations with respect to the finite-difference parame-ter. We show such a convergence study in Figure 1 forthe first conduction band of silicon at Γ .We first note that the order 8 finite-difference resultsaround the converged value vary by an amount similar tothe agreement between finite difference and DFPT, whichfurther supports the attribution of the discrepancies to1 TABLE II. Comparison of effective masses for the two de-generate valence bands of silicon at Γ ( Γ (cid:48) v ), usually referredto as ‘light hole’ and ‘heavy hole’ bands. We provide scalareffective masses along the Cartesian directions (100), (111),and (110) since the concept of effective mass tensor is notsuitable for degenerate bands. We use order 2 and order 8 fi-nite differences as well as DFPT to compute the masses. Theagreement between the finite-difference methods and DFPTis provided (DFPT - FD 2 and DFPT - FD 8, respectively).Results are provided in atomic units ( m e = 1).Direction (100) (111) (110) light hole FD order 2 -0.188 25 -0.129 7
44 50 -0.136 67
FD order 8 -0.188 25 -0.129 7
39 62 -0.136 67
DFPT -0.188 25 -0.129 7
39 40 -0.136 67
DFPT - FD 2 2E-6 5E-6 5E-6DFPT - FD 8 6E-8 2E-7 1E-7 heavy hole
FD order 2 -0.253 93 -0.648 381 -0.517 2
32 33
FD order 8 -0.253 93 -0.648 381 -0.517 2
48 27
DFPT -0.253 93 -0.648 381 -0.517 2
49 81
DFPT - FD 2 1E-6 -4E-7 -2E-5DFPT - FD 8 -2E-8 -2E-7 -2E-6 the finite-difference method.Also, we note the characteristic V-shape of the con-vergence. Indeed, there is an optimal value of the finite-difference parameter that yields the best compromise be-tween numerical noise and sampling of the band closeto the extrema (where it is parabolic). Thus, finite-difference calculations of effective masses require this sup-plementary convergence study. In contrast, DFPT is de-void of a parameter analogous to the finite-difference pa-rameter and thus does not require such a convergencestudy.
B. Effective mass for selected materials
1. Silicon
Now that our implementation has been validated, wecompare our results for silicon to those available in the lit-erature. Our calculations use the parameters specified inthe first paragraph of Section IV A, along with 500 pointsGauss-Legendre quadrature integration for the computa-tion of the transport effective mass tensor (see Eq. (72))and the spherically averaged effective mass. The con-duction band minima was determined to be at 0.42252(0.42253) on the Γ − X path with (without) SOC andthe effective mass was calculated at this point.For the non-degenerate bands, we simply compare theeffective mass tensor with other results in the literature.In the degenerate case, the literature tends to focus onscalar effective mass along the standard Cartesian direc- − − − − − − − − Finite difference parameter (red. coord.)10 − − − − − − − − E rr o r on e ff ec ti v e m a ss ( a . u . ) Finite differences order 2Finite differences order 8Converged parameter for order 2Converged parameter for order 8
FIG. 1. (Color online) Convergence study with respect tothe finite-difference parameter for the effective mass of thefirst conduction band of silicon at Γ . The converged value m i is taken to be the one where | m i +1 − m i | + | m i − m i − | is smallest, the index i labelling the successive calculations inthe convergence study. Only the absolute difference betweenthe calculated values and this converged value is shown forclarity. Note that this places the converged value outside thelogarithmic scale (0 = 10 −∞ ). tions (100) (cid:107) Γ − X , (111) (cid:107) Γ − L , and (110) (cid:107) Γ − K ;the comparison is therefore based on these results.However, these comparisons do not validate the imple-mentation of the transport equivalent formalism (see Sec-tion III B). Thus, we also provide results for this quantityand compare them to the only others results currentlyavailable (Ref. 22).Finally, we also assess the importance of taking intoaccount the SOC in effective-mass calculations by alsoproviding PAW results without SOC. The results are pro-vided in Table III.Let us first assess the impact of SOC. As expected, theconduction band is almost not influenced by the inclu-sion of SOC. However, this is not the case for the valencebands. Close examination of the split-off band immedi-ately suggests that this qualitative change of behavioris related to the degeneracy of the bands. Indeed, thesplit-off band is not degenerate when SOC is included,which allows its description by an effective mass tensor.The cubic symmetry of the Γ point then makes this ten-sor proportional to the identity, which makes the split-offeffective mass spherically symmetric.In contrast, neglecting SOC makes the split-off banddegenerate with the two other valence bands, which pre-vents its description by an effective mass tensor and thusrequires the formalism of Section III B. Moreover, in-creasing the degenerate subspace dimension from 2 to3 adds two off-diagonal coupling terms in f nn (cid:48) k (seeEq. (68)), which breaks the spherical symmetry of themass and cause the qualitative changes observed in Ta-ble III.2 TABLE III. Effective masses of silicon at the valence bandmaximum (VBM) ( Γ (cid:48) v ) and the conduction band minimum(CBM) (located at 0 . X with SOC and 0 . X with-out SOC). When degenerate, transport equivalent and spher-ically averaged effective masses are given, along with effec-tive masses along standard Cartesian directions. When non-degenerate, the effective mass tensor is given. At Γ , the cubicsymmetry makes the tensor proportional to the identity, sothat only one scalar needs to be reported. On the Γ − X path, Cartesian coordinates diagonalize the tensor and the 2directions perpendicular to the path are identical, so that onlytwo scalars need to be reported (the diagonal components ofthe tensor (cid:107) and ⊥ to Γ − X ). Results are provided in atomicunits ( m e = 1).Band This work Theory Experimentno SOC SOC [22] [50] [51] [52] split-off Eff. mass - 0.2226 - 0.22 - -Trans. eqv. 0.1336 0.2226 - - - -Sph. avg. 0.1101 0.2226 - - - -(100) 0.1671 0.2226 - - - -(111) 0.0938 0.2226 - - - -(110) 0.1054 0.2226 - - - - light hole
Trans. eqv. 0.5715 0.1559 0.1731 - - -Sph. avg. 0.3026 0.1431 0.1531 - - -(100) 0.2578 0.1882 - 0.18 0.171 -(111) 0.6495 0.1297 - 0.13 0.160 -(110) 0.2578 0.1367 - 0.14 0.163 - heavy hole
Trans. eqv. 10.60 0.7294 1.1567 - - -Sph. avg. 0.7405 0.4416 0.5322 - - -(100) 0.2578 0.2540 - 0.26 0.46 -(111) 0.6495 0.6484 - 0.67 0.56 -(110) 2.703 0.5173 - 0.54 0.53 -
CBM (cid:107) to Γ − X ⊥ to Γ − X This attribution can be more formally proven by (un-physically) treating the split-off band in a SOC calcu-lation as degenerate with the two other valence bandswithin the formalism of Section III B. Since our imple-mentation decides whether two bands are degenerate ornot using a numerical threshold on their eigenenergy dif-ference, we carried out the preceding test by increasingthis threshold to a value above the spin-orbit splitting.We thus obtained transport equivalent effective masses of0.1335, 0.5708, and 10.58, in very good agreement withthe values obtained without SOC (respectively 0.1336,0.5715, and 10.60; see Table III), which demonstratesthe validity of our interpretation.Yet, this analysis does not settle whether both resultsare correct and really reflect the curvature of the valence
TABLE IV. Comparison of effective masses for the threedegenerate valence bands of silicon (without SOC) at Γ . Weprovide scalar effective masses along the Cartesian directions(100), (111), and (110) since the concept of effective masstensor is not suitable for degenerate bands. We use order8 finite differences and DFPT to compute the masses. Theagreement between the finite-difference methods and DFPTis provided. Results are provided in atomic units ( m e = 1).Direction (100) (111) (110) Band 2
FD order 8 -0.167 180 4 -0.093 807 8 -0.105 36 DFPT -0.167 180 4 -0.093 807 8 -0.105 36 DFPT - FD 8 -4E-8 -3E-8 -5E-8
Band 3
FD order 8 -0.257 803 -0.649 497 -0.257 803 8 DFPT -0.257 803 -0.649 497 -0.257 803 8 DFPT - FD 8 3E-8 1E-7 2E-8
Band 4
FD order 8 -0.257 803 82 -0.649 497 -2.702 563 DFPT -0.257 803 82 -0.649 497 -2.702 563 DFPT - FD 8 4E-9 2E-7 5E-7 bands extrema. For the results with SOC, we have al-ready demonstrated in Table II that the DFPT resultswere indeed correct (i.e. that they agreed very well withfinite-difference calculations). Carrying out the same testfor the valence bands without SOC (see Table IV) revealsthe same agreement, which proves that the qualitativechange of behavior of the effective masses predicted byour implementation is correct.To explain this fact, we plot the valence bands withand without SOC around the Γ point in the (111) andthe (110) directions in Figure 2. Examining e.g. the lighthole band (middle one) with SOC in the (111) directionreveals a change of regime in the effective mass aroundthe SOC energy splitting. Indeed, it can be seen that thecurvature is relatively high at Γ , but then changes ratherrapidly and settles to a lower value around 0 . L − . L .This suggests that the effective mass evolve from the SOCvalue at low energy to the non-SOC value at higher ener-gies (with respect to the SOC splitting). This behavior isnot surprising, since one would expect that treating thesplit-off band as degenerate with the two other valencebands would become a good approximation at energieswhere the SOC splitting becomes negligible. However, inthe case of silicon, this occurs relatively far from the Γ point, so that the validity of the quadratic expansion ofthe eigenenergies becomes questionable. Thus, we post-pone further analysis of this question to the next section,where a material with much smaller SOC splitting of thevalence bands is studied (graphane) (see in particularFigure 4).Since we have established the critical role of SOC inthe proper description of the effective masses at degener-ate band extrema, we now compare our results including3 . K ΓΓΓ L − . − . − . − . − . − . . E n e r gy ( e V ) PAW w/o SOCPAW with SOC
FIG. 2. (Color online) PAW calculation of the valence bandsof silicon around Γ in the directions (110) (cid:107) K and (111) (cid:107) L ,with and without SOC. For the light hole band (middle band)in direction L with SOC, we see that the effective mass has achange of regime around the spin-orbit energy splitting. Afterthis change of regime, the effective mass of the band with SOCtends to the mass of the band without SOC. SOC with results from the literature (Table III), whichall include SOC. As could be expected, the agreementwith Ref. 50 (DFT results) is good and globally betterthan with Ref. 51 and 52 (experimental results). How-ever, the agreement of the ‘transport equivalent effectivemass’ with Ref. 22 is surprisingly poor.To assess this issue, we first replaced f n k ( θ, φ ) (seeEq. (68)) in our implementation by the fit used in Ref. 22(see their Eq. (16), or Eq. (63) of Ref. 18 for the originalversion) f n k ( θ, φ ) = A n k ± (cid:114) B n k + C n k sin ( θ ) (cid:16) cos ( θ ) + sin ( θ ) sin ( φ ) cos ( φ ) (cid:17) , (82)where the parameters A n k , B n k , and C n k are indepen-dent of n for a given pair of degenerate bands (it isonly the ± sign that distinguishes the two bands). Us-ing the values obtained by Ref. 22 for the parametersof the top valence band of silicon at Γ ( A = 4 . B = 0 . C = 5 . f n k ( θ, φ ) (in Eq. (71))yielded the same results, both for our implementation’s f n k ( θ, φ ) (see Eq. (68)) and Eq. (82)’s. Therefore, thediscrepancy does not stem from the implementation ofthe transport effective mass formalism (from Eq. (68) tothe end of Section III B).Thus, two possibilities remain: either Eq. (82) does notfit well f n k ( θ, φ ) or the underlying results for f n k ( θ, φ )differ. To discriminate between the two, we fitted Eq. (82) to our own results and calculated the transportequivalent effective mass tensor from this fit. The pa-rameters we obtained ( A = − . B = 0 . C = 5 . − difference) tothe one we obtained directly with our implementation.Thus, the discrepancy between the results of Ref. 22 andthe present study can be traced back to the numericalresults obtained for f n k ( θ, φ ).This is in line with the explanation given in Ref. 22 forthe poor agreement between their fit and their DFT val-ues for f n k ( θ, φ ), i.e. the nonparabolicity of the bandsintroduced errors in their finite-difference calculationsof f n k ( θ, φ ). Indeed, it is known that coupling be-tween bands (which cause the band warping of degen-erate states) can also cause strong nonparabolicity ofthe bands . Provided that such strong nonparabol-icity occur for a substantial portion of the directions( θ, φ ), finite-difference methods would become unsuitablefor computing f n k ( θ, φ ) in a calculation of ¯ M n k . Thiswould explain the poor agreement between our resultsand Ref. 22’s as well as why Eq. (82) fit perfectly ourresults while it is not the case for Ref. 22’s. It also illus-trates the convenience and reliability of direct (DFPT)calculations of f n k ( θ, φ ) with respect to finite-differencecomputations.
2. Graphane
Graphane has emerged in recent years as a newtwo-dimensional material with promising properties .However, the effective masses in this material have re-ceived little attention. Indeed, to the author’s knowl-edge, only the effective mass of the conduction band hasbeen roughly assessed . We thus decided to investigatethis topic from our first-principles framework.The grafting of one hydrogen per carbon atom ongraphene to produce graphane can be done followingmany different patterns . However, we focus here onthe most stable one: the so-called ‘chair’ configuration,where two hydrogen atoms attached to neighbouring car-bon atoms are located on opposite sides of the graphenesheet. This form of graphane has the same primitive cellas graphene, barring the addition of the two hydrogenatoms and a slight distortion of the carbon-carbon bonds.Thus, it also features the same Brillouin zone. However,in the case of graphane, the valence band maximum andthe conduction band minimum occur at Γ . Moreover,the valence band is doubly degenerate.In our simulations, we used both PAW with SOC (aspresented in Section II D) and NCPP without SOC (tovalidate it, since it is not used elsewhere in this work).To ensure convergence (effective masses precise to 3 sig-nificant digits), the NCPP simulations were carried outusing a cutoff energy of 40 Ha for the plane wave basiswhile, for the PAW case, a cutoff energy of 20 Ha forthe plane wave basis and 40 Ha for the PAW double gridwere used. Also, in both cases, a 8 8 1 k -point grid, an4 TABLE V. Effective masses for the conduction band mini-mum (CBM) and the two-fold degenerate valence band max-imum (VBM) of ‘chair’ graphane. The ‘light hole’ (lh) and‘heavy hole’ (hh) bands designate respectively the lower andupper bands in the inset of Figure 3. Results were obtainedusing NCPP calculations without SOC (labelled NCPP) aswell as PAW calculations with SOC (labelled PAW). Toexplain the large difference between these calculations, wetreated the valence bands as degenerate in a PAW calculationwith SOC (labelled PAW+deg). We also carried out finite-difference calculations (order 8) and obtained their differencewith respect to DFPT calculations (labelled ∆@NCPP and∆@PAW) using the methodology of Section IV A. Also, sincethe Γ point exhibits hexagonal symmetry, the tensors at thispoint are proportional to the identity, so that effective massescan be reported using a single number. Finally, degeneratebands are handled using the formalism of Appendix F sincethe formalism of Appendix E becomes unstable for 2D mate-rials. Results are provided in atomic units ( m e = 1).Band NCPP ∆@NCPP PAW ∆@PAW PAW+degVBM, lh -0.271 -4E-7 -0.373 8E-8 -0.267VBM, hh -0.616 1E-6 -0.372 8E-8 -0.613CBM 1.012 2E-7 1.012 5E-8 1.012 interlayer spacing of 22.5 ˚A, and the local-density approx-imation of Perdew and Wang were used. Moreover, ineach case, the structure was fully relaxed.The hexagonal structure of graphane is described by2 primitive vectors of equal lengths with a 120 ◦ anglebetween them, with the atoms placed at positions (0 , , ) (the other car-bon and hydrogen). Moreover, by symmetry, all the C-Cbonds and C-H bonds have the same length. This leavesonly three quantities to be reported to define the struc-ture of graphane: the primitive vector length a , the C-Cbond length and the C-H bond length. Our NCPP struc-tural relaxation yielded a =2.495 ˚A, C-C=1.509 ˚A, and C-H=1.110 ˚A while PAW yielded a =2.504 ˚A, C-C=1.515 ˚A,and C-H=1.117 ˚A. The latter result compares very wellwith the other PAW LDA result reported in the litera-ture ( a =2.504 ˚A, C-C=1.537 ˚A, and C-H=1.110 ˚A).Since graphane is a 2D material, the 3D formalismfor the ‘transport equivalent’ effective mass presented inSection III B and Appendix E becomes numerically un-stable. We discuss this issue and adapt the formalism ofAppendix E to the 2D case in Appendix F. Thus, it iswith the latter formalism that we handled the degeneratevalence band extremum in graphane.Since the Γ point exhibits hexagonal symmetry, thetensors at this point are proportional to the identity,which leaves only one quantity per band to be reported.We summarize these results in Table V. Also, for the de-generate 2D case, a scaling factor of the transport tensor c nk needs to be taken into account (see Eq. (F15)). How-ever, since it is found to be 1 .
000 for the valence band ofgraphane, it is omitted from Table V.We observe good agreement between all results for the conduction effective mass. Moreover, our results agreewell with Ref. 57, which reports an effective mass of 1 forthe conduction band of graphane using first-principlescalculations. However, the comparison is more complexin the case of the valence bands.The first issue is that valence bands are degeneratein the NCPP calculations, while the PAW calculationswith SOC lift the degeneracy by 8.71 meV (see Figure 3).Thus, the nature of the effective masses is not the samefor NCPP and PAW with SOC: the former are ‘transportequivalent’ effective masses while the latter are plain ef-fective masses. Still, examination of the angular depen-dent effective masses m n k ( θ, φ ) (see Eq. (70)) for bothdegenerate valence bands in the NCPP calculations re-veals that they exhibit spherical symmetry (i.e. there isno warping). In this case, ‘transport equivalent’ effectivemasses coincide with plain effective masses, so that directcomparison between NCPP and PAW with SOC resultsis possible.Yet, as in the case of silicon, the results with and with-out SOC strongly differ. In Table V, we prove that thisdifference can be attributed to the extra coupling termpresent in f nn (cid:48) k (see Eq. (68)) when the bands are con-sidered degenerate. We do so by increasing the numer-ical threshold for degeneracy to a value above the SOCsplitting in the PAW (with SOC) calculation (see thePAW+deg column). As expected, the results obtainedare in very good agreement with the NCPP ones.We then prove that both results (NCPP and PAW withSOC) really reflects the curvature of the bands at Γ bycomparing them with finite-difference calculations, usingthe methodology of Section IV A. For clarity, only the dif-ference between DFPT and finite-difference results is re-ported in Table V. The good agreement confirms both theaccuracy of NCPP and PAW with SOC effective masses(and validate our NCPP DFPT implementation).In the case of silicon, observation of the band structuresuggested that the effective masses with SOC evolve to-wards the values without SOC when moving away fromthe band extrema. However, such an observation is moredifficult to make in the case of graphane. Indeed, directexamination of the band structure (Figure 3) does notallow to see much besides a rigid shift of a valence bandwhen SOC in included. Thus, to more precisely investi-gate the issue, we calculated the scalar effective massesin the Γ − M direction at different points along the Γ − M line, with and without SOC. The results are presented inFigure 4.The results confirm what was suggested in the case ofsilicon: the masses with SOC evolve towards the masseswithout SOC, with a transition that occurs around k =0 . M , which coincides with an energy scale that iscomparable to the SOC energy splitting. Indeed, forPAW calculations with SOC, ε lh k − ε lh Γ = 9 . ε hh k − ε hh Γ = 7 . K ΓΓΓ
M K − − − − E n e r gy ( e V ) PAW w/o SOCPAW with SOCFermi energy0.05 K ΓΓΓ M − . . FIG. 3. (Color online) Band structure of graphane withinthe PAW formalism, with and without SOC. The inset showsthe 8.71 meV spin-orbit splitting of the valence bands. .
00 0 .
02 0 .
04 0 .
06 0 . k (in fraction of M ) − . − . − . − . − . − . − . − . S ca l a r e ff ec ti v e m a ss i nd i r ec ti on M Without SOC (degenerate bands)With SOC (non-degenerate bands)
FIG. 4. (Color online) PAW calculations of scalar effectivemasses of graphane in the Γ − M direction at different pointon the Γ − M line, with (red squares) and without (blue dots)SOC. relevant magnitude of k or ε n k around the band extrema(e.g. the doping level, gate voltage, ...) when computingeffective masses. A possible solution would be to mergethe formalism of Kane models into the formalism ofSection III.
3. Arsenic α -arsenic (or grey arsenic) is a semimetal with the A7crystal structure . The latter structure has a trigonalprimitive cell with 2 atoms per cell. The Brillouin zoneof α -As is described in Section II and Appendix A ofRef. 60. It features a non-degenerate ellipsoidal electronpocket at L and a non-degenerate hole pocket of complex shape around T .The fact that an effective mass at a non-degenerateband extrema is a tensor (see, e.g., Eq. (22)) implies thatthe angle-dependent effective mass m n k ( θ, φ ) should bean ellipsoid. Consequently, the effective mass formalismis inappropriate for the description of the hole pockets in α -As, since they are far from an ellipsoidal shape. Wetherefore focus on the electron pockets, which originatefrom the L band.In the literature on α -As effective masses ,Cartesian directions in the reciprocal space are conven-tionally chosen to be the ‘trigonal’ axis ((1 1 1) in reducedcoordinates) for the z direction and the ‘binary’ axis ((01 -1) in reduced coordinates) for the x axis, which leaves y (the ‘bisectrix’ axis) to be ˆ z × ˆ x .One of the principal axes of the effective mass ellipsoid m lies along the ‘binary’ axis x , which leaves the twoother m and m to be in the yz plane . Theorientation of the largest of these two principal effectivemasses m is conventionally described in term of the ‘tilt’angle it forms with the z axis. Positive angles denote arotation from the z axis towards the y axis. We use thesame convention here.In our calculations, the PAW formalism was used withand without SOC (see Section II D). A cutoff energy of30 Ha for the plane wave basis and 60 Ha for the PAWdouble grid along with a 30 30 30 k -point grid were usedto ensure fully converged calculations (effective massesprecise to 3 significant digits). Also, the local-densityapproximation of Perdew and Wang was used and thestructure was fully relaxed. The trigonal primitive vectorwere found to be 3.9783 ˚A long with 56.441 ◦ angles be-tween each of them and the atomic positions were foundto be 0 . · (1 , ,
1) and 0 . · (1 , ,
1) in reducedcoordinates.We compare our results for m , m , m , and the tiltangle with results from the literature in Table VI. Tofurther assess the nonparabolicity of the band, we alsocompute three scalar effective masses (with SOC) fromthe band extrema and the points of the Fermi surfacelying on the effective mass principal axes.We immediately note the disagreement between theDFPT and the other results for m and m . In partic-ular, the disagreement between our DFPT results andour scalar effective masses (FD@ ε F ) directs suspicion to-wards band nonparabolicity. To assess this, we plot theband structure with and without SOC around L alongthese directions in Figure 5. Indeed, both band struc-ture are clearly non-parabolic along these two directionsand, in particular, have a curvature along m that changesign at a finite wave vector k . This is in line with thestrong nonparabolicity also observed by Ref. 60.We also note that, for our scalar effective masses(FD@ ε F ), the agreement with other theoretical results(Ref. 61) is reasonably good. Indeed, the agreement isas good as could be expected, given that Ref. 61 usesan empirical pseudopotential approach, that our DFPTeffective mass principal axes are likely to be slightly dif-6 TABLE VI. DFPT effective masses for the electron pocket of α -As at L = (0 . , ,
0) from the present implementation andcomparison with the literature. PAW results with and with-out SOC are provided (labeled SOC and no SOC, respec-tively). To assess the nonparabolicity of the band extremagenerating the electron pockets, we also compute effectivemasses (with SOC) along the effective mass principal axesby fitting a parabola to the band extremum and the pointof the Fermi surface lying on the axis (the resulting massesare labeled FD@ ε F ). The convention for the tilt angle is thesame as in Refs. 61, 63–65. Results are provided in atomicunits ( m e = 1).This work Theory Experimentno SOC SOC FD@ ε F [61] [65] [64] [63] m -0.0448 -0.0709 0.0867 0.11 0.134 0.121 0.163 m m ◦ ◦ - 80 ◦ ◦ ◦ ◦ ← m . L .
01 0 . → m Distance from L (in k L k ) − . − . . . . E n e r gy ( e V ) PAW w/o SOCPAW with SOC
FIG. 5. (Color online) Band structure of α -As with andwithout SOC around L in the directions of the two DFPT ef-fective mass principal axes that exhibit strong nonparabolicity( m and m ). We observe a strong nonparabolicity in the m direction. Also, the m direction looks reasonably parabolicwithout SOC, but changes sign due to a band crossing at fi-nite wave vector. When SOC is included, the m directionexhibits strong nonparabolicity due to an avoided crossing. ferent from the Fermi surface principal axes and that theFermi surface deviates slightly from an ellipsoid .On the other hand, we observe that DFPT agrees withthe scalar effective mass results (FD@ ε F ) for m , whichsuggests that the effective mass is parabolic in this di-rection. In this case (as well as for the tilt angle, whichis incidentally defined with respect to m ), we also ob-serve good overall agreement with experimental andtheoretical results.We conclude from this that effective masses (as de-fined in Eq. (22), i.e. in the k · ˆ p and DFPT sense) are a dangerous concept to use in metals, even if they have aslittle charge carriers as α -As. Indeed, in the present case,the range where a quadratic expansion reliably describesthe band dispersion is much smaller than the electronpockets. However, it is possible that some coupling be-tween the bands in Figure 5 would explain the strongnonparabolicities observed. Thus, including the formal-ism of Kane models into the formalism of Section IIImay enhance the description of α -As (and semimetals ingeneral). V. CONCLUSION
Up to now, effective masses were usually calculatedusing finite-difference estimation of density functionaltheory (DFT) electronic band curvatures . Theonly option available to circumvent their use relies onWannier functions . However, finite differences requireadditional convergence studies and can lead to precisionissues while Wannier functions require careful selection ofthe starting functions. In contrast, the present DFPT-based method allows to obtain DFT effective masses withhigh precision without any such additional work. It istherefore more suitable for e.g. high-throughput mate-rial design.Moreover, it is known that the concept of effective massbreaks down at degenerate band extrema due to the non-analyticity of the band structure at such a point. Whilethis issue is usually addressed with a fitting procedurethat aims at accurately describing the band structure atthis point , it would be more convenient to directly de-termine a metric of the performance of the material ina design context. Since the most appropriate metric isusually some transport tensor, the concept of ‘transportequivalent mass tensor’ becomes quite suitable for ourobjective and has been integrated in our DFPT-basedmethod. This concept allows one to compute an effec-tive mass-like tensor which gives the right contribution totransport tensors when used, even if the band extremumcannot be described by a tensor. This makes our methodmore general and even simpler to use.The developed techniques were validated by compari-son with finite-difference calculations and excellent agree-ment was observed. Then, applying our method to thestudy of silicon, graphane, and arsenic, we found resultscoherent with previous studies, thus further validatingour method. Actually, the agreement with Ref. 22 was,at first sight, not very good, as seen in Section IV B 1, buta careful analysis pointed to the superiority of the DFPTversus finite-difference approach to extract a ‘transportequivalent effective mass’.Still, our simulations (especially in the case ofgraphane) suggest that neither the non-degenerate for-malism of Section II nor the degenerate formalism of Sec-tion III is suitable when the energy scale relevant to theproblem (e.g. doping level, gate voltage, ...) is compara-ble to the energy separation between the band of interest7and its closest neighbour. Thus, it would be interestingto merge the formalism developed by Kane into theformalism of Section III in future developments.Also, a substantial difference between DFT and experi-mental results remains. Provided that recent studies haveexposed the substantial impact of electron-electron inter-actions on the calculated effective masses , it becomesinteresting to investigate approximate schemes to includethese interactions in the calculations. Furthermore, itwould be interesting to investigate the impact of electron-phonon interaction not only on the band gap , butalso on the effective masses. VI. ACKNOWLEDGMENTS
This work was supported by the FRQNT througha postdoctoral research fellowship (J.L.J.) as wellas the FRS-FNRS through Scientific Stay grant No.2014/V 6/5/010-IB (J.L.J.), a FRIA fellowship (S.P.),and a FNRS fellowship (Y.G.). Also, we would liketo thank Yann Pouillon and Jean-Michel Beuken fortheir valuable technical support and help with the testand build system of ABINIT. Computational resourceshave been provided by the supercomputing facilities ofthe Universit´e catholique de Louvain (CISM/UCL) andthe Consortium des ´Equipements de Calcul Intensif enF´ed´eration Wallonie Bruxelles (CECI) funded by theFonds de la Recherche Scientifique de Belgique (FRS-FNRS) under Grant No. 2.5020.11.
Appendix A: Derivatives in reduced coordinates
Defining { a , a , a } as the real space and { b , b , b } as the reciprocal space primitive vectors, with a α · b β =2 πδ αβ , noting vectors in reduced coordinates as ˘ v , anddefining [ A ] αβ = (cid:52) [ a β ] α ; [ B ] αβ = (cid:52) [ b β ] α , (A1)we have r = A ˘ r ; k = B ˘ k ; A T B = 2 π , (A2)from which we deduce the inverse transformation˘ k = 12 π A T k . (A3)We can now deduce using Eq. (A3) the transformationthat links derivatives with respect to Cartesian compo-nents of k (noted X α ) to derivatives with respect to re-duced coordinates (noted ˘ X α ) δX = (cid:88) α X α [ δ k ] α = (cid:88) α ˘ X α [ δ ˘ k ] α , (A4) ⇒ X α = (cid:88) β [ A ] αβ π ˘ X β . (A5) Similarly for second-order derivatives, we obtain X αβ = (cid:88) γδ [ A ] αγ π ˘ X γδ [ A T ] δβ π , (A6)and can thus retrieve e.g. ε αβn k from ˘ ε αβn k . Appendix B: Kinetic energy with cutoff smearing
A known issue when optimizing the primitive cell sizein a plane wave implementation of DFT are the spuri-ous discontinuous drops in total energy that occur whenincreasing the cell size. Indeed, dilating the real spacelattice contracts the reciprocal space one. This contrac-tion increases discontinuously the number of plane waveslocated inside a sphere defined by the cutoff energy E c .This discontinuous increase of the size of the plane wavebasis set translates into a discontinuous increase of thenumber of degrees of freedom available for the minimiza-tion of the total energy, which causes discontinuous dropswhen increasing the cell size.A possible solution to this problem is to force the ef-fective number of degrees of freedom to increase con-tinuously as the cell size increases. A numerical wayto achieve this was first proposed in Ref. 29 and in-volves modifying the kinetic energy close to E c so that itsmoothly becomes large (cid:104) G | ˆ T k | G (cid:48) (cid:105) = δ GG (cid:48) (cid:20)
12 ( k + G ) + A (cid:18) (cid:16) / k + G ) − E c E s (cid:17)(cid:19)(cid:21) , (B1)where erf( x ) is the error function and where E s and A are adjustable parameters. Thus, the weights of the planewaves become smoothly small close to E c and the changein the number of degrees of freedom with varying cell sizecan be made more continuous. This idea not only makesthe total energy smoother with respect to cell size butalso provides, as a side effect, a very good approximationof Pulay stress within the kinetic energy term .Within ABINIT, the implementation of this idea takesa slightly different form. Rather than becoming smoothlylarge at E c , the kinetic energy rises asymptotically to in-finity (see Figure 6), so that the change in the number ofdegrees of freedom with varying cell size becomes com-pletely continuous. The mathematical formulation of thisidea takes the form (cid:104) G | ˆ T k | G (cid:48) (cid:105) = 12 ( k + G ) δ GG (cid:48) p ( x ) , (B2)where x = (cid:52) E c − ( k + G ) E s , (B3)and where p ( x ) → + ∞ as x → + . The deviation fromthe physical expression ( k + G ) starts at E c − E s , i.e.8 . . . . . . p . ( k + G ) / E c . . . . . . . . . h G | ˆ T k | G i / E c Kinetic energyModified kinetic energyCutoff energy E c Deviation onset E c − E s FIG. 6. Within ABINIT, the kinetic energy rises asymptot-ically to infinity so that the change in the number of degreesof freedom with varying cell size is continuous. The onset ofthe deviation from the physical expression ( k + G ) occursat the energy E c − E s . The first and second-order derivativesof the kinetic energy remain continuous at this onset. p ( x ) starts deviating from 1 below x = 1. The parameter E s is therefore the energy range around the cutoff energy E c where the occupations start to be forced towards 0,i.e. E s can be interpreted as a smearing of the cutoffenergy.To avoid numerical issues, p ( x ) is chosen so the firstand second-order derivatives of the kinetic energy remaincontinuous at the onset of this deviation, that is, justabove the plane wave kinetic energy E c − E s . Moreover,its inverse approaches quadratically zero at E c . Amongthe possible functions with this behaviour, the followingspecific form for p ( x ) was chosen and implemented withinABINIT p ( x )= (cid:52) (cid:40) < x, x (3+ x (1+ x ( − x ))) if 0 < x ≤ . (B4)Together, Eqs. (B2), (B3), and (B4) define the modi-fied kinetic energy implemented within ABINIT and formthe starting point from which its derivatives (Eqs. (26)and (27)) are obtained. Appendix C: PAW reminder
The basic idea of PAW is to transform the eigenfunc-tions | ψ n k (cid:105) of the Hamiltonian ˆ H into ‘pseudo’ wave-functions | ˜ ψ n k (cid:105) , which are smoother and thus accu-rately described by a smaller plane wave basis. Sincethe sharp features of the wavefunction usually occurnear the nucleus, the transformation only needs to dif-fer from identity within an ‘augmentation’ region Ω R around each nucleus position R within the primitive cell.Within these augmentation regions, an orthonormal basis {| φ R i (cid:105)} that contains the sharp components of the wave-functions | ψ n k (cid:105) is chosen, where the index i labels thefunctions | φ R i (cid:105) belonging to a given atomic site R . Thetransformation can then be defined as the replacementof these components {| φ R i (cid:105)} of the wavefunctions | ψ n k (cid:105) by those of another orthonormal basis {| ˜ φ R i (cid:105)} , which aresmoother inside the augmentation region Ω R and identi-cal outside. More formally, this idea translates into | ˜ ψ n k (cid:105) = | ψ n k (cid:105) + (cid:88) R i (cid:16) | ˜ φ R i (cid:105) − | φ R i (cid:105) (cid:17) (cid:104) φ R i | ψ n k (cid:105) . (C1)At the end of the calculation, we can recover the fullwavefunctions with the inverse transformation | ψ n k (cid:105) = | ˜ ψ n k (cid:105) + (cid:88) R i (cid:16) | φ R i (cid:105) − | ˜ φ R i (cid:105) (cid:17) (cid:104) ˜ φ R i | ˜ ψ n k (cid:105) . (C2)In the PAW method, the orthonormality constraint im-posed to the bases {| φ R i (cid:105)} and {| ˜ φ R i (cid:105)} is relaxed into acompleteness constraint. This gives additional degrees offreedom to make the basis {| ˜ φ R i (cid:105)} even smoother. How-ever, this implies that (cid:104) ˜ φ R i | ˜ φ R j (cid:105) = I R ij is not in generalthe identity matrix δ ij and thus (cid:80) i | ˜ φ R i (cid:105) (cid:104) ˜ φ R i | is not theidentity operator.To solve this issue, we introduce the projectors | ˜ p R i (cid:105) ,defined by (cid:104) ˜ p R i | ˜ φ R j (cid:105) = (cid:52) δ ij , (C3)which allows to express the identity as (cid:80) i | ˜ φ R i (cid:105) (cid:104) ˜ p R i | andthus Eq. (C2) becomes | ψ n k (cid:105) = | ˜ ψ n k (cid:105) + (cid:88) R i (cid:16) | φ R i (cid:105) − | ˜ φ R i (cid:105) (cid:17) (cid:104) ˜ p R i | ˜ ψ n k (cid:105) , (C4)= (cid:52) ˆ τ | ˜ ψ n k (cid:105) , (C5)where ˆ τ is a linear, but not unitary, transformation.For PAW to be computationally advantageous, the | ˜ ψ n k (cid:105) (and not the | ψ n k (cid:105) ) must be the variational pa-rameter used in the calculation. Ideally, one should avoidusing the | ψ n k (cid:105) whenever possible, and thus directly de-duce the desired observables O from the | ˜ ψ n k (cid:105) O n k = (cid:104) ψ n k | ˆ O | ψ n k (cid:105) , = (cid:104) ˜ ψ n k | ˆ τ † ˆ O ˆ τ | ˜ ψ n k (cid:105) , = (cid:52) (cid:104) ˜ ψ n k | ˆ˜ O | ˜ ψ n k (cid:105) , (C6)where the transformed operator ˆ˜ O takes the formˆ˜ O = (cid:32) (cid:88) R i | ˜ p R i (cid:105) (cid:16) (cid:104) φ R i | − (cid:104) ˜ φ R i | (cid:17)(cid:33) ˆ O (cid:88) R (cid:48) j (cid:16) | φ R (cid:48) j (cid:105) − | ˜ φ R (cid:48) j (cid:105) (cid:17) (cid:104) ˜ p R (cid:48) j | , = ˆ O + (cid:88) R ij | ˜ p R i (cid:105) (cid:16) (cid:104) φ R i | ˆ O | φ R j (cid:105) − (cid:104) ˜ φ R i | ˆ O | ˜ φ R j (cid:105) (cid:17) (cid:104) ˜ p R j | . (C7)9The last relation is valid only for semi-local operatorsˆ O (that is, operators which depend only locally on thevalue and derivatives with respect to r of the wavefunc-tions). Moreover, it supposes that the bases {| φ R i (cid:105)} and {| ˜ φ R i (cid:105)} are complete and that the augmentation regionsΩ R do not overlap. However, in practical calculations,small bases usually suffice to achieve good accuracy anda small overlap of the augmentation regions can be tol-erated without substantial loss of precision.In the case of periodic solids, we wish to use the | ˜ u n k (cid:105) (instead of the | ˜ ψ n k (cid:105) ) directly as the variational param-eter. To do so, we must transform Sch¨odinger’s equation(Eq. (4)) to its PAW version. Starting from Eq. (1), thenusing Eqs. (C5), (C6) and, finally, applying Eqs. (3), (5)to the PAW quantities, we obtainˆ H | ψ n k (cid:105) = ε n k | ψ n k (cid:105) , ⇒ ˆ˜ H | ˜ ψ n k (cid:105) = ε n k ˆ˜1 | ˜ ψ n k (cid:105) , ⇒ ˆ˜ H k | ˜ u n k (cid:105) = ε n k ˆ˜1 k | ˜ u n k (cid:105) , (C8)where we have defined the overlap operatorˆ˜1= (cid:52) ˆ τ † ˆ τ , (C9)and where the k -dependent operators ˆ˜ H k and ˆ˜1 k havethe following form, following Eq. (C7)ˆ˜ H k = ˆ H k + (cid:88) R ij e − i k · ˆ r | ˜ p R i (cid:105) D R ij (cid:104) ˜ p R j | e i k · ˆ r (C10)= (cid:52) ˆ H k + ˆ D k , (C11)ˆ˜1 k = 1 + (cid:88) R ij e − i k · ˆ r | ˜ p R i (cid:105) S R ij (cid:104) ˜ p R j | e i k · ˆ r , (C12)with D R ij = (cid:52) (cid:16) (cid:104) φ R i | ˆ H | φ R j (cid:105) − (cid:104) ˜ φ R i | ˆ H | ˜ φ R j (cid:105) (cid:17) , (C13) S R ij = (cid:52) (cid:16) (cid:104) φ R i | φ R j (cid:105) − (cid:104) ˜ φ R i | ˜ φ R j (cid:105) (cid:17) . (C14)As stated by Eqs. (20) and (21) for the norm-conserving case and as we demonstrate in Section III (seeEqs. (64) and (63)) for the PAW case, we need the firstand second-order derivatives of ˆ˜ H k and ˆ˜1 k to computethe effective masses. From Eqs. (C13), (C14), and thefact that we use a plane wave basis set, this means thatwe need expressions for (cid:104) G | ˆ D k | G (cid:48) (cid:105) and (cid:104) G | ˆ˜1 k | G (cid:48) (cid:105) . Weobtain (cid:104) G | ˆ D k | G (cid:48) (cid:105) = (cid:88) R ij (cid:104) G | e − i k · ˆ r | ˜ p R i (cid:105) D R ij (cid:104) ˜ p R j | e i k · ˆ r | G (cid:48) (cid:105) = (cid:88) R ij (cid:104) k + G | ˜ p R i (cid:105) D R ij (cid:104) ˜ p R j | k + G (cid:48) (cid:105) . (C15)Since the projectors | ˜ p R i (cid:105) stem from Schr¨odinger’s equa-tion with a spherical potential , they can be expressed as the product of their radial part times spherical har-monics (cid:104) r | ˜ p R i (cid:105) = (cid:52) ˜ P R i ( s ) s Y l i m i (ˆ s ) , (C16)where we have defined ˜ P R i ( s ), where s = (cid:52) r − R and where Y lm are the spherical harmonics. Thus, we can calculate (cid:104) k + G | ˜ p R i (cid:105)(cid:104) k + G | ˜ p R i (cid:105) = (cid:90) s c ds s ˜ P R i ( s ) s (cid:90) d ˆ s Y l i m i (ˆ s ) e − i ( k + G ) · ( s + R ) , (C17)where s c is the radius of the augmentation regions Ω R and ˆ s is the normalized version of s . This can be simpli-fied, using the identity e i k · s = 4 π ∞ (cid:88) l =0 i l j l ( ks ) l (cid:88) m = − l Y lm (ˆ s ) Y lm (ˆ q ) , (C18)where j l ( ks ) are spherical Bessel functions. Eq. (C17)thus becomes (cid:104) K | ˜ p R i (cid:105) = 4 πi l i e − i K · R Y l i m i ( ˆ K ) (cid:90) s c ds s ˜ P R i ( s ) j l i ( Ks ) , (C19)= (cid:52) πi l i e − i K · R Y l i m i ( ˆ K ) ˜ P R i ( K ) , (C20)= (cid:52) πi l i e − i K · R ¯ P R i ( K ) , (C21)where K = (cid:52) k + G . Substituting Eq. (C21) into Eq. (C15)gives (cid:104) G | ˆ D k | G (cid:48) (cid:105) = (cid:88) R ij πi l i ¯ P R i ( K ) e − i ( (cid:1) k + G ) · R D R ij e i ( (cid:1) k + G (cid:48) ) · R ¯ P ∗ R j ( K (cid:48) )4 π ( i l j ) ∗ , = (cid:88) R ij (cid:104) K | ¯ p R i (cid:105) D R ij (cid:104) ¯ p R j | K (cid:48) (cid:105) , (C22)where we have defined (cid:104) K | ¯ p R i (cid:105) = (cid:52) πi l i ¯ P R i ( K ) e − i G · R . (C23)Together, Eqs. (C19)-(C22) form the starting point forthe calculation of the derivatives of the non-local part ofthe Hamiltonian ˆ D k in Section II C. Appendix D: Derivation of Eq. (54) from Eq. (50)
Substituting Eq. (51) in Eq. (50), using Eq. (52), sim-plifying using Eq. (28), using Eq. (47), and invoking the0degeneracy at 1st order ( ε αnn (cid:48) k = ε α { d } k δ nn (cid:48) ) yields ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k − ε n k + ε n (cid:48) k αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k | ˆ˜ H k − ε n k + ε n (cid:48) k k | ˆ˜ Q k ˜ u βn k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˆ˜ Q k ˜ u βn k (cid:105) + (cid:104) ˜ u αn (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) ε β { d } k + ε α { d } k (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u βn k (cid:105) (cid:17) + α ↔ β. (D1)It is more convenient to reformulate the second linefrom the end of this expression so that it can be mergedwith the second and fourth lines. To do so, we rearrangethe terms included in α ↔ β and invoke the degeneracyto first order ε βn k = ε βn (cid:48) k ∀ n, n (cid:48) ∈ { d } , so that Eq. (46)can be used, and obtain (cid:16) (cid:104) ˜ u αn (cid:48) k | ˆ˜1 k | ˜ u n k (cid:105) ε βn k + ε αn (cid:48) k (cid:104) ˜ u n (cid:48) k | ˆ˜1 k | ˜ u βn k (cid:105) (cid:17) + α ↔ β = − (cid:16) (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u n k (cid:105) ε βn k + ε αn (cid:48) k (cid:104) ˜ u n (cid:48) k | ˆ˜1 β k | ˜ u n k (cid:105) (cid:17) + α ↔ β, = − (cid:16) (cid:104) δ ˜ u αn (cid:48) k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | δ ˜ u βn k (cid:105) (cid:17) + α ↔ β, (D2) where we have defined | δ ˜ u αn k (cid:105) = (cid:52) (cid:88) n (cid:48) ∈{ d } | ˜ u n (cid:48) k (cid:105) (cid:104) ˜ u n (cid:48) k | ˆ˜1 α k | ˜ u n k (cid:105) , (D3)and where we have used ε αnn (cid:48) k = ε α { d } k δ nn (cid:48) , Eq. (47), andEq. (52) for the last equality of Eq. (D2). Substitutingthis result in Eq. (D1) and using the fact that we canadd any component within the degenerate subspace tothe wavefunctions on the third line of this equation (sincethey don’t contribute to the final result as per Eq. (28)),we obtain ε αβnn (cid:48) k = (cid:104) ˜ u n (cid:48) k | ˆ˜ H αβ k − ε n k + ε n (cid:48) k αβ k | ˜ u n k (cid:105) + (cid:16) (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k − δ ˜ u αn k | ˆ˜ H β k − ε n k + ε n (cid:48) k β k | ˜ u n k (cid:105) + (cid:104) ˜ u n (cid:48) k | ˆ˜ H α k − ε n k + ε n (cid:48) k α k | ˆ˜ Q k ˜ u βn k − δ ˜ u βn k (cid:105) + (cid:104) ˆ˜ Q k ˜ u αn (cid:48) k − δ ˜ u αn k | ˆ˜ H k − ε n k + ε n (cid:48) k k | ˆ˜ Q k ˜ u βn k − δ ˜ u βn k (cid:105) (cid:17) + α ↔ β. (D4)Eq. (D4) is the intermediate point (see Eq. (54)) fromwhich the final expression for ε αβnn (cid:48) k (Eq. (64)) is obtainedin Section III A. Appendix E: ‘Transport equivalent effective masstensor’ ¯ M n k from band curvature f n k ( θ, φ ) We summarize in this appendix the idea of Ref. 22,which associates to f n k ( θ, φ ) a ‘transport equivalent masstensors’ ¯ M n k that generates the same contribution totransport properties. The association holds within therelaxation time approximation to Boltzmann’s transportequation with an energy dependent relaxation time of the form τ n k ( ε ) = U n τ n k ( ε ), i.e. where the en-ergy dependence can be factored out of the tensor. Italso requires that ¯ M n k be calculated at a band extrema( ε αnn (cid:48) k = 0). In the current appendix, we generalize theoriginal demonstration to an energy dependent relaxationtime τ n k ( ε ) (where the energy dependence may not befactored out of the tensor) but specialize to the case ofconductivity σ for concision.With these assumptions, Boltzmann’s transport equa-1tion becomes g n k = f ( T, ε n k − µ ) − ∂f∂ε (cid:12)(cid:12)(cid:12)(cid:12) ε n k (cid:16) τ n k ( ε n k ) v n k (cid:17) · (cid:18) − e E − ∇ µ − ε n k − µT ∇ T (cid:19) , (E1)with T the temperature, µ the chemical potential, f ( T, ε − µ ) the Fermi-Dirac distribution, v n k the elec-tronic velocity, − e the electronic charge, E the electricfield, and g n k the (out of equilibrium) occupation num-bers of the electrons. We can then calculate the resultingcurrent density j = − e (cid:88) n (cid:90) d k π v n k g n k , (E2) then deduce the conductivity σ = ∂ j ∂ E = − e (cid:88) n (cid:90) d k π ∂f∂ε (cid:12)(cid:12)(cid:12)(cid:12) ε n k v n k v Tn k τ Tn k ( ε n k ) . (E3)Since we are at a band extrema (located at k ), we havea dispersion of the form ε n k + q = ε n k + f n k ( θ, φ ) q . (E4)We can now obtain v n k from the band curvature f n k ( θ, φ ) v n k = ∂ε n k ∂ k = q v n k (ˆ q ) , (E5)where ˆ q is the unit vector along the direction θ, φ inspherical coordinates and where the quantity ¯ v n k (ˆ q )takes the following form in Cartesian coordinates¯ v n k (ˆ q )= (cid:52) f n k ( θ, φ ) sin( θ ) cos( φ ) + ∂f n k ∂θ cos( θ ) cos( φ ) − ∂f n k ∂φ sin( φ )sin( θ ) f n k ( θ, φ ) sin( θ ) sin( φ ) + ∂f n k ∂θ cos( θ ) sin( φ ) + ∂f n k ∂φ cos( φ )sin( θ ) f n k ( θ, φ ) cos( θ ) − ∂f n k ∂θ sin( θ ) . (E6)Supposing that the parabolic dispersion of the bandsholds wherever ∂f∂ε is non-negligible, the conductivity σ takes the following form σ = − e (cid:88) n (cid:90) d k π ∂f∂ε (cid:12)(cid:12)(cid:12)(cid:12) ε n k q v n k (ˆ q ) q v Tn k (ˆ q ) τ Tn k ( ε n k ) , (E7)which, using the substitution ε = (cid:52) ε n k + sign( ε n k − µ ) | f n k ( θ, φ ) | q , (E8)can be split into a product σ = (cid:88) n (cid:88) k C n k K n k , (E9)where the sum over k runs over the different extrema ofband n . This product distinguishes the integral over theenergy ε K n k = (cid:52) − e / π sign( ε n k − µ ) (cid:90) sign( ε n k − µ ) ∞ ε n k dε | ε − ε n k | / ∂f∂ε τ Tn k ( ε ) (E10)and the integral over the spherical angles ( θ, φ ) C n k = (cid:52) (cid:90) π dφ (cid:90) π dθ sin( θ ) ¯ v n k ( θ, φ )¯ v Tn k ( θ, φ )2 | f n k ( θ, φ ) | / . (E11) When f n k ( θ, φ ) is an ellipsoid, i.e. when the band dis-persion can be described by an effective mass tensor M n k ,and if we choose the Cartesian axes to be along the el-lipsoid principal axes M n k = m n k x m n k y
00 0 m n k z , (E12)then we have the relation f n k ( θ, φ ) = ˆ q T ( θ, φ ) M n k ˆ q ( θ, φ ) . (E13)Calculating f n k ( θ, φ ) in terms of m n k x , m n k y , m n k z , θ, and φ , substituting in ¯ v n k ( θ, φ ) (Eq. (E6)), then into C n k (Eq. (E11)) and finally carrying out analytically theintegration over θ, φ yields[ C n k ] ij = 8 π √ m n k x m n k y m n k z m n k i δ ij , (E14)which allows to deduce m n k i from C n k m n k i = [ C n k ] jj [ C n k ] kk (8 π/ ; i (cid:54) = j (cid:54) = k (cid:54) = i. (E15)2 Appendix F: ‘Transport equivalent effective masstensor’ ¯ M n k from band curvature f n k ( θ, φ ) in 2D When m n k z → ∞ , we observe from Eq. (E14) thatthe x and y matrix elements of C n k diverge. Thus, theprocedure do find ¯M n k described in Appendix E becomesnumerically unstable for 2D systems. To solve this issue,we adapt the formalism of Appendix E to the 2D context.The conductivity σ then becomes σ = (cid:88) n (cid:88) k C n k K n k , (F1)with the energy part K n k = (cid:52) − e π (cid:90) ±∞ ε n k ± dε | ε − ε n k | ∂f∂ε τ Tn k ( ε ) (F2)and the angular part C n k = (cid:52) (cid:90) π dφ ¯ v n k ( φ )¯ v Tn k ( φ )2 | f n k ( φ ) | , (F3)where¯ v n k ( φ )= (cid:52) (cid:32) f n k ( φ ) cos( φ ) − ∂f n k ∂φ sin( φ )2 f n k ( φ ) sin( φ ) + ∂f n k ∂φ cos( φ ) (cid:33) , (F4)and where σ and τ n k ( ε ) are 2D tensors.When f n k ( φ ) is an ellipse, i.e. when the band disper-sion can be described by an effective mass tensor M n k ,and if we choose the Cartesian axes to be along the prin-cipal axes M n k = (cid:32) m n k x m n k y (cid:33) , (F5)then we have the relation f n k ( φ ) = ˆ q T ( φ ) M n k ˆ q ( φ ) . (F6)We deduce from Eqs. (F3) and (F4) the tensor C n k re-sulting from Eqs. (F5) and (F6)[ C n k ] ij = 2 π √ m n k x m n k y m n k i δ ij . (F7)A distinct feature of Eq. (F7) with respect to the 3D case(Eq. (E14)) is that C n k is determined from the ratioof m n k x and m n k y only and is not influenced by theirmagnitude.It is easier to get an intuitive understanding of thisfact if we consider a one-band system with a minimumat Γ with M = m ∗ , τ = τ ⇒ σ = σ , and work ata constant Fermi energy with respect to the extremum(minimum or maximum). We then have σ = ne τm ∗ , (F8) with n = (cid:52) − (cid:90) d k (2 π ) D ∂f∂ε ∝ ( m ∗ ) D/ , (F9)with D the dimensionality of the system considered. Wesee that for the specific case of 2D systems, a cancellationoccurs between the carrier density n and the effectivemass m ∗ in Eq. (F8). Therefore, m ∗ does not influencethe conductivity σ in 2D or, in the more general case ofEq. (F3), rescaling f n k ( φ ) (or, equivalently, ¯M n k ) doesnot influence C n k . Thus, the scale of ¯M n k can be setarbitrarily.Reciprocally, ¯M n k does not influence the scale of C n k ,as per Eq. (F7). This feature of 2D tensorial effectivemasses does not hold true for general (i.e. warped) f n k ( φ ). There is therefore one degree of freedom of C n k (its scale) that ¯M n k fails to determine. Care must there-fore be taken when one computes transport quantitiesfrom ¯M n k in 2D. Once C n k has been diagonalized, onemust extract its scaling c n k U Tn k C n k U n k = (cid:32) C n k x C n k y (cid:33) = 2 πc n k (cid:113) C n k x C n k y (cid:113) C n k y C n k x (F10)and preserve this information. Then, substituting c n k → C n k compatible with atensorial effective mass, which allows direct comparisonwith Eq.(F7) (cid:114) C n k y C n k x = (cid:114) m n k x m n k y , (F11)which still leaves the scale of ¯M n k undetermined.Within this implementation, we choose to set the av-erage curvature associated with ¯M n k (through Eq. (F6))to the average curvature of the associated band extrema¯ f n k (cid:18) m n k x + 1 m n k y (cid:19) = ¯ f n k = (cid:52) π (cid:90) π dφf n k ( φ ) , (F12)so that we recover ¯M n k = M n k when there is no warp-ing (i.e. when the effective mass can be described by atensor). This allows to set specific values for m n k x and m n k y m n k x = 12 ¯ f n k (cid:18) C n k y C n k x (cid:19) ; (F13) m n k y = m n k x C n k x C n k y . (F14)As discussed above Eq. (F10), when these values are usedto obtain a transport quantity, one must remember to3multiply the final result by the scaling factor c n k obtainedin Eq. (F10) c n k = (cid:52) (cid:112) C n k x C n k y π , (F15)since tensorial ¯M n k are unable to account for it. For instance, rather than using Eq. (F7), which applies onlyto tensorial effective masses (i.e. to non-warped bands),one should substitute Eq. (F11) into (F10) U Tn k C n k U n k = 2 πc n k (cid:113) m n k y m n k x (cid:113) m n k x m n k y . (F16) C. Kittel,
Introduction to Solid State Physics (Wiley,2004). N. W. Ashcroft and N. D. Mermin,
Solid state physics (Harcourt College Publishers, Orlando, 1976). M. P. Marder,
Condensed Matter Physics (John Wiley &Sons, Hoboken, NJ, USA, 2010). P. Yu and M. Cardona,
Fundamentals of Semiconductors ,Physics and Materials Properties (Springer Science & Busi-ness Media, 2010). M. R. Filip, C. Verdi, and F. Giustino, Journal of PhysicalChemistry C , 25209 (2015). G. Hautier, A. Miglio, D. Waroquiers, G.-M. Rignanese,and X. Gonze, Chemistry of Materials , 5447 (2014). G. Hautier, A. Miglio, G. Ceder, G.-M. Rignanese, andX. Gonze, Nature Communications , 2292 (2013). Y.-S. Kim, M. Marsman, G. Kresse, F. Tran, and P. Blaha,Physical Review B , 205212 (2010). Y.-S. Kim, K. Hummer, and G. Kresse, Physical ReviewB , 035203 (2009). A. N. Chantis, M. van Schilfgaarde, and T. Kotani, Phys-ical Review Letters , 086405 (2006). C. B. Geller, W. Wolf, S. Picozzi, A. Continenza, R. Asahi,W. Mannstadt, A. J. Freeman, and E. Wimmer, AppliedPhysics Letters , 368 (2001). S. Wang, Z. Wang, W. Setyawan, N. Mingo, and S. Cur-tarolo, Physical Review X , 021012 (2011). K. Hummer, A. Gr¨uneis, and G. Kresse, Physical ReviewB , 195211 (2007). G. K. H. Madsen and D. J. Singh, Computer Physics Com-munications , 67 (2006). R. M. Martin,
Electronic Structure , Basic Theory andPractical Methods (Cambridge University Press, Cam-bridge, UK, 2004). J. R. Yates, X. Wang, D. Vanderbilt, and I. Souza, Phys-ical Review B , 195121 (2007). J. M. Luttinger and W. Kohn, Physical Review , 869(1955). G. Dresselhaus, A. F. Kip, and C. Kittel, Physical Review , 368 (1955). E. Kane, Journal of Physics and Chemistry of Solids , 82(1956). E. Kane, Journal of Physics and Chemistry of Solids , 249(1957). See Eq. (63) of Ref. 18 or Eq. (82) of the present work. N. A. Mecholsky, L. Resca, I. L. Pegg, and M. Fornari,Physical Review B , 155131 (2014). S. Baroni, P. Giannozzi, and A. Testa, Physical ReviewLetters , 1861 (1987). X. Gonze, Physical Review A , 1096 (1995). X. Gonze, Physical Review B , 10337 (1997). X. Gonze and C. Lee, Physical Review B , 10355 (1997). P. Ghosez and X. Gonze, Journal of Physics-Condensed Matter , 9179 (2000). P. Umari, X. Gonze, and A. Pasquarello, Physical ReviewB , 235102 (2004). M. Bernasconi, G. L. Chiarotti, P. Focher, S. Scandolo,E. Tosatti, and M. Parrinello, Journal of Physics andChemistry of Solids , 501 (1995). X. Gonze, B. Amadon, P. M. Anglade, J. M. Beuken,F. Bottin, P. Boulanger, F. Bruneval, D. Caliste, R. Cara-cas, M. Cˆot´e, T. Deutsch, L. Genovese, P. Ghosez, M. Gi-antomassi, S. Goedecker, D. R. Hamann, P. Hermet,F. Jollet, G. Jomard, S. Leroux, M. Mancini, S. Mazevet,M. J. T. Oliveira, G. Onida, Y. Pouillon, T. Rangel, G. M.Rignanese, D. Sangalli, R. Shaltaf, M. Torrent, M. J. Ver-straete, G. Zerah, and J. W. Zwanziger, Computer PhysicsCommunications , 2582 (2009). D. R. Hamann, M. Schluter, and C. Chiang, Physical Re-view Letters , 1494 (1979). P. E. Bl¨ochl, Physical Review B , 17953 (1994). D. Vanderbilt, Physical Review B , 7892 (1990). More specifically, we generalize their Eq. (IV.9). X. Gonze, F. Jollet, F. Abreu Araujo, D. Adams,B. Amadon, T. Applencourt, C. Audouze, J. M. Beuken,J. Bieder, A. Bokhanchuk, E. Bousquet, F. Bruneval,D. Caliste, M. Cˆot´e, F. Dahm, F. Da Pieve, M. Delaveau,M. Di Gennaro, B. Dorado, C. Espejo, G. Geneste,L. Genovese, A. Gerossier, M. Giantomassi, Y. Gillet,D. R. Hamann, L. He, G. Jomard, J. Laflamme Janssen,S. Le Roux, A. Levitt, A. Lherbier, F. Liu, I. Lukace-vie, A. Martin, C. Martins, M. J. T. Oliveira, S. Ponc´e,Y. Pouillon, T. Rangel, G. M. Rignanese, A. H. Romero,B. Rousseau, O. Rubel, A. A. Shukri, M. Stankovski,M. Torrent, M. J. van Setten, B. Van Troeye, M. J. Ver-straete, D. Waroquiers, J. Wiktor, B. Xu, A. Zhou, andJ. W. Zwanziger, “Recent developments in the ABINITsoftware package,” (2016), accepted for publication inComputer Physics Communications. R. Feynman and R. P. Feynman, Physical Review , 340(1939). While Eq. (12) do not look symmetric with respect to αβ at first glance, using Eqs. (13) and (17) along with the factthat | ˆ P k u n k (cid:105) does not contribute to ε αβn k (see Eq. (14) andassociated discussion) reveals that the expression is indeedsymmetric. M. R. Hestenes and E. Stiefel, Journal of Research of theNarional Bureau of Standards , 2379 (1952). G. H. Golub and C. F. Van Loan,
Matrix Computations (The Johns Hopkins University Press, London, UK, 2012). X. Gonze, P. Boulanger, and M. Cˆot´e, Annalen der Physik , 168 (2011). C. Audouze, F. Jollet, M. Torrent, and X. Gonze, PhysicalReview B , 035105 (2008). K. Miwa, Physical Review B , 094304 (2011). A. Dal Corso, Physical Review B , 235118 (2001). For convenience, we still summarize the definitions here: K = (cid:52) k + G , ˆ K is the unit vector in the direction of K , s = (cid:52) r − R , Y lm are the spherical harmonics, j l ( ks ) are sphericalBessel functions, s c is the radius of the PAW augmentationregions (see Eq. (C17)), ˜ P R i ( s ) is defined in Eq. (C16)(see also Eq. (C19)), ˜ P R i ( K ) is defined in Eq. (C20), and (cid:104) K | ¯ p R i (cid:105) differs from (cid:104) K | ˜ p R i (cid:105) only trough the substitution K → G in Eq. (C19) (see Eqs. (C19)-(C23)). A. Dal Corso, Physical Review B , 085135 (2012). M. J. Verstraete, M. Torrent, F. Jollet, G. Zerah, andX. Gonze, Physical Review B , 045119 (2008). C. Audouze, F. Jollet, M. Torrent, and X. Gonze, PhysicalReview B , 235101 (2006). B. Fornberg, Mathematics of computation , 699 (1988). J. P. Perdew and Y. Wang, Phys. Rev. , 13244 (1992). L. E. Ramos, L. K. Teles, L. M. R. Scolfaro, J. L. P.Castineira, A. L. Rosa, and J. R. Leite, Physical ReviewB , 165210 (2001). R. N. Dexter and B. Lax, Physical Review , 223 (1954). J. C. Hensel, H. Hasegawa, and M. Nakayama, PhysicalReview , A225 (1965). M. Willatzen and L. C. Lew Yan Voon,
The k p Method (Springer Berlin Heidelberg, Berlin, Heidelberg, 2009). E. Vishnyakova, B. E. Brinson, L. B. Alemany, M. Verma,and W. E. Billups, Chemistry-A European Journal ,1452 (2016). M. Inagaki and F. Kang, Journal of Materials ChemistryA , 13193 (2014). C. Zhou, S. Chen, J. Lou, J. Wang, Q. Yang, C. Liu,D. Huang, and T. Zhu, Nanoscale Research Letters ,26 (2014). O. D. Restrepo, K. E. Krymowski, J. Goldberger, andW. Windl, New Journal of Physics , 105009 (2014). C. He, C. X. Zhang, L. Z. Sun, N. Jiao, K. W. Zhang, andJ. Zhong, Physica Status Solidi - Rapid Research Letters , 427 (2012). R. W. G. Wyckoff,
Crystal Structures (Interscience, 1960). L. M. Falicov and S. Golin, Physical Review , A871(1965). P. J. Lin and L. M. Falicov, Physical Review , 441(1966). P. K. Silas, P. D. Haynes, and J. R. Yates, Physical ReviewB , 134103 (2013). M. G. Priestley, L. R. Windmiller, J. B. Ketterson, andY. Eckstein, Physical Review , 671 (1967). C. S. Ih and D. N. Langenberg, Physical Review B , 1425(1970). G. S. Cooper and A. W. Lawson, Physical Review B ,3261 (1971). X. Gonze, J. P. Michenaud, and J. P. Vigneron, PhysicalReview B , 11827 (1990). S. Ponc´e, Y. Gillet, J. Laflamme Janssen, A. Marini,M. Verstraete, and X. Gonze, The Journal of ChemicalPhysics , 102813 (2015). G. Antonius, S. Ponc´e, E. Lantagne-Hurtubise, G. Auclair,X. Gonze, and M. Cˆot´e, Physical Review B , 085137(2015). S. Ponc´e, G. Antonius, Y. Gillet, P. Boulanger,J. Laflamme Janssen, A. Marini, M. Cˆot´e, and X. Gonze,Physical Review B , 214304 (2014). G. Antonius, S. Ponc´e, P. Boulanger, M. Cˆot´e, andX. Gonze, Physical Review Letters , 215501 (2014). S. Ponc´e, G. Antonius, P. Boulanger, E. Cannuccia,A. Marini, M. Cˆot´e, and X. Gonze, Computational Mate-rials Science , 341 (2014). P. B. Allen and V. Heine, Journal of Physics C-Solid StatePhysics , 2305 (1976). P. B. Allen and M. Cardona, Physical Review B , 1495(1981). P. Gomes Dacosta, O. H. Nielsen, and K. Kunc, Journalof Physics C-Solid State Physics19