Analysis of Kohn-Sham Eigenfunctions Using a Convolutional Neural Network in Simulations of the Metal-insulator Transition in Doped Semiconductors
AAnalysis of Kohn-Sham Eigenfunctions Using a Convolutional Neural Network in Simulations ofthe Metal-insulator Transition in Doped Semiconductors
Yosuke Harashima , Tomohiro Mano , Keith Slevin , and Tomi Ohtsuki Institute of Materials and Systems for Sustainability, Nagoya University, Nagoya, Aichi 464-8601, Japan Physics Division, Sophia University, Chiyoda, Tokyo 102-8554, Japan Department of Physics, Osaka University, Toyonaka, Osaka 560-0043, Japan (Dated: September 8, 2020)Machine learning has recently been applied to many problems in condensed matter physics. A common pointof many proposals is to save computational cost by training the machine with data from a simple example andthen using the machine to make predictions for a more complicated example. Convolutional neural networks(CNN), which are one of the tools of machine learning, have proved to work well for assessing eigenfunctions indisordered systems. Here we apply a CNN to assess Kohn-Sham eigenfunctions obtained in density functionaltheory (DFT) simulations of the metal-insulator transition of a doped semiconductor. We demonstrate that aCNN that has been trained using eigenfunctions from a simulation of a doped semiconductor that neglects elec-tron spin successfully predicts the critical concentration when presented with eigenfunctions from simulationsthat include spin.
I. INTRODUCTION
Machine learning has proven to be very useful in condensedmatter physics.
It has been applied to the classification ofphases in spin systems , interacting systems and disorderedsystems as well as topological systems. There are alsosuggestions to use machine learning for calculating atomisticpotentials and performing materials search.
Recent work on the Anderson transition in three dimensionshas shown that a CNN can detect the transition point fromthe spatial profile of the eigenfunction intensity.
Thoughthe precision of the estimate of the critical point was less thanthat achieved with finite size scaling (FSS), the CNN predictedthe critical point from the simulation of only a single systemsize. It also showed generalisation capability; once trainedfor Anderson’s model of localisation , it was successfullyapplied to quantum percolation without further training. This suggests that a suitably trained CNN might also be use-fully applied to study other transitions, for example, the metal-insulator transition in doped semiconductors.A metal-insulator transition as a function of doping con-centration is observed in numerous semiconductors when theyare doped with impurities.
This transition is thought tobe a zero temperature continuous quantum phase transition inwhich both disorder due to the random positions of the impu-rities in the semiconductor and interactions between electronsplay important roles.Recently there have been attempts to better understandthis transition, and, in particular, its critical phenomena, bystudying the FSS of multifractal measures calculated fromeigenfunctions obtained by using DFT simulations of dopedsemiconductors.
In these studies, it has been found thatthe eigenfunction at the Fermi level is Anderson localised atlow doping concentration but becomes delocalised when thedoping concentration is su ffi ciently high. Supposing this coin-cides with the metal-insulator transition in the doped semicon-ductor, this provides an estimate of the critical concentration.In Refs. 30 and 31 the role of electron spin was ignored.However, the true spin configuration is expected to be param- agnetic with the local magnetic moment randomly distributed.Unfortunately, DFT calculations which include the electronspin are considerably more time-consuming than calculationsfor spinless electrons. It would save computational time, ifdata from simulations with spinless electrons could be used totrain the CNN. However, this would only be useful if such aCNN has the necessary generalisation capability.In this paper, we demonstrate that a CNN that has beentrained using Kohn-Sham eigenfunctions for spinless elec-trons successfully predicts the critical concentration whenpresented with Kohn-Sham eigenfunctions obtained in cal-culations that include spin. We also check whether such aCNN successfully predicts the critical concentration whenpresented with Kohn-Sham eigenfunctions from simulationswith spinless electrons for compensated semiconductors butfind that it does not. II. MODELS AND METHODS
The metal-insulator transition in a doped-semiconductorcan be studied theoretically, at a certain level of approxima-tion, using a model in which electrons with an e ff ective mass m ∗ move in an e ff ective medium with relative dielectric con-stant ε r . This leads to the consideration of the Hamiltonian H = − m ∗ (cid:88) i ∇ i − ε r (cid:88) i , I Z I | (cid:126) r i − (cid:126) R I | + ε r (cid:88) i (cid:44) j | (cid:126) r i − (cid:126) r j | + ε r (cid:88) I (cid:44) J Z I Z J | (cid:126) R I − (cid:126) R J | . (1)Here, (cid:126) r i and (cid:126) R I are the positions of the electrons and the impu-rity ions, respectively, Z I is the ionic charge value, which is + − L . In each system, N D donor impurityions and N A acceptor impurity ions are randomly distributed(see Fig. 1). The corresponding concentrations are n D = N D / V , (2) a r X i v : . [ c ond - m a t . d i s - nn ] S e p and n A = N A / V , (3)with V = L (4)being the volume of the system.As in previous studies, for a given configuration of theimpurities, we attempt to find the ground state of Eq. (1) for N = N D − N A (5)electrons, using the Kohn-Sham formulation of DFT. Thisinvolves finding the self-consistent solutions of the Kohn-Sham equations, (cid:32) − m ∗ ∇ + V σ e ff (cid:33) ψ σ i ( (cid:126) r ) = (cid:15) σ i ψ σ i ( (cid:126) r ) . (6)Here, ψ σ i are the Kohn-Sham eigenfunctions, (cid:15) σ i the Kohn-Sham eigenvalues, and σ is the spin. The electron densityof the ground state n (cid:0) (cid:126) r (cid:1) is then the sum of the spin resolvedelectron densities n (cid:0) (cid:126) r (cid:1) = n ↑ (cid:0) (cid:126) r (cid:1) + n ↓ (cid:0) (cid:126) r (cid:1) , (7)where n σ (cid:0) (cid:126) r (cid:1) = (cid:88) i :occupied (cid:12)(cid:12)(cid:12) ψ σ i (cid:12)(cid:12)(cid:12) (8)The e ff ective potentials V σ e ff are the sum of three terms V e ff = V ext + V Hartree + V σ XC . (9)The first term is the external potential due to the impurity ions.The second term is the Hartree potential of the electrons. Thethird term V XC is the exchange-correlation potential V σ XC = δ E XC (cid:104) n ↑ , n ↓ (cid:105) δ n σ . (10)Here, E XC is the exchange-correlation energy, which is a func-tional of the spin up and spin down electron densities, orequivalently the electron density n (cid:0) (cid:126) r (cid:1) and the spin density ζ (cid:0) (cid:126) r (cid:1) = n ↑ (cid:0) (cid:126) r (cid:1) − n ↓ (cid:0) (cid:126) r (cid:1) n (cid:0) (cid:126) r (cid:1) . (11)We use the local density approximation E XC ≈ E LDAXC = (cid:90) d r (cid:15) XC (cid:0) n (cid:0) (cid:126) r (cid:1) , ζ (cid:0) (cid:126) r (cid:1)(cid:1) n (cid:0) (cid:126) r (cid:1) , (12)with the form of (cid:15) XC given in Refs. 36 and 37. Solving theseequations self-consistently, we obtain the eigenfunctions ψ σ i .We focus on the highest occupied Kohn-Sham eigenfunction,i.e. the occupied eigenstate with the largest eigenvalue. Forbrevity in what follows we denote this eigenfunction simplyas ψ . We train the CNN so that it can correctly determine the lo-calised and delocalised phases from the eigenfunction. Theinput is the intensity | ψ | and the output is the probability p loc that the eigenfunction is in the localised phase. We performsupervised training, i.e, we prepare a correctly labelled dataset (training data) in advance to optimise the weight parame-ters of the CNN. The hyper-parameters of the network struc-ture are similar to the ones used in Refs. 3 and 24 for An-derson’s model of localisation in three dimensions. Trainingdata is prepared by simulating a system of spinless electronswithout acceptors, i.e., by solving the Kohn-Sham equationssubject to the constraint of complete spin-polarisation ζ (cid:0) (cid:126) r (cid:1) = N A = as well as quantum percolation .In multi-fractal finite size scaling the system size depen-dence of the e ff ective multifractal exponent ˜ α is analysed.This exponent is defined as,˜ α ≡ λ (cid:104) S (cid:105) ln λ , (14)where S and λ are defined as follows. Calculation of Eq. (14)involves coarse-grained eigenfunction intensities. A three di-mensional cubic system of linear size L is divided into boxes(indexed by the label k ) of linear size l and the eigenfunctionintensities integrated over each box µ k ≡ (cid:90) k d r (cid:12)(cid:12)(cid:12) ψ ( (cid:126) r ) (cid:12)(cid:12)(cid:12) . (15)The ratio of the box size to the system size is denoted by λ ≡ lL . (16)The quantity S is obtained by summing over all the boxes asfollows S ≡ (cid:88) k ln µ k . (17)The angular brackets (cid:104)· · · (cid:105) denotes an ensemble average.As discussed in Refs. 38 and 39, for a system with dimen-sionality d (here d =
3) and with λ held fixed, we expect˜ α → ∞ as L → ∞ in the localised or insulating phase, while˜ α → d as L → ∞ in the delocalised or metallic phase. At thecritical point, the system size dependence of ˜ α disappears.We can, therefore, find the critical point as a crossing betweencurves for systems of two di ff erent sizes L with the box size FIG. 1. Example of impurity distributions and Kohn-Sham eigenfunctions. Blue and green dots are donor and acceptor ions, respectively.Red shading indicates the square of the highest occupied Kohn-Sham eigenfunction. On the left a spin-up eigenfunction, and in the centre aspin-down eigenfunction. On the right a Kohn-Sham eigenfunction for a spinless compensated sample. adjusted such that the value of λ is the same for both curves(for example, see Fig. 4).It should be noted that the true multifractal exponent α isobtained only in the limit that λ → ff ective” (and a tilde) above. III. RESULTS AND DISCUSSION
For ease of comparison with the well studied case of Si, inwhat follows we set m ∗ = . m e and ε r = .
0, which arethe appropriate values for electrons in Si. However, for Eq.(1), this amounts only to a re-scaling of the units and does nota ff ect the analysis in any fundamental way. A. Spinless electron model of doped semiconductor
We consider two ways of training the CNN. The first is witha labelled data set of eigenfunctions of Anderson’s model oflocalisation. The second is with a labelled data set of Kohn-Sham eigenfunctions for a spinless model of a doped semi-conductor.For Anderson’s model of localisation the training set andthe CNN structure are the same as in Ref. 24 except the inputsystem size, which is 42 × ×
42 in the present case. For thedoped semiconductor the training set consists of the highestoccupied Kohn-Sham eigenfunctions obtained in simulationsof 1,000 samples each for doping concentrations of n D ≈ × cm − , which is in the insulating regime, and n D ≈ × cm − , which is in the metallic regime, and system size L ≈ . n D [10 cm − ] p l o c n MFAc n CNNc
Anderson modelDoped semiconductor (Spinless)
FIG. 2. The probability p loc that a Kohn-Sham eigenfunction islocalised as a function of the doping concentration n D . We com-pare the probabilities reported by two CNNs: one trained with datafor Anderson’s model of localisation, and the other with data for aspinless uncompensated doped semiconductor. The system size is L ≈ . p loc = .
5. For guides to the eye, dashedlines are drawn. The CNN trained with data for a spinless un-compensated doped semiconductor predicts a critical concentration n CNNc ≈ . × cm − . The critical concentration of multifractalanalysis (MFA) n MFAc ≈ . × cm − is also shown for compar-ison. the correct critical concentration for the semiconductor. TheCNN trained with the model of the doped semiconductor nat-urally gives the correct critical concentration. n D [10 cm − ] p l o c n CNNc ≈ . L ≈ . Å FIG. 3. The probability p loc that a Kohn-Sham eigenfunction is lo-calised as a function of the doping concentration n D for a systemwith spin. The CNN used has been trained with data for the spinlessuncompensated doped semiconductor. The prediction for the criti-cal concentration n c ≈ . × cm − is taken as the concentrationwhere p loc = . n D [10 cm − ] ˜ α n MFAc ≈ . L ≈ . Å L ≈ . Å FIG. 4. Estimation of the critical concentration for a system withspin using multifractal analysis. Two system sizes were simulated L ≈ . L ≈ . n c ≈ . × cm − is ingood agreement with the value found using the CNN (see Fig. 3). B. Model of doped semiconductor including electron spin
We use a CNN, trained as before with a labelled data set fora spinless model of a doped semiconductor, to assess Kohn-Sham eigenfunctions obtained from a model of a doped semi-conductor that includes spin. That is in a model where thecondition of complete spin polarisation Eq. (13) is removed.Since the spin configuration must also be optimised more it-erations are required to find the self-consistent solutions ofthe Kohn-Sham equations. When presented with the resultingeigenfunctions the CNN reports a probability that an eigen-function is localised. We average this over the highest oc- n D [10 cm − ] p l o c n CNNc ≈ . L ≈ . Å FIG. 5. The probability p loc that a Kohn-Sham eigenfunction islocalised as a function of the doping concentration n D for a spinlesscompensated doped semiconductor. The compensation is fixed at50% and the probability is plotted as a function of the concentrationof donor impurities n D . The CNN used has been trained with datafor a spinless uncompensated doped semiconductor. The predictionfor the critical concentration n c ≈ . × cm − is taken as theconcentration where p loc = . n D [10 cm − ] ˜ α n MFAc ≈ . L ≈ . Å L ≈ . Å FIG. 6. Estimation of the critical concentration for a spinless com-pensated system using multifractal analysis. Two system sizes weresimulated L ≈ . L ≈ . n c ≈ . × cm − is significantly less than the value found using the CNN (see Fig. 5). cupied spin-up and spin-down eigenfunctions and denote theresult p loc .The results obtained after averaging p loc over an ensem-ble of 20 samples with system size L ≈ . p loc = . n c ≈ . × cm − . In Fig.4 we show the e ff ective mul-tifractal exponent ˜ α as a function of donor concentration fortwo system sizes L ≈ . . . × cm − . This is ingood agreement with the prediction of the CNN. C. Spinless model of a compensated doped semiconductor
The system now includes randomly distributed donor andacceptor ions. The ratio of the compensation is fixed at 50%, n A n D = . . (18)We focus on the e ff ect of compensation and neglect the spindegree of freedom. The probability reported by the CNNtrained with data for the spinless uncompensated system isplotted in Fig.5 as a function of the donor concentration. Thesystem size is L ≈ .
1Å and the number of samples is 59.The CNN predicts a critical concentration of n c ≈ × [cm − ]. The multifractal analysis is shown in Fig.6. The es-timated critical concentration is n c ≈ × [cm − ]. It’sclear that the CNN significantly overestimates the critical con-centration. IV. CONCLUSION
In this paper, we investigated the generalisation capabilityof a CNN to determine the critical concentration of the metal- insulator transition in a model of a doped semiconductor. Theresults are mixed. A CNN trained with Kohn-Sham eigen-functions from DFT calculations in a spinless model of an un-compensated doped semiconductor assesses correctly eigen-functions from a model of a doped semiconductor with spin.However, the same CNN fails to assess correctly eigenfunc-tions from a spinless model of a compensated doped semicon-ductor .Nevertheless, the fact that CNN trained with eigenfunctionsfrom a spinless model of a doped semiconductor can be ap-plied to assess eigenfunctions for a model of a doped semicon-ductor that includes spin may be useful. The self-consistentcalculations involved in finding the Kohn-Sham eigenfunc-tions in models with spin are considerably more demandingsince the spin density must also be optimised. Since CNNsrequire large training data sets to be useful, this means thatconsiderable time could potentially be saved by training witha spinless model.
ACKNOWLEDGMENTS
This work was partly supported by JSPS KAKENHI GrantNos. JP17K18763, 16H06345, and 19H00658. The computa-tion was partly conducted using the facilities of the Supercom-puter Centre, the Institute for Solid State Physics, the Univer-sity of Tokyo. P. Mehta, M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K.Fisher, and D. J. Schwab, Phys. Rep. , 1 (2019). G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby,L. Vogt-Maranto, and L. Zdeborov´a, Rev. Mod. Phys. , 045002(2019). T. Ohtsuki and T. Mano, J. Phys. Soc. Jpn. , 022001 (2020). J. Carrasquilla and R. G. Melko, Nat. Phys. , 431 (2017). E. P. van Nieuwenburg, Y.-H. Liu, and S. D. Huber, Nat. Phys. , 435 (2017). P. Broecker, J. Carrasquilla, R. G. Melko, and S. Trebst, Sci. Rep. , 8823 (2017). T. Ohtsuki and T. Ohtsuki, J. Phys. Soc. Jpn. , 123706 (2016). T. Ohtsuki and T. Ohtsuki, J. Phys. Soc. Jpn. , 044708 (2017). Y. Zhang and E.-A. Kim, Phys. Rev. Lett. , 216401 (2017). Y. Zhang, R. G. Melko, and E.-A. Kim, Phys. Rev. B , 245119(2017). N. Yoshioka, Y. Akagi, and H. Katsura, Phys. Rev. B , 205110(2018). H. Araki, T. Mizoguchi, and Y. Hatsugai, Phys. Rev. B , 085406(2019). T. Mano and T. Ohtsuki, J. Phys. Soc. Jpn. , 123704 (2019). J. Behler and M. Parrinello, Phys. Rev. Lett. , 146401 (2007). A. Takahashi, A. Seko, and I. Tanaka, Phys. Rev. Materials ,063801 (2017). W. Li, Y. Ando, and S. Watanabe, J. Phys. Soc. Jpn. , 104004(2017). A. P. Bart´ok, J. Kermode, N. Bernstein, and G. Cs´anyi, Phys. Rev.X , 041048 (2018). H. Babaei, R. Guo, A. Hashemi, and S. Lee, Phys. Rev. Materials , 074603 (2019). J. Byggm¨astar, A. Hamedani, K. Nordlund, and F. Djurabekova,Phys. Rev. B , 144105 (2019). K. Takahashi and Y. Tanaka, Comput. Mater. Sci. , 364(2016). T. Yamashita, N. Sato, H. Kino, T. Miyake, K. Tsuda, andT. Oguchi, Phys. Rev. Materials , 013803 (2018). T. Fukazawa, Y. Harashima, Z. Hou, and T. Miyake, Phys. Rev.Materials , 053807 (2019). Y. Harashima, K. Tamai, S. Doi, M. Matsumoto, H. Akai,N. Kawashima, M. Ito, N. Sakuma, A. Kato, T. Shoji, andT. Miyake, arXiv:2007.14101 (2020). T. Mano and T. Ohtsuki, J. Phys. Soc. Jpn. , 113704 (2017),https: // doi.org / / JPSJ.86.113704. P. W. Anderson, Phys. Rev. , 1492 (1958). T. F. Rosenbaum, K. Andres, G. A. Thomas, and R. N. Bhatt,Phys. Rev. Lett. , 1723 (1980). H. Stupp, M. Hornung, M. Lakner, O. Madel, and H. v.L¨ohneysen, Phys. Rev. Lett. , 2634 (1993). H. v. Lhneysen, Ann. Phys. , 599 (2011). K. M. Itoh, M. Watanabe, Y. Ootuka, E. E. Haller, and T. Ohtsuki,J. Phys. Soc. Jpn. , 173 (2004). Y. Harashima and K. Slevin, Int. J. Mod. Phys. Conf. Ser. , 90(2012). Y. Harashima and K. Slevin, Phys. Rev. B , 205108 (2014). E. G. Carnio, N. D. M. Hine, and R. A. R¨omer, Phys. Rev. B ,081201 (2019). E. G. Carnio, N. D. Hine, and R. A. Rmer, Physica E: Low Di-mens. Syst. Nanostruct. , 141 (2019). P. Hohenberg and W. Kohn, Phys. Rev. , B864 (1964). W. Kohn and L. J. Sham, Phys. Rev. , A1133 (1965). O. Gunnarsson, B. I. Lundqvist, and J. W. Wilkins, Phys. Rev. B , 1319 (1974). J. F. Janak, V. L. Moruzzi, and A. R. Williams, Phys. Rev. B ,1257 (1975). A. Rodriguez, L. J. Vasquez, K. Slevin, and R. A. R¨omer, Phys.Rev. Lett. , 046403 (2010). A. Rodriguez, L. J. Vasquez, K. Slevin, and R. A. R¨omer, Phys.Rev. B , 134209 (2011). L. Ujfalusi and I. Varga, Phys. Rev. B , 184206 (2015). J. Lindinger and A. Rodrguez, Phys. Rev. B , 134202 (2017). L. Ujfalusi and I. Varga, Phys. Rev. B90