Data mining and accelerated electronic structure theory as a tool in the search for new functional materials
DData mining and accelerated electronic structure theory as a tool in the search fornew functional materials
C. Ortiz, ∗ O. Eriksson, and M. Klintenberg † Department of Physics, Uppsala University, Box 530, SE-751 21 Uppsala, Sweden (Dated: October 10, 2018)
Data mining is a recognized predictive tool in avariety of areas ranging from bioinformatics anddrug design to crystal structure prediction. In thepresent study, an electronic structure implemen-tation has been combined with structural datafrom the Inorganic Crystal Structure Database togenerate results for highly accelerated electronicstructure calculations of about 22,000 inorganiccompounds. It is shown how data mining algo-rithms employed on the database can identify newfunctional materials with desired materials prop-erties, resulting in a prediction of 136 novel mate-rials with potential for use as detector materialsfor ionizing radiation. The methodology behindthe automatized ab-initio approach is presented,results are tabulated and a version of the com-plete database is made available at the internetweb site http://gurka.fysik.uu.se/ESP/ (Ref.1).
Sensors, solar cells, advanced batteries, and magneticstrips in credit cards are examples of functional materialspresent in every-day life. One important task for the re-search in materials science is the continuous improvementand discovery of new such advanced materials.
Ab-initio electronic structure calculations as a tool for predictingmaterials properties have steadily increased in use overthe years [2] and play today an important role due tothe relatively inexpensive and versatile guidance it of-fers. There are currently some 8000 studies publishedannually with this method.Electronic structure theory applied in materials re-search is typically done in a fashion where a calculationfollows, or accompanies, an experimental result. Knowl-edge on an atomistic level is thus gained which can helpin understanding the experimental results [3]. In somerarer cases the theoretical calculations predict a materi-als property which subsequently may be realized exper-imentally. An example of the latter is the newly pro-posed tetragonally distorted FeCo alloy with exceptionalout of plane magnetic anisotropy [4, 5]. These materi-als simulations are done in a one-by-one mode, whereone calculation accompanies one experiment. However,with an increasing demand of an accelerated speed infinding or predicting new materials this may not be themost efficient approach. Alternatives to this methodol-ogy have indeed been discussed, for instance, numericalalgorithms which obey evolutionary principles borrowedfrom biology have been applied to find structural dataof compounds and alloys [6], and in a somewhat similar study where formation probabilities derived from corre-lations mined for in experimental data were used to guide ab-initio calculations for unknown structures [7].In this article a method for automatizing the gener-ation of a new database with electronic structure re-sults for a large number (22,000) of inorganic compoundsis presented. The necessary structural information forthe ab-initio calculations is extracted from the InorganicCrystal Structure Database (ICSD) [8]. The electronicstructure results are generated within the Local Den-sity Approximation (LDA) of Density Functional Theory(DFT) in combination with a highly accurate full poten-tial linear muffin-tin orbital (FP-LMTO) method [9].We describe how data mining algorithms can be ap-plied to the database when searching for any particularclass of potentially new advanced materials. This mayinvolve semiconductors with tailored band gaps for so-lar cells and light harvesting, i.e materials with desiredoptical properties, and magnetic compounds for energyconversion or magnetocaloric applications. To demon-strate in detail the power of data mining algorithms withautomatic electronic structure calculations we also give afull account of the search and identification of 136 novelcompounds with potential use as radiation detector ma-terials. In addition, we prove a high success rate of the al-gorithm by an un-biased identification of several known,successful cerium activated materials.Radiation detection systems are generally used in ar-eas such as biomedical imaging, nuclear security, nuclearnon-proliferation and treaty verification, and in indus-try. The limiting factors for the performance of thesesystems are found within the detector material, and im-provements are desired in properties like energy resolu-tion for isotope identification in nuclear security, tempo-ral resolution in biomedical imaging, or simply effectivedetection in small sized systems.Standard simulations involve to a large extent manualwork, where input data files must be generated. We havecircumvented this time consuming step by a fully auto-matic method, where all data files are computer gener-ated after certain materials specific criteria. In additionwe have designed a software system which, once all neces-sary input files have been generated, carries out the sim-ulation automatically (optimization of parameters, trun-cation criteria, decision making, etc.). It should also benoted that the algorithm employed has to undertake sev-eral steps of learning, in order to decide on a proper setof parameters. For instance, it was found that the defi- a r X i v : . [ c ond - m a t . m t r l - s c i ] A ug nition of the base geometry and basis set used in thesecalculations[9] needed to be updated after several trialcalculations, in order to ensure accuracy of the electronicstructure and total energy of the material. Hence allsteps of the simulations of this work have been made byartificial intelligence and high performance computation.The algorithm which executes first principles calcula-tions, with general rules for the computational details asdescribed in Supplementary Information 1, have been ap-plied to some 22,000 compounds from the ICSD database,and the results of these calculations are available on theweb site of Ref.1. There are a number of electronic struc-ture databases available on the web, e.g. [10, 11, 12, 13],however all at least two orders of magnitude smaller andwith different focus. The web-based databases all com-plement each other.It should be noted that the crystallographic data fromthe ICSD originates from different experimental settings,with small variations, giving rise to slightly different elec-tronic structure results and that several entries in ICSDmust be disregarded because at least one site has non-trivial occupancy. The control files for our calculationsare available upon request.The crystallographic data needed to construct the con-trol files involves information about the cell geometry,bravais lattice, and the coordinates for each atom andspace group. This information is available from the ICSDdatabase [8] in, for example, the CIF (crystallographic in-formation file) format. With access to the CIF files thecoordinates can then be unfolded and transformed to theminimal bravais lattice[14]. Our approach can make useof any electronic structure method and can be applied toany compilation of structural geometries, even hypothet-ical ones which have not yet been identified.For each entry in Ref.1 the electronic structure resultsare presented as figures illustrating band structure, den-sity of states (DOS), partial DOS, and charge densitycontour maps; furthermore, properties like density, to-tal energy, Fermi energy and band gap (if available) arealso listed. Note that the density is calculated using theexperimental lattice parameters. Optimizing the latticepaprameters, i.e. calculating the bulkmodulus, as well asperforming spin polarized calculations will be subjects offuture work.We now proceed with a detailed example on how min-ing algorithms on the electronic structure information inRef.1 may be used for identifying novel scintillator ma-terials. The general philosophy of the mining algorithmis to compare specific electronic structure related proper-ties of a larger set of compounds (i.e. the data in Ref.1)to a peer group, which is known to have desired proper-ties connected to a certain functionality of the material.We focus here on suitable candidates for nuclear radia-tion detector materials, and have chosen two sub-groupsof materials: (1) cerium activated scintillating materi-als, and (2) activated semiconductor materials, e.g. Ga doped ZnO (ZnO:Ga)[15]. In identifying principles ofdata mining for these materials, we consider experimentalinformation regarding characteristic electronic propertiesfor known cerium materials, that show 5 d to 4 f lumines-cence, as well as known semiconductor materials whichhave been found to have encouraging materials proper-ties. The data mining results in 136 candidate materialsproposed for further investigation.A first desirable property for the materials of inter-est for this study is high detection probability in smallsized units, which is associated to the number of avail-able electrons per unit volume. A high density and highatomic number (Z) are therefore desired [16]. Moreover, ashort attenuation length is needed and it is also advanta-geous that photons scatter mainly through the photoelec-tric channel. These two properties can be characterizedwith the photoelectric attenuation length (PAL). PALis the ratio between the calculated attenuation length( λ = F W/ ( ρ · [ σ pe + σ C ]) of the incoming radiation in thematerial and the calculated fraction between the photo-electric ( σ pe ) to Compton scattering ( σ C ) cross sections(or rather, the ratio σ pe / [ σ pe + σ C ]) at some energy, e.g.511 keV, which is the energy scale relevant for positronemission tomography (PET)[16, 17]. FW is the formulaweight and ρ is the calculated density of the material.The atomic masses and photoelectric- and Compton scat-tering cross sections are measured, element specific enti-ties and are listed in Supplementary Information 1. PALsummarizes attenuation length and the efficiency of thephotoelectric scattering channel and the lower the PALvalue is, the higher is the chance that an incoming γ -ray is absorbed in the material after a short distance bythe photoelectric effect, which makes the material morerelevant to our study.We show in Fig.1 the distribution and cumulative sumsof materials densities and the PALs. A high density re-quirement is imposed as ρ > . / cm , and we find 4,602materials satisfying this criterion. As an upper limit forthe PAL, the value of a well-known detector material isused, i.e. Tl doped NaI with PAL NaI : T l = 17 cm. Thislimit is satisfied for about 87% of the materials also sat-isfying the high density condition and the selection isreduced to 3,983 entries.We are now left with some four thousand compoundswhich according to the density and PAL criteria mightbe suitable detector materials. The synthesis and test-ing of thousands of candidate compounds is clearly notrealistic, nor is it feasible with standard, manually con-trolled computational methods to go through such a largebody of potential materials, to try to identify successfulcompounds. Hence, more efficient methods are needed,where efficient algorithms for calculating the electronicstructure must be combined with data mining techniques.Using a peer group of materials (discussed below) thedata mining algorithms can learn selection rules relatedto electronic structure properties as given by the band ! "! ’AL +cm. / u m be r o f c o m pound s ! "! Density +g m ! $ . / u m be r o f c o m pound s ! ’AL +cm. / u m be r o f c o m pound s ! "! Density +g m ! $ . / u m be r o f c o m pound s ! FIG. 1: Profiles for the density (upper graph) and PAL (lowergraph) distributions in energy (full line, scale is to the left ofthe figures) and their cumulative sum (dashed line, scale is tothe right of the figures). structure or the density of states. We will here use theLDA bandgap ( E g ), the width of the highest valenceband ( vbw ), the width of the lowest conduction band( cbw ), the width of the highest occupying electron in thevalence band ( dEe ), and the width of the lowest avail-able state in the conductio band ( dEh ) to further narrowdown the list of candidate materials. The definitions ofthese properties are shown in Fig.2. It should be notedthat the parameter vbw measures how delocalized thehighest energy band of the valence states is. In the sameway cbw measures the degree of delocalization of the low-est band amongst the conduction band states. The re-maining two parameters, dEe and dEh , give related in-formation, and stand in direct proportion to the effectivemass of the highest electron state and the lowest holestate, respectively. As a matter of fact our mining algo-rithm could have made use of effective electron and holemasses, instead of dEe and dEh , without any change inthe result of identified materials. When considering thebandgaps a distinction is made between a direct and anin-direct gap material, depending on whether or not thehighest energy in the valence band is found at the samepoint in reciprocal space as the lowest energy in the con-duction bands. FIG. 2: Illustration of band widths and dispersion relationsused in the mining algorithms. Because the values for thelowest of CB and the highest of VB occur at different positionsin reciprocal space, an indirect band gap (Eg) is shown. Byintegrating the total density of states around the Fermi levelthe width of the last occupied electron (dEe) and the widthof the first empty state containing one electron (dEh) arefound. vbw and cbw represent the width of the valence andconduction band, respectively.
The remaining step in the mining algorithm is to makea comparison of a profile of a selected property, e.g. thevalence band- and conduction band-width (vbw and cbw,respectively), for the approximately four thousand com-pounds in our database which satisfy the density andPAL criteria, with the corresponding profile for a peergroup of materials. This peer group must, of course,have desired properties for the specific functionality de-sired in the study in question. For semiconducting mate-rials the peer group is composed of well-known materialsfrom Ref.18.The profiles of vbw and cbw are shown in Fig.3. Inthe case of semiconducting materials for use as scintilla-tors, the mining algorithm concludes from the peer groupthat the vbw value should be greater than 0.4 eV. Anal-ogously, cbw must be greater than 0.9 eV. Fig.4 showsthe distributions of dEe and the dEh , respectively. Fromthe profile of the peer group of semiconductors, the lowerlimit for the dEe rule is set to 0.02 eV. Analogously thelower limit for the dEh rule is set to 0.03 eV.For cerium activated scintillating materials a morecomplex process is known to take place, and for thisreason somewhat different mining rules are used. Thescintillating process can be thought to occur as follows;a tri-valent cerium atom captures a hole and goes to thetetra-valent state ( Ce + h + → Ce ). A subsequentcapture of an electron takes the tetra-valent cerium atomto an excited tri-valent state ( Ce + e − → ( Ce ) ∗ ).The process can be thought of as an excitation of the Ce4 f -electron into the Ce 5 d -state. The band gap of thehost material must be large enough to properly accom-modate the Ce 4 f and 5 d states [19]. Finally, the excitedstate of the tri-valent Ce atom relaxes to the ground state vbw [eV] N u m be r o f c o m pound s cbw [eV] N u m be r o f c o m pound s vbw [eV] N u m be r o f c o m pound s cbw [eV] N u m be r o f c o m pound s FIG. 3: Profiles for the vbw (upper graph) and cbw (lowergraph) distributions in energy. The thick full line shows thedistribution for all compounds in Ref.1 (the scale is on theright of each plot). The thin full line and the dashed lineshow the profiles for cerium- and semiconductor materials,respectively, defining the peer groups (the scale is on the leftof each plot). (( Ce ) ∗ → Ce + hν ), with the emission of a photon( hν ), ideally around 3 eV, which can be detected withconventional photo-electronics. To properly accommo-date the Ce 4 f and 5 d states a large band gap, > dEe [eV] N u m be r o f c o m pound s dEh [eV] N u m be r o f c o m pound s dEe [eV] N u m be r o f c o m pound s dEh [eV] N u m be r o f c o m pound s FIG. 4: Profiles for the dEe (upper graph) and dEh (lowergraph) distribution in energy. The thick full line shows thedistribution for all compounds in Ref.1 (the scale is on theright of each plot). The thin full line and the dashed lineshow the profiles for cerium- and semiconductor materials,respectively, defining the peer groups (the scale is on the leftof each plot). of the measured one. Hence it is quite possible to use thecalculated band gap as a screening parameter, as long asone makes use of calculated gaps both for the peer groupand the group of compounds one performs the data min-ing for. Fig.5 shows the band gap profiles for all com-pounds with a calculated LDA band gap in Ref.1 (thickfull line), for peer groups of Ce doped materials showingcerium 5 d → f emission (thin full line), semiconductingmaterials (dashed line), and additionally a profile whereCe doped materials which also contain sulfur is shown(thin full line with circle markers). The reason for intro-ducing the latter profile is that a detailed analysis of thematerials listed in Ref.20 shows that host materials con-taining sulfur allow cerium 5d →
4f emission even thoughthe band gap is smaller compared to Ce materials with-out sulphur. Therefore the LDA band gap mining rulesets the lower limit of 3 eV for Ce activated host com-pounds, provided that this number is relaxed down to N u m b e r o f c o m p o u n d s FIG. 5: Profile for the LDA bandgap distribution for all com-pounds in Ref.1 (thick full line, the scale is on the right of thefigure). Most known cerium activated materials that emitlight have an LDA band gap larger than 3 eV (thin full line,scale is to the left of the figure). The thin full line with circlemarkers show that cerium activated materials where the hostcontain sulphur can emit light even though the bandgap issmall (scale is to the left of the figure). Semiconducting ma-terials from the peer group are shown as a dashed line (scaleis to the left of the figure). cbw is large. The list of predicted ma-terials in Table 1 of Supplementary Information 3 havesimilar bandgaps and values of cbw as these known com-pounds. However, it should be noted that in the group ofCuI, ZnO and CdS, only CuI is found in Table 1 of Sup-plementary Information 3, because of the very stringentset of criteria imposed in the data mining procedure (in particular, it is the requirement of a high materials den-sity which excludes ZnO and CdS from our list). Thesecriteria are designed to substantially improve the cur-rently used materials in scintillator applications. Hencethe predicted list of materials have potential to be sub-stantially better than what is used in current technol-ogy. We end the discussion of semiconducting scintilla-tors with a short analysis of the crystal chemistry of theidentified materials. First we note that space group 129(tetragonal) and 62 (orthorhombic) are the most commoncrystal structures found in Table 1 of Supplementary In-formation 3, and that materials with space group 141(tetragonal), 164 (trigonal) and 225 (fcc) are also ratherfrequent. Secondly, we note that oxides constitute over50 % of these materials. We also observe that ionic bondsare present is almost all the materials listed in Table 1(as well as in Table 2) of Supplementary Information 3.This is clearly a significant result, which is worthwhileto pursue experimentally, and is a unique feature of ourstudy, since it would be difficult to draw this conclusionwith any other technique.Applying the appropriate mining algorithm for Ce acti-vated compounds to all materials of Ref.1 identifies some70 candidate materials. We note again that the ruleslearned by the mining algorithm are optimized to iden-tify new materials with superior properties compared toknown systems. Table 2 in Supplementary Information2 shows how the number of candidate compounds is re-duced as the data mining progresses, and it is clear thatthe band gap rule stands out as the most efficient. Theresulting list of compounds identified as potential hostsfor Ce (Ce activated scintillator materials) are given inTable 2 of Supplementary Information 3. For cerium ac-tivation, compounds containing Gd, Lu, Y, La and Sc areknown to often be efficient scintillators, the reason beingthat these elements do not introduce electron/hole trapsand therefore allow efficient energy transfer to the ceriumdopant. Hence, it it is gratifying that among the materi-als in Table 2 of Supplementary Information 3, many areindeed rare-earth (RE) based. The quantum efficiency ofthe cerium 5 d to 4 f transition is largely dictated by thechemical bonding between the cerium and host atoms,a property which is ideal to investigate with electronicstructure methods. The fact that the crystal structureas well as type of host cation are important parametersis reflected in that most useful cerium activated com-pounds today are orthosilicates, aluminates, phosphatesor simple metal-halides. We note that several orthosili-cates, aluminates, phosphates or simple metal-halides arepredicted in Table 2 of Supplementary Information 3.As regards crystal structure we note that space group62 constitutes a large group of Ce doped compounds (thisspace group was also prevalent for the semiconductingscintillators). Space group 14 (monoclinic) and 225 (fcc)are also common in our Table of Ce doped scintillators.Again, and as noted above, the presence of ionic com-pounds is clear from this table.It might be argued that a theoretical prediction ofnovel materials with improved properties is uncertain,if it is not accompanied by an experimental verification.To work around this argument without doing any realsynthesis and measurements it is useful to consider thefollowing. Suppose some of the most successful/usefulcerium activated detector materials, e.g. LaBr , LaCl ,LaF , Lu SiO , Gd SiO , LuPO ,YAlO and CeF , hadnever been discovered, would the present method be ca-pable of identifying them? To answer this question weremoved the materials listed above from the peer group,thus forcing the mining algorithm to learn from a smallerpeer group, to see if our algorithms would identify thesewell known compounds. The results of this analysis isindeed very encouraging. Four out of the eight materi-als are immediately identified (LaF , Lu SiO , Gd SiO and LuPO ). LaBr and Ce F do not appear in our list,because they have too low density and the compoundsLaCl and YAlO are also excluded since they have toohigh PAL value. Should the density cut-off be set to 5.0,LaBr and CeF would also have been identified as wellas several more interesting compounds, e.g. Ce dopedY Si O . If we had used a higher value of the PAL, inthe screening process, we would also have included LaCl and YAlO (as well as several other compounds), but itshould be noted that the too high PAL value of thesetwo copmpounds is known to make them less attractiveas scintilator materials, even though they have possitivefeatures like low cost and are easily synthesized. The ex-ercise described above shows that our mining algorithmand electronic structure method has the desired accuracyfor identifying novel materials with desired properties.Inspection of Table 2 in Supplementary Information 3reveal several compounds of special interest and AsLuO and ClGdO stand out in this group, especially beacusethe first is isoelectronic to LuPO and the latter one is re-lated to the compound BrGdO, which is a known highlyluminous phosphor. In fact lanthanide oxyhalides dopedwith cerium are interesting because also the La and Luversions are well-known luminous phosphor materials.We note that the successful materials discussed in theprevious paragraph all have large LDA bandgaps and thisfact indicates that the following materials also deservespecial attention: AlO Tb, Al Gd O Sr, Ba O Ru and O SrYb .The materials predicted here are the sole result of the-oretical modeling and are found by using a data min-ing algorithm which uses material properties of a peergroup of already well-known materials. Obviously themethod presented here can be employed to identify ma-terials with other properties, for instance novel mate-rials for fuel cell and battery applications, super hardcompounds and magnetic nano-devices with taylormadetransport properties.We thank Dr. B. Sanyal, Dr. D. ˚Aberg, and Dr. B. Sadigh for helpful discussions on electronic structure the-ory, and Dr. M. J. Weber, Dr. S. E. Derenzo, and Dr. W.W. Moses for helpful discussions on radiation detection.This work has been sponsored in part by NNSA/na22;HSARPA; Stiftelsen f¨or internationalisering av h¨ogre ut-bildning och forskning (STINT); Vetenskapsr˚adet (VR);Kungliga vetenskapsakademin (KVA); SNIC/SNAC andthe G¨oran Gustafsson Stiftelse. ∗ Partly affiliated at Lawrence Berkeley National Labora-tory, Berkeley, California during years 2002-2005. † Corresponding author: [email protected];Partly affiliated at LBNL during years 1998-2005.[1] http://gurka.fysik.uu.se/esp/ .[2] M. Cohen. ”The theory of real materials”
Annual Reviewof Material Science , 30:1–26, 2000.[3] G. B. Olson. ”Materials by Designing a New MaterialWorld”
Science , 288(5468):993–998, 2000.[4] G. Andersson, T. Burkert, P. Warnicke, M. Bj¨orck,B. Sanyal, C. Chacon, C. Zlotea, L. Nordstr¨om, P. Nord-blad, and O. Eriksson. ”Perpendicular magnetocrys-talline anisotropy in tetragonally distorted Fe-Co alloys”
Physical Review Letters , 96(3):037205, 2006.[5] T. Burkert, L. Nordstr¨om, O. Eriksson, and O. Heinonen.”Giant magnetic anisotropy in tetragonal FeCo alloys”
Physical Review Letters , 93(2):027203, 2004.[6] G. L. W. Hart, V. Blum, M. J. Walorski, and A. Zunger.”Evolutionary approach for determining first-principleshamiltonians”
Nature Materials , 4(5):391–394, 2005.[7] C. C. Fischer, K. J. Tibbetts, D. Morgan, and G. Ceder.”Predicting crystal structure by merging data miningwith quantum mechanics”
Nature Materials , 5:641–646,2006.[8] ICSD Inorganic Crystal Structure Database, FIZ Karl-sruhe, .[9] J. M. Wills, O. Eriksson, M. Alouani, and D. L. Price. ”Electronic Structure and Physical Properties of solids:The uses of the LMTO method”
Springer Verlag, Berlin,2000.[10] http://databases.fysik.dtu.dk/ .[11] http://caldb.nims.go.jp/ [12] [13] http://ptp.ipap.jp/link?PTPS/138/755/ [14] C. J. Bradley, and A. P. Cracknell ”The MathematicalTheory of Symmetry in Solids: Represetation theory forpoint groups and space groups”
Clarendon press, Oxford,1972.[15] W. Lehmann. ”Edge emission of n-type conducting ZnOand CdS”
Solid-State Electronics , 9:1107–1110, 1966.[16] G. F. Knoll.
Radiation Detection and Measuremnet . JohnWiley and Sons, 2000.[17] M. M. Atalla.
Scientific Rept. no. 8 , page AD0447260,1964.[18] D. R. Lide, editor.
CRC Handbook of Chemistry andPhysics . Taylor and Francis, Boca Raton, FL. Internetversion, 88th edition edition, 2007.[19] W. Yen, M. Raukas, S. Basun, W. van Schaik, andU. Happek. ”Optical and photoconductive propertiesof cerium-doped crystalline solids”
Journal of Lumines- cence , 69:287–294, 1996.[20] P. Dorenbos. ”The 5d level positions of the trivalentlanthanides in inorganic compounds”
Journal of Lumi- nescence , 91:155–176, 2000.
APPENDIX A: SUPLEMENTARY INFORMATION 1
Any ab-initio method requires initial input data for the atomic species and their relative position in the crystal aswell as information about truncation in expansion of wavefunctions, density and potential. The structural data arein this work extracted from the ICSD [8]. Additionally for the FP-LMTO method used here [9] we need to define: • A muffin-tin radius, R MT , optimized to be the largest value for non-overlapping neighboring spheres. Theinitial value for R MT is set to be the ionic radius. The electronic structure calculation is iterated three timesand between each iteration R MT is set to the smallest value for where the potential between each atom pairreaches a maximum. • An upper limit, l cut , for the expansion of the angular part of the wavefunctions inside the muffin-tin, which weset equal to the highest populated orbital in the valence, plus one (e.g. for sp-bonded materials we use s,p andd orbitals as basis functions). • The expansion of density and potential inside the muffin-tin radius is done up to l cut =8. • The grid for the sampling of the irreducible Brillouin zone and the Fourier mesh for expanding the density andpotential in the interstitial are all set inversely proportional to the lengths of the crystal axes and to include allhigh symmetry points. Hence for smaller cells we make use of a higher number of k-points whereas for largercells we use a smaller number of k-points. • For each atom a selection of which electronic states should be categorized as chemically inert core states, andwhich are considered to be chemically active valence orbitals must be made. We have here made a conventionalchoice which is listed in the table below. • In the table below we also list for each atom the experimental crossections for compton scattering ( σ C ) and thephotoelectric effect ( σ pe ), which are used for the calclation of the PAL, defined in the main text. TABLE I: Definition of atomic configurations including atomic number(Z), symbol (atom), core- and valence electron configuration. The atomicmasses as well as photoelectric- and Compton crossections are also listed.
Z Atom core Valence Atomic mass σ pe σ C s s s s s p s p s p s p s p s p s s s p s p s p s p s p s p s p s s p s TABLE I – Continued
Z Atom core Valence Atomic mass ρ pe ρ C
21 Sc [Ne] 3 s p d s s p d s s p d s s p d s s p d s s p d s s p d s d s d s d s d s p d s p d s p d s p d s p d s p d s p s d s p s d s p d s d s p d s d s p d s d s p d s d s p d s d s p d s d s d d s d s d s p d s p d s p d s p d s p d s p d s p s d s p s d s p d s d f s p s d f s p s d f s p s d f s p s d f s p s d f s p s d f s p d s d f s p s d f s p s d f s p s d f s p s d f s p s d f s p s d f s p d s f d s f d s f d s f d s f d s f d s f d s TABLE I – Continued
Z Atom core Valence Atomic mass ρ pe ρ C
79 Au [Xe] 4 f d s f d s f d s p f d s p f d s p f d s p f d s p f d s p f d s p s f d s p s f d s p d s APPENDIX B: SUPLEMENTARY INFORMATION 2
Parameter Limit No. remainingRef.1 22,283Density a > <
17 cm 3,983Gap type direct 334Band gap 0.4 < E g < > > > b > a An upper limit of 13.0 g/cm is applied for the density. b
21 out of the 104 compounds lack values for dEe and dEh whichmeans that only 2 and 12 compounds are removed by these twoconstraints, respectively.
TABLE II: Results of the mining algorithm for wide-gap semiconductor materials. A final list of 66 compounds is obtained.Parameter Limit No. remainingRef.1 22,283Density a > <
17 cm 3,982Dope site b yes 1,825LDA gap > . < E g < . > > a An upper limit of 13.0 g/cm is applied for the density. b Compounds that pass this test must have a 3+ site or selected2+ site. At least one of the following elements need to be present:La, Ce, Gd, Y, Lu, Sc, Be, Mg, Ca, Sr, Ba, Al, Ga, In, Tl, As, Sb,Bi. If Pr, Nd, Pm, Sm, Dy, Ho, Er, Tm, or an element with Z >
83 is present the compound is excluded.
TABLE III: Results of the mining algorithm for host materials with Ce-activation when applied to Ref.1. A final list of 60compounds is listed, which becomes a list of 70 compounds if cbw and cbw are ignored and the sulphur containing small bandgap compounds are included.
APPENDIX C: SUPLEMENTARY INFORMATION 3
TABLE IV: Semiconducting materials
Material Spgrp ρ PAL E gap
Gap type wbw/cbw ICSD no.
AgHg O P 55 8.2 2.5 1.35 direct 0.46/1.81 2208AgI Tl 140 7.1 3.7 1.14 direct 0.46/1.77 23159AgLaOS 129 6.6 9.4 1.18 direct 1.30/1.96 15530Ag HgO
96 9.3 2.8 0.51 direct 0.43/1.35 280333Ag S 14 7.3 12.0 0.57 direct 0.61/2.07 44507Ag LiO
72 7.1 11.9 0.65 direct 0.64/1.31 4204AlO Tl 166 7.3 2.3 1.59 direct 0.65/0.70 29010AsLuO
141 6.9 5.0 3.40 direct 0.84/1.92 2506As Eu O 139 6.9 5.3 0.78 direct 1.01/1.37 1222AuBr 138 8.2 2.5 1.42 direct 1.30/1.01 200287AuBr 141 8.2 2.4 1.68 direct 0.55/1.10 200286Continued on Next Page. . . TABLE IV – Continued
Material Spgrp ρ PAL E gap
Gap type wbw/cbw ICSD no.
AuCl 141 7.8 2.2 1.30 direct 0.84/1.12 6052AuI 138 8.3 2.5 1.42 direct 1.10/0.72 24268AuLiS 70 7.0 2.5 1.34 direct 0.87/1.37 280534Au S Tl
59 10.2 1.5 0.86 direct 0.62/0.80 51235BaO 129 8.2 6.1 1.84 direct 1.36/3.81 15301BaSe 221 6.6 9.8 1.08 direct 3.48/6.67 52695BiFO 129 9.3 1.6 2.55 direct 1.13/1.65 24096BiIO 129 9.7 1.9 0.70 direct 1.49/2.78 29145BiO Sb 15 8.5 2.5 2.53 direct 0.53/0.89 75901Bi O Te 64 9.1 2.0 1.84 direct 0.72/1.23 6239Bi Cu Pb S
26 7.0 2.3 0.45 direct 0.65/0.82 95926BrFPb 129 7.7 2.4 2.49 direct 1.14/1.60 30288BrTl 221 7.5 2.4 1.75 direct 1.81/4.11 61532BrTl 225 6.6 2.7 1.92 direct 2.20/3.00 61519Br STl
128 7.4 2.3 1.74 direct 0.86/0.67 40521Br HgTl
128 7.0 2.7 1.81 direct 0.60/0.82 9325CO Pb 62 6.6 2.5 2.97 direct 0.46/0.90 36164CaHgO
166 6.5 2.9 2.42 direct 0.41/1.72 80717CdHgO
12 9.5 2.3 0.60 direct 0.61/2.52 74848CdI Tl
128 6.9 3.1 1.78 direct 0.43/0.87 60756ClFPb 129 7.2 2.3 3.04 direct 1.01/1.53 30287ClO P Pb
176 7.2 2.4 2.44 direct 0.49/0.69 24238ClO PbSb 63 7.0 3.1 1.64 direct 0.76/1.85 86229Cl Hg O
57 9.6 1.6 0.60 direct 0.64/1.42 83225Cl STl
128 7.1 2.1 1.46 direct 1.05/0.80 35289CrHg O
15 8.9 1.8 0.96 direct 0.44/0.81 81605CsI 221 9.0 5.5 1.54 direct 5.34/9.04 56524CuI 129 6.9 10.6 0.98 direct 1.63/3.48 78268Eu O Si 62 6.7 5.8 3.91 direct 0.53/1.81 1510FIPb 129 7.4 2.6 1.50 direct 1.10/1.44 279599FInO 70 6.6 13.1 1.62 direct 0.77/4.03 2521FTl 139 8.4 1.7 1.82 direct 2.87/5.95 9893FTl 28 9.0 1.6 1.37 direct 1.12/2.14 16112FTl 69 8.5 1.7 1.73 direct 2.94/6.06 30268F Hg 225 9.3 1.8 0.41 direct 0.50/4.25 33614F SiTl
163 6.8 2.4 3.10 direct 0.43/1.49 68021Gd InSe
58 7.2 7.4 0.67 direct 0.65/1.08 280242HfO
225 10.4 2.3 3.71 direct 1.69/1.03 53033HfO Pb 55 10.2 1.7 2.27 direct 0.43/1.15 52030HgI Tl
128 7.2 2.7 1.19 direct 0.49/1.14 14018HgO Ti 161 8.7 2.4 1.25 direct 0.57/1.01 19005HgO W 15 9.2 2.0 2.20 direct 0.44/0.73 280911Hg O Se 14 8.0 2.3 2.25 direct 0.67/0.65 412302Hg N O
14 7.5 2.2 1.70 direct 0.57/0.66 59156Hg O Si
12 9.1 1.8 1.56 direct 0.47/1.01 69123ISe Tl
140 8.6 1.9 0.68 direct 0.74/0.74 49524ITl 225 6.6 2.8 1.81 direct 1.89/2.96 60491I STl
128 7.2 2.4 1.61 direct 0.65/0.79 29265I SeTl
128 7.4 2.4 1.53 direct 0.64/0.82 40520O Sn 136 6.9 11.8 0.52 direct 1.27/5.15 39178O Sn 58 7.4 11.1 1.47 direct 1.08/5.42 56675O SbTl 163 7.1 3.0 1.88 direct 0.77/0.96 4123O W 7 7.4 3.1 1.50 direct 0.42/1.64 84144O Pb
117 8.7 1.6 1.16 direct 0.63/1.61 29094O Pb
135 8.9 1.6 0.64 direct 0.73/1.51 22325 TABLE V: Cerium activated materials.
Material Spgrp ρ PAL E gap
Gap type wbw/cbw ICSD no.
AlLaO
167 6.5 9.9 3.68 in-direct 0.29/0.66 90536AlO Tb 62 7.5 5.3 6.38 direct 0.32/2.38 84422Al Gd O Sr 139 6.9 7.2 5.90 direct 0.31/3.27 33580Al F Pb
108 6.8 2.7 4.56 in-direct 0.24/0.26 203224Al F Pb
140 6.7 2.8 4.18 in-direct 0.21/0.41 96597Al F Pb
87 6.7 2.8 4.37 in-direct 0.19/0.33 80105AsBiO
88 7.7 2.6 3.03 in-direct 0.25/0.70 30636AsLuO
141 6.9 5.0 3.40 direct 0.84/1.92 2506BGaO Pb 62 6.9 3.2 3.28 in-direct 0.58/0.90 279600BLuO
167 6.9 3.9 5.27 in-direct 0.34/0.84 16525BaBeLa O
14 6.6 7.9 3.78 in-direct 0.36/0.16 65292BaF
62 6.7 8.6 5.52 in-direct 0.92/2.43 41651BaO Tb 62 7.3 5.3 4.45 direct 1.02/1.96 86736BaO Tb
62 7.8 4.5 3.89 in-direct 0.15/1.83 78661BaO Tb Zn 62 7.8 5.1 3.72 direct 0.38/1.15 69721Ba Ce . O Sb 139 6.5 8.6 4.00 in-direct 0.49/0.81 72522Ba EuO Sb 225 7.0 7.0 4.12 in-direct 0.30/1.26 38330Ba GdO Sb 225 7.1 6.8 3.49 direct 0.37/1.56 38331Ba O SbTb 225 7.2 6.5 4.18 direct 0.32/1.27 38332Ba O SbYb 225 7.5 5.4 4.19 direct 0.35/1.44 38336Ba O TaYb 225 8.2 3.7 3.49 in-direct 0.38/0.83 91001Ba O WZn 225 7.7 5.0 3.37 in-direct 0.31/0.56 24983Ba NiO Ru
194 6.8 10.6 3.97 direct 0.09/1.73 50832Ba O Ru
64 6.7 9.9 4.72 in-direct 0.12/0.74 90902BeF Pb 62 6.8 2.7 4.34 in-direct 0.19/0.53 24568BiF
62 7.9 2.0 3.81 in-direct 0.58/0.68 1269Bi Ge O
220 7.1 2.6 3.26 in-direct 0.22/0.32 39231Bi O Si
220 6.8 2.4 3.58 in-direct 0.37/0.24 84519BiO V 15 7.0 2.8 3.38 in-direct 0.39/0.51 31549BrGdO 129 6.8 6.4 4.16 direct 1.08/0.99 41071CaLu O
62 8.1 3.2 3.49 in-direct 0.42/1.23 15125CaO Ta
182 7.6 3.3 3.08 in-direct 0.18/0.38 1854CaO Yb
62 8.0 3.4 4.38 direct 0.22/1.65 27312CaO Ta
62 7.4 3.5 3.15 in-direct 0.17/0.80 24091CeO
225 7.2 6.8 5.62 in-direct 0.48/1.38 28753Ce O
164 6.5 7.1 3.61 in-direct 0.43/1.36 96197ClGdO 129 6.7 5.8 5.13 direct 0.72/1.13 59232F La 139 7.0 8.5 5.40 in-direct 0.88/0.40 96133F SnTl
164 6.8 2.9 4.03 direct 0.78/1.29 410801F SiTl
163 6.8 2.4 3.10 direct 0.43/1.49 68021GaLaO
161 7.0 10.4 3.27 in-direct 0.18/0.78 51039GaLaO
167 6.9 10.5 3.01 in-direct 0.21/1.02 51286GaLaO
62 7.2 10.1 3.24 direct 0.51/0.56 79662Gd GeO
14 7.1 5.9 3.76 direct 0.24/1.16 61372Gd O
206 7.6 4.4 3.20 in-direct 0.23/1.50 40473Gd O Si 14 6.8 5.7 4.71 in-direct 0.17/0.78 27728Gd O Sb
217 6.6 7.3 3.16 in-direct 0.13/0.17 65147Ge Lu O
92 7.4 4.6 3.55 in-direct 0.12/1.53 39929Ge Lu O
13 7.4 4.1 3.85 in-direct 0.16/1.38 39790InO Ta 13 8.3 3.9 3.54 in-direct 0.24/0.60 72569KO W Y 15 6.6 4.6 3.32 direct 0.18/0.22 90378LaO Yb 33 8.2 3.8 4.30 direct 0.25/0.35 30399La LiO Sb 14 6.5 9.0 3.82 in-direct 0.21/0.74 72202La O
150 6.6 7.5 3.16 in-direct 0.75/1.13 56166La O
164 6.5 7.6 3.74 in-direct 0.38/0.83 96196LuO Rb 166 7.6 4.2 3.48 in-direct 0.53/1.83 15164LuO P 141 6.5 4.7 5.54 direct 0.89/1.57 79761Lu O
206 9.4 2.4 3.77 in-direct 0.28/1.21 40471Lu O Si 14 7.9 3.3 4.65 in-direct 0.24/1.12 89624O SrTa
182 7.8 3.3 3.07 in-direct 0.19/0.37 79704Continued on Next Page. . . TABLE V – Continued
Material Spgrp ρ PAL E gap
Gap type wbw/cbw ICSD no. O Sb Yb
217 7.0 5.4 3.15 in-direct 0.09/0.25 20945O STl
62 6.8 2.4 3.68 in-direct 0.31/0.61 27440O SeTl
62 7.0 2.5 3.46 in-direct 0.22/0.58 73411O SrYb
62 8.4 3.5 4.78 direct 0.19/1.65 15123AgLaOS 129 6.6 9.4 1.18 direct 1.30/1.96 15530AsBrHg S
186 6.6 3.1 1.28 in-direct 0.56/1.34 280330BiBrS 62 6.5 2.8 1.47 in-direct 0.61/1.32 31389BiIS 62 6.8 2.8 1.16 in-direct 0.62/1.34 23631Bi S3