Manifolds of quasi constant SOAP and ACSF fingerprints
MManifold of quasi constant SOAP and ACSF fingerprints
Behnam Parsaeifard
1, 2 and Stefan Goedecker
1, 2 Department of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland National Center for Computational Design and Discovery of Novel Materials (MARVEL), Switzerland
Atomic fingerprints are commonly used for the characterization of local environments of atomsin machine learning and other contexts. In this work we study the behaviour of the fingerprintsunder finite changes of atomic positions and demonstrate the existence of manifolds of quasiconstant fingerprints for two widely used fingerprints, namely the smooth overlap of atomic positions(SOAP) and Behler-Parrinello atom-centered symmetry functions (ACSF). These manifolds are foundnumerically by following eigenvectors of the sensitivity matrix with quasi zero eigenvalues. Theexistence of such manifolds in ACSF and SOAP is a result of the two- and three-body nature of thefingerprint. No such surfaces can be found for the Overlap matrix (OM) many-body fingerprint.
INTRODUCTION
Atomic environment fingerprints encode informationabout the chemical environment such as bondlengths toneighboring atoms or coordination numbers [1–4] TheCartesian coordinates of the atoms in a system are not auseful fingerprint since the invariance of the energy undercertain operations is not encoded in such a fingerprint.A fingerprint should be invariant under uniform transla-tions, rotations, and permutation of identical atoms inthe system. Two well-known fingerprints that are com-monly used in machine learning of the potential energysurface are the smooth overlap of atomic positions (SOAP)and Behler-Parrinello atom-centered symmetry functions(ACSF). ACSF consists of radial and angular symmetryfunctions. The Radial symmetry functions ( G ) are sumsof two-body terms and describe the radial environmentof an atom. The angular symmetry function contain thesummation of three-body terms and describe the angularenvironment of an atom [5, 6].In the SOAP (Smooth Overlap of Atomic Positions)scheme, a Gaussian is centered on each atom within thecutoff distance around the reference atom k . The resultingdensity of atoms multiplied with a cutoff function, whichgoes smoothly to zero at the cutoff radius over some char-acteristic width, is then expanded in terms of orthogonalradial functions g n ( r ) and spherical harmonics Y lm ( θ, φ ) as ρ k ( r ) = (cid:80) nlm c knlm g n ( r ) Y lm ( θ, φ ) . The vector contain-ing all p knn (cid:48) l ’s, defined as p knn (cid:48) l = (cid:113) π l +1 (cid:80) m c knlm ( c kn (cid:48) lm ) ∗ ,with n, n (cid:48) ≤ n max and l ≤ l max is the SOAP fingerprintvector of atom k [7].So the ACSF and SOAP fingerprints are both based ontwo- and three-body functions.In contrast, the OM fingerprint is based on the diago-nalization of the overlap matrix between the atoms withinthe cutoff sphere around the reference atom. In the OMscheme we place a minimal basis set of Gaussian orbitalson each atom within the cutoff radius and then calculatethe overlap matrix between all the atoms. The resultingoverlap matrix is modulated by some smooth cutoff func-tion. The eigenvalues of the resulting matrix form the fingerprint vector of the reference atom in OM [8, 9]. Wewill show that because of the two- and three-body natureof ACSF and SOAP, these fingerprints are insensitiveto some movements. We find the manifold of constantfingerprint for these fingerprints whereas it turns out thatthe OM is a true many-body fingerprint and no manifoldof constant fingerprint can be found for it. The softwareQUIPPY [10] is used to calculate both the ACSF andSOAP fingerprints.In a recent investigation [11] of the SOAP fingerprintit was found that for the CH molecule there are twodistinct configurations which give rise to exactly the samefingerprint. In this work we go one step further and showthat there exist even manifolds of constant fingerprint. METHODOLOGY
We introduced the sensitivity matrix in [12] to studythe behaviour of atomic fingerprints under infinitesimalchanges in the coordinates. The square of fingerprintdistance between a reference configuration of atoms, R ,and a configuration displaced by ∆ R , R , can be expandedin a Taylor series as: ( F ( R ) − F ) = (cid:88) α,β ∆ R α (cid:32)(cid:88) i g i,α g i,β (cid:33) ∆ R β (1)Where F is the fingerprint of the reference configura-tion, F ( R ) is the fingerprint of the displaced configurationand g i,α is the gradient of the i -th component of the fin-gerprint vector with respect to the Cartesian components α of the position vector R , i.e. g i,α = ∂F i ∂R α (cid:12)(cid:12)(cid:12)(cid:12) R = R (2)The symmetric N × N matrix S α,β = (cid:80) i g i,α g i,β is thesensitivity matrix. N is the number of neighbours of thereference atom (including itself) for which the sensitivitymatrix is calculated. Since S is symmetric its eigenvalues a r X i v : . [ c ond - m a t . o t h e r] F e b are real and its eigenvectors form a complete basis set,i.e.: S = N (cid:88) i =1 λ i | v i >< v i | (3)Since we can write any arbitrary displacement as ∆ R = (cid:80) i c i | v i > with c i = < v i | ∆ R > , the square of finger-print distance can be written in terms of the eigenvaluescontribution as: ( F ( R ) − F ) = (cid:88) i λ i c i (4)This means that the eigenvectors of the sensitivity ma-trix are displacement modes and the eigenvalues showhow much the fingerprint changes under these movements.The sensitivity must have 6 zero eigenvalues. The asso-ciated eigenvectors describe 3 uniform translations and3 rotations of the atoms. In the following we will onlyconsider displacements that do not contain any trans-lations or rotations. For a unit displacement along the i -th eigenvector of the sensitivity matrix, the fingerprintdistance ∆ F is √ λ i . So if the sensitivity matrix hasmore than 6 zero eigenvalues there will be infinitesimaldisplacements that leave the fingerprint invariant. If morethan 6 zero eigenvalues exist not only in a point but ona manifold, one can follow these zero modes and obtainin this way finite displacements that leave the fingerprintinvariant. It is actually not necessary that the eigenvalueis exactly zero. If there exists a manifold on which oneeigenvalue is smaller by several orders of magnitude thanthe other eigenvalues there will be finite movements thatleave the fingerprint nearly constant compared to othermovements. Fingerprints which are constant or numer-ically quasi constant are clearly problematic. It meansthat such a fingerprint can not distinguish any more struc-tural differences that give rise to different physical andchemical properties.In this work we used n max = l max = 16 as well as aGaussian width of 0.5 Å for SOAP. For ACSF we used 10radial and 48 angular symmetry functions with standardparameters as in [12]. The same cutoff of 4.0 Å is usedfor both SOAP and ACSF. But our observation show thatthe existence of the manifold as well its trajectory doesn’tchange significantly for another choice of parameters. Wehave scaled the sensitivity matrix for all fingerprints suchthat the largest eigenvalue is one. RESULTS
We apply now the method explained in the previoussection to three small molecules, namely H O, NH , andCH to investigate the manifold of constant fingerprint.In the case of H O the reference atom O has only twoneighbours, i.e. any fingerprint with three-body terms is sufficient to fully characterize its local environment.As expected, no manifold of constant fingerprint can befound with SOAP and ACSF for the O-atom in H O.For the NH , and CH where higher order many bodyterm come into play, the situation is however differentand a constant fingerprint manifolds exists. In Fig. 1 a we show the constant fingerprint trajectory in the rightpanel for NH . The rings shown in the figure are thesuperpositions of many frames along such a trajectorythat leaves the fingerprint invariant. In each frame theatoms are represented by small spheres such that thesuperposition of all the atomic positions describes themovement. The smallest eigenvalue of the sensitivitymatrix was in this case about 6 orders of magnitudesmaller than the largest one. The blue triangles in the leftpanel of the Fig. 1 a represent the fingerprint distancebetween the N atom in the configurations on the manifoldand the N atom in the initial configuration. Even thoughthe configurations undergo finite movements on theserings, the SOAP fingerprint distance is less than − among all the configurations on the ring. This is verysmall compared to random movements of comparableamplitude which lead to fingerprint distances of about − as shown by the red dots in Fig. 1Fig. 1 b shows the result for the ACSF. The manifoldis also ring shaped and its diameter is very similar to thecase of SOAP. The ratio between the fingerprint distancesfor equal amplitude movement on and off the manifoldare also very similar. For the trajectory shown in Fig. 1 b the smallest eigenvalues was also 6 orders of magnitudesmaller than the largest one and the fingerprint changedby less than × − along the trajectory whereas for ran-dom displacements with the same amplitude the distancechanged by up to . .The same kind of manifolds exist for the carbon atomin CH . Fig. 2 shows an manifolds of constant fingerprintfor the central C-atom in a CH molecule with both SOAPand ACSF. The diameter of the largest ring is about 0.15Å for both SOAP and ACSF. The ratio of the largest tothe smallest eigenvalue is about for both SOAP andACSF and the fingerprints change by less than × − for ACSF and less than × − for SOAP.It is worthwhile to mention that such manifolds cannot be found for all initial configurations. In the caseof N H and CH they do for instance not exist for thehigh symmetry ground state configurations. But for anymoderate distortion of the ground state we were alwaysable to find them. We couldn’t find such a manifold forany of the studied molecules and configurations with theOM fingerprint, i.e. the OM fingerprint always recognizesstructural differences. In fact, we use the OM fingerprintin Fig. 1 and Fig. 2 to detect differences between theatomic environments on the manifold of the SOAP andACSF fingerprints.In conclusion, by following the eigenvectors of theFigure 1: The manifold of constant fingerprint of atom N in NH for a ) SOAP and b ) ACSF. The right column showsthe rotation- and translation-free movements of the 4 atoms on the manifold and the left column shows the fingerprintdistances of the N atom along the trajectory. The distances obtained with OM are compared with the SOAP/ACSFdistances. The red circles indicate the fingerprint distances of the N-atom in randomly displaced configurations (by amaximum of 0.02 Å) and the blue triangles distances on the manifold. The diameter of the rings in both SOAP andACSF is about 0.1 Å.sensitivity matrix corresponding to small eigenvalues wecan find manifolds of constant fingerprint for SOAP andACSF. We observed that such manifolds can be foundfor fingerprints which are restricted to three-body termssuch as ACSF and SOAP while it is not possible to findany such manifold for a many-body fingerprint like OM.This research was performed within the NCCR MAR-VEL, funded by the Swiss National Science Foundation.The calculations were performed on the computationalresources of the Swiss National Supercomputer (CSCS) under project s963 and on the Scicore computing centerof the University of Basel. [1] M. Eickenberg, G. Exarchakis, M. Hirn, S. Mallat, andL. Thiry, The Journal of chemical physics , 241732(2018).[2] A. S. Christensen, L. A. Bratholm, F. A. Faber, andO. Anatole von Lilienfeld, The Journal of ChemicalPhysics , 044107 (2020). Figure 2: The manifold of constant fingerprint of the carbon atom in CH for a ) SOAP and b ) ACSF. The rightcolumn shows the rotation- and translation-free movements of the five atoms on the manifold and the left columnshows the fingerprint distances of the C-atom along the trajectory. The distances obtained withe OM are comparedwith the SOAP/ACSF distances. The red circles indicate the fingerprint distances of the C-atom in randomlydisplaced configurations (by a maximum of 0.04 Å) and and blue triangles distances in configurations on the manifold.The diameter of the largest ring is about 0.15 Å for both SOAP and ACSF. [3] J. S. Smith, R. Zubatyuk, B. Nebgen, N. Lubbers, K. Bar-ros, A. E. Roitberg, O. Isayev, and S. Tretiak, Scientificdata , 1 (2020).[4] M.-P. V. Christiansen, H. L. Mortensen, S. A. Meldgaard,and B. Hammer, The Journal of Chemical Physics ,044107 (2020), https://doi.org/10.1063/5.0015571.[5] J. Behler and M. Parrinello, Physical review letters ,146401 (2007).[6] J. Behler, The Journal of chemical physics , 074106(2011).[7] A. P. Bartók, R. Kondor, and G. Csányi, Physical ReviewB , 184115 (2013).[8] L. Zhu, M. Amsler, T. Fuhrer, B. Schaefer, S. Faraji, S. Rostami, S. A. Ghasemi, A. Sadeghi, M. Grauzinyte,C. Wolverton, et al. , The Journal of chemical physics ,034203 (2016).[9] A. Sadeghi, S. A. Ghasemi, B. Schaefer, S. Mohr, M. A.Lill, and S. Goedecker, The Journal of chemical physics , 184118 (2013).[10] N. Bernstein, G. Csanyi, and J. Kermode, “Quip andquippy documentation,” .[11] S. N. Pozdnyakov, M. J. Willatt, A. P. Bartók, C. Ortner,G. Csányi, and M. Ceriotti, Physical Review Letters ,166001 (2020).[12] B. Parsaeifard, D. S. De, A. S. Christensen, F. A. Faber,E. Kocer, S. De, J. Behler, A. von Lilienfeld, and,166001 (2020).[12] B. Parsaeifard, D. S. De, A. S. Christensen, F. A. Faber,E. Kocer, S. De, J. Behler, A. von Lilienfeld, and