DDeep learning regression for inverse quantum scattering
A. C. Maioli ∗ (Dated: September 22, 2020)In this work we study the inverse quantum scattering via deep learning regression, which isimplemented via a Multilayer Perceptron. A step-by-step method is provided in order to obtainthe potential parameters. A circular boundary-wall potential was chosen to exemplify the method.Detailed discussion about the training is provided. A investigation with noisy data is presented andit is observed that the neural network is useful to predict the potential parameters. I. INTRODUCTION
Machine Learning is a collection of powerful tools thatpredicts parameters or classify features based on exper-imental or synthetic data. A plethora of applicationsexist, such as the reconstruction of porous media [1], fea-ture selection by mutual information [2], percolation andfracture propagation in disordered solids [3], the behav-ior of Ising spin-lattice [4], a model for turbulent fluxesthat recovers spontaneous zonal flow [5], classification ofcomplex features in diffraction images [6] and much more[7–9].Recently, two-dimensional quantum scattering is re-ceiving attention, E. de Prunel gave a formulation fornon-isotropic interactions localized on circle [10]. Maioli et al found analytic solutions for the wavefunction scat-tered by circular and elliptic billiards [11, 12] and pre-sented a scattering with two-potential formalism [13].Which they used a boundary-wall potential introducedby M. G. E. da Luz et al [14]. Therefore, F. M Zanetti et al explores the eigenstates and scattering solutionsfor billiards using the Boundary Wall Method (BWM)[15], which is useful to find analytic solutions for the T matrix. Along these lines, the BWM provides a signif-icant way to study quantum scattering and electromag-netic wave propagation for TE or TM modes due to theanalogy of both physical phenomena [16]. On the otherhand, inverse scattering problems have a significant rolein applied physics, such as the reconstruction of mediumproperties [17]. In this scenario, G. Ariel and H. Dia-mant [18] shown a method to infer the entropy fromthe structure factor (which can be obtained by quan-tum scattering), and T. Tyni numerically investigatedthe two-dimensional inverse scattering with the aid ofSaito’s uniqueness theorem [19]. G. Fotopoulos and M.Harju [20] study how to retrieve the singularities of aunknown potential using the Born approximation.The goal of this work is to provide a simple methodthat obtains the potential parameters based on the scat-tering data. This type of inverse problem is extensivelyfrequent in scattering physics. It is designated a regres-sion problem in the machine learning vocabulary. Themethod consists in choosing a potential that model thephysical system, then generate synthetic data to train ∗ [email protected] a neural network. In order to employ the method wechoose the circular boundary wall potential. Employ aneural network to solve a regression problem is consideredexceedingly good, and the results improve as one addsmore hidden layers. However, it can be computationallyexhaustive and hard to converge the network’s parame-ters due to the vanishing gradient problem. Therefore weshow how to avoid the last difficulty. The trained neu-ral network can predict the correct results even when theinput data has noise and the training set doesn’t.This paper is organized as follows. In Section II wepresent the method detailed, including how the syntheticdata was generated (subsection II B) and the neural net-work training (subsection II D). In section III, it is shownthat the trained neural network can predict the correctvalues for the potential parameters. Finally we concludethe discussions on section IV. II. THE METHOD
The main idea is to provide a fast way to find thepotential parameters due to the scattering cross length l ( k ) obtained for the two-dimensional quantum scatter-ing. The scattering cross length is the two-dimensionalanalogs of the scattering cross-section, the usual formu-lation can be found at [21–23] and a comparison between2D and 3D formulas [24]. The method embraces a fewsimple steps, and some hints follow the example selectedthroughout this work. The steps are:1. Choose the potential that suits the desired physicalsystem.2. Generate synthetic data that will be the input ofthe neural network. One can use the scatteringcross length and other physical information, such asparticle’s mass, Plank’s constant, etc. There is noneed to worry about noise at this step. Therefore,the output is the potential parameters.3. Build a neural network. The size of the input willbe the number of physical quantities necessary toperform the regression.4. Train the neural network with synthetic data. a r X i v : . [ phy s i c s . c o m p - ph ] S e p A. First Step: Boundary wall potential
Here we use a circular boundary-wall potential that isdefined as a line integral V ( r ) = (cid:90) C γ ( s ) δ ( r − r ( s )) ds, (1)where γ ( s ) is the strength function, which we set to beconstant γ ( s ) = γ , C is a circle of radius R , the δ is thetwo-dimensional Dirac delta. Writing the potential as a Riemann integral, we have V ( r ) = γR (cid:90) π − π δ ( r − R ) r δ ( θ − s ) ds, (2)one can see that the parameters γ and R uniquely definethis type of potential, therefore those are the ones whichwe need to predict. B. Second step: Synthetic data
In this subsection is presented an expression for thescattering cross length l ( k ). It will be employed to gener-ate the synthetic data. Therefore, it is obtained throughthe analytic solution of the Lippmann-Schwinger equa-tion outside the circle ( r > R ) [11], ψ ( r ) = J ( kr ) + u H (1)0 ( kr ) + 2 ∞ (cid:88) n =1 i n (cid:104) J n ( kr ) + u n H (1) n ( kr ) (cid:105) cos [ n ( θ + ( − n α )] , (3)where J n and H (1) n are the Bessel and Hankel functionof the first kind of order n , respectively, α is the anglebetween the wave vector k of the plane wave and the x − axis, and u n = 2 πRγσJ n ( kR )1 − πRγσJ n ( kR ) H (1) n ( kR ) , (4)where σ = ( − i/ m/ (cid:126) ). For the sake of simplicity, weset α = 0, then using the relation i n J n ( kr ) = i − n J − n ( kr )and i n H (1) n ( kr ) = i − n H (1) − n ( kr ) one can rewrite the eq.(3) ψ ( r ) = e ikx + ∞ (cid:88) n = −∞ i n u n H (1) n ( kr ) e inθ , (5)where the sum of Bessel functions was identified as theexponential. Along these lines, one can use the asymp-totic expansion of the Hankel function H (1) n ( kr ) ≈ (cid:114) πk e − iπ/ e ikr e − inθ/ , (6)then it is easy to find the scattering amplitude f ( θ ) using ψ ( r ) ≈ e ikx + e ikr √ r f ( θ ) , (7)therefore f ( θ ) = (cid:114) πk e − iπ/ ∞ (cid:88) n = −∞ u n e inθ . (8) For central potentials it is useful to apply the partialwave analysis, f ( θ ) = ∞ (cid:88) l = −∞ f n e ilθ , (9)where f n = (cid:114) π e iπ/ (cid:114) k e iδ n sin δ n , (10)and δ n is the phase shift. One can find an analytic ex-pression for the phase shift after combining eq. (8), (9)and (10) δ n = log(1 + 2 u n )2 i , (11)and a relation for the scattering cross length l ( k ) = 4 k ∞ (cid:88) n = −∞ sin ( δ n ) = − k ∞ (cid:88) n = −∞ Re [ u n ] , (12)where Re [ u n ] stands for the Real part of u n .
1. Data Detailed
For a chosen γ and R it is computed l ( k ) for severalvalues of k . It begins with k min = 0 .
02 and ends at k max = 3 with increments ∆ k = 0 . m = (cid:126) = 1. The series of eq. (12) wastruncated at n max = 20 l ( k ) = − k (cid:88) n = − Re [ u n ] . (13) FIG. 1. A schematic representation of the neural net-work. The input have 603 values, which is defined by x =( m, (cid:126) , k min , k max , ∆ k, n max , l ( k min ) , ..., l ( k max )) T . The out-put contains two values R and γ . Each hidden layer has 804neurons, and there are 15 hidden layers.FIG. 2. Plot of the scattering cross length. The blue (gray)full line is related to the true values R = 2 and γ = 2, andthe red (black) dashed line to the “predicted” values γ ≈ . R ≈ .
98 obtained via the trained neural network.
So, one synthetic data is the group of 603 values x = ( m, (cid:126) , k min , k max , ∆ k, n max , l ( k min ) , ..., l ( k max )) T .Those values are organized as a column vector x andare the input of the neural network. Therefore, we gen-erate 55100 synthetic data, for different values of R and γ , where R spams from 0 . .
01, and γ from 0 . . C. Third Step: Build a neural network
Choose a specific Neural Network to implement a re-gression problem is a decisive matter due to the antag-onism between the computational time to execute theprogram and the spend personal time desired to obtainthe solution. Among several types of Neural Networks(such as Recurrent Neural Network, Modular Neural Net-work, Convolutional Neural Network, and some more),we choose a Multilayer Perceptron because it has a sim-ple set-up and provides excellent results. The number ofhidden layers in this work (15) is justified at the subsec-tion II D. Usually, the more hidden layer in the networkbetter is the results, until it starts to overfitting. How-ever, for hidden neurons, one may use some rules: • The number of hidden neurons should be betweenthe size of the input layer and the size of the outputlayer. • The number of hidden neurons should be 2/3 thesize of the input layer, plus the size of the outputlayer. • The number of hidden neurons should be less thantwice the size of the input layer.Those rule-of-thumbs appear at [25]. The chosen numberin this example was the size of the input plus one-thirdof it (804), and the activation function was the logisticsigmoid.
D. Fourth Step: The Training
In order to train the neural network, the synthetic datawas randomly separated among three groups, namelytraining set, validation set and test set. The test sethas 20% of the total number of the synthetic data. Theremaining (80%) was allocated between the training setand validation set. 30% of it for the validation set and70% to the training set. This separation is important tocheck the accuracy of the network. The error (loss orcost) function J employed is the mean squared difference J ( y , y (cid:48) ) = 1 N N (cid:88) j =1 ( y j − y (cid:48) j ) , (14)where y = ( y , ..., y N ) T is the network output, N = 2 isthe size of the output and y (cid:48) = ( y (cid:48) , ..., y (cid:48) N ) T is the desired FIG. 3. Scatter plot of the noisy scattering cross length with noise width w = 0 . w = 0 . w = 1 . w = 1 . output, in other words, the γ and R used to produce x .The training method is the stochastic gradient descentwith a batch size of 100 examples, where is importantto apply an adaptive learning rate that is invariant todiagonal rescaling of the gradients. However, one shouldavoid train the neural network directly, because of thevanishing gradient problem. This leads to a network withhigh bias.It is known, that a cascade-correlation learning archi-tecture [26] solves this problem. The procedure consistsin training the network several times, first with only onehidden layer. Then, one adds another hidden layer andkeep the weights learned previously. At each training,one must check the convergence of the error over the testset, the validation set. If the error calculated over thevalidation set increases (over each iteration at one train-ing), then you have overfitting. To solve this problemdecrease the number of hidden neurons. Finally, it is im-perative to apply the network over the test set at theend of each training, because one can visualize the errordecreasing until reaching the desired value. In this work,we stop at 15 hidden layers and obtain an error over thetest set of ∼ − . One can goes further (more hiddenlayers), but is enough for the purpose of this work. After checking the convergence of the parameters, werepeat the training with all the synthetic data. As anexample, in Fig. 2 is plotted the scattering cross lengthcalculated considering γ = 2 and R = 2 (blue full line).Then, it is provided to the neural network as an input,and it “predicts” the values γ ≈ .
92 and R ≈ . | γ − p γ | γ ≈ . , | R − p R | R ≈ . , (15)where p γ and p R stands for the “predicted” values ob-tained by the neural network. III. PREDICTION WITH NOISE
The trained neural network can predict accurate valuesof parameters when the input data has noise. It was gen-erated synthetic data l ( k ) and was added Gaussian whitenoise with different width. Therefore, it was plotted (Fig.3) the noisy scattering cross length with its respective FIG. 4. Percentage of correct predictions for each noise width w . It is considered as a correct prediction any example with apercentage relative difference less than 10% for both param-eters simultaneously. prediction to elucidate the procedure. The four plotscorrespond to the same scattering cross length (same aspresented at Fig.2), although their difference is the noisewidth. Along these lines, each example from Fig. 3 has a correct prediction for the potential parameters. Herewe consider a correct prediction as a percentage relativedifference less then 10% for all the parameters. Then, itwas generated one thousand of examples for each widthof the noise, where the parameters was randomly selectedbetween the interval R ∈ [0 . ,
2] and γ ∈ [0 . , w . It is shown the decreasing of accuratepredictions as the value of the noise increase. IV. CONCLUSION
In this work we have shown how a simple neural net-work can predict correct values for potential parameters.We choose a circular boundary-wall potential due to theexistence of the analytic solution for the wave functionand the scattering cross length. However the vast ma-jority of potential does not has an analytic solution forthe wavefunction nor the scattering cross length (or scat-tering cross section in 3D problems). Consequentely, onecan obtain it via numeric (boundary integral methods) orapproximate (Born approximation) methods. The neuralnetwork is able to determine the parameters even with anoisy input. [1] L. Mosser, O. Dubrule, and M. J. Blunt, Reconstruc-tion of three-dimensional porous media using generativeadversarial neural networks, Phys. Rev. E , 043309(2017).[2] N. Kwak and Chong-Ho Choi, Input feature selection bymutual information based on parzen window (2002).[3] S. Kamrava, P. Tahmasebi, M. Sahimi, and S. Arbabi,Phase transitions, percolation, fracture of materials, anddeep learning, Phys. Rev. E , 011001 (2020).[4] E. d. M. Koch, A. d. M. Koch, N. Kastanos, andL. Cheng, Short-sighted deep learning, Phys. Rev. E ,013307 (2020).[5] R. A. Heinonen and P. H. Diamond, Turbulence modelreduction by deep learning, Phys. Rev. E , 061201(2020).[6] J. Zimmermann, B. Langbehn, R. Cucini, M. Di Fraia,P. Finetti, A. C. LaForge, T. Nishiyama, Y. Ovcharenko,P. Piseri, O. Plekan, K. C. Prince, F. Stienkemeier,K. Ueda, C. Callegari, T. M¨oller, and D. Rupp, Deep neu-ral networks for classifying complex features in diffractionimages, Phys. Rev. E , 063309 (2019).[7] R. A. Vargas-Hern´andez, Y. Guan, D. H. Zhang, andR. V. Krems, Bayesian optimization for the inverse scat-tering problem in quantum reaction dynamics, New Jour-nal of Physics , 022001 (2019).[8] H. M. Yao, W. E. I. Sha, and L. Jiang, Two-step en-hanced deep learning approach for electromagnetic in-verse scattering problems, IEEE Antennas and WirelessPropagation Letters , 2254 (2019).[9] A. Palffy, J. Dong, J. F. P. Kooij, and D. M. Gavrila,Cnn based road user detection using the 3d radar cube,IEEE Robotics and Automation Letters , 1263 (2020). [10] E. de Prunel´e, Two-dimensional quantum scattering bynon-isotropic interactions localized on a circle, applica-tions to open billiards, Journal of Mathematical Physics , 102102 (2018).[11] A. C. Maioli and A. G. M. Schmidt, Exact solutionto Lippmann-Schwinger equation for a circular billiard,Journal of Mathematical Physics , 122102 (2018).[12] A. C. Maioli and A. G. Schmidt, Exact solution to theLippmann-Schwinger equation for an elliptical billiard,Physica E: Low-dimensional Systems and Nanostructures , 51 (2019).[13] A. C. Maioli and A. G. M. Schmidt, Two-dimensionalscattering by boundary-wall and linear potentials, Phys-ica Scripta 10.1088/1402-4896/ab57e6 (2019).[14] M. G. E. da Luz, A. S. Lupu-Sax, and E. J. Heller, Quan-tum scattering from arbitrary boundaries, Physical Re-view E , 2496 (1997).[15] F. M. Zanetti, E. Vicentini, and M. G. da Luz, Eigen-states and scattering solutions for billiard problems: Aboundary wall approach, Annals of Physics , 1644(2008).[16] F. M. Zanetti, M. L. Lyra, F. a. B. F. de Moura, andM. G. E. da Luz, Resonant scattering states in 2D nanos-tructured waveguides: a boundary wall approach, Jour-nal of Physics B: Atomic, Molecular and Optical Physics , 025402 (2009).[17] G. Rizzuti and A. Gisolf, An iterative method for 2d in-verse scattering problems by alternating reconstructionof medium properties and wavefields: theory and appli-cation to the inversion of elastic waveforms, Inverse Prob-lems , 035003 (2017). [18] G. Ariel and H. Diamant, Inferring entropy from struc-ture, Phys. Rev. E , 022110 (2020).[19] T. Tyni, Numerical results for saito’s uniqueness theoremin inverse scattering theory, Inverse Problems , 065002(2020).[20] G. Fotopoulos and M. Harju, Inverse scattering with fixedobservation angle data in 2d, Inverse Problems in Scienceand Engineering , 1492 (2017).[21] I. R. Lapidus, Quantummechanical scattering in two di-mensions, American Journal of Physics , 45 (1982).[22] P. A. Maurone and T. K. Lim, More on twodimensionalscattering, American Journal of Physics , 856 (1983). [23] S. K. Adhikari, Quantum scattering in two dimensions,American Journal of Physics , 362 (1986), 1601.02657.[24] E. De Prunel´e, Solvable quantum mechanical model intwo-dimensional space, Journal of Physics A: Mathemat-ical and General , 12469 (2006).[25] J. Heaton, Artificial Intelligence for Humans, Volume 3:Deep Learning and Neural Networks , Artificial Intelli-gence for Humans (Createspace Independent PublishingPlatform, 2015).[26] S. Fahlman and C. Lebiere, The cascade-correlationlearning architecture, Advances in Neural InformationProcessing Systems2