Identifying extra high frequency gravitational waves generated from oscillons with cuspy potentials using deep neural networks
PAPER • OPEN ACCESS
Identifying extra high frequency gravitational waves generated fromoscillons with cuspy potentials using deep neural networks
To cite this article: Li-Li Wang et al
New J. Phys. ew J. Phys. ( ) // doi.org / / / ab1310 PAPER
Identifying extra high frequency gravitational waves generated fromoscillons with cuspy potentials using deep neural networks
Li-Li Wang , Jin Li , Nan Yang and Xin Li Department of Physics, Chongqing University, Chongqing 401331, People ʼ s Republic of China Department of Electronical Information Science and Technology, Xingtai University, Xingtai 054001, People ʼ s Republic of China Department of Physics, Chongqing University, Chongqing 401331, People ʼ s Republic of China Author to whom any correspondence should be addressed.
E-mail: [email protected], [email protected], [email protected] and [email protected]: extra high frequency gravitational waves, deep neural networks, signal classi fi cation, parameters estimation Abstract
During oscillations of cosmology in fl ation around the minimum of a cuspy potential after in fl ation,the existence of extra high frequency gravitational waves ( HFGWs ) ( ∼ GHz ) has been proveneffectively recently. Based on the electromagnetic resonance system for detecting such extra HFGWs,we adopt a new data processing scheme to identify the corresponding GW signal, which is thetransverse perturbative photon fl uxes ( PPF ) . In order to overcome the problems of low ef fi ciency andhigh interference in traditional data processing methods, we adopt deep learning to extract PPF andmake some source parameters estimation. Deep learning is able to provide an effective method torealize classi fi cation and prediction tasks. Meanwhile, we also adopt anti-over fi tting technique andmake adjustment of some hyperparameters in the course of study, which improve the performance ofclassi fi er and predictor to a certain extent. Here the convolutional neural network ( CNN ) is used toimplement deep learning process concretely. In this case, we investigate the classi fi cation accuracyvarying with the ratio between the number of positive and negative samples. When such ratio exceedsto 0.11, the accuracy could reach up to 100%. Besides, we also investigate the classi fi cation accuracywith different amplitude of extra HFGWs. As a predictor, the mean relative error of parametersestimation decreases when the amplitude of extra HFGWs increases. Especially, when amplitude h ( t ) is in 10 − – − the mean relative error reaches around 0.014. On the contrary, the mean relativeerror increases with frequency increasing in 10 – Hz. At the optimal resonance frequency5 × Hz, the mean relative error is 0.12. Then we also study the mean relative error varying withwaist radius W of Gaussian beam, its optimal value is 0.138 when W is in ( ) approximately. Compared with classi fi ers and predictors using other machine learning algorithms,deep CNN for our datasets has higher accuracy and lower error.
1. Introduction
GW as one of the predictions of general relativity, has been discussed intensively in astronomy and theoreticalphysics. Currently several GW signals emitted from coalescence of binary black holes and binary neutron starswere veri fi ed as reality, which are all contributed to aLIGO ’ s frequency band ( – Hz ) [ – ] . Except thosesources, GW could also arise from many other sources, including corecollapse supernovae [ ] , rotating neutronstars [ ] , coalescing stellar binaries [ – ] , coalescing massive black hole binaries [ – ] and magnetars [
20, 21 ] ,which are in other frequency regions. Therefore GW detectors in different frequency bands are designed and willbe in operation successionally. For instance, there are pulsar timing arrays ( − – − Hz ) [ – ] , space-basedinterferometers such as eLISA ( − – Hz ) [ ] . In recent years, it has been indicated that in fl aton oscillationsaround the minimum of a cuspy potential after in fl ation [ ] and parametric resonance of f fi eld with other matter fi elds in preheating or at the end of in fl ation [ ] could produce extra HFGWs at 10 – Hz and withdimensionless amplitude of GW h ∼ − – − . The source of extra HFGWs ( i.e. in fl aton oscillations around OPEN ACCESS
RECEIVED
REVISED
ACCEPTED FOR PUBLICATION
25 March 2019
PUBLISHED this work must maintain attribution to theauthor ( s ) and the title ofthe work, journal citationand DOI. © 2019 The Author ( s ) . Published by IOP Publishing Ltd on behalf of the Institute of Physics and Deutsche Physikalische Gesellschaft he minimum of a cuspy potential after in fl ation ) is our study object. An EM resonance system for detecting extraHFGWs regarded as a supplement of current GW projects had been proposed by Professor Li [ ] . The basicprinciple is the electrodynamics equations in curved spacetime [ ] , in which background static magnetic fi eld in fl uctuation curved spacetime could generate perturbative EM fi eld. Thus such EM fi eld contains GW ’ sinformation and is able to react with background EM fi eld set arti fi cially, then generating PPF in the perpendiculardirection of GW propagation.Because the PPF can re fl ect the existence of extra HFGWs, it is considered as our signal. Unfortunately due tothe weak amplitude of extra HFGWs, PPF is always submerged by background EM fi eld and noise. In traditionalsignal processing scheme [ ] , a special material named fractal membranes is theoretically used to diminishtransverse background photon fl uxes ( BPFs ) in speci fi c area to ensure enough signal to noise ratio ( SNR ) in thisarea. But it leads to a great deal of potential new electromagnetic noise. So it is dif fi cult to extract PPF throughtraditional signal processing method. Here we utilize deep learning viewed as one of advanced technology inmachine learning to extract the signal and estimate corresponding parameters of GW source without fractalmembranes. Deep learning is more expressive than traditional methods in data analysis. It has been used widelyin engineering applications and gained great achievements in recent years, such as deep generative models,machine translations, attention in deep models, one-shot learning, style transfers, deep unsupervised learning inthe past few years [ – ] . In our case, each dataset is a 1 × [
39, 40 ] . The effect of GW is always at low SNR. Directly handling raw one-dimensional series can avoid losing information rather than converting to 2D image. Therefore in this paper, wefocus on the application of CNN which is able to reduce computational cost through sharing weights and smallkernels, to recognize the PPF generated by extra HFGWs submerged in the stationary gaussian white noise, shotnoise, noise from the inhomogeneity of background static magnetic fi eld.The deep learning process could be divided into two parts : 1. In classi fi cation, the data including signal ( PPF ) would be extracted and sorted to be GW event, otherwise the data is classi fi ed to be noise; 2. In prediction,some parameters of GW events classi fi ed successfully would be estimated. Recently using CNN to recognize GWin aLIGO frequency band and estimate the source parameters has obtained a great success [ ] . It was shown thatthe deep fi ltering outperforms conventional machine learning algorithms signi fi cantly, and the results areconsistent to the ones yielded by matched fi ltering [
39, 40 ] . At present, the samples are not from true eventsbecause the real world GW signal in extra high frequency band ( ∼ GHz ) discussed in this paper has not bedetected yet. Here we generate simulated data to do our work. The research is helpful to introduce newtechnology for laying the theoretical foundation to future experiment.This paper is organized as follows. In section 2, we will introduce our simulated data in theory and the CNNwe designed. Then noises concerned here are stationary gaussian white noise, shot noise and noise produced bythe inhomogeneity of background magnetic fi eld. In section 3, the effect of positive to negative sample ratio intraining data sets on deep CNN ’ s accuracy is also investigated. Through tuning the hyperparameters, theclassi fi cation accuracy varying with GW amplitude is obtained. Moreover, we also compare the accuracy of deepCNN with that of other machine learning algorithms. In section 4, the ability of our CNN as a predictor isdiscussed through adjusting the hyperparameters. Besides, we make a comparison between deep CNN and otherpredictors using baseline machine learning methods for estimating parameters. Conclusions and remarks arepresented in section. 5.
2. The EM resonant response to the extra HFGWs and obtaining datasets
The EM response system consists of background EM fi eld ( i.e. Gaussian beam ( GB )) and static magnetic fi eld [ ] . The EM response process includes two stages : ( ) the extra HFGWs propagating along the z -axis interactedwith static magnetic fi eld could generate transverse ( i.e. x -axis and y -axis ) fi rst-order perturbative EM fi eld [ – ] . Certainly in fact the propagating direction of extra HFGW is not always along the z -axis of ourobservation direction, so we will add an intersection angle term in calculation, which will be discussed inequation ( ) ; ( ) when the frequency of GW ν g equals to that of the background EM fi eld ν e , the interactionbetween the transverse fi rst-order perturbative EM fi eld and the GB could generate transverse fi rst-order PPF.The PPF is a physical effect of extra HFGWs, which can be observed by the photon counter on the transversedirection of extra HFGW. 2 New J. Phys. ( )043005 L-L Wang
The EM response system consists of background EM fi eld ( i.e. Gaussian beam ( GB )) and static magnetic fi eld [ ] . The EM response process includes two stages : ( ) the extra HFGWs propagating along the z -axis interactedwith static magnetic fi eld could generate transverse ( i.e. x -axis and y -axis ) fi rst-order perturbative EM fi eld [ – ] . Certainly in fact the propagating direction of extra HFGW is not always along the z -axis of ourobservation direction, so we will add an intersection angle term in calculation, which will be discussed inequation ( ) ; ( ) when the frequency of GW ν g equals to that of the background EM fi eld ν e , the interactionbetween the transverse fi rst-order perturbative EM fi eld and the GB could generate transverse fi rst-order PPF.The PPF is a physical effect of extra HFGWs, which can be observed by the photon counter on the transversedirection of extra HFGW. 2 New J. Phys. ( )043005 L-L Wang et al n general, GB of fundamental frequency mode could be expressed as [ ] y y k w k d = + -´ - - + + - ⎛⎝⎜ ⎞⎠⎟⎧⎨⎩ ⎫⎬⎭ ( ) ( ) ( ) rWz t zz rR zz e e e where ψ denotes the amplitude of GB in electric fi eld component. r = x + y , κ e = π / λ e , ω e = κ e c. λ e , κ e , ω e is the wavelength, wave vector and angle frequency of EM fi eld of GB, respectively. = pl z W e is the Rayleighsize of the GB. W denotes the waist radius of GB, i.e. the minimum spot radius along the z -axis. It can beadjusted in region ( ) as appropriate. = + ( ) W W zz , = + R z zz represents the curvatureradius of the wave fronts of the GB at z -axis. Note that z -axis is the symmetrical axis of the GB ( i.e. thepropagation axis of GB ) . δ denotes the phase difference between GB and the resonant components of extraHFGWs. The static magnetic fi led ¯ B y points along y -axis and is located in a fi xed region. The fi xed region is - l z l along the z -axis, where l = l = fi eld. Thus, we setcomponents of background electric fi eld y y = = E x x and = E z . Then, other components of thebackground EM fi eld through Maxwell ’ s equations could be written as ò k = - ⎜ ⎟ ⎛⎝ ⎞⎠ ( ) E x W R E y y e x w = ¶¶ - ¶¶ ⎛⎝⎜⎜ ⎞⎠⎟⎟ ( ) B Ey Ex i . 3 z e x y
From [ ] , one can fi nd that the PPF along the y -axis has no observable effect because it has the same distributionas background EM fi eld. Therefore, in this paper we only concentrate on the PPF along x -axis as the transversePPF due to its distinctly different space distribution from background EM fi eld. Then, the BPF density pointingalong x -axis could be represented as w m = ( ) ( ) n E B x e y z where μ is the vacuum permeability, and the angular bracket denotes the average over time. The expressions of E y and B z are given in ( ) , ( ) as above.According to the Maxwell ’ s equations in curved spacetime and spacetime fl uctuation caused by GWs, whenextra HFGWs pointing along a direction, which has an interaction angle θ to z -axis, are immersed in staticmagnetic fi eld, the transverse fi rst-order perturbative EM fi eld could be generated, which of electric componentin extra high frequency condition could be given as follow [ ] k k w qk w q = - + -+ + ( ) ¯ ( ) [ ( )]( ) ¯ [ ( )] ( ) ( ) E h t B c z l z th t B c z t
12 exp i cosi4 exp i cos , 5 y y g g gy g g where h ( t ) is amplitude of extra HFGWs. κ g , ω g is the wave vector and angle frequency of extra HFGWsrespectively, which are equal to the wave vector κ e and angle frequency ω e of background EM fi eld. Consideringthe average value in whole range of θ , the ( ) E y will have 1 / / fi cient by integrating against the θ term [ ] . In [ ] , the amplitude of extra HFGWs could be distributed in region 10 − – − . Background EM fi eldcoupled with the fi rst-order perturbative EM fi eld will generate fi rst-order perturbative energy fl uxes. Then, the fi rst-order PPF density along the x -axis should be expressed as w m = ( ) ( ) n E B x e y z where the expressions of B z and E y are provided in ( ) , ( ) . Then, according to formulae ( ) , ( ) , the BPF and PPFalong the x -axis can be obtained as follow òò = D ( ) N n y z d d , 7 x s x òò = D ( ) N n y z d d , 8 x s x New J. Phys. ( )043005 L-L Wang
12 exp i cosi4 exp i cos , 5 y y g g gy g g where h ( t ) is amplitude of extra HFGWs. κ g , ω g is the wave vector and angle frequency of extra HFGWsrespectively, which are equal to the wave vector κ e and angle frequency ω e of background EM fi eld. Consideringthe average value in whole range of θ , the ( ) E y will have 1 / / fi cient by integrating against the θ term [ ] . In [ ] , the amplitude of extra HFGWs could be distributed in region 10 − – − . Background EM fi eldcoupled with the fi rst-order perturbative EM fi eld will generate fi rst-order perturbative energy fl uxes. Then, the fi rst-order PPF density along the x -axis should be expressed as w m = ( ) ( ) n E B x e y z where the expressions of B z and E y are provided in ( ) , ( ) . Then, according to formulae ( ) , ( ) , the BPF and PPFalong the x -axis can be obtained as follow òò = D ( ) N n y z d d , 7 x s x òò = D ( ) N n y z d d , 8 x s x New J. Phys. ( )043005 L-L Wang et al here 0 < y < < z < y -axis and z -axis respectively. Δ s is a ‘ typical receiving surface ’ on the yoz plane, where the integral area is around 0.03 m .As shown in [ ] , although the strength of BPF in most areas is much larger than that of PPF, they havedistinct distributions. Thus the BPF could be dropped out from data by calculating difference of photonsnumber by switching magnetic fi eld on and off. After eliminating BPF from the interaction between backgroundelectric fi eld and magnetic fi eld, there could be thermal noise, shot noise [ ] , quantum fl uctuation noise,stationary gaussian white noise and noise produced by inhomogeneous background magnetic fi eld [ ] in curvespacetime. In this paper, we take stationary gaussian white noise, shot noise and noise from inhomogeneity ofbackground magnetic fi eld into account. We choose parameters of the EM resonance system as follows : thepower of GB P =
10 W, the amplitude of GB y » ´ - , the background static magnetic fi eld = ¯ B
10 T y , and d = p .According to above principle, we can simulate data sets, which include 1. PPF with three types of noisementioned above as positive samples, 2. pure noise as negative samples. The energy of PPF, stationary gaussianwhite noise, shot noise, noise from the inhomogeneity of background magnetic fi eld and PPF mixed with overallnoise are shown as fi gure 1 ( a ) . Compared to stationary gaussian white noise, the in fl uence of shot noise andnoise from inhomogeneous background magnetic fi eld are not dominant. The training and testing datasets arenormalized by taking the natural logarithm and dividing them by respective maximum. The positive andnegative samples of training and testing sets after normalization are illuminated in fi gures 1 ( b ) and ( c ) respectively, which shows that our positive and negative points are well mixed and would not be discriminated in Figure 1. ( a ) The energy distribution of one data set with SNR = fi eld ( BMF ) , the pure transverse fi rst-order PPF, the stationary gaussian white noise ( GWN ) , and the PPF mixed with overallnoise are shown with the cyan dashed curve, crimson dashed – dotted curve, black dotted curve, orange solid curve, magenta solidcurve respectively. Here, the dimensionless amplitude of the extra HFGW is h ( t ) = − , the extra HFGW of frequency is 10 Hz,and waist radius of GB W is 0.06 m. ( b ) The positive and negative samples of training and validation sets after normalization ( note:there are 12000 training sets and 4000 validation sets ) are shown with blue point, orange point respectively. ( c ) The positive andnegative samples of testing datasets after normalization ( there are 4000 data sets ) are shown with blue point, orange point respectively. New J. Phys. ( )043005 L-L Wang
10 T y , and d = p .According to above principle, we can simulate data sets, which include 1. PPF with three types of noisementioned above as positive samples, 2. pure noise as negative samples. The energy of PPF, stationary gaussianwhite noise, shot noise, noise from the inhomogeneity of background magnetic fi eld and PPF mixed with overallnoise are shown as fi gure 1 ( a ) . Compared to stationary gaussian white noise, the in fl uence of shot noise andnoise from inhomogeneous background magnetic fi eld are not dominant. The training and testing datasets arenormalized by taking the natural logarithm and dividing them by respective maximum. The positive andnegative samples of training and testing sets after normalization are illuminated in fi gures 1 ( b ) and ( c ) respectively, which shows that our positive and negative points are well mixed and would not be discriminated in Figure 1. ( a ) The energy distribution of one data set with SNR = fi eld ( BMF ) , the pure transverse fi rst-order PPF, the stationary gaussian white noise ( GWN ) , and the PPF mixed with overallnoise are shown with the cyan dashed curve, crimson dashed – dotted curve, black dotted curve, orange solid curve, magenta solidcurve respectively. Here, the dimensionless amplitude of the extra HFGW is h ( t ) = − , the extra HFGW of frequency is 10 Hz,and waist radius of GB W is 0.06 m. ( b ) The positive and negative samples of training and validation sets after normalization ( note:there are 12000 training sets and 4000 validation sets ) are shown with blue point, orange point respectively. ( c ) The positive andnegative samples of testing datasets after normalization ( there are 4000 data sets ) are shown with blue point, orange point respectively. New J. Phys. ( )043005 L-L Wang et al n obvious way. Our aim is to recognize the data sets including PPF from the ones only containing pure noise.Firstly, we designed our deep CNN as fi gure 2. Secondly, in order to fi nd the optimal weights and bias of ourCNN, we put a large number of data sets called as training sets to train the CNN. Meanwhile, in the process oftraining data sets, we apply validation sets for anti-over fi tting. Finally, the accuracy of classi fi er and mean relativeerror of predictor can be obtained using the trained deep CNN to make a judgment on testing data sets. Note: thetraining sets, validation sets and testing sets were chosen to be disjoint due to different source parameters asshown in fi gure 3. In the training process, we tune the hyperparameters according to the loss curve of trainingdatasets and validation datasets on Mathematica platform, and choose their optimal values which result in theminimum gap between the loss of training datasets and validation datasets.
3. The application of deep learning for classi fi cation Here we investigate classi fi cation accuracy of deep CNN for 12000 training data sets in amplitude region10 − – − . In order to achieve optimal results of our CNN, we spend great effort on tuning hyperparameters.In this paper, we take the accuracy as a metric measuring performance of classi fi er, which indicates the fractionof samples classi fi ed correctly. Using the CNN architecture as shown in fi gure 2, the results with various Figure 2.
Architecture of our convolutional neural network used to do classi fi cation. The input is one of our training data sets and theoutput is two classes, i.e. True or False. For prediction we simply take out the softmax layer after the 15th layer and use the meanrelative error as the loss function. Figure 3.
Our simulated training and testing datasets varying with the amplitude h ( t ) , frequency ν g = ω g / π of extra HFGWs, and thewaist radius W of GB. On Mathematica platform, the validation sets can be automatically chosen from training datasets. Here we setthe ratios of training, validation and testing datasets to be 60%, 20%, 20% respectively. New J. Phys. ( )043005 L-L Wang
Our simulated training and testing datasets varying with the amplitude h ( t ) , frequency ν g = ω g / π of extra HFGWs, and thewaist radius W of GB. On Mathematica platform, the validation sets can be automatically chosen from training datasets. Here we setthe ratios of training, validation and testing datasets to be 60%, 20%, 20% respectively. New J. Phys. ( )043005 L-L Wang et al yperparameters are shown in fi gure 5, where one can fi nd that the classi fi cation accuracy of CNN with somespeci fi c hyperparameters is hopeful to be 100% in amplitude region 10 − – − , i.e. noisy signals withamplitude 10 − – − could be recognized. Some anti-over fi tting methods, such as DropoutLayer,L2Regularization, and other two hyperparameters: MaxTrainingRounds ( i.e. number of iterations ) andBatchSize ( i.e. number of samples trained in a batch ) are concerned. The feature maps for our deep CNN are fi gured out in fi gure 4, which shows the CNN indeed extracts the key feature of pure signal and makes effectivediscrimination of positive and negative samples.However, it is inevitable that the ratio of positive and negative samples is severely imbalanced in real world,especially GW events in high frequency band have not been detected yet. As shown in fi gure 6, when the ratio ofposition and negative samples is 0.03, the accuracy could reach up to 97.63%, and the corresponding AUC ( i.e.the area under receiver operating characteristic curve ) is 1. Once the ratio exceeds to 0.11, the accuracy could goup to 100%. In some research work of GW, the same number of noisy GW signals and pure noise wasadopted [
39, 40 ] .DropoutLayer and L2Regularization are the main anti-over fi tting methods adopted in this paper, whichcould improve accuracy of deep CNN to a certain extent. In our results, from ( a ) in fi gure 5, when theDropoutLayer is set to be 0.5, the accuracy of deep CNN could reach up to 1 in amplitude range h ( t ) ∼ − – − . Thus in the following context the dropout ratio is determined to be 0.5. ForL2Rrgularization, the regularization coef fi cient chosen to be 0 may be preferable for our case in each interval ofthe amplitude as shown in ( b ) of fi gure 5. Furthermore, we also study other two hyperparameters:MaxTrainingRounds and Batchsize. As shown in ( c ) and ( d ) of fi gure 5, when their value is set to be 150, 100respectively, the accuracy of classi fi ers could reach to 1 in whole amplitude range.As a comparison, we also investigate the performance for our data sets through commonly used machinelearning methods, including Random Forest, Support Vector Machine, k-Nearest Neighbors, Neural Networks, Figure 4. ( a ) From left to right, the feature maps of pure signal ( PPF ) after the 1st, 2nd, 3rd convolution layer respectively; ( b ) from leftto right, the feature maps of negative sample after the 1st, 2nd, 3rd convolution layer respectively; ( c ) from left to right, the featuremaps of positive sample after the 1st, 2nd, 3rd convolution layer respectively. New J. Phys. ( )043005 L-L Wang
39, 40 ] .DropoutLayer and L2Regularization are the main anti-over fi tting methods adopted in this paper, whichcould improve accuracy of deep CNN to a certain extent. In our results, from ( a ) in fi gure 5, when theDropoutLayer is set to be 0.5, the accuracy of deep CNN could reach up to 1 in amplitude range h ( t ) ∼ − – − . Thus in the following context the dropout ratio is determined to be 0.5. ForL2Rrgularization, the regularization coef fi cient chosen to be 0 may be preferable for our case in each interval ofthe amplitude as shown in ( b ) of fi gure 5. Furthermore, we also study other two hyperparameters:MaxTrainingRounds and Batchsize. As shown in ( c ) and ( d ) of fi gure 5, when their value is set to be 150, 100respectively, the accuracy of classi fi ers could reach to 1 in whole amplitude range.As a comparison, we also investigate the performance for our data sets through commonly used machinelearning methods, including Random Forest, Support Vector Machine, k-Nearest Neighbors, Neural Networks, Figure 4. ( a ) From left to right, the feature maps of pure signal ( PPF ) after the 1st, 2nd, 3rd convolution layer respectively; ( b ) from leftto right, the feature maps of negative sample after the 1st, 2nd, 3rd convolution layer respectively; ( c ) from left to right, the featuremaps of positive sample after the 1st, 2nd, 3rd convolution layer respectively. New J. Phys. ( )043005 L-L Wang et al ogistic Regression and Naive Bayes. Here the Neural Network is a simple feedforward neural network from theinput to output layer connected with one hidden layer. And for different machine learning methods, the AUCare 1, 0.5648, 0.5545, 0.5530, 0.5493, 0.5417, 0.5156 respectively for deep CNN, Naive Bayes, LogisticRegression, Neural Network, Nearest Neighbors, Support Vector Machine, Random Forest ( see fi gure 7 ( b )) .Therefore in our case the traditional machine learning classi fi ers are not so reliable as deep CNN. Figure 5. ( a ) – ( d ) represent the classi fi cation accuracy of our designed deep convolutional neural network varying with GW amplitude ( h ( t ) ∼ − – − ) with different DropoutLayer, L2Regularization, MaxTrainingRounds, BatchSize respectively. Here, 12 000training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. Figure 6.
The accuracy of deep convolutional neural network for classi fi cation after training 10 000 samples, among them, some arepositive samples, while the others are negative samples. The horizontal axis represents the ratio of positive and negative samples. New J. Phys. ( )043005 L-L Wang
The accuracy of deep convolutional neural network for classi fi cation after training 10 000 samples, among them, some arepositive samples, while the others are negative samples. The horizontal axis represents the ratio of positive and negative samples. New J. Phys. ( )043005 L-L Wang et al igure 7. Left panel: this is an accuracy comparison of different machine learning methods as classi fi er by training our data sets inamplitude region h ( t ) ∼ − – − . Here, 12000 training samples, 4000 validation samples and 4000 testing samples are used andthey are disjoint. Right panel: the ROC curves of different machine learning methods in amplitude region h ( t ) ∼ − – − . Figure 8. ( a ) – ( d ) Represent mean relative error obtained by deep convolutional neural network for estimating amplitude in the region h ( t ) ∼ − – − varying with DropoutLayer, L2Regularization, MaxTrainingRounds, BatchSize respectively. Here, 12 000training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. New J. Phys. ( )043005 L-L Wang
The accuracy of deep convolutional neural network for classi fi cation after training 10 000 samples, among them, some arepositive samples, while the others are negative samples. The horizontal axis represents the ratio of positive and negative samples. New J. Phys. ( )043005 L-L Wang et al igure 7. Left panel: this is an accuracy comparison of different machine learning methods as classi fi er by training our data sets inamplitude region h ( t ) ∼ − – − . Here, 12000 training samples, 4000 validation samples and 4000 testing samples are used andthey are disjoint. Right panel: the ROC curves of different machine learning methods in amplitude region h ( t ) ∼ − – − . Figure 8. ( a ) – ( d ) Represent mean relative error obtained by deep convolutional neural network for estimating amplitude in the region h ( t ) ∼ − – − varying with DropoutLayer, L2Regularization, MaxTrainingRounds, BatchSize respectively. Here, 12 000training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. New J. Phys. ( )043005 L-L Wang et al . Parameters estimation with deep learning In this paper, we only focus on extra HFGWs generated by in fl aton oscillations around the minimum of a cuspypotential after in fl ation in [ ] . The extra HFGWs are distributed in amplitude region 10 − – − andfrequency band 10 – Hz. In this section, we would estimate three parameters: dimensionless amplitude h ( t ) Figure 9.
This is the mean relative error obtained by various machine learning algorithms for estimating amplitude in region h ( t ) ∼ − – − . Here, 12 000 training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. Figure 10. ( a ) – ( d ) Represent mean relative error obtained by deep convolutional neural network for estimating frequency in theregion 10 – Hz with different DropoutLayer, L2Regularization, MaxTrainingRounds, BatchSize respectively. Here, 12 000training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. New J. Phys. ( )043005 L-L Wang
This is the mean relative error obtained by various machine learning algorithms for estimating amplitude in region h ( t ) ∼ − – − . Here, 12 000 training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. Figure 10. ( a ) – ( d ) Represent mean relative error obtained by deep convolutional neural network for estimating frequency in theregion 10 – Hz with different DropoutLayer, L2Regularization, MaxTrainingRounds, BatchSize respectively. Here, 12 000training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint. New J. Phys. ( )043005 L-L Wang et al nd frequency ν g = ω g / π of extra HFGWs, and waist radius W of GB. Considering mean relative error as theloss function, we make the parameters estimation using the similar CNN shown in fi gure 2. For the estimation of GW amplitude: the amplitude is an important characteristic property of GW, whichaffects on the possibility of recognization to a great extent. By adopting deep learning, we could fi nd an optimaldetection range of extra HFGWs. As the same process of tuning hyperparameters in classi fi cation, we get thecorresponding results in fi gure 8. It can be found that the mean relative error is decreasing when the amplitude isincreasing on the whole ( see fi gure 8 ( a )) . In the optimal case ( i.e. DropoutLayer is set to be default ) we furtherinvestigate how the L2Regularization, MaxTrainingRounds and Batchsize affect on it ( see fi gures 8 ( b ) – ( d )) . Insummary, the predictor with DropoutLayer and L2Regularization omitted, MaxTrainingRounds and BatchSizeset to be 700 and 100 respectively could compress the mean relative error less than 0.018 in whole amplituderegion 10 − – − .Compared with predictors using common machine learning methods, such as Gaussian Process, k-NearestNeighbors, Linear Regression, Neural Network and Random Forest, the predictor adopting deep CNNalgorithm is more expressive. For our data sets, the predictor produced by deep CNN provides the lowest meanrelative error in entire amplitude range 10 − – − as shown in fi gure 9, which is less than 0.016. For the estimation of GW frequency: extra HFGWs have a broad frequency band n ~ -
10 10 Hz g . Herewe concern on the frequency band from 10 to 10 Hz. On the whole the mean relative error of deep CNNincreases with frequency increasing in 10 – Hz as shown in fi gure 10. In ( a ) of fi gure 10, the errors of thesepredictors are in fl uenced by DropoutLayer obviously, and the optimal value of DropoutLayer should be default.So with the default of DropoutLayer, we make a further discussion about other hyperparameters. Finally it isfound that the L2Regularization and MaxTrainingRounds have little effect on the performance of CNN. From ( b ) to ( d ) in fi gure 10, the predictor might be more suitable for our data sets with L2Regularization,MaxTrainingRounds and BatchSize set to be default, 500 and 200 respectively without DropoutLayer. Throughadjusting these hyperparameters, the mean relative errors of predictors vary from 0.05 to 0.45 in frequency band Figure 11.
This is the mean relative error obtained by various machine learning algorithms for estimating frequency in region10 – Hz. Here, 12 000 training samples, 4000 validation samples and 4000 testing samples are used and they are disjoint.
Figure 12.
This is the mean relative error obtained by various machine learning algorithms for estimating waist radius in region0.05 – New J. Phys. ( )043005 L-L Wang
This is the mean relative error obtained by various machine learning algorithms for estimating waist radius in region0.05 – New J. Phys. ( )043005 L-L Wang et al – Hz. At the optimal resonance frequency of extra HFGWs detector 5 × Hz [ ] , the mean relativeerror of optimal deep CNN is around 0.12.In the same way, a comparison between deep CNN and the predictors using other machine learningmethods is shown in fi gure 11. Our predictor is able to successfully measure the frequency for given noisy signalswith a relatively lower error. For the estimation of waist radius of GB : waist radius W describes minimum spot radius of GB. Weinvestigate the mean relative error of waist radius from 0.05 m to 0.1 m by using deep CNN and other machinelearning algorithms. The exact result is shown in fi gure 12. Compared to other predictors, the mean relativeerror of deep CNN is the lowest, which is 0.138. Thus, there will be observable effects, as long as the waist radiusis set in region ( ) .
5. Conclusion and remarks
Through the interaction between static magnetic fi eld and extra HFGWs generating transverse fi rst-orderperturbative EM fi eld, the transverse PPF as a special EM effect from extra HFGWs produced by in fl ationoscillations around the minimum of a cuspy potential after in fl ation could be generated. The amplitude of extraHFGWs is in the region ( − , 10 − ) and the frequency ranges from 10 to 10 Hz. Through the applicationof deep CNN for extracting GW signal and the corresponding parameters estimation, the ef fi ciency of deeplearning technology has been suggested. Deep CNN can be trained by a large number of training datasets and isef fi cient to do classi fi cation and prediction after training appropriately. In training process, we useDropoutLayer and L2Regularization to avoid over fi tting. Our deep CNN could be successfully classify andpredict for our data sets. In this paper, we also discuss how the ratio of positive and negative samples affects onclassi fi cation accuracy of classi fi er. One can fi nd that the accuracy could reach up to 97.63%, when the ratioexceeds to 0.03. Especially, when the ratio exceeds to 0.11, the accuracy can reach up to 100%. Moreover, theclassi fi cation accuracy in whole amplitude region 10 − – − shall be close to 100% with training 12000training sets. Through analysis of some hyperparameters, including DropoutLayer, L2Regularization,MaxTrainingRounds and BatchSize, the accuracy of classi fi er from deep CNN is higher than the othercommonly used classi fi ers in whole amplitude range. Therefore, extra HFGWs would be possible to be extractedfrom raw noisy datasets with high con fi dence level by deep CNN. The GW with stronger amplitude resulting inrelatively high SNR, which is much easier to be recognized by such scheme. The mean relative error decreaseswhen the amplitude of extra HFGWs increases. Signals with amplitude ranging from 10 − to 10 − are easier tobe recovered. Fortunately, the amplitude of HFGWs predicted by several classical cosmological models are inthis region, such as the quintessential in fl ationary models, some string cosmology scenarios and nanopiezoelectric crystal array. Through tuning some hyperparameters, the optimal architecture of deep CNN can be fi xed. One can fi nd that both the classi fi er and predictor from deep CNN have better performance thantraditional machine learning methods. Therefore, the PPF generated from extra HFGWs could be distinguishedfrom one-dimensional noisy data sets with high con fi dence level by deep learning. Our results indicate that thedeep learning technique could help us to improve the operability of extra HFGWs classi fi cation and parametersestimation. Acknowledgments
This work has been supported by the National Natural Science Fund of China ( Grant No. 11873001, 11775038and 11847301 ) , and by the Fundamental Research Funds for the Central Universities ( Grant No.106112017CDJXFLX0014 and 2019CDJDWL0005 ) , and by the Nature Science Fund of Chongqing No.cstc2018jcyjAX0767. ORCID iDs
Jin Li https: // orcid.org / References [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. X [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Astrophys. J. Lett. New J. Phys. ( )043005 L-L Wang
Jin Li https: // orcid.org / References [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. X [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Astrophys. J. Lett. New J. Phys. ( )043005 L-L Wang et al ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Abbott B P et al ( LIGO Scienti fi c Collaboration and Virgo Collaboration ) Phys. Rev. Lett. [ ] Ferrari V, Matarrese S and Schneider R 1999
Mon. Not. R. Astron. Soc. [ ] Regimbau T and de Freitas Pacheco J A 2001
Astron. Astrophys. [ ] Schneider R et al
Mon. Not. R. Astron. Soc. [ ] Farmer A J and Phinney E S 2003
Mon. Not. R. Astron. Soc. [ ] Regimbau T and de Freitas Pacheco J A 2006
Astrophys. J. [ ] Regimbau T and Mandic V 2008
Class. Quantum Grav. [ ] Regimbau T and Hughes S A 2009
Phys. Rev. D [ ] Zhu X J, Howell E, Regimbau T, Blair D and Zhu Z H 2011
Astrophys. J. [ ] Marassi S, Schneider R, Corvino G, Ferrari V and Portegies Zwart S 2011
Phys. Rev. D [ ] Rosado P A 2011
Phys. Rev. D [ ] Sesana A et al
Astrophys. J. [ ] Sesana A, Vecchio A and Colacino C N 2008
Mon. Not. R. Astron. Soc. [ ] Regimbau T and de Freitas Pacheco J A 2006
Astron. Astrophys. [ ] Wu C J, Mandic V and Regimbau T 2013
Phys. Rev. D [ ] Hobbs G 2008
Class. Quantum Grav. [ ] Nan R et al
Int. J. Mod. Phys. D [ ] Shannon R M et al
Science [ ] Lentati L et al ( EPTA Collaboration ) Mon. Not. R. Astron. Soc. [ ] Arzoumanian Z et al ( NANOGrav Collaboration ) Astrophys. J. [ ] Amaro Seoane P et al
Class. Quantum Grav. [ ] Liu J, Guo Z K, Cai R G and Shiu G 2018
Phys. Rev. Lett. [ ] Easther R, Giblin J T Jr and Lim E A 2007
Phys. Rev. Lett. [ ] Li F Y, Baker R M L Jr, Fang Z Y, Stephenson G V and Chen Z Y 2008
Eur. Phys. J. C [ ] Nishizawa A and Hayama K 2013
Phys. Rev. D [ ] Li J, Zhang L, Lin K and Wen H 2016
Int. J. Theor. Phys. [ ] Minar M R and Naher J 2018 ( https: // doi.org / / RG.2.2.24831.10403 ) [ ] Alsohybe N T, Dahan N A and Ba-Alwi F M 2017
Curr. J. Appl. Sci. Technol. [ ] Feng J T et al [ ] Ma S M, Sun X, Wang Y Z and Lin J Y 2018 arXiv:1805.04871v1 [ ] Lee J B et al [ ] Yao Y et al [ ] Daniel George and Huerta E A 2018
Phys. Rev. D [ ] George D and Huerta E A 2018
Phys. Lett. B [ ] Li F Y, Yang N, Fang Z Y, Baker R M L Jr, Stephenson G V and Wen H 2009
Phys. Rev. D [ ] Li J, Lin K, Li F Y and Zhong Y H 2011
Gen. Relativ. Gravit. [ ] Wen H, Li F Y and Fang Z Y 2014
Phys. Rev. D [ ] Wen H, Li F Y, Fang Z Y and Beckwith A 2014
Eur. Phys. J. C [ ] Wang L L and Li J 2018
Gravit. Cosmol. New J. Phys. ( )043005 L-L Wang