Artificial neural network to estimate the refractive index of a liquid infiltrating a chiral sculptured thin film
Patrick D. McAtee, Satish T.S. Bukkapatnam, Akhlesh Lakhtakia
AArtificial neural network to estimate the refractive index of a liquid infiltrating a chiralsculptured thin film
Patrick D. McAtee, a, ∗ Satish T.S. Bukkapatnam, b and Akhlesh Lakhtakia aa The Pennsylvania State University, Department of Engineering Science and Mechanics, University Park,PA 16802, USA b Texas A&M University, Department of Industrial and Systems Engineering, College Station, TX 77843,USA
Abstract
We theoretically expanded the capabilities of optical sensing based on surface plasmon resonance ina prism-coupled configuration by incorporating artificial neural networks (ANNs). We used calculationsmodeling the situation in which an index-matched substrate with a metal thin film and a porous chiralsculptured thin film (CSTF) deposited successively on it is affixed to the base of a triangular prism. Whena fluid is brought in contact with the exposed face of the CSTF, the latter is infiltrated. As a result ofinfiltration, the traversal of light entering one slanted face of the prism and exiting the other slanted faceof the prism is affected. We trained two ANNs with differing structures using reflectance data generatedfrom simulations to predict the refractive index of the infiltrant fluid. The best predictions were a resultof training the ANN with simpler structure. With realistic simulated-noise, the performance of this ANNis robust.
The ability to accurately detect small concentrations of chemicals and biochemicals, whether toxic or benign,is highly prized in chemical, pharmaceutical, medical, environmental, and food industries [1,2]. Introductionof pathogens and toxins into the human body can arise from intentional contamination of essential infrastruc-ture [3] as well as from unforeseen consequences of their otherwise necessary applications [4]. Even chemicalstraditionally thought of as non-toxic in their bulk form, such as gold, could have harmful effects when in-gested as nanoparticles [5–7]. Also, the concentrations of various chemicals in solutions and dispersions needto be determined in research as well as industrial laboratories [1, 8, 9].Sensors of chemicals and biosensors are designed to operate on the basis of several different phenomena,including electrochemical [10, 11], optical [12–14], piezoelectric [15], gravimetric [16], and pyroelectric [17].Our focus here lies on optical sensors, of which several types exist [18, 19].One commonly used optical-sensing technique relies on surface plasmon resonance (SPR) which can occurwhen light interacts with free electrons at a metal/dielectric interface [8]. As a result, a surface-plasmon-polariton (SPP) wave is excited. Changing the relative permittivity of the dielectric material will change thecharacteristics of the SPP wave, thus allowing for sensing [20]. Let us consider an SPP wave propagatingalong the x axis guided by the metal/dielectric interface z = 0 and suppose that the metal and the partneringdielectric material are isotropic and homogenous. The metal of relative permittivity ε met fills the half-space z < ε diel fills the half-space z >
0. Thecomplex-valued wavenumber of the SPP wave guided by the interface z = 0 is given by q = k (cid:112) ε diel ε met / ( ε diel + ε met ) , (1)where k is the free-space wavenumber. This SPP wave is p polarized [21].In order to evoke SPR in practice, many geometric configurations have been devised to couple incidentlight to the SPP wave guided by the metal/delectric interface. The most commonly implemented configura-tion is a prism-coupled configuration called the Turbadar–Kreschmann–Raether (TKR) configuration [22,23],wherein a metal film of thickness L met and a dielectric film of thickness L diel are successively deposited ontoone face of a substrate, thus establishing a sensor chip [31]. The substrate has the same refractive index as aprism of refractive index n prism which exceeds n diel = √ ε diel , our assumption being that both n prism and n diel are real and positive. Typically, the prism has a cross-section of a 45 ◦ –90 ◦ –45 ◦ triangle. The second faceof the substrate is affixed to the hypotenuse of the prism using an index-matching fluid. A monochromatic1 a r X i v : . [ phy s i c s . a pp - ph ] J u l -polarized plane wave of free-space wavelength λ and intensity I is incident onto one slanted face of theprism at an angle φ with respect to the normal to that face. The refracted plane wave is incident on thesubstrate/metal interface (effectively, the prism/metal interface) at an angle θ with respect to the normalto the interface; in principle, θ ∈ [0 ◦ , ◦ ). The plane wave is reflected and exits the other slanted face ofthe prism. The intensity I r of the exiting plane wave is measured as a function of θ by a photodetector, andthereby the reflectance R = I r /I is deduced as a function of θ . A sharp dip in the graph of R vs. θ indicatesthe excitation of a SPP wave at a specific angle denoted by θ SPP , provided that sin θ SPP > n diel /n prism . Thisangle is characteristic of ε diel and is related to the SPP wavenumber as follows:Re( q ) (cid:39) k n prism sin θ SPP . (2)The angle θ SPP changes with n diel . This is the principle of SPR-based sensing, with the partnering dielectricmaterial being the material whose refractive index is the quantity sensed.Only one dip indicating SPR appears at a fixed free-space wavelength λ = 2 π/k , because the chosenmetal/dielectric interface can guide just one SPP wave. Only one SPP wave can be excited at a fixed λ even if the partnering dielectric material is anisotropic [24, 25].Theory [20] and subsequent experiments [26–28], however, have confirmed that a periodically nonhomo-geneous dielectric material, whether isotropic or anisotropic, partnering a metal in the TKR configurationcan support the existence of multiple SPP-wave modes at a fixed λ . The different SPP-wave modes havetheir peak field within the partnering dielectric material at different distances from the interface [20]. Ifthe periodically nonhomogeneous dielectric material is porous and is infiltrated by a fluid of refractive index n inf , theory shows [29] and experiment has confirmed [30, 31] that the locations of all SPP-wave modes inthe graph of R vs. θ shift with n inf , thereby enabling optical-sensing applications.In addition to SPP-wave modes, waveguide modes [32, 33] may also manifest in the graph of R vs. θ .These waveguide modes differ from SPP-wave modes in that they can be bound to more than one interface,and therefore depend on the thickness of the partnering dielectric material [34].There could even exist signatures of undiscovered phenomena in the graph of R vs. θ . Therefore, whendeducing what n inf generated a specific graph of R vs. θ , it is best to make use of all the features of thegraph. This calls for the use of artificial neural networks (ANNs), which do not require understanding of thevarious underlying phenomena to discern a complicated quantitative relationship between them [36].Currently, ANNs and other machine-learning algorithms are being applied to a multitude of tasks inengineering and medicine [37–46], including inverse optical design [47, 48]. Although ANNs and machinelearning have been applied to SPR biosensors [49–53], their use in scenarios to exploit the excitation ofmultiple guided-wave modes (including multiple SPP wave-modes and waveguide modes) as well as of otherpolarization-dependent features in the graph of R vs. θ provides a novel avenue for optical sensing.The plan of this paper is as follows. Section 2 provides brief introductions to: (i) the calculation ofreflectances when a porous chiral sculptured thin film (CSTF) [29,30,54,55] is used as the partnering dielectricmaterial in the TKR configuration, and (ii) our application of ANNs to that optical-sensing scenario [56]. InSec. 3, we give the parameters of two different ANNs devised by us and the data used to train and test eachANN. Section 4 details the performance of each ANN, and Sec. 5 contains a discussion of the implicationsof the numerical results. Suppose that a CSTF of thickness L diel is used as the partnering dielectric material in the TKR configurationand the planar metal/prism interface is identified as the plane z = 0. The CSTF comprises closely nestednanohelixes of a dielectric material that were grown parallel to each other by the process of physical vapordeposition. The CSTF is infiltrated by a fluid of refractive index n inf which also fills the half space z >L met + L diel . The prism material is taken to fill the half space z <
0. A schematic is provided in Fig. 1.2igure 1: Schematic of the TKR configuration used to calculate R vs. θ when the partnering dielectricmaterial is a CSTF infiltrated by a fluid of refractive index n inf . All layers extend infinitely transverse tothe z axis. The fluid extends to + ∞ in the z direction and the prism extends to −∞ in z direction.The anisotropy and nonhomogeneity of the CSTF are macroscopically quantified by the relative permit-tivity dyadic [29, 54] ε diel ( z ) = S z ( z ) • S y ( χ ) • ε oref • S − ( χ ) • S − ( z ) . (3)Here, the local relative permittivity dyadic ε oref = ˆ u x ˆ u x ε b + ˆ u y ˆ u y ε c + ˆ u z ˆ u z ε a (4)in the material frame captures the local orthorhombicity of the CSTF; the dyadic S z ( z ) = ˆ u z ˆ u z + (ˆ u x ˆ u x + ˆ u y ˆ u y ) cos (cid:16) πz Ω (cid:17) + h (ˆ u y ˆ u x − ˆ u x ˆ u y ) sin (cid:16) πz Ω (cid:17) . (5)captures the rotation of the local relative permittivity dyadic ε ref = S y ( χ ) • ε oref • S − ( χ ) in the laboratoryframe about the z axis, with h ∈ {− , } denoting the structural handedness and 2Ω the period; and thedyadic S y ( χ ) = ˆ u y ˆ u y + (ˆ u x ˆ u x + ˆ u z ˆ u z ) cos χ + (ˆ u z ˆ u x − ˆ u x ˆ u z ) sin χ (6)represents the locally aciculate morphology of the CSTF, with χ > xy plane. Both ε ref and ε oref have the same eigenvalues—denoted by ε a , ε b , and ε c —but their eigenvectorsdiffer. All three parameters ε a , b , c of the fluid-infiltrated CSTF depend not only on λ but also on n inf (whichis itself dependent on λ ) [54]. If ε a , b , c are known for n inf = 1 at a specific value of λ , then a combination ofinverse and forward Bruggeman homogenization formalisms can be used to deduce their values for n inf (cid:54) = 1at the same λ [29].The electric field phasor of the plane wave incident on the prism/metal interface can be written as [54] E inc ( r ) = { a s ( − ˆ u x sin ψ + ˆ u y cos ψ ) + a p [ − (ˆ u x cos ψ + ˆ u y sin ψ ) cos θ + ˆ u z sin θ ] }× exp [ ik n prism ( x cos ψ + y sin ψ ) sin θ ] exp ( ik n prism z cos θ ) , (7)3here i = √− a s is the amplitude of the s -polarized component and a p of the p -polarized component. Theincidence direction is specified by the angles θ ∈ [0 ◦ , ◦ ) and ψ ∈ [0 ◦ , ◦ ). The electric field phasor of thereflected plane wave can be written as [54] E ref ( r ) = { r s ( − ˆ u x sin ψ + ˆ u y cos ψ ) + r p [(ˆ u x cos ψ + ˆ u y sin ψ ) cos θ + ˆ u z sin θ ] }× exp [ ik n prism ( x cos ψ + y sin ψ ) sin θ ] exp ( − ik n prism z cos θ ) , (8)where r s is the amplitude of the s -polarized component and r p of the p -polarized component. The reflectanceis defined as R = | r s | + | r p | | a s | + | a p | . (9)The procedure to compute r s and r p , and therefore R , from a s and a p as functions of λ , θ , and ψ isavailable in detail elsewhere [20]. The procedure to obtain ε a , b , c as functions of λ and n inf is also availablein detail elsewhere [29, 54]. ANNs are machine-learning algorithms of a specific type [36]. Machine-learning algorithms are constructedsuch that they improve in performance over time for a specific task without being explicitly programmedfor that task. Typically, an ANN comprises multiple nodes (neurons) organized in several layers arrangedin a hierarchy. Each node in a given layer is interconnected with all the nodes in the adjacent layers. Theseinterconnections are represented as numerical values called weights. The designated first layer serves as the input layer and the designated last layer as the output layer . Each node in a given layer computes the linearcombination of the value of each node in the previous layer, along with that node’s weight. In order toaccount for possible non-linear processes, the linear combination is fed into an activation function, such asthe sigmoid or rectifier function. A neuron with no activation function is called a linear neuron, and an ANNcomposed exclusively of linear neurons will fit a linear model to the data.ANNs require training and testing before implementation. An ANN is trained on a data set R of columnvectors denoted by R j , j ∈ [1 , J ]. R j consists of K scalars R jk , k ∈ [1 , K ]. In addition, there are labelvectors n j , each consisting of L scalars n j(cid:96) , (cid:96) ∈ [1 , L ]. Every R j is accompanied by a unique n j , but someof the labels may share the same value when considering noisy data. As an example, two different sensorchips employed in the TKR configuration may produce slightly different reflectance signatures given thesame infiltrating liquid, due to differences in sensor-chip quality from the fabrication process.For a specific j = j (cid:48) , R j (cid:48) is fed into the input layer, R j (cid:48) k being fed to the node labeled k in that layer.With the weights randomized, the label vector n (cid:63)j (cid:48) is predicted by the ANN and an error value is calculatedbased on some predefined error function of n j and n (cid:63)j . In general, n (cid:63)j (cid:54) = n (cid:63)j (cid:48) . This error function can beexpressed as a function of the weights, since n (cid:63)j is a function of the weights. Typically, an ANN uses agradient-descent method [35] to find the weights such that the average error over R is sufficiently small.The ANN is then tested on a data set R .In this paper, every testing vector and its elements are identified by the addition of an overbar tothe symbol for the corresponding training vector and its elements. In addition, we use R ab with a and b as placeholders to denote the polarization state (represented by the ratio of a s to a p ) and the angle ψ , respectively, of the incident plane wave for which the reflectance data are obtained. This notation isexplained in Tables 1 and 2. Next, all styles and fonts of uppercase ‘R’ represent reflectance data calculatedfor various polarization states, angles θ , and angles ψ . Finally, all styles and fonts of lowercase ‘n’ representrefractive-index data (i.e., n inf ) corresponding to the reflectance data. Let us note that n inf denotes therefractive index of the infiltrating liquid in general, whether used for training, testing, both, or neither.Theorems of machine learning suggest that an algorithm tailored to the needs of the specific applicationmust be sought [57, 58]. Therefore, we trained two ANNs with differing structures for various R with labelvectors n j and tested the ANNs for various R with the label vectors n j . For this work, both n j and n j comprise just one element each (i.e., L = 1) and therefore are denoted simply as n j and n j , respectively.4able 1: Label a and polarization state of the incident plane wave in the prism. a a s a p polarization state1 0 1 linear ( p )2 1 0 linear ( s )3 1 / √ / √ p and s )4 1 / √ − / √ p and s )5 1 / √ i/ √ / √ − i/ √ b and angle ψ chosen for the incident plane wave in the prism. b ψ (deg)1 02 183 364 545 726 90 After fixing λ = 635 nm, various R were calculated using a CSTF with half-period Ω = 200 nm and thickness L diel = 1200 nm made of titanium oxide. When n inf = 1, we set ε a = 2 . ε b = 3 . ε c = 2 . χ = 37 . ◦ , as provided by actual measurements on columnar thin films [59]. Valuesof ε a , b , c for n inf > n met = √ ε met = 0 . i . L met = 30 nm [30]. Theprism was chosen to be made of SF11 glass so that n prism = 1 . n water = 1 . R and R for n inf ∈ [1 . , .
4] for equal increments∆ n inf = 0 . n inf ≈ . J . All R and R were calculated for φ ∈ [ − ◦ , . ◦ ] in steps of ∆ φ = 0 . ◦ , which can be related [31] to θ by the standard law of refraction,yielding θ = 45 ◦ + sin − (cid:16) n − sin φ (cid:17) for the prism with a 45 ◦ –90 ◦ –45 ◦ triangle as its cross section. Alldata were computed using Mathematica R (cid:13) running on a laptop computer with a 4-core 2.4-GHz processorand 16 GB of RAM. With φ , n inf , and b fixed, the average estimated computation time was ∼ .
06 s for theset of six polarization states specified in Table 1.
SPP-wave modes are capable of being excited by incident light of an arbitrary polarization state in theprism-coupled configuration when a periodically nonhomogeneous dielectric material is partnering a metalthin film [20]. However, experiments have shown that using p -polarized incident light results in a higher5ensitivity over s -polarized incident light, based on the definition [31] ρ = θ SPP ( n inf ) − θ SPP ( n inf ) n inf − n inf (10)of sensitivity, where n inf (cid:96) , (cid:96) ∈ { , } , is the refractive index of an infiltrant fluid labeled (cid:96) and θ SPP ( n inf (cid:96) )is the value of θ SPP for n inf (cid:96) . One might then infer that R from p -polarized incident light will yield bettertraining for an ANN. However, since ρ is focused on SPP-wave modes but not on waveguide modes and otherphenomena, as mentioned in Sec. 1, we conjectured that s -polarized incident light as well as incident lightof other polarization states may also be useful for ANN training.Therefore, our first group of simulations used R calculated for six different polarization states of incidentlight with ψ = 0 ◦ . Consistently with Tables 1 and 2, the six training data sets are denoted by R a , a ∈ [1 , J = 10 and K = 250; thus, 10 values of n j and 250 values of θ were used. For R a , a ∈ [1 , J = 200 and K = 250. ψ To determine what value of the angle ψ is the most effective, training data sets R b , b ∈ [1 , J = 10 and K = 250. For the counterpart testing datasets R b , b ∈ [1 , J = 200 and K = 250. Even if one polarization state is more effective than the others, the ANN still may benefit from the inclusionof reflectance data from the other polarization states. We use the notation R cb to mean reflectance datathat includes polarization states labeled a ∈ { , , ...., c } ; thus, R cb = R b ∪ R b ∪ · · · ∪ R cb and R ≡ R .Fixing ψ = 0 ◦ , we focused on R c . For R , R , R , R , R , and R , we have J = 10 and K =250 , , , , J = 200 for R c , K = 250 , , , , c = 1 , , , ,
5, and 6, respectively. ψ With similar reasoning as in Sec. 3.1.3, we write R a c to mean reflectance data that includes values of ψ corresponding to b ∈ { , , ...., c } ; thus, R a c = R a ∪ R a ∪ · · · ∪ R ac and R ≡ R ≡ R . Fixing our attentiononly on incident p -polarized light, we focused on R c . For R , R , R , R , R , and R , J = 10 andand K = 250 , , , , R c was constituted with J = 200 and K = 250 , , , , c = 1 , , , ,
5, and 6, respectively. ψ Of the multitude of reflectance data sets that can be determined for combinations of polarization states and ψ , we only investigated R = (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) ∪ (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) ∪ · · · ∪ (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) . (11)For this training data set, J = 10 and K = 9000. For the corresponding testing data set R , J = 200 and K = 9000. For the data set that yielded the best ANN performance, noise was added to that set to simulate experimentaldata. The corresponding noisy data set is denoted by R with the specific superscript and subscript of theset that yielded the best performance. As shown in Sec. 4, the best-performance set was R . The noise was6igure 2: Reflectance as a function of θ measured for four different samples in the TKR configuration, eachwith a 30-nm-thick aluminum film and air as the partnering dielectric material. Each data point from agiven plot was compared to the data point for the same θ from all other plots. The absolute difference of allthese measurements was averaged and calculated to be 0 . .
05. This magnitude was split about 0, yielding − .
025 for the lower bound and 0 .
025 for the upper boundof random numbers used to add noise to R .simulated as a random number between − .
025 and 0 .
025 added to each element R jk ∈ R j ∈ R for 10different instances. The upper and lower bounds of the random numbers were chosen based on the averageabsolute difference of reflectance data between four different 30-nm-thick aluminum films implemented ina specific TKR apparatus, as shown in Fig. 2 . Thus, for R , J = 10 ×
10 = 100 and K = 250. For thecorresponding testing data set R , J = 200 ×
10 = 2000 and K = 250. Up until this stage, each training and testing data set has included reflectance data from θ ranging fromabout 30 ◦ to about 60 ◦ . With inspiration from conventional SPR occurring at the interface of a metal anda homogeneous dielectric material, we devised a data set denoted by R SP R with θ ranging between 0 ◦ and70 ◦ in steps of 0 . ◦ , with a = 1, b = 1, J = 10, K = 701, and with no noise added. In order to ensure thatonly the occurrences of SPR were captured in R SP R , the reflectances were calculated with L diel = 1200 nmand L diel = 1600 nm, and only those dips that differed by less than 0 . ◦ for the two values of L diel wereretained [30, 31]. Thus most entries in the column vectors in R SP R were zero. The corresponding testingdata set R SP R was constituted with J = 200, and K = 701. This data set would allow us to to determinethe relative efficacy of SPR alone for training ANNs. All training and testing was done using MATLAB R (cid:13) running on a laptop computer with a 4-core 2.4-GHzprocessor and 16 GB of RAM. In all, training took no more than a day. Each training data set was usedfor two separate ANN structures. The first type of ANN structure, denoted by ANN , consisted of threelayers: an input layer, one hidden layer, and an output layer. The hidden layer contained 100 nodes with no7ctivation function (linear neurons) and the output layer contained one node. The second ANN structure,denoted by ANN consisted of four layers: an input layer, two hidden layers, and an output layer. The twohidden layers each contained 100 nodes with rectifier activation functions, and the output layer contained onenode. The input layer for both ANN and ANN consisted of one vector. The size of this vector was either250, 500, 750, 1000, 1250, 1500, 9000, or 701 depending on the specific data set being used for training. Thestochastic gradient-descent with momentum method [61] was chosen for optimization with an initial learningrate of 0 .
01 and the error function defined as the mean-squared error. The number of maximum epochs was10 , µ , standard deviation δ , median σ , maximum M , and minimum m of | n (cid:63)j − n j | for each testingdata set were used to assess the performance of each ANN. In addition, for several instances of trainingfor any given ANN with a particular structure and training data set, the performance of that ANN giventhe same testing data set may vary due to the fact that the weights are randomized at the start of eachtraining instance. Therefore, given a particular training data set, we trained each ANN for ten instancesand averaged the aforementioned statistical measures for all of the training instances. Thus, hereafter, thesymbols µ , δ , σ , M , and m , denote performance measures averaged over ten training instances. Ideally, allfive performance measures should be as close to 0 as possible. Values of all five performance measures for every training data set for ANN are listed in Table 3. This tableis divided into six blocks, one each for R a , R b , R c , R c , R , and R (1 , , , , ,
6, respectively). Withineach of the first four blocks, those training sets which yielded the lowest value for each performance measureare highlighted by a colored background. In addition, the overall lowest value for each performance measureis identified in boldface.Table 3: Values for µ , δ , σ , M , and m for ANN . Note that R , R , and R are equivalent by virtue ofthe definitions in Sec. 3.1. µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) R . . . . . R . . . . . R . . . . . R . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . Values of all five performance measures for every training data set for ANN are listed in Table 4. This tableis divided into six blocks, one each for R a , R b , R c , R c , R , and R (1 , , , , ,
6, respectively). Withineach of the first four blocks, those training sets which yielded the lowest value for each performance measureare highlighted by a colored background. In addition, the overall lowest value for each performance measureis identified in boldface. 8able 4: Values for µ , δ , σ , M , and m for ANN . Note that R , R , and R are equivalent by virtue ofthe definitions in Sec. 3.1. µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . R . . . . . R . . R . . . . . R . . . . . and ANN Training of ANN for each R provided a linear function to predict n inf . By having three out of five perfor-mance measures as the lowest, we may define R as providing the best training for ANN . We also notethat changing ψ does not change the performance of ANN as much as changing the polarization state. Thesame statement goes for the addition of more ψ values versus the addition of more polarization states.Training of ANN for each R provided a non-linear function to predict n inf . By having 4 out of 5performance measures being the lowest, we may define R as providing the best training for ANN . If wecompare ANN and R with ANN and R , the latter pair perform better, despite the former training setcontaining more data and the latter ANN having a simpler structure.Training of ANN with R SP R yielded µ = 220 . × − , δ = 156 . × − , σ = 211 . × − , m = 0 . × − , and M = 507 . × − , while training of ANN with R SP R yielded µ = 251 . × − , δ = 145 . × − , σ = 251 . × − , m = 2 . × − , and M = 499 . × − . Overall, bothsets of measures are significantly worse compared to training with any other R . We conclude from this thatsensing only with SPR data seriously undermines the ability of an ANN to correctly predict n inf .When noise consistent with actual experimental data was added, we found that the performance of asimple ANN with a small number of training examples yielded no more than a 0 . n inf based on the fact that M = 36 . × − for ANN trained with R . Thus, useof ANNs provides a level of immunity to measurement noise. A typical method for sensing is via SPR using the TKR configuration. We have simulated reflectance datafrom a liquid-infiltrated CSTF partnering a metal thin-film in the TKR configuration and used this datato train two ANNs with differing structures. The performance measures of the ANNs for many traininginstance were compared. The various training data sets contained reflectance data calculated for variouscombinations of the polarization state and the angle ψ . Some of this training data was complicated byrealistic noise. We stress here that our work pertains directly to the sensing of n inf and only indirectly tothe identification of the infiltrant fluid, functionalization [8] being required for the latter purpose.One main conclusion we have shown is that n inf can best be predicted from p -polarized light with ψ = 90 ◦ ,with an ANN having no activation function. This instance represents a best-case scenario. It will requirethe use of either a triangular prism with a very broad base or a hemispherical prism because the value of ψ is very high.Another main conclusion of this paper is that the inclusion of other reflectance data in addition to SPR9ata greatly improves the performance of an ANN. Given the simplicity and heuristic choice of the ANN structure and the relative small number of training examples compared to the testing examples used for thiswork, we are optimistic that significant improvement in the performance can be achieved in the future byadding more training examples and refining the ANN structure. Eventually, the application of ANNs mayengender an era of simultaneous multianalyte sensing [30]. We also expect our ANN methodology to applywhen SPP waves are manipulated by active functional materials and phase-change materials for enhancedsensitivity [62, 63]. Appendix A: MATLAB R (cid:13) codes for Artificial Neural Networks The variables n and mb represent the input layer size and mini-batch size, respectively. The mini-batch sizewas that of the training data set used for that instance. A.1: ANN layers = [ ...sequenceInputLayer(n)fullyConnectedLayer(100)fullyConnectedLayer(1)regressionLayer];options = trainingOptions(’sgdm’,’InitialLearnRate’,0.01, ...’MaxEpochs’,10000,...’MiniBatchSize’,mb) A.2: ANN layers = [ ...sequenceInputLayer(n)fullyConnectedLayer(100)reluLayerfullyConnectedLayer(100)reluLayerfullyConnectedLayer(1)regressionLayer];options = trainingOptions(’sgdm’,’InitialLearnRate’,0.01, ...’MaxEpochs’,10000,...’MiniBatchSize’,mb) Acknowledgments.
The research of P. D. McAtee and A. Lakhtakia is funded by the Charles GodfreyBinder Endowment at the Pennsylvania State University.
References [1] C. McDonagh, C. S. Burke, and B. D. MacCraith, “Optical chemical sensors,”
Chemical Reviews ,400–422 (2008). [doi: 10.1021/cr068102g].[2] A. P. F. Turner, “Biosensors: sense and sensibility,”
Chemical Society Reviews , 3184–3196 (2013).[doi: 10.1039/c3cs35528d]. 103] D. V. Lim, J. M. Simpson, E. A. Kearns, and M. F. Kramer, “Current and developing technologies formonitoring agents of bioterrorism and biowarfare,” Clinical Microbiology Reviews , 583–607 (2005).[doi: 10.1128/CMR.18.4.583-607.2005].[4] N. Verma and A. Bhardwaj, “Biosensor technology for pesticides—A review,” Applied Biochemistry andBiotechnology , 3093–3119 (2014). [doi: 10.1007/s12010-015-1489-2].[5] A. Orlando, M. Colombo, D. Prosperi, F. Corsi, A. Panariti, I. Rivolta, M. Masserini, and E. Cazzaniga,“Evaluation of gold nanoparticles biocompatibility: a multiparametric study on cultured endothelial cellsand macrophages,”
Journal of Nanoparticle Research , 58 (2016). [doi: 10.1007/s11051-016-3359-4].[6] I. Fratoddi, I. Venditti, C. Cametti, and M. V. Russo, “How toxic are gold nanoparticles? The state ofthe art,” Nano Research , 1771–1799 (2015). [doi: 10.1007/s12274-014-0697-3].[7] A. M. Alkilany and C. J. Murphy, “Toxicity and cellular uptake of gold nanoparticles: what have welearned so far?,” Journal of Nanoparticle Research , 2313–2333 (2010). [doi: 10.1007/s11051-010-9911-8].[8] J. Homola, “Surface plasmon resonance sensors for detection of chemical and biological species,” Chem-ical Reviews , 462–493 (2008). [doi: 10.1021/cr068107d].[9] H. Malekzad, P. S. Zangabad, H. Mohammad, M. Sadroddini, Z. Jafari, N. Mahlooji, S. Abbaspour,S. Gholami, M. G. Houshangi, R. Pashazadeh, A. Beyzavi, M. Karimi, and M. R. Hamblin, “Noblemetal nanostructures in optical biosensors: Basics, and their introduction to anti-doping detection,”
Trends in Analytical Chemistry , 116–135 (2018). [doi: 10.1016/j.trac.2017.12.006].[10] J. R. Stetter and J. Li, “Amperometric gas sensors—A review,”
Chemical Reviews , 352–366 (2008).[doi: 10.1021/cr0681039].[11] S. Cosnier,
Electrochemical Biosensors , Pan Stanford, New York (2015).[12] I. Abdulhalim, M. Zourob, and A. Lakhtakia, “Surface plasmon resonance for biosensing: A mini-review,”
Electromagnetics , 214–242 (2008). [doi: 10.1080/02726340801921650].[13] T. Taliercio, F. G.-P. Flores, F. B. Barho, M. J. Milla–Rodrigo, M. Bomers, L. Cerutti, and E. Tourni´e,“Plasmonic bio-sensing based on highly doped semiconductors,” Proceedings of SPIE , 103530S(2017). [doi: 10.1117/12.2274303].[14] M. Arjmand, H. Saghafifar, M. Alijanianzadeh, and M. Soltanolkotabi, “A sensitive tapered-fiber opticbiosensor for the label-free detection of organophosphate pesticides,”
Sensors and Actuators B: Chemical , 523–532 (2017). [doi: 10.1016/j.snb.2017.04.121].[15] P. Skl´adal, “Piezoelectric biosensors,”
Trends in Analytical Chemistry , 127–133 (2016). [doi:10.1016/j.trac.2015.12.009].[16] M. DeMiguel–Ramos, B. D´ıaz–Dur´an, J.-M. Escolano, M. Barba, T. Mirea, J. Olivares, M. Clement,and E. Iborra, “Gravimetric biosensor based on a 1.3 GHz AlN shear-mode solidly mounted resonator,” Sensors and Actuators B: Chemical , 1282–1288 (2017). [doi: 10.1016/j.snb.2016.09.079].[17] A. Davidson, A. Buis, and I. Glesk, “Toward novel wearable pyroelectric temperature sensor for medicalapplications,”
IEEE Sensors Journal , 6682–6689 (2017). [doi:10.1109/JSEN.2017.2744181].[18] A. Rasooly and K. E. Herold (eds.), Biosensors and Biodetection: Methods and Protocols, Volume 503:Optical-Based Detectors , Humana Press, New York (2009).[19] M. Zourob and A. Lakhtakia (eds.),
Optical Guided-wave Chemical and Biosensors, Vols. 1 and 2 ,Springer, Heidelberg, Germany (2010). 1120] J. A. Polo Jr., T. G. Mackay, and A. Lakhtakia,
Electromagnetic Surface Waves: A Modern Perspective ,Elsevier, Waltham, Massachusetts (2013).[21] H. J. Simon, D. E. Mitchell, and J. G. Watson, “Surface plasmons in silver films—a novel undergraduateexperiment,”
American Journal of Physics , 630–636 (1975). [doi: 10.1119/1.9764].[22] T. Turbadar, “Complete absorption of light by thin metal films,” Proceedings of the Physical Society , 40–44 (1959). [doi: 10.1088/0370-1328/73/1/307].[23] E. Kretschmann and H. Raether, “Radiative decay of non radiative surface plasmons excited by light,” Zeitschrift f¨ur Naturforschung A , 2135–2136 (1968). [doi: 10.1515/zna-1968-1247].[24] G. J. Sprokel, “The reflectivity of a liquid crystal cell in a surface plasmon experiment,” MolecularCrystals and Liquid Crystals , 39–45 (1981). [doi: 10.1080/00268948108073551].[25] G. J. Sprokel, R. Santo, and J. D. Swalen, “Determination of the surface tilt angle by attenuated totalreflection,” Molecular Crystals and Liquid Crystals , 29–38 (1981). [doi:10.1080/00268948108073550].[26] Devender, D. P Pulsifer, and A. Lakhtakia, “Multiple surface plasmon polariton waves,” ElectronicsLetters , 1137–1138 (2009). [doi:10.1117/1.3249629].[27] A. Lakhtakia, Y.-J. Jen, and C.-F. Lin, “Multiple trains of same-color surface plasmon-polaritons guidedby the planar interface of a metal and a sculptured nematic thin film. Part III: Experimental evidence,” Journal of Nanophotonics , 033506 (2009). [doi: 10.1117/1.3249629].[28] T. H. Gilani, N. Dushkina, W. L. Freeman, M. Z. Numan, D. N. Talwar, and D. P. Pulsifer, “Surfaceplasmon resonance due to the interface of a metal and a chiral sculptured thin film,” Optical Engineering , 120503 (2010). [doi: 10.1117/1.3525282].[29] T. G. Mackay and A. Lakhtakia, “Modeling chiral sculptured thin films as platforms forsurface-plasmonic-polaritonic optical sensing,” IEEE Sensors Journal , 273–280 (2012). [doi:10.1109/JSEN.2010.2067448].[30] S. E. Swiontek, D. P. Pulsifer, and A. Lakhtakia, “Optical sensing of analytes in aqueous solu-tions with multiple surface-plasmon-polariton-wave platform,” Scientific Reports , 1409 (2013). [doi:10.1038/srep01409].[31] S. E. Swiontek and A. Lakhtakia, “Influence of silver-nanoparticle layer in a chiral sculptured thinfilm for surface-multiplasmonic sensing of analytes in aqueous solution,” Journal of Nanophotonics ,033008 (2016). [doi: 10.1117/1.JNP.10.033008].[32] D. Marcuse, Theory of Dielectric Optical Waveguides , Academic Press, San Diego, California (1991).[33] T. Khaleque and R. Magnusson, “Light management through guided-mode resonances in thin-film siliconsolar cells,”
Journal of Nanophotonics , 083995 (2014). [doi: 10.1117/1.JNP.8.083995].[34] L. Liu, M. Faryad, A. S. Hall, G. D. Barber, S. Erten, T. E. Mallouk, A. Lakhtakia, and T. S.Mayer, “Experimental excitation of multiple surface plasmon-polariton waves and waveguide modesin a one-dimensional photonic crystal atop a two-dimensional metal grating,” Journal of Nanophotonics , 093593 (2015). [doi: 10.1117/1.JNP.9.093593].[35] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propa-gation,” In: D. E. Rumelhart and J. L. McClelland (eds), Parallel Distributed Processing: Explorationsin the Microstructure of Cognition. Volume 1: Foundations , pp. 318–362, MIT Press, Cambridge, Mas-sachusetts (1986).[36] I. Goodfellow, Y. Bengio, and A. Courville,
Deep Learning , MIT Press, Cambridge, Massachusetts(2016). 1237] S. T. S. Bukkapatnam, A. Lakhtakia, and S. R. T. Kumara, “Chaotic neurons for on-line qualitycontrol in manufacturing,”
International Journal of Advanced Manufacturing Technology , 95–100(1997). [doi: 10.1007/BF01225755].[38] S. T. S. Bukkapatnam, S. R. T. Kumara, and A. Lakhtakia, “Fractal estimation of flank wear in turning,” Journal of Dynamic Systems, Measurement, and Control , 89–94 (2000). [doi: 10.1115/1.482446].[39] S. T. S. Bukkapatnam, S. R. T. Kumara, and A. Lakhtakia, “Analysis of acoustic emission sig-nals in machining,”
Journal of Manufacturing Science and Engineering , 568–576 (1999). [doi:10.1115/1.2833058].[40] S. T. S. Bukkapatnam, A. Lakhtakia, and S. R. T. Kumara, “Analysis of sensor signals showsturning on a lathe exhibits low-dimensional chaos,”
Physical Review E , 2375–2387 (1995).[doi:10.1103/PhysRevE.52.2375].[41] J. J. Braun, Y. Glina, J. K. Su, and T. J. Dasey, “Computational intelligence in biological sensing,”
Proceedings of SPIE , 111–122 (2004). [doi: 10.1117/12.541046].[42] S. Chakrabartty and Y. Liu, “Towards reliable multi-pathogen biosensors using high-dimensional en-coding and decoding techniques,”
Proceedings of SPIE , 703514 (2008). [doi: 10.1117/12.799358].[43] V. A. Saetchnikov, E. A. Tcherniavskaia, G. Schweiger, and A. Ostendorf, “Classification of antibi-otics by neural network analysis of optical resonance data of whispering gallery modes in dielectricmicrospheres,”
Proceedings of SPIE , 84240Q (2012). [doi: 10.1117/12.920397].[44] P. H. Rogers, K. D. Benkstein, and S. Semancik, “Machine learning applied to chemical analysis:Sensing multiple biomarkers in simulated breath using a temperature-pulsed electronic-nose,”
AnalyticalChemistry , 9774–9781 (2012). [doi: 10.1021/ac301687j].[45] N. Maleki, S. Kashanian, E. Maleki, and M. Nazari, “A novel enzyme based biosensor for catecholdetection in water samples using artificial neural network,” Biochemical Engineering Journal , 1–11(2017). [doi: 10.1016/j.bej.2017.09.005].[46] K. N. Mutter, “Hopfield neural network and optical fiber sensor as intelligent heart rate monitor,”
Proceedings of SPIE , 104564T (2018). [doi: 10.1117/12.2283012].[47] W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,”
ACSNano , 6326–6334 (2018). [doi: 10.1021/acsnano.8b03569].[48] D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design ofnanophotonics structures,” ACS Photonics , 1365–1369 (2018). [doi: 10.1021/acsphotonics.7b01377].[49] M. R. H. Nezhad, J. Tashkhourian, J. Khodaveisi, and M. R. Khoshi, “Simultaneous colorimetricdetermination of dopamine and ascorbic acid based on the surface plasmon resonance band of colloidalsilver nanoparticles using artificial neural networks,” Analytical Methods , 1263–1269 (2010). [doi:10.1039/C0AY00302F].[50] J. Ma, Y. Cao, K. Liu, X. Huang, J. Jiang, T. Wang, M. Xue, P. Chang, and T. Liu, ‘ “A simpledemodulation algorithm for optical SPR sensor based on all-phase low-pass filters,” Proceedings ofSPIE , 106180N (2018). [doi: 10.1117/12.2281236].[51] J. Khodaveisi, S. Dadfarnia, A. M. H. Shabani, M. R. Moghadam, and M. R. H. Nezhad, “Artificialneural network assisted kinetic spectrophotometric technique for simultaneous determination of parac-etamol and p-aminophenol in pharmaceutical samples using localized surface plasmon resonance bandof silver nanoparticles,”
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy ,474–480 (2015). [doi: 10.1016/j.saa.2014.11.094].1352] S. Yu, J. Wang, T. Zhang, R. Zhou, J. Dai, Y. Zhou, and K. Xu, “Performance optimization forplasmonic refractive index sensor based on machine learning ,”
Proceedings of SPIE , 110482X(2019). [doi: 10.1117/12.2519699].[53] F. Bahrami, M. Maisonneuve, M. Meunier, J. S. Aitchison, and M. Mojahedi, “An improved refractiveindex sensor based on genetic optimization of plasmon waveguide resonance,”
Optics Express , 20863–20872 (2013). [doi: 10.1364/OE.21.020863].[54] A. Lakhtakia, “Enhancement of optical activity of chiral sculptured thin films by suitable infiltration ofvoid regions,” Optik , 145–148 (2001). [doi: 10.1078/0030-4026-00024].[55] A. Lakhtakia, “Erratum: Enhancement of optical activity of chiral sculptured thin films by suitableinfiltration of void regions,”
Optik , 544 (2001). [doi: 10.1078/0030-4026-00024].[56] P. D. McAtee, S. T. S. Bukkapatnam, and A. Lakhtakia, “Artificial neural network to predict therefractive index of a liquid infiltrating a chiral sculptured thin film,”
Proceedings of SPIE ,107280G (2018). [doi: 10.1117/12.2321355].[57] D. H. Wolpert, “The lack of a priori distinctions between learning algorithms,”
Neural Computation ,1341–1390 (1996). [doi: 10.1162/neco.1996.8.7.1341].[58] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions onEvolutionary Computation , 67–82 (1997). [doi: 10.1109/4235.585893].[59] I. Hodgkinson, Q. h. Wu, and J. Hazel, “Empirical equations for the principal refractive indices andcolumn angle of obliquely deposited films of tantalum oxide, titanium oxide, and zirconium oxide,” Applied Optics , 2653–2659 (1998). [doi: 10.1364/AO.37.002653].[60] https://refractiveindex.info/?shelf=main&book=Au&page=Johnson (accessed on July 11 2018).[61] J. Patterson and A. Gibson, Deep Learning , O’Reilly Media, Sebastopol, California (2017).[62] D. Rodrigo, O. Limaj, D. Janner, D. Etezadi, F. J. Garc´ıa de Abajo, V. Pruneri, and H. Altug,“Mid-infrared plasmonic biosensing with graphene,”
Science , 165–168 (2015). [doi: 10.1126/sci-ence.aab2051].[63] K. V. Sreekanth, Q. Ouyang, S. Sreejith, S. Zeng, W. Lishu, E. Ilker, W. Dong, M. ElKabbash, Y.Ting C. T. Lim, M. Hinczewski, G. Strangi, K.-T. Yong, R. E. Simpson, and R. Singh, “Phase-change-material-based low-loss visible-frequency hyperbolic metamaterials for ultrasensitive label-free biosens-ing,”
Advanced Optical Materials7